全部產品
Search
文件中心

Platform For AI:Model Gallery 快速入門

更新時間:Nov 21, 2025

Model Gallery對PAI-DLC、PAI-EAS進行了封裝,協助您零代碼,高效快捷地部署和訓練開源大模型。本文以Qwen3-0.6B模型為例,為您介紹如何使用Model Gallery。該流程同樣適用於其他模型。

前提條件

使用主帳號開通PAI並建立工作空間。登入PAI控制台,左上方選擇開通地區,然後一鍵授權和開通產品。

計費說明

本文案例將使用公用資源建立DLC任務和EAS服務,計費方式為隨用隨付,詳細計費規則請參見DLC計費說明EAS計費說明

模型部署

部署模型

  1. 登入PAI控制台在左側導覽列單擊Model Gallery,搜尋並找到Qwen3-0.6B選項卡,然後單擊部署

    image

  2. 配置部署參數。在部署配置頁面已為您預置了預設參數,直接單擊部署 > 確定。部署大約需要等待5分鐘,當處於運行中時,代表部署成功。

    預設使用公用資源部署,計費方式為隨用隨付。

    image

調用模型

  1. 查看調用資訊。在服務詳情頁面,單擊查看調用資訊擷取調用地址Token

    後續查看部署任務詳情,您可通過在左側導覽列單擊Model Gallery > 任務管理 > 部署任務,然後再單擊服務名稱查看。

    image

  2. 體驗模型服務。常用的調用方式如下:

    線上調試

    單擊切換至線上調試頁面,在請求的content中輸入問題如:你好,你是誰。然後單擊發送請求,右側將返回大模型回答的結果。

    image

    使用Cherry Studio用戶端

    Cherry Studio 是業界主流的大模型對話用戶端,且整合了 MCP 功能,您可以方便地與大模型進行對話。

    串連到在PAI 部署的Qwen3模型

    使用Python SDK

    from openai import OpenAI
    import os
    
    # 若沒有配置環境變數,請用EAS服務的Token將下行替換為:token = 'YTA1NTEzMzY3ZTY4Z******************'
    token = os.environ.get("Token")
    # <調用地址>後面有 “/v1”不要去除
    client = OpenAI(
        api_key=token,
        base_url=f'調用地址/v1',
    )
    
    if token is None:
        print("請配置環境變數Token, 或直接將Token賦值給token變數")
        exit()
    
    query = '你好,你是誰'
    messages = [{'role': 'user', 'content': query}]
    
    resp = client.chat.completions.create(model='Qwen3-0.6B', messages=messages, max_tokens=512, temperature=0)
    query = messages[0]['content']
    response = resp.choices[0].message.content
    print(f'query: {query}')
    print(f'response: {response}')

重要提醒

本文使用了公用資源建立模型服務,計費方式為隨用隨付。當您不需要使用服務時請停止或刪除服務,以免繼續計費

image

模型微調

如果您希望模型在特定領域表現更好,可以通過在該領域的資料集上對模型微調來實現。本文以如下情境為例為您介紹模型微調的作用和步驟。

情境樣本

在物流領域,常需要從自然語言中提取結構化資訊(如收件者、地址、電話)。直接使用大參數模型(如:Qwen3-235B-A22B)效果好,但成本高、響應慢。為兼顧效果與成本,可先用大參數模型標註資料,再用這些資料微調小參數模型(如 Qwen3-0.6B),使其在相同任務上達到相近表現。此過程也被稱為模型蒸餾。

相同結構化資訊提取任務,使用原始的Qwen3-0.6B模型準確率50%,微調後準確率可達90%以上。

收件者地址資訊樣本

結構化資訊樣本

Amina Patel - Phone number (474) 598-1543 - 1425 S 5th St, Apt 3B, Allentown, Pennsylvania 18104

{
    "state": "Pennsylvania",
    "city": "Allentown",
    "zip_code": "18104",
    "street_address": "1425 S 5th St, Apt 3B",
    "name": "Amina Patel",
    "phone": "(474) 598-1543"
}

資料準備

為將教師模型(Qwen3-235B-A22B)在該任務中的知識蒸餾到 Qwen3-0.6B,需先通過教師模型的 API,將收件者的地址資訊抽取為結構化 JSON 資料。模型產生這些 JSON 資料可能需要較長時間,因此,本文已為您準備好了樣本訓練集train.json和驗證集eval.json,您可以直接下載使用

在模型蒸餾中大參數量模型也被稱為教師模型。本文所用資料均為大模型類比產生,不涉及使用者敏感資訊。

生產應用的資料擷取建議

如果後續需要應用到實際業務中,我們建議您通過這些方法來準備資料:

真實業務情境(推薦)

真實的業務資料能更好地反映業務情境,微調出來的模型能更好地適配業務。擷取業務資料後,您需要通過編程將您的業務資料轉化為如下格式的 JSON檔案。

[
    {
        "instruction": "You are an expert assistant for extracting structured JSON from US shipping information. The JSON keys are name, street_address, city, state, zip_code, and phone.  Name: Isabella Rivera Cruz | 182 Calle Luis Lloréns Torres, Apt 3B, Mayagüez, Puerto Rico 00680 | MOBILE: (640) 486-5927",
        "output": "{\"name\": \"Isabella Rivera Cruz\", \"street_address\": \"182 Calle Luis Lloréns Torres, Apt 3B\", \"city\": \"Mayagüez\", \"state\": \"Puerto Rico\", \"zip_code\": \"00680\", \"phone\": \"(640) 486-5927\"}"
    },
    {
        "instruction": "You are an expert assistant for extracting structured JSON from US shipping information. The JSON keys are name, street_address, city, state, zip_code, and phone.  1245 Broadwater Avenue, Apt 3B, Bozeman, Montana 59715Receiver: Aisha PatelP: (429) 763-9742",
        "output": "{\"name\": \"Aisha Patel\", \"street_address\": \"1245 Broadwater Avenue, Apt 3B\", \"city\": \"Bozeman\", \"state\": \"Montana\", \"zip_code\": \"59715\", \"phone\": \"(429) 763-9742\"}"
    }
]

JSON檔案中包含多個訓練樣本,每個樣本包括instruction(指令)和output(標準答案)兩個欄位:

  • instruction:包含用於指引大模型行為的提示詞,以及輸入的資料;

  • output:期望的標準答案,通常由人類專家或大參數量的模型(如 qwen3-235b-a22b 等)產生;

大模型產生

在業務資料不夠豐富時,可以考慮使用大模型做資料增強,使資料的多樣性和覆蓋範圍得到提升。為了避免泄漏使用者隱私,本方案使用大模型產生了一批虛擬地址資料,如下產生代碼供您參考。

類比資料集產生的範例程式碼

本範例程式碼將調用阿里雲百鍊中的大模型服務,您需要擷取百鍊API Key。代碼中使用 qwen-plus-latest 產生業務資料,使用qwen3-235b-a22b 模型進行打標。

# -*- coding: utf-8 -*-
import os
import asyncio
import random
import json
import sys
from typing import List
import platform
from openai import AsyncOpenAI

# Create an asynchronous client instance
# NOTE: This script uses the DashScope-compatible API endpoint.
# If you are using a different OpenAI-compatible service, change the base_url.
client = AsyncOpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)

# List of US States and Territories
us_states = [
    "Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "Delaware",
    "Florida", "Georgia", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas", "Kentucky",
    "Louisiana", "Maine", "Maryland", "Massachusetts", "Michigan", "Minnesota", "Mississippi",
    "Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico",
    "New York", "North Carolina", "North Dakota", "Ohio", "Oklahoma", "Oregon", "Pennsylvania",
    "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", "Vermont",
    "Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming", "District of Columbia",
    "Puerto Rico", "Guam", "American Samoa", "U.S. Virgin Islands", "Northern Mariana Islands"
]

# Recipient templates
recipient_templates = [
    "To: {name}", "Recipient: {name}", "Deliver to {name}", "For: {name}",
    "ATTN: {name}", "{name}", "Name: {name}", "Contact: {name}", "Receiver: {name}"
]

# Phone number templates
phone_templates = [
    "Tel: {phone}", "Tel. {phone}", "Mobile: {phone}", "Phone: {phone}",
    "Contact number: {phone}", "Phone number {phone}", "TEL: {phone}", "MOBILE: {phone}",
    "Contact: {phone}", "P: {phone}", "{phone}", "Call: {phone}",
]


# Generate a plausible US-style phone number
def generate_us_phone():
    """Generates a random 10-digit US phone number in (XXX) XXX-XXXX format."""
    area_code = random.randint(201, 999)  # Avoid 0xx, 1xx area codes
    exchange = random.randint(200, 999)
    line = random.randint(1000, 9999)
    return f"({area_code}) {exchange}-{line}"


# Use LLM to generate recipient and address information
async def generate_recipient_and_address_by_llm(state: str):
    """Uses LLM to generate a recipient's name and address details for a given state."""
    prompt = f"""Please generate recipient information for a location in {state}, USA, including:
1. A realistic full English name (can be common or less common, aim for diversity).
2. A real city name within that state.
3. A specific street address (e.g., street number + name, apartment number, etc., should be realistic).
4. A corresponding 5-digit ZIP code for that city/area.

Please return only the JSON object in the following format:
{{"name": "Recipient Name", "city": "City Name", "street_address": "Specific Street Address", "zip_code": "ZIP Code"}}

Do not include any other text, just the JSON. Ensure names are diverse, not just John Doe.
"""

    try:
        response = await client.chat.completions.create(
            messages=[{"role": "user", "content": prompt}],
            model="qwen-plus-latest",
            temperature=1.5,  # Increase temperature for more diverse names and addresses
        )

        result = response.choices[0].message.content.strip()
        # Clean up potential markdown code block markers
        if result.startswith('```'):
            result = result.split('\n', 1)[1]
        if result.endswith('```'):
            result = result.rsplit('\n', 1)[0]

        # Try to parse JSON
        info = json.loads(result)
        print(info)
        return info
    except Exception as e:
        print(f"Failed to generate recipient and address: {e}, using fallback.")
        # Fallback mechanism
        backup_names = ["Michael Johnson", "Emily Williams", "David Brown", "Jessica Jones", "Christopher Davis",
                        "Sarah Miller"]
        return {
            "name": random.choice(backup_names),
            "city": "Anytown",
            "street_address": f"{random.randint(100, 9999)} Main St",
            "zip_code": f"{random.randint(10000, 99999)}"
        }


# Generate a single raw data record
async def generate_record():
    """Generates one messy, combined string of US address information."""
    # Randomly select a state
    state = random.choice(us_states)

    # Use LLM to generate recipient and address info
    info = await generate_recipient_and_address_by_llm(state)

    # Format recipient name
    recipient = random.choice(recipient_templates).format(name=info['name'])

    # Generate a phone number
    phone = generate_us_phone()
    phone_info = random.choice(phone_templates).format(phone=phone)

    # Assemble the full address line
    full_address = f"{info['street_address']}, {info['city']}, {state} {info['zip_code']}"

    # Combine all components
    components = [recipient, phone_info, full_address]

    # Randomize the order of components
    random.shuffle(components)

    # Choose a random separator
    separators = [' ', ', ', '; ', ' | ', '\t', ' - ', ' // ', '', '  ']
    separator = random.choice(separators)

    # Join the components
    combined_data = separator.join(components)
    return combined_data.strip()


# Generate a batch of data
async def generate_batch_data(count: int) -> List[str]:
    """Generates a specified number of data records."""
    print(f"Starting to generate {count} records...")

    # Use a semaphore to control concurrency (e.g., up to 20 concurrent requests)
    semaphore = asyncio.Semaphore(20)

    async def generate_single_record(index):
        async with semaphore:
            try:
                record = await generate_record()
                print(f"Generated record #{index + 1}: {record}")
                return record
            except Exception as e:
                print(f"Failed to generate record #{index + 1}: {e}")
                return None

    # Concurrently generate data
    tasks = [generate_single_record(i) for i in range(count)]

    data = await asyncio.gather(*tasks)

    successful_data = [record for record in data if record is not None]

    return successful_data


# Save data to a file
def save_data(data: List[str], filename: str = "us_recipient_data.json"):
    """Saves the generated data to a JSON file."""
    with open(filename, 'w', encoding='utf-8') as f:
        json.dump(data, f, ensure_ascii=False, indent=2)
    print(f"Data has been saved to {filename}")


# Phase 1: Data Production
async def produce_data_phase():
    """Handles the generation of raw recipient data."""
    print("=== Phase 1: Starting Raw Recipient Data Generation ===")

    # Generate 2000 records
    batch_size = 2000
    data = await generate_batch_data(batch_size)

    # Save the data
    save_data(data, "us_recipient_data.json")

    print(f"\nTotal records generated: {len(data)}")
    print("\nSample Data:")
    for i, record in enumerate(data[:3]):  # Show first 3 as examples
        print(f"{i + 1}. Raw Data: {record}\n")

    print("=== Phase 1 Complete ===\n")
    return True


# Define the system prompt for the extraction model
def get_system_prompt_for_extraction():
    """Returns the system prompt for the information extraction task."""
    return """You are a professional information extraction assistant specializing in parsing US shipping addresses from unstructured text.

## Task Description
Based on the given input text, accurately extract and generate a JSON object containing the following six fields:
- name: The full name of the recipient.
- street_address: The complete street address, including number, street name, and any apartment or suite number.
- city: The city name.
- state: The full state name (e.g., "California", not "CA").
- zip_code: The 5 or 9-digit ZIP code.
- phone: The complete contact phone number.

## Extraction Rules
1.  **Address Handling**:
    -   Accurately identify the components: street, city, state, and ZIP code.
    -   The `state` field must be the full official name (e.g., "New York", not "NY").
    -   The `street_address` should contain all details before the city, such as "123 Apple Lane, Apt 4B".
2.  **Name Identification**:
    -   Extract the full recipient name.
3.  **Phone Number Handling**:
    -   Extract the complete phone number, preserving its original format.
4.  **ZIP Code**:
    -   Extract the 5-digit or 9-digit (ZIP+4) code.

## Output Format
Strictly adhere to the following JSON format. Do not add any explanatory text or markdown.
{
  "name": "Recipient's Full Name",
  "street_address": "Complete Street Address",
  "city": "City Name",
  "state": "Full State Name",
  "zip_code": "ZIP Code",
  "phone": "Contact Phone Number"
}
"""


# Use LLM to predict structured data from raw text
async def predict_structured_data(raw_data: str):
    """Uses an LLM to predict structured data from a raw string."""
    system_prompt = get_system_prompt_for_extraction()

    try:
        response = await client.chat.completions.create(
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": raw_data}
            ],
            model="qwen3-235b-a22b",  # A powerful model is recommended for this task
            temperature=0.0,  # Lower temperature for higher accuracy in extraction
            response_format={"type": "json_object"},
            extra_body={"enable_thinking": False}
        )

        result = response.choices[0].message.content.strip()

        # Clean up potential markdown code block markers
        if result.startswith('```'):
            lines = result.split('\n')
            for i, line in enumerate(lines):
                if line.strip().startswith('{'):
                    result = '\n'.join(lines[i:])
                    break
        if result.endswith('```'):
            result = result.rsplit('\n```', 1)[0]

        structured_data = json.loads(result)
        return structured_data

    except Exception as e:
        print(f"Failed to predict structured data: {e}, Raw data: {raw_data}")
        # Return an empty structure on failure
        return {
            "name": "",
            "street_address": "",
            "city": "",
            "state": "",
            "zip_code": "",
            "phone": ""
        }


# Phase 2: Data Conversion
async def convert_data_phase():
    """Reads raw data, predicts structured format, and saves as SFT data."""
    print("=== Phase 2: Starting Data Conversion to SFT Format ===")

    try:
        print("Reading us_recipient_data.json file...")
        with open('us_recipient_data.json', 'r', encoding='utf-8') as f:
            raw_data_list = json.load(f)

        print(f"Successfully read {len(raw_data_list)} records.")
        print("Starting to predict structured data using the extraction model...")

        # A simple and clear system message can improve training and inference speed.
        system_prompt = "You are an expert assistant for extracting structured JSON from US shipping information. The JSON keys are name, street_address, city, state, zip_code, and phone."
        output_file = 'us_recipient_sft_data.json'

        # Use a semaphore to control concurrency
        semaphore = asyncio.Semaphore(10)

        async def process_single_item(index, raw_data):
            async with (semaphore):
                structured_data = await predict_structured_data(raw_data)
                print(f"Processing record #{index + 1}: {raw_data}")

                conversation = {
                        "instruction": system_prompt + '  ' + raw_data,
                        "output": json.dumps(structured_data, ensure_ascii=False)
                }

                return conversation

        print(f"Starting conversion to {output_file}...")

        tasks = [process_single_item(i, raw_data) for i, raw_data in enumerate(raw_data_list)]
        conversations = await asyncio.gather(*tasks)

        with open(output_file, 'w', encoding='utf-8') as outfile:
            json.dump(conversations, outfile, ensure_ascii=False, indent=4)

        print(f"Conversion complete! Processed {len(raw_data_list)} records.")
        print(f"Output file: {output_file}")
        print("=== Phase 2 Complete ===")

    except FileNotFoundError:
        print("Error: us_recipient_data.json not found.")
        sys.exit(1)
    except json.JSONDecodeError as e:
        print(f"JSON decoding error: {e}")
        sys.exit(1)
    except Exception as e:
        print(f"An error occurred during conversion: {e}")
        sys.exit(1)


# Main function
async def main():
    print("Starting the data processing pipeline...")
    print("This program will execute two phases in sequence:")
    print("1. Generate raw US recipient data.")
    print("2. Predict structured data and convert it to SFT format.")
    print("-" * 50)

    # Phase 1: Generate data
    success = await produce_data_phase()

    if success:
        # Phase 2: Convert data
        await convert_data_phase()

        print("\n" + "=" * 50)
        print("All processes completed successfully!")
        print("Generated files:")
        print("- us_recipient_data.json: Raw, unstructured data list.")
        print("- us_recipient_sft_data.json: SFT-formatted training data.")
        print("=" * 50)
    else:
        print("Data generation phase failed. Terminating.")


if __name__ == '__main__':
    # Set event loop policy for Windows if needed
    if platform.system() == 'Windows':
        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())

    # Run the main coroutine
    asyncio.run(main(), debug=False)

微調模型

  1. 在左側導覽列單擊Model Gallery,搜尋並找到Qwen3-0.6B選項卡,然後單擊訓練

    image

  2. 配置訓練任務參數。只需配置如下關鍵參數,其他參數預設即可。

    • 訓練方式:預設選擇SFT監督微調LoRA微調方法。

      LoRA是一種高效的模型微調技術,僅修改模型的部分參數,以節省訓練使用的資源。
    • 訓練資料集:先單擊下載樣本訓練集train.json,然後配置頁面選擇OSS檔案或目錄,單擊image表徵圖選擇Bucket,單擊上傳檔案將下載訓練集上傳到OSS中,並選擇該檔案。

      image

    • 驗證資料集:先單擊下載驗證集eval.json,然後單擊添加驗證資料集,按照和配置訓練集相同的操作,上傳並選擇該檔案。

      驗證集用於在訓練過程中判斷模型的效能,協助評估模型在未見過的資料上的表現。
    • 模型輸出路徑:預設會將微調後的模型儲存到OSS中,如果OSS目錄為空白,請您建立目錄並指定該目錄。

    • 資源群組類型:選擇使用公用資源群組,本次微調大約需要5GB顯存,控制台中已經為您篩選出滿足要求的規格,選擇如:ecs.gn7i-c16g1.4xlarge

    • 超參數配置

      • learning_rate: 設定為0.0005

      • num_train_epochs:設定為4

      • per_device_train_batch_size:設定為8

      • seq_length:設定為512

      然後單擊訓練 > 確定,訓練任務進入建立中狀態,當處於運行中時,開始微調模型。

  3. 查看訓練任務並等待訓練完成。模型微調大約需要10分鐘,在微調過程中,任務詳情頁面將展示任務日誌及指標曲線,待訓練執行完成後,微調後的模型將儲存到設定的OSS目錄中。

    後續查看訓練任務詳情,您可通過在左側導覽列單擊Model Gallery > 任務管理 > 訓練任務,然後再單擊任務名稱查看。

    image

    (可選)根據loss映像,調整超參數,提升模型效果

    在任務詳情頁面可以分別看到train_loss曲線(反映訓練集損失)與 eval_loss曲線(反映驗證集損失):

    imageimage

    您可以根據損失值的變化趨勢,初步判斷當前模型的訓練效果:

    • 在結束訓練前 train_loss 與 eval_loss 仍有下降趨勢(欠擬合)

      您可以增加 num_train_epochs(訓練輪次,與訓練深度正相關) 參數,或適當增大 lora_rank(低秩矩陣的秩,秩越大,模型能表達更複雜的任務,但更容易過度訓練)的值後再進行訓練,加大模型的對訓練資料的擬合程度;

    • 在結束訓練前 train_loss 持續下降,eval_loss 開始變大(過擬合)

      您可以減少 num_train_epochs 參數,或適當減小lora_rank的值後再進行訓練,防止模型過度訓練;

    • 在結束訓練前 train_loss 與 eval_loss 均處於平穩狀態(良好擬合)

      模型處於該狀態時,您可以進行後續步驟。

部署微調後的模型

在訓練任務詳情頁,單擊部署按鈕開啟部署配置頁,資源類型選擇公用資源,部署0.6B的模型大約需要5GB顯存,資源規格中已為您篩選出滿足要求的規格,選擇如:ecs.gn7i-c8g1.2xlarge,其他參數保持預設即可,然後單擊部署 > 確定

部署過程大約需要5分鐘,當處於運行中時,代表部署成功。

後續查看訓練任務詳情,您可通過在左側導覽列單擊Model Gallery > 任務管理 > 訓練任務,然後再單擊任務名稱查看。

image

訓練任務顯示已成功後,如果部署按鈕無法點擊,代表輸出的模型還在註冊中,需要等待大約1分鐘。

image

後續的模型調用步驟與調用模型相同。

驗證微調後模型效果

在將微調後的模型部署到實際業務環境前,建議您先對其效果進行系統性的評測,確保模型具備良好的穩定性和準確性,避免上線後出現意料之外的問題。

準備測試資料

準備與訓練資料不重合的測試資料,用於測試模型效果。本方案已為您準備好了測試集,在執行下面的準確率測試代碼時會自動下載。

測試資料與訓練資料不重合樣本,這樣可以更準確地反映模型在新資料上的泛化能力,避免因“見過的樣本”導致分數虛高。

設計評測指標

評測標準應緊貼實際營運目標。以本方案為例,除了判斷產生的 JSON 字串是否合法,還應該關注對應的 Key、Value 的值是否正確。

您需要通過編程來定義評測指標,本案例評測指標實現,請參見下面準確率測試代碼的compare_address_info方法。

驗證模型微調後效果

執行如下測試代碼,將輸出模型在測試集上的準確率。

測試模型準確率程式碼範例

注意:請將Token、調用地址替換為您上文中擷取的真實調用資訊。

# pip3 install openai
from openai import AsyncOpenAI
import requests
import json
import asyncio
import os

# If the 'Token' environment variable is not set, replace the following line with your token from the EAS service: token = 'YTA1NTEzMzY3ZTY4Z******************'
token = os.environ.get("Token")

# Do not remove the "/v1" suffix after the service URL.
client = OpenAI(
    api_key=token,
    base_url=f'<Your_Service_URL>/v1',
)

if token is None:
    print("Please set the 'Token' environment variable, or assign your token directly to the 'token' variable.")
    exit()

system_prompt = """You are a professional information extraction assistant specializing in parsing US shipping addresses from unstructured text.

## Task Description
Based on the given input text, accurately extract and generate a JSON object containing the following six fields:
- name: The full name of the recipient.
- street_address: The complete street address, including number, street name, and any apartment or suite number.
- city: The city name.
- state: The full state name (e.g., "California", not "CA").
- zip_code: The 5 or 9-digit ZIP code.
- phone: The complete contact phone number.

## Extraction Rules
1.  **Address Handling**:
    -   Accurately identify the components: street, city, state, and ZIP code.
    -   The `state` field must be the full official name (e.g., "New York", not "NY").
    -   The `street_address` should contain all details before the city, such as "123 Apple Lane, Apt 4B".
2.  **Name Identification**:
    -   Extract the full recipient name.
3.  **Phone Number Handling**:
    -   Extract the complete phone number, preserving its original format.
4.  **ZIP Code**:
    -   Extract the 5-digit or 9-digit (ZIP+4) code.

## Output Format
Strictly adhere to the following JSON format. Do not add any explanatory text or markdown.
{
  "name": "Recipient's Full Name",
  "street_address": "Complete Street Address",
  "city": "City Name",
  "state": "Full State Name",
  "zip_code": "ZIP Code",
  "phone": "Contact Phone Number"
}
"""


def compare_address_info(actual_address_str, predicted_address_str):
    """Compares two JSON strings representing address information to see if they are identical."""
    try:
        # Parse the actual address information
        if actual_address_str:
            actual_address_json = json.loads(actual_address_str)
        else:
            actual_address_json = {}

        # Parse the predicted address information
        if predicted_address_str:
            predicted_address_json = json.loads(predicted_address_str)
        else:
            predicted_address_json = {}

        # Directly compare if the two JSON objects are identical
        is_same = actual_address_json == predicted_address_json

        return {
            "is_same": is_same,
            "actual_address_parsed": actual_address_json,
            "predicted_address_parsed": predicted_address_json,
            "comparison_error": None
        }

    except json.JSONDecodeError as e:
        return {
            "is_same": False,
            "actual_address_parsed": None,
            "predicted_address_parsed": None,
            "comparison_error": f"JSON parsing error: {str(e)}"
        }
    except Exception as e:
        return {
            "is_same": False,
            "actual_address_parsed": None,
            "predicted_address_parsed": None,
            "comparison_error": f"Comparison error: {str(e)}"
        }


async def predict_single_conversation(conversation_data):
    """Predicts the label for a single conversation."""
    try:
        # Extract user content (excluding assistant message)
        messages = conversation_data.get("messages", [])
        user_content = None

        for message in messages:
            if message.get("role") == "user":
                user_content = message.get("content", "")
                break

        if not user_content:
            return {"error": "User message not found"}

        response = await client.chat.completions.create(
            model="Qwen3-0.6B",
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_content}
            ],
            response_format={"type": "json_object"},
            extra_body={
                "enable_thinking": False
            }
        )

        predicted_labels = response.choices[0].message.content.strip()
        return {"prediction": predicted_labels}

    except Exception as e:
        return {"error": f"Prediction failed: {str(e)}"}


async def process_batch(batch_data, batch_id):
    """Processes a batch of data."""
    print(f"Processing batch {batch_id}, containing {len(batch_data)} items...")

    tasks = []
    for i, conversation in enumerate(batch_data):
        task = predict_single_conversation(conversation)
        tasks.append(task)

    results = await asyncio.gather(*tasks, return_exceptions=True)

    batch_results = []
    for i, result in enumerate(results):
        if isinstance(result, Exception):
            batch_results.append({"error": f"Exception: {str(result)}"})
        else:
            batch_results.append(result)

    return batch_results


async def main():
    output_file = "predicted_labels.jsonl"
    batch_size = 20  # Number of items to process per batch

    # Read test data
    url = 'https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20251015/yghxco/test.jsonl'
    conversations = []

    try:
        response = requests.get(url)
        response.raise_for_status()  # Check if the request was successful
        for line_num, line in enumerate(response.text.splitlines(), 1):
            try:
                data = json.loads(line.strip())
                conversations.append(data)
            except json.JSONDecodeError as e:
                print(f"JSON parsing error on line {line_num}: {e}")
                continue
    except requests.exceptions.RequestException as e:
        print(f"Request error: {e}")
        return

    print(f"Successfully read {len(conversations)} conversation data items")

    # Process in batches
    all_results = []
    total_batches = (len(conversations) + batch_size - 1) // batch_size

    for batch_id in range(total_batches):
        start_idx = batch_id * batch_size
        end_idx = min((batch_id + 1) * batch_size, len(conversations))
        batch_data = conversations[start_idx:end_idx]

        batch_results = await process_batch(batch_data, batch_id + 1)
        all_results.extend(batch_results)

        print(f"Batch {batch_id + 1}/{total_batches} completed")

        # Add a small delay to avoid making requests too quickly
        if batch_id < total_batches - 1:
            await asyncio.sleep(1)

    # Save results
    same_count = 0
    different_count = 0
    error_count = 0

    with open(output_file, 'w', encoding='utf-8') as f:
        for i, (original_data, prediction_result) in enumerate(zip(conversations, all_results)):
            result_entry = {
                "index": i,
                "original_user_content": None,
                "actual_address": None,
                "predicted_address": None,
                "prediction_error": None,
                "address_comparison": None
            }

            # Extract original user content
            messages = original_data.get("messages", [])
            for message in messages:
                if message.get("role") == "user":
                    result_entry["original_user_content"] = message.get("content", "")
                    break

            # Extract actual address information (if assistant message exists)
            for message in messages:
                if message.get("role") == "assistant":
                    result_entry["actual_address"] = message.get("content", "")
                    break

            # Save prediction result
            if "error" in prediction_result:
                result_entry["prediction_error"] = prediction_result["error"]
                error_count += 1
            else:
                result_entry["predicted_address"] = prediction_result.get("prediction", "")

                # Compare address information
                comparison_result = compare_address_info(
                    result_entry["actual_address"],
                    result_entry["predicted_address"]
                )
                result_entry["address_comparison"] = comparison_result

                # Tally comparison results
                if comparison_result["comparison_error"]:
                    error_count += 1
                elif comparison_result["is_same"]:
                    same_count += 1
                else:
                    different_count += 1

            f.write(json.dumps(result_entry, ensure_ascii=False) + '\n')

    print(f"All predictions complete! Results have been saved to {output_file}")

    # Statistics
    success_count = sum(1 for result in all_results if "error" not in result)
    prediction_error_count = len(all_results) - success_count
    print(f"Number of samples: {success_count}")
    print(f"Correct responses: {same_count}")
    print(f"Incorrect responses: {different_count}")
    print(f"Accuracy: {same_count * 100 / success_count} %")


if __name__ == "__main__":
    asyncio.run(main())

輸出結果:

All predictions complete! Results have been saved to predicted_labels.jsonl
Number of samples: 400
Correct responses: 382
Incorrect responses: 18
Accuracy: 95.5 %
因模型微調隨機數種子和大模型輸出隨機性的影響,您測試出的準確率可能與本方案的結果存在差異,這屬於正常情況。

可以看到準確率為 95.5%,相比於原始Qwen3-0.6B模型的 50% 的準確率提高了很多,表明微調後的模型顯著增強了在物流填單領域結構化資訊抽取的能力。

本文為了您學習過程中減少訓練時間,只設定 4 個訓練輪次,準確率就已提升至 95.5%。您可以適當增加訓練輪次來進一步提升準確率。

重要提醒

本文使用了公用資源建立模型服務,計費方式為隨用隨付。當您不需要使用服務時請停止或刪除服務,以免繼續計費

image

相關文檔

  • 更多Model Gallery功能如評測,壓縮等,請參見Model Gallery

  • 更多EAS功能如Auto Scaling、壓測、監控警示,請參見EAS概述