All Products
Search
Document Center

Platform For AI:Text summarization training

Last Updated:Mar 17, 2026

Train Transformer models to generate summaries and headlines from source documents.

Limitations

Runs only on DLC compute resources.

Model architecture

Uses standard Transformer encoder-decoder architecture. Takes source text as input and generates summaries during training.

Prerequisites

Connect a Sentence Split component upstream to split text into one sentence per line.

Parameters

Configure component parameters in Designer.

  • Input ports

    Port (left to right)

    Data type

    Upstream component

    Required

    Training data

    OSS

    Read OSS Data

    Yes

    Validation data

    OSS

    Read OSS Data

    Yes

  • Component parameters

    Tab

    Parameter

    Description

    Field Settings

    Input data format

    Column names in input file. Default: target:str:1,source:str:1.

    Source column

    Source text column name. Default: source.

    Summary Column Selection

    Summary text column name. Default: target.

    Model save path

    OSS directory where model is saved.

    Parameter Settings

    Pre-trained model

    Pre-trained model from Parameter Settings tab. Default: alibaba-pai/mt5-title-generation-zh.

    Batch size

    Samples processed per training batch. INT type. Default: 16.

    For multi-GPU servers, specifies batch size per GPU.

    Maximum text length

    Maximum sequence length. INT type. Range: 1-512. Default: 512.

    Number of epochs

    Training epochs. INT type. Default: 3.

    Learning rate

    Learning rate. FLOAT type. Default: 3e-5.

    Steps to Save a Model File

    Training steps between model evaluations and saves. Default: 150.

    Language

    Training language:

    • zh: Chinese

    • en: English

    Copy text from source

    Whether to copy text from source to output:

    • false (Default): Does not copy

    • true: Copies text

    Minimum decoder length

    Minimum output length. INT type. Default: 12.

    Maximum decoder length

    Maximum output length. INT type. Default: 32.

    Minimum non-repeated n-gram

    Minimum n-gram length that cannot repeat in output. INT type. Default: 2. For example, value 1 prevents repeated words such as "day day".

    Beam search size

    Search space size for candidate outputs. INT type. Default: 5. Larger values reduce prediction speed.

    Number of returned candidates

    Top-scoring candidates to return. INT type. Default: 5.

    Execution Tuning

    GPU machine type

    GPU instance type for training. Default: gn5-c8g1.2xlarge.

  • Output port

    Output port

    Data type

    Downstream component

    Required

    Output model

    OSS path specified in Model save path parameter on Field Settings tab. Model is saved in SavedModel format.

    Text Summarization Predict

    No

Usage example

This workflow demonstrates Text Summarization Training component usage.Workflow Configure and run the workflow:

  1. Prepare training dataset (cn_train.txt) and validation dataset (cn_dev.txt), then upload to OSS. These examples use TXT format with tab-separated fields.

    CSV files are also supported. Use Tunnel commands in MaxCompute client to upload datasets. For more information, see Connect using the client (odpscmd) and Tunnel commands.

  2. Use Read OSS Data-1 and Read OSS Data-2 components to read training and validation datasets. Set OSS Data Path to dataset locations.

  3. Connect training and validation datasets to Text Summarization Training-1 component. Configure parameters as described in Parameters.

  4. Click image.png to run the workflow. After workflow completes, view the output model. Model is saved to OSS path specified in Model save path parameter.

Related topics