This topic describes the Table to KV component provided by Machine Learning Studio and how to use the component to convert a common table into a table in the key-value format.

Limits

  • The output table after conversion does not show the null values in the input table. You can specify the columns that you want to reserve in the output table. The specified columns are retained in their original formats.
  • If the input includes a key_map table, the converted columns are the columns whose keys exist in both the key_map and key-value tables.
  • If the input includes a key_map table and its data type is different from that of the input table, the key_map table in the output uses the data type that you specified.
  • The columns that you want to convert into the key-value format in the input table must be of the BIGINT or DOUBLE data type.

Table to KV

You can configure the component by using one of the following methods:
  • Machine Learning Platform for AI (PAI) console
    Tab Parameter Description
    Fields Setting Columns to Convert The names of the columns that need to be converted.
    Reserved Columns The names of the columns that do not need to be converted.
    KV Delimiter The delimiter used between keys and values. Colons (:) are used by default.
    KV Pair Delimiter The delimiter used between key-value pairs. Commas (,) are used by default.
    Parameters Setting Convert Columns to IDs Specifies whether to convert the columns into IDs. Valid values:
    • Yes
    • No
    Tuning Cores The number of cores. The system automatically allocates the cores used for training based on the volume of input data.
    Memory Size The memory size of each core. The system automatically allocates the memory size based on the volume of input data. Unit: MB.
  • PAI command
    PAI -name TableToKV
        -project algo_public
        -DinputTableName=maple_tabletokv_basic_input
        -DoutputTableName=maple_tabletokv_basic_output
        -DselectedColNames=col0,col1,col2
        -DappendColNames=rowid;
    Parameter Required Description Default value
    inputTableName Yes The name of the input table. No default value
    inputTablePartitions No The partitions selected from the input table for training. Specify this parameter in the Partition_name=value format or

    the name1=value1/name2=value2; format (multi-level partitions).

    If you specify multiple partitions, separate them with commas (,).

    All partitions
    selectedColNames No The names of the selected columns for conversion. The data types of the selected columns must be BIGINT or DOUBLE. Full table
    appendColNames No The names of the columns that you want to reserve. The specified columns are retained in their original formats. No default value
    outputTableName Yes The name of the output table after conversion. No default value
    kvDelimiter No The delimiter used between keys and values. :
    itemDelimiter No The delimiter used between key-value pairs. ,
    convertColToIndexId No Specifies whether to convert columns into IDs. Valid values:
    • 1: Yes
    • 0: No
    0
    inputKeyMapTableName No The name of the input index table.

    This parameter is valid only when convertColToIndexId is set to 1. If this parameter is not specified, the system automatically calculates a set of IDs.

    ""
    outputKeyMapTableName Determined by convertColToIndexId The name of the output index table. This parameter is required only when convertColToIndexId is set to 1. No default value
    lifecycle No The lifecycle of the output table. The value of this parameter must be a positive integer. No default value
    coreNum No The number of cores. The value of this parameter must be a positive integer. Valid values: [1,9999]. This parameter is used with memSizePerCore. Automatically allocated
    memSizePerCore No The memory size of each core. Unit: MB. The value of this parameter must be a positive integer. Valid values: [1024,64 × 1024]. Automatically allocated

Example 1

  • Data generation
    rowid kv
    0 col0:1,col1:1.1,col2:2
    1 col0:0,col1:1.2,col2:3
    2 col0:1,col1:2.3
    3 col0:1,col1:0.0,col2:4
  • PAI command
    PAI -name TableToKV
        -project algo_public
        -DinputTableName=maple_tabletokv_basic_input
        -DoutputTableName=maple_tabletokv_basic_output
        -DselectedColNames=col0,col1,col2
        -DappendColNames=rowid;
  • Output
    Output table maple_tabletokv_basic_output
    rowid:bigint kv:string
    0 1:1.1,2:2
    1 1:1.2,2:3
    2 1:2.3
    3 1:0.0,2:4

Example 2

  • PAI command
    PAI -name TableToKV
        -project projectxlib4 -DinputTableName=maple_tabletokv_basic_input
        -DoutputTableName=maple_tabletokv_basic_output
        -DselectedColNames=col0,col1,col2 -DappendColNames=rowid
        -DconvertColToIndexId=1
        -DinputKeyMapTableName=maple_test_tabletokv_basic_map_input
        -DoutputKeyMapTableName=maple_test_tabletokv_basic_map_output;
  • Output
    Output table maple_test_tabletokv_basic_map_output
    col_name:string col_index:string col_datatype:string
    col1 1 bigint
    col2 2 double