The Table to KV component converts ordinary tabular data into key-value (KV) format strings. Each selected column becomes a key-value pair in the output.
Only columns of the BIGINT or DOUBLE data type can be converted. Other columns can be passed through in their original format using appendColNames. Null values are excluded from the output. You can specify the columns that you want to retain; the specified columns are retained in their original formats.
Configure the Table to KV component
Two configuration methods are supported.
Method 1: Configure in Machine Learning Designer
Configure the component parameters on the pipeline page in Machine Learning Designer (formerly Machine Learning Studio) of Machine Learning Platform for AI (PAI).
| Tab | Parameter | Description |
|---|---|---|
| Fields Setting | Columns to Convert | Names of the columns to convert. Columns must be of BIGINT or DOUBLE type. |
| Reserved Columns | Names of the columns to pass through unchanged in their original format. | |
| KV Delimiter | Delimiter between keys and values. Default: colons (:). | |
| KV Pair Delimiter | Delimiter between key-value pairs. Default: commas (,). | |
| Parameters Setting | Convert Columns to IDs | Whether to replace column names with integer IDs in the output. Valid values: Yes, No. |
| Tuning | Cores | Number of cores. The system allocates cores automatically based on the size of the input data. |
| Memory Size | Memory per core in MB. The system allocates memory automatically based on the size of the input data. |
Method 2: Configure using PAI commands
Run PAI commands through the SQL Script component. For more information, see SQL Script.
PAI -name TableToKV
-project algo_public
-DinputTableName=maple_tabletokv_basic_input
-DoutputTableName=maple_tabletokv_basic_output
-DselectedColNames=col0,col1,col2
-DappendColNames=rowid;| Parameter | Required | Description | Default |
|---|---|---|---|
inputTableName | Yes | Name of the input table. | None |
outputTableName | Yes | Name of the output table. | None |
selectedColNames | No | Columns to convert. Must be of BIGINT or DOUBLE type. | All columns |
appendColNames | No | Columns to pass through unchanged in their original format. | None |
inputTablePartitions | No | Partitions to read from the input table. Use Partition_name=value format. For multi-level partitions, use name1=value1/name2=value2;. Separate multiple partitions with commas (,). | All partitions |
kvDelimiter | No | Delimiter between keys and values. | Colons (:) |
itemDelimiter | No | Delimiter between key-value pairs. | Commas (,) |
convertColToIndexId | No | Whether to replace column names with integer IDs. Set to 1 to convert; 0 to keep column names. | 0 |
inputKeyMapTableName | No | Name of an existing index table to use for column ID mapping. Only used when convertColToIndexId=1. If not specified, the system generates a new set of IDs. | null |
outputKeyMapTableName | Yes (when convertColToIndexId=1) | Name of the output index table that stores the column-to-ID mapping. | None |
lifecycle | No | Lifecycle of the output table in days. Must be a positive integer. | None |
coreNum | No | Number of cores. Valid values: 1–9999. Must be specified together with memSizePerCore. | System-determined |
memSizePerCore | No | Memory per core in MB. Valid values: 1024–65536. Must be specified together with coreNum. | System-determined |
Limitations
If an input
key_maptable is provided, only the columns whose names appear in both thekey_maptable and the input table are converted.If the data type specified in the
key_maptable differs from the actual data type of a column in the input table, the outputkey_maptable uses the data type from thekey_maptable.
Example
Input table
drop table if exists test;
create table test as
select * from
(
select 0 as rowid, 1 as col0, 1.1 as col1, 2 as col2 union all
select 1 as rowid, 0 as col0, 1.2 as col1, 3 as col2 union all
select 2 as rowid, 1 as col0, 2.3 as col1, 4 as col2 union all
select 3 as rowid, 1 as col0, 0.0 as col1, 5 as col2
) tmp;PAI command
PAI -name TableToKV
-project algo_public
-DinputTableName=test
-DoutputTableName=test_output
-DselectedColNames=col0,col1,col2
-DconvertColToIndexId=1
-DoutputKeyMapTableName=test_key_map
-DappendColNames=rowid;col0, col1, and col2 are converted into KV format. rowid is passed through unchanged. Column names are replaced with integer IDs because convertColToIndexId=1.
Output table: `test_output`
| rowid | kv |
|---|---|
| 0 | 0:1,1:1.1,2:2 |
| 1 | 0:0,1:1.2,2:3 |
| 2 | 0:1,1:2.3,2:4 |
| 3 | 0:1,1:0,2:5 |
Output table: `test_key_map`
| col_name | col_index | col_datatype |
|---|---|---|
| col0 | 0 | bigint |
| col1 | 1 | double |
| col2 | 2 | bigint |