All Products
Search
Document Center

Platform For AI:columns to vector

Last Updated:Mar 08, 2024

The table to vector component converts multiple columns of data to vector data.

Limits

The supported compute engines are MaxCompute and Realtime Compute for Apache Flink.

Introduction

This columns to vector component converts multiple numeric column data to vector data.

Configure the component in Machine Learning Designer

Input ports

Input port (from left to right)

Data type

Recommended upstream component

Required

data

Integer

Read Table

Read CSV File

Yes

Component parameters

Tab

Parameter

Description

Field Setting

reservedCols

The names of the generated columns that you want to reserve. By default, all columns are reserved.

selectedCols

The names of the numeric columns whose data that you want to convert to vectors.

Parameter Setting

vectorCol

The name of the generated column that contains vector data.

handleInvalid

The policy that is used to handle exceptions. Default value: ERROR. Valid values:

  • ERROR: throws an exception.

  • SKIP: skips an exception and returns NULL.

vectorSize

The number of elements in a vector. Default value: -1.

Execution Tuning

Number of Workers

The number of workers. This parameter must be used together with the Memory per worker, unit MB parameter. The value of this parameter must be a positive integer. Valid values: [1,9999].

Memory per worker, unit MB

The memory size of each worker. Valid values: 1024 to 65536. Unit: MB.

Output ports

Output port (from left to right)

Storage location

Recommended downstream component

Model type

Output result

N/A

None

None

Example

You can copy the following code to the code editor of the PyAlink Script component. This allows the PyAlink Script component to function like the table to vector component.

from pyalink.alink import *

def main(sources, sinks, parameter):
    data = sources[0]
    op = ColumnsToVectorBatchOp()\
        .setSelectedCols(["f0", "f1"])\
        .setReservedCols(["row"])\
        .setVectorCol("vec")\
        .linkFrom(data)
    result = op.linkFrom(data)
    result.link(sinks[0])
    BatchOperator.execute()