The Semantic Vector Distance (Double Table) component allows you to specify two input tables. The left input port of the component is for a query table and the right input port is for a dictionary table. This output table of the component contains the top N distances between data in the query table and that in the dictionary table and the sorting of the distances.
- The component is used to calculate the Cartesian product distances between data in the two input tables and to sort the distances. Therefore, we recommend that you do not use more than tens of millions of samples.
- By default, a small number of resources are specified on the Tuning tab. If an out of memory (OOM) error occurs, you can increase the resources.
- If you use the Cosine distance calculation method for data of the DOUBLE type, negative numbers may exist in the output table. This is normal.
Configure the component
|Fields Setting||Vector Column||The vector values. You must write the vector to one field. Separate all values with spaces.|
|ID Column||The primary key of each column.|
|Parameters Setting||Distance Calculation Method||Valid values: Euclidean and Cosine.|
|Number of Highest Similarity Scores||The value of this parameter must be a positive integer.|
|Tuning||Cores||The number of CPU cores that you want to use in computing. Default value: 3. If an OOM error occurs during computing, you can increase the values of the Cores and Memory Size per Core parameters.|
|Memory Size per Core||The memory size of each CPU core. Default value: 2046. Unit: MB. If an OOM error occurs during computing, you can increase the values of the Cores and Memory Size per Core parameters.|