All Products
Search
Document Center

Platform For AI:JOIN

Last Updated:Mar 20, 2024

The JOIN component joins tables in the map stage instead of the reduce stage. This way, transmission of a large amount of data in the shuffle stage is not required, and jobs are optimized. For example, if you need to join a large table with a small table, you can load data in the small table to the memory. This improves the operation efficiency.

Configure the component

You can configure the JOIN component only on the pipeline page of Machine Learning Designer. The following table describes the parameters.

Parameter

Description

Join Type

The join type. Valid values: Left Join, Inner Join, Right Join, and Full Join.

MapJoin Optimization

Specifies whether to load data in the small table to the memory to accelerate the execution of the JOIN operation. Valid values:

  • Not Optimized: Data in the small table is not loaded to the memory.

  • Optimize Left Table: The left table is the small table, and data in the left table is loaded to the memory to accelerate the access speed.

  • Optimize Right Table: The right table is the small table, and data in the right table is loaded to the memory to accelerate the access speed.

Join Condition

The join conditions, which are in the format of equations. You can manually add or remove join conditions.

Select Output Columns from the Left Table

The output columns from the left table.

Select Output Columns from the Right Table

The output columns from the right table.