In a tree network, the Tree Depth component generates the depth of each vertex in a tree and the tree ID. This topic describes the Tree Depth component provided by Machine Learning Studio.
You can configure the component by using one of the following methods:
Machine Learning Platform for AI console
Tab | Parameter | Description |
---|---|---|
Fields Setting | Edge Table: Start Vertex Column | The start vertex column in the edge table. |
Edge Table: End Vertex Column | The end vertex column in the edge table. | |
Tuning | Workers | The number of vertices for parallel job execution. The parallelism level and framework communication costs increase with the value of this parameter. |
Memory Size per Worker | The maximum size of memory that a single job can use. By default, the system allocates 4,096 MB for each job. If the used memory size exceeds the value of this parameter, the OutOfMemory exception is reported. | |
Data Split Size | The data split size. Default value: 64. |
PAI command
PAI -name TreeDepth
-project algo_public
-DinputEdgeTableName=TreeDepth_func_test_edge
-DfromVertexCol=flow_out_id
-DtoVertexCol=flow_in_id
-DoutputTableName=TreeDepth_func_test_result;
Parameter | Required | Description | Default value |
---|---|---|---|
inputEdgeTableName | Yes | The name of the input edge table. | No default value |
inputEdgeTablePartitions | No | The partitions in the input edge table. | Full table |
fromVertexCol | Yes | The start vertex column in the input edge table. | No default value |
toVertexCol | Yes | The end vertex column in the input edge table. | No default value |
outputTableName | Yes | The name of the output table. | No default value |
outputTablePartitions | No | The partitions in the output table. | No default value |
lifecycle | No | The lifecycle of the output table. | No default value |
workerNum | No | The number of vertices for parallel job execution. The parallelism level and framework communication costs increase with the value of this parameter. | Not configured |
workerMem | No | The maximum size of memory that a single job can use. By default, the system allocates 4,096 MB for each job. If the used memory size exceeds the value of this parameter, the OutOfMemory exception is reported. | 4096 |
splitSize | No | The data split size. | 64 |
Examples
- Generate training data.
drop table if exists TreeDepth_func_test_edge; create table TreeDepth_func_test_edge as select * from ( select '0' as flow_out_id, '1' as flow_in_id from dual union all select '0' as flow_out_id, '2' as flow_in_id from dual union all select '1' as flow_out_id, '3' as flow_in_id from dual union all select '1' as flow_out_id, '4' as flow_in_id from dual union all select '2' as flow_out_id, '4' as flow_in_id from dual union all select '2' as flow_out_id, '5' as flow_in_id from dual union all select '4' as flow_out_id, '6' as flow_in_id from dual union all select 'a' as flow_out_id, 'b' as flow_in_id from dual union all select 'a' as flow_out_id, 'c' as flow_in_id from dual union all select 'c' as flow_out_id, 'd' as flow_in_id from dual union all select 'c' as flow_out_id, 'e' as flow_in_id from dual )tmp; drop table if exists TreeDepth_func_test_result; create table TreeDepth_func_test_result ( node string, root string, depth bigint );
The following figure shows the tree depth structure. - View training results.
0,0,0 1,0,1 2,0,1 3,0,2 4,0,2 5,0,2 6,0,3 a,a,0 b,a,1 c,a,1 d,a,2 e,a,2