An edge clustering coefficient is used to calculate the edge density in Undirected Graph G. This topic describes the Edge Clustering Coefficient component provided by Machine Learning Studio.

You can configure the component by using one of the following methods:

Machine Learning Platform for AI console

Tab Parameter Description
Fields Setting Start Vertex The start vertex column in the edge table.
End Vertex The end vertex column in the edge table.
Tuning Workers The number of vertices for parallel job execution. The parallelism level and framework communication costs increase with the value of this parameter.
Memory Size per Worker (MB) The maximum size of memory that a single job can use. By default, the system allocates 4,096 MB for each job. If the used memory size exceeds the value of this parameter, the OutOfMemory exception is reported.
Data Split Size (MB) The data split size. Default value: 64.

PAI command

PAI -name EdgeDensity
    -project algo_public
    -DinputEdgeTableName=EdgeDensity_func_test_edge
    -DfromVertexCol=flow_out_id
    -DtoVertexCol=flow_in_id
    -DoutputTableName=EdgeDensity_func_test_result;
Parameter Required Description Default value
inputEdgeTableName Yes The name of the input edge table. No default value
inputEdgeTablePartitions No The partitions in the input edge table. Full table
fromVertexCol Yes The start vertex column in the input edge table. No default value
toVertexCol Yes The end vertex column in the input edge table. No default value
outputTableName Yes The name of the output table. No default value
outputTablePartitions No The partitions in the output table. No default value
lifecycle No The lifecycle of the output table. No default value
workerNum No The number of vertices for parallel job execution. The parallelism level and framework communication costs increase with the value of this parameter. Not configured
workerMem No The maximum size of memory that a single job can use. By default, the system allocates 4,096 MB for each job. If the used memory size exceeds the value of this parameter, the OutOfMemory exception is reported. 4096
splitSize No The data split size. 64

Examples

  1. Generate training data.
    drop table if exists EdgeDensity_func_test_edge;
    create table EdgeDensity_func_test_edge as
    select * from
    (
      select '1' as flow_out_id,'2' as flow_in_id from dual
      union all
      select '1' as flow_out_id,'3' as flow_in_id from dual
      union all
      select '1' as flow_out_id,'5' as flow_in_id from dual
      union all
      select '1' as flow_out_id,'7' as flow_in_id from dual
      union all
      select '2' as flow_out_id,'5' as flow_in_id from dual
      union all
      select '2' as flow_out_id,'4' as flow_in_id from dual
      union all
      select '2' as flow_out_id,'3' as flow_in_id from dual
      union all
      select '3' as flow_out_id,'5' as flow_in_id from dual
      union all
      select '3' as flow_out_id,'4' as flow_in_id from dual
      union all
      select '4' as flow_out_id,'5' as flow_in_id from dual
      union all
      select '4' as flow_out_id,'8' as flow_in_id from dual
      union all
      select '5' as flow_out_id,'6' as flow_in_id from dual
      union all
      select '5' as flow_out_id,'7' as flow_in_id from dual
      union all
      select '5' as flow_out_id,'8' as flow_in_id from dual
      union all
      select '7' as flow_out_id,'6' as flow_in_id from dual
      union all
      select '6' as flow_out_id,'8' as flow_in_id from dual
    )tmp;
    drop table if exists EdgeDensity_func_test_result;
    create table EdgeDensity_func_test_result
    (
      node1 string,
      node2 string,
      node1_edge_cnt bigint,
      node2_edge_cnt bigint,
      triangle_cnt bigint,
      density double
    );
    The following figure shows the structure of the edge clustering coefficient graph. Structure of the edge clustering coefficient graph
  2. View training results.
    1,2,4,4,2,0.5
    2,3,4,4,3,0.75
    2,5,4,7,3,0.75
    3,1,4,4,2,0.5
    3,4,4,4,2,0.5
    4,2,4,4,2,0.5
    4,5,4,7,3,0.75
    5,1,7,4,3,0.75
    5,3,7,4,3,0.75
    5,6,7,3,2,0.66667
    5,8,7,3,2,0.66667
    6,7,3,3,1,0.33333
    7,1,3,4,1,0.33333
    7,5,3,7,2,0.66667
    8,4,3,4,1,0.33333
    8,6,3,3,1,0.33333