Modularity is a metric that is used to evaluate the structure of communities in a network. It is designed to measure the strength of division of a network into communities. Values greater than 0.3 indicate a strong community structure. This topic describes the Modularity component provided by Machine Learning Studio.
You can configure the component by using one of the following methods:
Machine Learning Platform for AI console
Tab | Parameter | Description |
---|---|---|
Fields Setting | Source Vertex Column | The start vertex column in the edge table. |
Initial Vertex Label Column | The group of start vertices in the edge table. | |
Target Vertex Column | The end vertex column in the edge table. | |
Target Vertex Label Column | The group of end vertices in the edge table. | |
Tuning | Workers | The number of vertices for parallel job execution. The parallelism level and framework communication costs increase with the value of this parameter. |
Memory Size per Worker | The maximum size of memory that a single job can use. By default, the system allocates 4,096 MB for each job. If the used memory size exceeds the value of this parameter, the OutOfMemory exception is reported. |
PAI command
PAI -name Modularity
-project algo_public
-DinputEdgeTableName=Modularity_func_test_edge
-DfromVertexCol=flow_out_id
-DfromGroupCol=group_out_id
-DtoVertexCol=flow_in_id
-DtoGroupCol=group_in_id
-DoutputTableName=Modularity_func_test_result;
Parameter | Required | Description | Default value |
---|---|---|---|
inputEdgeTableName | Yes | The name of the input edge table. | No default value |
inputEdgeTablePartitions | No | The partitions in the input edge table. | Full table |
fromVertexCol | Yes | The start vertex column in the input edge table. | No default value |
fromGroupCol | Yes | The group of start vertices in the input edge table. | No default value |
toVertexCol | Yes | The end vertex column in the input edge table. | No default value |
toGroupCol | Yes | The group of end vertices in the input edge table. | No default value |
outputTableName | Yes | The name of the output table. | No default value |
outputTablePartitions | No | The partitions in the output table. | No default value |
lifecycle | No | The lifecycle of the output table. | No default value |
workerNum | No | The number of vertices for parallel job execution. The parallelism level and framework communication costs increase with the value of this parameter. | Not configured |
workerMem | No | The maximum size of memory that a single job can use. By default, the system allocates 4,096 MB for each job. If the used memory size exceeds the value of this parameter, the OutOfMemory exception is reported. | 4096 |
splitSize | No | The data split size. | 64 |
Examples
- Generate training data.
The data is similar to that of the label propagation algorithm. For more information, see .
- View training results.
+--------------+ | val | +--------------+ | 0.4230769 | +--------------+