Configure code editor input widget - Dataphin - Alibaba Cloud Documentation Center

The code editor widget allows for the configuration of items not supported by system components, catering to specific business needs.

Procedure

On the Dataphin homepage, select Development > Data Integration from the top menu bar.
On the integration page, select Project from the top menu bar (Dev-Prod mode requires selecting the environment).
In the navigation pane on the left, click Batch Pipeline. Then, in the Batch Pipeline list, click the Offline Pipeline you want to develop to open its configuration page.
Click the Component Library icon in the upper right corner of the page to open the Component Library panel.
In the Component Library panel's left-side navigation pane, select Input. Find the Code Editor Input widget in the list on the right and drag it onto the canvas.
Click the icon in the code editor input widget card to open the Script Mode Input Configuration dialog box.

In the Script Mode Input Configuration dialog box, configure the parameters.

Parameter	Description
Step Name	This is the name of the code editor input widget. Dataphin automatically generates the step name, and you can also modify it according to business scenarios. The naming convention is as follows: Can only contain Chinese characters, letters, underscores (_), and numbers. Cannot exceed 64 characters.
Data Source Type	Select the data source type.
Datasource	Select the corresponding data source instance.
Generate Sync Configuration File	Click Generate Sync Configuration File. The system will automatically generate a configuration JSON template in the Configuration File area on the right based on the selected data source type. You can delete or modify configuration items according to actual business needs.
Configuration File	Click the icon to expand the editor and then view or write the configuration file in full screen. The template configuration consists of three main parts: `table` and `tables`: The single input table and multiple input tables for the component. The key for multiple input tables is `tables`. The key for a single input table is `table`. This parameter is not available if the component does not involve tables. `column`: The input fields for the input table. A field must have at least a `name` and a `type`. `parameter`: Parameters related to the sync configuration. The configuration varies based on the data source. For component parameters, see the documentation for the specific component. `{ "table":"tableName",// "tables":["tableName"] "column":[{ "name":"", "type":"" }], "parameter":{} }`

Click Confirm to finalize the property configuration for the Script Mode input widget.

Best Practices

The code editor widget uses a configuration JSON to set up offline integration tasks. It is primarily used in scenarios such as:

Configuring advanced items of components that cannot be set up with a graphical interface.
Manually configuring tables and fields when metadata support is unavailable, such as during errors or outages.
Addressing issues with special keywords for table and field names.
Utilizing database functions.
Implementing sharding strategies.
Directly executing query SQL statements.

The following explains how to use the Sync MySQL Multi-table JOIN Result Data as an example. Other scenarios can be configured similarly.

One: configuration introduction

In the Script Mode Input Configuration dialog box, set the parameters.

Parameter	Description
Step Name	Enter the name of the code editor input widget.
Data Source Type	Select MySQL.
Datasource	Select the corresponding data source instance.
Generate Sync Configuration File	Click Generate Sync Configuration File. The system will automatically generate a configuration JSON template in the Configuration File area on the right based on the selected data source type. You can delete or modify configuration items according to actual business needs.
Configuration File	The template configuration mainly consists of three parts: `table`: Configures the input table. If you have multiple tables, you can enter just one of them. `column`: The input fields for the input table. A field must have at least a `name` and a `type` value. Some input components may also require the index position of the field in the field list, which is the value of `index`. `parameter`: Parameters related to the sync configuration. The configuration varies based on the data source. For component parameters, see the documentation for the specific component. `{ "table":"tableName", "column":[ { "name":"", "type":"" } ], "parameter":{ } }`

Two: widget usage

Widget usage varies across different scenarios, as shown in the examples below.

Using query SQL

For instance, to sync the result data of a multi-table JOIN or other custom SQL queries, the configuration file content is as follows.

{
    "table": "sff_id_name",
    "parameter": {
        "connection": [
            {
                "querySql": [
                    "select a.id, b.name from sff_id_name a join sff_id_name_1 b on a.id = b.id"
                ]
            }
        ]
    },
    "column": [
        {
            "name": "id",
            "type": "Long"
        },
        {
            "name": "name",
            "type": "String"
        }
    ]
}

For complex SQL statements, use the querySql configuration item. This item is a string array that supports multiple SELECT statements with the same structure and must be nested within a connection. You can use multiple connection items to connect to multiple databases.

Sharding

Sharding involves distributing data across several databases and tables while maintaining a consistent table structure.

Using querySql

{
    "table": "sff_id_name",
    "parameter": {
        "connection": [
            {
                "querySql": [
                    "select a.id, b.name from sff_id_name a join sff_id_name_1 b on a.id = b.id"
                ]
            },
            {
                "jdbcUrl": [
                    "jdbc:mysql://ip:3306/dataphin"
                ],
                "querySql": [
                    "select id,name from sff_id_name"
                ]
            }
        ]
    },
    "column": [
        {
            "name": "id",
            "type": "Long"
        },
        {
            "name": "name",
            "type": "String"
        }
    ]
}

Directly Configure Table Name

{
    "table": "sff_id_name1",
    "parameter": {
        "connection": [
            {
                "table": [
                    "sff_id_name1","sff_id_name2"
                ]
            },
            {
                "jdbcUrl": [
                    "jdbc:mysql://xxx:3306/dataphin1"
                ],
                "table": [
                    "sff_id_name3","sff_id_name4"
                ]
            },
            {
                "jdbcUrl": [
                    "jdbc:mysql://xxx:3306/dataphin2"
                ],
                "table": [
                    "sff_id_name5","sff_id_name6"
                ]
            }
        ]
    },
    "column": [
        {
            "name": "id",
            "type": "Long"
        },
        {
            "name": "name",
            "type": "String"
        }
    ]
}

Using functions

You can directly use functions that the database supports, such as the MySQL upper function.

{
  "tables":[
    "sff_id_name_2"
  ],
  "parameter":{
    "where":"id>0",
    "splitPk":"id"
  },
  "column":[
    {
      "name":"id",
      "type":"Long"
    },
    {
      "name":"upper(name)",
      "type":"String"
    }
  ]
}

Keywords

MySQL uses backticks (``) to identify keywords. For example, if the keyword is id, you can use `id`.

{
	"tables":[
		"sff_id_name_2"
	],
	"parameter":{
		"where":"id>0",
		"splitPk":"id"
	},
	"column":[
		{
			"name":"`id`",
			"type":"Long"
		},
        {
			"name":"upper(name)",
			"type":"String"
		}
	]
}