edit-icon download-icon

Data source configuration

Last Updated: May 04, 2018

Data source configuration is the primary task of data integration. During data synchronization (data import or export) task development, the project administrator needs to configure reachable data sources to support the entire data development project.

The project administrator can create, edit, and delete data sources in the current project space. Multiple data source types are currently supported. For more information, see Supported Data Source Types.

Note:

Virtual Private Cloud (VPC) provides an isolated network environment, which can be customized with an IP address range, network segment, and gateway. With the improvement of the security of VPC, more and more VPC applications emerge and RDS-MySQL, RDS-SQL Server, and RDS-PostgreSQL are provided by data integration in this context. On the VPC network, no ECS on the same network as the VPC is required because the system supports automatic detection to make sure the network connectivity through a reverse proxy. The support for other Alibaba Cloud databases including PPAS, OceanBase, Redis, MongoDBm Memcache, TableStore, and HBase is also available in the future. For any non-RDS data sources, an ECS on the same network is required for configuring data integration synchronization tasks on the VPC network and ensuring interconnectivity.

Add a data source

Create a MaxCompute data source

Procedure

  1. Enter the DataWorks console as a developer, and click the Enter the Work Zone behind the operation bar of the corresponding project under the project list.

  2. Click Data integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select ODPS as the data source type.

  5. Complete the configuration items of the MaxCompute data source.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 30 characters.

    • Data source description: A brief description of the data source, which must not exceed 80 characters.

    • Data source type: The currently selected data source type is ODPS.

    • ODPS Endpoint: Read-only by default. The setting is automatically read from the system configuration.

    • ODPS project name: It refers to the corresponding MaxCompute Project identifier.

    • Access ID: The AccessID corresponding to the MaxCompute Project Owner cloud account.

    • AccessKey: The AccessKey corresponding to the MaxCompute Project Owner cloud account. It is paired with the AccessID.An AccessKey AccessKey (AK) is equivalent to a logon password.
  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Create an RDS > MySQL data source

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select “RDS > MySQL” as the data source type.

  5. Select to configure the MySQL data source in the form of a RDS instance.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 60 characters.

    • Data source description: A brief description of the data source, which must not exceed 80 characters.

    • Data source type: The currently selected data source type (RDS > MySQL > RDS instance type).

    • RDS instance ID: The ID of the RDS instance of the MySQL data source.

    • RDS instance purchaser ID: The purchaser ID of the RDS instance of the MySQL data source.

      Note: If you select JDBC to configure the data source, the format of the JDBC connection information is jdbc:mysql://IP:Port/database.

    • Database name: The database name of the data source.

    • User name/password: The user name and password of the database.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Create an RDS > SQLServer data source

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select “RDS > SQLServer” as the data source type.

  5. Select to configure the SQLServer data source in the form of a RDS instance.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 60 characters.

    • Data source description: A brief description of the data source, which must not exceed 80 characters.

    • Data source type: The currently selected data source type (RDS > SQLServer > RDS instance type).

    • RDS instance ID: The ID of the RDS instance of the SQLServer data source.

    • RDS instance purchaser ID: The purchaser ID of the RDS instance of the data source.

      Note: If you select JDBC to configure the data source, the format of the JDBC connection information is jdbc:mysql://IP:Port/database.

    • Database name: The database name of the data source.

    • User name/password: The user name and password of the database.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Create an RDS> PostgreSQL data source

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select “RDS > PostgreSQL” as the data source type.

  5. Select to configure the PostgreSQL data source in the form of a RDS instance.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 60 characters.

    • Data source description: A brief description of the data source, which must not exceed 80 characters.

    • Data source type: The currently selected data source type (RDS > PostgreSQL > RDS instance type).

    • RDS instance ID: The ID of the RDS instance of the PostgreSQL data source.

    • RDS instance purchaser ID: The purchaser ID of the RDS instance of the data source.

      Note: If you select JDBC to configure the data source, the format of the JDBC connection information is jdbc:mysql://IP:Port/database.

    • Database name: The database name of the data source.

    • User name/password: The user name and password of the database.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Create an Oracle data source

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select Oracle as the data source type.

  5. Complete the configuration items of the Oracle data source.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 60 characters.

    • Data source description: A brief description of the data source, which must not exceed 80 characters.

    • Data source type: The currently selected data source type is Oracle.

    • Network type: The currently selected network type.

    • JDBCUrl: It refers to the JDBC connection information in the following format: jdbc:oracle:thin:@serverIP:Port:Database.

    • User name/password: The user name and password of the database.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Create an ADS data source

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select ADS as the data source type.

  5. Complete the configuration items of the ADS data source.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 60 characters.

    • Data source description: A brief description of the data source, which must not exceed 80 characters.

    • Data source type: The currently selected data source type is ADS.

    • Connection URL: It refers to the ADS connection information in the following format: serverIP:Port.

    • Schema: It refers to the corresponding ADS Schema information.

    • AccessID/AceessKey: An AccessKey AccessKey (AK) is equivalent to a logon password.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Create an OSS data source

Procedure

  1. Enter the DataWorks as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select OSS as the data source type.

  5. Complete the configuration items of the OSS data source.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 60 characters.

    • Data source description: A brief description of the data source, which must not exceed 80 characters.

    • Data source type: The currently selected data source type is OSS.

    • Network type:

      • Classic network: IP addresses are centrally allocated by Alibaba Cloud. Classic networks are easy to configure and use. This network type is suitable for users who require high ease of operation and need to use the ECS quickly.

      • VPC: A VPC is a logically isolated private network. Network topologies and IP addresses can be customized. VPC supports private line connection and is suitable for users who are familiar with network management.

    • Endpoint: It refers to the OSS Endpoint information in the following format: http://oss.aliyuncs.com. The endpoint of the OSS service associates with a region, and different domains need to be entered for accessing different regions.

      Note:

      The format of the Endpoint is http://oss.aliyuncs.com. The connectivity test is passed but an error reports at the same time. If OSS is prefixed with the Bucket value with a dot, such as http://xxx.oss.aliyuncs.com.

    • Bucket: It refers to the corresponding OSS Bucket information. The bucket is a storage space and serves as the container for storing objects. You can create one or more buckets and add one or more files to each bucket. The bucket entered here searches for corresponding files in the data synchronization task, and file searching is unavailable for non-added buckets.

    • AccessID/AceessKey: An AccessKey AccessKey (AK) is equivalent to a logon password.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Create an OCS data source

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select OCS as the data source type.

  5. Complete the configuration items of the OCS data source.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 30 characters.

    • Data source description: A brief description of the data source, which must not exceed 1,024 characters.

    • Data source type: The currently selected data source type is OCS.

    • Network type: The currently selected network type.

    • PROXY: The OCS Proxy.

    • Port: The OCS port.

    • User name/password: The user name and password of the database.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Create a DRDS data source

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select DRDS as the data source type.

  5. Complete the configuration items of the DRDS data source.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 60 characters.

    • Data source description: A brief description of the data source, which must not exceed 80 characters.

    • Data source type: The currently selected data source type is DRDS.

    • Network type: The currently selected network type.

    • JDBCUrl: The JDBC URL. Format: jdbc://mysql://serverIP:Port/database.

    • User name/password: The user name and password of the database.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Create an FTP data source

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Click Add Data Source.

  4. In the data source creation dialog box window, select FTP as the data source type.

  5. Complete the configuration items of the FTP data source.

    Configuration item description:

    • Data source name: A data source name consists of letters, numbers, and underlines and must start with a letter or underline with a length no greater than 60 characters.

    • Data source description: A brief description of the data source, which must not exceed 80 characters.

    • Data source type: The currently selected data source type.

    • Network type: The currently selected network type is FTP.

    • Protocol: Only the FTP and SFTP protocols are supported currently.

    • Host: The FTP host IP address.

    • Port: If FTP is selected, the port is 21 by default. If SFTP is selected, the port is 22 by default.

    • User name/password: The username and the password for accessing the FTP server.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Edit data source

The project administrator can modify the configuration information of existing data sources as needed.

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Enter the data source name in the search box to make a fuzzy match search of the data source to be edited.

  4. Click the Edit behind the operation bar of the corresponding data source.

  5. Complete the configuration items of the data source. For more information, see the section Add Data Source.

  6. Click Test connectivity.

  7. Click OK when the connectivity test is passed.

Delete data source

The project administrator can delete the existing data source configuration.

Procedure

  1. Enter the DataWorks console as a developer, and click Enter the Work Zone in the action bar of the project.

  2. Click Data Integration in the top navigation bar and navigate to the Data Source page.

  3. Enter the data source name in the search box to make a fuzzy match search of the data source to be deleted.

  4. Click the Delete behind the operation bar of the corresponding data source.

  5. Click OK in the data source deletion dialog box window to delete the data source.

    Note:

    Editing and deleting a data source may affect the normal execution of the workflows and code that reference the data source and thus causing a production failure. Therefore, project administrators must exercise caution when editing or deleting a data source.

Thank you! We've received your feedback.