DataWorks provides PostgreSQL Reader and PostgreSQL Writer for you to read data from and write data to PostgreSQL data sources. You can use the codeless user interface (UI) or code editor to configure sync nodes for PostgreSQL data sources.

Background information

Workspaces in standard mode support the data source isolation feature. You can add data sources separately for the development and production environments to isolate the data sources. This helps keep your data secure. For more information, see Isolate a data source in the development and production environments.

Precautions

PostgreSQL data sources support password authentication and MD5 authentication. If the password and authentication method of the PostgreSQL database that you add to DataWorks as a data source are changed, you must update the configurations of the data source and test the network connectivity between the data source and your resource group again. In addition, you must also manually run the related data synchronization node again to ensure that the configurations of the node meet your business requirements.

Procedure

  1. Go to the Data Source page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region in which the workspace you want to manage resides. On the Workspaces page, find the workspace, move the pointer over the More icon in the Actions column, and then select Workspace Settings.
    4. In the left-side navigation pane of the page that appears, click Data Source to go to the Data Source page.
    Note You can also go to the Data Sources page in Data Integration to add a data source. However, you can add a data source on the Data Sources page in Data Integration only to the production environment.
  2. On the Data Source page, click Add data source in the upper-right corner.
  3. In the Add data source dialog box, click PostgreSQL in the Relational Database section.
  4. In the Add PostgreSQL data source dialog box, configure the parameters.
    1. Configure the basic information of the PostgreSQL data source.
      You can add a PostgreSQL data source in one of the following modes: Alibaba Cloud instance mode and connection string mode.
      • The following table describes the parameters you can configure when you add a PostgreSQL data source in Alibaba Cloud instance mode. PostgreSQL
        ParameterDescription
        Data Source TypeThe mode in which you want to add the data source to DataWorks. Set this parameter to Alibaba Cloud Instance Mode.
        Data Source NameThe name of the data source. The name can contain only letters, digits, and underscores (_), and must start with a letter.
        Data Source DescriptionThe description of the data source. The description cannot exceed 80 characters in length.
        EnvironmentThe environment in which the data source is used. Valid values: Development and Production.
        Note This parameter is displayed only if the workspace is in standard mode.
        RegionThe region where the data source resides.
        RDS instance IDThe ID of the ApsaraDB RDS for PostgreSQL instance. You can log on to the ApsaraDB RDS console to view the ID.
        RDS instance account IDThe ID of the Alibaba Cloud account that is used to purchase the ApsaraDB RDS for PostgreSQL instance. You can log on to the DataWorks console with the Alibaba Cloud account, move the pointer over the profile picture in the upper-right corner, and then select Security Settings to view the ID of the account.
        Default Database Name
        The name of the default ApsaraDB RDS for MySQL database. The following descriptions provide instructions for you to configure a data synchronization node or solution that uses a MySQL data source:
        • When you configure a database-level real-time or batch synchronization node or solution that uses a MySQL data source, you can select one or more databases on which you have access permissions in the ApsaraDB for MySQL instance.
        • If you select multiple databases when you configure a batch synchronization node, you must add a data source for each database.
        UsernameThe username that is used to connect to the database.
        PasswordThe password that is used to connect to the database. Do not use at signs (@) in the password.
        Configure Secondary DatabaseIf the ApsaraDB RDS for PostgreSQL instance has a read-only instance or a secondary ApsaraDB RDS for PostgreSQL instance, you can enable Configure Secondary Database when you add the ApsaraDB RDS for PostgreSQL instance to DataWorks as a data source, and select the ID of the read-only instance that you want to use. This prevents the primary ApsaraDB RDS for PostgreSQL instance from being affected and ensures the performance of the primary ApsaraDB RDS for PostgreSQL instance. If the secondary ApsaraDB RDS for PostgreSQL instance has multiple read-only ApsaraDB RDS for PostgreSQL instances, only data of one of the read-only instances is read.
      • The following table describes the parameters you can configure when you add a PostgreSQL data source in connection string mode. JDBC
        ParameterDescription
        Data Source TypeThe mode in which you want to add the data source to DataWorks. Set this parameter to Connection String Mode.
        Data Source NameThe name of the data source. The name can contain only letters, digits, and underscores (_), and must start with a letter.
        Data Source DescriptionThe description of the data source. The description cannot exceed 80 characters in length.
        EnvironmentThe environment in which the data source is used. Valid values: Development and Production.
        Note This parameter is displayed only if the workspace is in standard mode.
        JDBC URLThe Java Database Connectivity (JDBC) URL of the database. Configure this parameter in the format of jdbc:postgresql://ServerIP:Port/Database. You must replace Database with the name of the default database in the data source. However, when you configure a data synchronization node, you can use all databases in the ApsaraDB RDS for PostgreSQL instance.
        UsernameThe username that is used to connect to the database.
        PasswordThe password that is used to connect to the database.
  5. Set Resource Group connectivity to Data Integration.
  6. Find the desired resource group in the resource group list in the lower part of the dialog box and click Test connectivity in the Actions column.
    A synchronization node can use only one type of resource group. To ensure that your synchronization nodes can be normally run, you must test the connectivity of all the resource groups for Data Integration on which your synchronization nodes will be run. If you want to test the connectivity of multiple resource groups for Data Integration at a time, select the resource groups and click Batch test connectivity. For more information, see Establish a network connection between a resource group and a data source.
    Note
    • By default, the resource group list displays only exclusive resource groups for Data Integration. To ensure the stability and performance of data synchronization, we recommend that you use exclusive resource groups for Data Integration.
    • If you want to test the network connectivity between the shared resource group or a custom resource group and the data source, click Advanced below the resource group list. In the Warning message, click Confirm. Then, all available shared and custom resource groups appear in the resource group list.
  7. If the data source passes the network connectivity test, click Complete.

What to do next

You have learned how to add a PostgreSQL data source. You can proceed to subsequent tutorials. In subsequent tutorials, you will learn how to configure PostgreSQL Reader and PostgreSQL Writer. For more information, see PostgreSQL Reader and PostgreSQL Writer.