All Products
Search
Document Center

Dataphin:Create an Amazon Redshift data source

Last Updated:Jul 07, 2025

By creating an Amazon Redshift data source, you can enable Dataphin to read business data from or write data to Amazon Redshift. This topic describes how to create an Amazon Redshift data source.

Permission requirements

Only custom global roles with the permission to create data sources and the super administrator, data source administrator, domain architect, and project administrator roles can create data sources.

Procedure

  1. On the Dataphin homepage, choose Management Center > Datasource Management from the top navigation bar.

  2. On the Datasource page, click +Create Data Source.

  3. On the Create Data Source page, select Amazon Redshift in the Big Data section.

    If you have recently used Amazon Redshift, you can also select Amazon Redshift in the Recently Used section. You can also enter keywords in the search box to quickly search for Amazon Redshift.

  4. On the Create Amazon Redshift Data Source page, configure the connection parameters.

    1. Configure the basic information of the data source

      Parameter

      Description

      Datasource Name

      Enter a name for the data source. The name must meet the following requirements:

      • The name can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-).

      • The name cannot exceed 64 characters in length.

      Datasource Code

      After you configure the data source code, you can access Dataphin data source tables in Flink_SQL tasks or by using the Dataphin JDBC client in the format of data source code.table name or data source code.schema.table name for quick consumption. If you need to automatically switch data sources based on the task execution environment, you can access them using the variable format ${data source code}.table or ${data source code}.schema.table. For more information, see Development methods for Dataphin data source tables.

      Important
      • The data source code cannot be modified after it is configured successfully.

      • After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory.

      • In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.

      Version

      Currently, only version 1.0.x is supported.

      Data Source Description

      Enter a brief description of the data source. The description cannot exceed 128 characters.

      Time Zone

      The time zone will be used to process time format data in integration tasks. The default time zone is Asia/Shanghai. Click Modify to select a target time zone. The options are as follows:

      • GMT: GMT-12:00, GMT-11:00, GMT-10:00, GMT-09:30, GMT-09:00, GMT-08:00, GMT-07:00, GMT-06:00, GMT-05:00, GMT-04:00, GMT-03:00, GMT-03:00, GMT-02:30, GMT-02:00, GMT-01:00, GMT+00:00, GMT+01:00, GMT+02:00, GMT+03:00, GMT+03:30, GMT+04:00, GMT+04:30, GMT+05:00, GMT+05:30, GMT+05:45, GMT+06:00, GMT+06:30, GMT+07:00, GMT+08:00, GMT+08:45, GMT+09:00, GMT+09:30, GMT+10:00, GMT+10:30, GMT+11:00, GMT+12:00, GMT+12:45, GMT+13:00, GMT+14:00.

      • Daylight Saving Time: Africa/Cairo, America/Chicago, America/Denver, America/Los_Angeles, America/New_York, America/Sao_Paulo, Asia/Bangkok, Asia/Dubai, Asia/Kolkata, Asia/Shanghai, Asia/Tokyo, Atlantic/Azores, Australia/Sydney, Europe/Berlin, Europe/London, Europe/Moscow, Europe/Paris, Pacific/Auckland, Pacific/Honolulu.

      Data Source Configuration

      Select the data source that you want to configure:

      • If your business data source distinguishes between production and development data sources, select Production + Development Data Source.

      • If your business data source does not distinguish between production and development data sources, select Production Data Source.

      Tag

      You can categorize and tag data sources based on tags. For information about how to create tags, see Manage data source tags.

    2. Configure the connection parameters between the data source and Dataphin

      If you select Production + Development data source for your data source configuration, you need to configure the connection information for the Production + Development data source. If your data source configuration is Production data source, you only need to configure the connection information for the Production data source.

      Note

      In most cases, the production data source and development data source should be configured as different data sources to achieve environment isolation and reduce the impact of the development data source on the production data source. However, Dataphin also supports configuring them as the same data source with identical parameter values.

      Parameter

      Description

      Server Address

      Enter the IP address and port number of the server.

      You can click +Add to add multiple sets of IP addresses and port numbers, and click the image icon to delete excess IP addresses and port numbers. At least one set must be retained.

      Parameter Checking

      • Parameter name: You can select an existing parameter name or enter a custom parameter name.

        Custom parameter names can contain only letters, digits, periods (.), underscores (_), and hyphens (-).

      • Parameter value: When a parameter name is selected, the parameter value is required. It can contain only letters, digits, periods (.), underscores (_), and hyphens (-), and cannot exceed 256 characters in length.

      Note

      You can click +Add Parameter to add multiple parameters, and click the image icon to delete excess parameters. You can add up to 30 parameters.

      dbname

      Enter the database name.

      Schema

      Enter the schema information associated with the username.

      Username, Password

      Enter the authentication username and password. To ensure that tasks run properly, make sure that the user has the required data permissions.

    3. Advanced configuration

      Parameter

      Description

      connectTimeout

      The connection timeout period of the database in seconds. The default value is 900 seconds (15 minutes).

      Note

      For data sources created before Dataphin V3.11, the default value of connectTimeout is -1, which indicates no timeout limit.

      socketTimeout

      The socket timeout period of the database in seconds. The default value is 1800 seconds (30 minutes).

      Note

      For data sources created before Dataphin V3.11, the default value of socketTimeout is -1, which indicates no timeout limit.

      Connection Retries

      If the database connection times out, the system automatically retries the connection until the specified number of retries is reached. If the connection still fails after the maximum number of retries, the connection fails.

      Note
      • The default number of retries is 1. You can set a value between 0 and 10.

      • The number of connection retries is applied by default to offline integration tasks and global quality (requires the asset quality feature to be enabled). You can separately configure the number of retries at the task level in offline integration tasks.

  5. Select a Default Resource Group. This resource group is used to run tasks related to the current data source, including database SQL, offline database migration, and data preview.

  6. Click Test Connection or directly click OK to save and complete the creation of the Amazon Redshift data source.

    When you click Test Connection, the system tests whether the data source can connect to Dataphin properly. If you directly click OK, the system automatically tests the connection for all selected clusters. However, even if all selected clusters fail the connection test, the data source can still be created normally.

    Test Connection tests the connection for the Default Cluster or Registered Scheduling Clusters that have been registered in Dataphin and are in normal use. The Default Cluster is selected by default and cannot be deselected. If there are no resource groups under a Registered Scheduling Cluster, connection testing is not supported. You need to create a resource group first before testing the connection.

    • The selected clusters are only used to test network connectivity with the current data source and are not used for running related tasks later.

    • The test connection usually takes less than 2 minutes. If it times out, you can click the image icon to view the specific reason and retry.

    • Regardless of whether the test result is Connection Failed, Connection Successful, or Succeeded With Warning, the system will record the generation time of the final result.

      Note

      Only the test results for the Default Cluster include three connection statuses: Succeeded With Warning, Connection Successful, and Connection Failed. The test results for Registered Scheduling Clusters in Dataphin only include two connection statuses: Connection Successful and Connection Failed.

    • When the test result is Connection Failed, you can click the image icon to view the specific failure reason.

    • When the test result is Succeeded With Warning, it means that the application cluster connection is successful but the scheduling cluster connection failed. The current data source cannot be used for data development and integration. You can click the image icon to view the log information.