All Products
Search
Document Center

DataWorks:Isolate a data source in the development and production environments

Last Updated:Sep 07, 2023

DataWorks provides the data source isolation feature for workspaces in standard mode. This way, data of the development environment can be isolated from data of the production environment.

Background information

In a workspace in standard mode, a data source has two sets of settings: one in the development environment and the other in the production environment. You can separately configure the data source in the development environment and production environment based on the two databases or data warehouses that are specified for the data source in the workspace in standard mode. When you run a synchronization task, the environment in which the task is run determines the database of the data source that is accessed by the synchronization task. This way, data of the development environment is isolated from data of the production environment. For more information about workspaces in standard mode, see Differences between workspaces in basic mode and workspaces in standard mode.

  • In a workspace in standard mode, Operation Center in the development environment and DataStudio access the data source that is configured in the development environment by default.

  • When you run a task in Operation Center in the production environment, Operation Center in the production environment accesses the data source that is configured in the production environment by default.

示例
Note
  • You can configure different databases, usernames, and passwords for the same data source in the development and production environments. In this case, the synchronization task in which the data source is used may be successfully run on the DataStudio page but fail to run in the production environment due to different configurations of the data source in the development and production environments. You must make sure that the databases or data warehouses of the data source in the development and production environments are configured based on your business requirements. If you successfully run a task on the DataStudio page but fail to run the task in the production environment, or the amount of data differs in the development and production environments, you can troubleshoot the issue by comparing the success log of the task in the development environment with the error log of the task in the production environment.

  • Tasks are deployed to the production environment for running. If the configurations of the data source in the development and production environments differ, make sure that network connections are established between the resource group you want to use and the data source in different environments.

The data source isolation feature has the following impacts on workspaces:

  • Only workspaces in standard mode support the data source isolation feature. You can specify different databases or data warehouses for the same data source in a workspace in standard mode when you add the data source to the workspace.

    Note

    A workspace in basic mode provides only one environment. Therefore, data cannot be isolated by environment. For more information about workspace modes, see Scenario: Upgrade a workspace from the basic mode to the standard mode.

  • After you upgrade a workspace from the basic mode to the standard mode, the original data source is configured in the development and production environments.

Procedure

  1. Go to the Data Source page.

    1. Log on to the DataWorks console. In the left-side navigation pane, click Management Center. On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.

    2. In the left-side navigation pane of the page that appears, click Data Source. The Data Source page appears.

  1. On the Data Source page, configure the following parameters or perform the following operations.

  1. Operation or parameter

    Description

    Batch add data source

    You can click this button to add multiple MySQL, SQL Server, PolarDB, or Oracle data sources at a time. Other data sources do not support batch addition.

    DataWorks provides templates that you can use to add multiple data sources at a time. You can download a template, configure the fields in the template, and then upload the template. The progress and results are displayed in the Start new field of the Batch add data source dialog box. Fields in the template: DataSourceType, DataSourceName, description, Environment classification (0dev, 1prod), JDBC URL, username, and password.

    Note

    The name of the data source in the development environment must be the same as the name of the data source in the production environment.

    Add data source

    • Data sources in the development environment: You can select such a data source when you create a synchronization task and run the task in the development environment. You cannot commit the task to the production environment for running.

    • Data sources in the production environment: You can use such a data source only in the production environment. You cannot select such a data source when you create a synchronization task.

    Environment

    This parameter is not displayed for a workspace in basic mode.

    Actions

    • New: If no data source is configured in the related environment, New is displayed in the Operation column. You can click New to add a data source in the environment.

    • Edit and Delete: If a data source is configured in the related environment, Edit and Delete are displayed in the Operation column. You can click Edit to modify the data source or click Delete to delete the data source.

      • Before you delete a data source from the development and production environments, check whether the data source is used by a synchronization task in the production environment. The deletion cannot be rolled back. After the data source is deleted, you cannot select it when you configure a synchronization task in the development environment.

        If a synchronization task in the production environment uses the data source, the synchronization task cannot be run after the data source is deleted. Before you delete the data source, delete the synchronization task that uses the data source.

      • Before you delete a data source from the development environment, check whether the data source is used by a synchronization task in the production environment. The deletion cannot be rolled back. After the data source is deleted, you cannot select it when you configure a synchronization task in the development environment.

        If a synchronization task in the production environment uses the data source, after the data source is deleted, you cannot obtain metadata when you edit the synchronization task. However, the synchronization task can be run in the production environment.

      • Before you delete a data source from the production environment, check whether the data source is used by a synchronization task in the production environment. If you select the data source when you configure a synchronization task in the development environment, you cannot commit the synchronization task to the production environment after the data source is deleted.

        If a synchronization task in the production environment uses the data source, the synchronization task cannot be run after the data source is deleted.

    Select

    You can select multiple data sources in the Select column to test the network connectivity of the data sources or delete the data sources at a time.