This topic describes the project data protection mechanism and how to export data after data protection is enabled.

Background information

Some enterprises such as financial institutions have high data security requirements. They take various measures to prevent leakage of sensitive data. For example, their employees can only perform their jobs in the workplace, and are not allowed to take work materials out of the office. All USB ports on office computers are disabled.

As a MaxCompute project administrator, you may also encounter similar situations where users are not allowed to transfer data out of a project.

As shown in the following figure, the user Alice has access permissions on both Project1 and Project2. Therefore, Alice may transfer sensitive data from Project1 to Project2.
Specifically for this example, assume that Alice has the SELECT permission on myprj.table1 and the CREATE TABLE permission in Project2. In this case, Alice can transfer the data from Project1 to Project2. To transfer data across projects, execute the following SQL statement:
create table prj2.table2 as select * from myprj.table1;

To prevent sensitive data from flowing out to other projects, MaxCompute offers a data protection mechanism.

Data protection

Users authorized to access multiple projects can perform cross-project data access operations to transfer data. If a project stores highly-sensitive data, we recommend that the administrator configure the project protection mechanism.

To enable data protection for a project, run the following command in the project:
-- Set the project protection rule to allow inbound data flow but forbid outbound data flow.
set projectProtection=true;

The default value of ProjectProtection is false. After project protection is enabled, the data flow of the project is controlled. Data can only flow in, but cannot flow out. Cross-project data access operations fail because they violate the project protection rule.

Project protection controls data flow, but not data access. Data flow control is effective only if users can access the target data.

Outbound data flow after project protection is enabled

After project protection is enabled for a project, MaxCompute provides two data export methods.
  • Set an exception policy
    A project owner can configure an exception policy when enabling project protection. The command is as follows:
    SET ProjectProtection=true WITH EXCEPTION <policyFile>
    Even though both operations share the same syntax, an exception policy is different from authorization. An exception policy implements an exception in the project protection mechanism. Any access requests that meet the description of the exception policy are ignored by the project protection rule.
    Note Run the following command to check whether any exception exists:
    show SecurityConfiguration;

    Example

    The following example allows the user Alice@aliyun.com to export data out of the alipay project when performing the SELECT operation on the alipay.table_test table in an SQL task.

        {
        "Version": "1",
        "Statement":
        [{
            "Effect":"Allow",
            "Principal":"ALIYUN$Alice@aliyun.com",
             "Action":["odps:Select"],
            "Resource":"acs:odps:*:projects/alipay/tables/table_test",
            "Condition":{
                "StringEquals": {
                    "odps:TaskType":["DT", "SQL"]
                }
            }
        }]
        }
    Note
    • The exception policy is not a common authorization method. If the user Alice does not have the SELECT permission on the alipay.table_test table, Alice cannot export data even if the preceding exception policy is configured.
    • odps:TaskType mainly includes DT, SQL, and MapReduce. DT refers to tunnels (Batch data tunnel), which includes the encapsulation of Tunnel SDK, such as DataWorks data integration and open-source DataX.
    Data leakage due to time-of-check to time-of-use (TOCTOU), which is also known as the race condition, is described as follows:
    1. [TOC stage] User A submits an application to the project owner to export table t1. After verifying that t1 does not contain sensitive data, the project owner configures an exception policy to authorize user A to export t1.
    2. Between the TOC and TOU, a malicious user writes sensitive data to table t1.
    3. [TOU stage] User A exports t1. However, the t1 exported by the user is not same as the t1 authorized by the project owner.

    Suggestions for preventing TOCTOU problems: For each table that a user applies to export, the project owner must ensure that no other user (including the administrator) can update the table (UPDATE) or create a table with the same name (DROP + CREATE TABLE). In the preceding example, we recommend that the project owner create a snapshot of t1 in step 1, and then use this snapshot when setting the exception policy. Additionally, no other user should be granted the admin role.

  • Configure a trusted project

    If the current project is protected and the target project is a trusted project, data flow to the target project is not a violation of the project protection rule. If multiple projects are configured as mutually trusted projects, they form a trusted project group. Data can flow only within this project group.

    You can run the following commands to manage trusted projects.
        list trustedprojects;
          -- View all trusted projects of the current project.
        add trustedproject <projectname>;
          -- Add a trusted project to the current project.
        remove trustedproject <projectname>;
          -- Remove a trusted project from the current project.
    							
  • Resource sharing and data protection

    In MaxCompute, Resource sharing across projects based on package and project protection are independent mechanisms that take effect at the same time, but their functions are mutually restrictive.

    Resource sharing takes precedence over project protection. If a data object is made accessible to users from other projects through resource sharing, the object is not subject to the project protection rule.

Best practices

To prevent data outflow, you must set ProjectProtection=true and check the following settings:
  • Make sure that no trusted projects are added. If any trusted project is added, you must assess potential risks.
  • Make sure that no data sharing packages are used. If any data sharing package is used, you must ensure that no sensitive data exists in the package.