This topic describes the main points of a data security self-check and related adjustment points. It provides you with the idea of data security adjustment.

Scenario

In the early stage of a project, user and permission management is relatively loose for accelerating the project progress. When the project enters a stable development stage, data security becomes an increasingly important part of management. In this stage, you must check and analyze data security and make adjustments based on the analysis result.

Main points of a self-check

  1. Number of accounts: Collect the numbers of DataWorks workspace members and MaxCompute project members. To facilitate accountability and management, make sure that each member has only one account.
  2. Accounts that are no longer used and permissions for the accounts: For RAM users that have roles in a MaxCompute project or a DataWorks workspace, revoke the roles from the RAM users in the project and remove the users from the project before you delete them. Otherwise, the RAM users are displayed as p4_xxxxxxxxxxxxxxxxxxxx and cannot be removed from the project. This issue does not affect the use of the project. However, if an account is no longer used due to the changes of personnel, you must reclaim the account and its permissions. After notification and research, delete the accounts that have not been used for a long time and apply for new accounts if required.
  3. Personal account survey and analysis: Query the data submitted by personal accounts in the development phase within the last three months, collect statistics of top N users, and select typical accounts to analyze their daily tasks. The submitted data includes data involved in retrieval and computing tasks, which mainly refer to SQL tasks. You can use the TUNNELS_HISTORY view provided by the MaxCompute metadata service Information Schema to analyze the data. Examples
    • An account belongs to a member of an algorithm development project. Most of the daily tasks of the account are SQL tasks, and the SQL tasks mainly involve queries and table write operations in the development environment. The number of algorithm tasks and MapReduce tasks is less than the number of SQL tasks. This is normal in data development because SQL tasks are preferentially used to process data whenever possible.
    • Many tasks are submitted by the same account. This is because the owner of the account uses an SDK to design a program that allows other users to query the AccessKey pair of this account. This way, the users can use this account to submit tasks. To prevent multiple users from sharing the same account, exercise caution when you open permissions.
  4. Data download statistics: Collect the data download request tasks of each project, and analyze and plan downloadable projects. You can use the TUNNELS_HISTORY view provided by the MaxCompute metadata service Information Schema to analyze and collect statistics on these tasks.

Adjustment points

  • Allocation of accounts

    Each member in a project must have its own account.

    Grant different data access permissions to different members based on their business development teams and roles. Sharing accounts is not allowed. Avoid data security risks caused by excessive user permissions. For example, allocate accounts by business group in the data development process. Business groups include the management group, data integration group, data model group, algorithm group, analysis group, O&M group, and security group.
  • Data throttling

    Restrict the export of data from some projects and control the permissions of some members. Free data flowing among projects may affect the data architecture of the cloud platform and bring the risk of data leaks. Therefore, cross-project data flowing need to be restricted for most projects.

    For example, to prevent risks caused by unknown data flows, you can allow data to flow only to specified projects or locations at the MaxCompute level.

  • Data export limits

    If data is exported from MaxCompute as files, the transmission of data is uncontrollable. Therefore, we recommend that you minimize the possibility of data export from MaxCompute. You can restrict the data export permissions for some business groups based on the division of user roles. This does not affect the daily development work of the users.