All Products
Search
Document Center

MaxCompute:Schema-related operations

Last Updated:Mar 08, 2024

MaxCompute provides the schema feature. You can use schemas to classify tables, resources, and functions in projects. This topic describes the concepts and permissions that are related to schemas. This topic also describes how to use schemas.

Background information

A MaxCompute project is a basic organizational unit of MaxCompute and is used for isolation and access control of multiple users. A project contains objects, such as tables, resources, and functions. Before MaxCompute provides the schema feature, the objects are directly placed in projects. Projects serve as databases or schemas based on the hierarchy of traditional databases. This causes operation inconvenience for users, especially when a large number of tables or other objects exist. To resolve this issue, MaxCompute provides the schema feature. You can use schemas to classify tables, resources, and functions in projects. The following figure shows the hierarchy of schemas.层级图

The schema feature of MaxCompute is in public preview. If you want to use this feature, submit an application to test this feature and provide the ID of your Alibaba Cloud account and the name of your project in the application.

Note
  • After you submit an application, you need to wait for your project to be updated. Project updates are performed on every Monday and Thursday. After your project is updated, a schema named DEFAULT is created in your project and the schema feature for the tenant that has the ID you specified in the application is enabled. You can also create and use a schema in the project.

  • If you do not create a project, you can perform the following operations to enable the schema feature for all projects that you create later: Log on to the MaxCompute console and select a region. In the left-side navigation pane, choose Tenants > Tenant Property. On the Tenants page, turn on Tenant-level Information Schema Syntax.

Key terms

  • Schema

    A schema is created in a project. You can use schemas to classify objects in projects, such as tables, resources, and user-defined functions (UDFs). You can create multiple schemas in a project.

  • Parameter used to enable or disable the schema feature

    If you enable the schema feature, you must configure set odps.namespace.schema=true|false; to determine the SQL semantics of project.schema.table. You can enable the schema feature for a tenant or a job.

    • If no project exists in the current tenant, you can perform the following operations to enable the schema feature for the tenant: Log on to the MaxCompute console and select a region. In the left-side navigation pane, choose Tenants > Tenant Property. On the Tenants page, turn on Tenant-level Information Schema Syntax. If specific projects exist in the current tenant, you cannot change the setting of the schema feature.

    • You can configure set odps.namespace.schema=true; to enable the schema feature for a job. The schema feature identifies information in the a.b.c format in SQL statements as project.schema.table and identifies information in the a.b format in SQL statements as schema.table. You can run the set odps.namespace.schema=false; command to disable the schema feature. This way, in an SQL statement, information in the a.b.c format cannot be identified, and information in the a.b format is identified as project.table.

  • Default schema

    A schema named DEFAULT exists in each project for which the schema feature is enabled. This schema cannot be modified or deleted.

Limits

The schema feature of MaxCompute is in public preview. Specific modules do not support this feature. The operations that are initiated from these modules on MaxCompute schemas can be performed only if you configure set odps.namespace.schema=false. Take note of the following limits:

  • You can run a MaxCompute Spark job to configure custom schemas only in the regions in the Chinese mainland. You can run a MaxCompute Spark job to configure custom schemas in regions other than the regions in the Chinese mainland in the future.

  • MaxCompute Mars and MapReduce jobs do not support projects for which the schema feature is enabled. If you enable the schema feature for your project, you cannot run Mars and MapReduce jobs in your project.

  • Alibaba Cloud services such as Platform for AI (PAI) and Quick BI do not support custom schemas.

  • The schema feature of MaxCompute is supported for MaxCompute Studio V4.0.0 or later.

  • Views that are created based on the configuration odps.namespace.schema=false can be accessed only when the schema feature is disabled. Similarly, views that are created based on the configuration odps.namespace.schema=true can be accessed only when the schema feature is enabled.

  • The schema feature of MaxCompute is supported for Hologres V1.3 or later. For more information about how to upgrade a Hologres instance, see Instance upgrades.

  • MaxCompute SDK 0.40.8 for Java or later, Java Database Connectivity (JDBC) 3.3.2 or later, MaxCompute client V0.40.8 or later, and PyODPS 0.11.3.1 or later support the schema feature of MaxCompute.

  • DataWorks provides custom schemas only if the schema feature is enabled for a tenant. If the schema feature is disabled for a tenant but custom schemas are used in specific projects under the tenant, you cannot manage custom schemas in the DataWorks console. In this case, you can use scripts on Data Integration to specify objects in the custom schemas. For more information about the scenarios in which DataWorks supports the schema feature, see Use the MaxCompute schema feature in DataWorks.

Usage notes

This section describes how to use schemas.

  • Create a project.

    Before you can use schemas, you must create a project. For more information, see Create a MaxCompute project. If you set odps.namespace.schema to true when you create a project in the MaxCompute console, the project that you create supports the schema feature by default. If you set odps.namespace.schema to false, you must submit an application to enable your project to support the schema feature.

    Note

    After you submit an application, you need to wait for your project to be updated. Project updates are performed on every Monday and Thursday.

  • Manage schemas.

    Manage schemas by using SQL statements

    Use the following SQL statements to manage schemas.

    • View schemas.

      show schemas;
    • Create a schema.

      create schema <schema_name>;

      schema_name specifies the name of the custom schema.

    • View schema information.

      desc schema <schema_name>;

      schema_name specifies the name of the custom schema.

    • Drop a schema.

      drop schema <schema_name>;

      schema_name specifies the name of the custom schema.

    Manage schemas in the MaxCompute console

    1. Log on to the MaxCompute console. In the top navigation bar, select a region.

    2. In the left-side navigation pane, choose Workspace > Projects.

    3. On the Projects page, find the desired project and click Manage in the Actions column.

    4. On the Projects page, click Schema.

      Note

      The Schema tab appears only for the projects that support the schema feature.

    5. On the Schema tab, view schemas, create a schema, and drop a schema.

  • Perform operations on an object in a schema.

    If you want to perform operations on an object such as a table, view, resource, or function in a schema, specify the object in the projct.schema.table format in SQL statements.

    • When you perform operations on a table in a schema, take note of the following items:

      Note

      All descriptions of tables in this topic also apply to views, resources, and functions.

      • If you want to perform operations on a table across projects, you must specify the table in the project.schema.table format in SQL statements.

      • If you want to perform operations on an object across schemas in a project, you can specify the object in the schema.table format in SQL statements. In this case, a in the a.b format is parsed as a schema and b in the a.b format is parsed as a table. The project is the current project.

      • In the same project, you can execute the use schema <schema_name> statement to specify the current schema. You can use an SQL statement that is similar to select * from a. In this case, a in the statement indicates a table and the current project and the specified schema are automatically obtained.

      • If no schema is specified in the context, a in the SQL statement that is similar to select * from a indicates a table and the schema named DEFAULT and the current project are automatically obtained.

    • Examples

      • Perform operations on an object in a specified schema in projectA.

        • Perform operations on an object in the DEFAULT schema.

          use projectA;
          set odps.namespace.schema=true;-- If you configure this setting at the tenant level, skip this configuration.
          -- Perform operations on the t_a table.
          create table t_a(c1 string,c2 bigint);
          insert into/overwrite table t_a values ('a',1),('b',2),('c',3);
          select * from t_a;
          show tables;
          desc t_a;
          
          tunnel upload <path> t_a[/<pt_spc>];
          tunnel download t_a[/pt_spc] <path>;
          
          -- Perform operations on the res_a.jar resource.
          add jar <path>/res_a.jar ;
          desc resource res_a.jar;
          list resources;
          get resource res_a.jar D:\;
          drop resource res_a.jar;
          
          -- Perform operations on the fun_a function.
           create function fun_a as 'xx' using 'res_a.jar';
           desc function fun_a;
           list functions;
           drop function fun_a;

          path: the path and name of the file. pt_spc: the partitions that are specified in a partitioned table. You must specify the lowest-level partition. The value of this parameter is in the pt_spcpartition_col1=col1_value1, partition_col2=col2_value1...pt_spc format.

        • Perform operations on objects in the custom schemas s_1 and s_2, including cross-schema operations.

          use projectA;
          set odps.namespace.schema=true;-- If you configure this setting at the tenant level, skip this configuration.
          
          -- Perform operations on the t_c table in the s_1 schema.
          use schema s_1;
          create table t_c(c1 string,c2 bigint);
          insert into/overwrite table t_c values ('a',1),('b',2),('c',3);
          select * from t_c;
          show tables;
          drop table t_c;
          
          
          tunnel upload <path> t_c[/<pt_spc>];
          tunnel download t_c[/pt_spc] <path>;
          
          -- Perform operations on the t_d table in the s_2 schema.
          creata table s_2.t_d(c1 string,c2 bigint);
          insert into/overwrite table s_2.t_d values ('a',1),('b',2),('c',3);
          select * from s_2.t_d;
          show tables in s_2;
          drop table s_2.t_d;
          
          tunnel upload <path> s_2.t_d[/<pt_spc>];
          tunnel download s_2.t_d[/pt_spc] <path>;
          
          
          -- Perform operations on the res_b.jar resource in the s_1 schema.
          use schema s_1;
          add jar <path>/res_b.jar  ;
          desc resource res_b.jar;
          list resources;
          get resource res_b.jar D:\;
          drop resource res_b.jar;
          
          -- Perform operations on the res_c.jar resource in the s_2 schema.
          add jar xxx ;-- You can add resources only to the current schema or project. You cannot perform this operation across schemas or projects. To perform this operation, you must switch to the s_2 schema.
          -- When you perform operations on resources across schemas or projects, separate resource levels with colons (:).
          desc resource s_2:res_c.jar;
          list resources in s_2;
          get resource s_2:res_c.jar D:\;
          drop resource s_2:res_c.jar;
          
          -- Perform operations on the fun_b function in the s_1 schema.
           use schema s_1;
           create function fun_b as 'xx' using 'res_b.jar'
           desc function fun_b;
           list functions;
           drop function fun_b;
          
           -- Perform operations on the fun_c function in the s_2 schema.
           create function s_2.fun_c as 'xx' using 's_2/resources/res_c.jar'
           drop function s_2.fun_c;
           desc function s_2.fun_c;
           list functions in s_2;
           drop function s_2.fun_c;
      • Perform cross-project operations. For example, perform operations on objects of projectB in projectA.

        use projectA;
        set odps.namespace.schema=true;  -- If you configure this setting at the tenant level, skip this configuration.
        
        -- Perform operations on the t_f table of the s_3 schema in projectB.
        create table projectB.s_3.t_f(c1 string,c2 bigint);
        insert into/overwrite table projectB.s_3.t_f values ('a',1),('b',2),('c',3);
        select * from projectB.s_3.t_f;
        show tables in projectB.s_3;
        desc projectB.s_3.t_f;
        drop table projectB.s_3.t_f;
        
        tunnel upload <path> projectB.s_3.t_f[/<pt_spc>];
        tunnel download projectB.s_3.t_f[/pt_spc] <path>;
        
        -- Perform operations on the res_f.jar resource of the s_3 schema in projectB.
        add jar xxx ;-- You can add resources only to the current schema or project. You cannot perform this operation across schemas or projects. To perform this operation, you must switch to the s_3 schema in projectB.
        -- When you perform operations on resources across schemas or projects, separate resource levels with colons (:).
        desc resource projectB:s_3:res_f.jar;
        list resources in projectB.s_3;
        get resource projectB:s_3:res_f.jar D:\;
        drop resource projectB:s_3:res_f.jar;
        
        -- Perform operations on the fun_f function of the s_3 schema in projectB.
         create function projectB.s_3.fun_f as 'xx' using 'projectB/schemas/s_3/resources/res_f.jar'
         desc function projectB.s_3.fun_f;
         list functions in projectB.s_3;
         drop function projectB.s_3.fun_f;
                                        

Permissions

  • Permissions on schemas

    Some permissions, such as CreateTable, CreateResource, and CreateFunction, cannot be granted at the schema level. Instead, the permissions can be granted only at the project level. After you grant a user the CreateTable, CreateResource, or CreateFunction permission on a project, the user can perform the related operations in all schemas of the project. Schema-level management of these permissions will be available in the future.

    Note
    • By default, the owner of a schema has all access permissions on the schema and the objects in the schema and can manage the access permissions of other users on the schema and the objects in the schema.

    • If you have the CreateTable, CreateResource, and CreateFunction permissions on a project, you also have these permissions on the schemas of the project.

  • Permissions on objects in a schema

    When you perform ACL-based access control on an object in a schema, you must specify the object in the project.schema.table format in SQL statements. The following sample code shows the syntax for authorization. For more information about the permissions on objects such as tables, resources, and functions in a schema, see MaxCompute permissions. You can also grant permissions to users in the new MaxCompute console. For more information, see Manage user permissions in the MaxCompute console (new version).

    -- Grant a role the permissions to manage all tables in a schema.
    GRANT schemaObjectPrivileges ON TABLE <project_name>.<schema_name>.* TO role {rolename};
    -- Revoke the permissions to manage all tables in a schema from a role.
    REVOKE schemaObjectPrivileges ON TABLE <project_name>.<schema_name>.* FROM role {rolename};
    -- Grant a role or a user the permissions to manage a table in a schema.
    GRANT schemaObjectPrivileges ON TABLE <project_name>.<schema_name>.<tablename> TO {role|user} {rolename | USER name};-- Grant a role or a user the permissions to manage a table in a schema.
    -- Revoke the permissions to manage a table in a schema from a role or a user.
    REVOKE schemaObjectPrivileges ON TABLE <project_name>.<schema_name>.<tablename> FROM {role|user} {rolename | USER name};
    -- View the permissions on a table.
    SHOW GRANTS ON TABLE <project_name>.<schema_name>.<tablename>;
    Note

    The GRANT schemaObjectPrivileges ON TABLE <project_name>.<schema_name>.xxx* TO role {rolename}; syntax is not supported.