Schemas let you classify tables, resources, and functions within a MaxCompute project into finer-grained namespaces, giving you more granular control over data organization and access within a single project.
How schemas work
In MaxCompute, a schema sits between a project and its objects. The object hierarchy is: project → schema → tables/resources/functions.
After you enable the schema feature, all objects accessible to a MaxCompute entry object are placed in the schema named DEFAULT. To work with objects in a different schema, specify the target schema when you initialize the entry object or when you call an operation method.
Prerequisites
Before you begin, ensure that you have:
-
Enabled the schema feature for your MaxCompute project. For details, see Schema-related operations.
-
Set up a runtime environment. PyODPS runs on a PyODPS node in DataWorks or on an on-premises machine:
-
DataWorks: Create a PyODPS 2 node or a PyODPS 3 node. For details, see Use PyODPS in DataWorks.
-
On-premises machine: Install PyODPS and initialize the MaxCompute entry object.
-
Create a schema
schema = o.create_schema("schema_name")
print(schema)
Delete a schema
schema = o.delete_schema("schema_name")
List schemas
To list all schemas in the current project:
for schema in o.list_schemas():
print(schema)
Work with objects in a specific schema
Initialize the entry object for a schema
To operate on objects in a schema other than DEFAULT, pass the schema parameter when initializing the ODPS entry object.
import os
from odps import ODPS
# Store your credentials in environment variables — do not hardcode AccessKey ID or AccessKey Secret.
o = ODPS(
os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
project='<your-default-project>',
endpoint='<your-end-point>',
schema='<your-schema-name>',
)
Replace the following placeholders with your actual values:
| Placeholder | Description |
|---|---|
<your-default-project> |
The name of your MaxCompute project. |
<your-end-point> |
The endpoint of your MaxCompute project, based on the region and network connection method you selected. For details, see Endpoints. An invalid endpoint causes an error when you access MaxCompute. |
<your-schema-name> |
The name of the schema. |
Specify a schema for individual operations
Pass schema directly to operation methods to target a specific schema without rebinding the entry object.
# List all tables in test_schema.
for table in o.list_tables(schema='schema_name'):
print(table)
Specify a default schema for SQL statements
Pass default_schema to execute_sql to set the schema context for a single SQL statement.
o.execute_sql("SELECT * FROM dual", default_schema="schema_name")