Data Lakehouse Solution 1.0 (external projects) is no longer being developed and will be discontinued. If you use MaxCompute to access federated data sources, upgrade to Data Lakehouse Solution 2.0, which replaces external projects with external schemas and introduces a unified Foreign Server for connection management.
This topic explains what changes between the two solutions, how enabling schema modes affects existing jobs, and what SQL modifications each scenario requires.
What changes in Data Lakehouse Solution 2.0
Data Lakehouse Solution 1.0 supports two federation patterns using external projects:
-
Data Lake Formation (DLF) + Object Storage Service (OSS): The external project stores connection information directly. MaxCompute connects to DLF metadata and OSS data using external project properties.
-
Hive Metastore Service (HMS) + Hadoop Distributed File System (HDFS): A Foreign Server object stores connection information. MaxCompute uses this external data source to read Hive metadata and data.
In both cases, the project level maps to a database in DLF or Hive, and tables are stored directly in the project (Project.Table).
Data Lakehouse Solution 2.0 introduces a Project.Schema.Table structure and replaces the per-project connection model with a unified Foreign Server and external schema.
| Data Lakehouse Solution 1.0 | Data Lakehouse Solution 2.0 | |
|---|---|---|
| External data source | Connection info varies across federation scenarios. No shared abstraction for tenant-level access control. | A single unified Foreign Server separates tenant-level permissions from data-level permissions and simplifies sharing. |
| Project model | Tables stored directly in projects (Project.Table). Each database in a data source requires a separate external project. |
Projects support a schema layer (Project.Schema.Table). One project can map to multiple databases or data sources. |
| Computation | External project depends on a separate data warehouse project to run jobs, increasing cross-project authorization complexity. | External schemas use the computing resources of the data warehouse project they belong to. |
Choose a migration path
Before enabling schema modes, decide which project will host the external schema.
| Scenario | Recommended path |
|---|---|
Enabling schema mode for project p1 (which runs existing jobs) is acceptable |
Use case 1: Create the external schema in the same project |
Enabling schema mode for p1 would disrupt too many existing jobs |
Use case 2: Create the external schema in a separate project |
Enabling the schema mode for project-level metadata is irreversible. Once enabled, the project structure changes from Project.Table to Project.Schema.Table and cannot be reverted. This may break existing jobs that reference tables using the p.t syntax when the schema mode for SQL syntax is also enabled. Test at the session level before committing to tenant-level changes.
Schema modes
MaxCompute provides two schema switches that work together:
-
Schema mode for project-level metadata: Changes the project structure to
Project.Schema.Tableand creates a built-in Default schema. All existing internal tables move to the Default schema. The Default schema cannot be deleted. Irreversible once enabled. -
Schema mode for SQL syntax: Controls how SQL statements parse table references (for example, whether
p.tis read asproject.tableorschema.table). Can be enabled at session level or tenant level.
Enable the schema mode for project-level metadata
-
Log on to the MaxCompute console and select a region in the upper-left corner.MaxCompute console
-
In the left navigation pane, choose Manage Configurations > Projects.
-
On the Projects page, find the target project and click Enable Schema in the Actions column.
After enabling, a Default schema is created automatically. All internal tables in the project belong to this schema. To access custom schemas, also enable the schema mode for SQL syntax.
Enable the schema mode for SQL syntax
The schema mode for SQL syntax can be enabled at two levels:
Session level — Affects only the current session. Takes precedence over tenant-level settings. Use this to test compatibility before committing to a broader change. You can enable or disable the schema mode for SQL syntax at the session level.
SET odps.namespace.schema= true | false;
Tenant level — Applies to all projects and all jobs under the tenant. New projects support the schema mode for project-level metadata by default. Cannot be disabled after enabling.
To enable at the tenant level:
-
Log on to the MaxCompute console and select a region in the upper-left corner.MaxCompute console
-
In the left navigation pane, choose Manage Configurations > Tenants.
-
On the Tenants page, click the Tenant Property tab.
-
If no projects exist under the current tenant: turn on Tenant-level Schema Syntax.
-
If projects exist under the current tenant: do not modify this setting.
-
For more information, see Tenant properties.
Compatibility reference
Compatibility errors occur when the two schema modes are in different states. The tables below show how each combination affects existing and new jobs, using a data warehouse project p with table t as the baseline.
Metadata mode disabled, SQL syntax mode enabled
| Job type | SQL expression | Parsed as | Compatible? | Action |
|---|---|---|---|---|
| Existing | FROM t |
current_project(p).table(t) |
Yes | None |
| Existing | FROM p.t |
current_project.schema(p).table(t) — schema p does not exist |
No | Change to FROM p.default.t |
| New | p.default.t |
project(p).schema(default).table(t) |
Yes | None |
| New | default.t |
current_project.schema(default).table(t) |
Yes | None |
| New | t |
current_project.current_schema.table(t) |
Yes | None |
| New | p.s.t |
project(p).schema(s).table(t) — error if schema s doesn't exist |
Conditional | Ensure schema s exists |
Metadata mode enabled, SQL syntax mode disabled
| Job type | SQL expression | Parsed as | Compatible? | Action |
|---|---|---|---|---|
| Existing | FROM t |
current_project(p).schema(default).table(t) |
Yes | None |
| Existing | FROM p.t |
project(p).schema(default).table(t) — Default schema added automatically |
Yes | None |
| New | p.t |
project(p).schema(default).table(t) |
Yes | None |
| New | t |
project.schema(default).table(t) |
Yes | None |
| Note | — | Custom schemas are not accessible in this state. USE SCHEMA <schema_name> is not supported. |
— | Enable SQL syntax mode to access custom schemas |
Metadata mode enabled, SQL syntax mode enabled
| Job type | SQL expression | Parsed as | Compatible? | Action |
|---|---|---|---|---|
| Existing | FROM t |
current_project(p).schema(default).table(t) |
Yes | None |
| Existing | FROM p.t |
current_project.schema(p).table(t) — schema p does not exist |
No | Change to FROM p.default.t |
| New | Any | Follows full Project.Schema.Table syntax rules |
Yes | None |
Migration guide
Prerequisites
Before you begin, ensure that you have:
-
A MaxCompute project to serve as the data warehouse (to run compute jobs)
-
Access to the MaxCompute console with project management permissions
-
Reviewed the compatibility reference and identified any existing jobs that require SQL updates
For details on creating a Foreign Server and an external schema, see the Data Lakehouse Solution 2.0 user guide. For schema concepts and operations, see Schema and Schema operations.
Use case 1: Create an external schema in the project that runs existing jobs
Use this path when enabling schema mode for the existing job project (p1) is acceptable.
Scenario: Project p1 runs existing jobs. Queries target external_project(e).table(t). After migration, the same data is accessible as project(p1).external_schema(e).table(t).
Step 1: Enable the schema mode for project-level metadata on `p1`.
After enabling, the Default schema is created automatically. All existing internal tables in p1 belong to p1.default.
Step 2: Create an external schema in `p1` with the same name as the external project (`e`).
Set the external schema name to match the external project name. This allows existing queries that reference e to resolve correctly after you enable the SQL syntax schema mode.
Step 3: Enable the schema mode for SQL syntax at the session level and test existing jobs.
After enabling the SQL syntax schema mode:
-
SELECT * FROM e.t;—e.tis parsed asexternal_schema(e).table(t). The query runs inp1without modification. -
SELECT * FROM e.t1 JOIN p1.t2;—p1.t2is now parsed asschema(p1).table(t2), notproject(p1).table(t2). Update this statement toSELECT * FROM e.t1 JOIN p1.default.t2;.
Step 4: After verifying jobs run correctly, enable the schema mode for SQL syntax at the tenant level (optional).
Use case 2: Create an external schema in a separate project
Use this path when enabling schema mode for p1 would break too many existing jobs, or when you want to keep p1 unchanged during migration.
Scenario: Project p1 runs existing jobs. Queries target external_project(e).table(t). You create (or select) a separate project p2 to host the external schema. After migration, the federated data is accessible as project(p2).external_schema(e).table(t).
Step 1: Enable the schema mode for project-level metadata on `p2`.
p1 is not affected. Existing jobs in p1 continue to run normally.
Step 2: Create an external schema in `p2` with the same name as the external project (`e`).
Step 3: Enable the schema mode for SQL syntax and update SQL statements in `p1`.
After enabling the SQL syntax schema mode:
-
SELECT * FROM e.t;inp1— update toSELECT * FROM p2.e.t;, sinceenow resolves to a schema insidep1, not the external project. -
Running
SELECT * FROM e.t;directly inp2— no modification required. -
Cross-project join in
p1:SELECT * FROM e.t1 JOIN p1.t2;— update toSELECT * FROM p2.e.t1 JOIN p1.default.t2;.
What's next
-
Data Lakehouse Solution 2.0 user guide — create Foreign Servers and external schemas for DLF+OSS and HMS+HDFS sources
-
Schema — concepts and structure of schemas in MaxCompute
-
Schema operations — create, manage, and delete schemas
-
Tenant properties — configure tenant-level schema syntax settings