Migrate External Projects to Data Lakehouse 2.0 Schemas Seamlessly - MaxCompute

What changes in Data Lakehouse Solution 2.0

Data Lakehouse Solution 1.0 supports two federation patterns using external projects:

Data Lake Formation (DLF) + Object Storage Service (OSS): The external project stores connection information directly. MaxCompute connects to DLF metadata and OSS data using external project properties.
Hive Metastore Service (HMS) + Hadoop Distributed File System (HDFS): A Foreign Server object stores connection information. MaxCompute uses this external data source to read Hive metadata and data.

In both cases, the project level maps to a database in DLF or Hive, and tables are stored directly in the project (Project.Table).

Data Lakehouse Solution 2.0 introduces a Project.Schema.Table structure and replaces the per-project connection model with a unified Foreign Server and external schema.

	Data Lakehouse Solution 1.0	Data Lakehouse Solution 2.0
External data source	Connection info varies across federation scenarios. No shared abstraction for tenant-level access control.	A single unified Foreign Server separates tenant-level permissions from data-level permissions and simplifies sharing.
Project model	Tables stored directly in projects (`Project.Table`). Each database in a data source requires a separate external project.	Projects support a schema layer (`Project.Schema.Table`). One project can map to multiple databases or data sources.
Computation	External project depends on a separate data warehouse project to run jobs, increasing cross-project authorization complexity.	External schemas use the computing resources of the data warehouse project they belong to.

Choose a migration path

Before enabling schema modes, decide which project will host the external schema.

Scenario	Recommended path
Enabling schema mode for project `p1` (which runs existing jobs) is acceptable	Use case 1: Create the external schema in the same project
Enabling schema mode for `p1` would disrupt too many existing jobs	Use case 2: Create the external schema in a separate project

Important

Enabling the schema mode for project-level metadata is irreversible. Once enabled, the project structure changes from Project.Table to Project.Schema.Table and cannot be reverted. This may break existing jobs that reference tables using the p.t syntax when the schema mode for SQL syntax is also enabled. Test at the session level before committing to tenant-level changes.

Schema modes

MaxCompute provides two schema switches that work together:

Schema mode for project-level metadata: Changes the project structure to Project.Schema.Table and creates a built-in Default schema. All existing internal tables move to the Default schema. The Default schema cannot be deleted. Irreversible once enabled.
Schema mode for SQL syntax: Controls how SQL statements parse table references (for example, whether p.t is read as project.table or schema.table). Can be enabled at session level or tenant level.

Enable the schema mode for project-level metadata

Log on to the MaxCompute console and select a region in the upper-left corner.MaxCompute console
In the left navigation pane, choose Manage Configurations > Projects.
On the Projects page, find the target project and click Enable Schema in the Actions column.

After enabling, a Default schema is created automatically. All internal tables in the project belong to this schema. To access custom schemas, also enable the schema mode for SQL syntax.

Enable the schema mode for SQL syntax

The schema mode for SQL syntax can be enabled at two levels:

Session level — Affects only the current session. Takes precedence over tenant-level settings. Use this to test compatibility before committing to a broader change. You can enable or disable the schema mode for SQL syntax at the session level.

SET odps.namespace.schema= true | false;

Tenant level — Applies to all projects and all jobs under the tenant. New projects support the schema mode for project-level metadata by default. Cannot be disabled after enabling.

To enable at the tenant level:

Log on to the MaxCompute console and select a region in the upper-left corner.MaxCompute console
In the left navigation pane, choose Manage Configurations > Tenants.
On the Tenants page, click the Tenant Property tab.
- If no projects exist under the current tenant: turn on Tenant-level Schema Syntax.
- If projects exist under the current tenant: do not modify this setting.

For more information, see Tenant properties.

Compatibility reference

Compatibility errors occur when the two schema modes are in different states. The tables below show how each combination affects existing and new jobs, using a data warehouse project p with table t as the baseline.

Metadata mode disabled, SQL syntax mode enabled

Job type	SQL expression	Parsed as	Compatible?	Action
Existing	`FROM t`	`current_project(p).table(t)`	Yes	None
Existing	`FROM p.t`	`current_project.schema(p).table(t)` — schema `p` does not exist	No	Change to `FROM p.default.t`
New	`p.default.t`	`project(p).schema(default).table(t)`	Yes	None
New	`default.t`	`current_project.schema(default).table(t)`	Yes	None
New	`t`	`current_project.current_schema.table(t)`	Yes	None
New	`p.s.t`	`project(p).schema(s).table(t)` — error if schema `s` doesn't exist	Conditional	Ensure schema `s` exists

Metadata mode enabled, SQL syntax mode disabled

Job type	SQL expression	Parsed as	Compatible?	Action
Existing	`FROM t`	`current_project(p).schema(default).table(t)`	Yes	None
Existing	`FROM p.t`	`project(p).schema(default).table(t)` — Default schema added automatically	Yes	None
New	`p.t`	`project(p).schema(default).table(t)`	Yes	None
New	`t`	`project.schema(default).table(t)`	Yes	None
Note	—	Custom schemas are not accessible in this state. `USE SCHEMA <schema_name>` is not supported.	—	Enable SQL syntax mode to access custom schemas

Metadata mode enabled, SQL syntax mode enabled

Job type	SQL expression	Parsed as	Compatible?	Action
Existing	`FROM t`	`current_project(p).schema(default).table(t)`	Yes	None
Existing	`FROM p.t`	`current_project.schema(p).table(t)` — schema `p` does not exist	No	Change to `FROM p.default.t`
New	Any	Follows full `Project.Schema.Table` syntax rules	Yes	None

Migration guide

Prerequisites

Before you begin, ensure that you have:

A MaxCompute project to serve as the data warehouse (to run compute jobs)
Access to the MaxCompute console with project management permissions
Reviewed the compatibility reference and identified any existing jobs that require SQL updates

For details on creating a Foreign Server and an external schema, see the Data Lakehouse Solution 2.0 user guide. For schema concepts and operations, see Schema and Schema operations.

Use case 1: Create an external schema in the project that runs existing jobs

Use this path when enabling schema mode for the existing job project (p1) is acceptable.

Scenario: Project p1 runs existing jobs. Queries target external_project(e).table(t). After migration, the same data is accessible as project(p1).external_schema(e).table(t).

Step 1: Enable the schema mode for project-level metadata on `p1`.

After enabling, the Default schema is created automatically. All existing internal tables in p1 belong to p1.default.

Step 2: Create an external schema in `p1` with the same name as the external project (`e`).

Set the external schema name to match the external project name. This allows existing queries that reference e to resolve correctly after you enable the SQL syntax schema mode.

Step 3: Enable the schema mode for SQL syntax at the session level and test existing jobs.

After enabling the SQL syntax schema mode:

SELECT * FROM e.t; — e.t is parsed as external_schema(e).table(t). The query runs in p1 without modification.
SELECT * FROM e.t1 JOIN p1.t2; — p1.t2 is now parsed as schema(p1).table(t2), not project(p1).table(t2). Update this statement to SELECT * FROM e.t1 JOIN p1.default.t2;.

Step 4: After verifying jobs run correctly, enable the schema mode for SQL syntax at the tenant level (optional).

Use case 2: Create an external schema in a separate project

Use this path when enabling schema mode for p1 would break too many existing jobs, or when you want to keep p1 unchanged during migration.

Scenario: Project p1 runs existing jobs. Queries target external_project(e).table(t). You create (or select) a separate project p2 to host the external schema. After migration, the federated data is accessible as project(p2).external_schema(e).table(t).

Step 1: Enable the schema mode for project-level metadata on `p2`.

p1 is not affected. Existing jobs in p1 continue to run normally.

Step 2: Create an external schema in `p2` with the same name as the external project (`e`).

Step 3: Enable the schema mode for SQL syntax and update SQL statements in `p1`.

After enabling the SQL syntax schema mode:

SELECT * FROM e.t; in p1 — update to SELECT * FROM p2.e.t;, since e now resolves to a schema inside p1, not the external project.
Running SELECT * FROM e.t; directly in p2 — no modification required.
Cross-project join in p1: SELECT * FROM e.t1 JOIN p1.t2; — update to SELECT * FROM p2.e.t1 JOIN p1.default.t2;.

What's next

Data Lakehouse Solution 2.0 user guide — create Foreign Servers and external schemas for DLF+OSS and HMS+HDFS sources
Schema — concepts and structure of schemas in MaxCompute
Schema operations — create, manage, and delete schemas
Tenant properties — configure tenant-level schema syntax settings