All Products
Search
Document Center

AnalyticDB:Lake storage

Last Updated:Dec 01, 2025

AnalyticDB for MySQL offers the lake storage feature to simplify data lake construction, storage management, and optimization. Lake storage supports structured table data, such as Iceberg and Paimon tables, and unstructured file objects. This feature is ideal for scenarios that require both offline batch processing and real-time analytics. This topic describes how to create and modify a lake storage and view its data volume.

Important

The lake storage feature is in invitational preview. To use this feature, submit a ticket to contact technical support.

Prerequisites

An AnalyticDB for MySQL Enterprise Edition, Basic Edition, or Data Lakehouse Edition cluster is created.

Feature introduction

AnalyticDB for MySQL lake storage can store structured table data in formats such as Iceberg and Paimon, and unstructured file objects. With its fully managed architecture, lake storage automatically handles complex tasks such as data lake construction, metadata governance, and storage optimization. This reduces storage costs and improves query performance. The unified metadata management and service-based resource scheduling allow you to quickly complete data ingestion, storage optimization, and multi-engine collaborative computing without maintaining the underlying infrastructure.

Billing

Notes

  • An Alibaba Cloud account can create a maximum of five lake storages in the same region.

  • The display of storage usage for a lake storage has a delay. Therefore, the data volume is not visible immediately after data is written.

  • When you delete a lake storage, ensure that all data in it is deleted. Otherwise, the deletion fails.

  • After you create a lake storage, AnalyticDB for MySQL automatically creates a bucket in the OSS service account. This bucket is in the same region as the AnalyticDB for MySQL cluster and has the same name as the lake storage. You can view this bucket in the OSS console of your Alibaba Cloud account by adding a favorite path.

    View the OSS bucket

    1. Log on to the OSS console.

    2. In the navigation pane on the left, click the image button next to Favorite Paths.

    3. In the Add Favorite Path dialog box, select an Adding Method, and specify the Region and Bucket.

      Add Method: Select Add from other authorized buckets.

      Region: The region where the AnalyticDB for MySQL cluster resides.

      Bucket: Enter the lake storage name.

  • The backup and recovery feature of AnalyticDB for MySQL does not support data in lake storage.

Create a lake storage

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.

  2. In the navigation pane on the left, choose Data Management > Lake Storages.

  3. In the upper-right corner of the page, click Create Lake Storage.

  4. In the Create Lake Storage dialog box that appears, click OK.

    Important

    The lake storage name is automatically generated by the system in the format adb-lake-RegionID-RandomString. The name is globally unique and cannot be changed after creation.

  5. (Optional) Modify the lake storage description.

    The lake storage name is automatically generated and cannot be modified. You can modify the description to distinguish between lake storages that are used in different business scenarios.

    1. Click the image button in the Lake Storage Description column.

    2. In the Change Lake Storage Description dialog box that appears, enter a description and click OK.

Use lake storage

A lake storage table is a fully managed, lake-format table provided by AnalyticDB for MySQL. It supports the Iceberg and Paimon data formats. After a lake storage table is created, the system automatically generates a unique path for it. The default path is oss://<LakeStorageName>/lakehouse/default/tables/<table_uuid>. For example, if the data of the lake storage table test_iceberg_tbl is stored in `adb-lake-cn-shanghai-6gml****`, the generated path is oss://adb-lake-cn-shanghai-6gml****/lakehouse/default/tables/b22cd225-528d-421c-a2****. You can create, read from, and write to lake storage tables using the XIHE and Spark engines. You can also manage the permissions and lifecycles of lake storage tables in the same way that you manage regular internal tables. The following steps describe how to read from and write to lake storage tables:

Read from and write to a lake storage table using Spark SQL

Prerequisites

The kernel version of the cluster must be 3.2.3.0 or later.

Note

To view and update the minor version of an AnalyticDB for MySQL cluster, log on to the AnalyticDB for MySQL console and go to the Configuration Information section of the Cluster Information page.

Procedure

  1. Go to Data Development.

    1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.

    2. In the left-side navigation pane, choose Job Development > SQL Development.

    3. On the SQLConsole tab, select the Spark engine and a resource group (a job resource group or an interactive resource group with the Spark engine).

  2. Create an external database and an Iceberg lake storage table.

    1. Create a database.

      CREATE DATABASE adb_external_db_iceberg 
      WITH DBPROPERTIES ('adb_lake_bucket' = 'adb-lake-cn-shanghai-6gml****');

      The following table describes the parameter:

      Parameter

      Description

      adb_lake_bucket

      Specifies the storage location of the Data Lake Storage table data.

      If you specify Data Lake Storage in the database creation statement, all tables in the database are stored in the Data Lake Storage. If you do not want to store all tables in the database in Data Lake Storage, you can set this parameter when you create a table.

    2. Create an Iceberg external table.

      SET spark.adb.lakehouse.enabled=true;                      ----Enable Data Lake Storage  
      CREATE TABLE adb_external_db_iceberg.test_iceberg_tbl (
        `id` int,
        `name` string,
        `age` int
      ) USING iceberg
      PARTITIONED BY (age)
      TBLPROPERTIES ( 'adb_lake_bucket' = 'adb-lake-cn-shanghai-6gml****' );

      The following table describes the parameters.

      Parameter

      Description

      adb_lake_bucket

      Specifies the storage location of the Data Lake Storage table data.

      • If you have specified Data Lake Storage when you create the database, all tables in the database are stored in the Data Lake Storage. You do not need to specify this parameter again when you create a table.

      • If you did not specify Data Lake Storage when you create the database, you must explicitly specify this parameter when you create a table. Otherwise, an error occurs. After you specify this parameter, the data of the table is stored in the specified Data Lake Storage.

      • If you explicitly specify Data Lake Storage when you create both the database and the table, the data of the table is stored in the Data Lake Storage specified when you create the table. Other tables in the database are stored in the Data Lake Storage specified when you create the database.

  3. Write data to the Iceberg table.

    SET spark.adb.lakehouse.enabled=true;                      -- Enable lake storage
    INSERT INTO adb_external_db_iceberg.test_iceberg_tbl VALUES (1, 'lisa', 10), (2, 'jams', 10);
  4. (Optional) Delete data from the Iceberg table.

    SET spark.adb.lakehouse.enabled=true;                      ----Enable Data Lake Storage
    DELETE FROM adb_external_db_iceberg.test_iceberg_tbl WHERE id = 1;
    DELETE FROM adb_external_db_iceberg.test_iceberg_tbl WHERE age = 20;
  5. Query data from the Iceberg external table.

    SET spark.adb.lakehouse.enabled=true;                      ----Enable Data Lake Storage
    SELECT * FROM adb_external_db_iceberg.test_iceberg_tbl;

    The following result is returned:

    +---+----+---+
    |id |name|age|
    +---+----+---+
    |1  |anna|10 |
    |2  |jams|20 |
    +---+----+---+
  6. (Optional) Delete the Iceberg external table.

    The following statement deletes the Iceberg external table from AnalyticDB for MySQL and its data from OSS.

    SET spark.adb.lakehouse.enabled=true;
    DROP TABLE adb_external_db_iceberg.test_iceberg_tbl;

Read from and write to a lake storage table using XIHE SQL

Prerequisites

The kernel version of the cluster must be 3.2.5.3 or later.

Note

To view and update the minor version of an AnalyticDB for MySQL cluster, log on to the AnalyticDB for MySQL console and go to the Configuration Information section of the Cluster Information page.

Notes

The DELETE statement cannot be used to delete a single row of data from a lake storage table. You can only delete the entire table.

Procedure

  1. Go to Data Development.

    1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.

    2. In the left-side navigation pane, choose Job Development > SQL Development.

    3. In the SQLConsole window, select the XIHE engine and an Interactive resource group.

  2. Create an external database and an Iceberg external table.

    1. Create an external database.

      CREATE EXTERNAL DATABASE test_db 
      WITH DBPROPERTIES ('adb_lake_bucket' = 'adb-lake-cn-shanghai-6gml****');

      The following table describes the parameters.

      Parameter

      Description

      adb_lake_bucket

      Specifies the storage location for the data of the lake storage table.

      If you specify a lake storage in the CREATE DATABASE statement, all tables in the database are stored in that lake storage. If you do not want all tables in the database to be stored in the lake storage, you can set this parameter when you create a table.

    2. Create an Iceberg external table.

      CREATE TABLE test_db.test_iceberg_tbl (
        `id` int,
        `name` string
      )PARTITIONED BY (age int) 
      STORED AS ICEBERG
      TBLPROPERTIES (
        'catalog_type' = 'ADB', 
        'adb_lake_bucket' = 'adb-lake-cn-shanghai-6gml****'
      );

      The following table describes the parameters.

      Parameter

      Description

      catalog_type

      Specifies the catalog type. Set this parameter to ADB.

      adb_lake_bucket

      Specifies the storage location for the data of the lake storage table.

      • If you specified a lake storage when you created the database, all tables in the database are stored in that lake storage. You do not need to specify this parameter again when you create a table.

      • If you did not specify a lake storage when you created the database, you must explicitly specify this parameter. Otherwise, an error occurs when you create the table. After you specify the parameter, the table data is stored in the specified lake storage.

      • If you explicitly specify a lake storage both when you create the database and when you create the table, the table data is stored in the lake storage that is specified during table creation. Other tables in the database are stored in the lake storage that is specified in the CREATE DATABASE statement.

  3. Write data to the Iceberg table.

    INSERT INTO test_db1.test_iceberg_tbl select 1, 'anna', 10;
    INSERT INTO test_db1.test_iceberg_tbl select 2, 'jams', 20;
  4. Query data from the Iceberg external table.

    SELECT * FROM adb_external_db_iceberg.test_iceberg_tbl;

    The following result is returned:

    +---+----+---+
    |id |name|age|
    +---+----+---+
    |1  |anna|10 |
    |2  |jams|20 |
    +---+----+---+
  5. (Optional) Delete the Iceberg external table.

    The following statement deletes the Iceberg external table from AnalyticDB for MySQL and its data from OSS.

    SET spark.adb.lakehouse.enabled=true;
    DROP TABLE adb_external_db_iceberg.test_iceberg_tbl;

Query the data volume of a lake storage

  1. In the navigation pane on the left, choose Data Management > Lake Storages.

  2. In the Storage Size column of the target lake storage, view the data volume.

    Important

    The display of storage usage for a lake storage has a delay. Therefore, the data volume is not visible immediately after data is written.

Delete a lake storage

  1. In the navigation pane on the left, choose Data Management > Lake Storages.

  2. In the Actions column of the target lake storage, click Delete.

  3. In the Delete dialog box that appears, click OK.

    Important

    When you delete a lake storage, ensure that all data in it is deleted. Otherwise, the deletion fails.

References