All Products
Search
Document Center

AnalyticDB:Read and write Iceberg external tables using Spark SQL

Last Updated:Nov 28, 2025

This topic describes how to use Spark SQL to read and write Iceberg external tables in AnalyticDB for MySQL.

Prerequisites

Step 1: Go to the SQL Development page

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Job Development > SQL Development.

  3. On the SQLConsole tab, select the Spark engine and a resource group (a job resource group or an interactive resource group with the Spark engine).

Step 2: Create an external database and an Iceberg external table

Note

You can execute the following SQL statements in batch or interactive mode.

Data Lake Storage tables

  1. Create a database.

    CREATE DATABASE adb_external_db_iceberg 
    WITH DBPROPERTIES ('adb_lake_bucket' = 'adb-lake-cn-shanghai-6gml****');

    The following table describes the parameter:

    Parameter

    Description

    adb_lake_bucket

    Specifies the storage location of the Data Lake Storage table data.

    If you specify Data Lake Storage in the database creation statement, all tables in the database are stored in the Data Lake Storage. If you do not want to store all tables in the database in Data Lake Storage, you can set this parameter when you create a table.

  2. Create an Iceberg external table.

    SET spark.adb.lakehouse.enabled=true;                      ----Enable Data Lake Storage  
    CREATE TABLE adb_external_db_iceberg.test_iceberg_tbl (
      `id` int,
      `name` string,
      `age` int
    ) USING iceberg
    PARTITIONED BY (age)
    TBLPROPERTIES ( 'adb_lake_bucket' = 'adb-lake-cn-shanghai-6gml****' );

    The following table describes the parameter:

    Parameter

    Description

    adb_lake_bucket

    Specifies the storage location of the Data Lake Storage table data.

    • If you have specified Data Lake Storage when you create the database, all tables in the database are stored in the Data Lake Storage. You do not need to specify this parameter again when you create a table.

    • If you did not specify Data Lake Storage when you create the database, you must explicitly specify this parameter when you create a table. Otherwise, an error occurs. After you specify this parameter, the data of the table is stored in the specified Data Lake Storage.

    • If you explicitly specify Data Lake Storage when you create both the database and the table, the data of the table is stored in the Data Lake Storage specified when you create the table. Other tables in the database are stored in the Data Lake Storage specified when you create the database.

Non-Data Lake Storage tables

  1. Check whether you can create an Iceberg external table in an existing database, or create a new database.

    • Execute the SHOW CREATE DATABASE statement to view the DDL statement of an existing database. If one of the following conditions is met, you can use the existing database. Otherwise, you must create a new database.

      • No Location parameter is specified in the DDL statement.

      • The Location parameter is specified in the DDL statement and the value of the Catalog parameter is mix.

    • Create a new database.

      CREATE DATABASE adb_external_db_iceberg;
  2. Create an Iceberg external table.

    Important

    If the Location parameter is specified for the database, you must make sure that the OSS path prefix specified by the spark.iceberg.warehouse parameter for creating, reading, and writing the Iceberg external table matches the Location parameter.

    SET spark.adb.version=3.5;                                  --Specify the Spark version, which must be 3.5
    SET spark.iceberg.warehouse=oss://testBucketName/iceberg/;  --The storage path of the metadata and data files of the Iceberg external table
    CREATE TABLE adb_external_db_iceberg.test_iceberg_tbl (
      `id` int,
      `name` string,
      `age` int
    ) USING iceberg
    PARTITIONED BY (age);

Step 3: Write data to or delete data from the Iceberg external table

Data Lake Storage tables

Write data

When you perform write operations, you must add the following parameter before the SQL statements. Otherwise, the write operations fail.

SET spark.adb.lakehouse.enabled=true;                      ----Enable Data Lake Storage
  • Execute the INSERT INTO statement to write data.

    INSERT INTO adb_external_db_iceberg.test_iceberg_tbl VALUES (1, 'lisa', 10), (2, 'jams', 20);
  • Execute the INSERT OVERWRITE statement to write data.

    INSERT OVERWRITE adb_external_db_iceberg.test_iceberg_tbl VALUES (1, 'lisa', 10), (2, 'jams', 30);
  • Execute the INSERT OVERWRITE statement to write data to a static partition.

    INSERT OVERWRITE adb_external_db_iceberg.test_iceberg_tbl PARTITION(age=10) VALUES (1, 'anna');
  • Execute the INSERT OVERWRITE statement to write data to a dynamic partition.

    SET spark.sql.sources.partitionOverwriteMode=dynamic;    --Only overwrite dynamic partitions. If not configured, all data in the table will be overwritten
    INSERT OVERWRITE adb_external_db_iceberg.test_iceberg_tbl PARTITION(age) VALUES (1, 'bom', 10);
  • Execute the UPDATE statement to write data.

    UPDATE adb_external_db_iceberg.test_iceberg_tbl SET name = 'box' WHERE id = 2;

Delete data

SET spark.adb.lakehouse.enabled=true;                      ----Enable Data Lake Storage
DELETE FROM adb_external_db_iceberg.test_iceberg_tbl WHERE id = 1;
DELETE FROM adb_external_db_iceberg.test_iceberg_tbl WHERE age = 20;

Non-Data Lake Storage tables

When you perform write and delete operations, you must add the following parameters before the SQL statements. Otherwise, the write and delete operations fail.

SET spark.adb.version=3.5;                                  --Specify the Spark version, which must be 3.5
SET spark.iceberg.warehouse=oss://testBucketName/iceberg/;  --The storage path of the metadata and data files of the Iceberg external table

Write data

  • Execute the INSERT INTO statement to write data.

    INSERT INTO adb_external_db_iceberg.test_iceberg_tbl VALUES (1, 'Frank', 10), (2, 'Amy', 10);
  • Execute the INSERT OVERWRITE statement to write data.

    INSERT OVERWRITE adb_external_db_iceberg.test_iceberg_tbl VALUES (1, 'Frank', 10), (2, 'Amy', 20);
  • INSERT OVERWRITE static partition writing

    INSERT OVERWRITE adb_external_db_iceberg.test_iceberg_tbl PARTITION(age=10) VALUES (1, 'Frank');
  • Execute the INSERT OVERWRITE statement to write data to a dynamic partition.

    SET spark.sql.sources.partitionOverwriteMode=dynamic;    --Only overwrite dynamic partitions. If not configured, all data in the table will be overwritten
    INSERT OVERWRITE adb_external_db_iceberg.test_iceberg_tbl PARTITION(age) VALUES (1, 'Bom', 10);
  • Execute the UPDATE statement to write data.

    UPDATE adb_external_db_iceberg.test_iceberg_tbl SET name = 'box' WHERE id = 2;

Delete data

SET spark.adb.version=3.5;                                  --Specify the Spark version, which must be 3.5
SET spark.iceberg.warehouse=oss://testBucketName/iceberg/;  --The storage path of the metadata and data files of the Iceberg external table
DELETE FROM adb_external_db_iceberg.test_iceberg_tbl WHERE id = 1;
DELETE FROM adb_external_db_iceberg.test_iceberg_tbl WHERE age = 20;

Step 4: Query data from the Iceberg external table

Data Lake Storage tables

SET spark.adb.lakehouse.enabled=true;                      ----Enable Data Lake Storage
SELECT * FROM adb_external_db_iceberg.test_iceberg_tbl;

Return values:

+---+----+---+
|id |name|age|
+---+----+---+
|1  |anna|10 |
|2  |jams|20 |
+---+----+---+

Non-Data Lake Storage tables

SET spark.adb.version=3.5;                                  --Specify the Spark version, which must be 3.5
SET spark.iceberg.warehouse=oss://testBucketName/iceberg/;  --The storage path of the metadata and data files of the Iceberg external table
SELECT * FROM adb_external_db_iceberg.test_iceberg_tbl;

Return values:

+---+----+---+
|id |name|age|
+---+----+---+
|1  |anna|10 |
|2  |jams|20 |
+---+----+---+

Step 5: Delete the Iceberg external table

Data Lake Storage tables

Delete the Iceberg managed Data Lake Storage table metadata and data from AnalyticDB for MySQL:

SET spark.adb.lakehouse.enabled=true;
DROP TABLE adb_external_db_iceberg.test_iceberg_tbl;

Non-Data Lake Storage tables

  • Delete the Iceberg external table from AnalyticDB for MySQL while retaining the Iceberg external table data in OSS:

    DROP TABLE adb_external_db_iceberg.test_iceberg_tbl;
  • Delete the Iceberg external table from AnalyticDB for MySQL and the Iceberg external table data from OSS:

    DROP TABLE adb_external_db_iceberg.test_iceberg_tbl purge;