How to import and export data using an OSS foreign table - AnalyticDB

Limitations

The AnalyticDB for PostgreSQL instance and the OSS bucket must be in the same region.

Supported object formats

OSS foreign tables support the following formats:

Uncompressed CSV and TEXT objects
GZIP-compressed CSV and TEXT objects
ORC binary objects

For ORC data type mappings, see the "Data type mappings between ORC and AnalyticDB for PostgreSQL" section of Data type mappings for OSS foreign tables.

Prerequisites

Before you begin, ensure that you have created the following objects in this order:

An OSS server — registers the OSS bucket as a Foreign Data Wrapper (FDW) data source in AnalyticDB for PostgreSQL. See the "Create an OSS server" section of Use OSS foreign tables for data lake analysis.
A user mapping — maps a database user to the AccessKey credentials that authorize access to OSS. See the "Create a user mapping to the OSS server" section of Use OSS foreign tables for data lake analysis.
An OSS foreign table — defines the schema and the OSS location of the data. See the "Create an OSS foreign table" section of Use OSS foreign tables for data lake analysis.

These three objects form a chain: the server identifies the OSS bucket, the user mapping supplies the credentials to access it, and the foreign table exposes the OSS data as a queryable table.

Import data from OSS

The following steps walk through a complete example that imports a CSV file from OSS into an AnalyticDB for PostgreSQL table.

Upload a CSV file to OSS. This example uses a file named example.csv. For preparation steps, see the "Preparations" section of Use OSS foreign tables for data lake analysis. For upload instructions, see Upload objects.

Use the same encoding format for your OSS objects and your database to avoid conversion overhead. The default database encoding is UTF-8.
Connect to an AnalyticDB for PostgreSQL database. See Client connection.

Create an OSS server.

Parameter	Description
`endpoint`	The internal endpoint for the OSS bucket. Only internal endpoints are supported. See the "Alibaba Cloud public cloud" section of Regions and endpoints.
`bucket`	The name of the OSS bucket. The bucket must be in the same region as the AnalyticDB for PostgreSQL instance.

CREATE SERVER oss_serv
    FOREIGN DATA WRAPPER oss_fdw
    OPTIONS (
        endpoint 'oss-cn-********.aliyuncs.com',
        bucket 'adb-pg'
  );

Create a user mapping.

For cross-account data transfer, use the AccessKey ID and AccessKey secret of the Alibaba Cloud account that owns the OSS bucket.
```
CREATE USER MAPPING FOR PUBLIC
    SERVER oss_serv
    OPTIONS (
        id 'LTAI****************',
        key 'yourAccessKeySecret'
    );
```

Create an OSS foreign table named ossexample.

CREATE FOREIGN TABLE ossexample (
    date text,
    time text,
    open float,
    high float,
    low float,
    volume int
) SERVER oss_serv OPTIONS (dir 'oss_adb/', format 'csv');

Import the data using one of the following methods. Method 1: INSERT (recommended for loading into an existing table) Create a local table with the same schema as the foreign table, then insert the data.
```
CREATE TABLE adbexample (
    date text,
    time text,
    open float,
    high float,
    low float,
    volume int
) WITH (APPENDONLY=TRUE, ORIENTATION=COLUMN, COMPRESSTYPE=ZSTD, COMPRESSLEVEL=5);

INSERT INTO adbexample SELECT * FROM ossexample;
```
Method 2: CREATE TABLE AS (creates the table and loads data in one step)
```
CREATE TABLE adbexample AS SELECT * FROM ossexample DISTRIBUTED BY (volume);
```
All compute nodes read OSS objects in parallel using a polling mechanism. By default, four objects are read in parallel. For best throughput, set the number of parallel objects to a multiple of the total number of compute node cores (compute nodes × cores per node). See the "Split large objects" section of Best practices for performing operations on OSS foreign tables.

If the import fails — for example, due to a file encoding mismatch or a data type incompatibility — check that the foreign table schema matches the OSS object structure and that the encoding settings are consistent.

Export data to OSS

Create an OSS foreign table pointing to the output directory.

CREATE FOREIGN TABLE foreign_x (i int, j int)
SERVER oss_serv
OPTIONS (format 'csv', dir 'tt_csv/');

Export data from a local table to OSS.

INSERT INTO foreign_x SELECT * FROM local_x;

Verify the export by navigating to the tt_csv/ directory in your OSS bucket. Each compute node writes objects in parallel, so you will see multiple files — one or more per segment.

Naming convention for exported objects

Exported objects follow this naming format:

{tablename | prefix}_{timestamp}_{random_key}_{seg}{segment_id}_{fileno}.{ext}[.gz]

Component	Description
`{tablename \| prefix}`	If `prefix` is specified, the value of `prefix` is used. If `dir` is specified, the OSS foreign table name is used.
`{timestamp}`	The time the export started, in `YYYYMMDDHH24MISS` format.
`{random_key}`	A random key value.
`{seg}{segment_id}`	The segment identifier — for example, `seg1` for compute node 1.
`{fileno}`	The object segment number, starting from `0`.
`{ext}`	The object format: `.csv`, `.txt`, or `.orc`.
`[.gz]`	Present when the object is GZIP-compressed.

Example: export to a GZIP-compressed CSV object using the `dir` parameter

CREATE FOREIGN TABLE fdw_t_out_1(a int)
SERVER oss_serv
OPTIONS (format 'csv', filetype 'gzip', dir 'test/');

Exported objects are named in this format:

fdw_t_out_1_20200805110207_1718599661_seg-1_0.csv.gz

Example: export to an ORC object using the `prefix` parameter

CREATE FOREIGN TABLE fdw_t_out_2(a int)
SERVER oss_serv
OPTIONS (format 'orc', prefix 'test/my_orc_test');