OSS foreign tables let you move data in bulk between Object Storage Service (OSS) and AnalyticDB for PostgreSQL — including across Alibaba Cloud accounts.
Limitations
The AnalyticDB for PostgreSQL instance and the OSS bucket must be in the same region.
Supported object formats
OSS foreign tables support the following formats:
-
Uncompressed CSV and TEXT objects
-
GZIP-compressed CSV and TEXT objects
-
ORC binary objects
For ORC data type mappings, see the "Data type mappings between ORC and AnalyticDB for PostgreSQL" section of Data type mappings for OSS foreign tables.
Prerequisites
Before you begin, ensure that you have created the following objects in this order:
-
An OSS server — registers the OSS bucket as a Foreign Data Wrapper (FDW) data source in AnalyticDB for PostgreSQL. See the "Create an OSS server" section of Use OSS foreign tables for data lake analysis.
-
A user mapping — maps a database user to the AccessKey credentials that authorize access to OSS. See the "Create a user mapping to the OSS server" section of Use OSS foreign tables for data lake analysis.
-
An OSS foreign table — defines the schema and the OSS location of the data. See the "Create an OSS foreign table" section of Use OSS foreign tables for data lake analysis.
These three objects form a chain: the server identifies the OSS bucket, the user mapping supplies the credentials to access it, and the foreign table exposes the OSS data as a queryable table.
Import data from OSS
The following steps walk through a complete example that imports a CSV file from OSS into an AnalyticDB for PostgreSQL table.
-
Upload a CSV file to OSS. This example uses a file named
example.csv. For preparation steps, see the "Preparations" section of Use OSS foreign tables for data lake analysis. For upload instructions, see Upload objects.Use the same encoding format for your OSS objects and your database to avoid conversion overhead. The default database encoding is UTF-8.
-
Connect to an AnalyticDB for PostgreSQL database. See Client connection.
-
Create an OSS server.
Parameter Description endpointThe internal endpoint for the OSS bucket. Only internal endpoints are supported. See the "Alibaba Cloud public cloud" section of Regions and endpoints. bucketThe name of the OSS bucket. The bucket must be in the same region as the AnalyticDB for PostgreSQL instance. CREATE SERVER oss_serv FOREIGN DATA WRAPPER oss_fdw OPTIONS ( endpoint 'oss-cn-********.aliyuncs.com', bucket 'adb-pg' ); -
Create a user mapping.
For cross-account data transfer, use the AccessKey ID and AccessKey secret of the Alibaba Cloud account that owns the OSS bucket.
CREATE USER MAPPING FOR PUBLIC SERVER oss_serv OPTIONS ( id 'LTAI****************', key 'yourAccessKeySecret' ); -
Create an OSS foreign table named
ossexample.CREATE FOREIGN TABLE ossexample ( date text, time text, open float, high float, low float, volume int ) SERVER oss_serv OPTIONS (dir 'oss_adb/', format 'csv'); -
Import the data using one of the following methods. Method 1: INSERT (recommended for loading into an existing table) Create a local table with the same schema as the foreign table, then insert the data.
CREATE TABLE adbexample ( date text, time text, open float, high float, low float, volume int ) WITH (APPENDONLY=TRUE, ORIENTATION=COLUMN, COMPRESSTYPE=ZSTD, COMPRESSLEVEL=5); INSERT INTO adbexample SELECT * FROM ossexample;Method 2: CREATE TABLE AS (creates the table and loads data in one step)
CREATE TABLE adbexample AS SELECT * FROM ossexample DISTRIBUTED BY (volume);All compute nodes read OSS objects in parallel using a polling mechanism. By default, four objects are read in parallel. For best throughput, set the number of parallel objects to a multiple of the total number of compute node cores (compute nodes × cores per node). See the "Split large objects" section of Best practices for performing operations on OSS foreign tables.
If the import fails — for example, due to a file encoding mismatch or a data type incompatibility — check that the foreign table schema matches the OSS object structure and that the encoding settings are consistent.
Export data to OSS
-
Create an OSS foreign table pointing to the output directory.
CREATE FOREIGN TABLE foreign_x (i int, j int) SERVER oss_serv OPTIONS (format 'csv', dir 'tt_csv/'); -
Export data from a local table to OSS.
INSERT INTO foreign_x SELECT * FROM local_x; -
Verify the export by navigating to the
tt_csv/directory in your OSS bucket. Each compute node writes objects in parallel, so you will see multiple files — one or more per segment.
Naming convention for exported objects
Exported objects follow this naming format:
{tablename | prefix}_{timestamp}_{random_key}_{seg}{segment_id}_{fileno}.{ext}[.gz]
| Component | Description |
|---|---|
{tablename | prefix} |
If prefix is specified, the value of prefix is used. If dir is specified, the OSS foreign table name is used. |
{timestamp} |
The time the export started, in YYYYMMDDHH24MISS format. |
{random_key} |
A random key value. |
{seg}{segment_id} |
The segment identifier — for example, seg1 for compute node 1. |
{fileno} |
The object segment number, starting from 0. |
{ext} |
The object format: .csv, .txt, or .orc. |
[.gz] |
Present when the object is GZIP-compressed. |
Example: export to a GZIP-compressed CSV object using the `dir` parameter
CREATE FOREIGN TABLE fdw_t_out_1(a int)
SERVER oss_serv
OPTIONS (format 'csv', filetype 'gzip', dir 'test/');
Exported objects are named in this format:
fdw_t_out_1_20200805110207_1718599661_seg-1_0.csv.gz
Example: export to an ORC object using the `prefix` parameter
CREATE FOREIGN TABLE fdw_t_out_2(a int)
SERVER oss_serv
OPTIONS (format 'orc', prefix 'test/my_orc_test');
Exported objects are named in this format:
my_orc_test_20200924153043_1737154096_seg0_0.orc