Import data from Object Storage Service (OSS) into ApsaraDB for ClickHouse to run log queries, analytics, and data processing workloads. Two import methods are supported: the OSS table engine for persistent access to a fixed OSS data source, and the oss() table function for one-off or ad hoc imports.
Prerequisites
Before you begin, ensure that you have:
OSS activated. For details, see Activate OSS
A bucket in the same region as your ApsaraDB for ClickHouse cluster. For details, see Create a bucket
Read permissions on the OSS objects you want to import. For details, see Overview of permissions and access control
Choose a method
| Method | Best for |
|---|---|
| OSS table engine | A fixed OSS data source you query repeatedly. Define it once as an external table and reuse it. |
| `oss()` table function | A one-off or ad hoc import. No external table required — OSS connection details are passed directly in the INSERT INTO statement. |
Both methods use version-specific syntax. Check your cluster version in the Cluster Properties section on the Cluster Information page before running any commands.
Prepare test data
Save the following data as test.csv and upload it to your OSS bucket. ApsaraDB for ClickHouse uses a comma (,) as the default column delimiter.
1,yang,32,shanghai,http://example1.com
2,wang,22,beijing,http://example2.com
3,xiao,23,shenzhen,http://example3.com
4,jess,45,hangzhou,http://example4.com
5,jack,14,shanghai,http://example5.com
6,tomy,25,hangzhou,http://example6.com
7,lucy,45,shanghai,http://example7.com
8,tengyin,26,shanghai,http://example8.com
9,wangli,27,shenzhen,http://example9.com
10,xiaohua,37,shanghai,http://example10.comImport data from OSS
Step 1: Connect to the cluster
Step 2: Create a local table
Create the local table oss_test_tbl_local. Select the CREATE TABLE statement that matches your cluster's replica configuration, which you can view in the Cluster Properties section on the Cluster Information page.
The schema of your ApsaraDB for ClickHouse table must match the schema of the OSS data. Pay special attention to nullable fields — mismatches cause data parsing failures and cluster exceptions.
Select the appropriate CREATE TABLE statement based on the replica configuration of your cluster. You can view the replica configuration in the Cluster Properties section on the Cluster Information page in the console.
Single-replica cluster
CREATE TABLE oss_test_tbl_local ON CLUSTER default
(
id UInt8,
user_name String,
age UInt16,
city String,
access_url String
)
ENGINE = MergeTree()
ORDER BY id;Dual-replica cluster
CREATE TABLE oss_test_tbl_local ON CLUSTER default
(
id UInt8,
user_name String,
age UInt16,
city String,
access_url String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}')
ORDER BY id;For more information about table engines, see Table engines.
Step 3: Create a distributed table (optional)
Create the distributed table oss_test_tbl_distributed if you want to distribute imported data across all local tables simultaneously. Skip this step if you only need the data in a single local table.
CREATE TABLE oss_test_tbl_distributed ON CLUSTER default
(
id UInt8,
user_name String,
age UInt16,
city String,
access_url String
)
ENGINE = Distributed(default, default, oss_test_tbl_local, rand());Create a distributed table if you want to distribute data to all local tables in a collection at the same time.
Step 4: Import data
Step 5: Verify the import
Query the distributed table to confirm the data was imported correctly:
SELECT * FROM oss_test_tbl_distributed;Expected output:
┌─id─┬─user_name─┬──age──┬───city─────┬─────access_url────────┐
│ 1 │ yang │ 32 │ shanghai │ http://example1.com │
│ 2 │ wang │ 22 │ beijing │ http://example2.com │
│ 3 │ xiao │ 23 │ shenzhen │ http://example3.com │
│ 4 │ jess │ 45 │ hangzhou │ http://example4.com │
│ 5 │ jack │ 14 │ shanghai │ http://example5.com │
│ 6 │ tomy │ 25 │ hangzhou │ http://example6.com │
│ 7 │ lucy │ 45 │ shanghai │ http://example7.com │
│ 8 │ tengyin │ 26 │ shanghai │ http://example8.com │
│ 9 │ wangli │ 27 │ shenzhen │ http://example9.com │
│ 10 │ xiaohua │ 37 │ shanghai │ http://example10.com │
└────┴───────────┴───────┴────────────┴───────────────────────┘Match multiple OSS files with wildcards
When OSS stores many small files that follow a consistent naming pattern, use wildcard characters in the oss-file-path parameter to match them in a single import.
| Wildcard | Matches | Example |
|---|---|---|
* | Any file name or folder name | /dir/* matches all files in /dir |
{x,y,z} | Any of the listed values | file_{x,y,z} matches file_x, file_y, or file_z |
{num1..num2} | Any value in the numeric range | file_{1..3} matches file_1, file_2, or file_3 |
? | Any single character | file_? matches file_a, file_b, or file_c |
Example
Given the following folder structure in OSS:
oss://testBucketName/
doc-data/
oss-import/
small_files/
access_log_csv_1.txt
access_log_csv_2.txt
access_log_csv_3.txtThe following oss-file-path values all match these files:
oss://testBucketName/doc-data/oss-import/small_files/*oss://testBucketName/doc-data/oss-import/small_files/access*oss://testBucketName/doc-data/oss-import/small_files/access_log_csv_{1,2,3}.txtoss://testBucketName/doc-data/oss-import/*/access_log_csv_{1,2,3}.txtoss://testBucketName/doc-data/oss-import/*/*oss://testBucketName/doc-data/oss-import/*/access_log_csv_{1..3}.txtoss://testBucketName/doc-data/oss-import/*/access_log_csv_?.txt