All Products
Search
Document Center

Hologres:Access unstructured OSS data with object table

Last Updated:Mar 26, 2026

Object tables map files stored in OSS into a queryable Hologres table—similar to an external table, but purpose-built for unstructured data such as images, PDFs, audio, and video. Hologres V4.0 and later support this feature.

With object tables, you can:

Prerequisites

Before you begin, ensure that you have:

  • A Hologres instance running V4.0 or later. To upgrade, see Upgrade an instance

  • An OSS bucket containing the files you want to access

  • A RAM role with read access to the OSS bucket. To find the role ARN (Alibaba Cloud Resource Name), go to the Resource Access Management (RAM) console, click Identity Management > Role, and then click the role name

Limitations

  • Only classic network endpoints are supported.

  • Only manual refresh is supported. There is no automatic refresh.

  • By default, object tables scan only the top-level directory of the specified path. Subdirectories are not included. To scan subdirectories recursively, run:

    Note

    Recursive scans consume more resources than top-level scans. Set path to the most specific directory that contains your target files.

    ALTER DATABASE <db_name> SET hg_experimental_enable_oss_meta_recursive = on;

Create an object table

Object tables have a fixed schema—you do not specify columns when creating one.

Syntax:

CREATE OBJECT TABLE [IF NOT EXISTS] [schema_name.]<table_name>
WITH (
  -- Required
  path          = '<oss_path>',
  oss_endpoint  = '<oss_endpoint>',
  role_arn      = '<role_arn>',

  -- Optional
  [orientation                  = 'column | row | row,column',]
  [table_group                  = '<table_group_name>',]
  [distribution_key             = '<column_name>[,...]',]
  [clustering_key               = '<column_name>[:asc][,...]',]
  [storage_mode                 = 'hot | cold',]
  [event_time_column            = '<column_name>[,...]',]
  [bitmap_columns               = '<column_name>[,...]',]
  [dictionary_encoding_columns  = '<column_name>[,...]',]
  [time_to_live_in_seconds      = '<non_negative_integer>']
);

After creating an object table, refresh it once to load data:

REFRESH OBJECT TABLE [schema_name.]<table_name>;

Required parameters

Parameter Description Example
path The OSS directory path containing your files. The object table reads metadata from files in this directory. oss://my-bucket/my-dir
oss_endpoint The OSS classic network endpoint for the bucket region. For endpoints by region, see Regions and endpoints. oss-ap-southeast-1-internal.aliyuncs.com
role_arn The ARN of a RAM role with read access to the OSS bucket. acs:ram::role-id:role/role-name

Optional parameters

Parameter Default Description
orientation column Storage format. Supported values: column, row, row,column. See Table storage formats.
table_group Default table group The table group and shard count for the object table.
distribution_key object_uri The distribution key.
clustering_key object_uri The clustering key.
storage_mode hot Storage mode. Supported values: hot (hot storage), cold (cold storage). See Data tiering.
event_time_column last_modified_at The event time column (segment key).
bitmap_columns object_uri:auto,etag:auto The bitmap index columns.
dictionary_encoding_columns object_uri:auto,etag:auto Dictionary encoding columns.
time_to_live_in_seconds 3153600000 Data lifecycle for the table in seconds.

Query an object table

After the initial refresh, query an object table like any standard table:

SELECT * FROM <object_table_name>;

Object tables have the following fixed columns:

Column Type Description
object_uri TEXT The exact OSS file path.
etag TEXT A unique identifier for the file content, assigned when the object is created.
file FILE A JSON-like structure containing detailed file metadata. Parse it using JSON and JSONB types. Fields include: object_uri, etag, size, last_modified_at, owner_name, object_source (always OSS), oss_endpoint, object_table_id, and role_arn.
metadata JSON Custom metadata attached to the file.

To view the DDL or storage properties of an object table, use either of the following:

-- View DDL
SELECT hg_dump_script('[schema_name.]<object_table_name>');

-- View storage properties
SELECT * FROM hologres.hg_table_properties WHERE table_name = '<object_table_name>';

Refresh an object table

Refresh an object table to sync file metadata from OSS. Only manual refresh is supported.

REFRESH OBJECT TABLE [schema_name.]<table_name>;
Note

Refreshing consumes compute resources. To minimize cost, set path to the most specific directory that contains your target files.

Manage an object table

Rename or move an object table

-- Rename
ALTER OBJECT TABLE [IF EXISTS] <table_name> RENAME TO <new_name>;

-- Move to a different schema
ALTER OBJECT TABLE [IF EXISTS] [schema_name.]<table_name> SET SCHEMA <new_schema>;

Delete an object table

-- Moves to the recycle bin if it is enabled; otherwise, permanently deletes
DROP OBJECT TABLE [IF EXISTS] <table_name>;

-- Permanently deletes without moving to the recycle bin
DROP OBJECT TABLE [IF EXISTS] <table_name> FORCE;

If the database recycle bin is enabled, dropping an object table moves it to the recycle bin automatically. The object table can be recovered using the recycle bin recovery command, and it remains an object table after recovery. For details, see Recycle bin.

Warning

The FORCE option permanently deletes the object table. Recovery is not possible.

Best practices

Object tables work with Dynamic Tables, AI Functions, vector search, and full-text search to automate processing, searching, and analysis of unstructured data. For details, see: