All Products
Search
Document Center

PolarDB:Fast import of vector and raster data into GanosBase

Last Updated:Mar 28, 2026

GanosBase is the spatial database engine in PolarDB for PostgreSQL (Compatible with Oracle). Because PolarDB uses a shared storage architecture with no local disk, the standard PostGIS command-line tools (shp2pgsql, raster2pgsql) cannot be used to load data. GanosBase reads from Object Storage Service (OSS) directly and provides SQL functions for import instead.

Choose an import method

Select a method before you start.

Data typeMethodBest for
Vector (Shapefile and other formats)FDWAll vector imports; preferred for scripted or automated workflows
Vector (any format)GIS desktop tool (QGIS)One-off imports from a GUI
Raster (TIFF, HDF5, NetCDF)ST_ImportFromProduction data that must remain accessible even if the OSS source file is deleted
Raster (TIFF, HDF5, NetCDF)ST_CreateRastExploratory queries where data stays in OSS long-term and you control the source files

Why PostGIS tools don't work for PolarDB

PostGIS provides shp2pgsql and raster2pgsql as command-line tools for loading vector and raster data. These tools don't work for PolarDB for two reasons:

  • They require direct access to the server's local disk. PolarDB uses a shared storage architecture with no local disk.

  • PostGIS Raster and GanosBase Raster use different internal storage structures.

Import vector data

Import using FDW (recommended)

Foreign Data Wrapper (FDW) is a PostgreSQL extension for accessing external data sources. GanosBase FDW automatically maps geometric spatial data types to the Geometry type, letting you query foreign tables and local tables together in a single SQL statement.

The following steps import poly.shp from OSS into a local GanosBase table.

Prerequisites

Before you begin, ensure that you have:

  • A PolarDB for PostgreSQL (Compatible with Oracle) instance

  • An OSS bucket containing your Shapefile

  • The AccessKey ID and AccessKey secret for your Alibaba Cloud account

Steps

  1. Install the required extensions.

    CREATE EXTENSION ganos_spatialref;
    CREATE EXTENSION ganos_geometry;
    CREATE EXTENSION ganos_fdw;
  2. Register the Shapefile as a foreign table using ST_RegForeignTables.

    SELECT ST_RegForeignTables('oss://<access_id>:<secret_key>@[<endpoint>]/<bucket>/path_to/poly.shp');

    Replace the placeholders with your actual credentials and file path:

    PlaceholderDescriptionExample
    <access_id>Alibaba Cloud AccessKey IDLTAI5tXxx
    <secret_key>Alibaba Cloud AccessKey secretxXxXxXx
    <endpoint>OSS endpointoss-cn-hangzhou.aliyuncs.com
    <bucket>OSS bucket namemy-geo-bucket
  3. Confirm the foreign table was registered.

    SELECT foreign_table_name FROM information_schema.foreign_tables
    ORDER BY foreign_table_name ASC;

    The table poly appears in the results. At this point, queries on poly read from OSS on demand — no data has been written to the database.

  4. To write the data into the database, run:

    CREATE TABLE poly_db AS SELECT * FROM poly;

Import using a GIS desktop tool

PolarDB for PostgreSQL is fully compatible with PostgreSQL Community Edition, so QGIS and other GIS desktop tools work without modification. Create a PostgreSQL data source connection pointing to your PolarDB instance, then use the tool's built-in import capability.

Import raster data

How ST_ImportFrom and ST_CreateRast differ

Both functions create a Raster object, but they handle pixel data differently:

ST_ImportFromST_CreateRast
What gets stored in the databaseRaster metadata + pixel data (chunked at 256 x 256 pixels by default)Raster metadata only (logical mapping information to the external image)
If the OSS source file is deletedData remains accessibleData is no longer viewable
Storage costHigher (pixel data lives in the database)Lower (pixel data stays in OSS)

Use ST_ImportFrom when data durability matters. Use ST_CreateRast when you want to query data in-place without copying it.

Import TIFF data

  1. Upload the TIFF file to an OSS bucket.

  2. Create a table with a Raster column.

    CREATE TABLE raster_table (
      id      integer,
      format  text,
      rast    raster
    );
  3. Import the TIFF using either function.

    • With ST_ImportFrom — pixel data is stored in chunk_table:

      INSERT INTO raster_table
      SELECT 1, 'TIFF', ST_ImportFrom('chunk_table', 'oss://&lt;access_id&gt;:&lt;secret_key&gt;@[&lt;endpoint&gt;]/&lt;bucket&gt;/path_to/file');
    • With ST_CreateRast — only a pointer to the OSS file is stored:

      INSERT INTO raster_table
      SELECT 2, 'TIFF', ST_CreateRast('oss://&lt;access_id&gt;:&lt;secret_key&gt;@[&lt;endpoint&gt;]/&lt;bucket&gt;/path_to/file');
  4. Query the imported Raster objects using GanosBase UDFs.

    SELECT
      id,
      ST_Georeference(rast),
      st_extent(rast),
      ST_NumBands(rast),
      ST_SRID(rast),
      st_value(rast, 0, 100, 100),
      st_value(rast, 1, 200, 200)
    FROM raster_table;

    image

Import HDF5 and NetCDF data

HDF5 and NetCDF are used in earth observation, scientific computing, and geoscience (such as meteorology and oceanography). The import syntax is the same as for TIFF.

  • With ST_ImportFrom:

    INSERT INTO raster_table
    SELECT 3, 'NC', ST_ImportFrom('chunk_table', 'oss://<access_id>:<secret_key>@[<endpoint>]/<bucket>/path_to/file');
  • With ST_CreateRast:

    INSERT INTO raster_table
    SELECT 4, 'NC', ST_CreateRast('oss://<access_id>:<secret_key>@[<endpoint>]/<bucket>/path_to/file');

Files with subdatasets

HDF5 and NetCDF files can contain multiple named datasets. Append the subdataset name to the file path.

  • NetCDF:

    INSERT INTO raster_table
    SELECT 5, 'NC', ST_ImportFrom('chunk_table', 'oss://<access_id>:<secret_key>@[<endpoint>]/<bucket>/path_to/file.nc:sub_name');
  • HDF5:

    INSERT INTO raster_table
    SELECT 5, 'HDF5', ST_ImportFrom('chunk_table', 'oss://<access_id>:<secret_key>@[<endpoint>]/<bucket>/path_to/file.hdf5://path/sub_name');

Files with more than three dimensions

GanosBase stores raster data as (w, h, b) chunks, where w is width, h is height, and b is the number of bands. For files with more than three dimensions, specify chunkdim explicitly so GanosBase knows how to map the extra dimensions to bands.

For NetCDF files, the number of bands is the product of all dimensions except the x (longitude) and y (latitude) axes. For example, a file with lon, lat, time, and isobaric dimensions has 120 bands (the product of the time and isobaric dimension sizes).

image
INSERT INTO nc_table
VALUES (
  1,
  ST_ImportFrom(
    'nc_rbt',                         -- chunk table name
    '/Users/xiaofei/Data/raster/nc_demo.nc',
    '{"chunkdim":"(256,256,120)"}'    -- 256x256 tiles, 120 bands
  )
);

Import data in batches

For bulk imports from OSS, use a Python script with the oss2 and psycopg2 libraries. Before running the script, initialize the OSS Python SDK. For details, see Initialization.

import oss2
import psycopg2
import logging

# Connect to OSS
auth   = oss2.Auth('AK_ID', 'AK_Secret')
bucket = oss2.Bucket(auth, '***endpoint***', '***bucket_name***')

# Connect to the database
con = psycopg2.connect(
    database="***dataq***",
    user="***dde***",
    options="-c search_path=aster,public",  # schema
    password="******",
    host="*******",
    port="******"
)
cur = con.cursor()

# SQL template for data import
insert_sql = "INSERT into {raster_table_name}(filename, band_index, raster_obj) VALUES ..."

# Collect HDF5 file names from OSS
for obj in oss2.ObjectIterator(bucket, prefix="..."):
    ff = open("upload.txt", mode='a', encoding='utf-8')
    filename = obj.key.split('/')[-1]
    if filename.find(".hdf5") == -1:
        continue
    else:
        ff.write(filename + '\n')

# Read the file list
with open('upload.txt', 'r') as f:
    fileList = f.read().split('\n')

# Import each file
for file in fileList:
    # Build insert_sql with actual parameters...

    try:
        cur.execute(insert_sql)
        con.commit()
        logging.info(filename + " finished")
    except (Exception, psycopg2.Error) as e:
        logging.info(filename + " error!")
        print(filename + " upload failed\n")
Note

Replace all placeholder values (*endpoint*, *bucket_name*, AK_ID, AK_Secret, and database connection parameters) with your actual credentials before running the script.