All Products
Search
Document Center

AnalyticDB:Quick Start for Vector Engine

Last Updated:Jan 27, 2026

AnalyticDB for PostgreSQL provides a vector engine optimization feature for vector analysis scenarios, such as AIGC and vector retrieval. This tutorial guides you through the entire process, from creating an AnalyticDB for PostgreSQL instance to performing vector analysis.

Preparations

  • An Alibaba Cloud account is required. If you do not have an account, you can register one on the Alibaba Cloud official website.

  • Authorize a service-linked role. The first time you use AnalyticDB for PostgreSQL, you must authorize the creation of a service-linked role in the console. To do this, perform the following steps:

    1. Log on to the AnalyticDB for PostgreSQL console.
    2. In the upper-right corner of the page, click Create Instance.

    3. In the Create Service Linked Role dialog box that appears, click OK.

Billing

When you create an instance, you are charged for its compute and storage resources. For more information, see Pricing.

Free trial

Alibaba Cloud offers a free trial for Storage-elastic Mode instances. If you are a new user of AnalyticDB for PostgreSQL, you can visit Alibaba Cloud Free Trial to apply for a free trial. If you are not eligible for a free trial, follow the steps in this topic to create an instance in the console.

Procedure

Create an instance

  1. Log on to the AnalyticDB for PostgreSQL console.
  2. In the upper-right corner of the page, click Create Instance to open the buy page.

  3. On the instance purchase page, configure the core parameters to quickly select an instance type. You can keep the default values for other parameters. For more information about the parameters, see Create an instance.

    Configuration Item

    Description

    Example for This Tutorial

    Product Type

    • Subscription: This is a subscription billing method. You pay upfront when you create the instance. This method is suitable for long-term use and is more cost-effective than pay-as-you-go. The longer the subscription duration, the higher the discount.

    • Pay-as-you-go: This is a post-paid billing method. You are billed hourly. This method is suitable for short-term use. You can release the instance immediately after use to save costs.

    Pay-as-you-go

    Region and Zone

    The geographic location of the instance.

    You cannot change the region and zone after the instance is created. Create the instance in the same region as the ECS instances that you want to connect to. This enables service interconnection over the internal network.

    China (Hangzhou): Zone J

    Instance Resource Type

    • Elastic Storage Mode: Supports independent disk scale-out and smooth online scale-out.

    • Serverless Pro: Specify only the required computing resources. You do not need to reserve storage resources.

    Elastic Storage Mode

    Database Engine Version

    Select 7.0 Standard Edition for a richer feature experience. 6.0 Standard Edition is also supported.

    7.0 Standard Edition

    Instance Edition

    • High-performance (Basic Edition): Suitable for most business analysis scenarios.

    • High-availability Edition: Recommended for core business services.

    High-performance (Basic Edition)

    Vector Engine Optimization

    Select Enable.

    Enable

    Virtual Private Cloud

    Select the ID of the VPC.

    To interconnect with an ECS instance in the same region over the internal network, select the same VPC as the ECS instance. You can select an existing VPC or create a VPC and vSwitch as prompted on the page.

    vpc-xxxx

    vSwitch

    Select a vSwitch in the VPC. If no vSwitches are available, no vSwitch resources are available in the zone. You can switch to another zone or create a vSwitch in the current zone as prompted on the page.

    vsw-xxxx

  4. Click Buy Now, confirm the order information, and click Activate Now.

  5. After you complete the payment, click Management Console to go to the instance list and view the new instance.

    Note

    The initialization of an AnalyticDB for PostgreSQL instance takes some time. Wait for the instance status to change to Running before you proceed with the next steps.

Create an initial account

Important

AnalyticDB for PostgreSQL provides two types of users:

  • Privileged user: The initial account is a privileged user with the RDS_SUPERUSER role. This role grants all operational permissions on the database.

  • Regular user: By default, a regular user has no permissions. A privileged user, or another user with the GRANT permission, must explicitly grant permissions on database objects to the regular user. For more information about how to create a regular user, see Create and manage users.

  1. In the navigation pane on the left, click Account Management.

  2. Click Create Account. In the Create Account window, enter an account name, set a password, and then click OK.

    Parameter

    Description

    Account

    The name of the initial account.

    • Contain lowercase letters, digits, and underscores (_).

    • Must start with a lowercase letter and end with a lowercase letter or a digit.

    • Cannot start with gp.

    • Must be 2 to 16 characters in length.

    New Password and Confirm Password

    The password of the initial account.

    • Must contain at least three of the following character types: uppercase letters, lowercase letters, digits, and special characters.

    • Special characters include ! @ # $ % ^ & * ( ) _ + - =.

    • Must be 8 to 32 characters in length.

    Important

    For security, change your password regularly and avoid reusing old ones.

Configure a whitelist

Note

If you use only Data Management (DMS) to access the instance, you can skip this step. If you access the database from a local integrated development environment (IDE) or an ECS instance, you must configure an IP address whitelist. To obtain the client IP address, see Preparations.

  1. In the navigation pane on the left of the instance details page, click Security Controls.

  2. Click Create Whitelist and configure the following parameters:

    Configuration

    Description

    Whitelist Name

    The name of the new whitelist group. The name must meet the following requirements:

    • Consist of lowercase letters, digits, and underscores (_).

    • Start with a lowercase letter and end with a lowercase letter or a digit.

    • Be 2 to 32 characters in length.

    IP Addresses

    The IP addresses to add to the whitelist. Note the following:

    • Separate multiple IP addresses with commas (,). You can add up to 999 unique IP addresses.

    • The supported formats are specific IP addresses, such as 10.23.12.24, and CIDR blocks, such as 10.23.12.24/24. CIDR stands for Classless Inter-Domain Routing. The number after the slash (/) indicates the prefix length, which can be from 1 to 32.

    • If you set the prefix length to 0, such as 0.0.0.0/0 or 127.0.0.1/0, all IP addresses can access the instance. This poses a high security risk. Use this setting with caution.

    • The IP address 127.0.0.1 blocks access from all external IP addresses.

  3. Click OK.

Connect to the database

This section uses the psql tool as an example to demonstrate how to log on to the database. For information about how to connect using other tools, see Client connection.

  1. Install psql.

    1. Download the psql tool.

      wget https://docs-aliyun.cn-hangzhou.oss.aliyun-inc.com/assets/attach/181125/cn_zh/1598426198114/adbpg_client_package.el7.x86_64.tar.gz

      The preceding download command applies only if the client operating system is RHEL 7 or CentOS 7. To use other versions of the psql tool, see Client connection.

    2. Decompress the psql tool.

      tar -xzvf adbpg_client_package.el7.x86_64.tar.gz
    3. Go to the directory where the psql tool is located.

      cd adbpg_client_package/bin
  2. Log on to the database.

    ./psql -h <Endpoint of the AnalyticDB for PostgreSQL instance> -p 5432 -d <Database to connect to> -U <Account of the AnalyticDB for PostgreSQL instance>

    To obtain the endpoint:

    • If the client is deployed on an ECS instance that is in the same region and has the same network type as the AnalyticDB for PostgreSQL instance, you can use the internal endpoint to connect. Log on to the AnalyticDB for PostgreSQL console, and click the ID of the target instance to open the instance details page. In the navigation pane on the left, click Basic Information. In the Database Connection Information section, you can view the Internal Endpoint.

    • If the client and the AnalyticDB for PostgreSQL instance are deployed in different regions, on ECS instances with different network types, or on systems outside of Alibaba Cloud, you need to request a public endpoint and use it to connect. For more information, see Manage public endpoints.

    The following is an example of a connection command:

    psql -h gp-bp13zq652yy4p****-master.gpdb.rds.aliyuncs.com -p 5432 -d postgres -U testuser

    Press Enter and enter the password. If the system displays postgres=> , the connection is successful.

Import vector data using SQL

  1. Confirm that the FastANN vector retrieval extension is installed.

    You can run the \dx fastann command to check whether the extension is installed. If information about the FastANN extension is returned, the extension is installed. If no information is returned, submit a ticket to request the installation.

  2. You can download the test data.

    For testing purposes, AnalyticDB for PostgreSQL provides sample data. To download the data, click vector_sample_data.csv.

    The schema of the sample data table is as follows.

    Field

    Type

    Description

    id

    bigint

    ID.

    market_time

    timestamp

    The time when the car was launched.

    color

    varchar(10)

    The color of the car.

    price

    int

    The price of the car.

    feature

    float4[]

    The feature vector of the car photo.

    The command to download the sample data on a Linux server is as follows.

    wget https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20230606/uzkx/vector_sample_data.csv
  3. Import the data.

    1. Create and switch to a database.

      -- Create a database named testdb.
      CREATE DATABASE testdb;
      
      -- Switch to the database.
      \c testdb
    2. Create a table that contains a vector field based on the sample data.

      CREATE SCHEMA IF NOT EXISTS vector_test;
      CREATE TABLE IF NOT EXISTS vector_test.car_info
      (
        id bigint NOT NULL,
        market_time timestamp,
        color varchar(10),
        price int,
        feature float4[],
        PRIMARY KEY(id)
      ) 
      USING heap -- If vector search engine optimization is enabled when you create the instance, the default engine is heap.
      DISTRIBUTED BY(id);
    3. Create indexes.

      In this tutorial, a structured index is created for fields such as car launch time, color, and price, and a vector index is created for the feature vector of the car photo.

      -- Change the storage format of the vector field to PLAIN.
      ALTER TABLE vector_test.car_info ALTER COLUMN feature SET STORAGE PLAIN;
      
      -- Create structured indexes.
      CREATE INDEX ON vector_test.car_info(market_time);
      CREATE INDEX ON vector_test.car_info(color);
      CREATE INDEX ON vector_test.car_info(price);
      
      -- Create a vector index.
      CREATE INDEX ON vector_test.car_info USING ann(feature) WITH (dim='10', pq_enable='0');
    4. Load the sample data into the data table.

      \COPY vector_test.car_info FROM '/DATA_PATH/vector_sample_data.csv';

      Replace /DATA_PATH/vector_sample_data.csv in the example with the path where the sample data file is stored. For example, if the sample data is downloaded to the /home directory, the path should be /home/vector_sample_data.csv.

      After the data is imported, the system returns COPY 10000.

Vector analysis

This tutorial demonstrates vector analysis by providing examples of how to obtain the squared Euclidean distance, dot product distance, and cosine similarity.

Get the Euclidean distance

Perform vector analysis and obtain the squared Euclidean distance.

SELECT id, l2_squared_distance(feature, array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]) AS distance 
  FROM vector_test.car_info 
  ORDER BY feature <-> array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[] 
  LIMIT 10;

The following is a sample output.

  id  |      distance
------+--------------------
    2 |                  0
 1331 | 0.0677967891097069
 1543 |  0.079616591334343
 5606 | 0.0892329216003418
 6423 | 0.0894578248262405
 1667 | 0.0903968289494514
 8215 | 0.0936210229992867
 7801 | 0.0952572822570801
 2581 | 0.0965127795934677
 2645 | 0.0987173467874527
(10 rows)

Get the dot product distance (cosine similarity)

Perform vector analysis and obtain the dot product distance. When the data is normalized, the dot product distance is equal to the cosine similarity.

SELECT id, dp_distance(feature, array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]) AS similarity 
  FROM vector_test.car_info 
  ORDER BY feature <-> array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[] 
  LIMIT 10;

The following is a sample output.

  id  |    similarity
------+-------------------
    2 |                 1
 1331 | 0.966101586818695
 1543 | 0.960191607475281
 5606 | 0.955383539199829
 6423 | 0.955271065235138
 1667 | 0.954801559448242
 8215 | 0.953189492225647
 7801 |  0.95237135887146
 2581 | 0.951743602752686
 2645 | 0.950641334056854
(10 rows)

Hybrid search query

To perform a hybrid search of structured and unstructured data, you can use the following SQL statement.

SELECT id, dp_distance(feature, array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]) AS similarity 
  FROM vector_test.car_info 
  WHERE market_time >= '2020-10-30 00:00:00' 
  AND market_time < '2021-01-01 00:00:00' 
  AND color in ('red', 'white', 'blue') 
  AND price < 100 
  ORDER BY feature <-> array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[] 
  LIMIT 10;

The following is a sample output.

  id  |    similarity
------+-------------------
 7645 | 0.922723233699799
 8956 | 0.920517802238464
 8219 |  0.91210675239563
 8503 | 0.895939946174622
 5113 | 0.895431876182556
 7680 | 0.893448948860168
 8433 | 0.893425941467285
 3604 |  0.89293098449707
 3945 | 0.891274154186249
 7153 | 0.891128540039062
(10 rows)

References

For more information about vector analysis, see Vector analysis.