All Products
Search
Document Center

Hologres:Use Python to access Hologres

Last Updated:Mar 05, 2024

Psycopg is a PostgreSQL database adapter for Python. Hologres is compatible with PostgreSQL 11. You can use Psycopg to access Hologres. This topic describes how to use Psycopg 2 to access Hologres. In this example, Python 3.8 installed on CentOS 7 is used.

Install Python 3.8

You can install Python 3.8 in a Miniconda or Anaconda environment. In this example, Python 3.8 is installed on CentOS 7.

  1. Install Python 3.8.

    Download the Python and run the following commands to install Python:

    # yum install centos-release-scl
    # yum install rh-python38
    # scl enable rh-python38 bash
    # python --version
    Python 3.8.6
  2. Install the Psycopg 2 package.

    Run the following command to install the Psycopg 2 package:

     # pip install psycopg2-binary

Connect to Hologres

After you install Python 3.8 and Psycopg 2, perform the following steps to connect to Hologres:

  1. Load the Psycopg 2 package.

    To use Psycopg 2, run the import psycopg2 command to load the Psycopg 2 package that you installed.

  2. Connect to your Hologres database.

    You can use the psycopg2.connect() function to connect to your Hologres database. The following sample code provides an example on how to use this function. The following table describes the parameters of this function.

    conn = psycopg2.connect(host="<Endpoint>", port=<Port>, dbname="<database>", user="<Access ID>", password="<Access Key>", application_name="<Application Name>")

    Parameter

    Description

    host

    The endpoint of your Hologres instance.

    You can view the endpoint of the Hologres instance in the Network Information section of the instance details page in the Hologres console.

    port

    The port number of your Hologres instance.

    You can view the port number of the Hologres instance in the Network Information section of the Instance Details page in the Hologres console.

    dbname

    The name of the Hologres database.

    user

    The AccessKey ID of the Alibaba Cloud account.

    You can obtain the AccessKey ID from the Security Management page.

    password

    The AccessKey secret of the Alibaba Cloud account.

    You can obtain the AccessKey secret from the Security Management page.

    application_name

    (Optional) The name of your application.

    The parameter is used to specify the application on which the SQL statement is executed, which helps you understand the usage of the statement when you query logs.

Use Hologres

After you connect to the Hologres database, you can use Psycopg 2 to develop data in the database. You can perform the following steps to create a table, insert data into the table, query the data, and then release resources. If you want to use fixed plans to perform read and write operations that require better performance, use the related Grand Unified Configuration (GUC) parameters. For more information, see Accelerate the execution of SQL statements by using fixed plans.

  1. Create a cursor.

    Before you develop data, run the cur = conn.cursor() command to create a cursor for the connection.

  2. Develop data.

    1. Create a table.

      The following sample command creates a table named holo_test and specifies that the type of data in the table is INTEGER. You can set a custom table name and specify a custom data type based on your business requirements.

      cur.execute("CREATE TABLE holo_test (num integer);")
    2. Insert data.

      The following sample command inserts integers from 1 to 1,000 to the holo_test table.

      cur.execute("INSERT INTO holo_test SELECT generate_series(%s, %s)", (1, 1000))
    3. Query the data.

      cur.execute("SELECT sum(num) FROM holo_test;")
      cur.fetchone()
  3. Commit the transaction.

    After you run the command for the data query, run the conn.commit() command. This ensures that the operation is committed as a transaction. You can also set the autocommit parameter to true to automatically commit the operation.

  4. Release resources.

    To ensure that subsequent operations are not affected, after the preceding commands are run, run the following commands to close the cursor and disconnect Psycopg 2 from your Hologres database:

    cur.close()
    conn.close()

Best practice of efficiently importing DataFrames to Hologres

In most cases where you use Python, you need to use pandas to convert data entries into a DataFrame, process the DataFrame, and then import the DataFrame to Hologres. In this case, you may want to efficiently import the DataFrame to Hologres. The to_sql function is usually used to import DataFrames to Hologres. For more information, see pandas.DataFrame.to_sql.

The version of pandas must be V1.4.2 or later. You can run the following command to forcibly install pandas whose version is V1.5.1:

# pip install Pandas==1.5.1

When you import a DataFrame to Hologres, we recommend that you use a callable of the to_sql function to execute the COPY FROM STDIN statement. Sample code in Python:

# Load a dependency.
import pandas as pd
import psycopg2

# Generate a connection string.
host="hgpostcn-cn-xxxxxx-cn-hangzhou.hologres.aliyuncs.com"
port=80
dbname="demo"
user="LTAI5xxxxx"
password="fa8Kdgxxxxx"
application_name="Python Test"
conn = "postgresql+psycopg2://{}:{}@{}:{}/{}?application_name={}".format(user, password, host, port, dbname,application_name)
print(conn)

# Convert data entries into a DataFrame.
data = [('1','1','1'),('2','2','2')]
cols = ('col1','col2','col3')
pd_data = pd.DataFrame(data, columns=cols)

# Specify a callable of the to_sql function.
import csv
from io import StringIO

def psql_insert_copy(table, conn, keys, data_iter):
    """
    Execute SQL statement inserting data

    Parameters
    ----------
    table : pandas.io.sql.SQLTable
    conn : sqlalchemy.engine.Engine or sqlalchemy.engine.Connection
    keys : list of str
        Column names
    data_iter : Iterable that iterates the values to be inserted
    """
    # gets a DBAPI connection that can provide a cursor
    dbapi_conn = conn.connection
    with dbapi_conn.cursor() as cur:
        s_buf = StringIO()
        writer = csv.writer(s_buf)
        writer.writerows(data_iter)
        s_buf.seek(0)

        columns = ', '.join('"{}"'.format(k) for k in keys)
        if table.schema:
            table_name = '{}.{}'.format(table.schema, table.name)
        else:
            table_name = table.name

        sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(
            table_name, columns)
        cur.copy_expert(sql=sql, file=s_buf)

# Import the DataFrame.
pd_data.to_sql(
    name="pd_data",
    con=conn,
    if_exists="replace",
    index=False,
    method=psql_insert_copy
)

You can view historical queries in the HoloWeb console to check whether the DataFrame is imported to Hologres by executing the COPY FROM STDIN statement.