PyODPS is integrated with SQLAlchemy. You can use SQLAlchemy to query data in MaxCompute. This topic describes how to connect SQLAlchemy to a MaxCompute project and call the SQLAlchemy interface.

Connect SQLAlchemy to a MaxCompute project

Syntax
from sqlalchemy import create_engine
engine = create_engine('odps://<access_id>:<access_key>@<project>/?endpoint=<endpoint>')
conn = engine.connect()
  • access_id: the AccessKey ID that is used to access the MaxCompute project.

    You can obtain the AccessKey ID from the AccessKey Management page.

  • access_key: the AccessKey secret that corresponds to the AccessKey ID.

    You can obtain the AccessKey secret from the AccessKey Management page.

  • project: the name of the MaxCompute project that you want to access.

    This parameter specifies the name of your MaxCompute project instead of the DataWorks workspace to which the MaxCompute project corresponds. Log on to the MaxCompute console. In the top navigation bar, select a region and view the name of your MaxCompute project on the Project management tab.

  • endpoint: the endpoint of the region where your MaxCompute project resides.

    For more information about the endpoints of MaxCompute in different regions, see Endpoints.

For the existing ODPS object o, if you call o.to_global() to configure the object as a global object, you do not need to specify the preceding parameters in the connection string. Sample statements:
from sqlalchemy import create_engine
o.to_global()  # set ODPS object as global one
engine = create_engine('odps://')

Call the SQLAlchemy interface

After you connect SQLAlchemy to a MaxCompute project, you can call the SQLAlchemy interface. The following statements show how to create tables, insert data into tables, and query data from tables.
  • Create a table
    from sqlalchemy import Table, Column, Integer, String, MetaData
    metadata = MetaData()
    
    users = Table('users', metadata,
        Column('id', Integer),
        Column('name', String),
        Column('fullname', String),
    )
    
    metadata.create_all(engine)
  • Insert data
    ins = users.insert().values(id=1, name='jack', fullname='Jack Jones')
    conn.execute(ins)
  • Query data
    from sqlalchemy.sql import select
    s = select([users])
    result = conn.execute(s)
    for row in result:
    print(row)
    (1, 'jack', 'Jack Jones')