PyODPS is integrated with SQLAlchemy. You can use SQLAlchemy to query data in MaxCompute. This topic describes how to connect SQLAlchemy to a MaxCompute project and call the SQLAlchemy interface.
Connect SQLAlchemy to a MaxCompute project
from sqlalchemy import create_engine
engine = create_engine('odps://<access_id>:<access_key>@<project>/?endpoint=<endpoint>')
conn = engine.connect()
- access_id: the AccessKey ID that is used to access the MaxCompute project.
You can obtain the AccessKey ID from the AccessKey Management page.
- access_key: the AccessKey secret that corresponds to the AccessKey ID.
You can obtain the AccessKey secret from the AccessKey Management page.
- project: the name of the MaxCompute project that you want to access.
This parameter specifies the name of your MaxCompute project instead of the DataWorks workspace to which the MaxCompute project corresponds. Log on to the MaxCompute console. In the top navigation bar, select a region and view the name of your MaxCompute project on the Project management tab.
- endpoint: the endpoint of the region where your MaxCompute project resides.
For more information about the endpoints of MaxCompute in different regions, see Endpoints.
o.to_global()
to configure the object as a global object, you do not need to specify the preceding
parameters in the connection string. Sample statements: from sqlalchemy import create_engine
o.to_global() # set ODPS object as global one
engine = create_engine('odps://')
Call the SQLAlchemy interface
- Create a table
from sqlalchemy import Table, Column, Integer, String, MetaData metadata = MetaData() users = Table('users', metadata, Column('id', Integer), Column('name', String), Column('fullname', String), ) metadata.create_all(engine)
- Insert data
ins = users.insert().values(id=1, name='jack', fullname='Jack Jones') conn.execute(ins)
- Query data
from sqlalchemy.sql import select s = select([users]) result = conn.execute(s) for row in result: print(row) (1, 'jack', 'Jack Jones')