This topic describes the configuration options provided by PyODPS.

You can use odps.options to obtain the configuration options provided by PyODPS.
from odps import options
# Set the lifecycle option to specify the lifecycle of all output tables.
options.lifecycle = 30
# Set the tunnel.string_as_binary option to True to use bytes instead of Unicode to download data of the STRING type.
options.tunnel.string_as_binary = True
# When you execute PyODPS DataFrames in MaxCompute, you can refer to the following configuration to set the limit to a relatively large value during a sort operation.
options.df.odps.sort.limit = 100000000

General configurations

Option Description Default value
end_point The endpoint of MaxCompute. None
default_project The default project. None
log_view_host The hostname of Logview. None
log_view_hours The retention time of Logview. Unit: hours. 24
local_timezone The time zone that is used. True indicates local time, and False indicates UTC. The time zone of pytz can also be used. None
lifecycle The lifecycle of all tables. None
temp_lifecycle The lifecycle of temporary tables. 1
biz_id The user ID. None
verbose Specifies whether to display logs. False
verbose_log The log receiver. None
chunk_size The size of the write buffer. 1496
retry_times The number of request retries. 4
pool_connections The number of cached connections in the connection pool. 10
pool_maxsize The maximum capacity of the connection pool. 10
connect_timeout The connection timeout period. 5
read_timeout The read timeout period. 120
api_proxy The API proxy server. None
data_proxy The data proxy server. None
completion_size The limit on the number of object completion listing items. 10
notebook_repr_widget Specifies whether to use interactive graphs. True
sql.settings Global hints for MaxCompute SQL. None
sql.use_odps2_extension Specifies whether to enable MaxCompute 2.0 language extension. False

Data upload and download configurations

Option Description Default value
tunnel.endpoint The endpoint of MaxCompute Tunnel. None
tunnel.use_instance_tunnel Specifies whether to use InstanceTunnel to obtain execution results. True
tunnel.limit_instance_tunnel Specifies whether to limit the number of data records obtained by using InstanceTunnel. None
tunnel.string_as_binary Specifies whether to use bytes instead of Unicode for data of the STRING type. False

DataFrame configurations

Option Description Default value
interactive Specifies whether DataFrames are used in an interactive environment. Depends on the detection value.
df.analyze Specifies whether to enable functions that are not built in MaxCompute. True
df.optimize Specifies whether to enable full DataFrame optimization. True
df.optimizes.pp Specifies whether to enable DataFrame predicate pushdown optimization. True
df.optimizes.cp Specifies whether to enable DataFrame column pruning optimization. True
df.optimizes.tunnel Specifies whether to enable DataFrame tunnel optimization. True
df.quote Specifies whether to use a pair of grave accents (``) to mark field and table names in the backend of MaxCompute SQL. True
df.libraries The resource name of the third-party library that is used for DataFrame operations. None
df.supersede_libraries Specifies whether to use the self-uploaded NumPy to replace the version in the service. False
df.odps.sort.limit The default limit on the number of items that are added during a sort operation of DataFrames. 10000

Machine learning configurations

Option Description Default value
ml.xflow_settings The XFlow execution configuration. None
ml.xflow_project The default XFlow project name. algo_public
ml.use_model_transfer Specifies whether to use ModelTransfer to obtain the Predictive Model Markup Language (PMML) files of models. False
ml.model_volume The name of the volume used by ModelTransfer. pyodps_volume