You can use the MaxCompute client (odpscmd) to access MaxCompute projects. This topic describes how to install, configure, and run odpscmd.
- Java 8 or a later version is installed on the device on which you want to install odpscmd. This is because odpscmd is developed based on the Java language.
- A project is created. For more information, see Create a project.
- You have been a project member within an Alibaba account and assigned a role if you are a RAM user and have not created a project. For more information, see Add project members and configure roles.
- MaxCompute client (odpscmd): For more information, see Client.
- MaxCompute Studio: allows you to perform a complete data development procedure, based on which you can develop SQL, Java (UDF, MapReduce, and Graph), and Python scripts. odpscmd has been integrated into MaxCompute Studio. We recommend that you use MaxCompute Studio to construct a complete data development procedure. For more information, see What is Studio.
- DataWorks: In the DataWorks console, you can click Workspaces in the left-side navigation pane. On the Workspaces page, find your workspace and click Data Analytics in the Actions column to complete data development.
- Third-party tools: You can use IntelliJ IDEA to develop SDK for Java and SDK for Python, use the MaxCompute JDBC driver with Tableau to complete visualized data analytics, and use SQL Workbench/J to execute SQL statements.
- Download the odpscmd package.
- Decompress the downloaded package to obtain the bin, conf, lib, and plugins folders.
- Edit the odps_config.ini file in the conf folder to configure odpscmd. Sample configurations in the odps_config.ini file:
# Specify the name of the project that you want to access. project_name=<my_project> # access_id and access_key specify the AccessKey pair of your Alibaba Cloud account. To obtain the AccessKey pair, log on to the Alibaba Cloud Management console and view the AccessKey pair on the AccessKey Management page. access_id=******************* access_key=********************* Specify the endpoint of MaxCompute. The endpoint is determined based on the region and network environment of your MaxCompute project. end_point=xxxxxxxxxxx # The Logview URL that odpscmd returns after a job is run. After you access the Logview URL, you can view detailed operational logs of the job. Fixed URL: log_view_host=http://logview.odps.aliyun.com # Specify whether to enable HTTPS access. https_check=true # Specify the maximum size of input data. Unit: GB. data_size_confirm=100.0 # Specify the URL to update odpscmd. update_url=http://repo.aliyun.com/odpscmd # Specify whether to download the execution results of SQL statements by using InstanceTunnel. use_instance_tunnel=true # Specify the maximum number of records in the execution results of SQL statements downloaded by using InstanceTunnel. instance_tunnel_max_record=10000 # Specify the public endpoint of MaxCompute Tunnel. The endpoint is determined based on the region and network environment of your MaxCompute project. tunnel_endpoint=xxxxxxxxxxx
- We recommend that you specify end_point and tunnel_endpoint based on the region that you selected to create the project. Otherwise, an access error may occur. For more information, see Configure endpoints.
- If two MaxCompute projects are associated with a DataWorks workspace in standard mode, take note of the name difference between the project in the production environment and the project in the development environment when you specify project_name. The name of a project in the development environment is suffixed with _dev. For more information, see Basic mode and standard mode.
- A number sign (#) is used to comment out a line in the odps_config.ini file. Two consecutive minus signs (--) are used to comment out a command line on odpscmd.
- MaxCompute is accessible over the Internet, the classic network, and a virtual private cloud (VPC). Your download fees are subject to the connection method that you use. If you do not specify the endpoint of MaxCompute Tunnel, MaxCompute Tunnel may be automatically routed to the Internet and you may be charged for downloads over the Internet. For more information, see Configure endpoints.
- Run ./bin/odpscmd in Linux or ./bin/odpscmd.bat in Windows.
If the following information appears, odpscmd is running properly.