Before you process data in a MaxCompute project, you must select development tools and prepare the required environment based on your business requirements. This topic describes how to prepare an environment and install the required development tools.
The following table describes the development tools supported by MaxCompute.
|Development tool||Manual installation required||Scenario|
|Query editor (MaxCompute console)||No||
|MaxCompute client||Yes||The MaxCompute client is a command-line client. It is suitable for all scenarios, which helps you compile commands to process data.|
|DataWorks||No||DataWorks implements comprehensive features, such as data development, data integration, and data services in a visual manner based on MaxCompute projects. If you need to periodically schedule jobs, we recommend that you use DataWorks.|
|MaxCompute Studio||Yes||MaxCompute Studio is a development plug-in that is based on IntelliJ IDEA. MaxCompute Studio helps you develop data more easily and quickly. If you are familiar with IntelliJ IDEA, we recommend that you use MaxCompute Studio.|
Prepare an environment
The following table describes the environment requirements of the preceding development tools.
|Development tool||Environment requirement|
|Query editor (MaxCompute console)||We recommend that you use the latest version of Google Chrome.|
|MaxCompute client||You must install Java 8 or later.|
|DataWorks||We recommend that you use the latest version of Google Chrome.|
Install and configure the MaxCompute client
To install and configure the MaxCompute client, perform the following steps:
- Download the MaxCompute client installation package.
- Decompress the downloaded package to obtain the bin, conf, lib, and plugins folders.
- Open the conf folder and configure the odps_config.ini file. The following example shows the content in the odps_config.ini file.
project_name= access_id= access_key= end_point= log_view_host= https_check= # confirm threshold for query input size(unit: GB) data_size_confirm= # this url is for odpscmd update update_url= # download sql results by instance tunnel use_instance_tunnel= # the max records when download sql results by instance tunnel instance_tunnel_max_record= # IMPORTANT: # If leaving tunnel_endpoint untouched, console will try to automatically get one from odps service, which might charge networking fees in some cases. # Please refer to Configure endpoints # tunnel_endpoint= # use set.<key>= # e.g. set.odps.sql.select.output.format=
In the odps_config.ini file, lines that start with a number sign (#) are comments. The following table describes the parameters in the file.
Parameter Required Description Example project_name Yes The name of the MaxCompute project that you want to access.
If you create a workspace in standard mode, pay attention to the differences of the project names between the production environment and development environment when you specify this parameter. The names of the projects in the development environment end with _dev. For more information, see Basic mode and standard mode.
You can log on to the MaxCompute console and view the MaxCompute project names on the Project Management tab.
doc_test_dev access_id Yes The AccessKey ID of your Alibaba Cloud account or a RAM user within the Alibaba Cloud account.
You can obtain the AccessKey ID from the Security Management page.
None access_key Yes The AccessKey secret that corresponds to the AccessKey ID.
You can obtain the AccessKey secret from the Security Management page.
None end_point Yes The endpoint of MaxCompute.
You must set this parameter based on the region and network connection method you selected when you create the MaxCompute project. For more information about the endpoints that correspond to each region and network, see Endpoints.Notice If the endpoint that you configured is invalid, an error occurs when you access MaxCompute.
http://service.cn-hangzhou.maxcompute.aliyun.com/api log_view_host No The Logview Uniform Resource Locator (URL). You can view the detailed runtime information of a job by using this URL. This information helps you locate job errors. Set the value to http://logview.odps.aliyun.com.Note We recommend that you set this parameter. If you do not set this parameter, you cannot locate the cause of job errors. http://logview.odps.aliyun.com https_check No Specifies whether to enable HTTPS access. If HTTPS access is enabled, requests to access MaxCompute projects are encrypted. Valid values:
- True: HTTPS access is used.
- False: HTTP access is used.
Default value: False.
True data_size_confirm No The maximum size of input data, in GB. The value range is unlimited. We recommend that you set this parameter to 100. 100 update_url No A reserved parameter. None use_instance_tunnel No Specifies whether to use InstanceTunnel to download the results of SQL statements. Valid values:
- True: InstanceTunnel is used to download the results of SQL statements.
- False: InstanceTunnel is not used to download the results of SQL statements.
Default value: False.
True instance_tunnel_max_record No The maximum number of SQL execution results that can be returned by the client. You must specify this parameter if the use_instance_tunnel parameter is set to True. Maximum value: 10000. 10000 tunnel_endpoint No The public endpoint of MaxCompute Tunnel. If you do not specify this parameter, traffic is automatically routed to the Tunnel endpoint that corresponds to the network where MaxCompute resides. If you specify this parameter, traffic is routed to the specified endpoint and automatic routing is not performed.
For more information about the Tunnel endpoints that correspond to each region and network, see Endpoints.
http://dt.cn-hangzhou.maxcompute.aliyun.com set.<key> No The properties of the MaxCompute project.
For more information about the properties of MaxCompute projects, see Properties.
Install and configure MaxCompute Studio
To install and configure MaxCompute Studio, perform the following steps:
- Install IntelliJ IDEAMaxCompute Studio is a plug-in that is integrated with IntelliJ IDEA. To install MaxCompute Studio, you must install IntelliJ IDEA first.
- Install MaxCompute StudioInstall the MaxCompute Studio plug-in on IntelliJ IDEA.
- Configure MaxCompute StudioConfigure the configuration items of MaxCompute Studio.
- Connect to a MaxCompute projectAfter you connect to a MaxCompute project by using MaxCompute Studio, you can view the information of the MaxCompute project on MaxCompute Studio.
What to do next
- If you use the query editor to process data, for more information about the query editor, see Query editor.
- If you use the MaxCompute client to process data, for more information about the MaxCompute client, see MaxCompute client.
- If you use DataWorks to process data, perform the operations by following the instructions provided in Quick start of DataWorks.
- If you use MaxCompute Studio to process data, perform the operations by following the instructions provided in MaxCompute Studio.