This topic provides reading recommendations based on your roles.
MaxCompute beginners
Module | Description |
---|---|
Product Introduction | Provides the MaxCompute overview and describes its features. This topic helps you obtain a general knowledge of MaxCompute. |
Quick Start | Describes how to apply for an account, install the client, create a table, grant permissions, import and export data. It also describes how to execute SQL jobs, user-defined functions (UDFs), and MapReduce jobs. |
Terms | Introduces the basic terms of MaxCompute. |
Commonly used commands | Describes the commonly used commands in MaxCompute. This topic helps you familiarize yourself with operations on MaxCompute. |
Tools and Downloads | Before you analyze data, you must understand how to download, configure, and use the commonly used tools in MaxCompute. |
Client | You can use the client to perform operations on MaxCompute. |
Configure endpoints | Describes the regions in which MaxCompute is available, MaxCompute connection methods, and issues that arise from its use with other Alibaba Cloud services, such as Elastic Compute Service (ECS), Tablestore, and Object Storage Service (OSS). These issues include network connectivity issues and issues related to data download charges. |
Data analysts
- Supports data description language (DDL) statements.
- Uses CREATE, DROP, and ALTER statements to manage both tables and partitions.
- Uses the SELECT statement to select data records in a table and the WHERE clause to view data records that meet specific conditions. These statements help filter data records.
- Joins two tables by using equi-joins.
- Uses the GROUP BY clause to aggregate columns.
- Uses the INSERT OVERWRITE or INSERT INTO statement to insert data records into another table.
- Uses built-in functions and UDFs to complete a variety of computations.
- Collects table statistics and configures table lifecycles.
- Supports regular expressions.
Users with development experience
Module | Description |
---|---|
MapReduce | MaxCompute provides the MapReduce programming model for Java. You can use the Java API provided by MapReduce to write MapReduce programs and process MaxCompute data. |
Graph | Graph is a processing framework for iterative graph computing. A graph consists of vertices and edges, both of which contain values. MaxCompute Graph iteratively edits and evolves graphs to obtain analysis results. |
Tunnel | MaxCompute Tunnel enables you to upload or download large amounts of data to or from MaxCompute at a time. |
Java SDK | A Java API is provided for developers. |
Python SDK | A Python API is provided for developers. |
Project owners or administrators
Module | Feature | Description |
---|---|---|
Project management | Prepare for project creation | A project is a basic organizational unit of MaxCompute. Similar to a database or schema
in a traditional database system, a project is used to isolate users and control access
requests. A user can have permissions on multiple projects. After a user is authorized
to access multiple objects, the user can access objects across the projects, such
as tables, resources, functions, and instances. MaxCompute is used to manage the various
objects in projects. Preparations before project creation:
|
Create a project | For more information, see Create a project. | |
Manage project members | Members are managed from the perspectives of responsibilities and data security. If you use MaxCompute in the DataWorks console, you must understand the relationship between the permissions for the two services.. | |
Manage RAM users |
You can manage MaxCompute projects by using your Alibaba Cloud account or a RAM user. You can add RAM users under your Alibaba Cloud account to a MaxCompute project. However, MaxCompute does not authenticate these RAM users based on the permissions that are granted to the RAM users in Resource Access Management (RAM). For more information about RAM users, see Prepare a RAM user. If you manage MaxCompute projects and DataWorks workspaces in the DataWorks console, you can add only RAM users under your Alibaba Cloud account as members. Therefore, you must use your Alibaba Cloud account to create RAM users and manage these RAM users. Note
|
|
Manage scheduling resources |
Scheduling resources of DataWorks. These resources are used to execute or distribute
the tasks that are delivered by the scheduling system. Scheduling resources of DataWorks
are categorized into the following types. For more information, see View a resource group list.
|
|
Configure projects | Only the owner of a project has the permissions to configure the project. For example, the project owner can specify whether to enable full table scan and whether to use the MaxCompute V2.0 data type edition for a project by default. For more information, see Project operations. | |
Cost management | None | Budgets for resources help you estimate costs before you use the resources. It is
difficult to estimate the precise costs due to the billing methods of MaxCompute.
You must manage costs during the entire business development process.
|