MaxCompute is a fast and fully managed data warehouse that can process terabytes, petabytes, or even exabytes of data. This topic describes the open source features of MaxCompute.

SDK

MaxCompute provides SDK for Java and SDK for Python to create, view, and delete MaxCompute tables. You can edit code in SDKs to manage MaxCompute.
  • Java SDK

    For more information about how to use SDK for Java, see SDK for Java.

    How to obtain service support: See the official documentation or submit a ticket.

  • Python SDK
    PyODPS is the SDK for Python of MaxCompute. PyODPS provides the DataFrame framework and allows you to perform basic operations on MaxCompute objects. This helps you analyze data in MaxCompute. For more information, see aliyun-odps-python-sdk on GitHub and PyODPS documentation, which describes all related interfaces and classes in detail.

    How to obtain service support: See the official documentation or submit a ticket.

MaxCompute RODPS

RODPS is a plug-in that MaxCompute provides for R. For more information, see ODPS Plugin for R on GitHub.

How to obtain service support: Leave a message or create an issue in ODPS Plugin for R on GitHub.

MaxCompute JDBC is an official JDBC driver provided by MaxCompute. It provides a set of interfaces to run SQL tasks for Java programs. The project is hosted in ODPS JDBC on GitHub.

How to obtain service support: Leave a message or create an issue in ODPS JDBC on GitHub.

Mars

Mars is a tensor-based unified distributed computing framework. Mars makes it possible to run large-scale scientific computing tasks by using only several lines of code, whereas MapReduce requires hundreds of lines of code. In addition, Mars improves computing performance.

The source code of Mars is now available on GitHub. You are welcome to contribute to Mars. You can visit Mars on GitHub to obtain its open source code.

For more information about Mars, see Mars Documentation.

How to obtain service support: Leave a message or create an issue in Mars on GitHub.

Data collector

MaxCompute provides a set of open source data collectors.

MaxCompute provides data collectors for the following services:
  • Flume
  • Oracle GoldenGate (OGG)
  • Sqoop
  • Kettle
  • Hive Data Transfer UDTF

    The Flume and OGG data collectors are implemented based on the DataHub SDK, whereas the data collectors for Sqoop, Kettle, and Hive Data Transfer UDTF are implemented based on the Tunnel SDK. DataHub is a real-time data transfer channel. Tunnel is a batch data transfer channel. The Flume and OGG data collectors are used to transfer data in real time. The data collectors for Sqoop, Kettle, and Hive Data Transfer UDTF are used to transfer data in batches in offline mode.

For more information about the source code, see Aliyun MaxCompute Data Collectors on GitHub. For more information about data collectors, see wiki.

How to obtain service support: Leave a message or create an issue in Aliyun MaxCompute Data Collectors on GitHub.