MaxCompute Lightning is an interactive search service that is provided by MaxCompute. MaxCompute Lightning complies with the PostgreSQL protocol and syntax. MaxCompute Lightning allows you to use familiar tools and standard SQL statements to query and analyze data in MaxCompute projects and obtain the analysis results.

You can use mainstream business intelligence (BI) tools, such as Tableau and FineReport, or SQL clients to access data in your MaxCompute projects for BI analysis or ad hoc queries. You can also use the query acceleration feature of MaxCompute Lightning to encapsulate your table data as APIs for external use. This way, you can use the data in a variety of scenarios without data migration.

MaxCompute Lightning provides serverless computing services. You do not need to manage infrastructure. You are charged for only the data queries that are performed.

Key features

  • Compatibility with PostgreSQL

    MaxCompute Lightning provides Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC) APIs that are compatible with the PostgreSQL protocol. These APIs allow tools or applications that support PostgreSQL databases to connect to MaxCompute projects by using default JDBC or ODBC drivers. You can also use PostgreSQL tools to analyze MaxCompute data.

  • Improved query performance

    The query acceleration feature of MaxCompute Lightning allows you to query data in MaxCompute tables more efficiently, especially when small datasets need to be scanned in parallel. You can use the query acceleration feature in a variety of scenarios. For example, you can use the query acceleration feature to generate reports in fixed formats and expose data in MaxCompute tables as APIs.

  • Centralized permission management

    As a component provided by MaxCompute, MaxCompute Lightning provides access to MaxCompute projects and shares the same permission system with MaxCompute projects. This allows users to query the data that they are authorized to access.

  • Out-of-the-box feature and pay-by-query

    MaxCompute Lightning provides serverless computing services, which are not based on existing MaxCompute computing resources. You can use MaxCompute Lightning to connect to MaxCompute projects and query data without the need to configure, manage, or maintain MaxCompute Lightning resources.

    When you use MaxCompute Lightning, you are charged based on the amount of data that is processed in each query. If you do not query data, you are not charged.

Architecture

MaxCompute Lightning provides an endpoint, which allows clients and applications to call the JDBC or ODBC API of MaxCompute Lightning by using the PostgreSQL driver. This way, clients and applications can connect to MaxCompute projects and access the data of the projects in the unified permission system of MaxCompute projects.

The serverless computing services of MaxCompute Lightning are used to run the query tasks that are initiated and submitted by using the JDBC or ODBC API. This ensures query performance.

Scenarios

  • Ad hoc queries

    If you query a dataset that is within 100 GB, MaxCompute Lightning provides high performance. Therefore, you can use MaxCompute Lightning to directly query data in MaxCompute tables at a low latency without the need to import data into other systems for query acceleration. The systems include AnalyticDB for MySQL and ApsaraDB RDS. This saves resources and reduces costs.

    This scenario has the following characteristics: The data to query can be flexibly specified. The query logic is complex. Users want to obtain the query result and adjust the query logic within a short period of time. The expected query latency is within dozens of seconds. Users are data analysts who have mastered SQL skills and want to use familiar client tools to query and analyze data.

  • Report analysis

    MaxCompute Lightning generates analysis reports based on MaxCompute project data on which extract, transform, load (ETL) is performed. The management team and related personnel can view the reports on a regular basis.

    This scenario has the following characteristics: The data to query is a small amount of aggregated data. The query logic is fixed and simple. Queries are latency-sensitive, and results must be returned in seconds. For example, the latency for most queries must be within 5 seconds. The time that is required for each query varies based on the size of the queried data and the complexity of the query logic.

  • Online application

    Data in MaxCompute projects is encapsulated as RESTful APIs for online applications to call.

    This scenario has the following characteristics: MaxCompute Lightning is an engine that accelerates queries and is used with DataService Studio of Alibaba Cloud DataWorks to expose the data in MaxCompute tables as APIs. Manual development or O&M is not required in this process. For more information, see DataService Studio.

Limits

For more information about the limits on Data Definition Language (DDL) and Data Manipulation Language (DML) statements, queries, and user-defined functions (UDFs), see Limits.