This topic describes how to create a foreign table in Hologres to achieve accelerated queries of MaxCompute data.
Background information
MaxCompute is a fast and fully managed computing platform for large-scale data warehousing. It can process exabytes of data. MaxCompute is used to store and compute large amounts of structured data at a time. It provides various data warehousing solutions as well as big data analytics and modeling services.
Hologres is a real-time interactive analytics engine that is compatible with PostgreSQL. Hologres seamlessly integrates with MaxCompute storage resources. In Hologres, you can perform accelerated queries of MaxCompute data by creating foreign tables, without the need to store data in Hologres or import data to and export data from Hologres. You can use Hologres to connect to mainstream business intelligence (BI) tools.
You can also import data to Hologres before you query the data. Compared with other similar services that are not oriented for the big data ecosystem, Hologres offers higher performance and faster data import and export.
- Query MaxCompute data in Hologres.
This method is suitable for scenarios in which the amount of underlying data to be scanned in a query is less than 200 GB, and the number of partitions to be queried at a time is less than 512.Note The amount of underlying data to be scanned in a query is the volume of data in the partitions that hit the filter conditions, not the size of queried fields. The limit on the amount of underlying data to be scanned in a query ensures the stability of data queries. If you perform accelerated queries of MaxCompute data by creating foreign tables, Hologres loads the MaxCompute data in the queried partitions to its memory and cache. If the amount of underlying data to be scanned is too large, a lot of bandwidth and computing resources are occupied, which affects the performance of concurrent queries.
- Import MaxCompute data to Hologres before you query the data.
This method has no limit on the amount of underlying data to be scanned in a query. The method is suitable for the following scenarios: complex queries, index-based queries, and queries that involve UPDATE, INSERT, and DELETE operations.
Usage notes
When you create a foreign table in Hologres to accelerate queries of MaxCompute data, take note of the following items:
- You can accelerate queries of only internal tables in MaxCompute. Queries of foreign tables and views in MaxCompute cannot be accelerated.
- The amount of underlying data to be scanned in a query cannot exceed 200 GB, and the number of partitions to be queried at a time cannot exceed 512. However, if you import MaxCompute data to Hologres internal tables before you query the data, the amount of underlying data to be scanned and the number of partitions to be queried are not limited.
- After data in a MaxCompute table is updated, the cache in Hologres is automatically updated within 5 minutes. Before that, you cannot accelerate queries of the updated data. If you want to query the updated data in real time, execute the IMPORT FOREIGN SCHEMA statement to update the metadata of the foreign table. If you import data from the MaxCompute table to Hologres V1.1.25 or later, you do not need to manually update the metadata of the foreign table. After you execute the statement used to import data, Hologres automatically obtains the updated metadata of the MaxCompute table.
- If the schema of a MaxCompute table is updated, the cache in Hologres is not automatically updated. You must manually update the foreign table.
- Partition fields in MaxCompute tables can be mapped to partition fields or regular fields in Hologres tables. Therefore, you can import data from a partitioned MaxCompute table to a partitioned or non-partitioned Hologres table.
- Data types that are supported by MaxCompute and those supported by Hologres have a one-to-one mapping relationship. You can create tables based on the mappings between data types. For more information, see Data types.
- You can query MaxCompute data across regions. However, if your Hologres instance resides in China, you can accelerate only the queries of data in MaxCompute tables that reside in China. In addition, if your Hologres instance resides outside China, you cannot accelerate queries of MaxCompute data across regions. We recommend that you do not accelerate queries of MaxCompute data across regions. We recommend that you create a foreign table in the region in which a MaxCompute table resides. Otherwise, the stability of data queries cannot be ensured because of possible network errors.
- Foreign tables do not store data and are used only to map the source data. The source data is stored in MaxCompute tables.
- MaxCompute transactional tables and tables in the schema evolution state are not supported.
- When you use a streaming tunnel to write data to a MaxCompute table, the table is in the streaming state. The data in the table is asynchronously compacted and converted to the Optimized Row Columnar (ORC) format. Hologres V1.1.45 and later can read tables in the streaming state. Hologres of earlier versions cannot read tables whose compaction is not complete.
Query data in a non-partitioned MaxCompute table
Query data in a partitioned MaxCompute table
Create multiple foreign tables at a time
To accelerate queries of a MaxCompute table that contains a large amount of data, you can create multiple foreign tables at a time. In Hologres, you can create multiple foreign tables at a time by executing SQL statements or by using the Hologres console.
- Hologres allows you to execute the
IMPORT FOREIGN SCHEMA
statement to create multiple foreign tables at a time. For more information, see IMPORT FOREIGN SCHEMA. - You can also create multiple foreign tables at a time by using HoloWeb. For more information, see Create multiple foreign tables at a time.
Create a foreign table in the HoloWeb console
You can use HoloWeb to create a foreign table and query table data in a visualized way, without the need to write SQL statements. To do so, perform the following steps:
Optimize the performance of querying MaxCompute tables in Hologres
If the performance of querying MaxCompute tables in Hologres does not meet your requirements, you can merge small files in MaxCompute or optimize Hologres SQL statements. Hologres V0.10 and later adopt a new engine to query MaxCompute data. Compared with the versions earlier than V0.10, Hologres V0.10 and later improve query performance by about 30% to 100%. For more information, see Optimize the performance of querying MaxCompute tables in Hologres.
FAQ
For more information about the answers to frequently asked questions about the feature of creating a foreign table in Hologres to accelerate queries of MaxCompute data, see Common errors and troubleshooting methods of integration with MaxCompute.