StarRocks 3.1 or later supports Paimon catalogs. A Paimon catalog is an external catalog. You can use a Paimon catalog to query data in Paimon. This topic describes how to create a Paimon catalog in an E-MapReduce (EMR) StarRocks cluster and use the Paimon catalog to query data in Paimon.
Prerequisites
A cluster that contains the Paimon service, such as a DataLake cluster or a custom cluster, is created. For more information, see Create a cluster.
A cluster that contains the StarRocks service, such as an online analytical processing (OLAP) cluster or a custom cluster, is created, and you have logged on to the cluster. For more information, see Create a cluster and Getting started.
Limits
The preceding clusters must be deployed in the same virtual private cloud (VPC) and zone.
Create a Paimon catalog
Syntax
CREATE EXTERNAL CATALOG <catalog_name>
PROPERTIES
(
"key"="value",
...
);Parameter description
catalog_name: the name of the Paimon catalog. This parameter is required. The name must meet the following requirements:The name can contain letters, digits, and underscores (_). It must start with a letter.
The name must be 1 to 64 characters in length.
PROPERTIES: the properties of the Paimon catalog. This parameter is required.NoteThe Paimon catalogs of StarRocks have a one-to-one mapping relationship with the catalogs in the native Paimon API. The names and meanings of configuration items for the two types of catalogs are the same.
Property
Required
Description
type
Yes
The type of the data source. Set the value to paimon.
paimon.catalog.type
Yes
The metadata storage type that is used by Paimon. Valid values:
hive: Use the Hive metastore to store metadata.filesystem: Use a file system to store metadata.dlf: Use Data Lake Formation (DLF) to store metadata.
paimon.catalog.warehouse
Yes
The path where the warehouse resides. HDFS and OSS paths are supported.
hive.metastore.uris
No
The Uniform Resource Identifier (URI) of the Hive Metastore. This parameter is required if you set
paimon.catalog.typetohive. Specify the value in the following format:thrift://<IP address of the Hive metastore>:<port number>. The default port number is 9083.aliyun.oss.endpoint
No
The endpoint of OSS. This parameter is required if you set the value of the paimon.catalog.warehouse parameter to an OSS path.
dlf.catalog.id
No
The ID of the DLF data catalog. This parameter is required only if you set the paimon.catalog.type parameter to dlf. If you do not configure the
dlf.catalog.idparameter, the default DLF catalog is used.
Example
Execute the following statement to create a Paimon catalog named paimon_catalog. The paimon.catalog.type parameter is set to dlf.
CREATE EXTERNAL CATALOG paimon_catalog
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "dlf",
"paimon.catalog.warehouse" = "oss://<yourBucketName>/<yourPath>/",
);Query data in a Paimon table
Execute the following statement to query data in a specific table of a database:
SELECT * FROM <catalog_name>.<database_name>.<table_name>;References
For more information about Paimon, see Overview.