Data Map collects metadata from an External Catalog using indirect association. If you use the External Catalog feature in a StarRocks database and want to view the tables and metadata details within that External Catalog from a StarRocks data source in Data Map, you can follow the steps in this topic. After you configure and collect the metadata, you can use the search feature in Data Map to find and view metadata, such as tables and fields, that are associated with the StarRocks External Catalog.
Background information
After you configure a StarRocks data source in DataWorks and start metadata collection, Data Map by default collects metadata only from the StarRocks internal catalog. To retrieve metadata from a StarRocks External Catalog, you must configure a connection to the associated data source in DataWorks and collect its metadata. After the collection is complete, Data Map automatically associates the metadata. You can then view the External Catalog and its associated metadata within the StarRocks data source.
Prerequisites
You have added your StarRocks database to DataWorks as a StarRocks data source. For more information, see Add a StarRocks data source.
To collect metadata from a data source that has whitelist access control enabled, you must configure the required whitelist permissions in advance. For more information, see Configure a metadata collection whitelist.
Limits
Elasticsearch Catalog is not supported as an external catalog.
Paimon Catalog from an OSS source is not supported as an external catalog.
Procedure
This topic uses a MySQL External Catalog in StarRocks as an example. If you configure a MySQL database as mysql_catalog_db, you must configure and create a MySQL metadata collector in Data Map. After you collect the metadata from the MySQL database, you can search for and view the metadata of mysql_catalog_db in StarRocks.
Step 1: Prepare data
Create a MySQL data source
Create a MySQL data source with the database name mysql_catalog_db and create a sample table named mysql_catalog_table. The following is a sample script:
CREATE TABLE mysql_catalog_table(
catalog_table_id INT,
catalog_table_name VARCHAR(255)
)Prepare the MySQL JDBC driver package
You must upload the MySQL Java Database Connectivity (JDBC) driver that matches your version to OSS.
Upload the JDBC driver JAR package that matches your MySQL version to OSS. Log on to the OSS console and click Buckets in the navigation pane on the left.
Click the name of the destination bucket to open the Objects page. This topic uses the
catalog-bucket-ossbucket as an example.Click Create Directory to create a directory to store the JAR package. Set Directory Name to
libs.Navigate to the directory for the JDBC driver JAR package. Click Upload Object. In the Files To Upload area, click Select Files, add the
mysql-connector-java-8.0.28.jarJDBC driver JAR package to the bucket, and then click Upload Object.Find the JDBC driver JAR package that you uploaded. In the Actions column for the file, click View Details. On the View Details page, click Set ACL. In the Set ACL dialog box, set the file permission to Public Read/Write and click OK to enable external referencing.
Step 2: Configure the external data source connection
Go to the query list of the StarRocks instance.
Log on to the EMR console. In the navigation pane on the left, choose to open the Instance List page.
Find the StarRocks instance that you created and click Connect in the Actions column. The New Connection tab appears.
On the New Connection tab, select the Region and Instance name of your StarRocks instance. Enter the Connection Name, Username, and Password. Click Test Network Connectivity. After the connection is successful, click OK to open the Queries page of the StarRocks instance.
Configure the MySQL external connection in the StarRocks data source.
Below the query list, click the + File button. In the Create File dialog box, enter a Name, select a Storage Path, and click Confirm to create the file.
Under All Files, double-click the file that you created to open the StarRocks instance editor. Enter the following sample script for an external connection. For more information, see Sample configurations for a StarRocks External Catalog.
CREATE EXTERNAL CATALOG mysql_db_catalog PROPERTIES ( "driver_class" = "com.mysql.cj.jdbc.Driver", "driver_url" = "https://catalog-bucket-oss.oss-cn-hangzhou-internal.aliyuncs.com/libs/mysql-connector-java-8.0.28.jar", "type" = "jdbc", "user" = "<UserName>", "password"="<PassWord>", "jdbc_uri" = "jdbc:mysql://xxx:3306/mysql_catalog_db" );NoteThe
mysql_db_catalogparameter specifies the name of the external storage data directory for the MySQL data source.Set the
UserNameandPassWordparameters as needed.jdbc_uri: You must enter the connection path for the database that you created.
After you finish editing, click Run to execute the script. After the script runs successfully, you can view information about the associated table on the Databases tab.

Step 3: Configure data sources
Log on to the DataWorks console. Switch to the destination region. In the navigation pane on the left, click Workspace. On the Workspaces page, find the workspace you created, and in the Actions column, click Manage to open the Management Center.
On the Management Center page, in the navigation pane on the left, choose . On the Data Source List page, add the StarRocks and MySQL data sources. For more information, see Add a StarRocks data source and Add a MySQL data source.
NoteFor the MySQL data source, set Configuration Mode to Connection String Mode. This mode is also required for any external data source of the JDBC type.
Step 4: Configure metadata collection
Log on to the DataWorks console. Switch to the destination region. In the navigation pane on the left, choose and click Go To Data Map. In Data Map, configure metadata collection.
Configure metadata collection for the internal StarRocks catalog
Follow the steps in Create a custom collector to create a StarRocks collector.
Configure metadata collection for the StarRocks External Catalog
Similarly, you must complete metadata collection for the MySQL data source. Otherwise, you cannot search for the External Catalog information of the MySQL data source.
For the MySQL data source metadata collection, set Metadata Collection Type to MySQL.
Step 5: Search for metadata
Wait for the StarRocks and MySQL metadata collection tasks to complete. Click the
icon in the navigation pane on the left to open the search page.
In the Type section, on the Data Source tab, select the StarRocks data source. In the Filter Conditions section, select the StarRocks Instance that you created, the name of the external storage Data Catalog for the MySQL data source in the StarRocks instance, and the corresponding MySQL Database. You can then view the MySQL Catalog information in the StarRocks data source. The following figure shows the result:

Alternatively, in the Type section, on the Data Source tab, select the MySQL data source. In the Filter Conditions section, select the name of the MySQL database that you created and verify that the table information is consistent.

View table details.
Click the table name to view its details, as shown in the following figure.

The following describes the details.

Sample configurations for a StarRocks External Catalog
The following shows the syntax for configuring a StarRocks External Catalog:
CREATE EXTERNAL CATALOG <Catalog_Name> COMMENT '' PROPERTIES("type"="","xxx1"="","xxx2"="");The Catalog_Name parameter specifies the name of the external storage data directory. You can customize the name.
The following table provides sample configurations for a StarRocks External Catalog. For more information, see Data Catalog:
Collection method | Connection method | Sample configuration for a StarRocks External Catalog |
Default Catalog | default | When the collection method is set to Default Catalog, StarRocks internal metadata is collected by default. Therefore, you do not need to configure an External Catalog. For more information, see DataAnalysis. |
ODPS Catalog | VPC | Note You must replace |
Hive Catalog | Hive Metastore (HMS) | Note Replace the |
Data Lake Formation (DLF) | Note Replace the value of the | |
Iceberg Catalog | Hive | Note When you use Hive Metastore as the metadata service to configure an Iceberg external Catalog, replace the value of the |
Hudi Catalog | Hive | Note When you use Hive Metastore as the metadata service to configure a Hudi external catalog, replace the value of the |
Data Lake Formation (DLF) | Note When you use DLF as the metadata service to configure a Hudi External Catalog, replace the value of the | |
Delta Lake Catalog | Hive | Note
|
Data Lake Formation (DLF) | Note Replace the | |
JDBC Catalog | MySQL | Note
|
PostgreSQL | Note
| |
Paimon Catalog | Hive (supported in StarRocks 3.1 and later) | Note When you use Hive as the metadata service to configure a Paimon Catalog:
|
Data Lake Formation (DLF, supported in StarRocks 3.1 and later) | Note When you use DLF as the metadata service to configure a Paimon Catalog:
| |
Unified Catalog | Hive (supported in StarRocks 3.2 and later) | Note When you use Hive as the metadata service to configure a Unified Catalog, replace the |