The Dataphin metadata warehouse, a data warehouse that manages business metadata within Dataphin and the metadata of the corresponding compute engine, resides in a Dataphin project space within the Dataphin metadata warehouse tenant (OPS tenant). It comprises periodic data integration nodes, SQL script nodes, and Shell nodes. Metadata warehouse initialization involves configuring the compute engine type for the Dataphin system and initializing metadata. This topic explains how to use Databricks as the compute engine for metadata warehouse initialization.
Limits
Initialization can only be performed by accounts with the metadata warehouse tenant super administrator or system administrator role.
Please safeguard the account and password of the metadata warehouse tenant super administrator or system administrator. Additionally, operate with caution after logging into the system with the metadata warehouse tenant super administrator account.
Procedure
On the Dataphin home page, select Management Center > System Settings from the top menu bar.
In the navigation pane on the left, select System Operations And Maintenance > Metadata Warehouse Settings.
On the Metadata Warehouse Settings configuration wizard page, click Start.
When selecting the initialization engine type, choose the Databricks engine type.
ImportantIf the metadata warehouse has already been initialized, it will default to the last successfully initialized engine. Switching to an incompatible compute engine may render the administration feature unavailable.
Click Next.
On the Parameter Checking page, configure the parameters for the Databricks compute engine.
Parameter
Description
Authentication Type
You can choose Service Principal (M2M) or Personal Access Token (PAT).
Service Principal (M2M): Authentication based on Service Principal, requires providing Service Principal and Secret.
Personal Access Token (PAT): Authentication based on personal token, requires providing the token of the personal account.
Server hostname
Enter the workspace URL. The format is
{workspace-host-name}.cloud.databricks.com.Service Principal
Enter the Service Principal, which is the Client ID.
NoteThis option is supported only when the authentication method is Service Principal (M2M).
Secret
Enter the Client Secret.
NoteThis option is supported only when the authentication method is Service Principal (M2M).
token
Enter the token for accessing the server of the personal account.
NoteThis option is supported only when the authentication method is Personal Access Token (PAT).
HTTP path
Select the HTTP path. The list will display options based on the entered authentication information.
Catalog
Select the Catalog. The list will display options based on the entered authentication information.
Schema
Select the Schema of Databricks. The list will display options based on the entered authentication information.
Meta Project
Used for metadata production and processing logic project space. It is recommended to configure it as dataphin_meta. Please keep the name unchanged when reinitializing, otherwise initialization will fail.
Click Test Connection. After a successful connection test, click Next.
On the initialization page, click Start.
NoteThe system initialization process takes approximately 15 minutes. Please be patient.
Once the page indicates that the execution is successful, click Finish to complete the configuration.
What to do next
After you initialize the system metadata, you can set the compute engine for the Dataphin instance. If the compute engine for the metadata warehouse is Databricks, you can set the compute engine for a business tenant to any engine type except MaxCompute. For more information, see Compute settings.