Create Databricks clusters in Dataphin to run compute tasks, and manage cluster access, configuration, and versioning.
Permissions
-
Super administrators, system administrators, and custom global roles with Databricks Cluster-Management permission can create and manage Databricks clusters, assign cluster administrators, and control which users can reference the cluster in compute sources.
-
Cluster administrators can manage the clusters for which they are responsible.
-
Users with the Compute Source Management-Create global role can select Databricks clusters they have usage permissions for when creating a compute source.
Create a Databricks cluster
-
On the Dataphin homepage, choose Planning > Compute Source.
-
On the Compute Source page, click Manage Databricks Clusters.
-
In the Manage Databricks Clusters dialog box, click +Create Databricks Cluster.
-
On the Create Databricks Cluster page, configure the following parameters.
-
Basic Information
Parameter
Description
Cluster Name
The cluster name. Supports Chinese characters, English letters, digits, underscores (_), and hyphens (-). Maximum length: 128 characters.
Cluster Administrator
Select one or more tenant members as cluster administrators. Cluster administrators can edit, view historical versions, and delete the cluster.
Description (optional)
A brief description of the cluster. Maximum length: 128 characters.
-
Cluster Security Control
Available Members: Specify which users can reference this cluster when creating a compute source. Select Roles With "Create Compute Source" Permission or Specific Users.
-
Roles With "Create Compute Source" Permission: Default selection.
-
Specific Users: Select one or more personal accounts and user groups.
-
-
Cluster Configuration
Parameter
Description
Time Zone
The time zone used to process time-format data in integration tasks. Default: GMT+00:00. Click Modify to select a different time zone. Options:
-
GMT: GMT-12:00, GMT-11:00, GMT-10:00, GMT-09:30, GMT-09:00, GMT-08:00, GMT-07:00, GMT-06:00, GMT-05:00, GMT-04:00, GMT-03:00, GMT-03:00, GMT-02:30, GMT-02:00, GMT-01:00, GMT+00:00, GMT+01:00, GMT+02:00, GMT+03:00, GMT+03:30, GMT+04:00, GMT+04:30, GMT+05:00, GMT+05:30, GMT+05:45, GMT+06:00, GMT+06:30, GMT+07:00, GMT+08:00, GMT+08:45, GMT+09:00, GMT+09:30, GMT+10:00, GMT+10:30, GMT+11:00, GMT+12:00, GMT+12:45, GMT+13:00, GMT+14:00.
-
Daylight Saving Time: Africa/Cairo, America/Chicago, America/Denver, America/Los_Angeles, America/New_York, America/Sao_Paulo, Asia/Bangkok, Asia/Dubai, Asia/Kolkata, Asia/Shanghai, Asia/Tokyo, Atlantic/Azores, Australia/Sydney, Europe/Berlin, Europe/London, Europe/Moscow, Europe/Paris, Pacific/Auckland, Pacific/Honolulu.
Authentication Type
Select Service Principal (M2M) or Personal Access Token (PAT).
-
Service Principal (M2M): Authenticates using a Service Principal. Requires Service Principal and Secret.
-
Personal Access Token (PAT): Authenticates using a personal token.
Server hostname
The workspace URL in the format
{workspace-host-name}.cloud.databricks.com.Service Principal
The Service Principal (Client ID).
NoteAvailable only when Authentication Type is set to Service Principal (M2M).
Secret
The Client Secret.
NoteAvailable only when Authentication Type is set to Service Principal (M2M).
token
The personal access token for server authentication.
NoteAvailable only when Authentication Type is set to Personal Access Token (PAT).
HTTP path
Select an HTTP path. Options are populated based on the authentication credentials you provided.
Click +Add HTTP Path to add a path. Maximum: 50 HTTP paths.
-
-
-
Click Submit.
Manage Databricks clusters
-
On the Dataphin homepage, choose Planning > Compute Source.
-
On the Compute Source page, click Manage Databricks Clusters.
-
In the Manage Databricks Clusters dialog box, view the cluster list. The list displays cluster names, administrators, associated compute sources, creation info, and modification info.
-
Associated Compute Sources: Shows the total number of associated compute sources. Click the
icon to view the list. Click a source name to go to its detail page. -
Creation Information: Records the creator and creation time.
-
Modification Information: Records the last editor and modification time.
NoteCompute tasks can only run in one cluster. Data from different Databricks clusters cannot be joined.
-
-
(Optional) Enter a cluster name in the search box for fuzzy search.
-
In the Actions column, manage the target cluster. Supported operations:
Operation
Description
View
Click the Actions
icon to view the current version details. Users with Databricks Cluster-Management permission can download the cluster configuration file.Edit
Click the Actions
icon to open the Edit Databricks Cluster page. Modify the configuration, click Save, enter a Change Description, and click OK.Clone
Click the Actions
icon. The system clones all data from the cluster and opens the Create Databricks Cluster page for you to modify.Historical Versions
Click the Actions
icon and select History. The dialog box shows each version with its name, modifier, and change description. You can perform View and Compare operations on historical versions.-
View: Click the Actions
icon to view the version details. Users with Databricks Cluster-Management permission can download the configuration file. -
Compare: Click the Actions
icon to open the Version Comparison page. Select different versions from the dropdown. By default, the current version is compared with the selected version.
Delete
Note-
The current Databricks cluster can be deleted only when there are no associated compute sources under the cluster.
-
Deleted clusters cannot be recovered.
Click the Actions
icon, select Delete, and click OK. -