DataWorks supports Kerberos as its third-party identity authentication mechanism for data synchronization. To use Kerberos authentication with HBase, HDFS, or Hive data sources on CDH 6.x clusters, upload the required credential files to the Authentication File Management page, then reference them in a data source configuration.
Certificates have a validity period. If a certificate expires, the corresponding data synchronization tasks fail because authorization cannot be granted. Replace certificates before they expire.
Prerequisites
Before you begin, ensure that you have:
A DataWorks workspace with access to Management Center
A Kerberos keytab file (.keytab) and a Kerberos configuration file (krb5.conf)
An exclusive resource group for Data Integration or a Serverless resource group
A data source of type HBase, Hadoop Distributed File System (HDFS), or Hive
Limitations
Kerberos authentication supports only CDH cluster 6.x versions. Authentication may fail for other versions or self-managed clusters that have not been tested with Kerberos.
Kerberos authentication supports only HBase, HDFS, and Hive data sources.
Kerberos authentication requires an exclusive resource group for Data Integration or a Serverless resource group.
Upload an authentication file
Go to the Data Sources page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose More > Management Center.
On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.
In the left-side navigation pane of the SettingCenter page, click Data Sources.
Click the Authentication File Management tab.
In the upper-right corner, click Upload Authentication File. In the dialog box, click Upload File, select the target file, enter a File Description, and then click OK.
Repeat this step to upload both the keytab file and the krb5.conf file.
Reference an authentication file in a data source
After uploading the authentication files, reference them on the data source configuration page. The following example uses an HDFS data source. For other supported data sources, see Supported data sources.
For full instructions on adding a data source, see Configure a data source.

Configure the following Kerberos-specific parameters:
| Parameter | Description | Required |
|---|---|---|
| Special Authentication Method | Set to Kerberos Authentication. | Yes |
| Keytab File | The .keytab file registered in the Kerberos environment. It stores the credential keys used for authentication. To upload a new file, click Add Authentication File. | Yes |
| Conf File | The Kerberos configuration file, krb5.conf. It stores the Key Distribution Center (KDC) server configuration. To upload a new file, click Add Authentication File. | Yes |
| principal | The Kerberos principal in the keytab file. A principal identifies a user or service and has a unique name with an associated encrypted key. Formats: - User principal: username@REALM (example: hdfs@CORP.COM)- Service principal: service/hostname@REALM (example: hive/datanode01.example.com@CORP.COM) | Yes |
More operations
The Authentication File Management page also supports the following operations:

| Operation | Description |
|---|---|
| Batch Deletion | Delete multiple authentication files at once. |
| Re-upload | Replace an existing authentication file with a new version. |
| View References | See which data sources reference a given authentication file. |
Appendix: How Kerberos authentication works
Kerberos is a network authentication protocol based on symmetric keys. Its main feature is single sign-on (SSO): a user authenticates once to obtain a Ticket-Granting Ticket (TGT), then uses that TGT to access multiple services without re-entering credentials.
Both clients and servers rely on a Key Distribution Center (KDC) for identity verification. For background on Kerberos, see Overview.

Kerberos authentication in DataWorks follows four stages:
Client requests a TGT. When a client (principal) accesses a Kerberos-enabled data source, it first requests a TGT from the KDC.
KDC issues the TGT. The KDC authenticates the client and, if successful, issues an encrypted TGT with a limited validity period.
Client requests service access. The client uses the TGT to request access to specific service resources from the server.
Server authenticates the client. The server verifies the client's credentials. If authentication succeeds, the client is granted access.
The keytab file stores the authentication credentials of the resource principal, including principals and their encrypted keys. The krb5.conf file stores the KDC server configuration. Both files must be uploaded to the Authentication File Management page and referenced in the data source configuration before Kerberos authentication can work.
Supported data sources
The following data sources support Kerberos authentication in DataWorks:
| Data source | Configuration guide |
|---|---|
| HBase | Configure an HBase data source |
| HDFS | Configure an HDFS data source |
| Hive | Configure a Hive data source |