The data synchronization feature of DataWorks supports third-party identity authentication. To use this feature, you must upload the required authentication files on the Authentication File Management page and enable third-party authentication when you configure a data source. This process ensures that only trusted applications and services can access your data resources. This topic describes how to upload and reference authentication files.
Background
Third-party identity authentication provides strong authentication for users and services. This prevents untrusted programs or services from accessing data and improves security during data synchronization. DataWorks provides a centralized Authentication File Management page to manage authentication files. On this page, you can upload authentication files and view their references.
Limitations
DataWorks supports only the Kerberos authentication mechanism. For more information, see Configure Kerberos authentication.
Usage notes
Certificates have a validity period. You must monitor the validity period of your uploaded certificates. If a certificate expires, the corresponding data synchronization task fails because authorization cannot be granted. You must replace certificates with new, valid ones before they expire.
Upload an authentication file
Before you can use the authentication feature, you must prepare the authentication files and upload them to the Authentication File Management page.
Go to the Data Sources page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.
In the left-side navigation pane of the SettingCenter page, click Data Sources.
Click the Authentication File Management tab.
In the upper-right corner of the page, click Upload Authentication File. In the Upload Authentication File dialog box, click Upload File, select the target file, provide a File Description, and then click OK.
Reference an authentication file
To use a third-party identity authentication feature, you must enable the identity authentication method on the data source configuration page. Then, you must configure the relevant parameters and reference the authentication files. DataWorks supports only Kerberos authentication. For more information, see Configure Kerberos authentication.
This section uses an HDFS data source as an example to describe the key configuration items for Kerberos authentication. For more information about how to configure a data source, see Configure a data source.

Configuration item | Description |
Special Authentication Method | Set Special Authentication Method to Kerberos Authentication. |
Keytab File | The .keytab file registered in the Kerberos environment. It stores key information for authentication. To upload a new authentication file, click Add Authentication File. |
Conf File | The Kerberos configuration file, krb5.conf. To upload a new authentication file, click Add Authentication File. |
principal | The Kerberos principal contained in the keytab file. A principal can be a user or a service and has a unique name and an associated encrypted key.
|
Other operations
On the Authentication File Management page, you can also perform operations for authentication files, such as Batch Deletion, Re-upload, and View References.
Appendix: Configure Kerberos authentication
The data synchronization feature of DataWorks supports only Kerberos authentication. After you configure Kerberos authentication, only trusted applications and services are authenticated. This ensures that only authenticated applications and services can access data resources.
Kerberos is a computer network authentication protocol. Its main feature is single sign-on (SSO), which allows a user to enter credentials once to obtain a Ticket-Granting Ticket (TGT). The user can then use the TGT to access multiple services. When you use the Kerberos protocol, a shared key is established between each client and service. Services use this key to communicate, which prevents untrusted services or applications from accessing data resources and provides high security.
Limitations
The Kerberos authentication feature supports only CDH cluster 6.X versions. Authentication may fail for other versions or for self-managed clusters that have not been tested with Kerberos authentication.
The Kerberos authentication feature supports only HBase, HDFS, and Hive data sources.
The Kerberos authentication feature can be used only on exclusive resource groups for Data Integration and Serverless resource groups.
How Kerberos authentication works
Kerberos is a third-party authentication protocol based on symmetric keys. Both clients and servers rely on a Key Distribution Center (KDC) to perform identity authentication. The KDC is the server program for Kerberos. For more information about Kerberos, see Overview.
As shown in the preceding figure, Kerberos authentication in DataWorks consists of the following four stages:
A client requests a TGT: When a client (principal) accesses a data source for which Kerberos authentication is enabled, the client first requests a TGT from the KDC. The TGT serves as proof of identity for the client when it requests access to specific services from the KDC.
The KDC issues a TGT: After the KDC receives the request, it authenticates the client. If the client passes authentication, the KDC issues an encrypted TGT with a limited validity period to the client.
The client requests access to the server: After the client obtains the TGT, it requests access to specific service resources from the server based on the service name.
The server authenticates the client: After the server receives the request, it authenticates the client. If the client passes authentication, the client is allowed to access the service resources.
The Kerberos authentication procedure requires a keytab authentication file and a krb5.conf configuration file. The krb5.conf file stores the KDC server configuration. The keytab file stores the authentication credentials of the resource principal, including principals and encrypted principal keys. Before you can use Kerberos authentication, you must upload these two files to the Authentication File Management page. Then, you must reference and configure the authentication files on the data source configuration page. For more information about how to upload authentication files, see Upload an authentication file.
Data source types that support Kerberos authentication
The following table lists the data source types that support Kerberos authentication and provides links to their configuration guides.
Data source type | Configuration guide |
HBase | |
HDFS | |
Hive |