DataWorks can generate sample libraries based on the sample files that you provide. You can associate a sample library with a data identification rule to identify data. If the data to be identified contains the data in the sample library, the data to be identified hits the data identification rule. You can use sample libraries to identify enumerated values, such as employee names and user addresses. This topic describes how to create and manage sample libraries.
Limits
You can upload only UTF-8-encoded files in the TXT format as sample files to DataWorks. The size of a sample file cannot exceed 500 KB. Each data entry in a sample file occupies a line.
A data identification rule can be used to identify only one type of data. We recommend that you store the data of the same type in a sample library. To identify multiple types of data, you must configure multiple sample libraries. For example, if you want to identify employee names and home addresses, you must configure two sample libraries. One is used to identify employee names and the other is used to identify home addresses.
Create a sample library
Go to the Data Security Guard page.
Log on to the DataWorks console. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to DataStudio.
Click the icon in the upper-left corner and choose .
Click Try now to go to the Data Security Guard page.
In the left-side navigation pane, choose .
Click the Sample Management tab.
On the Sample Management tab, click Add Sample. In the dialog box that appears, specify a name for the sample library and upload a sample file.
NoteYou can upload only UTF-8-encoded files in the TXT format as sample files to DataWorks. The size of a sample file cannot exceed 500 KB. Each data entry in a sample file occupies a line.
NoteA data identification rule can be used to identify only one type of data. We recommend that you store the data of the same type in a sample library. To identify multiple types of data, you must configure multiple sample libraries. For example, if you want to identify employee names and home addresses, you must configure two sample libraries. One is used to identify employee names and the other is used to identify home addresses.
You can upload multiple sample files to a sample library in DataWorks.
Click Save. The sample library is created.
After you create a sample library, you can associate the sample library with a data identification rule. If the data to be identified contains the data in the sample library, the data to be identified hits the data identification rule. For more information about how to associate a sample library with a data identification rule, see Identify sensitive data.
Manage sample libraries
On the Sample Management tab, you can perform the following operations to manage existing sample libraries.
View the sample libraries.
You can view the number of samples in and the data identification rules associated with each existing sample library. To view the details of a sample library, find the sample library and click the icon in the Actions column.
Edit a sample library.
Find the sample library that you want to edit and click the icon in the Actions column. You can upload a new sample file or replace existing sample files.
Delete a sample library.
To delete a sample library, find the sample library and click the icon in the Actions column.
NoteIf the sample library is associated with a data identification rule, you cannot delete the sample library. You can view the data identification rule that is associated with the sample library on the Sample Management tab. Before you delete the sample library, you must disassociate the sample library from the data identification rule on the configuration page of the data identification rule. For more information about how to configure a data identification rule, see Identify sensitive data.