This topic describes the limits when you embed watermarks into data sources.

What types of data sources can be embedded with watermarks?

You can embed watermarks into all types of data sources for which Data Security Center (DSC) supports static de-identification. For more information about the types of data sources for which DSC supports static de-identification, see Supported data assets.

What requirements must the data sources meet before I can embed watermarks into the data sources?

The principle of embedding watermarks is to embed the watermark information into the columns that have different characteristics. The more characteristics of the source data, the more complete the watermark information that can be embedded, and the higher the success rate of watermark extraction. You can extract watermarks even if specific data is missing. When you embed watermarks into a data source, take note of the following items:
  • Make sure that the number of rows in the data source is greater than or equal to 1,000.

    If the data source contains fewer than 1,000 rows, you may fail to extract the watermarks due to insufficient characteristics.

  • Embed watermarks into the columns that have a wide range of values. If you embed watermarks into a column that has only a few enumerated values, you may fail to extract the watermarks due to insufficient characteristics.

    Typically, you can embed watermarks into columns whose name is address, name, UUID, amount, or total. Do not embed watermarks into columns whose name is gender or status.

  • After you embed watermarks into a column, the values of the column may change. Therefore, before you embed watermarks into a column, make sure that the value changes of the column are acceptable.

    For more information, see the "Does the source data change after watermarks are embedded?" section of this topic.

Which watermark embedding algorithm do I need to select?

If you embed watermarks into a column of the string type, select the space algorithm. If you embed watermarks into a column of the numeric type, select the least significant bit algorithm.

Does the source data change after watermarks are embedded?

After you configure a task in the DSC console to embed watermarks into the columns in Table A and write the data in Table A to Table B, the data in Table A remains unchanged. Only the data that is written to Table B changes.

Specific data embedded with watermarks is different from the source data. You can use the space algorithm and least significant bit algorithm to embed watermarks.