This topic provides answers to some frequently asked questions (FAQ) about sensitive data scanning and detection.

Does data scanning affect the performance of my database?

Data Security Center (DSC) provides various data scanning modes for you to use. You can initiate full data scans, incremental data scans, and scheduled data scans. A full data scan has few impacts on the performance of your database and does not affect your business. After you initiate an incremental data scan, Data Security Center (DSC) scans only modified data files. The impacts on the database performance are negligible.

DSC scans full data only if you complete asset authorization and manually initiate a new full data scan, or when the specified scheduling time of full data scanning arrives. DCS scans incremental data only if the data or tables in your database are modified. To reduce the impacts on the performance of your database caused by data scanning, take note of the following suggestions when you specify the data scanning cycle.
  • Increase the intervals at which data is scanned to reduce the impacts on the performance of your database.
  • Schedule data scans to be initiated during the time range when your database is not frequently accessed.

What types of data assets can DSC scan?

DSC can scan data assets that store structured data or unstructured data. DSC can scan the following types of data assets:
  • ApsaraDB RDS instances, PolarDB instances, PolarDB-X instances, ApsaraDB for MongoDB instances, ApsaraDB for OceanBase clusters, and self-managed databases, which store structured data.
  • Object Storage Service (OSS) buckets, which store unstructured data.
  • Tablestore instances, MaxCompute projects, AnalyticDB for MySQL instances, and AnalyticDB for PostgreSQL instances, which store big data

How long does it take to scan the data in a data asset after I authorize DSC to access the data asset?

DSC starts to scan the data in your data asset within 2 hours after it is authorized to access the data asset. The period of time required depends on the total size of the data. If a data asset contains a large number of tables, such as more than 10,000 tables, it takes a long period of time to scan the data. If the total size of objects stored in an OSS bucket is large, such as more than 1 PB, it also takes a long period of time to scan the data. When DSC scans your data, the scan results are progressively updated on the Overview page in the DSC console. For more information, see View summary information.

How does DSC scan the unstructured data in a data asset, such as an OSS bucket?

DSC scans the unstructured data that is stored in a data asset and determines whether the data is sensitive.

  • First scan: After you authorize DSC to access an OSS bucket, DSC scans all objects that are stored in the OSS bucket.
  • Scan of incremental data: If you add objects to or modify the existing objects stored in the OSS bucket, DSC scans the added or modified objects.

Can DSC rescan an OSS object after the object is scanned?

If the object remains unchanged, DSC does not rescan the object. If the object is modified, DSC rescans the object within 24 hours after the modification.

You can manually rescan OSS objects based on your business requirements. For more information, see Rescan your data assets.

How does DSC scan the structured data in a data asset, such as a MaxCompute project?

DSC scans the names and values of fields in databases or projects, and determines whether the fields are sensitive. For example, DSC scans the name and values of the age field. If DSC cannot determine whether a field is sensitive based on the values of the field, DSC also checks the name of the field.

  • First scan: After you authorize DSC to access a database or project, DSC scans all tables in the database or project.
  • Scan of incremental data: If you add tables to the database or project, DSC scans the added tables. If you modify the schema of an existing table by changing fields, DSC rescans the modified table.

Does DSC log on to a database to obtain data?

If authorized, DSC logs on to a database and samples data to detect sensitive data. DSC does not save data from your data assets.

When will a rescan be triggered?

DSC automatically rescans data in an authorized data asset in scenarios described in the following table.

Scenario Scanning logic Billing
You authorize DSC to access your data asset for the first time. DSC scans all data in the data asset. DSC charges you for a full scan on the data in the data asset.
You modify the data in a data asset after DSC has scanned the data asset with authorization. If you add fields to or delete fields from a MaxCompute or database table, DSC automatically rescans the table. If you add rows to or delete rows from a table, DSC does not automatically rescan the table. DSC charges you for a full scan on the data in the data asset.
If you add objects to or modify the existing objects stored in an OSS bucket, DSC automatically scans the added or modified objects.
Note If you delete objects from an OSS bucket, DSC does not automatically rescan the bucket.
DSC charges you for scanning the added or modified objects.
You change sensitive data detection rules. For example, you create, delete, enable, or disable a rule. DSC automatically scans all data in all authorized data assets. DSC charges you for a full scan on the data in all authorized data assets.