This topic provides answers to frequently asked questions about sensitive data scan and detection.

What types of data assets can DSC scan?

DSC can scan data assets that store structured data or unstructured data. DSC can scan the following types of data assets:
  • ApsaraDB RDS databases and self-managed databases, which store structured data
  • MaxCompute projects, which store structured data
  • Object Storage Service (OSS) buckets, which store unstructured data

How long does it take to scan data in my data asset after I authorize DSC to access the data asset?

DSC starts to scan your data asset within 2 hours after it is authorized to access the data asset. The time taken to scan your data depends on the data volume. If a data asset contains a large number of tables, for example, more than 10,000 tables, it takes a long period of time to scan the data asset. If the total size of objects stored in an OSS bucket is large, for example, more than 1 PB, it also takes a long period of time to scan the OSS bucket. When DSC scans your data, the scan results are progressively updated on the Overview page in the DSC console. For more information, see View summary information.

How does DSC scan data in an unstructured data asset, such as an OSS bucket?

DSC scans data that is stored in an unstructured data asset and determines whether the objects are sensitive.

  • First scan: After you authorize DSC to access an OSS bucket, DSC scans all objects that are stored in the OSS bucket.
  • Scan of incremental data: If you add objects to or modify objects stored in the OSS bucket, DSC scans the new or modified objects.

Can DSC rescan an OSS object after the object is scanned?

If the object remains unchanged, DSC does not rescan it. If you modify the object, DSC rescans the object within 4 to 8 hours after the modification.

How does DSC scan data in a structured data asset, such as a MaxCompute project or an ApsaraDB RDS database?

DSC scans the names and values of fields in databases or projects, and determines whether the fields are sensitive. For example, DSC scans the name and values of the age field. If DSC cannot determine whether a field is sensitive based only on the values of the field, DSC also checks the name of the field to determine whether the field is sensitive.

  • First scan: After you authorize DSC to access a database or project, DSC scans all tables in the database or project.
  • Scan of incremental data: If you add tables to the database or project, DSC scans the new tables. If you modify the schema of an existing table by changing fields, DSC rescans the table.

Does DSC log on to a database to obtain data?

If authorized, DSC logs on to a database and samples data to detect sensitive data. DSC does not save data from databases or MaxCompute projects.

When will a rescan be triggered?

DSC automatically rescans data in an authorized data asset in the scenarios described in the following table.

Scenario Scan logic Billing
You authorize DSC to access your data asset for the first time. DSC scans all data in the data asset. DSC charges you for a full scan on data in the data asset.
You modify data in a data asset after DSC has scanned the data asset with authorization. If you add fields to or delete fields from a MaxCompute or database table, DSC automatically rescans the table. If you add rows to or delete rows from a table, DSC does not automatically rescan the table. DSC charges you for a full scan on data in the data asset.
If you add objects to or modify objects stored in an OSS bucket, DSC automatically scans the new or modified objects.
Note If you only delete objects from an OSS bucket, DSC does not automatically rescan the bucket.
DSC charges you for scanning the new or modified objects.
You change sensitive data detection rules. For example, you create, delete, enable, or disable rules. DSC automatically scans all data in all authorized data assets. DSC charges you for a full scan on data in all authorized data assets.