What data sources can SDDP scan?

Sensitive Data Discovery and Protection (SDDP) can scan data in both structured and unstructured data sources.

Release time Supported data source
July 2019 Relational Database Service (RDS) for MySQL database (structured data source)
MaxCompute project (structured data source)
Object Storage Service (OSS) bucket (unstructured data source)

How long does it take to scan data in my data source after I authorize SDDP to scan the data source?

SDDP starts to scan data in your data source within 2 hours after it is authorized to do so. The time taken to scan your data depends on the data volume. If a data source contains a large number of tables, for example, over 10,000 tables, or if the total size of objects stored in your OSS bucket is large, for example, over 1 PB, the scan period is longer. During a scan, the scan results are progressively updated on the Overview page in the SDDP console. For more information, see Use the Overview page.

How does SDDP scan data in an unstructured data source, that is, OSS?

SDDP scans objects stored in OSS for sensitive data in the following way:

  • First scan: After you authorize SDDP to scan an OSS bucket, SDDP scans all objects stored in the OSS bucket during the first scan.
  • Scan of incremental data: If you add objects to or modify objects stored in the OSS bucket, SDDP scans the new or modified objects.

Can SDDP rescan an OSS object after the object is scanned?

If a scanned object remains unchanged, SDDP will not rescan it. If you modify a scanned object, SDDP rescans the object within 4 to 8 hours after the modification.

SDDP will provide the manual scan feature, which allows you to manually scan objects stored in specified OSS buckets.

How does SDDP scan data in a structured data source, that is, MaxCompute or RDS?

SDDP scans the names and values of fields, for example, the age field, in MaxCompute or RDS, and determines whether the fields are sensitive. If SDDP cannot determine whether a field is sensitive based only on the values of the field, SDDP also checks the name of the field to determine whether the field is sensitive.

  • First scan: After you authorize SDDP to scan a MaxCompute project or an RDS database, SDDP scans all tables in the project or database during the first scan.
  • Scan of incremental data: If you add tables to the database or project, SDDP scans the new tables. If you modify the schema of an existing table by changing columns, SDDP scans the table again.

Does SDDP log on to a database to obtain data?

If authorized, SDDP connects to a MaxCompute project or logs on to an RDS database and samples data to detect sensitive data. SDDP does not save any data from the MaxCompute project or RDS database.

In which scenarios will a scan be triggered?

Currently, SDDP automatically scans data in an authorized data source in the scenarios listed in the following table.

Scenario Scan logic Billing
You authorize SDDP to scan your data source for the first time. SDDP scans all data in the data source. SDDP charges you for a full scan on data in the data source.
You change data in a data source after SDDP has scanned data in the data source with authorization. If you add columns to or delete columns from a MaxCompute or RDS table, SDDP automatically rescans the table. If you add rows to or delete rows from a MaxCompute or RDS table, SDDP does not automatically rescan the table. SDDP charges you for a full scan on data in the data source.
If you add objects to or modify objects stored in an OSS bucket, SDDP automatically scans the new or modified objects.
Note If you only delete objects from an OSS bucket, SDDP does not automatically scan the bucket.
SDDP charges you for scanning the new or modified objects.
You change sensitive data detection rules, including adding, deleting, enabling, or disabling rules. SDDP automatically scans all data in all authorized data sources. SDDP charges you for a full scan on data in all authorized data sources.

In which scenarios will SDDP not scan my data?

If the size of an OSS object is greater than or equal to 200 MB, SDDP does not scan the object. Currently, SDDP only scans OSS objects whose size is less than 200 MB.

Note A package is considered as a single OSS object. If the total size of all files in a package is greater than or equal to 200 MB, SDDP does not scan the package.