DLA uses the pay-as-you-go (post-payment) billing method, which charges fees based on the number of bytes scanned. The cluster setup, maintenance, and upgrade are free of charge.
The billing rule is as follows: Every TB of data scanned is charged USD 4.
The minimum scanned data size that the billing system bills is 32 MB. Sizes less than 32 MB are calculated as 32 MB. The system creates one bill per hour and deducts fees from your Alibaba Cloud account. You can log on to the Alibaba Cloud console and choose Billing Management > Bills to view the consumption records.
The following example helps you understand the DLA billing method.
Assume that you store a 1 TB CSV file and a 1 TB JSON file in OSS and a 1 TB table in RDS.
If you want to use DLA to perform an association analysis for the data in OSS and RDS, based on the scanned data size and DLA billing method, you need to pay USD 12, that is,
4 + 4 + 4 = USD 12.
To reduce costs, you can use any of the following methods to process the original data, and then use DLA to scan the data.
Format conversion: You can convert the original data format into a high-performance format.
DLA supports multiple high-performance storage formats, such as Apache ORC, Apache Parquet, and Avro. You can convert the original data format into one of the preceding formats according to service requirements, and scan only the desired data columns instead of all data. For how to convert data into a high-performance storage format, see File format conversion.
Data compression: You can compress the original data to reduce the data size, and then use DLA to scan the compressed data.
Data partitioning: You can store the original data into different partitions, and then use DLA to scan one or multiple partitions, instead of all partitions.
In the preceding billing example, you can reduce the DLA scanning cost as follows:
Compress the 1 TB CSV file in GZIP format. The size of the compressed file is 0.4 TB. Store the data of the GZIP file into different partitions, and store the data to be scanned into the same partition. DLA scans only one partition, so the scanned data size is reduced to 0.2 TB.
Convert the 1 TB JSON file into the ORC format. DLA needs to scan only 10% of the data by column, and the scanned data size is reduced to 0.1 TB.
After data format conversion, compression, and partitioning, the DLA scanning fee you need to pay is USD 5.2, that is,
4 x 0.2 + 4 x 0.1 + 4 = USD 5.2. This can reduce the cost by USD 6.8 in total.