All Products
Search
Document Center

Pricing overview

Last Updated: Apr 26, 2019

Pricing

For Data Lake Analytics (DLA), you only pay for the data that is scanned by each query. There are no upfront infrastructure costs or maintenance costs.

Pay $4 for every TB of data scanned, and you are billed on an hourly basis.

You can scan up to 100GB for free, within a 30-day window since the DLA service activation date. You need to pay at the regular price after the first 100GB, or after 30 days.

For example:If you want to perform association analysis on a CSV file (1 TB) and JSON file (1 TB) stored in OSS, and on another table (1 TB) stored in RDS, the cost to perform this operation is as follows:

$12 = $4/TB x 1 TB (CSV) + $4/TB x 1 TB (JSON) + $4/TB x 1 TB (RDS)

How to save more

You can save more and optimize performance by compressing raw data, converting data formats, or partitioning data.

Compressing: This allows DLA to scan less data, thereby reducing overall costs.

Converting data format: DLA supports Apache ORC, Apache Parquet, and Avro. Based on your business needs, you can use filters to partially scan target files, tables, or objects.

Partitioning: You can partition data to limit the amount of data DLA scans, and avoid incurring costs from full scans.

Example:

Assume you compress a CSV file to gzip format, which minimizes the file size to 0.4 TB. You can then partition this gzip file, and scan 50% of it or the equivalent to 0.2 TB. Additionally, you can convert JSON data to ORC, and then scan only 10% of the entire file or 0.1 TB in total.

The cost to perform the preceding operation is as follows:

$5.2 = $4/TB x 0.2 TB (partitioned gzip) + $4/TB x 0.1 TB (ORC) + $4/TB x 1 TB (RDS).