Pay-per-byte is a pay-as-you-go billing method. When you use this billing method, you are charged only for the number of bytes that are scanned. If you use Data Lake Analytics (DLA) to perform association analysis on data of local or third-party data sources, you are charged based on the number of bytes that are scanned. This topic describes the billing rules, billing examples, and preferential policies of the pay-per-byte billing method.

Billing rules

The minimum size of scanned data for which you are charged is 32 MB. DLA generates a bill on an hourly basis and fees are deducted from the balance of your Alibaba Cloud account. To view bills, you can log on to the DLA console and choose Expenses > Orders.

Methods to reduce costs

To reduce costs, you can use one of the following methods to process the raw data before you use DLA to scan the data:
  • Format conversion: convert the format of the raw data into a high-performance data format.

    DLA supports multiple high-performance data formats, such as Apache ORC, Apache Parquet, and Apache Avro. You can convert the format of your data into one of the preceding formats. Then, use DLA to scan only data in the required columns.

  • Data compression: compress the raw data to reduce the data size. We recommend that you compress the data into a file in the Apache Parquet or Apache ORC format. Then, use DLA to scan data in the file.
  • Data partitioning: store the raw data in different partitions. Then, use DLA to scan the data in one or more partitions.