The pay-as-you-go model is a key benefit of cloud services because it eliminates the need for resource pre-provisioning. This model requires robust metering and billing for every cloud service. This document describes a solution for implementing pay-as-you-go billing based on Log Service. Many cloud services use this solution to process hundreds of billions of metering logs daily.
Pay-as-you-go billing logs
Use cases
-
Power companies: A log is generated every 10 seconds, recording each user's power consumption, peak usage, and average usage for that period. Power companies use this data to provide customers with daily, hourly, and monthly bills.
-
Telecommunication carriers: A base station sends a log every 10 seconds that contains a user's activity, such as web browsing, calls, text messages, or VoIP, along with data usage and duration. The backend billing service uses this information to calculate charges for that period.
-
Weather forecast API services: Users are charged based on the type of API call, city, query type, and the size of the result.
Requirements and challenges
Accuracy and precision are critical for billing systems. The system must meet the following requirements:
-
Accuracy and reliability: The system must not overcharge or undercharge customers.
-
Flexibility: The system must support scenarios like data backfilling. This allows the system to perform recalculations if data was pushed incorrectly.
-
Real-time processing: The system must support near-real-time billing to quickly suspend services for overdue accounts.
Additional requirements:
-
Bill correction: If real-time billing fails, you can perform reconciliation to correct the bill.
-
Detailed queries: You must be able to view detailed consumption data.
The system also faces two primary challenges in practice:
-
Data volume growth: As the number of users and API calls increases, the data volume grows. The architecture must support elastic scaling to accommodate this growth.
-
Fault tolerance: The billing application might have bugs. The metering data must be independent of the billing application to ensure data integrity.
This document presents a pay-as-you-go billing solution using Log Service. This solution has operated reliably in production for many years without any miscalculations or delays.
How it works
This solution uses the LogHub feature of Log Service. The process is as follows:
-
Ingest real-time metering logs by using LogHub. You can connect your metering application to LogHub, which supports over 50 data ingestion methods.
-
At fixed intervals, the metering application consumes incremental data from LogHub and calculates billing results in memory.
-
(Optional) To query detailed data, you can create indexes for the metering logs.
-
(Optional) Ship metering logs to long-term storage services like OSS for offline storage, T+1 reconciliation, and analysis.

The following diagram shows the internal structure of the real-time metering application:
-
Use the LogHub
GetCursorAPI to obtain a cursor for a specific time range, for example, from 10:00 to 11:00. -
Use the
PullLogsAPI to consume data within that time range. -
Perform statistical calculations on the data in memory to generate billing data.
You can adjust the time window for calculation to one minute, 10 seconds, or any other interval.

Performance analysis:
-
Assume you have 1 billion metering logs per day, and each log is 200 bytes. This results in 200 GB of data.
-
LogHub SDKs and Logtail provide a default compression feature. With a typical 5:1 compression ratio, the actual storage size is 40 GB, which is about 1.6 GB per hour.
-
The LogHub
PullLogsAPI can read up to 1,000 log groups in a single call, with each log group up to 5 MB in size. On a gigabit network, this read operation completes in under two seconds. -
Including the time for in-memory data aggregation and calculation, processing one hour of metering logs takes less than five seconds.
Generate billing data from metering logs
Metering logs record the billable items for your services. A backend billing module processes these logs based on billing rules to generate the final invoice. For example, the following raw access log records the usage of a project:
microtime:1457517269818107 Method:PostLogStoreLogs Status:200 Source:203.0.113.10 ClientIP:198.51.100.10 Latency:1968 InFlow:1409 NetFlow:474 OutFlow:0 UserId:44 AliUid:1264425845****** ProjectName:app-myapplication ProjectId:573 LogStore:perf UserAgent:ali-sls-logtail APIVersion:0.5.0 RequestId:56DFF2D58B3D939D691323C7
The metering and billing application reads the raw logs and aggregates usage data across multiple dimensions, such as traffic, number of requests, and outbound traffic, based on predefined rules.
How to handle large data volumes
In some scenarios, such as for telecom carriers or IoT applications, the volume of metering logs can be massive. For example, 10 trillion logs can amount to 2 PB of data per day. After compression, this is still 16 TB per hour. Reading this volume of data over a 10-gigabit network takes about 1,600 seconds, which is too slow for rapid billing.
-
Control the volume of generated metering logs
You can modify the application that generates metering logs, such as Nginx, to pre-aggregate data in memory. With this approach, the data volume depends on the number of active users rather than the number of raw events. For example, if Nginx serves 1,000 users in a one-minute window, the hourly data volume would be 1,000 users × 200 bytes/log × 60 minutes = 12 MB, or about 2.4 MB after compression.
-
Parallelize the processing of metering logs
Each Logstore in LogHub can be divided into multiple shards. You can create three shards and run three instances of the metering consumer application. To ensure the same consumer instance processes all data for a single user, you can route data to a specific shard based on a hash of the user ID. For example, data for users in region A can be written to shard 1, and data for users in region B can be written to shard 2. This design allows the backend metering application to scale horizontally.

Additional topics
-
How to backfill data?
You can configure a data retention period from 1 to 365 days for each Logstore in LogHub. If your billing application needs to re-consume data for correction, you can reprocess the data for any time range within the retention period.
-
How to handle metering logs scattered across multiple servers?
-
Use Logtail to collect logs in real time.
-
Use user-defined identifiers to create a dynamic machine group for elastic scaling.
-
-
How to perform detailed queries?
You can create indexes on your LogHub data to enable real-time queries and analysis.
Inflow>300000 and Method=Post* and Status in [200 300]You can also add statistical analysis to your queries:
Inflow>300000 and Method=Post* and Status in [200 300] | select max(Inflow) as s, ProjectName group by ProjectName order by s desc -
How to store logs for T+1 reconciliation?
Log Service allows you to ship data from LogHub. You can configure custom partitions and storage formats. Store the logs in OSS and then use services like E-MapReduce, Hadoop, Hive, Presto, or Spark for computation.