This topic describes how to collect data over the public network from multiple sources, such as mobile clients, web pages, PCs, servers, hardware devices, and cameras. In a traditional architecture, you can integrate front-end servers with Kafka to collect data over the public network. Now, you can replace the traditional architecture with the Log Service LogHub solution, which is more reliable, cost-effective, elastic, and secure.

Scenarios

In scenarios where you collect data over the public network, you may need to access multiple sources, such as mobile clients, external servers, and web pages. Then, you can use the data for various purposes such as real-time computing and data warehousing.Scenario

Solution 1: Integrate frontend servers with Kafka

Kafka does not support the RESTful protocol and is used in clusters in most cases. Therefore, you need to set up an NGINX server as a public network proxy before you can use Logstash or call API operations to write data to Kafka. The following table describes the required services.
Services Quantity Configuration Description Price
ECS instance 2 Single-core CPU, 2 GB memory Both ECS instances are used as front-end servers. CNY 108 per ECS instance per month
Server Load Balancer (SLB) instance 1 Standard The SLB instance is used as a pay-as-you-go instance. CNY 14.4 per month (lease fee) and CNY 0.8 per GB (data traffic fee)
Server on which Kafka and ZooKeeper are installed 3 Single-core CPU, 2 GB memory The servers write and process data. CNY 108 per device per month

Solution 2: Use LogHub

You can use mobile SDKs, Logtail, or Web Tracking to write data to the LogHub endpoint. The following table describes the required services.
Service Description Price
LogHub LogHub is used to collect data in real time. Less than CNY 0.18 per GB. For more information, see Billing method.

Solution comparison

Scenario 1: You collect 10 GB of data from 1 million write requests each day. In this scenario, 10 GB is the size of the compressed data. The size of the raw data is 50 GB to 100 GB.

Solution 1:
--------------
SLB lease fee: 0.02 × 24 × 30 = CNY 14.4
SLB traffic fee: 0 (Upstream traffic does not incur fees. No downstream traffic is generated.)
ECS instance fee: 108 × 2 = CNY 216 (Free-of-charge disks are used.)
Kafka installed on ECS: Kafka installed on ECS does not incur fees if it is shared with other services.
Total: CNY 484.8 per month

Solution 2:
--------------
LogHub traffic: 10 × 0.18 × 30 = CNY 54
Number of LogHub requests: 0.12 × 30 = CNY 3.6
Total: CNY 57.6 per month

Scenario 2: You collect 1 TB of data from 100 million write requests each day.

Solution 1:
--------------
SLB lease fee: 0.02 × 24 × 30 = CNY 14.4
SLB traffic fee: 0 (Upstream traffic does not incur fees. No downstream traffic is generated.)
SLB specification fee (specification: slb.s2.medium): 0.63 × 24 × 30 = CNY 453.6
ECS instance fee (specification: ecs.g6.large): 240 × 2 = CNY 480
ECS disk fee: CNY 4,800 for a single SSD whose peak throughput is twice the average throughput. Data is written to each replica at a speed of 50 MB/s. The SSD stores 6 TB of data for three days.
Kafka ECS: Kafka ECS does not incur fees if it is shared with other services.
Total: CNY 6,696 per month

Solution 2:
--------------
LogHub traffic: 1,000 × 0.15 × 30 = CNY 4,500 (tiered pricing)
Number of LogHub requests: 0.12 × 100 × 30 = CNY 360
Total: CNY 4,860 per month

Conclusion

The preceding two scenarios indicate that Solution 2 is much cheaper than Solution 1. In addition, Solution 2 has the following advantages:

  • Auto scaling: LogHub can collect megabytes to petabytes of data per day.
  • Reliable permission control: You can use an access control list (ACL) to control the read and write permissions.
  • HTTPS compatibility: Data is encrypted before transmission.
  • Free log shipping: No additional fees are incurred when you ship data to data warehouses.
  • Complete monitoring data: You can clearly know the status of your business.
  • SDK-based connections with upstream and downstream systems: LogHub supports multiple SDKs, which you can use to connect LogHub with upstream and downstream systems. Then, LogHub is deeply integrated with open-source products and other Alibaba Cloud products.

For more information, visit the product landing page of Log Service.