Simple Log Service supports various collection methods and data sources. This topic describes the collection methods that are used in various scenarios.
Data collection methods
Before you perform operations on logs or data in Simple Log Service, you must transfer and store the logs or data in Simple Log Service. You must select an appropriate data collection methods based on the scenario. After you collect data to Simple Log Service, you can perform operations on the data by using other features of Simple Log Service.
Data collection methods include Logtail-based data collection, LoongCollector-based data collection, SDK or API-based data collection, Alibaba Cloud service-based log collection, data import, and other collection methods.
Logtail-based data collection: Logtail is a log collection agent that is provided by Simple Log Service. You can use Logtail to collect logs from multiple data sources, such as Alibaba Cloud Elastic Compute Service (ECS) instances, servers in data centers, and servers from third-party cloud service providers. Logtail supports non-intrusive log collection based on log files. You do not need to modify your application code, and log collection does not affect the operation of your applications.
LoongCollector-based data collection: LoongCollector is a new-generation log collection agent that is provided by Simple Log Service. LoongCollector is an upgraded version of Logtail. LoongCollector is expected to integrate the capabilities of specific collection agents of Application Real-Time Monitoring Service (ARMS), such as Managed Service for Prometheus-based data collection and Extended Berkeley Packet Filter (eBPF) technology-based non-intrusive data collection.
SDK or API-based data collection: Simple Log Service allows you to call SDKs or API operations to collect data. You can create custom settings in the related code based on your business requirements. Compared with other data collection methods, this method offers a high degree of flexibility.
Alibaba Cloud service-based log collection: Simple Log Service can collect logs from multiple types of Alibaba Cloud services, such as elastic computing, storage, security, and database services. The logs record operational statistics, such as the user operations, running status, and business dynamics of Alibaba Cloud services. If you want to collect logs from Alibaba Cloud services other than Simple Log Service, you can select data collection methods based on the Alibaba Cloud services.
Data import: Simple Log Service allows you to import existing data, including data from other applications and historical files.
Other collection methods : Simple Log Service allows you to use third-party collection tools to collect logs to Simple Log Service by using a specific protocol.
The following figure shows the data collection overview of Simple Log Service.
Data Import
If you want to import existing data to Simple Log Service for analysis, Simple Log Service also supports the data import feature used in the following scenarios:
You can import log data from Object Storage Service (OSS) buckets to Simple Log Service. For more information, see Import data from OSS to Simple Log Service.
You can import data from a self-managed MySQL database or a database on an ApsaraDB RDS for MySQL instance to Simple Log Service. For more information, see Import data from a MySQL database to Simple Log Service.
You can import data from Elasticsearch to Simple Log Service. For more information, see Import data from Elasticsearch to Simple Log Service.
You can import log data from Amazon S3 objects to Simple Log Service. For more information, see Import Amazon S3 objects to Simple Log Service.
You can import Kafka data to Simple Log Service. For more information, see Import data from Kafka to Simple Log Service.
Logtail collects only incremental logs. You can also collect historical logs. For more information, see Import historical logs from log files.
Other collection methods
If the preceding data collection methods are not suitable for your scenario, you can collect logs to Simple Log Service in real time by using the web tracking feature, Kafka protocol, and syslog protocol.
Syslog protocol
You can use Syslog-ng to collect logs to Simple Log Service by using the syslog protocol. For more information, see Use the syslog protocol to upload logs.
Kafka protocol
You can collect logs by using collection tools, such as Beats, collectd, Fluentd, Logstash, Telegraf, and Vector, to Simple Log Service based on the Kafka protocol. For more information, see Kafka protocol.
Web page and JavaScript
You can use the web tracking feature to collect and analyze user information on browsers and apps. For more information, see Use the web tracking feature to collect logs. You can also use the web tracking feature to collect Unity3D logs. For more information, see Collect Unity3D logs.
Logtail configuration generator
When you use Logtail or call API operations to collect logs, you can use the Logtail configuration generator provided by Simple Log Service. For more information, see Logtail configuration generator. The Logtail configuration generator automatically generates the AliyunPipelineConfig custom resource definition (CRD) and a parameter script for the CreateLogtailPipelineConfig operation. For more information, see [Recommended] Use AliyunPipelineConfig to manage a Logtail configuration and CreateLogtailPipelineConfig. This way, you can use the AliyunPipelineConfig CRD or call the CreateLogtailPipelineConfig operation to create a Logtail configuration.
Best practice scenarios
The following section provides data collection tutorials in actual scenarios.
A company deploys its website application on an ECS instance that resides in Region A and deploys its Simple Log Service project in Region B. The company wants to collect logs from the ECS instance to its Simple Log Service project by using a Logtail configuration. This scenario involves cross-region log collection. For more information, see Use Logtail to collect logs across regions.
A growing number of Internet of Things (IoT) devices have improved our lives, such as smart routers, TV sticks, Tmall Genie, and cleaning robots. However, the embedded development model in the traditional software industry faces multiple challenges when the model is applied in the IoT device industry. For example, a large number of IoT devices are widely distributed, difficult to debug, and limited in hardware functions. For more information about how to process IoT device logs, see Collect IoT or embedded development logs.
You may want to collect user behavior data. The data may be related to the following behaviors: Check how many users view a promotional web page. Check whether a recipient reads a leaflet after the leaflet is sent to the recipient. Analyze the page views (PVs) of a promotional event page on a mobile app. To efficiently collect the preceding user behavior data and meet personalized collection and statistical requirements, you can use the web tracking feature. For more information, see Use web tracking to collect logs.
For more information, see Best practices.
FAQ
Which network type do I need to select if I use Express Connect together with Simple Log Service?
We recommend that you select the internal network type.
How do I select a network type and an endpoint?
For more information about the scenarios in which you can select a specific network type, see Select a network type. For more information about endpoints, see Endpoints. For more information about how to enable the transfer acceleration feature, see Use the transfer acceleration feature.
Can I obtain public IP addresses when I collect data from the Internet?
Yes, you can turn on Log Public IP for your Logstore to obtain public IP addresses. For more information, see Create a Logstore.
Which network type do I need to select if I want to collect ECS logs from Region A to a Simple Log Service project in Region B?
We recommend that you select the Internet type and specify the region name of Region B in the command used to install Logtail on your ECS instance in Region A. This allows Logtail to access the project in Region B over the Internet. For more information about how to select a network type in other scenarios, see Select a network type.
How do I check whether an endpoint is accessible?
You can run the following command. If output is returned, the endpoint is accessible.
$myproject
specifies the project name.cn-hangzhou.log.aliyuncs.com
specifies the endpoint.curl $myproject.cn-hangzhou.log.aliyuncs.com
For more information, see FAQs about data collection.