By Bruce Wu
The Alibaba Cloud LOG Java Producer is high performance write LogHub library that is designed for Java applications running in big data and high concurrency scenarios. In comparison with using APIs or SDKs, using Alibaba Cloud LOG Java Producer (Producer) has many advantages, such as high performance, isolated computing and I/O logic, and controllable resource usage. To understand the features and mechanisms of the Producer, see the article Alibaba Cloud LOG Java Producer - A powerful tool to migrate logs to the cloud. This article will focus on the use of the Producer.
You can perform three steps to use the Producer as shown in the following figure.
The following objects are involved when you create the Producer.
The ProjectConfig object contains the service endpoint information of the target project, and the access credential that indicates the identity of the caller.
The final access address is composed of the project name and the service endpoint. For more information about how to determine the endpoint of a project, see Service endpoint.
You can configure the AccessKey or Security Token Service (STS) token for the Producer. If you use the STS token, you must regularly create new ProjectConfig objects, and put them into ProjectConfigs.
If you need to write data to different projects, you can create multiple ProjectConfig objects, and put them to ProjectConfigs. ProjectConfigs maintains different project configurations through a map. The key of the map is the project name, and the value is the client of the project.
ProducerConfig is used to configure the sending policy. You can specify different values according to different business scenarios. The descriptions of these parameters are provided in the following table.
Please refer to TimeoutException and Attempts for more details.
LogProducer is the implementation class of Producer and it only accepts the producerConfig parameter. After you have prepared the producerConfig, you can create a Producer instance as follows:
Producer producer = new LogProducer(producerConfig);
A series of threads are created when you create a Producer instance. This consumes a considerable amount of resources. We recommend that you use only one Producer instance for one application. Threads of a Producer instance are listed as follows, where N is the ordinal number of the Producer instance among the current running processes and N starts from 0.
In addition, all methods provided by LogProducer are thread-safe. You can run these methods safely in a multithreading environment.
After creating a Producer instance, you can use methods it provides to send data.
The Producer provides multiple data sending methods. Parameters of these methods are described as follows.
To merge different data into a big batch, the data must have the same project, logstore, topic, source, and shardHash properties. To ensure the data merge function can work properly and to save memory resources, we recommend that you control the value range of these five properties. If the value of a field, for example topic, has too many different values, we recommend that you add these values to the logItem, instead of directly using topic.
The Producer sends data asynchronously. You need to obtain the data-sending result from future objects or callbacks returned by the Producer.
The send method returns a ListenableFuture. The ListenableFuture not only allows you to block the I/O thread and obtain the data-sending result by calling the get() method, but also allows you to register a callback. The callback will be called after the future settings are complete. The following snippet shows you how to use ListenableFuture. You need to register a FutureCallback for this future, and send the callback to the EXECUTOR_SERVICE thread pool that is provided by the application for execution. For complete example code, see SampleProducerWithFuture.java.
ListenableFuture<Result> f = producer.send("project", "logStore", logItem);
Futures.addCallback(f,
new FutureCallback<Result>() {
@Override
public void onSuccess(@Nullable Result result) {
}
@Override
public void onFailure(Throwable t) {
}
},
EXECUTOR_SERVICE);
In addition to future, you can also register a callback when you call the send method to obtain the data-sending result. The code snippet is as follows. For complete example code, see SampleProducerWithCallback.java.
producer.send(
"project",
"logStore",
logItem,
new Callback() {
@Override
public void onCompletion(Result result) {
}
});
Callbacks are implemented by internal threads of the Producer. The space occupied by a batch can be released only after the corresponding callback is executed. To avoid blocking the Producer and reducing the overall throughput, do not run time consuming operations in callbacks. Calling the send method to retry sending the producer batch is not recommended either. You can increase the value of the retries parameter, or retry sending the batch in a callback of the ListenableFuture object.
Which should I choose to obtain the data-sending result, future or callback? If the processing logic after you obtain the result is relatively simple and does not block the I/O thread of the Producer, directly use callback. Otherwise, we recommend that you use ListenableFuture to run the subsequent processing logic in a separate thread (pool).
If you do not have more data to be sent, or if you are going to exit the current process, you need to shut down the Producer. By doing so, the cached data of the Producer can be fully processed. Currently, the Producer supports safe shutdown and limited shutdown.
In most cases, we recommend that you use safe shutdown. You can call the close() method to safely shut down the Producer. This method returns only when all cached data of the Producer is processed, all threads are terminated, the registered callbacks are executed, and all futures have been set.
Although we must wait till all data is processed, after the Producer is shut down, the cached batch will be processed immediately and will not be retried in the case of failure. Therefore, as long as the callback is not blocked, the close method can usually quickly return.
If your callback is likely to be blocked when it is executed, and you want the close method to return shortly, use the limited shutdown mode. You can use the close(long timeoutMs)
method to implement the limited shutdown. If the Producer is not completely shut down after the expiration of the specified timeoutMs, it throws an IllegalStateException exception. In this case, the Producer is shut down regardless of whether the cached data has been processed or whether the registered callbacks have been executed.
To make learning Producer easier, we have prepared the Alibaba Cloud LOG Java Producer Sample Application for you. The sample covers the entire process from Producer creation to shutdown.
Alibaba Cloud LOG Java Producer - A Powerful Tool to Migrate Logs to the Cloud
57 posts | 12 followers
FollowAlibaba Clouder - November 11, 2019
Alibaba Cloud Storage - June 19, 2019
Alibaba Cloud Storage - June 19, 2019
Alibaba Cloud Storage - June 19, 2019
Alibaba Cloud Native Community - March 20, 2023
Alibaba Developer - March 5, 2020
57 posts | 12 followers
FollowAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreBuild a Data Lake with Alibaba Cloud Object Storage Service (OSS) with 99.9999999999% (12 9s) availability, 99.995% SLA, and high scalability
Learn MoreAn all-in-one service for log-type data
Learn MoreMore Posts by Alibaba Cloud Storage