How Idle Fish Uses RxJava to Improve the Asynchronous Programming Capability - Part1

By Kunming, from Idle Fish Technology

RxJava is an implementation framework for reactive programming in Java. It is an event-based code base that provides powerful and elegant asynchronous calling programs. Since 2018, the application architecture upgrade project initiated by the Taobao Technology Department has hoped to improve the overall system performance and machine resource utilization through reactive architecture and fully asynchronous transformation. The project aims to reduce network latency and resource reuse and provide agile architecture support for rapid business innovation. The basic procedures of Idle Fish, such as product batch update and order batch query, take advantage of the asynchronous programming capability of RxJava.

RxJava is easy to get started but difficult to master. Developers may get entrapped in it easily. Today, let's look at the usage, basic principles, and precautions of RxJava (spread across two articles.)

1. Before You Start

Let's look at the pain points of the callback code we wrote before using RxJava.

When our application needs to handle user events and perform asynchronous calls, the difficulty of code implementation will grow sharply as the complexity of streaming events and processing logic increases. For example, we sometimes need to deal with the combination of multiple event streams, handle the exception or timeout of the event stream, and clean after the event stream ends. If we need to implement it from scratch, we must handle many difficult problems carefully, such as callback, monitoring, and concurrency.

A problem called "callback hell" describes the unreadability of the code:

Code 1.1

// Example from callbackhell.com
fs.readdir(source, function (err, files) {
  if (err) {
    console.log('Error finding files: ' + err)
  } else {
    files.forEach(function (filename, fileIndex) {
      console.log(filename)
      gm(source + filename).size(function (err, values) {
        if (err) {
          console.log('Error identifying file size: ' + err)
        } else {
          console.log(filename + ' : ' + values)
          aspect = (values.width / values.height)
          widths.forEach(function (width, widthIndex) {
            height = Math.round(width / aspect)
            console.log('resizing ' + filename + 'to ' + height + 'x' + height)
            this.resize(width, height).write(dest + 'w' + width + '_' + filename, function(err) {
              if (err) console.log('Error writing file: ' + err)
            })
          }.bind(this))
        }
      })
    })
  }
})

The JavaScript code above has two defects:

Due to the imported layer-by-layer callback method, there are a lot of "})" at the end of the code.
The order of code writing is opposite to the order of code execution. The callback function will be executed earlier than the code in the previous line.

RxJava can handle callbacks and exceptions with ease.

2. An Introduction to RxJava

Let’s say you want to asynchronously obtain a list of users and then process the results, such as displaying them on the UI or writing them in the cache. After using RxJava, the code is listed below:

Code 2.1

Observable<Object> observable = Observable.create(new ObservableOnSubscribe<Object>() {
    @Override
    public void subscribe(@NotNull ObservableEmitter<Object> emitter) throws Exception {
        System.out.println(Thread.currentThread().getName() + "----TestRx.subscribe");
        List<UserDo> result = userService.getAllUser();
        for (UserDo st : result) {emitter.onNext(st);}
    }
});
Observable<String> map = observable.map(s -> s.toString());
// Create subscription relationship
map.subscribe(o -> System.out.println(Thread.currentThread().getName() + "----sub1 = " + o)/*Update to UI*/);
map.subscribe(o -> System.out.println(Thread.currentThread().getName() + "----sub2 = " + o)/*Write to the cache*/,
                     e-> System.out.println("e = " + e)),
                     ()->System.out.println("finish")));

userService.getAllUser() is a common synchronization method, but we encapsulate it into an Observable. When a result is returned, we send the user to the listener one by one. The first listener updates the result to the UI, and the second listener writes to the cache. When an exception occurs upstream, print the result. When the event stream ends, print the finish.

You can also configure the upstream timeout period, thread pool call, and fallback result easily. Isn't it powerful?

Note: RxJava code looks easy to use and readable in the example above, but unexpected bugs are prone to occur if it is not fully understood. Beginners may think that in the code above, each element will be asynchronously sent to two downstream observers that print the results in their respective threads after a user list is returned. However, this is not the case. userService.getAllUser() is called twice. The getAllUser() method is called whenever a subscription relationship is established. After the user list is queried, it is synchronously sent to two observers that also print each element synchronously.

sub1 = user1，sub1 = user2，sub1 = user3，sub2 = user1，sub2 = user2，sub2 = user3

As you can see, if there is no other configuration, RxJava is synchronous blocking by default! So, how do we use its asynchronous non-blocking capability?

Code 2.2

Observable
    .fromCallable(() -> {
         System.out.println(Thread.currentThread().getName() + "----observable fromCallable");
         Thread.sleep(1000); // Imitate expensive computation
         return "event";
     })
    .subscribeOn(Schedulers.io())
    .observeOn(Schedulers.single())
    .map(i->{
        System.out.println(Thread.currentThread().getName() + "----observable map");
        return i;
    })
    .observeOn(Schedulers.newThread())
    .subscribe(str -> System.out.println(Thread.currentThread().getName() + "----inputStr=" + str));

System.out.println(Thread.currentThread().getName() + "----end");

Thread.sleep(2000); // <--- Wait for the flow to finish. In RxJava the default Schedulers run on daemon threads

We use Observable.fromCallable() instead of the Observable.create method at the bottom in Code 2.1 to create an Observable (the observer). The fromCallable method creates a lazy Observable, and the incoming code is executed only when someone listens to it. (We will discuss this later. Here, we just want to show that there are many ways to create Observable.)

Then, you can use subscribeOn(Schedulers.io()) to specify the thread pool executed by the observer. observeOn(Schedulers.single()) specifies the thread pool that the downstream observer executes. (*The map method is an observer.) The map method, like many stream programming APIs, transforms each upstream element into another element. Finally, the current downstream observer, which refers to the thread pool executed by the incoming observer (Lambda mode) in the last subscribe, is formulated through observeOn(Schedulers.newThread()).

After the preceding code is executed, the printed thread name shows that the observed, map, and observer are all different threads, and the last "end" of the main thread will be executed first. This means asynchronous non-blocking is implemented.

3. Usage

This article series is not an interface document of RxJava, and will not introduce each API in detail. It will discuss some common or special APIs to elaborate on the capabilities of RxJava.

3.1 Basic Components

The core principle of RxJava is very simple. It is similar to the observer mode. Observable is the observed and generates data as a data source. Observer consumes the upstream data source.

You can register multiple Observers for each Observable. However, subscribe of Observable is called whenever registration occurs by default. If you only want to produce once, you can call the Observable.cached method.

Observable has multiple variants, such as Single and Flowable. Single represents a data source that only produces one element. Flowable is a data source that supports back pressure. The downstream listener can feed information back to the upstream, and the function of controlling the transmission rate can be achieved through the back pressure design.

Observable and Observer are connected through layers of packaging in decorator mode. If you change the API, such as map, a new ObservableMap (basic level self-observable) will be created. The original Observable is packaged as the source, and when it is executed, the conversion operation will be done first. Then, Observable is sent to the downstream observer.

Scheduler is a support class provided by RxJava for multi-threaded execution. It packages the execution logic of a producer or consumer into a Worker and submits it to a common thread pool provided by the framework, such as Schedulers.io() and Schedulers.newThread(). You can use Schedulers as a thread pool and Worker as a thread in the thread pool to understand easier. Observable.subscribeOn and Observable.observeOn can formulate the threads to be executed by the observed and the observer to achieve asynchronous non-blocking.

The core architecture diagram of RxJava is listed below:

3.2 Conversion of API

map: Please see Code 2.2. One-to-one conversion transforms each upstream element into another element like many stream programming APIs.
flatMap: This offers one-to-many conversion that converts each upstream element into 0 to multiple elements. Compared with Java8, the stream is returned in Stream.flatMap, while Observerable is returned in Observerable.flatMap. Note: This method is very powerful, and many APIs are based on this method at the bottom. Since multiple observables returned by flatMap are independent of each other, you can implement concurrency based on this feature.

3.3 API Combination

merge: Merge two event streams into one time stream. The order of event streams after merging is the same as the time order of the arrival of elements in the two streams.

zip: Receive each element of multiple upstream streams one by one, combine them one by one, and send them to the downstream after conversion. Please see Code 3.1 for an example:

Code 3.1

//The first stream outputs an even number every 1 second.
Observable<Long> even = Observable.interval(1000, TimeUnit.MILLISECONDS).map(i -> i * 2L);
//The second stream outputs an odd number every 3 seconds.
Observable<Long> odd = Observable.interval(3000, TimeUnit.MILLISECONDS).map(i -> i * 2L + 1);
//zip can also be passed in multiple streams. Here, it is passed in only two streams.
Observable.zip(even, odd, (e, o) -> e + "," + o).forEach(x -> {
    System.out.println("observer = " + x);
});

/* The output is as follows. We can see that when a stream has elements, it will wait for all other streams to receive an element. Then, elements are merged, processed, and sent to the downstream.
observer = 0,1
observer = 2,3
observer = 4,5
observer = 6,7
...
*/

Code 3.1 does not seem to have any problem. The two streams are executed concurrently, and zip is used to wait for their results. However, it hides a very important issue: RxJava is synchronized and blocked by default! When we send multiple requests concurrently using the solution above and use zip to monitor all the results, there is a strange phenomenon. In Code 3.2, ob2 code is always executed after ob1 code is executed. The two requests are not executed concurrently as we expected. The printed thread name also shows that the two Single are executed sequentially in the same thread!

Code 3.2

//Single is the implementation class of Observable that returns only one element.
Single<String> ob1 = Single.fromCallable(() -> {
        System.out.println(Thread.currentThread().getName() + "----observable 1");
        TimeUnit.SECONDS.sleep(3);
        return userService.queryById(1).getName();
    });

Single<String> ob2 = Single.fromCallable(() -> {
        System.out.println(Thread.currentThread().getName() + "----observable 2");
        TimeUnit.SECONDS.sleep(1);
        return userService.queryById(1).getName();
    });

String s =  Single.zip(ob1, ob2, 
                       (e, o) -> {System.out.println(e + "++++" + o);

Why can the two streams of Code 3.1 be executed concurrently? The source code shows that the implementation of zip will subscribe to the first stream and then to the second stream. So, sequential execution is performed. However, streams created through Observable.interval will be submitted to the thread pool provided by Schedulers.computation() by default. The thread pool is explained later in this article.

3.4 API Creation

create: The most original create and subscribe. Other creation methods are based on this.

Code 3.3

//The returned subclass is ObservableCreate.
Observable<String> observable = Observable.create(new ObservableOnSubscribe<String>() {
    @Override
    public void subscribe(ObservableEmitter<String> emitter) throws Exception {
        emitter.onNext("event");
        emitter.onNext("event2");
        emitter.onComplete();
    }
});
//Subscribe to the observable.
observable.subscribe(new Observer<String>() {
    @Override
    public void onSubscribe(Disposable d) {
        System.out.println(Thread.currentThread().getName() + " ,TestRx.onSubscribe");
    }
    @Override
    public void onNext(String s) {
        System.out.println(Thread.currentThread().getName() + " ,s = " + s);
    }
    @Override
    public void onError(Throwable e) {}
    @Override
    public void onComplete() {
        System.out.println(Thread.currentThread().getName() + " ,TestRx.onComplete");
    }
});

just: Observable.just("e1","e2"). Simply create an Observable that sends the specified n elements.
interval: Code 3.1 has given an example to create an Observable that generates elements at certain intervals continuously. The execution is in the thread pool provided by Schedulers.comutation() by default.
defer: Generate an Observable created with a delay. It's a bit confusing. Although the Observalble created by Observable.create is delayed, it will only start generating data when someone subscribes. However, the method of creating Observable is executed immediately. The Observable.defer method starts to create Observable only when someone subscribes to it.

Code 3.4

public String myFun() {
    String now = new Date().toString();
    System.out.println("myFun = " + now);
    return now;
}

public void testDefer(){
// The code immediately executes myFun().
Observable<String> ob1 = Observable.just(myFun());
// The code calls myFun() only when subscription is performed. This is similar to the Supplier interface of Java 8.
Observable<String> ob2 = Observable.defer(() -> Observable.just(myFun()) ); 
}

fromCallable: Generate an Observable created with a delay and a simplified defer method. Observable.fromCallable(() -> myFun()) is equivalent to Observable.defer(() -> Observable.just(myFun()) ).

So far, we have introduced RxJava and explores its usage. In part 2,we will continue to explore the basic principles and precautions of RxJava.

Community

How Idle Fish Uses RxJava to Improve the Asynchronous Programming Capability - Part1

1. Before You Start

2. An Introduction to RxJava

3. Usage

3.1 Basic Components

3.2 Conversion of API

3.3 API Combination

3.4 API Creation

Read previous post:

Read next post:

XianYu Tech

You may also like

Comments

XianYu Tech

Related Products

Realtime Compute

Batch Compute

ApsaraVideo Media Processing

Web Hosting Solution