Community Blog How Does Taobao Implement Real-Time Product Selection with Real-Time Computing?

How Does Taobao Implement Real-Time Product Selection with Real-Time Computing?

This article describes Alibaba's Blink real-time stream computing technology, which is used to implement real-time product selection

The Alibaba Cloud 2021 Double 11 Cloud Services Sale is live now! For a limited time only you can turbocharge your cloud journey with core Alibaba Cloud products available from just $1, while you can win up to $1,111 in cash plus $1,111 in Alibaba Cloud credits in the Number Guessing Contest.

With the development of Taobao's content-based strategy, the demand for real-time product selection has strengthened. For some products (SPUs) with demanding real-time requirements, the operation personnel want the product pools that they define to take effect in real time to be instantly available to article writers. For this purpose, stream computing can be used to implement real-time product selection. This article describes Alibaba's Blink real-time stream computing technology, which is used to implement real-time product selection and has achieved satisfying results in trials.

Problem Analysis

The following problems must be resolved to implement real-time product selection:

  1. Real-time triggering: In the context where the triggering data source is required for stream computing and the feature data submitted by users is stored in the IDB, how can the IDB be associated with the Blink computing process?
  2. Intermediate state storage: During the Blink computing process, the previous intermediate computing state must be recorded based on the business scenario. In this case, how can these intermediate states be stored and read in real time when needed?
  3. Real-time validation: Given that the Blink computing results will eventually take effect on the search engine, how does Blink interact with the search engine so that the results can take effect in real time?
  4. Incremental data: If incremental data processing is not available, part of the updated data will be overwritten during the offline transfer of full data. In this case, how can the incremental data be appended?


The "TT + Blink + HBase + Swift" solution addresses these problems. Specifically, TimeTunnel (TT) can resolve the real-time triggering problem, HBase can resolve the intermediate state storage problem, and Swift can resolve the real-time validation and incremental data addition problems. TT, HBase, and Swift are described as follows:

  1. TT: As the Alibaba Cloud log collection system, it allows users to subscribe to logs. In addition, it supports IDB and Blink and is an important interaction medium between IDB and Blink.
  2. HBase: It is an open source non-relational distributed database product. It can properly interface with Blink and can be used to store and read intermediate computing states.
  3. Swift: It is a messaging system developed by the Alibaba Search Business Unit. Currently, the primary search engine transfers messages in real time through the Swift system. Swift can be used to address the real-time validation and incremental data addition problems of the engine.

Implementation Process

The workload of the Blink process is scattered to six nodes, namely the log resolution node, query splitting node, SP requesting node, data processing node, TT writeback node, and Swift messaging node. The real-time computing process is as follows:


  1. A user submits the feature data for product selection, which is stored into IDB and synchronized to the TT log.
  2. The update of the TT log triggers a Blink task, and the log resolution node parses the TT log to retrieve the feature data for product selection.
  3. The query splitting node estimates the number of required SPUs, determines the number of concurrent requests based on that of SPUs, and concatenates the SP parameters.
  4. The SP requesting node sends a concurrent SP service requests to obtain the SPU information.
  5. The data processing node reads the intermediate state from the HBase and performs computation based on the service logic.
  6. The data processing node writes the computing result back to the HBase database for the next computation.
  7. The TT writeback node and the Swift message node write the computing result back to the TT and Swift, respectively.
  8. The dump module accepts the Swift message and updates the data to the engine to bring it into effect in real time.
  9. TT records the computing result and writes it back to ODPS for full computing offline.

Implementation Details

The implementation of the product selection function mainly depends on developing Blink tasks. Before developing Blink tasks, you must understand the concepts of UDF, UDTF, and UDAF.


A major task during Blink development is to implement UDFs. First, multiple compute nodes are divided based on the stream computing process, for example, the query splitting node and the SP requesting node are independent compute nodes during implementation. Then, UDF classes must be determined and implemented based on the implementation logic of each node. The following uses the SP requesting node as an example to explain the implementation process in detail.

  1. Node analysis: The business scenario of the SP requesting node is a one-to-many process. Therefore, the UDTF class is used for implementation.
  2. UDTF class encapsulation: This class needs to inherit the TableFunction, where TableFunction is the pojo defined for itself and passed to the next running node.
  3. Node output: You need to define your own pojo class (namely the "T" item mentioned in the previous step) so that the output of the node is visible on the next node.
  4. Main function association: The Blink development process requires a master function to associate each compute node for stream computing purposes. We recommend that you develop the main function in the Scala language to facilitate the understanding of the code.

Reference Code

The following lists the UDTF implementation code for the SP requesting node. The basic idea of the code is to concurrently output the returned results of the SP to the next node.

public class SearchEngineUdtf extends TableFunction<EngineFields> {

    private static final Logger logger = LoggerFactory.getLogger(SearchEngineUdtf.class);

     * Request the engine to retrieve the recall field
     * @param params
    public void eval(String params) {
        SpuSearchResult<String> spuSearchResult = SpuSearchEngineUtil.getFromSpuSearch(params);
            // Result parsing
            JSONObject kxuanObj = SpuSearchEngineUtil.getSpResponseJson(spuSearchResult, "sp_kxuan");
            if(null == kxuanObj || kxuanObj.isEmpty()){
                logger.error("sp query: " + spuSearchResult.getSearchURL());
                logger.error(String.format("[%s],%s", Constant.ERR_PAR_SP_RESULT,"get key:sp_kxuan data failed! "));
            }else {
                List<EngineFields> engineFieldsList = SpuSearchEngineUtil.getSpAuction(kxuanObj);
                // Concurrently output to the data stream
                for(EngineFields engineFields : engineFieldsList){
        }else {
            logger.error(String.format("[%s],%s",Constant.ERR_REQ_SP, "request SpuEngine failed!"));


Launching a Blink Task


Currently, the process of launching a Blink task in a cluster is not fully automated. After finishing developing the Blink task, you need to launch it by going through the procedure shown in the preceding figure. After launching the task, you can log on to Yarn to check the running status of the task node.


After the function is launched, a selection pool with over 10,000 SPUs can take effect in minutes, greatly improving the product selection efficiency for creators.

About the Author

Author: Cui Qinglei (nickname: Chenxin), Senior Development Engineer of the Search System Service Platform, Alibaba Search Business Unit

Having joined Alibaba in 2015, Cui Qinglei is mainly engaged in the development of content-based product selection servers and is familiar with technologies related to search engine services and stream computing.


0 0 0
Share on

Alibaba Clouder

2,606 posts | 737 followers

You may also like


Alibaba Clouder

2,606 posts | 737 followers

Related Products

  • ECS(Elastic Compute Service)

    An online computing service that offers elastic and secure virtual cloud servers to cater all your cloud hosting needs.

    Learn More
  • ApsaraDB for RDS

    An on-demand database hosting service for MySQL, SQL Server and PostgreSQL with automated monitoring, backup and disaster recovery capabilities

    Learn More
  • Function Compute

    Alibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.

    Learn More