Forum Moderator
Forum Moderator
  • UID555
  • Fans14
  • Follows2
  • Posts69

[Others]What kind of beast must the database be to support 175,000 transactions per second?

More Posted time:Feb 7, 2017 14:08 PM
The Singles' Day shopping carnival in 2016 concluded 120.7 billion yuan of transaction value and the opening orders were created at a speed of 175,000 transactions per second, with 120,000 payments per second, hitting another historical high. How does the IT system on the back end process this huge amount of data? Let’s start with the improvements to database technologies.

You are also welcome to join the discussion, sharing the data types you are using, how you handle deployment (self-built databases or ApsaraDB directly), and how you cope with high-concurrency and high-load scenarios.

Let's name a few scenarios as examples:
1. Logistic algorithm - The scheduling algorithm of parcel deliveries and collection has been a very important subject in the logistics industry during the Singles' Day period and it closely relates to the delivery or collection service effectiveness and the operating costs of logistic companies. A good algorithm can improve the service effectiveness, reduce the cost and even mobilize social resources better, such as DiDi taxi service which encourages participation from everyone.
2. Optimization of flash sales - For the transaction platform, the flash sales during the Singles' Day period poses a huge technical challenge, facing heavy stress from bots, scripts and external plug-ins for automated rush purchasing systems. There are a lot of ways to handle flash sales scenarios. For example, some databases adopt a queuing mechanism, some adopt asynchronous messages and some others adopt transaction merging. Which one are you using and what are the advantages?
3. Word segmentation for search - There are massive promotional products during the Singles' Day shopping carnival. How we can help users to quickly locate their target items through word segmentation for search, regular expressions and similarity? What is your solution? Do you have any questions?

Everybody is welcome to join the discussion if you have more opinions and questions about the database.

Assistant Engineer
Assistant Engineer
  • UID622
  • Fans3
  • Follows0
  • Posts52
1st Reply#
Posted time:Feb 8, 2017 11:19 AM
For me, I am using a mix of SQL and noSQL. The old businesses run on RDS directly. My website is not big though, so doesn't experience much stress. I use standalone Redis as the cache, and the results are good.
New businesses run on MongoDB and the speed is fantastic. I personally feel MongoDB is suitable for new businesses. But most people are still using MySQL, perhaps out of old habits.
I like MongoDB for its flexibility.

  • UID623
  • Fans4
  • Follows1
  • Posts72
2nd Reply#
Posted time:Feb 9, 2017 9:13 AM
Here are my personal opinions.
The regular expressions of databases are a great idea to address word similarity. I will see whether I can implement this idea into my memory retrieval.
The computation of logistic routes is not handled by a single Dijkstra algorithm, but to use the algorithm as a database plug-in is awesome.
I value map tiles very much out of data considerations, probably because I am personally engaged in projects related with the Internet of Things. We cannot work out various precise routes all at once, from A to B, for example. Why? For instance, because the transportation information is in real time, you can only plan a rough route at the beginning for a user who wants to transport the cargo from Beijing to Shanghai, and you cannot guarantee that the status of this route is the best. In another instance, after I drive for one hour, the route in the original plan may not be the best option due to traffic control. At this time, you need to guide the user to choose the optimal route - this is when the continuous correction algorithm comes in play. Therefore, when an event happens (such as a traffic control), we should judge whether the current controlled section overlaps with the planned route and try to determine the current location of the driver. If it is estimated that the driver won't arrive at the controlled section within the next two hours, we can totally withhold this information from the driver because you don't know whether the traffic control will continue after two hours. So only when the time interval between the current driver status and the controlled route is smaller than a certain value will the re-calculation be triggered. Of course, the re-calculation will not involve the entire route, but offer the optimal alternative for the section from the current location to the end of the controlled section and the remaining route will still follow the original plan. It is better for the business clusters to handle the real-time correction mechanism.
We now adopt a shared memory linked list for word segmentation. We also once considered using a database, but when there are a lot of data requests, the stress will be huge if all the stress is put on the full-text retrieval of the database. The key of word segmentation is to first let users understand what words they are looking for. Second, the order of articles displayed in the list should be determined based on the user's clickprints and algorithms for times of file reference (different algorithms are used for different tools). Data hotspots may be suitable in small-scale databases.
The flash sales itself may be comparatively intuitive as far as database implementation. There may be another way of thinking - to implement through pools, such as memory pools. I can regard the flash sales product as a service, or an API. For example, there are 50 grids at the most, corresponding to 50 products. All data requests from different servers (may be of the same front-end cluster) can call this API. The API is mainly responsible for returning failure directly when the pool is full; queue the requests when the pool is not full - first come, first served. The actual request should calculate data in the queue + effective data in the pool. The judgment is based on this. If the pool times up and is closed, or has been full, perform the validation algorithm to verify the request effectiveness and then store the data into the database at one time.
Of course, pure memory implementation also harbors risks, such as downtime and data security risks. It is just a way of thinking.

Assistant Engineer
Assistant Engineer
  • UID634
  • Fans0
  • Follows0
  • Posts44
3rd Reply#
Posted time:Feb 10, 2017 9:42 AM
The idea is awesome and can be used to constantly improve any products including PostgreSQL. I posted a topic some time ago about the scale. The scale supported by a PG database on a single node is directly proportional to the hardware. You can bring the hardware performance into full play. Usually a machine at a price of 100,000 yuan can be mounted with about 64-core, PCI-ESSD and 256G memory and a 10Gb NIC. It is no problem for such a machine to support 1 billion QPS, millisecond-level regular expressions and full-text retrieval. In addition, there are multiple logistic algorithms besides Dijkstra. I only listed a simple example in the article. There are a dozen algorithms nested in the pgRouting plug-in. You can also expand the algorithms yourself.

Assistant Engineer
Assistant Engineer
  • UID627
  • Fans3
  • Follows0
  • Posts55
Posted time:Feb 13, 2017 13:38 PM
I leave communities, forums and company websites on the servers which currently still adopt MySQL. Key website databases have been moved onto the RDS which is very convenient to use. I haven’t backed up databases specifically in the past two years. I have to give it a thumbs up.

  • UID619
  • Fans3
  • Follows2
  • Posts59
Posted time:Feb 14, 2017 9:44 AM
The data transmission volume between the database and server can be minimized through some processing logic in the stored procedure.

Assistant Engineer
Assistant Engineer
  • UID626
  • Fans1
  • Follows1
  • Posts52
Posted time:Feb 16, 2017 14:26 PM
I currently use Alibaba Cloud for a forum website and a blog website respectively. The data traffic is okay. At the beginning, I used a self-built MySQL database combined with the application server. Alibaba hadn't launched RDS back then. When the traffic started surging, the stress on the server became significant (my server configuration is 1G, 1-core, 2MB). Alibaba Cloud RDS had a promotional activity and I bought the RDS with the lowest configuration (it seems unavailable now. I can only renew it). I equipped the forum website with Memcache for caching and the speed has been improved.
For the time being, I used a self-built server for the company. After all it is up to the boss as to whether to migrate the website onto the cloud, right?
MySQL database is used, with read/write splitting and master-slave mechanism to cope with the high concurrency. Solr full-text retrieval is used for the search (we tested Alibaba's OpenSearch). Open-source ELK is used for log collection and Redis is configured for queues.