×
Community Blog Database Design and Implementation of a Ride Hailing Dispatch System

Database Design and Implementation of a Ride Hailing Dispatch System

This article discusses the database design of an order dispatch system for ride hailing apps by using DiDi as an example.

By digoal

Background

If a ride-hailing app dispatches orders solely relying on its scheduling system, the system must be very robust.

Take DiDi as an example. DiDi needs to answer how to match the requirements of passengers and drivers and efficiently dispatch orders.

With the increase in business demands, such as ride-sharing and ride-scheduling, the scheduling rules also pile up.

This article describes a simple order dispatch system that implements proximity-based order dispatch and ride-sharing using PostgreSQL and the spatial database plugin PostGIS.

You can use SKIP LOCKED and advisory locks to avoid lock conflicts during peak hours.

PostgreSQL Design

1) Spatial Database Plugin PostGIS

Creates a spatial database in PostgreSQL.

postgres=# create extension postgis;  
CREATE EXTENSION  

2) Car Location Table

Records the real-time locations of cars, whether they have trips to fulfill, and whether there are seats available (sets available seats of a car to 0 if the car is fulfilling a non-sharing trip).

create table car (  
  id int primary key,                          -- 车辆ID,主键  
  pos geometry,                                -- 实时位置, 使用PostGIS,geometry类型  
  sites int2 not null default 4,               -- 总座位数  
  rest_sites int2,                             -- 剩余座位数 (因为有拼车业务)  
  mod_time timestamp,                          -- 位置修改时间  
  order_pos geometry[],                        -- 当前订单对应用户打车的目的地位置  
  check (rest_sites <= sites and rest_sites>=0 and sites>0)  
);  

3) User Table

create table users (  
  id int8 primary key,  -- ID  
  otherinfo jsonb       -- 其他信息,请允许我偷懒一下使用JSON,实际上我这里派单只需要记录ID  
);  

4) Order Table

Records all orders and their status.

create table orders (  
  id serial8 primary key,   -- 订单号  
  carid int,             -- 车辆ID  
  uid int8,              -- 用户ID  
  crt_time timestamp,    -- 订单创建时间  
  pos1 geometry,         -- 上车位置  
  pos2 geometry,         -- 目的地  
  sites int2,            -- 乘坐几人  
  status int2            -- 订单状态(进行中 2, 取消 1, 结束 0)  
);  

5) Real-time Update of Car Locations

The location data is reported and updated every N seconds (for example, 5 seconds).

Assume that there are 10 million DiDi cars in a city.

假定车辆的活动范围经纬度(110~120, 25~30)  
  
vi test.sql  
  
\set id random(1,10000000)  
insert into car(id, pos, mod_time) values (  
  :id,   
  ST_SetSRID(ST_Point(round((random()*(120-110)+110)::numeric,6), round((random()*(30-25)+25)::numeric,6)), 4326),  
  now()  
) on conflict (id) do update set pos=ST_SetSRID(ST_Point(ST_X(excluded.pos)+random()-0.5, ST_Y(excluded.pos)+random()-0.5), 4326), mod_time=excluded.mod_time  
where car.sites <> car.rest_sites or car.rest_sites is null;     -- 不能被叫的车辆不更新位置(例如他的座位满了)  

The stress testing results (with spatial index enabled) are as follows:

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 56 -j 56 -T 120  
  
  
progress: 4.0 s, 200614.4 tps, lat 0.279 ms stddev 0.357  
progress: 5.0 s, 202598.0 tps, lat 0.276 ms stddev 0.336  
progress: 6.0 s, 196562.4 tps, lat 0.285 ms stddev 0.785  
progress: 7.0 s, 200305.4 tps, lat 0.280 ms stddev 0.534  
progress: 8.0 s, 207505.2 tps, lat 0.270 ms stddev 0.270  
progress: 9.0 s, 204128.0 tps, lat 0.274 ms stddev 0.347  

6) Ride-Hailing and Order Dispatch

1) A user reports the location.

2) The system searches for available cars nearby and returns car info by GiST index and filtering conditions such as lock, site, and distance.

3) The system creates spatial indexes.

Ride-sharing index:

create index idx_car_pos_1 on car using gist(pos) where rest_sites>0 or rest_sites is null;  

Non-sharing index:

create index idx_car_pos_2 on car using gist(pos) where rest_sites=sites or rest_sites is null;  

4) The system runs a function to verify whether ride-sharing is available for the user (the input parameters are the current user's destinations and existing passengers' destinations on nearby cars).

Here, the distance is used as one of the filtering conditions. The revolving door algorithm is also used. If the destination of the current user is not in the same "door" as those of the existing passengers on nearby cars (within a distance less than x km) that have available seats, ride-sharing is unavailable for the user. You can use Plugins such as pgRouting to determine, based on the map and map algorithms, whether ride-sharing is recommended.

create or replace function f_isbulk(  
  i_pos geometry,      -- 目的地  
  i_poss geometry[]    -- 该车已有乘客目的地  
) returns boolean as $$  
declare  
  vspheroid spheroid := 'SPHEROID["WGS84",6378137,298.257223563]' ;  -- WGS84椭球体参数定义  
begin  
  -- 先使用最简单的算法,例如任意已有乘客目的地与当前请求目的地距离在2000米以内则返回TRUE,允许拼车  
  -- 测试时,建议把允许的拼车目的地距离调大一点,否则可能很难找到合适的车辆  
  -- perform 1 from unnest(i_poss) t(pos) where ST_DistanceSpheroid(i_pos, pos, vspheroid) <= 2000 limit 1;   
  perform 1 from unnest(i_poss) t(pos) where ST_DistanceSpheroid(i_pos, pos, vspheroid) <= 200000000 limit 1;   
  if found then   
    return true;   
  else   
    return false;  -- 距离超过2000米,不与之拼车  
  end if;   
end;   
$$ language plpgsql strict;   
  
测试时,建议把允许的拼车目的地距离调大一点,否则可能很难找到合适的车辆  

5) For a user who prefers ride-sharing, the system uses the following function to generate a ride-sharing order and returns information such as order ID and vehicle ID.

create or replace function getcar_isbulk(  
  i_uid int8,        -- 用户ID  
  i_pos1 geometry,   -- 上车位置  
  i_pos2 geometry,   -- 目的地  
  i_sites int2       -- 乘坐几人  
) returns int8 as $$  -- 返回订单号  
declare  
  v_car_ctid tid;  -- car表被请求到的CAR的记录行号  
  v_carid int;     -- carid  
  v_orderid int8;  -- 订单号  
  v_pos geometry;  -- 锁定的车辆位置  
  vspheroid spheroid := 'SPHEROID["WGS84",6378137,298.257223563]' ;  -- WGS84椭球体参数定义  
begin  
  set local enable_seqscan=off;  
  set local enable_bitmapscan=off;  
    
  -- 根据条件锁定车辆ID,同时使用了try advisory lock避免行锁冲突  
  -- 与秒杀方法类似,大幅度提高吞吐,PG中锁定单条记录的吞吐可以达到将近 30万tps  
  select ctid,pos into v_car_ctid,v_pos from car where   
    (rest_sites > 0                                   -- 剩余座位数大于0    
    and rest_sites >= i_sites or rest_sites is null)  -- 剩余座位数大于等于请求座位数    
    and (order_pos is null or f_isbulk(i_pos2, order_pos))    -- 目的地满足拼车要求  
    and pg_try_advisory_xact_lock(id)                         -- adlock,提高秒杀吞吐  
    order by i_pos1 <-> pos for update limit 1;               -- 根据距离排序,以上条件满足,锁定1条即可  
    
  -- 如果车辆位置超出一定公里数(比如5公里),直接返回,不生成订单  
  -- 测试时,建议把公里数调大,便于找到车辆  
  -- if not found or ST_DistanceSpheroid(i_pos1, v_pos, vspheroid) > 5000 then  
  if not found or ST_DistanceSpheroid(i_pos1, v_pos, vspheroid) > 500000000 then  
    -- raise notice 'no car near your pos, the car leave you % meters', ST_DistanceSpheroid(i_pos1, v_pos, vspheroid);  
    return -1;  
  end if;  
    
  -- 更新车辆状态  
  update car set   
    rest_sites=coalesce(rest_sites-i_sites, sites-i_sites),         -- 减少剩余座位  
    order_pos=coalesce(order_pos||i_pos2, array[i_pos2])            -- 将目的地追加到车辆所有目的地中  
    where ctid=v_car_ctid    
    and coalesce(rest_sites-i_sites, sites-i_sites) >= 0  
    returning id into v_carid;         -- 返回车辆ID  
  
  if found then  
  -- 生成订单  
  insert into orders (carid, uid, crt_time, pos1, pos2, sites, status)   
    values(v_carid, i_uid, now(), i_pos1, i_pos2, i_sites, 2)   -- 状态为进行中  
    returning id into v_orderid;  -- 返回订单号  
  else  
    return -2;  
  end if;  
    
  return v_orderid;  
  
end;  
$$ language plpgsql strict;  

Optimized on April 16, 2018.

create index idx_car_pos_pc_1 on car using gist(pos) where rest_sites=1;  
create index idx_car_pos_pc_2 on car using gist(pos) where rest_sites=2;  
create index idx_car_pos_pc_3 on car using gist(pos) where rest_sites=3;  
create index idx_car_pos_pc_4 on car using gist(pos) where rest_sites=4;  
create index idx_car_pos_pc_5 on car using gist(pos) where rest_sites>4;  
create index idx_car_pos_pc_6 on car using gist(pos) where rest_sites is null;  


CREATE OR REPLACE FUNCTION public.f_isbulk(i_pos geometry, i_poss geometry[])
 RETURNS int
 LANGUAGE plpgsql
 STRICT
AS $function$  
declare  
  vspheroid spheroid := 'SPHEROID["WGS84",6378137,298.257223563]' ;  -- WGS84椭球体参数定义  
begin  
  -- 先使用最简单的算法,例如任意已有乘客目的地与当前请求目的地距离在2000米以内则返回TRUE,允许拼车  
  -- 测试时,建议把允许的拼车目的地距离调大一点,否则可能很难找到合适的车辆  
  perform 1 from unnest(i_poss) t(pos) where ST_DistanceSpheroid(i_pos, pos, vspheroid) <= 2000000000000 limit 1;   
  if found then   
    return 1;   
  else   
    return 0;    -- 距离超过2000米,不与之拼车  
  end if;   
end;   
$function$;

create or replace function getcar_isbulk(  
  i_uid int8,        -- 用户ID  
  i_pos1 geometry,   -- 上车位置  
  i_pos2 geometry,   -- 目的地  
  i_sites int2       -- 乘坐几人  
) returns int8 as $$  -- 返回订单号  
declare  
  v_carid int;     -- carid  
  v_orderid int8;  -- 订单号  
  v_pos geometry;  -- 锁定的车辆位置  
  vspheroid spheroid := 'SPHEROID["WGS84",6378137,298.257223563]' ;  -- WGS84椭球体参数定义  
  v_sites_step int;         -- 步调
  -- 如果车辆位置在给定范围内(比如5公里),则退出循环,否则继续. (测试时为了更容易找到车辆,需要把范围设大一些)
  -- v_dist float8 := 5000;  -- 车辆离你的距离 , 如果5公里开外,那么说明附近没有车
  v_dist float8 := 500000000;    
begin  
  set local enable_seqscan=off;  
  set local enable_bitmapscan=off;  
  
  case 
  when i_sites <= 4 then 
    for v_sites_step in i_sites..4 loop
      -- 根据条件锁定车辆ID,同时使用了try advisory lock避免行锁冲突  
      -- 与秒杀方法类似,大幅度提高吞吐,PG中锁定单条记录的吞吐可以达到将近 30万tps  
      select id,pos into v_carid,v_pos from car where   
        rest_sites = v_sites_step                                 -- 剩余座位数等于请求座位数 (或大于, loop超过i_sites时)
        and coalesce(f_isbulk(i_pos2, order_pos),1)=1                           -- 目的地满足拼车条件
        and pg_try_advisory_xact_lock(id)                         -- adlock,提高秒杀吞吐  
        and ST_DistanceSpheroid(i_pos1, pos, vspheroid) < v_dist   -- 车辆位置在给定范围内(比如5公里) . (测试时为了更容易找到车辆,需要把范围设大一些)
        order by i_pos1 <-> pos for update limit 1;               -- 根据距离排序,以上条件满足,锁定1条即可  

      -- 如果有符合条件的车辆,则退出 case
      if found then  
        exit; 
      end if;  
    end loop;

    -- 如果有符合条件的车辆,则退出 case
    if not found then  

      -- 4个座位内都没有找到符合条件的车辆, 找剩余座位大于4个的 
      select id,pos into v_carid,v_pos from car where   
        rest_sites > 4                                            -- 剩余座位数等于请求座位数 (或大于, loop超过i_sites时)
        and coalesce(f_isbulk(i_pos2, order_pos),1)=1                           -- 目的地满足拼车条件        
    and ST_DistanceSpheroid(i_pos1, pos, vspheroid) < v_dist   -- 车辆位置在给定范围内(比如5公里) . (测试时为了更容易找到车辆,需要把范围设大一些)
        and pg_try_advisory_xact_lock(id)                         -- adlock,提高秒杀吞吐  
        order by i_pos1 <-> pos for update limit 1;               -- 根据距离排序,以上条件满足,锁定1条即可  

      -- 如果有符合条件的车辆,则退出 case 
      if not found then  

        -- 大于4个座位的车辆没有,则找刚注册的车辆,即rest_sites is null
        select id,pos into v_carid,v_pos from car where   
          rest_sites is null                                        -- 剩余座位数等于请求座位数 (或大于, loop超过i_sites时)
      and ST_DistanceSpheroid(i_pos1, pos, vspheroid) < v_dist   -- 车辆位置在给定范围内(比如5公里) . (测试时为了更容易找到车辆,需要把范围设大一些)
          and pg_try_advisory_xact_lock(id)                         -- adlock,提高秒杀吞吐  
          order by i_pos1 <-> pos for update limit 1;               -- 根据距离排序,以上条件满足,锁定1条即可  
      end if;
    end if;

  -- 请求座位数大于4
  else

    -- 4个座位内都没有找到符合条件的车辆, 找剩余座位大于4个的 
    select id,pos into v_carid,v_pos from car where   
      rest_sites > 4                                            -- 剩余座位数等于请求座位数 (或大于, loop超过i_sites时)
      and coalesce(f_isbulk(i_pos2, order_pos),1)=1                          -- 目的地满足拼车条件      
      and ST_DistanceSpheroid(i_pos1, pos, vspheroid) < v_dist  -- 车辆位置在给定范围内(比如5公里) . (测试时为了更容易找到车辆,需要把范围设大一些)
      and pg_try_advisory_xact_lock(id)                         -- adlock,提高秒杀吞吐  
      order by i_pos1 <-> pos for update limit 1;               -- 根据距离排序,以上条件满足,锁定1条即可  

    -- 如果有符合条件的车辆,则退出 case 
    if not found then  

      -- 大于4个座位的车辆没有,则找刚注册的车辆,即rest_sites is null
      select id,pos into v_carid,v_pos from car where   
        rest_sites is null                                        -- 剩余座位数等于请求座位数 (或大于, loop超过i_sites时)
        and ST_DistanceSpheroid(i_pos1, pos, vspheroid) < v_dist   -- 车辆位置在给定范围内(比如5公里) . (测试时为了更容易找到车辆,需要把范围设大一些)
        and pg_try_advisory_xact_lock(id)                         -- adlock,提高秒杀吞吐  
        order by i_pos1 <-> pos for update limit 1;               -- 根据距离排序,以上条件满足,锁定1条即可  
    end if;
  end case;

  if not found then
    -- raise notice 'no car near your pos, the car leave you % meters', ST_DistanceSpheroid(i_pos1, v_pos, vspheroid);  
    return -1;
  end if;

  -- 更新车辆状态  
  update car set   
    rest_sites=coalesce(rest_sites-i_sites, sites-i_sites),         -- 减少剩余座位  
    order_pos=coalesce(order_pos||i_pos2, array[i_pos2])            -- 将目的地追加到车辆所有目的地中  
    where id=v_carid    
    and coalesce(rest_sites-i_sites, sites-i_sites) >= 0  
    returning id into v_carid;         -- 返回车辆ID  
  
  if found then  
  -- 生成订单  
  insert into orders (carid, uid, crt_time, pos1, pos2, sites, status)   
    values(v_carid, i_uid, now(), i_pos1, i_pos2, i_sites, 2)   -- 状态为进行中  
    returning id into v_orderid;  -- 返回订单号  
  else  
    return -2;  
  end if;  
    
  return v_orderid;  
  
end;  
$$ language plpgsql strict; 

6) For a user who chooses not to share the ride, the system uses the following function to generate a non-sharing order and returns information such as order ID and vehicle ID.

create or replace function getcar(  
  i_uid int8,        -- 用户ID  
  i_pos1 geometry,   -- 上车位置  
  i_pos2 geometry    -- 目的地  
) returns int8 as $$  -- 返回订单号  
declare  
  v_car_ctid tid;  -- car表被请求到的CAR的记录行号  
  v_carid int;     -- carid  
  v_orderid int8;  -- 订单号  
  v_sites int2;    -- 座位数  
  v_pos geometry;  -- 锁定的车辆位置  
  vspheroid spheroid := 'SPHEROID["WGS84",6378137,298.257223563]' ;  -- WGS84椭球体参数定义  
begin  
  set local enable_seqscan=off;  
  set local enable_bitmapscan=off;  
    
  -- 根据条件锁定车辆ID,同时使用了try advisory lock避免行锁冲突  
  -- 与秒杀方法类似,大幅度提高吞吐,PG中锁定单条记录的吞吐可以达到将近 30万tps  
  select ctid,pos into v_car_ctid,v_pos from car where   
    (rest_sites=sites or rest_sites is null)                  -- 剩余座位数等于能提供的座位数,说明没有订单在手,满足不拼车需求  
    and pg_try_advisory_xact_lock(id)                         -- adlock,提高秒杀吞吐  
    order by i_pos1 <-> pos for update limit 1;               -- 根据距离排序,以上条件满足,锁定1条即可  
    
  -- 如果车辆位置超出一定公里数(比如5公里),直接返回,不生成订单  
  -- 测试时,建议把公里数调大,便于找到车辆  
  -- if not found or ST_DistanceSpheroid(i_pos1, v_pos, vspheroid) > 5000 then  
  if not found or ST_DistanceSpheroid(i_pos1, v_pos, vspheroid) > 500000000 then  
    -- raise notice 'no car near your pos, the car leave you % meters', ST_DistanceSpheroid(i_pos1, v_pos, vspheroid);  
    return -1;  
  end if;  
  
  -- 更新车辆状态  
  update car set   
    rest_sites=0                      -- 剩余座位减少为0  
    where ctid=v_car_ctid                
    returning id,sites into v_carid,v_sites;        -- 返回车辆ID  
    
  -- 生成订单  
  insert into orders (carid, uid, crt_time, pos1, pos2, sites, status)   
    values(v_carid, i_uid, now(), i_pos1, i_pos2, v_sites, 2)    -- 状态为进行中  
    returning id into v_orderid;                 -- 返回订单号  
    
  return v_orderid;  
end;  
$$ language plpgsql strict; 

7) Ending or Canceling an Order

Update the order status, update the car's available seats, and delete the dropped-off passengers' destinations.

create or replace function change_order(  
  i_id int8,     -- 订单ID  
  i_status int2  -- 状态, 进行中2,取消1,结束0   
) returns int as $$  
declare  
  i_carid int;  
  i_pos geometry;  
  i_sites int2;  
begin  
  set local enable_seqscan=off;  
  set local enable_bitmapscan=off;    
  update orders set status=i_status where id=i_id and status<>0 returning carid, pos2, sites into i_carid, i_pos, i_sites;  
  
  if found then  
    update car set rest_sites=rest_sites+i_sites, order_pos=array_remove(order_pos, i_pos) where id=i_carid and pg_try_advisory_xact_lock(id);  
      
    -- 测试时加上这段,因为不存在锁冲突  
    if not found then  
      raise EXCEPTION '';  
    end if;  
  
    return 1;  
  end if;  
    
  raise EXCEPTION '';  
  exception when others then  
    return -1;  
end;  
$$ language plpgsql strict;  

8) Stress Testing

Create a function to generate random pick-up locations of ride-hailing users for stress testing.

create or replace function gen_pos() returns geometry as $$  
  select ST_SetSRID(ST_Point(round((random()*(120-110)+110)::numeric,6), round((random()*(30-25)+25)::numeric,6)), 4326);  
$$ language sql strict;  

Create a function to obtain the IDs of unfinished orders for stress testing.

create or replace function gen_orderid() returns int8 as $$  
declare  
  res int8;  
  n int := random()*56;
begin  
  set local enable_seqscan=off;  
  set local enable_bitmapscan=off;  
  -- 随机采样得到订单ID,避免并发执行时大家得到的是同一个ID
  -- select id into res from orders TABLESAMPLE SYSTEM(0.1) where status = 2 limit 1;  
  -- 或者
  select id into res from orders where status = 2 offset n limit 1;  
  return res;  
end;  
$$ language plpgsql strict volatile;  
create index idx_orders_1 on orders (id) where status=2;  

Perform stress testing on the assumption that there are two billion users.

1) Generate ride-sharing orders.

vi test1.sql  
  
\set uid random(1,2000000000)  
select getcar_isbulk(:uid, gen_pos(), gen_pos(), 1::int2);  

2) Generate non-sharing orders.

vi test2.sql  
  
\set uid random(1,2000000000)  
select getcar(:uid, gen_pos(), gen_pos());  

3) End orders.

vi test3.sql  
  
select change_order(odid, 0::int2) from gen_orderid() t(odid) where pg_try_advisory_xact_lock(odid);  

4) Add or update car locations.

vi test4.sql  
  
\set id random(1,10000000)  
insert into car(id, pos, mod_time) values (  
  :id,   
  ST_SetSRID(ST_Point(round((random()*(120-110)+110)::numeric,6), round((random()*(30-25)+25)::numeric,6)), 4326),  
  now()  
) on conflict (id) do update set pos=ST_SetSRID(ST_Point(ST_X(excluded.pos)+random()-0.5, ST_Y(excluded.pos)+random()-0.5), 4326), mod_time=excluded.mod_time  
where car.sites<>car.rest_sites or car.rest_sites is null;   -- 不能被叫的车辆不更新位置(例如他的座位满了)  

Single Scenario-based Stress Testing Performance

1) Add or update car locations: About 167,000 rows are added per second.

pgbench -M prepared -n -r -P 1 -f ./test4.sql -c 56 -j 56 -T 120
number of transactions actually processed: 20120910  
latency average = 0.334 ms  
latency stddev = 0.431 ms  
tps = 167648.839197 (including connections establishing)  
tps = 167663.075494 (excluding connections establishing)  

2) Generate ride-sharing orders: About 17,300 orders are generated per second.

pgbench -M simple -n -r -P 1 -f ./test1.sql -c 13 -j 13 -T 120
tps = 17338.303267 (including connections establishing)
tps = 17339.077083 (excluding connections establishing)

3) Generate non-sharing orders: About 12,300 orders are generated per second.

pgbench -M simple -n -r -P 1 -f ./test2.sql -c 13 -j 13 -T 120
tps = 26224.271953 (including connections establishing)
tps = 26225.359067 (excluding connections establishing)

4) End orders: About 200,000 orders are ended per second.

pgbench -M prepared -n -r -P 1 -f ./test3.sql -c 56 -j 56 -T 120
tps = 204695.541432 (including connections establishing)  
tps = 204713.293460 (excluding connections establishing)  
Single scenario TPS
Locations updated 167,000
Ride-sharing orders generated 17,300
Non-sharing orders generated 26,000
Orders ended 200,000

Hybrid Scenario-based Stress Testing Performance

pgbench -M prepared -n -r -P 3 -f ./test4.sql -c 16 -j 16 -T 120 > ./log1 2>&1 &  
  
pgbench -M simple -n -r -P 3 -f ./test1.sql -c 10 -j 10 -T 120 > ./log2 2>&1 &  
  
pgbench -M simple -n -r -P 3 -f ./test2.sql -c 10 -j 10 -T 120 > ./log3 2>&1 &  
  
pgbench -M prepared -n -r -P 3 -f ./test3.sql -c 16 -j 16 -T 120 > ./log4 2>&1 &  
Hybrid scenario TPS
Locations updated 54,300
Ride-sharing orders generated 10,647
Non-sharing orders generated 12,045
Orders ended 74,000

The total number of ride-sharing and non-sharing orders generated per second is 22,692.

Some stress-testing data is as follows:

postgres=# select * from car where rest_sites between 1 and 2 limit 10;  
   id    |                        pos                         | sites | rest_sites |          mod_time          |      order_pos     
---------+----------------------------------------------------+-------+------------+----------------------------+----------------------------------  
 7317265 | 0101000020E6100000A489994A8FAB5D40CC44BFBDA9343D40 |     4 |          1 | 2018-04-14 22:58:03.407874 | {0101000020E6100000B6BFB33D7AA55B40D28A6F287C9A3940:0101000020E6100000B6BFB33D7AA55B40D28A6F287C9A3940:0101000020E6100000B6BFB33D7AA55B40D28A6F287C9A3940}  
 3111010 | 0101000020E6100000BB1DFDD578835B405280A6641A673C40 |     4 |          2 | 2018-04-14 23:12:53.34535  | {0101000020E6100000FEB7921D1B1E5C400CC9C9C4AD7E3B40:0101000020E6100000FEB7921D1B1E5C400CC9C9C4AD7E3B40}  
 7541005 | 0101000020E610000086554818387E5C40B728ADCD43D03D40 |     4 |          2 | 2018-04-14 23:11:29.206046 | {0101000020E61000000F9BC8CC05065D400B2AAA7EA5373A40:0101000020E61000000F9BC8CC05065D400B2AAA7EA5373A40}  
 8690828 | 0101000020E6100000000CAF61E2675C40D3DC0C1EA9D13940 |     4 |          1 | 2018-04-14 23:14:19.151509 | {}  
 6457811 | 0101000020E610000032F3008486A25D40EC2EF2553E843B40 |     4 |          2 | 2018-04-14 23:14:03.251394 | {0101000020E6100000AFD007CBD8555D402DAF5C6F9BD93D40:0101000020E6100000AFD007CBD8555D402DAF5C6F9BD93D40}  
 8742742 | 0101000020E6100000A9F5D11666C25C40BD187A2ED8943A40 |     4 |          1 | 2018-04-14 23:07:05.165694 | {0101000020E6100000B0E76B96CBC95B401CCD91955FAE3B40:0101000020E6100000B0E76B96CBC95B401CCD91955FAE3B40:0101000020E6100000B0E76B96CBC95B401CCD91955FAE3B40}  
 2817265 | 0101000020E610000039565329D09F5C403F56DACE20EF3C40 |     4 |          2 | 2018-04-14 23:11:18.579623 | {0101000020E61000005260014C19C55B407AC6BE64E3393D40:0101000020E61000005260014C19C55B407AC6BE64E3393D40}  
 7150506 | 0101000020E610000061A6600487595D405D7086861CBB3C40 |     4 |          2 | 2018-04-14 23:15:33.078539 | {0101000020E61000006743FE9941FD5D40F0FB372F4E483C40:0101000020E61000006743FE9941FD5D40F0FB372F4E483C40}  
 5583272 | 0101000020E6100000857D1D6801D25D40C58F015D4F2B3D40 |     4 |          2 | 2018-04-14 23:10:02.842235 | {0101000020E61000004B92E7FA3E9B5D40FCA6B05241193940:0101000020E61000004B92E7FA3E9B5D40FCA6B05241193940}  
 1367076 | 0101000020E610000072D31806729E5D403A3F7D95E3703D40 |     4 |          1 | 2018-04-14 23:14:02.91879  | {0101000020E61000009450FA42C88C5B4003B4AD669DCD3C40:0101000020E61000009450FA42C88C5B4003B4AD669DCD3C40:0101000020E61000009450FA42C88C5B4003B4AD669DCD3C40}  
(10 rows)  
  
postgres=# select * from orders limit 10;  
   id    |  carid  |    uid     |          crt_time          |                        pos1                        |                        pos2                        | sites | status   
---------+---------+------------+----------------------------+----------------------------------------------------+----------------------------------------------------+-------+--------  
 5350583 | 8219421 |  656330079 | 2018-04-14 22:58:50.85801  | 0101000020E6100000F1B913ECBFA55B40E86A2BF697213C40 | 0101000020E6100000A6F0A0D975CE5C40543A58FFE7BC3A40 |     4 |      0  
 5350594 |  387903 | 1251082211 | 2018-04-14 22:58:50.899007 | 0101000020E61000005E4A5D328E2F5D40BF9CD9AED0BB3B40 | 0101000020E610000073F4F8BD4D465D409DD66D50FB353C40 |     1 |      0  
 5350601 | 6032695 |  633527755 | 2018-04-14 22:58:50.901435 | 0101000020E6100000095053CBD6D45B401232906797BF3D40 | 0101000020E6100000FDFA213658D85C4010035DFB02023A40 |     1 |      0  
 5350645 | 9332236 | 1950115872 | 2018-04-14 22:58:50.906247 | 0101000020E610000021CD58349D835B400A9E42AED4F33C40 | 0101000020E6100000988922A46E9D5C40FD2E6CCD56663C40 |     4 |      0  
 5534249 | 1426569 |  982096157 | 2018-04-14 22:59:13.750311 | 0101000020E6100000115322895E535D404EB857E6ADEE3B40 | 0101000020E61000002EAEF199ECF65C40D845D1031FE33940 |     1 |      0  
 5350660 |  725764 | 1023537035 | 2018-04-14 22:58:50.907513 | 0101000020E61000004DDBBFB2D2575C404580D3BB78C73940 | 0101000020E6100000E544BB0A294E5C407CD2890453593B40 |     1 |      0  
 5534260 | 1777824 | 1176511003 | 2018-04-14 22:59:13.751194 | 0101000020E6100000FCE4284014305D404A0A2C80297B3B40 | 0101000020E610000079B29B19FD9E5B4015C8EC2C7A6B3940 |     1 |      0  
 5350677 | 9205198 | 1483861832 | 2018-04-14 22:58:50.9087   | 0101000020E61000002BD9B11188C35B4094FB1D8A024D3D40 | 0101000020E6100000F6EE8FF7AA2D5D402B685A6265203D40 |     4 |      0  
 5350704 | 4722183 | 1707309465 | 2018-04-14 22:58:50.910806 | 0101000020E610000010E7E104A66C5D4080F44D9A069D3A40 | 0101000020E61000007D2079E7509D5D4084D4EDEC2B3B3B40 |     1 |      0  
 5350729 | 1273928 | 1122725930 | 2018-04-14 22:58:50.91219  | 0101000020E6100000815CE2C803805D40AED85F764F0A3C40 | 0101000020E61000004E7B4ACE89595D404CA59F70763F3C40 |     1 |      0  
(10 rows)  
  
postgres=# select * from orders where status=2 limit 10;  
   id    |  carid  |    uid     |          crt_time          |                        pos1                        |                        pos2                        | sites | status   
---------+---------+------------+----------------------------+----------------------------------------------------+----------------------------------------------------+-------+--------  
 5432604 | 6569377 | 1058047186 | 2018-04-14 22:58:54.188969 | 0101000020E6100000CB13083BC5785C40DBA6785C54EF3C40 | 0101000020E6100000E0F42EDE8FE25B40FE43FAEDEB783940 |     1 |      2  
 5432645 |  850296 | 1314115523 | 2018-04-14 22:58:54.190902 | 0101000020E61000006878B306EF335C40A0E1CD1ABC173D40 | 0101000020E6100000139A249694D35C40F513CE6E2DE73A40 |     4 |      2  
 5432709 | 6569377 |  283077597 | 2018-04-14 22:58:54.194083 | 0101000020E6100000CB13083BC5785C40DBA6785C54EF3C40 | 0101000020E6100000E0F42EDE8FE25B40FE43FAEDEB783940 |     1 |      2  
 5432752 | 9088745 |   11001436 | 2018-04-14 22:58:54.195929 | 0101000020E61000006878B306EF335C40A0E1CD1ABC173D40 | 0101000020E6100000139A249694D35C40F513CE6E2DE73A40 |     4 |      2  
 5432817 | 1126957 |  577778747 | 2018-04-14 22:58:54.199477 | 0101000020E6100000CB13083BC5785C40DBA6785C54EF3C40 | 0101000020E6100000E0F42EDE8FE25B40FE43FAEDEB783940 |     1 |      2  
 5432857 |  480496 |  269272019 | 2018-04-14 22:58:54.201194 | 0101000020E61000006878B306EF335C40A0E1CD1ABC173D40 | 0101000020E6100000139A249694D35C40F513CE6E2DE73A40 |     4 |      2  
 5432917 | 1126957 | 1665973989 | 2018-04-14 22:58:54.203993 | 0101000020E6100000CB13083BC5785C40DBA6785C54EF3C40 | 0101000020E6100000E0F42EDE8FE25B40FE43FAEDEB783940 |     1 |      2  
 5432962 | 7515414 |   14898049 | 2018-04-14 22:58:54.206027 | 0101000020E61000006878B306EF335C40A0E1CD1ABC173D40 | 0101000020E6100000139A249694D35C40F513CE6E2DE73A40 |     4 |      2  
 5433012 | 7323377 | 1678751369 | 2018-04-14 22:58:54.208529 | 0101000020E61000006878B306EF335C40A0E1CD1ABC173D40 | 0101000020E6100000139A249694D35C40F513CE6E2DE73A40 |     4 |      2  
 5433048 | 1126957 |  281166362 | 2018-04-14 22:58:54.210329 | 0101000020E6100000CB13083BC5785C40DBA6785C54EF3C40 | 0101000020E6100000E0F42EDE8FE25B40FE43FAEDEB783940 |     1 |      2  
(10 rows)  

Assume that the average price for each order is CNY 30, the commission for DiDi is 20%, and 22,692 orders are generated per second. The maximum profit brought about by this PostgreSQL database is 22,692 orders/s × CNY 6/order = CNY 136 thousand/s (USD 19.6 thousand/s), that is, CNY 11.7 billion/day (USD 1.685 billion/day). The peak time cannot last all day long, but we can expect at least one peak hour per day, meaning a CNY 500 million profit.

Then, let's see how much this database costs. A standalone server for this database may cost about CNY 20,000 (USD 2,881). Considering the costs of servers for primary/secondary deployment, disaster recovery, and backup, the hardware costs can be controlled within CNY 300,000 (USD 43,224).

It costs only CNY 6,140 (USD 884.6) per month if you buy an Alibaba Cloud RDS for PostgreSQL instance — with a specification of 56 cores, 480 GB memory, and 2 TB SSD storage and provided with features such as high cloud disk reliability, high availability and database backup, and most importantly database kernel-level technical support service.

Optimization

1) Bottleneck diagnosis for code.

2) Bottleneck diagnosis for user-defined functions (UDFs) (the testing data has been updated after optimization).

How do we diagnose the performance when all logic is written into UDFs? Two options are available.

One is to use the auto_explain module:

load 'auto_explain';
set auto_explain.log_analyze =on;
set auto_explain.log_buffers =on;
set auto_explain.log_min_duration =0;
set auto_explain.log_nested_statements =on;
set auto_explain.log_time=on;
set auto_explain.log_verbose =on;
set client_min_messages ='log';
postgres=# select getcar_isbulk(1, st_setsrid(st_point(120,30),4326), st_setsrid(st_point(120,30.1),4326), 1::int2);
LOG:  duration: 0.047 ms  plan:
Query Text: SELECT 1 from unnest(i_poss) t(pos) where ST_DistanceSpheroid(i_pos, pos, vspheroid) <= 2000000000 limit 1
Limit  (cost=0.00..1.56 rows=1 width=4) (actual time=0.039..0.039 rows=1 loops=1)
  Output: 1
  ->  Function Scan on pg_catalog.unnest t  (cost=0.00..51.25 rows=33 width=4) (actual time=0.038..0.038 rows=1 loops=1)
        Output: 1
        Function Call: unnest('{0101000020E61000000000000000005E409A99999999193E40:0101000020E61000000000000000005E409A99999999193E40:0101000020E61000000000000000005E409A99999999193E40}'::geometry[])
        Filter: (st_distancespheroid('0101000020E61000000000000000005E409A99999999193E40'::geometry, t.pos, 'SPHEROID("WGS84",6378137,298.257223562997)'::spheroid) <= '2000000000'::double precision)


LOG:  duration: 0.485 ms  plan:
Query Text: select id,pos                    from car where   
    (rest_sites > 0                                   -- 剩余座位数大于0    
    and rest_sites >= i_sites or rest_sites is null)  -- 剩余座位数大于等于请求座位数    
    and (order_pos is null or f_isbulk(i_pos2, order_pos))    -- 目的地满足拼车要求  
    and pg_try_advisory_xact_lock(id)                         -- adlock,提高秒杀吞吐  
    order by i_pos1 <-> pos for update skip locked limit 1
Limit  (cost=0.41..1.50 rows=1 width=50) (actual time=0.482..0.482 rows=1 loops=1)
  Output: id, pos, (('0101000020E61000000000000000005E400000000000003E40'::geometry <-> pos)), ctid
  Buffers: shared hit=9
  ->  LockRows  (cost=0.41..1269428.33 rows=1169416 width=50) (actual time=0.481..0.481 rows=1 loops=1)
        Output: id, pos, (('0101000020E61000000000000000005E400000000000003E40'::geometry <-> pos)), ctid
        Buffers: shared hit=9
        ->  Index Scan using idx_car_pos_1 on public.car  (cost=0.41..1257734.17 rows=1169416 width=50) (actual time=0.465..0.465 rows=1 loops=1)
              Output: id, pos, ('0101000020E61000000000000000005E400000000000003E40'::geometry <-> pos), ctid
              Order By: (car.pos <-> '0101000020E61000000000000000005E400000000000003E40'::geometry)
              Filter: ((((car.rest_sites > 0) AND (car.rest_sites >= '1'::smallint)) OR (car.rest_sites IS NULL)) AND pg_try_advisory_xact_lock((car.id)::bigint) AND ((car.order_pos IS NULL) OR f_isbulk('0101000020E61000000000000000005E409A99999999193E40'::geometry, car.order_pos)))
              Buffers: shared hit=7


LOG:  duration: 0.102 ms  plan:
Query Text: update car set   
    rest_sites=coalesce(rest_sites-i_sites, sites-i_sites),         -- 减少剩余座位  
    order_pos=coalesce(order_pos||i_pos2, array[i_pos2])            -- 将目的地追加到车辆所有目的地中  
    where id=v_carid    
    and coalesce(rest_sites-i_sites, sites-i_sites) >= 0  
    returning id
Update on public.car  (cost=0.43..2.67 rows=1 width=86) (actual time=0.098..0.099 rows=1 loops=1)
  Output: id
  Buffers: shared hit=15
  ->  Index Scan using car_pkey on public.car  (cost=0.43..2.67 rows=1 width=86) (actual time=0.017..0.018 rows=1 loops=1)
        Output: id, pos, sites, COALESCE((rest_sites - '1'::smallint), (sites - '1'::smallint)), mod_time, COALESCE((order_pos || '0101000020E61000000000000000005E409A99999999193E40'::geometry), '{0101000020E61000000000000000005E409A99999999193E40}'::geometry[]), ctid
        Index Cond: (car.id = 1112283)
        Filter: (COALESCE((car.rest_sites - '1'::smallint), (car.sites - '1'::smallint)) >= 0)
        Buffers: shared hit=4


LOG:  duration: 0.032 ms  plan:
Query Text: insert into orders (carid, uid, crt_time, pos1, pos2, sites, status)   
    values(v_carid, i_uid, now(), i_pos1, i_pos2, i_sites, 2)   -- 状态为进行中  
    returning id
Insert on public.orders  (cost=0.00..0.02 rows=1 width=96) (actual time=0.030..0.031 rows=1 loops=1)
  Output: id
  Buffers: shared hit=8
  ->  Result  (cost=0.00..0.02 rows=1 width=96) (actual time=0.010..0.010 rows=1 loops=1)
        Output: nextval('orders_id_seq'::regclass), 1112283, '1'::bigint, now(), '0101000020E61000000000000000005E400000000000003E40'::geometry, '0101000020E61000000000000000005E409A99999999193E40'::geometry, '1'::smallint, '2'::smallint
        Buffers: shared hit=1


LOG:  duration: 1.686 ms  plan:
Query Text: select getcar_isbulk(1, st_setsrid(st_point(120,30),4326), st_setsrid(st_point(120,30.1),4326), 1::int2);
Result  (cost=0.00..0.26 rows=1 width=8) (actual time=1.681..1.681 rows=1 loops=1)
  Output: getcar_isbulk('1'::bigint, '0101000020E61000000000000000005E400000000000003E40'::geometry, '0101000020E61000000000000000005E409A99999999193E40'::geometry, '1'::smallint)
  Buffers: shared hit=56
 getcar_isbulk 
---------------
      18854017
(1 row)

The other is to use plprofiler.

Database Sharding

Generally, the purpose of database sharding is to reduce the number of requests per database. However, how do we shard a spatiotemporal database?

1) If you shard a database by space, the geographical boundaries are hard to define.

2) If you shard a database by other attributes, data of any space may exist in all shards. For example, when you query information about a car, you must search all shards.

Business Logic:

To reduce the number of requests, you have to shard the database by space. You must overcome the problems of boundaries and car location migration.

1) First, shard the database by regular activity regions (represented by polygons defined by location points) of cars, build the metadata, and store the polygon-to-shard mapping relationship.

2) Select the polygon that covers the pickup position and find one or more shards corresponding to this polygon. PostgreSQL has outstanding performance in polygon selection.

3) Shard the database by order ID and store the order-to-shard mapping relationship. This relationship indicates the shard to which an order belongs and avoids a space overlapping problem.

4) Finally, correct the location of a car. For example, when a car goes far away from its regular activity region to fulfill a long trip, you can update its location to migrate its information from the original shard to the destination shard. As a result, its information appears in the destination shard and is visible to users.

The location of a car can be updated based on an electronic fence. Specifically, when the car goes out of the electronic fence, the driver is prompted to choose whether to update the location. Some apps also detect your location based on electronic fences. And when you are on a business trip to another city, remind you to change your location to the current city. The driver may also change settings to accept return-trip orders.

Summary

PostgreSQL is an excellent full-stack database, and this example uses the following database features:

1) PostGIS: A spatial database plugin.

2) Array: Serves as the arithmetic foundation for ride-hailing.

3) Partial index: Allows creating indexes for data that need to be searched.

4) UDF: User-defined functions used in database programming. Order dispatch in this example is completed by using PL/pgSQL functions supported by PostgreSQL.

5) SKIP LOCKED and advisory locks: Similar to their application in the flash sale scenario, SKIP LOCKED and advisory locks are used to improve the ride-hailing throughput during peak hours and prevent multiple customers from waiting for the same car.

The server used for testing is a 56-core Alibaba Cloud ECS VM with SSD cloud disks.

Performance: In a hybrid scenario, this server is able to dispatch 10,000 orders per second. If there are 1 million orders to be dispatched during peak hours, you can shard the database into 100 shards.

Do Simple Math

Assume that the average price for each order is CNY 30, the commission for DiDi is 20%, and 22,692 orders are generated per second. The maximum profit brought about by this PostgreSQL database is 22,692 orders/s × CNY 6/order = CNY 136 thousand/s (USD 19.6 thousand/s), that is, CNY 11.7 billion/day (USD 1.685 billion/day). The peak time cannot last all day long, but we can expect at least one peak hour per day, meaning a CNY 500 million profit.

It takes only CNY 6,140 (USD 884.6) per month to buy an Alibaba Cloud RDS for PostgreSQL instance — with a specification of 56 cores, 480 GB memory, and 2 TB SSD storage and provided with features such as high cloud disk reliability, high availability and database backup, and most importantly database kernel-level technical support service.

Optimization

This article mainly provides some ideas and examples for your reference. Further optimizations are certainly available.

Optimization 1: The locations of unavailable cars are not updated (or can be updated less frequently) to reduce the number of updates. This is already implemented in this example (supported by the INSERT ON CONFLICT syntax).

Optimization 2: Facilitating ride-sharing orders of the same directions during peak hours. Similar to the group commit feature of a database, a time window of a specified length can be held for customers during ride-hailing. Then a clustering algorithm such as K-means is used to aggregate customers whose destination locations are close to facilitate ride-sharing among these customers. (Of course, there are more methods you can use to facilitate ride-sharing.)

Optimization 3: If multiple customers hail rides from the same place at the same time, the same car may be selected multiple times if the system uses the proximity-based dispatch policy.

This article uses advisory locks to avoid row lock conflicts. This method is not the best solution. Although it solves the problem of lock conflicts, the scan still goes from near to far. When multiple scans are performed in parallel, some sessions must scan many rows to find a row that is not locked. Here is an approach that is similar to group commit. In this approach, the system retrieves multiple cars for the customers who hail rides from the same place at the same time and allocates these cars to different customers.

There is another approach that must be implemented in a database. Given a discrete factor, the system can retrieve nearby cars (within a specified distance) and randomly pick one based on the index. Therefore, it needs to be ensured that only one entry is retrieved during index scans and heap scans. This is similar to the random scanning of indexed points.

Optimization 4: You can design a reasonable fill-factor for the frequently updated car table and its indexes. For example, you may set the fill-factor to 70.

0 0 0
Share on

digoal

148 posts | 10 followers

You may also like

Comments