×
Community Blog Supporting 200 Billion Data Records in a Single RDS PostgreSQL Instance

Supporting 200 Billion Data Records in a Single RDS PostgreSQL Instance

Varbitx, an extension provided by ApsaraDB for RDS PostgreSQL, enables real-time tagging for trillions of USER_TAGs using a single RDS PG.

By Digoal

Alibaba Cloud ApsaraDB for PostgreSQL is capable of handling large amounts of records at a time. How much? Well, let's look at an example to find out. Let's assume we have 2 billion users, each with 1000 tags, and we need to perform user tagging and pivoting based on any combination of tags (the business requirement is to compute a combination of up to 100 tags at a time).

This is equivalent to handling 200 billion records at a time, each requiring real-time response.

You may think that this would need at least a hundred machines. But in fact, this amount of data only requires one ApsaraDB for RDS PostgreSQL instance. This article discusses the cutting-edge RDS PG technology that helps us solve this sort of business requirements while consuming minimum resources.

Optimization Solution to Improve Response Speed

  1. bitmap segmentation
  2. Use parallel computing to determine the USER COUNT which meets the tag conditions (using dblink asynchronous calls)
  3. Return streaming cursors when determining user IDs.

Sample Demo

Install required plug-ins

create extension dblink;  
create extension varbitx;  

Create a tag table, and split it into segments, for example a table of 2 billion users may be split into 400 segments, with each segment having 50 million user BITs.

postgres=# create table t_bitmap (  
  tagid int,   -- Tag ID  
  ofid int,    -- Offset value multiplied by 50 million  
  v varbit     -- userid bit  
);  
CREATE TABLE  

Create an index (constraint)

create unique index idx_t_bitmap_1 on t_bitmap (tagid, ofid);  

Create BITMAP data for 1000 tags, with 400 pieces of data for each tag, and each piece of data having a length of 50 million bits.

postgres=# do language plpgsql 
$$
  
declare v varbit := repeat('1',5000000)::varbit;  
begin  
  for i in 1..100 loop  
    for x in 0..399 loop  
      insert into t_bitmap values (i, x, v);                            
    end loop;  
  end loop;  
end;  

$$
;  
  
  
DO  
Time: 150468.359 ms (02:30.468)  

Create a function that generates dblink connections (does not report errors for repeated creations).

create or replace function conn(  
  name,   -- dblink name  
  text    -- Connection string, URL  
) returns void as 
$$
    
declare    
begin    
  perform dblink_connect($1, $2);   
  return;    
exception when others then    
  return;    
end;    

$$
 language plpgsql strict;    

Parallel computing function for AND tag combinations (dblink asynchronous parallelism) returns the USERID bitcount.

create or replace function get_bitcount_and(  
  and_tagids int[],   -- Enters the tag ID array  
  v_bit int,          -- Determines the bitcount of 1 or 0  
  conn text,          -- Connection string
  OUT cnt int8        -- Returns value, the count of 1 or 0  
) returns setof int8 as 
$$
   
declare  
begin  
for i in 0..399 loop   -- Generates 400 links. There are exactly 400 links, because the length for each link is 50 million bits, and there are a total of 2 billion bits. Then LOOP  
  perform conn('link'||i,  conn);   -- Connects  
  perform dblink_get_result('link'||i);                        -- Consumes the result of the last asynchronous link, otherwise an error will be thrown.  
    
  -- Sends asynchronous DBLINK calls  
  -- One bit segment for each operation, returns the bitcount where the bit is either 0 or 1  
  perform dblink_send_query('link'||i, format('select bit_count(bit_and(v), %s) from t_bitmap where tagid = any (%L) and ofid=%s', v_bit, and_tagids, i));    
end loop;  
  
for i in 0..399 loop  
  -- Returns results of asynchronous calls, including all segments  
  return query SELECT * FROM dblink_get_result('link'||i) as t(cnt int8);  
end loop;  
end;  

$$
 language plpgsql strict;  

Parallel computing function for OR tag combinations (dblink asynchronous parallelism) returns the USERID bitcount.

create or replace function get_bitcount_or(  
  or_tagids int[],   
  v_bit int,   
  conn text,          -- Connection string
  OUT cnt int8  
) returns setof int8 as 
$$
   
declare  
begin  
for i in 0..399 loop   
  perform conn('link'||i,  conn);   
  perform dblink_get_result('link'||i);  
  perform dblink_send_query('link'||i, format('select bit_count(bit_or(v), %s) from t_bitmap where tagid = any (%L) and ofid=%s', v_bit, or_tagids, i));  
end loop;  
  
for i in 0..399 loop  
  return query SELECT * FROM dblink_get_result('link'||i) as t(cnt int8);  
end loop;  
end;  

$$
 language plpgsql strict;  

Parallel computing function for AND/OR tag combinations (dblink asynchronous parallelism) returns the USERID bitcount.

create or replace function get_bitcount_and_or(  
  and_tagids int[],   
  or_tagids int[],   
  v_bit int,   
  conn text,          -- Connection string
  OUT cnt int8  
) returns setof int8 as 
$$
   
declare  
begin  
for i in 0..399 loop   
  perform conn('link'||i,  conn);   
  perform dblink_get_result('link'||i);  
  perform dblink_send_query('link'||i, format('  
    with t1 as (select bit_and(v) b from t_bitmap where tagid = any (%L) and ofid=%s),   
         t2 as (select bit_or(v) b from t_bitmap where tagid = any (%L) and ofid=%s)   
    select bit_count(bitor(t1.b, t2.b), %s) from t1,t2',   
    and_tagids, i, or_tagids, i, v_bit));  
end loop;  
  
for i in 0..399 loop  
  return query SELECT * FROM dblink_get_result('link'||i) as t(cnt int8);  
end loop;  
end;  

$$
 language plpgsql strict;  
-- A more complex query, parameters of which can be modified based on your needs. This is rarely needed for actual business.  
-- (a and b andc or d) or (a and c) or (d and not b)..........  

Performance of bitcount pivoting is as follows: only 1.5 seconds for a 50-tag combination, and 2.6 seconds for a 100-tag combination:

It only takes 2.6 seconds to count the combination of 200 billion user_tags (one record for each user, and 1000 tags for each record)!

One tag:  
postgres=# select sum(cnt) from (select get_bitcount_and(array_agg(id),1,'dbname=postgres user=postgres') cnt from generate_series(1,1) t(id)) t;  
    sum       
------------  
 2000000000  
(1 row)  
  
Time: 791.392 ms  
  
A 10-tag combination:  
postgres=# select sum(cnt) from (select get_bitcount_and(array_agg(id),1,'dbname=postgres user=postgres') cnt from generate_series(1,10) t(id)) t;  
    sum       
------------  
 2000000000  
(1 row)  
  
Time: 847.427 ms  
  
A 50-tag combination:  
postgres=# select sum(cnt) from (select get_bitcount_and(array_agg(id),1,'dbname=postgres user=postgres') cnt from generate_series(1,50) t(id)) t;  
    sum       
------------  
 2000000000  
(1 row)  
  
Time: 1478.847 ms (00:01.479)  
  
A 100-tag combination:  
postgres=# select sum(cnt) from (select get_bitcount_and(array_agg(id),1,'dbname=postgres user=postgres') cnt from generate_series(1,100) t(id)) t;  
    sum       
------------  
 2000000000  
(1 row)  
  
Time: 2574.761 ms (00:02.575)  

The performance of the AND/OR combination is as follows, and is also very good:

postgres=# select sum(cnt) from (select get_bitcount_and_or(array_agg(case mod(id,2) when 0 then id end), array_agg(case mod(id,2) when 1 then id end), 1,'dbname=postgres user=postgres') cnt from generate_series(1,1) t(id)) t;  
 sum   
-----  
      
(1 row)  
  
Time: 854.934 ms  
postgres=# select sum(cnt) from (select get_bitcount_and_or(array_agg(case mod(id,2) when 0 then id end), array_agg(case mod(id,2) when 1 then id end), 1,'dbname=postgres user=postgres') cnt from generate_series(1,10) t(id)) t;  
    sum       
------------  
 2000000000  
(1 row)  
  
Time: 889.472 ms  
postgres=# select sum(cnt) from (select get_bitcount_and_or(array_agg(case mod(id,2) when 0 then id end), array_agg(case mod(id,2) when 1 then id end), 1,'dbname=postgres user=postgres') cnt from generate_series(1,50) t(id)) t;  
    sum       
------------  
 2000000000  
(1 row)  
  
Time: 1519.031 ms (00:01.519)  
postgres=# select sum(cnt) from (select get_bitcount_and_or(array_agg(case mod(id,2) when 0 then id end), array_agg(case mod(id,2) when 1 then id end), 1,'dbname=postgres user=postgres') cnt from generate_series(1,100) t(id)) t;  
    sum       
------------  
 2000000000  
(1 row)  
  
Time: 2597.701 ms (00:02.598)  

The AND function that determines USERID is as follows. In order to achieve high speed response, we return cursors.

create or replace function get_pos_and(  
  and_tagids int[],    -- Tag combination  
  v_bit int            -- Determines the bitcount of 1 or 0, returns the cursor. A cursor contains the ofid and the position subscript (of course, the translation action can also be done by the program, and then returns BIT and ofid)  
) returns setof refcursor as 
$$
  
declare  
  ref refcursor[];    -- Returns a cursor array  
  res refcursor;      -- Returns a cursor  
  sql text;           -- SQL text corresponding to the cursor, which is at the position of USERID  
begin  
  for x in 1..400 loop   -- Generates 400 cursor names  
    ref[x] := 'cur'||x;  
  end loop;  
  
  for i in 0..399 loop   
    -- Uses the offset value from 0 to 399, and then multiplies it by the 50 million coefficient.  
      
    -- Grants a name to a cursor  
    res := ref[i+1];   
    -- Generates the dynamic SQL statement for the cursor (ofid, bit position). Note that the bit position can be either left untranslated or be translated by the program. When program translation is used, use the in query dictionary after the translation is done  
    -- select uid from uid_mapping where pos in (pos_array);  
    -- 100 million, in 1 million, 380 ms  
    --[HTAP database PostgreSQL scenarios and performance tests—No. 25 (OLTP) IN, EXISTS query](201711/20171107_26.md)    
    sql := format('select %s, bit_posite(bit_and(v), %s, true) from t_bitmap where tagid = any (%L) and ofid=%s', i, v_bit, and_tagids, i);   
    -- Opens a cursor  
    open res for execute sql ;  
    -- Returns a cursor  
    return next res;  
  end loop;  
end;  

$$
 language plpgsql strict;  

The OR function that determines USERID is as follows. In order to achieve high speed response, we return cursors.

create or replace function get_pos_or(  
  or_tagids int[],   
  v_bit int   
) returns setof refcursor as 
$$
  
declare  
  ref refcursor[];    
  res refcursor;   
  sql text;  
begin  
  for x in 1..400 loop  
    ref[x] := 'cur'||x;  
  end loop;  
  
  for i in 0..399 loop   
    res := ref[i+1];   
    sql := format('select %s, bit_posite(bit_or(v), %s, true) from t_bitmap where tagid = any (%L) and ofid=%s', i, v_bit, or_tagids, i);  
    open res for execute sql ;  
    return next res;  
  end loop;  
end;  

$$
 language plpgsql strict;  

The AND_OR function that determines USERID is as follows. In order to achieve high speed response, we return cursors.

create or replace function get_pos_and_or(  
  and_tagids int[],   
  or_tagids int[],   
  v_bit int   
) returns setof refcursor as 
$$
  
declare  
  ref refcursor[];    
  res refcursor;   
  sql text;  
begin  
  for x in 1..400 loop  
    ref[x] := 'cur'||x;  
  end loop;  
  
  for i in 0..399 loop   
    res := ref[i+1];   
    sql := format('with t1 as  
                   (select bit_and(v) v from t_bitmap where tagid = any (%L) and ofid=%s),  
                   t2 as  
                   (select bit_or(v) v from t_bitmap where tagid = any (%L) and ofid=%s)  
                   select %s, bit_posite(bitor(t1.v, t2.v), %s, true) from t1,t2',   
                  and_tagids, i, or_tagids, i, i, v_bit);  
    open res for execute sql ;  
    return next res;  
  end loop;  
end;  

$$
 language plpgsql strict;  

The following example very quickly determines the USERID in only 88 ms.

postgres=# begin;  
BEGIN  
Time: 0.031 ms  
postgres=# select * from get_pos_and_or(array[1,2,3], array[4,5,6], 1);  
 get_pos_and_or   
----------------  
 cur1  
 cur2  
 cur3  
 cur4  
 cur5  
 cur6  
 cur7  
 ....  
 cur399  
 cur400  
(400 rows)  
  
Time: 88.069 ms  

It takes only 692 ms to determine the cursor values for 50 million IDs:

fetch 1 from cur1;  
Time: 692.408 ms  

If bit position translation is done on the client, then we only need to get the resulting bitmap, which is much faster. It only takes 224 ms to return a bitmap for 50 million bits. This can be made concurrent, so that each client can get different ofids.

CREATE OR REPLACE FUNCTION public.get_pos_and(and_tagids integer[])
 RETURNS SETOF refcursor
 LANGUAGE plpgsql
 STRICT
AS $function$
declare
  ref refcursor[];  
  res refcursor; 
  sql text;
begin
  for x in 1..400 loop
    ref[x] := 'cur'||x;
  end loop;

  for i in 0..399 loop 
    res := ref[i+1]; 
    -- sql := format('select %s, bit_posite(bit_and(v), %s, true) from t_bitmap where tagid = any (%L) and ofid=%s', i, v_bit, and_tagids, i);
    sql := format('select %s, bit_and(v) from t_bitmap where tagid = any (%L) and ofid=%s', i, and_tagids, i);
    open res for execute sql ;
    return next res;
  end loop;
end;
$function$;


postgres=# \timing
Timing is on.
postgres=# begin;
BEGIN
Time: 0.045 ms
postgres=# select get_pos_and(array_agg(id)) from generate_series(1,100) t(id);
 get_pos_and 
-------------
 cur1
 cur2
 cur3
...
 cur397
 cur398
 cur399
 cur400
(400 rows)


fetch 1 from cur1;
Time: 224.776 ms

We can also use the BIT operation to get USERIDs for users that contain a certain tag but do not contain another.

Example:

Users that contain b1, but do not contain b2

postgres=# select b1 & bitxor(b1,b2) from (values (bit'11001100', bit'11000001')) as t(b1,b2);
 ?column? 
----------
 00001100
(1 row)
  
In order to use this method, we can simply add a new UDF.

Summary

varbitx, an extension provided by Alibaba Cloud ApsaraDB for RDS PostgreSQL, enables real-time tagging for trillions of USER_TAGs using a single RDS PG.

We can use BITMAP segmentation, DBLINK asynchronous query, cursor, and other technologies to improve performance.

Performance indicators:

  1. Response speed of 2.6 seconds for determining the count of users that meet the tag conditions for 200 billion (2 billion users, a 100-tag combination) USER_IDs.
  2. It only takes 692 ms to determine USERID details, and return BIT positions of 50 million user IDs.
  3. It only takes 224 ms to determine USERID details and return a 50 million bit BITMAP.
1 1 1
Share on

digoal

56 posts | 2 followers

You may also like

Comments

Raja_KT March 15, 2019 at 1:53 pm

Good one. But you have not shared the infra for the set up. Oracle has procedure, functions, packages, varray, composite data type..... and if those test cases can be covered , with complex joins, nesting , then it will be more comprehensive.