ShardingSphere is an open source ecosystem that consists of a set of distributed database middleware solutions.

Prerequisites

All PostgreSQL versions used with ApsaraDB for RDS support ShardingSphere.

Background information

ApsaraDB RDS for PostgreSQL supports database-integrated sharding plug-ins (for example, Citus, Postgres-XC, and AntDB) and massively parallel processing (MPP) products. It also supports sharding middleware products similar to those widely used in MySQL. ShardingSphere is one of such sharding middleware products.

ShardingSphere is suitable for services that run in databases with thorough, well-organized logical sharding. It offers the following features:

  • Data sharding
    • Database sharding and table sharding
    • Read/write splitting
    • Sharding strategy customization
    • Centerless distributed primary key
  • Distributed transaction
    • Unified transaction API
    • XA transaction
    • BASE transaction
  • Database orchestration
    • Dynamic configuration
    • Orchestration and governance
    • Data encryption
    • Tracing and observability
    • Elastic scaling out (planning)

For more information, see the ShardingSphere documentation.

ShardingSphere products

ShardingSphere includes three independent products. You can choose the product that best meets your business needs. The following table describes these products.

Category Sharding-JDBC Sharding-Proxy Sharding-Sidecar
Supported database engine All JDBC-compatible database engines such as MySQL, PostgreSQL, Oracle, and SQL Server MySQL and PostgreSQL MySQL and PostgreSQL
Connections consumed High Low High
Supported heterogeneous language Java All All
Performance Low consumption Moderate consumption Low consumption
Centerless Yes No Yes
Stateless API No Yes No

Prepare configuration templates

  1. On your ECS instance, run the following commands to go to the directory where configuration templates are stored.
    cd apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin  
    cd conf
  2. Run the following command to view all files stored in the directory:
    # ll  
    total 24  
    -rw-r--r-- 1 501 games 3019 Jul 30  2019 config-encrypt.yaml  
    -rw-r--r-- 1 501 games 3582 Apr 22  2019 config-master_slave.yaml  
    -rw-r--r-- 1 501 games 4278 Apr 22  2019 config-sharding.yaml  
    -rw-r--r-- 1 501 games 1918 Jul 30  2019 server.yaml
    Note
    • config-encrypt.yaml: the data encryption configuration file.
    • config-master_slave.yaml: the read/write splitting configuration file.
    • config-sharding.yaml: the data sharding configuration file.
    • server.yaml: the common configuration file.
  3. Modify the configuration files.
    Note For more information about the configuration files, see the ShardingSphere documentation. This topic only describes how to modify the data sharding and common configuration files.
    • Example data sharding configuration file
      schemaName: # The name of the logical data source.  
      
      dataSources: # The configuration of the data source. You can configure more than one data source by using the data_source_name element. 
        <data_source_name>: # You do not need to configure a database connection pool. This is different in Sharding-JDBC.
          url: # The URL used to connect to your database.  
          username: # The username used to log on to the database.  
          password: # The password used to log on to the database.  
          connectionTimeoutMilliseconds: 30000 # The connection timeout duration in milliseconds.
          idleTimeoutMilliseconds: 60000 # The idle-connection reclaiming timeout duration in milliseconds.
          maxLifetimeMilliseconds: 1800000 # The maximum connection time to live (TTL) in milliseconds.
          maxPoolSize: 65 # The maximum number of connections allowed.
      
      shardingRule: # You do not need to configure a sharding rule, because it is the same in Sharding-JDBC.
    • Example common configuration file
      Proxy properties  
      # You do not need to configure proxy properties that are the same in Sharding-JDBC  
      
      props:  
        acceptor.size: # The number of worker threads that receive requests from the client. The default number is equal to the number of CPU cores multiplied by 2.  
        proxy.transaction.type: # The type of transaction processed by the proxy. Valid values: LOCAL | XA | BASE. Default value: LOCAL. Value XA specifies to use Atomikos as the transaction manager. Value BASE specifies to copy the .jar package that implements the ShardingTransactionManager API to the lib directory.  
        proxy.opentracing.enabled: # Specifies whether to enable link tracing. Link tracing is disabled by default.
        check.table.metadata.enabled: # Specifies whether to check the consistency of metadata among sharding tables during startup. Default value: false.
        proxy.frontend.flush.threshold: # The number of packets returned in a batch during a complex query.
      
      
      Permission verification  
      This part of the configuration is used to verify your permissions when you attempt to log on to Sharding-Proxy. After you configure the username, password, and authorized databases, you must use the correct username and password to log on to Sharding-Proxy from the authorized databases.  
      
      authentication:  
        users:  
          root: # The username of the root user.  
            password: root # The password of the root user.  
          sharding: # The username of the sharding user.  
            password: sharding # The password of the sharding user.  
            authorizedSchemas: sharding_db, masterslave_db # The databases in which the specified user is authorized. If you want to specify more than one database, separate them with commas (,). You are granted the credentials of the root user by default. This means that you can access all databases.

Set up a test environment

  • On your ECS instance, install Java.
    yum install -y java
  • Configure an ApsaraDB for RDS instance that runs PostgreSQL 12.
    • Create an account with username r1.
    • Set the password of the account to "PW123321!".
    • Creates the following databases whose owners are user r1: db0, db1, db2, and db3.
    • Add the IP address of your ECS instance to an IP address whitelist of the ApsaraDB RDS for PostgreSQL instance.
    Note
    Create an ApsaraDB RDS for PostgreSQL instance
  • Configure the common configuration file.
    vi /root/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/conf/server.yaml  
    
    authentication:  
      users:  
        r1:  
          password: PW123321!  
          authorizedSchemas: db0,db1,db2,db3  
    props:  
      executor.size: 16  
      sql.show: false

Test horizontal sharding

  1. Modify the data sharding configuration file.
    vi /root/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/conf/config-sharding.yaml  
    
    
    schemaName: sdb
    
    dataSources:
      db0:
        url: jdbc:postgresql://pgm-bpxxxxx.pg.rds.aliyuncs.com:1433/db0
        username: r1
        password: PW123321!
        connectionTimeoutMilliseconds: 30000
        idleTimeoutMilliseconds: 60000
        maxLifetimeMilliseconds: 1800000
        maxPoolSize: 65
      db1:
        url: jdbc:postgresql://pgm-bpxxxxx.pg.rds.aliyuncs.com:1433/db1
        username: r1
        password: PW123321!
        connectionTimeoutMilliseconds: 30000
        idleTimeoutMilliseconds: 60000
        maxLifetimeMilliseconds: 1800000
        maxPoolSize: 65
      db2:
        url: jdbc:postgresql://pgm-bpxxxxx.pg.rds.aliyuncs.com:1433/db2
        username: r1
        password: PW123321!
        connectionTimeoutMilliseconds: 30000
        idleTimeoutMilliseconds: 60000
        maxLifetimeMilliseconds: 1800000
        maxPoolSize: 65
      db3:
        url: jdbc:postgresql://pgm-bpxxxxx.pg.rds.aliyuncs.com:1433/db3
        username: r1
        password: PW123321!
        connectionTimeoutMilliseconds: 30000
        idleTimeoutMilliseconds: 60000
        maxLifetimeMilliseconds: 1800000
        maxPoolSize: 65
    
    shardingRule:
      tables:
        t_order:
          actualDataNodes: db${0..3}.t_order${0..7}
          databaseStrategy:
            inline:
              shardingColumn: user_id
              algorithmExpression: db${user_id % 4}
          tableStrategy:
            inline:
              shardingColumn: order_id
              algorithmExpression: t_order${order_id % 8}
          keyGenerator:
            type: SNOWFLAKE
            column: order_id
        t_order_item:
          actualDataNodes: db${0..3}.t_order_item${0..7}
          databaseStrategy:
            inline:
              shardingColumn: user_id
              algorithmExpression: db${user_id % 4}
          tableStrategy:
            inline:
              shardingColumn: order_id
              algorithmExpression: t_order_item${order_id % 8}
          keyGenerator:
            type: SNOWFLAKE
            column: order_item_id
      bindingTables:
        - t_order,t_order_item
      defaultTableStrategy:
        none:
  2. Start ShardingSphere and listen to Port 8001.
    cd ~/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/bin/
    ./start.sh 8001
  3. Connect to the destination database.
    psql -h 127.0.0.1 -p 8001 -U r1 sdb
  4. Create a table.
    create table t_order(order_id int8 primary key, user_id int8, info text, c1 int, crt_time timestamp);  
    create table t_order_item(order_item_id int8 primary key, order_id int8, user_id int8, info text, c1 int, c2 int, c3 int, c4 int, c5 int, crt_time timestamp);
    Note When you create a table, the system automatically creates horizontal shards in the destination database based on the sharding strategy you specify. Table sharding succeeded

More

  • If you want to know the SQL parsing and routing statements used in ShardingSphere, run the following command:
    vi /root/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/conf/server.yaml  
    
    authentication:  
      users:  
        r1:  
          password: PW123321!  
          authorizedSchemas: db0,db1,db2,db3  
    props:  
      executor.size: 16  
      sql.show: true  # Specifies to log parsed SQL statements.
  • If you want to test writes and queries, run the following commands:
    sdb=> insert into t_order (user_id, info, c1, crt_time) values (0,'a',1,now());  
    
    sdb=> insert into t_order (user_id, info, c1, crt_time) values (1,'b',2,now());  
    
    sdb=> insert into t_order (user_id, info, c1, crt_time) values (2,'c',3,now());  
    
    sdb=> insert into t_order (user_id, info, c1, crt_time) values (3,'c',4,now());  
    
    
    sdb=> select * from t_order;  
          order_id      | user_id | info | c1 |          crt_time            
    --------------------+---------+------+----+----------------------------  
     433352561047633921 |       0 | a    |  1 | 2020-02-09 19:48:21.856555  
     433352585668198400 |       1 | b    |  2 | 2020-02-09 19:48:27.726815  
     433352610813050881 |       2 | c    |  3 | 2020-02-09 19:48:33.721754  
     433352628370407424 |       3 | c    |  4 | 2020-02-09 19:48:37.907683  
    (4 rows)  
    
    sdb=> select * from t_order where user_id=1;  
          order_id      | user_id | info | c1 |          crt_time            
    --------------------+---------+------+----+----------------------------  
     433352585668198400 |       1 | b    |  2 | 2020-02-09 19:48:27.726815  
    (1 row)
  • If you want to view ShardingSphere logs, run the following command:
    /root/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/logs/stdout.log
  • If you want to use pgbench for stress testing, run the following commands:
    vi test.sql  
    \set user_id random(1,100000000)  
    \set order_id random(1,2000000000)  
    \set order_item_id random(1,2000000000)  
    insert into t_order (user_id, order_id, info, c1 , crt_time) values (:user_id, :order_id,random()::text, random()*1000, now()) on conflict (order_id) do update set info=excluded.info,c1=excluded.c1,crt_time=excluded.crt_time;   
    insert into t_order_item (order_item_id, user_id, order_id, info, c1,c2,c3,c4,c5,crt_time) values (:order_item_id, :user_id,:order_id,random()::text, random()*1000,random()*1000,random()*1000,random()*1000,random()*1000, now()) on conflict(order_item_id) do nothing;  
    
    pgbench -M simple -n -r -P 1 -f ./test.sql -c 24 -j 24  -h 127.0.0.1 -p 8001 -U r1 sdb -T 120  
    progress: 1.0 s, 1100.9 tps, lat 21.266 ms stddev 6.349  
    progress: 2.0 s, 1253.0 tps, lat 18.779 ms stddev 7.913  
    progress: 3.0 s, 1219.0 tps, lat 20.083 ms stddev 13.212