ShardingSphere is an open source ecosystem that consists of a set of distributed database middleware solutions.

Prerequisites

All PostgreSQL versions that are used with ApsaraDB RDS support ShardingSphere.

Background information

ShardingSphere is suitable for services that run in databases with thorough, well-organized logical sharding. ShardingSphere provides the following features:

  • Data sharding
    • Database sharding and table sharding
    • Read/write splitting
    • Sharding strategy customization
    • Centerless distributed primary key
  • Distributed transaction
    • Unified transaction API
    • XA transaction
    • BASE transaction
  • Database orchestration
    • Dynamic configuration
    • Orchestration and governance
    • Data encryption
    • Tracing and observability
    • Elastic scaling out (planning)

For more information, see the ShardingSphere documentation.

ShardingSphere products

ShardingSphere includes three independent products. You can choose the product that best meets your business requirements. The following table describes these products.

CategorySharding-JDBCSharding-ProxySharding-Sidecar
Supported database engineAll JDBC-compatible database engines such as MySQL, PostgreSQL, Oracle, and SQL ServerMySQL and PostgreSQLMySQL and PostgreSQL
Number of connections consumedHighLowHigh
Supported heterogeneous languageJavaNot limitedNot limited
Impact on performanceLowModerateLow
CenterlessSupportedNot supportedSupported
Stateless APINot supportedSupportedNot supported

Modify configuration files

  1. On your Elastic Compute Service (ECS) instance, run the following commands to go to the directory in which the configuration files are stored.
    cd apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin  
    cd conf
  2. Run the ll command to view all files that are stored in the directory.
    The information similar to the following command output is displayed:
    total 24  
    -rw-r--r-- 1 501 games 3019 Jul 30  2019 config-encrypt.yaml  
    -rw-r--r-- 1 501 games 3582 Apr 22  2019 config-master_slave.yaml  
    -rw-r--r-- 1 501 games 4278 Apr 22  2019 config-sharding.yaml  
    -rw-r--r-- 1 501 games 1918 Jul 30  2019 server.yaml
    Note
    • config-encrypt.yaml: the data encryption configuration file.
    • config-master_slave.yaml: the read/write splitting configuration file.
    • config-sharding.yaml: the data sharding configuration file.
    • server.yaml: the common configuration file.
  3. Modify the configuration files.
    Note For more information about the configuration files, see the ShardingSphere documentation. This topic only describes how to modify the data sharding configuration file and the common configuration file.
    • Example on modifying the data sharding configuration file:
      schemaName: #The name of the logical data source.  
      
      dataSources: #The configuration of the data source. You can configure more than one data source by using the data_source_name element.  
        <data_source_name>: #You do not need to configure a database connection pool. This is different in Sharding-JDBC. 
          url: #The URL that is used to connect to your RDS instance.  
          username: #The username that is used to log on to your RDS instance.  
          password: #The password that is used to log on to your RDS instance.  
          connectionTimeoutMilliseconds: 30000 #The connection timeout period, which is measured in milliseconds.
          idleTimeoutMilliseconds: 60000 #The idle-connection reclaiming timeout period, which is measured in milliseconds.
          maxLifetimeMilliseconds: 1800000 #The maximum connection time to live (TTL), which is measured in milliseconds.
          maxPoolSize: 65 #The maximum number of connections that are allowed.
      
      shardingRule: #You do not need to configure a sharding rule. The sharding rule is the same as the sharding rule in Sharding-JDBC. 
    • Example on modifying the common configuration file:
      • Proxy properties
        #You do not need to configure the proxy properties that you can find in Sharding-JDBC.  
        
        props:  
          acceptor.size: #The number of worker threads that receive requests from the client. The default number is equal to the number of cores multiplied by 2.   
          proxy.transaction.type: #The type of transaction processed by the proxy. Valid values: LOCAL, XA, and BASE. Default value: LOCAL. The value XA specifies to use Atomikos as the transaction manager. The value BASE specifies to copy the .jar package that implements the ShardingTransactionManager operation to the lib directory.   
          proxy.opentracing.enabled: #Specifies whether to enable the link tracing feature. By default, this feature is disabled. 
          check.table.metadata.enabled: #Specifies whether to check the consistency of metadata among the shaded tables during startup. Default value: false. 
          proxy.frontend.flush.threshold: #The number of packets that are returned in a batch during a complex query. 
      • Permission verification
        This part of the configuration is used to verify your permissions when you attempt to log on to Sharding-Proxy. After you configure the username, password, and authorized databases, you must use the correct username and password to log on to Sharding-Proxy from the authorized databases.   
        
        authentication:  
          users:  
            username: # The username.  
              password: **** # The password.  
            sharding: # The username.  
              password: sharding# The password of the sharding user.  
              authorizedSchemas: sharding_db, masterslave_db # The databases on which the specified user has permissions. If you want to specify more than one database, separate them with commas (,). You are granted the permissions of the root user by default. This means that you can access all databases. 

Set up a test environment

  • On your ECS instance, install a database client that runs Java.
    sodu yum install -y java
  • Configure your RDS instance, which runs PostgreSQL 12.
    • Create an account whose username is r1.
    • Set the password of the account to "PW123321!".
    • Create the following databases whose owners are user r1: db0, db1, db2, and db3.
    • Add the IP address of your ECS instance to an IP address whitelist of the RDS instance.
    Note
  • Run the vi command on the common configuration file.
    vi /home/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/conf/server.yaml  
  • Configure the common configuration file based on the following example:
    authentication:  
      users:  
        r1:  
          password: PW123321!  
          authorizedSchemas: db0,db1,db2,db3  
    props:  
      executor.size: 16  
      sql.show: false

Test horizontal sharding

  1. Modify the data sharding configuration file.
    1. Run the vi command to open the common configuration file.
      vi /home/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/conf/config-sharding.yaml
    2. Configure the common configuration file based on the following example:
      schemaName: sdb
      
      dataSources:
        db0:
          url: jdbc:postgresql://pgm-bpxxxxx.pg.rds.aliyuncs.com:1433/db0
          username: r1
          password: PW123321!
          connectionTimeoutMilliseconds: 30000
          idleTimeoutMilliseconds: 60000
          maxLifetimeMilliseconds: 1800000
          maxPoolSize: 65
        db1:
          url: jdbc:postgresql://pgm-bpxxxxx.pg.rds.aliyuncs.com:1433/db1
          username: r1
          password: PW123321!
          connectionTimeoutMilliseconds: 30000
          idleTimeoutMilliseconds: 60000
          maxLifetimeMilliseconds: 1800000
          maxPoolSize: 65
        db2:
          url: jdbc:postgresql://pgm-bpxxxxx.pg.rds.aliyuncs.com:1433/db2
          username: r1
          password: PW123321!
          connectionTimeoutMilliseconds: 30000
          idleTimeoutMilliseconds: 60000
          maxLifetimeMilliseconds: 1800000
          maxPoolSize: 65
        db3:
          url: jdbc:postgresql://pgm-bpxxxxx.pg.rds.aliyuncs.com:1433/db3
          username: r1
          password: PW123321!
          connectionTimeoutMilliseconds: 30000
          idleTimeoutMilliseconds: 60000
          maxLifetimeMilliseconds: 1800000
          maxPoolSize: 65
      
      shardingRule:
        tables:
          t_order:
            actualDataNodes: db${0..3}.t_order${0..7}
            databaseStrategy:
              inline:
                shardingColumn: user_id
                algorithmExpression: db${user_id % 4}
            tableStrategy:
              inline:
                shardingColumn: order_id
                algorithmExpression: t_order${order_id % 8}
            keyGenerator:
              type: SNOWFLAKE
              column: order_id
          t_order_item:
            actualDataNodes: db${0..3}.t_order_item${0..7}
            databaseStrategy:
              inline:
                shardingColumn: user_id
                algorithmExpression: db${user_id % 4}
            tableStrategy:
              inline:
                shardingColumn: order_id
                algorithmExpression: t_order_item${order_id % 8}
            keyGenerator:
              type: SNOWFLAKE
              column: order_item_id
        bindingTables:
          - t_order,t_order_item
        defaultTableStrategy:
          none:
  2. Start ShardingSphere and listen to port 8001.
    cd ~/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/bin/
    ./start.sh 8001
  3. Connect to a database whose owner is user r1.
    psql -h 127.0.0.1 -p 8001 -U r1 sdb
  4. Create a table.
    create table t_order(order_id int8 primary key, user_id int8, info text, c1 int, crt_time timestamp);  
    create table t_order_item(order_item_id int8 primary key, order_id int8, user_id int8, info text, c1 int, c2 int, c3 int, c4 int, c5 int, crt_time timestamp);
    Note When you create a table, the system automatically creates horizontal shards in the specific database based on the sharding strategy that you specify.

More

  • If you want to know the SQL parsing and routing statements that are used in ShardingSphere, perform the following operations:
    • Run the vi command to open the common configuration file.
      vi /home/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/conf/server.yaml
    • Configure the common configuration file based on the following example:
      authentication:  
        users:  
          r1:  
            password: PW123321!  
            authorizedSchemas: db0,db1,db2,db3  
      props:  
        executor.size: 16  
        sql.show: true # Specifies to log the SQL statements that are parsed.
  • If you want to test writes and queries, run the following commands:
    • Write examples
      insert into t_order (user_id, info, c1, crt_time) values (0,'a',1,now());  
      insert into t_order (user_id, info, c1, crt_time) values (1,'b',2,now());  
      insert into t_order (user_id, info, c1, crt_time) values (2,'c',3,now());  
      insert into t_order (user_id, info, c1, crt_time) values (3,'c',4,now());  
    • Query example 1
      select * from t_order;
      Sample result
            order_id      | user_id | info | c1 |          crt_time            
      --------------------+---------+------+----+----------------------------  
       433352561047633921 |       0 | a    |  1 | 2020-02-09 19:48:21.856555  
       433352585668198400 |       1 | b    |  2 | 2020-02-09 19:48:27.726815  
       433352610813050881 |       2 | c    |  3 | 2020-02-09 19:48:33.721754  
       433352628370407424 |       3 | c    |  4 | 2020-02-09 19:48:37.907683  
      (4 rows)  
    • Query example 2
      sdb=> select * from t_order where user_id=1;
      Sample result
           order_id      | user_id | info | c1 |          crt_time            
      --------------------+---------+------+----+----------------------------  
       433352585668198400 |       1 | b    |  2 | 2020-02-09 19:48:27.726815  
      (1 row)
  • If you want to view ShardingSphere logs, go to the following path:
    /home/apache-shardingsphere-incubating-4.0.0-sharding-proxy-bin/logs/stdout.log
  • If you want to use the pgbench stress testing, perform the following operations:
    1. Create a file named test sql and open the file.
      vi test.sql
    2. Configure the commands for the pgbench stress testing based on the following example:
      \set user_id random(1,100000000)  
      \set order_id random(1,2000000000)  
      \set order_item_id random(1,2000000000)  
      insert into t_order (user_id, order_id, info, c1 , crt_time) values (:user_id, :order_id,random()::text, random()*1000, now()) on conflict (order_id) do update set info=excluded.info,c1=excluded.c1,crt_time=excluded.crt_time;   
      insert into t_order_item (order_item_id, user_id, order_id, info, c1,c2,c3,c4,c5,crt_time) values (:order_item_id, :user_id,:order_id,random()::text, random()*1000,random()*1000,random()*1000,random()*1000,random()*1000, now()) on conflict(order_item_id) do nothing;  
      
      pgbench -M simple -n -r -P 1 -f ./test.sql -c 24 -j 24  -h 127.0.0.1 -p 8001 -U r1 sdb -T 120  
      progress: 1.0 s, 1100.9 tps, lat 21.266 ms stddev 6.349  
      progress: 2.0 s, 1253.0 tps, lat 18.779 ms stddev 7.913  
      progress: 3.0 s, 1219.0 tps, lat 20.083 ms stddev 13.212