All Products
Search
Document Center

Tablestore:Split data in a table into several logical splits whose sizes are approximately the same as the specified value

Last Updated:Aug 19, 2024

You can call the ComputeSplitsBySize operation to split data in a table into several logical splits whose sizes are approximately the same as the specified value. The information about the split points among the splits and the hosts in which the splits reside is returned. In most cases, this operation is used to determine execution plans such as concurrency plans for compute engines.

Note

For more information, see ComputeSplitPointsBySize.

Prerequisites

API operation

    /**
     * Logically split data in a table into several splits whose sizes are close to the specified size, and return the split points between splits and prompt about hosts in which the splits reside. 
     * In most cases, this operation is used to determine execution plans such as concurrency plans for compute engines. 
     * @api
     * @param [] $request The request parameters. 
     * @return [] The response. 
     * @throws OTSClientException The exception that is thrown when a parameter error occurs or the Tablestore server returns a verification error. 
     * @throws OTSServerException The exception that is thrown when the Tablestore server returns an error. 
     */
    public function computeSplitPointsBySize(array $request)
            

Parameters

Request information

Request parameters

Parameter

Description

table_name

The name of the data table.

split_size

The specified size of each split.

Unit: 100 MB.

Request syntax

$result = $client->ComputeSplitsBySize([
    'table_name' => '<string>', // Specify the name of the data table. This parameter is required. 
    'split_size' => <integer>   // Specify the size of each split. This parameter is required. 
]);     

Response information

Response parameters

Parameter

Description

consumed

The number of capacity units (CUs) that are consumed by this operation.

capacity_unit indicates the number of read and write CUs that are consumed.

  • read: the read throughput.

  • write: the write throughput.

primary_key_schema

The schema of the primary key for the data table, which is the same as the schema that is specified when the data table is created.

splits

The split points between splits. This parameter includes the following configuration items:

  • lower_bound: the minimum value in the range of the primary key.

    The lower_bound value can be passed to GetRange to read data by range.

    • Each item contains the primary key name, primary key value (PrimaryKeyValue), and primary key type (PrimaryKeyType) in sequence.

    • You can set PrimaryKeyType to PrimaryKeyTypeConst::CONST_INTEGER that specifies an INTEGER value, PrimaryKeyTypeConst::CONST_STRING that specifies a UTF-8 encoded string, PrimaryKeyTypeConst::CONST_BINARY that specifies a BINARY value, PrimaryKeyTypeConst::CONST_INF_MIN that specifies an INF_MIN(-inf) value, or PrimaryKeyTypeConst::CONST_INF_MAX that specifies an INF_MAX(inf) value.

  • upper_bound: the maximum value in the range of the primary key. The format of upper_bound is same as that of lower_bound.

    The upper_bound value can be passed to GetRange to read data by range.

  • location: the machine where the split point is located. The value of this parameter can be empty.

Response syntax

[
    'consumed' => [
        'capacity_unit' => [
            'read' => <integer>,
            'write' => <integer>
        ]
    ],
    'primary_key_schema' => [
        ['<string>', <PrimaryKeyType>],
        ['<string>', <PrimaryKeyType>, <PrimaryKeyOption>]
    ]
    'splits' => [
        [ 
            'lower_bound' => [
                ['<string>', <PrimaryKeyValue>, <PrimaryKeyType>],
                ['<string>', <PrimaryKeyValue>, <PrimaryKeyType>]
            ],
            'upper_bound' => [
                ['<string>', <PrimaryKeyValue>, <PrimaryKeyType>],
                ['<string>', <PrimaryKeyValue>, <PrimaryKeyType>]
            ],
            'location' => '<string>'
        ],
        // ...
    ]
]           

Examples

The following sample code provides an example on how to logically split data in a table into multiple splits whose size is close to 100 MB:

    $result = $client->ComputeSplitsBySize([
        'table_name' => 'MyTable', 
        'split_size' => 1
    ]);
    foreach($result['splits'] as $split) {
        print_r($split['location']);    
        print_r($split['lower_bound']);    
        print_r($split['upper_bound']);    
    }