All Products
Search
Document Center

Tablestore:Split data for parallel scans

Last Updated:Jun 25, 2026

To scan an entire table in parallel from a compute engine, divide the data into sub-ranges that can be processed concurrently. Tablestore SDK for Java generates primary key range splits of a specified size. Pass these splits directly to range read APIs to retrieve data concurrently.

Prerequisites

Description

public ComputeSplitsBySizeResponse computeSplitsBySize(ComputeSplitsBySizeRequest request) throws TableStoreException, ClientException

On the server side, Tablestore logically divides a table into splits of a specified size. Each split is returned with its primary key range (lowerBound / upperBound) and a hint for the machine that hosts the split (location). Pass these primary key ranges directly to RangeRowQueryCriteria, then read the splits in parallel with Read data by range or Read data with an iterator. Compute engines typically use this pattern to plan execution parallelism.

The following example divides the split_demo table into splits of approximately 200 MB and prints the location and primary key range of each split.

String tableName = "split_demo";

// Divide the full data of the table into splits of approximately 200 MB (2 * 100 MB).
ComputeSplitsBySizeRequest request =
        new ComputeSplitsBySizeRequest(tableName, 2);

ComputeSplitsBySizeResponse response = client.computeSplitsBySize(request);

System.out.println("RequestId: " + response.getRequestId());
System.out.println("PrimaryKeySchema: " + response.getPrimaryKeySchema());
System.out.println("ConsumedCapacity: " + response.getConsumedCapacity().jsonize());

List<Split> splits = response.getSplits();
System.out.println("Splits size: " + splits.size());

Iterator<Split> iterator = splits.iterator();
while (iterator.hasNext()) {
    Split split = iterator.next();
    // The primary key ranges returned by getLowerBound() and getUpperBound() can be passed directly
    // to RangeRowQueryCriteria for parallel reads with getRange or createRangeIterator.
    System.out.println("Location: " + split.getLocation());
    System.out.println("LowerBound: " + split.getLowerBound().jsonize());
    System.out.println("UpperBound: " + split.getUpperBound().jsonize());
}
Note
  • splitSize is measured in units of 100 MB. Each split is approximately N × 100 MB, where N is the value you pass.

  • Splits are logical, and their sizes are approximate. Exact sizes cannot be guaranteed.

  • The location field hints at the machine that hosts the split. The field may be empty in some cases.

Parameters

Name

Type

Description

tableName (required)

String

The name of the table.

splitSize (required)

long

The approximate size of each split, in units of 100 MB. For example, passing 2 divides the data into splits of 200 MB each.