Use Tablestore SDK for Java to perform batch read operations on offline data - Tablestore

You can perform batch read operations on offline data in a data table in big data scenarios. After data is written to a data table, you can read the data based on specific conditions.

Prerequisites

A client is initialized. For more information, see Initialize a Tablestore client.
A data table is created, and data is written to the data table.

Parameters

Parameter	Description
tableName	The name of the data table.
inclusiveStartPrimaryKey	The start and end primary keys for the batch read operation. The start and end primary keys must be valid primary keys or virtual points that consist of INF_MIN type values and INF_MAX type values. The number of columns in the virtual points must be the same as that in the primary key. INF_MIN specifies an infinitely small value. All values of other types are greater than a value of the INF_MIN type. INF_MAX specifies an infinitely large value. All values of other types are smaller than a value of the INF_MAX type. The inclusiveStartPrimaryKey parameter specifies the start primary key. If a row that contains the start primary key exists, the row of data is returned. The exclusiveEndPrimaryKey parameter specifies the end primary key. Regardless of whether a row that contains the end primary key exists, the row of data is not returned. The rows in a data table are sorted in ascending order based on primary key values. The range that is used to read data is a left-closed, right-open interval. If data is read in the forward direction, the rows whose primary key values are greater than or equal to the start primary key value but less than the end primary key value are returned.
exclusiveEndPrimaryKey
columnsToGet	The columns that you want to return. You can specify the names of primary key columns or attribute columns. If you do not specify a column, all data in the row is returned. If you specify columns but the row does not contain the specified columns, the return value is null. If the row contains some of the specified columns, the data in some of the specified columns of the row is returned. Note By default, Tablestore returns the data from all columns of a row when you query the row. You can configure the columnsToGet parameter to return specific columns. For example, if col0 and col1 are specified for the columnsToGet parameter, only the values of the col0 and col1 columns are returned. If a row is in the specified range that you want to read based on primary key values but does not contain the specified columns that you want to return, the response excludes the row. If you configure the columnsToGet and filter parameters, Tablestore queries the columns that are specified by the columnsToGet parameter, and then returns the rows that meet the filter conditions.
filter	The filter that you want to use to filter the query results on the server side. Only rows that meet the filter conditions are returned. For more information, see Configure a filter. Note If you configure the columnsToGet and filter parameters, Tablestore queries the columns that are specified by the columnsToGet parameter, and then returns the rows that meet the filter conditions.
dataBlockType	The format type of the returned data for this read request. Valid values: PlainBuffer and SimpleRowMatrix.

Examples

The following sample code provides an example on how to perform batch read operations on data whose primary key value is within a specific range:

private static void bulkExport(SyncClient client, String start, String end){
    // Specify the start primary key. 
    PrimaryKeyBuilder startPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    startPrimaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(String.valueOf(start)));
    PrimaryKey startPrimaryKey = startPrimaryKeyBuilder.build();

    // Specify the end primary key. 
    PrimaryKeyBuilder endPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    endPrimaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(String.valueOf(end)));
    PrimaryKey endPrimaryKey = endPrimaryKeyBuilder.build();

    // Create a bulkExportRequest. 
    BulkExportRequest bulkExportRequest = new BulkExportRequest();
    // Create a bulkExportQueryCriteria. 
    BulkExportQueryCriteria bulkExportQueryCriteria = new BulkExportQueryCriteria("<TABLE_NAME>");

    bulkExportQueryCriteria.setInclusiveStartPrimaryKey(startPrimaryKey);
    bulkExportQueryCriteria.setExclusiveEndPrimaryKey(endPrimaryKey);
    // Use the DBT_PLAIN_BUFFER encoding method. 
    bulkExportQueryCriteria.setDataBlockType(DataBlockType.DBT_PLAIN_BUFFER);
    // If you want to use the DBT_SIMPLE_ROW_MATRIX encoding method, use the following code. 
    // bulkExportQueryCriteria.setDataBlockType(DataBlockType.DBT_SIMPLE_ROW_MATRIX);
    bulkExportQueryCriteria.addColumnsToGet("pk");
    bulkExportQueryCriteria.addColumnsToGet("DC1");
    bulkExportQueryCriteria.addColumnsToGet("DC2");

    bulkExportRequest.setBulkExportQueryCriteria(bulkExportQueryCriteria);
    // Obtain the bulkExportResponse. 
    BulkExportResponse bulkExportResponse = client.bulkExport(bulkExportRequest);
    
    // If you set DataBlockType to DBT_SIMPLE_ROW_MATRIX, you need to use the following code to print the result. 
    //{
    //    SimpleRowMatrixBlockParser parser = new SimpleRowMatrixBlockParser(bulkExportResponse.getRows());
    //    List<Row> rows = parser.getRows();
    //    for (int i = 0; i < rows.size(); i++){
    //        System.out.println(rows.get(i));
    //    }
    //}

    // Set DataBlockType to DBT_PLAIN_BUFFER and print the result. 
    {
        PlainBufferBlockParser parser = new PlainBufferBlockParser(bulkExportResponse.getRows());
        List<Row> rows = parser.getRows();
        for (int i = 0; i < rows.size(); i++){
            System.out.println(rows.get(i));
        }
    }
}

References

If you want to use indexes to accelerate data queries, you can use the secondary index or search index feature. For more information, see Secondary index or Search index.
If you want to visualize data in a table, you can connect the table to DataV or Grafana. For more information, see Data visualization.
If you want to download data from a table to a local file, you can use DataX or the Tablestore CLI. For more information, see Download Tablestore data to a local file.
If you want to compute and analyze data in a table, you can use the SQL query feature of Tablestore. For more information, see SQL query.
Note
You can also use compute engines, such as MaxCompute, Spark, Hive, HadoopMR, Function Compute, and Flink, to compute and analyze data in tables. For more information, see Overview.