Log data is stored on a shard of a Logstore where read/write operations are performed. A Logstore consists of multiple shards. Each shard is allocated with an MD5 hash value interval that is left closed and right open and does not overlap with each other. All the intervals construct the entire range of MD5 hash values.
You must specify the number of shards when creating a Logstore. The entire MD5 hash value range is automatically and evenly divided based on the specified number of shards. The MD5 hash value interval allocated to a shard must reside within the following range: [00000000000000000000000000000000,ffffffffffffffffffffffffffffffff).
An MD5 hash value interval is composed of a left-closed BeginKey and a right-open EndKey as follows:
- BeginKey: indicates the start of a shard. This value is included in the interval that is allocated to the shard.
- EndKey: indicates the end of a shard. This value is excluded from the interval that is allocated to the shard.
The interval allocated to the shard allows you to use hash keys to write logs to specific shards, and to identify shards to split or merge. To read data from a shard, you must specify the shard ID. To write data to a shard, you can specify a hash key or use the load balancing method. If you use the load balancing method, each data packet is randomly written to an available shard. If you specify a hash key, data is written to the shard whose allocated interval includes the value of the specified hash key.
For example, a Logstore has four shards and the MD5 value range of this Logstore is [00,FF). The following table describes the interval that is allocated to each shard.
If you specify a hash key with the value of 5F to write logs to a Logstore, the log data is written to Shard1 because Shard1 contains the MD5 hash value 5F. If the value of the specified hash key is 8C, the log data is written to Shard2 that contains the MD5 hash value 8C.
The read/write capacities of a shard are as follows:
- Write: 5 MB/s, 500 times/s
- Read: 10 MB/s, 100 times/s
You can specify the number of shards in a Logstore based on your business needs. If the read/write capacities are insufficient, you can split the shards to increase the number of shards and achieve greater read/write capacities. If the read/write capacities exceed your needs, you can merge the shards to reduce the number of shards and cut down costs.
For example, you have two shards where you can read and write data. The total write capacity of the two shards is 10 MB/s, but you need to write data at 14 MB/s in real time. In this case, you can split a shard to obtain a total of three shards in the read/write mode. If you write data at 3 MB/s in real time, you can merge these two shards because one shard is sufficient.
- If the error code 403 or 500 is frequently returned when you write data through API, you can use the monitoring feature to view the write traffic and the service status of the Logstore. Then you can determine whether to increase the number of shards.
- For read/write operations that exceed the capacities of shards, best efforts are attempted to provide the needed services, but the quality is not guaranteed.
The shard status includes:
- readwrite: supports read/write operations.
- readonly: supports only read operations.
When shards are created, the shards are in the read/write mode. When you split or merge shards, the shards are in the read-only mode. New shards generated are in the read/write mode. The status of a shard does not affect its read performance. You can write data to a shard in the read/write mode as expected, but you cannot write data to a shard in the read-only mode.
When you split a shard, you must specify the ID of a shard that is in the read/write mode and an MD5 hash key. The value of the MD5 hash key must be greater than the value of BeginKey and smaller than the value of EndKey of the shard. After a shard is split, two more shards are added. The original shard enters the read-only mode. The two new shards are in the read/write mode with their IDs behind the original shard ID. The MD5 intervals that are allocated to the two shards construct the interval that is allocated to the original shard.
When you merge shards, you must specify a shard in the read/write mode. In addition, the shard cannot be the last shard in the Logstore that is in the read/write mode. After you specify a shard in the read/write mode, Log Service finds the shard whose MD5 value interval is adjacent to the specified shard to the right and merges the two shards. After the two shards are merged, they enter the read-only mode. The newly generated shard generated is in the read/write mode and its MD5 value interval is the combination of the two intervals that are allocated to the original two shards.