The SmartData service of a version that ranges from 3.0.X to 3.5.X has a known defect, which may cause damage to cached data. As a result, an error may be reported when you read cached data from SmartData. This topic describes the impact of the issue, the solution, and the procedure to fix the issue.
Impact of the issue
- Affected components: all components for which the data caching feature of SmartData
is enabled
Notice If the SmartData service is deployed in your cluster, but you do not use the data caching feature, you can ignore the notice.
SmartData allows you to use JindoFS in block storage mode or cache mode.
- Affected versions:
- E-MapReduce (EMR): V3.30.X, V4.5.X, V3.32.X, V4.6.X, V3.33.X, V4.7.X, V3.34.X, V4.8.X, V3.35.X, and V4.9.X
- SmartData: 3.0.X, 3.1.X, 3.2.X, 3.3.X, 3.4.X, and 3.5.X
- Severity level: critical. The occasional occurrence of the issue affects the accuracy of data. We recommend that you fix the issue.
- Issue description: If JindoFS in block storage mode or cache mode is used, data may be contaminated at a low probability. As a result, an error may be reported when you read the data. To use JindoFS in cache mode, set the jfs.cache.data-cache.enable parameter to true. In block storage mode, the data caching feature is enabled by default. For example, a data parsing error is reported when you read data from ORC or Parquet files, or an HFile format error is reported when you read HBase data.
Solution:
Cached data in SmartData of a version that ranges from 3.0.X to 3.5.X is damaged due to a defect in the merge process of small files. To avoid the issue, modify configurations to disable the merging of small files, and then restart the SmartData service. If the issue has occurred, disable the data caching feature first to eliminate the impact of cached data and recover online business at the earliest opportunity. If you use only JindoFS in cache mode, you can use a tool to format the cache system to clear all cached data in your cluster. This way, all cached data blocks that may be damaged are cleared. After the clearing operation is complete, re-enable data caching.
Fixing procedure
Common fixing procedure
If the issue does not occur in your cluster, perform the following steps to avoid the issue:
- On the SmartData service page in the EMR console, add a custom configuration item.
- Restart Jindo Storage Service.
Fixing procedure in emergency
If the issue has occurred, perform the following steps to recover business and fix the caching mechanism: