All Products
Search
Document Center

E-MapReduce:SmartData 3.2.X

Last Updated:Jun 16, 2026

SmartData is a storage service for the E-MapReduce (EMR) Jindo engine. It provides unified storage, caching optimization, and computing acceleration for EMR computing engines. SmartData consists of JindoFS, JindoTable, and related tools. This topic describes the updates in SmartData 3.2.X.

OSS storage scalability on JindoFS

  • JindoFS supports multiple password-free methods to obtain tokens for accessing OSS, with options for customization and extension.
  • Alibaba Cloud Tablestore is used to implement mutual exclusion for concurrent rename operations.
  • Data can be written to OSS by using Delta or Hudi.

JindoFS-based caching optimization

JindoFS optimizes metadata caching for large volumes of small files in AI training scenarios, improving the performance of metadata preloading and list operations.

JindoTable-based computing optimization

  • JindoTable integrates with AliORC to provide a native Optimized Row Columnar (ORC) reader, enabling Spark and Presto to read ORC files with accelerated data reading and improved computing performance.
  • JindoTable collects access frequency statistics of Hive tables for Presto.

Ecosystem support for JindoFS

When you use Spark to write data to OSS, you can set spark.hadoop.mapreduce.fileoutputcommitter.marksuccessfuljobs to false to avoid generating a _SUCCESS file.