Introduction: [AI training acceleration] Lecture 17
subject: F Fluid + JindoFS accelerate data training on HDFS luid + JindoFS pair accelerate Data training on OSS
lecturer: chen Shan, EMR technical expert, Alibaba Computing Platform Division
- what is Fluid + JindoFS (JindoRuntime)
- why do I use JindoRuntime to accelerate HDFS?
- How to use JindoRuntime
live playback link:(17 Lectures)
1. What is Fluid + JindoFS (JindoRuntime)
introduction to Fluid
CNCF Fluid is an open-source Kubernetes-native distributed dataset orchestration and acceleration engine that mainly serves data-intensive applications in cloud-native scenarios, such as big data applications and AI applications.
Reference URL: https://github.com/fluid-cloudnative/fluid
Fluid functional concepts
Fluid is not full storage acceleration and management, but the data set acceleration and management used by applications.
- Dataset: the dataset is logically related A set of data, consistent file features , will be used by the same operation engine.
- Runtime: supports data set security, version management, and data acceleration. The API of the execution engine. , defines a series of Lifecycle methods.
- JindoRuntime: based on JindoFS, the kernel supports Dataset data management and caching. Execution Engine efficient implementation.
background: in the cloud native environment, JindoFS Cache Acceleration engine is used to orchestrate cached datasets and applications.
Why do I use JindoRuntime to accelerate HDFS?
HDFS storage and AI training
problems faced by HDFS in AI training scenarios
- data reading performance is poor due to separation of computing and storage. AI training job IO performance
- many deep learning training frameworks do not adapt to native HDFS API, which greatly increases the development difficulty.
- HDFS clusters have high pressure and even stability problems.
Fluid JindoRuntime to accelerate access to HDFS
- the Master supports Raft high availability.
- Supports data affinity scheduling (nodeAffinity) and selects an appropriate cache node.
- Supports data preloading DataLoad CRD.
- You can specify a Fuse user to access HDFS.
Reference URL: https://github.com/aliyun/alibabacloud-jindofs/blob/master/docs/jindo_fluid/jindo_fluid_overview.md
3. How to use JindoRuntime
JindoRuntime to accelerate HDFS
- download and Install Fluid : https :// github.com / aliyun / alibabacloud-jindodata /blob/master/docs/ jindo_fluid / jindo_fluid_jindofs_hdfs_introduce.md
- create Dataset
- create JindoRuntime
- cache preloading DataLoad
- implementation AI training job
- Kubernetes version > 1.14 support CSI
- Golang 1.12+
- Helm 3
- Fluid 0.6.0
demo: accelerate data access on HDFS
links to related documents:
- Fluid JindoRuntime reference
https://github.com/aliyun/alibabacloud-jindofs/blob/master/docs/jin D o_fluid/jindo_fl uid_overview.md
- embrace cloud native, Fluid combination JindoFS : Acceleration HDFS user Guide
https://github.com/aliyun/alibabacloud-jindodata/blob/master/docs/jindo_fluid/jindo_flui d_jindofs_hdfs_int roduce.md
- ImageNet dataset acceleration test
- InsightFace dataset acceleration test
⭐ Click the playback link to directly watch the video playback of lecture 17 and obtain the lecturer's example explanation:
⭐ Github link: