Alibaba Cloud E-MapReduce (EMR) is a big data platform based on open source engines such as Hadoop, Spark, HBase, Hive, and Flink. EMR is built on Alibaba Cloud ECS instances. EMR provides a variety of big data solutions. It allows you to use open source technologies on the cloud to process big data in scenarios such as data warehouse creation, batch processing, streaming processing, and ad hoc queries.
Learning Path
Start your EMR journey here to discover infinite possibilities with Alibaba Cloud
-
About EMRIntroduction to EMR
-
What is E-MapReduce?
-
Benefits
-
Architecture
-
Use scenarios
-
Limits
How Is EMR Billed?-
Billable items
-
Subscription
-
Pay-as-you-go
-
Expiration and overdue payments
-
Pay-as-you-go
-
Renewal
-
Switch from pay-as-you-go to subscription
-
-
Get Started with EMRQuick Start
-
Overview
-
Make preparations
-
Create a cluster
-
Create and run a job
-
-
Use EMRQuick StartCluster Management
-
Create a cluster
-
Security groups
-
Scale out a cluster
-
Scale in a cluster
-
Connect to the master node of an EMR cluster in SSH mode
Data Development-
Manage projects
-
Edit jobs
-
Edit a workflow
-
Implement ad hoc queries
-
Scheduling center
Metadata Management-
Manage Hive metadata in a centralized manner
-
Basic operations on Hive metadata
-
Configure an independent ApsaraDB RDS instance
-
Manage Kafka metadata
-
-
PracticesBest Practices
-
Use PyFlink jobs to process Kafka data
-
Use Spark Streaming jobs to process Kafka data
-
Use Flink jobs to process OSS data
-
Use Kafka Connect to migrate data
-
-
DevelopmentDeveloper Resources
-
List of operations by function
-
SDK reference
-
FAQ
View more frequently asked questions and related solutions
-
FAQ about cluster planning and configuration
-
FAQ about data development
-
FAQ about metadata management