This topic describes how to synchronize data from an E-MapReduce (EMR) Dataflow cluster to the HDFS service of an EMR data lake cluster.
Prerequisites
- An EMR data lake cluster is created, and Flume is selected from the optional services during cluster creation. For more information, see Create a cluster.
- An EMR Dataflow cluster is created and the Kafka service is selected from the optional services during cluster creation. For more information, see Create a cluster.
Procedure
- Configure the Flume service.
- Start the Flume service.
- On the Status tab of the Flume service, find the FlumeAgent component and choose in the Actions column.
- In the dialog box that appears, enter an execution reason and click OK.
- In the Confirm message, click OK.
- Test data synchronization.