Document Center

    E-MapReduce

    • E-MapReduce
    • Announcements & Updates
      • Document updates
    • Product Introduction
      • What is E-MapReduce?
      • Benefits
      • Architecture
      • Use scenarios
      • Limits
      • Release version
        • Overview
        • Release notes (EMR V4.X series)
          • Release notes for EMR V4.6.X
          • Release notes of EMR V4.5.X
          • Release notes of EMR V4.4.1
          • Release notes of EMR V4.4.0
          • Release notes of EMR V4.3.0
        • Release notes
          • Release notes for EMR V3.33.X
          • Release notes of EMR V3.32.X
          • Release notes of EMR V3.30.X
          • Release notes of EMR V3.29.X
          • Release notes of EMR V3.28.X
          • Release notes of EMR V3.27.X
          • Release notes of EMR V3.26.X
          • Release notes of EMR V3.25.X
          • Release notes of EMR V3.24.X
          • Release notes of EMR V3.23.X
          • Release notes of EMR V3.22.X
          • Release notes of versions earlier than E-MapReduce V3.22.X
    • Pricing
      • Billable items
      • Subscription
      • Pay-as-you-go
      • Expiration and overdue payments
      • Renewal
      • Switch from pay-as-you-go to subscription
    • Quick Start
      • Overview
      • Make preparations
      • Create a cluster
      • Create and run a job
    • Cluster Management
      • Cluster planning
        • Select configurations
        • Node categories
        • Configure EMR roles
          • Authorize roles
          • EMR service role
          • ECS application role (used in EMR V3.32.0 and earlier V3.X.X versions as well as in EMR V4.5.0 and earlier V4.X.X versions)
          • ECS application role (used in V3.X.X versions later than EMR V3.32.0 as well as in V4.X.X versions later than EMR V4.5.0)
          • Use a custom ECS application role to access cloud resources of the same account
        • Manage user accounts
        • Authorize RAM users
        • Deployment of service components
        • Gateway clusters
        • ECS instances
        • Storage
        • Supported instances with local disks
        • Mutual access between the classic network and a VPC
        • Disaster recovery in EMR clusters
      • Configure clusters
        • Create a cluster
        • Create a gateway cluster
        • View the cluster list and cluster details
        • Scale out a cluster
        • Scale in a cluster
        • Release a cluster
        • Manage cluster tags
        • Use resource groups
        • Cluster renewal
        • Connect to the master node of an EMR cluster in SSH mode
        • Access the Web UI
          • Create an SSH tunnel to access web UIs of open source components
          • Access the web UIs of open source components
        • View the running status of services
        • Configure parameters for components
        • Customize software configurations
        • Cluster scripts
        • Security groups
        • Manage cluster resources
          • Cluster resource overview
          • Use Fair Scheduler
          • Use Capacity Scheduler
        • Auto Scaling
          • Overview
          • Enable and configure auto scaling
          • Configure AutoScaling by time
          • View the scaling history
        • Configure VPCs
        • Cluster management page
      • Manage services
        • Add a service
        • Restart a service
      • Modify configurations
        • Upgrade node configuration
        • Expand disks
        • Remove abnormal nodes
        • Multiple host groups
      • Third-party software
        • Bootstrap actions
      • Cluster Operations
        • Status list
        • Configurations of cluster ports
        • Guide for E-MapReduce cluster operations
        • Common file paths
      • FAQ about cluster planning and configuration
    • Cluster types
      • Hadoop cluster
        • SmartData
        • Alluxio
        • Hudi
        • Kudu
        • Zeppelin
        • Oozie
        • ZooKeeper
        • OpenLDAP
        • Flink (VVR)
        • Sqoop
        • Knox
        • Hive
          • Overview
          • Use Hive to perform basic operations
          • Connect to Hive
        • Impala
          • Overview
          • Connect to Impala
        • ESS
          • Introduction
        • Delta
          • Overview
          • Quick start
          • Scenarios
            • Scenario 1: Stream ingestion
            • Scenario 2: Data synchronization
            • Scenario 3: Hot and cold data separation
          • Basic operations
            • Batch reads and writes
            • Streaming reads and writes
            • Data deletes, updates, and merges
            • Optimize tables
            • Convert tables
            • Modify tables
            • Data quality and schema evolution
            • Table history access
          • Use Hive to read data from a Delta table
          • Use Presto to read Delta tables
          • Appendixes
            • Appendix 1: Important parameters
            • Appendix 2: Optimization tips
            • Appendix 3: FAQ
        • Presto
          • Overview
          • Architecture
          • Terms
          • Use a CLI
          • Use JDBC
        • Flume
          • Use Flume
          • Configure Flume
          • Use Flume to synchronize data from Log Service to HDFS of an EMR cluster
        • Hue
          • Use Hue
          • Connect Hue to an LDAP server
          • Implement multi-instance load balancing for Hue
        • Ranger
          • Overview
          • Integrate components with Ranger
            • HDFS
            • HBase
            • Hive
            • Spark
            • Presto
          • Connect to LDAP for Ranger
            • Integrate Ranger UserSync with an LDAP server
            • Integrate Ranger Admin with an LDAP server
          • Mask Hive data
        • Kerberos
          • Introduction to Kerberos
          • Compatible with the MIT Kerberos authentication protocol
          • RAM authentication
          • Execution plan authentication
          • Cross-region access
        • Component authorization
          • HDFS
          • YARN
          • Hive
          • HBase
      • Kafka cluster
        • Kafka
          • Overview
          • Access Kafka across clusters
          • Integrate Kafka with Ranger
          • Use Kafka SSL
          • FAQ
        • Use Kafka Manager
        • ZooKeeper
        • OpenLDAP
        • Knox
        • Ranger
          • Overview
          • Kafka
          • Connect to LDAP for Ranger
            • Integrate Ranger UserSync with an LDAP server
            • Integrate Ranger Admin with an LDAP server
        • Component authorization
          • Kafka
      • Data Science cluster
        • Overview
        • PAI-Alink
        • Use Alink to schedule jobs
        • Faiss-Server
        • AutoML
        • TensorFlow
      • Druid cluster
        • Druid
          • Introduction to Druid
          • Quick start
          • Ingestion Spec
          • Kafka Indexing Service
          • LOG Indexing Service
          • Common Druid problems
        • Superset
        • ZooKeeper
        • Knox
        • OpenLDAP
      • Dataflow cluster
        • ZooKeeper
        • Knox
        • OpenLDAP
    • Data Development
      • Overview
      • Manage projects
      • Edit jobs
      • Edit a workflow
      • Implement ad hoc queries
      • Scheduling center
      • Manage cluster templates
      • Event codes for Cloud Monitor
      • Jobs
        • Configure job time and date
        • Configure a Shell job
        • Configure a Hive job
        • Configure a Hive SQL job
        • Configure a Spark job
        • Configure a Spark SQL job
        • Configure a Spark Shell job
        • Configure a Spark Streaming job
        • Configure a Hadoop MapReduce job
        • Configure a Sqoop job
        • Configure a Pig job
        • Configure a VVR-based Flink job
        • Configure a Streaming SQL job
        • Configure a Presto SQL job
        • Configure an Impala SQL job
      • FAQ about data development
    • Metadata management
      • Manage Hive metadata
        • Manage Hive metadata in a centralized manner
        • Basic operations on Hive metadata
        • Configure an independent ApsaraDB RDS instance
        • Migrate Hive metadata
          • Migrate data from a unified metadatabase to an ApsaraDB RDS instance
      • Manage Kafka metadata
      • FAQ
    • Monitoring and Alarms
      • Monitoring dashboard
        • Overview
        • Dashboard
        • Events
        • Cluster monitoring
        • Service monitoring
          • Overview
          • HDFS monitoring
          • YARN monitoring
          • Hive monitoring
          • ZooKeeper monitoring
          • Spark monitoring
          • Druid monitoring
          • Kafka service
        • Job monitoring
        • Configure event subscription
        • Subscription event encoding table
        • Log search
        • Query custom metrics
      • Alarm management
        • Cluster alert management
      • Use CloudMonitor to monitor service status
    • Developer Guide
      • Preparations
        • Development preparations
        • Configure the OSS URI to use E-MapReduce
        • Use the sample project
        • Install Python
      • Spark
        • Preparations
        • Parameters
        • Connect Spark to OSS
        • Use MaxCompute in Spark
        • Use Spark Streaming to consume MQ data
        • Consume Table Store data in Spark
        • Use Spark to consume Log Service data
          • Consume data in real time
          • Consume data offline
          • Consume Log Service data
        • Use Spark Streaming to consume MNS data
        • Use Spark to write data to HBase
        • Use Spark Streaming to process Kafka data
        • Use Spark and write data to MySQL
        • Configure spark-submit parameters
      • Spark Streaming SQL
        • Common keywords
        • Streaming query
          • Job template (EMR V3.23.0 and later)
          • Job template
          • Configurations
        • DDL overview
          • CREATE TABLE
          • CREATE SCAN
          • CREATE STREAM
        • Query overview
          • SELECT
          • WHERE
          • GROUP BY
          • JOIN
          • WATERMARK
          • UNION ALL
        • DML overview
          • MERGE INTO
          • INSERT INTO
        • Window function
          • Overview
          • Tumbling window
          • Sliding window
        • Data source
          • Overview
          • HBase data source
          • JDBC data source
          • Kafka data source
          • LogHub data source
          • Tablestore data source
          • Redis data source
          • Druid data source
          • DataHub data source
      • Hadoop
        • Parameters
        • Create and run MapReduce jobs
        • Create and run a Hive job
        • Create and run a Pig job
        • Create a Hadoop streaming job
        • Process Table Store data in Hive
        • Process Table Store data in MR
      • HBase
        • Access HBase
        • Back up an HBase cluster
    • Best Practices
      • SmartData
        • Migrate data from HDFS to JindoFS
        • Use MapReduce to process data in JindoFS
        • Use Hive to query data in JindoFS
        • Use Spark to process data in JindoFS
        • Use Flink to process data in JindoFS
        • Use Impala or Presto to query data in JindoFS
        • Use JindoFS as the storage backend of HBase
        • Store the logs of YARN MapReduce and Spark jobs
        • Import data from Kafka to JindoFS
        • Access JindoFS across clusters
      • Cluster management
        • Use cgroups with YARN to control the CPU usage
        • Isolate OSS data of different RAM users
      • Data Development
        • Adaptive execution of Spark SQL
        • E-MapReduce data migration solution
        • Use Flink jobs to process OSS data
        • Run Flume on a Gateway node to synchronize data
        • Use Spark Streaming jobs to process Kafka data
        • Use Kafka Connect to migrate data
        • Use Hive jobs to process Tablestore data
        • Use JDBC to connect to HiveServer2
        • Use PyFlink jobs to process Kafka data
    • SmartData
      • SmartData 3.2.X
        • SmartData 3.2.X
        • JindoFS in block storage mode
          • Use JindoFS in block storage mode
          • Use RocksDB to store metadata
          • Use Raft-RocksDB-Tablestore to store metadata
          • Use Jindo AuditLog
          • Access the web UI of JindoFS
          • Manage JindoFS permissions
          • Data management policies
          • Analyze metadata offline
        • JindoFS in cache mode
          • Use JindoFS in cache mode
          • Use the password-free feature of JindoFS SDK
          • Use Jindo AuditLog
          • Use Jindo Job Committer
          • Use a credential provider
          • Access the web UI of JindoFS
          • Manage JindoFS permissions
        • JindoTable
          • Enable query acceleration for ORC files
          • Use JindoTable
          • Use JindoCube
          • Use JindoTable to collect access frequency statistics of tables and partitions
        • Toolset
          • Use JindoFS FUSE
          • Use Jindo DistCp
          • Use Jindo DistCp in specific scenarios
          • Use tiered storage commands of JindoFS
      • SmartData 3.1.x
        • SmartData 3.1.x
        • JindoFS in block storage mode
          • Use JindoFS in block storage mode
          • Use RocksDB to store metadata
          • Use Raft-RocksDB-Tablestore to store metadata
          • Use Jindo AuditLog
          • Access the web UI of JindoFS
          • Manage JindoFS permissions
          • Data management policies
          • Analyze metadata offline
        • JindoFS in cache mode
          • Use JindoFS in cache mode
          • Use the password-free feature of JindoFS SDK
          • Use Jindo AuditLog
          • Use Jindo Job Committer
          • Use a credential provider
          • Access the web UI of JindoFS
          • Manage JindoFS permissions
        • JindoTable
          • Use JindoTable
          • Use JindoTable to collect access frequency statistics of tables and partitions
          • Use EMR JindoCube
        • Toolset
          • Use JindoFS FUSE
          • Use Jindo DistCp
          • Use Jindo DistCp in specific scenarios
          • Use tiered storage commands of JindoFS
      • SmartData 3.0.x
        • SmartData 3.0.0
        • JindoFS in block storage mode
          • Use JindoFS in block storage mode
          • Use RocksDB to store metadata
          • Use Raft-RocksDB-Tablestore to store metadata
          • Access the web UI of JindoFS
          • Manage JindoFS permissions
          • Use Jindo AuditLog
          • Analyze metadata offline
        • JindoFS in cache mode
          • Use JindoFS in cache mode
          • Use the password-free feature of JindoFS SDK
          • Access the web UI of JindoFS
          • Manage JindoFS permissions
          • Use Jindo Job Committer
          • Use Jindo AuditLog
          • Use a credential provider
        • JindoTable
          • Use JindoTable
          • Use EMR JindoCube
        • Toolset
          • Use JindoFS FUSE
          • Use tiered storage commands of JindoFS
          • Use Jindo DistCp
          • Use Jindo DistCp in specific scenarios
      • SmartData 2.7.301
        • JindoFS in block storage mode
          • Use JindoFS in block storage mode
          • Use Tablestore instances to store metadata
          • Use RocksDB to store metadata
          • Use Raft-RocksDB-Tablestore to store metadata
          • Manage JindoFS permissions
          • Use Jindo AuditLog
        • JindoFS in cache mode
          • Use JindoFS in cache mode
          • Use the password-free feature of JindoFS SDK
          • Use Jindo Job Committer
          • Manage JindoFS permissions
          • Use Jindo AuditLog
        • JindoTable
          • Use EMR JindoCube
        • Toolset
          • Use Jindo DistCp
          • Use JindoFS FUSE
      • SmartData 2.6.0-2.7.2
        • SmartData 2.6.X
        • JindoFS in block storage mode
          • Use JindoFS in block storage mode
          • Use Tablestore instances to store metadata
          • Use RocksDB to store metadata
          • Use Raft-RocksDB-Tablestore to store metadata
          • Manage JindoFS permissions
        • JindoFS in cache mode
          • Use JindoFS in cache mode
          • Manage JindoFS permissions
          • Use Jindo Job Committer
        • JindoTable
          • Use EMR JindoCube
        • Toolset
          • Use Jindo DistCp
      • SmartData 2.2.x and earlier
        • Use JindoFS in EMR V3.20.0 to V3.22.0 (V3.22.0 excluded)
        • Use JindoFS in EMR V3.22.0 to V3.26.3
        • Use the block storage mode
        • Use the cache mode
        • Use the password-free feature of JindoFS SDK
        • Use the external client
    • API Reference
      • List of operations by function
      • Description of calls
        • Request structure
        • Common parameters
        • Sign signatures
        • Responses
      • Cluster
        • Create a cluster
        • Create a cluster template
        • Create a cluster by using a template
        • Delete a cluster Template
        • Queries the details of a cluster template
        • Queries the basic information of a cluster
        • Query cluster information
        • Queries the host list of a cluster
        • Query the list of clusters
        • Queries the list of cluster templates
        • Query quick links of a service
        • Queries host groups in a cluster
        • Queries the information list of available clusters
        • Query available resources
        • View EMR Version information
        • Modify a cluster Template
        • ModifyClusterName
        • Release a cluster
        • Expand a cluster
        • Join resource groups
        • Cluster scale-in
      • Cluster service
        • Add a service
        • Modify a scaling configuration items
        • Create Resource Pool
        • Create a resource queue
        • Deletes a specified resource pool
        • Delete Resource Queue
        • View cluster operation logs
        • View resource pool policy types
        • Query Service information
        • Query Service configuration details
        • Queries the service configuration tags of a cluster.
        • Query the list of components
        • Queries the operation history list of a cluster
        • View the task list for a specified host
        • A list of hosts whose operation history is queried
        • Query the list of services installed on a cluster
        • Queries the task list for a specified host
        • View the list of services supported by a cluster
        • Queries the Service list of a cluster
        • View the list of health information
        • Query the modification history of a service configuration
        • Queries the list of resource pools.
        • Modifies the configuration of a specified service of a cluster
        • Update resource pools
        • Modifies the scheduling type of a resource pool
        • Modify resource queue
        • Synchronize resource pools and configure to clusters
        • Run specified actions
      • Auto Scaling
        • Create a scaling group
        • New auto scaling configuration items
        • Modify a scaling group
        • View scaling groups
        • View scaling configuration items
        • View scaling activities
        • Query details about a scaling configuration items
        • View scaling group instance details
        • Query details about a scaling Group
        • Perform operations on scaling Group instances
        • Delete auto scaling configuration items
      • Tags
        • Query tag list
        • Create and bind a tag
        • Unbind a tag
      • Data development
        • Clone workflow
        • Clone a job
        • Create a workflow directory folder
        • Create a workflow
        • Create a job
        • Create a data development project
        • Create project cluster settings
        • Add a project user
        • Delete a workflow
        • Delete workflow Directory
        • Delete a job
        • Delete a data development project
        • Delete project cluster settings
        • Delete a project user
        • Queries the information of a workflow.
        • Query directory information
        • Obtain directory tree
        • Obtains the information about a workflow instance
        • Query job information
        • Queries the details of a node instance
        • Queries the container logs of a node instance
        • Query the initiator logs of a node instance
        • Query project details
        • Query the details of Project settings
        • Terminate job instance
        • Queries the workflow list
        • Query the list of clusters available in a project
        • Query the list of clusters available for data development
        • Obtain the commit agent node list
        • Queries the list of clients that can submit jobs
        • Queries the workflow instance list
        • ListJobs
        • Query the list of job running instances
        • Queries the node instance list of a workflow
        • Queries the container status details of a node instance
        • Queries the SQL results of a node instance
        • Queries the list of the specified projects
        • Queries the cluster settings list of a project
        • Queries the user information of a project
        • Modify workflows
        • Query the directory of a workflow
        • Modify project cluster settings
        • Rename a directory
        • Workflow for modifying graph information
        • Modify a data development project
        • ModifyJob
        • Retry workflow
        • Resume workflow
        • Submit a running workflow
        • Submit a running job
        • Suspend a workflow
      • Common APIs
        • Create a cluster
        • Modify the configurations of a cluster
        • Scale out a cluster
        • Create and manage a project
        • Run a job
        • Run a workflow
    • SDK Reference
      • E-MapReduce SDK release notes
      • Java SDK
        • Download SDKs
        • Quick start
        • Reference project
      • Python SDK
        • Install SDK
        • Sample code
    • FAQ
      • FAQ
 
All Products
Search
  • Document Center
  • E-MapReduce
  • FAQ
  • FAQ

This Product

  • This Product
  • All Products

    FAQ

    Document Center
    Product Details

    FAQ

    Last Updated: Feb 09, 2021

    If you encounter an error or issue when you use Alibaba Cloud E-MapReduce (EMR), you can troubleshoot the error or fix the issue based on information provided in this topic.

    • FAQ about cluster planning and configuration
    • FAQ about data development
    • FAQ
    Previous: Sample code

    How helpful was this page?

    What might be the problems?

      More suggestions?

      Thank you! We've received your feedback.
      Free Trial Free Trial