Q: How do I create a pay-as-you-go E-MapReduce cluster with high-specification ECS instances?

When you create a pay-as-you-go E-MapReduce cluster, Elastic Compute Service (ECS) instances with eight or more vCPUs are unavailable by default. To use such high-specification ECS instances to create a pay-as-you-go E-MapReduce cluster, you must go to the ECS console and submit a ticket. We recommend that you use subscription E-MapReduce clusters, which do not limit the specifications of ECS instances.

Q: What is a high-security cluster?

A high-security cluster is a Kerberos-authenticated cluster. You can enable the Kerberos mode when you create a cluster. For more information, see Introduction to Kerberos. You cannot disable the Kerberos mode for a high-security cluster of a version earlier than V3.12. If you want to use a non-high-security cluster, create a new cluster. For clusters of V3.12 and later, you can directly disable the Kerberos mode.

Q: Why does the error message "The specified instance Type exceeds the maximum limit for the PostPaid instances" appear when I fail to create a cluster?

In most cases, this error message appears because the maximum number of pay-as-you-go instances that you can create is exceeded. The quota of pay-as-you-go instances depends on your initial purchase. To apply for a higher quota of pay-as-you-go instances, submit a ticket. Another cause is that you do not have the permission to create instances of the specified type. To use such instances, you must apply for the corresponding permission on the instance type in the ECS console.

Q: How do I renew a cluster?

For more information about how to renew a cluster, see Cluster renewal. Sometimes, you may fail to renew a cluster after renewing the subscription for ECS. This is because the subscription for E-MapReduce is not renewed, which is also required for renewing a cluster. You can view the expiration dates of ECS and E-MapReduce on the cluster renewal page.

Q: How do I enable auto-renewal?

You can enable auto-renewal in the E-MapReduce console to renew the subscription for E-MapReduce and ECS automatically.

Q: Can I add an existing ECS instance to an E-MapReduce cluster?

Currently, you can only create ECS instances in the E-MapReduce console when you create an E-MapReduce cluster.

Q: Can I install software on the master node of an E-MapReduce cluster?

Technically, you are allowed to install software on the master node provided that the cluster environment is not affected. However, we recommend that you do not perform this operation because running software on the master node may impact the stability of a cluster.

Q: How do I log on to a core node and gain root permissions?

For more information about how to log on to a core node and gain root permissions, see the "Log on to the core node" section in Create an E-MapReduce cluster.

Q: Can I disassociate an insecure EIP from a master node in the ECS console? Does the operation affect E-MapReduce services?

Elastic IP Addresses (EIPs) are used to connect E-MapReduce clusters to unified metadatabases. If you do not use unified metadatabases, you can disassociate the EIP from the master node.base_info

Q: Do services start automatically when nodes are powered on, and restart automatically after they are stopped unexpectedly?

Services start automatically when nodes are powered on and restart automatically after they are stopped unexpectedly. If a service fails to start, it will retry starting three times.

Q: What permissions are required for a RAM user to use E-MapReduce?

To use E-MapReduce, a Resource Access Management (RAM) user must be granted the permissions specified by the EMRfullaccessrole policy. Make sure that the Alibaba Cloud account to which the RAM user belongs has the permissions specified by the EMRdefaultRole and ECSdefaultRole policies. Otherwise, you cannot activate E-MapReduce for the RAM user.

Q: How do I use Hue in E-MapReduce?

You must use a username and password to use Hue in E-MapReduce. For more information about the default username and password of Hue, see Hue. You can also refer to this topic if you forget your username or password. If you fail to access Hadoop Distributed File System (HDFS) from Hue, set the dfs.webhdfs.enabled parameter of HDFS to true in the E-MapReduce console and restart HDFS.

Q: By default, anonymous users are allowed to access Zeppelin. How do I disable this feature? Do I need to modify the configuration file on the master node if no configuration item is available on the configuration page of Zeppelin in the E-MapReduce console?

To disable anonymous access for Zeppelin, you must modify the configuration file manually and restart Zeppelin.

Q: I failed to use Thrift to connect to HBase through port 9090. Which port is available?

You can use Thrift to connect to HBase through port 9099.

Q: After I create a database on the Tables page of the E-MapReduce console and refresh the page, the database disappears. If I run the SHOW DATABASES command in Zeppelin, the database is listed in the return result. I can also find the database by running commands in Hive CLI. What is the cause?

You cannot find the created database on the Tables page because the Unified Metabases feature is not enabled for the current cluster. To view the created database, run commands in Hive command line interface (CLI). Currently, you cannot check whether the Unified Metabases feature is enabled for clusters on the Cluster Management page. To check the Unified Metabases feature of a cluster, follow these steps: Log on to the header 1 node. Find the hive-site.xml file in the /etc/ecm/hive-conf directory and check the value of the javax.jdo.option.ConnectionURL parameter. If the value is emr-header-1, the Unified Metabases feature is not enabled.

Q: Does E-MapReduce support spot instances (preemptible instances)?

If you enable the auto scaling feature for a cluster, you can use preemptible instances. For more information, see Preemptible instances in AutoScaling.

Q: Why do I fail to run Hive queries for a cluster with Unified Metabases enabled? Why does the error message FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient appear?

This error message appears because no EIP is associated with your cluster. EIPs are required for connecting to unified metadatabases. If the number of EIPs exceeds the upper limit, no EIP is available when you create the cluster. As a result, you cannot connect to unified metadatabases. To resolve this issue, you must associate an EIP with the cluster manually and submit a ticket to Alibaba Cloud to add the EIP to the security group of your databases.

Q: What is a low-specification ECS instance?

A low-specification ECS instance has only two vCPUs and 8 GB of memory. You can create low-specification ECS instances and add specific users to the whitelist so that they can use these instances. Note that low-specification ECS instances are prone to exceptions. You have to resolve these exceptions by yourself.

Q: Why is no instance type available for a master node?

No instance type is available for the master node because the current zone has no instance type. You can switch to another zone.

Q: How do I use other ECS instances to submit jobs and obtain return results?

With gateway clusters, you can use other ECS instances to submit jobs and obtain return results. You can purchase gateway clusters in the E-MapReduce console. For more information, see Gateway clusters.

Q: How do I expand the disk capacity of an E-MapReduce cluster?

In the near future, you can expand the disk capacity in the E-MapReduce console. For more information, see Expand disks.

Q: Can I enable an existing cluster to save operational logs of OSS?

Currently, you cannot enable an existing cluster to save operational logs of Object Storage Service (OSS). We recommend that you create a workflow to run jobs to view logs without enabling the logging feature.

Q: If an E-MapReduce cluster uses the ZooKeeper, Kafka, or Storm service, can the E-MapReduce cluster communicate with an ApsaraDB for HBase cluster?

An E-MapReduce cluster that uses the ZooKeeper, Kafka, or Storm service can communicate with an ApsaraDB for HBase cluster if the following conditions are met: The E-MapReduce cluster and the HBase cluster must be in the same Virtual Private Cloud (VPC). In addition, the IP address of the E-MapReduce cluster must be added to the whitelist of the HBase cluster.

Q: How do I remove core nodes and task nodes from an E-MapReduce cluster? For example, I have a cluster with four core nodes and two task nodes. Can I remove all task nodes and reduce the number of core nodes to two?

Currently, you cannot directly remove core nodes from a cluster in the E-MapReduce console. If you want to remove core nodes, submit a ticket in the ECS console. Note that you must remove core nodes running the ZooKeeper service based on the reverse sequence of the node IDs. For example, if you have four core nodes named worker1, worker2, worker3, and worker4, remove worker4 first and worker1 last. Currently, you cannot remove subscription core nodes and task nodes in the E-MapReduce console. Only pay-as-you-go task nodes can be removed in the E-MapReduce console.

Q: How do I get refunds for E-MapReduce?

To get refunds for E-MapReduce, submit a ticket to the E-MapReduce product team and describe the reason for refunds in the ticket.

Q: What is the difference between E-MapReduce and MaxCompute?

Both E-MapReduce and MaxCompute are used to process big data. E-MapReduce is a big data platform built completely based on open source technologies. It is fully compatible with open sources for use and practices. MaxCompute is a proprietary platform developed by Alibaba Cloud and is not made open source. You can enjoy the powerful functions of MaxCompute whenever and wherever needed after you encapsulate the connection of MaxCompute. In addition, MaxCompute improves cost effectiveness of operations and maintenance. E-MapReduce is designed based on the open source Hadoop ecosystem. Developers with Hadoop prior knowledge can get started more easily. Using MaxCompute requires a little modification on the code.

Q: How do I view the password for connecting to MySQL from an E-MapReduce cluster?

To view the password for connecting to MySQL from an E-MapReduce cluster, follow these steps: On the configuration page of the cluster, choose Clusters Service > Hive in the left-side navigation pane. On the page that appears, click the Configure tab and find the values of the following parameters:
javax.jdo.option.ConnectionURL
javax.jdo.option.ConnectionUserName
javax.jdo.option.ConnectionPassword

Q: I encountered an error message similar to "Spark 2.2.1 is not supported" when running Spark 2.2.X in E-MapReduce V3.X with Zeppelin V0.71 installed. How do I resolve the issue?

Zeppelin does not support Spark 2.2.X in E-MapReduce of versions earlier than V3.11. E-MapReduce V3.11 and later have resolved this issue by upgrading Zeppelin to V0.73.

Q: I created an E-MapReduce cluster of V3.4.3 before. Why do I fail to create another E-MapReduce cluster of V3.4.3 now?

E-MapReduce V3.4.3 is unpublished. Before trying to create an E-MapReduce cluster of an unpublished version, check the service versions that you need in the unpublished E-MapReduce version. Some services such as Hive and Spark in unpublished versions of E-MapReduce are still supported by a later version of E-MapReduce. In this case, we recommend that you create an E-MapReduce cluster of an available version. If no available versions of E-MapReduce support the service versions that you need, contact Alibaba Cloud engineers. They will add your IP address to the whitelist so that you can create an E-MapReduce cluster of an unpublished version.

Q: Is automatic storage balancing available? How do I manually rebalance storage?

To manually rebalance storage, follow these steps: Log on to the E-MapReduce console. On the E-MapReduce homepage, click Cluster Management. On the Cluster Management page, click the name of the target cluster. On the page that appears, choose Cluster Service > HDFS in the left-side navigation pane. On the HDFS page, choose Actions > Rebalance in the upper-right corner.

Q: Does E-MapReduce support downgrading the specification, for example, reducing 16 vCPUs and 32 GB of memory to 8 vCPUs and 16 GB of memory for the master node, core nodes, or task nodes?

Currently, E-MapReduce does not support downgrading the specification.

Q: Why does the error message "The specified DataDisk Size beyond the permitted range, or the capacity of snapshot exceeds the size limit of the specified disk category" appear?

This error message appears because the value you specified for the DataDisk Size parameter is too small when you create a cluster through the SDK or API. We recommend that you expand the disk capacity to more than 40 GB and try again.

Q: Why does the error message "Your account does not have enough balance" appear?

This error message appears because your account balance is insufficient.

Q: Why does the error message "The maximum number of Pay-As-You-Go instances is exceeded: create ecs vcpu quota per region limited by user quota [xxx]" appear?

This error message appears because your quota for pay-as-you-go instances is insufficient for creating an E-MapReduce cluster. You can apply for a higher quota in the ECS console or release some pay-as-you-go instances.

Q: I cannot find Flume in the E-MapReduce console. How do I push data to the OSS path configured for my Hadoop cluster by using Flume?

Currently, Flume has not been integrated with E-MapReduce. To use Flume, you must install it manually.

Q: In E-MapReduce, does Spark support submitting jobs in standalone mode?

In E-MapReduce, Spark does not support submitting jobs in standalone mode. Instead, Spark submits jobs in Spark On Yarn mode.

Q: How do I modify software configurations?

Earlier versions of E-MapReduce do not support modifying software configurations in the E-MapReduce console. To modify software configurations, follow these steps:
  1. Log on to the master node of your cluster.
  2. Go to the directory of configuration templates.
    cd /var/lib/ecm-agent/cache/ecm/service/A
  3. Go to the directory of the target service, for example, Hue.
  4. Go to the directory of the corresponding version of Hue, for example, /var/lib/ecm-agent/cache/ecm/service/HUE/4.1.0.1.3.
  5. Go to the /package/templates/ directory, where the corresponding configuration files are stored.
  6. Modify the configurations as needed. You can add configurations or modify existing configurations:
    • If you want to add a configuration, make sure that the format is correct and line breaks and spaces are used properly.
    • If you want to modify an existing configuration, use annotations or modify parameters directly.
  7. After the modification is complete, restart the service so that the configurations take effect.
  8. Check whether the configuration file in the /etc/ecm/service name-conf/ directory is modified.