This topic provides answers to some frequently asked questions about ZooKeeper.

What do I do if the ZooKeeper service is unstable and unexpectedly restarts?

The ZooKeeper service may be unstable due to various causes. The most common cause is that the number of ZooKeeper nodes (znodes) or the size of a snapshot is excessively large. ZooKeeper uses memory to maintain all znodes and synchronizes data between znodes. If the number of znodes or the size of a snapshot is excessively large, the service becomes unstable. ZooKeeper is a distributed coordination service and cannot be used as a file system. We recommend that you keep the number of znodes less than 100,000 and the size of each snapshot less than 800 MB.
  • View the number of znodes on the Status tab of the ZooKeeper service page. Zookeeper-1
  • View the size of snapshots.
    1. On the Configure tab of the ZooKeeper service page, search for the zk_data_dirs parameter and obtain its value. The value indicates the data directory of ZooKeeper. zk_data_dirs
    2. View the size of snapshots in the data directory. snapshot

If the number of znodes or the size of a snapshot is excessively large, check the distribution of znodes. Then, stop upper-layer applications from excessively using ZooKeeper based on the distribution of znodes.

What do I do if an error message that contains "Too many connections" is reported?

A possible cause is that the number of connections between each znode and a client IP address exceeds the upper limit.

On the Configure tab of the ZooKeeper service page, search for the maxClientCnxns parameter, which specifies the maximum number of connections between each znode and a client IP address. Then, set the parameter to a larger value based on your business requirements and restart ZooKeeper for the configuration to take effect. Zookeeper-config

If the issue persists after you modify the maxClientCnxns parameter, view the connection metric on the Status tab. If the number of connections continues to increase, check the processes of the ZooKeeper client. For example, check whether all unnecessary connections are closed. After you resolve the issues, the ZooKeeper client can run as expected.

How do I smoothly migrate data from the data directory of ZooKeeper to a new data directory?

If you want to change the data directory of ZooKeeper to a new one due to issues such as insufficient disk space or poor disk performance, perform the following steps on each node in your cluster to achieve smooth data migration without interrupting the ZooKeeper service.
Note In the following example, the data directory of ZooKeeper needs to be changed from /mnt/disk1/zookeeper to /mnt/disk2/zookeeper. The EMR cluster consists of three nodes: emr-worker-2, emr-header-1, and emr-worker-1. emr-worker-2 is the leader. emr-header-1 and emr-worker-1 are followers. When you migrate data, we recommend that you perform operations on the followers first.
  1. Change the data directory and save the configurations.
    1. On the Configure tab of the ZooKeeper service page, search for the zk_data_dirs parameter and change the value of this parameter to /mnt/disk2/zookeeper. modify parameters
    2. In the upper-right corner of the Configure tab, click Save.
    3. In the Confirm Changes dialog box, configure Description and click OK.
  2. Deploy client configurations.
    1. In the upper-right corner of the Configure tab of the ZooKeeper service page, click Deploy Client Configuration.
    2. In the Cluster Activities dialog box, configure Description and click OK.
    3. In the Confirm message, click OK.
  3. Optional:Check the new data directory.
    1. Log on to the master node of the EMR cluster in SSH mode. For more information, see Log on to a cluster.
    2. Run the following command to view the value of the dataDir parameter in the zoo.cfg configuration file:
       cat /etc/ecm/zookeeper-conf/zoo.cfg
      If the command output shown in the following figure is returned, the data directory of ZooKeeper is changed. View data directory
  4. Stop the emr-header-1 node.
    1. On the Component Deployment tab of the ZooKeeper service page, find the emr-header-1 node and click Stop in the Actions column.
    2. In the Cluster Activities dialog box, configure Description and click OK.
    3. In the Confirm message, click OK.
    4. In the upper-right corner of the Component Deployment tab of the ZooKeeper service page, click History to view the stop progress. The emr-header-1 node is stopped if the value of the Status parameter is Successful.
  5. Change the data directory.
    1. Log on to the master node of the EMR cluster in SSH mode. For more information, see Log on to a cluster.
    2. Run the following command to change the data directory and configure related permissions for the emr-header-1 node:
      sudo rm -rf /mnt/disk2/zookeeper && sudo cp -rf /mnt/disk1/zookeeper /mnt/disk2/zookeeper && sudo chown hadoop:hadoop -R /mnt/disk2/zookeeper
  6. Start the emr-header-1 node.
    1. On the Component Deployment tab of the ZooKeeper service page, find the emr-header-1 node and click Start in the Actions column.
    2. In the Cluster Activities dialog box, configure Description and click OK.
    3. In the Confirm message, click OK.

      Refresh the page until the value of the Status parameter for the emr-header-1 node is Good.

  7. Find the emr-worker-1 node and repeat Step 4 to Step 6.
  8. Find the emr-worker-2 node and repeat Step 4 to Step 6.
    Data is migrated to the new data directory after the value of the Status parameter for all nodes is Good.
    Note The emr-worker-2 node was originally the leader. After you click Stop for the node in the Actions column, the node becomes a follower. The emr-worker-1 or emr-header-1 node becomes the leader.