All Products
Search
Document Center

E-MapReduce:Install Hue in DataLake clusters

Last Updated:Aug 25, 2023

In the new E-MapReduce (EMR) console, Hue is no longer available when you create DataLake clusters of EMR V5.8.0 or a later minor version and DataLake clusters of EMR V3.42.0 or a later minor version. This topic describes how to use the root user to build and install Hue in an EMR DataLake cluster and how to access the web UI of Hue.

Prerequisites

A DataLake cluster is created. For more information, see Create a cluster.

Limits

You must turn on Assign Public Network IP for the master node group when you create the DataLake cluster.

Procedure

  1. Log on to the master node of the DataLake cluster. For more information, see Log on to a cluster.

  2. Download the hue-release package from the Git repository. Upload the package to the master node of the DataLake cluster and run the following command to decompress the package:

    cd $hue_dir
    tar zxf hue-release-4.10.0.tar.gz

    $hue_dir specifies the directory to which the package is uploaded. In this example, $hue_dir is set to /tmp/.

  3. Run the following commands to install dependencies:

    sudo yum -y install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc gcc-c++ krb5-devel libffi-devel libxml2-devel libxslt-devel make mysql-devel openldap-devel python3-devel sqlite-devel gmp-devel rsync mysql-devel
    sudo yum -y install nodejs npm
    sudo yum -y install git
  4. Create a database and modify Hue-related configurations to connect to MySQL.

    1. Run the following command to log on to MySQL Shell:

      mysql -u root -pEMRroot1234
      Note

      The username that is used to log on to MySQL Shell is root, and the password is EMRroot1234.

    2. Run the following commands to create a database named hue and an account named hue, and grant all permissions on the database to the hue account:

      CREATE DATABASE IF NOT EXISTS hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
      CREATE USER 'hue'@'localhost' IDENTIFIED BY '******';
      GRANT ALL on hue.* to 'hue'@'localhost' IDENTIFIED BY '******';
      FLUSH PRIVILEGES;
      Note

      In the preceding commands, ****** specifies the password of the hue account. You can change the password based on your business requirements.

    3. Go to the $hue_dir/hue-release-4.10.0/desktop/conf directory, change pseudo-distributed.ini.tmpl to pseudo-distributed.ini, and then modify the configurations in the pseudo-distributed.ini file based on your business requirements. In this example, you need to modify the configurations under [desktop] and [[database]].

        [desktop]
        	gunicorn_work_class=sync
        [[database]]
          # Database engine is typically one of:
          # postgresql_psycopg2, mysql, sqlite3 or oracle.
          #
          # Note that for sqlite3, 'name', below is a path to the filename. For other backends, it is the database name
          # Note for Oracle, options={"threaded":true} must be set in order to avoid crashes.
          # Note for Oracle, you can use the Oracle Service Name by setting "host=" and "port=" and then "name=<host>:<port>/<service_name>".
          # Note for MariaDB use the 'mysql' engine.
          engine=mysql
          host=localhost
          port=3306
          user=hue
          password=******
          # conn_max_age option to make database connection persistent value in seconds
          # https://docs.djangoproject.com/en/1.11/ref/databases/#persistent-connections
          ## conn_max_age=0
          # Execute this script to produce the database password. This will be used when 'password' is not set.
          ## password_script=/path/script
          name=hue
          ## options={}
          # Database schema, to be used only when public schema is revoked in postgres
          ## schema=public

      Parameter

      Description

      gunicorn_work_class

      Set the value to sync.

      engine

      The database engine. In this example, set the value to mysql.

      host

      The hostname that is used to access the database. In MySQL, the default value is localhost.

      port

      The port number that is used for communication with the database. In MySQL, the default value is 3306.

      user

      Set this parameter to the name of the account that you created. In this example, the account name is hue.

      password

      Set this parameter to the password for the account that you created. In this example, the password is ******.

      name

      Set this parameter to the name of the database that you created. In this example, the database name is hue.

  5. Run the following commands to configure environment variables:

    export PYTHON_VER=python3.6
    export SKIP_PYTHONDEV_CHECK=true
    Note

    In this topic, Hue is built by using Python 3.6 that is deployed on the master node. You can specify the Python version based on your business requirements.

  6. Download dependencies and install Hue. If the node can access GitHub in a stable manner, you can select automatic download and installation. Otherwise, you must manually download the related software packages and install Hue.

    Automatic download and installation

    Run the following commands to install Hue. During the automatic download and installation process, the node accesses GitHub and downloads the required dependencies.

    Hue is installed in the /opt/apps/ directory.

    rm -rf $hue_dir/hue-release-4.10.0/desktop/core/ext-py/
    rm -rf /opt/apps/hue
    PREFIX=/opt/apps make install
    Note

    If the node cannot access GitHub in a stable manner, Hue may fail to be installed. In this case, we recommend that you manually download all required software packages and install Hue.

    Manual download and installation

    1. Modify the last two lines in the $hue_dir/hue-release-4.10.0/desktop/core/requirements.txt file to cancel the automatic download of required dependencies in GitHub. The following sample code provides an example of the content before and after modification.

      # Before modification
      # git+https://github.com/gethue/django-babel.git
      # git+https://github.com/gethue/django-mako.git
      # After modification
      django-babel
      django-mako
    2. Go to the root directory of Hue and run the following commands to install Hue. Hue is installed in the /opt/apps/ directory.

      rm -rf desktop/core/ext-py/
      rm -rf /opt/apps/hue
      PREFIX=/opt/apps make install
    3. Download the django-mako and django-babel packages from the Git repository. Upload the packages to the node where the Hue package resides and run the following commands to decompress the packages:

      unzip django-babel-master.zip
      unzip django-mako-master.zip

      In this example, the packages are uploaded to the root directory /tmp/ of the master node. The decompressed packages are stored in the /tmp/django-babel-master and /tmp/django-mako-master directories.

    4. Separately run the following commands in the django-babel and django-mako root directories to install django-mako and django-babel:

      source /opt/apps/hue/build/env/bin/activate
      pip install -e .
  7. Run the following commands to start and use Hue:

    source /opt/apps/hue/build/env/bin/activate
    sudo useradd hue
    supervisor

    You can run the following commands to create a superuser and use the superuser to access the web UI of Hue:

    source /opt/apps/hue/build/env/bin/activate
    hue createsuperuser #Trigger an interactive command line. You must enter the username and password of the superuser.

  8. Enter an address in the http://<Public IP address of the master node>:8000 format in the address bar of your browser and press Enter to access the web UI of Hue.

    Note

    In this example, only the basic settings of Hue are configured. To use other features of Hue, view the configurations of Hue, modify the /opt/apps/hue/desktop/conf/pseudo-distributed.ini configuration file, and then restart Hue.

References

Official O&M documentation of Hue: ADMINISTRATOR.