This topic describes how to use the Impala shell tool in E-MapReduce (EMR).


An EMR Hadoop cluster is created, and Impala is selected from the optional services when you create the cluster. For more information, see Create a cluster.


  1. Log on to the cluster in SSH mode.
    For more information, see Log on to a cluster.
  2. Run the following command to start the Impala shell tool:
    • Common cluster:
    • High-security cluster:
      impala-shell -k
      Note Make sure that the account that you use to connect to Impala has passed the security authentication. For more information, see Configure MIT Kerberos authentication.
    If the returned information contains the following content, the Impala shell tool is started:
    Welcome to the Impala shell.
    Before you connect to Impala, you can run the impala-shell --help command to obtain help information about Impala.
      -h, --help            show this help message and exit
      -i IMPALAD, --impalad=IMPALAD
                            <host:port> of impalad to connect to
                            [default: emr-header-1.cluster-20****:2****]
      -q QUERY, --query=QUERY
                            Execute a query without the shell [default: none]
      -f QUERY_FILE, --query_file=QUERY_FILE
                            Execute the queries in the query file, delimited by ;.
                            If the argument to -f is "-", then queries are read
                            from stdin and terminated with ctrl-d. [default: none]
      -k, --kerberos        Connect to a kerberized impalad [default: False]
      -o OUTPUT_FILE, --output_file=OUTPUT_FILE
                            If set, query results are written to the given file.
                            Results from multiple semicolon-terminated queries
                            will be appended to the same file [default: none]
      -B, --delimited       Output rows in delimited mode [default: False]
      --print_header        Print column names in delimited mode when pretty-
                            printed. [default: False]
                            Field delimiter to use for output in delimited mode
                            [default: \t]
      -s KERBEROS_SERVICE_NAME, --kerberos_service_name=KERBEROS_SERVICE_NAME
                            Service name of a kerberized impalad [default: impala]
      -V, --verbose         Verbose output [default: True]
      -p, --show_profiles   Always display query profiles after execution
                            [default: False]
      --quiet               Disable verbose output [default: False]
      -v, --version         Print version information [default: False]
      -c, --ignore_query_failure
                            Continue on query failure [default: False]
      -r, --refresh_after_connect
                            Refresh Impala catalog after connecting
                            [default: False]
      -d DEFAULT_DB, --database=DEFAULT_DB
                            Issues a use database command on startup
                            [default: none]
      -l, --ldap            Use LDAP to authenticate with Impala. Impala must be
                            configured to allow LDAP authentication.
                            [default: False]
      -u USER, --user=USER  User to authenticate with. [default: root]
      --ssl                 Connect to Impala via SSL-secured connection
                            [default: False]
      --ca_cert=CA_CERT     Full path to certificate file used to authenticate
                            Impala's SSL certificate. May either be a copy of
                            Impala's certificate (for self-signed certs) or the
                            certificate of a trusted third-party CA. If not set,
                            but SSL is enabled, the shell will NOT verify Impala's
                            server certificate [default: none]
                            Specify the configuration file to load options. The
                            following sections are used: [impala],
                            [impala.query_options]. Section names are case
                            sensitive. Specifying this option within a config file
                            will have no effect. Only specify this as an option in
                            the commandline. [default: /root/.impalarc]
      --live_summary        Print a query summary every 1s while the query is
                            running. [default: False]
      --live_progress       Print a query progress every 1s while the query is
                            running. [default: False]
                            If set, LDAP authentication may be used with an
                            insecure connection to Impala. WARNING: Authentication
                            credentials will therefore be sent unencrypted, and
                            may be vulnerable to attack. [default: none]
                            Shell command to run to retrieve the LDAP password
                            [default: none]
      --var=KEYVAL          Defines a variable to be used within the Impala
                            session. Can be used multiple times to set different
                            variables. It must follow the pattern "KEY=VALUE", KEY
                            starts with an alphabetic character and contains
                            alphanumeric characters or underscores. [default:
      -Q QUERY_OPTIONS, --query_option=QUERY_OPTIONS
                            Sets the default for a query option. Can be used
                            multiple times to set different query options. It must
                            follow the pattern "KEY=VALUE", KEY must be a valid
                            query option. Valid query options  can be listed by
                            command 'set'. [default: none]
  3. Optional: Run the quit; command to exit the Impala shell tool.