By default, E-MapReduce provides a Hive environment. You can use Hive to create tables and perform operations on the tables and the data in them.

Prerequisites

  • A project is created. For more information, see Manage projects.
  • A Hive SQL script file, such as uservisits_aggre_hdfs.hive, is uploaded to a directory in OSS, such as oss://path/to/uservisits_aggre_hdfs.hive.

Procedure

  1. Log on to the Alibaba Cloud E-MapReduce console with an Alibaba Cloud account.
  2. Click the Data Platform tab.
  3. In the Projects section, click Edit Job in the row of a project.
  4. In the left-side navigation pane, right-click the required folder and choose Create Job from the shortcut menu.
    Note You can also right-click the folder to create a subfolder, rename the folder, or delete the folder.
  5. In the dialog box that appears, set the Name and Description parameters, and select Hive from the Job Type drop-down list.
    This option indicates that a Hive job will be created. You can use the following command syntax to submit a Hive job:
    hive [user provided parameters]
  6. Click OK.
  7. Specify the command line arguments required to submit the job in the Content field.
    For example, to use the Hive script uploaded to OSS, enter the following command:
    -f ossref://path/to/uservisits_aggre_hdfs.hive

    The content of uservisits_aggre_hdfs.hive is as follows:

    USE DEFAULT;
     DROP TABLE uservisits;
     CREATE EXTERNAL TABLE IF NOT EXISTS uservisits (sourceIP STRING,destURL STRING,visitDate STRING,adRevenue DOUBLE,userAgent STRING,countryCode STRING,languageCode STRING,searchWord STRING,duration INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS SEQUENCEFILE LOCATION '/HiBench/Aggregation/Input/uservisits';
     DROP TABLE uservisits_aggre;
     CREATE EXTERNAL TABLE IF NOT EXISTS uservisits_aggre (sourceIP STRING, sumAdRevenue DOUBLE) STORED AS SEQUENCEFILE LOCATION '/HiBench/Aggregation/Output/uservisits_aggre';
     INSERT OVERWRITE TABLE uservisits_aggre SELECT sourceIP, SUM(adRevenue) FROM uservisits GROUP BY sourceIP;
    Note Click Enter an OSS path in the lower part of the page. In the dialog box that appears, specify the file in File Path. The system automatically completes the path of the Hive script in OSS. File Prefix must be set to OSSREF to make sure that E-MapReduce can download the file.
  8. Click Save.