When users are applying for a cluster in E-MapReduce, they are provided with a Hive environment by default. Users can directly create and operate their tables and data by using Hive. The procedure is as follows.
Prepare the Hive script in advance, for example:
DROP TABLE uservisits;
CREATE EXTERNAL TABLE IF NOT EXISTS uservisits (sourceIP STRING,destURL STRING,visitDate STRING,adRevenue DOUBLE,user
Agent STRING,countryCode STRING,languageCode STRING,searchWord STRING,duration INT ) ROW FORMAT DELIMITED FIELDS TERMI
NATED BY ',' STORED AS SEQUENCEFILE LOCATION '/HiBench/Aggregation/Input/uservisits';
DROP TABLE uservisits_aggre;
CREATE EXTERNAL TABLE IF NOT EXISTS uservisits_aggre ( sourceIP STRING, sumAdRevenue DOUBLE) STORED AS SEQUENCEFILE LO
INSERT OVERWRITE TABLE uservisits_aggre SELECT sourceIP, SUM(adRevenue) FROM uservisits GROUP BY sourceIP;
Save this script into a script file, such as uservisits_aggre_hdfs.hive, and then upload it to an OSS directory (for example, oss://path/to/uservisits_aggre_hdfs.hive).
Log on to
Alibaba Cloud E-MapReduce Console Job List.
Click Create a job in the upper right corner to enter the job creation page.
Enter the job name.
Select the Hive job type to create a Hive job. This type of job is submitted in the background by using the following process:
hive [user provided parameters]
Enter the Parameters in the option box with parameters subsequent to Hive commands. For example, if it is necessary to use a Hive script uploaded to OSS, the following must be entered:
You can also click Select OSS path to view and select from OSS, the system will automatically complete the path of Hive script on OSS. Switch the Hive script prefix to ossref (click Switch resource type) to guarantee this file is properly downloaded by E-MapReduce.
Select the policy for failed operations.
Click OK to complete the Hive job configuration.