All Products
Search
Document Center

E-MapReduce:ECS application role (used in EMR V3.32.0 or an earlier minor version, or EMR V4.5.0 or an earlier minor version)

Last Updated:Jan 17, 2024

E-MapReduce (EMR) provides MetaService, which serves as a special ECS application role. In EMR V3.32.0 and earlier V3.X.X versions as well as in EMR V4.5.0 and earlier V4.X.X versions, when you create a cluster, this role is automatically bound to your cluster. Applications that run on your EMR cluster use this role to access other Alibaba Cloud resources without an AccessKey pair. This avoids the disclosure of the AccessKey pair in a configuration file.

Prerequisites

This role is authorized. For more information, see Assign roles to an Alibaba Cloud account.

Background information

MetaService allows you to access only Object Storage Service (OSS), Log Service, and Message Service (MNS) without an AccessKey pair.

Permissions

The default role AliyunEmrEcsDefaultRole is configured with the policy AliyunEmrECSRolePolicy. The following table describes OSS-related permissions.
Permission (Action)Description
oss:PutObjectUploads a file or folder.
oss:GetObjectQueries a file or folder.
oss:ListObjectsQueries files.
oss:DeleteObjectDeletes a file.
oss:ListBucketsQueries buckets.
oss:AbortMultipartUploadTerminates a multipart upload event.
oss:ListMultipartUploadsQueries all ongoing multipart upload events.
oss:RestoreObjectRestores an Archive or Cold Archive object.
oss:GetBucketInfoQueries the information about a bucket.
oss:ListObjectVersionsQueries the versions of all objects in a bucket, including delete markers.
oss:DeleteObjectVersionDeletes a specific version of an object.
oss:PostDataLakeStorageFileOperationAccesses OSS-HDFS.
ots:CreateTableCreates a table based on the specified table schema.
ots:DeleteTableDeletes a specific table from the current instance.
ots:GetRowReads data in a single row based on a specific primary key.
ots:PutRowInserts data into a specific row.
ots:UpdateRowUpdates data in a specific row.
ots:DeleteRowDeletes a row of data.
ots:GetRangeReads data within a specific value range of the primary key.
ots:BatchWriteRowInserts, modifies, or deletes multiple rows of data from one or more tables at a time.
ots:BatchGetRowReads multiple rows of data from one or more tables at a time.
ots:ComputeSplitPointsBySizeLogically splits data in a table into several shards whose sizes are close to the specified size, and returns the split points between the shards and the prompt about hosts where the partitions reside.
ots:StartLocalTransactionCreates a local transaction based on a specified partition key value and queries the ID of the local transaction.
ots:CommitTransactionCommits a local transaction.
ots:AbortTransactionAborts a local transaction.
dlf:BatchCreatePartitionsCreates multiple partitions at a time.
dlf:BatchCreateTablesCreates multiple tables at a time.
dlf:BatchDeletePartitionsDeletes multiple partitions at a time.
dlf:BatchDeleteTablesDeletes multiple tables at a time.
dlf:BatchGetPartitionsQueries information about multiple partitions at a time.
dlf:BatchGetTablesQueries information about multiple tables at a time.
dlf:BatchUpdatePartitionsUpdates multiple partitions at a time.
dlf:BatchUpdateTablesUpdates multiple tables at a time.
dlf:CreateDatabaseCreates a database.
dlf:CreateFunctionCreates a function.
dlf:CreatePartitionCreates a partition.
dlf:CreateTableCreates a table.
dlf:DeleteDatabaseDeletes a database.
dlf:DeleteFunctionDeletes a function.
dlf:DeletePartitionDeletes a partition.
dlf:DeleteTableDeletes a table.
dlf:GetDatabaseQueries information about a database.
dlf:GetFunctionQueries information about a function.
dlf:GetPartitionQueries information about a partition.
dlf:GetTableQueries information about a table.
dlf:ListCatalogsQueries catalogs.
dlf:ListDatabasesQueries databases.
dlf:ListFunctionNamesQueries the names of the functions.
dlf:ListFunctionsQueries functions.
dlf:ListPartitionNamesQueries the names of the partitions.
dlf:ListPartitionsQueries partitions.
dlf:ListPartitionsByExprQueries metadata table partitions by conditions.
dlf:ListPartitionsByFilterQueries metadata table partitions by conditions.
dlf:ListTableNamesQueries the names of tables.
dlf:ListTablesQueries tables.
dlf:RenamePartitionRenames a partition.
dlf:RenameTableRenames a table.
dlf:UpdateDatabaseUpdates a database.
dlf:UpdateFunctionUpdates a function.
dlf:UpdateTableUpdates a table.
dlf:UpdateTableColumnStatisticsUpdates the statistics of a metadata table.
dlf:GetTableColumnStatisticsQueries the statistics of a metadata table.
dlf:DeleteTableColumnStatisticsDeletes the statistics of a metadata table.
dlf:UpdatePartitionColumnStatisticsUpdates the statistics of a partition.
dlf:GetPartitionColumnStatisticsQueries the statistics of a partition.
dlf:DeletePartitionColumnStatisticsDeletes the statistics of a partition.
dlf:BatchGetPartitionColumnStatisticsQueries the statistics of multiple partitions at a time.
dlf:CreateLockCreates a metadata lock.
dlf:UnLockUnlocks a specific metadata lock.
dlf:AbortLockAborts a metadata lock.
dlf:RefreshLockRefreshes a metadata lock.
dlf:GetLockQueries information about a metadata lock.
dlf:GetAsyncTaskStatusQueries the status of an asynchronous task.
dlf:DeltaGetPermissionsQueries permissions.
dlf:GetPermissionsQueries information about data permissions.
dlf:GetServiceInfoQueries information about a service.
dlf:GetRolesQueries information about roles in data permissions.
dlf:CheckPermissionsVerifies data permissions.
Important Modify or delete the AliyunEmrEcsDefaultRole role with caution. Otherwise, your cluster fails to be created or jobs fail to be run.

Data sources that support MetaService

MetaService allows you to access OSS, Log Service, and MNS. You can use an EMR SDK in your EMR cluster to read data from and write data to the preceding data sources without an AccessKey pair.

By default, only access to OSS is enabled. If you want to read data from and write data to Log Service and MNS, log on to the RAM console and configure the required permissions for the AliyunEmrEcsDefaultRole role. For more information, see RAM console.

For more information about how to authorize a RAM role, see Grant permissions to a RAM role.

Use MetaService

MetaService allows you to access OSS, Log Service, and MNS without an AccessKey pair. MetaService provides the following benefits:
  • Reduces the risk of AccessKey information leak. To minimize the security risk, authorize roles in the RAM console based on the principle of least privilege.
  • Improves user experience. MetaService shortens the OSS path that you need to enter during interactive access to OSS resources.
  • Brings the following benefits for services in your EMR cluster:

    The jobs that you run in the services can access Alibaba Cloud resources (OSS, Log Service, and MNS) without an AccessKey pair.

    Comparison of operations before and after MetaService is used:
    • Run the hadoop fs -ls command to view OSS data.
      • MetaService is not used:
        hadoop fs -ls oss://ZaH******As1s:Ba23N**************sdaBj2@bucket.oss-cn-hangzhou-internal.aliyuncs.com/a/b/c
      • MetaService is used:
        hadoop fs -ls oss://bucket/a/b/c
    • Create an external table in Hive.
      • MetaService is not used:
        CREATE EXTERNAL TABLE test_table(id INT, name string)
                ROW FORMAT DELIMITED
                FIELDS TERMINATED BY '/t'
                LOCATION 'oss://ZaH******As1s:Ba23N**************sdaBj2@bucket.oss-cn-hangzhou-internal.aliyuncs.com/a/b/c';
      • MetaService is used:
        CREATE EXTERNAL TABLE test_table(id INT, name string)
                ROW FORMAT DELIMITED
                FIELDS TERMINATED BY '/t'
                LOCATION 'oss://bucket/a/b/c';
    • Use Spark to view OSS data.
      • MetaService is not used:
        val data = sc.textFile("oss://ZaH******As1s:Ba23N**************sdaBj2@bucket.oss-cn-hangzhou-internal.aliyuncs.com/a/b/c")
      • MetaService is used:
        val data = sc.textFile("oss://bucket/a/b/c")
  • Brings the following benefits for self-deployed services:
    MetaService is an HTTP service. You can access the URL of this HTTP service to obtain a Security Token Service (STS) temporary credential. Then, you can use the STS temporary credential to access Alibaba Cloud resources without an AccessKey pair in self-managed systems.
    Important A new STS temporary credential is generated 30 minutes before the current one expires. Both STS credentials can be used within the 30 minutes.

    For example, you can run curl http://localhost:10011/cluster-region to obtain the region where your cluster resides.

    You can use MetaService to obtain the following information:
    • Region: /cluster-region
    • Role name: /cluster-role-name
    • AccessKey ID: /role-access-key-id
    • AccessKey secret: /role-access-key-secret
    • Security token: /role-security-token
    • Network type: /cluster-network-type