This topic describes how to integrate Spark with Ranger and how to configure permissions.

Background information

Spark can be integrated with Ranger to control permissions. This integration applies only when Spark Thrift Server is used to execute Spark SQL queries. For example, the Beeline client or JDBC of Spark submits a Spark SQL job by using Spark Thrift Server.

Integrate Spark SQL with Ranger

  1. Integrate Hive with Ranger.
    Spark SQL and Hive share permission configurations in Ranger. To control Spark SQL permissions by using Ranger, you must integrate Hive with Ranger. For more information, see Integrate Ranger into Hive.
  2. Start Spark.
    1. On the Cluster Management tab, click a cluster ID.
    2. In the Services section, click Ranger.
    3. On the Ranger Cluster Service page, choose Actions > EnabledSpark in the upper-right corner.
      ranger_spark
    4. In the Cluster Activities dialog box that appears, configure the parameters as required.
    5. Click OK.
    6. On the Confirm message that appears, click OK.
    7. Click History in the upper-right corner to view the task progress.
  3. After the task is complete, restart Spark Thrift Server.
    1. In the left-side navigation pane, choose Cluster Service > Spark.
    2. On the Spark Cluster Service page, choose Actions > Restart ThriftServer in the upper-right corner.
    3. In the Cluster Activities dialog box that appears, configure the parameters as required.
    4. Click OK.
    5. On the Confirm message that appears, click OK.
    6. Click History in the upper-right corner to view the task progress.

Permission configuration example 1 (configure permissions in Ranger UI)

Grant user foo the SELECT permission on column a of the testdb.test table.

  1. In the left-side navigation pane, click Connect Strings.
  2. On the Public Connect Strings page, click the link in the row where the service is RANGER UI.
  3. On the Ranger page, click emr-hive and configure the permission.
    Because Spark SQL and Hive share permission configurations, permissions for Spark SQL are configured in the emr-hive service.hive-emr
  4. On the configuration page, click Add New Policy.
  5. In the Create Policy dialog box that appears, configure the parameters as required.
    policy_ranger

    The following table describes the parameters.

    Parameter Description
    Policy Name Enter a policy name.
    database Add a database in Hive, such as testdb.
    table Add a table, such as test.
    Hive Column Add a column, such as a.

    * indicates all columns.

    Description Add a description.
    Select User Select a user, such as foo.
    Permissions Select a permission, such as select.
    Note After you add, delete, or modify a policy, wait about one minute to complete the configuration.
    After the policy is configured, user foo can access column a of the testdb.test table.