Use ciphertext management to encrypt and dynamically reference sensitive information. This improves data security and maintenance efficiency in cloud environments. - E-MapReduce

Storing sensitive information, such as AccessKeys or passwords, in plaintext during data development and task execution poses a security risk. The ciphertext management feature lets you encrypt and store sensitive information. You can then dynamically reference the encrypted information during data development and in session configurations. This approach helps prevent data leakage and improves maintenance efficiency.

Create a ciphertext

Go to the Ciphertext Management page.
1. Log on to the E-MapReduce console.
2. In the navigation pane on the left, choose EMR Serverless > Spark.
3. On the Spark page, click the name of the target workspace.
4. On the EMR Serverless Spark page, click Ciphertexts in the navigation pane on the left.
On the Ciphertexts page, click Add Ciphertext.

On the Add Ciphertext page, configure the following parameters and click Confirm.

Parameter	Description
Variable Name	The variable name must be unique within the same workspace. It cannot be modified after creation.
Ciphertext	The ciphertext is case-sensitive. It cannot be modified or viewed again after creation.

Use a ciphertext

Use in a Notebook

In a Notebook job, you can use ciphertexts with the emrssutils.utils library. The DPI engine version must be esr-2.8.0, esr-3.4.0, esr-4.4.0, or a later version.

Example

Import the library and load the ciphertext.

# Example code for getting a ciphertext
import emrssutils.utils
# Dynamically get the decrypted value
password = emrssutils.utils.get_secret(key='<variable_name>')

Reference the ciphertext.

# Example code for referencing a ciphertext
df = spark.read \
  .format("jdbc") \
  .option("url", "jdbc:mysql://<jdbc_url>") \
  .option("dbtable", "<db>.<table>") \
  .option("user", "<username>") \
  .option("password", password) \ # Reference the ciphertext  
  .load()
df.show()

Use in Spark configurations

In the Spark configuration for a session or batch job, use the format ${secret_values.variable_name} for the ciphertext.

Example

When you read data from or write data to MaxCompute, you can first add the AccessKey to Ciphertext Management. Then, use the ciphertext in the Spark configurations of the SQL session. For more information, see Read from and write to MaxCompute.

spark.sql.catalog.odps                        org.apache.spark.sql.execution.datasources.v2.odps.OdpsTableCatalog
spark.sql.extensions                          org.apache.spark.sql.execution.datasources.v2.odps.extension.OdpsExtensions
spark.sql.sources.partitionOverwriteMode      dynamic
spark.hadoop.odps.tunnel.quota.name           pay-as-you-go
spark.hadoop.odps.project.name                <project_name>
spark.hadoop.odps.end.point                   https://service.cn-hangzhou-vpc.maxcompute.aliyun-inc.com/api
spark.hadoop.odps.access.id                   <accessId>
# Reference the ciphertext 
spark.hadoop.odps.access.key                  ${secret_values.AccessKey}

Use in batch or stream jobs

For the runtime parameters of a batch or stream job, you can use a ciphertext in the format ${secret_values.variable_name}.

Example

When you create a JAR batch job, you can add the relevant sensitive information to Ciphertext Management. Then, reference the ciphertext in the runtime parameters using the format ${secret_values.variable_name}.