In AnalyticDB for MySQL, the Basic Edition of the Apache Spark based confidential compute engine encrypts sensitive data, which is then transmitted and stored as ciphertext. Only key owners can decrypt the data, which prevents data breaches. This topic uses ApsaraDB RDS for MySQL as an example to describe how to use the Basic Edition of the Apache Spark based confidential compute engine to encrypt data, perform computations and analysis on the encrypted data, and decrypt the computation results.
Prerequisites
An AnalyticDB for MySQL Enterprise Edition, Basic Edition, or Data Lakehouse Edition cluster is created.
A database account is created for the AnalyticDB for MySQL cluster.
If you use an Alibaba Cloud account, you need to only create a privileged account.
If you use a Resource Access Management (RAM) user, you must create a privileged account and a standard account and associate the standard account with the RAM user.
The AliyunADBFullAccess, AliyunADBSparkProcessingDataRole, and read and write permissions for AnalyticDB for MySQL databases and tables are granted to a RAM user. For more information, see Account authorization.
A Java 8 development environment is installed to run the client tool. This topic uses an ECS instance that runs a Linux operating system as an example.
Usage notes
The adb-crypto-tool client tool supports encrypting or decrypting data stored only in MySQL and PostgreSQL databases.
NoteThe MySQL or PostgreSQL database can be an ApsaraDB database, a self-managed database, or a third-party cloud database.
The confidential engine does not support the Decimal data type or nested data types. You can use the Float or Double data type as an alternative.
For PostgreSQL databases, the confidential engine does not automatically remove trailing spaces from the
Char(n)data type. Use theVarchar(n)type instead.
Data preparation
This topic uses an ApsaraDB RDS for MySQL database named adb. A table named customer is created in the database. The following example shows the statements:
CREATE DATABASE IF NOT EXIST adb;
CREATE TABLE IF NOT EXISTS adb.customer(
id int,
name varchar(1023),
address varchar(1023),
age int
);Use the following statements to insert data into the customer table:
INSERT INTO customer (id,name,address,age) VALUES('1', 'james', 'beijing', '10');
INSERT INTO customer (id,name,address,age) VALUES('2', 'bond', 'beijing', '15');
INSERT INTO customer (id,name,address,age) VALUES('3', 'jack', 'shanghai', '20');
INSERT INTO customer (id,name,address,age) VALUES('4', 'lucy', 'hangzhou', '25');
INSERT INTO customer (id,name,address,age) VALUES('5', 'tom', 'qingdao','30');Procedure
Step 1: Install the encryption and decryption client. You can download and install the adb-crypto-tool client.
Step 2: Encrypt and upload data. You can use the client tool to convert plaintext data into ciphertext and upload the ciphertext to Object Storage Service (OSS).
Step 3: Execute SQL computation on ciphertext. You can compute the ciphertext in the SQL window of the Apache Spark based confidential compute engine on AnalyticDB for MySQL. The computation results are stored in OSS as ciphertext.
Step 4: Download and decrypt the ciphertext computation results. You can download the ciphertext computation results to your on-premises device. After you use the client tool to decrypt the data, the decrypted data is automatically inserted into the plaintext table in the database.
Install the encryption and decryption client
Download the client tool: adb-crypto-tool-1.0.4.zip.
Decompress the adb-crypto-tool-1.0.4.zip package.
unzip adb-crypto-tool-1.0.4.zipDecompressing the package creates the following three files and folders:
cryptoTool: The entry point for the encryption and decryption tool.
NoteThe cryptoTool depends on the bash environment. On a Windows system, you can use a tool such as Windows Subsystem for Linux (WSL) to run it.
demo.json: A sample configuration file.
dist: The dependency package for the encryption and decryption tool.
Encrypt and upload data
Prepare an encryption configuration file. You can create a configuration file based on the demo.json file or directly modify the
table,jdbcUrl,username, andpasswordparameters in the demo.json file. This topic uses a new configuration file namedencryption_config.jsonas an example. The following example shows the configuration:{ "dataDir": "file:///tmp", "table": [ "customer" ], "encAlgo": "AES_128_GCM", "mek": "00112233445566778899aabbccdd****", "jdbcUrl": "jdbc:mysql://rm-bp1487t8613zz****.mysql.rds.aliyuncs.com:3306/adb", "username": "demo_user", "password": "demo_password", "saveMode": "Overwrite", "format":"EncBlocksSource" }The following table describes the parameters.
Parameter
Description
dataDirThe local path to store the ciphertext dataset.
tableThe name of the table to encrypt. This topic uses customer as an example.
encAlgoThe encryption and decryption algorithm for the ciphertext. Only AES_128_GCM is supported.
mekThe master encryption key (MEK) that is used to encrypt data encryption keys (DEKs). For more information, see the "Keys" section of the Overview topic.
The MEK must be a 16-byte hexadecimal string. You can use the OpenSSL tool to generate a key. This topic uses
00112233445566778899aabbccdd****as an example.WarningThe MEK is the root credential for accessing encrypted data. If you lose the MEK, you can no longer access the existing data. Keep the MEK secure.
The confidential engine does not generate or back up MEKs. You must provide an MEK in the required format.
jdbcUrlThe connection address and name of the database. Only MySQL and PostgreSQL databases are supported. Use one of the following formats for the connection address:
MySQL database:
jdbc:mysql://ip:port/database_namePostgreSQL database:
jdbc:postgresql://ip:port/database_name
This topic uses
jdbc:mysql://rm-bp1487t8613zz****.mysql.rds.aliyuncs.com:3306/adbas an example.userNameThe database account.
passwordThe password for the account.
saveModeThe mode used to save the ciphertext dataset if the destination path for the encrypted file already exists. The following modes are supported:
Append: Appends the data.
ErrorIfExists: Reports an error and exits if the destination exists. You can view the error details in the logs.
Ignore: Skips writing the data if the destination exists. You cannot view error details in the logs.
Overwrite: Overwrites the existing data.
formatThe ciphertext format. Only the EncBlocksSource format is supported.
Run the client tool to encrypt the data in the
customertable of theadbdatabase. The encrypted dataset is stored in the path specified by thedataDirparameter in the configuration file../cryptoTool encrypt encryption_config.jsonUpload the
customerfolder that contains the encrypted dataset to OSS. This topic uses the pathoss://testBucketName/adb/Spark/customeras an example. For more information, see Simple upload.
Run SQL queries on the ciphertext data
Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the navigation pane on the left, choose Job Development > SQL Development.
In the SQLConsole window, select the Spark engine and a job resource group.
Run the following statements to perform SQL computations on the ciphertext.
Enable confidential computing and set the customer master key.
-- Enable confidential computing. set spark.adb.tee.enabled=true; -- Set the customer master key. This key must be the same as the MEK in the encryption configuration file. set spark.adb.tee.encryption.key=00112233445566778899aabbccdd****; -- Create a database to store the ciphertext data and computation results. CREATE DATABASE IF NOT EXISTS test_db;Use the confidential engine to create an external table named
customer. The LOCATION parameter specifies the OSS path of the ciphertext dataset folder. This topic usesoss://testBucketName/adb/Spark/customeras an example.set spark.adb.tee.enabled=true; set spark.adb.tee.encryption.key=00112233445566778899aabbccdd****; CREATE TABLE IF NOT EXISTS test_db.customer USING EncBlocksSource LOCATION 'oss://testBucketName/adb/Spark/customer';Create an external table named
customer_output. Write the SQL computation results from thecustomertable to thecustomer_outputexternal table. The data for thecustomer_outputexternal table is stored atoss://testBucketName/adb/Spark/customer_output.set spark.adb.tee.enabled=true; set spark.adb.tee.encryption.key=00112233445566778899aabbccdd****; CREATE TABLE IF NOT EXISTS test_db.customer_output USING EncBlocksSource LOCATION 'oss://testBucketName/adb/Spark/customer_output' AS SELECT * FROM test_db.customer WHERE id = 1;
Download and decrypt the ciphertext results
Download the ciphertext computation results from the
oss://testBucketName/adb/Spark/customer_outputpath to your local device. For more information, see Download objects.Prepare a decryption configuration file. This topic uses a new configuration file named
decryption_config.jsonas an example. The following example shows the configuration.{ "dataDir": "file:///tmp", "table": [ "customer_output" ], "encAlgo": "AES_128_GCM", "outTblPrefix": "abc_", "mek": "00112233445566778899aabbccdd****", "jdbcUrl": "jdbc:mysql://rm-bp1487t8613zz****.mysql.rds.aliyuncs.com:3306/adb", "username": "demo_user", "password": "demo_password", "saveMode": "Overwrite", "format":"EncBlocksSource" }The following table describes the parameters:
Parameter
Description
dataDirThe local path to store the ciphertext dataset.
tableThe folder name of the ciphertext dataset.
encAlgoThe encryption and decryption algorithm for the ciphertext. Only AES_128_GCM is supported.
outTblPrefix
The prefix for the table name in the database after decryption. The destination table name consists of the prefix and the folder name of the ciphertext dataset.
mekThe master encryption key (MEK) that is used to encrypt data encryption keys (DEKs). For more information, see the "Keys" section of the Overview topic.
The MEK must be a 16-byte hexadecimal string. You can use the OpenSSL tool to generate a key. This topic uses
00112233445566778899aabbccdd****as an example.WarningThe MEK is the root credential for accessing encrypted data. If you lose the MEK, you can no longer access the existing data. Keep the MEK secure.
The confidential engine does not generate or back up MEKs. You must provide an MEK in the required format.
jdbcUrl
The connection address and name of the database. Only MySQL and PostgreSQL databases are supported. Use one of the following formats for the connection address:
MySQL database:
jdbc:mysql://ip:port/database_namePostgreSQL database:
jdbc:postgresql://ip:port/database_name
This topic uses
jdbc:mysql://rm-bp1487t8613zz****.mysql.rds.aliyuncs.com:3306/adbas an example.userNameThe database account.
passwordThe password for the account.
saveModeThe mode used to generate the plaintext dataset if the destination database, where the decrypted data is stored, already exists. The following modes are supported:
Append: Appends the data.
ErrorIfExists: Reports an error and exits if the destination exists. You can view the error details in the logs.
Ignore: Skips writing the data if the destination exists. You cannot view error details in the logs.
Overwrite: Overwrites the existing data.
formatThe ciphertext format. Only the EncBlocksSource format is supported.
Run the client to decrypt the data. The decrypted data is automatically inserted into the
adbApsaraDB RDS for MySQL database. The name of the destination table is determined by the `outTblePrefix` parameter and the folder name of the encrypted dataset. In this example, the table name isadb_customer_output../cryptoTool decrypt decryption_config.jsonView the decrypted data in the ApsaraDB RDS for MySQL database. The computation result is displayed in plaintext.
SELECT * FROM adb_customer_output;Sample result:
+--------+----------+----------+----------+ | id | name | address | age | +--------+----------+----------+----------+ | 1 | james | beijing | 10 | +--------+----------+----------+----------+