This topic describes how to create a MaxCompute sink connector to export data from a data source topic of a Message Queue for Apache Kafka instance to a MaxCompute table.
Prerequisites
- Message Queue for Apache Kafka
- The connector feature is enabled for the Message Queue for Apache Kafka instance. For more information, see Enable the connector feature.
- A topic is created in the Message Queue for Apache Kafka instance. For more information, see Step 1: Create a topic.
A topic named maxcompute-test-input is used in this example.
- MaxCompute
- A MaxCompute table is created on the MaxCompute client. For more information, see
Create tables.
In this example, a MaxCompute table named test_kafka is created in a project named connector_test. You can execute the following statement to create a MaxCompute table named test_kafka:
CREATE TABLE IF NOT EXISTS test_kafka(topic STRING,partition BIGINT,offset BIGINT,key STRING,value STRING) PARTITIONED by (pt STRING);
- A MaxCompute table is created on the MaxCompute client. For more information, see
Create tables.
- Optional:EventBridge
- EventBridge is activated. For more information about how to activate EventBridge, see Activate EventBridge and grant permissions to a RAM user.
Note EventBridge is required to be activated only when the instance that contains the data source topic is in the China (Hangzhou) or China (Chengdu) region.
Precautions
- You can only export data from a data source topic of a Message Queue for Apache Kafka instance to a MaxCompute table within the same region. For more information about the limits on connectors, see Limits.
- If the instance that contains the data source topic is in the China (Hangzhou) or
China (Chengdu) region, the connector task is published to EventBridge.
- At present, EventBridge is free of charge. For more information, see Billing.
- When you create a connector, EventBridge creates the AliyunServiceRoleForEventBridgeSourceKafka service-linked role for you.
- If the service-linked role is not available, EventBridge automatically creates one for you to allow EventBridge to access Message Queue for Apache Kafka.
- If the service-linked role is available, EventBridge does not create a new one.
- You cannot view the operational logs of connector tasks that are published to EventBridge. After a connector task is completed, you can view the consumption details of the groups that subscribe to the data source topic to see the status of the connector task. For more information, see View consumption details.
Procedure
To export data from a data source topic of a Message Queue for Apache Kafka instance to a MaxCompute table by using a MaxCompute sink connector, perform the following steps:
- Grant Message Queue for Apache Kafka the permissions to access MaxCompute.
- Optional: Create the topics and group that are required by a MaxCompute sink connector.
If you do not want to manually create the topics and group, skip this step and set the Resource Creation Method parameter to Auto in the next step.
Important Specific topics that are required by a MaxCompute sink connector must use a local storage engine. If the major version of your Message Queue for Apache Kafka instance is 0.10.2, topics that use a local storage engine cannot be created manually and must be created automatically. - Create and deploy a MaxCompute sink connector
- Verify the result.
Create a RAM role
You cannot select Message Queue for Apache Kafka as the trusted service when you create a RAM role. Therefore, select any service that can be the trusted service first. Then, manually modify the trust policy of the RAM role.
Add permissions
To use a MaxCompute sink connector to export messages to a MaxCompute table, you must grant the following permissions to the RAM role.
Object | Operation | Description |
---|---|---|
Project | CreateInstance | The permissions to create instances in projects. |
Table | Describe | The permissions to read the metadata of tables. |
Table | Alter | The permissions to modify the metadata of tables and the permissions to create and delete partitions. |
Table | Update | The permissions to overwrite data in tables and insert data into tables. |
For more information about the preceding permissions and how to grant these permissions, see MaxCompute permissions.
To grant the required permissions to AliyunKafkaMaxComputeUser1, perform the following steps:
Create the topics that are required by a MaxCompute sink connector
In the Message Queue for Apache Kafka console, you can manually create the five topics that a MaxCompute sink connector requires. The five topics are the task offset topic, task configuration topic, task status topic, dead-letter queue topic, and error data topic. The five topics differ in partition count and storage engine. For more information, see Parameters in the Configure Source Service step.
Create the group that is required by a MaxCompute sink connector
In the Message Queue for Apache Kafka console, you can manually create the group that is required by a MaxCompute sink connector. The name of the group must be in the connect-task name format. For more information, see Parameters in the Configure Source Service step.
Create and deploy a MaxCompute sink connector
To create and deploy a MaxCompute sink connector that is used to export data from Message Queue for Apache Kafka to MaxCompute, perform the following steps:
- Log on to the Message Queue for Apache Kafka console.
- In the Resource Distribution section of the Overview page, select the region where your instance is deployed.
- On the Instances page, click the name of the instance that you want to manage.
- In the left-side navigation pane, click Connectors.
- On the Connectors page, click Create Connector.
- In the Create Connector wizard, perform the following steps:
- Go to the Connectors page, find the connector that you created, and then click Deploy in the Actions column.
Send a test message
After you deploy the MaxCompute sink connector, you can send a message to the data source topic in Message Queue for Apache Kafka to test whether the message can be exported to MaxCompute.
- On the Connectors page, find the connector that you want to use and click Test in the Actions column.
- In the Send Message panel, configure the required parameters to send a test message.
- Set the Method of Sending parameter to Console.
- In the Message Key field, enter the key of the message. For example, you can enter demo as the key of the message.
- In the Message Content field, enter the content of the message. For example, you can enter {"key": "test"} as the content of the message.
- Configure the Send to Specified Partition parameter to specify whether to send the message to a specific partition.
- If you want to send the message to a specific partition, click Yes and enter the partition ID in the Partition ID field. For example, you can enter 0 as the partition ID. For information about how to query partition IDs, see View partition status.
- If you do not want to send the message to a specific partition, click No.
- Set the Method of Sending parameter to Docker and run the docker commands that are provided in the Run the Docker container to produce a sample message section to send a test message.
- Set the Method of Sending parameter to SDK and click the link to the topic that describes how to obtain and use the SDK that you want to use. Then, use the SDK to send and consume a test message. Message Queue for Apache Kafka provides topics that describe how to use SDKs for different programming languages based on different connection types.
- Set the Method of Sending parameter to Console.
View data in the MaxCompute table
After you send a message to the data source topic in Message Queue for Apache Kafka, you can log on to the MaxCompute client to check whether the message is received.
To view the test_kafka table, perform the following steps: