Creates a data ingestion task to load data from an Apache Kafka topic into an AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster.
Debugging
Authorization information
The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:
- Operation: the value that you can use in the Action element to specify the operation on a resource.
- Access level: the access level of each operation. The levels are read, write, and list.
- Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
- For mandatory resource types, indicate with a prefix of * .
- If the permissions cannot be granted at the resource level,
All Resourcesis used in the Resource type column of the operation.
- Condition Key: the condition key that is defined by the cloud service.
- Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.
| Operation | Access level | Resource type | Condition key | Associated operation |
|---|---|---|---|---|
| adb:CreateApsKafkaHudiJob | none | *DBClusterLakeVersion acs:adb:{#regionId}:{#accountId}:dbcluster/{#DBClusterId} |
| none |
Request parameters
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
| DBClusterId | string | Yes | The cluster ID. Note
You can call the DescribeDBClusters operation to query the IDs of all clusters in a region.
| amv-bp11q28kvl688**** |
| RegionId | string | Yes | The region ID of the cluster. | cn-hangzhou |
| PartitionSpecs | array<object> | No | The partition information. | |
| object | No | "SourceColumn": the name of the source partition field. "Strategy": The policy.
The SourceTypeFormat enumeration and meaning are as follows, and the right side corresponds to the time precision. Timestamp Accurate to Milliseconds Timestamp Accurate to Microseconds Timestamp Accurate to Seconds yyyy-MM-dd HH:mm:ss.SSS APSMIDyyyyMMddHHmmss -> yyyy-MM-dd HH:mm:ss APSMIDyyyyMMdd -> yyyy-MM-dd. APSyyyyMMddHHmmss -> yyyyMMddHHmmss. �� APSyyyyMMdd -> yyyMMdd. APSyyyyMM -> yyyMM. YYYY/MM/DD YYYY/MM/DD | [{ "SourceColumn": "NetOutFlow", "Strategy": "ParseAsTimeAndFormat", "SourceTypeFormat": "APSLiteralTimestampSecond", "TargetTypeFormat": "yyyy-MM-dd", "TargetColumn": "NetOutFlow" }] | |
| Columns | array<object> | Yes | The column information. | |
| object | Yes | The column information. | ||
| Name | string | No | The name of the source column to use for partitioning. | a |
| MapName | string | No | The name of the partition column in the destination table. | b |
| Type | string | No | The format of the source field. See the table below for valid values. | string |
| MapType | string | No | The desired format for the destination partition column. | string |
| PrimaryKeyDefinition | string | No | The primary key settings. Contains the uuid policy and mapping policy. The explanation is as follows. Uuid policy: "Strategy": "uuid". Mapping policy: "Strategy": "mapping", "Values":[ "f1", "f2" ], "RecordVersionField","xxx" The meaning of the RecordVersionField is the HUDI record version. | "Strategy": "mapping" |
| WorkloadName | string | Yes | The name of the workload. | test |
| LakehouseId | long | No | The ID of the lakehouse. | 123 |
| ResourceGroup | string | Yes | The resource group name. | aps |
| HudiAdvancedConfig | string | No | The HUDI configuration of the destination. | hoodie.keep.min.commits=20 |
| AdvancedConfig | string | No | The advanced configurations. | - |
| FullComputeUnit | string | No | The full synchronization configuration. | 2ACU |
| IncrementalComputeUnit | string | Yes | The incremental synchronization configuration. | 2ACU |
| KafkaClusterId | string | No | The ID of the Apache Kafka instance. You can get it in the Kafka console. | xxx |
| KafkaTopic | string | No | Kafka Topic ID. You can get it in the Kafka console. | test |
| StartingOffsets | string | Yes | Specifies the position from which to start consuming messages. Valid values: begin_cursor/end_cursor/timestamp Each corresponds to the earliest /latest /specified time respectively. | begincursor |
| MaxOffsetsPerTrigger | long | No | The maximum number of records to fetch in a single batch. | 50000 |
| DbName | string | Yes | The name of the user-defined database. | testDB |
| TableName | string | Yes | The name of the user-defined table. | testTB |
| OutputFormat | string | No | The format of the output data. | HUDI |
| TargetType | string | No | The destination type. | OSS |
| TargetGenerateRule | string | No | The rules for generating the destination database. | xxx |
| AcrossUid | string | No | The ID of the Alibaba Cloud account to which the source Kafka belongs. | 123************ |
| AcrossRole | string | No | The Resource Access Management (RAM) role that is created for the trusted Alibaba Cloud account. For more information, see Create a RAM role for a trusted Alibaba Cloud account. The ARN of the RAM role that grants AnalyticDB for MySQL permission to access resources in the source account. Required for cross-account data ingestion. | aps |
| JsonParseLevel | integer | No | The number of layers that are parsed for nested JSON fields. Valid values: 0: Nested JSON fields are not parsed. 1: parses one layer. 2: Two layers are parsed. 3: Three layers are parsed. 4: Four layers are parsed. By default, one layer is parsed. For more information about how nested JSON fields are parsed, see the Examples of schema fields parsed with different numbers of layers section of this topic. | 0 |
| DataOutputFormat | string | No | Enumeration value and description. Single: The source is a single-row JSON record. Multi: source is a JSON array. Output a single JSON record. | Single |
| OssLocation | string | No | The path of the destination data lakehouse in an Object Storage Service (OSS) bucket. | oss://test-xx-zzz/yyy/ |
| DatasourceId | long | No | The data source ID. | 1 |
Response parameters
Examples
Sample success responses
JSONformat
{
"HttpStatusCode": 200,
"Data": "xxx",
"RequestId": "1A943417-5B0E-1DB9-A8**-A566****C3",
"Success": true,
"Code": 200,
"Message": "ok"
}Error codes
For a list of error codes, visit the Service error codes.
