Dataphin supports the output of tag tables to downstream application systems via batch tag query services. This topic guides you through creating a new tag offline task.
Limitations
To ensure high availability of data, tag offline services write data into a temporary table ({target table}_dpfx_b
). Once the data writing is complete, the original target table is renamed ({target table}_dpfx_tmp
), and the temporary table is then renamed to the target table. Finally, the original target table ({target table}_dpfx_tmp
) is deleted. There will be a brief period when the data is unavailable, from the completion of the original target table's renaming until the temporary table is renamed to the target table.
Prerequisites
Before creating a tag offline service, ensure you have selected the required tags in the tag asset market and obtained usage permissions for the application where the service will be located. For specific operations, see request tag or audience group permissions.
The application associated with the tag offline service must already be created. For specific operations, see create application.
Before creating a tag offline service, you need to create the corresponding entity. For specific operations, see create entity.
Create tag offline service
On the Dataphin home page, select Tag > Tag Application from the top menu bar.
In the left-side navigation pane, select Service Management > Tag Offline Service.
On the Tag Offline Service page, click Create Task.
On the Create Offline Service configuration page, set the required parameters.
Parameter
Description
Basic Information
Task Name
Enter the name of the offline task. Chinese, English, numbers, and underscores (_) are allowed, within 64 characters.
Application Selection
Select the application associated with the project.
Owner
Select the owner of the offline service. You can search by keyword.
Entity
Select the entity name corresponding to the offline service.
Entity ID Selection
Select the ID name corresponding to the entity.
NoteIn Field Mapping, Enter Tag to filter the optional tag range based on the selected entity ID.
Schedule Type
Supports recurring schedule and manual schedule task types.
Manual Schedule: One-time integration. After the task is published, you can select manual execution on the task list page.
Recurring Schedule: Scheduled execution according to the configured recurring schedule.
Description
Enter a brief description, within 1000 characters.
Field Mapping
Target Data Source
Select the target data source corresponding to the offline service. The target data source can be a MySQL, Oracle, AnalyticDB for PostgreSQL, Greenplum, openGauss data source, or project created in Dataphin.
NoteThe target data source of the project only supports projects joined by the current account (General and Tag project), and the project tenant account has write-through permission.
If there is no required data source, you can click +create Data Source to create one. For specific operations, see data source management.
Schema
When the target data source type is openGauss, you can select the schema of the data source.
Target Table
Select the target table in the target data source. The openGauss data source type is the target table under the schema.
Multi-level partitioned tables are not supported.
When cross-project safe mode is enabled, cross-project table creation is not supported. For more information, see security settings.
If you do not have write table data permission for the current target table production environment, you can click Request Permission to apply for permission. For more information, see request, renew, and return table permissions.
If there is no corresponding target table, after selecting Enter Tag, you can click One-click Create Table to create the required target table.
In the system-generated create table statement, confirm whether the table name, field type, precision, etc., meet the requirements before clicking Create.
The table name and table comments are automatically generated by the system and can be modified as needed.
The system-generated create table statement refers to the type of input tag and performs preliminary transformation, which you can modify as needed.
When the target data source is a project, the system will default to generating a partitioned table, and it is recommended not to make adjustments.
When the target data source type is AnalyticDB for PostgreSQL, partitioned tables are not supported.
After selecting the input tag, you can configure the code value and code name of the exported tag. You can export the tag value and code name separately or export them all, but at least one must be exported.
When one-click creating a table, if the tag has a configured lookup table, you can choose to export the code name. The exported code name is
{tag code}_codename
. After one-click creating a table, the system will automatically map fields.
Date Partition
Select the partition field of the target table.
If the selected target table is a partitioned table, the system will default to the first partition field of the table.
If the selected target table is a non-partitioned table, there is no need to select a date partition.
Partition Field Format
Enter the date format or select an existing date format. You can choose yyyyMMdd, yyyy-MM-dd, yyyy/MM/dd, yyyy.MM.dd.
NoteOnly when the compute engine is MaxCompute, the partition field format can be selected as yyyymmdd, yyyy-mm-dd, yyyy/mm/dd, or yyyy.mm.dd.
Loading Policy
Only overwrite policy loading strategy is supported; under the overwrite data strategy, when there is a primary key/constraint violation, the original data will be deleted first, and then the entire new data row will be inserted.
Enter Tag
Select the tags that need to be mapped under the entity and click the
button to configure the data source field mapping relationship.
Mapping
The system displays the selected tags and their mapping field relationships. You need to select the mapping field for the output tag mapping.
Same name mapping: Click Same Name Mapping to associate tags and their mapping fields with the same name.
Purge: Click the
icon to purge the mapped relationships.
Output content: Select the output content of the tag with a configured lookup table. You can choose tag value and code name. By default, all tag values are selected. You can select tag value and code name at the bottom of the output tag list to select all (entire page).
Maintenance Configuration
Recurrence
The cycle for scheduling tasks within a specific time range. Supports daily scheduling. Tag offline tasks will run daily according to the configured schedule time.
Click Publish to complete the creation of the tag offline service task.
Manage tag offline services
The tag offline service page displays information such as task name, owner, application name, entity, entity ID, task status, execution status, tag, schedule type, and available operations.
Hover your mouse over the target table to view its full name and the name of the data source. If the data source type is openGauss, you can also view the schema of the target table.
(Optional) Filter tasks by selecting Only Show Mine, entering the task name, or using the Filter option to filter by application name, entity-entity ID, target source type, task status, execution status, or schedule type.
In the Actions column of the tag offline service task list, various operations can be performed.
Operation item
Description
Edit
When the task status is not publishing or unpublishing, you can click the
icon to edit on the Edit Offline Service page and republish. Modifying the task name, schedule type, target data source type, and loading policy is not supported.
Details
Click the
icon to view the detailed information of the current tag offline service on the View Offline Service page. When the task status is editing, publish failed, published, or offline, you can click Edit at the bottom of the page to edit the current tag offline service.
View instance
Click the
icon to view the execution instance generated by the current tag offline service.
Unpublish
When the task status is published or unpublish failed, you can click the
unpublish icon to unpublish the current tag offline service.
Data backfill
When the task status is published for recurring schedule tag offline services, you can click the
data backfill icon to perform data backfill. The default data timestamp for backfill is yesterday's (T-1) data.
Run
For tasks with a manual schedule type, you can click the
run icon to manually run in the Run dialog box by selecting the data timestamp.
Delete
When the task status is offline or publish failed, you can click the
delete icon to delete the current tag offline service.