This topic describes how to configure system settings for Data Service.
Permissions
Only super administrators and system administrators can modify system configurations.
Access the system configuration page
On the Dataphin home page, in the top menu bar, choose Service > Service Management.
In the navigation pane on the left, click System Configuration to open the System Configuration page.
API call authentication configuration
Prerequisites
You can configure API call authentication only after the gateway configuration is complete. For more information about how to configure the gateway, see Data Service settings.
Limits
If you use an Alibaba Cloud API Gateway (dedicated or shared instance) or Apsara Stack API Gateway, you can enable token-based authentication for calls.
If you use the Dataphin self-developed gateway, you can enable the application IP whitelist.
If you use an Alibaba Cloud API Gateway-Dedicated Instance or Alibaba Cloud API Gateway-Shared Instance, the API call authentication section is hidden after you enable API call authentication. You must republish existing APIs to enable token-based authentication for them.
If you use an Apsara Stack API Gateway, you can modify the API call authentication configuration.
Enable token-based authentication for calls
In the API Call Authentication Configuration section, click Modify and select whether to enable token-based authentication for calls.
Enable: Requires you to obtain the AppKey and AppSecret for the application and generate a token using the SDK code. The token is used for authentication when you call the API. This method enhances data security.
Disable: Does not require an authentication token. API calls are not authenticated using a token. You can call the API by providing only its basic information. This option provides low data security and should be used with caution.
Click Save to complete the API call authentication configuration.
Enable application IP whitelist
In the API Call Authentication Configuration section, click Modify and select whether to enable the application IP whitelist.
After you enable this feature, you can configure an IP address whitelist for an application on the Call-Application Management page. When an application calls an API, if its IP address is in the whitelist, the call proceeds without authentication.
Click Save to complete the API call authentication configuration.
API cached data storage location
The following conditions must be met to enable data caching for active and standby links:
You must use the Dataphin self-developed gateway. Other gateway types are not supported. For more information, see API gateway configuration.
You must enable the high availability (HA) feature for DataService Studio or the online tagging service.
In the API Cached Data Storage Location section, click Modify to specify the storage location for API cached data. The configurations for the active and standby links are the same.
System Redis + Application Memory: Stores cached data in the system public Redis instance. The storage space is shared with other modules. This option is suitable for scenarios with a small amount of cached data.
NoteThe standby link stores cached data in the public Redis instance of the Dataphin standby system. This option is suitable for scenarios with a small amount of cached data.
Application Memory Only: This option is not recommended if the amount of cached data is large because memory usage affects the system response time. This option is suitable for scenarios where only a few APIs require caching and the amount of data is very small.
NoteThe cache duration is determined by the memory data cache time that is set during the deployment of the Dataphin application. The cache duration that you define when you create an API does not take effect.
Specified Redis Instance: For optimal stability and performance, you can use a dedicated Redis instance. This option stores cached data in the specified Redis instance and is suitable for scenarios where many APIs have caching enabled or the amount of cached data is large. To add a Redis instance, see Create a Redis data source.
ImportantDo not delete the Redis instance used for API cached data. Otherwise, cached data storage fails and the enabled cache for the API becomes invalid.
Click Save to complete the configuration of the API cached data storage location.
SQL injection check
You must enable the DataService Studio or online tagging service feature.
Registered APIs do not support SQL injection checks.
In the SQL Injection Check section, click Modify to enable or disable the SQL injection check.
Enable: When enabled, input SQL is checked during API calls. If the check fails, the API call is blocked. You can view the reason for the failure by checking the error code that is returned on the page.
Disable: When disabled, the input SQL is not checked during API calls.
Click Save to complete the SQL injection check configuration.
Log and O&M statistics settings
Prerequisites
Log and O&M statistics can be configured only after the gateway is configured. If gateway log collection is not configured or is disabled after configuration, failed gateway logs cannot be collected. For more information about how to configure the gateway, see Data Service settings.
Limits
The current log collection framework does not guarantee complete high availability. Therefore, some log information may be lost or duplicated when the application restarts or an API call fails.
When DataService Studio HA is enabled, you can test the connectivity of the external log storage data source for only the active link. Log collection for both the active and standby links is successful only if both links can connect to the log storage database.
If you use an Alibaba Cloud API Gateway (dedicated or shared instance), call logs for previously published APIs do not contain
requestidinformation. You must republish the APIs to include this information in the logs.If you use an Alibaba Cloud API Gateway-Shared Instance, the gateway does not support collecting information such as
requestQueryString,requestHeaders,requestBody,responseHeaders, or responseBody. When authentication, throttling, or other failures occur on the gateway, call logs do not contain request parameters, request SQL, or other information.If log collection is not enabled for the API gateway, the number of failed calls from the gateway is not included in the call statistics, and logs for failed calls cannot be collected. We strongly recommend that you configure log collection for the gateway to collect complete log information.
When the storage table for call detail logs is changed, call detail logs are queried from the new storage table. Data in the previous storage table can no longer be searched. Similarly, when the statistics log table is changed, data in the previous statistics log table can no longer be searched. Plan your table changes accordingly to prevent the loss of access to historical logs.
Procedure
Configuration changes take effect on the same day. Call detail logs and call statistics that exceed the specified storage duration are automatically purged. A long retention period can consume a large amount of storage. Therefore, configure this setting with caution.
In the Log And O&M Statistics Settings section, click Modify at the bottom to configure the storage location and duration for API call detail logs and call statistics.
Call Detail Logs: This feature is disabled by default. After you enable this feature, you can configure the storage location and duration for successful and failed API call detail logs.
Storage Database: Select the storage location for call detail logs. Options include Built-in Storage and External Data Source.
Built-in Storage: Stores up to 500,000 successful and 500,000 failed call detail logs. Available storage durations are 3 Days, 7 Days, 14 Days, Custom (1 to 9,999 days), Do Not Store, and Unlimited.
External Data Source: Stores up to 100 million successful and 100 million failed call detail logs. You must select the data source type, data source, and log table name for log storage. Available storage durations are 7 Days, 14 Days, 30 Days, Custom (1 to 9,999 days), Do Not Store, and Unlimited.
Data Source Type: Only PostgreSQL data sources are supported.
Data Source: Select a data source instance of the specified type. Click Test Connectivity to test the data source connection for the transactional processing (TP) application. To create a data source, click Create to open the Create Data Source dialog box. This dialog box is located under Management Center > Data Source Management.
Log Table Name - Primary Table: Select the table for storing API call log information. You can also click Create Table With One Click. In the Create Log Table dialog box, click Check Table Name Existence. The system then creates the log table and its related index tables.
The log table contains the following fields. The length of each field cannot exceed 10,240 characters.
Field name
Data type
Field description
biz_code_describe
VARCHAR(1024)
Description of the business code.
tenant_id
INT4
Tenant ID.
status_code
VARCHAR(16)
HTTP status code.
biz_code
VARCHAR(64)
Business code.
create_time
TIMESTAMP(3)
Time when the log was created.
api_no
INT4
API number.
ip
VARCHAR(32)
Client IP address.
end_time
TIMESTAMP(3)
End time of the query.
response_size
INT4
Response size in bytes.
sql
TEXT
SQL statement for the query.
start_time
TIMESTAMP(3)
Start time of the query.
app_key
INT4
Application key.
response_parameter
TEXT
Response parameters.
status_describe
VARCHAR(1024)
Description of the HTTP status code.
cost_time
INT4
Query duration in milliseconds.
id
INT8
Auto-incrementing ordinal number.
request_size
INT4
Request size in bytes.
result_count
INT4
Number of API query results
request_id
VARCHAR(64)
Unique ID of the query.
request_parameter
TEXT
Request parameters.
successful
Boolean
Indicates whether the query was successful.
job_id
VARCHAR(64)
Task ID of the asynchronous call.
execute_mode
INT2
Execution mode of the call.
execute_cost_time
INT4
SQL execution duration of the asynchronous call in milliseconds.
Supplementary Information Table: Stores log information for calls to public APIs in asynchronous invocation mode. You can also click Create Table With One Click. In the Create Log Table dialog box, click Check Table Name Existence. The system then creates a supplementary information table (public API call detail log table) for you.
The supplementary information table contains the following fields.
Field name
Data type
Field description
id
INT8
Auto-incrementing ordinal number.
api_no
INT4
API number.
job_id
VARCHAR(64)
Task ID of the asynchronous call.
ext_content
TEXT
Result of the public API call.
create_time
TIMESTAMP(3)
Time when the log record was created.
update_time
TIMESTAMP(3)
Time when the log record was updated.
tenant_id
INT4
Tenant ID.
NoteSuccessful and failed logs are stored in the same table. The table name is unique for each tenant. You can use the `successful` field to distinguish between them.
Expired logs are deleted by the system daily, starting at 02:00. The expiration time is calculated as follows:
current date - logs older than storage duration - 1 < creation time.Built-in storage is suitable for short-term storage scenarios, such as storing successful logs, which can be used to troubleshoot API calls with long response times. External data sources are suitable for long-term storage scenarios, such as storing failed logs, which can be used to troubleshoot API call errors.
After an upgrade in an on-premises environment (including public cloud and private cloud on-premises deployments), the table name for call detail logs in the configuration file changes from
dataservice_api_logtodataservice_api_log_${tenantid}. If you have configured a log collection task for this table, you must update the log table name by navigating to DataService Studio > Management > System Configuration > Call detail logs > Log table name.
Call Statistics: Configure the storage location and duration for 1-minute and 5-minute API call statistics logs.
Storage Database: Select the storage location for call statistics logs. Options include Built-in Storage and External Data Source.
Built-in Storage: Stores up to 500,000 1-minute and 500,000 5-minute call statistics logs. Available storage durations are 7 Days, 14 Days, 30 Days, and Custom (1 to 9,999 days).
External Data Source: Stores up to 100 million 1-minute and 100 million 5-minute call statistics logs. You must select the data source type, data source, and table name for call statistics log storage. Available storage durations for 1-minute call statistics logs are 7 Days, 14 Days, 30 Days, and Custom (1 to 9,999 days). Available storage durations for 5-minute call statistics logs are 183 Days, 366 Days, 2 Years, 3 Years, and Custom (1 to 9,999 days).
Data Source Type: Only PostgreSQL data sources are supported.
Data Source: Select a data source instance of the specified type. Click Test Connectivity to test the data source connection in the TP application. To create a data source, click Create, which takes you to the Management Center > Data Source Management > Create Data Source dialog box.
Table Name: Select the table for storing call statistics logs. You can also click Create Table With One Click. In the Create Log Table dialog box, click Check Table Name Existence. The system then creates the log table and its related index tables.
The log table contains the following fields.
Field name
Data type
Field description
tenant_id
INT4
Tenant ID.
client_ips
VARCHAR(256)
Summary of client IP addresses (up to 10).
biz_error_count
INT8
Number of business errors in API queries.
create_time
TIMESTAMP(3)
Time when the log record was created.
total_count
INT8
Total number of API queries.
api_no
INT4
API number.
total_time_cost
INT8
Total time consumed by API queries in milliseconds.
offline_count
INT8
Number of server-side failures in API queries.
total_succ_time_cost
INT8
Total time consumed by successful API queries in milliseconds.
minute
VARCHAR(16)
Time window for statistics.
update_time
TIMESTAMP(3)
Time when the log record was updated.
app_key
INT4
Application key.
client_fail_count
INT8
Number of client-side failures in API queries.
execute_mode
INT2
Execution mode of the API call.
Note1-minute and 5-minute call statistics logs are stored in different tables. The table name is unique for each tenant.
Expired logs are deleted by the system daily, starting at 02:00. The expiration time is calculated as follows:
current date - logs older than storage duration - 1 < creation time.Built-in storage is suitable for short-term storage scenarios, such as storing 1-minute call statistics logs, which can be used to troubleshoot API calls with long response times. External data sources are suitable for long-term storage scenarios, such as storing 5-minute call statistics logs, which can be used to troubleshoot API call errors.
The
dws_dataphin_service_api_mitable, which is the Data Service call count statistics table in the Dataphin metadata warehouse shared model, records only API call statistics logs from built-in storage. If you use an external data source for log storage, this table does not collect data from it. To retrieve log data from an external data source, you must use an integration task.
Click Save to complete the log and O&M statistics configuration.