All Products
Search
Document Center

Artificial Intelligence Recommendation:Log configuration

Last Updated:Nov 03, 2025

Debug log configuration

After a service is published, you can collect various information to analyze its online performance. Although you can print logs for offline debugging, this method is not flexible enough for an online service.

This configuration collects online debug information. You can enable it to print information to the console, output it to DataHub for detailed analysis, such as analyzing the effect of each recall, or write it to a local file for other calls. Currently, this configuration collects item data after the recall, filtering, and coarse-ranking stages.

The debug configuration corresponds to `DebugConfs` in the configuration overview. `DebugConfs` is a `Map[string]object` structure where the key represents the scenario. This lets you create isolated configurations for different scenarios.

{
    "DebugConfs": {
        "${scene_name}": {
            "OutputType": "datahub",
            "Rate": 0,
            "DebugUsers": [
                "1001"
            ],
            "DatahubName": "dh_debug_log",
            "KafkaName": "pairec_debug_log",
            "FilePath": "/Users/username/pairec/debug_log/",
            "MaxFileNum": 20
        }
    }
}

Field

Type

Required

Description

OutputType

string

Yes

The output method for debug logs. Valid values:

  • console: Outputs to the console.

  • datahub: Outputs to a topic in DataHub.

  • kafka: Outputs to a topic in Kafka.

  • file: Outputs to a local file.

Rate

int

Yes

The log sampling ratio. The value ranges from 0 to 100. Adjust this value based on the queries per second (QPS) of your online service.

Note: If the value is 0, no debug information is output.

DebugUsers

[]string

No

Records logs only for specific UIDs. For example, `"DebugUsers": [ "1001"]` means logs are recorded only for user 1001.

DatahubName

string

No

Required when `OutputType` is set to `datahub`.

This is the custom name from `DatahubConfs` in the data source configuration.

KafkaName

string

No

Required when `OutputType` is set to `kafka`.

This is the instance name from `KafkaConfs` in the data source configuration.

FilePath

string

No

Required when `OutputType` is set to `file`.

The debug logs are output to this path. The path is created automatically if it does not exist.

MaxFileNum

int

No

Optional when `OutputType` is set to `file`.

Sets the maximum number of log files that can be saved in the `FilePath` directory. The default value is 20. Each log file is limited to 1 G. If a file exceeds this limit, file rotation occurs.

If the number of files exceeds `MaxFileNum`, the oldest log file is deleted.

The output information includes the following fields:

  • request_id: A unique ID for each recommendation request.

  • module: The module that generated the log. Current modules include `recall`, `filter`, and `general_rank`.

  • scene_id: The scenario ID.

  • exp_id: The experiment ID.

  • request_time: The request UNIX timestamp in seconds.

  • uid: The user ID.

  • retrieveid: The recall ID.

  • items: The list of items. The format is `"item1:score1:{'dbmtl_prob_click':'0.03'},item2:score2:{'dbmtl_prob_click':'0.04'}"`. If `module` is `recall` or `filter`, `score` is the recall score.

    If `module` is `general_rank`, `rank`, or `sort`, the format is `"item1:score1:{'dbmtl_prob_click':'0.03'},item2:score2:{'dbmtl_prob_click':'0.03'}"`. In this case, `score` is the score from the model service during the general rank or rank stage. `dbmtl_prob_click` is one of the model's target scores. If the model service has multiple targets, the braces contain all target scores. The actual names of the targets depend on the model service, and `dbmtl_prob_click` is just an example.

Currently, within a module, items are output separately for each `retrieveid`. For example, if there are five recall sources, five records are generated when `module` is `recall`. Each record corresponds to one recall source. The `filter` and `general_rank` modules follow the same logging logic.

Datahub configuration

For the Datahub configuration, you do not need to create a topic. The engine automatically creates one if you specify the topic and schemas.

{
    "DatahubConfs": {
        "dh_debug_log": {
            "Endpoint": "http://dh-cn-beijing-int-vpc.aliyuncs.com",
            "ProjectName": "project_test",
            "TopicName": "pairec_debug_log",
            "Schemas": [
                {
                    "Field": "request_id",
                    "Type": "string"
                },
                {
                    "Field": "module",
                    "Type": "string"
                },
                {
                    "Field": "scene_id",
                    "Type": "string"
                },
                {
                    "Field": "request_time",
                    "Type": "integer"
                },
                {
                    "Field": "exp_id",
                    "Type": "string"
                },
                {
                    "Field": "items",
                    "Type": "string"
                },
                {
                    "Field": "retrieveid",
                    "Type": "string"
                },
                {
                    "Field": "uid",
                    "Type": "string"
                }
            ]
        }
    }
}

When the service runs, the engine automatically creates the `pairec_debug_log` topic and applies the specified schema. After you see log output in the DataHub console, you can create a subscription relationship between MaxCompute and DataHub. This saves the DataHub data to a MaxCompute table.

Field

Type

Required

Description

Endpoint

string

Yes

The endpoint is a domain name from the DataHub Domain Name List. If the DataHub project and PAI-Rec are in the same region, use a VPC address. Otherwise, use a public address.

ProjectName

string

Yes

The name of the DataHub project.

TopicName

string

Yes

The name of the DataHub topic.

Schemas

[]map

Yes

The details of the DataHub schema.

Kafka configuration

For the Kafka configuration, configure the instance name (which corresponds to `KafkaName` in `DebugConfs`), the `BootstrapServers` ingest endpoint, and the name of the pre-existing topic.

{
    "KafkaConfs": {
        "pairec_debug_log": {
            "BootstrapServers": "alikafka-post-cn-xxxxx-1.alikafka.aliyuncs.com:9093,alikafka-post-cn-xxxxx-2.alikafka.aliyuncs.com:9093,alikafka-post-cn-xxxxx-3.alikafka.aliyuncs.com:9093",
            "Topic": "debug_log"
        }
    }
}
Important

If you use the PAI-Rec console for configuration, when the service is deployed to EAS, the virtual private cloud (VPC) and vSwitch in the network configuration must be the same as those of the current Kafka instance.

Feature log configuration

This configuration is used to collect online user-side and item-side features. These features are then output to DataHub for detailed analysis.

The feature log configuration is defined in `FeatureLogConfs` in the configuration overview. `FeatureLogConfs` is a `Map[string]object` structure. The key represents the scenario, which lets you isolate the configuration for each scenario.

{
    "FeatureLogConfs": {
        "${scene_name}": {
            "OutputType": "datahub",
            "DatahubName": "",
            "UserFeatures": "",
            "ItemFeatures": ""
        }
    }
}

Field

Type

Required

Description

OutputType

string

Yes

The output method for debug logs. Currently, only `DataHub` is supported.

DatahubName

string

Yes

Required when `OutputType` is set to `datahub`.

This is the custom name from `DatahubConfs` in the data source configuration.

UserFeatures

string

No

A user can have many features. Here, you can select a subset of user-side features. Separate multiple features with commas. Use an asterisk (`*`) to record all user-side features. If this parameter is empty or not provided, no user-side features are recorded.

ItemFeatures

string

No

Similar to the above, you can select a subset of item-side features. Use an asterisk (`*`) to record all item-side features in the engine. If this parameter is empty, the recall ID and model scoring features are recorded.

Datahub configuration

You do not need to create a topic in the Datahub configuration. The engine automatically creates a topic if you specify the topic and schemas.

{
    "DatahubConfs": {
        "dh_feature_log": {
            "Endpoint": "http://dh-cn-beijing-int-vpc.aliyuncs.com",
            "ProjectName": "",
            "TopicName": "pairec_feature_log",
            "Schemas": [
                {
                    "Field": "request_id",
                    "Type": "string"
                },
                {
                    "Field": "scene_id",
                    "Type": "string"
                },
                {
                    "Field": "exp_id",
                    "Type": "string"
                },
                {
                    "Field": "request_time",
                    "Type": "integer"
                },
                {
                    "Field": "user_id",
                    "Type": "string"
                },
                {
                    "Field": "user_features",
                    "Type": "string"
                },
                {
                    "Field": "item_id",
                    "Type": "string"
                },
                {
                    "Field": "position",
                    "Type": "string"
                },
                {
                    "Field": "item_features",
                    "Type": "string"
                }
            ]
        }
    }
}

When the service runs, the engine automatically creates the `pairec_feature_log` topic and applies the specified schema.

Field

Type

Required

Description

Endpoint

string

Yes

The DataHub endpoint from the list of DataHub domain names. If the DataHub project and PAI-Rec are in the same region, a VPC address is typically used. Otherwise, a public network address is typically used.

ProjectName

string

Yes

The name of the DataHub project.

TopicName

string

Yes

The name of the DataHub topic.

Schemas

[]map

Yes

The details of the DataHub schema.