Data registration
Before data registration, make sure that you have activated MaxCompute (previously known as ODPS), bound a MaxCompute project, and registered a MaxCompute table. This table is the intermediate table for performance statistics collection, and is also the source table of experiment reports. This table uses user_id, exp_id, and a dimension field as the composite primary key. The composite primary key indicates the metrics of a user in a certain experiment, such as impressions and clicks. For example, you can configure country as the dimension field and use this table as the intermediate statistical table that is collected by user_id, country, and exp_id. The configurations of single-dimension metrics and derived metrics depend on this intermediate table. Before you understand the source table of experiment reports, you must be familiar with the overall experiment architecture, as shown in the following figure.

PAI-Rec is the recommendation service provided by Alibaba Cloud. This service integrates the SDK of the configuration center from which experiment information is obtained. The response returned by the recommendation service contains the experiment ID (exp_id) and the request ID (reqid).
When a user performs an operation on an app, behavioral logs are returned to MaxCompute by the front-end event tracking task on the app. A source table of experiment reports is generated during SQL-based statistics collection. After several metrics are configured in the configuration center, a SQL statement is dynamically assembled to collect statistics from the source table of experiment reports and save the results to a Hologres database. You can query metric data on the Performance Reports page of the PAI-Rec console.
I. Offline experiment reports
Code for generating an offline source table of experiment reports
CREATE TABLE IF NOT EXISTS experiment_report (
user_id STRING COMMENT 'User ID',
exp_id STRING COMMENT 'Experiment ID',
exposure_count BIGINT COMMENT 'Exposure count',
click_count BIGINT COMMENT 'Click count',
like_comment_collect_count BIGINT COMMENT 'Like count'
)
PARTITIONED by (dt STRING COMMENT 'Date')
STORED AS ALIORC;
INSERT OVERWRITE TABLE experiment_report PARTITION (dt = '${bdp.system.bizdate}')
SELECT user_id
,exp_id
,SUM(IF(event == 'expose',1,0))
,SUM(IF(event == 'click',1,0))
,SUM(IF(event IN ('like','comment','collect'),1,0))
FROM rec_sln_demo_behavior_table_v1
WHERE ds = '${bdp.system.bizdate}'
GROUP BY user_id
,exp_idFields in an offline source table of experiment reports
Field | Description | Required | Remarks |
user_id | User ID | Yes | The value of this field can be a user ID, a device ID, or an International Mobile Equipment Identity (IMEI) number. |
exp_id | Experiment ID | Yes | The value of this field is the experiment ID returned by the A/B test, which is logged by the event tracking task. If you use a self-managed recommendation service, you can specify a custom experiment ID. Example: exp_id:ER1_L1#EG1#E1_L2#EG2#E2 |
Dimension field | Used for more granular performance report statistics | No | You can use this dimension field to collect statistics and display corresponding metrics. For example, you can set a province field to collect statistics and analyze product performance in different provinces. You can set a maximum of two fields such as province and city. When you set a third field, there are too many combinations, which slows down the calculation. Common dimension fields can also include recall ID (recall_id), which is used to analyze the performance of different recall sources. You need to first set up the backend logs of the PAI-Rec engine. For more information, see Other configurations for pairec_debug_log. |
Compute field | You can specify multiple compute fields, such as show_count (impressions); click_count (clicks) | Yes | You need to enter numeric values for the compute fields, such as the number of impressions, the number of clicks, and view duration. |
dt | The date field. | Yes | The value of this field is in the yyyyMMdd format. |
hh | The hour field. | No | The 24-hour system is used. The value of this field ranges from 00 to 23. |
mm | The minute field. | No | The value of this field ranges from 00 to 59. |
The source table can be generated offline on a daily basis or on an hourly basis (for example, scheduled once per hour in DataWorks to collect statistics on impression, click, and purchase log data from the previous hour), depending on your business requirements. If the source table is generated offline, the user_id, exp_id, and dt fields are required. If the source table is generated on an hourly basis, the user_id, exp_id, dt, and hh fields are required. The mm field is optional.
Metrics are divided into offline metrics and real-time metrics (corresponding to hourly reports). If data sources are generated in real time, offline metrics and real-time metrics can share the same data source. If data sources are generated offline, you can only create offline metrics, not real-time metrics.
The dimension field in the preceding table is optional. After the dimension field is specified, you can analyze metrics by the specified dimension. For example, you can view report data by common dimension fields including gender and OS (such as iOS and Android).
II. Real-time source table of experiment reports
Code for generating a source table of experiment reports
CREATE TABLE IF NOT EXISTS experiment_report_real (
user_id STRING COMMENT 'User ID',
exp_id STRING COMMENT 'Experiment ID',
exposure_count BIGINT COMMENT 'Exposure count',
click_count BIGINT COMMENT 'Click count',
like_comment_collect_count BIGINT COMMENT 'Like count'
)
PARTITIONED by (
dt string
,hh string
)
STORED AS ALIORC;
INSERT OVERWRITE TABLE experiment_report_real PARTITION (dt = '${bdp.system.bizdate}',hh = '${hour}')
SELECT user_id
,exp_id
,SUM(IF(event == 'expose',1,0))
,SUM(IF(event == 'click',1,0))
,SUM(IF(event IN ('like','comment','collect'),1,0))
FROM rec_sln_demo_behavior_table_v1
WHERE ds = '${bdp.system.bizdate}'
AND hh = hour(now()) -1
GROUP BY user_id
,exp_id;Fields in a real-time source table of experiment reports
Field | Description | Required | Remarks |
user_id | User ID | Yes | The value of this field can be a user ID, a device ID, or an International Mobile Equipment Identity (IMEI) number. |
exp_id | Experiment ID | Yes | The value of this field is the experiment ID returned by the A/B test, which is logged by the instrumentation task. If you use a self-managed recommendation service, you can specify a custom experiment ID. Example: exp_id:ER1_L1#EG1#E1_L2#EG2#E2 |
Dimension field | The dimension field, such as OS | No | You can filter metrics based on the specified dimension field. |
Compute field | You can specify multiple compute fields, such as show_count (impressions) and click_count (clicks) | Yes | You need to enter numeric values for the compute fields, such as the number of impressions, the number of clicks, and view duration. |
dt | Date partition | Yes | The value of this field is in the yyyyMMdd format. |
hh | Hour partition | Yes | The 24-hour system is used. The value of this field ranges from 00 to 23. |
mm | Minute partition | No | The value of this field ranges from 00 to 59. |
The fields in result tables are fixed, except the dimension field. Make sure that the definition of the dimension field in result tables is consistent with that in source tables.
Offline metrics and real-time metrics cannot share a result table. Among the three time fields, only the dt field is required for the result table of offline metrics, and the dt and hh fields are required for the result table of real-time metrics.
After you obtain the source table of experiment reports, you can go to the Data Registration page on the PAI-Rec console to add a MaxCompute table. To access the Data Registration page, choose Metric Management > Data Registration in the navigation pane on the left.
We recommend that you use the default Hologres table to write back metric calculation results. This table does not consume resources. The system automatically creates tables and stores relevant metric data for you. You can also use the Hologres table you created instead of the default Hologres table.
III. Register a MaxCompute table
Click Create Data Table. In the panel that appears, select a MaxCompute project, select the MaxCompute data table that you want to add, specify a name for the data table, and then click Import.
If the fields in the data table that you registered are modified, we recommend that you import the data table again at the earliest opportunity. Otherwise, the system may not recognize the modified fields.

Field configuration
The registered data table automatically appears in the list. You can click View Fields in the Actions column of the data table to view the fields in the data table or modify the fields. The user_id, exp_id, and dt fields are required. These fields represent the total number of behaviors such as impressions, clicks, and likes of a user (user_id) in a set of experiments (exp_id) on a specific day (dt). Custom metrics are defined and calculated based on this table.

Dimension fields
You can add one or two dimension fields to the preceding table. For example, if you set city as a dimension field, you can view the comparative effects of experiments under different city dimensions in the experiment report. If you set city and gender as dimension fields, the experiment comparison results will be based on the combination of these two fields.