All Products
Search
Document Center

Artificial Intelligence Recommendation:Push data

Last Updated:Feb 20, 2024

Overview

This topic describes how to push real-time data to the Artificial Intelligence Recommendation (AIRec) service. After an instance is started, all data changes such as data addition, modification, and deletion in the user, item, and behavior tables are synchronized to AIRec.

PushDocument: This operation is called to push real-time data to the AIRec instances of Industry Operation Edition and Algorithm Configuration Edition.

Terms

data table

A data table is a collection of data that is organized based on data attributes. The types of data tables vary based on specific scenarios. In most cases, the following types of tables are required: user table, item table, and behavior table.

document

A document is the basic unit for data uploads. The document structure in the JSON format is applied.

Reserved field

Field

Valid value

Description

cmd

add, update, and delete

You can specify this field to add, update, or delete a document.

Important
  1. An add operation is used to add a document, whereas an update operation is used to update a document.

  2. You can perform an add operation to add a document. If a document already exists, the existing document is replaced by the newly added document. To report information about an add operation, you must submit the primary key and all required fields.

  3. You can perform an update operation to update an existing document. Only the fields to be updated are reported after the primary key is confirmed. If you update a document that does not exist, a dirty data record is generated. The dirty data is unavailable. Therefore, do not use an update operation to add a document.

  4. If you want to delete the value of a field in a data record, you can update the field to none, instead of leaving the field empty.

Content field

Content fields constitute valid information of a document. To specify content fields, see Data specifications. Note that these data specifications cannot be applied to Cold Start Edition.

Push request

Multiple documents from a data table can be pushed in a request. We recommend that you push as many documents as possible in a request to improve the processing performance. For example, a push request that involves a maximum of 800 documents can be triggered, or a push request can be triggered every 10 seconds.

The cmd field can be set to different values for documents that are pushed in a request.

Quota

If a user table or an item table contains excessive data and the purchased quota is exceeded, an error code is returned by the PushDocument operation, which indicates that the quota is exceeded. For more information, see 14. What are the common errors that occur when data is pushed by using a server SDK?

You can push only the documents for which the cmd field is set to delete in a push request. Unnecessary documents can be deleted.

Limits

  1. A single request can push a maximum of 800 documents.

  2. The data size in a single request cannot exceed 1 MB.

  3. The size of a document cannot exceed 10 KB.

  4. The data size in a field cannot exceed 5 KB.

  5. Make sure that you do not push data when an instance is being started or restarted.

Manage documents

The cmd field specifies the operation that you want to perform on a data table.

If the cmd field is set to add, documents are added to a specific data table. If existing documents in the data table have the same unique identifiers as the documents that you want to add, the existing documents are deleted.

If the cmd field is set to update, existing documents in a specific data table are updated. If the cmd field is set to delete, existing documents in a specific data table are deleted. The following table provides unique identifiers of user tables, item tables, and behavior tables.

Table

Unique identifier

User table

The user_id field.

Item table

The item_id and item_type fields.

Behavior table

N/A. In a behavior table, documents cannot be updated or deleted.

The following table describes the requirements for the data integrity of the documents that you want to push based on the setting of the cmd field.

Operation

Reserved field

Content field

add

All reserved fields

The combination of fields that form a unique identifier in a data table and the required fields specified in Data specifications.

update

All reserved fields

The combination of fields that form a unique identifier in a data table and the required fields specified in Data specifications.

delete

All reserved fields

The combination of fields that form a unique identifier in a data table.

Important

If you add a behavioral data record, make sure that the corresponding item data and user data are uploaded on the same day. Otherwise, the behavioral data is invalid. You can query invalid behavioral data on the Data Diagnostics page of the AIRec console.

Description of API operations

If you call an API operation, you must specify the instance ID, the data table name, and the documents that you want to push. Document data is displayed as a JSON array. Note: You can add or modify specific fields. For more information, see Data specifications.

1. Add data.

// Push added item data.
[
    {
        "cmd": "add",
        "fields": {
            "item_id": "1",
            "item_type": "article",
            "title": "The college student who wins the Touching China award becomes the Deputy Secretary",
            "content": "Content",
            "pub_time": "1590327038",
            "status":"1",
            "scene_id":"test01",
            "weight":"100",
            "category_level":"3",
            "category_path":"12_1024_56",
            "tags":"News, Touching China"
        }
    },
    {
        "cmd": "add",
        "fields": {
            "item_id": "2",
            "item_type": "article",
            "title": "The college student who wins the Touching China award, XXX",
            "content": "Content",
            "pub_time": "1590327038",
            "status":"1",
            "scene_id":"test01",
            "weight":"1",
            "category_level":"3",
            "category_path":"12_1024_56",
            "tags":"Touching China"
        }
    }
]



// Push added behavioral data.
[{
        "cmd": "add",
        "fields": {
            "item_id": "1024233",
            "item_type": "article",
            "bhv_type": "expose",
            "bhv_value": "1",
            "trace_id": "Alibaba",
            "trace_info": "1007.5911.12351.1002000:::::::",
            "scene_id": "testScene01",
            "bhv_time": "1600852251",
            "user_id": "1"
        }
    },
    {
        "cmd": "add",
        "fields": {
            "item_id": "1024234",
            "item_type": "article",
            "bhv_type": "expose",
            "bhv_value": "1",
            "trace_id": "Alibaba",
            "trace_info": "1007.5911.12351.1002000:::::::",
            "scene_id": "testScene01",
            "bhv_time": "1600852251",
            "user_id": "1"
        }
    }
]

2. Update data. You must specify the item ID, item type, field, and content that you want to update. Example:

[
    {
        "cmd": "update",
        "fields": {
            "item_id": "2",
            "item_type": "article",
            "title": "Title of an updated item"
        }
    }
]

3. Delete data. You must specify the item ID and item type that you want to delete. Example:

[
    {
        "cmd": "delete",
        "fields": {
            "item_id": "2",
            "item_type": "article"
        }
    }
]

Data requirements:

  1. If only one document is pushed in a request, the document data must also be displayed as a JSON array.

  2. You can use the Pretty Print JSON format, or use a compact style without line breaks or indentations.

Parameters

Parameter

Type

Description

Required

Value

tableName

String

The table that you want to push.

Yes

user

item

behavior

content

JSON

The document. For more information, see the preceding content.

Yes

N/A

content-cmd

String

The operation type, which is required. For more information, see the preceding content.

Yes

add

update

delete

Causes of common failures when you push data by using server SDKs

For more information, see 14. What are the common errors that occur when data is pushed by using a server SDK?

Sample code

package com.aliyun.airec;

import com.aliyuncs.DefaultAcsClient;
import com.aliyuncs.airec.model.v20181012.PushDocumentRequest;
import com.aliyuncs.airec.model.v20181012.PushDocumentResponse;
import com.aliyuncs.http.FormatType;
import com.aliyuncs.profile.DefaultProfile;
import com.aliyuncs.profile.IClientProfile;

public class PushDocument {

    public static void main(String args[]) {
        // Note: The region ID in the following code must be the same as the ID of the region where the instance resides. For example, if the instance is deployed in the China (Beijing) region, enter cn-beijing in the following code.
        // The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations in AIRec is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. 
        // We recommend that you do not save the AccessKey ID and AccessKey secret in your project code. Otherwise, the AccessKey pair may be leaked, and the security of all resources that belong to your account may be compromised. 

        // In this example, the AccessKey ID and AccessKey secret are stored in the environment variables to implement identity verification. 
        IClientProfile profile = DefaultProfile.getProfile("cn-hangzhou", System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));

        // Note: The region ID in the following code must be the same as the ID of the region where the instance resides. For example, if the instance is deployed in the China (Beijing) region, enter ("cn-beijing", "Airec", "airec.cn-beijing.aliyuncs.com") in the following code.
        DefaultProfile.addEndpoint("cn-hangzhou", "Airec", "airec.cn-hangzhou.aliyuncs.com");

        DefaultAcsClient client = new DefaultAcsClient(profile);

        PushDocumentRequest request = new PushDocumentRequest();
        request.setAcceptFormat(FormatType.JSON);

        // Enter the instance ID.
        request.setInstanceId("airec-xxxx");
        // Enter the data table name, which can be user, item, or behavior.
        request.setTableName("item");

        String content = "JSON data that you want to upload to the AIRec instance";
        request.setHttpContent(content.getBytes(), "UTF-8", FormatType.JSON);

        try {
            PushDocumentResponse response = client.getAcsResponse(request);
            System.out.println(response.getResult());

        } catch (Exception e) {
            e.printStackTrace();  
        }
    }
}

Demo for pushing data and obtaining recommendation results

For more information, see airec-demo.