All Products
Search
Document Center

Data collection 2.0

Last Updated: Sep 09, 2021

Benefits of uploading behavioral data to OpenSearch

  • You can use behavioral data to understand the user reactions to search results, such as browse, click, dwell, like, share, add to favorites, and purchase. This can provide guidance for you to optimize search effects.

  • The report statistics feature of OpenSearch allows you to view various search reports for applications, such as the reports of page views (PVs), item page views (IPVs), and click-through rate (CTR). You can improve your business operations based on the reports.

  • OpenSearch provides an algorithm platform, which allows you to use feedback data of search behavior to train search sort algorithm models. This helps you improve your search effects.

Usage notes

  • Data refers to the feedback data of user reactions to search results.

  • Collection refers to the process of uploading search behavioral data to OpenSearch by using OpenSearch SDKs. In the latest version, OpenSearch allows you to collect search behavioral data only by using a server SDK. The features of collecting search behavioral data by using a mobile SDK or web SDK will be available soon.

  • Compared with the earlier data collection feature, the data collection feature 2.0 allows you to pass parameters and use SDKs with ease. If you are new to OpenSearch, you can use OpenSearch SDKs to upload behavioral data by using the fields that are described in this topic.Note: The SDK for Java 3.4.0 and SDK for PHP 3.2.0 support data collection 2.0.

Enable data collection

  1. Log on to the OpenSearch console. In the left-side navigation pane, choose Feature Extensions > data collection.1

  2. On the data collection page, select an application for which you want to enable data collection, and select Open Server-side data collection service and Read Usage Commitment to enable data collection.2

  3. The enabling process may take several minutes to complete. After Activation status becomes Successfully opened, you can upload behavioral data.3

Upload behavioral data

Note: After you enable the feature of collecting behavioral data in the OpenSearch console, we recommend that you upload behavioral data by using SDKs. The following part describes the fields that are used to upload behavioral data. Description:

  1. To upload behavioral data by using SDKs, you must specify the following fields: imei or user_id, biz_id, trace_id, trace_info, rn, bhv_type, bhv_time, item_id, and item_type. At least one of imei and user_id is required.

  2. To upload behavioral data by calling API operations, you must also specify the reach_time field in addition to the preceding fields.

  3. For more information about the demos for uploading behavioral data by using SDKs or calling API operations, see SDKs for data collection 2.0.

Description of behavioral data fields

ID

Field

Type

Description

Value

Required

1

app_version

STRING

The version number of the website or mobile app that collects behavioral data.

No

2

sdk_type

STRING

The type of the SDK that is used to upload behavioral data. OpenSearch uses this field to distinguish whether behavioral data is uploaded or collected by using a server SDK or mobile SDK.

No. If you upload behavioral data by using OpenSearch SDKs, the value of this field is set to opensearch_sdk by default.

3

sdk_version

STRING

The version number of the SDK that is used to upload behavioral data.

No. If you upload behavioral data by using OpenSearch SDKs, the value of this field is set by default.

4

login

STRING

Specifies whether the user has logged on to the website or mobile app that collects behavioral data.

Valid values: 0 and 1. A value of 0 indicates that the user is not logged on. A value of 1 indicates that the user is logged on.

No

5

user_id

STRING

The ID used to uniquely identify the user.

No. However, you must specify either the imei field or this field.

6

imei

STRING

The ID of the user device. Valid values: imei, device_id, and idfa.

No. However, you must specify either the user_id field or this field.

7

biz_id

BIGINT

A numeric ID used to distinguish between different search services. Generally, a biz_id field corresponds to an OpenSearch application.

Yes

8

trace_id

STRING

The provider of the search service from which the document is searched and collected.

If the document is searched and collected from OpenSearch, set this field to Alibaba. If the document is searched and collected from another service provider, set this field as needed.

Yes

9

trace_info

STRING

The value of this field is the value of the ops_request_misc parameter that OpenSearch returns in the search results. Pass in the value of the ops_request_misc parameter as it is.

Yes. Note: You must pass in this field when the value of the trace_id field is set to Alibaba. This field is used to check whether the search results are provided from OpenSearch.

10

rn

STRING

This field is used to identify a page view (PV). The value of this field is the value of the request_id parameter that OpenSearch returns in the search results. Pass in the value of the request_id parameter as it is.

Yes

11

item_id

STRING

The primary key value of a document. The value of this field is the primary key value of the primary table in the OpenSearch application.

Yes

12

item_type

STRING

The business type of the document.

For more information about valid values of this field, see Description of the item_type field.

Yes

13

bhv_type

STRING

The type of the behavior, such as expose, dwell, browse, add to favorites, and download.

For more information about valid values of this field, see Common behavior types.

Yes

14

bhv_value

STRING

The value used to measure the behavior, such as the dwell time and number of items that are purchased.

For more information about valid values of this field, see Common behavior types.

No

15

bhv_time

STRING

The time at which the behavior occurs. The value is a UNIX timestamp that is accurate to the second.

Yes

16

bhv_detail

STRING

The detailed description of the behavior.

The format of this field is key=value{,key=value}. One or more key=value pairs can be used.

No

17

ip

STRING

The IP address of the mobile phone or terminal device on which the behavior occurs.

No. However, we recommend that you set this field.

18

longitude

STRING

The longitude of the location at which the behavior occurs.

No. However, we recommend that you set this field.

19

latitude

STRING

The latitude of the location at which the behavior occurs.

No. However, we recommend that you set this field.

20

session_id

STRING

The ID of a user session.

No. However, we recommend that you set this field.

21

spm

STRING

This field is used to track the page module at which the behavior occurs.

The encoding format of this field is a.b.c.d.e, which indicates the site ID, page ID, module ID, and location ID.

No

22

report_src

STRING

This field is used to identify the method that is used to upload behavioral data.

Valid values: 1, 2, and 3. A value of 1 indicates that behavioral data is uploaded by calling OpenSearch SDKs. A value of 2 indicates that behavioral data is collected by calling mobile SDKs. A value of 3 indicates that behavioral data is uploaded by calling OpenSearch API operations.

No

23

mac

STRING

The media access control (MAC) address of the mobile phone or terminal device that collects behavioral data.

No

24

brand

STRING

The brand of the mobile phone or terminal device that collects behavioral data.

No. However, we recommend that you set this field.

25

device_model

STRING

The model of the mobile phone or terminal device that collects behavioral data.

No

26

resolution

STRING

The screen resolution of the mobile phone or terminal device that collects behavioral data.

No

27

carrier

STRING

The carrier of the mobile phone or terminal device that collects behavioral data.

No

28

access

STRING

The network connected to the mobile phone or terminal device that collects behavioral data.

No

29

access_subtype

STRING

The type of the network connected to the mobile phone or terminal device that collects behavioral data.

No

30

os

STRING

The operating system of the mobile phone or terminal device that collects behavioral data.

No

31

os_version

STRING

The version of the operating system of the mobile phone or terminal device that collects behavioral data.

No

32

language

STRING

The language set for the mobile phone or terminal device that collects behavioral data.

No

33

phone_md5

STRING

The MD5 hash of the mobile number of the user.

No

34

reserve1

STRING

A reserved field.

No

35

reserve2

STRING

A reserved field.

No

36

reach_time

BIGINT

The time at which the data reaches the server. The value of this field is in the yyyyMMddHHmmss format.

Yes. If you upload behavioral data by using OpenSearch SDKs, this field is automatically set by the SDKs. If you upload behavioral data by calling API operations, you must set this field.

Description of the item_type field

ID

item_type

Description

1

goods

Goods and commodities

2

article

Articles, blogs, and fictions

3

ask

Q&A

4

bbs

Forum posts

5

download

The behavior to download an item

6

image

Images

7

media

Multimedia such as movies, TV plays, and music

8

recipe

Food and recipes

9

news

News and information

10

institution

Organizations

11

other

Others

Common behavior types

ID

bhv_type

Description

bhv_value

bhv_detail

1

expose

The behavior to expose an item

Empty

Empty

2

stay

Dwell

Dwell time. Unit: seconds

Empty

3

click

The behavior to click an item

The number of clicks Default value: 1

Empty

4

cart

The behavior to add to a shopping cart, bookshelf, or playlist

Empty

Empty

5

buy

The behavior to purchase an item

The number of items that are purchased Default value: 1

buy_price=12,price_unit=RMB. Note: You must perform URL encoding on the values. buy_price indicates the price of the item when the order is placed. By default, the price_unit parameter is set to RMB.

6

collect

The behavior to add an item to favorites

Empty

Empty

7

like

The behavior to like an item

The number of likes. Default value: 1

Empty

8

dislike

The behavior to dislike an item

The number of dislikes. Default value: 1

Empty

9

comment

The behavior to comment on an item

The number of comments. Default value: 1

Empty

10

share

The behavior to share or forward an item

The number of shares or forwards. Default value: 1

Empty

11

subscribe

The behavior to follow or subscribe to an item

Empty

Empty

12

gift

The behavior to send gifts

Empty

Empty

13

download

The behavior to download an item

Empty

Empty

14

read

The behavior to read an item

Empty

Empty

15

tip

The behavior to reward an item

Empty

Empty

16

complain

The behavior to complaint about an item

Empty

Empty

The behavior to view a data report

After you enable the feature of collecting behavioral data and upload a specific amount of behavioral data, you can view the data status and quality on the data collection page.4

Data status

Data status can be either Normal (Available) or Abnormal (Unavailable). Normal (Available) indicates that no quality issue occurs on the behavioral data and the behavioral data is verified. Abnormal (Unavailable) indicates that a quality issue occurs on the behavioral data.

If the data status is Abnormal (Unavailable), the creation and training of popularity models and category prediction may be affected.

Abnormal data status5

Normal data status6

Data quality

If the quality check on the behavioral data fails, an error message appears on the Data Verification page in the OpenSearch console. If the quality check is passed, no error message appears on the Data Verification page.7Note: The sample data that is checked in the preceding figure is the behavioral data that is synchronized to OpenSearch within an hour before a sample quality check is performed at the beginning of each hour.