All Products
Search
Document Center

Create a data source

Last Updated: May 31, 2021

Artificial Intelligence Recommendation (AIRec) uses only MaxCompute (formerly known as ODPS) as its full data source. Before you synchronize full data to AIRec, you must upload the data to a DataWorks workspace.

If a MaxCompute project is created, and the data has been uploaded to the project, skip this step.

DataWorks is an important PaaS-layer service of Alibaba Cloud. It is based on MaxCompute. For more information, see What is DataWorks?. This topic describes how to create a DataWorks workspace. The workspace is used as the data source for the full data that you want to upload to AIRec.

Procedure

1. Log on to the DataWorks console

(1) Log on to the Alibaba Cloud Management Console. Enter DataWorks in the search box and click DataWorks in the Console Entry section to go to the DataWorks console.

DataWorks is in public preview and free of charge.

2. Create a workspace in DataWorks

When you create a workspace in DataWorks, a MaxCompute project is actually created. AIRec uses the MaxCompute project only as the data source for starting an AIRec instance. This project can be frozen or deleted if AIRec has steady input data in subsequent phases.

(1) Select a region and create a workspace in the region

Select a region and activate DataWorks in the region, such as Singapore. Then, you can create a workspace in the region based on your volume of full data.

Select a regionCreate a workspace

(2) Configure the workspace

a. Specify Workspace Name. The name of the workspace is the same as that of the MaxCompute project. This name needs to be provided to AIRec.

b. Specify Mode. We recommend that you set the value to Basic Mode (Production Environment Only). If you use this mode, you can execute SQL statements to query and modify the data in the MaxCompute project on the DataStudio page of the DataWorks console.

c. Retain default settings for other parameters.Create a workspace

(3) Select a computing engine

Select MaxCompute as the computing engine. AIRec uses only the storage feature of MaxCompute. Therefore, other features of MaxCompute do not need to be enabled.

If you do not have other dependencies on MaxCompute, select the pay-as-you-go billing method. If this is the first time that you use MaxCompute, you must click Buy Now to pay for it. Then, you can select the pay-as-you-go billing method.Buy Now

After AIRec reads the full data, and the AIRec instance starts, you can freeze or delete the MaxCompute project.

(4) Configure the computing engine

Engine Details

(5) Complete the configuration

a. You can find the newly created workspace on the Workspaces page.

b. AIRec uses MaxCompute only to store data. The name of the DataWorks workspace is the same as that of the MaxCompute project. You need to provide only the workspace name and the names of the three tables.

c. AIRec reads the full data in the MaxCompute project only when the AIRec instance is being started. Subsequent incremental data is not stored in the project. When you add, delete, modify, or query data in the project, the services provided by AIRec are not affected.