This topic describes how to prepare data by using Integrated Development Environment (IDE) to upload data.
A project is created. For more information, see Create a project.
- Data stored in MaxCompute is used by general-purpose algorithm components.
Note When the data size is smaller than 20 MB, we recommend that you use IDE to upload data. When the data size is larger than 20 MB, we recommend that you use the command-line tool to upload data. For more information, see Tunnel command usage.
- Structured or unstructured data stored in OSS is used by algorithm components of deep learning.
- Go to a Machine Learning Studio project.
- Log on to the PAI console.
- In the left-side navigation pane, choose .
- In the upper-left corner of the page, select the region that you want.
- Optional:In the search box on the PAI Visualization Modeling page, enter the name of a project to search for the project.
- Find the project that you want and click Machine Learning in the Operation column.
- Upload data.
- In the left-side navigation pane, click Data Source.
- In the lower-left corner, click Create Table.
- In the Create Table dialog box, set the Table name and Lifecycle (Days) parameters.
- In the Schema section, click the icon. Then, enter a name in the Column Name column and specify the data type in the Type column.
- Click Next.
- Click Select File and follow the instructions to upload on-premises files.
- Click OK.
- Create an experiment.
- In the left-side navigation pane, click Home.
- In the upper-right corner, click New and select New Experiment.
- In the New Experiment dialog box, set the Name parameter and click OK.
- Configure the data source.
- In the left-side navigation pane, click Components.
- In the Components pane, click Data Source/Target. Then, drag the Read MaxCompute Table component to the canvas.
- Click the Read MaxCompute Table component on the canvas. On the Select Table tab on the right side, enter the name of the created table in the Table Name field.
- Click the Fields Information tab to view the columns, data type, the value range of the first 100 rows of the table.
What to do next
After data preparation is complete, you need to preprocess the data. For more information, see Preprocess data.