All Products
Search
Document Center

Alibaba Cloud Model Studio:Data import

Last Updated:Dec 04, 2025

Before you can build a knowledge base, you must import your knowledge data into Alibaba Cloud Model Studio. This data serves as the initial source for your knowledge base.

Important

User guide

Import local files

  1. Go to the File tab.

  2. In the Category Management section on the left, select an existing category or click the image icon to create one.

    Alibaba Cloud Model Studio uses categories to manage imported files.
  3. Click Import Data. On the Import Data page, set the import method to Upload Local File.

    The platform does not currently support the direct import of JSON, CSV, or YAML files. You must convert these files to the XLSX or XLS format before importing them.
  4. Select a parser. The default is Intelligent Document Parsing, which cannot be changed. You can use Data Parsing Settings to configure parsing rules for different formats to improve parsing results.

    Data parsing settings

    You can configure the parsing policy according to your requirements. If you are unsure, you can keep the default settings.

    • Digital Parsing: Does not parse illustrations or charts in files.

    • Intelligent Parsing: For illustrations in a file, the parser detects and extracts text from the images and generates text summaries. These summaries, along with other non-image content, are chunked, converted into vectors, and used for knowledge base retrieval.

    • LLM Parsing: Agent applications that use the Qwen-VL model can answer questions about illustrations and charts in files. To enable the model to detect and understand this content, select LLM Parsing.

    • Qwen VL Parsing: Supports only image formats. You can select a Qwen-VL model and use a prompt to specify the layout, elements, and content for the model to detect. Other features are the same as those of LLM Parsing.

    image

    image

    How to make an Alibaba Cloud Model Studio application properly display illustrations from a file in its answers

  5. (Optional) Configure Tags for the file.

    calling an application using an API, you can specify tags in the tags request parameter. The application then filters relevant files by these tags when retrieving information from the knowledge base to improve retrieval efficiency. For agent applications, you can set tags when you debug the knowledge base in the console.
  6. Click Confirm to begin parsing and importing the data. You can view the task progress on the page.

    The file is converted into a format that Model Studio can process. This process may take several hours during peak business hours. Please wait for the process to complete.
  7. After the import is complete, click Details next to the file to view it.

    After a file is imported into Model Studio, it is stored as an independent replica in the free storage space provided by the platform. This replica is not associated with the original raw data, and no capacity limit is imposed.
    You can view only files that were imported within the last 90 days. After this period, the imported files cannot be viewed, but they are not deleted.
    Imported files can be used only by users in the current workspace. Model Studio does not use them for any commercial purposes or make them public.

Import local tables

  1. Go to the Table tab.

  2. In the Table Management section on the left, select an existing data table or click the image icon to create a new one.

    Alibaba Cloud Model Studio manages imported data using data tables.

    Import to a new data table

    1. Enter a Table Name. Then, configure the table schema by choosing to Upload Excel File or use Custom Header.

      • Upload Excel: Model Studio automatically detects the table header in the uploaded file, uses the header to create the data table schema, and imports the remaining content as data records into the table.

      • Custom Header: The Column Name and Type fields are required. The Description field is optional.

        Important
        • The structure of the data table, including the column name, description, and type, cannot be modified after it is confirmed.

        • The schema of the uploaded file, including the number of columns and column names, must exactly match the schema of the target data table. Otherwise, the import fails. For example, if the data table to be imported has two columns, you must configure two fields for the table schema with identical column names. You can add or remove fields by clicking New Columns or Delete in the Actions column.

        • To help the model understand the meaning of each field, provide a clear, natural-language description in the "Description" field. For example, you can specify that the age field represents a user's age.

        • If you set the field type to image_url, ensure the value is a publicly accessible image URL. The knowledge base retrieves the image from this URL to generate a vector index, which is used for scenarios such as search by image.

          Example image_url format: https://example.com/downloads/pic.jpg
          When you create a knowledge base, fields of the image_url type are used to generate an image index. Model Studio accesses the target image, extracts its features, converts the features into a vector using image embedding, and then saves the vector. During knowledge base retrieval, this vector is compared with the vector of the user-uploaded image for similarity.
    2. Click the image icon to select and upload a file (XLSX or XLS format).

      The file must contain a table header. Otherwise, the import fails.
      The platform does not currently support the direct import of JSON, CSV, or YAML files. You must convert these files to the XLSX or XLS format before importing them.
    3. Click OK to start the import. The new data table will then appear in the Table Management navigation tree on the left.

    Import to an existing data table

    1. From the Table Management list on the left, select the data table and click Import Data.

    2. Set the import type to Upload and Overwrite or Incremental Upload.

      Click Download Template to download a blank file that contains only the table header. You can insert new data into this file and then use it for an overwrite or incremental upload.
    3. Click the image icon to select and upload a file (XLSX or XLS format).

      The file must contain a table header that matches the structure of the header of the current data table. Otherwise, the import fails.
      The platform does not currently support the direct import of JSON, CSV, or YAML files. You must convert these files to the XLSX or XLS format before importing them.

Import OSS files

  1. Go to the File tab.

  2. In the Category Management section on the left, you can select an existing category or click the image icon to create a new one.

    Alibaba Cloud Model Studio organizes imported files into categories.
  3. Click Import Data to open the Import Data page. Set the import method to OSS.

    When you import data from OSS to Alibaba Cloud Model Studio for the first time, you must complete the authorization as prompted and add the bailian-datahub-access tag to the target bucket. For more information, see Configure file import from OSS.
    Buckets of the Archive, Cold Archive, or Deep Cold Archive storage class are not supported.
    Accessing files in the root directory of a bucket is not supported. Select an existing subdirectory or create a new one for Model Studio to access.
    Buckets with content encryption are supported. Private buckets are supported.
    If you want to use a bucket that has Referer hotlink protection enabled, you must add the domain name *.console.aliyun.com to the Referer whitelist. For more information, see Allow access only from trusted websites
  4. The parser is set to Intelligent Document Parsing by default and cannot be changed. You can configure parsing rules for different formats in Data Parsing Settings to improve the parsing results.

    Data parsing settings

    Configure the parsing policy according to your requirements. If you are unsure, you can keep the default settings.

    • Digital Parsing does not support parsing illustrations or charts in files.

    • Intelligent Parsing: For illustrations in a file, the parser detects and extracts text from the images and generates text summaries. These summaries, along with other non-image content, are chunked, converted into vectors, and used for knowledge base retrieval.

    • LLM Parsing: Agent applications that use the Qwen-VL model can answer questions about illustrations and charts in files. To enable the model to detect and understand this visual content, select LLM Parsing.

    • Qwen VL Parsing: Supports only image formats. You can select a Qwen-VL model and use a prompt to specify the layout, elements, and content for the model to detect. Other features are the same as those of LLM Parsing.

    image

    image

    How to make a Model Studio application display illustrations from a file in its answers

  5. (Optional) Configure Tags for the file.

    calling an application using an API, you can specify tags in the tags request parameter. The application then filters relevant files by these tags when retrieving information from the knowledge base to improve retrieval efficiency. For agent applications, you can set tags when you debug the knowledge base in the console.
  6. Click OK. The system then begins to parse and import the data. You can monitor the task progress on the page.

    The file is converted into a format that Model Studio can process. This process may take several hours during peak business hours. Please wait for the process to complete.
  7. After the import is complete, click Details next to the file to view the results.

    After a file is imported into Model Studio, it is stored as an independent replica in the free storage space provided by the platform. This replica is not associated with the original raw data, and no capacity limit is imposed.
    Imported files can be used only by users in the current workspace. Model Studio does not use them for any commercial purposes or make them public.

Next step

Create a knowledge base

More information

Configure file import from OSS

The first time you import files from OSS, you need to grant Model Studio access to your OSS resources. The authorization process is different for an Alibaba Cloud account and a RAM user.

Alibaba Cloud account authorization

  1. Click Authorize Now, as shown in the following figure.

    image

  2. In the dialog box that appears, click Confirm Authorization. This automatically creates an OSS service-linked role that grants Alibaba Cloud Model Studio access to your OSS resources.

    The authorization usually takes effect within seconds, but a slight delay may occur during peak hours.
    What should I do if I receive the error 'This request failed. Try submitting again or contact an administrator. Error code: 10041495'?

    image

  3. Add the bailian-datahub-access tag to the target OSS bucket.

    This tag marks the buckets that Model Studio can access. Model Studio cannot access buckets that do not have this tag.
    1. Log on to the OSS console. In the navigation pane on the left, click Buckets. Then, find the target bucket.

    2. Hover over its image icon, click Edit.

    3. On the Bucket Tag page, if no tags are set, click Create Tag. Otherwise, click Settings.

    4. Click Tag, set the tag key to bailian-datahub-access and the tag value to read, and then click Save.

      image

  4. Return to the Import Data page, reselect the target bucket, and retry the import.

    Note that Model Studio does not support accessing files in the root directory of a bucket. Select an existing subdirectory or create a new one for Model Studio to access.

RAM user authorization

  1. Click Authorize Now, as shown in the following figure.

    image

  2. In the dialog box, click Confirm Authorization. If you receive an Authorization Failed or No Permission error, you must first grant the RAM user permission to create service-linked roles.

    1. Log on to the RAM console. In the navigation pane on the left, select Permissions > Policies, and then click Create Policy.

    2. Click the JSON tab. Copy and paste the following policy, and then click OK.

      {
          "Action": [
              "ram:CreateServiceLinkedRole"
          ],
          "Resource": "*",
          "Effect": "Allow",
          "Condition": {
              "StringEquals": {
                  "ram:ServiceName": "datahub.sfm.aliyuncs.com"
              }
          }
      }

      image

    3. Enter a policy name and click OK.

      image

    4. In the navigation pane on the left, choose Identities > Users. On the page, find the RAM user that you want to authorize, and then in the Actions column, click Add Permissions.

    5. In the access policy list, select the custom policy you just created and click Grant permissions. The RAM user now has permission to create service-linked roles.

      image

  3. Grant the RAM user permission to access OSS through Model Studio.

    1. Return to the Import Data page and click Authorize Now.

      image

    2. In the dialog box that appears, click Confirm Authorization to automatically create the required OSS service-linked role.

      The authorization usually takes effect within seconds, but a slight delay may occur during peak hours.
      What should I do if I encounter the error "The request failed. Try to submit again or contact an administrator. Error code: 10041495"?

      image

  4. Add the bailian-datahub-access tag to the target OSS bucket.

    This tag marks the buckets that Model Studio can access. Model Studio cannot access buckets that do not have this tag.
    1. Log on to the OSS console. In the navigation pane on the left, click Buckets. Then, find the target bucket.

    2. Hover over the image icon, click Edit.

    3. On the Bucket Tag page, click Create Tag if no tags are set. Otherwise, click Settings.

    4. Click Tag, set the tag key to bailian-datahub-access and the tag value to read, and then click Save.

      image

  5. Return to the Import Data page, reselect the target bucket, and retry the import.

    Note that Model Studio does not support accessing files in the root directory of a bucket. Select an existing subdirectory or create a new one for Model Studio to access.

Quotas and limits

For more information about supported data formats and capacity, see Knowledge base quotas and limits.

FAQ

Permissions and security

  • When I import data, the error message "You are not authorized to access this module" appears. What do I do?

    By default, a RAM user cannot perform write operations such as importing data or creating a knowledge base. To enable these operations, an Alibaba Cloud account must grant the RAM user the page permissions for Administrator, or at least permissions that include Application Data - Operations and Knowledge Base - Operations.

Importing OSS files

  • What should I do if error code "10041495" is returned?

    This error usually occurs because the Alibaba Cloud account has not activated OSS. To resolve this issue, perform the following steps:

    1. Log on to the OSS console using the Alibaba Cloud account and activate OSS as prompted.

    2. Return to the Model Studio Import Data page and retry the authorization.