All Products
Search
Document Center

Data Transmission Service:Tutorial: Connect OSS to a DTS RAGFlow knowledge base

Last Updated:Mar 28, 2026

KBSync is a command-line tool that syncs files from an Alibaba Cloud Object Storage Service (OSS) bucket into a Data Transmission Service (DTS) RAGFlow knowledge base. Run KBSync from a Linux host with network access to both OSS and RAGFlow, point it at a config file, and it transfers the files into the specified knowledge base dataset.

Supported file types

KBSync can sync the following file types:

  • Documents: DOC, DOCX, PPT, PPTX, YML, XML, HTML, JSON, CSV, TXT, XLS, XLSX, WPS, RTF, MD, SQL

  • Images: JPG, JPEG, PNG

  • Other: INI, MP3

Prerequisites

Before you begin, make sure you have:

  • A RAGFlow knowledge base created in DTS, with an IP whitelist configured

  • An OSS bucket containing the files to sync

  • A Linux host that can reach both OSS and RAGFlow over the network

  • The KBSync binary (see step 1 of the procedure below)

Gather required values

Before configuring KBSync, collect the following values. Each is required in the config file.

OSS credentials and bucket details

  1. Create an AccessKey pair and record the AccessKey ID and AccessKey secret.

    If you use an AccessKey pair from a Resource Access Management (RAM) user, grant that RAM user either the AliyunOSSReadOnlyAccess (read-only) or AliyunOSSFullAccess (management) permission for OSS.
  2. Record your OSS bucket name and region ID:

    1. Log on to the OSS console.

    2. In the navigation pane, click Buckets.

    3. Find the target bucket and record its Bucket Name.

    4. Note the Region, then find the corresponding region ID (for example, cn-beijing).

RAGFlow connection details

Collect the following three values from the RAGFlow page. To log on, follow the steps in Log on to the RAGFlow page.

API endpoint (ragflowUrl)

  1. In the navigation pane, click API.

  2. Copy the API Server value.

API key (ragflowApiKey)

  1. In the navigation pane, click API.

  2. Next to RAGFlow API, click API KEY.

  3. In the API KEY dialog box, click Create New Key.

  4. Click the copy icon to record the token.

    Important

    The API key must start with Bearer , for example: Bearer ragflow-RhMjc0NjFhNTZmNTExZjBiYWY****.

Knowledge base ID (ragflowDatasetId)

  1. On the Knowledge Base page, click the target knowledge base.

  2. In the URL, record the value after id=. That value is the knowledge base ID.

Sync OSS files to RAGFlow

Step 1: Get the KBSync binary

Join the DingTalk group (ID: 79690034672) and contact the helpdesk to get the KBSync binary.

Step 2: Create the config file

  1. On your Linux host, create a file named config.

  2. Copy the following template into the file:

    whiteList=
    blackList=
    sinkType=RagFlow
    sourceType=OSS
    
    ragflowUrl=http://XX.XX.XX.XX
    ragflowApiKey=Bearer ragflow-Rh******
    ragflowDatasetId=******
    
    sourceOSSAccessKeyId=******
    sourceOSSAccessKeySecret=******
    sourceOSSRegion=cn-beijing
    sourceOSSBucket=kbsync
  3. Replace the placeholder values with the values you collected in the previous section:

    Important

    Leave optional parameters blank rather than removing them. blackList takes precedence over whiteList when both are set.

    ParameterRequiredDescriptionExample
    whiteListNoSpace-separated paths of OSS files or folders to include. Supports regular expressions. Leave blank to include everything.docs/ reports/2024
    blackListNoSpace-separated paths of OSS files or folders to exclude. Supports regular expressions. Takes precedence over whiteList.drafts/ *.tmp
    sinkTypeYesMust be RagFlow.RagFlow
    sourceTypeYesMust be OSS.OSS
    ragflowUrlYesThe RAGFlow API Server endpoint.http://192.0.2.10
    ragflowApiKeyYesThe RAGFlow API key. Must start with Bearer .Bearer ragflow-Rh****
    ragflowDatasetIdYesThe ID of the target knowledge base.b2abcd1234ef
    sourceOSSAccessKeyIdYesYour AccessKey ID.LTAI5tXxx
    sourceOSSAccessKeySecretYesYour AccessKey secret.xXxXxXx
    sourceOSSRegionYesThe region ID of your OSS bucket.cn-beijing
    sourceOSSBucketYesThe name of your OSS bucket.my-bucket

Step 3: Run KBSync

  1. Place the KBSync binary and the config file in the same directory.

  2. Run the following command:

    ./KBSync --config config

Verify the sync

If KBSync starts successfully, the output looks similar to the following:

INFO config SourceType=OSS, SinkType=RagFlow
INFO config whiteList=, blackList=
INFO config ragflowUrl=http://XX.XX.XX.XX ragflowApiKey=Bearer ragflow-Rh******
INFO config ragflowDatasetId=b2******
INFO config sourceOssKeyId=******, sourceOssRegion=cn-beijing
INFO Verifying RAGFlow connection...
INFO Attempting to list datasets to validate the connection...
INFO Successfully found matching dataset: Name='test', ID='b2******'
INFO RAGFlow connection verified successfully.

The key indicator is RAGFlow connection verified successfully. Once you see it, KBSync is connected and syncing files from your OSS bucket to the knowledge base.

Troubleshooting

KBSync cannot connect to RAGFlow

Check that ragflowUrl points to the correct API Server address and that the Linux host can reach that address over the network. Verify that the RAGFlow instance's IP whitelist includes the host's IP address.

Authentication fails

Confirm that ragflowApiKey starts with Bearer (including the space), and that the token has not expired. Create a new API key if needed.

OSS access is denied

Verify that the AccessKey ID and AccessKey secret are correct. If you are using a RAM user's AccessKey pair, confirm the RAM user has AliyunOSSReadOnlyAccess or AliyunOSSFullAccess on the bucket.

Files are not being synced

If you set whiteList, check that the paths match the actual OSS object paths. If blackList is also set, it takes precedence — a path matched by blackList is excluded even if it also matches whiteList.

What's next