All Products
Search
Document Center

Data Transmission Service:Tutorial: Connect Lark to a DTS RAGFlow knowledge base

Last Updated:Aug 12, 2025

This topic describes how to transfer data from Lark to a Data Transmission Service (DTS) RAGFlow knowledge base.

Prerequisites

You have created a RAGFlow knowledge base in DTS and configured an IP whitelist.

Background information

Supported data types

DTS RAGFlow supports connecting to Lark Docs, workbooks, Bitable, and knowledge bases in Lark.

Methods for accessing Lark

You can access Lark data using an application access credential (tenant_access_token) or a user access credential (user_access_token).

Method

Permission type

Pros

Cons

tenant_access_token (Recommended)

Application identity

  • Supports resumable transfers.

  • For transfers of a single folder or knowledge base, you only need to configure it once.

Requires more steps for authorization. You need to configure extra access permissions for Cloud Drive and knowledge bases.

user_access_token

User identity

Requires fewer steps for authorization. It has access permissions to all folders, so you do not need to configure extra access permissions for Cloud Drive and knowledge bases.

  • You must get (or re-get) a Lark authorization code each time you run the KBSync program.

  • The Lark authorization code expires. You must get a new one after it expires.

Preparations

  1. Log on to the Lark Open Platform and go to the Developer Console.

  2. Create an application.

    Click Create Custom App, configure information such as Name and App Description, and then click Create.

  3. Click the application card to go to the application editing page.

    By default, the Basic Information > Credentials & Basic Info page appears.

  4. On the Credentials & Basic Info page, in the App Credentials section, record the App ID and App Secret.

Procedure

Note

For more information about Lark operations, see the official Lark documentation (Help Center and Developer Documentation).

Step 1: Configure access permissions

Use tenant_access_token

  1. Log on to the Lark Open Platform and go to the Developer Console.

  2. Click the application that you created in the Preparations section.

  3. Add a bot and publish the application.

    1. In the navigation pane on the left, choose App Features > Add App Features.

    2. On the Add By Feature tab, find the Bot card and click Add.

    3. At the top of the page, click Create Version.

      Note

      Alternatively, in the navigation pane on the left, choose App Release > Version Management & Release, and then click Create Version.

    4. On the Version Details page, enter the App Version and Update Notes.

      Note

      Keep the default value Bot for Default Feature On Mobile and Default Feature On Desktop.

    5. Click Save.

    6. In the dialog box that appears, click Confirm Release.

  4. Configure API permissions.

    1. In the navigation pane on the left, choose Development Configuration > Permissions & Scopes.

    2. Click Bulk Import/Export Scopes.

    3. On the Import tab, in the JSON text box, enter the following permissions for the application.

      {
        "scopes": {
          "tenant": [
            "docs:document:export",
            "drive:drive",
            "wiki:wiki"
          ],
          "user": []
        }
      }
    4. Click Next, Confirm New Scopes.

    5. Click Request To Activate.

  5. Log on to the Lark client, create a new group, and add the application that you created in the Preparations section as a Group Bot.

  6. Configure access permissions for Cloud Drive and knowledge bases.

    Configure Cloud Drive access permissions

    1. Go to the target Cloud Drive folder.

    2. On the right side of the page, click Share.

    3. For Invite Collaborators, enter the audience group that you created in Step 5.

      The default Can View permission is sufficient.

    4. Click Send.

    Configure knowledge base access permissions

    1. Go to the All Knowledge Bases page.

    2. Hover over the target knowledge base, and then click the Knowledge Base Settings icon that appears.

    3. Click the Member Settings tab. In the Roles & Permissions section, on the Administrator tab, click Add Administrator.

    4. In the dialog box that appears, add the audience group that you created in Step 5, and then click Next.

    5. Click Send.

Use user_access_token

  1. Log on to the Lark Open Platform and go to the Developer Console.

  2. Click the application that you created in the Preparations section.

  3. Configure API permissions.

    1. In the navigation pane on the left, choose Development Configuration > Permissions & Scopes.

    2. Click Bulk Import/Export Scopes.

    3. Configure the user identity permissions for the application created in the Preparations section.

      {
        "scopes": {
          "tenant": [],
          "user": [
            "offline_access",
            "docs:document:export",
            "drive:drive",
            "wiki:wiki"
          ]
        }
      }
    4. Click Next, Confirm New Scopes.

    5. Click Request To Activate.

  4. You can configure the redirection IP whitelist.

    1. In the navigation pane on the left, choose Development Configuration > Security Settings.

    2. In the Redirect URLs text box, enter https://www.aliyun.com.

    3. Click Add to the right of the text box.

    4. Turn on the Refresh User_access_token switch.

      Note

      If this switch is not available, the feature is enabled by default.

  5. Obtain the authorization code.

    1. Construct the URL for the Lark authorization page.

      Note

      Replace YOUR_FEISHU_CLIENT_ID in the following URL with the App ID that you recorded in the Preparations section.

      https://accounts.feishu.cn/open-apis/authen/v1/authorize?client_id=YOUR_FEISHU_CLIENT_ID&redirect_uri=https://www.aliyun.com&scope=drive:drive offline_access docs:document:export wiki:wiki
    2. Open the authorization page in a browser.

    3. Click Authorize.

    4. Retrieve the authorization code (code) from the redirection URL.

      Note

      The authorization code does not include code=, &, or any information that follows the & symbol.

Step 2: Get the token of the Lark Doc folder and the ID of the knowledge base

  1. Log on to the Lark client.

  2. Obtain the token of the Lark Doc folder.

    1. Go to the target folder.

    2. Copy the URL of the folder from the address bar.

    3. Extract the token from the URL. The token is the string of characters after folder/.

      Note

      The token does not include the ? symbol or the information that follows it.

  3. Obtain the ID of the knowledge base.

    1. Go to the All Knowledge Bases page.

    2. Hover over the target knowledge base, and then click the Knowledge Base Settings icon that appears.

    3. Copy the URL of the target knowledge base from the address bar.

    4. Extract the ID of the knowledge base from the URL. The ID is the string of characters after settings/.

      Note

      The knowledge base ID contains only digits and does not include the # symbol or the information that follows it.

Step 3: Run the KBSync program

  1. Obtain the KBSync file.

    Note

    You can join the DingTalk group (ID: 79690034672) and contact the helpdesk to obtain the KBSync file.

  2. Prepare the runtime environment for the KBSync program.

    Note

    The KBSync program must run in a Linux environment that can access the Lark Open Platform and RAGFlow.

  3. Prepare the config configuration file.

    1. Create a Linux file named config.

    2. Copy the following code to the config file.

      whiteList=
      blackList=
      sinkType=RagFlow
      sourceType=FeiShu
      ragflowUrl=http://XX.XX.XX.XX
      ragflowApiKey=Bearer RAGFlow-BmND******MDI0Mm
      ragflowDatasetId=928d061******2ac120006
      feishuAppId=cli_a8a******d00d
      feishuAppSecret=pMp73Si******UDrWXBSOa
      feishuUserAccessCode=bGzpx6******B9KFCdzdCDHG
      feishuCloudSpaceDirToken=ESJm*******CRdn002cii3bnAc
      feishuWikiSpaceId=7504968******8674
    3. Replace the parameters in the config file.

      Important
      • For parameters that you do not need to configure, leave their values empty.

      • If you pass parameters for both feishuCloudSpaceDirToken and feishuWikiSpaceId, only the Lark Docs and their parent folder are transferred (only the feishuCloudSpaceDirToken parameter takes effect).

      • The blackList parameter has a higher priority than the whiteList parameter.

      Parameter

      Required

      Description

      How to obtain

      whiteList

      No

      The paths to transfer (whitelist) and not to transfer (blacklist). These include folder paths in Lark Docs and document paths in the knowledge base.

      Note

      Regular expressions are supported. Separate multiple paths with spaces.

      Obtain these from the Lark client.

      blackList

      No

      sinkType

      Yes

      The type of the sink.

      Keep the value as RagFlow.

      sourceType

      Yes

      The type of the source.

      Keep the value as Feishu.

      ragflowUrl

      Yes

      The RAGFlow address (API Server).

      Get the API endpoint of the RAGFlow knowledge base.

      ragflowApiKey

      Yes

      The API key of the RAGFlow knowledge base.

      Get the API key of the RAGFlow knowledge base.

      ragflowDatasetId

      Yes

      The ID of the RAGFlow knowledge base.

      Get the ID of the RAGFlow knowledge base.

      feishuAppId

      Yes

      The ID of the application in Lark (App ID).

      Get the App ID and App Secret.

      feishuAppSecret

      Yes

      The secret of the application in Lark (App Secret).

      feishuUserAccessCode

      No

      Note

      This parameter is required only when you use the user_access_token method to access Lark data.

      The Lark authorization code.

      Get the authorization code.

      feishuCloudSpaceDirToken

      No

      Note

      You only need to pass a parameter for one of these.

      The token of the folder that contains the Lark Docs.

      Get the token of the Lark Doc folder and the ID of the knowledge base.

      feishuWikiSpaceId

      The ID of the Lark knowledge base.

  4. Place the KBSync file and the config configuration file in the same folder in the Linux environment.

  5. In the Linux environment, run the ./KBSync --config config command to start the KBSync program.

    If the output is similar to the following, the KBSync program is running correctly.

    ./KBSync --config config
    
    INFO config whiteList=, blackList=
    INFO config ragflowUrl=http://XX.XX.XX.XX/, ragflowApiKey=Bearer RAGFlow-BmND******MDI0Mm
    INFO config ragflowDatasetId=928d061******2ac120006
    INFO config feishuAppId=cli_a8a******d00d, feishuAppSecret=pMp73Si******UDrWXBSOa
    INFO Response from https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal: 200, headers: {'Server': 'Tengine', 'Content-Type': 'application/json', 'Content-Length': '102', 'Connection': 'keep-alive', 'Date': 'Tue, 08 Jul 2025 02:49:01 GMT', 'Request-Id': '25bf****-d386-4a86-****-f440f070****', 'Tt_st****': '1', 'X-Lgw-Dst-Svc': 'jbpiSR****OiA0J3d****-Oz0xugYAH9otZIFg4x****', 'X-Request-Id': '25bf****-d386-4a86-b9f4-f440f070****', 'X-Tt-Logid': '202507081049012933B870245850D****', 'server-timing': 'inner; dur=73, cdn-cache;desc=MISS,edge;dur=0,origin;dur=129', 'x-tt-trace-host': '****', 'x-tt-trace-tag': '****', 'x-tt-trace-id': '00-****', 'X-Timestamp': '175194****.952', 'Via': 'cache8.cn6540[129,0]', 'Timing-Allow-Origin': '*', 'EagleId': '6ae3651c1751942941849****'}, body: b'{"code":0,"expire":4340,"msg":"ok","tenant_access_token":"t-g10478a*******CSC3YVY"}'
    INFO set feishu tenant access token expires in: 4340

Appendix

Get the API endpoint of the RAGFlow knowledge base

  1. Log on to the RAGFlow page.

  2. In the navigation pane on the left, click API.

  3. Copy the API Server value.

Get the API key of the RAGFlow knowledge base

  1. Log on to the RAGFlow page.

  2. In the navigation pane on the left, click API.

  3. On the right side of RAGFlow API, click API KEY.

  4. In the API KEY dialog box, click Create New Key.

  5. Click image to record the token.

Get the ID of the RAGFlow knowledge base

  1. Log on to the RAGFlow page.

  2. On the Knowledge Base page, click the target knowledge base.

  3. In the URL of the current page, record the ID of the knowledge base.

    Note

    The information after id= is the ID of the knowledge base.