This topic describes how to transfer data from Lark to a Data Transmission Service (DTS) RAGFlow knowledge base.
Prerequisites
You have created a RAGFlow knowledge base in DTS and configured an IP whitelist.
Background information
Supported data types
DTS RAGFlow supports connecting to Lark Docs, workbooks, Bitable, and knowledge bases in Lark.
Methods for accessing Lark
You can access Lark data using an application access credential (tenant_access_token) or a user access credential (user_access_token).
Method | Permission type | Pros | Cons |
tenant_access_token (Recommended) | Application identity |
| Requires more steps for authorization. You need to configure extra access permissions for Cloud Drive and knowledge bases. |
user_access_token | User identity | Requires fewer steps for authorization. It has access permissions to all folders, so you do not need to configure extra access permissions for Cloud Drive and knowledge bases. |
|
Preparations
Log on to the Lark Open Platform and go to the Developer Console.
Click Create Custom App, configure information such as Name and App Description, and then click Create.
Click the application card to go to the application editing page.
By default, the page appears.
On the Credentials & Basic Info page, in the App Credentials section, record the App ID and App Secret.
Procedure
For more information about Lark operations, see the official Lark documentation (Help Center and Developer Documentation).
Step 1: Configure access permissions
Use tenant_access_token
Log on to the Lark Open Platform and go to the Developer Console.
Click the application that you created in the Preparations section.
Add a bot and publish the application.
In the navigation pane on the left, choose .
On the Add By Feature tab, find the Bot card and click Add.
At the top of the page, click Create Version.
NoteAlternatively, in the navigation pane on the left, choose , and then click Create Version.
On the Version Details page, enter the App Version and Update Notes.
NoteKeep the default value Bot for Default Feature On Mobile and Default Feature On Desktop.
Click Save.
In the dialog box that appears, click Confirm Release.
Configure API permissions.
In the navigation pane on the left, choose .
Click Bulk Import/Export Scopes.
On the Import tab, in the JSON text box, enter the following permissions for the application.
{ "scopes": { "tenant": [ "docs:document:export", "drive:drive", "wiki:wiki" ], "user": [] } }Click Next, Confirm New Scopes.
Click Request To Activate.
Log on to the Lark client, create a new group, and add the application that you created in the Preparations section as a Group Bot.
Configure access permissions for Cloud Drive and knowledge bases.
Configure Cloud Drive access permissions
Go to the target Cloud Drive folder.
On the right side of the page, click Share.
For Invite Collaborators, enter the audience group that you created in Step 5.
The default Can View permission is sufficient.
Click Send.
Configure knowledge base access permissions
Go to the All Knowledge Bases page.
Hover over the target knowledge base, and then click the Knowledge Base Settings icon that appears.
Click the Member Settings tab. In the Roles & Permissions section, on the Administrator tab, click Add Administrator.
In the dialog box that appears, add the audience group that you created in Step 5, and then click Next.
Click Send.
Use user_access_token
Log on to the Lark Open Platform and go to the Developer Console.
Click the application that you created in the Preparations section.
Configure API permissions.
In the navigation pane on the left, choose .
Click Bulk Import/Export Scopes.
Configure the user identity permissions for the application created in the Preparations section.
{ "scopes": { "tenant": [], "user": [ "offline_access", "docs:document:export", "drive:drive", "wiki:wiki" ] } }Click Next, Confirm New Scopes.
Click Request To Activate.
You can configure the redirection IP whitelist.
In the navigation pane on the left, choose .
In the Redirect URLs text box, enter
https://www.aliyun.com.Click Add to the right of the text box.
Turn on the Refresh User_access_token switch.
NoteIf this switch is not available, the feature is enabled by default.
Obtain the authorization code.
Construct the URL for the Lark authorization page.
NoteReplace
YOUR_FEISHU_CLIENT_IDin the following URL with the App ID that you recorded in the Preparations section.https://accounts.feishu.cn/open-apis/authen/v1/authorize?client_id=YOUR_FEISHU_CLIENT_ID&redirect_uri=https://www.aliyun.com&scope=drive:drive offline_access docs:document:export wiki:wikiOpen the authorization page in a browser.
Click Authorize.
Retrieve the authorization code (
code) from the redirection URL.NoteThe authorization code does not include code=, &, or any information that follows the & symbol.
Step 2: Get the token of the Lark Doc folder and the ID of the knowledge base
Log on to the Lark client.
Obtain the token of the Lark Doc folder.
Go to the target folder.
Copy the URL of the folder from the address bar.
Extract the token from the URL. The token is the string of characters after
folder/.NoteThe token does not include the ? symbol or the information that follows it.
Obtain the ID of the knowledge base.
Go to the All Knowledge Bases page.
Hover over the target knowledge base, and then click the Knowledge Base Settings icon that appears.
Copy the URL of the target knowledge base from the address bar.
Extract the ID of the knowledge base from the URL. The ID is the string of characters after
settings/.NoteThe knowledge base ID contains only digits and does not include the # symbol or the information that follows it.
Step 3: Run the KBSync program
Obtain the KBSync file.
NoteYou can join the DingTalk group (ID: 79690034672) and contact the helpdesk to obtain the KBSync file.
Prepare the runtime environment for the KBSync program.
NoteThe KBSync program must run in a Linux environment that can access the Lark Open Platform and RAGFlow.
Prepare the config configuration file.
Create a Linux file named config.
Copy the following code to the config file.
whiteList= blackList= sinkType=RagFlow sourceType=FeiShu ragflowUrl=http://XX.XX.XX.XX ragflowApiKey=Bearer RAGFlow-BmND******MDI0Mm ragflowDatasetId=928d061******2ac120006 feishuAppId=cli_a8a******d00d feishuAppSecret=pMp73Si******UDrWXBSOa feishuUserAccessCode=bGzpx6******B9KFCdzdCDHG feishuCloudSpaceDirToken=ESJm*******CRdn002cii3bnAc feishuWikiSpaceId=7504968******8674Replace the parameters in the config file.
ImportantFor parameters that you do not need to configure, leave their values empty.
If you pass parameters for both
feishuCloudSpaceDirTokenandfeishuWikiSpaceId, only the Lark Docs and their parent folder are transferred (only thefeishuCloudSpaceDirTokenparameter takes effect).The
blackListparameter has a higher priority than thewhiteListparameter.
Parameter
Required
Description
How to obtain
whiteListNo
The paths to transfer (whitelist) and not to transfer (blacklist). These include folder paths in Lark Docs and document paths in the knowledge base.
NoteRegular expressions are supported. Separate multiple paths with spaces.
Obtain these from the Lark client.
blackListNo
sinkTypeYes
The type of the sink.
Keep the value as
RagFlow.sourceTypeYes
The type of the source.
Keep the value as
Feishu.ragflowUrlYes
The RAGFlow address (API Server).
ragflowApiKeyYes
The API key of the RAGFlow knowledge base.
ragflowDatasetIdYes
The ID of the RAGFlow knowledge base.
feishuAppIdYes
The ID of the application in Lark (App ID).
feishuAppSecretYes
The secret of the application in Lark (App Secret).
feishuUserAccessCodeNo
NoteThis parameter is required only when you use the user_access_token method to access Lark data.
The Lark authorization code.
feishuCloudSpaceDirTokenNo
NoteYou only need to pass a parameter for one of these.
The token of the folder that contains the Lark Docs.
Get the token of the Lark Doc folder and the ID of the knowledge base.
feishuWikiSpaceIdThe ID of the Lark knowledge base.
Place the KBSync file and the config configuration file in the same folder in the Linux environment.
In the Linux environment, run the
./KBSync --config configcommand to start the KBSync program.If the output is similar to the following, the KBSync program is running correctly.
./KBSync --config config INFO config whiteList=, blackList= INFO config ragflowUrl=http://XX.XX.XX.XX/, ragflowApiKey=Bearer RAGFlow-BmND******MDI0Mm INFO config ragflowDatasetId=928d061******2ac120006 INFO config feishuAppId=cli_a8a******d00d, feishuAppSecret=pMp73Si******UDrWXBSOa INFO Response from https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal: 200, headers: {'Server': 'Tengine', 'Content-Type': 'application/json', 'Content-Length': '102', 'Connection': 'keep-alive', 'Date': 'Tue, 08 Jul 2025 02:49:01 GMT', 'Request-Id': '25bf****-d386-4a86-****-f440f070****', 'Tt_st****': '1', 'X-Lgw-Dst-Svc': 'jbpiSR****OiA0J3d****-Oz0xugYAH9otZIFg4x****', 'X-Request-Id': '25bf****-d386-4a86-b9f4-f440f070****', 'X-Tt-Logid': '202507081049012933B870245850D****', 'server-timing': 'inner; dur=73, cdn-cache;desc=MISS,edge;dur=0,origin;dur=129', 'x-tt-trace-host': '****', 'x-tt-trace-tag': '****', 'x-tt-trace-id': '00-****', 'X-Timestamp': '175194****.952', 'Via': 'cache8.cn6540[129,0]', 'Timing-Allow-Origin': '*', 'EagleId': '6ae3651c1751942941849****'}, body: b'{"code":0,"expire":4340,"msg":"ok","tenant_access_token":"t-g10478a*******CSC3YVY"}' INFO set feishu tenant access token expires in: 4340
Appendix
Get the API endpoint of the RAGFlow knowledge base
In the navigation pane on the left, click API.
Copy the API Server value.
Get the API key of the RAGFlow knowledge base
In the navigation pane on the left, click API.
On the right side of RAGFlow API, click API KEY.
In the API KEY dialog box, click Create New Key.
Click
to record the token.
Get the ID of the RAGFlow knowledge base
On the Knowledge Base page, click the target knowledge base.
In the URL of the current page, record the ID of the knowledge base.
NoteThe information after
id=is the ID of the knowledge base.