Use the KBSync tool to sync content from Lark — including Lark Docs, workbooks, Bitable, and knowledge bases — into a Data Transmission Service (DTS) RAGFlow knowledge base, making your Lark content searchable through RAGFlow.
Prerequisites
Before you begin, make sure you have:
A RAGFlow knowledge base created in DTS
An IP whitelist configured for the RAGFlow knowledge base
Choose an access method
KBSync connects to Lark using one of two credential types. Choose based on your sync pattern:
| Method | Permission type | Pros | Cons |
|---|---|---|---|
tenant_access_token (recommended) | Application identity | Supports resumable transfers. For a single folder or knowledge base, configure once and reuse. | Requires extra steps to grant Cloud Drive and knowledge base permissions. |
user_access_token | User identity | Fewer setup steps — has access to all folders with no extra permission grants needed. | Must get a new authorization code each time KBSync runs. The code expires and cannot be reused. |
Which method should I use? Use tenant_access_token for recurring or automated syncs — you configure permissions once and reuse the credentials indefinitely. Use user_access_token for one-time or ad hoc syncs where setup simplicity matters more than reusability.
Step 1: Create a Lark application
Create a custom application in the Lark Open Platform to get the App ID and App Secret that KBSync needs to authenticate with Lark.
For more information about Lark platform operations, see the Lark Help Center and Lark Developer Documentation.
Log in to the Lark Open Platform and go to the Developer Console.
Click Create Custom App, fill in Name and App Description, and then click Create.
Click the application card to open it. The Basic Information > Credentials & Basic Info page appears by default.
In the App Credentials section, record the App ID and App Secret. You will need these values when configuring KBSync.
Step 2: Configure access permissions
The configuration steps differ depending on which access method you chose.
Use tenant_access_token (recommended)
Add a bot and publish the application
KBSync uses a bot to access Lark data on behalf of the application. You must add a bot and publish the application to activate it before you can configure permissions.
Log in to the Lark Open Platform and open the application you created.
In the left navigation pane, choose App Features > Add App Features.
On the Add By Feature tab, find the Bot card and click Add.
At the top of the page, click Create Version.
Alternatively, choose App Release > Version Management & Release in the left navigation pane, and then click Create Version.
On the Version Details page, enter App Version and Update Notes, then click Save.
Keep the default Bot value for Default Feature On Mobile and Default Feature On Desktop.
In the confirmation dialog, click Confirm Release.
Configure API permissions
Grant the application the scopes it needs to read Lark Docs and knowledge bases. Without these scopes, KBSync cannot export or access your content.
In the left navigation pane, choose Development Configuration > Permissions & Scopes.
Click Bulk Import/Export Scopes.
On the Import tab, paste the following JSON into the JSON text box:
{ "scopes": { "tenant": [ "docs:document:export", "drive:drive", "wiki:wiki" ], "user": [] } }Click Next, Confirm New Scopes, then click Request To Activate.
Create a group and add the bot
Before you can grant Cloud Drive and knowledge base access to the application, you must create a Lark group and add the application as a Group Bot.
Log in to the Lark client.
Create a new group.
Add the application you created in Step 1 as a Group Bot in the group.
Grant access to Cloud Drive and knowledge bases
Because tenant_access_token uses application identity, the bot must be explicitly granted access to the specific folders and knowledge bases it will sync. This is a one-time setup per folder or knowledge base.
Grant Cloud Drive access:
In the Lark client, go to the target Cloud Drive folder.
Click Share on the right side of the page.
Under Invite Collaborators, enter the group that contains the bot. The default Can View permission is sufficient.
Click Send.
Grant knowledge base access:
Go to the All Knowledge Bases page.
Hover over the target knowledge base, then click the Knowledge Base Settings icon.
Click the Member Settings tab. In the Roles & Permissions section, on the Administrator tab, click Add Administrator.
Add the group that contains the bot, then click Next and Send.
Use user_access_token
Configure API permissions
Grant the application the user-level scopes it needs to access Lark content on behalf of the user.
Log in to the Lark Open Platform and open the application you created.
In the left navigation pane, choose Development Configuration > Permissions & Scopes.
Click Bulk Import/Export Scopes.
On the Import tab, paste the following JSON into the JSON text box:
{ "scopes": { "tenant": [], "user": [ "offline_access", "docs:document:export", "drive:drive", "wiki:wiki" ] } }Click Next, Confirm New Scopes, then click Request To Activate.
Configure the redirect URL
The redirect URL is where Lark sends the authorization code after the user grants access. KBSync uses this code to authenticate on the user's behalf.
In the left navigation pane, choose Development Configuration > Security Settings.
In the Redirect URLs field, enter
https://www.aliyun.comand click Add.Turn on the Refresh User_access_token switch.
If the switch is not visible, the feature is already enabled by default.
Get the authorization code
The authorization code is a credential that KBSync uses to get a user_access_token. You must get a new code each time you run KBSync, as the code expires.
Build the Lark authorization URL by replacing
YOUR_FEISHU_CLIENT_IDwith the App ID you recorded in Step 1:https://accounts.feishu.cn/open-apis/authen/v1/authorize?client_id=YOUR_FEISHU_CLIENT_ID&redirect_uri=https://www.aliyun.com&scope=drive:drive offline_access docs:document:export wiki:wikiOpen the URL in a browser and click Authorize.
After authorization, the browser redirects to
https://www.aliyun.com. Copy the value of thecodeparameter from the redirect URL.ImportantThe authorization code is the value of the
codeparameter only — do not includecode=,&, or anything after&.
Step 3: Get folder tokens and knowledge base IDs
KBSync uses a folder token to identify which Lark Docs folder to sync, and a knowledge base ID to identify which Lark knowledge base to sync.
Get the Lark Docs folder token:
In the Lark client, navigate to the target folder.
Copy the folder URL from the address bar.
Extract the token: it is the string after
folder/in the URL. Do not include?or anything that follows.
Get the Lark knowledge base ID:
Go to the All Knowledge Bases page.
Hover over the target knowledge base, then click the Knowledge Base Settings icon.
Copy the URL from the address bar.
Extract the ID: it is the string of digits after
settings/in the URL. Do not include#or anything that follows.
Step 4: Run KBSync
Get the KBSync file
Join the DingTalk group (ID: 79690034672) and contact the helpdesk to get the KBSync file.KBSync runs in a Linux environment that has network access to both the Lark Open Platform and RAGFlow.
Create the config file
In the Linux environment, create a file named
config.Copy the following template into the file:
whiteList= blackList= sinkType=RagFlow sourceType=FeiShu ragflowUrl=http://XX.XX.XX.XX ragflowApiKey=Bearer RAGFlow-BmND******MDI0Mm ragflowDatasetId=928d061******2ac120006 feishuAppId=cli_a8a******d00d feishuAppSecret=pMp73Si******UDrWXBSOa feishuUserAccessCode=bGzpx6******B9KFCdzdCDHG feishuCloudSpaceDirToken=ESJm*******CRdn002cii3bnAc feishuWikiSpaceId=7504968******8674Replace the placeholder values with your actual values. Use the parameter reference below.
Parameter reference:
| Parameter | Required | Description | How to get |
|---|---|---|---|
whiteList | No | Folder paths in Lark Docs or document paths in knowledge bases to include. Separate multiple paths with spaces. Regular expressions are supported. | From the Lark client |
blackList | No | Paths to exclude. Takes priority over whiteList. Separate multiple paths with spaces. Regular expressions are supported. | From the Lark client |
sinkType | Yes | Type of the sync destination. | Set to RagFlow |
sourceType | Yes | Type of the sync source. | Set to FeiShu |
ragflowUrl | Yes | RAGFlow API Server address. | Get the API endpoint |
ragflowApiKey | Yes | API key for the RAGFlow knowledge base. | Get the API key |
ragflowDatasetId | Yes | ID of the RAGFlow knowledge base. | Get the knowledge base ID |
feishuAppId | Yes | App ID of the Lark application. | Recorded in Step 1 |
feishuAppSecret | Yes | App Secret of the Lark application. | Recorded in Step 1 |
feishuUserAccessCode | Required for user_access_token only | Lark authorization code. Get a new one before each run, as the code expires. | Get the authorization code |
feishuCloudSpaceDirToken | One of these two is required | Token of the Lark Docs folder to sync. If both feishuCloudSpaceDirToken and feishuWikiSpaceId are set, only Lark Docs and their parent folder are transferred (only this parameter takes effect). | Get the folder token |
feishuWikiSpaceId | One of these two is required | ID of the Lark knowledge base to sync. | Get the knowledge base ID |
Leave the value blank for any parameter you do not need — do not remove the parameter line.
Run the program
Place the
KBSyncbinary and theconfigfile in the same directory.Run the following command:
./KBSync --config configIf KBSync starts successfully, the output looks similar to:
INFO config whiteList=, blackList= INFO config ragflowUrl=http://XX.XX.XX.XX/, ragflowApiKey=Bearer RAGFlow-BmND******MDI0Mm INFO config ragflowDatasetId=928d061******2ac120006 INFO config feishuAppId=cli_a8a******d00d, feishuAppSecret=pMp73Si******UDrWXBSOa INFO Response from https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal: 200, ... INFO set feishu tenant access token expires in: 4340
Appendix
Get the API endpoint of the RAGFlow knowledge base
In the left navigation pane, click API.
Copy the API Server value.
Get the API key of the RAGFlow knowledge base
In the left navigation pane, click API.
Next to RAGFlow API, click API KEY.
In the API KEY dialog, click Create New Key.
Click the copy icon to save the token.
Get the ID of the RAGFlow knowledge base
On the Knowledge Base page, click the target knowledge base.
In the URL of the page, copy the value after
id=.