Build and manage a knowledge base_Lingma_AI Coding Assistant - AI Coding Assistant Lingma

Lingma can use private data from your knowledge base to generate answers that fit your needs. To utilize retrieval-augmented generation (RAG) technology, it's important to build a high-quality enterprise knowledge base and manage its access. This topic outlines how to build and manage a high-quality enterprise knowledge base.

Who can use this feature?

Editions: Lingma Enterprise Dedicated
Roles: Lingma admins and global admins in an organization

When to use

Lingma possesses a wide range of general knowledge, but lacks in-depth knowledge of specific companies. By adding an enterprise knowledge base, you can help the model understand your private domain knowledge more accurately. This helps generate more personalized and relevant answers. Lingma can answer questions, optimize code, and generate suggested code based on the knowledge base. With these features, Lingma is often used for compliance checks and technical support.

For example, you can use the knowledge base as context feature for the following scenarios:

AI Chat: Training new staff, ensuring security rules are followed, helping with product maintenance and fixing issues, working on internal platforms and APIs.
Code optimization and generation: Make sure code style is consistent and follows company standards. Find code weaknesses and suggest fixes based on security rules.

To get the most out of Lingma, follow these two tips:

Build a good knowledge base with high-quality data.
Set up permissions so the right members can see the knowledge base.

The knowledge base admin needs to:

Provide AI-friendly and high-quality knowledge data, such as documents and code. Old or incorrect information can be harmful and lead to incorrect answers.
Create a well-organized knowledge base with clear permission settings. This keeps data private and makes it easier to manage. Poor permission settings can put data at risk.

Build a high-quality knowledge base

You can transform uploaded files into retrieval-enhanced knowledge data. Before you start, you must follow the principles and methods of preparing file-based knowledge data.

File format

You can upload up to 10 files at a time.
Supported formats include pdf, docx, txt, markdown, and csv.
Each file must be 5 MB or smaller.
File names must be 200 characters or less. Only UFT-8 and GBK encoded files are supported.
Use structured documents to make sure the information is easy to find.

A single file

If you upload a single file, check if it follows the rules for file name, title, format, and content. See more details and examples below.

File type and naming

Type: Markdown format is preferred. Compared with Word and PDF, the Markdown format is better suited for file processing.
Encoding: Use UTF-8 for the best character compatibility.

File naming: Keep names short and clear, and make sure the model can easily understand them. Do not use unclear abbreviations, numbers, or symbols. Here are some examples:

Don'ts

Dos

Do not use names that are not too general, too similar, or can cause confusion:

Coding Specification
Security Specification 1
Security Specification 2
SR3

Use names that specify the content and purpose of the file:

Java Language Programming Specification
API Data Security Management Specification
Cloud Account Security Management Specification

File structure

Hierarchical structure: Use headings to organize the content in a file. Write each proper noun on its own line to facilitate understanding.

Titles at all levels: Titles at all levels should be clear and concise, and easy distinguishable. Do not use unclear abbreviations, numbers, or symbols. Do not list keywords from a file, as this can cause problems.

Don'ts

Dos

Specifications on Safe Use of AccessKey Pairs
[Contents]
Keywords: AK, security specifications, AccessKey
I. Definition
An AccessKey pair is a key pair used for authentication. Each AccessKey pair consists of an AccessKey ID and an AccessKey secret. AccessKey pairs allow you to securely access system services by sending API requests. This File aims to clarify the rules for using AccessKey pairs and ensure the security and stability of the system. The AccessKey ID is used to verify the identity of the user. The AccessKey secret is used to encrypt your signature string and ensure the uniqueness and non-repudiation of your request. 
2. Usage notes
Ensure the confidentiality of the AccessKey secret and do not disclose it to any unauthorized third party. Follow the principle of least privilege to grant permissions on API operations, and grant only the permissions necessary to complete the task. Change the AccessKey secret every 90 days. Record the usage of AccessKey pairs and regularly review the usage logs to eliminate abnormal behaviors and revoke unnecessary permissions. 
3. Security practices
To ensure the security of AccessKey pairs, we implement the following simplified security practices: In the production environment, we preferentially use environment variables to store AccessKey pairs to prevent hard coding. We manage AccessKey pairs by using a configuration management system in a unified manner to prevent their direct exposure in code. At the same time, we filter logs to ensure that no AccessKey secret is recorded. We regularly review permissions to ensure that AccessKey pairs have only the minimum permissions required to perform the necessary operations. In addition, we establish an anomaly detection mechanism to quickly identify and respond to any suspicious AccessKey usage activities. These measures together guarantee the safe and reasonable use of AccessKey pairs. 
4. API calling examples
● Example 1
Use AccessKey pairs for API calls in Node.js: In Node.js, you can use the Axios library to send API requests and include AccessKey pairs in the request header. The following sample code shows an API request that uses AccessKey pairs for signature authentication.
[Sample code block]
● Example 2
Use AccessKey pairs for API calls in Python: In Python, you can use the requests library to send API requests with AccessKey pairs. The following sample code shows how to construct a request and add a signature:
[Sample code block]

Specifications on Safe Use of AccessKey Pairs
/*
Remove disturbing elements: Remove information that does not need to be recalled, such as content and keywords at the beginning of a file. 
Explanation of professional terms: List professional terms and their explanations in the form of entries to be better queried and understood by the model. 
 */
I. Definition
● AccessKey pair: An AccessKey pair is a key pair used for authentication. Each AccessKey pair consists of an AccessKey ID and an AccessKey secret. AccessKey pairs allow you to securely access system services by sending API requests. 
● AccessKey ID: The AccessKey ID is used to verify the identity of the user. 
● AccessKey secret: The AccessKey secret is used to encrypt your signature string and ensure the uniqueness and non-repudiation of your request. 
/*
Do not use large-paragraph statements. Use bullet point statements, which are easier to understand for the model.
 */
2. Usage notes
● Confidentiality: Keep the AccessKey secret strictly confidential and do not disclose it to any unauthorized third party. 
● Principle of least privilege: Follow the principle of least privilege to grant permissions on API operations, and grant only the permissions necessary to complete the task. 
● Regular rotation: We recommend that you change the AccessKey secret every 90 days. 
● Monitoring and auditing: Record the usage of AccessKey pairs and regularly review the usage logs to eliminate abnormal behaviors. 
● Timely revocation: If you no longer need to use an AccessKey pair, revoke its permissions in a timely manner. 
3. Security practices
● Environment variables: In the production environment, store AccessKey pairs by using environment variables instead of hard coding. 
● Configuration management: Manage AccessKey pairs by using a configuration management system to prevent direct exposure in code. 
● Log filtering: Make sure that no AccessKey secret appears in log entries. 
● Permission review: Regularly check the permission settings of AccessKey pairs to ensure compliance with the principle of least privilege. 
● Anomaly detection: Establish an anomaly detection mechanism to detect and handle suspicious activities in a timely manner. 
/*
The headings in API calling examples are unclear. Change Example 1 and Example 2 to more specific names.
*/
4. API calling examples
● Example of using AccessKey pairs for API calls in Node.js
 In Node.js, you can use the Axios library to send API requests and include AccessKey pairs in the request header. The following sample code shows an API request that uses AccessKey pairs for signature authentication.
[Sample code block]
● Example of using AccessKey pairs for API calls in Python
In Python, you can use the requests library to send API requests with AccessKey pairs. The following sample code shows how to construct a request and add a signature:
[Sample code block]

File sections and paragraphs

Gather relevant content in the same paragraph or section to keep the information clear and connected.
Do not shorten or use abbreviations for important details. If the content is the same, repeat it instead of saying "same as above." Describe the specific content instead.
Do not use blank lines that don't add meaning.

Use bullet points and proper indentation to make the content easier to understand.

Don'ts

Dos

Avoid meaningless blank lines

6.3 Method receiver
We recommend that you name the receiver after the first lowercase letter of the class name. 

If the function exceeds 20 lines, do not use a single character as the name of the receiver. 

Do not use confusing names such as 'me', 'this', and 'self'.

Remove meaningless blank lines

6.3 Method receiver
● We recommend that you name the receiver after the first lowercase letter of the class name. 
● If the function exceeds 20 lines, do not use a single character as the name of the receiver. 
● Do not use confusing names such as 'me', 'this', and 'self'.

Do not use "same as above," abbreviations, or pronouns.

4.6  Interface naming conventions in Go
The naming conventions are the same as those for structs. 
or
Same as Chapter 3.2 Naming Conventions and Struct Naming Conventions

Clearly describe and explain specific content.

4.6  Interface naming conventions in Go
● Use the CamelCase naming method. Use uppercase or lowercase for the first letter according to the access control requirements. 
● Use nouns or noun phrases as struct names, such as Customer, WikiPage, Account, and AddressParser. Do not use verbs. 
● Do not use struct names with too broad meanings, such as Data and Info.

Images, tables, and multimedia content

If you use images and tables in file paragraphs, here are some tips:

Processing of tables in files:
- Requirements for table format:
  - Use the first row for table headers.
  - Do not put the table name in the first row.
- Table structure description: There are no special rules for table structure. You can design columns and rows based on your content.
- Keep the style concise: Remove extra formatting, like the background color and font style. Use clear table lines and the default style.
- Additional notes:
  - In Enterprise Dedicated Edition, Lingma has advanced features to keep the table data accurate.
Processing of images in files:
- Use text instead of images: Use text to share information. If an image has a small amount of text and contains important details, write the information as text.
- Add illustrations: Include clear illustrations for all important diagrams to help explain the subject matter.
Others:
- Special characters: Do not use emojis or other special characters, to avoid parsing issues.
- Headers, footers, watermarks, and annotations: Do not use these, because they may obstruct visibility.
- File background: Do not use a background, because it may cause issues with user experience.
- Uniform text orientation: Make sure that all text faces the same direction.
- Audio and video: Do not include audio or video.

Files in different formats

Markdown: We recommend using the Markdown format.
Word:
- Use updated format: Use Word version 2007 or later.
- Use global styles: Use global heading and paragraph styles.
- Do not use character styles, such as special font formatting, borders, and shading.
- Use paragraph styles: Use paragraph styles for consistent formatting.
PDF:
- Do not use images: Do not convert images directly to PDF files. Write important information from images as text, and follow the file requirements described in this topic.
- Do not include compressed files: Make sure that the file does not include compressed files.
- Keep the single-column layout: Use a one-column layout to make sure the content can be properly parsed.
CSV:
- Do not use images: This makes sure the text can be searched.
- Do not embed compressed files: Do not embed compressed files in the file.
- Use the first row as table headers: Use the first row as table headers. Do not put the table name in the first row of a table.
  Notes:
  - Recommended: Save question-answer pairs in frequently asked questions (FAQs). To improve retrieval accuracy, make sure that:
    - The questions are clearly expressed.
    - The answers are concise and easy to understand.
    - The terms are familiar to users.
    - The keywords are highlighted.
  - Do not upload a complex data table in CSV. It may take a very long time to process, and might fail.

Multiple files

When writing documents, make sure they can be understood on their own, and that they are aggregated, consistent, and comprehensive. This helps users quickly find what they need, and makes the documents more useful. Here are some tips to keep your documents organized and easy to handle:

Knowledge-independent: Each document should have all the information it needs on its own.

Don't repeat information in different documents.
Make sure that each file is a self-contained unit of knowledge that provides complete and accurate information on its own.

Knowledge-aggregated: Aggregate all the relevant information to the same topic.

This keeps related topics together and makes them easy to understand.

Consistent: Make sure that similar information in different files is consistent and standardized.

Use the same style and words in all documents.
Create and follow style guides and glossaries to ensure consistency.

Comprehensive: Make sure all documents are complete, up-to-date, and accurate.

Make sure the knowledge base can answer all common questions clearly. When you write about an API, include all the parts and details that matter.
Regularly check and update to add missing information and remove old content.

The preceding guidelines can help enterprise knowledge base administrators create high-quality product documentation, and improve user satisfaction and experience. A systematic approach to organizing and managing documents ensures the accuracy, ease of use, and integrity of information, providing users with a more reliable knowledge resource base.

Set up permissions

You can determine who has access to the knowledge base based on its content and the members:

For general-purpose knowledge, consider establishing a public knowledge base accessible to all authorized developers in the enterprise. These can include important documents and guidelines, like code rules and security standards.
For knowledge specific to your teams or departments, you can create private knowledge bases. These can include business-specific development documents, training materials, and operation and maintenance guides, or technical guides for new employees in the enterprise.

Create a knowledge base

In the Lingma console, go to the Knowledge Base section, click Create Knowledge Base, and select Chat for Scenarios. Set the Access Control parameter to Private.

If you set Access Control to Public, all developers authorized to use Lingma can visit the knowledge base.
If you set Access Control to Private, you can choose which members can visit the knowledge base. We recommend using this setting.

Access control

In the Lingma console, go to the Knowledge Base section, and select the knowledge base you want to manage. Add developers to or remove developers from the list of members with access. See Knowledge base as context for chat for details.

Note

When you manage the access control of a knowledge base, make sure that each member has access only to content they need. This keeps data private, reduces unnecessary information, and helps users focus and find what they need more easily.

矮的 (4)

For more information about how to prepare code-based knowledge data, see Practices for enterprise-level code completion enhancement.