Updated on 2025-11-06 GMT+08:00

Configuring a Knowledge Base

A knowledge base is a core component of the agent development platform. It is used to store, manage, and retrieve domain knowledge. The knowledge base provides precise information support for agents through structured storage, intelligent retrieval, and dynamic update mechanisms. The knowledge base supports the upload of multiple formats, such as DOC, PDF, PPTX, XLSX, and CSV. Through multi-source knowledge convergence and vectorization, the knowledge base can understand and reason complex semantics, providing reliable knowledge support for agent decision-making, Q&A, and task execution. Developers can flexibly configure knowledge sources, update policies, and retrieval modes to ensure that agents can quickly invoke accurate information in different scenarios to complete intelligent services.

Constraints

Table 1 Knowledge base constraints

Constraint

Description

Maximum number of knowledge bases

The maximum number is 10.

Knowledge base size

The maximum size of a single document that can be uploaded is 128 MB.

Adding a Knowledge Base

Knowledge bases can be added for agents. When sending a message, the agent can reference the content in the knowledge base to answer users' questions. Currently, only one knowledge base can be associated.

If you need to add a knowledge base, ensure that the knowledge base has been created. For details about how to create a knowledge base, see Managing Knowledge Bases.

To add a knowledge base, perform the following steps:

  1. In the Knowledge area, click .
  2. In the Add knowledge dialog box, click to add a knowledge base, and then click OK.
    Figure 1 Adding a knowledge base
  3. In the skill > Knowledge area, view the added knowledge base.
    Figure 2 Viewing the knowledge base

Knowledge Base Hit Test

The platform supports the hit test on the created knowledge base to evaluate the effectiveness and accuracy of the knowledge base.

The knowledge base hit test procedure is as follows:

  1. On the Knowledge tab page of the Workbench page, click the required knowledge base. On the basic information page of the knowledge base, click Hit Test in the upper right corner.
  2. Enter a question in the text box and click Hit Test. The lower part of the page displays multiple matched contents according to different search modes and sorts the content in descending order by matching score.

    You can evaluate whether the current knowledge base meets the requirements based on the score and the amount of matched information.

Knowledge Base Recall Strategy

  1. In the Knowledge area, click to configure advanced settings for the knowledge base, including Retrieval policy, Relevance threshold, and Topk recall quantity.
    • Retrieval policy: Different retrieval technologies are used for knowledge base retrieval. The following retrieval technologies are supported:
      • Semantic retrieval: The vector retrieval technology is used to retrieve knowledge in documents and structured data and recall slice content that is highly related to user intents. It is recommended that this technology be used in scenarios where context correlation and understanding of user intents are required.
      • Keyword retrieval: The inverted retrieval technology is used to retrieve knowledge in documents and structured data and recall slice content that matches the query keywords. It is recommended that this technology be used in scenarios where the keyword matching degree of user questions is high.
      • Hybrid retrieval: The vector retrieval and keyword retrieval policies are used to retrieve knowledge bases. It is recommended that this technology be used in scenarios where user intent understanding and keyword matching degree need to be considered.
    • Relevance threshold: Search results that exceed the relevance threshold will be submitted to the large model for summary. The other search results will be filtered out. You can adjust the threshold based on the relevance score in the hit test for the knowledge base.
    • Topk recall quantity: indicates the number of top relevant results in the top K search results based on the relevance threshold. For example, if the number of top K recalls is 5, the top 5 relevant results will be recalled and submitted to the large model for summary.