Managing Knowledge Bases
You can manage knowledge bases on the LakeSearch web UI, including creating and deleting a knowledge base. DOC, DOCX, PDF, and JSON data can be uploaded to a knowledge base. Due to HBase storage restrictions, a single document to be uploaded cannot exceed 10 MB.
Prerequisites
You have created a LakeSearch user, for example, lakeuser, and added it to the lakesearchgroup. For details about user creation and permission management, see Creating a LakeSearch Role.
Creating a Knowledge Base and Uploading Documents
- Log in to the LakeSearch web UI as lakeuser. For details, see Accessing the LakeSearch Web UI.
- Create a knowledge base.
- Upload documents to the knowledge base.
- Click the ID of the knowledge base created in 2.b. Set the following basic parameters of the knowledge base.
Table 1 Basic knowledge base parameters Parameter
Description
Top K Recalls
Top k of vector query. A larger value means that more vectors are recalled to improve precision and more resources are consumed.
- Default value: 50
- Value range: 10-300
Reference Documents
Number of reference documents transferred to the model for dialogs. Documents are sorted by its relevance to questions and answers.
- Default value: 3
- Value range: 1 to 10
Refined Ranking
Whether to use the refined ranking model to sort query results for second time.
- This function is disabled by default.
- Value range: off and on
Custom Prompt
Prompts are used to guide the model to generate expected results. Customized prompts are supported. You can click Configure to view the default prompt and set a new one.
- Upload data.
- Upload documents. DOC, DOCX, and PDF formats are supported.
In the Document Management tab, click Upload, and then Select Document. Select the document you want to upload, and click Confirm. When the Document Status changes to Normal, the upload is successful.
- Create an FAQ (enter questions and answers).
On the Q&A Management page, click Create, enter the standard question and answer, and click Confirm. The Q&A is used to construct the answer to the question and similar questions so that users can quickly find the desired answer.
- Import FAQ data in batches in XLSX or XLS format.
In the FAQs Import tab, click Upload, and then Select Document. Select the document you want to upload, and click Confirm. When the Document Status changes to Normal, the upload is successful.
- Upload structured data. JSON documents using UTF-8 are supported.
In the Structured Data tab, click Upload, and then Select Document. Select the document you want to upload, and click Confirm. When the Document Status changes to Normal, the upload is successful.
- Upload documents. DOC, DOCX, and PDF formats are supported.
- Click the ID of the knowledge base created in 2.b. Set the following basic parameters of the knowledge base.
- Toggle on the switch on the right of Knowledge Base Status to set a knowledge base Enabled.
FAQ Batch Import Table
- Only Excel (XLSX and XLS) files can be imported.
- A maximum of 1,000 FAQs (that is, 1000 lines in an Excel file) can be imported.
- You do not need to add a table header. You can directly enter answers and questions in the table.
- The answer and question columns are mandatory. The similar question columns are optional.
Answer (Mandatory) |
Question (Mandatory) |
Similar Question (Optional) |
Similar Question (Optional) |
Similar Question (Optional) |
Similar Question (Optional) |
Similar Question (Optional) |
---|---|---|---|---|---|---|
Answer: 1 |
Question 1 |
Similar question A1 |
Similar question B1 |
Similar question C1 |
Similar question D1 |
Similar question E1 |
Answer 2 |
Question 2 |
Similar question A2 |
Similar question B2 |
Similar question C2 |
Similar question D2 |
- |
Answer 3 |
Question 3 |
Similar question A3 |
- |
- |
- |
- |
Answer 4 |
Question 4 |
- |
- |
- |
- |
- |
Structured Data Format
JSON documents encoded in UTF-8 format are supported. The documents must meet the field requirements of StructureData.
Parameter |
Mandatory |
Description |
---|---|---|
id |
Yes |
ID of each data record, which can contain 4 to 64 characters. |
content |
|
Content of each data record, which can contain 1 to 1,000 characters. |
cmd |
Yes |
Operation. The options are as follows:
|
title |
No |
Title, which can contain a maximum of 640 characters. |
category |
No |
Data category, which can contain a maximum of 640 characters. |
url |
No |
URL for uploading data, which can contain a maximum of 2,000 characters. Format: "((http|https)://)(www.)?[a-zA-Z0-9@: %._\\+~#?&//=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9@: %._\\+~#?&//=]*)" |
docTime |
No |
Time when a document is uploaded The time is in YYYY-MM-DD HH:MM:SS format. |
tags |
No |
Tag of each data record Format: ["tag1","tag2","tag3"] |
[ { "cmd": "ADD", "id": "100001", "content": "content for the first data" }, { "cmd": "ADD", "id": "100002", "title": "title for the second data", "content": "content for the second data", "url": "https://www.xxx.com/intl/zh-cn/", "docTime":"2015/01/01 12:10:30", "category":"category1", "tags":["tag1","tag2","tag3"] }, { "cmd": "UPDATE", "id": "100002", "content":"The content for the second data is updated", "category":"newCategory" }, { "cmd": "DELETE", "id": "100001" } ]
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot