Obtaining the Knowledge Base List
Function
This API is used to obtain the list of all knowledge bases under the current account. The list includes the knowledge base ID, name, status, creator, creation time, and update time.
URI
GET /v1/koosearch/repos
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
No |
String |
Knowledge base name |
status |
No |
String |
Status (open: enabled; close: disabled) |
page_num |
No |
Integer |
Request page number. |
page_size |
Yes |
Integer |
Response result page size specified by the request, for example, 5 records/page or 10 records/page. |
tag |
No |
String |
Tag information, which consists of key and value connected by colons (:), for example, key1:value1. |
lod |
No |
String |
Level of detail, which indicates the level of detail of the returned result. simple: simple detail: detailed |
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
X-Auth-Token |
Yes |
String |
Parameter description: Token used for API authentication. For how to obtain the token, see section 3.2 "Authentication." Constraints: N/A. |
Response Parameters
Status code: 200
Parameter |
Type |
Description |
---|---|---|
data_list |
Array of KnowledgeRepoListInfo objects |
Knowledge Bases |
total |
Integer |
Total number of knowledge bases. |
region_ocr_enabled |
Boolean |
OCR switch: whether to enable the OCR parsing service. |
region_rac_enabled |
Boolean |
RAC switch: whether to enable the RAC service. |
Parameter |
Type |
Description |
---|---|---|
id |
String |
Knowledge base ID. |
name |
String |
Knowledge base name |
detail |
String |
Description |
status |
String |
Status. |
create_user |
String |
User. |
create_time |
String |
Creation time. |
update_time |
String |
Update time. |
top_k |
Integer |
Number of top K records. |
prompt |
String |
Prompt |
common_prompt |
String |
General prompt. |
rerank_enabled |
Boolean |
Rerank switch. |
moderate_enabled |
Boolean |
Content moderation switch. |
search_plan_enabled |
Boolean |
Whether to enable the search planning function. |
query_rewrite_enabled |
Boolean |
Rewriting switch. |
reference_count |
Integer |
The number of reference documents. A reference document is input to the NLP model with a query to generate the final answer. |
fields |
Array of KnowledgeRepoFieldSchema objects |
Field description |
search_threshold |
Float |
Search API filtering threshold. When result reranking is disabled, the threshold ranges from 0 to 200. When reranking is enabled, the threshold ranges from 0 to 1. |
chat_ref_threshold |
Float |
Threshold for reference document filtering. When result reranking is disabled, the threshold ranges from 0 to 200. When reranking is enabled, the threshold ranges from 0 to 1. |
faq_threshold |
Float |
FAQs with a correlation score exceeding this threshold will have their answers directly output, without needing to be summarized by the big model. Notes:
|
embedding_model |
String |
Embedding model name |
rerank_model |
String |
Rerank model name |
nlp_model |
String |
NLP model name |
file_extract |
FileExtract object |
Document parsing details |
search_plan_category_ids |
Array of strings |
Search plan category type default.category.list[0].id=talk default.category.list[0].category=Chit-chat default.category.list[0].locale=zh default.category.list[1].id=language_task default.category.list[1].category=Language task default.category.list[1].locale=zh default.category.list[2].id=human default.category.list[2].category=Characteristics default.category.list[2].locale=zh default.category.list[3].id=common default.category.list[3].category=General knowledge default.category.list[3].locale=zh default.category.list[4].id=special_knowledge default.category.list[4].category=Industry knowledge default.category.list[4].locale=zh |
language_id |
String |
Knowledge base language ID |
cache_enabled |
Boolean |
Whether to enable cache. |
session_config |
SessionConfig object |
Cache policy. |
answer_reference_enabled |
Boolean |
Whether to enable reference. |
answer_image_reference_enabled |
Boolean |
Whether to include both text and images. |
extend_config |
KnowledgeRepoExtendConfig object |
Knowledge base extension configuration. |
refs |
String |
List of referenced knowledge base IDs, which are separated by commas (,). |
prompt_info |
KnowledgeRepoPromptInfo object |
Associated prompt information. |
version |
String |
Knowledge base version. |
pangu_nlp_model |
String |
NLP model name |
search_plan_model |
String |
Name of the search planning model. |
Parameter |
Type |
Description |
---|---|---|
name |
String |
Column name |
field_type |
String |
Field type. |
name_zh |
String |
Field Name (Chinese) |
Parameter |
Type |
Description |
---|---|---|
parse_conf |
ParseConf object |
Document parsing configuration, including whether to use OCR enhancement, whether to parse images, whether to extract text during image parsing, whether to parse the header and footer, and whether to parse the contents page. |
split_conf |
SplitConf object |
Split configuration, including the segmentation mode, level parsing mode, title level depth, title saving mode, segment length, and title matching pattern. |
id |
String |
Document parsing rule ID. |
Parameter |
Type |
Description |
---|---|---|
ocr_enabled |
Boolean |
Parameter description: Whether the current knowledge base uses OCR enhancement.
Default value: false |
image_enabled |
Boolean |
Parameter description: Whether the current knowledge base needs to parse images. true: Skip images in the document by default. false: Parse images. The parsing mode is configured in image_conf. Constraints: N/A. Default value: false |
header_footer_enabled |
Boolean |
Parameter description: Whether to parse the header and footer of the file in the current knowledge base. true: The parsing result contains the header and footer. false: The parsing result does not contain the header and footer. (If the header and footer do not contain key text information, you are advised to set this parameter to false to avoid interference.) Constraints: N/A Default value: false |
catalog_enabled |
Boolean |
Parameter description: Indicates whether to parse the directory page of the file in the current knowledge base. false: The parsing result does not contain the directory page. (If there is no information that needs to be reserved on the content page, it is recommended that the default value be false.) Generally, a directory page contains a large number of keywords, which may affect the search result.) true: The parsing result contains the directory page. Constraints: N/A. Default value: false |
image_conf |
String |
Parameter description: Image parsing mode when image parsing is enabled (image_enable is set to True).
Default value: TEXT |
Parameter |
Type |
Description |
---|---|---|
split_mode |
String |
Parameter description: Mode for splitting a document. Options: Four modes are available:
Constraints: N/A Default value: AUTO |
separator_ids |
Array of strings |
Parameter description: ID list of segment identifiers in automatic segmentation and length segmentation modes. Segment identifier: determines the end character when a slice is segmented. Options: The specific value mapping is as follows: period_zh: Chinese period. period_en: English period. exclamation_mark_zh: Chinese exclamation mark (!) exclamation_mark_en: English exclamation mark (!) question_mark_zh: Chinese question mark (?) question_mark_en: English question mark (?) comma_zh: Chinese comma (,) comma_en: English comma (,) space_en: space Constraints: N/A. Default value: ["period_zh", "period_en", "exclamation_mark_zh", "exclamation_mark_en", "question_mark_zh", "question_mark_en"] |
rule_regex_id |
String |
Parameter description: ID of the selected user-defined parsing rule. Constraints: N/A. |
chunk_size |
Integer |
Parameter description: Maximum length of a document segment. A document is segmented based on the maximum length. Constraints: N/A. Default value: 500 |
title_level |
Integer |
Parameter description: Depth of the title level reserved for a segment. For example: If the depth is 3, the current paragraph is 1.1.3, and the parent titles 1.1 and 1 are retained. If the depth is 2, the current paragraph is 1.1.3, the parent title 1.1 is retained, and the parent title 1 is discarded. Constraints: N/A. Default value: 3 |
combine_title |
Boolean |
Parameter description: Whether to retain the hierarchical title combination. The options are as follows: false: Only the last-level title is retained. true: Save the combination of multiple levels of titles, from the first level to the last level. For example, 1.1 indicates the usage description, and 1.1.1 indicates how to open the refrigerator. Constraints: N/A. Default value: false |
merge_titles |
Boolean |
Parameter description: Whether to merge titles. The options are as follows: true: If the text in a single paragraph of different titles is small, the paragraphs are automatically merged into the specified segment length to generate more comprehensive results. For example, if the two adjacent sub-paragraphs are less than 200 characters and the expected segment length is 500, the two paragraphs are combined into one paragraph. false: Paragraphs with different titles are not merged. Constraints: N/A. Default value: true |
Parameter |
Type |
Description |
---|---|---|
similarity_threshold |
Float |
Parameter description: Threshold of the query2query similarity for matching cached questions. Options: 0.1 to 1.0. A higher threshold indicates a higher similarity between the query and the cached question. Constraints: N/A. |
answer_select_policy |
String |
Parameter description: Cache hit selection policy. Options: FIRST: Select the result with the highest score as the answer. RANDOM: Randomly select a result as the answer. Constraints: N/A. |
eviction |
Eviction object |
Cache expiration policy. |
model_name |
String |
Parameter description: Name of the query2query model used when the cache is hit. This parameter is used to calculate the similarity between the new query and the cached query. Constraints: N/A. |
Parameter |
Type |
Description |
---|---|---|
policy |
String |
Parameter description: Declares which expiration policy is used by the cache. Options: LRU: (Least Recently Used) now - accessTime > ttl, clear. FIFO: (First In First Out) now - createTime > ttl, clear. LFU: (Least Frequency Used) hit_count < threshold, clear. Constraints: N/A. |
ttl |
Long |
Parameter description: Cache expiration time. When the cache exceeds the specified time, the cache is cleared. The unit is millisecond. Constraints: N/A. |
hit_count_threshold |
Long |
Parameter description: Cache hit threshold. When the number of cache hits reaches the threshold, the cache result is not used. Constraints: N/A. |
Parameter |
Type |
Description |
---|---|---|
extend_context |
Boolean |
Parameter description: Specifies whether to extend the long context of the reference shard. Provides a wider context to provide the model with complete answers. Constraints: N/A. |
effective_input_length |
Integer |
Parameter description: Specifies the length of the selected context when the extended context is enabled. This parameter is related to the model and ensures the valid length of the input token to ensure the optimal output. Constraints: For multi-round dialogs, it is recommended that the value be 60/ %(rounded up) of the model context length. Options: 2 to 128, in KB |
top_p |
Float |
Parameter description: An alternative to temperature sampling, called nucleus sampling, which controls the diversity of generated text by limiting the range of vocabulary choices. A higher top_p value means a wider choice of tokens and hence a higher text diversity. Constraints: You are advised to change the value of top_p or temperature to adjust the generating text tendency. Do not change both two parameters. Options: 0.1 ~ 1 Default value: 0.1 |
max_tokens |
Integer |
Parameter description: Specifies the maximum number of new words generated by the model. Constraints: The value of max_tokens is related to the maximum context length supported by the model. max_tokens must be less than the maximum context length supported by the model minus the length of the tokens input to the model. Options: 1 ~ 131072 Default value: 131072 |
chat_temperature |
Float |
Parameter description: Diversity and creativity of text generated by the model in non-search enhancement scenarios. A value close to 0 indicates the lowest randomness, while 1 indicates the highest randomness. Generally, a lower temperature is suitable for deterministic tasks, higher values favor creative tasks. Constraints: N/A. Options: 0 ~ 1. |
search_temperature |
Float |
Parameter description: Diversity and creativity of text generated by the model in search enhancement scenarios. A value close to 0 indicates the lowest randomness, while 1 indicates the highest randomness. Generally, a lower temperature is suitable for deterministic tasks, higher values favor creative tasks. Constraints: N/A. Options: 0 to 1. Generally, the value is set to 0.2 or 0.3 for the Pangu NLP model. Default value: 0.3 |
presence_penalty |
Float |
Parameter description: Degree of duplication in the generated text. The purpose of presence_penalty is to reduce the repeated use of the same or similar content when the model generates text, so as to improve the diversity of the generated text. If a token has appeared in the previous text, the model will be penalized when generating this token. A smaller presence_penalty indicates that the model considers fewer previously generated tokens, which may result in repeated content in the text. A larger value of presence_penalty indicates that the model tends to generate new tokens that have not appeared before, and the generated text is more diversified. Options: The value ranges from –2 to 2. The actual value needs to be determined depending on the situation. Generally, the value 1.1 is used for the Pangu NLP model. Default value: 0 |
use_system_prompt |
Boolean |
Parameter description: Whether to use the system prompt. The prompt standard combination scheme in the RAG scenario of the Pangu NLP model is used. Constraints: When the Pangu NLP model is used, the system prompt can be used in common scenarios. Default value: false |
system_prompt |
String |
Parameter description: System prompt. Constraints:
|
embedding_search_enable |
Boolean |
Parameter description: Specifies whether to enable vector retrieval when related documents are retrieved. Constraints: N/A. Default value: true |
keyword_search_enable |
Boolean |
Parameter description: Specifies whether to enable keyword retrieval when related documents are retrieved. Constraints: N/A. Default value: false |
keyword_top_k |
Integer |
Parameter description: Specifies the number of top results returned when keyword retrieval is used. Constraints: N/A. Options: 0 ~ 100 Default value: 10 |
refuse_enable |
Boolean |
Parameter description: If no related reference document content is found, determines whether to disable model invoking and directly reject the answer on the platform. Constraints: N/A. Default value: false |
refuse_answer |
String |
Parameter description: When refuse_enabled is set to true, if no related reference document content is found, the platform rejects the configured wording. Constraints: N/A. |
image_match_type |
String |
Parameter description: Specifies the image recall mode in the image and text recall scenario. Options: Currently, only context_match and reference_match are supported. context_match: Only semantically related images are recalled. If the context of the image in the reference paragraph is semantically similar to the generated paragraph, the image is recalled. Otherwise, the image is not recalled. reference_match: All images in the reference paragraph are recalled. Constraints: N/A. Default value: context_match |
custom_types |
Map<String,String> |
Parameter description: Mapping type dictionary of a custom field, which applies to structured data scenarios and specifies queryable fields. Example: {"companyName": "keyword"} companyName: field to be retrieved; keyword: retrieval mode Constraints: The value in the mapping dictionary must be of a type supported by Elasticsearch queries, for example, keyword, integer, or text. |
Parameter |
Type |
Description |
---|---|---|
prompt_id |
String |
Parameter description: ID of the NLP model prompt used by the current knowledge base. Constraints: The value of prompt_id must be an existing prompt_id in the prompt management. |
qa_question_prompt_id |
String |
Parameter description: ID of the QA question generation prompt used by the current knowledge base. This prompt is used to automatically generate QA pairs using documents. Constraints: The value of prompt_id must be an existing prompt_id in the prompt management. |
qa_answer_prompt_id |
String |
Parameter description: ID of the QA answer generation prompt used by the current knowledge base. This prompt is used to automatically generate QA pairs using documents. Constraints: The value of prompt_id must be an existing prompt_id in the prompt management. |
Status code: 400
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
|
error_msg |
String |
Error description |
Status code: 500
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
|
error_msg |
String |
Error description |
Example Requests
GET https://{endpoint}/v1/koosearch/repos?page_num=1&page_size=10&name=knowledge &status=open
Example Responses
Status code: 200
Knowledge Base List Response Body
{ "data_list" : [ { "id" : "acd90739-2e22-4870-b2db-35018699b623", "name" : "Knowledge base A", "detail" : "", "version" : "c0c1bcb0-aa9a-4b86-8436-a1ae71483496", "status" : "OPEN", "create_user" : "", "create_time" : "1731033664912", "update_time" : "1731034693520", "top_k" : 50, "prompt" : "You are a Q&A assistant. Please answer the question based on the documents below. Before answering the question, please carefully evaluate whether the documents provided can answer the question. If the given documents are irrelevant to the question, answer \"Sorry, I can't answer this question\"; if they are relevant, answer the question based on these documents.\nDocuments provided:\n{0}\nQuestion: {1}\nPlease provide your answer after carefully consideration based on the requirements given.}", "common_prompt" : "You are a Q&A assistant. Please answer the question based on the documents below. Before answering the question, please carefully evaluate whether the documents provided can answer the question. If the given documents are irrelevant to the question, answer \"Sorry, I can't answer this question\"; if they are relevant, answer the question based on these documents.\nDocuments provided:\n{0}\nQuestion: {1}\nPlease provide your answer after carefully consideration based on the requirements given.}", "rerank_enabled" : true, "moderate_enabled" : false, "query_rewrite_enabled" : true, "reference_count" : 3, "fields" : [ ], "search_threshold" : 0, "embedding_model" : "pangu_embedding", "rerank_model" : "pangu_rerank", "pangu_nlp_model" : "KooSearch-N1", "search_plan_model" : "search-plan", "file_extract" : { "id" : "35bcd5d3-68ca-41ef-a21e-b2b705b91552", "parse_conf" : { "ocr_enabled" : true, "image_enabled" : true, "image_conf" : "IMAGE", "header_footer_enabled" : true, "catalog_enabled" : false }, "split_conf" : { "split_mode" : "AUTO" } }, "search_plan_category_ids" : [ ], "language_id" : "zh", "cache_enabled" : false, "answer_reference_enabled" : false, "answer_image_reference_enabled" : false, "chat_ref_threshold" : 0, "faq_threshold" : 0.95, "extend_config" : { "extend_context" : false, "effective_input_length" : 5, "top_p" : 0.1, "max_tokens" : 2048, "chat_temperature" : 0.8, "search_temperature" : 0.3, "presence_penalty" : 0, "use_system_prompt" : false, "system_prompt" : "When replying to a user request based on the conversation history and a given document, comply with the following principles: 1. Strictly comply with the terms and description logic of the document. 2. If a document segment is used in the reply, use [No.] to add a reference in the corresponding position. 3. If you cannot reply to a user's request based on the dialog history and given document, or the user's question involves sensitive security information, reply [You cannot reply to your request based on the existing information].\nBasic document information:\n\n<#list docs as doc>\n[${doc?counter}] Document title: ${doc.title!}\nA piece of the document: ${doc.content!}\n</#list>", "refuse_enable" : false, "image_match_type" : "context_match" }, "prompt_info" : { "prompt_id" : "default_chat_prompt", "qa_question_prompt_id" : "default_qa_question_prompt", "qa_answer_prompt_id" : "default_qa_answer_prompt" } } ], "total" : 11 }
Status Codes
Status Code |
Description |
---|---|
200 |
Knowledge Base List Response Body |
400 |
Incorrect request body parameter |
500 |
Internal error |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot