Modifying Knowledge Base Configuration
Function
This API is used to modify knowledge base settings,
including:
-
Parsing settings: whether to use OCR enhancement, and whether to parse images, the header and footer, and the contents page.
-
Document splitting settings: automatic segmentation, segmentation by text length, and segmentation by subheading, where subheading parsing rules can be customized.
-
Search model settings: Select a reranking model.
-
NLP model setting: Select a generative model.
-
Other settings: recall quantity, reranking switch, reference document quantity, intent classification, and query rewriting switch.
URI
PUT /v1/{project_id}/applications/{application_id}/uni-search/knowledge-repo/{repo_id}
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
project_id |
Yes |
String |
Definition: Project ID. For details about how to obtain the project ID, see Obtaining a Project ID. Constraints: N/A Value range: The value can contain 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter. Default value: N/A |
application_id |
Yes |
String |
Definition: Application ID. For details about how to obtain the application ID, see Obtaining an Application ID. Constraints: Character string Value range: The value can contain 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter. Default value: N/A |
repo_id |
Yes |
String |
Definition: Knowledge base ID. How to obtain: Log in to the KooSearch experience platform. In the navigation tree on the left, choose Knowledge Bases to view knowledge base IDs. Each knowledge base has a unique ID stored in the vector database. Constraints: N/A Value range: Length: 1 to 64 characters. The value can contain only digits, letters, hyphens (-), and underscores (_). Default value: N/A |
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
X-Auth-Token |
Yes |
String |
Definition: Token used for API authentication. For details about how to obtain the token, see Obtaining an IAM User Token. Constraints: N/A Value range: N/A Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
id |
No |
String |
Definition: Knowledge base ID. Constraints: N/A Value range: 1 to 64 characters. Default value: N/A |
name |
No |
String |
Definition: Knowledge base name. Constraints: N/A Value range: The value can contain a maximum of 64 characters. It must start with a letter or digit and can contain letters, digits, and underscores (_). Default value: N/A |
top_k |
No |
Integer |
Definition: top_k configuration. top_k indicates that the first k chunks relevant to the query are recalled. Constraints: N/A Value range: 10-500 Default value: N/A |
reference_count |
No |
Integer |
Definition: The number of reference documents. Number of reference documents provided as input to the NLP model along with the query to generate the final answer. Constraints: N/A Value range: 1-50 Default value: N/A |
rerank_enabled |
No |
Boolean |
Definition: Whether to enable result reranking. When enabled, the recalled top_k results are reranked using the reranking model. When disabled, the recalled top_k results are not reranked. Constraints: N/A Value range: N/A Default value: N/A |
query_rewrite_enabled |
No |
Boolean |
Definition: Indicates whether to use the rewriting result for search. Constraints: N/A Value range: N/A Default value: N/A |
search_plan_category_ids |
No |
Array of strings |
Definition: Search planning categories. The list can contain a maximum of 10 elements. Each element can contain a maximum of 64 characters. Constraints: N/A Value range: The list length cannot exceed 10. Values:
Default value: N/A |
file_extract |
No |
FileExtract object |
Definition: Overall configuration of document parsing, including the components used for document parsing and document splitting rules. Constraints: N/A Value range: N/A Default value: N/A |
rerank_model |
No |
String |
Definition: Reranking model name. Constraints: N/A Value range: The value can contain 1 to 32 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter or digit. Default value: N/A |
search_plan_model |
No |
String |
Definition: The name of the search planning model. Constraints: N/A Value range: The value can contain 1 to 32 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter or digit. Default value: N/A |
pangu_nlp_model |
No |
String |
Definition: NLP model name. Constraints: N/A Value range: The value can contain 1 to 32 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter or digit. Default value: N/A |
search_threshold |
No |
Float |
Definition: Threshold for filtering search interfaces. Constraints: If reranking is disabled, the threshold ranges from 0 to 200. If reranking is enabled, the threshold ranges from 0 to 1. Value range: 0-200 Default value: N/A |
chat_ref_threshold |
No |
Float |
Definition: Reference document filtering threshold. Constraints: If reranking is disabled, the threshold ranges from 0 to 200. If reranking is enabled, the threshold ranges from 0 to 1. Value range: 0-200 Default value: N/A |
faq_threshold |
No |
Float |
Definition: FAQs with a correlation score exceeding this threshold will have their answers directly output, without needing to be summarized by the big model. Constraints:
Value range: 0-200 Default value: N/A |
cache_enabled |
No |
Boolean |
Definition: Whether to enable cache. Constraints: N/A Value range: N/A Default value: N/A |
session_config |
No |
SessionConfig object |
Definition: Cache policy. Constraints: N/A Value range: N/A Default value: N/A |
answer_reference_enabled |
No |
Boolean |
Definition: Whether to enable the reference function. Constraints: N/A Value range: N/A Default value: N/A |
answer_image_reference_enabled |
No |
Boolean |
Definition: Whether to include both text and images. Constraints: N/A Value range: N/A Default value: N/A |
refs |
No |
String |
Definition: List of referenced knowledge base IDs, which are separated by commas (,). Constraints: N/A Value range: The value contains a maximum of 1024 characters. Default value: N/A |
tags |
No |
Array of TagInfo objects |
Definition: Tag list. Constraints: N/A Value range: N/A Default value: N/A |
extend_config |
No |
KnowledgeRepoExtendConfig object |
Definition: Knowledge base extension configuration. Constraints: N/A Value range: N/A Default value: N/A |
prompt_info |
No |
KnowledgeRepoPromptInfo object |
Definition: Prompts associated with the knowledge base. Constraints: N/A Value range: N/A Default value: N/A |
table_rag_enabled |
No |
Boolean |
Definition: Whether to enable tableRAG. Constraints: N/A Value range: N/A Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
parse_conf |
No |
ParseConf object |
Definition: Document parsing configuration, including whether to use OCR enhancement, whether to parse images, whether to extract text during image parsing, whether to parse the header and footer, and whether to parse the contents page. Constraints: N/A Value range: N/A Default value: N/A |
split_conf |
No |
SplitConf object |
Definition: Split configuration, including the segmentation mode, level parsing mode, title level depth, title saving mode, segment length, and title matching pattern. Constraints: N/A Value range: N/A Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
ocr_enabled |
No |
Boolean |
Definition: OCR enhancement. Constraints: N/A Value range: N/A Default value: false |
mllm_enabled |
No |
Boolean |
Definition: Multimodal enhancement. Constraints: N/A Value range: N/A Default value: false |
mllm_model |
No |
String |
Definition: Multimodal model name. Constraints: The mllm_plan model must have already been configured on the platform. You can check the models configured on the platform using the ListModels API. Value range: The value can contain 1 to 32 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter or digit. Default value: N/A |
mllm_prompt |
No |
Map<String,String> |
Definition: Prompt of the multimodal model. Constraints: A default prompt is provided. You can also configure custom prompts. Value range: N/A Default value: N/A |
image_enabled |
No |
Boolean |
Definition: Image parsing. Constraints: N/A Value range: N/A Default value: false |
header_footer_enabled |
No |
Boolean |
Definition: Parse the header and footer. Constraints: N/A Value range: N/A Default value: false |
catalog_enabled |
No |
Boolean |
Definition: Parse contents page. Constraints: N/A Value range: N/A Default value: false |
image_conf |
No |
String |
Definition: Image parsing mode when image_enable is set to True. Constraints: When answers need to be returned with images, the IMAGE mode must be used to retain the original images. Value range:
Default value: TEXT |
footnote_enabled |
No |
Boolean |
Definition: Parse footnotes. Constraints: N/A Value range: N/A Default value: false |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
split_mode |
No |
String |
Definition: Document segmentation mode. Value range: The value can be:
Constraints: N/A Default value: AUTO |
separator_ids |
No |
Array of strings |
Definition: The chunk ID list in automatic segmentation and length segmentation modes. Chunk ID: determines the end character for each chunk. Constraints: N/A Value range: Value mapping:
Default value: {"period_zh", "period_en", "exclamation_mark_zh", "exclamation_mark_en", "question_mark_zh", "question_mark_en"} |
rule_regex_id |
No |
String |
Definition: User-defined Parsing Rule ID Constraints: N/A Value range: N/A Default value: N/A |
chunk_size |
No |
Integer |
Definition: Maximum length of a document chunk. The document is segmented based on the maximum chunk length. Constraints: N/A Value range: 0-6000 Default value: 500 |
title_level |
No |
Integer |
Definition: Title hierarchy depth retained in a chunk. For example: If the depth is 3 and the current paragraph is 1.1.3, then the parent titles 1.1 and 1 are both retained. If the depth is 2 and the current paragraph is 1.1.3, then the parent title 1.1 is retained, and the parent title 1 is discarded. Constraints: N/A Value range: 1-10 Default value: 3 |
combine_title |
No |
Boolean |
Definition: Whether to retain the hierarchical title combination. Constraints: N/A Value range: N/A Default value: false |
merge_titles |
No |
Boolean |
Definition: Cross-Title Merge: When text in paragraphs with different titles is limited, it is automatically merged up to a specified section length, aiding in the creation of a more comprehensive outcome. Constraints: N/A Value range: N/A Default value: false |
rule_regexs |
No |
Array of strings |
Definition: User-defined parsing rules. Constraints: N/A Value range: The list length ranges from 1 to 100. Default value: N/A |
merge_last_chunk |
No |
Boolean |
Definition: Whether to merge the most recent modified segments. Constraints: N/A Value range: N/A Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
similarity_threshold |
Yes |
Float |
Definition: Query2query similarity threshold for cache hit. A higher threshold indicates a higher similarity between the new query and the cached query is required. Constraints: N/A Options: 0.1 ~ 1.0 Default value: 0.9 |
answer_select_policy |
Yes |
String |
Definition: Cache hit selection policy. Constraints: N/A Value range: The value can be:
Default value: N/A |
eviction |
Yes |
Eviction object |
Definition: Cache expiration policy. Constraints: N/A Value range: N/A Default value: N/A |
model_name |
Yes |
String |
Definition: Name of the query2query model used to calculate the similarity between the new query and the cached query when there is a hit. Constraints: N/A Value range: 1 to 64 characters. Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
policy |
Yes |
String |
Definition: Cache expiration policy. Constraints: N/A Value range: The value can be:
Default value: N/A |
ttl |
No |
Long |
Definition: Cache expiration time, in milliseconds. Constraints: N/A Value range: 0-31536000000 Default value: N/A |
hit_count_threshold |
No |
Long |
Definition: Threshold of cache hits. Constraints: N/A Value range: 1-10000 Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
tag_key |
Yes |
String |
Definition: Knowledge base tag keyword. Constraints: N/A Value range: 1 to 128 characters. Default value: N/A |
tag_value |
Yes |
String |
Definition: Knowledge base tag information. Constraints: N/A Value range: 1 to 128 characters. Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
extend_context |
No |
Boolean |
Definition: Extend the context length to generate more comprehensive responses, for example:
Constraints: N/A Value range: N/A Default value: false |
effective_input_length |
No |
Integer |
Definition: Optimal context length, which varies with different models. Set the valid length of input tokens to ensure optimal output. In consideration of multi-turn dialogues, we recommend setting this length to 60/ %(rounded up) of the maximum context length supported by the model. Constraints: N/A Value range: 2-256 Default value: 32 |
top_p |
No |
Float |
Definition: An alternative to sampling with temperature, called nucleus sampling, where the model only takes into account the tokens with the probability mass determined by the top_p parameter. Constraints: N/A Value range: 0.1-1 Default value: 0.1 |
max_tokens |
No |
Integer |
Definition: Maximum number of tokens in the generated text. The total length of the input text plus the generated text cannot exceed the maximum length that the model can process. Constraints: N/A Value range: 1-262144 Default value: 2048 |
chat_temperature |
No |
Float |
Definition: Diversity of non-RAG model's output. Constraints: N/A Value range: N/A Default value: 0-1 |
search_temperature |
No |
Float |
Definition: Diversity of the RAG model's output. Constraints: N/A Value range: 0-1 Default value: 0.6 |
presence_penalty |
No |
Float |
Definition: Text repetition penalty. Constraints: N/A Value range: -2 - 2 Default value: 0 |
use_system_prompt |
No |
Boolean |
Definition: Whether to use system prompts. Keep consistent with the standard prompt assembly solution used by Pangu RAG. Constraints: N/A Value range: N/A Default value: false |
system_prompt |
No |
String |
Definition: System prompt. Note:
Constraints: N/A Value range: 0-8192 Default value: N/A |
qa_question_prompt |
No |
String |
Definition: QA generation and question generation prompt. Constraints: N/A Value range: 0-8192 Default value: N/A |
qa_answer_prompt |
No |
String |
Definition: QA generation and answer generation prompt. Constraints: N/A Value range: N/A Default value: 0-8192 |
refuse_enable |
No |
Boolean |
Definition: Whether to reject certain questions. Constraints: N/A Value range: N/A Default value: false |
refuse_answer |
No |
String |
Definition: Rejection answer. Constraints: N/A Value range: 1-8192 Default value: N/A |
image_match_type |
No |
String |
Definition: Image description parameter. The options are context_match, reference_match, and model_match. The default value is context_match. Constraints: N/A Value range:
Default value: context_match |
custom_types |
No |
Map<String,Map<String,String>> |
Definition: Custom structure type. Constraints: N/A Value range: N/A Default value: N/A |
directory_enable |
No |
Boolean |
Definition: Whether to enable directory management. Constraints: N/A Value range: N/A Default value: false |
embedding_search_enable |
No |
Boolean |
Definition: Whether to enable vector search. Constraints: N/A Value range: N/A Default value: true |
keyword_search_enable |
No |
Boolean |
Definition: Whether to enable keyword-based search. Constraints: N/A Value range: N/A Default value: N/A |
keyword_top_k |
No |
Integer |
Definition: Top-k for keyword-based search. The value ranges from 0 to 100. The default value is 10. Constraints: N/A Value range: 0-100 Default value: 10 |
search_engine_type |
No |
String |
Definition: Search engine type. Constraints: N/A Value range: The value can be:
Default value: N/A |
search_engine_name |
No |
String |
Definition: Search engine name. Constraints: N/A Value range: 0 to 64 characters. Default value: N/A |
think_model_name |
No |
String |
Definition: Name of the deep thinking model. Constraints: N/A Value range: 0 to 64 characters. Default value: N/A |
faq_top_k |
No |
Integer |
Definition: Top-k for Q&A and hybrid search where results are not directly from preset FAQs. Constraints: N/A Value range: 0-50 Default value: 2 |
faq_similarity_threshold |
No |
Float |
Definition: Threshold for Q&A and hybrid search where results are not directly from preset FAQs. Constraints: N/A Value range: 0-1 Default value: 0.8 |
extract_model_name |
No |
String |
Definition: Graph extraction model name. Constraints: N/A Value range: 0 to 64 characters. Default value: N/A |
optimize_model_name |
No |
String |
Definition: Name of the graph optimization model Constraints: N/A Value range: The value cannot exceed 64 characters. Default value: N/A |
graph_search_enable |
No |
Boolean |
Definition: Whether to enable graph search. Constraints: N/A Value range: N/A Default value: false |
graph_reference_count |
No |
Integer |
Definition: Number of graph search reference documents. This parameter takes effect when graph search is enabled. Constraints: N/A Value range: 1-50 Default value: 10 |
graph_top_k |
No |
Integer |
Definition: Top-k for graph vector recall. Constraints: N/A Value range: 1-500 Default value: 50 |
graph_keyword_top_k |
No |
Integer |
Definition: Top-k for keyword-based graph search. Constraints: N/A Value range: 1-100 Default value: 20 |
graph_threshold |
No |
Float |
Definition: Graph re-ranking threshold. Constraints: N/A Value range: 0-200 Default value: 0.3 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
prompt_id |
No |
String |
Definition: Prompt ID. Constraints: N/A Value range: The value can contain only 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. Default value: N/A |
qa_question_prompt_id |
No |
String |
Definition: QA question generation prompt ID. Constraints: N/A Value range: The value can contain only 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. Default value: N/A |
qa_answer_prompt_id |
No |
String |
Definition: QA answer generation prompt ID Constraints: N/A Value range: The value can contain only 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. Default value: N/A |
mllm_prompt_id |
No |
String |
Definition: ID of the mllm prompt. Constraints: N/A Value range: The value can contain only 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. Default value: N/A |
table_rag_config |
No |
String |
Definition: Prompts related to tabular enhancement, including: chat_prompt_with_sqlresults_id: Q&A prompts for tabular enhancement nl2sql_prompt_id: prompt for generating SQL statements table_rag_prompt_id: tabular Q&A prompt Constraints: N/A Value range: 1 to 512 characters. Default value: N/A |
Response Parameters
Status code: 200
Parameter |
Type |
Description |
---|---|---|
repo_id |
String |
Definition: Knowledge base ID. Value range: N/A |
Status code: 400
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Definition: Value range: N/A |
error_msg |
String |
Definition: Error message. Value range: N/A |
Status code: 500
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Definition: Value range: N/A |
error_msg |
String |
Definition: Error message. Value range: N/A |
Example Requests
This API is used to modify knowledge base settings.
/v1/1ed40ceefc8d40f8b884edb6a84e7768/applications/fb9731ab-7085-474f-b6c7-64473586f0f3/uni-search/knowledge-repo/5bb86225-e2ea-4404-8125-aaa3b79419ad { "id" : "5bb86225-e2ea-4404-8125-aaa3b79419ad", "name" : "test_20250425", "tags" : [ ], "top_k" : 50, "rerank_enabled" : true, "query_rewrite_enabled" : true, "reference_count" : 3, "search_threshold" : 0, "rerank_model" : "rerank-zh", "pangu_nlp_model" : "dp-r1", "search_plan_model" : "search_ai_plan", "file_extract" : { "parse_conf" : { "ocr_enabled" : true, "image_enabled" : true, "header_footer_enabled" : false, "catalog_enabled" : false, "image_conf" : "TEXT" }, "split_conf" : { "split_mode" : "LENGTH", "separator_ids" : [ "period_zh", "period_en", "exclamation_mark_zh", "exclamation_mark_en", "question_mark_zh", "question_mark_en" ], "chunk_size" : 500 } }, "search_plan_category_ids" : [ ], "cache_enabled" : true, "session_config" : { "model_name" : "embedding-zh_faq", "similarity_threshold" : 0.9, "answer_select_policy" : "first", "eviction" : { "policy" : "lru", "hit_count_threshold" : 1, "ttl" : 86400000 } }, "answer_reference_enabled" : false, "answer_image_reference_enabled" : false, "chat_ref_threshold" : 0, "faq_threshold" : 0.95, "extend_config" : { "extend_context" : false, "effective_input_length" : 3, "top_p" : 0.1, "max_tokens" : 2048, "chat_temperature" : 0.6, "search_temperature" : 0.6, "presence_penalty" : 0, "search_engine_name" : "bocha", "think_model_name" : "dp-r1", "refuse_enable" : false, "image_match_type" : "context_match", "directory_enable" : false, "embedding_search_enable" : true, "keyword_search_enable" : false, "keyword_top_k" : 10, "faq_top_k" : 2, "faq_similarity_threshold" : 0.8, "refuse_answer" : "" }, "prompt_info" : { "prompt_id" : "default_chat_prompt", "qa_answer_prompt_id" : "default_qa_answer_prompt", "qa_question_prompt_id" : "default_qa_question_prompt" } }
Example Responses
Status code: 200
Response to a knowledge base configuration modification request.
{ "repo_id" : "1235abc" }
Status Codes
Status Code |
Description |
---|---|
200 |
Response to a knowledge base configuration modification request. |
400 |
Incorrect request body parameter. |
500 |
Internal error. |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot