Uploading a document from the local host
Function
Document parsing API, which is used to upload documents locally.
URI
POST /v1/koosearch/doc-search/files
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
X-Auth-Token |
Yes |
String |
Parameter description: Token used for API authentication. For how to obtain the token, see section 3.2 "Authentication." Constraints: N/A. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
file |
Yes |
File |
Parameter description: Document to be uploaded for parsing. Constraints: N/A Values: File Default value: N/A |
language |
No |
String |
Parameter description: Language of the document. The options are CHINESE, ENGLISH, ARABIC, and THAI. This parameter can be left empty for Chinese and English documents. Constraints: N/A Values: CHINESE, ENGLISH, ARABIC, THAI Default value: N/A |
mode |
No |
Integer |
Parameter description: Document parsing and splitting mode. The value can be 1 (hierarchical parsing), 2 (rule-based parsing), 3 (length-based parsing), or 4 (automatic parsing). Constraints: The priority of split_mode is higher. Values: 1, 2, 3, 4 Default value: N/A |
ocr |
No |
Boolean |
Parameter description: Whether to use OCR for parsing Constraints: The priority of ocr_enabled is higher. Values: true: OCR is used for parsing. false: OCR is not used for parsing. Default value: false |
ocr_enabled |
No |
Boolean |
Parameter description: Whether to use OCR for parsing Constraints: N/A Values: true: OCR is used for parsing. false: OCR is not used for parsing. Default value: false |
image_enabled |
No |
Boolean |
Parameter description: Whether to parse images. Constraints: N/A Values: true: Parse images. false: Do not parse images. Default value: false |
image_conf |
No |
String |
Parameter description: Image parsing mode. Constraints: This parameter does not take effect when image_enabled is set to false. Values: TEXT: Extracts image text. IMAGE: Retains the original image. BASE64: Returns the image data encoded using Base64. Default value: IMAGE |
header_footer_enabled |
No |
Boolean |
Parameter description: Whether to parse the footer and header. Constraints: N/A Values: true: Parse the footer and header. false: Do not parse the footer and header. Default value: false |
catalog_enabled |
No |
Boolean |
Parameter description: Whether to parse the directory Constraints: N/A Values: true: Parse the directory. false: Do not parse the directory. Default value: false |
separators |
No |
Array of strings |
Parameter description: List set of paragraph identifiers. The value is an array of strings. Each string is an identifier. Constraints: 50 character limit. Values: N/A Default value: [". ", ".", "? ", "! ", "!", "?", "\n"] |
rule_regexs |
No |
Array of strings |
Parameter description: Title matching expression in the rule-based splitting scenario. The value is an array of strings. Each string is an expression. Constraints: The length cannot exceed 10 characters. Values: N/A Default value: N/A |
split_mode |
No |
String |
Parameter description: Text splitting mode. The options are as follows: LENGTH (split by text length), CATALOG (split by contents), RULE (split by defined rules), and AUTO (automatic splitting). Constraints: N/A Values: LENGTH, CATALOG, RULE, AUTO Default value: N/A |
chunk_size |
No |
Integer |
Parameter description: Maximum length of a chunk Constraints: N/A Values: 1- Default value: N/A |
title_level |
No |
Integer |
Parameter description: Maximum depth of the title Constraints: N/A Values: 1- Default value: N/A |
combine_title |
No |
Boolean |
Parameter description: Whether to merge titles. The format for merging titles is Title 1 Title 2 Title 3. The format for not merging titles is Title 3. Constraints: N/A Values: true: Merge titles. false: The title is not merged. Default value: true |
merge_titles |
No |
Boolean |
Parameter description: Whether to merge different titles. Constraints: N/A Values: true: Merge different titles. false: Do not merge different titles. Default value: true |
reference_enabled |
No |
Boolean |
Parameter description: Whether to parse references Constraints: N/A Values: true: Parse references. false: Do not parse references. Default value: false |
Response Parameters
Status code: 200
Parameter |
Type |
Description |
---|---|---|
task_id |
String |
ID of a document parsing task. You can use this ID to query the document parsing status and result. |
Status code: 400
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
|
error_msg |
String |
Error description |
Status code: 401
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
|
error_msg |
String |
Error description |
Status code: 500
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
|
error_msg |
String |
Error description |
Example Requests
None
Example Responses
Status code: 200
File content parsing task creation result
{ "task_id" : "00c7591f88af4f3fb2f3d7c7191865e6" }
Status Codes
Status Code |
Description |
---|---|
200 |
File content parsing task creation result |
400 |
Invalid request parameters |
401 |
Authentication exception |
500 |
Internal error |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot