Updating Document Parsing
Function
Document parsing API, which is used to upload documents locally.
URI
POST /v1/{project_id}/applications/{app_id}/doc-search/files
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
project_id |
Yes |
String |
Definition: Specifies the project ID. For details about how to obtain the project ID, see Obtaining a Project ID. Constraints: N/A Value range: The value can contain 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter. Default value: N/A |
|
app_id |
Yes |
String |
Definition: Application ID. For details about how to obtain the application ID, see Obtaining an Application ID. Constraints: String Value range: The value can contain 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter. Default value: N/A |
Request Parameters
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
X-Auth-Token |
Yes |
String |
Definition: Token used for API authentication. For details about how to obtain the token, see Obtaining an IAM User Token. Constraints: N/A Value range: N/A Default value: N/A |
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
file |
Yes |
File |
Definition: Document to be uploaded and parsed. Constraints: N/A Value range: N/A Default value: N/A |
|
language |
No |
String |
Definition: Document language. The options are zh (Chinese), en (English), ar (Arabic), th (Thai), pt (Portuguese), and es (Spanish). This parameter is optional for Chinese and English documents. Constraints: N/A Value range:
ar: Arabic pt: Portuguese Default value: N/A |
|
mode |
No |
Integer |
Definition: Splitting mode. Constraints: N/A Value range:
Default value: N/A |
|
ocr |
No |
Boolean |
Definition: Whether to use OCR for document parsing. Constraints: N/A Value range:
Default value: N/A |
|
priority |
No |
Integer |
Definition: Job priority. A larger value indicates a higher priority. The default value is 0. Constraints: N/A Value range: N/A Default value: N/A |
|
ocr_enabled |
No |
Boolean |
Definition: Whether to use OCR for document parsing. Constraints: N/A Value range:
Default value: N/A |
|
mllm_enabled |
No |
Boolean |
Definition: Whether to use multi-modal parsing. Constraints: N/A Value range:
Default value: N/A |
|
image_enabled |
No |
Boolean |
Definition: Whether to parse images. Constraints: N/A Value range:
Default value: N/A |
|
image_conf |
No |
String |
Definition: Image parsing method. Constraints: N/A Value range: Enumerated value
Default value: N/A |
|
header_footer_enabled |
No |
Boolean |
Definition: Whether to parse footers and headers. Constraints: N/A Value range:
Default value: N/A |
|
catalog_enabled |
No |
Boolean |
Definition: Whether to parse Contents. Constraints: N/A Value range:
Default value: N/A |
|
separators |
No |
Array of strings |
Definition: Paragraph ID, which is used to split sentences. Constraints: N/A Value range: N/A Default value: N/A |
|
rule_regexs |
No |
Array of strings |
Definition: Title matching expression in the rule splitting scenario. Constraints: N/A Value range: N/A Default value: N/A |
|
split_mode |
No |
String |
Definition: Document splitting mode. Constraints: N/A Value range: Enumerated value
Default value: N/A |
|
chunk_size |
No |
Integer |
Definition: Maximum chunk length. Constraints: N/A Value range: N/A Default value: N/A |
|
title_level |
No |
Integer |
Definition: Maximum title depth. Constraints: N/A Value range: N/A Default value: N/A |
|
combine_title |
No |
Boolean |
Definition: Whether to merge titles. Merged format: title 1 title 2 title 3. Non-merged format: title 3. Constraints: N/A Value range:
Default value: N/A |
|
merge_titles |
No |
Boolean |
Definition: Whether to merge across titles. Constraints: N/A Value range:
Default value: N/A |
|
overlap |
No |
Float |
Definition: Chunk overlap ratio. Constraints: N/A Value range: N/A Default value: N/A |
|
reference_enabled |
No |
Boolean |
Definition: Whether to parse reference documents. Constraints: N/A Value range:
Default value: N/A |
|
footnote_enabled |
No |
Boolean |
Definition: Whether to parse footnotes. Constraints: N/A Value range:
Default value: N/A |
|
mllm_model |
No |
String |
Definition: Whether to use multi-modal parsing. Multiple multi-modal models can be configured and matched by name. Constraints: N/A Value range: N/A Default value: N/A |
|
mllm_prompt |
No |
String |
Definition: Multimodal prompt, which is of the map type. Example: {"en":"Please parse this image"}. Constraints: N/A Value range: N/A Default value: N/A |
Response Parameters
Status code: 200
|
Parameter |
Type |
Description |
|---|---|---|
|
task_id |
String |
Definition: ID of a document parsing task. You can use this ID to query the document parsing progress and result. Value range: N/A |
Status code: 400
|
Parameter |
Type |
Description |
|---|---|---|
|
error_code |
String |
Definition: Error code Value range: N/A |
|
error_msg |
String |
Definition: Error description Value range: N/A |
Status code: 401
|
Parameter |
Type |
Description |
|---|---|---|
|
error_code |
String |
Definition: Error code Value range: N/A |
|
error_msg |
String |
Definition: Error description Value range: N/A |
Status code: 500
|
Parameter |
Type |
Description |
|---|---|---|
|
error_code |
String |
Definition: Error code Value range: N/A |
|
error_msg |
String |
Definition: Error description Value range: N/A |
Example Requests
http://100.85.216.4:31628/v1/ee51ecd9-bc3c-4e98-b7df-ba6647350af2/applications/01d3c218-4d37-489a-98ff-69d69ea44bb1/doc-search/files
{
"file" : "/D:/Documents/Identifying existing management practices in the control of Striga asiatica within rice–maize systems in mid-west Madagascar.pdf",
"mode" : "4"
}
Example Responses
Status code: 200
File content parsing task creation result
{
"task_id" : "00c7591f88af4f3fb2f3d7c7191865e6"
}
Status Codes
|
Status Code |
Description |
|---|---|
|
200 |
File content parsing task creation result |
|
400 |
Request parameter error |
|
401 |
Authentication exception |
|
500 |
Internal error |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot