Querying the Result of an Asynchronous Document Parsing Task
Function
Querying an Asynchronous Document Parsing Task
URI
GET /v1/{project_id}/applications/{app_id}/doc-search/tasks/{task_id}
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
project_id |
Yes |
String |
Definition: Specifies the project ID. For details about how to obtain the project ID, see Obtaining a Project ID. Constraints: N/A Value range: The value can contain 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter. Default value: N/A |
|
app_id |
Yes |
String |
Definition: Application ID. For details about how to obtain the application ID, see Obtaining an Application ID. Constraints: String Value range: The value can contain 1 to 64 characters. Only digits, letters, hyphens (-), and underscores (_) are allowed. The value must start with a letter. Default value: N/A |
|
task_id |
Yes |
String |
Definition: Task ID. The value can contain 32 characters. Only digits and letters are allowed. ID of a document parsing task. You can use this ID to query the document parsing progress and result. Constraints: 32 characters are allowed Value range: N/A Default value: N/A |
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
is_other_original_table |
No |
Boolean |
Definition: Whether to return the big content field for xlsx, xls, and et Constraints: N/A Value range:
Default value: N/A |
Request Parameters
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
X-Auth-Token |
Yes |
String |
Definition: Token used for API authentication. For details about how to obtain the token, see Obtaining an IAM User Token. Constraints: N/A Value range: N/A Default value: N/A |
Response Parameters
Status code: 200
|
Parameter |
Type |
Description |
|---|---|---|
|
task_desc |
String |
Definition: Task description, mainly the information when the task fails Constraints: N/A Value range: N/A Default value: N/A |
|
result |
ParsedDocument object |
Document parsing result. This field is contained only when the parsing is successful. |
|
Parameter |
Type |
Description |
|---|---|---|
|
doc_id |
String |
Definition: Document ID, which is generated based on the UUID Value range: N/A |
|
doc_name |
String |
Definition: Document name Value range: N/A |
|
doc_type |
String |
Definition: Document type, for example, PDF or DOCX. Value range: Enumerated value
|
|
preview_file_url |
String |
Definition: Preview file address Constraints: N/A Value range: N/A Default value: N/A |
|
original_file |
String |
Definition: Original document path Value range: N/A |
|
html_path |
String |
Definition: Path of the generated HTML file. Value range: N/A |
|
file_size |
Integer |
Definition: Original document size, in bytes. Value range: N/A |
|
pages |
Array of ParsedDocumentPage objects |
Definition: Document page information. Value range: N/A |
|
images |
Array of ParsedDocumentImage objects |
Definition: Document image information. Value range: N/A |
|
original_tables |
Array of OriginalTable objects |
Definition: Original table information. Value range: N/A |
|
Parameter |
Type |
Description |
|---|---|---|
|
page_num |
Integer |
Definition: Page number, which indicates the sequence number of a page in the document. Value range: N/A |
|
preview_image_url |
String |
Definition: Address of the document page preview image. Value range: N/A |
|
components |
Array of ParsedDocumentComponent objects |
Definition: Paragraph information on the page. Value range: N/A |
|
Parameter |
Type |
Description |
|---|---|---|
|
id |
String |
Definition: Paragraph ID, which is generated based on the UUID Value range: N/A |
|
text |
String |
Definition: Paragraph content. Value range: N/A |
|
component_num |
Integer |
Definition: Paragraph code, which indicates the sequence number of a paragraph in the document. The value starts from 1. Value range: N/A |
|
pdf_coordinate |
Array<Array<Integer>> |
Definition: Coordinates of a paragraph on the page, corresponding to the upper left, upper right, lower right, and lower left, respectively, for highlighting. Value range: N/A |
|
original_table_id |
String |
Definition: This parameter has a value only when the table is split. It is used to save the original long table to support the small2big feature. Value range: N/A |
|
type |
String |
Definition: Chunk type. Value range: N/A |
|
title |
String |
Definition: Paragraph title. Value range: N/A |
|
original_title |
String |
Definition: Original title. Value range: N/A |
|
element_id |
String |
Definition: HTML element ID, used for text locating. Value range: N/A |
|
elements |
Array of strings |
Definition: HTML element set, used for text highlighting. Value range: N/A |
|
original_page_nums |
Array of integers |
Definition: Original page number of a chunk. Value range: N/A |
|
Parameter |
Type |
Description |
|---|---|---|
|
image_id |
String |
Definition: Image ID, which is the prefix img- and UUID. Value range: N/A |
|
url |
String |
Definition: Path for uploading the image to OBS. Value range: N/A |
|
data |
String |
Definition: Base64-encoded image data. Value range: N/A |
|
title |
String |
Definition: Image title. Value range: N/A |
|
desc |
String |
Definition: Image description. Value range: N/A |
|
width |
Integer |
Definition: Image width. Value range: N/A |
|
height |
Integer |
Definition: Image height. Value range: N/A |
|
Parameter |
Type |
Description |
|---|---|---|
|
id |
String |
Definition: Table ID. ParsedDocumentComponent will reference this ID to avoid storing multiple copies. Value range: N/A |
|
content |
String |
Definition: Table content. Value range: N/A |
Status code: 400
|
Parameter |
Type |
Description |
|---|---|---|
|
error_code |
String |
Definition: Error code Value range: N/A |
|
error_msg |
String |
Definition: Error description Value range: N/A |
Status code: 401
|
Parameter |
Type |
Description |
|---|---|---|
|
error_code |
String |
Definition: Error code Value range: N/A |
|
error_msg |
String |
Definition: Error description Value range: N/A |
Status code: 500
|
Parameter |
Type |
Description |
|---|---|---|
|
error_code |
String |
Definition: Error code Value range: N/A |
|
error_msg |
String |
Definition: Error description Value range: N/A |
Example Requests
https://127.0.0.1:8081/v1/729cbd739854470da5426ed26bd900ca/applications/01d3c218-4d37-489a-98ff-69d69ea44bb1/doc-search/tasks/37d4e2c03b76455a9e2b9e3fad2e5967?is_other_original_table=false
Example Responses
Status code: 200
Result of an asynchronous document parsing task
{
"process" : 100,
"task_status" : "SUCCESS",
"result" : {
"pages" : [ {
"components" : [ {
"id" : "9074e9104fae45fa9c5ddb7e596a21ca",
"text" : "2844, an increase of 101 compared with the previous year. There are 2879 listed stocks, an increase of 101. Among them, there are 2838 A-shares, an increase of 102; and 41 B-shares, a decrease of 1. The total issued share capital is 26414.94 billion shares, an increase of 3.0/ %over the previous year. The total circulating share capital is 22970.66 billion shares, an increase of 4.4%.\nThe original insurance premium income of insurance institutions in the whole year was 171.956 billion yuan, an increase of 12.6/ %over the previous year. Among them, the income of property insurance business is 44.269 billion yuan, an increase of 5.9%; the income of personal insurance business is 127.687 billion yuan, an increase of 15.1%. The total amount of compensation expenses is 56.157 billion yuan, an increase of 27.5%. Among them, the expenditure of property insurance business is 26.454 billion yuan, an increase of 19.7%; the expenditure of personal insurance business is 29.703 billion yuan, an increase of 35.2%.\nIX. People's Life and Social Security\nThe per capita disposable income of residents in the whole year was 76,910 yuan, an increase of 5.8/ %over the previous year. The per capita consumption expenditure of residents was 49,013 yuan, an increase of 9.4%. The Engel coefficient is 29.4%.",
"component_num" : 1,
"pdf_coordinate" : [ [ 72, 107 ], [ 521, 107 ], [ 521, 444 ], [ 72, 444 ] ],
"element_id" : "9074e9104fae45fa9c5ddb7e596a21ca",
"elements" : [ "9074e9104fae45fa9c5ddb7e596a21ca", "dbd121825fae4137b641b79f52d4e4ec", "89ac39fe379a43319ae7ef299316a563", "4e401be69a5a4b89a8dc0e9cc0b94955" ],
"original_page_nums" : [ 1 ]
}, {
"id" : "3ac0437f4bad46fc86ead8b72cf4ccfe",
"text" : "```echarts-117dfebaf04648e08dd4b09638c83dac\n{\n \"title\": {\n \"text\": \"\"\n },\n \"legend\": {\n \"data\": [\n \"Food and tobacco and alcohol\",\n \"Clothing\",\n \"Housing\",\n \"Essential goods and services\",\n \"Transportation and communication\",\n \"Education, culture, and entertainment\",\n \"Healthcare\",\n \"Other goods and services\"\n ]\n },\n \"tooltip\": {\n \"trigger\": \"item\",\n \"formatter\": \"{a} <br/>{b}: {c} ({d}%)\"\n },\n \"series\": [\n {\n \"name\": \"\",\n \"type\": \"pie\",\n \"radius\": \"55%\",\n \"center\": [\"50%\", \"50%\"],\n \"data\": [\n { \"value\": 14429.44, \"name\": \"Food, tobacco, and alcohol\", \"percent\": 29.4 },\n { \"value\": 2169.98, \"name\": \"Clothing\", \"percent\": 4.4 },\n { \"value\": 12990.05, \"name\": \"Housing\", \"percent\": 26.5 },\n { \"value\": 2508.15, \"name\": \"Essential goods and services\", \"percent\": 5.1 },\n { \"value\": 7362.83, \"name\": \"Transportation and communication\", \"percent\": 15.0 },\n { \"value\": 5034.22, \"name\": \"Education, culture, and entertainment\", \"percent\": 10.3 },\n { \"value\": 2513.40, \"name\": \"Healthcare\", \"percent\": 5.1 },\n { \"value\": 2004.93, \"name\": \"Other goods and services\", \"percent\": 4.1 }\n ],\n \"emphasis\": {\n \"itemStyle\": {\n \"shadowBlur\": 10,\n \"shadowOffsetX\": 0,\n \"shadowColor\": \"rgba(0, 0, 0, 0.5)\"\n }\n }\n }\n ]\n}\n```\n{img-117dfebaf04648e08dd4b09638c83dac}\nFigure 11\nPer capita consumption expenditure and composition of residents in 2023 <br>",
"component_num" : 2,
"pdf_coordinate" : [ [ 81, 478 ], [ 512, 478 ], [ 512, 702 ], [ 81, 702 ] ],
"element_id" : "3ac0437f4bad46fc86ead8b72cf4ccfe",
"elements" : [ "3ac0437f4bad46fc86ead8b72cf4ccfe" ],
"original_page_nums" : [ 1 ]
} ],
"page_num" : 1
} ],
"images" : [ {
"image_id" : "img-117dfebaf04648e08dd4b09638c83dac",
"url" : "kos-docs/guangqi/images/ec/ecd9d8294ba04bba968ab88e15e15779.jpg",
"title" : "Figure 11\nPer Capita Consumption Expenditure and Composition of Residents in 2023",
"width" : 897,
"height" : 467
} ],
"doc_id" : "053e86000f9e4ed5aca7ee6670fb6474",
"doc_name" : "17-Reference Materials_Shenzhen_2023_National Economic and Social Development Statistical Bulletin_Super Application of Assistant Mayor_Economic Operation Report .pdf",
"doc_type" : "PDF",
"html_path" : "kos-docs/haier/output/html/05/053e86000f9e4ed5aca7ee6670fb6474.html",
"json_path" : "kos-docs/haier/output/json/05/053e86000f9e4ed5aca7ee6670fb6474.json",
"md_path" : "kos-docs/haier/output/md/05/053e86000f9e4ed5aca7ee6670fb6474.md",
"file_size" : 80549
}
}
Status Codes
|
Status Code |
Description |
|---|---|
|
200 |
Result of an asynchronous document parsing task |
|
400 |
Request parameter error |
|
401 |
Authentication error |
|
500 |
Service content error |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot