General Text OCR
Function
General Text OCR recognizes the text in an image and returns the recognized text and coordinates in JSON format. It can be used in various scenarios, such as scanned files, electronic documents, books, notes, and forms. For details about the constraints on using this API, see Constraints. For details about how to use this API, see Introduction to OCR.
Prerequisites
Before using General Text OCR, you need to apply for the service and complete authentication. For details, see Subscribing to OCR and Authentication.
URI
POST https://{endpoint}/v2/{project_id}/ocr/general-text
| Parameter | Mandatory | Description |
|---|---|---|
| endpoint | Yes | Domain name or IP address of the server bearing the REST service endpoint. The endpoint varies depending on services in different regions. For more details, see Endpoints. For example, the endpoint of OCR in the CN North-Beijing4 region is ocr.cn-north-4.myhuaweicloud.com. |
| project_id | Yes | Project ID, which can be obtained from Obtaining a Project ID. |
Request Parameters
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| X-Auth-Token | Yes | String | User token During API authentication using a token, the token is added to requests to obtain permissions for calling the API. The value of X-Subject-Token in the response header is the obtained token. |
| Content-Type | Yes | String | MIME type of the request body. The value is application/json. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| image | No. Set either this parameter or url. | String | Base64 character string converted from the image. The size cannot exceed 10 MB. The narrow edge contains at least 15 pixels and the wide edge contains at most 4,096 pixels. Images only in JPEG, JPG, PNG, BMP, or TIFF format can be recognized. |
| url | No. Set either this parameter or image. | String | Image URL. Currently, the following URLs are supported:
NOTE:
|
| detect_direction | No | Boolean | Whether to enable the function of aligning tilted images. The options are as follows:
An image tilted at any angle can be aligned. If this parameter is not specified, the default value false is used. |
| quick_mode | No | Boolean | If the quick mode is enabled, a single-line text image (containing only one line of text and the text area occupies more than 50% of the total text area) can be quickly recognized. Possible values are as follows:
If this parameter is not specified, the default value false is used, indicating that the quick mode is disabled. |
Response Parameters
Response parameters and status codes vary in different recognition results. They are described as below.
Status code: 200
| Parameter | Type | Description |
|---|---|---|
| result | GeneralTextResult object | Calling result of a successful API call This parameter is not included when the API fails to be called. |
| Parameter | Type | Description |
|---|---|---|
| direction | Integer | Image direction
|
| words_block_count | Integer | Number of text blocks to be recognized |
| words_block_list | Array of GeneralTextWordsBlockList objects | List of text blocks to be recognized. The output sequence is from left to right and from top to bottom. |
| Parameter | Type | Description |
|---|---|---|
| words | String | Recognition result of a text block |
| location | Array of integers | List of recognized location information about a text block, including the two-dimensional coordinates (x, y) of four vertexes in the text area. The coordinate origin is the upper left corner of the image, the X axis is horizontal, and the Y axis is vertical. |
Status code: 400
| Parameter | Type | Description |
|---|---|---|
| error_code | String | Error code of a failed API call. For details, see Error Codes. If error code ModelArts.4204 is displayed, refer to Why Is a Message Stating "ModelArts.4204" Displayed When the OCR API Is Called? This parameter is not included when the API is successfully called. |
| error_msg | String | Error message returned when the API fails to be called This parameter is not included when the API is successfully called. |
Request Example
- The endpoint is the request URL for calling an API. Endpoints vary depending on services and regions. For details, see Endpoints.
For example, General Text OCR is deployed in the CN North-Beijing4 region. The endpoint is ocr.cn-north-4.myhuaweicloud.com. The request URL is https://ocr.cn-north-4.myhuaweicloud.com/v2/{project_id}/ocr/general-text. project_id is the project ID. For details about how to obtain the project ID, see Obtaining a Project ID.
- For details about how to obtain a token, see Making an API Request.
- Request example (Method 1: Use the image Base64 string.)
POST https://{endpoint}/v2/{project_id}/ocr/general-text Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "image":"/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA...", "detect_direction":false }
- Request example (Method 2: Use the image URL.)
POST https://{endpoint}/v2/{project_id}/ocr/general-text Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "url":"https://BucketName.obs.xxxx.com/ObjectName", "detect_direction":false }
- Sample code for a Python 3 request (For codes in other languages, refer to the following sample or use OCR SDK.)
# encoding:utf-8 import requests import base64 url = "https://{endpoint}/v2/{project_id}/ocr/general-text" token = "Actual token value obtained by the user" headers = {'Content-Type': 'application/json', 'X-Auth-Token': token} imagepath = r'./data/general-text-demo.png' with open(imagepath, "rb") as bin_data: image_data = bin_data.read() image_base64 = base64.b64encode(image_data).decode("utf-8") # Base64 encoding of images. payload = {"image": image_base64} # imageurl = 'https://BucketName.obs.xxxx.com/ObjectName' # URL of the image. # payload = {'url': imageurl} # url or image. response = requests.post(url, headers=headers, json=payload) print(response.text)
Example Response
Status code: 200
Successful response example
{
"result": {
"direction": -1,
"words_block_count": 1,
"words_block_list": [
{
"words": "Words recognized from the text recognition area",
"location":[
[15,15],
[30,15],
[30,30],
[15,30]
]
}
]
}
} Status code: 400
{
"error_code": "AIS.0103",
"error_msg": "The image size does not meet the requirements."
} Status Codes
| Status Code | Description |
|---|---|
| 200 | Success response |
| 400 | Failure response |
For details about status codes, see Status Codes.
Error Codes
For details about error codes, see Error Codes.
Last Article: General Table OCR
Next Article: Web Image OCR
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.