Web Image
Function
This API detects and extracts text from web images and converts the text into a structured JSON format.
For details about the constraints on using this API, see Constraints and Limitations. For details about how to use this API, see Introduction to OCR.
Constraints and Limitations
- English and Chinese are supported but support for traditional Chinese characters is limited.
- Only images in JPG, JPEG, PNG, BMP, TIFF, TGA, WebP, ICO, PCX, or GIF format can be recognized.
- Common image types are supported, such as mobile phone or desktop screenshots, e-commerce product images, and advertisement design drawings.
- No side of the image can be smaller than 15 or larger than 30,000 pixels.
- The characters to be recognized must occupy more than 60% of the image.
- The web image to be recognized can be rotated to any angle (direction detection must be enabled).
Calling Method
For details, see Calling APIs.
Prerequisites
Before using this API, subscribe to the service and complete authentication. For details, see Subscribing to an OCR Service and Authentication.
Before you use the service for the first time, subscribe to the service by clicking Subscribe. You only need to subscribe to the service once. If you have not subscribed to the service yet, error "ModelArts.4204" will be displayed when you call this API. Before you call the API, log in to the OCR console and subscribe to the corresponding service. Ensure that you make the subscription to the service in the same region where you want to call this API.
URI
POST /v2/{project_id}/ocr/web-image
Parameter |
Mandatory |
Description |
---|---|---|
endpoint |
Yes |
Endpoint, which is the request address for calling an API. The endpoint varies depending on services in different regions. For more details, see Endpoints. |
project_id |
Yes |
Project ID, which can be obtained from Obtaining a Project ID. |
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
X-Auth-Token |
Yes |
String |
User token Used to obtain the permission to call APIs. The token is the value of X-Subject-Token in the response header in Authentication. |
Content-Type |
Yes |
String |
MIME type of the request body. The value is application/json. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
image |
No |
String |
Set either this parameter or url. Base64-encoded image file. The image file has a size limit of 10 MB. No side of the image can be smaller than 15 or larger than 30,000 pixels. Only images in JPG, JPEG, PNG, BMP, TIFF, TGA, WebP, ICO, PCX, or GIF format can be recognized. An example is /9j/4AAQSkZJRgABAg.... If the image data contains an unnecessary prefix, the error "The image format is not supported" is reported. |
url |
No |
String |
Set either this parameter or image. Image URL. Currently, the following URLs are supported:
NOTE:
|
detect_direction |
No |
Boolean |
Whether to align the tilted image. The options are as follows:
An image tilted to any angle can be aligned. If this parameter is not specified, false is used by default. If the image to be recognized is tilted, you are advised to set this parameter to true. |
extract_type |
No |
Array of strings |
Structured data extraction parameter list. Currently, only the image width and height are supported. The input parameter value of the image width and height is image_size. If this parameter is not set or is deleted, this parameter will not be used. |
detect_font |
No |
Boolean |
The value is of the Boolean type. If this parameter is not specified, slice fonts are not detected by default. If this parameter is set to True, the slice font type is detected and the five most similar font names are returned. |
detect_text_direction |
No |
Boolean |
The value is of the Boolean type. If this parameter is not transferred, the default value True is used, indicating that the text direction of each field is detected. If this parameter is set to False, the text direction is not detected. If all text in the image faces up, you are advised to set this parameter to False. |
Response Parameters
The status code may vary depending on the recognition results. For example, 200 indicates that the API is successfully called, and 400 indicates that the API fails to be called. The following describes the status codes and corresponding response parameters.
Status code: 200
Parameter |
Type |
Description |
---|---|---|
result |
WebImageResult object |
Calling result of a successful API call This parameter is not included when the API fails to be called. |
Parameter |
Type |
Description |
---|---|---|
words_block_count |
Integer |
This parameter is not included when the API fails to be called. |
words_block_list |
Array of WebImageWordsBlockList objects |
List of text blocks to be recognized. The output sequence is from left to right and from top to bottom. |
extracted_data |
WebImageExtractedData object |
Structured JSON results extracted. The key value in the dictionary is the same as the value of extract_type in the input parameter list. Currently, only the contact (contact_info) and image size (image_size) can be extracted. If extract_type is left blank or missing, no information is extracted. |
Parameter |
Type |
Description |
---|---|---|
words |
String |
Recognition result of a text block |
confidence |
Float |
Confidence of related fields. A higher confidence indicates a higher accuracy of the field identified. The confidence is calculated using algorithms and is not equal to the accuracy. |
location |
Array<Array<Integer>> |
List of location information about a text block, including the 2D coordinates (x, y) of four vertexes in the text area, where the coordinate origin is the upper-left corner of the image, the X axis is horizontal, and the Y axis is vertical. |
font_list |
Array of strings |
Font type of a text block, in list format, indicating the font type closest to the font of the text in a text block. |
font_scores |
Array of numbers |
Probability of the font type to which a text block belongs, in list format, corresponding to font_list, indicating the probability that the text in a text block belongs to a font type. |
Parameter |
Type |
Description |
---|---|---|
contact_info |
WebImageContactInfo object |
Extracted contact information, including the name, phone number, province, city, and detailed address. If extract_type does not contain this parameter, this parameter is not included in the response. |
image_size |
WebImageImageSize object |
Width and height of an image. If extract_type does not contain this parameter, this parameter is not included in the response. |
Parameter |
Type |
Description |
---|---|---|
name |
String |
Name, which is returned when contact_info is specified |
phone |
String |
Contact phone number, which is returned when contact_info is specified |
province |
String |
Province, which is returned when contact_info is specified |
city |
String |
City, which is returned when contact_info is specified |
district |
String |
County or district, which is returned when contact_info is specified |
detail_address |
String |
Detailed address (excluding the province, city, and county or district), which is returned when contact_info is specified |
Parameter |
Type |
Description |
---|---|---|
height |
Integer |
Image height, which is returned when image_size is specified |
width |
Integer |
Image width, which is returned when image_size is specified |
Status code: 400
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Error code of a failed API call. For details, see Error Codes. This parameter is not returned when the API is successfully called. |
error_msg |
String |
Error message when the API call fails. This parameter is not included when the API is successfully called. |
Example Request
- endpoint is the request URL for calling an API. Endpoints vary depending on services and regions. For details, see Endpoints.
For example, Web Image OCR is deployed in the AP-Bangkok region. The endpoint is ocr.ap-southeast-2.myhuaweicloud.com or ocr.ap-southeast-2.myhuaweicloud.cn. The request URL is https://ocr.ap-southeast-2.myhuaweicloud.com/v2/{project_id}/ocr/web-image. project_id is the project ID. For how to obtain the project ID, see Obtaining a Project ID.
- For details about how to obtain a token, see Authentication.
- Transfer the Base64 code of a web image for recognition.
POST https://{endpoint}/v2/{project_id}/ocr/web-image Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "image":"/9j/4AAQSkZJRgABAgEASABIAAD/..." }
- Transfer the URL of a web image for recognition.
POST https://{endpoint}/v2/{project_id}/ocr/web-image Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "url":"https://BucketName.obs.xxxx.com/ObjectName" }
Example Response
Status code: 200
Example response for a successful request
{ "result": { "words_block_count": 3, "words_block_list": [ { "words": "Text block 1", "confidence": 0.9950, "location": [ [13, 476], [91, 332], [125, 351], [48, 494] ] }, { "words": "Text block 2", "confidence": 0.9910, "location": [ [13, 476], [91, 332], [125, 351], [48, 494] ] }, { "words": "Text block 3", "confidence": 0.9910, "location": [ [13, 476], [91, 332], [125, 351], [48, 494] ] } ], "extracted_data": {} } }
Status code: 400
Example response for a failed request
{ "error_code": "AIS.0103", "error_msg": "The image size does not meet the requirements." }
Example SDK Code
The example SDK code is as follows:
You are advised to update the SDKs to the latest versions before use to prevent the local outdated SDKs from being unable to use the latest OCR functions.
- Transfer the Base64 code of a web image for recognition. During the recognition, the service verifies the tilt angle of the image, determines the font type to be recognized, and checks whether the image contains contact information.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
package com.huaweicloud.sdk.test; import com.huaweicloud.sdk.core.auth.ICredential; import com.huaweicloud.sdk.core.auth.BasicCredentials; import com.huaweicloud.sdk.core.exception.ConnectionException; import com.huaweicloud.sdk.core.exception.RequestTimeoutException; import com.huaweicloud.sdk.core.exception.ServiceResponseException; import com.huaweicloud.sdk.ocr.v1.region.OcrRegion; import com.huaweicloud.sdk.ocr.v1.*; import com.huaweicloud.sdk.ocr.v1.model.*; import java.util.List; import java.util.ArrayList; public class RecognizeWebImageSolution { public static void main(String[] args) { // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment String ak = System.getenv("CLOUD_SDK_AK"); String sk = System.getenv("CLOUD_SDK_SK"); ICredential auth = new BasicCredentials() .withAk(ak) .withSk(sk); OcrClient client = OcrClient.newBuilder() .withCredential(auth) .withRegion(OcrRegion.valueOf("<YOUR REGION>")) .build(); RecognizeWebImageRequest request = new RecognizeWebImageRequest(); WebImageRequestBody body = new WebImageRequestBody(); List<String> listbodyExtractType = new ArrayList<>(); listbodyExtractType.add("contact_info"); listbodyExtractType.add("image_size"); body.withDetectFont(true); body.withExtractType(listbodyExtractType); body.withDetectDirection(true); body.withImage("/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA..."); request.withBody(body); try { RecognizeWebImageResponse response = client.recognizeWebImage(request); System.out.println(response.toString()); } catch (ConnectionException e) { e.printStackTrace(); } catch (RequestTimeoutException e) { e.printStackTrace(); } catch (ServiceResponseException e) { e.printStackTrace(); System.out.println(e.getHttpStatusCode()); System.out.println(e.getRequestId()); System.out.println(e.getErrorCode()); System.out.println(e.getErrorMsg()); } } }
- Transfer the URL of a web image for recognition. During the recognition, the service verifies the tilt angle of the image, determines the font type to be recognized, and checks whether the image contains contact information.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
package com.huaweicloud.sdk.test; import com.huaweicloud.sdk.core.auth.ICredential; import com.huaweicloud.sdk.core.auth.BasicCredentials; import com.huaweicloud.sdk.core.exception.ConnectionException; import com.huaweicloud.sdk.core.exception.RequestTimeoutException; import com.huaweicloud.sdk.core.exception.ServiceResponseException; import com.huaweicloud.sdk.ocr.v1.region.OcrRegion; import com.huaweicloud.sdk.ocr.v1.*; import com.huaweicloud.sdk.ocr.v1.model.*; import java.util.List; import java.util.ArrayList; public class RecognizeWebImageSolution { public static void main(String[] args) { // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment String ak = System.getenv("CLOUD_SDK_AK"); String sk = System.getenv("CLOUD_SDK_SK"); ICredential auth = new BasicCredentials() .withAk(ak) .withSk(sk); OcrClient client = OcrClient.newBuilder() .withCredential(auth) .withRegion(OcrRegion.valueOf("<YOUR REGION>")) .build(); RecognizeWebImageRequest request = new RecognizeWebImageRequest(); WebImageRequestBody body = new WebImageRequestBody(); List<String> listbodyExtractType = new ArrayList<>(); listbodyExtractType.add("contact_info"); listbodyExtractType.add("image_size"); body.withDetectFont(true); body.withExtractType(listbodyExtractType); body.withDetectDirection(true); body.withUrl("https://BucketName.obs.myhuaweicloud.com/ObjectName"); request.withBody(body); try { RecognizeWebImageResponse response = client.recognizeWebImage(request); System.out.println(response.toString()); } catch (ConnectionException e) { e.printStackTrace(); } catch (RequestTimeoutException e) { e.printStackTrace(); } catch (ServiceResponseException e) { e.printStackTrace(); System.out.println(e.getHttpStatusCode()); System.out.println(e.getRequestId()); System.out.println(e.getErrorCode()); System.out.println(e.getErrorMsg()); } } }
- Transfer the Base64 code of a web image for recognition. During the recognition, the service verifies the tilt angle of the image, determines the font type to be recognized, and checks whether the image contains contact information.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
# coding: utf-8 from huaweicloudsdkcore.auth.credentials import BasicCredentials from huaweicloudsdkocr.v1.region.ocr_region import OcrRegion from huaweicloudsdkcore.exceptions import exceptions from huaweicloudsdkocr.v1 import * if __name__ == "__main__": # The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. # In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment ak = os.getenv("CLOUD_SDK_AK") sk = os.getenv("CLOUD_SDK_SK") credentials = BasicCredentials(ak, sk) \ client = OcrClient.new_builder() \ .with_credentials(credentials) \ .with_region(OcrRegion.value_of("<YOUR REGION>")) \ .build() try: request = RecognizeWebImageRequest() listExtractTypebody = [ "contact_info", "image_size" ] request.body = WebImageRequestBody( detect_font=True, extract_type=listExtractTypebody, detect_direction=True, image="/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA..." ) response = client.recognize_web_image(request) print(response) except exceptions.ClientRequestException as e: print(e.status_code) print(e.request_id) print(e.error_code) print(e.error_msg)
- Transfer the URL of a web image for recognition. During the recognition, the service verifies the tilt angle of the image, determines the font type to be recognized, and checks whether the image contains contact information.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
# coding: utf-8 from huaweicloudsdkcore.auth.credentials import BasicCredentials from huaweicloudsdkocr.v1.region.ocr_region import OcrRegion from huaweicloudsdkcore.exceptions import exceptions from huaweicloudsdkocr.v1 import * if __name__ == "__main__": # The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. # In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment ak = os.getenv("CLOUD_SDK_AK") sk = os.getenv("CLOUD_SDK_SK") credentials = BasicCredentials(ak, sk) \ client = OcrClient.new_builder() \ .with_credentials(credentials) \ .with_region(OcrRegion.value_of("<YOUR REGION>")) \ .build() try: request = RecognizeWebImageRequest() listExtractTypebody = [ "contact_info", "image_size" ] request.body = WebImageRequestBody( detect_font=True, extract_type=listExtractTypebody, detect_direction=True, url="https://BucketName.obs.myhuaweicloud.com/ObjectName" ) response = client.recognize_web_image(request) print(response) except exceptions.ClientRequestException as e: print(e.status_code) print(e.request_id) print(e.error_code) print(e.error_msg)
- Transfer the Base64 code of a web image for recognition. During the recognition, the service verifies the tilt angle of the image, determines the font type to be recognized, and checks whether the image contains contact information.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
package main import ( "fmt" "github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic" ocr "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1" "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/model" region "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/region" ) func main() { // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment ak := os.Getenv("CLOUD_SDK_AK") sk := os.Getenv("CLOUD_SDK_SK") auth := basic.NewCredentialsBuilder(). WithAk(ak). WithSk(sk). Build() client := ocr.NewOcrClient( ocr.OcrClientBuilder(). WithRegion(region.ValueOf("<YOUR REGION>")). WithCredential(auth). Build()) request := &model.RecognizeWebImageRequest{} var listExtractTypebody = []string{ "contact_info", "image_size", } detectFontWebImageRequestBody:= true detectDirectionWebImageRequestBody:= true imageWebImageRequestBody:= "/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA..." request.Body = &model.WebImageRequestBody{ DetectFont: &detectFontWebImageRequestBody, ExtractType: &listExtractTypebody, DetectDirection: &detectDirectionWebImageRequestBody, Image: &imageWebImageRequestBody, } response, err := client.RecognizeWebImage(request) if err == nil { fmt.Printf("%+v\n", response) } else { fmt.Println(err) } }
- Transfer the URL of a web image for recognition. During the recognition, the service verifies the tilt angle of the image, determines the font type to be recognized, and checks whether the image contains contact information.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
package main import ( "fmt" "github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic" ocr "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1" "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/model" region "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/region" ) func main() { // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment ak := os.Getenv("CLOUD_SDK_AK") sk := os.Getenv("CLOUD_SDK_SK") auth := basic.NewCredentialsBuilder(). WithAk(ak). WithSk(sk). Build() client := ocr.NewOcrClient( ocr.OcrClientBuilder(). WithRegion(region.ValueOf("<YOUR REGION>")). WithCredential(auth). Build()) request := &model.RecognizeWebImageRequest{} var listExtractTypebody = []string{ "contact_info", "image_size", } detectFontWebImageRequestBody:= true detectDirectionWebImageRequestBody:= true urlWebImageRequestBody:= "https://BucketName.obs.myhuaweicloud.com/ObjectName" request.Body = &model.WebImageRequestBody{ DetectFont: &detectFontWebImageRequestBody, ExtractType: &listExtractTypebody, DetectDirection: &detectDirectionWebImageRequestBody, Url: &urlWebImageRequestBody, } response, err := client.RecognizeWebImage(request) if err == nil { fmt.Printf("%+v\n", response) } else { fmt.Println(err) } }
For more SDK code examples in various programming languages, see the Sample Code tab on the right of the API Explorer page, which can automatically generate corresponding SDK code examples.
Status Codes
Status Code |
Description |
---|---|
200 |
Example response for a successful request |
400 |
Example response for a failed request |
See Status Codes.
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot