Web Image

Function

This API detects and extracts text from web images and converts the text into a structured JSON format.

For details about the constraints on using this API, see Notes and Constraints. For details about how to use this API, see Introduction to OCR.

Notes and Constraints

English and Chinese are supported but support for traditional Chinese characters is limited.
Only images in JPG, JPEG, PNG, BMP, TIFF, TGA, WebP, ICO, PCX, or GIF format can be recognized.
Common image types are supported, such as mobile phone or desktop screenshots, e-commerce product images, and advertisement design drawings.
No side of the image can be smaller than 15 or larger than 30,000 pixels. The file size of a single image after Base64 encoding should not exceed 10 MB.
The characters to be recognized must occupy more than 60% of the image.
The web image to be recognized can be rotated to any angle (direction detection must be enabled).

Calling Method

For details, see Calling APIs.

Prerequisites

Before using this API, subscribe to the service and complete authentication. For details, see Subscribing to an OCR Service and Authentication.

To use the service for the first time, subscribe to it by clicking Subscribe. You only need to subscribe to the service once. If you have not subscribed to the service yet, error "ModelArts.4204" will be displayed when you call this API. Before you call the API, log in to the OCR console and subscribe to the corresponding service. Ensure that you make the subscription to the service in the same region where you want to call this API.

URI

POST /v2/{project_id}/ocr/web-image

**Table 1** URI parameters
Parameter	Mandatory	Description
endpoint	Yes	Endpoint, which is the request address for calling an API. The endpoint varies depending on services in different regions. For more details, see Endpoints.
project_id	Yes	Project ID, which can be obtained from Obtaining a Project ID.

Request Parameters

**Table 2** Request header parameters
Parameter	Mandatory	Type	Description
X-Auth-Token	Yes	String	User token Used to obtain the permission to call APIs. The token is the value of X-Subject-Token in the response header in Authentication.
Content-Type	Yes	String	MIME type of the request body. The value is application/json.

**Table 3** Request body parameters
Parameter	Mandatory	Type	Description
image	No	String	Set either this parameter or url. The file size of a single image after Base64 encoding should not exceed 10 MB. Since images increase in size after Base64 encoding, it is recommended that the original image size not exceed 7 MB. No side of the image can be smaller than 15 or larger than 30,000 pixels. Only images in JPG, JPEG, PNG, BMP, TIFF, TGA, WebP, ICO, PCX, or GIF format can be recognized. An example is /9j/4AAQSkZJRgABAg.... If the image data contains an unnecessary prefix, the error "The image format is not supported" is reported.
url	No	String	Set either this parameter or image. The Base64-encoded file size of a single image contained in a URL should not exceed 10 MB. Since images increase in size after Base64 encoding, it is recommended that the original image size not exceed 7 MB. Image URL. Currently, the following URLs are supported: Public HTTP/HTTPS URL URL provided by OBS. You need to be authorized to use OBS data, including service authorization, temporary authorization, and anonymous public authorization. For details, see Configuring Access Permissions of OBS. NOTE: The API response time depends on the image download time. If the image download takes a long time, the API call will fail. Ensure that the storage service where the image to be detected resides is stable and reliable. OBS is recommended for storing image data. The URL cannot contain Chinese characters. If Chinese characters exist, they must be encoded using UTF-8.
detect_direction	No	Boolean	Whether to align the tilted image. The options are as follows: true: The tilted image will be aligned. false: The tilted image will not be aligned. An image tilted to any angle can be aligned. If this parameter is not specified, false is used by default. If the image to be recognized is tilted, you are advised to set this parameter to true.
extract_type	No	Array of strings	Structured data extraction parameter list. Currently, only the image width and height are supported. The input parameter value of the image width and height is image_size. If this parameter is not set or is deleted, this parameter will not be used.
detect_font	No	Boolean	The value is of the Boolean type. If this parameter is not specified, slice fonts are not detected by default. If this parameter is set to True, the slice font type is detected and the five most similar font names are returned.
detect_text_direction	No	Boolean	The value is of the Boolean type. If this parameter is not transferred, the default value True is used, indicating that the text direction of each field is detected. If this parameter is set to False, the text direction is not detected. If all text in the image faces up, you are advised to set this parameter to False.

Response Parameters

The status code may vary depending on the recognition results. For example, 200 indicates that the API is successfully called, and 400 indicates that the API fails to be called. The following describes the status codes and corresponding response parameters.

Status code: 200

**Table 4** Response body parameter
Parameter	Type	Description
result	WebImageResult object	Calling result of a successful API call This parameter is not included when the API fails to be called.

**Table 5** WebImageResult
Parameter	Type	Description
words_block_count	Integer	This parameter is not included when the API fails to be called.
words_block_list	Array of WebImageWordsBlockList objects	List of text blocks to be recognized. The output sequence is from left to right and from top to bottom.
extracted_data	WebImageExtractedData object	Structured JSON results extracted. The key value in the dictionary is the same as the value of extract_type in the input parameter list. Currently, only the contact (contact_info) and image size (image_size) can be extracted. If extract_type is left blank or missing, no information is extracted.

**Table 6** WebImageWordsBlockList
Parameter	Type	Description
words	String	Recognition result of a text block
confidence	Float	Confidence of related fields. A higher confidence indicates a higher accuracy of the field identified. The confidence is calculated using algorithms and is not equal to the accuracy.
location	Array<Array<Integer>>	List of location information about a text block, including the 2D coordinates (x, y) of four vertexes in the text area, where the coordinate origin is the upper-left corner of the image, the X axis is horizontal, and the Y axis is vertical.
font_list	Array of strings	Font type of a text block, such as SimHei, Arial, and STZhongsong. It is presented in a list format, indicating the font type that most closely matches the text within a text block.
font_scores	Array of numbers	Probability of the font type to which a text block belongs, in list format, corresponding to font_list, indicating the probability that the text in a text block belongs to a font type.

**Table 7** WebImageExtractedData
Parameter	Type	Description
contact_info	WebImageContactInfo object	Extracted contact information, including the name, phone number, province, city, and detailed address. If extract_type does not contain this parameter, this parameter is not included in the response.
image_size	WebImageImageSize object	Width and height of an image. If extract_type does not contain this parameter, this parameter is not included in the response.

**Table 8** WebImageContactInfo
Parameter	Type	Description
name	String	Name, which is returned when contact_info is specified
phone	String	Contact phone number, which is returned when contact_info is specified
province	String	Province, which is returned when contact_info is specified
city	String	City, which is returned when contact_info is specified
district	String	County or district, which is returned when contact_info is specified
detail_address	String	Detailed address (excluding the province, city, and county or district), which is returned when contact_info is specified

**Table 9** WebImageImageSize
Parameter	Type	Description
height	Integer	Image height, which is returned when image_size is specified
width	Integer	Image width, which is returned when image_size is specified

Status code: 400

**Table 10** Response body parameters
Parameter	Type	Description
error_code	String	Error code of a failed API call. For details, see Error Codes. This parameter is not returned when the API is successfully called.
error_msg	String	Error message when the API call fails. This parameter is not included when the API is successfully called.

Example Request

endpoint is the request URL for calling an API. Endpoints vary depending on services and regions. For details, see Endpoints.
For example, Web Image OCR is deployed in the AP-Bangkok region. The endpoint is ocr.ap-southeast-2.myhuaweicloud.com or ocr.ap-southeast-2.myhuaweicloud.cn. The request URL is https://ocr.ap-southeast-2.myhuaweicloud.com/v2/{project_id}/ocr/web-image. project_id is the project ID. For how to obtain the project ID, see Obtaining a Project ID.
For details about how to obtain a token, see Authentication.

Transfer the Base64 code of a web image for recognition.

POST https://{endpoint}/v2/{project_id}/ocr/web-image
Request Header:
Content-Type: application/json
X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...

Request Body:  
{  
    "image":"/9j/4AAQSkZJRgABAgEASABIAAD/..."
}

Transfer the URL of a web image for recognition.

POST https://{endpoint}/v2/{project_id}/ocr/web-image
Request Header:   
Content-Type: application/json   
X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...       
Request Body:
{
     "url":"https://BucketName.obs.xxxx.com/ObjectName"
}

Example Response

Status code: 200

Example response for a successful request

{ 
  "result": { 
      "words_block_count": 3, 
      "words_block_list": [ 
          { 
              "words": "Text block 1",
              "confidence": 0.9950,
              "location": [ 
                  [13, 476], 
                  [91, 332], 
                  [125, 351], 
                  [48, 494] 
              ] 
          }, 
          { 
              "words": "Text block 2",
              "confidence": 0.9910,
              "location": [ 
                  [13, 476], 
                  [91, 332], 
                  [125, 351], 
                  [48, 494] 
              ] 
          }, 
          { 
              "words": "Text block 3",
              "confidence": 0.9910,
              "location": [ 
                  [13, 476], 
                  [91, 332], 
                  [125, 351], 
                  [48, 494] 
              ] 
          } 
      ],
      "extracted_data": {}
  } 
}

Status code: 400

Example response for a failed request

{
    "error_code": "AIS.0103", 
    "error_msg": "The image size does not meet the requirements." 
}

Example SDK Code

The example SDK code is as follows:

You are advised to update the SDKs to the latest versions before use to prevent the local outdated SDKs from being unable to use the latest OCR functions.

Transfer the Base64 code of a web image for recognition. During the recognition, the service verifies the tilt angle of the image, determines the font type to be recognized, and checks whether the image contains contact information.

        
         
           
           
             package com.huaweicloud.sdk.test;

import com.huaweicloud.sdk.core.auth.ICredential;
import com.huaweicloud.sdk.core.auth.BasicCredentials;
import com.huaweicloud.sdk.core.exception.ConnectionException;
import com.huaweicloud.sdk.core.exception.RequestTimeoutException;
import com.huaweicloud.sdk.core.exception.ServiceResponseException;
import com.huaweicloud.sdk.ocr.v1.region.OcrRegion;
import com.huaweicloud.sdk.ocr.v1.*;
import com.huaweicloud.sdk.ocr.v1.model.*;

import java.util.List;
import java.util.ArrayList;

public class RecognizeWebImageSolution {

    public static void main(String[] args) {
        // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security.
        // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment
        String ak = System.getenv("CLOUD_SDK_AK");
        String sk = System.getenv("CLOUD_SDK_SK");

        ICredential auth = new BasicCredentials()
                .withAk(ak)
                .withSk(sk);

        OcrClient client = OcrClient.newBuilder()
                .withCredential(auth)
                .withRegion(OcrRegion.valueOf("<YOUR REGION>"))
                .build();
        RecognizeWebImageRequest request = new RecognizeWebImageRequest();
        WebImageRequestBody body = new WebImageRequestBody();
        List<String> listbodyExtractType = new ArrayList<>();
        listbodyExtractType.add("contact_info");
        listbodyExtractType.add("image_size");
        body.withDetectFont(true);
        body.withExtractType(listbodyExtractType);
        body.withDetectDirection(true);
        body.withImage("/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA...");
        request.withBody(body);
        try {
            RecognizeWebImageResponse response = client.recognizeWebImage(request);
            System.out.println(response.toString());
        } catch (ConnectionException e) {
            e.printStackTrace();
        } catch (RequestTimeoutException e) {
            e.printStackTrace();
        } catch (ServiceResponseException e) {
            e.printStackTrace();
            System.out.println(e.getHttpStatusCode());
            System.out.println(e.getRequestId());
            System.out.println(e.getErrorCode());
            System.out.println(e.getErrorMsg());
        }
    }
}

            

          

        
       

Transfer the URL of a web image for recognition. During the recognition, the service verifies the tilt angle of the image, determines the font type to be recognized, and checks whether the image contains contact information.

        
         
           
           
             package com.huaweicloud.sdk.test;

import com.huaweicloud.sdk.core.auth.ICredential;
import com.huaweicloud.sdk.core.auth.BasicCredentials;
import com.huaweicloud.sdk.core.exception.ConnectionException;
import com.huaweicloud.sdk.core.exception.RequestTimeoutException;
import com.huaweicloud.sdk.core.exception.ServiceResponseException;
import com.huaweicloud.sdk.ocr.v1.region.OcrRegion;
import com.huaweicloud.sdk.ocr.v1.*;
import com.huaweicloud.sdk.ocr.v1.model.*;

import java.util.List;
import java.util.ArrayList;

public class RecognizeWebImageSolution {

    public static void main(String[] args) {
        // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security.
        // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment
        String ak = System.getenv("CLOUD_SDK_AK");
        String sk = System.getenv("CLOUD_SDK_SK");

        ICredential auth = new BasicCredentials()
                .withAk(ak)
                .withSk(sk);

        OcrClient client = OcrClient.newBuilder()
                .withCredential(auth)
                .withRegion(OcrRegion.valueOf("<YOUR REGION>"))
                .build();
        RecognizeWebImageRequest request = new RecognizeWebImageRequest();
        WebImageRequestBody body = new WebImageRequestBody();
        List<String> listbodyExtractType = new ArrayList<>();
        listbodyExtractType.add("contact_info");
        listbodyExtractType.add("image_size");
        body.withDetectFont(true);
        body.withExtractType(listbodyExtractType);
        body.withDetectDirection(true);
        body.withUrl("https://BucketName.obs.myhuaweicloud.com/ObjectName");
        request.withBody(body);
        try {
            RecognizeWebImageResponse response = client.recognizeWebImage(request);
            System.out.println(response.toString());
        } catch (ConnectionException e) {
            e.printStackTrace();
        } catch (RequestTimeoutException e) {
            e.printStackTrace();
        } catch (ServiceResponseException e) {
            e.printStackTrace();
            System.out.println(e.getHttpStatusCode());
            System.out.println(e.getRequestId());
            System.out.println(e.getErrorCode());
            System.out.println(e.getErrorMsg());
        }
    }
}

            

          

        
       

        
         
           
           
             # coding: utf-8

from huaweicloudsdkcore.auth.credentials import BasicCredentials
from huaweicloudsdkocr.v1.region.ocr_region import OcrRegion
from huaweicloudsdkcore.exceptions import exceptions
from huaweicloudsdkocr.v1 import *

if __name__ == "__main__":
    # The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security.
    # In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment
    ak = os.getenv("CLOUD_SDK_AK")
    sk = os.getenv("CLOUD_SDK_SK")

    credentials = BasicCredentials(ak, sk) \

    client = OcrClient.new_builder() \
        .with_credentials(credentials) \
        .with_region(OcrRegion.value_of("<YOUR REGION>")) \
        .build()

    try:
        request = RecognizeWebImageRequest()
        listExtractTypebody = [
            "contact_info",
            "image_size"
        ]
        request.body = WebImageRequestBody(
            detect_font=True,
            extract_type=listExtractTypebody,
            detect_direction=True,
            image="/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA..."
        )
        response = client.recognize_web_image(request)
        print(response)
    except exceptions.ClientRequestException as e:
        print(e.status_code)
        print(e.request_id)
        print(e.error_code)
        print(e.error_msg)

            

          

        
       

        
         
           
           
             # coding: utf-8

from huaweicloudsdkcore.auth.credentials import BasicCredentials
from huaweicloudsdkocr.v1.region.ocr_region import OcrRegion
from huaweicloudsdkcore.exceptions import exceptions
from huaweicloudsdkocr.v1 import *

if __name__ == "__main__":
    # The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security.
    # In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment
    ak = os.getenv("CLOUD_SDK_AK")
    sk = os.getenv("CLOUD_SDK_SK")

    credentials = BasicCredentials(ak, sk) \

    client = OcrClient.new_builder() \
        .with_credentials(credentials) \
        .with_region(OcrRegion.value_of("<YOUR REGION>")) \
        .build()

    try:
        request = RecognizeWebImageRequest()
        listExtractTypebody = [
            "contact_info",
            "image_size"
        ]
        request.body = WebImageRequestBody(
            detect_font=True,
            extract_type=listExtractTypebody,
            detect_direction=True,
            url="https://BucketName.obs.myhuaweicloud.com/ObjectName"
        )
        response = client.recognize_web_image(request)
        print(response)
    except exceptions.ClientRequestException as e:
        print(e.status_code)
        print(e.request_id)
        print(e.error_code)
        print(e.error_msg)

            

          

        
       

        
         
           
           
             package main

import (
	"fmt"
	"github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic"
    ocr "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1"
	"github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/model"
    region "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/region"
)

func main() {
    // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security.
    // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment
    ak := os.Getenv("CLOUD_SDK_AK")
    sk := os.Getenv("CLOUD_SDK_SK")

    auth := basic.NewCredentialsBuilder().
        WithAk(ak).
        WithSk(sk).
        Build()

    client := ocr.NewOcrClient(
        ocr.OcrClientBuilder().
            WithRegion(region.ValueOf("<YOUR REGION>")).
            WithCredential(auth).
            Build())

    request := &model.RecognizeWebImageRequest{}
	var listExtractTypebody = []string{
        "contact_info",
	    "image_size",
    }
	detectFontWebImageRequestBody:= true
	detectDirectionWebImageRequestBody:= true
	imageWebImageRequestBody:= "/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA..."
	request.Body = &model.WebImageRequestBody{
		DetectFont: &detectFontWebImageRequestBody,
		ExtractType: &listExtractTypebody,
		DetectDirection: &detectDirectionWebImageRequestBody,
		Image: &imageWebImageRequestBody,
	}
	response, err := client.RecognizeWebImage(request)
	if err == nil {
        fmt.Printf("%+v\n", response)
    } else {
        fmt.Println(err)
    }
}

            

          

        
       

        
         
           
           
             package main

import (
	"fmt"
	"github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic"
    ocr "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1"
	"github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/model"
    region "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/region"
)

func main() {
    // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security.
    // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment
    ak := os.Getenv("CLOUD_SDK_AK")
    sk := os.Getenv("CLOUD_SDK_SK")

    auth := basic.NewCredentialsBuilder().
        WithAk(ak).
        WithSk(sk).
        Build()

    client := ocr.NewOcrClient(
        ocr.OcrClientBuilder().
            WithRegion(region.ValueOf("<YOUR REGION>")).
            WithCredential(auth).
            Build())

    request := &model.RecognizeWebImageRequest{}
	var listExtractTypebody = []string{
        "contact_info",
	    "image_size",
    }
	detectFontWebImageRequestBody:= true
	detectDirectionWebImageRequestBody:= true
	urlWebImageRequestBody:= "https://BucketName.obs.myhuaweicloud.com/ObjectName"
	request.Body = &model.WebImageRequestBody{
		DetectFont: &detectFontWebImageRequestBody,
		ExtractType: &listExtractTypebody,
		DetectDirection: &detectDirectionWebImageRequestBody,
		Url: &urlWebImageRequestBody,
	}
	response, err := client.RecognizeWebImage(request)
	if err == nil {
        fmt.Printf("%+v\n", response)
    } else {
        fmt.Println(err)
    }
}

            

          

        
       

For more SDK code examples in various programming languages, see the Sample Code tab on the right of the API Explorer page, which can automatically generate corresponding SDK code examples.