VAT Invoice OCR

Function

VAT Invoice OCR recognizes the category of a VAT invoice and the text in the VAT invoice image, and returns the structured recognition result in JSON format. For details about the constraints on using this API, see Constraints. For details about how to use this API, see Introduction to OCR.

  • Only the VAT invoices in the People's Republic of China can be recognized.
  • The following types of VAT invoices are supported: special VAT invoices, general VAT invoices, electronic VAT invoices (including toll invoices), and general VAT invoices (roll invoices).

Prerequisites

Before using VAT Invoice OCR, you need to apply for the service and complete authentication. For details, see Subscribing to OCR and Authentication.

URI

POST https://{endpoint}/v2/{project_id}/ocr/vat-invoice

Table 1 Path parameters

Parameter

Mandatory

Description

endpoint

Yes

Domain name or IP address of the server bearing the REST service endpoint. The endpoint varies depending on services in different regions. For more details, see Endpoints.

For example, the endpoint of OCR in the CN North-Beijing4 region is ocr.cn-north-4.myhuaweicloud.com.

project_id

Yes

Project ID, which can be obtained from Obtaining a Project ID.

Request Parameters

Table 2 Request header parameters

Parameter

Mandatory

Type

Description

X-Auth-Token

Yes

String

User token

During API authentication using a token, the token is added to requests to obtain permissions for calling the API. The value of X-Subject-Token in the response header is the obtained token.

Content-Type

Yes

String

MIME type of the request body. The value is application/json.

Table 3 Request body parameters

Parameter

Mandatory

Type

Description

image

No. Set either this parameter or url.

String

Base64 character string converted from the image. The size cannot exceed 10 MB. The narrow edge contains at least 100 pixels and the wide edge contains at most 8,192 pixels. The JPEG, JPG, PNG, BMP, and TIFF formats are supported.

url

No. Set either this parameter or image.

String

Image URL. Currently, the following URLs are supported:

  • Public network: HTTP/HTTPS URL
  • URL provided by OBS. You need to be authorized to use OBS data, including service authorization, temporary authorization, and anonymous public authorization. For details, see Configuring Access Permissions of OBS.
NOTE:
  • The API response time depends on the image download time. If the image download takes a long time, the API call will fail.
  • Ensure that the storage service where the images to be detected reside is stable and reliable. OBS is recommended for storing image data.

advanced_mode

No

Boolean

The default value is false. If this parameter is set to true, more fields are returned. For details, see Table 7.

Response Parameters

Response parameters and status codes vary in different recognition results. They are described as below.

Status code: 200

Table 4 Response body parameter

Parameter

Type

Description

result

VatInvoiceResult object

Calling result of a successful API call

This parameter is not included when the API fails to be called.

Table 5 VatInvoiceResult

Parameter

Type

Description

type

String

VAT invoice type. Possible values are as follows:

  • special: special VAT invoice
  • normal: general VAT invoice
  • electronic: electronic VAT invoice (including the toll invoice)
  • roll: general VAT invoice (roll invoice)

serial_number

String

Serial number of special voucher form

This parameter is returned only when advanced_mode is set to true.

attribution

String

Attribution of the invoice

This parameter is returned only when advanced_mode is set to true.

supervision_seal

Array of strings

Supervision seal of the invoice

This parameter is returned only when advanced_mode is set to true.

code

String

Invoice code.

machine_number

String

Machine number

This parameter is returned only when advanced_mode is set to true.

print_number

String

Machine-printed number

This parameter is returned only when advanced_mode is set to true.

check_code

String

Invoice verification code. If the verification code is not included in specific VAT invoices, an empty string is returned.

number

String

Invoice number

issue_date

String

Issue date

encryption_block

String

Password encryption block

buyer_name

String

Buyer's name

buyer_id

String

Buyer's taxpayer identifier

buyer_address

String

Buyer's address and phone number

buyer_bank

String

Buyer's deposit bank and bank account

seller_name

String

Seller's name

seller_id

String

Seller's taxpayer identifier

seller_address

String

Seller's address and phone number

seller_bank

String

Seller's deposit bank and bank account

subtotal_amount

String

Total amount

subtotal_tax

String

Total amount of tax

total

String

Total price including tax

total_in_words

String

Tax-inclusive price in words

This parameter is returned only when advanced_mode is set to true.

remarks

String

Remarks

This parameter is returned only when advanced_mode is set to true.

receiver

String

Payee

This parameter is returned only when advanced_mode is set to true.

reviewer

String

Reviewer

This parameter is returned only when advanced_mode is set to true.

issuer

String

Issuer

This parameter is returned only when advanced_mode is set to true.

seller_seal

Array of strings

Seller's invoice seal

This parameter is returned only when advanced_mode is set to true.

item_list

Array of ItemList objects

List of goods or taxable labor services

confidence

Object

Confidence of each field

This parameter is returned only when advanced_mode is set to true.

Table 6 ItemList

Parameter

Type

Description

name

String

Name of the goods or taxable labor service

specification

String

Specifications

unit

String

Unit

quantity

String

Quantity

unit_price

String

Unit price

license_plate_number

String

License plate number

This parameter is returned only when advanced_mode is set to true.

amount

String

Amount

tax_rate

String

Tax rate

tax

String

Amount of tax.

end_date

String

End date of the pass

This parameter is returned only when advanced_mode is set to true.

start_date

String

Start date of the pass

This parameter is returned only when advanced_mode is set to true.

vehicle_type

String

Vehicle type

This parameter is returned only when advanced_mode is set to true.

Status code: 400

Table 7 Response body parameters

Parameter

Type

Description

error_code

String

Error code of a failed API call. For details, see Error Codes.

If error code ModelArts.4204 is displayed, refer to Why Is a Message Stating "ModelArts.4204" Displayed When the OCR API Is Called?

This parameter is not included when the API is successfully called.

error_msg

String

Error message returned when the API fails to be called

This parameter is not included when the API is successfully called.

Request Example

  • The endpoint is the request URL for calling an API. Endpoints vary depending on services and regions. For details, see Endpoints.

    For example, VAT Invoice OCR is deployed in the CN North-Beijing4 region. The endpoint is ocr.cn-north-4.myhuaweicloud.com. The request URL is https://ocr.cn-north-4.myhuaweicloud.com/v2/{project_id}/ocr/vat-invoice. project_id is the project ID. For details about how to obtain the project ID, see Obtaining a Project ID.

  • For details about how to obtain a token, see Making an API Request.
  • Request example (Method 1: Use the image Base64 string.)
    POST https://{endpoint}/v2/{project_id}/ocr/vat-invoice 
     Request Header:   
     Content-Type: application/json   
     X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...      
     Request Body:
     {   
        "image":"/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAAj...",
        "advanced_mode": true
      }
  • Request example (Method 2: Use the image URL.)
    POST https://{endpoint}/v2/{project_id}/ocr/vat-invoice 
     Request Header:   
     Content-Type: application/json   
     X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...     
     Request Body:
     {
         "url":"https://BucketName.obs.xxxx.com/ObjectName",
         "advanced_mode": true
      }
  • Sample code for a Python 3 request (For codes in other languages, refer to the following sample or use OCR SDK.)
    # encoding:utf-8
    
    import requests
    import base64
    
    url = "https://{endpoint}/v2/{project_id}/ocr/vat-invoice"
    token = "Actual token value obtained by the user"
    headers = {'Content-Type': 'application/json', 'X-Auth-Token': token}
    
    imagepath = r'./data/vat-invoice-demo.png'
    with open(imagepath, "rb") as bin_data:
        image_data = bin_data.read()
    image_base64 = base64.b64encode(image_data).decode("utf-8")  # Base64 encoding of images.
    payload = {"image": image_base64}  # url or image.
    response = requests.post(url, headers=headers, json=payload)
    print(response.text)

Example Response

Status code: 200

Successful response example

{
    "result": {
        "type": "special", 
         "serial_number": "Serial number recognized from the image",
        "attribution": "Attribution recognized from the image",
        "supervision_seal": [
            "Content recognized from the supervision seal",
             "Content recognized from the supervision seal"
             "Content recognized from the supervision seal"
        ], 
        "code": "310316XXXX", 
        "check_code": "", 
        "machine_number": "310316XXXX", 
        "print_number": "", 
        "number": "60543XXX", 
        "issue_date": "Issue date recognized from the image",
        "encryption_block": "6/+1+733<672085+063>82>30<1872/1<>*312671<9<1-11208-746599*6/>+7>2163+141-8737*4932+7970*11892126>0*-+7+78>1", 
        "buyer_name": "Buyer name recognized from the image",
        "buyer_id": "917107277650880665", 
        "buyer_address": "XXXX", 
        "buyer_bank": "Buyer bank recognized from the image",
        "seller_name": "Seller name recognized from the image",
        "seller_id": "9351099411892126", 
        "seller_address": "XXXX", 
        "seller_bank": "Seller bank recognized from the image",
        "subtotal_amount": "Subtotal amount recognized from the image",
         "subtotal_tax": "Subtotal tax recognized from the image",
         "total": "Total recognized from the image",
        "total_in_words": "Total in words recognized from the image",
        "remarks": "Remarks recognized from the image",
        "receiver": "XX", 
        "reviewer": "XX", 
        "issuer": "XX", 
        "seller_seal": [
             "Content recognized from the seller's seal",
            "Content recognized from the seller's seal", 
            "Content recognized from the seller's seal"
        ], 
        "item_list": [
            {
                "name": "Name recognized from the image",
                "specification": "Specification recognized from the image",
                "unit": "Unit recognized from the image",
                "quantity": "300", 
                "unit_price": "28.00", 
                "license_plate_number": "", 
                "vehicle_type": "", 
                "start_date": "", 
                "end_date": "", 
                "amount": "8400.00", 
                "tax_rate": "17%", 
                "tax": "1428.00"
            }
        ]
        "confidence": {
            "type": 0.9960, 
            "serial_number": 0.9652, 
            "attribution": 0.9960, 
            "supervision_seal": [
                0.9970, 
                0.9945, 
                0.9960
            ], 
            "code": 0.99999, 
            "check_code": 0.8430, 
            "machine_number": 0.9070, 
            "print_number": 0.0000, 
            "number": 1.9856, 
            "issue_date": 0.9848, 
            "encryption_block": 0.9922, 
            "buyer_name": 0.9854, 
            "buyer_id": 0.9869, 
            "buyer_address": 0.0000, 
            "buyer_bank": 0. 0000, 
            "seller_name": 0.9883, 
            "seller_id": 0.9914, 
            "seller_address": 0.9952, 
            "seller_bank": 0.9829, 
            "subtotal_amount": 0.9533, 
            "subtotal_tax": 0.9167, 
            "total": 0.9444, 
            "total_in_words": 0.9854, 
            "remarks": 0.8762, 
            "receiver": 0.9850, 
            "reviewer": 0.9759, 
            "issuer": 0.9872, 
            "seller_seal": [
                0.9883, 
                0.9914, 
                0.9999
            ], 
            "item_list": [
                {
                    "name": 0.9779, 
                    "specification": 0.0000, 
                    "unit": 0.0000, 
                    "quantity": 0. 0000, 
                    "unit_price": 0. 0000, 
                    "license_plate_number": 0. 0000, 
                    "vehicle_type": 0. 0000, 
                    "start_date": 0. 0000, 
                    "end_date": 0. 0000, 
                    "amount": 0.8227, 
                    "tax_rate": 0.5183, 
                    "tax": 0.8394
                }
            ]
        }
    }
}

Status code: 400

Failure response example

{
    "error_code": "AIS.0103",
    "error_msg": "The image size does not meet the requirements."
}

Status Codes

Status Code

Description

200

Success response

400

Failure response

For details about status codes, see Status Codes.

Error Codes

For details about error codes, see Error Codes.