手写文字识别
约束与限制
- 只支持识别PNG、JPG、JPEG、BMP、TIFF格式图片。
- 图像各边的像素大小在15px到8192px之间。
- 图像中识别区域有效占比超过80%,保证所有文字及其边缘包含在图像内。
- 支持图像任意角度的水平旋转(需开启方向检测)。
- 目前不支持复杂背景(如户外自然场景、防伪水印等)和表格线扭曲图像的文字识别。
- 文字书写越工整,识别率越高。
调用方法
请参见如何调用API。
前提条件
在使用之前,需要您完成服务申请和认证鉴权,具体操作流程请参见开通服务和认证鉴权章节。
用户首次使用需要先申请开通。服务只需要开通一次即可,后面使用时无需再次申请。如未开通服务,调用服务时会提示ModelArts.4204报错,请在调用服务前先进入控制台开通服务,并注意开通服务区域与调用服务的区域保持一致。
URI
POST /v2/{project_id}/ocr/handwriting
请求参数
参数 |
是否必选 |
参数类型 |
描述 |
---|---|---|---|
X-Auth-Token |
是 |
String |
用户Token。 用于获取操作API的权限。获取Token接口响应消息头中X-Subject-Token的值即为Token。 |
Content-Type |
是 |
String |
发送的实体的MIME类型,参数值为“application/json”。 |
Enterprise-Project-Id |
否 |
String |
企业项目ID。OCR支持通过企业项目管理(EPS)对不同用户组和用户的资源使用,进行分账。 获取方法:进入“企业项目管理”页面,单击企业项目名称,在企业项目详情页获取Enterprise-Project-Id(企业项目ID)。
企业项目创建步骤请参见用户指南。
说明:
创建企业项目后,在传参时,有以下三类场景。
|
参数 |
是否必选 |
参数类型 |
说明 |
---|---|---|---|
image |
否 |
String |
该参数与url二选一。 图片的Base64编码,要求Base64编码后大小不超过10MB。 图片最短边不小于15px,最长边不超过8192px,支持JPEG、JPG、PNG、BMP、TIFF格式。 图片Base64编码示例如/9j/4AAQSkZJRgABAg...,带有多余前缀会产生The image format is not supported报错。 |
url |
否 |
String |
该参数与image二选一。图片的url路径,目前支持:
说明:
|
quick_mode |
否 |
Boolean |
快速模式开关,针对单行文字图片(要求图片只包含一行文字,且文字区域占比超过50%),打开时可以更快返回识别内容。可选值如下所示。
未传入该参数时默认为false,即关闭快速模式。 |
char_set |
否 |
String |
字符集设置,用户可以根据实际需要限定输出字符集范围。可选值如下所示。
未传入该参数时,默认为“general”模式。 |
detect_direction |
否 |
Boolean |
是否校正图片的倾斜角度,可选值如下。
支持任意角度的校正,未传入该参数时默认为“false”。 待识别图片如果存在倾斜,建议将此参数设置为“true”。 |
响应参数
根据识别的结果,可能有不同的HTTP响应状态码(status code)。例如,200表示API调用成功,400表示调用失败,详细的状态码和响应参数说明如下。
状态码: 200
参数 |
参数类型 |
描述 |
---|---|---|
result |
HandwritingResult object |
识别结果。 调用失败时不返回此字段。 |
参数 |
参数类型 |
描述 |
---|---|---|
words_block_count |
Integer |
检测到的文字块数目。 |
words_block_list |
Array of HandwritingItemsResponse objects |
识别文字块列表。输出顺序从左到右,从上到下。 |
参数 |
参数类型 |
描述 |
---|---|---|
words |
String |
文字块识别结果。 |
type |
String |
说明该识别结果所属类型,返回值为“text”。 |
confidence |
Float |
字段的置信度,取值范围0~1。 置信度越大,本次识别的字段的可靠性越高,在统计意义上,置信度越大,准确率越高。 置信度由算法给出,不直接等价于字段的准确率。 |
location |
Array<Array<Integer>> |
文字块“words”的区域位置信息,列表形式,分别表示文字块顶点的(x,y)坐标;采用图像坐标系,坐标原点为图片左上角,x轴沿水平方向,y轴沿竖直方向。 |
状态码: 400
参数 |
参数类型 |
说明 |
---|---|---|
error_code |
String |
调用失败时的错误码,具体请参见错误码。 调用成功时不返回此字段。 |
error_msg |
String |
调用失败时返回的错误信息。 调用成功时不返回此字段。 |
请求示例
- 传入手写文字图片的base64编码进行内容识别,识别范围包括数字、字母、中文,识别过程关闭快速模式并不校验图片倾斜角度
POST https://{endpoint}/v2/{project_id}/ocr/handwriting Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "image": "/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA...", "quick_mode": false, "char_set": "general", "detect_direction": false }
- 传入手写文字图片的url进行内容识别,识别范围包括数字、字母、中文,识别过程关闭快速模式并不校验图片倾斜角度
POST https://{endpoint}/v2/{project_id}/ocr/handwriting Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "url":"https://BucketName.obs.xxxx.com/ObjectName", "quick_mode":false, "char_set": "general", "detect_direction": false }
响应示例
状态码:200
成功响应示例
{ "result": { "words_block_count": 2, "words_block_list": [ { "words": "大江东去", "type": "text", "confidence": 0.98, "location": [ [282, 45], [461, 47], [460, 77], [280, 76] ] }, { "words": "浪淘尽", "type": "text", "confidence": 0.99, "location": [ [949, 52], [1095, 53], [1100, 87], [953, 86] ] } ] } }
状态码:400
失败响应示例
{ "error_code": "AIS.0103", "error_msg": "The image size does not meet the requirements." }
SDK代码示例
SDK代码示例如下。
使用SDK前建议将SDK更新至最新版,防止本地旧版SDK无法使用最新的OCR功能。
- 传入手写文字图片的base64编码进行内容识别,识别范围包括数字、字母、中文,识别过程关闭快速模式并不校验图片倾斜角度。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
package com.huaweicloud.sdk.test; import com.huaweicloud.sdk.core.auth.ICredential; import com.huaweicloud.sdk.core.auth.BasicCredentials; import com.huaweicloud.sdk.core.exception.ConnectionException; import com.huaweicloud.sdk.core.exception.RequestTimeoutException; import com.huaweicloud.sdk.core.exception.ServiceResponseException; import com.huaweicloud.sdk.ocr.v1.region.OcrRegion; import com.huaweicloud.sdk.ocr.v1.*; import com.huaweicloud.sdk.ocr.v1.model.*; public class RecognizeHandwritingSolution { public static void main(String[] args) { // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment String ak = System.getenv("CLOUD_SDK_AK"); String sk = System.getenv("CLOUD_SDK_SK"); ICredential auth = new BasicCredentials() .withAk(ak) .withSk(sk); OcrClient client = OcrClient.newBuilder() .withCredential(auth) .withRegion(OcrRegion.valueOf("<YOUR REGION>")) .build(); RecognizeHandwritingRequest request = new RecognizeHandwritingRequest(); HandwritingRequestBody body = new HandwritingRequestBody(); body.withDetectDirection(false); body.withCharSet("general"); body.withQuickMode(false); body.withImage("/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA..."); request.withBody(body); try { RecognizeHandwritingResponse response = client.recognizeHandwriting(request); System.out.println(response.toString()); } catch (ConnectionException e) { e.printStackTrace(); } catch (RequestTimeoutException e) { e.printStackTrace(); } catch (ServiceResponseException e) { e.printStackTrace(); System.out.println(e.getHttpStatusCode()); System.out.println(e.getRequestId()); System.out.println(e.getErrorCode()); System.out.println(e.getErrorMsg()); } } }
- 传入手写文字图片的url进行内容识别,识别范围包括数字、字母、中文,识别过程关闭快速模式并不校验图片倾斜角度
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
package com.huaweicloud.sdk.test; import com.huaweicloud.sdk.core.auth.ICredential; import com.huaweicloud.sdk.core.auth.BasicCredentials; import com.huaweicloud.sdk.core.exception.ConnectionException; import com.huaweicloud.sdk.core.exception.RequestTimeoutException; import com.huaweicloud.sdk.core.exception.ServiceResponseException; import com.huaweicloud.sdk.ocr.v1.region.OcrRegion; import com.huaweicloud.sdk.ocr.v1.*; import com.huaweicloud.sdk.ocr.v1.model.*; public class RecognizeHandwritingSolution { public static void main(String[] args) { // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment String ak = System.getenv("CLOUD_SDK_AK"); String sk = System.getenv("CLOUD_SDK_SK"); ICredential auth = new BasicCredentials() .withAk(ak) .withSk(sk); OcrClient client = OcrClient.newBuilder() .withCredential(auth) .withRegion(OcrRegion.valueOf("<YOUR REGION>")) .build(); RecognizeHandwritingRequest request = new RecognizeHandwritingRequest(); HandwritingRequestBody body = new HandwritingRequestBody(); body.withDetectDirection(false); body.withCharSet("general"); body.withQuickMode(false); body.withUrl("https://BucketName.obs.myhuaweicloud.com/ObjectName"); request.withBody(body); try { RecognizeHandwritingResponse response = client.recognizeHandwriting(request); System.out.println(response.toString()); } catch (ConnectionException e) { e.printStackTrace(); } catch (RequestTimeoutException e) { e.printStackTrace(); } catch (ServiceResponseException e) { e.printStackTrace(); System.out.println(e.getHttpStatusCode()); System.out.println(e.getRequestId()); System.out.println(e.getErrorCode()); System.out.println(e.getErrorMsg()); } } }
- 传入手写文字图片的base64编码进行内容识别,识别范围包括数字、字母、中文,识别过程关闭快速模式并不校验图片倾斜角度。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
# coding: utf-8 from huaweicloudsdkcore.auth.credentials import BasicCredentials from huaweicloudsdkocr.v1.region.ocr_region import OcrRegion from huaweicloudsdkcore.exceptions import exceptions from huaweicloudsdkocr.v1 import * if __name__ == "__main__": # The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. # In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment ak = os.getenv("CLOUD_SDK_AK") sk = os.getenv("CLOUD_SDK_SK") credentials = BasicCredentials(ak, sk) \ client = OcrClient.new_builder() \ .with_credentials(credentials) \ .with_region(OcrRegion.value_of("<YOUR REGION>")) \ .build() try: request = RecognizeHandwritingRequest() request.body = HandwritingRequestBody( detect_direction=False, char_set="general", quick_mode=False, image="/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA..." ) response = client.recognize_handwriting(request) print(response) except exceptions.ClientRequestException as e: print(e.status_code) print(e.request_id) print(e.error_code) print(e.error_msg)
- 传入手写文字图片的url进行内容识别,识别范围包括数字、字母、中文,识别过程关闭快速模式并不校验图片倾斜角度
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
# coding: utf-8 from huaweicloudsdkcore.auth.credentials import BasicCredentials from huaweicloudsdkocr.v1.region.ocr_region import OcrRegion from huaweicloudsdkcore.exceptions import exceptions from huaweicloudsdkocr.v1 import * if __name__ == "__main__": # The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. # In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment ak = os.getenv("CLOUD_SDK_AK") sk = os.getenv("CLOUD_SDK_SK") credentials = BasicCredentials(ak, sk) \ client = OcrClient.new_builder() \ .with_credentials(credentials) \ .with_region(OcrRegion.value_of("<YOUR REGION>")) \ .build() try: request = RecognizeHandwritingRequest() request.body = HandwritingRequestBody( detect_direction=False, char_set="general", quick_mode=False, url="https://BucketName.obs.myhuaweicloud.com/ObjectName" ) response = client.recognize_handwriting(request) print(response) except exceptions.ClientRequestException as e: print(e.status_code) print(e.request_id) print(e.error_code) print(e.error_msg)
- 传入手写文字图片的base64编码进行内容识别,识别范围包括数字、字母、中文,识别过程关闭快速模式并不校验图片倾斜角度。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
package main import ( "fmt" "github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic" ocr "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1" "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/model" region "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/region" ) func main() { // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment ak := os.Getenv("CLOUD_SDK_AK") sk := os.Getenv("CLOUD_SDK_SK") auth := basic.NewCredentialsBuilder(). WithAk(ak). WithSk(sk). Build() client := ocr.NewOcrClient( ocr.OcrClientBuilder(). WithRegion(region.ValueOf("<YOUR REGION>")). WithCredential(auth). Build()) request := &model.RecognizeHandwritingRequest{} detectDirectionHandwritingRequestBody:= false charSetHandwritingRequestBody:= "general" quickModeHandwritingRequestBody:= false imageHandwritingRequestBody:= "/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA..." request.Body = &model.HandwritingRequestBody{ DetectDirection: &detectDirectionHandwritingRequestBody, CharSet: &charSetHandwritingRequestBody, QuickMode: &quickModeHandwritingRequestBody, Image: &imageHandwritingRequestBody, } response, err := client.RecognizeHandwriting(request) if err == nil { fmt.Printf("%+v\n", response) } else { fmt.Println(err) } }
- 传入手写文字图片的url进行内容识别,识别范围包括数字、字母、中文,识别过程关闭快速模式并不校验图片倾斜角度
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
package main import ( "fmt" "github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic" ocr "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1" "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/model" region "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ocr/v1/region" ) func main() { // The AK and SK used for authentication are hard-coded or stored in plaintext, which has great security risks. It is recommended that the AK and SK be stored in ciphertext in configuration files or environment variables and decrypted during use to ensure security. // In this example, AK and SK are stored in environment variables for authentication. Before running this example, set environment variables CLOUD_SDK_AK and CLOUD_SDK_SK in the local environment ak := os.Getenv("CLOUD_SDK_AK") sk := os.Getenv("CLOUD_SDK_SK") auth := basic.NewCredentialsBuilder(). WithAk(ak). WithSk(sk). Build() client := ocr.NewOcrClient( ocr.OcrClientBuilder(). WithRegion(region.ValueOf("<YOUR REGION>")). WithCredential(auth). Build()) request := &model.RecognizeHandwritingRequest{} detectDirectionHandwritingRequestBody:= false charSetHandwritingRequestBody:= "general" quickModeHandwritingRequestBody:= false urlHandwritingRequestBody:= "https://BucketName.obs.myhuaweicloud.com/ObjectName" request.Body = &model.HandwritingRequestBody{ DetectDirection: &detectDirectionHandwritingRequestBody, CharSet: &charSetHandwritingRequestBody, QuickMode: &quickModeHandwritingRequestBody, Url: &urlHandwritingRequestBody, } response, err := client.RecognizeHandwriting(request) if err == nil { fmt.Printf("%+v\n", response) } else { fmt.Println(err) } }
更多编程语言的SDK代码示例,请参见API Explorer的代码示例页签,可生成自动对应的SDK代码示例。
状态码
状态码 |
描述 |
---|---|
200 |
成功响应示例 |
400 |
失败响应示例 |
状态码请参见状态码。
错误码
错误码请参见错误码。