What Is OCR?

Optical Character Recognition (OCR) detects and extracts text from images and converts the recognition results into an editable JSON format.

OCR provides open APIs, so you can use programming languages such as Python and Java to call OCR APIs to extract text from images. OCR allows you to automate the collection of key data. It helps you build an intelligent service system to improve efficiency. For details about how to obtain APIs, see Optical Character Recognition API Reference.

OCR also provides software development kits (SDKs) for multiple programming languages. For details about how to use SDKs, see the Optical Character Recognition SDK Reference.

Before You Start

You will need some basic programming skills. Familiarity with Java, Python, iOS, Android, and Node.js is recommended.

You need to call APIs to use OCR and transmit the results to the service system, or to convert the results from JSON to TXT or Excel form.

OCR Capabilities

General OCR
Text in images (including web images and more) can be automatically identified.
Card OCR
OCR automatically identifies information in images of certificates such as passports, ID cards, driving licenses, and converts the information into editable text.

Using OCR for the First Time

If you are a first-time user, the following sections are a good place to start:

Function Description
Learn about the different OCR functions, including General OCR and Card OCR.
Getting Started
Learn how to use OCR by referring to Optical Character Recognition Getting Started.
Using OCR
Learn how to call OCR services as a developer who feel more comfortable writing code, see Optical Character Recognition API Reference or Optical Character Recognition SDK Reference.
Progressive Knowledge
Learn how to get started using OCR.