Today, digital technologies have taken over traditional mediums of media and mass communication. Revenues for print media are declining every year as readers prefer spending time on digital alternatives. While outdoor advertising still uses large-size hoardings and billboards, digital signage boards are also encroaching the space. Nonetheless, the world hasn’t become completely paperless.
Numerous business practices and operations still rely on the exchange of physical bills, receipts, invoices, agreements, NOCs and other printed documents. Legal and administrative departments in several countries use a vast amount of paper documentation. This creates a lot of scope for counterfeiting and frauds. Also, paper documentation requires the use of manual processes and creates lots of inefficiencies both in their storage and processing.
In this article, we will explore how Alibaba Cloud’s OCR technology helps in solving these challenges. We will also understand other major application scenarios of the OCR technology.
Optical Character Recognition (OCR) is a field of science closely associated with pattern recognition, artificial intelligence, and computer vision. The technology enables conversion of non-digital forms of text into digital versions for easy storage and processing by computers. OCR helps machines in recognizing text from printed and handwritten documents, text on billboards, screens, or any other surface.
Today, several industries utilize applications for improvement of operational workflows and digitization of data. Some of the common applications are:
● Invoice imaging
● Legal documents imaging
● Imaging of banking and financial documents
● Healthcare documentation imaging
● Automatic number plate recognition (ANPR)
By definition, OCR might not sound very complicated. It simply allows computers to read and interpret the text. However, to accomplish this simple task, computers need special capabilities. First of all, they need to convert printed text into an image by using a camera or an optical scanner. This image is still a meaningless graphics file for a normal computer which can at best allow image enhancements with software like Photoshop. It is OCR which allows computers to convert the pixels in these image into machine-readable text.
The main challenge for OCR is to identify letters, characters, and symbols in various fonts and styles. Differences in handwriting add another level of complexity to this challenge. There are two approaches to resolving this challenge:
● Pattern Recognition: Recognizing the complete character
● Feature Extraction: Breaking down characters into individual lines and strokes, and using these "features" to identify characters
The origins of pattern recognition go back to the 1960s when banks had to process a lot of checks on a daily basis. To develop a system that could help in automation of this processing, computer scientists proposed a universal font called OCR-A. The font used exactly same width and clear spacing of characters for numbering of checks. It soon became a standard, and banks started issuing newly printed check-books to its customers. They installed OCR equipment that could easily read the check numbers, and it made check processing a lot faster for banks.
However, OCR-A didn’t become popular outside the banking industry. In the later years, developers improved OCR systems and made them capable of recognizing a wider variety of fonts. However, the approach of pattern recognition had its limitations. For instance, it failed in understanding handwritten text.
This technology, also known as intelligent character recognition, relies on detecting every character’s salient features such as curves, slanted lines, vertical lines, horizontal strikes, etc. For instance, letter V in English is made up of two slanted lines which converge at the bottom. By identifying letters in this manner, Feature Extraction enables computers to recognize characters irrespective of their fonts and styles.
Alibaba Cloud OCR uses methods similar to Feature Extraction for recognition of Chinese language characters. It also uses Artificial Intelligence tools to automatically extract patterns from a field of view in a manner that is similar to what humans use in reading.
In addition to static images, Alibaba Cloud OCR can also identify textual content (subtitles) in a video. These abilities made it possible for Alibaba Cloud OCR to win the Global ICDAR Robust Reading competition. The competition challenges participants to face a wide range of real-world situations and display their OCR capabilities.
OCR is just one of the imaging capabilities offered by Alibaba Cloud. It closely integrates with image search which is capable of offline indexing and online querying of billions of photos. Businesses can use semantic search for cross-media retrieval. By integrating OCR with these imaging technologies, businesses can develop numerous applications for their specific business requirements.
For instance, businesses can utilize these technologies to detect the use of their brand image, logo, taglines, etc. by unauthorized parties in real time. These intelligent systems can scan offline (physical goods) and online (websites) properties to make sure that the brand is safe from copyright infringement.
Similarly, law enforcement agencies can combine OCR technology with other video analysis tools to identify license plates on moving vehicles. The intelligent system can use this information to locate any stolen vehicle in the city traffic. We have discussed more of these applications below:
Alibaba Cloud OCR can identify a variety of IDs such as driver's license, government-issued identity cards, bank cards, business licenses, etc. The system scores better than their human counterparts in collecting information from these cards. It can easily identify driving licenses and collects information such as the license holder’s name, ID number, vehicle type and license validity period. Its business license identification service extracts such information as registration number, company name, address and so on.
Education, banking, finance, insurance, hospitality and several other industries can use Alibaba Cloud OCR to process their paperwork. For instance, healthcare professionals have to deal with a lot of documentation, including patient’s past medical history, prescriptions, and insurance forms. Misdiagnosis due to manual errors in reading a doctor’s handwriting, dosage instructions, and other medical information is very common. Jiande Hangzhou’s First People’s Hospital in China faced similar issues in the management of its medical services.
The hospital utilized OCR, image recognition, and other ET Medical Brain capabilities to improve its medical records management. The system performed multiple checks to ensure that the patients received the right dosage of medicines. It also improved diagnosis of diseases.
With this system, the hospital could securely digitize all its data. Precisely, this meant that healthcare professionals could retrieve, share, and analyze patient information anytime, and from anywhere. With reduced instances of medical error and overall improvement in medical services, the hospital received HIMSS6 certification. In fact, this was an industry first for a hospital in China.
Counterfeiting is prevalent across all regions, and often online retailers have to bear the brunt. As most of these companies operate on a marketplace model, they can easily shrug off responsibility by claiming to be intermediaries. However, in the long term, this behavior can severely damage their brand and customers could favor buying from those who offer some guarantee.
Alibaba Group also faced similar issues a few years ago when brands like Tiffany, Michael Kors, and Gucci accused it of selling counterfeit products on their platform. To crackdown on counterfeiting, Alibaba used OCR, machine learning, and big data models. The system analyzed images of various products listed on its portal to find discrepancies in the product description on the website.
On identifying any suspicious products, Alibaba uses analyzes seller history and reputation through big data. These details include the sellers' past transactions, shipping and return addresses, and any other linked accounts, among others. All this helped the system to identify any large-scale counterfeiting activity. This system helped Alibaba successfully remove 120 million suspicious items from its e-commerce platform.
With the ongoing developments in OCR and other artificial intelligence technologies, the line between physical and virtual world is blurring fast. OCR is among those technologies that are helping computers extract real-world information in an automated manner. As part of ET Brain, Alibaba Cloud OCR easily integrates with its other functions of information gathering and analysis. It means leveraging a broad range of AI capabilities is easier with Alibaba Cloud. This makes Alibaba Cloud OCR a strong contender in the field of OCR systems.
At the time of writing, Alibaba Cloud OCR is not yet available to the international market. To learn more about other Alibaba Cloud ET Brain solutions, visit www.alibabacloud.com/et.
Alibaba Clouder - September 14, 2017
Alibaba Clouder - November 27, 2018
GarvinLi - November 7, 2018
yanmin - June 27, 2019
Alibaba Cloud Product Launch - December 12, 2018
Alibaba Clouder - August 22, 2019
An intelligent image search service with product search and generic search features to help users resolve image search requests.Learn More
A Big Data service that uses Apache Hadoop and Spark to process and analyze dataLearn More
Conduct large-scale data warehousing with MaxComputeLearn More
More Posts by Alibaba Clouder