- Source: Intelligent character recognition
Intelligent character recognition (ICR) is used to extract handwritten text from images. It is a more sophisticated type of OCR technology that recognizes different handwriting styles and fonts to intelligently interpret data on forms and physical documents.
These paper-based papers are scanned, the information is extracted, and the data is then digitally stored in a database program using ICR technology. The data is utilized for analytical reporting and is integrated with business processes. ICR technology is used by businesses to organize unstructured data and obtain current information from these reports. Users can rapidly read handwritten data on paper using ICR, then convert it to a digital format. ICR algorithms collaborate with OCR to automate data entry from forms by removing the need for keystrokes. It has a high degree of accuracy and is a dependable method for processing various papers quickly.
Capabilities
Most ICR software has a self-learning system referred to as a neural network, which automatically updates the recognition database for new handwriting patterns. It extends the usefulness of scanning devices for the purpose of document processing, from printed character recognition (a function of OCR) to hand-written matter recognition. Because this process is involved in recognizing hand writing, accuracy levels may, in some circumstances, not be very good but can achieve 97%+ accuracy rates in reading handwriting in structured forms. Often to achieve these high recognition rates several read engines are used within the software and each is given elective voting rights to determine the true reading of characters. In numeric fields, engines which are designed to read numbers take preference, while in alpha fields, engines designed to read hand written letters have higher elective rights. When used in conjunction with a bespoke interface hub, hand-written data can be automatically populated into a back office system avoiding laborious manual keying and can be more accurate than traditional human data entry.
= Automated Forms Processing
=An important development of ICR was the invention of Automated Forms Processing in 1993 by Joseph Corcoran who was awarded a patent on the invention. This involved a three-stage process of capturing the image of the form to be processed by ICR and preparing it to enable the ICR engine to give best results, then capturing the information using the ICR engine and finally processing the results to automatically validate the output from the ICR engine.
This application of ICR increased the usefulness of the technology and made it applicable for use with real world forms in normal business applications. Modern software applications use ICR as a technology of recognizing text in forms filled in by hand (hand-printed).
Differences between ICR and OCR
= OCR
=Optical character recognition (OCR) is commonly considered to apply to any recognition technique that reads machine printed text. An example of a traditional OCR use case would be to translate the characters from an image of a printed document, such as a book page, newspaper clipping, or legal contract, into a separate file that could be searched and updated with a word processor or document viewer. It's also quite helpful for automating the processing of forms. Information can be swiftly extracted from form fields and entered into another application, like a spreadsheet or database, by zonally applying the OCR engine to those fields.
Yet, data is typically manually input rather than typed into form fields. Character identification becomes even more challenging while reading handwritten material. The diversity of more than 700,000 printed font variants is tiny compared to the near unlimited variations in hand-printed characters. The recognition program must take into account not just stylistic differences but also the kind of writing implement used, the standard of the paper, errors, hand stability, and smudges or running ink.
= ICR
=Intelligent character recognition (ICR) makes use of continuously improving algorithms to collect more information about the variances in hand-printed characters and more precisely identify them. ICR, which was created in the early 1990s to aid in the automation of forms processing, enables the conversion of manually entered data into text that is simple to read, search for, and change. When used to read characters that are obviously divided into distinct areas or zones, such as fixed fields seen on many structured forms, it works best.
Both OCR and ICR can be configured to read a variety of languages; however, limiting the expected character set to a smaller number of languages will produce better recognition outcomes. ICR cannot read cursive handwriting since it must still be able to assess each character individually. While writing in cursive, it might be difficult to tell where one character ends and another one begins, and there are more differences across samples than when hand-printing text. A more recent method called intelligent word recognition (IWR) focuses on reading a word in context rather than recognizing individual characters.
Intelligent word recognition
Intelligent word recognition (IWR) can recognize and extract not only printed-handwritten information, cursive handwriting as well. ICR recognizes on the character-level, whereas IWR works with full words or phrases. Capable of capturing unstructured information from every day pages, IWR is said to be more evolved than hand print ICR.
Not meant to replace conventional ICR and OCR systems, IWR is optimized for processing real-world documents that contain mostly free-form, hard-to-recognize data fields that are inherently unsuitable for ICR. This means that the highest and best use of IWR is to eliminate a high percentage of the manual entry of handwritten data and run-on hand print fields on documents that otherwise could be keyed only by humans.
See also
Optical character recognition (OCR)
Document automation
Document layout analysis
Document modelling
Machine learning
Outsourced document processing
Text mining
Reference List
Kata Kunci Pencarian:
- Sekolah Teknik Elektro dan Informatika Institut Teknologi Bandung
- Pengenalan karakter optis
- Sensus Penduduk Indonesia 2010
- Richardus Eko Indrajit
- Mamalia
- Akselerator kecerdasan buatan
- AI-komplit
- Samsung SDS
- Intelligent character recognition
- Optical character recognition
- Handwriting recognition
- Intelligent word recognition
- ICR
- Turnaround document
- Document processing
- Word n-gram language model
- Enterprise content management
- Smart data capture