821 |
Classifier combination for optical character recognition systems utilizing normalized weights and samples of characters |
US13659289 |
2012-10-24 |
US08548259B2 |
2013-10-01 |
Diar Tuganbaev |
Techniques and methods are disclosed herein for combining and weighting of values from and associated with classifiers. Classifiers are used to recognize characters as part of an optical character recognition (OCR) system. Various methods of normalization facilitate combining of results of classifiers. For example, weight values may be entered into a weight table having two columns, one that includes weights from comparing patterns with images of correct characters, the other column includes weights from comparing patterns with images of incorrect characters. |
822 |
Removing inserted text from an image using extrapolation for replacement pixels after optical character recognition |
US12250852 |
2008-10-14 |
US08457448B2 |
2013-06-04 |
Gerold K. Shelton; Curtis Gold; Michael J. Shelton |
A method of removing inserted text from a digital image includes recognizing the inserted text in the digital image using optical character recognition; and replacing pixels of the digital image corresponding to the inserted text so as to remove the inserted text from the digital image. A computer program product for removing inserted text from a digital image includes an inserted text removal program stored on a computer-readable medium, the program including an optical character recognition module for recognizing inserted text in a digital image; and an extrapolation module for replacing pixels corresponding to the inserted text in the digital image with replacement image pixels so as to remove the inserted text from the digital image. A photo printing kiosk includes an interface for receiving a digital image; an optical character recognition module for recognizing inserted text in the digital image; and an extrapolation module for replacing pixels corresponding to the inserted text in the digital image with replacement image pixels so as to remove the inserted text from the digital image. |
823 |
Efficient identification and correction of optical character recognition errors through learning in a multi-engine environment |
US12357367 |
2009-01-21 |
US08331739B1 |
2012-12-11 |
Ahmad Abdulkader; Matthew R. Casey |
OCR errors are identified and corrected through learning. An error probability estimator is trained using ground truths to learn error probability estimation. Multiple OCR engines process a text image, and convert it into texts. The error probability estimator compares the outcomes of the multiple OCR engines for mismatches, and determines an error probability for each of the mismatches. If the error probability of a mismatch exceeds an error probability threshold, a suspect is generated and grouped together with similar suspects in a cluster. A question for the cluster is generated and rendered to a human operator for answering. The answer from the human operator is then applied to all suspects in the cluster to correct OCR errors in the resulting text. The answer is also used to further train the error probability estimator. |
824 |
Optical character recognition (OCR) engines having confidence values for text types |
US12954865 |
2010-11-27 |
US20120134589A1 |
2012-05-31 |
Prakash Reddy |
An image of a known text sample having a text type is generated. The image of the known text sample is input into each OCR engine of a number of OCR engines. Output text corresponding to the image of the known text sample is received from each OCR engine. For each OCR engine, the output text received from the OCR engine is compared with the known text sample, to determine a confidence value of the OCR engine for the text type of the known text sample. |
825 |
Use of level detection while capturing and presenting text with optical character recognition |
US11729663 |
2007-03-28 |
US07792363B2 |
2010-09-07 |
Benjamin Perkins Foss |
A system for presenting text found on an object. The system comprises an object manipulation subsystem configured to position the substantially planar object for imaging; an imaging module configured to capture an image of the substantially planar object; a text capture module configured to capture text from the image of the substantially planar object; an Optical Character Recognition (“OCR”) component configured to convert the text to a digital text; a material context component configured to associate a media type with the text found on the substantially planar object; and an output module configured to convert the digital text to an output format, wherein the system is configured to organize the digital text according to the media type before converting the digital text to an output format. |
826 |
DYNAMIC TRANSRATING BASED ON OPTICAL CHARACTER RECOGNITION ANALYSIS OF MULTIMEDIA CONTENT |
US12707417 |
2010-02-17 |
US20100150449A1 |
2010-06-17 |
Indra Laksono |
Exemplary techniques for modifying multimedia data based on content are disclosed. One technique comprises determining whether a first portion of multimedia content of multimedia data has a first content characteristic and performing one or more content actions associated with the first content characteristic when the first portion of the multimedia content is determined to have the first content characteristic, wherein the one or more content actions modify a first portion of the multimedia data associated with the first portion of the multimedia content. |
827 |
Multiple image input for optical character recognition processing systems and methods |
US11560026 |
2006-11-15 |
US07734092B2 |
2010-06-08 |
Donald B. Curtis; Shawn Reid |
A method of processing an image includes receiving a digital version of the image, processing the digital version of the image through at least two binarization processes to thereby create a first binarization and a second binarization, and processing the first binarization through a first optical character recognition process to thereby create a first OCR output file. Processing the first binarization through a first optical character recognition process includes compiling first metrics associated with the first OCR output file. The method also includes processing the second binarization through the first optical character recognition process to thereby create a second OCR output file. Processing the second binarization through the first optical character recognition process includes compiling second metrics associated with the second OCR output file. The method also includes using the metrics, at least in part, to select a final OCR output file from among the OCR output files. |
828 |
Optical character recognition using digital information from encoded text embedded in the document |
US11274805 |
2005-11-15 |
US07505180B2 |
2009-03-17 |
Dennis C. DeYoung; Devin J. Rosenbauer |
A method for performing optical character recognition (OCR) on an image of a document including text includes embedding a physical manifestation of digital information associated with the text on the document. When the document is scanned with a scanning device, the digital information and a digital text file are produced. The digital text file is proofed using the digital information. |
829 |
Shape clustering and cluster-level manual identification in post optical character recognition processing |
US11519368 |
2006-09-11 |
US20080063278A1 |
2008-03-13 |
Luc Vincent; Raymond W. Smith |
Techniques for shape clustering and applications in processing various documents, including an output of an optical character recognition (OCR) process. |
830 |
Method and apparatus for performing optical character recognition (OCR) and text stitching |
US10092772 |
2002-03-07 |
US07343049B2 |
2008-03-11 |
Mark Melvin Butterworth |
A method of generating an electronic text file from a paper-based document that includes a plurality of characters includes capturing a plurality of partially overlapping digital images of the document. Optical character recognition is performed on each one of the plurality of captured digital images, thereby generating a corresponding plurality of electronic text files. Each one of the electronic text files includes a portion of the plurality of characters in the document. The plurality of electronic text files are compared with one another to identify characters that are in common between the electronic text files. The plurality of electronic text files are combined into a combined text file based on the comparison. The combined text file includes the plurality of characters in the document. |
831 |
Display control method, and program, information processing apparatus and optical character recognizer |
US10434503 |
2003-05-08 |
US07260262B2 |
2007-08-21 |
Toshimichi Arima |
A method for controlling the display of a screen which allows the user to discriminate the scanned image and the recognition result intuitively and easily. The display control method for allowing the user to verify the recognition result of a character on a verification screen form is implemented as follows. First of all, the standard pattern of a specific character stored in a memory and a plurality of character images recognized as the specific character are read, upon an operation of the user. And the read standard pattern is displayed in a recognition result character display portion within the verification screen form and a plurality of the read character image are displayed sequentially in a character image display portion adjacent or proximal to the recognition result character display portion and at a predetermined position of a character image list display portion. A plurality of character images are listed in the character image list display portion, in which the already displayed character images are shifted one position from the predetermined position and displayed. |
832 |
Personal information retrieval using knowledge bases for optical character recognition correction |
US11299453 |
2005-12-12 |
US20070133874A1 |
2007-06-14 |
Marco Bressan; Herve Dejean; Christopher Dance |
In a system for updating a contacts database (42, 46), a portable imager (12) acquires a digital business card image (10). An image segmenter (16) extracts text image segments from the digital business card image. An optical character recognizer (OCR) (26) generates one or more textual content candidates for each text image segment. A scoring processor (36) scores each textual content candidate based on results of database queries respective to the textual content candidates. A content selector (38) selects a textual content candidate for each text image segment based at least on the assigned scores. An interface (50) is configured to update the contacts list based on the selected textual content candidates. |
833 |
Image scanner and optical character recognition system using said image scanner |
US09830639 |
1999-10-27 |
US06901166B1 |
2005-05-31 |
Mitsuo Nakayama |
The present invention is directed to providing an image scanner and an optical recognition system using said image scanner which can scan only the “intended region” to carry out character recognition and can carry out character recognition in the background of an application software and can input the recognition result directly to said application software. Image picture data which are captured by scanning the “intended region” of a document with the image scanner mouse 20 are converted to text data by a character recognition software in the personal computer 10 and are inputted directly to an application software. Designation and confirmation of input starting position for the “intended region” on a document are made easily and surely with the LCD26 at hand on the image scanner 20. |
834 |
Display control method, and program, information processing apparatus and optical character recognizer |
US10434503 |
2003-05-08 |
US20040001629A1 |
2004-01-01 |
Toshimichi
Arima |
nullObjectnullTo provide a method for controlling the display of a screen which allows the user to discriminate the scanned image and the recognition result intuitively and easily. nullConstitutionnullThe display control method for allowing the user to verify the recognition result of a character on a verification screen form 300 is implemented as follows. First of all, the standard pattern of a specific character stored in a memory and a plurality of character images recognized as the specific character are read, upon an operation of the user. And the read standard pattern is displayed in a recognition result character display portion 310 within the verification screen form 300 and a plurality of the read character image are displayed sequentially in a character image display portion 320 adjacent or proximal to the recognition result character display portion 310 and at a predetermined position of a character image list display portion 330. A plurality of character images are listed in the character image list display portion 330, in which the already displayed character images are shifted one position from the predetermined position and displayed. |
835 |
Method and apparatus for performing optical character recognition (OCR) and text stitching |
US10092772 |
2002-03-07 |
US20030169923A1 |
2003-09-11 |
Mark
Melvin
Butterworth |
A method of generating an electronic text file from a paper-based document that includes a plurality of characters includes capturing a plurality of partially overlapping digital images of the document. Optical character recognition is performed on each one of the plurality of captured digital images, thereby generating a corresponding plurality of electronic text files. Each one of the electronic text files includes a portion of the plurality of characters in the document. The plurality of electronic text files are compared with one another to identify characters that are in common between the electronic text files. The plurality of electronic text files are combined into a combined text file based on the comparison. The combined text file includes the plurality of characters in the document. |
836 |
Method and apparatus for optical character recognition utilizing
proportional nonpredominant color analysis |
US10967 |
1993-01-29 |
US5835625A |
1998-11-10 |
Gregory P. Fitzpatrick; Marvin L. Williams |
A method and apparatus for optical identification an unknown character from a plurality of known characters. Each of the known characters includes a predominant color and a nonpredominant color in preselected proportions. The unknown character has at least one geometric feature and a plurality of pixels including a predominant color and a nonpredominant color. The method and apparatus of the present invention include an examination of at least one geometric feature of the unknown character. A hypothetical identity for the unknown character is generated in response to the examination of at least one geometric feature of the unknown character. A portion of the plurality of pixels of the unknown character is sampled and a proportion between the predominant color and the nonpredominant color within the sampled portion of the plurality of pixels is determined from the sampled pixels. A comparison of the determined proportion with a preselected proportion associated with a known character corresponding to the hypothetical identity is made. The hypothetical identity is assigned to the unknown character if the determined proportion falls within the preselected range of the preselected proportion for the known character corresponding to the hypothetical identity. |
837 |
High accuracy optical character recognition using neural networks with
centroid dithering |
US55523 |
1993-04-29 |
US5475768A |
1995-12-12 |
Thanh A. Diep; Hadar I. Avi-Itzhak; Harry T. Garland |
Pattern recognition, for instance optical character recognition, is achieved by training a neural network, scanning an image, segmenting the image to detect a pattern, preprocessing the detected pattern, and applying the preprocessed detected pattern to the trained neural network. The preprocessing includes determining a centroid of the pattern and centrally positioning the centroid in a frame containing the pattern. The training of the neural network includes randomly displacing template patterns within frames before applying the template patterns to the neural network. |
838 |
System and method for correction of optical character recognition with
display of image segments according to character data |
US100941 |
1993-08-03 |
US5455875A |
1995-10-03 |
Dan Chevion; Ittai Gilat; Andre Heilper; Oren Kagan; Amir Kolsky; Yoav Medan; Eugene Walach |
A data entry system generates an electronically stored coded representation of a character sequence from one or more electronically stored document images. The system comprising optical character recognition logic for generating, from the document image or images, character data specifying one of a plurality of possible character values for corresponding segments of the document images. The system also has an interactive display means for generating and sequentially displaying, one or more types of composite image, each composite image comprising segments of the document image or images arranged according to the character data, and a correction mechanism responsive to a user input operation to enable the operator to correct the character data associated with displayed segments. |
839 |
Method of segmenting characters in lines which may be skewed, for
allowing improved optical character recognition |
US361031 |
1989-06-02 |
US5062141A |
1991-10-29 |
Hiroshi Nakayama; Keiji Kojima; Gen Sato |
A method of segmenting characters of a document image comprises the steps of dividing the document image into a plurality of divided regions and setting a check width with respect to each of the divided regions, where each check width is greater than or equal to a width of a corresponding one of the divided regions so that the check widths of two mutually adjacent divided regions partially overlap each other, reading image data amounting to one line of the document image, obtaining from the image data horizontal projections of each line data within each of the check widths, where each horizontal projection is a number of black picture elements in a corresponding data line within a check width and each data line is made up of a plurality of picture elements arranged horizontally, segmenting a line based on the horizontal projections, obtaining from the image data vertical projections, where each vertical projection is a number of black picture elements in a vertical direction, determining a character segmentation range based on the vertical projections, and segmenting each character of the line within the character segmentation range. |
840 |
Optical character recognition neural network system for machine-printed
characters |
US474587 |
1990-02-02 |
US5048097A |
1991-09-10 |
Roger S. Gaborski; Louis J. Beato; Lori L. Barski; Hin-Leong Tan; Andrew M. Assad; Dawn L. Dutton |
Character images which are to be sent to a neural network trained to recognize a predetermined set of symbols are first processed by an optical character recognition pre-processor which normalizes the character images. The output of the neural network is processed by an optical character recognition post-processor. The post-processor corrects erroneous symbol identifications made by the neural network. The post-processor identifies special symbols and symbol cases not identifiable by the neural network following character normalization. For characters identified by the neural network with low scores, the post-processor attempts to find and separate adjacent characters which are kerned and characters which are touching. The touching characters are separated in one of nine successively initiated processes depending upon the geometric parameters of the image. When all else fails, the post-processor selects either the second or third highest scoring symbol identified by the neural network based upon the likelihood of the second or third highest scoring symbol being confused with the highest scoring symbol. |