The article written by Sarunya Kanjanawattana and Masaomi Kimura is a study about Optical Character Recognition (OCR), which represents a a typical tool used to transform image-based characters to computer editable characters. The two illustrate a novel method which is a combination of a graph componenet extraction and an OCR-error correction.
In the last years, graphs became very important to researches, as they contain significant information which can be extracted and used. Graphs offer data summarization which presents essential information that is interpreted by acquiring small descriptive details. In order to succeed in obtaining a primary interpretation, OCR was created, which is an approving solution used for acquiring graph components as a digital format o character letters. This study uses a collection of bar graphs which contains at least axis descriptions and a legend in order to illustrate OCR.
OCR is widely used, as there are thousands of paper-based documents converted to digitezed information using OCR. Though, it does not provide a 100% correct result, as it can have errors. Poor printing quality, small image resolution, specific language requirement and image noises cause the misrecognition that produce OCR errors. Let’s take the word “BED”: it can be recognized and “8ED” and this is an error. These errors can be classified in non-word errors and real-word erros. The difference between them is that the non-word errors generate words that does not exist, while real-word errors do recognize different words than the one typed, but the word recognized exists. These are very important aspects, as people who work with OCR should be careful with such errors, in order to notice the incorrect recognition of words. However, OCR should not be directly applied to graph images, as this can cause recognition noise.
The article makes a reference to previous studies that are about image segmentation (a techinque used to capture and separate dominant objects from image backgrounds) and OCR-error correction. This study, however, utilizes a pre-processing and suggests a post-processing method to achieve a difficulty of OCR errors. The methodology is divided into Graph-component extraction (whose task is to separate the components into individual images) and, as done in previous works, OCR-error correction (the use of ontologies and integrating an edit distance and NLP to the correction system).
In order to evaluate the methods and the theory presented, Sarunya Kanjanawattana and Masaomi Kimura conducted experiments. There have been 4 experiments. The first experiment was a combination of the image partition method and edit distance. The result was that all performance rates were presented the lowest values, except the noise ration, which was up to 29,48 %. The second experiment was a combination between the graph component extraction and the edit distance. The result was that the accuracy and F-measure were increased to 57,28% and 50,54%. The thirds experiment was a combination of the image partition method and the OCR-error correction. The performance rates were improved comparing to the first experiment. The accuracy was up to 80,75% and the F-measure reached to 92,28%. Finally, the last experiment consisted of combining the first and the second modles proposed by this study. The results: accuracy: 84,23% and F-measure – 86,02%.
According to these statistics, the researches calculated the token errors and the differences between the results of the four experiments. As following, they proposed a new method of OCR-error correction based on bar graph images using semantics. They obtained the wanted results and proved that the method presentented the highest performance rates greater than other methods.
The next stage of the research consists of graph-content information extraction and of designing a new ontology to support extractable graph information and to utilize other ontologies in order to reveal latent information.