Handwritten Text Recognition System Based On Neural Network
Handwritten Text Recognition System Based On Neural Network
Handwritten Text Recognition System Based On Neural Network
Computer Science & Technology (IJARCST 2016) Vol. 4, Issue 1 (Jan. - Mar. 2016) ISSN : 2347 - 9817 (Print)
I,II,III,IV
Faculty of Computer Science and Information Systems, Mansoura University, Mansoura, Egypt
Abstract
Handwritten text recognition is still an open research issue in the domain of Optical Character Recognition (OCR). This paper
proposes an efficient approach towards the development of handwritten text recognition systems. 3-layer Artificial Neural Network
(ANN) is utilized in this Paper using supervised learning approach. The choice of optimal feature vectors greatly the accuracy of
any text recognition system therefore bit map representation of input samples are utilized as feature vector. The feature vectors are
first pre-processed and then applied to the ANN along with the generated target vectors; that are generated on the basis on input
samples. 55 samples of each English alphabet are used as a ANN training process in order to make sure the general applicability of
system towards new inputs. Two different learning algorithms are utilized in this paper. Additive image processing algorithms are
also developed in order to deal with the multiple characters input in a single image, tilt image and rotated image. The trained system
provides an average accuracy of more than 95 % with the unseen test image.
Keywords
OCR, Image Processing, Handwritten Text Recognition, Artificial Neural Network, English Alphabet Recognition, Supervised
Learning
statistical/structural features [15] have also been successfully used. classified the data into 26 different English letters. This method
In a recent study [16] on handwritten Devanagari and Bangla performs well but it did not include includes the classification of
character recognition, the wavelet transforms of input character small English letters.
image were subjected to three layer approach. Anshul Mehta at el [23] proposed their work based on the heuristic
Rajib et al. [17] proposed a handwritten English character segmentation algorithm. Their system performs identification of
recognition system based on the Hidden Markov Model (HMM). valid segmentation points between handwritten letters quite well.
This method made use of two different feature extractions namely Fourier descriptors are used in this approach for feature extraction.
global and local feature extraction. Global feature includes many After a successful segmentation, Discrete Fourier Coefficients are
features like gradient features, projection features and curvature calculated (a[k] and b[k]) for the input image. Here k varies from
features in the numbers of four, six and four respectively. Whereas zero to (L-1) and L represents the boundary points of input image.
local features are calculated by dividing the sample image into This method tried to provide classification of total 52 characters
nine equal blocks. Gradient feature of each block is calculated (26 upper case English letters and 26 lower case English letters).
using four feature vector, which makes the total number of local It also provides a comparative analysis of different classification
features as 36. This resulted into fifty features (local + global) for methods. The method also incorporates post processing in order
each sample image. Then, these features are fed into HMM model to reduce the error rate but it suffers with a low accuracy and
in order to train it. Data post processing is also utilized by this high cross classification rate; because of the non-optimal choice
method in order to decrease the cross classification of different of features.
classes. This method takes a lot of time in training and feature Serrano et al. [24] proposed a novel interactive approach for
extraction. Moreover, it performs poor in case of such inputs, handwritten character recognition. The system requires human
when many characters are combined in a single image. suggestion for only those inputs for which the system gets confuse.
Velappa Ganapathy et al. [18] proposed a recognition method Although It keep the accuracy to high level, it increases the human
based on the multi scale neural network training. In order to lead. The only problem was that the system was not fully automatic
improve the accuracy, this method used selective threshold, which and requires human intervention for operation.
is calculated based on minimum distance technique. This method Amma et al. [25] proposed a wearable input system such that one
also involves the development of GUI, which can find out the can change the texts that are drawn in the sir itself. It was a 3D
character throughout the scanned image. This method provides integration method for handwriting recognition. The handwriting
an accuracy of 85% with moderate level of training. This method gestures were caught wirelessly by the help of motion sensors,
used large resolution images (20 × 28 pixels) for training with accelerometers and gyroscopes, which are placed strategically to
lesser training time. the back of the human hand. The proposal was good but it was
T. Som et al. [19] used fuzzy membership function to improve the unable to perform the same for already written data.
accuracy of hand written text recognition system. In this method, In this paper, a bit map version of input image sample is used as
text images are normalized to 20 × 10 pixels and then fuzzy feature vector. The optimal feature vector selection is an integral
approach is used to each class. Bonding box is created around part of any recognition system. The aim of proposed feature
the character in order to determine the vertical and horizontal extraction algorithm is to help in classification of the pattern
projection of the text. Once the image is cropped to a bounding correctly by means of minimum number of features that are
box, it is re-scaled to the size of 10×10 pixels. Then, cropped effective in discriminating pattern classes. The bit map version
images are thinned by the help of thinning operation. In order to contains all the major information of parent image in a small
create the test matrix, all these pre-processed images are placed neighborhood. The proposed also involves the study of the change
into a single matrix; one after another. When new (test) images are come in the system due to different learning mechanisms. it also
presented by the user, it is tested for the matching against the test shows the effect of different parameter use like number of hidden
matrix. The method was fast but it provides a low accuracy. layer, size of hidden layer and epochs etc. The Preprocessing of
Rakesh kumar et al [20] proposed a method in order to reduce proposed method involves thee like noise removal, segmentation
the training time of system by utilizing a single layer neural of characters, normalization and De-skewing. In this work an
network. Segmented characters are scaled to 80 X 80 pixels. Data effort is made towards recognition of English characters with on
normalization is performed on the input matrices to improve the accuracy up to 95%. Due to its logical simplicity, ease of use and
training performance. But their result has a low accuracy rate. high recognition rate, proposed system may turn out to be very
Other notable work proposed by Zamora includes feature extraction useful for practical use.
using the diagonal method [21], an improved version of this work
[22]. The others used zone based hybrid feature extraction from the III. The Proposed System
text. Doing so, led to improvement in speed and accuracy. By using The purpose of this study is the development of system that takes
Euler number approach, speed and accuracy are improved. Many handwritten English characters as input, process the input, extract
preprocessing like Thresholding, thinning and filtering operations the optimal features, train the neural network using either Resilient
are performed on the input image so that cross error rate can be Back-propagation or Scaled conjugate gradient, recognize the
minimized. Three techniques are utilized for better segmentation. class of input text, and finally generate the computerized form
After segmentation, the input image is resized to the size of 90 X of input text. The complete system is divided into two major
60 pixels. Then after, Euler number is calculated for each text and sections: Training of ANN with image database and testing of
then they are divided into 54 zones, such that each contains 10×10 ANN with test images. Figure 1 is showing the block diagram of
pixels. The average value of each zone (row and column wise) is training part of ANN and Figure 2 is showing the block diagram
used as the feature vector of the character. These features are fed of testing part of ANN.
in to a feed forward back propagation neural network (FF-BP-
NN), which have a configuration of 69-100-100-26. This system
B. Segmentation
Segmentation of image is done in the testing stage only. In this,
a complete image is decomposed into a sequence of text/sub-
images of individual text. The segmentation is done on the basis
of edge detection and gap between the different characters. After
segmentation, the sub-divided parts are labeled and then processed
Fig. 1: Block diagram of training part of ANN further one by one. This labeling is done in order to find out the
number of characters in the entire image. Each sub image is then
The training part of proposed work involves: creation of dataset, resized (70×50) and normalized with respect to itself. This helps
preprocessing of that dataset, feature extraction from pre-processed in extracting the quality features from the image. The scanned
dataset, generation of a feature vector and test vector, training of image is identified for valid segmentation points by the help of
ANN and saving of trained ANN for testing purpose. minima or arcs locations in between the characters, which is very
The testing part involves some extra pre-processing steps as here easy to find in handwritten texts. The segmentation points are
we need to figure out the number of characters in the input image also checked for any error point inclusion by checking all points
but it does not includes any training of ANN. On the contrary, it against the average distance between two segmentation points in
uses trained ANN directly after the feature vector generation. The complete image (will be shown later).
segmentation is an important step of test procedure as it helps to
figure out number of characters. C. Feature Extraction
The detailed explanation and working of each block involved in The feature vector is calculated by converting the pre-processed
training and testing procedure is shown in figure (1) and figure image into bit mapped version of size 7×5. Figure 4 show few
(2) respectively : examples of the bit map version of different characters utilized
in the proposed system. The bit map version preserve the major
Input
Image Select Feature features of input image in shorter space/ data length. Such that
Character Vector
Yes Output the reduces the time elapsed in NN Training without affecting the
Extraction
accuracy of correct character recognition.
Is all final
character Handwritten
Pre-
processing
s
scanned?
Document
After that, The bit map images are converted into a single vector
Classification
and Generated
of size 35×1, which serves as an input vector to the ANN.
Tilt and Trained
digital No
Document
Segmentation Vectors
A. Pre-processing
A series of operations are performed on the input image (In testing
as well as training stage) during the pre-processing. It helps in
enhancing the image rendering and makes the image suitable for
segmentation. The main objective of pre-processing is to remove
the background noise, enhance the region of interest in image and
make a clear difference between foreground and background. In
order to achieve these goals: noise filtering, conversion to binary
and smoothing operations are performed on the input image.
Figure 3 is showing an example of image normalization.
(a)
(b)
Fig. 3: (a) Tilt input image and (b) Image after preprocessing is
performed
IV. Results
The Handwritten Character Recognition system was tested
on several different scanned handwritten images is proposed
with different styles. The results were highly encouraging. The
proposed system performs pre-processing on the image in order
to remove the noise. Feature extraction is performed from the bit
map image representation, which gives pretty decent classification
of around 95%. The proposed system is advantageous as it uses
fewer features to train the neural network, which results into faster
convergence (less time for training). The advantage also lies in less
computation involved in feature extraction, training and testing.
The Feature comparison chart of proposed system with earlier
systems is shown in table (1).
Fig. 4: examples of the bit map version for different characters
Table (1): Feature comparison of proposed system with other
D. Learning Algorithms systems
Two different learning Algorithms are utilized in this Research Training Accuracy English Perfor- Can Auto- Treat
study. Time Character mance Perform matic with
(Capital / (can ex- Tilt? Extrac- sym-
Description Small) tract many tion bols
1. Resilient Back-propagation characters
in single
Resilient back-propagation, is a learning Algorithms for supervised image?)
learning methodology of feed forward ANN it was proposed by Velappa et al. Low Medium Only Able to No Yes Yes
Martin Riedmiller and Heinrich Braun in 1992 [26]. Likewise [18] Capital extract
the Manhattan update learning rule, Resilient back-propagation Rajib et al. High High Only Unable to No Yes Yes
considers the sign of partial derivative irrespective of the magnitude [17] Capital extract
and it works on each weight independently. This rule updates each Rakesh et al. Medium Medium Only Unable to No Yes Yes
[20] Capital extract
weight by a factor of η− or η+ depending of the sign change of
Anshul et al. Low High Both Unable to Yes Yes No
the partial derivative. Here η represents the update factor. A detail [23] extract
description of Resilient Back-propagation can be found in [26].
Serrano et al. Medium High Only Unable to Yes No Yes
[24] Capital extract
2. Scaled conjugate gradient Medium High Capital and Able to Yes Yes No
Proposed
Scaled Conjugate Gradient (SCG) is a supervised learning Small both extract
algorithm of second order. It based on the principal of gradient
decent algorithm, which is a very old guided search method. The The proposed system given good results for images that contain
advantage of SCG lies in the facts that it needn’t user defined handwritten text written in different styles, different size and
parameter for ANN while training. The step size gets adjusted with alignment with varying background. It classifies most of the
each epoch automatically, which results into faster convergence handwritten characters correctly even if the image contains noise
and better accuracy. A detailed description of Scaled Conjugate in ether characters or background.
Gradient can be found in [27]. It shows that our system is really good in compression with other
systems Except in treating with symbols .
3. ANN Train and Test It is quite evident from the table (2) and table (3) that the Scaled
The training of ANN involves specifying the hidden layers and conjugate gradient learning algorithm out performs the Resilient
choice of learning algorithm. The input vector and target vector Back-propagation algorithm in terms of both accuracy as well as
are also normalized in the range of [-1 to 1], so that the training training time. Table (2) and (3) also show that the more hidden
can be done efficiently. During training, gradient is set as e-10 layers are, the more training time because the involved weights
and maximum number of iteration as 1000. 55 samples of each (weights to be trained) also increased along with the hidden layers.
character are used for creating of training dataset. The accuracy increases along with the hidden layer, But after
Now, New test images are created in order to check the validity some time it again starts decreasing. This happen because of the
of our designed system. figure (6) show a sample of handwritten over saturation of available weights that needs to be trained with
document. limited constraints. The process of choosing the number of hidden
layers is always a heuristic problem. But most of research papers
always user double the number of input layers. this situation is
similar to mathematics problem where more than ‘n’ equations
are available to figure out ‘n’ variables. These results into over
training and it degrades the performance of the system. It is quite
clear from table (2) and (3) that ‘80’ is the optimal hidden layer
that should be used in proposed system.
Table (2): Resilient Back-propagation Results [3] R. Plamondon and S. N. Srihari, On-Line and Off-Line
Configuration of ANN Accuracy (%) Training Time Handwriting Recognition: A Comprehensive Survey, IEEE
(input-hidden-output) (second) Trans. Pattern Analysis and Machine Intelligence, vo1.22,
layers pp.63-84, 2000.
35-10-52 64.44 69.33 [4] H. Liu and X. Ding, Handwritten Character Recognition
using Gradient Feature and Quadratic Classifier with
35-20-52 66.54 75.76
Multiple Discrimination Schemes, Proc. 8th Int. Conf. on
35-30-52 69.43 85.43 Document Analysis and Recognition, pp. 19-25, 2005
35-40-52 73.76 98.87 [5] Pratap, Neeraj, and Dr Shwetank Arya. “A Review
35-50-52 77.32 109.43 of Devnagari Character Recognition from Past to
35-60-52 83.65 117.65 Future”. International Journal of Computer Science and
35-70-52 87.35 139.34 Telecommunications 3, no. 6 (2012): 77-82.
[6] U. Bhattacharya, M. Shridhar, S. K. Parui, P. K. Sen and
35-80-52 93.54 151.54
B. B. Chaudhuri, Offline recognition of handwritten Bangla
35-90-52 91.11 169.85 characters: an efficient two-stage approach, Pattern Analysis
35-100-52 88.32 187.76 and Applications, vo1. l5(4), pp.445-458, 2012.
35-110-52 85.43 195.42 [7] John, Jomy, and Kannan Balakrishnan. “A system for
offline recognition of handwritten characters in Malayalam
Table (3): Scaled conjugate gradient Results script.” International Journal of Image, Graphics and Signal
Configuration of ANN Accuracy (%) Training Time Processing (IJIGSP) 5, no. 4 (2013): 53.
(input-hidden-output) (second) [8] Fischer, Andreas, Ching Y. Suen, Volkmar Frinken, Kaspar
layers Riesen, and Horst Bunke. “A fast matching algorithm for
35-10-52 65.36 65.33 graph-based handwriting recognition.” In Graph-Based
Representations in Pattern Recognition, pp. 194-203.
35-20-52 67.32 72.34
Springer Berlin Heidelberg, 2013.
35-30-52 70.34 82.44 [9] Parvez, Mohammad Tanvir, and Sabri A. Mahmoud. “Arabic
35-40-52 74.36 94.27 handwriting recognition using structural and syntactic
35-50-52 79.34 101.37 pattern attributes.” Pattern Recognition 46, no. 1 (2013):
35-60-52 86.36 110.33 141-154.
35-70-52 90.63 130.32 [10] Zamora-Martínez, Francisco, Volkmar Frinken, Salvador
España-Boquera, Maria Jose Castro-Bleda, Andreas Fischer,
35-80-52 95.62 142.77
and Horst Bunke. “Neural network language models for off-
35-90-52 92.44 159.34 line handwriting recognition.” Pattern Recognition 47, no.
35-100-52 89.13 178.33 4 (2014): 1642-1652.
35-110-52 87.36 189.34 [11] Tao, Dapeng, Lingyu Liang, Lianwen Jin, and Yan Gao.
“Similar handwritten Chinese character recognition
V. Conclusion by kernel discriminative locality alignment.”Pattern
A proposed handwritten character recognition system has been Recognition Letters 35 (2014): 186-194.
designed and tested. A comparison with related work has been [12] Das, Soumendu, and Sreeparna Banerjee. “An Algorithm
presented. ANNs have been trained for this purpose with various for Japanese Character Recognition.” International Journal
types of input samples and that’s why the developed program of Image, Graphics and Signal Processing (IJIGSP) 7, no.
has an ability to test and classify the input character into 52 1 (2014): 9.
different classes with an accuracy of more than 95%. Two different [13] M. Shi, Y. Fujisawa, T. Wakabayashi, and F. Kimura,
learning algorithms have been used. Scaled Conjugate Gradient Handwritten Numeral Recognition Using Gradient and
algorithm has been turned out to be better learning algorithm than Curvature of Gray Scale Image, Pattern Recognition, vol.
the Resilient Back-propagation algorithm in terms of accuracy 35(10), pp. 2051-2059, 2002.
and training time, while using the same configuration. In future [14] Singh, Sukhpreet, Ashutosh Aggarwal, and Renu Dhir. “Use
work, hybrid feature extraction methods will be developed in order of Gabor Filters for recognition of Handwritten Gurmukhi
to enhance the accuracy. Also better classification methods will character.” International Journal of Advanced Research
be investigated in order to minimize the miss classified image. in Computer Science and Software Engineering 2, no. 5
Finally, the proposed work will be extended to identify the Arabic (2012).
language. [15] Wu X-Q, Wang K-Q, Zhang D (2005) Wavelet energy feature
extraction and matching for palmprint recognition. J Comput
References Sci Technol 20(3):411–418
[1] N. Arica and F. Yarman-Vural, An Overview of Character [16] Bhattacharya, U. and B. B. Chaudhuri, Handwritten Numeral
Recognition Focused on Off-Line Handwriting, IEEE Trans. Databases of Indian Scripts and Multistage Recognition of
Systems, Man, and Cybernetics Part C: Applications and Mixed Numerals, IEEE Trans. Pattern Analysis and Machine
Rev., vol. 31, pp. 216-233, 2001. Intelligence, vol. 31(3), pp. 444- 457, 2009.
[2] A . Amin and H. B. Al-Sadoun, Hand Printed Arabic [17] Rajib Lochan Das, Binod Kumar Prasad, Goutam Sanyal,
Character Recognition System, Proc. of 12th Int. Conf. “HMM based Offline Handwritten Writer Independent
Pattern Recognition, pp. 536-539, 1994. English Character Recognition using Global and Local
Feature Extraction”, International Journal of Computer [31] Hazem M. El-Bakry, and Nikos Mastorakis “New Fast
Applications (0975 – 8887), Volume 46– No.10, pp. 45-50, Normalized Neural Networks for Pattern Detection,” Image
May 2012. and Vision Computing Journal, vol. 25, issue 11, 2007, pp.
[18] Velappa Ganapathy, Kok Leong Liew, “Handwritten 1767-1784.
Character Recognition Using Multiscale Neural Network [32] Hazem M. El-Bakry, “New Fast Time Delay Neural Networks
Training Technique”, World Academy of Science, Engineering Using Cross Correlation Performed in the Frequency
and Technology, pp. 32-37, 2008. Domain,” Neurocomputing Journal, vol. 69, October 2006,
[19] T.Som, Sumit Saha, “Handwritten Character Recognition pp. 2360-2363.
Using Fuzzy Membership Function”, International Journal [33] Hazem M. El-Bakry, and Qiangfu Zhao, “Fast Normalized
of Emerging Technologies in Sciences and Engineering, Neural Processors For Pattern Detection Based on Cross
Vol.5, No.2, pp. 11-15, Dec 2011. Correlation Implemented in the Frequency Domain,” Journal
[20] Rakesh Kumar Mandal, N R Manna, “Hand Written English of Research and Practice in Information Technology, Vol.
Character Recognition using Row- wise Segmentation 38, No.2, May 2006, pp. 151-170.
Technique”, International Symposium on Devices MEMS,
Intelligent Systems & Communication, pp. 5-9, 2011.
[21] J Pradeep, E Shrinivasan and S.Himavathi, “Diagonal Based
Feature Extraction for Handwritten Alphabets Recognition
System Using Neural Network”, International Journal of
Computer Science & Information Technology (IJCSIT), vol.
3, No 1, Feb 2011.
[22] Om Prakash Sharma, M. K. Ghose, Krishna Bikram Shah,
“An Improved Zone Based Hybrid Feature Extraction
Model for Handwritten Alphabets Recognition Using Euler
Number”, International Journal of Soft Computing and
Engineering (ISSN: 2231 - 2307), Vol. 2, Issue 2, pp. 504-
508, May 2012.
[23] Anshul Mehta, Manisha Srivastava, Chitralekha Mahanta
“Offline handwritten character recognition using neural
network” IEEE 2011 International conference on computer
applications and Industrial Electronics.
[24] Serrano, Nicolás, Adrià Giménez, Jorge Civera, Alberto
Sanchis, and Alfons Juan. “Interactive handwriting
recognition with limited user effort.”International Journal
on Document Analysis and Recognition (IJDAR) 17, no. 1
(2014): 47-59.
[24] Amma, Christoph, Marcus Georgi, and Tanja Schultz.
“Airwriting: a wearable handwriting recognition
system.” Personal and ubiquitous computing 18, no. 1
(2014): 191-203.
[25] Martin Riedmiller und Heinrich Braun: Rprop - A
Fast Adaptive Learning Algorithm. Proceedings of the
International Symposium on Computer and Information
Science VII, 1992
[26] Møller, Martin Fodslette. “A scaled conjugate gradient
algorithm for fast supervised learning.” Neural networks 6.4
(1993): 525-533.
[27] Hazem M. El-Bakry, “A New High Speed Neural Model
For Character Recognition Using Cross Correlation and
Matrix Decomposition,” International Journal of Signal
Processing, vol.2, no.3, 2005, pp. 183-202.
[28] Hazem M. El-Bakry, “A Novel High Speed Neural Model
for Fast Pattern Recognition,” Soft Computing Journal, vol.
14, no. 6, 2010, pp. 647-666.
[29] Hazem M. El-Bakry, “An Efficient Algorithm for Pattern
Detection using Combined Classifiers and Data Fusion,”
Information Fusion Journal, vol. 11, issue 2, April 2010, pp.
133-148.
[30] Hazem M. El-Bakry, “A New Neural Design for Faster
Pattern Detection Using Cross Correlation and Matrix
Decomposition,” Neural Network World journal, 2009, vol.
19, no. 2, pp. 131-164.