Nothing Special   »   [go: up one dir, main page]

Next Article in Journal
Detection and Monitoring of Tunneling-Induced Riverbed Deformation Using GPS and BeiDou: A Case Study
Previous Article in Journal
Algorithmically Optimized Hemispherical Dome as a Secondary Optical Element for the Fresnel Lens Solar Concentrator
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Recognition of Urdu Handwritten Characters Using Convolutional Neural Network

by
Mujtaba Husnain
1,
Malik Muhammad Saad Missen
1,
Shahzad Mumtaz
1,
Muhammad Zeeshan Jhanidr
1,
Mickaël Coustaty
2,
Muhammad Muzzamil Luqman
2,
Jean-Marc Ogier
2 and
Gyu Sang Choi
3,*
1
Department of Computer Science & IT, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
2
L3i Lab, Université of La Rochelle Av. Michel Cŕepeau, 17000 La Rochelle, France
3
Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 712-749, Korea
*
Author to whom correspondence should be addressed.
Appl. Sci. 2019, 9(13), 2758; https://doi.org/10.3390/app9132758
Submission received: 2 June 2019 / Revised: 26 June 2019 / Accepted: 1 July 2019 / Published: 8 July 2019
Figure 1
<p>Urdu character set with phonemes and numerals with their Roman equivalences.</p> ">
Figure 2
<p>Basic geometrical strokes of Urdu script [<a href="#B33-applsci-09-02758" class="html-bibr">33</a>].</p> ">
Figure 3
<p>The Urdu characters: (<b>a</b>) the matching order of pen-strokes among Urdu characters; (<b>b</b>) groups of Urdu isolated characters based on the number of strokes [<a href="#B21-applsci-09-02758" class="html-bibr">21</a>].</p> ">
Figure 4
<p>Statistical features: (<b>a</b>) common trend while writing Urdu character “Alif”; (<b>b</b>) slope value of the character-box for Urdu character “Seen”; (<b>c</b>) identification of a cusp in Urdu character “Hey”; (<b>d</b>) feature vector having intersection points of Urdu character “Daal” (U: up, D: down, R: right, L: left); (<b>e</b>) finishing trend of Urdu character “A’en” [<a href="#B21-applsci-09-02758" class="html-bibr">21</a>].</p> ">
Figure 5
<p>Urdu character “De” (<b>left</b>) and “Daal” (<b>right</b>) were repeatedly misrecognized.</p> ">
Figure 6
<p>Classification of the initial half forms on the basis of the number of strokes [<a href="#B23-applsci-09-02758" class="html-bibr">23</a>]. MDLSTM, multidimensional long short-term memory.</p> ">
Figure 7
<p>Different samples of input images with noise and without noise [<a href="#B47-applsci-09-02758" class="html-bibr">47</a>].</p> ">
Figure 8
<p>Recognition rate and recognition time using different Daubechies wavelets for handwritten numerals [<a href="#B51-applsci-09-02758" class="html-bibr">51</a>].</p> ">
Figure 9
<p>A sample of the handwritten Urdu characters is shown in (<b>a</b>,<b>b</b>), while (<b>c</b>) depicts a sample of handwritten Urdu numerals.</p> ">
Figure 10
<p>Grouping of Urdu characters according to shape similarity. Each groups is numbered from right to left.</p> ">
Figure 11
<p>Block diagram of proposed Urdu handwritten character classification system.</p> ">
Figure 12
<p>Details of the types of structural (geometrical) features of Urdu characters and numerals.</p> ">
Figure 13
<p>Inside architecture of our proposed CNN for Urdu handwritten character recognition system.</p> ">
Figure 14
<p>Effect of (<b>a</b>) batch size and (<b>b</b>) learning rate on accuracy.</p> ">
Figure 15
<p>Confusion matrices and respective performance graphs in Urdu handwritten numeral classification. (<b>a</b>) Accuracy result with 10 hidden neurons; (<b>b</b>) best validation performance with 10 hidden neurons; (<b>c</b>) accuracy result with 30 hidden neurons; (<b>d</b>) best validation performance with 30 hidden neurons; (<b>e</b>) accuracy result with 50 hidden neurons; (<b>f</b>) best validation performance with 50 hidden neurons.</p> ">
Figure 16
<p>Confusion matrices and respective performance graphs in Urdu character classification. (<b>a</b>) Accuracy result with 10 hidden neurons; (<b>b</b>) best validation performance with 10 hidden neurons; (<b>c</b>) accuracy result with 30 hidden neurons; (<b>d</b>) best validation performance with 30 hidden neurons; (<b>e</b>) accuracy result with 50 hidden neurons; (<b>f</b>) best validation performance with 50 hidden neurons.</p> ">
Versions Notes

Abstract

:
In the area of pattern recognition and pattern matching, the methods based on deep learning models have recently attracted several researchers by achieving magnificent performance. In this paper, we propose the use of the convolutional neural network to recognize the multifont offline Urdu handwritten characters in an unconstrained environment. We also propose a novel dataset of Urdu handwritten characters since there is no publicly-available dataset of this kind. A series of experiments are performed on our proposed dataset. The accuracy achieved for character recognition is among the best while comparing with the ones reported in the literature for the same task.

1. Introduction

In the field of pattern recognition and computer vision research, the task of handwritten text recognition is regarded as one of the most challenging areas. The cursive nature of text, the shape similarity of individual characters, and the availability of different writing styles are some of the key issues that make the recognition task more challenging. While recognizing the isolated word and character in the printed text, higher accuracy rates are observed in the literature; however, there is a need for an efficient recognition system that gives remarkable results in recognizing handwritten texts [1,2,3,4,5]. Urdu is one the cursive languages that is widely spoken and written in the regions of South-East Asia including Pakistan, India, Bangladesh, Afghanistan, etc. [6]. Optical character recognition (OCR) of Urdu script started in late 2000, and the first work on Urdu OCR was published in 2004. The literature review identified the fact that there is a lack of research efforts in Urdu handwritten text recognition as compared to the recognition of the scripts of other languages [4,7,8,9]. Furthermore, there are a few Urdu OCR systems for printed text that are commercially available [10,11], but there is no system available for recognition of Urdu handwritten text to date. It is pertinent to mention that in the field of computer vision and pattern recognition, handwritten text recognition is termed ICR (intelligent character recognition), while analysis of the printed text is known as OCR (optical character recognition). In the text, we use ICR for handwritten text recognition.
It is observed that several machine learning models like SVM (support vector machines) [12], NB (naive Bayes) [13], ANN (artificial neural network) [14,15], etc., were applied in the analysis of Urdu handwritten text in the literature survey. We also proved the competitiveness of the above approaches in analyzing the text images. In the literature, many researchers recommend using CNN (convolutional neural networks) [16,17] in extracting information from the images having text data. Furthermore, the notable work reported in [18,19,20] concluded that CNN is one of the most commonly-used DNNs (deep neural networks) in image processing while performing complex tasks like pattern matching, pattern analysis, etc. Furthermore, CNN is equally applicable to the data corpus at either the word or character level, without any prior knowledge of the syntactic (or semantic) structures of the language. The CNN model is equally applicable in a variety of data science-related tasks ranging from computer vision applications to speech recognition and others. The reason behind the successive usage of DNN is that these network models build the required correct mathematical relationship between the given input and the output regardless of the underlying nature of the model whether it is linear or non-linear. Moreover, the information in the DNN moves through the underlying layers calculating the probability of each output. This capability makes DNN one of the reliable and efficient models for solving the tasks mentioned earlier. Furthermore, the capability of deep learning models of extracting and identifying the peculiar features plays a key role in generating incisive and reliable results to the researchers. These approaches have also been proven to be competitive with traditional models. The literature related to Urdu handwritten text recognition [21,22,23,24,25,26,27] also recommended deep network models to get more optimal results in the minimum time.
To our knowledge, this paper is a pioneer in reporting the results of applying CNN to classify Urdu handwritten characters with exceptions. Our contribution in this work is to prove the fact that deep CNN does not require the knowledge of each character individually when trained on the large-scale dataset. The advantage of our proposed recognition system is quite useful to help the children who are learning to write Urdu characters and numerals. As a result, our proposed system will help in correctly classifying the characters written by the children.
The paper is organized as follows: Section 2 gives a brief introduction to the Urdu script. A detailed literature review is given in Section 3. Our proposed dataset is explained in Section 4. In Section 5, the proposed model is explained in detail, and experimental results are shown in Section 6. Finally, future work and conclusion are given in Section 7.

2. Urdu Script

Urdu is the national language of Pakistan and also considered as one of the two official languages of Pakistan [6] (with the other being English). It is widely spoken and understood as a second language by a majority of people of Pakistan [28,29] and also being adopted increasingly as a first language by the people living in urban areas of Pakistan.
Urdu script is written from right to left, while numerals are written from left to right; this is the reason Urdu is considered as one of the bidirectional languages [6]. Urdu script consists of 38 basic letters and 10 numeric letters, as shown in Figure 1. This character set is also considered a superset of some other Urdu-based script, i.e., Arabic contains 28 and Persian contains 32 characters [30]. Furthermore, the Urdu script also contains some additional characters to express the Hindi phonemes. Both Hindi and Urdu languages [30] share the same phonology with only a difference in written script. All Urdu-script-based languages such as Arabic and Persian have some unique characteristics, i.e., (i) the script of these languages is written from right to left in cursive style, and (ii) the script of these languages is context sensitive, i.e., written in the form of ligatures, which is a combination of a single or more characters. Due to this context sensitivity, most of the characters have different shapes depending on their position and the adjoining character in the word [7]. The connectivity of characters [31] has enriched the Urdu vocabulary with almost 24,000 ligatures.

3. Literature Review

In this section, we aim to assess in detail the use of different approaches in the recognition of Urdu handwritten characters. In general, we categorized the tasks and issues related to character level analysis into two subsections: (i) Urdu handwritten character recognition and (ii) Urdu handwritten numerals’ recognition.

3.1. Urdu Handwritten Character Recognition

Handwritten text recognition at the character level is a challenging task because of having a large number of variations in writing styles (even from a single author). It is observed from the literature related to character-level recognition in the Urdu script, the artificial neural network (ANN) and its different variants are widely used. An ANN [32] is a collection of nodes (also known as artificial neurons) linked with each other. These links between artificial neurons are enabled to transmit a signal from one to another within the network. These neurons can process the signals received and then propagate to the neurons connected in subsequent layers. The structure of the ANN may be affected by the kind of information flowing through it because a neural network usually trains itself using the input and labeled output.
The problem of developing a generic type of ICR that can resolve the issues associated with any language is challenging since different languages exhibit different characteristic features, and thus, generalizing this type of system is not possible. In order to overcome this problem, a novel approach was proposed in [33] exploring how the character set of any language can be represented by primitive geometrical strokes. One of the promising features of the approach is that the recognizer (artificial neural network) has to be trained only once. The data structure of the character set should be represented in the form of geometrical strokes in an XML file. This file helps in training the neural network, not for every time, for each word in the language. Figure 2 shows a set of thirteen basic geometrical strokes. For evaluation purposes, a set of 25 handwritten Urdu text samples were tested and achieved a success rate of 75– 80 % . One of the limitations of this approach is that it does not apply to the words having dots and diacritics.
Due to having a large character (or alphabet) set, there is inherent similarity among some major strokes, as shown in Figure 2. This similarity in characters is one of the challenging issues of the incorrect recognition of Urdu handwritten text. Keeping in view the fact mentioned above, in [21], the authors divided the Urdu character set into four groups according to the number of strokes, as shown in Figure 3. The authors performed an online Urdu ICR considering single-stroke characters only. Some novel features (shown in Figure 4) were extracted and then fed to three different classifiers namely, the back propagation neural network (BPNN), the probabilistic neural network (PNN), and the correlation-based classifier. The proposed approach was tested on 85 instances of single-stroke characters taken from 35 writers of different age groups. The results showed that the PNN classifier achieved a higher accuracy of 95% as compared to the other two classifiers. Unlike BNN, the PNN-based classifiers require no initial training. This is the reason PNN-based classifiers achieved higher accuracy than BNN.
For isolated character recognition, the authors in [22] proposed a technique that builds the feature vector by analyzing the primary and secondary strokes while writing Urdu characters in isolated form. Some of the stroke features that were used to train the classifier were as follows: the diagonal length of the bounding box; the sine-cosine angle ration of the bounding box diagonal; the displacement of the first and last point while tracing the bounding box; the corresponding sine-cosine ratio of the angle between the first and last point; the total length (in pixels) of the primary stroke; and the total angle traversed. A linear classifier was applied to the dataset of five samples each of 38 Urdu characters, i.e., a total of 190 characters were provided by two different writers who could write Urdu characters smoothly. The classifier recognized the characters with an error rate of almost 6% because some characters share quite similar shapes (see Figure 5) and were not correctly recognized.
Similar work was reported in [23] by considering the initial half of different Urdu characters. In this work, only those characters were considered that change their shapes concerning their position and context in a word. Figure 6 depicts Urdu characters in the initial half forms and classified based on the number of strokes. Almost 100 native Urdu writers and speakers were invited to write in Urdu script. The writers were provided with a stylus and digitizing tablet to get the dataset of 3600 instances of Urdu letters in the initial half form. A combination of multilevel one-dimensional wavelet analysis with the Daubechies wavelet [34,35,36] was applied to extract features from these instances. Several neural networks with different configurations were trained for recognition purposes. Among these networks, BPNN provided a maximum recognition rate of 92%.
The MDLSTM (multidimensional long short-term memory) neural network is one of the RNNs (recurrence neural networks) that is implicitly used for sequence learning and segmentation in multidimensional environments [37,38,39]. This model was used for the first time in the work of [26] for Urdu script recognition. One of the promising features of the model is that it can scan the input image in all four directions, thus reducing the chance of ambiguity. For evaluation purposes, the UPTI (Urdu Printed Text Image) dataset [40] was used, which contains 10,000 scanned images of both Urdu handwritten and printed text. MDLSTM is one of the supervised techniques; therefore, each input sample in the dataset is tagged and labeled with appropriate information. The dataset is further divided according to the following ratio: 68% for training and 16% for both testing and validation purposes. In order to evaluate the accuracy of the proposed approach, the Levenshtein edit distance [41] was computed between the output text and baseline results and achieved an accuracy of 94.04 % as compared to the results reported in the works of [42,43], reporting 88.94 % and 89% accuracy, respectively. Table 1 shows a comparison of the proposed approach on the UPTI dataset [40] with other techniques.
Promising work was reported in [46] in which Urdu handwritten text was recognized using the dataset UNHD (Urdu Nastaliq Handwritten Dataset) [47]. This dataset can be accessed publicly https://sites.google.com/site/researchonUrdulanguage1/databases UNHD Database. The dataset contains 312,000 words (including both Urdu script and Urdu numerals) written on a total of 10,000 lines by 500 writers of different age groups. The writers were directed to write on white pages of size A4. Each was provided six blank pages labeled with the author ID and the page number. One of the samples of written pages is shown in Figure 7. Furthermore, in order to maintain the uniformity in data, the writers were asked to write the provided printed text.
In order to recognize the text, a one-dimensional long short term memory (BLSTM) based approach was proposed that was based on RNN (recurrent neural network), capable of restoring the previous sequence information. For evaluation purposes, the dataset was divided into 50% for training, 30% for validation, and 20% for testing and achieved a 6–8 percent error rate that can be improved using two-dimensional BLSTM, as proposed by the authors. Table 2 gives the summary of the accuracy reported on common datasets in the Urdu domain.
In [27], the authors proposed a novel approach for Urdu text recognition at the character level, written in Nastaliq font by combining CNN (convolutional neural network) and MDLSTM. In the first phase, CNN was deployed to extract the characteristic features, which were then fed to MDLSTM in the second phase. This approach outperformed the state-of-the-art systems on the UPTI dataset. Table 3 shows the comparison of Urdu recognition on UPTI datasets.

3.2. Urdu Numeral Recognition

It is quite easy for a human being to recognize the handwritten numeral data, but for the computer system, there is a need for an intelligent approach based on some machine learning algorithms developed for this kind of job. The writing stroke, length, width, orientation, and other geometrical features tend to change while writing the same numeral even by the same author. These different writing styles may introduce shape variations of Urdu numerals that may break the strokes’ primitives and also change their topology. These issues make Urdu handwritten numeral recognition one of the active research areas in the field of image processing. Unfortunately, there is no commercially-available standard dataset of Urdu numerals. Due to this lack of resources, the researchers developed their own dataset and concluded the results. This section covers some notable work related to handwritten numeral recognition in the Urdu domain.
In [51], different transformations of the Daubechies wavelet [34,35,36] were applied for feature extraction from a dataset of about 2150 samples of handwritten Urdu numerals. For evaluation purposes, 2000 samples were used for training the neural network and 150 instances for testing. In order to decompose the images into different frequency bands, both the low-pass and high-pass filtering were applied at each phase of the Daubechies wavelet [34,35,36] filtering. For classification purposes, BPNN was used and achieved an average recognition rate of 92.05 %, as shown in Figure 8.
In [52,53], the authors presented the similarities and dissimilarities between Urdu and Arabic script with recognition of handwritten numeric data. A hybrid technique of HMM and the fuzzy rule was used to recognize the handwritten numerals of both Arabic and Urdu script. The dataset was prepared by inviting 30 trained users to write both the Urdu and Arabic numerals and collected 900 samples in total. The system obtained 97%, 96%, and 97.8 % recognition rates using the fuzzy rule, HMM, and the hybrid approach, respectively. The authors also conclude that separation of numerals from Urdu text in a handwritten text is still a challenging issue due to having shape similarity, e.g., First character of Urdu script (Alif) and Urdu numeric (One) both have exactly same shape. A new algorithm is proposed in [54] to preprocess the complex input and preserve shape of the actual input. Fuzzy association rules are used to link secondary stroke with their respective primary strokes. Different classifiers such as the hidden Markov model (HMM), fuzzy logic, the k-nearest neighbor (KNN), hybrid fuzzy HMM, hybrid KNN fuzzy, and the convolutional neural network (CNN) wee used for the classification. Statistical tests were applied to find the significance of classifiers’ results. Similarly, a newly-developed OCR algorithm was introduced in the work reported in [55] that used a semi-supervised multi-level clustering for categorization of the ligatures. Classification was performed using four machine learning techniques, i.e., decision trees, linear discriminant analysis, naive Bayes, and k-nearest neighbor (k-NN). The system was implemented, and the results showed 62, 61, 73, and 9% accuracy for the decision tree, linear discriminant analysis, naive Bayes, and k-NN, respectively.
In a very recent work [56], the authors presented a simple and robust line segmentation algorithm for Urdu handwritten and printed text. In the proposed line segmentation algorithm, a modified header and a baseline detection method were used. This technique purely depends on the counting pixels approach, which efficiently segments Urdu handwritten and printed text lines along with skew detection. The handwritten and printed Urdu text dataset was manually generated for evaluating the algorithm. Dataset consisted of 80 pages having 687 handwritten Urdu text lines, and printed dataset consisted of 48 pages having 495 printed text lines. The algorithm performed significantly well on printed documents and handwritten Urdu text documents with well-separated lines and moderately well on a document containing overlapping words.
The literature related to the Urdu text recognition at the character level proved that the ANN outperformed other machine learning approaches. The results generated by the character recognition system based on ANN were two-fold, i.e., the system was not only applicable for Latin script, but also for handwritten cursive characters of the Arabic-base script. We present a novel approach of CNN in order to recognize Urdu handwritten characters embedding both pixel- and geometrical-based features. The geometrical features were extracted for each text image using hybrid approaches of connected-components labeling [57] and the upper-lower profile [58]. The upper-lower profile works by dividing the image into four columns, then by detecting the position of both the first and last black pixels on each column, and provides the bounding box covering the area of interest. The extracted features are then embedded with pixel-based features, making a feature vector and then processed by our proposed model (discussed in the subsequent section) in order to recognize and classify using the variable size of the test set and invariant font.

4. Our Dataset

In research activities related to data science, a data-enrich dataset has a key role in generating the correct results. The precise and established dataset leads to the correct evaluation of the mathematical models that are implemented on the dataset. Furthermore, it is mandatory to have a standard dataset for each data-science domain to achieve the benchmark results. While performing our experimental work, we found that there was no publicly-available standard dataset, as mentioned earlier. In order to bridge this gap, we developed a novel dataset of Urdu handwritten isolated characters and numerals. Our dataset contained 800 images of each of the 38 Urdu character and 10 numerals. The dataset was built by inviting 500 native Urdu speakers from different social groups. Each author was directed to write both the Urdu characters and numerals each in his or her own handwriting in Nastaliq font in a column, as shown in Figure 9. As mentioned earlier, the dataset was not from a single writer; therefore, there was very less chance of overfitting the classification model. It is pertinent to mention that the numeral part of this dataset was also used successfully for visualization in our work [59]. The ground truth and information about the authors of our proposed dataset, e.g., age, gender, hand preference while writing (left hand/right hand or both), physical impairment (if any), and profession, were also recorded in a suitable XML-based repository. After dataset collection, the text pages were scanned on a flatbed scanner at 300 dpi and segmented manually into images of 28 by 28 for each Urdu character and numeral data. As mentioned earlier, the dataset consisted of 800 × 10 = 8000 numeral images and 800 × 38 = 30,400 Urdu characters. We planned to increase the number of authors to 1000 later in order to enrich the dataset by adding as many variations of handwriting as possible. Upon completion of the dataset, we will make the dataset publicly available to researchers.
In the case of Urdu characters, we considered only those characters that have much the shape similarity rather than the number of strokes, as shown in Figure 3. Keeping this observation, we divided the Urdu characters into 12 groups, as shown in Figure 10. It is pertinent to mention that our way of grouping the Urdu characters was different since we grouped the characters based on the shape similarity rather than based on the number of strokes, as reported in [21].

5. Proposed Model

The block diagram of the proposed Urdu handwritten character classification is shown in Figure 11. The proposed recognition technique relies on a convolutional neural network model (CNN) with a feature mapped output layer. Our proposed model will classify the given input out of 10 classes using CNN while classifying the Urdu numeral. Similarly, while classifying the Urdu character, the same model will classify the given Urdu character out of 12 classes (see Figure 10). The detail about the different phases of our proposed model will come in the following subsections.

5.1. Preprocessing

In this phase, the images of our proposed dataset went through some preprocessing steps to prepare the data for further processing. First, the images were processed in order to remove noise using the algorithms reported in the noteworthy work [60]. Then, we converted the images of our dataset gray-scale and then resized to 28 × 28 pixels by keeping the aspect ratio locked.

5.2. Feature Extraction

Along with pixel-based data of images, each Urdu handwritten character was processed in order to extract the structural/geometrical features like the width of the character, the height of the character, the aspect ratio of the text image, the number of horizontal and vertical lines in the image, the number and position of loops and arcs, etc. These features were then embedded with the pixel-based data of the image in order to obtain accurate results in the classification. The structural features of the Urdu handwritten characters are shown in Figure 12.

5.3. Convolutional Neural Network

The architecture of CNN is quite different from a conventional neural network model. In the conventional neural network, input values are transformed by traversing through a series of hidden layers. Every layer is made up of a set of neurons, where each layer is fully connected to all neurons in the layer before. The reason behind the better performance of CNNs is that these networks capture the inherent properties of images [61]. This significant feature of CNN gave us the confidence to use it in the analysis of our proposed dataset.
We have our proposed dataset of Urdu handwritten numerals (31K images with 10 labels (0–9) with the size of each image being 32 × 32 pixels, which will be fed into the CNN model. In our model, the first layer was a 2D convolution layer equipped with a 5 × 5 kernel size. This layer will help in interleaving each and every input image pixel. The output of this layer was embedded with the feature map having a size of 28 × 28 . After that, the structural features were embedded to build a feature vector. Each output in the convolution was able to be activated by using the activation function ReLU (rectified linear unit) [62]. The ReLU function can better handle the gradient vanishing problem when compared with the the “sigmoid” function [63]. Furthermore, ReLU plays an efficient role in simulating the brain mechanism of humans using the inherent threshold invariant. Finally, the fully-connected layer at the end helps to classify the given input. The model was tested for Urdu characters and numerals separately. The inside functioning of CNN used in our experimental work is depicted in Figure 13.

6. Experimental Setup and Results

In order to evaluate the accuracy of our proposed model (see Figure 11, we used CNN to classify both the Urdu handwritten characters and numerals in two different experiments. In order to classify the Urdu handwritten numerals, 3 / 4 of the 8000 data was selected as training data and 1 / 4 as test data. The same proportion was used for classifying the Urdu handwritten character dataset in the second set of experiments.
While training CNN having four convolutional layers for both the experiments, we considered the learning rate, the number of hidden neurons, and the batch size as parameters. It was observed from the results that CNN worked efficiently by increasing the network scale with one major drawback of the problem of over-fitting due to a longer time incurred while training. On the other hand, it is possible to get the optimal state of the model by tuning the batch size. The rule of thumb is that the model cannot be trained when the batch size is increased to some certain value [64]. Furthermore, the batch size was also dependent on the available memory. The effect of both the batch size and learning rate on accuracy is shown in Figure 14. In order to avoid the issues mentioned above, we trained the model in a controlled environment using a momentum value of 0.8 . This specific value of momentum helped with obtaining the optimal results.
In order to achieve the optimal state of the network, we had to increase the number of convolutional cores gradually since the increase all at once would cause the problem of overfitting. In the case of batch size, relatively large numbers were needed to achieve the global gradient. We chose the learning rate of 0.0025 with a batch size of 132. The confusion matrices with efficiency graphs using the different numbers of hidden neurons for the Urdu handwritten numerals are shown in Figure 15 with an average accuracy rate of 98.03 %. The diagonal values show the classification accuracy of individual Urdu numeral, while the overall accuracy achieved in each experiment is highlighted in a box (bottom right) in the corresponding confusion matrix. Furthermore, the off-diagonal cells correspond to incorrectly classified observations. Both the number of observations and the percentage of the total number of observations are shown in each cell. The column on the far right of the matrix shows the percentages of all the correctly and incorrectly predicted examples belong to each class.
It is clear from the performance graphs that the number of epochs increased with the number of hidden neurons in order to achieve higher accuracy. Here, hidden neurons are not representing the number of classes. In practice, the number of hidden neurons in each new hidden layer equals the number of connections to be made. It is also mentioned in the structure of CNN (Figure 13) that the internal layer may be comprised of different numbers of hidden neurons. These neurons help in choosing the features of the input image as deeply as possible. It is pertinent to mention that adding to the number of neurons may increase the complexity, but it helps achieve a higher accuracy rate. The output class here was the labels of the Urdu handwritten character, i.e., from 0–9 in case of numeral recognition.
The same set of experiments was performed with the dataset of Urdu handwritten characters. In this experiment, the model of CNN was modified by increasing the number of outputs from 10 to 12 since the Urdu characters were grouped into 12 classes based on the shape similarity (see Figure 10). Similarly, the same set of parameters with different values was applied for this set of experiments. We chose a learning rate of 0.08 with a starting batch size of 40. The confusion matrices with efficiency graphs using the different number of hidden neurons for the Urdu handwritten numerals and isolated characters are shown in Figure 15 and Figure 16, respectively. Our final test accuracy was around 98.3 % for Urdu numerals and 96.04 % for Urdu characters. Table 4 depicts the comparison of our proposed approach with other techniques for the same task.
It is noteworthy that the results shown in blocks (last two red-colored rows of Figure 16) were quite similar regardless of the number of hidden neurons in the case of Urdu handwritten character classification.
The experiments were also performed using variations of the n-fold cross-validation approach in order to avoid any confusion regarding the ratio of training and testing data. The confusion matrices for Urdu handwritten numerals are given in Table 5 and Table 6 showing average accuracy of 92.7 % and 95.6 % , respectively using 10-fold and 8-fold cross validation. Similarly Table 7 and Table 8 show results of Urdu handwritten characters (shown in groups in Figure 10) with 10 fold and 8 fold cross validation. Overall it was observed that our proposed model of CNN showed better predictive accuracy compared with other classification models.

7. Conclusions

In this paper, we made use of CNN (convolutional neural network) in recognizing and classifying Urdu handwritten characters. We also generated a novel dataset of Urdu handwritten characters and numerals. While performing experiments on our proposed dataset using CNN, we compared the results of different approaches in order to propose recommendations based on parameter tuning. The application of CNN in Urdu handwritten characters’ classification provides a platform for developing applications for children at the beginner level to learn how to write Urdu characters and numerals correctly. Furthermore, there is a lack of standard data resource in the Urdu domain in order to generate benchmark results.
In the field of machine learning, deep CNNs come with a revolutionary change by providing quite efficient results in comparison with conventional approaches. However, there are also some inherent questionable issues like there is a lack of knowledge of how to determine the number of levels and hidden neurons in each layer. Furthermore, a large-scale dataset is required to check the validity and efficiency of deep network models. Therefore, in our experiment, we had to train the CNN with many samples. In addition, finding a set of optimal parameters to generate error-free results is also a research issue. Moreover, our proposed classifier can be assessed using some other convolutional neural network models like two-dimensional BLSTM or bidirectional LSTM. Similarly, some complex future tasks like character recognition of rotated, mirror-text, and noisy images by extracting novel features could benefit. Moreover, we have also planned to develop a system that should recognize individual Urdu characters rather than in groups. Since data science is continuously providing multifaceted large-scale datasets, it is essential to design and develop more efficient CNN models that are cost effective in the utilization of resources like memory, computational bandwidth, etc.
According to Table 4, our proposed model was significantly better than the approaches used in the related literature in terms of the number of parameters and the amount of calculation. Furthermore, our proposed model was quite efficient (in terms of accuracy) and effective at performing the recognition and classification since it provided better accuracy in the minimum time as compared with the others, and it is suitable for developing a learning application for children on mobile phones.

Author Contributions

Conceptualization, M.H.; Data curation, M.M.S.M.; Formal analysis, S.M.; Funding acquisition, M.Z.J. and G.S.C.; Investigation, M.Z.J. and M.M.L.; Methodology, S.M.; Project administration, M.C.; Resources, M.H. andM.M.L.; Software, M.H.; Supervision, M.C. and J.-M.O.; Validation, M.M.S.M.; Writing—original draft, M.H.

Funding

This research was supported by the Ministry of Trade, Industry & Energy (MOTIE, Korea) under the Industrial Technology Innovation Program. No. 10063130, the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2019R1A2C1006159), and MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2019-2016-0-00313) supervised by the IITP (Institute for Information & communications Technology Promotion), and the 2018 Yeungnam University Research Grant.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Plamondon, R.; Srihari, S.N. Online and off-line handwriting recognition: A comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 63–84. [Google Scholar] [CrossRef]
  2. Steinherz, T.; Rivlin, E.; Intrator, N. Offline cursive script word recognition—A survey. Int. J. Doc. Anal. Recognit. 1999, 2, 90–110. [Google Scholar]
  3. Arica, N.; Yarman-Vural, F.T. An overview of character recognition focused on off-line handwriting. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2001, 31, 216–233. [Google Scholar] [CrossRef]
  4. Khan, N.H.; Adnan, A.; Basar, S. An analysis of off-line and on-line approaches in Urdu character recognition. In Proceedings of the 15th International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases (AIKED 16), Venice, Italy, 29–31 January 2016; pp. 29–31. [Google Scholar]
  5. Fujisawa, H. Forty years of research in character and document recognition–an industrial perspective. Pattern Recognit. 2008, 41, 2435–2446. [Google Scholar] [CrossRef]
  6. Simons, G.F.; Fennig, C.D. Ethnologue: Languages of Asia; Sil International: Dallas, TX, USA, 2017. [Google Scholar]
  7. Jan, Z.; Shabir, M.; Khan, M.; Ali, A.; Muzammal, M. Online Urdu Handwriting Recognition System Using Geometric Invariant Features. Nucleus 2016, 53, 89–98. [Google Scholar]
  8. Sagheer, M.W.; He, C.L.; Nobile, N.; Suen, C.Y. A new large Urdu database for off-line handwriting recognition. In Proceedings of the International Conference on Image Analysis and Processing, Vietri sul Mare, Italy, 8–11 September 2009; Springer: London, UK, 2009; pp. 538–546. [Google Scholar]
  9. Sagheer, M.W.; He, C.L.; Nobile, N.; Suen, C.Y. Holistic Urdu handwritten word recognition using support vector machine. In Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 23–26 August 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1900–1903. [Google Scholar]
  10. Wali, A.; Hussain, S. Context sensitive shape-substitution in nastaliq writing system: Analysis and formulation. In Innovations and Advanced Techniques in Computer and Information Sciences and Engineering; Springer: London, UK, 2007; pp. 53–58. [Google Scholar]
  11. Akram, Q.U.A.; Hussain, S. Ligature-based font size independent OCR for Noori Nastalique writing style. In Proceedings of the 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR), Nancy, France, 3–5 April 2017; pp. 129–133. [Google Scholar]
  12. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  13. Maron, M.E. Automatic indexing: An experimental inquiry. J. ACM 1961, 8, 404–417. [Google Scholar] [CrossRef]
  14. McCulloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 5, 115–133. [Google Scholar] [CrossRef]
  15. Werbos, P. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Ph.D. Thesis, Harvard University, Cambridge, MA, USA, 1974. [Google Scholar]
  16. LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
  17. LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
  18. Dos Santos, C.; Gatti, M. Deep convolutional neural networks for sentiment analysis of short texts. In Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers, COLING 2014, Dublin, Ireland, 23–29 August 2014; pp. 69–78. [Google Scholar]
  19. Johnson, R.; Zhang, T. Effective use of word order for text categorization with convolutional neural networks. arXiv 2014, arXiv:1412.1058. [Google Scholar]
  20. Kim, Y. Convolutional neural networks for sentence classification. arXiv 2014, arXiv:1408.5882. [Google Scholar]
  21. Haider, I.; Khan, K.U. Online recognition of single stroke handwritten Urdu characters. In Proceedings of the 2009 IEEE 13th International Multitopic Conference, Islamabad, Pakistan, 14–15 December 2009; pp. 1–6. [Google Scholar]
  22. Shahzad, N.; Paulson, B.; Hammond, T. Urdu Qaeda: Recognition system for isolated Urdu characters. In Proceedings of the IUI Workshop on Sketch Recognition, Sanibel Island, FL, USA, 8–11 February 2009. [Google Scholar]
  23. Khan, K.U.; Safdar, Q.-T.-A. Online Urdu handwritten character recognition: Initial half form single stroke characters. In Proceedings of the 2014 12th International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan, 17–19 December 2014; pp. 292–297. [Google Scholar]
  24. Naz, S.; Umar, A.I.; Shirazi, S.H.; Khan, S.A.; Ahmed, I.; Khan, A.A. Challenges of Urdu named entity recognition: A scarce resourced language. Res. J. Appl. Sci. Eng. Technol. 2014, 8, 1272–1278. [Google Scholar] [CrossRef]
  25. Ahmad, Z.; Orakzai, J.K.; Shamsher, I.; Adnan, A. Urdu nastaleeq optical character recognition. In Proceedings of the World Academy of Science, Engineering and Technology, Istanbul, Turkey, 21–22 December 2017; Citeseer: Forest Grove, OR, USA, 2007; Volume 26, pp. 249–252. [Google Scholar]
  26. Naz, S.; Umar, A.I.; Ahmad, R.; Ahmed, S.B.; Shirazi, S.H.; Razzak, M.I. Urdu Nasta’liq text recognition system based on multi-dimensional recurrent neural network and statistical features. Neural Comput. Appl. 2017, 28, 219–231. [Google Scholar] [CrossRef]
  27. Naz, S.; Umar, A.I.; Ahmad, R.; Siddiqi, I.; Ahmed, S.B.; Razzak, M.I.; Shafait, F. Urdu Nastaliq recognition using convolutional recursive deep learning. Neurocomputing 2017, 243, 80–87. [Google Scholar] [CrossRef]
  28. Rahman, T. Language and politics in Pakistan. Language 1998, 133, 9. [Google Scholar]
  29. Mahboob, A.; Jain, R. Bilingual Education in India and Pakistan. In Bilingual and Multilingual Education; Springer: Berlin, Germany, 2016; pp. 1–14. [Google Scholar]
  30. Razzak, M.I. Online Urdu Character Recognition in Unconstrained Environment. Ph.D. Thesis, International Islamic University, Islamabad, Pakistan, 2011. [Google Scholar]
  31. Lehal, G.S. Choice of recognizable units for URDU OCR. In Proceeding of the Workshop on Document Analysis and Recognition, Mumbai, India, 16 December 2012; ACM: New York, NY, USA, 2012; pp. 79–85. [Google Scholar]
  32. Van Gerven, M.; Bohte, S. Artificial Neural Networks as Models of Neural Information Processing; Frontiers Media SA: Lausanne, Switzerland, 2018. [Google Scholar]
  33. Ali, A.; Ahmad, M.; Rafiq, N.; Akber, J.; Ahmad, U.; Akmal, S. Language independent optical character recognition for hand written text. In Proceedings of the 8th International Multitopic Conference INMIC 2004, Lahore, Pakistan, 24–26 December 2004; pp. 79–84. [Google Scholar]
  34. Chui, C.K. Wavelets: A tutorial in theory and applications. In First and Second Volume of Wavelet Analysis and Its Applications; Academic Press: New York, NY, USA, 1992. [Google Scholar]
  35. Daubechies, I. The wavelet transform, time-frequency localization and signal analysis. IEEE Trans. Inf. Theory 1990, 36, 961–1005. [Google Scholar] [CrossRef]
  36. Daubechies, I. Ten Lectures on Wavelets; SIAM: Philadelphia, PA, USA, 1992; Volume 61. [Google Scholar]
  37. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  38. Kalchbrenner, N.; Danihelka, I.; Graves, A. Grid long short-term memory. arXiv 2015, arXiv:1507.01526. [Google Scholar]
  39. Sak, H.; Senior, A.; Beaufays, F. Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Proceedings of the Fifteenth Annual Conference of the International Speech Communication Association, Singapore, 14–18 September 2014. [Google Scholar]
  40. Sabbour, N.; Shafait, F. A segmentation-free approach to Arabic and Urdu OCR. In Document Recognition and Retrieval XX. International Society for Optics and Photonics; GFDRR: Washington, DC, USA, 2013; Volume 8658, p. 86580N. [Google Scholar]
  41. Yujian, L.; Bo, L. A normalized Levenshtein distance metric. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 1091–1095. [Google Scholar] [CrossRef] [PubMed]
  42. Ahmed, S.B.; Naz, S.; Razzak, M.I.; Rashid, S.F.; Afzal, M.Z.; Breuel, T.M. Evaluation of cursive and non-cursive scripts using recurrent neural networks. Neural Comput. Appl. 2016, 27, 603–613. [Google Scholar] [CrossRef]
  43. Ul-Hasan, A.; Ahmed, S.B.; Rashid, F.; Shafait, F.; Breuel, T.M. Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In Proceedings of the 2013 12th International Conference on Document Analysis and Recognition, Washington, DC, USA, 25–28 August 2013; pp. 1061–1065. [Google Scholar]
  44. Naz, S.; Umar, A.I.; Ahmed, R.; Razzak, M.I.; Rashid, S.F.; Shafait, F. Urdu Nasta liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks. SpringerPlus 2016, 5, 2010. [Google Scholar] [CrossRef] [PubMed]
  45. Naz, S.; Umar, A.I.; Shirazi, S.H.; Ahmed, S.B.; Razzak, M.I.; Siddiqi, I. Segmentation techniques for recognition of Arabic-like scripts: A comprehensive survey. Educ. Inf. Technol. 2016, 21, 1225–1241. [Google Scholar] [CrossRef]
  46. Ahmed, S.B.; Naz, S.; Swati, S.; Razzak, M.I. Handwritten Urdu character recognition using one-dimensional BLSTM classifier. Neural Comput. Appl. 2017, 31, 1–9. [Google Scholar] [CrossRef]
  47. Gosselin, B. Multilayer perceptrons combination applied to handwritten character recognition. Neural Process. Lett. 1996, 3, 3–10. [Google Scholar] [CrossRef]
  48. Naz, S.; Umar, A.I.; Ahmad, R.; Ahmed, S.B.; Shirazi, S.H.; Siddiqi, I.; Razzak, M.I. Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks. Neurocomputing 2016, 177, 228–241. [Google Scholar] [CrossRef]
  49. Rashid, S.F.; Schambach, M.P.; Rottland, J.; von der Nüll, S. Low resolution arabic recognition with multidimensional recurrent neural networks. In Proceedings of the 4th International Workshop on Multilingual OCR, Washington, DC, USA, 24 August 2013; ACM: New York, NY, USA, 2013; p. 6. [Google Scholar]
  50. Pham, V.; Bluche, T.; Kermorvant, C.; Louradour, J. Dropout improves recurrent neural networks for handwriting recognition. In Proceedings of the 2014 14th International Conference on Frontiers in Handwriting Recognition, Heraklion, Greece, 1–4 September 2014; pp. 285–290. [Google Scholar]
  51. Borse, R.; Ansari, I. Offline Handwritten and Printed Urdu Digits Recognition Using Daubechies Wavelet; ER Publication: New Delhi, India, 2015. [Google Scholar]
  52. Razzak, M.I.; Hussain, S.; Belaïd, A.; Sher, M. Multi-font numerals recognition for Urdu script based languages. Int. J. Recent Trends Eng. 2009, 2, 70–76. [Google Scholar]
  53. Razzak, M.I.; Hussain, S.; Sher, M. Numeral recognition for Urdu script in unconstrained environment. In Proceedings of the 2009 International Conference on Emerging Technologies, Islamabad, Pakistan, 19–20 October 2009; pp. 44–47. [Google Scholar]
  54. Anwar, F. Online Urdu Handwritten Text Recognition for Mobile Devices Using Intelligent Techniques. Ph.D. Thesis, International Islamic University, Islamabad, Pakistan, 2019. [Google Scholar]
  55. Khan, N.H.; Adnan, A.; Basar, S. Urdu ligature recognition using multi-level agglomerative hierarchical clustering. Clust. Comput. 2018, 21, 503–514. [Google Scholar] [CrossRef]
  56. Malik, S.A.; Maqsood, M.; Aadil, F.; Khan, M.F. An Efficient Segmentation Technique for Urdu Optical Character Recognizer (OCR). In Future of Information and Communication Conference; Springer: Berlin, Germany, 2019; pp. 131–141. [Google Scholar]
  57. Samet, H.; Tamminen, M. Efficient component labeling of images of arbitrary dimension represented by linear bintrees. IEEE Trans. Pattern Anal. Mach. Intell. 1988, 10, 579–586. [Google Scholar] [CrossRef]
  58. Kavallieratou, E.; Stamatatos, S. Discrimination of machine-printed from handwritten text using simple structural characteristics. In Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, Cambridge, UK, 23–26 August 2004; Volume 1, pp. 437–440. [Google Scholar]
  59. Husnain, M.; Missen, M.M.S.; Mumtaz, S.; Luqman, M.M.; Coustaty, M.; Ogier, J.M. Visualization of High-Dimensional Data by Pairwise Fusion Matrices Using t-SNE. Symmetry 2019, 11, 107. [Google Scholar] [CrossRef]
  60. Soman, R.; Thomas, J. A Novel Approach for Mixed Noise Removal using’ROR’Statistics Combined with ACWMF and DPVM. Int. J. Comput. Appl. 2014, 86, 11–17. [Google Scholar]
  61. Sainath, T.N.; Vinyals, O.; Senior, A.; Sak, H. Convolutional, long short-term memory, fully connected deep neural networks. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 19–24 April 2015; pp. 4580–4584. [Google Scholar]
  62. Nair, V.; Hinton, G.E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]
  63. Zhang, X.; Zhao, J.; LeCun, Y. Character-level convolutional networks for text classification. In Advances in Neural Information Processing Systems; NIPS: Grenada, Spain, 2015; pp. 649–657. [Google Scholar]
  64. Radiuk, P.M. Impact of training set batch size on the performance of convolutional neural networks for diverse datasets. Inf. Technol. Manag. Sci. 2017, 20, 20–24. [Google Scholar] [CrossRef]
Figure 1. Urdu character set with phonemes and numerals with their Roman equivalences.
Figure 1. Urdu character set with phonemes and numerals with their Roman equivalences.
Applsci 09 02758 g001
Figure 2. Basic geometrical strokes of Urdu script [33].
Figure 2. Basic geometrical strokes of Urdu script [33].
Applsci 09 02758 g002
Figure 3. The Urdu characters: (a) the matching order of pen-strokes among Urdu characters; (b) groups of Urdu isolated characters based on the number of strokes [21].
Figure 3. The Urdu characters: (a) the matching order of pen-strokes among Urdu characters; (b) groups of Urdu isolated characters based on the number of strokes [21].
Applsci 09 02758 g003
Figure 4. Statistical features: (a) common trend while writing Urdu character “Alif”; (b) slope value of the character-box for Urdu character “Seen”; (c) identification of a cusp in Urdu character “Hey”; (d) feature vector having intersection points of Urdu character “Daal” (U: up, D: down, R: right, L: left); (e) finishing trend of Urdu character “A’en” [21].
Figure 4. Statistical features: (a) common trend while writing Urdu character “Alif”; (b) slope value of the character-box for Urdu character “Seen”; (c) identification of a cusp in Urdu character “Hey”; (d) feature vector having intersection points of Urdu character “Daal” (U: up, D: down, R: right, L: left); (e) finishing trend of Urdu character “A’en” [21].
Applsci 09 02758 g004
Figure 5. Urdu character “De” (left) and “Daal” (right) were repeatedly misrecognized.
Figure 5. Urdu character “De” (left) and “Daal” (right) were repeatedly misrecognized.
Applsci 09 02758 g005
Figure 6. Classification of the initial half forms on the basis of the number of strokes [23]. MDLSTM, multidimensional long short-term memory.
Figure 6. Classification of the initial half forms on the basis of the number of strokes [23]. MDLSTM, multidimensional long short-term memory.
Applsci 09 02758 g006
Figure 7. Different samples of input images with noise and without noise [47].
Figure 7. Different samples of input images with noise and without noise [47].
Applsci 09 02758 g007
Figure 8. Recognition rate and recognition time using different Daubechies wavelets for handwritten numerals [51].
Figure 8. Recognition rate and recognition time using different Daubechies wavelets for handwritten numerals [51].
Applsci 09 02758 g008
Figure 9. A sample of the handwritten Urdu characters is shown in (a,b), while (c) depicts a sample of handwritten Urdu numerals.
Figure 9. A sample of the handwritten Urdu characters is shown in (a,b), while (c) depicts a sample of handwritten Urdu numerals.
Applsci 09 02758 g009
Figure 10. Grouping of Urdu characters according to shape similarity. Each groups is numbered from right to left.
Figure 10. Grouping of Urdu characters according to shape similarity. Each groups is numbered from right to left.
Applsci 09 02758 g010
Figure 11. Block diagram of proposed Urdu handwritten character classification system.
Figure 11. Block diagram of proposed Urdu handwritten character classification system.
Applsci 09 02758 g011
Figure 12. Details of the types of structural (geometrical) features of Urdu characters and numerals.
Figure 12. Details of the types of structural (geometrical) features of Urdu characters and numerals.
Applsci 09 02758 g012
Figure 13. Inside architecture of our proposed CNN for Urdu handwritten character recognition system.
Figure 13. Inside architecture of our proposed CNN for Urdu handwritten character recognition system.
Applsci 09 02758 g013
Figure 14. Effect of (a) batch size and (b) learning rate on accuracy.
Figure 14. Effect of (a) batch size and (b) learning rate on accuracy.
Applsci 09 02758 g014
Figure 15. Confusion matrices and respective performance graphs in Urdu handwritten numeral classification. (a) Accuracy result with 10 hidden neurons; (b) best validation performance with 10 hidden neurons; (c) accuracy result with 30 hidden neurons; (d) best validation performance with 30 hidden neurons; (e) accuracy result with 50 hidden neurons; (f) best validation performance with 50 hidden neurons.
Figure 15. Confusion matrices and respective performance graphs in Urdu handwritten numeral classification. (a) Accuracy result with 10 hidden neurons; (b) best validation performance with 10 hidden neurons; (c) accuracy result with 30 hidden neurons; (d) best validation performance with 30 hidden neurons; (e) accuracy result with 50 hidden neurons; (f) best validation performance with 50 hidden neurons.
Applsci 09 02758 g015
Figure 16. Confusion matrices and respective performance graphs in Urdu character classification. (a) Accuracy result with 10 hidden neurons; (b) best validation performance with 10 hidden neurons; (c) accuracy result with 30 hidden neurons; (d) best validation performance with 30 hidden neurons; (e) accuracy result with 50 hidden neurons; (f) best validation performance with 50 hidden neurons.
Figure 16. Confusion matrices and respective performance graphs in Urdu character classification. (a) Accuracy result with 10 hidden neurons; (b) best validation performance with 10 hidden neurons; (c) accuracy result with 30 hidden neurons; (d) best validation performance with 30 hidden neurons; (e) accuracy result with 50 hidden neurons; (f) best validation performance with 50 hidden neurons.
Applsci 09 02758 g016
Table 1. Comparison of the proposed approach on the Urdu Printed Text Image (UPTI) dataset with other techniques [44].
Table 1. Comparison of the proposed approach on the Urdu Printed Text Image (UPTI) dataset with other techniques [44].
AuthorsFeaturesApproachUPTI DatasetAccuracy (%)
46% Training
[44]pixelsBLSTM34% Validation 94.85 %
(Bidirectional Long Short Term Memory)20% Test
46% Training
[42]pixelsBLSTM44% Validation 94.85 %
10% Test
46% Training
[43]statisticalMDLSTM16% Validation 94.97 %
featuresMultidimensional Long Short-Term Memory16% Test
68% Training
[45]pixelsMDLSTM16% Validation 98.25 %
16% Test
Table 2. Accuracy reported on common datasets. UNHD, Urdu Nastaliq Handwritten Dataset.
Table 2. Accuracy reported on common datasets. UNHD, Urdu Nastaliq Handwritten Dataset.
DatasetArticle ReferenceAccuracyApproaches
UPTI [40][26]98%MDLSTM (Leven.Dist)
[48] 94.6 %MDLSTM (CTC)
UNHD[46]92%BDLSTM
CENPARMI [8][9] 96.02 %SVM
Table 3. Comparison of Urdu recognition on UPTI datasets.
Table 3. Comparison of Urdu recognition on UPTI datasets.
SystemsSegmentationFeaturesClassifierAcc (%)
[40]HolisticConvolutionMDLSTM 91.00 %
[42]ImplicitPixelsBLSTM 88.94 %
[43]ImplicitPixelsBLSTM 94.85 %
[49]ImplicitStatisticalMDLSTM 96.40 %
[50]ImplicitStatisticalMDLSTM 96.97 %
Proposed [27]ImplicitConvolutionMDLSTM 98.12 %
Table 4. Comparison of our proposed approach for Urdu handwritten character classification with other techniques.
Table 4. Comparison of our proposed approach for Urdu handwritten character classification with other techniques.
Urdu Handwritten Character Classification
ReferenceApproachFeaturesAccuracy (%)
[33]Neural Networkgeometrical strokes75–80%
[21]BPNN, PNNgeometrical strokes66%
[22]Linear classifierstatistical features66%
[46]BLSTMpixel-based92–94%
Our proposed approachCNNpixel- and geometrical-based 96.04 %
Urdu Handwritten Numeral Classification
ReferenceApproachFeaturesAccuracy (%)
[51]Daubechies waveletpixel-based 92.05 %
[52,53]HMM, fuzzy rulepixel-based 97.45 %, 97.09 %
Our proposed approachCNNpixel- and geometrical-based 98.3 %
Table 5. Confusion matrix of n-fold cross-validation (10-fold) results for Urdu handwritten numeral classification.
Table 5. Confusion matrix of n-fold cross-validation (10-fold) results for Urdu handwritten numeral classification.
EightFiveFourNineOneSevenSixThreeTwoZeroClassified as
92%1%001%02%004%Eight
095%01%000004%Five
0093%1%1%01%03%1%Four
00096%01%02%01%Nine
2%00093%2%2%001%One
3%001%2%91%002%1%Seven
02%01%2%092%1%2%0Six
002%000094%4%0Three
006%03%00091%0Two
02%002%0006%90%Zero
Table 6. Confusion matrix of n-fold cross-validation (8-fold) results for Urdu handwritten numeral classification.
Table 6. Confusion matrix of n-fold cross-validation (8-fold) results for Urdu handwritten numeral classification.
EightFiveFourNineOneSevenSixThreeTwoZeroClassified as
93%2%001%1%1%02%0Eight
096%0000002%2%Five
0095%0002%003%Four
00097%0001%1%1%Nine
3%001%95%1%0000One
2%001%3%93%001%0Seven
01%03%3%093%000Six
001%000095%2%2%Three
003%2%0001%94%0Two
05%000000095%Zero
Table 7. Confusion matrix of n-fold cross-validation (10-fold) results for Urdu handwritten characters classification.
Table 7. Confusion matrix of n-fold cross-validation (10-fold) results for Urdu handwritten characters classification.
GrpGrpGrpGrpGrpGrpGrpGrpGrpGrpGrpGrpClassified
123456789101112as
93%001%2%002%1%1%00Grp 1
089%03%0003%2%1%1%1%Grp 2
02%92%02%1%00002%1%Grp 3
2%03%91%02%02%01%00Grp 4
2%05%093%0000000Grp 5
002%3%2%90%002%01%0Grp 6
00000093%4%001%2%Grp 7
02%3%000087%4%1%1%2%Grp 8
3%1%00000089%03%4%Grp 9
02%002%003%1%90%02%Grp 10
01%001%0003%091%0Grp 11
3%2%0001%1%02%01%90%Grp 12
Table 8. Confusion matrix of n-fold cross-validation (8-fold) results for Urdu handwritten numeral classification.
Table 8. Confusion matrix of n-fold cross-validation (8-fold) results for Urdu handwritten numeral classification.
GrpGrpGrpGrpGrpGrpGrpGrpGrpGrpGrpGrpClassified
123456789101112as
91%3%00002%02%1%01%Grp 1
085%4%0%1%3%02%01%3%1%Grp 2
00%90%3%01%2%0002%2%Grp 3
1%1%2%89%03%02%01%01%Grp 4
001%095%01%01%01%1%Grp 5
002%0094%2%02%000Grp 6
00000093%4%001%2%Grp 7
00002%1%094%01%1%1%Grp 8
001%03%03%087%02%4%Grp 9
0003%2%02%0090%1%2%Grp 10
00%3%000004%090%3%Grp 11
0003%02%1%01%0093%Grp 12

Share and Cite

MDPI and ACS Style

Husnain, M.; Saad Missen, M.M.; Mumtaz, S.; Jhanidr, M.Z.; Coustaty, M.; Muzzamil Luqman, M.; Ogier, J.-M.; Sang Choi, G. Recognition of Urdu Handwritten Characters Using Convolutional Neural Network. Appl. Sci. 2019, 9, 2758. https://doi.org/10.3390/app9132758

AMA Style

Husnain M, Saad Missen MM, Mumtaz S, Jhanidr MZ, Coustaty M, Muzzamil Luqman M, Ogier J-M, Sang Choi G. Recognition of Urdu Handwritten Characters Using Convolutional Neural Network. Applied Sciences. 2019; 9(13):2758. https://doi.org/10.3390/app9132758

Chicago/Turabian Style

Husnain, Mujtaba, Malik Muhammad Saad Missen, Shahzad Mumtaz, Muhammad Zeeshan Jhanidr, Mickaël Coustaty, Muhammad Muzzamil Luqman, Jean-Marc Ogier, and Gyu Sang Choi. 2019. "Recognition of Urdu Handwritten Characters Using Convolutional Neural Network" Applied Sciences 9, no. 13: 2758. https://doi.org/10.3390/app9132758

APA Style

Husnain, M., Saad Missen, M. M., Mumtaz, S., Jhanidr, M. Z., Coustaty, M., Muzzamil Luqman, M., Ogier, J. -M., & Sang Choi, G. (2019). Recognition of Urdu Handwritten Characters Using Convolutional Neural Network. Applied Sciences, 9(13), 2758. https://doi.org/10.3390/app9132758

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop