Ieee Format
Ieee Format
Ieee Format
I.INTRODUCTION
Diseases in agricultural crops pose a significant threat,
reducing output and affecting the quality and quantity of farm
products. In India, where 70% of the population relies on
agriculture, this sector contributes 17% to the country's GDP.
Ensuring sufficient food production for a growing global
population is crucial, yet food security faces challenges from
climate change, declining pollinators, and plant diseases.
Smallholder farmers, dependent on healthy crops, suffer the
most, with reports of yield loss exceeding 50% in the
developing world, where over 80% of agricultural production
comes from them.
To address the challenge of feeding an ever-expanding Fig 1: Number of samples per class
population, innovative agricultural approaches are necessary
as crop diseases significantly threaten food security, causing III. RELATED WORK
reduced yields and economic losses. Timely and accurate The IEEE paper "Efficient Disease Detection of Paddy
disease detection is vital for effective management, with Crop using CNN" by P. A. Harsha et al. employs CNN and
recent technological advancements, especially image-based Raspberry Pi for efficient paddy crop disease detection. While
methods leveraging Smartphone cameras, offering scalable
accurate and implemented on low-cost platforms, its illustrated in Figure 2.
applicability is limited to paddy crops.[1]
B. Feature Extraction
SK Mahmudul et al.'s paper introduces a novel CNN model
for plant disease identification, featuring inception and Feature extraction is a critical aspect of ML algorithm
residual connections to reduce parameter count. However, its implementation, often considered challenging yet pivotal.
accuracy is affected by an imbalanced cassava dataset, In this study, texture features and general color statistical
suggesting potential improvements through data augmentation features were employed. Texture features were obtained
or ensemble learning.[2] by analyzing the grey level co-occurrence matrix
(GLCM), capturing spatial relationships of neighboring
Manish Kumar et al.'s paper utilizes exploratory data pixels. Features such as correlation, contrast, energy,
analysis and machine learning to predict plant diseases from homogeneity, and dissimilarity were derived from
soil sensor data, enabling early detection and intervention. GLCM. Color features were extracted through histogram
Challenges include robust data collection infrastructure and analysis, providing a comprehensive description of color
capturing all disease factors solely through soil sensor data.[3] statistics in the image. Specifically, 120 texture features
and 96 color features were computed, totaling 216
An innovative methodology integrating IoT and machine features. GLCM analysis involved calculating 12
learning is proposed for predicting Blister Blight in tea plants,
matrices for both full images and images with removed
offering real-time monitoring and improved crop
green pixels, considering 4 distances and 3 angles. Color
management. Concerns involve data privacy and scalability,
features were exclusively calculated for full images,
addressed through encryption, distributed architecture, and
adaptive learning algorithms.[4] encompassing 6 features per color channel. Additionally,
a histogram with 26 buckets per channel was utilized,
The survey explores automated disease diagnosis of herbal resulting in 78 features.
plants via digital image processing, featuring image pre-
processing, feature extraction, and classification algorithms. C. Support Vector Machines (SVM)
Challenges include image quality variability and limited
datasets, tackled through advanced enhancement techniques, Support Vector Machines (SVM) are supervised
crowd-sourcing, and parallel processing for scalability.[5] learning algorithms employed for classification or
When utilizing classical machine learning (ML) algorithms, regression tasks. SVM achieves classification by defining
certain pre-processing steps are imperative. The fundamental a separating hyper plane in the feature space. It is capable
steps are depicted in Figure 2. When utilizing classical of linear classification and can extend to nonlinear
machine learning (ML) algorithms, classification through kernel functions, facilitating
efficient transformation into high-dimensional feature
IV. EXISTING MODEL
spaces. Multiclass classification with SVM can be
When utilizing classical machine learning(ML) implemented using one-vs-all or one-vs-one strategies. In
algorithms ,certain pre-processing steps are imperative .The this study, radial basis function kernel and regularization
fundamental steps are depicted in Figure 2. parameter {(C=100)} were found to yield optimal results.
The one-vs-all approach was adopted, achieving an
accuracy of 91.74% on the test set.
A. Region Segmentation
In the context of image classification, typical pre-
processing steps involve standardizing images to uniform
dimensions and eliminating background and artifacts. As
the Plant Village dataset already contains segmented and
scaled images, these procedures were deemed
unnecessary. However, further pre-processing was
conducted by segmenting the images to extract
potentially infected leaf areas. This was achieved by
eliminating pixels with green channel values exceeding
those of red and blue channels. Examples of segmented
images and images with removed green pixels are
D. k-Nearest Neighbors In this section, we describe the proposed
methodology used to evaluate the effectiveness and
k-NN [7] is a very simple algorithm often used for efficiency of existing real- time detection schemes for
classification problems. It is both non-parametric (doesn’t plant disease detection. Deep Learning (DL) represents
have a fixed number of parameters) and lazy learning a subset of Machine Learning algorithms that employ
(doesn’t have a training phase). k-NN works under the multiple layers to learn features in a hierarchical
assumption that most samples from the same class are manner. Predominantly based on artificial neural
close to each other in the feature space. When networks, DL algorithms are at the forefront of
determining the class of the sample, k-NN will look at its contemporary Artificial Intelligence (AI) solutions.
k closest neighbors and decide to which class it belongs These models have demonstrated exceptional capability
by the simple majority rule. Small values of k will allow in learning intricate patterns given sufficient data. One
for higher non-linearity but will be sensitive to outliers. of the primary advantages of DL algorithms is their
High values of k achieve good generalization but fail to ability to automatically learn relevant features from raw
fit complex boundaries. The best value for parameter k is data, obviating the need for manual feature engineering.
determined experimentally. For tasks like image recognition, Convolutional Neural
For this dataset, small values of k were shown to give Networks (CNNs) are commonly employed.
the best results. Varying k from 1 to 9 doesn’t change the
accuracy much, with best result being 78.06% much To contrast with classical models, we utilized a
lower than the SVM. We used k=5 in this work. Google Net model with parameters as detailed in [3].
This model leveraged pertaining on the Image Net
E. Fully Connected Neural Network dataset and was configured with the following
parameters:
FCNN is the simplest type of artificial neural networks. - Optimizer: Stochastic Gradient Descent
It is a supervised learning algorithm able to model highly - Learning rate: 0.005
non-linear functions. As opposed to SVM and k-NN, it - Momentum: 0.9
does not converge to the global optimum, but when - Weight decay: 0.0005
properly configured, it usually gives good enough results. - Batch size: 24
Important neural network configuration parameters are: - Number of epochs: 10
• number of hidden layers
• activation function Converging within 10 epochs, this DL model achieved
• number of neurons per layer an impressive accuracy of 99.32%, surpassing classical
• optimization method algorithms by a significant margin. We found the above all
In this paper, we used an FCNN with four hidden information through by doing the various experiments on
the given epochs to detect and Identify the disease of
layers with 300, 200, 100 and 50 neurons per layer,
various classes of plant Images as Dataset.
respectively. Activation function in hidden layers is a
rectified linear unit (ReLU), with a softmax in the output TABLE I
THE COMPARATIVE ANALYSIS OF PREVIOUS STUDIES
layer [8]. We used L2 regularization with regularization
parameter equal to 0.3. Adam optimizer with default
parameters was used. This configuration gave us the Ref Technique Dataset Outcome Limitations
eren
accuracy of 91.46% on the test set. ce
[1] Disease Paddy Accurate Limited
F. Overall Workflow prediction, crop disease applicability
Raspberry Pi, detection to paddy
CNN, crop plants
The workflow involved pre-processing image data,
Artificial
extracting relevant features, and applying three distinct Intelligence
classical ML algorithms: SVM, k-NN, and FCNN. Each [2] Novel Imbala Reduced Utilization of an
algorithm was evaluated based on its accuracy in CNN nced parameters imbalanced
classifying healthy and diseased plant leaves using the model cassav , dataset leads to
based on a innovative lower accuracy
Plant Village dataset. The performance metrics indicated inception dataset architectur
SVM as the most effective classifier in this context, and e
achieving the highest accuracy on the test set. residual
connection
V.METHODOLOGY s
Deep Learning Based Approach [3] Exploratory Soil Early Requires robust data
data analysis, sensor disease collection
Machine data detection, infrastructure, may
learning timely not capture all disease of features. The number of convolutional layers varies
interventio factors through soil depending on the size of input images.
n data
After the convolutional layer, pooling is performed
[4] IoT, Machine Sensor Real-time Data privacy
learning networks monitoring concerns, which is responsible for reducing the dimension of the
algorithms data , early scalability issues, convolutional feature map. The pooling layer performs
disease need for down sampling operations by reducing the dimension of
detection robustness in the feature map, which ultimately helps in reducing the
handling diverse required computational complexity to process the data.
conditions
Different types of pooling operations are there, such as
[5] Automated Herbal Non- Variability in image
disease plant invasive quality, limited max-pooling, min-pooling, average-pooling. The output
diagnosis, image diagnosis , labeled datasets, feature maps of the convolution or pooling layer are
classification dataset high computational transformed into a one-dimensional vector in which every
accuracy complexity input is connected to every output by weight. This layer is
also called a dense layer. There can be one or more fully
connected layers, and the final fully connected layer has
Based on the survey and from that we have gathered the same number output as the number of classes.
the above from all of their related work in terms of
techniques used, dataset they have been used and the Here's a brief description of how a CNN-based approach
outcome at last they recovered along with some works:
limitations also it have been provided in the above table
Convolutional Layers: CNNs consist of multiple layers,
in detailed description.
starting with convolutional layers. In these layers, Filters, also
known as kernels, are employed on the input image to extract
a variety of features. Each filter traverses across the input,
conducting element-wise multiplication and accumulating the
outcomes to generate feature maps.
Fine-tuning and Transfer Learning: CNNs offer the Model Architecture Selection and Implementation: Various
capability for fine-tuning or transfer learning. Fine-tuning deep learning models including CNN, SVM, KNN, and
entails training the network on a new dataset with a FCNN are considered for the image classification task. Each
diminished learning rate, while transfer learning involves model's architecture is selected based on its suitability for the
leveraging pre-trained CNN models on novel tasks or datasets dataset and the complexity of the classification problem. The
by repurposing the acquired features. implementation of each model is described in detail, including
the configuration of layers, activation functions, and
Overall, CNN-based approaches have demonstrated state- optimization algorithms.
of-the-art performance in various computer vision tasks and
are widely used in both research and practical applications. Training and Validation: The selected models are trained
using the training dataset and validated using the validation
dataset to optimize their performance. Techniques such as
cross-validation and hyper parameter tuning may be employed
to fine-tune the models and prevent over fitting.
This paper delineates a comprehensive methodology for By adhering to this suggested methodology, researchers
both the development and evaluation of deep learning models and practitioners can methodically devise, assess, and
tailored for image classification tasks.The proposed method juxtapose deep learning models for image classification tasks,
consists of several key steps: dataset preparation, data splitting fostering progress in the realm of computer vision and pattern
into training and testing sets, selection and implementation of recognition.
various deep learning models including Convolutional Neural
Networks (CNN), Support Vector Machines (SVM), K- VI. EXPERIMENTAL RESULTS
Nearest Neighbors (KNN), and Fully Connected Neural
Networks (FCNN), training and validation of the models,
Experimental results indicate that the CNN architecture
performance metric evaluation, and visualization techniques.
consistently outperforms SVM, KNN, and FCNN models in
Each step is described in detail to provide a comprehensive
terms of accuracy, showcasing its effectiveness in learning
guide for researchers and practitioners in the field of image
hierarchical features from images. While SVM and KNN
classification.
models demonstrate competitive performance, particularly on
simpler datasets, FCNNs excel in capturing complex nonlinear
Dataset Preparation: This step involves acquiring and pre- relationships within the data. Overall, the findings validate the
processing the dataset suitable for the image classification proposed methodology's efficacy in guiding model selection
task. Pre-processing may include resizing images, and development for image classification tasks.
normalization, and augmentation techniques to enhance the
diversity of the dataset.
After 96.54 We can see that k-NN has a much lower score than the
Training other choices. SVM and FCNN have comparable results,
although still much lower than CNN which was shown to give
the best results by far. The error rate for CNN was less than
ACCURACY OF PROPOSED CNN MODEL BEFORE 1%, compared to 8-9% for SVM and FCNN, and more than
AND AFTER TRAINING 20% for k-NN.
VII.CONCLUSION
This paper highlights the superiority of the DL method 2020 International Conference on Smart Technologies in
Computing, Electrical, and Electronics (ICSTCEE) held in
compared to classical ML algorithms. The simplicity of the Bengaluru, India, 2020, pp. 116-119, doi:
approach and the attained accuracy affirm that DL is the 10.1109/ICSTCEE49637.2020.9276775.
preferred approach for addressing image classification [2] S. M. Hassan and A. K. Maji published the paper titled "Plant
challenges with relatively large datasets. Since the achieved Disease Identification Using a Novel Convolutional Neural
Network" in IEEE Access, volume 10, pages 5390-5401, in 2022,
accuracy of the DL method is already exceedingly high, DOI: 10.1109/ACCESS.2022.3141371.
attempting to enhance its results on the same dataset would [3] M. Kumar, A. Kumar, and V. S. Palaparthy authored the paper titled
yield minimal benefits. Further work with the DL model could "Soil Sensors-Based Prediction System for Plant Diseases Using
be done by expanding the dataset with more diverse images, Exploratory Data Analysis and Machine Learning" in the IEEE
Sensors Journal. It appears in volume 21, issue 16, spanning pages
collected from multiple sources, in order to allow it to 17455 to 17468, and was published on August 15, 2021, with the
generalize better. DOI: 10.1109/JSEN.2020.3046295.
The considered ML algorithms achieved relatively high [4] Z. Liu, R. N. Bashir, S. Iqbal, M. M. A. Shahid, M. Tausif and Q.
accuracy, but with error rates still an order of magnitude Umer, "Internet of Things (IoT) and Machine Learning Model of
Plant Disease Prediction–Blister Blight for Tea Plant," in IEEE
higher than the DL model. Further work in improving Access, vol. 10, pp. 44934-44944, 2022, doi:
accuracy of the classical approach can be done by 10.1109/ACCESS.2022.3169147.
experimenting with other algorithms and by improving the [5] A. Khandelwal, A. Shukla, and M. Sain presented their paper titled
features, as most likely they are the limiting factor of this "A Survey on Automated Disease Diagnosis and Classification of
Herbal Plants Using Digital Image Processing" at the 6th
approach. International Conference on Information Systems and Computer
Networks (ISCON) held in Mathura, India in 2023.
[6] Savary, Serge, et al. "The global burden of pathogens and pests on
major food crops." Nature ecology & evolution 3.3 (2019): 430.
[7] Mohanty, Sharada P., David P. Hughes, and Marcel Salathé
published the article titled "Using deep learning for image-based
plant disease detection" in Frontiers in Plant Science, volume 7, in
2016, on page 1419.
[8] Fujita, E., et al. "A practical plant diagnosis system for field leaf
images and feature visualization." International Journal of
Engineering & Technology 7.4.11 (2018): 49-54.
[9] Haralick, Robert M., Karthikeyan Shanmugam, and Its' Hak
REFERENCES Dinstein published the paper titled "Textural features for image
classification" in the IEEE Transactions on Systems, Man, and
Cybernetics, volume 6, in 1973, spanning pages 610-621.
[1] P. A. H. Vardhini, S. Asritha, and Y. S. Devi presented their paper [10] Cortes, Corinna, and Vladimir Vapnik. "Support-vector networks."
titled "Efficient Disease Detection of Paddy Crop using CNN" at the
Machine learning 20.3 (1995): 273-297.