Nothing Special   »   [go: up one dir, main page]

FINAL PROJRCT (GROUP A) - IT (5th SEMESTER) DOCUMENT

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Declaration :-

We hereby declare that the project work being presented in the project proposal entitled “HAND
DRAWN GEOMETRIC SHAPES (TRIANGLE AND SQUARE) CLASSIFICATION USING
DEEP LEARNING” in partial fulfilment of the requirements for the award of the degree of
BACHELOR OF INFORMATION TECHNOLOGY at GURU NANAK INSTITUTE OF
TECHNOLOGY ,SODEPUR, KOLKATA, WEST BENGAL, is an authentic work carried out
under the guidance of MR. TRIDIP CHAKRABORTY. The matter embodied in this project work
has not been submitted elsewhere for the award of any degree of our knowledge and belief.

December 2020

Name of the Student:

Debayan Roy - Roll No - 500418011011


Sayani Sengupta - Roll No - 500418021046
Gourav Nandy - Roll No - 500418011014
Sriparna Ghosh - Roll No - 500418021054
Pallab Nath - Roll No - 500418011033
Roumita Singha - Roll No – 500418021041
Certificate

This is to certify that this proposal of minor project entitled “HAND DRAWN GEOMETRIC
SHAPES (TRIANGLE AND SQUARE) CLASSIFICATION USING DEEP LEARNING” is
a record of bonafide work, carried out by Debayan Roy , Sayani Sengupta , Sriparna Ghosh ,
Pallab Nath, Gourav Nandy and Roumita Singha under my guidance at GURU NANAK
INSTITUTE OF TECHNOLOGY. In my opinion, the report in its present form is in partial
fulfilment of the requirements for the award of the degree of BACHELOR OF INFORMATION
TECHNOLOGY and as per regulations of the institution . To the best of my knowledge, the
results embodied in this report, are original in nature and worthy of incorporation in the present
version of the report.

Debayan Roy - Roll No - 500418011011


Sayani Sengupta - Roll No - 500418021046
Gourav Nandy - Roll No - 500418011014
Sriparna Ghosh - Roll No - 500418021054
Pallab Nath - Roll No - 500418011033
Roumita Singha - Roll No – 500418021041

-------------------------------------- -------------------------------------------
HEAD OF THE DEPARTMENT PROJECT SUPERVISOR
Mr. Sudeep Ghosh Mr. Tridip Chakraborty
Head Of the Department Assistant Professor
Department of Information Technology Department of Information Technology .
GNIT , Kolkata GNIT , Kolkata

Invigilator :- ___________________________
Acknowledgement

Success of any project depends largely on the encouragement and guidelines of many others. I take
this sincere opportunity to express my gratitude to the people who have been instrumental in the
successful completion of this project work.

I would like to show our greatest appreciation to Mr. Tridip Chakraborty , Assistant Professor ,
Department of Information Technology at Guru Nanak Institute of Technology , Kolkata. I always
feel motivated and encouraged every time by his valuable advice and constant inspiration; without
his encouragement and guidance this project would not have materialized.

Words are inadequate in offering our thanks to the other mates , teachers and other members at
GNIT , Kolkata for their encouragement and cooperation in carrying out this project work. The
guidance and support received from all the members and who are contributing to this project, was
vital for the success of this project.
Index :-

1. Aim of the Project ……………………………………………………….. 1


2. Introduction ……………………………………………………………… 2
3. Glossary …………………………………………………………………. 4
4. Problem Definition ………………………………………………………. 6
5. Scope of the project ……………………………………………………… 7
6. Technology Used ………………………………………………………… 8
7. Proposed Scheme
A. Working of the Model …………………………………….. 9

B. Model Implementation ……………………………………. 12

8. Review Work
A . Detailed Designing ……………………………………….. 20
B . Project Code ………………………………………………. 22
C . Dataset Representation ……………………………………. 26
D . Output Screen ………………………………………………27

9. Understandability and Cost Effectiveness ……………………………….. 28


10. Benefits of the Project …………………………………………………….. 29
11. Future Scope ……………………………………………………………... 30
12. Conclusion ……………………………………………………………….. 31
13. Bibliography ……………………………………………………………… 32
1

Aim Of The Project

Classification of hand drawn geometric shapes (triangle and square) using deep learning
2

Introduction

Image classification refers to the task of extracting information classes from a multiband raster
image. The resulting raster from image classification can be used to create thematic maps.
Depending on the interaction between the analyst and the computer during classification.
There are two types of classification: supervised and unsupervised.

Supervised Classification
Supervised classification uses the spectral signatures obtained from training samples to classify an
image. With the assistance of the Image Classification toolbar, you can easily create training
samples to represent the classes you want to extract. You can also easily create a signature file
from the training samples, which is then used by the multivariate classification tools to classify the
image.

Unsupervised Classification
Unsupervised classification finds spectral classes (or clusters) in a multiband image without the
analyst’s intervention. The Image Classification toolbar aids in unsupervised classification by
providing access to the tools to create the clusters, capability to analyze the quality of the clusters,
and access to classification tools.
3

Hand Drawn Image Classification and Detection :-

The project discusses an approach involving hand drawn digital image processing and geometric
shape for recognition of Two - Dimensional shapes of objects such as squares and triangles as well
as the color of the object. This approach can be extended to applications like robotic vision and
computer intelligence. The methods involved are three dimensional RGB image to two
dimensional black and white image conversion, color pixel classification for object-background
separation, Area Based filtering and use of bounding box and its properties for calculating object
metrics.

We try to find optimal ways of recognizing hand drawn geometric shapes ( square and triangles )
of files in different formats. Shapes recognition is a field of artificial intelligence, which includes
all representation and decision techniques to automate the process of identifying similarities
between objects or phenomena. An application of shapes recognition requires the definition of
descriptors and choosing a distance. An application of shapes recognition is done in two phases:
learning and recognition.

This detection is carried out by generating a dataset which have two sets – Validation and Training
. While checking of the image format is carried off , the part of dataGenerator comes into figure.
Then the model is created , compiled and trained to get the following output.
4

Glossary

Machine learning implementations are classified into three major categories, depending on the
nature of the learning “signal” or “response” available to a learning system which are as follows:-

Supervised learning : When an algorithm learns from example data and associated target
responses that can consist of numeric values or string labels, such as classes or tags, in order to
later predict the correct response when posed with new examples comes under the category of
Supervised learning. This approach is indeed similar to human learning under the supervision of a
teacher. The teacher provides good examples for the student to memorize, and the student then
derives general rules from these specific examples.

Unsupervised learning : Whereas when an algorithm learns from plain examples without any
associated response, leaving to the algorithm to determine the data patterns on its own. This type
of algorithm tends to restructure the data into something else, such as new features that may
represent a class or a new series of un-correlated values. They are quite useful in providing humans
with insights into the meaning of data and new useful inputs to supervised machine learning
algorithms. As a kind of learning, it resembles the methods humans use to figure out that certain
objects or events are from the same class, such as by observing the degree of similarity between
objects. Some recommendation systems that you find on the web in the form of marketing
automation are based on this type of learning.

Reinforcement learning : When you present the algorithm with examples that lack labels, as
in unsupervised learning. However, you can accompany an example with positive or negative
feedback according to the solution the algorithm proposes comes under the category of
Reinforcement learning, which is connected to
5

applications for which the algorithm must make decisions (so the product is prescriptive, not just
descriptive, as in unsupervised learning), and the decisions
bear consequences. In the human world, it is just like learning by trial and error. Errors help you
learn because they have a penalty added (cost, loss of time, regret, pain, and so on), teaching you
that a certain course of action is less likely to succeed than others. An interesting example of
reinforcement learning occurs when computers learn to play video games by themselves. In this
case, an application presents the algorithm with examples of specific situations, such as having the
gamer stuck in a maze while avoiding an enemy. The application lets the algorithm know the
outcome of actions it takes, and learning occurs while trying to avoid what it discovers to be
dangerous and to pursue survival.

Semi-supervised learning : where an incomplete training signal is given: a training set with
some (often many) of the target outputs missing. There is a special case of this principle known as
Transduction where the entire set of problem instances is known at learning time, except that part
of the targets are missing.

Categorizing on the basis of required Output


Another categorization of machine learning task arises when one considers the desired output of a
machine-learned system:

Classification : When inputs are divided into two or more classes, and the learner must produce
a model that assigns unseen inputs to one or more (multi-label classification) of these classes. This
is typically tackled in a supervised way. Spam filtering is an example of classification, where the
inputs are email (or other) messages and the classes are “spam” and “not spam”.

1. Regression : Which is also a supervised problem, A case when the outputs are continuous
rather than discrete.

2. Clustering : When a set of inputs is to be divided into groups. Unlike in classification, the
groups are not known beforehand, making this typically an unsupervised task.
6

Problem Definition

Hand Drawn Geometric Shapes (Triangle and Square) Classification Using


Deep Learning.
7

Scope of the Project

In this project , it helps us in different ways such as ,


➢ Image and shape detection
➢ Image Classification
Classic deep learning methods are based on the assumption that the data are vectors to exploit
basic operation such as convolutions. While this suffices for many signals’ classification problems
such as speech, image, and video classification/segmentation, in various applications the data have
other structures.
8

Technology Used

❖ Software - Performed on Google Co - laboratory (Accessible in all kinds of platforms)

❖ Hardware – In all Kinds of Hardware that supports recent technology.


9

Proposed Scheme

Project Planning

➢ Working of the Model :-

The first layer of a neural network takes in all the pixels within an image. After all the data has
been fed into the network, different filters are applied to the image, which forms representations
of different parts of the image. This is feature extraction and it creates "feature maps". This process
of extracting features from an image is accomplished with a "convolutional layer", and convolution
is simply forming a representation of part of an image. It is from this convolution concept that we
get the term Convolutional Neural Network (CNN), the type of neural network most commonly
used in image classification.

Digital images are rendered as height, width, and some RGB value that defines the pixel's colors,
so the "depth" that is being tracked is the number of color channels the image has. Grayscale (non-
color) images only have 1 color channel while color images have 3 depth channels. All of this
means that for a filter of size 3 applied to a full-color image, the dimensions of that filter will be 3
x 3 x 3. For every pixel covered by that filter, the network multiplies the filter values with the
values in the pixels themselves to get a numerical representation of that pixel. This process is then
done for the entire image to achieve a complete representation. The filter is moved across the rest
of the image according to a parameter called "stride", which defines how many pixels the filter is
to be moved by after it calculates the value in its current position. The end result of all this
calculation is a feature map. This process is typically done with more than one filter, which helps
preserve the complexity of the image.
10

After the feature map of the image has been created, the values that represent the image are passed
through an activation function or activation layer. The activation function takes values that
represent the image, which are in a linear form (i.e. just a list of numbers) thanks to the
convolutional layer, and increases their non-linearity since images themselves are non-linear. The
typical activation function used to accomplish this is a Rectified Linear Unit (ReLU), although
there are some other activation functions that are occasionally used.

After the data is activated, it is sent through a pooling layer. Pooling " Down -Samples " an image,
meaning that it takes the information which represents the image and compresses it, making it
smaller. The pooling process makes the network more flexible and more adept at recognizing
objects/images based on the relevant features.

When we look at an image, we typically aren't concerned with all the information in the
background of the image, only the features we care about, such as people or animals. Similarly, a
pooling layer in a CNN will abstract away the unnecessary parts of the image, keeping only the
parts of the image it thinks are relevant, as controlled by the specified size of the pooling layer.
Because it has to make decisions about the most relevant parts of the image, the hope is that the
network will learn only the parts of the image that truly represent the object in question. This helps
prevent overfitting, where the network learns aspects of the training case too well and fails to
generalize to new data.
There are various ways to pool values, but max pooling is most commonly used. Max pooling
obtains the maximum value of the pixels within a single filter (within a single spot in the image).
This drops 3/4ths of information, assuming 2 x 2 filters are being used. The maximum values of
the pixels are used in order to account for possible image distortions, and the parameters/size of
the image are reduced in order to control for overfitting. There are other pooling types such as
average pooling or sum pooling, but these aren't used as frequently because max pooling tends to
yield better accuracy.
11

➢ Block diagram

Fig (1.1) – Block Diagram Representation of the Model


12

➢ Model Implementation

A. Part 1 — Data Preprocessing

I. Importing the libraries


from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
import numpy as np
from keras.preprocessing import image

II. Generate Data


The Dataset contains Triangle and Square images and using that images as
Features for our model, we try to predict that whether a particular image is a triangle
or a square image.
Image classification is a method to classify the images into their respective category
classes using some method like :
• Training a small network from scratch
• Fine tuning the top layers of the model using VGG16
13

Let’s discuss how to train model from scratch and classify the data containing triangle and square.
Train Data : Train data contains the 160 images of each triangle and square i.e. total there are
320 images in the training dataset.
Test Data : Test data contains 40 images of each triangle and square i.e. total there are 80 images
in the test dataset.

The model firstly we prepare the dataset in below arrangement.


- > TRAIN
- SQUARE
Square-1.jpg
Square-2.jpg
…………..
- TRIANGLE
Triangle-1.jpg
Triangle-2.jpg
…………..
- > VALIDATION
- SQUARE
Square-1.jpg
Square-2.jpg
…………..
- TRIANGLE
Triangle-1.jpg

Triangle-2.jpg
…………..
14

img_width, img_height = 150, 150


Every image in the dataset is of the size 150*150.
train_data_dir = ‘/content/drive/MyDrive/Dataset/train’
validation_data_dir=’/content/drive/MyDrive/Dataset/validation’
nb_train_samples = 320
nb_validation_samples=80
epochs = 2
batch_size = 1
Here, the train_data_dir is the train dataset directory. validation_data_dir is the directory for
validation data. nb_train_samples is the total number train samples. nb_validation_samples is the
total number of validation samples.

B. Part 2 - Checking format of Image

if K.image_data_format() == ‘channels_first’:
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)

This part is to check the data format i.e the RGB channel is coming first or last so,
whatever it may be, model will check first and then input shape will be feeded
accordingly.
15

C. Part 3 – Using DataGenerator

Now, the part of dataGenerator comes into figure. In which we have used:
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
#this is the augmentation configuration we will use for testing:
#only rescaling
test_datagen= ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=’binary’)
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=’binary’)
16

ImageDataGenerator that rescales the image, applies shear in some range, zooms the image and
does horizontal flipping with the image. This ImageDataGenerator includes all possible
orientation of the image.Train_datagen.flow_from_directory is the function that is used to
prepare data from the train_dataset directory Target_size specifies the target size of the image.
Test_datagen.flow_from_directory is used to prepare test data for the model and all is similar as
above. Fit_generator is used to fit the data into the model made above, other factors used are
steps_per_epochs tells us about the number of times the model will execute for the training data.
Epochs tells us the number of times model will be trained in forward and backward
pass.Validation_data is used to feed the validation/test data into the model.Validation_steps
denotes the number of validation/test samples.

D. Part 4 – Creating Model


model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation(‘relu’))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.summary()
model.add(Conv2D(32,(3, 3)))
model.add(Activation(‘relu’))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64,(3,3)))
model.add(Activation(‘relu’))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
17

model.add(Dense(64))
model.add(Activation(‘relu’))
model.add(Dropout(0.1))
model.add(Dense(1))
model.add(Activation(‘sigmoid’))
model.summary()

Conv2D is the layer to convolve the image into multiple images


Activation is the activation function.
The Rectified linear unit (relu) activation function has been the most widely used activation
function for deep learning applications with state-of-the-art results. It usually achieves better
performance and generalization in deep learning compared to the sigmoid activation function.
Using a mathematical definition, the sigmoid function takes any range real number and returns
the output value which falls in the range of 0 to 1. Based on the convention, the output value is
expected to be in the range of -1 to 1. The sigmoid function produces an “S” shaped curve.
MaxPooling2D is used to max pool the value from the given size matrix and same is used for the
next 2 layers. Then, Flatten is used to flatten the dimensions of the image obtained after convolving
it.
Dense is used to make this a fully connected model and is the hidden layer.
Dropout is used to avoid overfitting on the dataset.
Dense is the output layer contains only one neuron which decide to which category image belongs.
18

E. Part 5 – Model Compilation

model.compile(loss=’binary_crossentropy’,
optimizer=’rmsprop’,
metrics=[‘accuracy’])
Compile function is used here that involve use of loss, optimizers and metrics.here loss function
used is binary_crossentropy, optimizer used is rmsprop.

F. Part 6 – Train Model

model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,

validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)

model.save_weights(‘first_try.h5’)
i_pred = image.load_img(‘/content/drive/MyDrive/Dataset/validiation/triangle/Triangle-20’,
target_size=(150, 150))
img_pred = image.img_to_array(img_pred)
img_pred = np.expand_dims(img_pred, axis= 0)
19

G. Part 7 – Run Model

rslt = model.predict(img_pred)
print(rslt)
if rslt[0][0] == 1:
prediction=”Triangle”
else:
prediction=”Square”
print(prediction)
20

Review Work

Project Designing

A. Detailed Designing

Fig. (1.2) Summarized Diagram of the Project


21

Fig (1.2) Process Diagram of the Project


22

B. Project Code

#import Libraries
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
import numpy as np
from keras.preprocessing import image

#Generate Data
img_width, img_height = 150, 150
train_data_dir = '/content/drive/MyDrive/Dataset/train'
validation_data_dir='/content/drive/MyDrive/Dataset/validation'
nb_train_samples = 320
nb_validation_samples= 80
epochs = 2
batch_size = 1
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
23

input_shape = (img_width, img_height, 3)


train_datagen = ImageDataGenerator(

rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)

#this is the augmentation configuration we will use for testing:


#only rescaling

test_datagen= ImageDataGenerator(rescale=1. / 255)


train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
24

#Create Model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))

model.add(MaxPooling2D(pool_size=(2, 2)))
model.summary()
model.add(Conv2D(32,(3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64,(3,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.1))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.summary()
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
25

metrics=['accuracy'])

model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,

validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)

model.save_weights('first_try.h5')
img_pred = image.load_img('/content/drive/MyDrive/Dataset/validiation/triangle/Triangle-20',
target_size=(150, 150))
img_pred = image.img_to_array(img_pred)
img_pred = np.expand_dims(img_pred, axis= 0)
#run model
rslt = model.predict(img_pred)
print(rslt)
if rslt[0][0] == 1:
prediction="Triangle"
else:
prediction="Square"
print(prediction)
26

C. Dataset Representation

Fig (1.3) – Pictorial Representation of Dataset of the Model .


27

Output Screen

Fig. (1.5) - Output Snapshot


28

Understandability and Cost Effectiveness

❖ Understandability:

A method is understandable if someone other than the creator of the method can understand
the code. We use the method which small and coherent helps to accomplish this.

❖ Cost-Effectiveness:

It is under cost budget. It is desirable to aim for a system with a minimum cost subject to
the condition that it must satisfy the entire requirement.
29

Benefits of the Project

Our goal of this project was to detect geometric shaped objects from an image, separate and then
recognize these objects.

Moreover , day by day people are discovering more and more technologies to decrease the
sufferings of people. There has been invented robot to reduce the people effort to the hard
section of various fields of invention and researches.

Object detection is the hard section of various fields of inventions and researches . Object
detection is the crucial requirements in robotics. Without detection or recognition of objects
robot can’t perform any significant role.
30

Future Scope

➢ Future Scope of the Project :

This model can be easily implemented under various situations. We can add new features
as we required. Reusability is possible as and when require in this application. There is
flexibility in all the modules.

➢ Further Extension :
In this paper we proposed a system that uses convolution
neural network for extracting and selecting the features for any given image and classify
the images into appropriate classes.

▪ The Convolution neural network can give high accuracy


compared to other classifiers.

▪ The performance and accuracy is tested on simple CPU as well as GPU.

▪ Hence , we conclude that Convolution Neural Networks are a good choice for Image
Classification. Further this system can be extended for applications such as biometric
recognition .

▪ As a future work, we will consider several algorithms and


different weight adjacent functions of deep learning in order to
compare the performance enhancement with GPU Platform.
31

Conclusion

The geometric features have been analyzed using the immediate output of 2D classification
algorithm -the borders of shapes have toothed form. If line simplification algorithms are applied,
the correlation and importance of features may be different. The geometric feature “rectilinearity”
has not been analyzed together with other features under scope of study, because it requires using
line simplification algorithms. Therefore , it must be discovered independently to compare the best
combination of algorithms and their input parameters with features researched under this study.
The combinations of statistical, spatial and geometric features belong to different groups of
parameters. Therefore , correlation among them must be minimal, but clusters are located in
sufficient distance one from other providing good conditions for automatic classification.

Detection of objects from an image is an important technological revolution. Our goal of this
project was to detect geometric shaped objects from an image, separate and then recognize these
objects.

For the input image 80 Squares and 80 Triangles used as a training data and 8 Squares and 8
Triangles used as a validation data. A detailed experiment with different input image and different
detection accuracy has been shown in Moreover, day by day people are discovering more
and more technologies to decrease the sufferings of people. There has been invented robot to
reduce the people effort to the hard section of various fields of inventions and researches. Object
detection is the crucial requirement in robotics. Without detection or recognition of objects robot
can’t perform any significant role. The solution presented in this paper will enhance the capability
of object detection and recognition for industrial robots.
32

Bibliography

References :-

[1] T. O. Binford, "Visual perception by computer," in IEEE Conference on Systems and


Control, 1971.
[2] C. Liu, L. Sharan, E. H. Adelson and R. Rosenholtz, "Exploring features in a Bayesian
framework for material recognition," in The Twenty-Third {IEEE} Conference on Computer
Visoin and Pattern Recognition {CVPR}, San Francisco, CA, USA, 13-18 June 2010.
[3] A. Krizhevsky, I. Sutskeevr and G. E. Hinton, "ImageNet classification with deep
convolutional neural networks," Advances in Neural Information Processing Systems, pp. 1106-
1114, 2012.
[4] M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in
Computer Vision - {ECCV} 2014 - 13th European Conference, Zurich, Switzerland, September
6- 12, 2014.
[5] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng and T. Darrell, "Decaf: A
Deep Convolutional Activation Feature for Generic Visual Recognition," CoRR, vol.
abs/1310.1531, 2013.
[6] A. S. Razavian, H. Azizpour, J. Sulliva and S. Carlsson, "CNN Features off-the-shelf: an
Astounding Baseline for Recognition," CoRR, vol. abs/1403.6382, 2014.
[7] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich Feature Hierarchies for Accurate
Object Detection and Semantic Segmentation," in Conference on Computer Vision and Pattern
Recognition {CVPR}, Columbus, OH, USA, pp. 580-587, 2014.
[8] K. J. Dana, B. V. Ginneken, S. K. Nayar and J. J. Koenderink, "Reflectance and Texture of
RealWorld Surfaces," ACM Trans. Graph, vol. 18, pp. 1-34, 1999.
[9] M. Varma and A. Zisserman, "A Statistical Approach to Materialn Classification Using
Image Patch Exemplars," IEEE Trans. Pattern Anal. Mach. Intel, vol. 31, pp. 2032-2047, 2009.
[10] Davis, R.: Position statement and overview: sketch recognition at MIT. In: AAAI Sketch
Understanding Symposium (2002)
33

THANK YOU

You might also like