Reinforcement Learning: Parallelizing Genetic Algorithms
Reinforcement Learning: Parallelizing Genetic Algorithms
Reinforcement Learning: Parallelizing Genetic Algorithms
Learning Task 79
IV Q Learning 82
Evaluating Hypotheses 86
Genetic Algorithms 92
An Illustrative Example 96
105
Parallelizing Genetic Algorithms.
UNIT I
Introduction to Machine Learning
1. Introduction
Definition of learning
Definition
A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks T, as measured by P, improves with experience E.
Examples
i) Handwriting recognition learning problem
• Task T: Recognising and classifying handwritten words within images
• Performance P: Percent of words correctly classified
• Training experience E: A dataset of handwritten words with given classifications
ii) A robot driving learning problem
• Task T: Driving on highways using vision sensors
• Performance measure P: Average distance traveled before an error
• training experience: A sequence of images and steering commands recorded while
observing a human driver
iii) A chess learning problem
• Task T: Playing chess
• Performance measure P: Percent of games won against opponents
• Training experience E: Playing practice games against itself
Definition
A computer program which learns from experience is called a machine learning program or
simply a learning program. Such a program is sometimes also referred to as a learner.
1
1. Data storage
Facilities for storing and retrieving huge amounts of data are an important component of the
learning process. Humans and computers alike utilize data storage as a foundation for advanced
reasoning.
• In a human being, the data is stored in the brain and data is retrieved using electrochemical signals.
• Computers use hard disk drives, flash memory, random access memory and similar devices to store
data and use cables and other technology to retrieve data.
2. Abstraction
The second component of the learning process is known as abstraction.
Abstraction is the process of extracting knowledge about stored data. This involves creating general
concepts about the data as a whole. The creation of knowledge involves application of known models
and creation of new models.
The process of fitting a model to a dataset is known as training. When the model has been trained, the
data is transformed into an abstract form that summarizes the original information.
3. Generalization
The third component of the learning process is known as generalisation.
The term generalization describes the process of turning the knowledge about stored data into a form
that can be utilized for future action. These actions are to be carried out on tasks that are similar, but
not identical, to those what have been seen before. In generalization, the goal is to discover those
properties of the data that will be most relevant to future tasks.
4. Evaluation
Evaluation is the last component of the learning process.
It is the process of giving feedback to the user to measure the utility of the learned knowledge. This
feedback is then utilised to effect improvements in the whole learning process
There are mainly two kinds of logical models: Tree models and Rule models.
Rule models consist of a collection of implications or IF-THEN rules. For tree-based models, the ‘if-part’
defines a segment and the ‘then-part’ defines the behaviour of the model for this segment. Rule models
follow the same reasoning.
3
The following example explains this idea in more detail.
A Concept Learning Task called “Enjoy Sport” as shown above is defined by a set of data from
some example days. Each data is described by six attributes. The task is to learn to predict the value of
Enjoy Sport for an arbitrary day based on the values of its attribute values. The problem can be
represented by a series of hypotheses. Each hypothesis is described by a conjunction of constraints on
the attributes. The training data represents a set of positive and negative examples of the target
function. In the example above, each hypothesis is a vector of six constraints, specifying the values of
the six attributes – Sky, AirTemp, Humidity, Wind, Water, and Forecast. The training phase involves
learning the set of days (as a conjunction of attributes) for which Enjoy Sport = yes.
Given instances X which represent a set of all possible days, each described by the attributes:
o Sky – (values: Sunny, Cloudy, Rainy),
o AirTemp – (values: Warm, Cold),
o Humidity – (values: Normal, High),
o Wind – (values: Strong, Weak),
o Water – (values: Warm, Cold),
o Forecast – (values: Same, Change).
Try to identify a function that can predict the target variable Enjoy Sport as yes/no, i.e., 1 or 0.
Linear models
Linear models are relatively simple. In this case, the function is represented as a linear
combination of its inputs. Thus, if x1 and x2 are two scalars or vectors of the same dimension
and a and b are arbitrary scalars, then ax1 + bx2 represents a linear combination of x1 and x2. In the
simplest case where f(x) represents a straight line, we have an equation of the form f (x)
= mx + c where c represents the intercept and m represents the slope.
Linear models are parametric, which means that they have a fixed form with a small number of numeric
parameters that need to be learned from data. For example, in f (x) = mx + c, m and c are the
parameters that we are trying to learn from the data. This technique is different from tree or rule
models, where the structure of the model (e.g., which features to use in the tree, and where) is not
fixed in advance.
Linear models are stable, i.e., small variations in the training data have only a limited impact on the
learned model. In contrast, tree models tend to vary more with the training data, as the choice of a
different split at the root of the tree typically means that the rest of the tree is different as well. As a
result of having relatively few parameters, Linear models have low variance and high bias. This implies
that Linear models are less likely to overfit the training data than some other models. However, they
are more likely to underfit. For example, if we want to learn the boundaries between countries based
on labelled data, then linear models are not likely to give a good approximation.
Distance-based models
Distance-based models are the second class of Geometric models. Like Linear models, distance-
based models are based on the geometry of data. As the name implies, distance-based models work on
the concept of distance. In the context of Machine learning, the concept of distance is not based on
merely the physical distance between two points. Instead, we could think of the distance between two
points considering the mode of transport between two points. Travelling between two cities by plane
5