Nothing Special   »   [go: up one dir, main page]

Bic Easy Hard

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 30

Biologically Inspired

Computing: Optimisation

This is mandatory additional material for


`Biologically Inspired Computing’
Contents:
Optimisation; hard problems and easy problems;
computational complexity; trial and error search;
This Material

A formal statement about optimisation, and explanation of the close


relationship with classification.

About optimisation problems: formal notions for hard problems and


easy ones.

Formal notions about optimisation algorithms: exact algorithms and


approximate algorithms.

The status of EAs (and other bio-inspired optimisation algorithms) in


this context.

The classical computing `alternatives’ to EAs


What to take from this material:
A clear understanding of what an optimisation problem is,
and an appreciation of how classification problems are
also optimisation problems
What it means, technically, for an optimisation problem
to be hard or easy
What it means, technically, for an optimisation algorithm
to be exact or approximate
What kinds of algorithms, technically speaking, EAs are.
An appreciation of non-EA techniques for optimisation
A basic understanding of when an EA might be applicable
to a problem, and when it might not.
Search and Optimisation
Imagine we have 4 items as follows:
(item 1: 20kg; item2: 75kg; item 3: 60kg, item4: 35kg)
Suppose we want to find the subset of items with highest total weight
0000 0100 1000 1100
0001 0101 1001 1101
0010 0110 1010 1110
0011 0111 1011 1111

Here is a standard treatment of this as an optimisation problem.


The set S of all possible solutions is indicated above,
The fitness of a solution s is the function total_weight(s)
We can solve the problem (I.e. find the fittest s) by working out the
fitness of each one in turn. But … any thoughts? I mean, how hard is
this?
• Of course, this problem is much easier to solve than that.
We already know that 1111 has to be the best solution.
• We can prove, mathematically, that any problem of the
form “find heaviest subset of a set of k things”, is solved
by the “all of them” subset, as long as each thing has
positive weight.
• In general, some problems are easy in this sense, in that
there is an easily provable way to construct an optimal
solution. Sometimes it is less obvious than in this case
though.
Search and Optimisation
In general, optimisation means that you are trying to find
the best solution you can (usually in a short time) to a
given problem.

We always have a set S of all possible S


solutions
s1 s2 s3 …
In realistic problems, S is too large to search
one by one. So we need to find some other way
to search through S.

One way is random search. E.g. in a 500-iteration


random search, we might randomly choose something
in S and evaluate its fitness, repeating that 500 times.
The Fitness function
Every candidate solution s in the set of all solutions S can
be given a score, or a “fitness”, by a so-called fitness
function. We usually write f(s) to indicate the fitness of
solution s. Obviously, we want to find the s in S which
has the best score.

Examples

timetabling: f could be no. of clashes.


wing design: f could be aerodynamic drag
delivery schedule f will be total distance travelled
An Aside about Classification
In a classification problem, we have a set of things to classify,
and a number of possible classes.

To classify s we use an algorithm called a classifier. So,


classifier(s) gives us a class label for s.

We can assign a fitness value to a classifier – this can be


simply the percentage of examples it gets right.

In finding a good classifier, we are solving the following


optimisation problem: Search a space of classifiers, and
find the one that gives the best accuracy.

E.g. the classifier might be a neural network, and we may use


an EA to evolve the NN with the best connection weights.
Searching through S
When S is small (e.g. 10, 100, or only
1,000,000 or so items), we can simply do
so-called exhaustive search.

Exhaustive search: Generate every


possible solution, work out its fitness,
and hence discover which is best (or
which set share the best fitness)

This is also called Enumeration


However …
In all interesting/important cases, S is much much
much too large for exhaustive search (ever).

There are two kinds of `too-big’ problem:


– easy (or `tractable’, or `in P’)
– hard (or `intractable’, or `not known to be in P’)
There are rigorous mathematical definitions of the two types – that’s
what the ‘P’ is about – it stands for the ‘class of problems that can
be solved by a deterministic polynomial Turing machine’. That’s
outside the scope of this module.
But, what’s important for you to know is that almost
all important problems are technically hard.
About Optimisation Problems
To solve a problem means to find an optimal
solution. I.e. to deliver an element of s whose
fitness is guaranteed to be the best in S.

An Exact algorithm is one which can do this


(i.e. solve a problem, guaranteeing to find the
best).

Is 500-iteration random search an Exact


algorithm?
Problem complexity
‘Problem complexity’, in the context of computer science, is all
about characterising how hard it is to solve a given problem.
Statements are made in terms of functions of n, which is meant
to be some indication of the size of the problem. E.g.:

Correctly sort a set of n numbers


• Can be done in around n log n steps

Find the closest pair out of n vectors


• Can be done in O(n2) steps

Find best design for an n-structural-element bridge


• Can be done in O(10n) steps …
Polynomial and Exponential Complexity

Given some problem Q, with `size’ n, imagine that A is the


fastest algorithm known for solving that problem exactly.
The complexity of problem Q is the time it takes A to solve
it, as a function of n.

There are two key kinds of complexity:

Polynomial: the dominant term in the expression is


polynomial in n. E.g. n34, n.log.n, sin(n2.2), etc …

Exponential: the dominant term is exponential in n. E.g. 1.1n,


nn+2 , 2n, …
Polynomial and Exponential Complexity

n 2 3 4 5 6 10 20 50 100
(exp)
1.1n 1.21 1.33 1.46 1.61 1.77 2.59 6.73 117 13,780

(poly)
n1.1 2.14 3.35 4.59 5.87 7.18 12.6 27.0 73.9 159

Problems with exponential complexity take too long to solve at large n


Note how an exponential always eventually catches up, and then
rapidly overtakes, a polynomial.
Hard and Easy Problems
Polynomial Complexity: these are called tractable,
and easy problems. Fast algorithms are known
which provide the best solution. Pairwise
alignment is one such problem. Sorting is another.

Exponential Complexity: these are called


intractable, and hard problems. The fastest known
algorithm which exactly solves it is usually not
significantly faster than exhaustive search.
exponential

An exponential curve always takes


over a polynomial one.

E.g. time needed on fastest computers to search all protein


structures with 500 amino acids: trillions of times longer
than the current age of the universe.

polynomial

Increasing n
Example: Minimum Spanning
Tree Problems
This is a case of an easy problem, but not as obvious
as our first easy example.

Find the cheapest tree which connects all (i.e. spans)


the nodes of a given graph.

Applications: Comms network backbone design;


Electricity distribution networks, water
distribution networks, etc …
A graph, showing the costs of building each
pair-to-pair link
12
4

6 14
7 9 3
5

8
6

What is the minimal-cost spanning tree?


(Spanning Tree = visits all nodes, has no cycles;
cost is sum of costs of edges used in the tree)
Here’s one tree:

12

14
7 3

With cost = 36
Here’s a cheaper one

6
7 3

With cost 20 …
The problem find the minimal cost spanning tree
(aka the `MST’) is easy in the technical sense.
12
4

14
6
7 9 3
5

8
6

Several fast algorithms are known which solve this in polynomial time;
Here is the classic one: Prim’s algorithm:
Start with empty tree (no edges)
Repeat: choose cheapest edge which feasibly extends the tree
Until: n — 1 edges have been chosen.
Prim’s step 1:

12
4

14
6
7 9 3
5

8
6
Prim’s step 2:

12
4

14
6
7 9 3
5

8
6
Prim’s step 3:

12
4

14
6
7 9 3
5

8
6
Prim’s step 4:

12
4

14
6
7 9 3
5

8
6
Prim’s step 4:

6
3
5

Guaranteed to have minimal possible cost for this graph;


i.e. this is the (or a) MST in this case.
But change the problem slightly:
We may want the degree constrained – MST (I.e. the MST,
but where no node in the tree has a degree above 4)
Or we may want the optimal communication spanning tree –
which is the MST, but constrained among those trees which
satisfy certain bandwith requirements between certain pairs
of nodes
There are many constrained/different forms of the MST. These
are essentially problems where we seek the cheapest tree
structure, but where many, or even most, trees are not
actually feasible solutions.
Here’s the thing: These constrained versions are almost
always technically hard. and Real-world MST-style
problems are invariably of this kind.
Approximate Algorithms
For hard optimisation problems (again, which turns
out to be nearly all the important ones), we need
Approximate algorithms .

These:
• deliver solutions in reasonable time

• try to find pretty good (`near optimal’)


solutions, and often get optimal ones.

• do not (cannot) guarantee that they have


delivered the optimal solution.
Typical Performance of
Approximate Methods
Evolutionary Algorithms turn out to be the most successful and
generally useful approximate algorithms around. They often take a
Long time though – it’s worth getting used to the following curve
which tends to apply across the board.

Sophisticated method, slow, but


Quality

better solutions eventually

Simple method gets good


solutions fast

Time
So …
• Most real world problems that we need to
solve are hard problems
• We can’t expect truly optimal solutions for
these, so we use approximate algorithms
and do as well as we can.
• EAs (and other BIC methods we will see)
are very successful approximate algorithms

You might also like