ML 01
ML 01
ML 01
Introduction to
Computer Intelligence and
Machine Learning
• Intelligence
– Ability to solve problem correctly
• Artificial Intelligence
– Ability of machines in conducting intelligent tasks
• Intelligent Programs
– Programs conducting specific intelligent tasks
Intelligent
Processing
Input Output
Artificial Intelligence
A program that:
Our class:
Machine used = Computer
Computer Learning = Computer Intelligence
Machine Learning
• Machine Learning (Mitchell 1997)
11
Why is Machine Learning Important?
• Some tasks cannot be defined well, except by examples
(e.g., recognizing people).
• Relationships and correlations can be hidden within large
amounts of data. Machine Learning/Data Mining may be
able to find these relationships.
• Human designers often produce machines that do not
work as well as desired in the environments in which they
are used.
12
• The amount of knowledge available about certain tasks
might be too large for explicit encoding by humans (e.g.,
medical diagnostic).
• Environments change over time.
• New knowledge about tasks is constantly being
discovered by humans. It may be difficult to continuously
re-design systems “by hand”.
13
Example 1: Text Classification
… …
Training
Training
Games played:
Game 1’s move list Win
Game 2’s move list Lose
… …
Training
New matrix
Strategy of
representing Best move
Searching and
the current
Evaluating
board
Areas of Influence for Machine
Learning
• Statistics: How best to use samples drawn from unknown
probability distributions to help decide from which distribution
some new sample is drawn?
• Brain Models: Non-linear elements with weighted inputs
(Artificial Neural Networks) have been suggested as simple
models of biological neurons.
• Adaptive Control Theory: How to deal with controlling a
process having unknown parameters that must be estimated
during operation?
17
• Psychology: How to model human performance on various
learning tasks?
• Artificial Intelligence: How to write algorithms to acquire the
knowledge humans are able to acquire, at least, as well as
humans?
• Evolutionary Models: How to model certain aspects of
biological evolution to improve the performance of computer
programs?
18
Why Machine Learning Is Possible?
• Mass Storage
– More data available
• Adaptive
– Adaptive to the changing conditions
– Easy in migrating to new domains
ML in a Nutshell
• Tens of thousands of machine learning algorithms
• Hundreds new every year
• Most machine learning algorithm has three components:
– Representation
– Evaluation
– Optimization
Representation
• Decision trees
• Sets of rules / Logic programs
• Instances
• Graphical models (Bayes/Markov nets)
• Neural networks
• Support vector machines
• Model ensembles
• Etc.
Evaluation
• Accuracy
• Precision and recall
• Squared error
• Likelihood
• Posterior probability
• Cost / Utility
• Margin
• Entropy
• K-L divergence
• Etc.
Optimization
• Combinatorial optimization
– E.g.: Greedy search
• Convex optimization
– E.g.: Gradient descent
• Constrained optimization
– E.g.: Linear programming
Types of Learning
• Association Analysis
• Supervised (inductive) learning
– Training data includes desired outputs
– Classification
– Regression/Prediction
• Unsupervised learning
– Training data does not include desired outputs
• Semi-supervised learning
– Training data includes a few desired outputs
• Reinforcement learning
– Rewards from sequence of actions
Learning Associations
• Basket analysis:
P (Y | X ) probability that somebody who buys X also
buys Y where X and Y are products/services.
27
Techniques
• Supervised learning
– Decision tree induction
– Rule induction
– Instance-based learning
– Bayesian learning
– Neural networks
– Support vector machines
– Model ensembles
– Learning theory
Classification
• Example: Credit
scoring
• Differentiating
between low-risk
and high-risk
customers from their
income and savings
Model 29
Classification: Applications
• Face recognition: Pose, lighting, occlusion (glasses,
beard), make-up, hair style
• Character recognition: Different handwriting styles.
• Speech recognition: Temporal dependency.
– Use of a dictionary or the syntax of the language.
– Sensor fusion: Combine multiple modalities; eg,
visual (lip image) and acoustic for speech
• Medical diagnosis: From symptoms to illnesses
• Web Advertizing: Predict if a user clicks on an ad on the
Internet.
30
Face Recognition
Test images
31
Prediction: Regression
32
Unsupervised Learning
• Learning “what normally happens”
• No output
• Clustering: Grouping similar instances
• Other applications: Summarization, Association Analysis
• Example applications
– Customer segmentation in CRM
– Image compression: Color quantization
– Bioinformatics: Learning motifs
33
Techniques
• Unsupervised learning
– Clustering
– Dimensionality reduction
Reinforcement Learning
• Topics:
– Policies: what actions should an agent take in a
particular situation
– Utility estimation: how good is a state (used by
policy)
• No supervised output but delayed reward
• Credit assignment problem (what was responsible for the
outcome)
• Applications:
– Game playing
– Robot in a maze
– Multiple agents, partial observability, ...
35
Solving Real World Problems
Validation Data
Training Data
1st partition
Disjoint Validation Data Sets
Validation Data
Validation
Data
Training Data
44