Nothing Special   »   [go: up one dir, main page]

Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

MATLAB Machine Learning Recipes: A Problem-Solution Approach
MATLAB Machine Learning Recipes: A Problem-Solution Approach
MATLAB Machine Learning Recipes: A Problem-Solution Approach
Ebook665 pages2 hours

MATLAB Machine Learning Recipes: A Problem-Solution Approach

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Harness the power of MATLAB to resolve a wide range of machine learning challenges. This book provides a series of examples of technologies critical to machine learning. Each example solves a real-world problem. All code in MATLAB Machine Learning Recipes:  A Problem-Solution Approach is executable. The toolbox that the code uses provides a complete set of functions needed to implement all aspects of machine learning. Authors Michael Paluszek and Stephanie Thomas show how all of these technologies allow the reader to build sophisticated applications to solve problems with pattern recognition, autonomous driving, expert systems, and much more.
What you'll learn:
  • How to write code for machine learning, adaptive control and estimation using MATLAB
  • How these three areas complement each other
  • How these three areas are needed for robust machine learning applications
  • How to use MATLAB graphics and visualization tools for machine learning
  • How to code real world examples in MATLAB for major applications of machine learning in big data
 Who is this book for: The primary audiences are engineers, data scientists and students wanting a comprehensive and code cookbook rich in examples on machine learning using MATLAB.
LanguageEnglish
PublisherApress
Release dateJan 31, 2019
ISBN9781484239162
MATLAB Machine Learning Recipes: A Problem-Solution Approach
Author

Michael Paluszek

Mr. Paluszek is President of Princeton Satellite Systems (PSS), which he founded in 1992. He holds an Engineer’s degree in Aeronautics and Astronautics (1979), an SM in Aeronautics and Astronautics (1979), and an SB in Electrical Engineering (1976), all from MIT. He is the PI on the ARPA-E OPEN grant to develop a compact nuclear fusion reactor based on the Princeton Field Reversed Configuration concept. He is also PI on the ARPA-E GAMOW project to develop power electronics for the fusion industry. He is PI on a project to design a closed-loop Brayton Cycle heat engine for space applications. Prior to founding PSS, he worked at GE Astro Space in East Windsor, NJ. At GE, he designed or led the design of several attitude control systems including GPS IIR, Inmarsat 3, and GGS Polar platform. He also was an ACS analyst on over a dozen satellite launches, including the GSTAR III recovery. Before joining GE, he worked at the Draper Laboratory and at MIT, where he still teaches Attitude Control Systems (course 16.S685/16.S890). He has 14 patents registered to his name.

Read more from Michael Paluszek

Related to MATLAB Machine Learning Recipes

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for MATLAB Machine Learning Recipes

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    MATLAB Machine Learning Recipes - Michael Paluszek

    © Michael Paluszek and Stephanie Thomas  2019

    Michael Paluszek and Stephanie ThomasMATLAB Machine Learning Recipeshttps://doi.org/10.1007/978-1-4842-3916-2_1

    1. An Overview of Machine Learning

    Michael Paluszek¹  and Stephanie Thomas¹

    (1)

    Plainsboro, NJ, USA

    1.1 Introduction

    Machine learning is a field in computer science where data are used to predict, or respond to, future data. It is closely related to the fields of pattern recognition, computational statistics, and artificial intelligence. The data may be historical or updated in real-time. Machine learning is important in areas such as facial recognition, spam filtering, and other areas where it is not feasible, or even possible, to write algorithms to perform a task.

    For example, early attempts at filtering junk emails had the user write rules to determine what was junk or spam. Your success depended on your ability to correctly identify the attributes of the message that would categorize an email as junk, such as a sender address or words in the subject, and the time you were willing to spend on tweaking your rules. This was only moderately successful as junk mail generators had little difficulty anticipating people’s hand-made rules. Modern systems use machine-learning techniques with much greater success. Most of us are now familiar with the concept of simply marking a given message as junk or not junk, and take for granted that the email system can quickly learn which features of these emails identify them as junk and prevent them from appearing in our inbox. This could now be any combination of IP or email addresses and words and phrases in the subject or body of the email, with a variety of matching criteria. Note how the machine learning in this example is data-driven, autonomous, and continuously updating itself as you receive email and flag it. However, even today, these systems are not completely successful since they do yet not understand the meaning of the text that they are processing.

    In a more general sense, what does machine learning mean? Machine learning can mean using machines (computers and software) to gain meaning from data. It can also mean giving machines the ability to learn from their environment. Machines have been used to assist humans for thousands of years. Consider a simple lever, which can be fashioned using a rock and a length of wood, or the inclined plane. Both of these machines perform useful work and assist people but neither has the ability to learn. Both are limited by how they are built. Once built, they cannot adapt to changing needs without human interaction. Figure 1.1 shows early machines that do not learn.

    ../images/420697_2_En_1_Chapter/420697_2_En_1_Fig1_HTML.png

    Figure 1.1

    Simple machines that do not have the capability to learn.

    Both of these machines do useful work and amplify the capabilities of people. The knowledge is inherent in their parameters, which are just the dimensions. The function of the inclined plane is determined by its length and height. The function of the lever is determined by the two lengths and the height. The dimensions are chosen by the designer, essentially building in the designer’s knowledge of the application and physics.

    Machine learning involves memory that can be changed while the machine operates. In the case of the two simple machines described above, knowledge is implanted in them by their design. In a sense, they embody the ideas of the builder, and are thus a form of fixed memory. Learning versions of these machines would automatically change the dimensions after evaluating how well the machines were working. As the loads moved or changed the machines would adapt. A modern crane is an example of a machine that adapts to changing loads, albeit at the direction of a human being. The length of the crane can be changed depending on the needs of the operator.

    In the context of the software we will be writing in this book, machine learning refers to the process by which an algorithm converts the input data into parameters it can use when interpreting future data. Many of the processes used to mechanize this learning derive from optimization techniques, and in turn are related to the classic field of automatic control. In the remainder of this chapter, we will introduce the nomenclature and taxonomy of machine learning systems.

    1.2 Elements of Machine Learning

    This section introduces key nomenclature for the field of machine learning.

    1.2.1 Data

    All learning methods are data driven. Sets of data are used to train the system. These sets may be collected and edited by humans or gathered autonomously by other software tools. Control systems may collect data from sensors as the systems operate and use that data to identify parameters, or train, the system. The data sets may be very large, and it is the explosion of data storage infrastructure and available databases that is largely driving the growth in machine learning software today. It is still true that a machine learning tool is only as good as the data used to create it, and the selection of training data is practically a field unto itself.

    Note

    When collecting data from training, one must be careful to ensure that the time variation of the system is understood. If the structure of a system changes with time it may be necessary to discard old data before training the system. In automatic control, this is sometimes called a forgetting factor in an estimator.

    1.2.2 Models

    Models are often used in learning systems. A model provides a mathematical framework for learning. A model is human-derived and based on human observations and experiences. For example, a model of a car, seen from above, might show that it is of rectangular shape with dimensions that fit within a standard parking spot. Models are usually thought of as human-derived and providing a framework for machine learning. However, some forms of machine learning develop their own models without a human-derived structure.

    1.2.3 Training

    A system, which maps an input to an output, needs training to do this in a useful way. Just as people need to be trained to perform tasks, machine learning systems need to be trained. Training is accomplished by giving the system and input and the corresponding output and modifying the structure (models or data) in the learning machine so that mapping is learned. In some ways, this is like curve fitting or regression. If we have enough training pairs, then the system should be able to produce correct outputs when new inputs are introduced. For example, if we give a face recognition system thousands of cat images and tell it that those are cats we hope that when it is given new cat images it will also recognize them as cats. Problems can arise when you don’t give it enough training sets or the training data are not sufficiently diverse, for instance, identifying a long-haired cat or hairless cat when the training data only consist of shorthaired cats. Diversity of training data is required for a functioning neural net.

    1.2.3.1 Supervised Learning

    Supervised learning means that specific training sets of data are applied to the system. The learning is supervised in that the training sets are human-derived. It does not necessarily mean that humans are actively validating the results. The process of classifying the system’s outputs for a given set of inputs is called labeling, that is, you explicitly say which results are correct or which outputs are expected for each set of inputs.

    The process of generating training sets can be time consuming. Great care must be taken to ensure that the training sets will provide sufficient training so that when real-world data are collected, the system will produce the correct results. They must cover the full range of expected inputs and desired outputs. The training is followed by test sets to validate the results. If the results aren’t good then the test sets are cycled into the training sets and the process repeated.

    A human example would be a ballet dancer trained exclusively in classical ballet technique. If she were then asked to dance a modern dance, the results might not be as good as required because the dancer did not have the appropriate training sets; her training sets were not sufficiently diverse.

    1.2.3.2 Unsupervised Learning

    Unsupervised learning does not utilize training sets. It is often used to discover patterns in data for which there is no right answer. For example, if you used unsupervised learning to train a face identification system the system might cluster the data in sets, some of which might be faces. Clustering algorithms are generally examples of unsupervised learning. The advantage of unsupervised learning is that you can learn things about the data that you might not know in advance. It is a way of finding hidden structures in data.

    1.2.3.3 Semi-Supervised Learning

    With this approach, some of the data are in the form of labeled training sets and other data are not [11]. In fact, typically only a small amount of the input data is labeled while most are not, as the labeling may be an intensive process requiring a skilled human. The small set of labeled data is leveraged to interpret the unlabeled data.

    1.2.3.4 Online Learning

    The system is continually updated with new data [11]. This is called online because many of the learning systems use data collected online. It could also be called recursive learning. It can be beneficial to periodically batch process data used up to a given time and then return to the online learning mode. The spam filtering systems from the introduction utilize online learning.

    1.3 The Learning Machine

    Figure 1.2 shows the concept of a learning machine. The machine absorbs information from the environment and adapts. The inputs may be separated into those that produce an immediate response and those that lead to learning. In some cases they are completely separate. For example, in an aircraft a measurement of altitude is not usually used directly for control. Instead, it is used to help select parameters for the actual control laws. The data required for learning and regular operation may be the same, but in some cases separate measurements or data are needed for learning to take place. Measurements do not necessarily mean data collected by a sensor such as radar or a camera. It could be data collected by polls, stock market prices, data in accounting ledgers or any other means. The machine learning is then the process by which the measurements are transformed into parameters for future operation.

    Note that the machine produces output in the form of actions. A copy of the actions may be passed to the learning system so that it can separate the effects of the machine actions from those of the environment. This is akin to a feedforward control system, which can result in improved performance.

    A few examples will clarify the diagram. We will discuss a medical example, a security system, and spacecraft maneuvering.

    A doctor may want to diagnose diseases more quickly. She would collect data on tests on patients and then collate the results. Patient data may include age, height, weight, historical data such as blood pressure readings and medications prescribed, and exhibited symptoms. The machine learning algorithm would detect patterns so that when new tests were performed on a patient, the machine learning algorithm would be able to suggest diagnoses, or additional tests to narrow down the possibilities. As the machine-learning algorithm were used it would, hopefully, get better with each success or failure. Of course, the definition of success or failure is fuzzy. In this case, the environment would be the patients themselves. The machine would use the data to generate actions, which would be new diagnoses. This system could be built in two ways. In the supervised learning process, test data and known correct diagnoses are used to train the machine. In an unsupervised learning process, the data would be used to generate patterns that may not have been known before and these could lead to diagnosing conditions that would normally not be associated with those symptoms.

    ../images/420697_2_En_1_Chapter/420697_2_En_1_Fig2_HTML.png

    Figure 1.2

    A learning machine that senses the environment and stores data in memory.

    A security system may be put into place to identify faces. The measurements are camera images of people. The system would be trained with a wide range of face images taken from multiple angles. The system would then be tested with these known persons and its success rate validated. Those that are in the database memory should be readily identified and those that are not should be flagged as unknown. If the success rate were not acceptable, more training might be needed or the algorithm itself might need to be tuned. This type of face recognition is now common, used in Mac OS X’s Faces feature in Photos, face identification on the new iPhone X, and Facebook when tagging friends in photos.

    For precision maneuvering of a spacecraft, the inertia of the spacecraft needs to be known. If the spacecraft has an inertial measurement unit that can measure angular rates, the inertia matrix can be identified. This is where machine learning is tricky. The torque applied to the spacecraft, whether by thrusters or momentum exchange devices, is only known to a certain degree of accuracy. Thus, the system identification must sort out, if it can, the torque scaling factor from the inertia. The inertia can only be identified if torques are applied. This leads to the issue of stimulation. A learning system cannot learn if the system to be studied does not have known inputs and those inputs must be sufficiently diverse to stimulate the system so that the learning can be accomplished. Training a face recognition system with one picture will not work.

    1.4 Taxonomy of Machine Learning

    In this book, we take a bigger view of machine learning than is typical. Machine learning as described above is the collecting of data, finding patterns, and doing useful things based on those patterns. We expand machine learning to include adaptive and learning control. These fields started off independently, but are now adapting technology and methods from machine learning. Figure 1.3 shows how we organize the technology of machine learning into a consistent taxonomy . You will notice that we created a title that encompasses three branches of learning; we call the whole subject area Autonomous Learning. That means, learning without human intervention during the learning process. This book is not solely about traditional machine learning. There are other, more specialized books that focus on any one of the machine-learning topics. Optimization is part of the taxonomy because the results of optimization can be new discoveries, such as a new type of spacecraft or aircraft trajectory. Optimization is also often a part of learning systems.

    ../images/420697_2_En_1_Chapter/420697_2_En_1_Fig3_HTML.png

    Figure 1.3

    Taxonomy of machine learning.

    There are three categories under Autonomous Learning. The first is Control. Feedback control is used to compensate for uncertainty in a system or to make a system behave differently than it would normally behave. If there were no uncertainty you wouldn’t need feedback. For example, if you are a quarterback throwing a football at a running player, assume for a moment and you know everything about the upcoming play. You know exactly where the player should be at a given time, so you can close your eyes, count, and just throw the ball to that spot. Assuming that the player has good hands, you would have a 100% reception rate! More realistically, you watch the player, estimate the player’s speed and throw the ball. You are applying feedback to the problem. As stated, this is not a learning system. However, if now you practice the same play repeatedly, look at your success rate and modify the mechanics and timing of your throw using that information, you would have an adaptive control system, the box second from the top of the control list. Learning in control takes place in adaptive control systems and also in the general area of system identification.

    System identification is learning about a system. By system we mean the data that represent the system and the relationships between elements of those data. For example, a particle moving in a straight line is a system defined by its mass, the force on that mass, its velocity and position. The position is related to the velocity times time and the velocity is related determined by the acceleration, which is the force divided by the mass.

    Optimal control may not involve any learning. For example, what is known as full state feedback produces an optimal control signal, but does not involve learning. In full state feedback, the combination of model and data tells us everything we need to know about the system. However, in more complex systems we can’t measure all the states and don’t know the parameters perfectly so some form of learning is needed to produce optimal or the best possible results.

    The second category is what many people consider true Machine Learning. This is making use of data to produce behavior that solves problems. Much of its background comes from statistics and optimization. The learning process may be done once in a batch process or continually in a recursive process. For example, in a stock-buying package, a developer may have processed stock data for several years, say prior to 2008, and used that to decide which stocks to buy. That software may not have worked well during the financial crash. A recursive program would continuously incorporate new data. Pattern recognition and data mining fall into this category. Pattern recognition is looking for patterns in images. For example, the early AI Blocks World software could identify a block in its field of view. It could find one block in a pile of blocks. Data mining is taking large amounts of data and looking for patterns, for example, taking stock market data and identifying companies that have strong growth potential. Classification techniques and fuzzy logic are also in this category.

    The third category of autonomous learning is Artificial Intelligence . Machine learning traces some of its origins to artificial intelligence. Artificial Intelligence is the area of study whose goal is to make machines reason. Although many would say the goal is to think like people, this is not necessarily the case. There may be ways of reasoning that are not similar to human reasoning, but are just as valid. In the classic Turing test, Turing proposes that the computer only needs to imitate a human in its output to be a thinking machine, regardless of how those outputs are generated. In any case, intelligence generally involves learning, so learning is inherent in many Artificial Intelligence technologies such as inductive learning and expert systems. Our diagram includes the two techniques of inductive learning and expert systems.

    The recipe chapters of this book are grouped according to this taxonomy. The first chapters cover state estimation using the Kalman Filter and adaptive control. Fuzzy logic is then introduced, which is a control methodology that uses classification. Additional machine-learning recipes follow with chapters on data classification with binary trees, neural nets including deep learning, and multiple hypothesis testing. We then have a chapter on aircraft control that incorporates neural nets, showing the synergy between the different technologies. Finally, we conclude with a chapter on an artificial intelligence technique, case-based expert systems.

    1.5 Control

    Feedback control algorithms inherently learn about the environment through measurements used for control. These chapters show how control algorithms can be extended to effectively design themselves using measurements. The measurements may be the same as used for control, but the adaptation, or learning, happens more slowly than the control response time. An important aspect of control design is stability. A stable controller will produce bounded outputs for bounded inputs. It will also produce smooth, predictable behavior of the system that is controlled. An unstable controller will typically experience growing oscillations in the quantities (such as speed or position) that is controlled. In these chapters, we explore both the performance of learning control and the stability of such controllers. We often break control into two parts, control and estimation. The latter may be done independent of feedback control.

    1.5.1 Kalman Filters

    Chapter 4 shows how Kalman filters allow you to learn about dynamical systems for which we already have a model. This chapter provides an example of a variable gain Kalman Filter for a spring system, that is, a system with a mass connected to its base via a spring and a damper. This is a linear system. We write the system in discrete time. This provides an introduction to Kalman Filtering. We show how Kalman Filters can be derived from Bayesian Statistics. This ties it into many machine-learning algorithms. Originally, the Kalman Filter, developed by R. E. Kalman, C. Bucy, and R. Battin, was not derived in this fashion.

    The second recipe adds a nonlinear measurement. A linear measurement is a measurement proportional to the state (in this case position) it measures. Our nonlinear measurement will be the angle of a tracking device that points at the mass from a distance from the line of movement. One way is to use an Unscented Kalman Filter (UKF) for state estimation. The UKF lets us use a nonlinear measurement model easily.

    The last part of the chapter describes the Unscented Kalman Filter configured for parameter estimation. This system learns the model, albeit one that has an existing mathematical model. As such, it is an example of model-based learning. In this example, the filter estimates the oscillation frequency of the spring-mass system. It will demonstrate how the system needs to be stimulated to identify the parameters.

    1.5.2 Adaptive Control

    Adaptive control is a branch of control systems in which the gains of the control system change based on measurements of the system. A gain is a number that multiplies a measurement from a sensor to produce a control action such as driving a motor or other actuator. In a nonlearning control system, the gains are computed prior to operation and remain fixed. This works very well most of the time since we can usually pick gains so that the control system is tolerant of parameter changes in the system. Our gain margins tell us how tolerant we are to uncertainties in the system. If we are tolerant to big changes in parameters, we say that our system is robust.

    Adaptive control systems change the gain based on measurements during operation. This can help a control system perform even better. The better we know a system’s model, the tighter we can control the system. This is much like driving a new car. At first, you have to be cautious driving a new car, because you don’t know how sensitive the steering is to turning the wheel or how fast it accelerates when you depress the gas pedal. As you learn about the car you can maneuver it with more confidence. If you didn’t learn about the car you would need to drive every car in the same fashion.

    Chapter 5 starts with a simple example of adding damping to a spring using a control system. Our goal

    Enjoying the preview?
    Page 1 of 1