DK201770681A1 - A method for (re-)training a machine learning component - Google Patents
A method for (re-)training a machine learning component Download PDFInfo
- Publication number
- DK201770681A1 DK201770681A1 DKPA201770681A DKPA201770681A DK201770681A1 DK 201770681 A1 DK201770681 A1 DK 201770681A1 DK PA201770681 A DKPA201770681 A DK PA201770681A DK PA201770681 A DKPA201770681 A DK PA201770681A DK 201770681 A1 DK201770681 A1 DK 201770681A1
- Authority
- DK
- Denmark
- Prior art keywords
- training data
- machine learning
- learning component
- augmented
- data
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B9/00—Simulators for teaching or training purposes
- G09B9/006—Simulators for teaching or training purposes for locating or ranging of objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/086—Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B9/00—Simulators for teaching or training purposes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Educational Technology (AREA)
- Educational Administration (AREA)
- Physiology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
A computer-implemented method, comprising: evolving a set of augmented training data (209) and training a machine learning component (204) by: synthesizing (304) a set of augmentation data (204) based on a set of parameter values in accordance with a parametric representation of an artefact; generating (305) a set of augmented training data (209) by augmenting training data (203) based on the augmentation data (204); evolving (309) the set of augmented training data (209) over generationsbased on evolving the set of parameter values in accordance with optimization of a fitness function, which is configured to reward a performance deficiency associated with an output produced by the machine learning component in response to receiving augmented training data (209) as its input; among the augmented training data (209), determining (310) a set of adversarial augmented training data (211) which are augmented training data in the set of augmented training data (209) that caused aperformance deficiency associated with an output produced by the machine learning component (204) in response to receiving augmented training data as its input; and training the machine learning (204) component based on the set of adversarial training data (211). The computer-implemented method isrelated to improve machine learning component for self-driving cars and image-based identification of persons.
Description
A method for (re-)training a machine learning component
Machine learning components are increasingly used in connection with complex control systems such as those used in self-driving cars and imagebased authentication of persons.
However, recently e.g. Szegedy et al. report that machine learning components, such as deep neural networks, can be ‘fooled’ to misclassify an image when modified by a certain imperceptible perturbation - leading to situations wherein humans hardly never would be fooled, but where the machine learning component is clearly fooled. This may be critical or at least problematic e.g. in connection with self-driving cars and image-based authentication of persons, but also in connection with other applications of machine learning components - especially when observations (e.g. in the form of images) input to the machine learning component may represent an almost infinite number of possible observations (such as related to observing traffic in a first person view).
Such machine learning components may have a component or portion which is herein denoted a ‘black box’, which can be observed at its input and output, but its internal states and parameters, such as weights of a neural network, may not be observable or enabled for change of their values. The machine learning component may also have a trainable component or portion which is enabled for training the machine learning component to improve the performance of the machine learning component e.g. as a controller.
The present disclosure relates to techniques for analysing such machine learning components to discover critical flaws and to fix those flaws automatically.
Machine learning components may comprise artificial neural networks, such as deep neural networks, convolutionary neural networks; discriminatorbased neural networks, policy-based neural networks, regression neural
DK 2017 70681 A1 networks etc. Machine learning components may also be based on learning trees, reinforcement learning, such as Q-learning etc.
Generally, there is a need to improve such machine learning components e.g. to improve their performance in technical fields such as authentication, self-driving vehicles, automated diagnostics etc.
RELATED PRIOR ART
Szegedy, Zaremba, Sutskever, Bruna, Erhan, Goodfellow, and Fergus, “Intriguing properties of neural networks” finds that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extend. Specifically, Szegedy et al. find that the network can be used to misclassify an image by applying a certain imperceptible perturbation, which is found by maximizing the network's prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, trained on a different subset of the dataset, to misclassify the same input.
Nguyen A, Yosinski J, Clune J., “Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images”, in Computer Vision and Pattern Recognition, IEEE, 2015, (https://arxiv.org/pdf/1412.1897.pdf) demonstrates that it is easy to produce images that are completely unrecognizable to humans, but that state-of-the art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labelling with certainty that white noise static is a lion). Such, images completely unrecognizable to humans are denoted “fooling images” (more generally, “fooling examples”).
Nguyen et al. describes that convolutional neural networks (CNN) are trained to perform well on an image dataset and then images that the CNN label with high confidence as belonging to each dataset class are identified using a network, trained through gradient descent. The images identified in this way are created by an evolutionary algorithm or gradient ascent algorithm. Subsequently, images identified in this way are “optimized” to generate highDK 2017 70681 A1 confidence predictions for each class in the dataset the CNN is trained on.
Thereby, images that fool DNNs can be produced.
Nguyen et al. use an algorithm called the multi-dimensional archive of phenotypic elites MAP-Elites, which enables simultaneously evolving a population that contains individuals that score well on many classes. MAPElites works by keeping the best individual found so far for each objective. At each iteration the algorithm chooses a random individual from the population, mutates it randomly, and replaces the current champion for any objective if the new individual has higher fitness on that objective. Fitness is determined by propagating the image through the DNN; if the image generates a higher prediction score for any class than has been seen before, the newly generated individual becomes the champion in the population for that class.
Nguyen et al. describe an evolutionary algorithm with an indirect encoding, which is more likely to produce regular images, meaning images that contain compressible patterns (e.g. symmetry and repetition). Indirectly encoded images tend to be regular because elements in the genome can affect multiple parts of the image. Specifically, the indirect encoding is a compositional pattern-producing network (CPPN), which can evolve complex, regular images that resemble natural and man-made objects.
However, Nguyen et al. reports the detected flaws of the DNNs and concludes that while retrained DNNs learn to classify the negative examples as fooling images, yet a new batch of fooling images can be produced that fool these new (i.e. retrained) networks. Thus, Nguyen et al. have not described how to improve DNNs, but only reported the detected flaws.
SUMMARY
The present method relates to detecting and fixing critical flaws in machine learning components, e.g. comprising black box systems, through an active parametric content generation approach as set out below.
DK 2017 70681 A1
There is provided a method, comprising:
evolving a set of augmented training data and training a machine learning component by:
synthesizing augmentation data based on a set of parameter values in accordance with a parametric representation of an artefact;
generating a set of augmented training data by augmenting training data based on the augmentation data;
evolving the set of augmented training data over generations based on evolving the set of parameter values in accordance with optimization of a fitness function, which is configured to reward a performance deficiency associated with an output produced by the machine learning component in response to receiving augmented training data as its input;
among the set of augmented training data, determining a set of adversarial augmented training data which are augmented training data in the set of augmented training data that caused a performance deficiency associated with an output produced by the machine learning component in response to receiving augmented training data as its input; and training the machine learning component based on the set of adversarial training data.
Thereby, the machine learning component is improved by training or retraining based on generations of new training data that are developed over generations of training data for the purpose of improving the machine learning component in particular areas of training data wherein the machine learning component has difficulties in performing as desired. Thus the machine learning component is improved in particular areas wherein the machine learning component was likely to fail.
DK 2017 70681 A1
The method is guided by the fitness function to generate augmented training data in the particular areas wherein the machine learning component was likely to fail. Thus the fitness function guides the method towards area of high failure rates. The fitness function is based on how well the synthesized augmented training data can ‘fool’ the machine learning component.
The machine learning component may be e.g. a discriminator-based machine learning component or a policy-based machine learning component. The fitness function drives the method to generate evolutions of the set of augmented training data by evolving parameters that make e.g. a parametric content generator produce examples that ‘fool’ the machine learning component.
Thus the method involves both evolving a set of augmented training data; selecting a set of adversarial augmented training data; and training, such as retraining, of the machine learning component. The set of adversarial augmented training data may be the result of evolving augmented training data over one or multiple generations in readiness of training or retraining the machine learning component. During the course of evolving the set of augmented training data, such as at every generation, the machine learning component may be used in connection with computing a performance deficiency associated with an output produced by the machine learning component in response to receiving augmented training data as its input. The fitness function may take the computed performance deficiency as its input.
The generations may also be denoted ‘iterations’, however herein it is generally preferred to use the term ‘generations’ in connection with the evolving a set of augmented training data and in readiness of selecting adversarial augmented training date therefrom. The term ‘iterations’ is used herein in connection with the determination of the set of adversarial training data at each first iterations and in connection with training of the machine learning component at each second iterations.
DK 2017 70681 A1
The fitness function, which is configured to reward such a performance deficiency, drives the method to generate augmented training data in areas wherein the machine learning component is more likely to fail in outputting a desired output. Adversarial training data are training data which causes the machine learning component to fail in outputting a desired output. As an example of a desired output is a correct classification in case the machine learning component is a discriminator-based machine learning component. A desired output may also be the ability to output values of state variables in a control system causing the control system to have a desired performance, such as in case of a control system for a self-driving vehicle that the control system drives the vehicle according to traffic rules and avoids crashing. A simulator may be configured to determine when a desired performance is achieved by the controller and when the desired performance is not achieved.
Once, a set of adversarial training data is determined the machine learning component is trained based on these adversarial training data. The machine learning component may be (re-)trained based on the set of training data supplemented by the adversarial training data - or it may be (re-)trained based the adversarial training data while forgo training on the training data.
The machine learning component may comprise a non-trainable portion and a trainable portion coupled in series and/or in parallel. A non-trainable portion may also be denoted a ‘black box', however its input and output should be available in connection with determining the training data that caused the machine learning component to fail or have a poor performance.
Augmentation data may be data that are in a format that is compatible with or convertible to be compatible with training data or at least an input portion thereof. Augmentation data supplements training data or modifies training data. Augmentation data is a representation of an artefact. An artefact may be an object made by a human being; something observed in a scientific
DK 2017 70681 A1 investigation or experiment that is not naturally present but occurs as a result of the preparative or investigative procedure; or something which resembles a natural phenomenon. In case the training data comprises images of faces of human beings, the augmentation data may be images of artefacts such as e.g. glasses (which is an object made by a human being) or the augmentation data may represent an artefact such darkness (which is a natural phenomenon). In the latter case, the augmentation data may itself be an image or layer thereof (with low transparency) or it may be brightness parameters controlling the training data or a copy thereof. The parameters may then control transparency of the glass in the glasses, thickness and colour of the frame of the glasses, and brightness of an image. The augmentation data may be synthesized by a procedural content generator.
The set of augmented training data is synthesized by augmenting the training data based on the augmentation data. The augmentation may take place in an image-domain or video-domain, in an audio-domain or in a parametricdomain. A parameter represents a degree of manipulation; for instance a brightness parameter may represent the brightness of an entire image or set of images or of a single pixel.
In general, at each iteration of evolving a set of augmented training data, the augmented training data are generated by synthesizing augmented data from parameter values. The augmented data may be generated by a content generator which deterministically or substantially deterministically synthesizes the augmentation data from the parameter values. The parameter values are evolved over generations e.g. by means of an evolutionary algorithm (EA) or another type of optimisation algorithm using a search strategy.
In the present method, evolving the set of augmented training data over generations is based on evolving the set of parameter values in accordance with optimization of a fitness function. The fitness function is configured to
DK 2017 70681 A1 reward a set of parameter values causing a performance deficiency associated with an output produced by the machine learning component in response to receiving augmented training data generated from the parameter as its input. The deficiency may be detectable by:
1) Comparing an output to a desired output and registering the parameter values that caused a predefined discrepancy between output and a desired output e.g. in case of a discriminator-based machine learning component.
In accordance therewith, parameter values related to augmented training data which e.g. caused the machine learning component to generate an output which deviates from a desired output, determines the adversarial augmented training data.
2) Evaluating the output of a performance indicator function e.g. in case of a policy-based machine learning component. The performance indicator function may relate to a performance of a controller in a control system or performance of the control system; wherein the machine learning component is a part of the controller.
In accordance therewith parameter values related to augmented training data which occurred at the time when a performance deficiency occurred or at times before, leading to occurrence of a performance deficiency are registered determines the adversarial augmented training data.
Thus, among the augmented training data, the set of adversarial augmented training data are those which caused a performance deficiency associated with an output produced by the machine learning component in response to receiving augmented training data as its input.
Parameters may be initialized with random values that are constrained to lie in a predefined interval.
DK 2017 70681 A1
The present method may be run by a computer system which receives a representation of the machine learning component in a first state which it was brought to by training based on the training data. The computer system may enter a first mode, wherein the present method is executed and wherein the machine learning component is trained as set out herein to have a second state which it is brought to by training based on at least the augmented training data. The computer system may complete the first mode providing the machine learning component in the second state.
In some embodiments evolving the set of augmented training data comprises:
from one generation to another generation of evolving the set of augmented training data:
the synthesizing of augmentation data based on the set of parameter values;
the generating of augmented training data by augmenting the training data based on the augmentation data; and forgo changing the training data.
Thus, the augmentation data are evolved over generations, driven by the fitness function towards or into areas wherein the machine learning component has difficulties in performing as desired. Thereby the machine learning component is improved by training in those areas wherein it has difficulties in performing as desired without changing the (original) training data as such. Thus, the (original) training data may be preserved.
In some embodiments evolving the set of augmented training data from one generation to another generation comprises:
the synthesizing of augmentation data based on the set of parameter values; and
DK 2017 70681 A1 the generating of augmented training data by augmenting the training data based on the augmentation data; and at least partially replacing the training data with the augmented training data.
Thereby the (original) training data are changed in accordance with the augmentation data. Thereby, training data may be overwritten or otherwise replaced; which reduces the amount of storage required in connection with (re-)training the machine learning component.
In some embodiments the determining of the set of adversarial augmented training data comprises:
inputting a current generation of the set of augmented training data to the machine learning component and evaluating an output associated with the current generation of augmented training data;
determining examples, among the current generation of the set of augmented training data, which have a performance deficiency;
including the determined examples in set of adversarial augmented training data.
Thus, the machine learning component is used to test whether it had any problems in producing a desired output. In case testing reveals no deficiencies the method may forgo including such examples in (re-)training the machine learning component. Thereby processing power of a computer system executing the method may be reduced.
In case testing reveals deficiencies, the method may include such examples in the set of adversarial augmented training data in connection with (re)training the machine learning component. Thereby processing power of a computer system executing the method can be utilized where the likelihood of improving the machine learning component is greatest.
DK 2017 70681 A1
In some embodiments the method comprises:
using a parametric content generator to synthesize augmentation data;
determining a parameter range or limits within which the parametric content generator generates a variation of augmentation data for which it applies in one or both of first input elements of the augmented training data and the augmented data that predefined features are detectable;
constraining parameters input to the parametric content generator during the evolving of the set of training data to the determined parameter range or limits.
Thereby a parameter range is defined which constrains the procedural content generator to generate augmentation data that is relevant in the sense that the machine learning component has a fair chance of detecting features for performing its task e.g. as a discriminator-based or policy-based machine learning component.
By constraining the output of the problem generator (by restricting the genetic representation and/or the mutation operators), the (re-)training of the machine learning component focusses on examples that are relevant for expected functioning of the machine learning component and enables effectively learning via adversarial augmented training data.
The parameter values may be restricted to generate augmented training data (examples) with known class labels. For example, a genetic representation that is restricted to only create valid car track (class 1) instead of invalid tracks that always lead to a crash (class 2); a system that adds accessories to faces like glasses can be restricted to only add glasses with a certain size and position to allow potential recognition.
The predefined features may be determined by human intervention e.g. via visual inspection or by processing the training data to compute the
DK 2017 70681 A1 predefined features and compare values thereof against a threshold, which is decisive for determining whether features are detectable, or by computing salient features in the training data based on the machine learning component, such as weights thereof.
By constraining parameters input to the parametric content generator a relevant area of augmented training data can be more densely (or less sparsely) sampled - this in turn improves efficiency such as in terms of how much the machine learning component becomes better at performing in relevant areas per unit of processing power.
In some embodiments the set of parameter values, based on which the set of augmented training data is evolved, is changed in accordance with an evolutionary algorithm performing one or both of mutation and cross-over of a set of genotype components comprising the set of parameters; and wherein the determination of adversarial training data comprises performing selection from the set of genotype components.
An evolutionary algorithm works excellent in (re-)training the machine learning component since the performance of the machine learning component depends very much on the data it receives as its input and may have many local minima and/or maxima which cannot be analytically resolved. In some aspects a phenotype is the augmented training data.
A fitness function guides the evolutionary algorithm. In case of a discriminator-based machine learning component, the fitness may be based on whether the augmented training data (i.e. the phenotype; in case of the face fooling this would be the actual picture) can induce misclassification or another type of problematic behaviour in the trainable discriminator. Augmented training data (i.e. Individuals) rewarded by a high fitness value are then mutated and crossed over to generate the next generation. Evolutionary algorithms have the advantage that they can find combinations
DK 2017 70681 A1 of features that are effective at fooling the machine learning component (e.g.
faces with a certain colour of glasses, a moustache but no brown eyes, etc.).
Parameters are restricted to generate augmented training data with known class labels. For example, in case the machine learning component is a policy-based machine learning component, a parametric representation may be restricted to create only valid car track (e.g. class 1) instead of invalid tracks that always lead to a crash (e.g. class 2). In case the machine learning component is a discriminator-based machine learning component, a parametric representation may be restricted to create only artefacts (for images of faces) like glasses with a certain size and position to allow potential recognition.
In some embodiments the machine learning component comprises a first trainable portion and a second trainable or non-trainable portion; and wherein the training of the machine learning component involves:
training of the first portion of the machine learning component; and forgo training the second portion of the machine learning component.
In some aspects it is an object to improve, a so-called ‘black-box' component which is has an interface for accessing its input and output, but not for training the ‘black-box' component itself. By the present method, a system based on such a ‘black-box' can be improved by adding a trainable machine learning component; i.e. the first portion. The second portion may be trainable or non-trainable (‘black-box'). In some aspects both the first portion and the second portion of the machine learning component are trainable.
Such a ‘black-box' component may be comprised by the machine learning component as a non-trainable, second portion thereof. The first portion of the machine learning component may however have an interface for training the component. The first portion may then be coupled to compensate one or both of input and output to the second portion (‘black-box') of the machine learning
DK 2017 70681 A1 component. Compensation may be performed by additive operations, multiplicative operations or in other ways.
The ‘black-box' may be a machine learning component that have been trained and then supplied as a ‘black-box' leaving it non-trainable. In some aspects the ‘black-box' is a naturally occurring object, which is not trainable to generate accurate outputs to inputs.
In some embodiments the machine learning component is a discriminatorbased machine learning component; and wherein the training data comprises first input elements or portions thereof associated with class labels.
A discriminator-based machine learning component is trained based on discriminating between different input elements in accordance with the class labels. In some aspects the first input elements are images or a representation of images. Such a task is also denoted classification.
In other aspects the machine learning component is a predictor. The first input elements may be audio data e.g. audio data comprising speech.
In some embodiments manipulating the set of training data based on the content data comprises:
augmenting first input elements of the training data with the augmentation data while preserving the class labels associated with the first input elements of the training data.
Thereby, it is possible to generate new training data based on original training data and thus without having to retrieve all new training data from an original source. Retrieve all new training data from an original source for a discriminator-based machine learning component may involve significant work.
New training data i.e. augmented training data that modifies or partially masks the first input elements, while preserving features in the first input
DK 2017 70681 A1 elements, wherein the features are essential for performing the classification task may be generated. The augmented training data may be synthesized as described above.
In some embodiments the fitness function is configured to reward adversarial training data; and wherein adversarial training data are augmented training data wherein the augmentation of the training data caused the machine learning component to misclassify the augmented training data.
Thereby a simple, yet effective method for improving the machine learning component is provided. The fitness function may be configured to evaluate an output from the machine learning component based on augmented training data and an expected output from the machine learning component. The expected output from the machine learning component may be the value of a class label associated with the training data that were augmented in connection with the generating of the augmented training data. Thus, the information in the class label value is reused firstly for determining whether the augmented training data should be included in the set of adversarial training data and, in the affirmative event thereof, secondly, using the adversarial training data to train the machine learning component.
The fitness function may output a first value, e.g. 1, in the event of a discrepancy between the output and the expected output and output a second value, e.g. 0, in case of a match between the output of the machine learning component and an expected output. In some cases, the output of the fitness function may output a distance or difference value, sometimes denoted a ‘norm’ between the output of the machine learning component and the expected output.
In some embodiments the method comprises defining a set of artefacts; the artefacts comprising an image representation of items selected from the group of: glasses, hats, wigs, beards, and make-up layers.
DK 2017 70681 A1
The artefacts may be represented e.g. as augmentation data in the form of images or layers thereof. Such artefacts, e.g. in connection with face recognition, may make it more difficult for the machine learning component to recognise a face augmented by such an artefact, albeit the face should be recognizable since it relates to the person on the image without the artefact.
As mentioned above, it may be possible to disguise a face to an extent where the person cannot be recognised any longer even by an adult human being. Therefore, it may be advantageous to constrain parameters controlling the extent of disguise (e.g. one or more of the expanse; transparency and colour of the artefact).
For instance, in some aspects, in order to augment an image of a face, positions of the face and eyes are detected using e.g. the known Haar Cascades method. Having determined the positions of the face and eyes, it is possible to determine a position at which an artefact such as glasses should be placed to generate realistic augmented training data. The thereby augmented training data are then presented to the machine learning component to check the identification result of the new variation generated. If the augmented training data are misclassified, it is added to the set of adversarial training date (failure cases).
In some embodiments, the machine learning component is a policy-based machine learning component;
the training data represents tracks or maps;
the machine learning component is configured to act as a controller coupled to observe the track or map via a first simulator interface of a vehicle simulator and to control the vehicle simulator via a second simulator interface; and
DK 2017 70681 A1 the vehicle simulator is configured to output a performance value indicating performance of the machine learning component.
The performance value may relate to how far, e.g. in terms of time and/or distance, the machine learning component was able to drive the vehicle without a crash. The performance value may also relate to how smooth the vehicle was driven. The fitness function may be configured to output a value between 0 and 1, depending on how far the car was able to drive before crashing (higher values means the car crashed earlier). Tracks that make the simulator crash the vehicle therefore receive a higher score and more similar tracks will be generated in future populations.
In some aspects the vehicle simulator is a simulator for a self-driving vehicle, such as a car or a truck or movable construction equipment. The track or map may represent a road and its equipment such as road signs, lane marks etc. The track or map may represent paths in a landscape. The first simulator interface may comprise what is often denoted a ‘first person view' e.g. in the form of a sequence of images such as video. The first simulator interface may comprise radar images such as videos e.g. from a LIDAR or simulated LIDAR. The first simulator interface may comprise sensor measurements such as speed sensor measurements, acceleration measurements, gyroscope measurements etc. The second interface may comprise speed control, or power control, steering control etc. as it is known in the field of controlling a vehicle.
In some aspects the vehicle simulator is a simulator for a vessel, such as a ship. The track or map may represent a nautical chart.
In some aspects the vehicle simulator is a simulator for a robot, such as a robot with one or more arms and one or more hands. The track or map may represent a path for one or more arms and one or more hands.
DK 2017 70681 A1
In some embodiments the fitness function is configured to reward adversarial training data; and wherein adversarial training data are augmented training data that caused the device simulator to output a poor performance value.
In some embodiments the method comprises:
saving states, such as a crash log, at the simulator associated with an event that exceeded a predefined threshold comprising registering a time index; and retrieving, from one or more of the track or map and the simulator, observations observed via the first interface at a range about the time index; and generating adversarial training data from one or both of the saved states and the retrieved observations.
There is also provided a computer system comprising:
- a first interface for accessing training data;
- a first repository comprising one or both of: augmentation data and augmented training data;
- a second repository for storing at least a portion of the machine learning component;
wherein the computer system is configured to perform the method set out above.
There is also provided a computer-readable medium encoded with a program configured to perform the method set out above when run by a computer.
Generally, herein the terms training and retraining are both represented by the term (re-)training. In some embodiments the machine learning component may be trained and then enabled for retraining according to the methods described herein. In other embodiments the machine learning
DK 2017 70681 A1 component may comprise a ‘black box' component which is not enabled for retraining, hence the method described herein may train a portion of the machine learning component e.g. for the first time. The term (re-)training should encompass both cases.
Generally, herein, the terms ‘component’, such as ‘machine learning component'; ‘unit’, ‘portion’, ‘processing means’ and ‘processing unit’ are intended to comprise any circuit and/or device suitably adapted to perform the functions described herein. In particular, the above term comprises general purpose or proprietary programmable microprocessors, Digital Signal Processors (DSP), Application Specific Integrated Circuits (ASIC), Programmable Logic Arrays (PLA), Field Programmable Gate Arrays (FPGA), special purpose electronic circuits, etc., or a combination thereof.
BRIEF DESCRIPTION OF THE FIGURES
A more detailed description follows below with reference to the drawing, in which:
fig. 1 illustrates a computer system that generates adversarial training data for a black box system and retrains a machine learning component based on the adversarial training data;
fig. 2 shows a block diagram of a computer system for retraining a machine learning component;
fig. 3 shows a flowchart for a method of retraining a machine learning component in a (re-)training mode;
fig. 4 shows a flowchart of determining a parameter range;
fig. 5 shows a flowchart for a method of generating a set of parameter values;
fig. 6 shows a first and a second portion of a machine learning component;
DK 2017 70681 A1 fig. 7 shows a first method of augmenting training data;
fig. 8 shows a second method of augmenting training data; and fig. 9 shows a simulator for a self-driving vehicle comprising a machine learning component.
DETAILED DESCRIPTION
Fig. 1 illustrates a computer system that generates adversarial training data for a system, comprising a machine learning component, and retrains the machine learning component based on the adversarial training data.
The machine learning component 104 may be a discriminator-based machine learning component as described below. Alternatively, as described further below, the system may be a policy-based machine learning component or another type of machine learning component.
An evolutionary algorithm 101 receives or computes a fitness value for each member of a population of candidate training data and outputs a set of parameters, i.e. the genotype, to a parametric content generator 102 for each member. The evolutionary algorithm 101 performs a search for parameters that make a parametric content generator generate candidate training data (i.e. adversarial training data) that ‘fool’ the discriminator-based machine learning component 104.
In some embodiments the machine learning component 104 is configured, such as by training, to discriminate between persons based on images of their faces e.g. in connection with an authentication application. An example representation of parameters for such an embodiment could be in the form of an array [o1, x1, y1, s1, o2, x2, y2, s2, ...], wherein ‘o’ represents presence of an artefact (e.g. glasses); wherein ‘x’, and ‘y’ represent the position coordinate of this artefact on the given face; and wherein ‘s’ represent the size of the artefact. The values of the parameters may be restricted to predefined ranges. For instance, the size parameter ‘s’ is restricted such that
DK 2017 70681 A1 the size of the glasses does not cover more than 20% of the face. The predefined ranges may be investigated automatically by means of an image processing algorithm or semi-automatically by means of a user interface providing a user with a visual representation of the artefact in response to a user selecting a value for one or more parameters. Thereby the user can perform visual inspection of an artefact and record end values of ranges within which realistic artefacts are generated.
In a first iteration, i.e. the first generation provided by the evolutionary algorithm 101, a number of these parametric representations, individuals, are created (i.e. a population of individuals), wherein each individual is initialized with random values that lie in the predetermined intervals. Each of these individuals in the population is then given a ‘fitness’ value through a fitness function, which the evolutionary algorithm uses as its guide in generating next generations of individuals.
The fitness value is based on whether the produced individual (i.e. the phenotype; in case of the face fooling this would be the actual picture) can induce misclassification or another type of problematic behaviour in the discriminator-based machine learning component.
Individuals with a high fitness value are then mutated and/or crossed over to generate the next generation. The evolutionary algorithm may be configured to increase chances that an individual is chosen for the next generation proportional to the fitness value, however, also giving individuals with low fitness values a chance for reproduction (the stepping stones towards complex behaviours could have low fitness value; i.e. the fitness landscape is deceptive). Mutations add small random numbers with a certain given probability to each element of the vector. For crossover, a random point p in the genome is chosen two parents with high fitness are chosen to create a new offspring, where the first elements of the genotypic representation are chosen from one parent and the rest from the other parent. In contrast to only
DK 2017 70681 A1 random sampling, evolutionary algorithms have the advantage that they can find combinations of features that are effective at fooling the discriminatorbased machine learning component (e.g. faces with a certain colour of glasses, a moustache but no brown eyes, etc.).
The representation and/or mutation parameters are restricted to generate examples with known class labels. For example, a genetic representation that is restricted to only create glasses with a certain size and position to allow potential recognition of the face.
The parametric content generator 102 receives a parametric representation (genotype) as described above and outputs a phenotype (e.g. an image of a face). The parametric content generator takes as input the parameters produced by the evolutionary algorithm, and optionally training data e.g. comprising images of faces, and produces augmented training data (i.e. the phenotype), such as a decorated image of a face.
The machine learning component 104 receives augmented training data and outputs one or both of a performance value and a fitness value to thereby indicate whether a weakness was found. The discriminator-based machine learning component 104 receives as its input the augmented training data generated by the parametric content generator 102 and performs a computational operation on it, such as classifying it into one of a number of predefined classes.
The training data, 105, contains a number of content examples that are used by the training process to train the trainable discriminator in the training process. Initially, the training data 105 comprises a number of real life examples, wherein classes are known. As more adversarial examples are generated by the parametric content generator, these are added to the dataset for retraining of the discriminator-based machine learning component. For the adversarial training data, classes are known. The classes are known from the original training data.
DK 2017 70681 A1
A training component 106 receives a representation of the machine learning component 104 and outputs a (re-)trained machine learning component. The training component 106 trains the discriminator-based machine learning component using a dataset containing multiple examples. The exact training mechanism depends on the internal architecture of the trainable discriminator or policy. For a discriminator implemented as a neural network, stochastic gradient descent through the backpropagation algorithm can be used.
The machine learning component 104 may alternatively or additionally be a policy-based machine learning component or another type of machine learning component. A policy-based machine learning component is configured, such as by training, to control a component, device or system such as a vehicle or a computer model of a vehicle to drive the vehicle in accordance with a predefined performance criterion. In the domain of controlling (“driving”) a vehicle such as a car, the fitness function may output values e.g. between 0 and 1, depending on how far the car was able to drive on a track before crashing (higher values means the car crashed earlier).
In connection therewith, the parametric content generator 102 may receive a genotypic description of a track (e.g. a list of coordinates). The description of the track may then be used to create a 3D track in a simulator. The machine learning component 104, which may be a deep neural network, may be applied to control a car in a driving simulation (i.e. the machine learning component has outputs for controlling the car to turn right, turn left, accelerate and brake. The machine learning component may receive a first person view of the track as its input from one or more cameras and optionally input from other types of sensors as it is known in the art. A simulation is run for a certain amount of time, wherein, at time steps, the machine learning component receives the inputs mentioned above and its outputs are translated into control signals for controlling the car. When allotted time runs out or the car crashes (e.g. by failing to satisfy a performance criterion) the simulation may be terminated and the inverse of the distance travelled before
DK 2017 70681 A1 crashing (the network's fitness) is reported back to the Evolutionary
Algorithm.
Tracks that make the car “crash” within a shorter period or distance, rather than longer period or distance, are “rewarded” by the fitness function by a higher value. Adversarial examples, in the form of tracks similar to the ones that scored a high fitness value, will then be generated for coming populations - and for (re-)training of the machine learning component. In some embodiments the machine learning component may be a policy-based machine learning component implemented as a neural network e.g. implemented as a Q-learning component.
Fig. 2 shows a block diagram of a computer system for retraining a machine learning component. The computer system 201 may operate in a first mode 202, wherein the machine learning component 204 is trained as it is known in the art, and operate in a second mode 206, wherein the machine learning component is improved by being (re-)trained. In some aspects thereof, the machine learning component comprises a non-trainable portion (sometimes denoted a ‘black-box') and a trainable component.
In the first mode 201, the computer system 201 comprises training data 203, a machine learning component 204 and an MLC trainer for training the machine learning component 204 based on the training data. The training data 203 may be stored in a repository such as on a storage drive and/or be accessible from a remote location via an interface. The first mode 201 comprises a step of testing whether the machine learning component performs as desired on the training data or a test-set as it is known in the art.
In the second mode 206, the computer system 201 comprises the machine learning component 204 and the training data 203 as described in connection with the first mode.
A parametric content generator 207 is configured to synthesize a set of augmentation data 208 based on a set of parameter values in accordance
DK 2017 70681 A1 with a parametric representation of an artefact. Thus, the parametric content generator receives parameter values from a parameter population 210 as its input and generates augmentation data 208 as its output in a format that is compatible with the training data 203.
Augmented training data 209 are generated based on the training data 203 and the augmentation data 208.
The parameter population 210 is generated by an evolutionary algorithm 210 that evolves the set of augmented training data 209 over generations based on evolving the parameter population (a set of parameter values) in accordance with optimization of a fitness function, which is configured to reward a performance deficiency associated with an output produced by the machine learning component in response to receiving augmented training data as its input. Thus, the machine learning component 204 is used in a process of optimization of the fitness function.
In evolutionary algorithm terms, the parameters are a genotype and the augmentation data are a phenotype.
During the course of evolving the augmented training data, a set of adversarial augmented training data 211 is determined. Adversarial augmented training data are augmented training data in the set of augmented training data that caused a performance deficiency associated with an output produced by the machine learning component in response to receiving augmented training data as its input. That is, augmented training data that were rewarded by the fitness function.
The computer system also comprises a machine learning training component 212, which (re-)trains the machine learning component based on at least the set of adversarial training data 211.
The second mode 206 may be completed when the machine learning component has been trained over a number of iterations until a termination
DK 2017 70681 A1 criterion has been met. A termination criterion may be that a generalization error on a test set, comprising augmentation data, has reached a predefined level or progresses slower than a predefined rate.
Fig. 3 shows a flowchart for a method of (re-)training a machine learning component in a (re-)training mode. The (re-)training mode, which is entered in step 302, may correspond to the second mode 206 of the computer system described above.
The method of (re-)training a machine learning component is generally designated by reference numeral 301 and starts at step 303 wherein a set of parameter values (PV) are generated. The parameter values may be initialized with random values that are constrained to lie in a predefined interval as described herein. The method then proceeds to step 304.
In step 304 a set of augmentation data (AD) is synthesized based on the set of parameter values (PV) in accordance with a parametric representation of an artefact. As described herein, augmentation data may be data that are in a format that is compatible with or convertible to be compatible with training data or at least an input portion thereof. Augmentation data supplements training data or modifies training data. Augmentation data is a representation of an artefact. An artefact may be an object made by a human being; something observed in a scientific investigation or experiment that is not naturally present but occurs as a result of the preparative or investigative procedure; or something which resembles a natural phenomenon. In case the training data comprises images of faces of human beings, the augmentation data may be images of artefacts such as e.g. glasses (which is an object made by a human being) or the augmentation data may represent an artefact such darkness (which is a natural phenomenon). In the latter case, the augmentation data may itself be an image or layer thereof or it may be brightness parameters controlling the training data or a copy thereof. The parameters may then control transparency of the glass in the glasses,
DK 2017 70681 A1 thickness and colour of the frame of the glasses, and brightness of an image.
The augmentation data may be synthesized by a parametric content generator. The method then proceeds to step 305.
In step 305 a set of augmented training data (ATD) is generated. Step 305 may comprise retrieving training data based on which the machine learning component has been trained and augmenting the training data with the augmentation data (AD) to generate the augmented training data. In case of a discriminator-based machine learning component, the training data may comprise pairs of images (e.g. an image of a person) and class labels (e.g. an identification code for that person). The augmented training data may then be generated by augmenting the image portion of the training data while preserving the class label thereby generating a new training example. The method then proceeds to step 306.
However, the augmented training data are not yet used for training the machine learning component. Firstly, in step 306 the augmented training data are input to the machine learning component and in response thereto the machine learning component is activated to compute an output. In case of the discriminator-based machine learning component, the input may be an augmented image and the output may be a class label. The method then proceeds to step 307.
In step 307 the output, e.g. the class label, is compared to an expected output and the value of a fitness function is computed. The expected output may be the class label for the (non-augmented) training data (which were augmented). For instance a training example being a pair of an image (IMG_10) and a class label (ID1); the augmented image may be denoted AIMG_10_17. Then the expected output when inputting the image AIMG_10_17 would be class label ID1, since AIMG_10_17 is generated by augmenting image IMG_10. The fitness function is configured to reward a performance deficiency associated with an output produced by the machine
DK 2017 70681 A1 learning component in response to receiving augmented training data as its input. Thus, in the example above, if the class label output from the machine learning component is not correct, the fitness function may reward this case over the case that the class label output from the machine learning component is correct. The method then proceeds to step 308.
In step 308, the method branches off to either revert to step 304 via step 309 in which a next generation of parameter values is evolved by one or both of mutation and cross-over of the set of parameter values (i.e. in the genotype domain) in step 303 or proceed to step 310 wherein a set of adversarial augmented training data (AATD) is selected. The branching in step 308 may be based on an evaluation of fitness values or evaluation of a number of elapsed generations. Following step 310, the method proceeds to step 311.
In step 311 the machine learning component is (re-)trained based on at least the set of adversarial augmented training data. (Re-)training is performed as it is known in the art and may involve various training methods e.g. to improve (minimize) a generalisation error. Following step 311, the method proceeds to step 312 wherein the method branches off to either revert to step 303 for starting a new iteration of generating adversarial augmented training data and (re-)training the machine learning component or to complete the (re)training mode in step 313. Step 313 may comprise reporting performance of the machine learning component e.g. relative to performance of the machine learning component when the (re-)training mode was entered in step 302.
Fig. 4 shows a flowchart for a method of determining a parameter range. In step 402 an artefact is loaded from a repository of artefacts. In case of a discriminator-based machine learning component for image-based face recognition, the artefacts may comprise a representation of items selected from the group of: glasses, hats, wigs, beards, and make-up layers. The items may be represented in an electronic file e.g. a CAD format or in an image format. The method then proceeds to step 403.
DK 2017 70681 A1
In step 403 the loaded artefact is manipulated by changing parameters controlling a graphical rendering of the artefact or by editing via a graphical user interface, GUI, a graphical presentation of the artefact controlling parameters associated with the graphical presentation of the artefact as edited. Thus parameter values and the graphical presentation are interrelated. The method then proceeds to step 404 providing at least a parameter and an interrelated graphical representation of the artefact.
In step 404, detectability of predefined features in augmented training data is examined. The examination may be performed by visual inspection and/or by computer processing to measure detectability of predefined features. Detectability of predefined features may be performed based on one or both of the augmentation data alone and the augmented training data. The method then proceeds to step 405 wherein the method branches off to resume at step 403 in case the predefined features are detectable or to step 406 wherein a parameter values is registered as an endpoint of a range of the parameter value.
The method then proceeds to step 407 wherein the method branches resuming to step 403 to continue manipulation of the artefact to determine other parameter ranges or endpoints thereof for the artefact; or to resume at step 402 wherein a further artefact is loaded; or to proceed by providing a parameter range 408.
Based on the parameter range is determined, synthesizing of augmentation data may be initialized or evolved as described herein to develop adversarial augmented training data.
Fig. 5 shows a flowchart for a method of generating a set of parameter values. The method of generating a set of parameter values is performed by steps 303 and 309 described above. In a first step 502, the method branches to initialize a set of parameters (a population) based on the parameter range 408 cf. step 303 above or to perform one or both of mutation and cross-over
DK 2017 70681 A1 of parameter values (in the genotype domain). As a result a set of parameters 505 is generated. In step 503 and in step 504 parameters are restricted or confined to lie within the parameter range 408. The parameter range 408 may thus represent artefacts that can augment the training data without destroying the machine learning component’s ability to, ideally, perform well. The set of parameters 505 thus represents ‘realistic’ artefacts as approved in connection with the method described in fig. 4.
Fig. 6 shows a first and a second portion of a machine learning component. According to a first embodiment, the first portion 603 of the machine learning component 601 may be a ‘black box’ component in the sense that it is not enabled to be trained or not enabled to be trainable or that it is not trained.
The second portion 602 of the machine learning component 601 is enabled to be trained e.g. in connection with the computer system and methods described above. The second portion 602 is trainable via an interface designated T.
Input to the machine learning component 601 is provided via an interface designated ‘MLC Input’. Output from the first portion 603 and the second portion 602 of the machine learning component 601 is provided via an interface designated ‘MLC Output1’ and ‘MLC Output2’, respectively. The inputs and outputs may be used both during (re-)training of the machine learning component and during use of the machine learning component for e.g. as a discriminator or policy in connection with determining performance of the machine learning component and/or computing a value of the fitness function.
In some embodiments, input to the first portion 603 and the second portion 602 may be provided in parallel. In some embodiments, input to the second portion is provided by feeding output from the first component 603 to the input of the second component 602.
DK 2017 70681 A1
In some embodiments, output from the first portion 603 and output from the second portion 602 are provided in parallel on respective outputs. In some embodiments, output from the first portion 603 is modified by a unit 604, which receives output from the first portion 604 and output from the second portion 602 and provides a common output on the output designated ‘MLC Ouputl’. A common output from unit 604 may be generated by one or more of linear operations, such as additive operations, multiplicative operations, or non-linear operations.
Fig. 7 shows a first method of augmenting training data. In some embodiments, this first method is used in connection with discriminatorbased machine learning components. According to this first method, augmentation data 703 are generated by a parametric content generator 702 e.g. in the form or images or layers of images. According to this second method, augmentation data are explicit in the sense that they represented in a format compatible with the training data.
The parametric content generator 702 generates the augmentation data 703 based on parameter values. A unit 705 augments the training data 704 with the augmentation data 703.
As mentioned the augmentation data and the training data may be in the form of images or layers thereof. The unit 705 may then be configured to perform image processing to ‘add’ or ‘blend’ the images. In some embodiments the training data 704 and the augmentation data 703 are in different formats; the unit 705 may then be configured to convert or transform from at least one format into at least another format to augment the training data. In some embodiments the unit 705 provides the augmentation across different formats of the training data 704 and the augmentation data 703.
Fig. 8 shows a second method of augmenting training data. In some embodiments, this second method is used in connection with policy-based machine learning components. According to this second method,
DK 2017 70681 A1 augmentation data are implicit in the sense that they are augmentation parameters 802; e.g. parameters that describe artefacts such as traffic equipment like signs, lane marks or layout of roads, paths or tracks.
The augmentation parameters 802 may be generated as a set of parameters as described above wherein a renderer 805 provides a visual presentation of the augmentation represented by the parameters. Thereby realistic augmentation may be generated within ranges of the augmentation parameters.
The augmentation parameters 802 may be compatible with track parameters 803 that are in accordance with a parameter description or computer language, such as a declarative computer language, for rendering a graphical presentation, e.g. as a so-called first person view, of the track and compatible with the renderer 805.
A unit 804 may be configured to merge such track parameters 803 with such augmentation parameters 802 to provide an augmented rendering of the track, its equipment, track surroundings, weather and light conditions etc. The track parameters 803 and the augmentation parameters 802 may be mutually exclusive in their parameter domain or they may overlap; in the latter case allowing augmentation parameters to modify or overrule track parameters.
Fig. 9 shows a simulator for a self-driving vehicle comprising a machine learning component. The machine leaning component 904 may be policy based machine learning component configured for controlling the vehicle as it is known in the art of self-driving or autonomous vehicles; however as shown, the machine learning component is shown in connection with the simulator 901 wherein the machine learning component is subject to (re-)training.
The machine learning component is configured to act as a controller for the vehicle and is coupled to observe the track or map via a first simulator interface 909 of a vehicle simulator and to control the vehicle simulator via a
DK 2017 70681 A1 second simulator interface 910. The first simulator interface 909 may comprise radar images such as videos e.g. from a LIDAR or simulated LIDAR. The first simulator interface 909 may comprise sensor measurements such as speed sensor measurements, acceleration measurements, gyroscope measurements etc. The second interface 910 may comprise speed control, or power control, steering control etc. as it is known in the field of controlling a vehicle.
In this example embodiment, output from the machine learning component 904 is input to a simulator engine 902 which simulates a drive of a vehicle represented by a vehicle model 905 on a track 906 represented by a set of parameters 907. The simulator engine 902 is configured to output a performance value 908 indicating performance of the machine learning component related to driving the vehicle on the track. The performance value may relate to how far, e.g. in terms of time and/or distance, the machine learning component was able to drive the vehicle without a crash. The performance value may also relate to how smooth the vehicle was driven.
The aforementioned fitness function may be configured to output a value between 0 and 1, depending on how far the car was able to drive before crashing (higher values means the car crashed earlier). Tracks that make the simulator crash the vehicle therefore receive a higher score and more similar tracks will be generated in future populations of the set of parameters 907. The set of parameters 907 may be evolved, e.g. by an evolutionary algorithm as described above.
In connection therewith, the machine learning component 904 is (retrainable) via an interface designated T. The interface T may give access e.g. as an application programmable interface, API, or directly to change parameters, such as weights, of the machine learning component.
The simulator engine 902 may be configured to save states (e.g. a crash log) associated with an event that exceeded a predefined threshold comprising
DK 2017 70681 A1 registering a time index; and retrieving, from one or more of the track or map and the simulator, observations observed via the first interface at a range about the time index; and generating adversarial training data from one or both of the saved states and the retrieved observations. Thereby, adversarial training data can be evolved around crash events.
In some embodiments, a method may proceed in accordance with the following steps:
1. Run the original training set through the machine learning component and identify failure and success examples;
2. Create an augmented training set by augmenting the original training set for the success examples;
3. Calculate the system accuracy (e.g. performance) on the augmented training set; and
4. Add failure examples from the augmented training set to the original 15 training set and continue at 1.
Claims (16)
1. A method, comprising:
evolving a set of augmented training data (209) and training a machine learning component (204) by:
synthesizing (304) augmentation data (204) based on a set of parameter values in accordance with a parametric representation of an artefact;
generating (305) a set of augmented training data (209) by augmenting training data (203) based on the augmentation data (204);
evolving (309) the set of augmented training data (209) over generations based on evolving the set of parameter values in accordance with optimization of a fitness function, which is configured to reward a performance deficiency associated with an output produced by the machine learning component in response to receiving augmented training data (209) as its input;
among the set of augmented training data (209), determining (310) a set of adversarial augmented training data (211) which are augmented training data in the set of augmented training data (209) that caused a performance deficiency associated with an output produced by the machine learning component (204) in response to receiving augmented training data as its input; and training the machine learning (204) component based on the set of adversarial training data (211).
2. A method according to claim 1, wherein evolving the set of augmented training data comprises:
DK 2017 70681 A1 from one generation to another generation of evolving the set of augmented training data:
the synthesizing of augmentation data based on the set of parameter values;
the generating of augmented training data by augmenting the training data based on the augmentation data; and forgo changing the training data.
3. A method according to claim 1, wherein evolving the set of augmented training data from one generation to another generation comprises:
the synthesizing of augmentation data based on the set of parameter values; and the generating of augmented training data by augmenting the training data based on the augmentation data; and at least partially replacing the training data with the augmented training data.
4. A method according to any of claims 1-3, wherein the determining of the set of adversarial augmented training data comprises:
inputting a current generation of the set of augmented training data to the machine learning component and evaluating an output associated with the current generation of augmented training data;
determining examples, among the current generation of the set of augmented training data, which have a performance deficiency;
DK 2017 70681 A1 including the determined examples in set of adversarial augmented training data.
5. A method according to any of claims 1-4, comprising:
using a parametric content generator to synthesize augmentation data;
determining a parameter range or limits within which the parametric content generator generates a variation of augmentation data for which it applies in one or both of first input elements of the augmented training data and the augmented data that predefined features are detectable;
constraining parameters input to the parametric content generator during the evolving of the set of training data to the determined parameter range or limits.
6. A method according to any of claims 1-5, wherein the set of parameter values, based on which the set of augmented training data is evolved, is changed in accordance with an evolutionary algorithm performing one or both of mutation and cross-over of a set of genotype components comprising the set of parameters; and wherein the determination of adversarial training data comprises performing selection from the set of genotype components.
7. A method according to any of claims 1-6, wherein the machine learning component comprises a first trainable portion and a second trainable or nontrainable portion; and wherein the training of the machine learning component involves:
training of the first portion of the machine learning component; and
DK 2017 70681 A1 forgo training the second portion of the machine learning component.
8. A method according to any of claims 1-7, wherein the machine learning component is a discriminator-based machine learning component; and
5 wherein the training data comprises first input elements or portions thereof associated with class labels.
9. A method according to claim 8, wherein manipulating the set of training data based on the content data comprises:
10 augmenting first input elements of the training data with the augmentation data while preserving the class labels associated with the first input elements of the training data.
10. A method according to claim 8 or 9, wherein fitness function is configured
15 to reward adversarial training data; and wherein adversarial training data are augmented training data wherein the augmentation of the training data caused the machine learning component to misclassify the augmented training data.
20
11. A method according to any of claims 1 -10, comprising defining a set of artefacts; the artefacts comprising an image representation of items selected from the group of: glasses, hats, wigs, beards, and make-up layers.
12. A method according to any of claims 1-7,
DK 2017 70681 A1 wherein the machine learning component is a policy-based machine learning component;
wherein the training data represents tracks or maps;
wherein the machine learning component is configured to act as a controller coupled to observe the track or map via a first simulator interface of a vehicle simulator and to control the vehicle simulator via a second simulator interface;
wherein the vehicle simulator is configured to output a performance value indicating performance of the machine learning component.
13. A method according to claim 12, wherein fitness function is configured to reward adversarial training data; and wherein adversarial training data are augmented training data that caused the device simulator to output a poor performance value.
14. A method according to claim 12 or 13, comprising:
saving states at the simulator associated with an event that exceeded a predefined threshold comprising registering a time index; and retrieving, from one or more of the track or map and the simulator, observations observed via the first interface at a range about the time index; and
15. A computer system comprising:
generating adversarial training data from one or both of the saved states and the retrieved observations.
DK 2017 70681 A1
- a first interface for accessing training data;
- a first repository comprising one or both of: augmentation data and augmented training data;
- a second repository for storing at least a portion of the machine learning 5 component;
wherein the computer system is configured to perform the method according to any of the claims 1 -14.
16. A computer-readable medium encoded with a program configured to 10 perform the method according to any of the preceding claims when run by a computer.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DKPA201770681A DK201770681A1 (en) | 2017-09-12 | 2017-09-12 | A method for (re-)training a machine learning component |
PCT/EP2018/074585 WO2019053052A1 (en) | 2017-09-12 | 2018-09-12 | A method for (re-)training a machine learning component |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DKPA201770681A DK201770681A1 (en) | 2017-09-12 | 2017-09-12 | A method for (re-)training a machine learning component |
Publications (1)
Publication Number | Publication Date |
---|---|
DK201770681A1 true DK201770681A1 (en) | 2019-04-03 |
Family
ID=60320608
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
DKPA201770681A DK201770681A1 (en) | 2017-09-12 | 2017-09-12 | A method for (re-)training a machine learning component |
Country Status (2)
Country | Link |
---|---|
DK (1) | DK201770681A1 (en) |
WO (1) | WO2019053052A1 (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11995557B2 (en) * | 2017-05-30 | 2024-05-28 | Kuano Ltd. | Tensor network machine learning system |
US10565475B2 (en) * | 2018-04-24 | 2020-02-18 | Accenture Global Solutions Limited | Generating a machine learning model for objects based on augmenting the objects with physical properties |
WO2021008798A1 (en) * | 2019-07-12 | 2021-01-21 | Elektrobit Automotive Gmbh | Training of a convolutional neural network |
CN112580680B (en) * | 2019-09-30 | 2023-11-07 | 中国科学院深圳先进技术研究院 | Training sample generation method and device, storage medium and electronic equipment |
US11893111B2 (en) * | 2019-11-26 | 2024-02-06 | Harman International Industries, Incorporated | Defending machine learning systems from adversarial attacks |
US20230033495A1 (en) * | 2019-12-24 | 2023-02-02 | Panasonic Intellectual Property Management Co., Ltd. | Evaluation method for training data, program, generation method for training data, generation method for trained model, and evaluation system for training data |
US20210284184A1 (en) * | 2020-03-05 | 2021-09-16 | Waymo Llc | Learning point cloud augmentation policies |
US12103530B2 (en) | 2020-04-15 | 2024-10-01 | Ford Global Technologies, Llc | Vehicle data augmentation |
CN111582375B (en) * | 2020-05-09 | 2024-07-12 | 广州易通达供应链管理有限公司 | Data enhancement policy searching method, device, equipment and storage medium |
EP3944159A1 (en) | 2020-07-17 | 2022-01-26 | Tata Consultancy Services Limited | Method and system for defending universal adversarial attacks on time-series data |
WO2023035263A1 (en) * | 2021-09-13 | 2023-03-16 | 华为技术有限公司 | Method and device for determining image signal processing parameters, and perception system |
CN114027974B (en) * | 2021-09-15 | 2023-10-13 | 苏州中科华影健康科技有限公司 | Endoscope path planning method, device and terminal for multiple lesion sites |
-
2017
- 2017-09-12 DK DKPA201770681A patent/DK201770681A1/en not_active Application Discontinuation
-
2018
- 2018-09-12 WO PCT/EP2018/074585 patent/WO2019053052A1/en active Application Filing
Non-Patent Citations (3)
Title |
---|
GUY KATZ ET AL: "Towards Proving the Adversarial Robustness of Deep Neural Networks", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 8 September 2017 (2017-09-08), pages 19 - 26, XP080819645, DOI: 10.4204/EPTCS.257.3 * |
HYEUNGILL LEE ET AL: "Generative Adversarial Trainer: Defense to Adversarial Perturbations with GAN", 9 May 2017 (2017-05-09), XP055462228, Retrieved from the Internet <URL:https://arxiv.org/pdf/1705.03387.pdf> [retrieved on 20180323] * |
YUCHI TIAN ET AL: "DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars", 28 August 2017 (2017-08-28), XP055462222, Retrieved from the Internet <URL:https://arxiv.org/pdf/1708.08559v1.pdf> [retrieved on 20180323] * |
Also Published As
Publication number | Publication date |
---|---|
WO2019053052A1 (en) | 2019-03-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DK201770681A1 (en) | A method for (re-)training a machine learning component | |
Chen et al. | A deep learning algorithm for simulating autonomous driving considering prior knowledge and temporal information | |
CN113039563B (en) | Learning to generate synthetic data sets for training neural networks | |
CN109492662B (en) | Zero sample image classification method based on confrontation self-encoder model | |
EP3796112B1 (en) | Virtual vehicle control method, model training method, control device and storage medium | |
US9111375B2 (en) | Evaluation of three-dimensional scenes using two-dimensional representations | |
CN113366507B (en) | Training a classifier to detect an open door | |
JP7345639B2 (en) | Multi-agent simulation | |
CN112241784A (en) | Training generative model and discriminant model | |
CN111507459B (en) | Method and apparatus for reducing annotation costs for neural networks | |
CN109447096A (en) | A kind of pan path prediction technique and device based on machine learning | |
CN109978870A (en) | Method and apparatus for output information | |
WO2020240808A1 (en) | Learning device, classification device, learning method, classification method, learning program, and classification program | |
CN114842343A (en) | ViT-based aerial image identification method | |
CN114912719B (en) | Heterogeneous traffic individual trajectory collaborative prediction method based on graph neural network | |
CN112560948A (en) | Eye fundus map classification method and imaging method under data deviation | |
Atakishiyev et al. | Explaining autonomous driving actions with visual question answering | |
Ge et al. | Deep reinforcement learning navigation via decision transformer in autonomous driving | |
CN112947466A (en) | Parallel planning method and equipment for automatic driving and storage medium | |
CN114580715B (en) | Pedestrian track prediction method based on generation countermeasure network and long-term and short-term memory model | |
US20230064387A1 (en) | Perceptual fields for autonomous driving | |
CN115981302A (en) | Vehicle following lane change behavior decision-making method and device and electronic equipment | |
Attaoui et al. | Search-based DNN Testing and Retraining with GAN-enhanced Simulations | |
CN115115058A (en) | Model training method, device, equipment and medium | |
CN116777814A (en) | Image processing method, apparatus, computer device, storage medium, and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PAT | Application published |
Effective date: 20190313 |
|
PHB | Application deemed withdrawn due to non-payment or other reasons |
Effective date: 20191222 |