Nothing Special   »   [go: up one dir, main page]

WO2003009074A1 - Behavior control apparatus and method - Google Patents

Behavior control apparatus and method Download PDF

Info

Publication number
WO2003009074A1
WO2003009074A1 PCT/JP2002/007224 JP0207224W WO03009074A1 WO 2003009074 A1 WO2003009074 A1 WO 2003009074A1 JP 0207224 W JP0207224 W JP 0207224W WO 03009074 A1 WO03009074 A1 WO 03009074A1
Authority
WO
WIPO (PCT)
Prior art keywords
behavior
mobile unit
target object
target
location
Prior art date
Application number
PCT/JP2002/007224
Other languages
French (fr)
Inventor
Takamasa Koshizen
Hiroshi Tsujino
Hideaki Ono
Original Assignee
Honda Giken Kogyo Kabushiki Kaisha
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honda Giken Kogyo Kabushiki Kaisha filed Critical Honda Giken Kogyo Kabushiki Kaisha
Priority to JP2003514353A priority Critical patent/JP2004536400A/en
Priority to EP02746100A priority patent/EP1407336A1/en
Priority to US10/484,147 priority patent/US7054724B2/en
Publication of WO2003009074A1 publication Critical patent/WO2003009074A1/en

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/0088Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots characterized by the autonomous decision making process, e.g. artificial intelligence, predefined behaviours

Definitions

  • the present invention relates to a behavior control apparatus and method for mobile unit, in particular, to a behavior control apparatus and method for recognizing a target object in acquired images and controlling behavior of the mobile unit with high accuracy based on the recognized target object.
  • control system To control a mobile unit with high accuracy based on input images, it is necessary for a control system to recognize an object in the image as a target for behavior of the mobile unit.
  • One approach is that the control system learns training data pre-selected by an operator prior to recognition. Specifically, the control system searches the input images to extract some shapes or colors therefrom designated as features of the target. Then, the control system outputs commands to make the mobile unit move toward the extracted target.
  • An alternative approach is that a template for the target is prepared and during controlling the mobile unit the template is always applied to input images to search and extract shape and location of the target in detail.
  • computational cost would become huge because a computer has to keep calculating the shape and location of the target.
  • the calculation of searching the target may fall into a local solution.
  • No. H7- 13461 a method for leading autonomous moving robots for managing indoor air-conditioning units is disclosed. According to the method, a target object for leading is detected through image processing and the robot is leaded toward the target. However, the method needs blowing outlets of air-conditioning units as target objects, which lacks generality.
  • a behavior control apparatus for controlling behavior of a mobile unit.
  • the apparatus comprises sensory input capturing method for capturing sensory inputs and motion estimating method for estimating motion of the mobile unit.
  • the apparatus further comprises target segregation method for segregating the portion which includes a target object to be target for behavior of the mobile unit from sensory inputs, and target object matching method for extracting the target object from the segregated portion.
  • the apparatus still further comprises target location acquiring method for acquiring the location of the target object and behavior decision method for deciding behavior command for controlling the mobile unit based on the location of the target object.
  • the behavior control apparatus roughly segregate the portion that includes a target object of behavior from sensory inputs, such as images, based on the estimation of motion.
  • the apparatus specifies a target object from the portion, acquires location of the target object and output behavior command which moves the mobile unit toward the location.
  • detailed feature of the target object need not be predetermined.
  • the computational load is reduced. Therefore, highly efficient and accurate control for the mobile unit may be implemented.
  • mobile unit refers to a unit which has a driving mechanism and moves in accordance with behavior commands.
  • the sensory inputs may be images of the external environment of the mobile unit.
  • the motion estimating method comprises behavior command output method for outputting the behavior command and behavior evaluation method for evaluating the result of the behavior of the mobile unit.
  • the motion estimating method further comprises learning method for learning the motion of the mobile unit using the relationship between the sensory inputs and the behavior result and storing method for storing the learning result.
  • the behavior control apparatus pre-learns the relationship between sensory inputs and behavior commands. Then the apparatus updates the learning result when new feature is acquired on behavior control stage.
  • the learning result is represented as probabilistic density distribution. Thus, motion of the mobile unit on behavior control stage may be estimated with high accuracy.
  • the motion of the mobile unit may be captured using a gyroscope instead of estimating it.
  • the target segregation method segregates the portion by comparing the sensory inputs and the estimated motion using such as optical flow.
  • the behavior control apparatus may roughly segregate the portion that includes a target object
  • the target location acquiring method defines the center of the target object as the location of the target object and the behavior decision method outputs the behavior command to move the mobile unit toward the location of the target object.
  • the mobile unit may be controlled stably.
  • the behavior decision method calculates the distance between the mobile unit and the location of the target object, and deciding the behavior command to decrease the calculated distance. This calculation is very simple and helps to reduce the amount of computation.
  • the target segregation method repeats segregating the portion which includes a target object.
  • the target object matching method extracts the target object by pattern matching between the sensory inputs and predetermined templates.
  • the target object may be extracted more accurately.
  • Fig. 1 shows overall view of a radio-controlled (RC) helicopter according to one embodiment of the invention,'
  • Fig. 2 is a functional block diagram illustrating one exemplary configuration of a behavior control apparatus according to the invention
  • Fig. 3 is a graph illustrating the relationship between a generative model and minimum variance!
  • Fig. 4 shows a conceptual illustration of a target object recognized by means of target segregation!
  • Fig. 5 is a chart illustrating that the range of the target object is narrowed by learning
  • Fig. 6 is a flowchart illustrating control routine of a RC helicopter
  • Fig. 7 is a chart illustrating a distance between a target location and center of motion, *
  • Fig. 8 is a graph illustrating unstable control status of the mobile unit on initial stage of behavior control
  • Fig. 9 is a graph illustrating that the vibration of motion of the mobile unit is getting smaller, " and
  • Fig. 10 is a graph illustrating stable control status of the mobile unit on last stage of behavior control. Best Mode for Carrying Out the Invention
  • a behavior control apparatus recognizes a target object, which is a reference for controlling a mobile unit, from input images and then controls behavior of the mobile unit based on the recognized target object.
  • the apparatus is used as installed on the mobile unit, which has driving mechanism and is movable by itself.
  • Fig. 1 shows a radio-controlled (RC) helicopter 100 according to one embodiment of the invention.
  • the RC helicopter 100 consists of body 101, main rotor 102 and tail rotor 103.
  • On the body 101 are installed a CCD camera 104, a behavior control apparatus 105 and a servomotor 106.
  • At the base of the tail rotor 103 there is link mechanism 107, which is coupled with the servomotor 106 through a rod 108.
  • the RC helicopter 100 can float in the air by rotating the main rotor 102 and the tail rotor 103.
  • the CCD camera 104 takes images of frontal vision of the RC helicopter. Area taken by the camera is showed in Fig. l as visual space 109.
  • the behavior control apparatus 105 autonomically recognizes a location of a target object 110 (hereinafter simply referred to as "target location 110"), which is to be a target for behavior control, and also recognizes self-referential point in the visual space 109 based on the image taken by the CCD camera 104.
  • the target location 110 is represented as probabilistic density distribution, as described later, and is conceptually illustrated as ellipse in Fig. 1.
  • the RC helicopter 100 is tuned as that only the control of yaw orientation (as an arrow in Fig. 1, around the vertical line) is enabled. Therefore, term “stable” as used herein means that vibration of the RC helicopter's directed orientation is small.
  • the behavior control apparatus 105 outputs behavior commands to move the self-referential point (for example, center. This is hereinafter referred to as COM 111, acronym of "center of motion") of the image captured by CCD camera 104 (the visual space 109) toward the target location 110 in order to control the RC helicopter 100 stably.
  • the behavior commands are sent to the servomotor 106.
  • the servomotor 106 drives the rod 108, activating the link mechanism to alter the angle of tail rotor 103 so as to rotate the RC helicopter 100 in yaw orientation.
  • controllable orientation is limited in one-dimensional operation such that COM moves from side to side for the purpose of simple explanation.
  • present invention may be also applied to position control in two or three dimensions.
  • the RC helicopter 100 is described as an example of the mobile unit having the behavior control apparatus of the present invention, the apparatus may be installed on any of mobile unit having driving mechanism and being able to move by itself.
  • the mobile unit is not limited to flying objects like a helicopter, but includes, for example, vehicles traveling on the ground.
  • the mobile unit further includes the unit only the part of which can moves.
  • the behavior control apparatus of the present invention may be installed on industrial robots of which base is fixed to floor, to recognize an operation target of the robot.
  • Fig. 2 is a functional block diagram of the behavior control apparatus 105.
  • the behavior control apparatus 105 comprises an image capturing block 202, a behavior command output block 204, a behavior evaluation block 206, a learning block 208, a storage block 210, a target segregation block 212, a matching block 214, a target location acquiring block 216 and a behavior decision block 218.
  • the behavior control apparatus 105 may be implemented by running a program according to the present invention on a general-purpose computer, and it can also be implemented by means of hardware having functionality of the invention.
  • the behavior control apparatus 105 first learns relationship between features of inputs (e.g., images taken by the CCD camera 104) and behavior of the mobile unit. These operations are inclusively referred to as "learning stage”. Completing the learning stage, the apparatus may estimate motion of the mobile unit based on the captured images using learned knowledge. The apparatus further searches and extracts target location in the image autonomously using estimated motion. Finally, the apparatus controls the motion of the mobile unit with the reference to the target location. These operations are inclusively referred to as "behavior control stage”.
  • the behavior control apparatus 105 shown in Fig. 2 is configured for use on the RC helicopter 100, and the apparatus may be configured in various manner depending on the characteristic of the mobile unit installed thereon.
  • the apparatus may further include a gyroscope sensor.
  • the apparatus uses the signals generated from the gyroscope sensor to estimate motion of the mobile unit, and uses the sensory input captured by the image capturing block 202 only for recognizing the target location.
  • the behavior control apparatus 105 learns relationship between features of input images taken by an image pickup device and behavior result in response to behavior command from the behavior command output block 204.
  • the apparatus then stores learning result in the storage block 210. This learning enables the apparatus to estimate motion of the mobile unit accurately based on input images in the behavior control stage described later.
  • the behavior command output block 204 outputs behavior commands Q ; (t) , which directs behavior of the mobile unit. While the learning is immature in initial stage, behavior commands are read from command sequence which is selected randomly beforehand. During the mobile unit moves randomly, the behavior control apparatus 105 may learn necessary knowledge for estimating the motion of the mobile unit. As for the RC helicopter 100 shown in Fig. 1, the behavior commands correspond to driving current of the servomotor 106, which drives link mechanism 107 to change the yaw orientation. The behavior command is sent to driving mechanism such as the servomotor 106 and the behavior evaluation block 206.
  • the relationship between the sensory inputs Ir(t) and the behavior commands Q ; (t) is represented by the following mapping / .
  • mapping / may be given as a non-linear approximation translation using well-known Fourier series or the like.
  • the behavior command output block
  • the behavior evaluation block 206 receives signal from an external device and outputs behavior commands in accordance with the signal.
  • the behavior evaluation block 206 generates reward depending on both sensory inputs I : (t) from image capturing block 202 and the behavior result in response to behavior command Q t (t) based on predetermined evaluation function under a reinforcement learning scheme.
  • the example of the evaluation function is a function that yields reward "1" when the mobile unit controlled by behavior command is stable, otherwise yields reward "2".
  • the behavior evaluation block 206 After the rewards are yielded, the behavior evaluation block 206 generates a plurality of columns 1,2, 3, ...,m as many as the number of type of the rewards and distributes behavior commands into each column responsive to the type of their rewards.
  • the behavior commands Q t (t) distributed in column 1 are denoted as " Q'(t) ".
  • Sensory inputs I,(t) and behavior command Q : (t) are supplied to learning block 208 and used for learning the relationship between them.
  • the purpose of the evaluation function is to minimize the variance of the behavior commands.
  • the reinforcement learning satisfying cr(Q ) ⁇ cr(Q 2 ) is executed with the evaluation function.
  • the minimum variance of the behavior commands needs to be reduced for smooth control. Learning with the evaluation function allows the behavior control apparatus 105 to eliminate unnecessary sensory inputs and to learn important sensory inputs selectively.
  • both sensory inputs and the behavior commands are stored according to the type of rewards given to the behavior commands.
  • Each column 1,2,3, ...,m corresponds to a cluster model of the behavior commands.
  • Each column is used to calculate generative models g( ⁇ i) where 1 denotes the number of attention classes applied.
  • Generative model is a storage model generated through learning, and may be represented by probabilistic density function in statistic learning.
  • Non-linear estimation such as neural network may be used to model g( ⁇ ⁇ ), which gives the estimation of probabilistic density distribution P(Q
  • ⁇ 0 takes the form of Gaussian mixture model, which may make approximation for any of probabilistic density function.
  • Fig. 3 shows the relationship between the number of generative models (horizontal axis) and the minimum variance (vertical axis).
  • a behavior command for minimizing the variance of the normal distribution curve of behavior commands for a new sensory input may be selected out of the column by means of a statistical learning scheme, and the rapid stability of the mobile unit may be attained.
  • the learning block 208 calculates the class of attention ⁇ , corresponding one by one to each column 1 which contains the behavior commands using identity mapping translation. This translation is represented by the following mapping h.
  • the purpose of the class of attention ⁇ is efficient learning by focusing on the particular sensory inputs from massive sensory inputs when new sensory inputs are given. Generally, the amount of sensory inputs far exceeds the processing capacity of the computer. Thus, appropriate filtering for sensory inputs with the classes of attention ⁇ , improves the efficiency of the learning. Therefore, the learning block 208 may eliminate the sensory inputs except the selected small subset of them.
  • the learning block 208 may know directly the class of attention corresponding to the sensory input using the statistical probability without calculating the mapping f and/or h one by one. More specifically, each of the classes of attention ⁇ , is a parameter for modeling the behavior commands
  • probabilistic density function of each class of attention ⁇ may be obtained.
  • each class of attention ⁇ z is assigned as an element of the probabilistic density function
  • the learning block 208 learns the relation between the sensory inputs and the classes of attention by means of supervised learning scheme using neural network. More specifically, this learning is executed by obtaining conditional probabilistic density function p ⁇ (l i (t) ⁇ ⁇ l ) of the class of attention ⁇ , and the sensory input I.(t) using hierarchical neural network with the class of attention ⁇ , as supervising signal. It should be noted that the class of attention may be calculated by synthetic function f - h . The obtained conditional probabilistic density function p (l i (t) ⁇ ⁇ l corresponds to the probabilistic relation between the sensory input and the class of attention.
  • New sensory inputs gained by CCD camera 104 are provided to the behavior control apparatus 105 after the learning is over.
  • the learning block 208 selects the class of attention corresponding to provide sensory input using statistical learning scheme such as bayes' learning. This operation corresponds to calculating conditional probabilistic density function ?( ⁇ Z
  • conditional probabilistic density function of the sensory inputs and the class of attention has been already estimated by the hierarchical neural network, newly given sensory inputs may be directly assigned to particular class of attention. In other words, after the supervised learning with neural network is over, calculation of the mapping / and/or h become unnecessary for selecting class of attention ⁇ z relative to sensory input /,(t) .
  • bayes' learning scheme is used as the statistical learning scheme. Assume that sensory inputs I.(i) are given and both prior probability ]?( ⁇ z (t)) and probabilistic density function p(l i (t) ⁇ ⁇ l ) have been calculated beforehand. Maximum posterior probability for each class of attention is calculated by following bayes' rule.
  • the ⁇ ( ⁇ t)) may be called the "belief of ⁇ and is the probability that a sensory input I : (t) belongs to a class of attention
  • the class with highest probability (belief) is selected as class of attention ⁇ corresponding to the provided sensory input I.(f) .
  • the behavior control apparatus 105 may obtain the class of attention ⁇ z that is hidden parameter from directly observable sensory input I.(i) using bayes' rule and to assign the sensory input I ; (t) to corresponding class of attention ⁇ z .
  • the learning block 208 further searches behavior command according to the sensory input stored in the column corresponding to the selected class of attention, then send the searched behavior command to the target segregation block 212.
  • the behavior control apparatus may estimate motion of the mobile unit accurately based on input images. Therefore, these blocks are inclusively referred to as "motion estimating method" in appended claims.
  • the behavior control apparatus 105 estimates the motion based on input image and roughly segregates the location of the target object (target location). Then the apparatus performs pattern matching with templates which are stored in the memory as target object and calculate the target location more accurately. And the apparatus indicates to output the behavior command based on the distance between the target location and center of motion (COM). By repeating this process, the target location is getting refined and the mobile unit reaches in stably controlled status. In other words, the apparatus segregates the target based on motion estimation and understands what is to be target object. Now the functionality of each block is described.
  • Target segregation block 212 roughly segregates and extracts a potion including target object, which are to be the behavior reference of the mobile unit, from visual space. For example, the segregation is done by comparing optical flow of the image and the estimated motion.
  • Target object matching block 214 uses templates to extract the target object more accurately.
  • the target object matching block 214 compares the template and the segregated portion and determines whether the portion is the object to be targeted or not.
  • the templates are prepared beforehand. If there are plurality of target objects, or if there are plurality of objects which match with the templates, the object having largest matching index is selected.
  • a target location acquiring block 216 defines the center point of the target object as the target location.
  • behavior decision block 218 supplies request signal to behavior command output block 204.
  • behavior command output block 204 outputs the behavior command to move such that center of motion (COM) of the mobile unit overlaps the location of the target object.
  • COM center of motion
  • Fig. 4 is a diagram illustrating a target object segregation recognized by the target segregation block 212.
  • Ellipses 401, 402, 403 are the cluster to be the location of the target object calculated based on the estimated motion and represented as normal distribution ⁇ ⁇ , ⁇ 2, ⁇ 3, respectively. These are attention classes extracted from feature information of the image.
  • Fig. 5 is a chart illustrating that range of the target location is refined (reduced) by the learning.
  • Learning block 208 narrows down uncertain probability range (in other words, variance of probabilistic density distribution) ⁇ of the location of the target location by, for example, bayes' learning.
  • the EM algorithm is an iterative algorithm for estimating the maximum likelihood parameter when observed data can be viewed as incomplete data.
  • the parameter ⁇ is represented by ⁇ ( ⁇ , ⁇ ).
  • the model of feature vector is built by means of bayes' parameter estimation. This is employed to estimate the number of clusters which represents data structure best. Algorithm to estimate a parameter of Gaussian mixture model will be described. This algorithm is similar to conventional clustering essentially, but is different in that it can estimate parameters closely when clusters are overlapped. Therefore, sample of training data is used to determine the number of subclass and the parameters of each subclass.
  • Y be an M dimensional random vector to be modeled using a Gaussian mixture distribution. Assume that this model has K subclasses. The following parameters are required to completely specify the k-th subclass.
  • ⁇ k the probability that a pixel has subclass k ⁇ k : the M dimensional spectral mean vector for subclass k
  • Rk the M times M spectral covariance matrix for subclass k ⁇ , ⁇ , R denote the following parameter sets, respectively.
  • the set of admissible ⁇ for a k-th order model is denoted by p .
  • Yi, Y2, ...,Y n N multispectral pixels sampled from the class of interest.
  • the subclass of that pixel is given by the random variable X n for each pixel Yi.
  • MDL estimator works by attempting to find the model order which minimizes the number of bits that would be required to code both the data samples y n and the parameter vector ⁇ .
  • MDL reference is expressed like the following expression.
  • the objective is to minimize the MDL criteria
  • the objective of the EM algorithm is hereby to iteratively optimize with respect to ⁇ until a local minimum of the
  • the Q function is optimized in the following way.
  • the number K of subclasses will be started with sufficiently large, and then be decremented sequentially.
  • the EM algorithm is applied until it is converged to a local maximum of the MDL function.
  • the value of K may be selected simply and corresponding parameters that resulted in the largest value for the MDL criteria may be selected.
  • learning stage and behavior control stage are not also divided clearly, but both of them may be executed simultaneously as one example described bellow.
  • behavior evaluation block 206 determines whether feature of image provided afresh should be reflected to knowledge acquired by previous learning in behavior control stage. Furthermore, behavior evaluation block 206 receives the motion estimated from the image. When change of the external environment that was not learned in previous learning is captured by image capturing block 202, the feature is sent to behavior evaluation block 206, which outputs attentional demanding for indicating generation of an attention class. In response to this, learning block 208 generates an attention class. Thus learning result is always updated,' therefore, precision of the motion estimation is improved, too.
  • Fig. 6 is a flowchart of the process. This chart can be divided into two step showed as two dotted line rectangular in Fig. 6. One is coarse step of left side column where rough segregation of target/non-target is executed. The other is fine step of right side column where the target location is narrowed (refined) gradually.
  • step 602 probabilistic density distribution P( ⁇ ⁇ ) for all attention classes ⁇ i of motion are assumed to be uniform.
  • the mobile unit moves randomly for collecting data for learning.
  • data set collected for stabilizing the RC helicopter 100 was used to generate 500 training data points and 200 test points.
  • the CCA reinforced EM algorithm is executed for calculating parameters ⁇ ( ⁇ , ⁇ ) which defines the probabilistic density distribution ⁇ i.
  • ⁇ , ⁇
  • 20 subclasses was used at first, but the number of subclasses converges by CCA reinforced EM algorithm and finally reduced to 3 as shown in Fig. 4.
  • P(Q I ⁇ i) is calculated with ⁇ , where Q represents behavior command.
  • Q represents behavior command.
  • probabilistic relation between feature vector I and attention class ⁇ i is calculated with neural network.
  • motion of the mobile unit is estimated by bayes' rule. Steps 602 to 612 correspond to the learning stage.
  • Gaussian mixture model is calculated with the use of each probabilistic density function. Part of the image which is not included in Gaussian mixture model is separated as non-target.
  • the target object is recognized by template matching and probabilistic density distribution ⁇ TL of the target location is calculated.
  • the center of this is defined as target location.
  • difference D between center of motion (COM) and the target location (TL) is calculated.
  • the map outputs behavior command expanding the width of motion when the helicopter is far from the target location, otherwise outputs command reducing the width of the motion.
  • Fig. 7 shows an example of output behavior command.
  • a map is stored in memory which takes different output value depending on D and corresponding value is searched and transmitted to the servomotor.
  • the unit may estimate ⁇ accurately and thus predict the target location accurately.
  • Step 624 When D is smaller than ⁇ at step 624, it shows that the helicopter is stable with sufficient accuracy for target location and so the process is terminated.
  • the unit may control both the location of helicopter and the duration during which the helicopter remains at that location. Steps 614 to 624 correspond to the behavior control stage.
  • Figs. 8 to 10 are graphs illustrating control status of the RC helicopter.
  • horizontal axis represents the number of trial and vertical axis represents the distance between center of motion (COM) and the target location (TL) when controlling the helicopter to be stable.
  • Two dotted straight line in the graphs represent threshold values ⁇ to determine stability of the control.
  • the value ⁇ is set to 0.1826 in the graphs.
  • Fig. 8 is graph of control immediately after the behavior control is initiated. In this case, the distance D does not become lower than ⁇ and the vibration is still large, so the control is determined as to be unstable. As the target location is narrowed, the vibration becomes smaller (as Fig. 9). Finally, the control status becomes stable as shown in Fig. 10.
  • the behavior control apparatus may not be installed on the mobile unit.
  • the behavior control apparatus roughly segregate target area that includes a target object of behavior from sensory inputs, such as images, based on the estimation of motion.
  • the apparatus specifies a target object from the target area, acquires location of the target object and output behavior command which moves the mobile unit toward the location.
  • detailed feature of the target object need not be predetermined.
  • the computational load is reduced. Therefore, highly efficient and accurate control for the mobile unit may be implemented.
  • the behavior control apparatus pre -learns the relationship between sensory inputs and behavior commands. Then the apparatus updates the learning result when new feature is acquired on behavior control stage.
  • the learning result is represented as probabilistic density distribution.
  • motion of the mobile unit on behavior control stage may be estimated with high accuracy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Game Theory and Decision Science (AREA)
  • Business, Economics & Management (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Health & Medical Sciences (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

The invention relates to a behavior control apparatus and method for autonomously controlling a mobile unit based on visual information in practical application without the needs of a greatdeal of preparation or computational cost and limiting the type of target object. According to one aspect of the invention, a method for controlling behavior of a mobile unit using behavior command is provided. First, sensory inputs are captured and then the motionof the mobile unit is estimated. The portion which includes a target object to be target for behavior of the mobile unit is segregated from the sensory inputs. The target objectsextracted from the segregated portion and the location of the target object is acquired.Finally, the mobile unit is controlled based on the location of target object.

Description

DESCRIPTION
Behavior Control Apparatus and Method
Technical Field
The present invention relates to a behavior control apparatus and method for mobile unit, in particular, to a behavior control apparatus and method for recognizing a target object in acquired images and controlling behavior of the mobile unit with high accuracy based on the recognized target object.
Background Art
To control a mobile unit with high accuracy based on input images, it is necessary for a control system to recognize an object in the image as a target for behavior of the mobile unit. One approach is that the control system learns training data pre-selected by an operator prior to recognition. Specifically, the control system searches the input images to extract some shapes or colors therefrom designated as features of the target. Then, the control system outputs commands to make the mobile unit move toward the extracted target.
However, it is necessary for the operator to teach features such as shape or color of the target in detail to the control system and therefore preparation for that is a burden in terms of time and labor. In addition, since the control would be interrupted when the target goes off the input image, it is difficult to apply this approach to practical use.
An alternative approach is that a template for the target is prepared and during controlling the mobile unit the template is always applied to input images to search and extract shape and location of the target in detail. In this case, however, computational cost would become huge because a computer has to keep calculating the shape and location of the target. Furthermore, the calculation of searching the target may fall into a local solution.
Therefore, to control behavior of the mobile unit efficiently and flexibly, it is preferable to make the mobile unit move autonomously rather than utilizing supervised learning method as a target is specified beforehand. To achieve that, a method for recognizing the target autonomously and learning the location of the target is needed. In Japanese Patent Application Unexamined Publication (Kokai) No. H8- 126981, image position recognition method in robot system is disclosed. According to the method, the target object is searched out autonomically even when the target object is missing out of input image due to the error. However, the method requires that work plane for recognizing images is painted with various colors prior to working, which is substantially time-consuming task.
In Japanese Patent Application Unexamined Publication (Kokai)
No. H7- 13461, a method for leading autonomous moving robots for managing indoor air-conditioning units is disclosed. According to the method, a target object for leading is detected through image processing and the robot is leaded toward the target. However, the method needs blowing outlets of air-conditioning units as target objects, which lacks generality.
Therefore, it is objective of the present invention to provide a behavior control apparatus and method for autonomously controlling a mobile unit based on visual information in practical application without the needs of a great deal of preparation or computational cost and limiting the type of target object.
Disclosure of Invention
According to one aspect of the invention, a behavior control apparatus for controlling behavior of a mobile unit is provided. The apparatus comprises sensory input capturing method for capturing sensory inputs and motion estimating method for estimating motion of the mobile unit. The apparatus further comprises target segregation method for segregating the portion which includes a target object to be target for behavior of the mobile unit from sensory inputs, and target object matching method for extracting the target object from the segregated portion. The apparatus still further comprises target location acquiring method for acquiring the location of the target object and behavior decision method for deciding behavior command for controlling the mobile unit based on the location of the target object.
The behavior control apparatus roughly segregate the portion that includes a target object of behavior from sensory inputs, such as images, based on the estimation of motion. The apparatus then specifies a target object from the portion, acquires location of the target object and output behavior command which moves the mobile unit toward the location. Thus, detailed feature of the target object need not be predetermined. In addition, because the features irrelevant to present behavior are eliminated, the computational load is reduced. Therefore, highly efficient and accurate control for the mobile unit may be implemented.
As used herein, "mobile unit" refers to a unit which has a driving mechanism and moves in accordance with behavior commands.
The sensory inputs may be images of the external environment of the mobile unit. The motion estimating method comprises behavior command output method for outputting the behavior command and behavior evaluation method for evaluating the result of the behavior of the mobile unit. The motion estimating method further comprises learning method for learning the motion of the mobile unit using the relationship between the sensory inputs and the behavior result and storing method for storing the learning result.
The behavior control apparatus pre-learns the relationship between sensory inputs and behavior commands. Then the apparatus updates the learning result when new feature is acquired on behavior control stage. The learning result is represented as probabilistic density distribution. Thus, motion of the mobile unit on behavior control stage may be estimated with high accuracy.
The motion of the mobile unit may be captured using a gyroscope instead of estimating it. The target segregation method segregates the portion by comparing the sensory inputs and the estimated motion using such as optical flow. Thus the behavior control apparatus may roughly segregate the portion that includes a target object
The target location acquiring method defines the center of the target object as the location of the target object and the behavior decision method outputs the behavior command to move the mobile unit toward the location of the target object. Thus the mobile unit may be controlled stably.
The behavior decision method calculates the distance between the mobile unit and the location of the target object, and deciding the behavior command to decrease the calculated distance. This calculation is very simple and helps to reduce the amount of computation.
If the calculated distance is greater than a predetermined value, the target segregation method repeats segregating the portion which includes a target object.
The target object matching method extracts the target object by pattern matching between the sensory inputs and predetermined templates. Thus the target object may be extracted more accurately.
Other embodiments and features will be apparent by reference to the following description in connection with the accompanying drawings. Brief Description of Drawings
Fig. 1 shows overall view of a radio-controlled (RC) helicopter according to one embodiment of the invention,'
Fig. 2 is a functional block diagram illustrating one exemplary configuration of a behavior control apparatus according to the invention;
Fig. 3 is a graph illustrating the relationship between a generative model and minimum variance!
Fig. 4 shows a conceptual illustration of a target object recognized by means of target segregation!
Fig. 5 is a chart illustrating that the range of the target object is narrowed by learning;
Fig. 6 is a flowchart illustrating control routine of a RC helicopter; Fig. 7 is a chart illustrating a distance between a target location and center of motion,*
Fig. 8 is a graph illustrating unstable control status of the mobile unit on initial stage of behavior control;
Fig. 9 is a graph illustrating that the vibration of motion of the mobile unit is getting smaller," and
Fig. 10 is a graph illustrating stable control status of the mobile unit on last stage of behavior control. Best Mode for Carrying Out the Invention
Preferred embodiments of the present invention will be described as follows with reference to the drawings.
A behavior control apparatus according to the invention recognizes a target object, which is a reference for controlling a mobile unit, from input images and then controls behavior of the mobile unit based on the recognized target object. The apparatus is used as installed on the mobile unit, which has driving mechanism and is movable by itself.
Configuration
Fig. 1 shows a radio-controlled (RC) helicopter 100 according to one embodiment of the invention. The RC helicopter 100 consists of body 101, main rotor 102 and tail rotor 103. On the body 101 are installed a CCD camera 104, a behavior control apparatus 105 and a servomotor 106. At the base of the tail rotor 103, there is link mechanism 107, which is coupled with the servomotor 106 through a rod 108. The RC helicopter 100 can float in the air by rotating the main rotor 102 and the tail rotor 103. The CCD camera 104 takes images of frontal vision of the RC helicopter. Area taken by the camera is showed in Fig. l as visual space 109. The behavior control apparatus 105 autonomically recognizes a location of a target object 110 (hereinafter simply referred to as "target location 110"), which is to be a target for behavior control, and also recognizes self-referential point in the visual space 109 based on the image taken by the CCD camera 104. The target location 110 is represented as probabilistic density distribution, as described later, and is conceptually illustrated as ellipse in Fig. 1.
The RC helicopter 100 is tuned as that only the control of yaw orientation (as an arrow in Fig. 1, around the vertical line) is enabled. Therefore, term "stable" as used herein means that vibration of the RC helicopter's directed orientation is small.
The behavior control apparatus 105 outputs behavior commands to move the self-referential point (for example, center. This is hereinafter referred to as COM 111, acronym of "center of motion") of the image captured by CCD camera 104 (the visual space 109) toward the target location 110 in order to control the RC helicopter 100 stably. The behavior commands are sent to the servomotor 106. In response to the behavior commands, the servomotor 106 drives the rod 108, activating the link mechanism to alter the angle of tail rotor 103 so as to rotate the RC helicopter 100 in yaw orientation.
In the embodiment described above, controllable orientation is limited in one-dimensional operation such that COM moves from side to side for the purpose of simple explanation. However, the present invention may be also applied to position control in two or three dimensions.
Although the RC helicopter 100 is described as an example of the mobile unit having the behavior control apparatus of the present invention, the apparatus may be installed on any of mobile unit having driving mechanism and being able to move by itself. In addition, the mobile unit is not limited to flying objects like a helicopter, but includes, for example, vehicles traveling on the ground. The mobile unit further includes the unit only the part of which can moves. For example, the behavior control apparatus of the present invention may be installed on industrial robots of which base is fixed to floor, to recognize an operation target of the robot.
Fig. 2 is a functional block diagram of the behavior control apparatus 105. The behavior control apparatus 105 comprises an image capturing block 202, a behavior command output block 204, a behavior evaluation block 206, a learning block 208, a storage block 210, a target segregation block 212, a matching block 214, a target location acquiring block 216 and a behavior decision block 218. The behavior control apparatus 105 may be implemented by running a program according to the present invention on a general-purpose computer, and it can also be implemented by means of hardware having functionality of the invention.
The behavior control apparatus 105 first learns relationship between features of inputs (e.g., images taken by the CCD camera 104) and behavior of the mobile unit. These operations are inclusively referred to as "learning stage". Completing the learning stage, the apparatus may estimate motion of the mobile unit based on the captured images using learned knowledge. The apparatus further searches and extracts target location in the image autonomously using estimated motion. Finally, the apparatus controls the motion of the mobile unit with the reference to the target location. These operations are inclusively referred to as "behavior control stage".
It should be noted that the behavior control apparatus 105 shown in Fig. 2 is configured for use on the RC helicopter 100, and the apparatus may be configured in various manner depending on the characteristic of the mobile unit installed thereon. For example, the apparatus may further include a gyroscope sensor. In this case, the apparatus uses the signals generated from the gyroscope sensor to estimate motion of the mobile unit, and uses the sensory input captured by the image capturing block 202 only for recognizing the target location.
Learning
In learning stage, while moving the mobile unit, the behavior control apparatus 105 learns relationship between features of input images taken by an image pickup device and behavior result in response to behavior command from the behavior command output block 204. The apparatus then stores learning result in the storage block 210. This learning enables the apparatus to estimate motion of the mobile unit accurately based on input images in the behavior control stage described later.
The image capturing block 202 receives images every predetermined interval from an image pickup device such as CCD camera 104 installed in front of the RC helicopter 100. Then the block 202 extracts features as sensory inputs I,(t) (i=l,2,...) from the images. This feature extraction may be implemented by any of prior-art approaches such as optical flow. The extracted features are sent to the behavioral evaluation block 206.
The behavior command output block 204 outputs behavior commands Q;(t) , which directs behavior of the mobile unit. While the learning is immature in initial stage, behavior commands are read from command sequence which is selected randomly beforehand. During the mobile unit moves randomly, the behavior control apparatus 105 may learn necessary knowledge for estimating the motion of the mobile unit. As for the RC helicopter 100 shown in Fig. 1, the behavior commands correspond to driving current of the servomotor 106, which drives link mechanism 107 to change the yaw orientation. The behavior command is sent to driving mechanism such as the servomotor 106 and the behavior evaluation block 206. The relationship between the sensory inputs Ir(t) and the behavior commands Q;(t) is represented by the following mapping / .
/ : /,( ι→ fi( (1)
where subscript i (i=l,2, ...) means i-th data. For example, the mapping / may be given as a non-linear approximation translation using well-known Fourier series or the like.
In alternative embodiment, the behavior command output block
204 receives signal from an external device and outputs behavior commands in accordance with the signal. The behavior evaluation block 206 generates reward depending on both sensory inputs I:(t) from image capturing block 202 and the behavior result in response to behavior command Qt(t) based on predetermined evaluation function under a reinforcement learning scheme. The example of the evaluation function is a function that yields reward "1" when the mobile unit controlled by behavior command is stable, otherwise yields reward "2". After the rewards are yielded, the behavior evaluation block 206 generates a plurality of columns 1,2, 3, ...,m as many as the number of type of the rewards and distributes behavior commands into each column responsive to the type of their rewards. Hereinafter the behavior commands Qt(t) distributed in column 1 are denoted as " Q'(t) ". Sensory inputs I,(t) and behavior command Q:(t) are supplied to learning block 208 and used for learning the relationship between them.
The purpose of the evaluation function is to minimize the variance of the behavior commands. In other words, the reinforcement learning satisfying cr(Q )< cr(Q2) is executed with the evaluation function. The minimum variance of the behavior commands needs to be reduced for smooth control. Learning with the evaluation function allows the behavior control apparatus 105 to eliminate unnecessary sensory inputs and to learn important sensory inputs selectively.
In each column, both sensory inputs and the behavior commands are stored according to the type of rewards given to the behavior commands.
Each column 1,2,3, ...,m corresponds to a cluster model of the behavior commands. Each column is used to calculate generative models g( Ω i) where 1 denotes the number of attention classes applied. Generative model is a storage model generated through learning, and may be represented by probabilistic density function in statistic learning. Non-linear estimation such as neural network may be used to model g(Ω ι), which gives the estimation of probabilistic density distribution P(Q | Ω i). In the present embodiment, it is assumed that P(Q | Ω 0 takes the form of Gaussian mixture model, which may make approximation for any of probabilistic density function. Fig. 3 shows the relationship between the number of generative models (horizontal axis) and the minimum variance (vertical axis). Only one column will not accelerate the convergence of the learning because, if so, it will take much time until the normal distribution curve of behavior commands stored in the column is sharpened and the variance gets small. In order to control the mobile unit stably, it needs to learn in such a way that the variance of normal distribution of motor output becomes smaller. One feature of the invention is that the normal distribution curve is sharpened rapidly since a plurality of columns are generated. A method utilizing such minimum variance theory is described in Japanese Patent Application Unexamined Publication (Kokai) No. 2001-028758.
Then the learning process described later is executed in the learning block 208. After the learning process is completed, a behavior command for minimizing the variance of the normal distribution curve of behavior commands for a new sensory input may be selected out of the column by means of a statistical learning scheme, and the rapid stability of the mobile unit may be attained. Now the learning process at the learning block 208 will be described in detail.
The learning block 208 calculates the class of attention Ω, corresponding one by one to each column 1 which contains the behavior commands using identity mapping translation. This translation is represented by the following mapping h.
h Q,(t) → Ω,(t) (2)
The purpose of the class of attention Ω, is efficient learning by focusing on the particular sensory inputs from massive sensory inputs when new sensory inputs are given. Generally, the amount of sensory inputs far exceeds the processing capacity of the computer. Thus, appropriate filtering for sensory inputs with the classes of attention Ω, improves the efficiency of the learning. Therefore, the learning block 208 may eliminate the sensory inputs except the selected small subset of them.
When the learning goes forward, the learning block 208 may know directly the class of attention corresponding to the sensory input using the statistical probability without calculating the mapping f and/or h one by one. More specifically, each of the classes of attention Ω, is a parameter for modeling the behavior commands
Qi (t) stored in each column using the probabilistic density function of the normal distribution. To obtain the probabilistic density function, a mean μ and covariance Σ need to be calculated for the behavior commands Q- (t) stored in each column. This calculation is performed by unsupervised Expectation Maximization (EM) algorithm using clustered component algorithm (CCA), which will be described later. It should be noted that the classes of attention Ω, are modeled on the assumption that true probabilistic distribution p{lι(t) I Ωz) will exists for each class of attention Ω; .
Using the obtained parameters, probabilistic density function of each class of attention Ω, may be obtained. The obtained density functions are used as prior probability p(Ωl(t)) (= p(Qi(t) \ Ωl(t)) ) of each class of attention before sensory inputs are given. In other words, each class of attention Ωz is assigned as an element of the probabilistic density function
Figure imgf000017_0001
After the classes of attention Ωz are calculated, the learning block 208 learns the relation between the sensory inputs and the classes of attention by means of supervised learning scheme using neural network. More specifically, this learning is executed by obtaining conditional probabilistic density function pλ(li(t) \ Ωl) of the class of attention Ω, and the sensory input I.(t) using hierarchical neural network with the class of attention Ω, as supervising signal. It should be noted that the class of attention may be calculated by synthetic function f - h . The obtained conditional probabilistic density function p (li(t) \ Ωl corresponds to the probabilistic relation between the sensory input and the class of attention. New sensory inputs gained by CCD camera 104 are provided to the behavior control apparatus 105 after the learning is over. The learning block 208 selects the class of attention corresponding to provide sensory input using statistical learning scheme such as bayes' learning. This operation corresponds to calculating conditional probabilistic density function ?(ΩZ | /z(t)) of the class of attention Ωz relative to the sensory inputs It(t) . As noted above, since the probabilistic density function of the sensory inputs and the class of attention has been already estimated by the hierarchical neural network, newly given sensory inputs may be directly assigned to particular class of attention. In other words, after the supervised learning with neural network is over, calculation of the mapping / and/or h become unnecessary for selecting class of attention Ωz relative to sensory input /,(t) .
In this embodiment, bayes' learning scheme is used as the statistical learning scheme. Assume that sensory inputs I.(i) are given and both prior probability ]?(Ωz(t)) and probabilistic density function p(li(t) \ Ωl) have been calculated beforehand. Maximum posterior probability for each class of attention is calculated by following bayes' rule.
Figure imgf000018_0001
The ^(Ω^t)) may be called the "belief of Ω and is the probability that a sensory input I:(t) belongs to a class of attention
Ωz(t) . Calculating the probability that a sensory input I,(t) belongs to a class of attention Ω using bayes' rule implies that one class of attention Ω can be identified selectively by increasing the belief (weight) by learning of bayes' rule.
The class with highest probability (belief) is selected as class of attention Ω corresponding to the provided sensory input I.(f) . Thus, the behavior control apparatus 105 may obtain the class of attention Ωz that is hidden parameter from directly observable sensory input I.(i) using bayes' rule and to assign the sensory input I;(t) to corresponding class of attention Ωz .
The learning block 208 further searches behavior command according to the sensory input stored in the column corresponding to the selected class of attention, then send the searched behavior command to the target segregation block 212.
As noted above, using the blocks 204-210, the behavior control apparatus may estimate motion of the mobile unit accurately based on input images. Therefore, these blocks are inclusively referred to as "motion estimating method" in appended claims.
Behavior Control
On behavior control stage, the behavior control apparatus 105 estimates the motion based on input image and roughly segregates the location of the target object (target location). Then the apparatus performs pattern matching with templates which are stored in the memory as target object and calculate the target location more accurately. And the apparatus indicates to output the behavior command based on the distance between the target location and center of motion (COM). By repeating this process, the target location is getting refined and the mobile unit reaches in stably controlled status. In other words, the apparatus segregates the target based on motion estimation and understands what is to be target object. Now the functionality of each block is described.
Target segregation block 212 roughly segregates and extracts a potion including target object, which are to be the behavior reference of the mobile unit, from visual space. For example, the segregation is done by comparing optical flow of the image and the estimated motion.
Target object matching block 214 uses templates to extract the target object more accurately. The target object matching block 214 compares the template and the segregated portion and determines whether the portion is the object to be targeted or not. The templates are prepared beforehand. If there are plurality of target objects, or if there are plurality of objects which match with the templates, the object having largest matching index is selected.
A target location acquiring block 216 defines the center point of the target object as the target location. When the target location is defined, behavior decision block 218 supplies request signal to behavior command output block 204. When the request signal is received, behavior command output block 204 outputs the behavior command to move such that center of motion (COM) of the mobile unit overlaps the location of the target object.
It is indispensable for determining behavior command autonomously to segregate the target and non-target. The reason is because a target object segregated by target segregation may be used to select the optimal behavior to control the target object toward the location. In other words, the actual most suitable behavior is selected by predicting center of motion (COM) based on selective attention. Thus it allows the behavior control apparatus to search the location of the target object accurately in captured image. Fig. 4 is a diagram illustrating a target object segregation recognized by the target segregation block 212. Ellipses 401, 402, 403 are the cluster to be the location of the target object calculated based on the estimated motion and represented as normal distribution Ω ι, Ω 2, Ω 3, respectively. These are attention classes extracted from feature information of the image. Mixture distribution of three normal distribution model Ω 1, Ω 2, Ω 3 are showed as a dotted-lined ellipse in Fig. 4. Center of motion is acquired as center of mixture distribution in visual space. Each gaussian distribution in visual space is produced by projecting clustered behavior space based on center of motion on visual space with non-linear mapping like neural network. Assuming that Ω TL represents the target location and τ represents the area where segregating may be executed in captured image, the location of the target object is modeled by probability density function P( Ω L I τ ). Since the location Ω TL is basically uncertain value, it is assumed that the location has behavior control noise (that is, the variance of probabilistic density distribution). By repeating feedback process, noise (variance) of the target location is reduced and refined. In the present invention, reduction of noise (variance) depends on the accuracy of the motion estimation of the mobile unit.
Fig. 5 is a chart illustrating that range of the target location is refined (reduced) by the learning. Learning block 208 narrows down uncertain probability range (in other words, variance of probabilistic density distribution) σ of the location of the target location by, for example, bayes' learning.
CCA reinforced EM algorithm
Now CCA reinforced EM algorithm is described in detail. The EM algorithm is an iterative algorithm for estimating the maximum likelihood parameter when observed data can be viewed as incomplete data. When the observed data is the normal distribution, the parameter θ is represented by θ (μ, ∑).
In one embodiment of the invention, the model of feature vector is built by means of bayes' parameter estimation. This is employed to estimate the number of clusters which represents data structure best. Algorithm to estimate a parameter of Gaussian mixture model will be described. This algorithm is similar to conventional clustering essentially, but is different in that it can estimate parameters closely when clusters are overlapped. Therefore, sample of training data is used to determine the number of subclass and the parameters of each subclass.
Let Y be an M dimensional random vector to be modeled using a Gaussian mixture distribution. Assume that this model has K subclasses. The following parameters are required to completely specify the k-th subclass. π k: the probability that a pixel has subclass k μ k: the M dimensional spectral mean vector for subclass k Rk: the M times M spectral covariance matrix for subclass k π , μ , R denote the following parameter sets, respectively.
Figure imgf000023_0001
The complete set of parameters for the class are then given by K and θ =( π , μ ,R). Note that the parameters are constrained in a variety of ways. In particular, K must be an integer greater than 0, π k = 0 with Σ π =1, and det(R) ≥ ε , where might be chosen depending on the application. The set of admissible θ for a k-th order model is denoted by p . Let Yi, Y2, ...,Yn be N multispectral pixels sampled from the class of interest. Moreover, assume that the subclass of that pixel is given by the random variable Xn for each pixel Yi. Certainly, Ω n is normally not known, and which can also be useful for analyzing the problem. Letting each subclass be a multivariate Gaussian distribution, the probability density function for the pixel Yn for Ω n=k is given by
- μk)} (5)
Figure imgf000023_0002
Since the subclass Ω n of each sample is not known, to compute the density function of Ynm for given parameter θ , the following definition of conditional probability is applied.
Py. fo I *
Figure imgf000024_0001
Py„ y„ I k, θ)πk (6)
The logarithm of the probability of the entire sequence
^ = frX, (7)
is as follows.
ogpy(y
Figure imgf000024_0002
\ k,θ)πk) (8)
The objective is then to estimate the parameters K and θ e p (K)
Minimum description length (MDL) estimator works by attempting to find the model order which minimizes the number of bits that would be required to code both the data samples yn and the parameter vector θ . MDL reference is expressed like the following expression.
MDl{K,θ) = -l gpy{y | K,θ)+ 2Ll g{NM) (9)
Therefore, the objective is to minimize the MDL criteria
Figure imgf000024_0003
In order to derive the EM algorithm update equations, it is required to compute the following equation (Expectation step)
Q{θ;θ ) = E[logpytX (γ,X \ Θ) \ Y = y, θU]- Llog(NM) (U)
where Y and X are the sets of random variables fc .WL (i2) respectively, and y and x are realizations of these random objects. Thus the following equation holds.
MDL(K,θ)- MDL(κ,θ) < Q(θ( ))- Q(θ;θ{i)) (13)
This results in a useful optimization method since any value of θ that increases the value of Q( θ , ' θ (i)) is guaranteed to reduce the
MDL criteria. The objective of the EM algorithm is hereby to iteratively optimize with respect to θ until a local minimum of the
MDL function is reached. The Q function is optimized in the following way.
Q(E,π;Eii πii) )= E[\ogpy x(Y,X | E9π) \ ytE{ tπw]-KMlog(NM)
(14) In this case,
Figure imgf000025_0001
(15) where
Figure imgf000025_0002
The EM update equations then are following.
Figure imgf000026_0001
The solution is given as follows.
(*+1) _ n principal eigenvector Rk
π«+1) = Nk /N (is)
Initially, the number K of subclasses will be started with sufficiently large, and then be decremented sequentially. For each value of K, the EM algorithm is applied until it is converged to a local maximum of the MDL function. Eventually, the value of K may be selected simply and corresponding parameters that resulted in the largest value for the MDL criteria may be selected.
One method to effectively reduce K is to constrain the parameters of two classes to be equal, such that eι=em for classes 1 and m. Moreover, letting E* and E*ιm be the unconstrained and constrained solutions to Eq (17), a distance function may be defined as follows.
Figure imgf000026_0002
σmax (Rm ) - σmax (Rj + Rm ) > 0 (19) where σ max(R) denotes the principal eigenvalue of R. At each step, the two components that minimized the class distance are computed.
(/*, m*) = arg min/)BI d(l, m) (20)
After all, the two classes are merged and the number of subclass K is decreased.
Process of Behavior Control Apparatus
It should be noted that the learning stage and behavior control stage are not also divided clearly, but both of them may be executed simultaneously as one example described bellow.
In other words, behavior evaluation block 206 determines whether feature of image provided afresh should be reflected to knowledge acquired by previous learning in behavior control stage. Furthermore, behavior evaluation block 206 receives the motion estimated from the image. When change of the external environment that was not learned in previous learning is captured by image capturing block 202, the feature is sent to behavior evaluation block 206, which outputs attentional demanding for indicating generation of an attention class. In response to this, learning block 208 generates an attention class. Thus learning result is always updated,' therefore, precision of the motion estimation is improved, too.
Now the control process in practical application will be described of the behavior control apparatus of the invention installed on RC helicopter. Fig. 6 is a flowchart of the process. This chart can be divided into two step showed as two dotted line rectangular in Fig. 6. One is coarse step of left side column where rough segregation of target/non-target is executed. The other is fine step of right side column where the target location is narrowed (refined) gradually.
At step 602, probabilistic density distribution P( Ω ι) for all attention classes Ω i of motion are assumed to be uniform. At step 604, the mobile unit moves randomly for collecting data for learning. In this example, data set collected for stabilizing the RC helicopter 100 was used to generate 500 training data points and 200 test points.
At step 606, the CCA reinforced EM algorithm is executed for calculating parameters θ ( μ , Σ ) which defines the probabilistic density distribution Ω i. In the present example, 20 subclasses was used at first, but the number of subclasses converges by CCA reinforced EM algorithm and finally reduced to 3 as shown in Fig. 4.
At step 608, P(Q I Ω i) is calculated with θ , where Q represents behavior command. At step 610, probabilistic relation between feature vector I and attention class Ω i is calculated with neural network. At step 612, motion of the mobile unit is estimated by bayes' rule. Steps 602 to 612 correspond to the learning stage.
At step 614, Gaussian mixture model is calculated with the use of each probabilistic density function. Part of the image which is not included in Gaussian mixture model is separated as non-target.
At step 616, the target object is recognized by template matching and probabilistic density distribution Ω TL of the target location is calculated. At step 618 the center of this is defined as target location.
At step 620, difference D between center of motion (COM) and the target location (TL) is calculated. At step 622, the map outputs behavior command expanding the width of motion when the helicopter is far from the target location, otherwise outputs command reducing the width of the motion. Fig. 7 shows an example of output behavior command. As seen, a map is stored in memory which takes different output value depending on D and corresponding value is searched and transmitted to the servomotor. At step 624, it is determined whether D is smaller than the allowable error ε . If D is larger than ε , the accuracy of the target location is not sufficient and the process returns to step 606 to re-calculate θ . That is, it attributes to the normalization problem how many number of gaussian mixture function is need to estimate the state of motion. By increasing the applied number of mixture gaussian function every time the process returns to step 606, the unit may estimate θ accurately and thus predict the target location accurately.
When D is smaller than ε at step 624, it shows that the helicopter is stable with sufficient accuracy for target location and so the process is terminated. By setting ε small, the unit may control both the location of helicopter and the duration during which the helicopter remains at that location. Steps 614 to 624 correspond to the behavior control stage.
Results
Figs. 8 to 10 are graphs illustrating control status of the RC helicopter. In these graphs, horizontal axis represents the number of trial and vertical axis represents the distance between center of motion (COM) and the target location (TL) when controlling the helicopter to be stable. Two dotted straight line in the graphs represent threshold values ε to determine stability of the control.
The value ε is set to 0.1826 in the graphs.
Fig. 8 is graph of control immediately after the behavior control is initiated. In this case, the distance D does not become lower than ε and the vibration is still large, so the control is determined as to be unstable. As the target location is narrowed, the vibration becomes smaller (as Fig. 9). Finally, the control status becomes stable as shown in Fig. 10.
Some preferred embodiments have been described, but this invention is not limited to such embodiments. For example, the behavior control apparatus may not be installed on the mobile unit.
In this case, only the CCD camera is installed on the mobile unit and the behavior control apparatus is installed on another place.
Then information is transmitted through wireless communication between the camera and the apparatus.
Industrial Applicability
According to one aspect of the invention, the behavior control apparatus roughly segregate target area that includes a target object of behavior from sensory inputs, such as images, based on the estimation of motion. The apparatus then specifies a target object from the target area, acquires location of the target object and output behavior command which moves the mobile unit toward the location. Thus, detailed feature of the target object need not be predetermined. In addition, because the features irrelevant to present behavior are eliminated, the computational load is reduced. Therefore, highly efficient and accurate control for the mobile unit may be implemented.
According to another aspect of the invention, the behavior control apparatus pre -learns the relationship between sensory inputs and behavior commands. Then the apparatus updates the learning result when new feature is acquired on behavior control stage. The learning result is represented as probabilistic density distribution. Thus, motion of the mobile unit on behavior control stage may be estimated with high accuracy.

Claims

1. A behavior control apparatus for controlling behavior of a mobile unit, comprising' sensory input capturing method for capturing sensory inputs,' motion estimating method for estimating motion of the mobile unit; target segregation method for segregating the portion which includes a target object to be target for behavior of the mobile unit from sensory inputs,' target object matching method for extracting target object from said segregated portion! target location acquiring method for acquiring the location of the target object; behavior decision method for deciding behavior command for controlling the mobile unit based on the location of target object.
2. The behavior control apparatus claimed in claim 1, said motion estimating method comprising: behavior command output method for outputting said behavior command; behavior evaluation method for evaluating the result of the behavior of the mobile unit; learning method for learning the motion of the mobile unit using the relationship between said sensory inputs and said behavior result; and storage method for storing the learning result.
3. The behavior control apparatus claimed in claim 2, wherein said learning result is probabilistic density distribution.
4. The behavior control apparatus claimed in claims 1 to 3, wherein said target segregation method segregates said portion by comparing the sensory inputs and said estimated motion.
5. The behavior control apparatus claimed in claim 4, wherein said segregation is done by utilizing optical flow.
6. The behavior control apparatus claimed in claims 1 to 5, wherein said target location acquiring method defines the center of the target object as the location of said target object; said behavior decision method outputs the behavior command to move the mobile unit toward said location of the target object.
7. The behavior control apparatus claimed in claim 6, wherein said behavior decision method calculates the distance between the mobile unit and the location of said target object, said behavior decision method deciding the behavior command to decrease the calculated distance.
8. The behavior control apparatus claimed in claim 7, wherein if the calculated distance is greater than a predetermined value, said target segregation method repeats segregating said portion which includes a target object.
9. The behavior control apparatus claimed in claims 1 to 8, wherein said sensory input capturing method captures images of the external environment of the mobile unit as the sensory inputs.
10. The behavior control apparatus claimed in claims 1 to 8, wherein said target object matching method extracts target object by pattern matching between the sensory inputs and predetermined templates.
11. The behavior control apparatus claimed in claims 1 to 8, wherein said sensory inputs capturing method is a gyroscope which captures motion of the mobile unit.
12. A method for controlling behavior of a mobile unit using behavior command, comprising' capturing sensory inputs,' estimating motion of the mobile unit; segregating the portion which includes a target object to be target for behavior of the mobile unit from sensory inputs; extracting the target object from said segregated portion! acquiring the location of said target object! and controlling the mobile unit based on the location of target object.
13. The method claimed in claim 12, said estimating step further comprising' outputting said behavior command! evaluating the result of the behavior of the mobile unit! learning the motion of the mobile unit using the relationship between said sensory inputs and said behavior result! and storing the learning result.
14. The method claimed in claim 13, wherein said learning result is probabilistic density distribution.
15. The method claimed in claims 12 to 14, wherein said segregating is done by comparing the sensory inputs and said estimated motion.
16. The method claimed in claim 15, wherein said segregation is done by utilizing optical flow.
17. The method claimed in claims 12 to 16, wherein the center of the target object is defined as the location of said target object! said behavior command being determined so as to move the mobile unit toward said location of the target object.
18. The method claimed in claim 17, wherein the distance between the center of motion of the mobile unit and the location of said target object is calculated, and then the behavior command is determined to decrease the calculated distance.
19. The method claimed in claim 18, wherein if the calculated distance is greater than a predetermined value, said segregating step is repeated.
20. The method claimed in claims 12 to 19, wherein said sensory inputs are images of the external environment of the mobile unit.
21. The method claimed in claims 12 to 19, wherein said extracting is done by pattern matching between the sensory inputs and predetermined templates.
22. The method claimed in claim 12 to 19, wherein motion of the mobile unit is captured using a gyroscope.
23. Computer software program for implementing any of the method claimed in claims 12 to 22 on a computer.
24. A recording medium for recording said computer software program claimed in claim 23.
PCT/JP2002/007224 2001-07-16 2002-07-16 Behavior control apparatus and method WO2003009074A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2003514353A JP2004536400A (en) 2001-07-16 2002-07-16 Behavior control device and method
EP02746100A EP1407336A1 (en) 2001-07-16 2002-07-16 Behavior control apparatus and method
US10/484,147 US7054724B2 (en) 2001-07-16 2002-07-16 Behavior control apparatus and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2001-214907 2001-07-16
JP2001214907 2001-07-16

Publications (1)

Publication Number Publication Date
WO2003009074A1 true WO2003009074A1 (en) 2003-01-30

Family

ID=19049647

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2002/007224 WO2003009074A1 (en) 2001-07-16 2002-07-16 Behavior control apparatus and method

Country Status (4)

Country Link
US (1) US7054724B2 (en)
EP (1) EP1407336A1 (en)
JP (1) JP2004536400A (en)
WO (1) WO2003009074A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2403365A (en) * 2003-06-27 2004-12-29 Hewlett Packard Development Co Camera having behaviour memory

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7400291B2 (en) * 2003-12-04 2008-07-15 Sony Corporation Local positioning system which operates based on reflected wireless signals
JP4634842B2 (en) * 2005-03-31 2011-02-16 株式会社デンソーアイティーラボラトリ Landscape estimation device
WO2007124014A2 (en) * 2006-04-19 2007-11-01 Swope John M System for position and velocity sense and control of an aircraft
US7643893B2 (en) * 2006-07-24 2010-01-05 The Boeing Company Closed-loop feedback control using motion capture systems
US7813888B2 (en) * 2006-07-24 2010-10-12 The Boeing Company Autonomous vehicle rapid development testbed systems and methods
US7885732B2 (en) * 2006-10-25 2011-02-08 The Boeing Company Systems and methods for haptics-enabled teleoperation of vehicles and other devices
US20090319096A1 (en) * 2008-04-25 2009-12-24 The Boeing Company Control and monitor heterogeneous autonomous transport devices
US8068983B2 (en) * 2008-06-11 2011-11-29 The Boeing Company Virtual environment systems and methods
US20100312386A1 (en) * 2009-06-04 2010-12-09 Microsoft Corporation Topological-based localization and navigation
US8285659B1 (en) * 2009-08-18 2012-10-09 The United States of America as represented by the Administrator of the National Aeronautics & Space Administration (NASA) Aircraft system modeling error and control error
US9015093B1 (en) 2010-10-26 2015-04-21 Michael Lamport Commons Intelligent control with hierarchical stacked neural networks
US8775341B1 (en) 2010-10-26 2014-07-08 Michael Lamport Commons Intelligent control with hierarchical stacked neural networks
US9336302B1 (en) 2012-07-20 2016-05-10 Zuci Realty Llc Insight and algorithmic clustering for automated synthesis
US9769367B2 (en) 2015-08-07 2017-09-19 Google Inc. Speech and computer vision-based control
US9836819B1 (en) 2015-12-30 2017-12-05 Google Llc Systems and methods for selective retention and editing of images captured by mobile image capture device
US9836484B1 (en) 2015-12-30 2017-12-05 Google Llc Systems and methods that leverage deep learning to selectively store images at a mobile image capture device
US9838641B1 (en) 2015-12-30 2017-12-05 Google Llc Low power framework for processing, compressing, and transmitting images at a mobile image capture device
US10225511B1 (en) 2015-12-30 2019-03-05 Google Llc Low power framework for controlling image sensor mode in a mobile image capture device
US10732809B2 (en) 2015-12-30 2020-08-04 Google Llc Systems and methods for selective retention and editing of images captured by mobile image capture device
US11205103B2 (en) 2016-12-09 2021-12-21 The Research Foundation for the State University Semisupervised autoencoder for sentiment analysis
US20200364567A1 (en) * 2019-05-17 2020-11-19 Samsung Electronics Co., Ltd. Neural network device for selecting action corresponding to current state based on gaussian value distribution and action selecting method using the neural network device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4873644A (en) * 1987-09-16 1989-10-10 Kubota, Ltd. Guide system for a working machine having a product identifying system
EP0390051A2 (en) * 1989-03-31 1990-10-03 Honeywell Inc. Method and apparatus for computing the self-motion of moving imaging devices
JPH09170898A (en) * 1995-12-20 1997-06-30 Mitsubishi Electric Corp Guiding apparatus
DE19645556A1 (en) * 1996-04-02 1997-10-30 Bodenseewerk Geraetetech Steering signal generating device for target tracking of e.g. military missile

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4092716A (en) * 1975-07-11 1978-05-30 Mcdonnell Douglas Corporation Control means and method for controlling an object
JPH05150607A (en) 1991-11-29 1993-06-18 Konica Corp Color image forming device
JPH06266507A (en) 1993-03-12 1994-09-22 Victor Co Of Japan Ltd Multivolume continuous reproducing device
JP2000185720A (en) 1998-12-18 2000-07-04 Sato Corp Label affixing device
US6326763B1 (en) * 1999-12-20 2001-12-04 General Electric Company System for controlling power flow in a power bus generally powered from reformer-based fuel cells
DE10102243A1 (en) * 2001-01-19 2002-10-17 Xcellsis Gmbh Device for generating and distributing electrical energy to consumers in a vehicle

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4873644A (en) * 1987-09-16 1989-10-10 Kubota, Ltd. Guide system for a working machine having a product identifying system
EP0390051A2 (en) * 1989-03-31 1990-10-03 Honeywell Inc. Method and apparatus for computing the self-motion of moving imaging devices
JPH09170898A (en) * 1995-12-20 1997-06-30 Mitsubishi Electric Corp Guiding apparatus
DE19645556A1 (en) * 1996-04-02 1997-10-30 Bodenseewerk Geraetetech Steering signal generating device for target tracking of e.g. military missile

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NELSON R C: "VISUAL HOMING USING AN ASSOCIATIVE MEMORY", BIOLOGICAL CYBERNETICS, SPRINGER VERLAG, HEIDELBERG, DE, vol. 65, no. 4, 1 August 1991 (1991-08-01), pages 281 - 291, XP000227586, ISSN: 0340-1200 *
PATENT ABSTRACTS OF JAPAN vol. 1997, no. 10 31 October 1997 (1997-10-31) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2403365A (en) * 2003-06-27 2004-12-29 Hewlett Packard Development Co Camera having behaviour memory
GB2403365B (en) * 2003-06-27 2008-01-30 Hewlett Packard Development Co An autonomous camera having exchangeable behaviours
US7742625B2 (en) 2003-06-27 2010-06-22 Hewlett-Packard Development Company, L.P. Autonomous camera having exchangable behaviours

Also Published As

Publication number Publication date
JP2004536400A (en) 2004-12-02
EP1407336A1 (en) 2004-04-14
US20040162647A1 (en) 2004-08-19
US7054724B2 (en) 2006-05-30

Similar Documents

Publication Publication Date Title
US7054724B2 (en) Behavior control apparatus and method
JP2004536400A5 (en)
US7221797B2 (en) Image recognizing apparatus and method
US8583284B2 (en) Decision making mechanism, method, module, and robot configured to decide on at least one prospective action of the robot
WO2017127218A1 (en) Object-focused active three-dimensional reconstruction
CN112052802B (en) Machine vision-based front vehicle behavior recognition method
CN103149940A (en) Unmanned plane target tracking method combining mean-shift algorithm and particle-filter algorithm
CN109940614B (en) Mechanical arm multi-scene rapid motion planning method integrating memory mechanism
CN109799829B (en) Robot group cooperative active sensing method based on self-organizing mapping
CN115592324A (en) Automatic welding robot control system based on artificial intelligence
CN113920061A (en) Industrial robot operation method and device, electronic equipment and storage medium
Stachniss et al. Analyzing gaussian proposal distributions for mapping with rao-blackwellized particle filters
CN110119768A (en) Visual information emerging system and method for vehicle location
CN117011378A (en) Mobile robot target positioning and tracking method and related equipment
CN114667852B (en) Hedge trimming robot intelligent cooperative control method based on deep reinforcement learning
CN112734823A (en) Jacobian matrix depth estimation method based on visual servo of image
Xing et al. Deep reinforcement learning based robot arm manipulation with efficient training data through simulation
CN110764519A (en) Unmanned aerial vehicle ground target self-adaptive tracking method based on CS model
CN111444838A (en) Robot ground environment sensing method
Hafez et al. Target model estimation using particle filters for visual servoing
Santos et al. Model-based and machine learning-based high-level controller for autonomous vehicle navigation: lane centering and obstacles avoidance
Li et al. Robust target detection, tracking and following for an indoor mobile robot
Jiang et al. Robust linear-complexity approach to full SLAM problems: Stochastic variational Bayes inference
US20240027226A1 (en) Method for determining objects in an environment for slam
CN113110516B (en) Operation planning method for limited space robot with deep reinforcement learning

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2002746100

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2003514353

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 10484147

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 2002746100

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 2002746100

Country of ref document: EP