JP6567998B2

JP6567998B2 - Control method

Info

Publication number: JP6567998B2
Application number: JP2016058370A
Authority: JP
Inventors: 中村　仁彦; 仁彦中村; 渉高野; 高橋　太郎; 太郎高橋
Original assignee: University of Tokyo NUC; Toyota Motor Corp
Current assignee: University of Tokyo NUC; Toyota Motor Corp
Priority date: 2016-03-23
Filing date: 2016-03-23
Publication date: 2019-08-28
Anticipated expiration: 2036-03-23
Also published as: JP2017170553A

Description

本発明は、統計モデルを用いて可動部を制御する制御方法に関する。 The present invention relates to a control method for controlling a movable part using a statistical model.

動作する可動部の位置情報（ロボットの関節角度など）に基づき構築した統計モデル（隠れマルコフモデルなど）を用いて、可動部を制御する制御方法が知られている（例えば、特許文献１参照）。 A control method for controlling a movable part using a statistical model (such as a hidden Markov model) constructed based on position information (such as a joint angle of a robot) of a moving part is known (for example, see Patent Document 1). .

特開２００４−３３０３６１号公報JP 2004-330361 A

上記制御方法においては、位置情報に基づいて統計モデルを構築している。このため、統計モデル構築時と、実際の可動部の制御時とで、例えば、その可動部と操作対象物との位置関係が変化した場合に、可動部と操作対象物との間に過大な力が発生する、あるいは可動部が操作対象物に接触できない虞がある。
本発明は、かかる課題を解決するためになされたものであり、統計モデル構築時と可動部の制御時で、可動部と操作対象物の位置関係が異なる場合でも、可動部と操作対象物との間に過大な力が発生する、あるいは可動部が操作対象物に接触できない状況を抑制できる制御方法を提供することを主たる目的とする。 In the above control method, a statistical model is constructed based on position information. For this reason, for example, when the positional relationship between the movable part and the operation target changes between the statistical model construction and the actual control of the movable part, there is an excessive amount between the movable part and the operation target. There is a possibility that a force is generated or the movable part cannot contact the operation target.
The present invention has been made to solve such a problem, and even when the positional relationship between the movable part and the operation target is different between the statistical model construction and the control of the movable part, the movable part and the operation target It is a main object to provide a control method that can suppress a situation in which an excessive force is generated during this period or a movable part cannot contact an operation target.

上記目的を達成するための本発明の一態様は、
動作する可動部の位置又は速度情報と該可動部に対する力情報とに基づき構築したモデルであって、各ノード間の遷移確率が設定され該各ノードは前記位置又は速度情報と力情報の分布を有する統計モデルと、時刻ｔ（ｔは自然数）における前記可動部の位置又は速度情報と力情報と、に基づいて、時刻ｔ＋１における前記可動部の目標位置情報又は目標速度情報をサンプリングするステップと、
前記サンプリングした目標位置情報又は目標速度情報に基づいて、時刻ｔ＋１における前記可動部の目標位置情報又は目標速度情報と、目標加速度情報と、を算出するステップと、
前記算出した時刻ｔ＋１における目標位置情報、目標速度情報及び目標加速度情報に基づいて、逆動力学演算を行って、時刻ｔ＋１における前記可動部の目標力情報を算出するステップと、
前記時刻ｔにおける位置情報及び力情報と、前記統計モデルと、に基づいて、前記算出した時刻ｔ＋１における目標位置情報及び目標力情報が前記統計モデルから生成される確率を算出するステップと、
前記算出した時刻ｔ＋１における目標位置情報及び目標力情報と、前記算出した確率と、に基づいて、前記時刻ｔ＋１における目標位置情報及び目標力情報の期待値を算出するステップと、
前記算出した目標位置情報及び目標力情報の期待値に基づいて、前記可動部を制御するステップと、
を含む、ことを特徴とする制御方法
である。
この一態様によれば、可動部の位置指令値と力指令値の整合性を取ることができるため、統計モデル構築時と可動部制御時で、可動部と操作対象物の位置関係が異なる場合などでも、可動部と操作対象物との間で、過大な力が発生する、あるいは可動部が操作対象物に接触できない状況を抑制できる。 In order to achieve the above object, one embodiment of the present invention provides:
A model constructed based on the position or speed information of the movable part to be operated and the force information for the movable part, and a transition probability between the nodes is set, and each node has a distribution of the position or speed information and the force information. Sampling the position or speed information and force information of the movable part at time t (t is a natural number), and sampling the target position information or target speed information of the movable part at time t + 1 based on
Calculating target position information or target speed information of the movable part and target acceleration information at time t + 1 based on the sampled target position information or target speed information;
Performing reverse dynamics calculation based on the calculated target position information, target speed information and target acceleration information at time t + 1 to calculate target force information of the movable part at time t + 1;
Calculating a probability that the calculated target position information and target force information at time t + 1 are generated from the statistical model based on the position information and force information at time t and the statistical model;
Calculating expected values of target position information and target force information at time t + 1 based on the calculated target position information and target force information at time t + 1 and the calculated probability;
Controlling the movable part based on the calculated target position information and expected value of the target force information;
A control method characterized by comprising:
According to this aspect, since the consistency between the position command value and the force command value of the movable part can be taken, the positional relationship between the movable part and the operation target is different between the statistical model construction and the movable part control. Even in such a case, it is possible to suppress a situation in which an excessive force is generated between the movable part and the operation target or the movable part cannot contact the operation target.

本発明によれば、統計モデル構築時と可動部の制御時で、可動部と操作対象物の位置関係が異なる場合でも、可動部と操作対象物との間に過大な力が発生する、あるいは可動部が操作対象物に接触できない状況を抑制できる制御方法を提供できる。 According to the present invention, an excessive force is generated between the movable unit and the operation target even when the positional relationship between the movable unit and the operation target is different between the statistical model construction and the control of the movable unit, or It is possible to provide a control method capable of suppressing a situation in which the movable part cannot contact the operation target.

本発明の実施形態１に係るロボットアームの概略的な構成を示す図である。It is a figure which shows schematic structure of the robot arm which concerns on Embodiment 1 of this invention. 本発明の実施形態１に係るロボットアーム及び制御装置の概略的なシステム構成を示すブロック図である。1 is a block diagram showing a schematic system configuration of a robot arm and a control device according to Embodiment 1 of the present invention. 隠れマルコフモデルの一例を示す図である。It is a figure which shows an example of a hidden Markov model. 各関節部の目標軌道生成およびロボットアーム制御を示す図である。It is a figure which shows the target trajectory generation and robot arm control of each joint part. 本発明の実施形態１に係る制御方法のフローを示すフローチャートである。It is a flowchart which shows the flow of the control method which concerns on Embodiment 1 of this invention.

以下、図面を参照して本発明の実施の形態について説明する。
実施形態１
図１は、本発明の実施形態１に係るロボットアームの概略的な構成を示す図である。本実施形態１に係る制御装置は、例えば、多関節型のロボットアーム（可動部の一具体例）２を制御する。 Embodiments of the present invention will be described below with reference to the drawings.
Embodiment 1
FIG. 1 is a diagram showing a schematic configuration of a robot arm according to the first embodiment of the present invention. The control device according to the first embodiment controls, for example, an articulated robot arm (one specific example of a movable part) 2.

ロボットアーム２は、複数のリンク２１と、各リンク２１を回動可能に連結する関節部（手首関節、肘関節、肩関節など）２２と、その先端に設けられ操作対象を操作するエンドエフェクタ２３と、を有する。 The robot arm 2 includes a plurality of links 21, joint portions (a wrist joint, an elbow joint, a shoulder joint, etc.) 22 that rotatably connect the links 21, and an end effector 23 that is provided at the tip of the link arm 21 and operates an operation target. And having.

図２は、本発明の実施形態１に係るロボットアーム及び制御装置の概略的なシステム構成を示すブロック図である。各関節部２２には、各関節部２２の関節角度（位置情報の一具体例）を検出するエンコーダなどの角度センサ２４と、各関節部２２を駆動するサーボモータなどのアクチュエータ２５と、各関節部２２の操作力を検出する力センサ２６と、が設けられている。 FIG. 2 is a block diagram illustrating a schematic system configuration of the robot arm and the control device according to the first embodiment of the present invention. Each joint unit 22 includes an angle sensor 24 such as an encoder that detects a joint angle (one specific example of position information) of each joint unit 22, an actuator 25 such as a servo motor that drives each joint unit 22, and each joint And a force sensor 26 that detects an operation force of the unit 22.

力センサ２６は、例えば、各関節部２２の関節トルク（力情報の一具体例）を検出するトルクセンサなどである。各関節部２２には、減速機構が設けられている。エンドエフェクタ２３は、例えば、物体を把持、接触などして物体に操作力を加える。エンドエフェクタ２３には、エンドエフェクタ２３を駆動するアクチュエータ２５と、エンドエフェクタ２３の操作力を検出する力センサ２６と、が設けられている。 The force sensor 26 is, for example, a torque sensor that detects joint torque (one specific example of force information) of each joint portion 22. Each joint portion 22 is provided with a speed reduction mechanism. The end effector 23 applies an operating force to the object, for example, by gripping or contacting the object. The end effector 23 is provided with an actuator 25 that drives the end effector 23 and a force sensor 26 that detects an operation force of the end effector 23.

制御装置１は、例えば、各関節部２２の角度センサ２４からの角度情報（関節角度など）と、力センサ２６からの操作力と、に基づいて、各関節部２２及びエンドエフェクタ２３のアクチュエータ２５を制御することで、ロボットアーム２をフィードバック制御する。 The control device 1, for example, based on the angle information (joint angle etc.) from the angle sensor 24 of each joint portion 22 and the operation force from the force sensor 26, the actuator 25 of each joint portion 22 and the end effector 23. By controlling the above, the robot arm 2 is feedback-controlled.

なお、制御装置１は、例えば、演算処理等と行うＣＰＵ（Central Processing Unit）１ａ、ＣＰＵ１ａによって実行される演算プログラム、制御プログラム等が記憶されたＲＯＭ（Read Only Memory）やＲＡＭ（Random Access Memory）からなるメモリ１ｂ、外部と信号の入出力を行うインターフェイス部（Ｉ／Ｆ）１ｃ、などからなるマイクロコンピュータを中心にして、それぞれ、ハードウェア構成されている。ＣＰＵ１ａ、メモリ１ｂ、及びインターフェイス部１ｃは、データバス１ｄなどを介して相互に接続されている。 The control device 1 includes, for example, a CPU (Central Processing Unit) 1a that performs arithmetic processing and the like, a ROM (Read Only Memory) and a RAM (Random Access Memory) that store arithmetic programs executed by the CPU 1a, control programs, and the like. Each of the hardware components includes a microcomputer 1 including a memory 1b and an interface unit (I / F) 1c for inputting / outputting signals to / from the outside. The CPU 1a, the memory 1b, and the interface unit 1c are connected to each other via a data bus 1d.

本実施形態１に係る制御装置１は、例えば、隠れマルコフモデル（Hidden Markov Model：ＨＭＭ）などの統計モデルを用いて、人の動作を模倣学習することで、ロボットアーム２によるテーブル拭きのような力を加えて行う物理タスクを実行する。 The control device 1 according to the first embodiment performs, for example, a table wiping by the robot arm 2 by imitating a human motion using a statistical model such as a Hidden Markov Model (HMM). Perform physical tasks with force.

ところで、ロボットアームが上記のような物理タスクを実行するために、制御装置は、ロボットアームの位置情報（関節角度や手先位置）に加えて、ロボットアームの力情報（関節トルクや手先反力）についても学習する必要がある。この場合、制御装置が単にロボットアームの位置情報と同様に、力情報を学習するだけでは次のような問題が生じる。例えば、その学習時と制御時とでそのロボットアームと操作対象物との位置関係が変化した場合（テーブルの高さが変化した場合など）に、ロボットアームと操作対象物との間に過大な力が発生する、あるいはロボットアームが操作対象物に接触できない虞がある。 By the way, in order for the robot arm to execute the physical tasks as described above, the control device, in addition to the robot arm position information (joint angle and hand position), the robot arm force information (joint torque and hand reaction force) Need to learn about. In this case, the following problems arise when the control device simply learns the force information, just like the position information of the robot arm. For example, if the positional relationship between the robot arm and the operation target changes during learning and control (such as when the height of the table changes), the robot arm and the operation target are excessively large. There is a possibility that a force is generated or the robot arm cannot contact the operation target.

これに対し、本実施形態１に係る制御装置１は、隠れマルコフモデルによるロボットアーム２の目標軌道生成時に、後述の動力学演算式を用いて力学的整合性を確保する。 On the other hand, the control device 1 according to the first embodiment ensures dynamic consistency using a dynamic equation described later when the target trajectory of the robot arm 2 is generated by the hidden Markov model.

制御装置１は、予め用意されたデータに基づいて学習を行い隠れマルコフモデルを構築する学習処理と、学習処理で構築した隠れマルコフモデルを用いてロボットアーム２を制御する実行処理と、を実行する。 The control device 1 executes learning processing for learning based on data prepared in advance to construct a hidden Markov model, and execution processing for controlling the robot arm 2 using the hidden Markov model constructed by the learning processing. .

（学習処理）
制御装置１は、学習処理において、例えば、予め用意されたロボットアーム２の各関節部２２の関節角度および関節トルクの時系列データ（シミュレーションデータなど）を用いて学習を行い、Baum-Welch法によりleft-right隠れマルコフモデルを構築する（図３）。 (Learning process)
In the learning process, for example, the control device 1 performs learning using time series data (simulation data, etc.) of joint angles and joint torques of the joint portions 22 of the robot arm 2 prepared in advance, and uses the Baum-Welch method. Build a left-right hidden Markov model (Figure 3).

なお、制御装置１は、実際にロボットアーム２を遠隔的に動作させ、その際に角度センサ２４及び力センサ２６により検出されたロボットアーム２の各関節部２２の関節角度および関節トルクの時系列データを用いて学習を行い、隠れマルコフモデルを構築してもよい。 The control device 1 actually operates the robot arm 2 remotely, and at this time, the time series of the joint angle and joint torque of each joint portion 22 of the robot arm 2 detected by the angle sensor 24 and the force sensor 26. Learning may be performed using data to construct a hidden Markov model.

図３に示す如く、隠れマルコフモデルにおいて、各ノードｑと、ノードｑ間を遷移する遷移確率ａが設定される。各ノードｑは、各関節部２２の関節角度θおよび関節トルクτの分布を有している。制御装置１は、上記各関節部２２の関節角度および関節トルクの時系列データを用いてノード間の遷移確率ａと、各関節部２２の関節角度θおよび関節トルクτの分布を表すパラメータと、を学習した隠れマルコフモデルを構築する。 As shown in FIG. 3, in the hidden Markov model, each node q and a transition probability a for transitioning between the nodes q are set. Each node q has a distribution of a joint angle θ and a joint torque τ of each joint portion 22. The control device 1 uses the time series data of the joint angles and joint torques of the joint portions 22 described above, the transition probability a between the nodes, the parameters representing the distribution of the joint angles θ and joint torques τ of the joint portions 22, Construct a hidden Markov model.

このように、学習処理において、ロボットアーム２の関節角度と関節トルクを使用して隠れマルコフモデルを構築する。これにより、ロボットアーム２の関節角度と関節トルクの相関関係も含めて時系列データを保存でき、ロボットアーム２が操作対象物の接触動作を行うための位置指令と力指令の両者を同時に生成できる。また、そのデータの情報量を圧縮しつつ、ロボットアーム２の動作認識等を行うこともできる。 Thus, in the learning process, a hidden Markov model is constructed using the joint angle and joint torque of the robot arm 2. Thereby, time series data including the correlation between the joint angle of the robot arm 2 and the joint torque can be stored, and both the position command and the force command for the robot arm 2 to perform the contact operation of the operation target can be generated simultaneously. . It is also possible to recognize the movement of the robot arm 2 while compressing the information amount of the data.

（実行処理）
制御装置１は、本実行処理において、図４に示す如く、各関節部２２の目標軌道生成と、ロボットアーム制御と、を同時に実行する。さらに、制御装置１は、このロボットアーム制御時において、力センサ２６及び角度センサ２４により検出された現在時刻ｔのセンサ値に基づいて、次瞬間である時刻ｔ＋１のロボットアーム２の目標位置情報θ_ｐｌａｎ及び目標力情報τ_ｐｌａｎを決定し、これら目標位置情報θ_ｐｌａｎ及び目標力情報τ_ｐｌａｎに基づいてロボットアーム２のコンプライアンス制御を実行する。これにより、ロボットアーム２の現在の状態から大きく離れた指令値が生成され、ロボットアーム２が急激に動作するような状況を回避できる。さらに、ロボットアーム２の位置指令及び力指令を同時に厳密に満たせない場合でも、上記目標位置情報及び目標力情報を用いて、ロボットアーム２の位置及び力を予め設定したパラメータに従ったバランス状態で適正に制御できる。 (Execution process)
In this execution process, the control device 1 simultaneously performs target trajectory generation and robot arm control of each joint portion 22 as shown in FIG. Further, the controller 1 controls the target position information θ of the robot arm 2 at time t + 1, which is the next moment, based on the sensor value at the current time t detected by the force sensor 26 and the angle sensor 24 during the robot arm control. _plan and target force information τ _plan are determined, and compliance control of the robot arm 2 is executed based on the target position information θ _plan and the target force information τ _plan . Thereby, a command value greatly deviating from the current state of the robot arm 2 is generated, and a situation in which the robot arm 2 operates rapidly can be avoided. Furthermore, even when the position command and the force command of the robot arm 2 cannot be strictly met at the same time, the position and force of the robot arm 2 are balanced in accordance with preset parameters using the target position information and the target force information. It can be controlled properly.

制御装置１は、所定時間毎に、以下の[prediction]（１）〜（８）および[resampling]を繰り返して関節部２２の目標軌道を生成する。 The control device 1 generates the target trajectory of the joint unit 22 by repeating the following [prediction] (1) to (8) and [resampling] at predetermined time intervals.

[prediction]
（１）制御装置１は、まず、現在時刻ｔにおける隠れマルコフモデルのノードｑ_ｔと、角度センサ２４及び力センサ２６により検出された現在時刻ｔにおける関節角度θ_ｔ及び関節トルクτ_ｔと、を保持する。 [prediction]
(1) The control apparatus 1, first, the node q _t hidden Markov model at the current time t, and joint angle theta _t and joint torque tau _t at the current time t, which is detected by the angle sensor 24 and force sensor 26, the Hold.

（２）制御装置１は、隠れマルコフモデルのノード間の遷移確率ａを用いて、次瞬間である時刻ｔ＋１におけるノードｑ_ｔ＋１をサンプリングする。 (2) The control apparatus 1 samples the node q _{t + 1} at time t + 1, which is the next moment, using the transition probability a between the nodes of the hidden Markov model.

（３）制御装置１は、サンプリングしたノードｑ_ｔ＋１が有する関節角度の分布（ノードｑ_ｔ＋１における出力確率）に基づいて、時刻ｔ＋１における関節部２２の目標関節角度（目標位置情報の一具体例）θ_ｔ＋１ ^ｒｅｆをサンプリングする。
このように、関節部２２の目標軌道生成時に関節部２２の関節角度び関節トルクを含む隠れマルコフモデルと、角度センサ２４及び力センサ２６により検出された現在時刻ｔにおける関節角度及び関節トルクと、に基づいて、目標関節角度をサンプリングしている。 (3) The control device 1 determines the target joint angle of the joint portion 22 at time t + 1 (one specific example of target position information) based on the distribution of joint angles of the sampled node q _{t + 1} (output probability at the node q _{t + 1} ). Sampling θ _{t + 1} ^ref .
Thus, the hidden Markov model including the joint angle and joint torque of the joint portion 22 when the target trajectory of the joint portion 22 is generated, the joint angle and joint torque at the current time t detected by the angle sensor 24 and the force sensor 26, Based on this, the target joint angle is sampled.

（４）制御装置１は、サンプリングした目標関節角度θ_ｔ＋１ ^ｒｅｆを１階微分して関節部２２の目標関節角速度（目標速度情報の一具体例）θ（ドット）_ｔ＋１ ^ｒｅｆを算出し、目標関節角度θ_ｔ＋１ ^ｒｅｆを２階微分して関節部２２の目標関節角加速度（目標加速度情報の一具体例）θ（２ドット）_ｔ＋１ ^ｒｅｆを算出する。なお、上記のようにθの上部に１つのドット記号を付したもの（１階微分値）をθ（ドット）と表記し、θの上部に２つのドット記号を付したもの（２階微分値）をθ（２ドット）と表記し、以降、他のパラメータについても同様に表記する。 (4) The control device 1 calculates the target joint angular velocity (one specific example of target speed information) θ (dot) _{t + 1} ^ref of the joint unit 22 by first-order differentiation of the sampled target joint angle θ _{t + 1} ^ref , and the target joint The angle θ _{t + 1} ^ref is second-order differentiated to calculate a target joint angular acceleration (one specific example of target acceleration information) θ (2 dots) _{t + 1} ^ref of the joint unit 22. In addition, as described above, one dot symbol added to the upper part of θ (first-order differential value) is expressed as θ (dot), and two dot symbols added to the upper part of θ (second-order differential value) ) Is expressed as θ (2 dots), and the other parameters are also expressed in the same manner.

（５）制御装置１は、算出した目標関節角加速度θ（２ドット）_ｔ＋１ ^ｒｅｆを下記（式１）を用いて修正する。下記式１において、Ｋ_ｐ及びＫ_ｄは、予め設定される係数行列である。
この修正により、制御安定性をより高めることができる。

なお、本実施形態１において、制御装置１は、処理の高速化を図るために、上記(５)による目標関節角加速度θ（２ドット）_ｔ＋１ ^ｒｅｆの修正を行わなくても良い。 (5) The control device 1 corrects the calculated target joint angular acceleration θ (2 dots) _{t + 1} ^ref using the following (Equation 1). In Equation 1 below, _Kp and _Kd are preset coefficient matrices.
This correction can further increase the control stability.

In the first embodiment, the control device 1 does not have to correct the target joint angular acceleration θ (2 dots) _{t + 1} ^ref according to the above (5) in order to increase the processing speed.

（６）制御装置１は、算出した目標関節角度θ_ｔ＋１ ^ｒｅｆ、目標関節角速度θ（ドット）_ｔ＋１ ^ｒｅｆ、および修正した目標関節角加速度θ（２ドット）_ｔ＋１ ^ｒｅｆと、目標反力Ｆ_ｔ＋１ ^ｒｅｆとを実現する、目標関節トルクτ_ｔ＋１を算出する。
なお、上記目標反力Ｆ_ｔ＋１ ^ｒｅｆは、ロボットアーム２の操作内容に基づいてユーザにより設定される。例えば、ロボットアーム２を操作対象物に強く接触させる場合は、目標反力１０[Ｎ]に設定し、軽く接触させる場合は、１[Ｎ]に設定する。
制御装置１は、算出した目標関節角度θ_ｔ＋１ ^ｒｅｆ、目標関節角速度θ（ドット）_ｔ＋１ ^ｒｅｆ、および修正した目標関節角加速度θ（２ドット）_ｔ＋１ ^ｒｅｆと、目標反力Ｆ_ｔ＋１ ^ｒｅｆと、に基づいて、下記逆動力学演算式（運動方程式）（式２）を用いて関節部２２の目標関節トルク（目標力情報の一具体例）τ_ｔ＋１を算出する。なお、下記（式２）において、Ｍは慣性行列であり、Ｃは遠心及びコリオリ項であり、Ｇは重力項である。Jは、関節角速度をロボットアーム２（可動部）の速度に関係付けるヤコビ行列である。
このように、下記逆動力学演算式を用いて目標関節トルクτ_ｔ＋１を算出することで、隠れマルコフモデルによる関節部２２の目標軌道生成時において、ロボットアーム２の力学的な整合性を確保することができる。

(6) The control device 1 calculates the calculated target joint angle θ _{t + 1} ^ref , the target joint angular velocity θ (dot) _{t + 1} ^ref , the corrected target joint angular acceleration θ (2 dots) _{t + 1} ^ref, and the target reaction force F _{t + 1} ^ref The target joint torque τ _{t + 1} is calculated to realize
The target reaction force F _{t + 1} ^ref is set by the user based on the operation content of the robot arm 2. For example, when the robot arm 2 is brought into strong contact with the operation target, the target reaction force is set to 10 [N], and when lightly contacted, it is set to 1 [N].
The control device 1 is based on the calculated target joint angle θ _{t + 1} ^ref , target joint angular velocity θ (dot) _{t + 1} ^ref , corrected target joint angular acceleration θ (2 dots) _{t + 1} ^ref , and target reaction force F _{t + 1} ^ref. Then, the target joint torque (one specific example of the target force information) τ _{t + 1} of the joint portion 22 is calculated using the following inverse dynamic equation (expression of motion) (Expression 2). In the following (Expression 2), M is an inertia matrix, C is a centrifugal and Coriolis term, and G is a gravity term. J is a Jacobian matrix that relates the joint angular velocity to the velocity of the robot arm 2 (movable part).
Thus, by calculating the target joint torque τ _{t + 1} using the following inverse dynamic equation, the mechanical consistency of the robot arm 2 is ensured when the target trajectory of the joint portion 22 is generated by the hidden Markov model. be able to.

（７）制御装置１は、角度センサ２４及び力センサ２６により検出された現在時刻ｔでの関節角度θ_ｔ及び関節トルクτ_ｔと、隠れマルコフモデルと、に基づいて、目標関節角度θ_ｔ＋１ ^ｒｅｆ、および目標関節トルクτ_ｔ＋１が隠れマルコフモデルから生成される（ｑ_ｔ＋１、θ_ｔ＋１ ^ｒｅｆ、τ_ｔ＋１に到達する）確率Ｐ（θ_ｔ＋１ ^ｒｅｆ、τ_ｔ＋１｜θ_ｔ、τ_ｔ、λ）を算出する。このように、角度センサ２４及び力センサ２６により検出された現在時刻ｔの関節角度及び関節トルク（センサ値）と、隠れマルコフモデルと、から、その目標関節角度θ_ｔ＋１ ^ｒｅｆおよび目標関節トルクτ_ｔ＋１になる確率（尤度）を算出する。
制御装置１は、パーティクル毎に上記確率Ｐを算出する。ここで、制御装置１は、各パーティクルの確率Ｐの総和が１にするため、次の修正処理を行う。例えば、制御装置１は、各パーティクルの確率の総和を算出し、各パーティクルの確率を算出した総和で除算する。 (7) The control device 1 determines the target joint angle θ _{t + 1} ^ref based on the joint angle θ _t and the joint torque τ _t at the current time t detected by the angle sensor 24 and the force sensor 26 and the hidden Markov model. , And the target joint torque τ _{t + 1} is generated from the hidden Markov model (q _{t + 1} , θ _{t + 1} ^ref , τ _{t + 1} is reached) P (θ _{t + 1} ^ref , τ _{t + 1} | θ _t , τ _t , λ) is calculated . Thus, from the joint angle and joint torque (sensor value) at the current time t detected by the angle sensor 24 and the force sensor 26 and the hidden Markov model, the target joint angle θ _{t + 1} ^ref and the target joint torque τ _{t + 1 are obtained.} The probability (likelihood) of becoming is calculated.
The control device 1 calculates the probability P for each particle. Here, the control device 1 performs the following correction process so that the sum of the probabilities P of the respective particles is 1. For example, the control device 1 calculates the sum of the probabilities of each particle and divides the probability of each particle by the calculated sum.

（８）制御装置１は、目標関節角度θ_ｔ＋１ ^ｒｅｆ、目標関節トルクτ_ｔ＋１および修正した確率に基づいて下記（式３）を用いて、次瞬間の指令値である時刻ｔ＋１における目標関節角度の推定値θ（ハット）_ｔ＋１ ^ｒｅｆ及び目標関節トルクの推定値τ（ハット）_ｔ＋１を、パーティクル毎の関節角度及び関節トルクの期待値として算出する。なお、θ及びτの上部にハット記号を付したものを、夫々、θ（ハット）及びτ（ハット）と表記し、以降、他のパラメータについても同様に表記する。
このように、上記（７）で算出した確率による重みづけ平均により次瞬間の指令値を算出する。

(8) The control device 1 uses the following (Equation 3) based on the target joint angle θ _{t + 1} ^ref , the target joint torque τ _{t + 1} and the corrected probability to calculate the target joint angle at the time t + 1, which is the next instantaneous command value. The estimated value θ (hat) _{t + 1} ^ref and the estimated value τ (hat) _{t + 1} of the target joint torque are calculated as the joint angle and the expected value of the joint torque for each particle. In addition, what attached | subjected the hat symbol to the upper part of (theta) and (tau) is each described as (theta) (hat) and (tau) (hat), and hereafter, it describes similarly about another parameter.
Thus, the command value at the next moment is calculated by the weighted average based on the probability calculated in (7) above.

制御装置１は、上記（８）において算出した目標関節角度の推定値θ（ハット）_ｔ＋１ ^ｒｅｆ及び目標関節トルクの推定値τ（ハット）_ｔ＋１と、角度センサ２４により検出された各関節部２２の関節角度θと、に基づいて、各関節部２２のコンプライアンス制御を行う。制御装置１は、上記（８）において算出した目標関節角度の推定値θ（ハット）_ｔ＋１ ^ｒｅｆ及び目標関節トルクの推定値τ（ハット）_ｔ＋１を、夫々、角度指令θ_ｐｌａｎ、及びトルク指令τ_ｐｌａｎとして、下記（式４）を用いて、トルク指令τ_ｒｅｆを算出する。なお、下記（式４）において、Ｋ及びＤは予め設定される係数行列である。

The control device 1 determines the estimated value θ (hat) _{t + 1} ^ref of the target joint angle and the estimated value τ (hat) _{t + 1 of the} target joint torque calculated in the above (8) and each joint portion 22 detected by the angle sensor 24. Based on the joint angle θ, compliance control of each joint portion 22 is performed. The control device 1 uses the estimated value θ (hat) _{t + 1} ^{ref of} the target joint angle and the estimated value τ (hat) _{t + 1} of the target joint torque calculated in the above (8), respectively, as an angle command θ _plan and a torque command τ _plan. Then, the torque command τ _ref is calculated using the following (formula 4). In the following (Equation 4), K and D are preset coefficient matrices.

制御装置１は、算出したトルク指令τ_ｒｅｆを各関節部２２のアクチュエータ２５に送信することで、各アクチュエータ２５を制御する。
さらに、制御装置１は、力センサ２６により検出された関節トルクτが、算出したトルク指令τ_ｒｅｆに追従するように、関節部２２のアクチュエータ２５に対して、フィードバック制御を行う。 The control device 1 controls each actuator 25 by transmitting the calculated torque command τ _ref to the actuator 25 of each joint portion 22.
Furthermore, the control device 1 performs feedback control on the actuator 25 of the joint portion 22 so that the joint torque τ detected by the force sensor 26 follows the calculated torque command τ _ref .

以上のように、目標軌道生成時において、関節部２２の関節角度及び関節トルクを含む隠れマルコフモデルと、角度センサ２４及び力センサ２６により検出された関節角度及び関節トルクに基づいて、目標関節角度をサンプリングする。サンプリングした目標関節角度に基づいて目標関節角速度及び目標関節角加速度を算出し、目標関節角度と、算出した目標関節角速度および目標関節角加速度と、目標反力と、に基づいて、逆動力学演算を行い目標関節トルクを算出して、ロボットアーム２の力学的な整合性を確保する。現在時刻ｔの角度センサ値及びトルクセンサ値と、隠れマルコフモデルと、から、その目標関節角度および目標関節トルクになる確率を算出する。算出した確率による重みづけ平均により時刻ｔ＋１の目標関節角度及び目標関節トルクの期待値を算出し、この目標関節角度及び目標関節トルクの期待値を用いてロボットアーム２の関節部２２を制御する。 As described above, the target joint angle is determined based on the hidden Markov model including the joint angle and joint torque of the joint portion 22 and the joint angle and joint torque detected by the angle sensor 24 and the force sensor 26 when the target trajectory is generated. Is sampled. Calculate the target joint angular velocity and target joint angular acceleration based on the sampled target joint angle, and calculate the inverse dynamics based on the target joint angle, the calculated target joint angular velocity and target joint angular acceleration, and the target reaction force The target joint torque is calculated to ensure the mechanical consistency of the robot arm 2. From the angle sensor value and the torque sensor value at the current time t and the hidden Markov model, the probability of the target joint angle and the target joint torque is calculated. An expected value of the target joint angle and target joint torque at time t + 1 is calculated by weighted average based on the calculated probability, and the joint portion 22 of the robot arm 2 is controlled using the expected value of the target joint angle and target joint torque.

これにより、ロボットアーム２の位置指令値と力指令値の整合性を取ることができる。したがって、隠れマルコフモデル構築時とロボットアーム制御時で、例えば、操作対象物の大きさ、形状が異なる場合や、ロボットアーム２と操作対象物の位置関係（テーブルの高さが異なるなど）が異なる場合などでも、ロボットアーム２と操作対象物との間で、過大な力が発生する、あるいはロボットアームが操作対象物に接触できない状況を抑制できる。 Thereby, the consistency between the position command value of the robot arm 2 and the force command value can be obtained. Therefore, when the hidden Markov model is constructed and when the robot arm is controlled, for example, when the size and shape of the operation target are different, or the positional relationship between the robot arm 2 and the operation target (table height is different, etc.) is different. Even in such a case, it is possible to suppress a situation in which an excessive force is generated between the robot arm 2 and the operation target or the robot arm cannot contact the operation target.

[resampling]
制御装置１は、上記算出した確率Ｐを正規化して、確率密度分布ｐ_ｋを算出する。
制御装置１は、各パーティクルの確率Ｐとなるように、確率密度分布ｐ_ｋに従ってパーティクルをばらまき、時刻ｔ＋１でのパーティクルを更新する。これにより、確率の低いパーティクルを除去し、確率の高いパーティクルを残すことができる。
制御装置１は、選択したパーティクルｓのノードｑ_ｔ＋１を次のフレームのためにｑ_ｔにセットし、上記（１）に戻り処理を繰り返す。 [resampling]
Controller 1 normalizes the probability P calculated above, to calculate a probability density distribution p _k.
The control device 1, so that the probability P of each particle, handouts particles according to the probability density distribution p _k, update the particle at time t + 1. Thereby, particles with low probability can be removed, and particles with high probability can be left.
Controller 1 sets the q _t for node q _{t + 1} for the next frame of the selected particle s, and repeats the processing returns to (1).

図５は、本実施形態１に係る制御方法のフローを示すフローチャートである。なお、図５に示す制御処理は、例えば、所定時間毎に繰返し実行される。
制御装置１は、隠れマルコフモデルのノード間の遷移確率ａを用いて時刻ｔ＋１（次瞬間）におけるノードｑ_ｔ＋１をサンプリングする（ステップＳ１０１）。 FIG. 5 is a flowchart showing a flow of the control method according to the first embodiment. Note that the control process shown in FIG. 5 is repeatedly executed at predetermined time intervals, for example.
The control device 1 samples the node q _{t + 1} at time t + 1 (next moment) using the transition probability a between the nodes of the hidden Markov model (step S101).

制御装置１は、サンプリングしたノードｑ_ｔ＋１が有する関節角度の分布に基づいて、時刻ｔ＋１における目標関節角度θ_ｔ＋１ ^ｒｅｆをサンプリングする（ステップＳ１０２）。
制御装置１は、サンプリングした目標関節角度θ_ｔ＋１ ^ｒｅｆを１階微分して目標関節角速度θ（ドット）_ｔ＋１ ^ｒｅｆを算出し、目標関節角度θ_ｔ＋１ ^ｒｅｆを２階微分して目標関節角加速度θ（２ドット）_ｔ＋１ ^ｒｅｆを算出する（ステップＳ１０３）。 The control device 1 samples the target joint angle θ _{t + 1} ^ref at time t + 1 based on the joint angle distribution of the sampled node q _{t + 1} (step S102).
Controller 1, a target joint angle theta _t ^{+ 1 ref} sampled by first-order derivative calculates the target joint angular velocity theta _(dot) ^{t + 1 ref,} the target joint angle θ _t ^{+ 1 ref} 2 derivative to a target joint angle acceleration theta ( (2 dots) _{t + 1} ^ref is calculated (step S103).

制御装置１は、算出した目標関節角加速度θ（２ドット）_ｔ＋１ ^ｒｅｆを上記（式１）を用いて修正する（ステップＳ１０４）。
制御装置１は、算出した目標関節角度θ_ｔ＋１ ^ｒｅｆ、目標関節角速度θ（ドット）_ｔ＋１ ^ｒｅｆ、および修正した目標関節角加速度θ（２ドット）_ｔ＋１ ^ｒｅｆと、目標反力Ｆ_ｔ＋１ ^ｒｅｆと、に基づいて、上記逆動力学演算式（式２）を用いて目標関節トルクτ_ｔ＋１を算出する（ステップＳ１０５）。 The control device 1 corrects the calculated target joint angular acceleration θ (2 dots) _{t + 1} ^ref using the above (Equation 1) (step S104).
The control device 1 is based on the calculated target joint angle θ _{t + 1} ^ref , target joint angular velocity θ (dot) _{t + 1} ^ref , corrected target joint angular acceleration θ (2 dots) _{t + 1} ^ref , and target reaction force F _{t + 1} ^ref. Then, the target joint torque τ _{t + 1} is calculated using the inverse dynamics calculation formula (Formula 2) (Step S105).

制御装置１は、角度センサ２４及び力センサ２６により検出された現在時刻ｔでの関節角度θ_ｔ及び関節トルクτ_ｔと、隠れマルコフモデルと、に基づいて、目標関節角度θ_ｔ＋１ ^ｒｅｆ、および目標関節トルクτ_ｔ＋１が隠れマルコフモデルから生成される確率Ｐ（θ_ｔ＋１ ^ｒｅｆ、τ_ｔ＋１｜θ_ｔ、τ_ｔ、λ）を算出する（ステップＳ１０６）。 The control device 1 determines the target joint angle θ _{t + 1} ^ref and the target based on the joint angle θ _t and the joint torque τ _t at the current time t detected by the angle sensor 24 and the force sensor 26 and the hidden Markov model. The probability P (θ _{t + 1} ^ref , τ _{t + 1} | θ _t , τ _t , λ) that the joint torque τ _{t + 1} is generated from the hidden Markov model is calculated (step S106).

制御装置１は、目標関節角度θ_ｔ＋１ ^ｒｅｆ、目標関節トルクτ_ｔ＋１および修正した確率に基づいて上記（式３）を用いて、時刻ｔ＋１における目標関節角度の推定値θ（ハット）_ｔ＋１ ^ｒｅｆ及び目標関節トルクの推定値τ（ハット）_ｔ＋１（次瞬間の指令値）を、パーティクル毎の関節角度及び関節トルクの期待値として算出する（ステップＳ１０７）。 The control device 1 uses the above (Equation 3) based on the target joint angle θ _{t + 1} ^ref , the target joint torque τ _{t + 1} and the corrected probability, and uses the target joint angle estimated value θ (hat) _{t + 1} ^ref and the target at time t + 1. The joint torque estimated value τ (hat) _{t + 1} (command value at the next moment) is calculated as the joint angle for each particle and the expected value of the joint torque (step S107).

制御装置１は、上記（８）において算出した目標関節角度の推定値θ（ハット）_ｔ＋１ ^ｒｅｆ及び目標関節トルクの推定値τ（ハット）_ｔ＋１と、角度センサ２４により検出された各関節部の関節角度θと、に基づいて、各関節部２２のコンプライアンス制御を行う（ステップＳ１０８）。
制御装置１は、各パーティクルの確率Ｐとなるように、確率密度分布ｐ_ｋに従ってパーティクルをばらまき、時刻ｔ＋１でのパーティクルを更新する（ステップＳ１０９）。 The control device 1 uses the estimated value θ (hat) _{t + 1} ^ref of the target joint angle and the estimated value τ (hat) _{t + 1 of the} target joint torque calculated in (8) above, and the joints of each joint detected by the angle sensor 24. Based on the angle θ, compliance control of each joint portion 22 is performed (step S108).
The control device 1, so that the probability P of each particle, handouts particles according to the probability density distribution p _k, update the particle at time t + 1 (step S109).

以上、本実施形態１において、関節部２２の関節角度及び関節トルクを含む隠れマルコフモデルと、角度センサ２４及び力センサ２６により検出された関節角度及び関節トルクに基づいて、目標関節角度をサンプリングする。サンプリングした目標関節角度に基づいて目標関節角速度及び目標関節角加速度を算出し、目標関節角度と、算出した目標関節角速度および目標関節角加速度と、目標反力と、に基づいて、逆動力学演算を行い目標関節トルクを算出する。現在時刻ｔの角度センサ値及びトルクセンサ値と、隠れマルコフモデルと、から、その目標関節角度および目標関節トルクになる確率を算出する。算出した確率による重みづけ平均により時刻ｔ＋１の目標関節角度及び目標関節トルクの期待値を算出し、この目標関節角度及び目標関節トルクの期待値を用いてロボットアーム２の関節部２２を制御する。 As described above, in the first embodiment, the target joint angle is sampled based on the hidden Markov model including the joint angle and joint torque of the joint portion 22 and the joint angle and joint torque detected by the angle sensor 24 and the force sensor 26. . Calculate the target joint angular velocity and target joint angular acceleration based on the sampled target joint angle, and calculate the inverse dynamics based on the target joint angle, the calculated target joint angular velocity and target joint angular acceleration, and the target reaction force To calculate the target joint torque. From the angle sensor value and the torque sensor value at the current time t and the hidden Markov model, the probability of the target joint angle and the target joint torque is calculated. An expected value of the target joint angle and target joint torque at time t + 1 is calculated by weighted average based on the calculated probability, and the joint portion 22 of the robot arm 2 is controlled using the expected value of the target joint angle and target joint torque.

これにより、ロボットアーム２の位置指令値と力指令値の整合性を取ることができる。したがって、隠れマルコフモデル構築時とロボットアーム制御時で、例えば、操作対象物の大きさ、形状が異なる場合や、ロボットアーム２と操作対象物の位置関係が異なる場合などでも、ロボットアーム２と操作対象物との間で、過大な力が発生する、あるいはロボットアームが操作対象物に接触できない状況を抑制できる。 Thereby, the consistency between the position command value of the robot arm 2 and the force command value can be obtained. Therefore, even when the hidden Markov model is constructed and when the robot arm is controlled, for example, when the size and shape of the operation target are different, or when the positional relationship between the robot arm 2 and the operation target is different, the robot arm 2 and the operation are performed. It is possible to suppress a situation in which an excessive force is generated between the object and the robot arm cannot contact the operation object.

実施形態２
本発明の実施形態２において、制御装置１は、エンドエフェクタ２３の手先位置（位置情報の一具体例）および、エンドエフェクタ２３の手先反力（力情報の一具体例）の時系列データに基づいて、隠れマルコフモデルを構築してもよい。この場合、制御装置１は、上記実施形態１における[prediction]（１）〜（８）および[resampling]の関節角度および関節トルクを、夫々、手先位置および手先反力に置き換えて同様の処理を行う。 Embodiment 2
In Embodiment 2 of the present invention, the control device 1 is based on time-series data of the hand position of the end effector 23 (one specific example of position information) and the hand reaction force of the end effector 23 (one specific example of force information). Hidden Markov models may be constructed. In this case, the control device 1 replaces the joint angles and joint torques of [prediction] (1) to (8) and [resampling] in the first embodiment with the hand position and the hand reaction force, respectively, and performs similar processing. Do.

（学習処理）
制御装置１は、学習処理において、例えば、予め用意されたエンドエフェクタ２３の手先位置および手先反力の時系列データを用いて学習を行い、Baum-Welch法によりleft-right隠れマルコフモデルを構築する。 (Learning process)
In the learning process, for example, the control device 1 performs learning using time series data of hand positions and hand reaction forces of the end effector 23 prepared in advance, and constructs a left-right hidden Markov model by the Baum-Welch method. .

（実行処理）
制御装置１は、所定時間毎に、以下の[prediction]（１）〜（８）および[resampling]を繰り返す。 (Execution process)
The control device 1 repeats the following [prediction] (1) to (8) and [resampling] every predetermined time.

[prediction]
（１）制御装置１は、まず、現在時刻ｔにおける隠れマルコフモデルのノードｑ_ｔと、センサにより検出された現在時刻ｔにおける手先位置及び手先反力と、（パーティクル）を保持する。 [prediction]
(1) The control apparatus 1, first, holds the node q _t hidden Markov model at the current time t, and hand position and hand reaction force at the current time t is detected by the sensor, the (particles).

（２）制御装置１は、隠れマルコフモデルのノード間の遷移確率ａを用いて時刻ｔ＋１（次瞬間）におけるノードｑ_ｔ＋１をサンプリングする。 (2) The control device 1 samples the node q _{t + 1} at time t + 1 (next moment) using the transition probability a between the nodes of the hidden Markov model.

（３）制御装置１は、サンプリングしたノードｑ_ｔ＋１が有する手先位置の分布に基づいて、時刻ｔ＋１における目標手先位置（目標位置情報の一具体例）をサンプリングする。 (3) The control device 1 samples the target hand position (one specific example of the target position information) at time t + 1 based on the distribution of the hand position of the sampled node q _{t + 1} .

（４）制御装置１は、サンプリングした目標手先位置にを１階微分して目標手先速度（目標速度情報の一具体例）を算出し、目標手先位置にを２階微分して目標手先加速度（目標加速度情報の一具体例）を算出する。 (4) The control device 1 calculates the target hand speed (one specific example of the target speed information) by performing first-order differentiation on the sampled target hand position, and performs second-order differentiation on the target hand position to obtain the target hand acceleration ( A specific example of target acceleration information) is calculated.

（５）制御装置１は、制御安定性を向上させるために、算出した目標手先加速度を修正する。本実施形態２において、制御装置１は、上記(５)による目標手先加速度の修正を行わなくても良い。 (5) The control device 1 corrects the calculated target hand acceleration in order to improve control stability. In the second embodiment, the control device 1 may not correct the target hand acceleration according to the above (5).

（６）制御装置１は、算出した目標手先位置、目標手先速度、および修正した目標手先加速度と、目標反力とを実現する、目標手先反力（目標力情報の一具体例）を算出する。 (6) The control device 1 calculates a target hand reaction force (one specific example of target force information) that realizes the calculated target hand position, target hand speed, corrected target hand acceleration, and target reaction force. .

（７）制御装置１は、センサにより検出された現在時刻ｔでの手先位置及び手先反力と、隠れマルコフモデルと、に基づいて、目標手先位置および目標手先反力が隠れマルコフモデルから生成される確率Ｐを算出する。
制御装置１は、パーティクル毎に上記確率Ｐを算出する。ここで、制御装置１は、各パーティクルの確率Ｐの総和が１にするため、次の修正処理を行う。例えば、制御装置１は、各パーティクルの確率の総和を算出し、各パーティクルの確率を算出した総和で除算する。
制御装置１は、算出した確率Ｐを正規化して、確率密度分布ｐ_ｋを算出する。 (7) The control device 1 generates the target hand position and the target hand reaction force from the hidden Markov model based on the hand position and hand reaction force at the current time t detected by the sensor and the hidden Markov model. Probability P is calculated.
The control device 1 calculates the probability P for each particle. Here, the control device 1 performs the following correction process so that the sum of the probabilities P of the respective particles is 1. For example, the control device 1 calculates the sum of the probabilities of each particle and divides the probability of each particle by the calculated sum.
Controller 1, the calculated probability P is normalized to calculate the probability density distribution p _k.

（８）制御装置１は、目標手先位置、目標手先反力および修正した確率に基づいて、時刻ｔ＋１における目標手先位置の推定値及び目標手先反力の推定値（次瞬間の指令値）を、パーティクル毎の手先位置及び手先反力の期待値として算出する。
制御装置１は、上記（８）において算出した目標手先位置の推定値及び目標手先反力の推定値と、センサにより検出された手先位置と、に基づいて、各関節部２２のコンプライアンス制御を行う。 (8) The control device 1 calculates the estimated value of the target hand position and the estimated value of the target hand reaction force (command value at the next moment) at time t + 1 based on the target hand position, the target hand reaction force, and the corrected probability. Calculated as the expected value of the hand position and hand reaction force for each particle.
The control device 1 performs compliance control of each joint portion 22 based on the estimated value of the target hand position and the estimated value of the target hand reaction force calculated in (8) above, and the hand position detected by the sensor. .

[resampling]
制御装置１は、各パーティクルの確率Ｐとなるように、確率密度分布ｐ_ｋに従ってパーティクルをばらまき、時刻ｔ＋１でのパーティクルを更新する。 [resampling]
The control device 1, so that the probability P of each particle, handouts particles according to the probability density distribution p _k, update the particle at time t + 1.

実施形態３
本発明の実施形態３において、制御装置１は、ロボットアーム２の各関節部２２の関節角度および関節トルクと、エンドエフェクタ２３の手先反力と、の時系列データに基づいて隠れマルコフモデルを構築する。 Embodiment 3
In Embodiment 3 of the present invention, the control device 1 constructs a hidden Markov model based on time-series data of the joint angle and joint torque of each joint portion 22 of the robot arm 2 and the hand reaction force of the end effector 23. To do.

この場合、制御装置１は、上記実施形態１の[prediction]（３）において、サンプリングしたノードｑ_ｔ＋１が有する関節角度の分布に基づいて、時刻ｔ＋１における目標関節角度θ_ｔ＋１ ^ｒｅｆ及び目標反力をサンプリングする。 In this case, the control device 1 determines the target joint angle θ _{t + 1} ^ref and the target reaction force at time t + 1 based on the joint angle distribution of the sampled node q _{t +} 1 in [prediction] (3) of the first embodiment. Sampling.

そして、制御装置１は、上記[prediction]（３）において、算出した目標関節角度θ_ｔ＋１ ^ｒｅｆ、目標関節角速度θ（ドット）_ｔ＋１ ^ｒｅｆ、および修正した目標関節角加速度θ（２ドット）_ｔ＋１ ^ｒｅｆと、サンプリングした目標反力Ｆ_ｔ＋１と、に基づいて、上記運動方程式（逆動力学演算式）（式３）を用いて目標関節トルクτ_ｔ＋１を算出する。
これにより、隠れマルコフモデルに手先反力を含めることで、[prediction]の過程で自動的に目標反力を設定することができる。 Then, in [prediction] (3), the control device 1 calculates the calculated target joint angle θ _{t + 1} ^ref , the target joint angular velocity θ (dot) _{t + 1} ^ref , and the corrected target joint angular acceleration θ (2 dots) _{t + 1} ^ref Based on the sampled target reaction force F _{t + 1} , the target joint torque τ _{t + 1} is calculated using the equation of motion (reverse dynamics calculation equation) (Equation 3).
Thus, by including the hand reaction force in the hidden Markov model, the target reaction force can be automatically set in the process of [prediction].

本実施形態３において、上記[prediction]（３）及び（６）以外の他の処理は、上記実施形態１と同一であるため、詳細な説明は省略する。 In the third embodiment, processes other than the above [prediction] (3) and (6) are the same as those in the first embodiment, and thus detailed description thereof is omitted.

実施形態４
本発明の実施形態４において、制御装置１は、エンドエフェクタ２３の手先速度（速度情報の一具体例）および、エンドエフェクタ２３の手先反力、の時系列データに基づいて、隠れマルコフモデルを構築する。 Embodiment 4
In Embodiment 4 of the present invention, the control device 1 constructs a hidden Markov model based on time-series data of the hand speed of the end effector 23 (one specific example of speed information) and the hand reaction force of the end effector 23. To do.

（学習処理）
制御装置１は、学習処理において、例えば、予め用意されたエンドエフェクタ２３の手先速度および手先反力の時系列データを用いて学習を行い、Baum-Welch法によりleft-right隠れマルコフモデルを構築する。 (Learning process)
In the learning process, the control device 1 performs learning using, for example, time series data of hand speed and hand reaction force of the end effector 23 prepared in advance, and constructs a left-right hidden Markov model by the Baum-Welch method. .

（実行処理）
[prediction]
（１）制御装置１は、まず、現在時刻ｔにおける隠れマルコフモデルのノードｑ_ｔと、センサにより検出された現在時刻ｔにおける手先速度及び手先反力と、（パーティクル）を保持する。 (Execution process)
[prediction]
(1) First, the control device 1 holds the hidden Markov model node q _t at the current time t, the hand speed and the hand reaction force at the current time t detected by the sensor, and (particles).

（３）制御装置１は、サンプリングしたノードｑ_ｔ＋１が有する手先速度の分布に基づいて、時刻ｔ＋１における目標手先速度をサンプリングする。 (3) The control device 1 samples the target hand speed at time t + 1 based on the hand speed distribution of the sampled node q _{t + 1} .

（４）制御装置１は、サンプリングした目標手先速度を１階積分して目標手先位置を算出し、目標手先速度を１階微分して目標手先加速度を算出する。
上記（１）乃至（４）以外の処理は、上記実施形態２と同一であるため、詳細な説明は省略する。 (4) The control device 1 calculates the target hand position by first-order integration of the sampled target hand speed, and calculates the target hand acceleration by differentiating the target hand speed by the first order.
Since processes other than the above (1) to (4) are the same as those in the second embodiment, detailed description thereof is omitted.

実施形態５
本発明の実施形態５において、制御装置１は、ロボットアーム２の各関節部２２の関節角速度および関節トルクの時系列データに基づいて、隠れマルコフモデルを構築する。 Embodiment 5
In the fifth embodiment of the present invention, the control device 1 constructs a hidden Markov model based on the time series data of the joint angular velocity and joint torque of each joint portion 22 of the robot arm 2.

（学習処理）
制御装置１は、学習処理において、例えば、予め用意されたロボットアーム２の各関節部２２の関節角速度および関節トルクの時系列データを用いて学習を行い、Baum-Welch法によりleft-right隠れマルコフモデルを構築する。 (Learning process)
In the learning process, the control device 1 performs learning using, for example, time series data of joint angular velocity and joint torque of each joint portion 22 of the robot arm 2 prepared in advance, and left-right hidden Markov using the Baum-Welch method. Build a model.

（実行処理）
[prediction]
（１）制御装置１は、まず、現在時刻ｔにおける隠れマルコフモデルのノードｑ_ｔと、センサにより検出された現在時刻ｔにおける関節角速度及び関節トルクと、（パーティクル）を保持する。 (Execution process)
[prediction]
(1) The control apparatus 1, first, holds the node q _t hidden Markov model at the current time t, and joint angular velocity and joint torque at the current time t is detected by the sensor, the (particles).

（３）制御装置１は、サンプリングしたノードｑ_ｔ＋１が有する関節角速度の分布に基づいて、時刻ｔ＋１における目標関節角速度をサンプリングする。 (3) The control device 1 samples the target joint angular velocity at time t + 1 based on the joint angular velocity distribution of the sampled node q _{t + 1} .

（４）制御装置１は、サンプリングした目標関節角速度にを１階積分して目標関節角度を算出し、目標関節角速度にを１階微分して目標関節角加速度を算出する。
上記（１）乃至（４）以外の処理は、上記実施形態１と同一であるため、詳細な説明は省略する。 (4) The control device 1 calculates the target joint angle by performing first-order integration on the sampled target joint angular velocity, and calculates the target joint angular acceleration by performing first-order differentiation on the target joint angular velocity.
Since the processes other than the above (1) to (4) are the same as those in the first embodiment, detailed description thereof is omitted.

なお、本発明は上記実施の形態に限られたものではなく、趣旨を逸脱しない範囲で適宜変更することが可能である。
上記実施形態において、制御装置１は、隠れマルコフモデルを構築し、構築した隠れマルコフモデルを用いて制御を行っているが、これに限定されない。制御装置１は、各ノード間に遷移確率が設定され各ノードは位置又は速度情報と力情報の分布を有する、グラフィカルな統計モデルを構築し、この統計モデル用いて制御を行っても良い。制御装置１は、例えば、マルコフモデル、やＣＲＦ（Conditional Random Field）などのグラフィカルな統計モデルを構築してもよい。 Note that the present invention is not limited to the above-described embodiment, and can be changed as appropriate without departing from the spirit of the present invention.
In the above embodiment, the control device 1 constructs a hidden Markov model and performs control using the constructed hidden Markov model, but is not limited thereto. The control device 1 may construct a graphical statistical model in which transition probabilities are set between the nodes and each node has a distribution of position or velocity information and force information, and control may be performed using the statistical model. The control apparatus 1 may construct | assemble graphical statistical models, such as a Markov model and CRF (Conditional Random Field), for example.

上記実施形態において、制御装置１は、ロボットアーム２を制御しているが、これに限定されない。制御装置１は、例えば、複数の関節部を有するロボット脚部や、人の脚部に装着されその歩行を補助する歩行支援ロボットを制御してよく、複数の関節部を有するロボットの各部を制御できる。さらに、制御装置１は、ロボットの制御だけでなく、工作機械（力制御を行う加工機械における人の加工方法の模倣等）、車両などの自動運転（人のハンドル操作の模倣等）、モータによる可動部分を含む装置、などを制御してもよい。 In the above embodiment, the control device 1 controls the robot arm 2, but is not limited to this. The control device 1 may control, for example, a robot leg having a plurality of joints or a walking support robot that is attached to a human leg and assists in walking, and controls each part of the robot having a plurality of joints. it can. Furthermore, the control device 1 is not only controlled by a robot, but also by a machine tool (imitation of a person's processing method in a processing machine that performs force control), automatic operation of a vehicle or the like (imitation of a human handle operation, etc.), motor Devices including moving parts may be controlled.

本発明は、例えば、図５に示す処理を、ＣＰＵ１ａにコンピュータプログラムを実行させることにより実現することも可能である。 In the present invention, for example, the processing shown in FIG. 5 can be realized by causing the CPU 1a to execute a computer program.

プログラムは、様々なタイプの非一時的なコンピュータ可読媒体（non-transitory computer readable medium）を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体（tangible storage medium）を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体（例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ）、光磁気記録媒体（例えば光磁気ディスク）、ＣＤ−ＲＯＭ（Read Only Memory）、ＣＤ−Ｒ、ＣＤ−Ｒ／Ｗ、半導体メモリ（例えば、マスクＲＯＭ、ＰＲＯＭ（Programmable ROM）、ＥＰＲＯＭ（Erasable PROM）、フラッシュＲＯＭ、ＲＡＭ（random access memory））を含む。 The program may be stored using various types of non-transitory computer readable media and supplied to a computer. Non-transitory computer readable media include various types of tangible storage media. Examples of non-transitory computer-readable media include magnetic recording media (for example, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (for example, magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs, CD-R / W and semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (random access memory)) are included.

プログラムは、様々なタイプの一時的なコンピュータ可読媒体（transitory computer readable medium）によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 The program may be supplied to the computer by various types of transitory computer readable media. Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

１制御装置、２ロボットアーム、２１リンク、２２関節部、２３エンドエフェクタ、２４角度センサ、２５アクチュエータ、２６力センサ DESCRIPTION OF SYMBOLS 1 Control apparatus, 2 Robot arm, 21 Link, 22 Joint part, 23 End effector, 24 Angle sensor, 25 Actuator, 26 Force sensor

Claims

A model constructed based on the position or speed information of the movable part to be operated and the force information on the movable part, and a transition probability between the nodes is set, and each node is distributed between the position or speed information and the force information. Sampling target position information or target speed information of the movable part at time t + 1 based on a statistical model having the following information: and position or speed information and force information of the movable part at time t (t is a natural number); ,
Calculating target position information or target speed information of the movable part and target acceleration information at time t + 1 based on the sampled target position information or target speed information;
Performing reverse dynamics calculation based on the calculated target position information, target speed information and target acceleration information at time t + 1 to calculate target force information of the movable part at time t + 1;
Calculating a probability that the calculated target position information and target force information at time t + 1 are generated from the statistical model based on the position information and force information at time t and the statistical model;
Calculating expected values of target position information and target force information at time t + 1 based on the calculated target position information and target force information at time t + 1 and the calculated probability;
Controlling the movable part based on the calculated target position information and expected value of the target force information;
A control method comprising: