KR101234289B1

KR101234289B1 - Congitive system realizing contextual human robot interaction for service robot and method thereof

Info

Publication number: KR101234289B1
Application number: KR1020110017697A
Authority: KR
Inventors: 안현식
Original assignee: 동명대학교산학협력단
Priority date: 2011-02-28
Filing date: 2011-02-28
Publication date: 2013-02-18
Also published as: KR20120098030A

Abstract

서비스 로봇을 위한 맥락적 상호작용이 가능한 인지 시스템이 개시된다. 이 시스템은 로봇의 외부 정보를 감지하기 위한 감지부와, 로봇의 외부로 행위하기 위한 행위부, 로봇의 기억이 저장되기 위한 기억부, 및 감지부에 의해 감지된 정보를 문장으로 표현하여 기억부에 저장하며 기억부에 저장된 기억을 추론하고 그 추론 결과에 따라 상기 행위부를 통해 외부로 행위하게 하는 추론 관리부를 포함한다. 이에 의해 서비스 로봇의 맥락적 상호작용이 가능해진다.A cognitive system capable of contextual interaction for a service robot is disclosed. The system expresses the sensing unit for sensing external information of the robot, the acting unit for acting outside of the robot, the storage unit for storing the robot's memory, and the information detected by the sensing unit in sentences. And an inference management unit for inferring the memory stored in the storage unit and acting outside through the acting unit according to the inference result. This enables the contextual interaction of service robots.

Description

Cognitive system realizing contextual human robot interaction for service robot and method

본 발명은 로봇에 관련된 기술로, 특히 로봇이 인간과 상호 소통할 수 있도록 하기 위한 인지 기술에 관한 것이다. The present invention relates to a technology related to a robot, and more particularly, to a cognitive technology for allowing a robot to communicate with a human.

로봇 기술이 발전하면서 인간과 로봇이 생활 공간 안에서 일상생활을 함께하는 서비스 로봇에 대한 관심이 늘어나고 있다. 그동안 인간 로봇 상호작용(Human Robot Interface)의 관점에서 기초화(grounding)에 대한 연구가 진행되어 왔는데, Roy의 경우는 로봇에 의해 얻어지는 시각적 물체 정보를 마음 영상화(mental imagery)를 통해 모델화하고 이에 대한 대화가 가능하도록 하였다. Siskind는 동영상 속에 존재하는 물체와 손의 움직임을 역학적 의미를 가지는 언어적 표현으로 기술하였다. 이러한 연구들은 시간과 공간적 상황을 동시에 처리할 수 있는 문장 단위의 기초화 모델이 제시되고 있지 못하며, 기초화된 정보를 기억으로 저장하고 재생하는 것에 대한 연구는 아직 이루어지고 있지 않다.With the development of robot technology, there is a growing interest in service robots where humans and robots share their daily lives in the living space. In the meantime, research on grounding has been conducted in terms of human robot interface. In the case of Roy, visual object information obtained by robots is modeled through mental imagery and The conversation was made possible. Siskind described the movement of objects and hands in the video as linguistic expressions with mechanical meaning. These studies have not suggested a sentence-based basic model that can handle both temporal and spatial situations at the same time, and there are no studies on storing and reproducing the basic information as memory.

한편, 인지과학 분야에서는 인간의 마음의 구조와 연결하여 이를 모델화한 인지 구조(congnitive architecture)에 대한 연구가 진행되고 있는데, 카네기멜런대학의 ACT-R 이나 미시간대학의 SOAR의 경우 인간의 인지를 이해하고 시뮬레이션하는 연구들이다. 이 연구들은 인간의 인지 기능을 가능한 한 유사하게 모방하는 데에 일차적 목적을 두고 있으며, 로봇에 적용할 경우에 대한 효율성의 문제는 충분히 고려하고 있지 않다. 따라서 로봇의 인지 시스템으로 활용하기가 어려우며, 인간과 로봇이 맥락에 기반한 상호작용에 대한 발명은 이루어지지 못하였다.On the other hand, in the field of cognitive science, research on the cognitive architecture that models and connects with the structure of the human mind is being conducted.In the case of ACT-R at Carnegie Mellon University and SOAR at the University of Michigan, human cognition is understood. And simulations. These studies primarily aim to mimic human cognitive function as similarly as possible, and do not fully consider the issue of efficiency when applied to robots. Therefore, it is difficult to use it as a robot's cognitive system, and the invention about the context-based interaction between humans and robots has not been achieved.

인간과 일상생활을 함께하는 서비스 로봇이 되기 위해서는 인간과 경험을 공유하고 시공간적 맥락에 의거한 상호작용이 필수적으로 요구된다. 로봇과 인간의 상호작용은 외부로부터의 센싱과 외부로 향한 행동, 인지 정보에 대한 기억, 그리고 맥락에 의거한 재생 등의 기능이 요구된다. 이를 위해서는 인지 정보를 적절한 형태로 기술하고 저장하는 과정이 필요하다. 또한 상호 대화를 위한 언어를 활용하기 위해서는 로봇이 인간의 언어를 이해하고 필요에 따라 발화하는 기능이 요구된다. 이러한 기능들은 필수적으로 언어와 인지 정보의 결합 또는 연결이 요구되며 이로부터 맥락적 상호작용을 위한 정보를 이끌어낼 수 있다. 이와 같이 감지, 행위, 기억 등과 같은 인지 정보에 대한 언어적 표현과 인간으로부터 발화된 언어를 인지적 정보와 결합시키는 인지 시스템이 필요하다.In order to become a service robot that shares human life with everyday life, it is necessary to share experiences with humans and interact based on time and space context. Robot-human interaction requires functions such as sensing from the outside, acting outward, memory of cognitive information, and contextual reproduction. To do this, it is necessary to describe and store the cognitive information in an appropriate form. In addition, in order to utilize the language for mutual communication, the robot needs the ability to understand the human language and speak as needed. These functions necessarily require combining or linking language and cognitive information, and can derive information for contextual interaction. As such, there is a need for a cognitive system that combines verbal expressions of cognitive information such as sensing, behavior, and memory with cognitive information.

본 발명은 인간과 로봇이 맥락에 기반한 상호 작용을 가능하게 하는 기술적 방안을 제공함을 목적으로 한다.An object of the present invention is to provide a technical solution that enables a human-robot interaction based on the context.

전술한 기술적 과제를 달성하기 위한 본 발명의 일 양상에 따른 서비스 로봇을 위한 맥락적 상호작용이 가능한 인지 시스템은 로봇의 외부 정보를 감지하기 위한 감지부, 상기 로봇의 외부로 행위하기 위한 행위부, 상기 로봇의 기억이 저장되기 위한 기억부, 및 상기 감지부에 의해 감지된 정보를 문장으로 표현하여 상기 기억부에 저장하며, 상기 기억부에 저장된 기억을 추론하고 그 추론 결과에 따라 상기 행위부를 통해 외부로 행위하게 하는 추론 관리부를 포함한다.A cognitive system capable of contextual interaction for a service robot according to an aspect of the present invention for achieving the above technical problem includes a sensing unit for sensing external information of the robot, an acting unit for acting outside of the robot, A memory for storing the memory of the robot, and the information detected by the sensor is expressed in a sentence and stored in the memory, and the memory stored in the memory is inferred and the action is based on the result of the inference. It includes an inference management unit that causes an outside act.

한편, 전술한 기술적 과제를 달성하기 위한 본 발명의 일 양상에 따른 서비스 로봇의 맥락적 상호작용을 위한 인지 방법은 로봇의 청취 모듈에 의해 음성이 변환된 문장을 해석하는 단계, 상기 해석된 문장을 상기 로봇의 기억부에 저장하는 단계, 상기 해석된 문장의 내용을 추론하는 단계, 및 상기 추론 결과에 대한 행위를 실행하는 단계를 포함한다.On the other hand, the cognitive method for the contextual interaction of the service robot according to an aspect of the present invention for achieving the above technical problem is a step of interpreting the sentence, the speech is converted by the listening module of the robot, the interpreted sentence Storing in the storage unit of the robot, inferring contents of the interpreted sentence, and executing an action on the inference result.

전술한 기술적 과제를 달성하기 위한 본 발명의 다른 양상에 따른 서비스 로봇의 맥락적 상호작용을 위한 인지 방법은 로봇에 입력된 카메라 영상에서 인식된 물체에 대해 동일 물체에 대한 정보가 상기 로봇의 기억부의 물체기술자에 등록되어 있는지 판단하는 단계, 상기 판단 결과 등록되어 있지 않으면 상기 인식된 물체에 대한 정보를 상기 물체기술자에 새로이 등록하는 단계, 및 상기 새로운 물체 인식 사건에 대한 문장을 생성하여 상기 기억부에 저장하는 단계를 포함한다.According to another aspect of the present invention, there is provided a cognitive method for contextual interaction of a service robot, in which information about the same object is detected in a camera image input to the robot. Determining whether the object descriptor is registered with the object descriptor; if the determination result is not registered, newly registering information on the recognized object with the object descriptor; and generating a sentence for the new object recognition event to generate the sentence in the storage unit. Storing.

본 발명은 모든 사건을 문장으로 표현하여 기억하고 필요할 경우 이를 재생할 수 있는 기능을 통해 서비스 로봇의 맥락적 상호작용을 가능하게 하는 효과를 창출한다. 또한 본 발명은 언어적 표현만이 아닌 공간 정보와 같은 보조적 인지 정보를 제공하여 서비스 로봇의 맥락적 상호작용을 더욱 높일 수 있다.The present invention creates the effect of enabling the contextual interaction of the service robot through the function of expressing and remembering all events in sentences and reproducing them if necessary. In addition, the present invention can further enhance the contextual interaction of the service robot by providing auxiliary cognitive information such as spatial information as well as verbal expression.

도 1은 본 발명의 일 실시예에 따른 인지 시스템 개념도.
도 2는 본 발명의 일 실시예에 따른 인지 시스템 블록도.
도 3은 본 발명의 일 실시예에 따른 해석기 블록도.
도 4는 본 발명의 일 실시예에 따른 공간 추론기의 공간 추론 예시도.
도 5는 본 발명의 일 실시예에 따라 인지 시스템의 시각 모듈에서 발생하는 사건에 대한 인지 방법 흐름도.
도 6은 본 발명의 일 실시예에 따라 인지 시스템의 청취 모듈에서 발생하는 사건에 대한 인지 방법 흐름도.
도 7은 본 발명의 일 실시예에 따라 인지 시스템의 행위 모듈에서 발생하는 사건에 대한 인지 방법 흐름도.
도 8은 본 발명의 일 실시예에 따라 인지 시스템의 발화 모듈에서 발생하는 사건에 대한 인지 방법 흐름도.1 is a conceptual diagram of a cognitive system according to an embodiment of the present invention.
2 is a block diagram of a cognitive system in accordance with an embodiment of the present invention.
3 is an interpreter block diagram in accordance with an embodiment of the present invention.
Figure 4 is a spatial inference example of the spatial inference according to an embodiment of the present invention.
5 is a flow chart of a cognitive method for events occurring in a visual module of a cognitive system in accordance with an embodiment of the present invention.
6 is a flow diagram of a cognitive method for events occurring in a listening module of a cognitive system in accordance with an embodiment of the present invention.
7 is a flow diagram of a cognitive method for an event occurring in an action module of a cognitive system in accordance with one embodiment of the present invention.
8 is a flow diagram of a cognitive method for events occurring in a speech module of a cognitive system in accordance with an embodiment of the present invention.

전술한, 그리고 추가적인 본 발명의 양상들은 첨부된 도면을 참조하여 설명되는 바람직한 실시예들을 통하여 더욱 명백해질 것이다. 이하에서는 본 발명을 이러한 실시예를 통해 당업자가 용이하게 이해하고 재현할 수 있도록 상세히 설명하기로 한다.BRIEF DESCRIPTION OF THE DRAWINGS The foregoing and further aspects of the present invention will become more apparent from the following detailed description of preferred embodiments with reference to the accompanying drawings. Hereinafter, the present invention will be described in detail to enable those skilled in the art to easily understand and reproduce the present invention.

도 1은 본 발명의 일 실시예에 따른 인지 시스템 개념도이다.1 is a conceptual diagram of a cognitive system according to an embodiment of the present invention.

도 1은 인간과의 상호작용을 위해 로봇이 갖추어야 할 인지 시스템의 개념도로서, 인지 정보를 문장과 연결한 모습을 나타낸다. 외부로부터 감지 또는 청취한 정보를 문장으로 표현하고 저장 또는 추론하고, 외부로 행위를 하거나 발화할 경우에도 문장을 기준으로 하여 실행하게 한다는 것이다. 즉, 인지 정보와 언어를 연결하고 문장을 기본 단위로 기억 속에 저장하고 필요할 경우에 추론하고 재생한다. 그러나 추론시 기억 속에 저장된 문장 만으로는 한계를 가지는데, 문장의 연속으로만 기억이 구성됨으로써 문장으로 표현하기 어려운 인지 정보에 대한 추론이 어렵다. 따라서 본 발명에 따른 인지 시스템은 문장을 기반으로 하되 시각 등과 같은 추가적인 인지 정보를 보조 수단으로 활용함으로써 보다 효과적인 추론이 가능하도록 한다. 따라서 인지 정보의 언어적 표현과 이를 보조적 인지 정보와 연결하는 과정이 요구된다.1 is a conceptual diagram of a cognitive system to be equipped with a robot for interaction with a human, and shows a state in which cognitive information is connected to sentences. The information sensed or listened from the outside is expressed in sentences, stored or inferred, and executed based on sentences even when acting or speaking outward. That is, it connects cognitive information and language, stores sentences as basic units in memory, infers and plays them when necessary. However, only the sentences stored in the memory at the time of inference have limitations, and it is difficult to reason about cognitive information that is difficult to express in sentences because the memory is composed only of a series of sentences. Therefore, the cognitive system according to the present invention enables more effective reasoning by using additional cognitive information such as visual as an auxiliary means based on sentences. Therefore, the process of linking the verbal expression of cognitive information and auxiliary cognitive information is required.

한편, 도 1과 같은 방식에 의하면 다음과 같은 장점이 있다. 첫째, 대화 자체가 언어적 표현이므로 인지 정보의 언어적 표현은 발화나 청취 정보를 그대로 표현할 수 있으므로 다른 형태로 코드화할 필요가 없다. 둘째, 로봇이 감지하는 센싱 정보, 행동 정보뿐만 아니라 내부적 추론도 독백 형태의 언어로 표현할 수 있다. 셋째, 문장의 시간 태그를 포함한 순서적인 저장을 통해 인지 정보를 시간적으로 저장하고 필요할 경우에 재생할 수 있다. 넷째, 한 문장은 로봇이나 물체의 동작을 표현할 수 있는 최소 단위로서 의미적으로 완결적이며 사건의 기본 단위가 될 수 있다.On the other hand, according to the method as shown in Figure 1 has the following advantages. First, since the dialogue itself is a linguistic expression, the linguistic expression of the cognitive information can express the speech or listening information as it is, so there is no need to code it in another form. Second, internal reasoning as well as sensing information and behavior information detected by the robot can be expressed in a monologue language. Third, the cognitive information can be stored temporally and reproduced when necessary through the sequential storage including the time tag of the sentence. Fourth, a sentence is the smallest unit that can express the motion of a robot or an object. It is semantically complete and can be a basic unit of events.

도 2는 본 발명의 일 실시예에 따른 인지 시스템 블록도이다.2 is a block diagram of a cognitive system according to an embodiment of the present invention.

도시된 바와 같이, 인지 시스템은 감지부(100), 행위부(200), 기억부(300), 및 추론 관리부(400)를 포함한다. 감지부(100)는 로봇의 외부 정보를 감지하기 위한 구성이다. 일 실시예에 있어서, 감지부(100)는 시각 모듈(110)과 센서 모듈(130) 및 청취 모듈(120) 중 적어도 하나를 포함한다. 시각 모듈(110)은 카메라와 연결되어 카메라로 입력되는 영상을 획득하기 위한 구성이다. 청취 모듈(120)은 마이크와 연결되어 마이크로 입력되는 외부 음성을 획득하기 위한 구성이다. 이 청취 모듈(120)은 마이크로 입력된 음성을 문장으로 변환하기 위한 STT(Speech To Text) 기능을 포함할 수 있다. 그리고 센서 모듈(130)은 하나 이상의 센서로부터 센싱되는 데이터를 획득하기 위한 구성이다. 다수의 서로 다른 목적을 가진 센서들이 센서 모듈(130)과 연결될 수 있으며, 동일한 센서가 여러 개일 수 있다. 예를 들어, 센서로는 촉각 센서, 온도 센서, 습도 센서 등일 수 있다. 감지부(100)는 이러한 시각 모듈(110)과 청취 모듈(120) 및 센서 모듈(130)을 통해 로봇의 외부 상황을 감지할 수 있다.As shown, the cognitive system includes a sensing unit 100, an acting unit 200, a storage unit 300, and an inference management unit 400. The sensing unit 100 is a component for sensing external information of the robot. In one embodiment, the detector 100 includes at least one of the visual module 110, the sensor module 130, and the listening module 120. The visual module 110 is a component connected to the camera to acquire an image input to the camera. The listening module 120 is a component connected to a microphone to obtain an external voice input to the microphone. The listening module 120 may include a speech to text (STT) function for converting a microphone input voice into a sentence. The sensor module 130 is a component for acquiring data sensed from one or more sensors. A plurality of different purpose sensors may be connected to the sensor module 130, and the same sensor may be several. For example, the sensor may be a tactile sensor, a temperature sensor, a humidity sensor, or the like. The sensor 100 may detect an external situation of the robot through the visual module 110, the listening module 120, and the sensor module 130.

행위부(200)는 동작 모듈(210)과 발화 모듈(220)을 포함한다. 동작 모듈(210)은 로봇의 동작을 위한 다수의 액추에이터를 구동하기 위한 모듈이다. 동작 모듈(210)을 통해 로봇의 팔, 다리의 움직임은 물론 감정 표현을 위한 얼굴 표정의 움직임도 가능하다. 발화 모듈(220)은 로봇의 발화를 위한 구성으로서, 스피커와 연결된다. 바람직하게 발화 모듈(220)은 문장을 음성으로 변환하기 위한 TTS(Text To Speech) 기능을 포함한다.The acting unit 200 includes an operation module 210 and a speech module 220. The operation module 210 is a module for driving a plurality of actuators for the operation of the robot. Through the motion module 210, the movement of the robot's arms and legs as well as the facial expressions for expressing emotions may be possible. The ignition module 220 is configured to ignite the robot and is connected to the speaker. Preferably, the speech module 220 includes a text to speech (TTS) function for converting sentences into speech.

기억부(300)는 로봇의 기억을 저장하기 위한 구성이다. 이 기억부(300)는 하나 이상의 메모리를 포함한다. 메모리는 비휘발성 메모리임이 바람직하며, 플래시 메모리일 수 있다. 기억부(300)는 문장 저장부(310)와 물체기술자(320) 및 동작기술자(330)를 포함한다. 문장 저장부(310)와 물체기술자(320) 및 동작기술자(330)는 각각 물리적으로 구분된 메모리에 구현될 수 있다. 아니면 문장 저장부(310)와 물체기술자(320) 및 동작기술자(330)는 물리적으로 하나의 메모리에 저장되되, 논리적으로 구분되어 있을 수도 있다. 문장 저장부(310)에는 로봇과 관련된 사건이 문장으로 표현되어 저장된다. 물체기술자(320)에는 시각 모듈(110)에 의해 새로운 물체 혹은 물체의 자세가 바뀌었을 때 그 정보가 저장되며, 동작기술자(330)에는 로봇이 하는 동작과 그 동작순서에 대한 정보가 저장된다.The storage unit 300 is a configuration for storing the memory of the robot. This storage unit 300 includes one or more memories. The memory is preferably a nonvolatile memory, and may be a flash memory. The storage unit 300 includes a sentence storage unit 310, an object descriptor 320, and an operation descriptor 330. The sentence storage unit 310, the object descriptor 320, and the operation descriptor 330 may be embodied in physically separated memories. Alternatively, the sentence storage unit 310, the object descriptor 320, and the operation descriptor 330 may be physically stored in one memory and logically divided. The sentence storage unit 310 is represented by a sentence related to the robot and stored. The object descriptor 320 stores the information when the posture of the new object or the object is changed by the visual module 110, and the motion descriptor 330 stores information about the operation of the robot and its operation sequence.

추론 관리부(400)는 감지부(100)에 의해 감지된 정보를 문장으로 표현하여 기억부(300)에 저장하며, 기억부(300)에 저장된 기억을 추론하고 그 추론 결과에 따라 행위부(200)를 통해 외부로 행위하는 역할을 수행하기 위한 구성으로서, 컴퓨터 프로세서로 구현될 수 있다. 이 추론 관리부(400)는 감지부(100)에서 전달되는 정보를 모두 문장으로 표현하고 로봇 외부로 행위를 하기 위한 해석기(410)를 포함하고, 사건의 발생에 대한 지식과 정보를 추론하기 위한 지식 추론기(430)를 포함하며, 공간적 추론과 현재의 공간적 상황을 나타내기 위한 공간 추론기(420)를 포함한다.The inference management unit 400 expresses the information detected by the detection unit 100 in a sentence and stores the information in the storage unit 300, infers the memory stored in the storage unit 300, and acts according to the inference result 200. As a configuration for performing a role of acting externally through), it may be implemented as a computer processor. The reasoning management unit 400 includes an interpreter 410 for expressing all the information transmitted from the sensing unit 100 in a sentence and acts outside the robot, and knowledge for inferring knowledge about the occurrence of the event and information. It includes a reasoner 430, and includes a spatial reasoner 420 to represent the spatial reasoning and the current spatial situation.

해석기(410)는 감지부(100)나 행위부(200)에 의해서 발생한 모든 정보, 즉 사건을 문장으로 표현하고, 그 사건이 언제 일어났는지를 알 수 있도록 시간 태그를 함께 붙여 기억부(300)의 문장 저장부(310)에 저장한다. 이는 로봇의 인간과의 맥락적 상호작용을 가능하게 한다. 표 1은 시각 모듈(110)을 통해 입력된 정보에 대해 문장으로 표현하여 문장 저장부(310)에 저장한 예를 나타내고 있다.The interpreter 410 expresses all the information generated by the sensing unit 100 or the acting unit 200, that is, an event, in a sentence, and attaches a time tag together so that it knows when the event occurred. The sentence storage unit 310 to store. This enables the robot's contextual interaction with humans. Table 1 shows an example in which the information input through the visual module 110 is expressed in sentences and stored in the sentence storage unit 310.

번호number 시간time 모듈module 문장sentence 1One 년월일시분초Year month date hour minute second 시각Time (S (NP_AGENT Tom) (VP gave (NP_THEME an apple)) (PP_BENEFICIARY to (NP Peter)) (PP_TIME at (NP 6))) .)(S (NP_AGENT Tom) (VP gave (NP_THEME an apple)) (PP_BENEFICIARY to (NP Peter)) (PP_TIME at (NP 6))).) 22 ...... ...... ...... 33 ...... ...... ......

표 1과 같은 문장 저장을 위해서, 시각 모듈(110)을 통해 입력된 정보를 문장으로 표현할 수 있도록, 해석기(410)는 시각 모듈(110)을 통해 입력된 정보를 해석하기 위한 시각 해석기(411)를 포함한다. 그 외에 해석기(410)는 센서 모듈(130)을 통해 입력된 정보, 예를 들어, 촉각 정보, 온도 정보, 습도 정보 등을 해석하기 위해서 센서 해석기(412)를 더 포함한다. 또한 해석기(410)는 시각 해석기(411) 및 센서 해석기(412)에 의해 해석된 내용을 문장으로 표현하기 위한 문장 발생기(413)를 더 포함한다.For storing the sentences as shown in Table 1, the interpreter 410 is a visual interpreter 411 for interpreting the information input through the visual module 110, so that the information input through the visual module 110 can be represented in a sentence. It includes. In addition, the analyzer 410 further includes a sensor analyzer 412 to interpret information input through the sensor module 130, for example, tactile information, temperature information, humidity information, and the like. Also, the interpreter 410 further includes a sentence generator 413 for expressing the contents analyzed by the visual analyzer 411 and the sensor interpreter 412 into sentences.

시각 해석기(411)에 의해 카메라 영상에 대한 내용이 해석되면, 문장 발생기(413)는 해석된 내용을 문장으로 발생시킨다. 시각 해석기(411) 및 문장 발생기(413)에 의해 시각 정보가 문장으로 표현된다. 이 문장은 그대로 문장 저장부(310)에 저장될 수도 있으나, 구문 해석 및 의미 해석을 통해 해석된 결과가 문장 저장부(310)에 저장될 수도 있다. 이를 위해 문장 해석기(414)는 구문 해석기(syntactic parser)(414A) 및 의미 해석기(semantic parser)(414B)를 포함한다. 여기서 구문 해석기(414A)와 의미 해석기(414B)로는 자연언어처리에서 제공하는 소프트웨어가 이용될 수 있다.When the contents of the camera image are interpreted by the visual analyzer 411, the sentence generator 413 generates the interpreted contents as sentences. The time interpreter 411 and the sentence generator 413 express time information in sentences. The sentence may be stored in the sentence storage unit 310 as it is, but the result interpreted through the syntax analysis and the semantic analysis may be stored in the sentence storage unit 310. To this end, the sentence parser 414 includes a syntactic parser 414A and a semantic parser 414B. Here, the software provided by natural language processing may be used as the syntax interpreter 414A and the semantic interpreter 414B.

구문 해석기(414A)의 예를 들면 Penn Treebank라는 규칙에 따라 문장을 해석하게 되는데, 표 2에는 Penn Treebank의 태그집합을 보여주고 있다. 구문 해석기(414A)는 문장의 구를 태그 집합의 종류를 자동으로 판단하여 구의 종류를 나타내주는 역할을 하는 것이다.For example, the parser 414A interprets a sentence according to the rule of Penn Treebank. Table 2 shows the tag sets of the Penn Treebank. The syntax interpreter 414A automatically determines the type of tag set for a phrase of a sentence to indicate the type of phrase.

1One ADJPADJP Adjective phraseAdjective phrase 22 ADVPADVP Adverb phraseAdverb phrase 33 NPNP Noun phraseNoun phrase 44 PPPP Prepositional phrasePrepositional phrase 55 SS Simple declarative clauseSimple declarative clause 66 SBARSBAR Clause introduced by subordinating conjunction or 0Clause introduced by subordinating conjunction or 0 77 SBAEQSBAEQ Direct question introduced by wh-word or wh-phraseDirect question introduced by wh-word or wh-phrase 88 SINVSINV Declarative sentence with subject-aux inversionDeclarative sentence with subject-aux inversion 99 SQSQ Subconstituent of SBARQ excluding wh-word or wh-phraseSubconstituent of SBARQ excluding wh-word or wh-phrase 1010 VPVP Verb phraseVerb phrase 1111 WHADVPWHADVP wh-adverb phrasewh-adverb phrase 1212 WHNPWHNP wh-noun phrasewh-noun phrase 1313 WHPPWHPP wh-propositional phrasewh-propositional phrase 1414 XX Constituent of unknown or certain categoryConstituent of unknown or certain category Null elementsNull elements 1One ** "Understood" subject of infinitive or imperative"Understood" subject of infinitive or imperative 22 00 Zero variant of that in subordinate clauseZero variant of that in subordinate clause 33 TT Trace-marks position where moved wh-constituent is interpretedTrace-marks position where moved wh-constituent is interpreted 44 NILNIL Marks position where preposition is interpreted in pied-piping contextsMarks position where preposition is interpreted in pied-piping contexts

그리고 의미 해석기(414B)는 문장의 구 단위에 대해 각 논항의 의미역을 부여하는 기능을 한다. 자연어 처리 연구 분야에 의하면, 의미역은 크게 논항과 부가항으로 이루어진다. 아래 표 3은 논항과 부가항 태그의 종류를 나타내고 있다.The semantic interpreter 414B functions to give a semantic range of each argument to a phrase unit of a sentence. According to the field of natural language processing research, the semantic domain is largely composed of arguments and additional terms. Table 3 below shows the types of arguments and supplementary tags.

논항 태그Dispute tag 1One 행위주(agent)Agent 술어가 기술하는 행위를 의도적으로 일으키는 것Intentionally causing the behavior of the predicate to describe 22 피험체(Theme)Subject 술어가 기술하는 행위로 인해 변화를 겪는 것Experiencing change due to the behavior of the predicate 33 심리경험주(Experiencer)Experiencer 술어가 기술하는 정신적 심리적 상태를 겪는 것Experiencing the mental and psychological state described by the predicate 44 도구(Instrument)Instrument 술어가 기술하는 행위에서 쓰여지는 도구A tool used in the actions predicates describe 55 장소(Locative)Location 술어가 기술하는 행위가 향해지는 곳Where the action described by the predicate is directed 66 목표(Goal)Goal 술어가 기술하는 행위가 향해지는 곳Where the action described by the predicate is directed 77 근원(Source)Source 술어가 기술하는 행위에서 개체가 움직여 온 곳Where the object has moved in the behavior described by the predicate 88 수혜자(Benefactive)Benefactive 술어가 기술하는 행위로부터 수혜를 받는 것Receiving benefits from the actions described by the predicate 99 사역주(Causer)Causer 어떤 사건행위를 일으키는 것Causing an event 1010 피사역체(Causee)Cauesee 사역주에 의해서 일어나는 것What happens by the minister 부가항 태그Supplementary tag 1One 시간(Time)Time 언제 행위가 일어난 시점When did the action take place 22 방법(Manner)Manner 행위의 방법Method of act 33 이유(Reason)Reason 행위의 이유Reason of act 44 목적(Purpose)Purpose 행위의 목적Purpose of the act 55 위치(Location)Location 행위가 일어난 장소The place where the act took place 66 방향(Direction)Direction 행위의 방향Direction of action

구문 해석기(414A)와 의미 해석기(414B)의 예를 다음과 같은 문장으로 살펴본다.An example of the syntax interpreter 414A and the semantic interpreter 414B will be described as follows.

Tom gave an apple to Peter at 6Tom gave an apple to Peter at 6

구문 해석기(414A)를 거치면 문장의 동사를 중심으로 한 구 단위의 해석이 이루어진다.The syntax interpreter 414A is used to interpret phrases centering on verbs in sentences.

(S (NP Tom) (VP gave (NP an apple) (PP to (NP peter)) (PP at (NP 6))) .)(S (NP Tom) (VP gave (NP an apple) (PP to (NP peter)) (PP at (NP 6))).

이를 다시 의미 해석기(414B)를 통과시키면 문장의 구에 대한 의미가 주어진다.Passing it again through the meaning interpreter 414B gives the meaning of the phrase in the sentence.

(S (NP_AGENT Tom) (VP gave (NP_THEME an apple) (PP_BENEFICIARY to (NP Peter)) (PP_TIME at (NP 6))) .)(S (NP_AGENT Tom) (VP gave (NP_THEME an apple) (PP_BENEFICIARY to (NP Peter)) (PP_TIME at (NP 6))).)

문장 저장부(310)에 저장되는 문장은 위와 같은 구문 해석과 의미 해석을 거쳐서 저장되나, 이와 달리 일반적인 문장을 그대로 저장하고 필요시에 구문 해석과 의미 해석을 하여 처리하는 방법도 가능하다.Sentences stored in the sentence storage unit 310 is stored through the syntax analysis and semantic analysis as described above, alternatively, it is also possible to store the general sentence as it is, and to process the syntax analysis and semantic analysis if necessary.

한편, 공간 추론기(420)는 세계좌표계 상으로 로봇이 존재하는 공간에 대한 물체들의 배치를 나타내는 것이며, 필요시 과거의 사건을 회상할 경우 과거 기억의 공간적 상황에 대해 추론하는 기능을 한다. 도 4는 공간 추론기의 예를 보여주고 있다. 공간 추론기(420)가 현재의 공간적 상황을 나타내는 방법은, 시각 모듈(110)에서 시각 모듈(110)과 연결되어 있는 카메라로부터 물체의 형상을 인식하고 그 결과를 공간 추론기(420)에 표시한다. 그리고 공간 추론기(420)는 기억부(300)에 저장된 지금까지의 시각적 인식을 근거로 하여 현재 상태에 대해 세계좌표계의 공간을 구성하게 된다. 여기에 카메라로 입력된 영상에서 물체를 구분하고, 그 물체의 모양과 자세를 찾는다. 여기서 자세는 물체의 세계좌표계에서의 위치와 각도를 나타내는데, 2차원으로 표현할 경우 (x, t, θ)이며, 3차원으로 표현할 경우 위치정보와 각도정보로 이루어지는 (x, y, z, α, β, γ)로 표현하게 된다. 공간 추론기(420)가 과거의 기억을 추론할 경우는 물체기술자(320)의 물체의 모양정보와 자세정보를 검색하여 과거의 기억 공간을 만들고, 물체의 중심위치를 기준으로 하여 전후좌우 등의 공간적 정보를 파악하게 된다.On the other hand, the space reasoner 420 represents the arrangement of objects with respect to the space in which the robot exists in the world coordinate system, and when necessary recalls the past events, and functions to infer about the spatial situation of the past memory. 4 shows an example of a spatial reasoner. The spatial reasoner 420 indicates the current spatial situation in the visual module 110 by recognizing the shape of an object from a camera connected with the visual module 110 and displaying the result on the spatial reasoner 420. do. The spatial inference unit 420 configures the space of the world coordinate system with respect to the current state based on the previous visual recognition stored in the storage unit 300. Here, the object is distinguished from the image input by the camera, and the shape and posture of the object are found. Here, the posture represents the position and angle of the object in the world coordinate system, which is expressed in two dimensions (x, t, θ), and in three dimensions, consisting of position information and angle information (x, y, z, α, β, γ). When the spatial inference device 420 infers the memory of the past, the shape descriptor and the attitude information of the object descriptor 320 are searched to create the memory space of the past, and the front, rear, left, right, etc. based on the center position of the object. Spatial information is identified.

지식 추론기(430)는 인지 시스템의 내부에서 발생하는 지식과 관련된 추론을 하는 기능으로서 다음과 같은 기능을 가진다. 외부에 의한 지식의 재생을 요청할 경우를 예를 들면, 외부의 인간이 로봇에게 과거의 기억에 대한 질의를 할 경우, 감지부(100)의 청취 모듈(120)에 의해 인간의 음성은 문장으로 변환되고, 해석기(410)의 구문 해석기(414A)와 의미 해석기(414B)에 의해 구문 해석과 의미 해석을 거치게 된다. 의미 해석의 결과로 나타나는 질의의 내용은 지식 추론기(430)에 의해 기억부(300)를 통해 검색되고, 필요에 따라서 발화 모듈(220)에서 발화하거나 동작 모듈(210)을 통해 동작으로 연결되게 된다. 따라서 지식 추론기(430)는 IF-THEN 형태의 산출 규칙이 되고, IF에 해당하는 내용은 조건부가 되며, THEN에 해당하는 내용은 추론의 결과가 된다. 예로써, 다음의 질의를 살펴본다.The knowledge reasoner 430 is a function of inferring knowledge related to the knowledge generated inside the cognitive system and has the following functions. For example, when an external human is asked to regenerate knowledge, when an external human query the robot about past memories, the human voice is converted into a sentence by the listening module 120 of the sensor 100. Then, the syntax interpreter 414A and the semantic interpreter 414B of the interpreter 410 undergo syntax analysis and semantic analysis. The contents of the query resulting from the semantic analysis are retrieved through the memory unit 300 by the knowledge inference unit 430 and, if necessary, are uttered by the speech module 220 or connected to the operation through the operation module 210. do. Therefore, the knowledge reasoner 430 becomes a calculation rule of IF-THEN type, the content corresponding to IF becomes conditional, and the content corresponding to THEN becomes a result of inference. For example, consider the following query.

When did Tom go to the back of PeterWhen did Tom go to the back of Peter

지식 추론기(430)의 동작은 조건문에는 상황에 대한 내용과 결과문에는 그 조건에 대한 결과로서 함수를 호출하게 된다.The operation of the knowledge reasoner 430 calls a function as the result of the condition in the conditional statement and the result of the condition in the result statement.

IF (WHADVP TIME) (VP go) (TENSE PAST) (1PP1 to) (LOCATION the back of Peter);IF (WHADVP TIME) (VP go) (TENSE PAST) (1PP1 to) (LOCATION the back of Peter);

THEN CALL search(TIME, AGENT, 1PP1, LOCATION);THEN CALL search (TIME, AGENT, 1PP1, LOCATION);

추론 관리부(400)는 지식 추론기(430)와 공간 추론기(420) 중 적어도 하나에 의해 추론된 결과를 가지고 문장 발생기(413)에 의해 문장으로 발생하여 발화 모듈(220)을 통해 음성으로 변환 후 스피커로 출력되거나, 동작기술자(330)에 기술된 로봇의 동작 및 그 동작 순서를 조합하여 행위 모듈로 출력하여 로봇의 액추에이터를 구동함으로써 로봇이 동작하도록 한다. 아니면 추론 관리부(400)는 추론 결과에 따라 발화 행위 및 동작 행위를 모두 수행하도록 할 수도 있다.The reasoning management unit 400 has a result inferred by at least one of the knowledge reasoner 430 and the spatial reasoner 420 and is generated as a sentence by the sentence generator 413 to be converted into a voice through the speech module 220. After the output to the speaker, or by combining the operation of the robot described in the operation descriptor 330 and its operation order to the action module to drive the robot's actuators to operate the robot. Alternatively, the inference manager 400 may perform both the act of speaking and acting according to the result of the inference.

한편, 물체기술자(320)에 대해 표 4를 이용하여 설명하면 다음과 같다.On the other hand, the object descriptor 320 will be described using Table 4 as follows.

물체object 이름name 모양shape 특징Characteristic 현재자세Present posture O1O1 TomTom

Blue x1, y1, θ1 O2 Peter

Red x1, y2, θ2 H1 Kim

Male x3, y3, θ3 H2 Park

Female x4, y4, θ4

물체기술자(320)는 물체에 대한 자료를 저장하는 자료 공간으로서, 시각 모듈(110)에 의해 인식된 물체의 모양과 자세를 저장하며, 물체의 특징 등을 저장하고 인지 시스템이 알고 있는 최근의 자세를 저장한다.The object descriptor 320 is a data space for storing data about an object. The object descriptor 320 stores a shape and a posture of an object recognized by the visual module 110, stores a feature of the object, and the recent posture known by a cognitive system. Save it.

동작기술자(330)에 대해 표 5를 이용하여 설명하면 다음과 같다.The operation descriptor 330 will be described with reference to Table 5 below.

번호number 문장sentence 함수function S1S1 I found at x,y,θI found at x, y, θ find(X)find (X) S2S2 I approach XI approach X approach(X)approach (X) S3S3 I open handI open hand openHand()openHand () S4S4 I reach to grip position of XI reach to grip position of X reachGripPosition(X)reachGripPosition (X) S5S5 I grip XI grip X B(grip X)B (grip X) S6S6 I bend armI bend arm B(bend arm)B (bend arm)

동작기술자(330)는 로봇의 동작에 있어서 기초가 되는 동작들의 집합이다. 표 5는 로봇이 물체를 집는 명령, 즉 “hold X”라는 명령을 받았을 때 취하는 동작이다. 여기서 각 문장은 기초 동작들이며, 이것의 조합에 의해 완전한 하나의 의미 있는 동작이 이루어진다.The motion descriptor 330 is a set of motions that are the basis for the motion of the robot. Table 5 shows the actions taken when the robot receives a command to pick up an object, that is, a “hold X” command. Here each sentence is a basic action, and a combination of them makes a complete one meaningful action.

도 5는 본 발명의 일 실시예에 따라 인지 시스템의 시각 모듈에서 발생하는 사건에 대한 인지 방법 흐름도이다.5 is a flowchart illustrating a method for an event occurring in a visual module of a cognitive system according to an embodiment of the present invention.

시각 모듈(110)은 카메라로부터 지속적으로 영상을 입력받고 있으며, 입력되는 카메라 영상에 변화가 일어나면 영상 내부의 물체를 인식한다(S510)(S520). 여기서 물체를 인식한다는 것은 물체의 위치와 회전 및 물체의 형태를 인식한다는 의미이다. 인식이 완료되면, 추론 관리부(400)는 인식된 물체와 물체기술자(320)의 물체 데이터들과 비교한다(S530). 추론 관리부(400)는 비교를 통해 인식된 물체에 대해 물체기술자(320)에 기술된 물체들 중 동일 물체가 존재하는지를 판단한다(S540). 동일 물체가 존재하면, 추론 관리부(400)는 인식된 물체와 동일 물체의 자세를 비교하여 동일한 자세인지를 판단한다(S550). 동일한 자세가 아니면, 추론 관리부(400)는 물체기술자(320)에 자세 정보를 변경한다(S560). 한편, 동일 물체가 존재하지 않으면, 추론 관리부(400)는 인식된 물체가 로봇이 새롭게 접한 물체이므로 이를 물체기술자(320)에 등록한다(S570). 또한 물체기술자(320)가 변경된 경우 시간 모듈 상에 새로운 사건이 발생한 것이므로, 문장 저장부(310)에 그 사건을 문장으로 표현하여 등록한다(S580)(S590). 이때 추론 관리부(400)는 사건을 문장으로 표현하기 위해 시각 해석기(411)를 이용한다.The visual module 110 continuously receives an image from a camera and recognizes an object inside the image when a change occurs in the input camera image (S510). Recognizing an object here means recognizing the position and rotation of the object and the shape of the object. When the recognition is completed, the inference management unit 400 compares the recognized object with the object data of the object descriptor 320 (S530). The inference management unit 400 determines whether the same object among the objects described in the object descriptor 320 exists with respect to the object recognized through the comparison (S540). If the same object exists, the inference management unit 400 compares the recognized object and the posture of the same object to determine whether the same posture (S550). If it is not the same posture, the reasoning management unit 400 changes the posture information to the object descriptor 320 (S560). On the other hand, if the same object does not exist, the reasoning management unit 400 registers it in the object descriptor 320 because the recognized object is a new contact with the robot (S570). In addition, when the object descriptor 320 is changed, since a new event occurs on the time module, the sentence is expressed as a sentence in the sentence storage unit 310 and registered (S580) (S590). At this time, the reasoning management unit 400 uses the visual interpreter 411 to express the event in a sentence.

상기 문장저장부(310)에 문장을 등록하기 위한 방법을 상세히 설명하면, 시각모듈(110)에서 물체의 모양과 자세정보를 검출하면, 시각해석기(411)에서 물체기술자(320)에 저장된 물체들과 비교하여 사건의 변경 여부를 결정한다. 만약 물체의 변동이 이루어 졌다고 한다면 문장발생기(413)에서 그 사건에 대한 내용을 문장으로 만들어 문장저장부(310)에 저장한다. 그 사건은 크게 두 가지로 나누어지는데, 새로운 물체가 등장한 경우와, 저장된 물체의 자세가 바뀐 경우이다. 먼저 새로운 물체가 등장한 경우는 The method for registering a sentence in the sentence storage unit 310 will be described in detail. When the visual module 110 detects the shape and posture information of objects, the objects stored in the object descriptor 320 in the visual interpreter 411. To determine if the event has changed. If the variation of the object is made, the sentence generator 413 makes a sentence about the event and stores it in the sentence storage unit 310. The event is divided into two main categories: the appearance of new objects and the change of attitude of stored objects. If a new object first appeared

A new object appeared at x1,y1,θ1A new object appeared at x1, y1, θ1

라고 표현될 수 있다. 여기서 ‘A new object’ 는 임의의 새로운 물체를 지칭하는 것이며, ‘appeared ’는 공간추론기(420) 상의 현재의 장면에 새로운 물체가 등장했음을 알리는 동사로 작용하며, ‘at'은 물체의 자세를 나타낸다. 여기서 ‘A new object’와 ‘appeared ’와 ‘at'는 문장발생기에서 자동으로 발생시키게 되며 (x1,y1,θ1)의 값은 청취모듈에서 인식된 값을 대응시킨다.It can be expressed as. Here, 'A new object' refers to an arbitrary new object, 'appeared' serves as a verb indicating that a new object has appeared in the current scene on the spatial reasoner 420, and 'at' refers to the posture of the object. Indicates. Here, 'A new object', 'appeared' and 'at' are automatically generated by the sentence generator, and the value of (x1, y1, θ1) corresponds to the value recognized by the listening module.

센서모듈은 로봇에게 부착된 인간의 음성외의 모든 센서를 나타내며, 음향의 경우에도 자연어 인식은 청취모듈의 역할이나 외부의 노이즈나 음향 그리고 인간의 음성에서 나타나는 음성세기나 무드(mood)등은 모두 센서모듈에 해당된다. 센서모듈이 인식되는 과정과 해석되는 과정은 시각모듈의 것과도 유사한 과정인데, 센서는 외부로부터의 감지 정보를 이용하므로 외부로부터 감지된 정보를 해석하고 그 결과를 문장으로 발생시키게 된다. 센서로부터의 감지된 정보는 크게 센서의 종류, 센싱된 정보의 강도(intesnsity), 센싱정보의 시간적 연속성과 시간에 따른 신호의 특성 등이 된다. 이러한 센서정보는 센서모듈(120)에 의해 인식되어지며 그 내용은 센서해석기에서 의미를 정의하고 문장발생기에서 문장으로 발생시키게 된다. 예를 든다면 촉각센서에서 가벼운 두드림의 상황이 발생하면,The sensor module represents all sensors other than the human voice attached to the robot, and in the case of sound, the natural language recognition is the sensor of the listening module, the external noise or sound, and the voice intensity or mood appearing in the human voice. This is the module. The process of recognizing and interpreting the sensor module is similar to that of the visual module. Since the sensor uses the sensed information from the outside, the sensor interprets the information sensed from the outside and generates the result as a sentence. The sensed information from the sensor is largely the type of the sensor, the intensity of the sensed information, the temporal continuity of the sensing information and the characteristics of the signal over time. Such sensor information is recognized by the sensor module 120, the contents of which define the meaning in the sensor interpreter and generate the sentence in the sentence generator. For example, if a light tapping situation occurs in the tactile sensor,

Someone touches me.Someone touches me.

촉각센서에서 강하게 치는 상황이 발생하면,If you hit hard with the tactile sensor,

Someone beats me.Someone beats me.

라는 문장으로 표현한다. 여기서 ‘Someone’는 현재까지 확인되지 않은 에이전트를 의미하며 에이전트는 인간이나 로봇 동물이 될 수 있다. 'me'는 대상이 다른 에이전트가 아닌 로봇 자신임을 나타낸다. ‘touches'와 ’beats'는 센서해석기의 해석에 의해 결정되는데 센서해석기에 감지된 촉각센서의 정보의 세기가 미리 사전에 정의한 기준을 넘어선 경우와 그렇지 않은 경우에 따라 결정되는 것이다.Is expressed as a sentence. Here, "Someone" means an agent that has not been identified to date, and the agent may be a human or a robot animal. 'me' indicates that the target is the robot itself, not another agent. 'Touches' and 'beats' are determined by the interpretation of the sensor analyzer, which is determined when the intensity of the information of the tactile sensor detected by the sensor analyzer exceeds or exceeds the predefined criteria.

도 6은 본 발명의 일 실시예에 따라 인지 시스템의 청취 모듈에서 발생하는 사건에 대한 인지 방법 흐름도이다.6 is a flowchart illustrating a method for an event occurring in a listening module of a cognitive system according to an embodiment of the present invention.

청취 모듈(120)은 마이크를 통해 음성을 입력받는다(S610). 청취 모듈(120)은 입력된 음성을 내장된 음성 인식 기능을 통해 문장으로 변환한다(S620). 추론 관리부(400)의 문장 해석기(414)를 이용하여 변환된 문장의 의미를 해석하는데, 바람직하게 구문적 해석과 의미적 해석 과정을 수행한다(S630). 추론 관리부(400)는 구문적 해석과 의미적 해석이 완료된 문장을 문장 저장부(310)에 저장한다(S640). 이때 추론 관리부(400)는 그 문장에 대한 사건이 언제 일어났는지를 알 수 있도록 시간 태그를 함께 붙여 문장 저장부(310)에 저장한다. 추론 관리부(400)는 공간 추론기(420)와 지식 추론기(430) 중 적어도 하나를 이용하여 문장의 의미를 해석하는데, 이때 각 논항의 의미를 따라서 추론하게 된다(S650). 추론이 완료되면, 추론 관리부(400)는 추론 결과를 동작 모듈(210) 및/또는 발화 모듈(220)로 출력하여 추론 결과에 대한 행위를 실행시킨다(S660).The listening module 120 receives a voice through a microphone (S610). The listening module 120 converts the input voice into a sentence through the built-in voice recognition function (S620). The sentence interpreter 414 of the reasoning management unit 400 is used to interpret the meaning of the converted sentence. Preferably, a syntactic analysis and a semantic interpretation process are performed (S630). The inference management unit 400 stores the sentence in which the syntactic analysis and the semantic analysis are completed in the sentence storage unit 310 (S640). At this time, the reasoning management unit 400 attaches the time tag together and stores the sentence in the sentence storage unit 310 so as to know when an event for the sentence occurred. The reasoning management unit 400 interprets the meaning of a sentence using at least one of the spatial reasoning machine 420 and the knowledge reasoning machine 430, and infers the meaning of each argument according to the meaning of each argument (S650). When the inference is completed, the inference management unit 400 outputs the inference result to the operation module 210 and / or the speech module 220 to execute the action on the inference result (S660).

도 7은 본 발명의 일 실시예에 따라 인지 시스템의 행위 모듈에서 발생하는 사건에 대한 인지 방법 흐름도이다.7 is a flowchart illustrating a method for an event occurring in an action module of a cognitive system according to an embodiment of the present invention.

추론 관리부(400)는 지식 추론기(430)를 이용하여 생성 규칙에 따라 행위 명령을 내리게 되며, 그 결과로 동작기술자(330)에 있는 동작을 검색하게 된다(S710)(S720). 추론 관리부(400)는 검색 결과에 따라 동작 모듈(210)을 통해 액추에이터를 구동하여 동작을 실행한다(S730).The reasoning management unit 400 issues an action command according to a generation rule using the knowledge reasoner 430, and as a result, searches for an operation in the operation descriptor 330 (S710). The reasoning management unit 400 drives the actuator through the operation module 210 according to the search result and executes the operation (S730).

도 8은 본 발명의 일 실시예에 따라 인지 시스템의 발화 모듈에서 발생하는 사건에 대한 인지 방법 흐름도이다.8 is a flowchart illustrating a method for an event occurring in a speech module of a cognitive system according to an embodiment of the present invention.

새로운 물체가 발견되면, 추론 관리부(400)는 물체기술자(320)에 그 물체의 라벨이 존재하지 않음을 확인한다(S810). 추론 관리부(400)는 그 존재 없음을 확인한 후, 지식 추론기(430)에게 발화 문장을 작성하게 하고, 이어서 그 작성된 문장을 발화 모듈(220)에서 음성으로 변환하여 출력하도록 한다(S820)(S830).If a new object is found, the reasoning management unit 400 confirms that the label of the object does not exist in the object descriptor 320 (S810). The inference management unit 400 confirms that there is no presence, and then causes the knowledge inference unit 430 to write a spoken sentence, and then converts the written sentence into a voice in the speech module 220 and outputs it (S820) (S830). ).

도 7과 함께 행위부에 해당하는 동작모듈과 발화모듈의 예를 구체적으로 들면 다음과 같다. 동작모듈은 추론관리부에 의해 로봇의 동작이 요청되는 경우에 로봇이 가지고 있는 모터와 같은 엑츄에이터(actuator)에 의해 이루어지는 동작을 실행하는 기능을 한다. 예를 들어 외부인 인간이 청취모듈로 통해 다음과 같은 명령이 주어졌다고 하자.A detailed example of the operation module and the speech module corresponding to the acting unit together with FIG. 7 is as follows. The operation module functions to execute an operation made by an actuator such as a motor of the robot when the operation of the robot is requested by the inference management unit. For example, let's say that an external human is given the following command through a listening module.

Hold the cup.Hold the cup.

청취모듈은 이를 문장으로 만들고 문장저장부(310)에 문장해석기(414)를 통해 해석된 문장을 저장한다. 동시에 문장해석기(414)는 그 내용을 해석하여 지식추론기에 전달한다. 지식추론기(430)에는 상기한 바와 같이 어떤 상황에 대한 IF-THEN 형태의 추론이 이루어진다.The listening module makes the sentence and stores the sentence interpreted through the sentence interpreter 414 in the sentence storage unit 310. At the same time, the sentence interpreter 414 interprets the content and delivers it to the knowledge reasoner. As described above, the knowledge reasoner 430 is inferred in the IF-THEN form for a certain situation.

IF (VP_IMPERATIVE hold) (OBJ the cup)IF (VP_IMPERATIVE hold) (OBJ the cup)

THEN CALL hold(the cup)THEN CALL hold (the cup)

이라는 추론의 결과에 의해 hold(the cup)이라는 동작 함수가 호출되는 것이다. 이에 따라 표 5와 같이 동작기술자에 존재하는 'hold X'를 호출하게 되면 컵을 집는 과정을 'hold X'에 포함된 기초 동작들의 함수를 호출함으로써 순서대로 컵을 집는 동작을 진행하게 되는 것이다.By the result of inference, the action function called hold (the cup) is called. Accordingly, when the 'hold X' is present in the motion descriptor as shown in Table 5, the process of picking the cup is performed by calling the functions of the basic operations included in the 'hold X'.

다음으로 도 8과 함께 발화모듈에 해당하는 경우를 예를 들면 다음과 같다. 발화모듈은 추론관리부의 요청에 의해 외부로 음성이 출력되는 것을 말한다. 발화모듈이 동작되는 경우는 크게 인지시스템 내부의 요청과 외부 세계의 요청에 의해 이루어 진다. 내부의 요청의 경우를 예를 들면, 상기한 바와 같이 새로운 물체가 등장하면 이에 대한 정보를 물체기술자에 저장하게 된다. 그런데 이 저장된 물체기술자(320)의 이름칸(label slot)에 물체의 이름(label)이 없는 상태이다. 따라서 이 물체를 가지고 인간과 대화할 수 없다. 따라서 새로운 물체를 발견하였을 때 즉시 그 물체에 대한 이름을 인간에게 질의한다. 이 과정은 지식추론기(430)가 일정시간 간격으로 물체기술자(320)의 상태를 점검하며 만약 이름이 없는 경우가 존재하면 발화모듈로 이름을 묻는 문장을 발생시키게 된다. Next, for example, the case corresponding to the speech module in conjunction with FIG. The speech module refers to the output of voice to the outside at the request of the inference management unit. When the speech module is operated, it is largely made by a request inside the cognitive system and a request from the outside world. In the case of an internal request, for example, as described above, when a new object appears, information about this is stored in the object descriptor. However, there is no label of an object in a label slot of the stored object descriptor 320. Therefore, you cannot talk to humans with this object. Therefore, when a new object is found, the human is immediately queried for the name of the object. In this process, the knowledge reasoner 430 checks the state of the object descriptor 320 at predetermined time intervals, and if there is no name, a sentence asking for a name is generated.

IF label( Obj) == 0 ;IF label (Obj) == 0;

THEN CALL sentence(Obj);THEN CALL sentence (Obj);

예를 들어 오렌지 색깔의 둥근 물체가 있으면 지식추론기는 문장발생기로 통해 'sentence(Obj)‘라는 함수를 이용하여 다음과 같은 문장을 발생시키게 된다.For example, if there is a round object of orange color, the knowledge reasoner generates the following sentence using the function 'sentence (Obj)' through the sentence generator.

What is the new orange objectWhat is the new orange object

다음으로는 외부세계의 요구에 의해 발화되는 경우를 예를 드는데 주로 외부의 인간의 질문에 대한 답을 하는 경우이다. 예를 들어 외부 세계에서 청취모듈로 다음과 같은 문장이 입력되었다고 한다면,Next, the case is ignited by the demands of the outside world, mainly the answer to external human questions. For example, if the following sentence is input to the listening module from the outside world,

When did Ele go to the back of PupyWhen did Ele go to the back of Pupy

이 문장의 의미는 공간추론기로 통해 과거의 사건을 추론을 요청하고 그 결과를 발화모듈로 출력할 것을 요청하는 것이다. 'When'은 사건 발생의 시점을 묻는 것이며, 'go'는 'Ele'라는 물체가 이동한 사건을 나타내는 것이며 'to'는 물체가 향하는 방향성을 나타내는 것이며, "the back of Pupy" 목표 위치를 나타내는 것이다. 이에 대해 공간추론기는 문장메모리와 물체기술자의 정보를 이용하여 시점을 현재로부터 과거로 이동하면서 물체의 상호관계를 추론하게 된다. 문장메모리에 정의된 'Pupy'와 과거의 위치정보를 상호 비교하면서 'Ele'가 'Pupy'의 중심위치보다 뒤쪽으로 이동한 사건과 그 사건이 일어난 시점을 찾는다. 찾아진 후 그 사건을 다음과 같은 문장으로 문장발생기에서 만들어서 발화모듈로 통해 말로 출력되는 것이다.The meaning of this sentence is to request the reasoning of past events through the spatial reasoner and to output the result to the speech module. 'When' asks the time of occurrence of the event, 'go' indicates the event that the object called 'Ele' has moved, 'to' indicates the direction the object is facing, and indicates the target location of the "the back of Pupy". will be. On the other hand, the spatial inference machine uses the sentence memory and the information of the object descriptor to infer the interrelation of objects while moving the viewpoint from the present to the past. By comparing 'Pupy' defined in sentence memory with past positional information, we find the case where 'Ele' moves backward from the central position of 'Pupy' and when the event occurred. After finding, the event is made in the sentence generator with the following sentence and output in words through the speech module.

Ele went to the to the back of Pupy at 10:51Ele went to the to the back of Pupy at 10:51

이러한 일련의 과정을 나타내는 문장저장부의 예를 들면 다음과 같다. 각 문장들은 해석기를 거치지 않은 문장만을 나타내었다. An example of a sentence storage unit representing such a series of processes is as follows. Each sentence represents only a sentence without an interpreter.

번호number 시간time 모듈module 문장sentence 1One 9-10:40:159-10: 40: 15 시각Time A new object appeared at 129,51,-8.1A new object appeared at 129,51, -8.1 22 9-10:40:189-10: 40: 18 발화Fire what is the new orange objectwhat is the new orange object 33 9-10:40:299-10: 40: 29 청취listening It is PupyIt is Pupy 44 9-10:40:499-10: 40: 49 시각Time A new object appeared at 77,72,-149.0A new object appeared at 77,72, -149.0 55 9-10:40:529-10: 40: 52 발화Fire What is the new green object.What is the new green object. 66 9-10:41:19-10: 41: 1 청취listening It is RabyIt is Raby 77 9-10:41:279-10: 41: 27 시각Time A new object appeared at 146,141,-171.0A new object appeared at 146,141, -171.0 88 9-10:41:309-10: 41: 30 발화Fire what is the new blue objectwhat is the new blue object 99 9-10:41:469-10: 41: 46 청취listening It is CatyIt is Caty 1010 9-10:41:579-10: 41: 57 시각Time A new object appeared 210,49,-118.6A new object appeared 210,49, -118.6 1111 9-10:42:019-10: 42: 01 발화Fire What is the new red objectWhat is the new red object 1212 9-10:42:379-10: 42: 37 청취listening It is EleIt is Ele 1313 9-10:43:569-10: 43: 56 시각Time Ele went to 170,-8,-63.4Ele went to 170, -8, -63.4 1414 9-10:46:299-10: 46: 29 시각Time Caty disappearedCaty disappeared 1515 9-10:51:559-10: 51: 55 시각Time Ele went to 123,142,-131.8Ele went to 123,142, -131.8 1616 9-10:52:89-10: 52: 8 시각Time Caty reappeared at 201,30,-163.4Caty reappeared at 201,30, -163.4 1717 9-10:54:49-10: 54: 4 청취listening When did Ele go to the back of PupyWhen did Ele go to the back of Pupy 1818 9-10:54:99-10: 54: 9 발화Fire Ele went to the to the back of Pupy at 10:51Ele went to the to the back of Pupy at 10:51 1919 9-10:56:229-10: 56: 22 청취listening What did appear lastWhat did appear last 2020 9-10:56:279-10: 56: 27 발화Fire Ele appeared lastEle appeared last 2121 9-10:57:479-10: 57: 47 청취listening What is foremost objectWhat is foremost object 2222 9-10:52:519-10: 52: 51 발화Fire Caty is foremost objectCaty is foremost object

이제까지 본 발명에 대하여 그 바람직한 실시예들을 중심으로 살펴보았다. So far I looked at the center of the preferred embodiment for the present invention.

본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is defined by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

100 : 감지부 110 : 시각 모듈
120 : 청취 모듈 130 : 센서 모듈
200 : 행위부 210 : 동작 모듈
220 : 발화 모듈 300 : 기억부
310 : 문장 저장부 320 : 물체기술자
330 : 동작기술자 400 : 추론 관리부
410 : 해석기 411 : 시각 해석기
412 : 센서 해석기 413 : 문장 발생기
414 : 문장 해석기 414A : 구문 해석기
414B : 의미 해석기 420 : 공간 추론기
430 : 지식 추론기 100: detector 110: visual module
120: listening module 130: sensor module
200: action unit 210: operation module
220: ignition module 300: storage unit
310: sentence storage unit 320: object descriptor
330: motion engineer 400: inference management
410: interpreter 411: visual analyzer
412 sensor interpreter 413 sentence generator
414: Parsing Interpreter 414A: Parsing Interpreter
414B: Semantic Interpreter 420: Spatial Reasoner
430: Knowledge Reasoner

Claims

delete

A sensing unit for sensing external information of the robot;
An acting unit for acting outside of the robot;
A storage unit for storing the memory of the robot;
A reasoning management unit expressing the information sensed by the sensing unit in a sentence and storing the information in the storage unit, inferring the memory stored in the storage unit, and acting externally through the acting unit according to the inference result;
The reasoning management unit stores the sentence in the storage unit along with the time tag for the contextual memory of the robot,
The detection unit and a visual module for obtaining a camera image;
A listening module for obtaining an external voice,
The sensing unit includes a sensor module for obtaining one or more sensing data,
The acting unit is an operation module for driving an actuator for the operation of the robot,
It has a speech module for converting the spoken sentence into a voice for outputting the robot to output a speaker,
The memory unit is a sentence storage unit for storing the sentence,
An object descriptor for storing object information obtained by the visual module;
An operation descriptor for storing information about the operation of the robot and its operation sequence;
The reasoning management unit interprets the information sensed through the sensing unit to express the sentence, and an interpreter for performing an action by controlling the acting unit according to the inference result;
A knowledge reasoner which infers the meaning-interpreted sentence through the storage unit;
And a spatial inference device for classifying objects in the camera image obtained by the visual module, grasping the shape and posture of the separated objects, and updating the object descriptors.
The parser includes a sentence parser including a syntactic parser that parses the sentence in phrase units and a semantic parser that interprets the meaning of the phrase of the parsed sentence. Cognitive system with contextual interaction for robots.

The method of claim 11,
When the spatial inference machine deduces the past spatial memory of the robot, the shape descriptor and the posture information of the object descriptor are searched to create the past memory space and the spatial information is deduced from the front, rear, left, and right based on the center position of the object. Contextually cognitive system for service robots.

The method of claim 12,
The reasoning management unit generates a sentence based on a result inferred by at least one of the knowledge reasoner and the spatial reasoner and outputs the sentence to the speech module or combines the motions and the sequence of operations of the robot described in the motion descriptor. A contextually interactive cognitive system for service robots characterized by modular output.

delete

Interpreting the sentence in which the voice is converted by the listening module of the robot;
Storing the interpreted sentence in a memory of the robot;
Infer contents of the interpreted sentence;
Executing an action on the inference result;
The interpreting step may include: parsing the sentence in which the speech is converted into phrases;
Interpreting the meaning of the phrase in the parsed sentence,
The storing step stores a sentence with a time tag for the contextual memory of the robot,
The reasoning step is a context of a service robot, characterized by inferring the contents of the interpreted sentence by searching the memory of the robot, in which an object descriptor storing information about objects acquired through a sentence and a camera image stored by time is stored. Cognitive method for enemy interaction.

18. The method of claim 17,
In the deduction step, when inferring the past spatial memory of the robot according to the contents of the interpreted sentence, the object descriptor searches for shape information and posture information of the object to create a past memory space and refers to the center position of the object. Cognitive method for contextual interaction of service robot characterized by inferring front, rear, left and right spatial information.

19. The method of claim 18,
The executing step may generate a sentence according to the inference result and output the sentence to the utterance module of the robot or combine the operation of the robot described in the operation descriptor of the storage unit and the sequence of the operation to the behavior module of the robot. Cognitive method for contextual interaction of service robots.

delete

Determining whether information about the same object is registered in the object descriptor of the storage unit of the robot with respect to the object recognized in the camera image input to the robot;
If it is not registered as a result of the determination, newly registering information about the recognized object with the object descriptor;
Generating a sentence for the new object recognition event and storing the sentence in the storage unit;
The information about the object registered to the object descriptor includes information about the shape and posture of the object,
If it is registered as a result of the determination, determining whether the recognized posture of the object is the same as the posture of the same object,
If the attitude is not the same, changing the attitude information of the same object to the object descriptor;
Generating a sentence in the form of a posture change event of the same object and storing it in the storage unit.