KR20210004803A

KR20210004803A - Electronic apparatus and controlling method thereof

Info

Publication number: KR20210004803A
Application number: KR1020200012154A
Authority: KR
Inventors: 최우제; 김민경
Original assignee: 삼성전자주식회사
Priority date: 2019-07-03
Filing date: 2020-01-31
Publication date: 2021-01-13

Abstract

Disclosed is an electronic device capable of quickly and accurately controlling devices through a user voice in a multi-device environment. The electronic device of the present invention comprises: a microphone; a communication unit; a memory in which a control command determination tool based on a control command determined by a voice recognition server for processing voice recognition of a user voice received from the electronic device is stored; and a processor obtaining user intention information by recognizing the user voice, receiving state information of an external device related to user intention information from a device control server for controlling a plurality of external devices, applying the user intention information and the state information of the external device to the control command determination tool to determine the control command for controlling a control target device among a plurality of external devices, and transmitting the determined control command to the device control server.

Description

Electronic device and its control method {ELECTRONIC APPARATUS AND CONTROLLING METHOD THEREOF}

본 개시는 전자 장치 및 이의 제어 방법에 관한 것으로, 보다 상세하게는, 사용자의 음성을 통해 외부 기기를 제어하는 전자 장치 및 이의 제어 방법에 관한 것이다. The present disclosure relates to an electronic device and a method for controlling the same, and more particularly, to an electronic device for controlling an external device through a user's voice, and a method for controlling the same.

최근 네트워크 통신 기술 및 음성 인식 기술이 발달함에 따라 사용자는 네트워크를 통해 연결된 각종 전자 기기들의 동작을 음성으로 제어할 수 있게 되었다.With the recent development of network communication technology and voice recognition technology, users can control operations of various electronic devices connected through a network with voice.

예를 들어, 사용자는 IoT(Internet of Things) 환경이나 홈 네트워크 환경에서 허브 장치(예를 들어, AI 스피커 등)에 음성 명령을 발화함으로써 주변의 각종 전자 기기들의 동작을 제어할 수 있다.For example, the user can control the operation of various electronic devices in the vicinity by firing a voice command to a hub device (eg, an AI speaker) in an Internet of Things (IoT) environment or a home network environment.

이때, 종래에는 허브 장치를 통해 수신된 사용자의 음성 명령을 클라우드를 통해 해석 및 처리하여 기기를 제어하였는 데, 이 경우 해석 및 처리 과정에서 딜레이가 발생하여 사용자가 느끼는 반응 속도가 저하되는 문제가 발생할 수 있다. At this time, conventionally, the user's voice command received through the hub device was interpreted and processed through the cloud to control the device.In this case, a delay occurs in the interpretation and processing process, resulting in a problem that the reaction speed felt by the user is lowered. I can.

본 개시는 상술한 문제점에 착안하여 안출된 것으로, 본 개시의 목적은, 멀티 디바이스 환경에서 사용자의 음성을 통해 신속하고 정확하게 기기들을 제어할 수 있는 전자 장치 및 이의 제어 방법을 제공함에 있다. The present disclosure has been conceived in light of the above-described problems, and an object of the present disclosure is to provide an electronic device capable of quickly and accurately controlling devices through a user's voice in a multi-device environment, and a control method thereof.

상기 목적을 달성하기 위한 본 개시의 일 실시 예에 따른 전자 장치는, 마이크로폰, 통신부, 상기 전자 장치로부터 수신된 사용자 음성을 음성 인식 처리하는 음성 인식 서버에 의해 판단된 제어 명령에 기초한 제어 명령 판단 툴이 저장된 메모리 및 사용자 음성이 상기 마이크로폰을 통해 수신되면, 상기 수신된 사용자 음성을 음성 인식 처리하여 사용자 의도 정보를 획득하고, 상기 획득된 사용자 의도 정보와 관련된 외부 기기의 상태 정보를 상기 통신부를 통해 복수의 외부 기기를 제어하기 위한 기기 제어 서버로부터 수신하며, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보를, 상기 제어 명령 판단 툴에 적용하여 상기 복수의 외부 기기 중 제어 대상 기기를 제어하기 위한 제어 명령을 판단하고, 상기 판단된 제어 명령을 상기 통신부를 통해 상기 기기 제어 서버로 전송하는 프로세서를 포함한다. An electronic device according to an embodiment of the present disclosure for achieving the above object is a control command determination tool based on a control command determined by a microphone, a communication unit, and a voice recognition server for voice recognition processing of a user's voice received from the electronic device. When the stored memory and the user voice are received through the microphone, the received user voice is voice-recognized to obtain user intention information, and state information of an external device related to the obtained user intention information is received through the communication unit. Received from a device control server for controlling an external device of, and applies the obtained user intention information and the received state information of the external device to the control command determination tool to control a control target device among the plurality of external devices And a processor that determines a control command to be performed and transmits the determined control command to the device control server through the communication unit.

또한, 상기 제어 명령 판단 툴은, 사용자 의도 정보, 외부 기기의 상태 정보 및 상기 음성 인식 서버에 의해 판단된 제어 명령이 서로 매칭된 적어도 하나의 룰을 포함하는 룰 DB를 포함하고, 상기 프로세서는, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보에 대응되는 룰이 상기 룰 DB에 존재하는 경우, 상기 대응되는 룰에 매칭된 제어 명령을 상기 사용자 음성에 대응되는 제어 명령으로 판단할 수 있다. In addition, the control command determination tool includes a rule DB including at least one rule in which user intention information, state information of an external device, and a control command determined by the voice recognition server are matched with each other, and the processor, When a rule corresponding to the acquired user intention information and the received state information of the external device exists in the rule DB, a control command matching the corresponding rule may be determined as a control command corresponding to the user voice. have.

또한, 상기 프로세서는, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보에 대응되는 룰이 상기 룰 DB에 존재하지 않는 경우, 상기 통신부를 통해 상기 사용자 음성을 상기 음성 인식 서버로 전송하고, 상기 사용자 음성에 기초하여 상기 음성 인식 서버에서 판단된 제어 명령이 상기 통신부를 통해 수신되면, 상기 수신된 제어 명령을 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보와 매칭하여 신규 룰을 생성하고, 상기 생성된 신규 룰을 상기 룰 DB에 업데이트할 수 있다. In addition, the processor transmits the user voice to the voice recognition server through the communication unit when a rule corresponding to the obtained user intention information and the received state information of the external device does not exist in the rule DB. When the control command determined by the voice recognition server based on the user's voice is received through the communication unit, the received control command is matched with the acquired user intention information and the received state information of the external device to create a new rule. May be generated, and the generated new rule may be updated in the rule DB.

또한, 상기 제어 명령 판단 툴은, 사용자 의도 정보 및 외부 기기의 상태 정보를 입력으로 하고, 음성 인식 서버에 의해 판단된 제어 명령을 출력으로 하여 학습된 인공 지능 모델을 포함하고, 상기 프로세서는, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보를 상기 학습된 인공 지능 모델에 입력하고, 상기 학습된 인공 지능 모델로부터 출력되는 제어 명령을 상기 사용자 음성에 대응되는 제어 명령으로 판단할 수 있다. In addition, the control command determination tool includes an artificial intelligence model learned by inputting user intention information and state information of an external device as input, and outputting a control command determined by a voice recognition server, and the processor The acquired user intention information and the received state information of the external device may be input to the learned artificial intelligence model, and a control command output from the learned artificial intelligence model may be determined as a control command corresponding to the user voice. .

또한, 상기 프로세서는, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보가 입력된 상기 학습된 인공 지능 모델로부터 상기 제어 명령이 출력되지 않는 경우, 상기 통신부를 통해 상기 사용자 음성을 상기 음성 인식 서버로 전송하고, 상기 사용자 음성에 기초하여 상기 음성 인식 서버에서 판단된 제어 명령이 상기 통신부를 통해 수신되면, 상기 획득된 사용자 의도 정보, 상기 수신된 외부 기기의 상태 정보 및 상기 수신된 제어 명령에 기초하여 상기 학습된 인공 지능 모델을 재학습시킬 수 있다. In addition, when the control command is not output from the learned artificial intelligence model in which the acquired user intention information and the received state information of the external device are input, the processor transmits the user voice through the communication unit. When the control command is transmitted to the recognition server and determined by the voice recognition server based on the user voice is received through the communication unit, the acquired user intention information, the received state information of the external device, and the received control command Based on, the learned artificial intelligence model may be retrained.

또한, 상기 프로세서는, 상기 수신된 외부 기기의 상태 정보에 기초하여 상기 획득된 사용자 의도 정보만으로 상기 제어 대상 기기 및 제어 대상 기기의 동작을 특정할 수 있는 경우, 상기 제어 명령 판단 툴을 이용함 없이, 상기 획득된 사용자 의도 정보에 기초하여 상기 제어 명령을 판단할 수 있다. In addition, the processor may specify the operation of the control target device and the control target device only with the acquired user intention information based on the received state information of the external device, without using the control command determination tool, The control command may be determined based on the acquired user intention information.

또한, 상기 획득된 사용자 의도 정보는, 엔티티에 관한 정보를 포함하고, 상기 프로세서는, 상기 기기 제어 서버로부터 상기 엔티티와 관련된 복수의 외부 기기의 상태 정보가 수신되면, 상기 획득된 사용자 의도 정보만으로 상기 제어 대상 기기를 특정할 수 없다고 판단하고, 상기 제어 명령 판단 툴을 이용하여 상기 제어 명령을 판단할 수 있다. In addition, the acquired user intention information includes information on an entity, and the processor, when state information of a plurality of external devices related to the entity is received from the device control server, the acquired user intention information It is determined that the control target device cannot be specified, and the control command may be determined using the control command determination tool.

또한, 상기 통신부는, IR(Infrared) 통신 모듈을 포함하고, 상기 프로세서는, 상기 제어 대상 기기가 IR 방식으로 제어 가능한 기기인 경우, 상기 판단된 제어 명령을 상기 IR 통신 모듈을 통해 상기 제어 대상 기기로 전송할 수 있다. In addition, the communication unit includes an IR (Infrared) communication module, and the processor, when the control target device is a device capable of controlling by an IR method, transmits the determined control command to the control target device through the IR communication module. Can be transferred to.

한편, 본 개시의 일 실시 예에 따른 전자 장치의 제어 방법은, 상기 전자 장치로부터 수신된 사용자 음성을 음성 인식 처리하는 음성 인식 서버에 의해 판단된 제어 명령에 기초한 제어 명령 판단 툴을 저장하는 단계, 사용자 음성이 수신되면, 상기 수신된 사용자 음성을 음성 인식 처리하여 사용자 의도 정보를 획득하는 단계, 상기 획득된 사용자 의도 정보와 관련된 외부 기기의 상태 정보를 복수의 외부 기기를 제어하기 위한 기기 제어 서버로부터 수신하는 단계, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보를, 제어 명령 판단 툴에 적용하여 상기 복수의 외부 기기 중 제어 대상 기기를 제어하기 위한 제어 명령을 판단하는 단계 및 상기 판단된 제어 명령을 상기 기기 제어 서버로 전송하는 단계를 포함한다. Meanwhile, a method for controlling an electronic device according to an exemplary embodiment of the present disclosure includes storing a control command determination tool based on a control command determined by a voice recognition server for voice recognition processing of a user voice received from the electronic device, When a user voice is received, the step of acquiring user intention information by speech recognition processing the received user voice, and status information of an external device related to the obtained user intention information from a device control server for controlling a plurality of external devices Receiving, applying the obtained user intention information and the received state information of the external device to a control command determination tool to determine a control command for controlling a control target device among the plurality of external devices, and the determination And transmitting the generated control command to the device control server.

또한, 상기 제어 명령 판단 툴은, 사용자 의도 정보, 외부 기기의 상태 정보 및 상기 음성 인식 서버에 의해 판단된 제어 명령이 서로 매칭된 적어도 하나의 룰을 포함하는 룰 DB를 포함하고, 상기 제어 명령을 판단하는 단계는, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보에 대응되는 룰이 상기 룰 DB에 존재하는 경우, 상기 대응되는 룰에 매칭된 제어 명령을 상기 사용자 음성에 대응되는 제어 명령으로 판단할 수 있다. In addition, the control command determination tool includes a rule DB including at least one rule in which user intention information, state information of an external device, and a control command determined by the voice recognition server are matched with each other, and the control command In the determining step, when a rule corresponding to the obtained user intention information and the received state information of the external device exists in the rule DB, a control command matched to the corresponding rule is controlled corresponding to the user voice. It can be judged by order.

또한, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보에 대응되는 룰이 상기 룰 DB에 존재하지 않는 경우, 상기 사용자 음성을 상기 음성 인식 서버로 전송하는 단계 및 상기 사용자 음성에 기초하여 상기 음성 인식 서버에서 판단된 제어 명령이 수신되면, 상기 수신된 제어 명령을 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보와 매칭하여 신규 룰을 생성하고, 상기 생성된 신규 룰을 상기 룰 DB에 업데이트하는 단계를 더 포함할 수 있다. In addition, when a rule corresponding to the acquired user intention information and the received state information of the external device does not exist in the rule DB, transmitting the user voice to the voice recognition server and based on the user voice When a control command determined by the voice recognition server is received, a new rule is generated by matching the received control command with the obtained user intention information and the received state information of the external device, and the generated new rule is It may further include the step of updating the rule DB.

또한, 상기 제어 명령 판단 툴은, 사용자 의도 정보 및 외부 기기의 상태 정보를 입력으로 하고, 음성 인식 서버에 의해 판단된 제어 명령을 출력으로 하여 학습된 인공 지능 모델을 포함하고, 상기 제어 명령을 판단하는 단계는, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보를 상기 학습된 인공 지능 모델에 입력하고, 상기 학습된 인공 지능 모델로부터 출력되는 제어 명령을 상기 사용자 음성에 대응되는 제어 명령으로 판단할 수 있다. In addition, the control command determination tool includes an artificial intelligence model learned by inputting user intention information and state information of an external device and outputting a control command determined by a voice recognition server, and determining the control command. In the step of, inputting the acquired user intention information and the received state information of the external device into the learned artificial intelligence model, and a control command output from the learned artificial intelligence model, a control command corresponding to the user voice. Can be judged as.

또한, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보가 입력된 상기 학습된 인공 지능 모델로부터 상기 제어 명령이 출력되지 않는 경우, 상기 사용자 음성을 상기 음성 인식 서버로 전송하는 단계 및 상기 사용자 음성에 기초하여 상기 음성 인식 서버에서 판단된 제어 명령이 수신되면, 상기 획득된 사용자 의도 정보, 상기 수신된 외부 기기의 상태 정보 및 상기 수신된 제어 명령에 기초하여 상기 학습된 인공 지능 모델을 재학습시키는 단계를 더 포함할 수 있다. In addition, when the control command is not output from the learned artificial intelligence model in which the acquired user intention information and the received state information of the external device are input, transmitting the user voice to the voice recognition server, and the When a control command determined by the voice recognition server is received based on a user's voice, the learned artificial intelligence model is reconstructed based on the acquired user intention information, the received state information of the external device, and the received control command. It may further include the step of learning.

또한, 상기 수신된 외부 기기의 상태 정보에 기초하여 상기 획득된 사용자 의도 정보만으로 상기 제어 대상 기기 및 제어 대상 기기의 동작을 특정할 수 있는 경우, 상기 제어 명령 판단 툴을 이용함 없이, 상기 획득된 사용자 의도 정보에 기초하여 상기 제어 명령을 판단하는 단계를 더 포함할 수 있다. In addition, when the operation of the control target device and the control target device can be specified only with the acquired user intention information based on the received state information of the external device, the acquired user without using the control command determination tool It may further include determining the control command based on the intention information.

또한, 상기 획득된 사용자 의도 정보는, 엔티티에 관한 정보를 포함하고, 상기 기기 제어 서버로부터 상기 엔티티와 관련된 복수의 외부 기기의 상태 정보가 수신되면, 상기 획득된 사용자 의도 정보만으로 상기 제어 대상 기기를 특정할 수 없다고 판단하고, 상기 제어 명령 판단 툴을 이용하여 상기 제어 명령을 판단하는 단계를 더 포함할 수 있다. In addition, the acquired user intention information includes information on an entity, and when state information of a plurality of external devices related to the entity is received from the device control server, the control target device is controlled only by the acquired user intention information. It may further include determining that it cannot be specified, and determining the control command using the control command determination tool.

또한, 상기 제어 대상 기기가 IR 방식으로 제어 가능한 기기인 경우, 상기 판단된 제어 명령을 상기 제어 대상 기기로 전송하는 단계를 더 포함할 수 있다. In addition, when the control target device is a device that can be controlled by the IR method, transmitting the determined control command to the control target device may be further included.

한편, 본 개시의 일 실시 예에 따른 음성 인식 서버는, 통신부, 상기 음성 인식 서버가 전자 장치로부터 수신된 사용자 음성에 기초하여 판단한 제어 명령에 기초한 제어 명령 판단 툴이 저장된 메모리 및 사용자 음성이 상기 통신부를 통해 상기 전자 장치로부터 수신되면, 상기 수신된 사용자 음성을 음성 인식 처리하여 사용자 의도 정보를 획득하고, 상기 획득된 사용자 의도 정보와 관련된 외부 기기의 상태 정보를 상기 통신부를 통해 복수의 외부 기기를 제어하기 위한 기기 제어 서버로부터 수신하며, 상기 획득된 사용자 의도 정보 및 상기 수신된 외부 기기의 상태 정보를, 상기 제어 명령 판단 툴에 적용하여 상기 복수의 외부 기기 중 제어 대상 기기를 제어하기 위한 제어 명령을 판단하고, 상기 판단된 제어 명령을 상기 통신부를 통해 상기 기기 제어 서버로 전송하는 프로세서를 포함한다. Meanwhile, the speech recognition server according to an embodiment of the present disclosure includes a communication unit, a memory in which a control command determination tool based on a control command determined by the speech recognition server based on a user voice received from an electronic device, and a user voice are stored in the communication unit. When received from the electronic device through the voice recognition process, the received user voice is processed to obtain user intention information, and state information of the external device related to the obtained user intention information is controlled through the communication unit. A control command for controlling a control target device among the plurality of external devices by applying the obtained user intention information and the received state information of the external device to the control command determination tool And a processor that determines and transmits the determined control command to the device control server through the communication unit.

또한, 상기 제어 명령 판단 툴은, 사용자 의도 정보, 외부 기기의 상태 정보 및 상기 음성 인식 서버가 판단한 제어 명령이 서로 매칭된 적어도 하나의 룰을 포함하는 룰 DB, 및 사용자 의도 정보 및 외부 기기의 상태 정보를 입력으로 하고, 상기 음성 인식 서버가 판단한 제어 명령을 출력으로 하여 학습된 인공 지능 모델 중 적어도 하나를 포함할 수 있다. In addition, the control command determination tool includes a rule DB including user intention information, state information of an external device, and at least one rule in which the control command determined by the voice recognition server is matched with each other, and user intention information and the state of the external device. It may include at least one of the artificial intelligence models learned by inputting information and outputting a control command determined by the voice recognition server.

이상과 같은 본 개시의 다양한 실시 예에 따르면, 멀티 디바이스 환경에서 사용자의 음성을 통해 신속하고 정확하게 기기들을 제어할 수 있게 된다. According to various embodiments of the present disclosure as described above, it is possible to quickly and accurately control devices through a user's voice in a multi-device environment.

도 1은 본 개시의 일 실시 예에 따른 음성 제어 시스템을 도시한 도면,
도 2는 본 개시의 일 실시 예에 따른 전자 장치의 블럭도,
도 3은 본 개시의 일 실시 예에 따른 음성 제어 시스템을 도시한 도면,
도 4는 본 개시의 일 실시 예에 따른 전자 장치의 블럭도,
도 5는 본 개시의 일 실시 예에 따른 음성 인식 서버의 블럭도, 및
도 6은 본 개시의 일 실시 예에 따른 전자 장치의 제어 방법을 도시한 흐름도이다. 1 is a diagram illustrating a voice control system according to an embodiment of the present disclosure;
2 is a block diagram of an electronic device according to an embodiment of the present disclosure;
3 is a diagram illustrating a voice control system according to an embodiment of the present disclosure;
4 is a block diagram of an electronic device according to an embodiment of the present disclosure;
5 is a block diagram of a voice recognition server according to an embodiment of the present disclosure, and
6 is a flowchart illustrating a method of controlling an electronic device according to an embodiment of the present disclosure.

본 개시를 설명함에 있어, 관련된 공지 기술에 대한 구체적인 설명이 본 개시의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 동일한 구성의 중복 설명은 되도록 생략하기로 한다. In describing the present disclosure, when it is determined that a detailed description of related known technologies may unnecessarily obscure the subject matter of the present disclosure, a detailed description thereof will be omitted. In addition, redundant description of the same configuration will be omitted.

이하의 설명에서 사용되는 구성요소에 대한 접미사 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. The suffix "unit" for the constituent elements used in the following description is given or used interchangeably in consideration of only the ease of writing the specification, and does not itself have a distinct meaning or role from each other.

본 개시에서 사용한 용어는 실시 예를 설명하기 위해 사용된 것으로, 본 개시를 제한 및/또는 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. Terms used in the present disclosure are used to describe embodiments, and are not intended to limit and/or limit the present disclosure. Singular expressions include plural expressions unless the context clearly indicates otherwise.

본 개시에서, '포함하다' 또는 '가지다' 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In the present disclosure, terms such as'include' or'have' are intended to designate the presence of features, numbers, steps, actions, components, parts, or a combination thereof described in the specification, but one or more other features. It is to be understood that the presence or addition of elements or numbers, steps, actions, components, parts, or combinations thereof, does not preclude in advance.

본 개시에서 사용된 "제1," "제2," "첫째," 또는 "둘째," 등의 표현들은 다양한 구성요소들을, 순서 및/또는 중요도에 상관없이 수식할 수 있고, 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 뿐 해당 구성요소들을 한정하지 않는다. Expressions such as "first," "second," "first," or "second," used in the present disclosure may modify various elements, regardless of order and/or importance, and one element It is used to distinguish it from other components and does not limit the components.

어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "(기능적으로 또는 통신적으로) 연결되어((operatively or communicatively) coupled with/to)" 있다거나 "접속되어(connected to)" 있다고 언급된 때에는, 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로 연결되거나, 다른 구성요소(예: 제3 구성요소)를 통하여 연결될 수 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소(예: 제1 다른 구성요소(예: 제2 구성요소)에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소와 상기 다른 구성요소 사이에 다른 구성요소(예: 제 3 구성요소)가 존재하지 않는 것으로 이해될 수 있다.Some component (eg, a first component) is "(functionally or communicatively) coupled with/to)" to another component (eg, a second component) or " When referred to as "connected to", it should be understood that the certain component may be directly connected to the other component or may be connected through another component (eg, a third component). On the other hand, when it is mentioned that it is "directly connected" or "directly connected" to a certain component (eg, a first other component (eg, a second component)), between the certain component and the other component It may be understood that no other component (eg, a third component) exists in the.

본 개시의 실시 예들에서 사용되는 용어들은 다르게 정의되지 않는 한, 해당 기술 분야에서 통상의 지식을 가진 자에게 통상적으로 알려진 의미로 해석될 수 있다. Terms used in the embodiments of the present disclosure may be interpreted as meanings commonly known to those of ordinary skill in the art, unless otherwise defined.

이하에서 첨부된 도면을 참조하여 본 개시의 다양한 실시 예를 상세히 설명한다. Hereinafter, various embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.

도 1은 본 개시의 일 실시 예에 따른 음성 제어 시스템을 설명하기 위한 예시도이다. 도 1에 따르면, 음성 제어 시스템(1000)은 전자 장치(100), 외부 기기(200-1, 200-2), 음성 인식 서버(400) 및 기기 제어 서버(500)를 포함할 수 있다. 1 is an exemplary diagram for describing a voice control system according to an embodiment of the present disclosure. Referring to FIG. 1, the voice control system 1000 may include an electronic device 100, external devices 200-1 and 200-2, a voice recognition server 400, and a device control server 500.

전자 장치(100)와 외부 기기(200-1, 200-2)는 가정 또는 사무실과 같은 댁내(10)에서 홈 네트워크 환경 내지 사물 인터넷 네트워크 환경을 구성할 수 있다. The electronic device 100 and the external devices 200-1 and 200-2 may configure a home network environment or an Internet of Things network environment in a premises 10 such as a home or an office.

전자 장치(100)는 스마트 스피커, AI 스피커, 스마트 TV, 스마트 냉장고, 스마트폰, 액세스 포인트, 노트북, 데스크탑 PC, 태블릿 등과 같은 다양한 종류의 장치로 구현될 수 있다. The electronic device 100 may be implemented as various types of devices such as a smart speaker, an AI speaker, a smart TV, a smart refrigerator, a smart phone, an access point, a notebook, a desktop PC, and a tablet.

한편, 사물 인터넷 환경에서 사물의 종류에는 제한이 없으므로, 외부 기기(200-1, 200-2)의 종류 역시 제한이 없다. 도 1에는 TV 1(200-1) 및 TV 2(200-2)를 도시하였으나, 이는 일 예일 뿐, 에어컨, 스마트 조명, 선풍기, 세탁기, 전자 렌지, 도어락, 사운드 바, 홈시어터, 스마트폰, TV, 냉장고 등 전자 장치(100)와 통신 연결되어 전자 장치(100)를 통해 동작이 제어될 수 있는 기기이면 그 종류에 제한이 없다. Meanwhile, since there is no limitation on the type of objects in the IoT environment, the types of external devices 200-1 and 200-2 are also not limited. In FIG. 1, TV 1 (200-1) and TV 2 (200-2) are shown, but these are only examples, and air conditioners, smart lights, fans, washing machines, microwaves, door locks, sound bars, home theaters, smartphones, There is no limitation on the type of device as long as it is a device that is connected in communication with the electronic device 100 such as a TV or refrigerator and can control an operation through the electronic device 100.

이때, 전자 장치(100)는 사용자 음성에 따라 외부 기기(200-1, 200-2)의 동작을 제어할 수 있다. In this case, the electronic device 100 may control the operation of the external devices 200-1 and 200-2 according to the user's voice.

예를 들어, 사용자가 외부 기기(200-1, 200-2)의 동작을 제어하기 위한 사용자 음성을 발화하면, 전자 장치(100)는 발화된 사용자 음성을 수신하여 음성 인식 서버(400)로 전송함으로써 음성 인식 서버(400) 및 기기 제어 서버(500)를 통해 외부 기기(200-1, 200-2)의 동작을 제어할 수 있다. For example, when a user utters a user voice for controlling the operation of the external devices 200-1 and 200-2, the electronic device 100 receives the uttered user voice and transmits it to the voice recognition server 400 Accordingly, the operation of the external devices 200-1 and 200-2 may be controlled through the voice recognition server 400 and the device control server 500.

구체적으로, 음성 인식 서버(400)는 전자 장치(100)로부터 수신된 사용자 음성에 대응되는 제어 명령을 판단하고, 판단된 제어 명령을 기기 제어 서버(500)로 전송할 수 있다. Specifically, the voice recognition server 400 may determine a control command corresponding to a user voice received from the electronic device 100 and transmit the determined control command to the device control server 500.

여기서, 제어 명령은, 음성 인식 서버(400)가, 기기 제어 서버(500)로부터 수신한 외부 기기(200-1, 200-2)의 상태 정보에 기초하여, 후술할 사용자 의도 정보를 해석한 결과로서, 사용자가 음성을 통해 제어하고자 하는 제어 대상 기기 및 그 제어 대상 기기의 동작에 관한 정보를 포함할 수 있다. Here, the control command is a result of the speech recognition server 400 analyzing user intention information to be described later based on the state information of the external devices 200-1 and 200-2 received from the device control server 500 As an example, the control target device that the user wants to control through voice and information on the operation of the control target device may be included.

구체적으로, 전자 장치(100)로부터 사용자 음성이 수신되면, 음성 인식 서버(400)는 수신된 사용자의 음성을 음성 인식 처리하여 사용자 음성에 대응되는 사용자 의도 정보를 획득할 수 있다. Specifically, when a user's voice is received from the electronic device 100, the voice recognition server 400 may perform voice recognition on the received user's voice to obtain user intention information corresponding to the user voice.

여기서, 음성 인식 서버(400)가 수행하는 음성 인식 처리는, 사용자의 음성을 텍스트로 변환하는 ASR(Auto Speech Recognition) 처리와, ASR 처리에 의해 텍스트로 변환된 사용자 음성을 기계가 이해할 수 있는 표현으로 변환하는 NLU(Natural Language Understanding) 처리를 포함한다.Here, the speech recognition processing performed by the speech recognition server 400 includes ASR (Auto Speech Recognition) processing for converting a user's speech into text, and an expression that a machine can understand the user's speech converted into text by the ASR processing. It includes NLU (Natural Language Understanding) processing that converts to.

여기서, 사용자 의도 정보는, 음성 인식 서버(400)가, 외부 기기(200-1, 200-2)의 상태 정보를 이용하여 제어 명령을 판단하기 전에, 사용자 음성을 음성 인식 처리한 결과로서, 엔티티(entity) 및 엔티티의 동작(action)에 관한 정보를 포함할 수 있다. Here, the user intention information is a result of the voice recognition processing of the user's voice before the voice recognition server 400 determines a control command using the state information of the external devices 200-1 and 200-2, It may include information about (entity) and an action of the entity.

음성 인식 처리 결과인 엔티티 및 엔티티의 동작은 사용자 음성에 따라 다양할 수 있으며, 경우에 따라 엔티티가 존재하지 않거나 엔티티의 동작이 특정되지 않을 수도 있다. 그러나, 이하에서는 설명의 편의를 위해, 사용자가 외부 기기(200-1, 200-2)의 동작을 제어하기 위한 사용자 음성을 발화한 것을 전제로, 엔티티가 외부 기기(200-1, 200-2)로 특정되고, 엔티티의 동작이 외부 기기(200-1, 200-2)의 동작으로 특정되는 경우를 예로 들어 설명한다. The entity and the operation of the entity as a result of the speech recognition processing may vary according to the user's voice, and in some cases, the entity may not exist or the operation of the entity may not be specified. However, hereinafter, for convenience of description, assuming that the user uttered a user voice for controlling the operation of the external devices 200-1 and 200-2, the entity is the external devices 200-1 and 200-2. ), and the operation of the entity is specified as the operation of the external devices 200-1 and 200-2.

이와 같이, 사용자 의도 정보가 획득되면, 음성 인식 서버(400)는 획득된 사용자 의도 정보와 관련된 엔티티 즉, 외부 기기(200-1, 200-2)의 상태 정보를 기기 제어 서버(500)로 요청할 수 있다. In this way, when the user intention information is obtained, the voice recognition server 400 requests the device control server 500 for status information of the entity related to the obtained user intention information, that is, the external devices 200-1 and 200-2. I can.

여기서, 상태 정보는, 외부 기기(200-1, 200-2)의 현재 동작 상태와 관련된 정보로서, 예를 들어, 외부 기기(200-1, 200-2)의 전원 온/오프 상태에 관한 정보, 외부 기기(200-1, 200-2)의 기능과 관련된 설정 정보, 외부 기기(200-1, 200-2)의 댁내(10) 위치 정보 등을 포함할 수 있으나, 이에 한정되는 것은 아니다. Here, the state information is information related to the current operating state of the external devices 200-1 and 200-2, for example, information on the power on/off state of the external devices 200-1 and 200-2 , Setting information related to functions of the external devices 200-1 and 200-2, location information of the premises 10 of the external devices 200-1 and 200-2, etc., but are not limited thereto.

요청한 상태 정보가 기기 제어 서버(500)로부터 수신되면, 음성 인식 서버(400)는, 수신된 외부 기기(200-1, 200-2)의 상태 정보에 기초하여 사용자 의도 정보를 해석함으로써 제어 명령을 판단할 수 있다. When the requested state information is received from the device control server 500, the voice recognition server 400 interprets the user intention information based on the received state information of the external devices 200-1 and 200-2 to provide a control command. I can judge.

구체적으로, 음성 인식 서버(400)는, 외부 기기(200-1, 200-2)의 상태 정보에 기초하여, 사용자 의도 정보만으로 제어 대상 기기 및 제어 대상 기기의 동작이 특정될 수 있는지 판단하고, 특정되는 경우, 추가적인 해석없이, 특정된 제어 대상 기기 및 동작에 관한 정보를 포함하는 제어 명령을 판단할 수 있다.Specifically, the voice recognition server 400 determines whether the operation of the control target device and the control target device can be specified only with user intention information, based on the state information of the external devices 200-1 and 200-2, If specified, it is possible to determine a control command including information on the specified control target device and operation without further interpretation.

그러나, 사용자 의도 정보만으로 제어 대상 기기 및 동작이 특정되는지 않는 경우, 음성 인식 서버(400)는 음성 인식 서버(400)가 가지고 있는 자체 정책에 따라 사용자 의도 정보를 추가적으로 해석하여 제어 명령을 판단할 수 있다. 이에 관한 보다 자세한 내용은 후술하기로 한다. However, when the control target device and operation are not specified only with the user intention information, the voice recognition server 400 may determine a control command by additionally interpreting the user intention information according to its own policy of the voice recognition server 400. have. More details on this will be described later.

음성 인식 서버(400)는 위와 같이 판단된 제어 명령을 기기 제어 서버(500)로 전송하고, 기기 제어 서버(500)는 음성 인식 서버(400)로부터 수신한 제어 명령에 기초하여 제어 대상 기기의 동작을 제어할 수 있게 된다. The voice recognition server 400 transmits the control command determined as above to the device control server 500, and the device control server 500 operates the control target device based on the control command received from the voice recognition server 400. You will be able to control it.

기기 제어 서버(500)는 기기 제어 서버(500)와 네트워크를 통해 연결된 기기들을 사용자 계정 별로 등록하고, 등록된 기기들의 상태 정보를 수집 및 관리할 수 있다. 도 1의 예에서, 기기 제어 서버(500)는 전자 장치(100) 및 외부 기기(200-1, 200-2)를 등록하고, 전자 장치(100) 및 외부 기기(200-1, 200-2)의 상태 정보를 수집하여 관리할 수 있다. The device control server 500 may register devices connected to the device control server 500 through a network for each user account, and collect and manage state information of the registered devices. In the example of FIG. 1, the device control server 500 registers the electronic device 100 and the external devices 200-1 and 200-2, and the electronic device 100 and the external devices 200-1 and 200-2 ) Status information can be collected and managed.

따라서, 기기 제어 서버(500)는 음성 인식 서버(400)로부터 외부 기기(200-1, 200-2)의 상태 정보 전송이 요청되는 경우, 요청에 응답하여 외부 기기(200-1, 200-2)의 상태 정보를 음성 인식 서버(400)로 전송할 수 있다. Therefore, when the transmission of the state information of the external devices 200-1 and 200-2 is requested from the voice recognition server 400, the device control server 500 responds to the request to the external devices 200-1 and 200-2. ) Status information may be transmitted to the voice recognition server 400.

한편, 기기 제어 서버(500)는 등록된 기기들 각각에 대한 제어 신호 세트를 저장 및 관리할 수 있다. 여기서, 제어 신호 세트는 해당 기기의 각종 동작을 제어할 수 있는 제어 코드 세트를 말한다. Meanwhile, the device control server 500 may store and manage a set of control signals for each of the registered devices. Here, the control signal set refers to a control code set capable of controlling various operations of the corresponding device.

동일한 동작에 관한 제어 명령이더라도 기기마다 제조사별로 또는 모델 별로 제어 코드가 상이할 수 있으므로, 기기 제어 서버(500)는 등록된 기기들의 동작을 제어하기 위해 기기 별 제어 신호 세트를 저장 및 관리할 수 있다. 이때, 제어 신호 세트는 기기로부터 직접 수신되거나 또는 기기 제조사가 관리하는 서버를 통해 획득될 수 있을 것이다. Even though the control command for the same operation may have different control codes for each manufacturer or model for each device, the device control server 500 may store and manage a set of device-specific control signals to control the operation of registered devices. . In this case, the control signal set may be directly received from the device or may be acquired through a server managed by the device manufacturer.

따라서, 기기 제어 서버(500)는 음성 인식 서버(400)로부터 제어 명령이 수신되면, 제어 대상 기기 및 제어 대상 기기의 동작에 대응되는 제어 신호를 제어 신호 세트에서 확인하고, 확인된 제어 신호를 제어 대상 기기로 전송하여 제어 대상 기기의 동작을 제어할 수 있다. Accordingly, when a control command is received from the voice recognition server 400, the device control server 500 checks the control signal corresponding to the control target device and the operation of the control target device in the control signal set, and controls the confirmed control signal. By transmitting to the target device, the operation of the controlled device can be controlled.

이하에서, 구체적인 예를 들어 음성 인식 시스템(100)의 동작을 설명한다. 예를 들어, 도 1에 도시된 바와 같이, 가정 내(10)에는 전자 장치(100), TV 1(200-1) 및 TV 2(200-2)가 IoT 네트워크를 구성하고 있으며, TV 1(200-1) 및 TV 2(200-2)는 사용자 계정을 통해 기기 제어 서버(500)에 등록되어 있다. 이때, 사용자는 어떤 TV인지 특정하지 않고 "TV 틀어줘"와 같은 사용자 음성을 전자 장치(100)로 발화할 수 있다. Hereinafter, an operation of the speech recognition system 100 will be described with a specific example. For example, as shown in FIG. 1, in the home 10, the electronic device 100, TV 1 (200-1), and TV 2 (200-2) constitute an IoT network, and TV 1 ( 200-1) and TV 2 200-2 are registered in the device control server 500 through a user account. In this case, the user may utter a user's voice such as "Turn on TV" to the electronic device 100 without specifying which TV it is.

전자 장치(100)는 사용자 음성을 음성 인식 서버(400)로 전송하고, 음성 인식 서버(400)는 수신된 사용자 음성을 ASR(Automatic Speech Recognition) 처리하여 "TV 틀어줘"와 같은 ASR 처리 결과를 획득하고, ASR 처리 결과를 NLU(Natural Language Understanding) 처리하여 "TV power-on"과 같은 사용자 의도 정보를 획득할 수 있다. 이 경우, 사용자 의도 정보에 포함된 엔티티는 "TV"가 되고, 엔티티의 동작은 "power-on"이 될 것이다. The electronic device 100 transmits the user voice to the voice recognition server 400, and the voice recognition server 400 processes the received user voice ASR (Automatic Speech Recognition) to obtain an ASR processing result such as "Turn on TV". It is acquired, and user intention information such as "TV power-on" may be acquired by processing the ASR processing result by NLU (Natural Language Understanding). In this case, the entity included in the user intention information will be "TV", and the operation of the entity will be "power-on".

이와 같이 사용자 의도 정보가 획득되면, 음성 인식 서버(400)는 사용자 의도 정보와 관련된 외부 기기 즉, TV의 상태 정보를 기기 제어 서버(500)로 요청할 수 있다. When the user intention information is obtained as described above, the voice recognition server 400 may request the device control server 500 for status information of an external device, that is, a TV, related to the user intention information.

이때, 기기 제어 서버(500)에 등록된 사용자의 TV는 TV 1(200-1)과 TV 2(200-2)가 있으므로, 기기 제어 서버(500)는 음성 인식 서버(400)의 요청에 응답하여, 상태 정보 전송이 요청된 당시의 TV 1(200-1)의 상태 정보 및 TV 2(200-2)의 상태 정보를 음성 인식 서버(400)로 전송할 수 있다. At this time, since the TV of the user registered in the device control server 500 includes TV 1 (200-1) and TV 2 (200-2), the device control server 500 responds to the request of the voice recognition server 400 Accordingly, the state information of the TV 1 200-1 and the state information of the TV 2 200-2 at the time the state information transmission is requested may be transmitted to the voice recognition server 400.

이에 따라, 음성 인식 서버(400)는, 수신된 TV 1(200-1) 및 TV 2(200-2)의 상태 정보에 기초하여, 사용자 의도 정보만으로 제어 명령이 특정되는지 판단할 수 있다. Accordingly, the voice recognition server 400 may determine whether a control command is specified only with user intention information, based on the received state information of the TV 1 200-1 and TV 2 200-2.

위 예에서, 사용자 의도 정보에 포함된 엔티티는 "TV"이지만, TV 1(200-1) 및 TV 2(200-2) 각각 대한 상태 정보가 기기 제어 서버(500)로부터 수신되었으므로, 사용자 의도 정보만으로는 TV가 TV 1(200-1) 및 TV 2(200-2) 중 어느 TV인지 특정될 수 없다. In the above example, the entity included in the user intention information is "TV", but since status information for each of TV 1 (200-1) and TV 2 (200-2) has been received from the device control server 500, user intention information By itself, it cannot be specified which TV is TV 1 (200-1) or TV 2 (200-2).

즉, 사용자 의도 정보만으로는 제어 대상 기기가 특정되지 않으므로, 음성 인식 서버(400)는 사용자 의도 정보만으로 제어 명령을 특정할 수 없다고 판단하고, 자체 정책에 따라 제어 명령을 판단할 수 있다. That is, since the control target device is not specified using only the user intention information, the voice recognition server 400 may determine that the control command cannot be specified only with the user intention information, and may determine the control command according to its own policy.

여기서, 자체 정책은, 음성 인식 서버(400)가 자체적으로 제어 명령을 판단하기 위한 수단으로, 예를 들어, 질의에 대한 사용자의 응답을 통해 제어 명령을 판단하는 정책 등을 포함할 수 있으나, 이에 한정되는 것은 아니다. Here, the self-policy is a means for the voice recognition server 400 to self-determine a control command, and may include, for example, a policy for determining a control command through a user's response to a query. It is not limited.

위 예에서, 음성 인식 서버(400)는 TV 1(200-1) 및 TV 2(200-2) 중 어떤 TV의 전원을 켤지 묻는 질의를 전자 장치(100)로 전송하고, 전자 장치(100)는 수신된 질의를 출력할 수 있다. 이에 따라, 사용자가 예를 들어"TV 1"과 같은 응답을 발화하면, 전자 장치(100)는 이를 수신하여 음성 인식 서버(400)로 전송하고, 음성 인식 서버(400)는 수신된 음성에 기초하여 TV 1(200-1)을 제어 대상 기기로 판단할 수 있다.In the above example, the voice recognition server 400 transmits to the electronic device 100 a query asking which TV to turn on, among TV 1 (200-1) and TV 2 (200-2), and the electronic device 100 Can output the received query. Accordingly, when the user utters a response such as "TV 1", the electronic device 100 receives it and transmits it to the voice recognition server 400, and the voice recognition server 400 is based on the received voice. Accordingly, it is possible to determine the TV 1 (200-1) as a control target device.

따라서, 음성 인식 서버(400)는 "TV 틀어줘"에 대응되는 제어 명령으로 "TV 1 power-on"을 판단하고, 판단된 제어 명령을 기기 제어 서버(500)로 전송할 수 있다. 이 경우, 제어 대상 기기는 "TV 1(200-1)"이 되고, 제어 대상 기기의 동작은 "power-on"이 될 것이다. Accordingly, the voice recognition server 400 may determine “TV 1 power-on” as a control command corresponding to “Turn on TV” and transmit the determined control command to the device control server 500. In this case, the device to be controlled will be "TV 1 (200-1)", and the operation of the device to be controlled will be "power-on".

만일 위 예와 달리, 사용자 계정에 하나의 TV만 등록되어 있는 경우, 등록된 하나의 TV에 관한 상태 정보만이 기기 제어 서버(500)로부터 수신될 것이므로, 음성 인식 서버(400)는 사용자 의도 정보만으로 제어 대상 기기를 특정할 수 있게 된다. 따라서, 음성 인식 서버(400)는 자체 정책을 이용하는 등의 추가적인 해석없이, 특정된 제어 대상 기기인"TV" 및 특정된 동작인"power-on"을 포함하는 제어 명령인 "TV power-on"을 판단할 수 있을 것이다. 이는 복수의 TV가 등록된 경우라도 사용자가 처음부터 "TV 1(200-1) 틀어줘"와 같이 제어 대상 기기를 특정하여 발화한 경우에도 마찬가지이다. Unlike the above example, if only one TV is registered in the user account, only status information about one registered TV will be received from the device control server 500, so the voice recognition server 400 provides user intention information. It is only possible to specify the device to be controlled. Therefore, the voice recognition server 400 is a control command "TV power-on" including a specified control target device "TV" and a specified operation "power-on" without additional interpretation such as using its own policy. Will be able to judge. This is the same even when a plurality of TVs are registered, even when a user specifies and utters a control target device such as "Turn on TV 1 (200-1)" from the beginning.

한편, "TV 1 power-on"과 같은 제어 명령이 수신되면, 기기 제어 서버(500)는 TV 1(200-1)에 대응되는 제어 신호 세트에서 "power-on"에 해당하는 제어 신호를 확인하고, 확인된 제어 신호를 TV 1(200-1)으로 전송할 수 있다. 이에 따라, TV 1(200-1)은 턴-온되며, 사용자는 TV 1(200-1)을 통해 방송 프로그램을 시청할 수 있게 된다. On the other hand, when a control command such as "TV 1 power-on" is received, the device control server 500 checks a control signal corresponding to "power-on" in a control signal set corresponding to TV 1 (200-1). And, the confirmed control signal can be transmitted to the TV 1 (200-1). Accordingly, TV 1 (200-1) is turned on, and a user can watch a broadcast program through TV 1 (200-1).

이와 같이, 본 개시의 일 실시 예에 따르면, 전자 장치(100)는, 수신된 사용자 음성을 음성 인식 서버(400)로 전송함으로써 음성 인식 서버(400)가 판단한 제어 명령을 통해 외부 기기(200-1, 200-2)의 동작을 제어할 수 있다. As described above, according to an embodiment of the present disclosure, the electronic device 100 transmits the received user's voice to the voice recognition server 400, so that the external device 200-through the control command determined by the voice recognition server 400 1, 200-2) can be controlled.

한편, 본 개시의 일 실시 예에 따르면, 전자 장치(100)는, 수신된 사용자 음성에 대한 제어 명령을 직접 판단하여 외부 기기(200-1, 200-2)의 동작을 제어할 수 있다. Meanwhile, according to an embodiment of the present disclosure, the electronic device 100 may control operations of the external devices 200-1 and 200-2 by directly determining a control command for a received user voice.

구체적으로, 전자 장치(100)는, 음성 인식 서버(400)를 통해 외부 기기(200-1, 200-2)를 제어했던 히스토리 정보에 기초하여 제어 명령을 판단하기 위한 툴을 생성하고, 생성된 제어 명령 판단 툴에 기초하여 제어 명령을 판단할 수 있다. Specifically, the electronic device 100 generates a tool for determining a control command based on the history information that has controlled the external devices 200-1 and 200-2 through the voice recognition server 400, and generates The control command may be determined based on the control command determination tool.

여기서, 제어 명령 판단 툴은, 1) 사용자 의도 정보, 외부 기기의 상태 정보 및 제어 명령이 서로 매칭되어 있는 룰, 또는 2) 사용자 의도 정보 및 외부 기기의 상태 정보를 입력으로 하고, 제어 명령을 출력으로 하여 학습된 인공 지능 모델이 될 수 있다. Here, the control command determination tool includes 1) user intention information, a rule in which state information of an external device and a control command are matched with each other, or 2) inputs user intention information and state information of an external device, and outputs a control command. It can be a learned artificial intelligence model.

이를 위해, 전자 장치(100)는 음성 인식 기능을 구비하며, 수신된 사용자 음성을 음성 인식 처리하여 사용자 의도 정보를 획득할 수 있다. To this end, the electronic device 100 may have a voice recognition function, and may obtain user intention information by performing voice recognition on the received user voice.

여기서, 전자 장치(100)의 음성 인식 기능은, 전술한 ASR 및 NLU 처리 기능 중 적어도 하나를 포함할 수 있다. Here, the voice recognition function of the electronic device 100 may include at least one of the aforementioned ASR and NLU processing functions.

예를 들어, 전자 장치(100)가 ASR 및 NLU 처리 기능을 모두 포함하는 경우, 전자 장치(100)는 수신된 사용자 음성을 음성 인식 처리하여 사용자 의도 정보를 획득할 수 있다. For example, when the electronic device 100 includes both ASR and NLU processing functions, the electronic device 100 may obtain user intention information by performing speech recognition processing on the received user voice.

또한, 전자 장치(100)가 ASR 처리 기능만 포함하는 경우, 전자 장치(100)는 사용자 음성에 대한 ASR 처리 결과를 음성 인식 서버(400)로 전송하고, 음성 인식 서버(400)로부터 NLU 처리 결과 즉, 사용자 의도 정보를 획득할 수 있다. In addition, when the electronic device 100 includes only the ASR processing function, the electronic device 100 transmits the ASR processing result for the user's voice to the speech recognition server 400, and the NLU processing result from the speech recognition server 400 That is, user intention information can be obtained.

또한, 전자 장치(100)가 NLU 기능만 포함한 경우, 전자 장치(100)는 수신된 사용자 음성을 음성 인식 서버(400)로 전송하고, 음성 인식 서버(400)로부터 수신한 ASR 처리 결과를 NLU 처리하여 사용자 의도 정보를 획득할 수 있다. In addition, when the electronic device 100 includes only the NLU function, the electronic device 100 transmits the received user voice to the voice recognition server 400, and NLU processes the ASR processing result received from the voice recognition server 400. Thus, user intention information can be obtained.

이때, 전자 장치(100)가 자신의 ASR 처리 기능이나 NLU 처리 기능을 이용하여 사용자 음성을 처리하였으나, 그 처리 결과의 신뢰도가 일정 수준 미만인 경우, 전자 장치(100)는 음성 인식 서버(400)로 사용자 음성에 대한 ASR 처리 및/또는 NLU 처리를 요청하고, 음성 인식 서버(400)로부터 요청한 처리 결과를 획득할 수도 있다. 이에 관한 자세한 내용은 후술한다. At this time, when the electronic device 100 processes the user's voice using its own ASR processing function or NLU processing function, but the reliability of the processing result is less than a certain level, the electronic device 100 sends the voice recognition server 400 ASR processing and/or NLU processing for the user's voice may be requested, and the requested processing result may be obtained from the speech recognition server 400. Details on this will be described later.

이에 따라, 전자 장치(100)는 획득된 사용자 의도 정보와 관련된 외부 기기(200-1, 200-2)의 상태 정보를 기기 제어 서버(500)로 요청하여 수신하고, 수신된 외부 기기(200-1, 200-2)의 상태 정보 및 상기 획득된 사용자 의도 정보를 상기 제어 명령 판단 툴에 적용함으로써 제어 명령을 판단할 수 있다. Accordingly, the electronic device 100 requests and receives the status information of the external devices 200-1 and 200-2 related to the acquired user intention information to the device control server 500, and receives the received external device 200- The control command may be determined by applying the state information of 1, 200-2) and the acquired user intention information to the control command determination tool.

또한, 전자 장치(100)는 판단된 제어 명령을 기기 제어 서버(500)로 직접 전송함으로써, 음성 인식 서버(400)를 거치지 않고 외부 기기(200-1, 200-2)의 동작을 제어할 수 있다.In addition, the electronic device 100 can directly transmit the determined control command to the device control server 500 to control the operation of the external devices 200-1 and 200-2 without going through the voice recognition server 400. have.

이와 같이, 전자 장치(100)가 수신된 사용자 음성을 직접 음성 인식 처리하여 제어 명령을 판단하거나, 판단된 제어 명령을 기기 제어 서버(500)로 직접 전송하는 경우, 음성 인식 서버(400)를 통해 제어 명령을 판단하고 전송하는 경우보다 신속하게 외부 기기(200-1, 200-2)의 동작을 제어할 수 있다. In this way, when the electronic device 100 directly processes the received user's voice to voice recognition to determine a control command or directly transmits the determined control command to the device control server 500, the voice recognition server 400 The operation of the external devices 200-1 and 200-2 can be controlled more quickly than when the control command is determined and transmitted.

한편, 제어 명령 판단 툴을 이용하여 제어 명령을 판단하는 경우, 사용자 의도 정보 및 외부 기기(200-1, 200-2)의 상태 정보를, 룰과 비교하거나 인공 지능 모델에 입력하는 등 비교적 간단한 동작을 통해 제어 명령이 판단되게 되므로, 전술한 음성 인식 서버(400)의 자체 정책을 이용하여 제어 명령을 판단하는 경우 보다 빠르게 제어 명령이 판단될 수 있다. 따라서, 음성 인식 서버(400)가 자체 정책을 통해 제어 명령을 판단하는 경우 발생될 수 있는 딜레이가 개선될 수 있다. On the other hand, when determining a control command using a control command determination tool, relatively simple operations such as comparing user intention information and status information of external devices 200-1 and 200-2 with rules or inputting into an artificial intelligence model. Since the control command is determined through the control command, the control command may be determined faster than when the control command is determined using the self-policy of the voice recognition server 400 described above. Accordingly, a delay that may occur when the voice recognition server 400 determines a control command through its own policy may be improved.

이하에서는, 도 2를 참조하여, 본 개시의 일 실시 예에 따른 전자 장치(100)의 동작을 자세히 설명한다. Hereinafter, an operation of the electronic device 100 according to an embodiment of the present disclosure will be described in detail with reference to FIG. 2.

도 2는 본 개시의 일 실시 예에 따른 전자 장치의 블럭도이다. 도 2에 따르면, 전자 장치(100)는 마이크(110), 프로세서(120), 통신부(130) 및 메모리(140)를 포함한다. 2 is a block diagram of an electronic device according to an embodiment of the present disclosure. According to FIG. 2, the electronic device 100 includes a microphone 110, a processor 120, a communication unit 130, and a memory 140.

마이크(110)는 음파 형태의 사운드를 외부로부터 수신하여 전기적 신호로 변환한 후 변환된 전기적 신호를 프로세서(120)로 제공한다. 특히, 마이크(110)는 사용자 음성을 수신하여 대응되는 전기적 신호를 프로세서(120)로 제공할 수 있다. The microphone 110 receives sound in the form of a sound wave from the outside, converts it into an electrical signal, and provides the converted electrical signal to the processor 120. In particular, the microphone 110 may receive a user's voice and provide a corresponding electrical signal to the processor 120.

통신부(130)는 외부의 각종 서버 또는 각종 기기들과 통신을 수행하기 위한 구성이다. The communication unit 130 is a component for performing communication with various external servers or various devices.

특히, 통신부(130)는 음성 인식 서버(400) 또는 기기 제어 서버(500)와 통신을 수행하여, 사용자 음성, 사용자 의도 정보, 외부 기기의 상태 정보, 제어 명령 등과 같은 각종 정보 내지 데이터를 송, 수신할 수 있다. In particular, the communication unit 130 communicates with the voice recognition server 400 or the device control server 500 to transmit various information or data such as user voice, user intention information, status information of external devices, control commands, etc. Can receive.

또한, 통신부(130)는 전자 장치(100) 주변의 외부 기기들(200-1, 200-2)과 통신을 수행하여 IoT 환경이나 홈네트워크 환경을 구성할 수 있다. Also, the communication unit 130 may configure an IoT environment or a home network environment by performing communication with external devices 200-1 and 200-2 around the electronic device 100.

메모리(140)는 저장된 데이터 또는 정보에 프로세서(120) 등이 접근할 수 있도록, 데이터 또는 정보를 전기 또는 자기 형태로 저장할 수 있다. The memory 140 may store data or information in electric or magnetic form so that the processor 120 or the like can access the stored data or information.

특히, 메모리(140)에는, 제어 명령을 판단하기 위한 기초가 되는 제어 명령 판단 툴(141), 제어 명령 판단 툴을 관리하기 위한 툴 관리 모듈(125), 음성 인식 기능을 수행하고 제어 명령을 판단하기 위한 음성 인식 모듈(123), 및 외부 기기(200-1, 200-2)의 상태 정보를 모니터링하고 제어 명령을 송, 수신하기 위한 기기 제어 모듈(121) 등이 저장될 수 있다.In particular, the memory 140 includes a control command determination tool 141 that is a basis for determining a control command, a tool management module 125 for managing the control command determination tool, and performs a voice recognition function and determines a control command. A voice recognition module 123 for monitoring and a device control module 121 for monitoring state information of the external devices 200-1 and 200-2 and transmitting and receiving control commands may be stored.

프로세서(120)는 전자 장치(100)의 전반적인 동작을 제어한다. 특히, 프로세서(120)는 메모리(140)에 저장된 제어 명령 판단 툴(141), 툴 관리 모듈(125), 음성 인식 모듈(123), 기기 제어 모듈(121) 등을 로딩하여 각 모듈의 기능을 수행할 수 있다. The processor 120 controls the overall operation of the electronic device 100. In particular, the processor 120 loads the control command determination tool 141, the tool management module 125, the voice recognition module 123, the device control module 121, etc. stored in the memory 140 to perform functions of each module. Can be done.

구체적으로, 도 2는, 프로세서(120)가 메모리(140)에 저장된 기기 제어 모듈(121), 음성 인식 모듈(123), 및 툴 관리 모듈(125)을 로딩하여 해당 기능을 수행하고 있는 상태를 도시하고 있다. Specifically, FIG. 2 shows a state in which the processor 120 loads the device control module 121, the voice recognition module 123, and the tool management module 125 stored in the memory 140 and performs the corresponding function. Is shown.

제어 명령 판단 툴(141)은, 음성 인식 모듈(123)(구체적으로는, NLU 모듈(123-2))이, 후술할 바와 같이 사용자 의도 정보 및 외부 기기의 상태 정보를 적용하여 제어 명령을 판단하기 위한 수단으로, 전자 장치(100)가 이전에 외부 기기를 제어했던 히스토리 정보에 기초하여 생성되는 룰, 또는 전자 장치(100)가 이전에 외부 기기를 제어했던 히스토리 정보에 기초하여 학습되는 인공 지능 모델을 포함할 수 있다. In the control command determination tool 141, the voice recognition module 123 (specifically, the NLU module 123-2) determines a control command by applying user intention information and state information of an external device, as described later. As a means for doing so, a rule generated based on history information in which the electronic device 100 previously controlled an external device, or artificial intelligence learned based on the history information in which the electronic device 100 previously controlled an external device. Can contain models.

여기서, 히스토리 정보에는, 이전 제어 당시 획득된 사용자 의도 정보, 이전 제어 당시 외부 기기의 상태 정보 및 이전 제어 당시 음성 인식 서버(400)에 의해 판단된 제어 명령(특히, 자체 정책에 의해 판단된 제어 명령)이 포함될 수 있다. Here, the history information includes user intention information obtained at the time of the previous control, state information of the external device at the time of the previous control, and a control command determined by the voice recognition server 400 at the time of the previous control (in particular, a control command determined by its own policy). ) May be included.

구체적으로, 룰 DB(141-1)는 제어 명령을 판단하기 위한 복수의 룰을 포함하는 데이터 베이스이다. 룰 DB(141-1)는 예를 들어, 아래 표 1과 같이 사용자 의도 정보, 외부 기기의 상태 정보 및 제어 명령이 서로 매칭되어 있는 룩 업 테이블 형태일 수 있으나, 이에 한정되는 것은 아니다. Specifically, the rule DB 141-1 is a database including a plurality of rules for determining a control command. The rule DB 141-1 may be in the form of a look-up table in which user intention information, status information of an external device, and control commands are matched, as shown in Table 1 below, but is not limited thereto.

사용자 의도 정보User Intent Information 외부 기기의 상태 정보External device status information 제어 명령Control command 룰 1Rule 1 TV power-onTV power-on TV 1(200-1): off
TV 2(200-2): onTV 1(200-1): off
TV 2(200-2): on TV 1(200-1) power-onTV 1(200-1) power-on 룰 2Rule 2 에어컨 power-onAir conditioner power-on 에어컨 1: off에어컨 2: offAir conditioner 1: off Air conditioner 2: off 에어컨 2 power-onAir conditioner 2 power-on

룰 DB(141-1)의 각 룰에는, 사용자 의도 정보, 외부 기기의 상태 정보 및 제어 명령이 서로 매칭되어 있으므로, 음성 인식 모듈(123)은, 후술할 바와 같이, 사용자 음성에 대응되는 사용자 의도 정보 및 외부 기기의 상태 정보를 획득하고, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보에 대응되는 룰을 룰 DB(141-1)에서 확인하여 신속하게 제어 명령을 판단할 수 있게 된다. In each rule of the rule DB 141-1, the user intention information, the state information of the external device, and the control command are matched with each other, so the voice recognition module 123, as will be described later, It is possible to quickly determine a control command by acquiring information and state information of an external device, and checking a rule corresponding to the acquired user intention information and state information of an external device in the rule DB 141-1.

인공 지능 모델(141-2)은 사용자 의도 정보 및 외부 기기의 상태 정보를 입력으로 하고, 제어 명령을 출력으로 하여 학습된다. 따라서, 후술할 바와 같이, 음성 인식 모듈(123)은, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보를 상기 학습된 인공 지능 모델에 입력함으로써, 신속하게 제어 명령을 획득할 수 있다. The artificial intelligence model 141-2 is learned by inputting user intention information and state information of an external device as input, and outputting a control command. Accordingly, as will be described later, the voice recognition module 123 may quickly obtain a control command by inputting the acquired user intention information and the state information of the external device into the learned artificial intelligence model.

툴 관리 모듈(125)은 메모리(140)에 저장된 제어 명령 판단 툴(141)을 관리할 수 있다. The tool management module 125 may manage the control command determination tool 141 stored in the memory 140.

구체적으로, 툴 관리 모듈(125)은 음성 인식 모듈(123)이 제어 명령을 판단할 때 제어 명령 판단 툴(141)을 이용할 수 있도록, 적절한 시점에 메모리(140)에 접속하여 제어 명령 판단 툴(141) 중 적어도 일부를 프로세서(120)에 로딩할 수 있다. Specifically, the tool management module 125 accesses the memory 140 at an appropriate time so that the voice recognition module 123 can use the control command determination tool 141 when determining the control command. At least some of 141) may be loaded into the processor 120.

예를 들어, 툴 관리 모듈(125)은 음성 인식 모듈(123)의 요청이 있는 때, 룰 DB(141-1)에 포함된 적어도 일부의 룰 및/또는 인공 지능 모델(141-2)을 프로세서(120)에 로딩할 수 있다. 그러나, 로딩 시점이 이에 한정되는 것은 아니다. 가령, 툴 관리 모듈(125)은, 실시 예에 따라, 사용자 음성이 수신된 때 또는 전자 장치(100)의 전원이 온된 때 등과 같이 음성 인식 모듈(123)이 제어 명령을 판단하기 전의 다양한 시점에 제어 명령 판단 툴(141)을 프로세서(120)에 로딩할 수 있다. For example, when there is a request from the speech recognition module 123, the tool management module 125 processes at least some of the rules and/or the artificial intelligence model 141-2 included in the rule DB 141-1. It can be loaded at 120. However, the loading time is not limited thereto. For example, the tool management module 125 may be configured at various times before the voice recognition module 123 determines a control command, such as when a user's voice is received or when the electronic device 100 is turned on. The control command determination tool 141 may be loaded into the processor 120.

한편, 툴 관리 모듈(125)는 제어 명령 판단 툴(141)을 업데이트할 수 있다. 이에 관하여는 후술하기로 한다. Meanwhile, the tool management module 125 may update the control command determination tool 141. This will be described later.

기기 제어 모듈(121)은 기기 제어 서버(500)를 통해 외부 기기의 상태 정보를 모니터링하고, 음성 인식 모듈(123)로 제공할 수 있다. The device control module 121 may monitor status information of an external device through the device control server 500 and provide it to the voice recognition module 123.

구체적으로, 기기 제어 모듈(121)은, 음성 인식 모듈(123)로부터 외부 기기의 상태 정보가 요청되면, 통신부(130)를 통해 기기 제어 서버(500)로 외부 기기의 상태 정보를 요청하고, 기기 제어 서버(500)로부터 수신되는 현재 외부 기기의 상태 정보를 음성 인식 모듈(123)로 제공할 수 있다. Specifically, when the state information of the external device is requested from the voice recognition module 123, the device control module 121 requests the state information of the external device to the device control server 500 through the communication unit 130, and The current status information of the external device received from the control server 500 may be provided to the voice recognition module 123.

한편, 기기 제어 모듈(121)은, 후술하는 실시 예와 같이 툴 관리 모듈(125)로부터 외부 기기의 상태 정보가 요청되면, 기기 제어 서버(500)로부터 수신되는 현재 외부 기기의 상태 정보를 툴 관리 모듈(125)로 제공할 수도 있다. Meanwhile, the device control module 121 manages the current state information of the external device received from the device control server 500 when the state information of the external device is requested from the tool management module 125 as in an embodiment to be described later. It may also be provided as a module 125.

또한, 기기 제어 모듈(121)은 음성 인식 모듈(123)이 판단한 제어 명령을 기기 제어 서버(500)로 전송하여 제어 대상 기기의 동작을 제어할 수 있다. In addition, the device control module 121 may transmit a control command determined by the voice recognition module 123 to the device control server 500 to control the operation of the controlling device.

구체적으로, 기기 제어 모듈(121)은, 음성 인식 모듈(123)로부터 제어 명령이 수신되면, 수신된 제어 명령을 통신부(130)를 통해 기기 제어 서버(500)로 전송할 수 있다. 이에 따라, 기기 제어 서버(500)가 제어 명령에 대응되는 제어 신호를 제어 대상 기기로 전송함으로써, 제어 대상 기기의 동작이 제어될 수 있다. Specifically, when a control command is received from the voice recognition module 123, the device control module 121 may transmit the received control command to the device control server 500 through the communication unit 130. Accordingly, the device control server 500 transmits a control signal corresponding to the control command to the control target device, so that the operation of the control target device may be controlled.

음성 인식 모듈(123)은 마이크(110)를 통해 수신되는 사용자 음성에 대응되는 제어 명령을 판단할 수 있다. The voice recognition module 123 may determine a control command corresponding to a user's voice received through the microphone 110.

구체적으로, 음성 인식 모듈(123)은 음성 인식 기능을 수행할 수 있다. 이를 위해, 음성 인식 모듈(123)은 ASR 모듈(123-1) 및 NLU 모듈(123-2)을 포함할 수 있다. Specifically, the voice recognition module 123 may perform a voice recognition function. To this end, the voice recognition module 123 may include an ASR module 123-1 and an NLU module 123-2.

ASR 모듈(123-1)은 사용자 음성을 인식하고, 인식된 사용자 음성을 텍스트로 출력할 수 있다. 예를 들어, "TV 틀어줘"와 같은 사용자 음성이 마이크(110)를 통해 수신되면, ASR 모듈(123-1)은 이를 인식하여 "TV 틀어줘"와 같은 텍스트를 NLU 모듈(123-2)로 출력할 수 있다. The ASR module 123-1 may recognize a user voice and output the recognized user voice as text. For example, when a user voice such as "Turn on TV" is received through the microphone 110, the ASR module 123-1 recognizes this and sends a text such as "Turn on TV" to the NLU module 123-2. Can be printed as

이를 위해, ASR 모듈(123-1)은 음향(acoustic) 모델 및 언어(language) 모델을 포함할 수 있다. 이때, 음향 모델은 발성에 관련된 정보를 포함할 수 있고, 언어 모델은 단위 음소 정보 및 단위 음소 정보의 조합에 대한 정보를 포함할 수 있다. 따라서, ASR 모듈(123-1)은 발성에 관련된 정보 및 단위 음소 정보에 대한 정보를 이용하여 사용자 음성을 텍스트로 변환할 수 있다. To this end, the ASR module 123-1 may include an acoustic model and a language model. In this case, the acoustic model may include information related to speech, and the language model may include information on a combination of unit phoneme information and unit phoneme information. Accordingly, the ASR module 123-1 may convert the user's voice into text by using information related to vocalization and information on unit phoneme information.

NLU 모듈(123-2)은 ASR 모듈(123-1)로부터 수신된 텍스트의 의미 즉, 사용자 음성에 대응되는 사용자 의도를 파악할 수 있다. 예를 들어, "TV 틀어줘"와 같은 텍스트가 ASR 모듈(123-1)로부터 수신되면, NLU 모듈(123-2)은 이를 분석하여 "TV power-on"과 같은 사용자 의도 정보를 획득할 수 있다. The NLU module 123-2 may determine the meaning of the text received from the ASR module 123-1, that is, the user intention corresponding to the user's voice. For example, when a text such as "Turn on TV" is received from the ASR module 123-1, the NLU module 123-2 may analyze it and obtain user intention information such as "TV power-on". have.

이를 위해, NLU 모듈(123-2)은, ASR 모듈(123-1)로부터 수신된 텍스트에 대해, 키워드 매칭(keyword matching), 구문 분석(syntactic analysis) 및 의미 분석(semantic analysis) 등의 작업을 수행하여 사용자 의도를 파악할 수 있다. To this end, the NLU module 123-2 performs tasks such as keyword matching, syntactic analysis, and semantic analysis with respect to the text received from the ASR module 123-1. To understand user intent.

도 2에서는 음성 인식 모듈(123)이 ASR 모듈(123-1) 및 NLU 모듈(123-2)를 모두 포함하는 것을 예로 들었으나, 실시 예가 이에 한정되는 것은 아니다. 즉, 예를 들어 음성 인식 모듈(123)은, ASR 모듈(123-1) 또는 NLU 모듈(123-2) 중 하나만을 포함할 수도 있으며, 이 경우, 음성 인식 모듈(123)은 나머지 모듈의 역할에 해당하는 처리를 음성 인식 서버(400)로 요청하고, 그 처리 결과를 수신할 수도 있다. In FIG. 2, it has been exemplified that the speech recognition module 123 includes both the ASR module 123-1 and the NLU module 123-2, but the embodiment is not limited thereto. That is, for example, the voice recognition module 123 may include only one of the ASR module 123-1 or the NLU module 123-2, and in this case, the voice recognition module 123 serves as the remaining modules. It is also possible to request the processing corresponding to the speech recognition server 400, and receive the processing result.

이때, 어느 경우이든 음성 인식 모듈(123)은, 음성 인식 처리 결과 즉, ASR 처리 및/또는 NLU 처리 결과의 신뢰도가 일정 수준 미만인 경우, 음성 인식 서버(400)로 사용자 음성에 대한 ASR 처리 및/또는 NLU 처리를 요청하고, 그 처리 결과를 획득할 수도 있음은 전술한 바와 같다. At this time, in any case, when the reliability of the speech recognition processing result, that is, the ASR processing and/or the NLU processing result, is less than a certain level, the speech recognition server 400 performs ASR processing and / Alternatively, as described above, it is possible to request NLU processing and obtain the processing result.

이와 같이, 음성 인식 모듈(123)은 마이크(110)를 통해 수신된 사용자 음성을 음성 인식 처리하여 사용자 음성에 대응되는 사용자 의도 정보를 획득할 수 있다. In this way, the voice recognition module 123 may perform voice recognition on the user voice received through the microphone 110 to obtain user intention information corresponding to the user voice.

또한, 음성 인식 모듈(123)은 사용자 의도 정보와 관련된 외부 기기의 상태 정보를 획득할 수 있다. 구체적으로, 상술한 바와 같이 사용자 의도 정보가 획득되면, NLU 모듈(123-2)(이하에서, NLU 모듈(123-2)의 동작은, 음성 인식 모듈(123)이 ASR 모듈(123-1)만 포함하는 실시 예의 경우에는, 음성 인식 모듈(123)이 수행하는 것으로 볼 수 있다.)은 획득된 사용자 의도 정보와 관련된 외부 기기의 상태 정보를 기기 제어 모듈(121)을 통해 기기 제어 서버(500)로부터 수신할 수 있다. Also, the voice recognition module 123 may obtain state information of an external device related to user intention information. Specifically, as described above, when the user intention information is obtained, the NLU module 123-2 (hereinafter, the operation of the NLU module 123-2, the voice recognition module 123 is the ASR module 123-1) In the case of an embodiment including only, the voice recognition module 123 may be regarded as being performed.) The device control server 500 through the device control module 121 transmits state information of the external device related to the acquired user intention information. ).

이와 같이, 사용자 의도 정보 및 외부 기기의 상태 정보가 획득되면, NLU 모듈(123-2)은, 제어 명령 판단 툴(141) 중 룰 DB(141-1)의 로딩을 툴 관리 모듈(125)에 요청할 수 있다. In this way, when the user intention information and the status information of the external device are obtained, the NLU module 123-2 transfers the loading of the rule DB 141-1 of the control command determination tool 141 to the tool management module 125. Can be requested.

이에 따라, 룰 DB(141-1)가 프로세서(120)에 로딩되면, NLU 모듈(123-2)은 로딩된 룰 DB(141-1) 중 상기 획득된 사용자 의도 정보 및 외부 기기의 상태 정보와 동일한 사용자 의도 정보 및 외부 기기의 상태 정보가 매칭되어 있는 룰이 룰 DB(141-1)에 존재하는지 확인하고, 존재하는 경우, 해당 룰에 매칭되어 있는 제어 명령을, 사용자 음성에 대응되는 제어 명령으로 판단할 수 있다. Accordingly, when the rule DB 141-1 is loaded into the processor 120, the NLU module 123-2 includes the acquired user intention information and the state information of the external device among the loaded rule DB 141-1. Checks whether a rule in which the same user intention information and status information of an external device are matched exists in the rule DB 141-1, and if there is, a control command matching the rule is given to a control command corresponding to the user's voice. Can be judged as.

예를 들어, 이전에 사용자가 "TV 틀어줘"라는 사용자 음성을 발화함에 따라 음성 인식 서버(400)를 통해 TV 1(200-1)을 제어했던 히스토리에 기초하여, 상기 표 1의 룰 1과 같은 룰이 생성되어 룰 DB(141-1)에 저장되어 있고, 이후에 다시 사용자가 "TV 틀어줘"라는 음성을 발화한 경우를 가정하자. For example, based on the history of controlling the TV 1 (200-1) through the voice recognition server 400 as the user previously uttered the user's voice "Turn on the TV", rule 1 of Table 1 and Suppose that the same rule is created and stored in the rule DB 141-1, and the user utters a voice "Turn on TV" again later.

이 경우, 음성 인식 모듈(123)은 사용자 음성을 음성 인식 처리하여 "TV power-on"이라는 사용자 의도 정보를 획득하고, NLU 모듈(123-2)은 TV와 관련된 외부 기기의 상태 정보를 기기 제어 모듈(121)을 통해 기기 제어 서버(500)로부터 수신할 수 있다. In this case, the voice recognition module 123 performs voice recognition on the user's voice to obtain user intention information of “TV power-on”, and the NLU module 123-2 controls the state information of the external device related to the TV. It may be received from the device control server 500 through the module 121.

또한, NLU 모듈(123-2)은 툴 관리 모듈(125)에 룰 DB(141-1)의 로딩을 요청하고, 이에 따라, 룰 1을 포함하는 룰 DB(141-1)가 프로세서(120)에 로딩될 수 있다. In addition, the NLU module 123-2 requests the tool management module 125 to load the rule DB 141-1, and accordingly, the rule DB 141-1 including the rule 1 is transferred to the processor 120 Can be loaded into

이때, NLU 모듈(123-2)로 수신된 외부 기기의 상태 정보가 "TV 1(200-1) off" 및 "TV 2(200-2) on"인 경우, 룰 DB(141-1)에는 사용자 의도 정보가 "TV power-on"이고, 외부 기기의 상태 정보가 "TV 1(200-1) off" 및 "TV 2(200-2) on"인 룰 1이 존재하므로, 음성 인식 모듈(123)은 룰 1에 매칭되어 있는 제어 명령인 "TV 1(200-1) power-on"을, 현재 수신된 "TV 틀어줘"에 대한 제어 명령으로 즉시 판단할 수 있다. At this time, when the status information of the external device received by the NLU module 123-2 is "TV 1 (200-1) off" and "TV 2 (200-2) on", the rule DB (141-1) Since the rule 1 in which the user intention information is "TV power-on" and the status information of the external device is "TV 1 (200-1) off" and "TV 2 (200-2) on" exists, the voice recognition module ( 123) may immediately determine “TV 1 (200-1) power-on”, which is a control command matched with Rule 1, as a control command for “Turn on TV” currently received.

이상에서는, 음성 인식 모듈(123)의 요청에 따라 메모리(140)에 저장된 전체 룰 DB(141-1)가 프로세서(140)에 로딩되는 것을 예로 들어 설명하였으나, 실시 예가 이에 한정되는 것은 아니다. In the above, it has been described as an example that the entire rule DB 141-1 stored in the memory 140 is loaded into the processor 140 in response to the request of the voice recognition module 123, but the embodiment is not limited thereto.

가령, 사용자 의도 정보와 관련된 외부 기기를 포함하는 룰들만 프로세서(120)에 로딩될 수도 있다. 예를 들어, 상기 표 1과 같은 룰 DB(141-1)가 메모리(140)에 저장되어 있고, 상술한 예에서와 같이 "TV power-on"이라는 사용자 의도 정보가 획득되면, 음성 인식 모듈(123)은 사용자 의도 정보와 관련된 외부 기기 즉, TV를 포함하는 룰의 로딩을 툴 관리 모듈(125)로 요청할 수 있다. 이에 따라, 툴 관리 모듈(125)은 룰 DB(141-1) 중 TV를 포함하는 룰 1만 프로세서(120)에 로딩할 수 있다. 이 경우, 음성 인식 모듈(123)이 매칭되는 룰의 존부를 확인할 때 비교 대상이 줄어들게 되므로, 보다 신속하게 제어 명령의 존부가 판단될 수 있을 것이다. For example, only rules including external devices related to user intention information may be loaded into the processor 120. For example, if the rule DB 141-1 shown in Table 1 is stored in the memory 140, and user intention information “TV power-on” is obtained as in the above-described example, the speech recognition module ( 123) may request the tool management module 125 to load an external device related to user intention information, that is, a rule including a TV. Accordingly, the tool management module 125 may load 10,000 rules including a TV from the rule DB 141-1 into the processor 120. In this case, when the voice recognition module 123 checks the existence of a matching rule, the number of comparison targets decreases, so that the presence or absence of a control command may be more quickly determined.

한편, 본 개시의 다른 일 실시 예에 따르면, 사용자 의도 정보 및 외부 기기의 상태 정보가 획득된 경우, NLU 모듈(123-2))은 제어 명령 판단 툴(141) 중 인공 지능 모델(141-2)의 로딩을 툴 관리 모듈(125)에 요청할 수도 있다. Meanwhile, according to another embodiment of the present disclosure, when user intention information and status information of an external device are acquired, the NLU module 123-2 is an artificial intelligence model 141-2 of the control command determination tool 141. ) May be requested to be loaded from the tool management module 125.

이에 따라, 인공 지능 모델(141-2)이 프로세서(120)에 로딩되면, NLU 모듈(123-2)은, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보를 인공 지능 모델(141-2)에 입력하고, 이에 따라 출력되는 제어 명령(예를 들어, 복수의 제어 명령들 중 가장 높은 확률 값을 갖는 제어 명령)을, 사용자 음성에 대응되는 제어 명령으로 판단할 수 있다. Accordingly, when the artificial intelligence model 141-2 is loaded into the processor 120, the NLU module 123-2 transmits the acquired user intention information and the state information of the external device to the artificial intelligence model 141-2. A control command input and output accordingly (eg, a control command having a highest probability value among a plurality of control commands) may be determined as a control command corresponding to a user's voice.

예를 들어, 이전에 사용자가 "TV 틀어줘"라는 사용자 음성을 발화함에 따라 음성 인식 서버(400)를 통해 TV 1(200-1)을 제어 했던 히스토리에 기초하여, 사용자 의도 정보("TV power-on") 및 사용자 의도 정보와 관련된 외부 기기의 상태 정보("TV 1(200-1) off", "TV 2(200-2) on")가 입력되는 경우 제어 명령("TV 1(200-1) power-on")이 가장 높은 확률 값으로 출력되도록 하여 인공 지능 모델(141-2)이 학습되어 있는 경우를 가정하자.For example, based on the history of controlling the TV 1 (200-1) through the voice recognition server 400 as the user previously uttered the user's voice "Turn on the TV", user intention information ("TV power -on") and the status information of the external device related to the user intention information ("TV 1(200-1) off", "TV 2(200-2) on"), a control command ("TV 1 (200 -1) Suppose that the artificial intelligence model 141-2 is trained by outputting "power-on") as the highest probability value.

이 경우, 이후에 다시 사용자가 "TV 틀어줘"라는 음성을 발화한 경우를 가정하면, 음성 인식 모듈(123)은 사용자 음성을 음성 인식 처리하여 "TV power-on"이라는 사용자 의도 정보를 획득하고, NLU 모듈(123-2)는 TV와 관련된 외부 기기의 상태 정보를 기기 제어 모듈(121)을 통해 기기 제어 서버(500)로부터 수신할 수 있다. In this case, assuming that the user utters the voice "Turn on TV" again later, the voice recognition module 123 performs voice recognition processing on the user's voice to obtain user intention information of "TV power-on". , The NLU module 123-2 may receive status information of an external device related to the TV from the device control server 500 through the device control module 121.

이때, 수신된 외부 기기의 상태 정보가 "TV 1(200-1) off", "TV 2(200-2) on"인 경우, NLU 모듈(123-2)은, 현재 획득된 사용자 의도 정보인 "TV power-on" 및 현재 수신한 외부 기기의 상태 정보인 "TV 1(200-1) off", "TV 2(200-2) on"을 상기 학습된 인공 지능 모델(141-2)에 입력할 수 있다. 이에 따라, NLU 모듈(123-2)은, 인공 지능 모델(141-2)에서 출력되는 제어 명령 "TV 1(200-1) power-on"을 현재 수신된 "TV 틀어줘"에 대한 제어 명령으로 즉시 판단할 수 있다. At this time, when the received status information of the external device is "TV 1 (200-1) off" or "TV 2 (200-2) on", the NLU module 123-2 is the currently acquired user intention information. "TV power-on" and "TV 1 (200-1) off" and "TV 2 (200-2) on", which are status information of the currently received external device, are added to the learned artificial intelligence model 141-2. You can enter. Accordingly, the NLU module 123-2 receives the control command "TV 1 (200-1) power-on" output from the artificial intelligence model 141-2 for the currently received "Turn on TV" Can be judged immediately.

한편, 본 개시의 일 실시 예에 따르면, NLU 모듈(123-2)은, 제어 명령 판단 툴(141)을 이용하기 전에, 도 1의 음성 인식 서버(400)의 동작과 같이, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보에 기초하여 사용자 의도 정보만으로 제어 명령이 특정되는지를 먼저 판단하고, 사용자 의도 정보만으로 제어 명령이 특정되는 경우에는 제어 명령 판단 툴(141)을 이용하지 않고 상기 획득된 사용자 의도 정보에 기초하여 제어 명령을 바로 판단할 수도 있다. On the other hand, according to an embodiment of the present disclosure, before using the control command determination tool 141, the NLU module 123-2, as in the operation of the voice recognition server 400 of FIG. Based on the information and the state information of the external device, it is determined first whether a control command is specified only with user intention information, and when the control command is specified only with user intention information, the acquired user without using the control command determination tool 141 The control command may be determined immediately based on the intention information.

예를 들어, 위 예에서 기기 등록 서버(500)에 등록된 사용자 계정의 TV가 1대이고, "TV off"라는 TV의 상태 정보가 수신된 경우, "TV power-on"이라는 사용자 의도 정보만으로, 제어 대상 기기 및 제어 대상 기기의 동작이 특정되므로, 음성 인식 모듈(123)은 툴 관리 모듈(125)로 제어 명령 판단 툴(141)을 요청하지 않고,"TV power-on"을 "TV 틀어줘"라는 사용자 음성에 대한 제어 명령으로 바로 판단할 수도 있을 것이다. For example, in the above example, when there is one TV in the user account registered in the device registration server 500 and the TV status information “TV off” is received, only the user intention information “TV power-on” , Since the control target device and the operation of the control target device are specified, the voice recognition module 123 does not request the control command determination tool 141 to the tool management module 125, and turns on "TV power-on" It could be judged right away with a control command for the user's voice, "Give me".

이와 같이, 사용자 음성에 대응되는 제어 명령이 판단되면, 음성 인식 모듈(121)은 판단된 제어 명령을 기기 제어 모듈(121)로 전송함으로써, 제어 대상 기기의 동작을 제어할 수 있다. In this way, when a control command corresponding to the user's voice is determined, the voice recognition module 121 transmits the determined control command to the device control module 121 to control the operation of the controlling device.

한편, 툴 관리 모듈(125)은 룰 DB(141-1) 또는 인공 지능 모델(141-2)을 업데이트 할 수 있다. 구체적으로, 음성 인식 모듈(123)은 획득된 사용자 의도 정보 및 외부 기기의 상태 정보에 대응되는 룰이 룰 DB(141-1)에 존재하지 않거나, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보를 인공 지능 모델(141-2)에 입력하였으나 제어 명령이 출력되지 않는 경우(예를 들어, 기설정된 확률 값 이상을 갖는 제어 명령이 없는 경우), 획득된 사용자 의도 정보 및 외부 기기의 상태 정보를 툴 관리 모듈(125)로 전송하고, 통신부(130)를 통해 사용자 음성을 음성 인식 서버(400)로 전송할 수 있다. Meanwhile, the tool management module 125 may update the rule DB 141-1 or the artificial intelligence model 141-2. Specifically, the voice recognition module 123 does not exist in the rule DB 141-1 or a rule corresponding to the acquired user intention information and the state information of the external device, or the acquired user intention information and the state information of the external device. When input to the artificial intelligence model 141-2 but a control command is not output (for example, when there is no control command having a predetermined probability value or more), the acquired user intention information and the status information of the external device are used as a tool. It transmits to the management module 125 and transmits the user's voice to the voice recognition server 400 through the communication unit 130.

이에 따라, 음성 인식 서버(400)는 도 1에서 전술한 바와 같이 사용자 음성에 대응되는 제어 명령을 판단하고, 판단된 제어 명령을 기기 제어 서버(500) 및 전자 장치(100)로 전송할 수 있다. 또는 음성 인식 서버(400)는 판단된 제어 명령을 기기 제어 서버(500)로만 전송하고, 기기 제어 서버(500)가 전자 장치(100)로 제어 명령을 전송할 수도 있다. Accordingly, as described above in FIG. 1, the voice recognition server 400 may determine a control command corresponding to the user's voice and transmit the determined control command to the device control server 500 and the electronic device 100. Alternatively, the voice recognition server 400 may transmit the determined control command only to the device control server 500, and the device control server 500 may transmit the control command to the electronic device 100.

이때, 기기 제어 서버(500)로 전송된 제어 명령은, 도 1에서 전술한 바와 같이, 제어 신호 형태로 제어 대상 기기로 전송되어 제어 대상 기기의 동작을 제어하는데 이용된다. At this time, the control command transmitted to the device control server 500 is transmitted to the control target device in the form of a control signal and is used to control the operation of the control target device, as described above in FIG. 1.

한편, 전자 장치(100)로 전송된 제어 명령은 룰 DB(141-1)나 인공 지능 모델(141-2)를 업데이트하는데 이용되게 된다. Meanwhile, the control command transmitted to the electronic device 100 is used to update the rule DB 141-1 or the artificial intelligence model 141-2.

구체적으로, 음성 인식 서버(400)에서 판단된 제어 명령이 통신부(130)를 통해 수신되면, 툴 관리 모듈(125)은, 음성 인식 모듈(123)로부터 수신한 사용자 의도 정보 및 외부 기기의 상태 정보를, 음성 인식 서버(400)로부터 수신한 제어 명령과 매칭시켜 신규 룰을 생성하고, 생성된 신규 룰을 룰 DB(141-1)에 업데이트 할 수 있다. Specifically, when the control command determined by the voice recognition server 400 is received through the communication unit 130, the tool management module 125 includes user intention information received from the voice recognition module 123 and state information of an external device. A new rule may be generated by matching with a control command received from the voice recognition server 400, and the generated new rule may be updated in the rule DB 141-1.

또한, 툴 관리 모듈(125)은, 음성 인식 모듈(123)로부터 수신한 사용자 의도 정보 및 외부 기기의 상태 정보를 입력으로 하고, 음성 인식 서버(400)로부터 수신한 제어 명령을 출력으로 하여 인공 지능 모델을 학습시킴으로써, 인공 지능 모델(141-2)을 업데이트할 수 있다. In addition, the tool management module 125 inputs user intention information received from the voice recognition module 123 and status information of an external device, and outputs a control command received from the voice recognition server 400 to produce artificial intelligence. By training the model, the artificial intelligence model 141-2 can be updated.

이와 같이, 업데이트된 룰 DB(141-1) 및 인공 지능 모델(141-2)은 이후 마이크(110)를 통해 수신되는 사용자 음성에 대응되는 제어 명령 판단에 이용될 수 있다. In this way, the updated rule DB 141-1 and the artificial intelligence model 141-2 may be used to determine a control command corresponding to the user's voice received through the microphone 110 afterwards.

한편, 이상에서는, 툴 관리 모듈(125)이 음성 인식 모듈(123)로부터 사용자 의도 정보 및 외부 기기의 상태 정보를 모두 수신하는 것을 예로 들었으나, 실시 예가 이에 한정되는 것은 아니다. Meanwhile, in the above, it has been exemplified that the tool management module 125 receives both user intention information and state information of an external device from the voice recognition module 123, but the embodiment is not limited thereto.

예를 들어, 음성 인식 모듈(123)은 획득된 사용자 의도 정보만을 툴 관리 모듈(125)로 전송할 수 있다. 이에 따라, 툴 관리 모듈(125)은 음성 인식 모듈(123)로부터 수신된 사용자 의도 정보와 관련된 외부 기기의 상태 정보를 기기 제어 모듈(121)로 요청하고, 기기 제어 모듈(121)로부터 외부 기기의 상태 정보를 직접 수신할 수도 있다. For example, the voice recognition module 123 may transmit only the acquired user intention information to the tool management module 125. Accordingly, the tool management module 125 requests the device control module 121 for status information of the external device related to the user intention information received from the voice recognition module 123, and the device control module 121 You can also receive status information directly.

한편, 이상에서는, NLU 모듈(123-2)이 제어 명령을 판단하기 위해 룰 DB(141-1) 또는 인공 지능 모델(142-2)을 이용하는 예를 각각 설명하였다. 그러나, 본 개시의 일 실시 에에 따르면, NLU 모듈(123-2)은, 먼저 룰 DB(141-1)를 이용하여 제어 명령 판단을 시도하고, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보에 대응되는 룰이 룰 DB(141-1) 존재하지 않는 경우, 인공 지능 모델(141-2)을 이용하여 제어 명령을 판단할 수 있다. Meanwhile, in the above, examples in which the NLU module 123-2 uses the rule DB 141-1 or the artificial intelligence model 142-2 to determine a control command have been described, respectively. However, according to an embodiment of the present disclosure, the NLU module 123-2 first attempts to determine a control command using the rule DB 141-1, and responds to the acquired user intention information and the state information of the external device. When the rule DB 141-1 does not exist, the control command may be determined using the artificial intelligence model 141-2.

또는 본 개시의 다른 실시 예에 따르면, NLU 모듈(123-2)은, 먼저 인공 지능 모델(141-2)을 이용하여 제어 명령 판단을 시도하고, 기설정된 확률 값 이상을 갖는 제어 명령이 인공 지능 모델(141-2)에서 출력되지 않는 경우, 룰 DB(141-1)에 기초하여 제어 명령을 판단할 수도 있다. Alternatively, according to another embodiment of the present disclosure, the NLU module 123-2 first attempts to determine a control command using the artificial intelligence model 141-2, and the control command having a predetermined probability value or more is artificial intelligence. When not output from the model 141-2, the control command may be determined based on the rule DB 141-1.

위 두 실시 예에서, 룰 DB(141-1) 및 인공 지능 모델(141-2) 어느 것을 이용하더라도 제어 명령이 판단되지 않는 경우, 툴 관리 모듈(125)은 전술한 바와 같이 룰 DB(141-1) 및 인공 지능 모델(141-2)을 업데이트할 수 있다. In the above two embodiments, when the control command is not determined regardless of whether the rule DB 141-1 and the artificial intelligence model 141-2 are used, the tool management module 125 uses the rule DB 141- 1) and the artificial intelligence model 141-2 can be updated.

이하에서는, 도 3 내지 도 5을 참조하여 본 개시의 다양한 실시 예들을 설명한다. 도 3 내지 도 5를 설명함에 있어 전술한 것과 동일한 내용의 중복 설명은 생략한다. Hereinafter, various embodiments of the present disclosure will be described with reference to FIGS. 3 to 5. In describing FIGS. 3 to 5, overlapping descriptions of the same contents as those described above will be omitted.

도 3은 본 개시의 일 실시 예에 따른 음성 인식 시스템을 도시한 도면이다. 도 3에 따르면, 음성 인식 시스템(1000')은, 전자 장치(100), 스마트 TV(200-3), 스마트 에어컨 1(200-4), 스마트 에어컨 2(200-5), 레거시 TV(200-6), 액세스 포인트(300), 음성 인식 서버(400) 및 기기 제어 서버(500)을 포함할 수 있다. 3 is a diagram illustrating a speech recognition system according to an embodiment of the present disclosure. According to FIG. 3, the voice recognition system 1000' includes an electronic device 100, a smart TV 200-3, a smart air conditioner 1 (200-4), a smart air conditioner 2 (200-5), and a legacy TV 200. -6), an access point 300, a voice recognition server 400, and a device control server 500.

전자 장치(100)와 스마트 TV(200-3), 에어컨 1(200-4), 에어컨 2(200-5)는 인터넷 연결이 가능한 기기들로, 댁내(10)에서 IoT 네트워크를 구성하고 있으며, 액세스 포인트(300)를 통해 음성 인식 서버(400) 및 기기 제어 서버(500)와 연결될 수 있다. 한편, 레거시 TV(200-6)의 경우, 인터넷 연결이 불가능하며, IR 통신 방식으로만 동작의 제어가 가능하다. The electronic device 100, smart TV (200-3), air conditioner 1 (200-4), air conditioner 2 (200-5) are devices that can connect to the Internet, and constitute an IoT network at home (10). It may be connected to the voice recognition server 400 and the device control server 500 through the access point 300. Meanwhile, in the case of the legacy TV 200-6, internet connection is not possible, and operation can be controlled only by using an IR communication method.

한편, 음성 인식 서버(400) 및 기기 제어 서버(500)는 클라우드 서버일 수 있으나, 이에 한정되는 것은 아니다. Meanwhile, the voice recognition server 400 and the device control server 500 may be cloud servers, but are not limited thereto.

이러한 상황에서, 사용자는 기기 제어 서버(500)에 접속하여 스마트 TV(200-3), 에어컨 1(200-4), 에어컨 2(200-5), 레거시 TV(200-6)를 자신의 계정에 등록할 수 있다. In this situation, the user accesses the device control server 500 and stores the smart TV 200-3, air conditioner 1 (200-4), air conditioner 2 (200-5), and legacy TV 200-6 in his/her account. You can register at

이때, 기기 제어 서버(500)는 액세스 포인트(300)를 통해 연결된 스마트 TV(200-3), 에어컨 1(200-4), 에어컨 2(200-5)에 대하여만 상태 정보의 모니터링 및 동작의 제어가 가능하고, 레거시 TV(200-6)에 대하여는 상태 정보의 모니터링과 동작의 제어가 불가능하다. 다만, 레거시 TV(200-6)의 사용자 계정을 통해 알 수 있다. At this time, the device control server 500 monitors and operates the status information only for the smart TV 200-3, air conditioner 1 (200-4), and air conditioner 2 (200-5) connected through the access point 300. It is possible to control, and it is impossible to monitor the status information and control the operation of the legacy TV 200-6. However, it can be known through the user account of the legacy TV 200-6.

한편, 이전에 외부 기기를 제어했던 히스토리 정보에 기초하여 아래 표 2와 같은 룰 DB가 전자 장치(100)의 메모리(140)에 저장되어 있을 수 있다. Meanwhile, a rule DB as shown in Table 2 below may be stored in the memory 140 of the electronic device 100 based on history information that previously controlled the external device.

사용자 의도 정보User Intent Information 외부 기기의 상태 정보External device status information 제어 명령Control command 룰 1Rule 1 TV power-onTV power-on 스마트 TV(200-3): on
레거시 TV(200-6)Smart TV (200-3): on
Legacy TV (200-6) 레거시 TV(200-6) power-onLegacy TV (200-6) power-on 룰 2Rule 2 에어컨 power-onAir conditioner power-on 에어컨 1(200-4): off
에어컨 2(200-5): offAir conditioner 1 (200-4): off
Air conditioner 2 (200-5): off 에어컨 2(200-5) power-onAir conditioner 2(200-5) power-on

이와 같은 상황에서, "TV 틀어줘"와 같은 사용자 음성이 수신되면, 전자 장치(100)는 전술한 바와 같이, 사용자 음성을 음성 인식 처리하여 "TV power-on"과 같은 사용자 의도 정보를 획득하고, TV의 상태 정보를 기기 제어 서버(500)로 요청할 수 있다. 이에 따라, "스마트 TV(200-3) on", "레거시 TV(200-6)"와 같은 상태 정보가 수신되면, 전자 장치(100)는 표 2의 룰 1에 기초하여 "레거시 TV(200-6) power-on"을 제어 명령으로 판단할 수 있다. In such a situation, when a user's voice such as "Turn on TV" is received, the electronic device 100 acquires user intention information such as "TV power-on" by performing voice recognition processing on the user's voice, as described above. , TV status information may be requested from the device control server 500. Accordingly, when status information such as "Smart TV 200-3 on" and "Legacy TV 200-6" is received, the electronic device 100 is configured with the "Legacy TV 200" based on Rule 1 of Table 2. -6) "power-on" can be judged as a control command.

다만, 이 경우에는 기기 제어 서버(500)를 통한 레거시 TV(200-6)의 동작 제어가 불가능하므로, 전자 장치(100)는 판단된 제어 명령을 직접 레거시 TV(200-6)로 전송하여, 레거시 TV(200-6)의 동작을 제어할 수 있다. 이를 위해, 전자 장치(100)는 IR 블라스터 등과 같은 IR 신호를 송, 수신하기 위한 구성을 포함할 수 있다. However, in this case, since it is impossible to control the operation of the legacy TV 200-6 through the device control server 500, the electronic device 100 directly transmits the determined control command to the legacy TV 200-6, The operation of the legacy TV 200-6 can be controlled. To this end, the electronic device 100 may include a configuration for transmitting and receiving an IR signal such as an IR blaster.

한편, 에어컨 1(200-4) 및 에어컨 2(200-5)가 모두 off인 상태에서 사용자가 "에어컨 틀어줘"와 같은 음성을 발화한 경우, 전자 장치(100)는 표 2의 룰 2에 따라 "에어컨 2(200-5) power-on"와 같은 제어 명령을 판단할 수 있다. 이 경우에는 기기 제어 서버(500)를 통한 에어컨 2(200-5)의 동작 제어가 가능하므로, 전자 장치(100)는 도 1 및 도 2에서 전술한 바와 같이, 기기 제어 서버(500)로 제어 명령을 전송하여 에어컨 2(200-5)의 동작을 제어할 수 있다. On the other hand, when the user utters a voice such as "Turn on the air conditioner" while the air conditioner 1 (200-4) and the air conditioner 2 (200-5) are both off, the electronic device 100 complies with Rule 2 of Table 2. Accordingly, a control command such as "air conditioner 2 (200-5) power-on" can be determined. In this case, since it is possible to control the operation of the air conditioner 2 (200-5) through the device control server 500, the electronic device 100 is controlled by the device control server 500 as described above in FIGS. 1 and 2 By sending a command, the operation of the air conditioner 2 (200-5) can be controlled.

그러나, 실시 예가 이에 한정되는 것은 아니다. 즉, 기기 제어 서버(500)를 통한 동작 제어가 가능한 경우에도, 실시 예에 따라 와이 파이 다이렉트 등과 같은 기기 간 통신 방식을 통해 전자 장치(100)가 제어 명령을 직접 에어컨 2(200-5)로 전송함으로써, 에어컨 2(200-5)의 동작을 제어할 수도 있을 것이다. However, the embodiment is not limited thereto. That is, even when operation control through the device control server 500 is possible, the electronic device 100 directly sends a control command to the air conditioner 2 (200-5) through a communication method between devices such as Wi-Fi Direct, according to an embodiment. By transmitting, it may be possible to control the operation of the air conditioner 2 (200-5).

도 4는 본 개시의 일 실시 예에 따른 전자 장치의 블럭도이다. 도 4에 따르면, 전자 장치(100')는 마이크(110), 프로세서(120), 통신부(130), 메모리(140), 스피커(150) 및 디스플레이(160)를 포함할 수 있다. 4 is a block diagram of an electronic device according to an embodiment of the present disclosure. Referring to FIG. 4, the electronic device 100 ′ may include a microphone 110, a processor 120, a communication unit 130, a memory 140, a speaker 150, and a display 160.

마이크(100)는 하나 이상의 마이크로폰으로 구현될 수 있으며, 전자 장치(100')와 일체형으로 구현될 수도 있고, 분리형으로 구현될 수도 있다. 여기서, 분리형 마이크는 마이크로폰이 전자 장치(100')의 본체에 포함되지 않고 따로 떨어져서 유선 또는 무선으로 전자 장치(100')와 연결되는 형태를 의미한다. The microphone 100 may be implemented as one or more microphones, may be implemented integrally with the electronic device 100 ′, or may be implemented separately. Here, the detachable microphone refers to a form in which the microphone is not included in the main body of the electronic device 100 ′, but is separately connected to the electronic device 100 ′ by wire or wirelessly.

통신부(130)는 각종 서버나 기기들과 통신을 수행하여, 다양한 정보(또는 데이터)를 송수신할 수 있는 하드웨어를 지칭할 수 있다. The communication unit 130 may refer to hardware capable of transmitting and receiving various information (or data) by performing communication with various servers or devices.

통신부(130)는 TCP/IP(Transmission Control Protocol/Internet Protocol), UDP(User Datagram Protocol), HTTP(Hyper Text Transfer Protocol), HTTPS(Secure Hyper Text Transfer Protocol), FTP(File Transfer Protocol), SFTP(Secure File Transfer Protocol), MQTT(Message Queuing Telemetry Transport) 등의 통신 규약(프로토콜)을 이용하여 외부의 서버(400, 500)나 기기들(200-1 내지 200-6)과 다양한 정보를 송수신할 수 있다. The communication unit 130 includes Transmission Control Protocol/Internet Protocol (TCP/IP), User Datagram Protocol (UDP), Hyper Text Transfer Protocol (HTTP), Secure Hyper Text Transfer Protocol (HTTPS), File Transfer Protocol (FTP), and SFTP ( Secure File Transfer Protocol), MQTT (Message Queuing Telemetry Transport), and other communication protocols (protocols) can be used to transmit and receive various information with external servers (400, 500) or devices (200-1 to 200-6). have.

통신부(130)는 각종 서버(400, 500) 및 복수의 외부 기기(200-1 내지 200-6)와 각종 네트워크를 통해 연결될 수 있다. 여기서, 네트워크는 영역 또는 규모에 따라 개인 통신망(PAN; Personal Area Network), 근거리 통신망(LAN; Local Area Network), 광역 통신망(WAN; Wide Area Network) 등을 포함하며, 네트워크의 개방성에 따라 인트라넷(Intranet), 엑스트라넷(Extranet), 또는 인터넷(Internet) 등을 포함할 수 있다.The communication unit 130 may be connected to various servers 400 and 500 and a plurality of external devices 200-1 to 200-6 through various networks. Here, the network includes a personal area network (PAN), a local area network (LAN), a wide area network (WAN), etc. according to an area or scale, and an intranet ( Intranet), an extranet, or the Internet.

통신부(130)는 근거리 무선 통신 모듈(미도시) 및 무선랜 통신 모듈(미도시) 중 적어도 하나의 통신 모듈을 포함할 수 있다. 근거리 무선 통신 모듈(미도시)은 근거리에 위치한 외부 기기와 무선으로 데이터 통신을 수행하는 통신 모듈로써, 예를 들어, 블루투스(Bluetooth) 모듈, 지그비(ZigBee) 모듈, NFC(Near Field Communication) 모듈, 적외선 통신 모듈, IR(Infrared) 통신 모듈, 와이파이 모듈(와이 파이 P2P 기능 사용 시) 등이 될 수 있다. 또한, 무선랜 통신 모듈(미도시)은 와이파이(WiFi), IEEE 등과 같은 무선 통신 프로토콜에 따라 외부 네트워크에 연결되어 외부 서버 또는 외부 기기와 통신을 수행하는 모듈이다.The communication unit 130 may include at least one of a short-range wireless communication module (not shown) and a wireless LAN communication module (not shown). A short-range wireless communication module (not shown) is a communication module that performs wireless data communication with an external device located in a short distance, for example, a Bluetooth module, a ZigBee module, a NFC (Near Field Communication) module, It can be an infrared communication module, an IR (Infrared) communication module, a Wi-Fi module (when using the Wi-Fi P2P function), and the like. In addition, the wireless LAN communication module (not shown) is a module that is connected to an external network according to a wireless communication protocol such as WiFi and IEEE, and communicates with an external server or an external device.

이 밖에 통신부(130)는 실시 예에 따라 3G(3rd Generation), 3GPP(3rd Generation Partnership Project), LTE(Long Term Evolution), 5G(5th Generation mobile communications) 등과 같은 다양한 이동 통신 규격에 따라 이동 통신망에 접속하여 통신을 수행하는 이동 통신 모듈을 더 포함할 수도 있으며, HDMI(High-Definition Multimedia Interface), USB(Universal Serial Bus), IEEE(Institute of Electrical and Eletronics Engineers) 1394, RS-232, RS-422, RS-485, Ethernet 등과 같은 통신 규격에 따른 유선 통신 모듈(미도시)을 더 포함할 수도 있다. In addition, the communication unit 130 according to various mobile communication standards such as 3G (3rd Generation), 3GPP (3rd Generation Partnership Project), LTE (Long Term Evolution), 5G (5th Generation mobile communications), etc. It may further include a mobile communication module that connects and performs communication, and HDMI (High-Definition Multimedia Interface), USB (Universal Serial Bus), IEEE (Institute of Electrical and Eletronics Engineers) 1394, RS-232, RS-422 , RS-485, Ethernet, etc. may further include a wired communication module (not shown) according to the communication standard.

한편, 통신부(130)는 상술한 유무선 통신 방식에 따른 네트워크 인터페이스(Network Interface) 또는 네트워크 칩을 포함할 수 있다. 또한, 통신 방식은 상술한 예에 한정되지 아니하고, 기술의 발전에 따라 새롭게 등장하는 통신 방식을 포함할 수 있다. Meanwhile, the communication unit 130 may include a network interface or a network chip according to the wired/wireless communication method described above. In addition, the communication method is not limited to the above-described example, and may include a communication method newly emerging according to the development of technology.

메모리(140)에는 전자 장치(100') 또는 프로세서(120)의 동작을 위한 운영체제(O/S), 각종 프로그램 또는 애플리케이션, 및 데이터가 저장될 수 있다. 구체적으로, 메모리(140)에는 전자 장치(100') 또는 프로세서(120)의 동작에 필요한 적어도 하나의 인스트럭션(instruction), 모듈 또는 데이터가 저장될 수 있다. The memory 140 may store an operating system (O/S) for the operation of the electronic device 100 ′ or the processor 120, various programs or applications, and data. Specifically, at least one instruction, module, or data necessary for the operation of the electronic device 100 ′ or the processor 120 may be stored in the memory 140.

여기서, 인스트럭션은 전자 장치(100') 또는 프로세서(120)의 동작을 지시하는 부호 단위로서, 컴퓨터가 이해할 수 있는 언어인 기계어로 작성된 것일 수 있다. 모듈은 작업 단위의 특정 작업을 수행하는 일련의 인스트럭션의 집합체(instruction set)일 수 있다. 데이터는 문자, 수, 영상 등을 나타낼 수 있는 비트(bit) 또는 바이트(byte) 단위의 정보일 수 있다.Here, the instruction is a unit of code indicating the operation of the electronic device 100' or the processor 120, and may be written in a machine language, which is a language understandable by a computer. A module may be a set of instructions that perform a specific task in a unit of work. The data may be information in bits or bytes that can represent characters, numbers, images, and the like.

메모리(140)는 프로세서(120)에 의해 액세스 되며, 프로세서(120)에 의해 인스트럭션, 모듈, 인공지능 모델 또는 데이터에 대한 독취/기록/수정/삭제/갱신 등이 수행될 수 있다. The memory 140 is accessed by the processor 120, and read/write/modify/delete/update instructions, modules, artificial intelligence models, or data by the processor 120 may be performed.

이를 위해, 메모리(140)는 내장 메모리 또는 외장 메모리를 포함할 수 있다. 내장 메모리는, 휘발성 메모리 또는 비휘발성 메모리(non-volatile Memory) 중 적어도 하나를 포함할 수 있다. 휘발성 메모리는, 예를 들어 DRAM(dynamic RAM), SRAM(static RAM), SDRAM(synchronous dynamic RAM) 등일 수 있다. 비휘발성 메모리는 예를 들어 OTPROM(one time programmable ROM), PROM(programmable ROM), EPROM(erasable and programmable ROM), EEPROM(electrically erasable and programmable ROM), mask ROM, flash ROM, NAN flash memory, NOR flash memory 등일 수 있다. 또한, 내장 메모리는 Solid State Drive(SSD)일 수 있다. 외장 메모리는 flash drive, CF(compact flash), SD(secure digital), Micro-SD(micro secure digital), Mini-SD(mini secure digital), xD(extreme digital) 또는 Memory Stick 등을 포함할 수 있다. 외장 메모리는 다양한 인터페이스를 통하여 전자 장치(100')와 기능적으로 연결될 수 있다. 또한, 전자 장치(100')는 하드 드라이브와 같은 저장 장치를 더 포함할 수도 있다. To this end, the memory 140 may include an internal memory or an external memory. The built-in memory may include at least one of a volatile memory or a non-volatile memory. The volatile memory may be, for example, dynamic RAM (DRAM), static RAM (SRAM), synchronous dynamic RAM (SDRAM), or the like. Non-volatile memory is, for example, one time programmable ROM (OTPROM), programmable ROM (PROM), erasable and programmable ROM (EPROM), electrically erasable and programmable ROM (EEPROM), mask ROM, flash ROM, NAN flash memory, NOR flash. It could be memory, etc. Also, the internal memory may be a solid state drive (SSD). The external memory may include a flash drive, compact flash (CF), secure digital (SD), micro secure digital (Micro-SD), mini secure digital (Mini-SD), extreme digital (xD), or a memory stick. . The external memory may be functionally connected to the electronic device 100 ′ through various interfaces. Further, the electronic device 100 ′ may further include a storage device such as a hard drive.

스피커(150)는 전기적 신호를 청각적인 형태(예: 음성)로 출력하는 장치이다. 스피커는 오디오 처리부(미도시)에 의해 디코딩이나 증폭, 노이즈 필터링과 같은 다양한 처리 작업이 수행된 각종 오디오 데이터뿐만 아니라 각종 알림 음이나 음성 메시지를 직접 소리로 출력할 수 있다. 예를 들어, 음성 인식 서버(400)의 자체 정책에 따른 질의가 통신부(130)를 통해 수신되면, 스피커(150)는 수신된 질의를 음성으로 출력할 수 있다.The speaker 150 is a device that outputs an electrical signal in an audible form (eg, voice). The speaker may directly output a variety of notification sounds or voice messages as sound as well as various types of audio data on which various processing tasks such as decoding, amplification, and noise filtering have been performed by an audio processing unit (not shown). For example, when a query according to the policy of the voice recognition server 400 is received through the communication unit 130, the speaker 150 may output the received query as a voice.

디스플레이(160)는 정보를 시각적인 형태(예: 그래픽, 문자, 이미지 등)로 출력하는 장치이다. 디스플레이(160)는 이미지 프레임을 디스플레이 영역의 전체 또는 일부 영역에 표시할 수 있다. 예를 들어, 음성 인식 서버(400)의 자체 정책에 따른 질의가 통신부(130)를 통해 수신되면, 디스플레이(160)는 수신된 질의를 문자로 출력할 수 있다. The display 160 is a device that outputs information in a visual form (eg, graphic, text, image, etc.). The display 160 may display the image frame on the entire or partial area of the display area. For example, when a query according to a policy of the voice recognition server 400 is received through the communication unit 130, the display 160 may output the received query in text.

프로세서(120)는 메모리(140)에 저장된 각종 프로그램(예를 들어, 적어도 하나의 인스트럭션, 모듈 등)이나 데이터를 읽어, 본 개시의 다양한 실시 예들에 따른 전자 장치(100')의 동작을 수행할 수 있다.The processor 120 reads various programs (eg, at least one instruction, module, etc.) or data stored in the memory 140 to perform an operation of the electronic device 100 ′ according to various embodiments of the present disclosure. I can.

메모리(140)에 저장된 룰 DB, 인공 지능 모델, 음성 인식 모듈, 기기 제어 모듈 등은, 프로세서(120)에 의해 로딩되어 해당 기능이 각각 수행될 수 있다. 이를 위해, 프로세서(120)는 메모리(140)에 저장된 각종 프로그램 및 데이터의 적어도 일부를 로딩하기 위한 내부 메모리를 포함할 수 있다. A rule DB, an artificial intelligence model, a voice recognition module, a device control module, and the like stored in the memory 140 may be loaded by the processor 120 to perform respective functions. To this end, the processor 120 may include an internal memory for loading at least a portion of various programs and data stored in the memory 140.

한편, 프로세서(120)는 중앙처리장치(central processing unit(CPU)), controller, 어플리케이션 프로세서(application processor(AP)), 마이크로 프로세서(microprocessor unit(MPU)), 커뮤니케이션 프로세서(communication processor(CP)), GPU(Graphic Processing Unit), VPU(Vision Processing Unit), NPU(Neural Processing Unit), 또는 ARM 프로세서 중 하나 또는 그 이상을 포함할 수 있다. Meanwhile, the processor 120 is a central processing unit (CPU), a controller, an application processor (AP), a microprocessor unit (MPU), and a communication processor (CP). , GPU (Graphic Processing Unit), VPU (Vision Processing Unit), NPU (Neural Processing Unit), or may include one or more of the ARM processor.

한편, 도 2에서는, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보에 대응되는 룰이 룰 DB(141-1)에 존재하지 않거나, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보를 인공 지능 모델(141-2)에 입력하였으나 제어 명령이 출력되지 않는 때, 음성 인식 서버(400)로 사용자 음성이 전송되는 것을 예로 들었으나, 실시 예가 이에 한정되는 것은 아니다. Meanwhile, in FIG. 2, the rule corresponding to the acquired user intention information and the state information of the external device does not exist in the rule DB 141-1, or the acquired user intention information and the state information of the external device are used in an artificial intelligence model ( 141-2), but when a control command is not output, a user voice is transmitted to the voice recognition server 400, but the embodiment is not limited thereto.

즉, 실시 예에 따라, 프로세서(120)는, 대응되는 룰의 존재 여부나 인공 지능 모델로부터 제어 명령의 출력 여부와 무관하게, 사용자 음성이 수신되면 수신된 사용자 음성을 항상 음성 인식 서버(400)로 전송할 수도 있다. That is, according to an embodiment, the processor 120 always sends the received user voice to the voice recognition server 400 when a user voice is received, regardless of whether a corresponding rule exists or whether a control command is output from the artificial intelligence model. It can also be transferred to.

이러한 실시 예에서, 대응되는 룰이 룰 DB(141-1)에 존재하거나 인공 지능 모델로부터 제어 명령이 출력되는 경우에는, 전자 장치(100) 및 음성 인식 서버(400) 각각이 판단한 제어 명령이 기기 제어 서버(500)로 각각 전송되게 되므로, 기기 제어 서버(500)는 전자 장치(100') 및 음성 인식 서버(500)로부터 동일한 제어 명령을 각각 수신하게 될 수 있다. In this embodiment, when a corresponding rule exists in the rule DB 141-1 or a control command is output from the artificial intelligence model, the control command determined by each of the electronic device 100 and the voice recognition server 400 is Since each is transmitted to the control server 500, the device control server 500 may receive the same control command from the electronic device 100 ′ and the voice recognition server 500, respectively.

이러한 경우에는, 2 개의 제어 명령을 수신한 기기 제어 서버(500)가 2번째로 수신한 제어 명령을 무시하거나, 또는 2 개의 제어 신호를 수신한 제어 대상 기기가 2 번째로 수신한 제어 신호를 무시함으로써, 제어 대상 기기가 동일한 동작을 2번 수행하는 오류를 막을 수 있다. In this case, the device control server 500 that has received the two control commands ignores the second control command received, or the control target device that has received the two control signals ignores the second control signal. By doing so, it is possible to prevent an error in which the controlling device performs the same operation twice.

이때, 기기 제어 서버(500)가 2 번째로 수신하는 제어 명령 또는 제어 대상 기기가 2 번째로 수신하는 제어 신호는, 대부분 음성 인식 서버(400)가 전송한 제어 명령 및 그에 따른 제어 신호가 될 것이다. 이는, 제어 명령 판단 툴(141)에 의한 제어 명령 판단이 음성 인식 서버(400)의 제어 명령 판단보다 빠르기 때문이다. At this time, the control command that the device control server 500 receives for the second time or the control signal that the control target device receives for the second time will be a control command transmitted by the voice recognition server 400 and a control signal according thereto. . This is because the control command determination by the control command determination tool 141 is faster than the control command determination by the voice recognition server 400.

한편, 도 1 및 도 2에서는 음성 인식 서버(400)의 추가적인 해석이 필요한 경우로, 사용자 의도 정보만으로 제어 대상 기기가 특정되지 않는 경우를 예로 들었으나, 이에 한정되는 것은 아니며, 제어 대상 기기는 특정되더라도 제어 대상 기기의 동작이 특정되지 않는 경우에도 음성 인식 서버(400)의 추가적인 해석이 필요하다. Meanwhile, in FIGS. 1 and 2, a case in which an additional analysis of the voice recognition server 400 is required, and a case in which a control target device is not specified only with user intention information is exemplified, but is not limited thereto. Even if the operation of the device to be controlled is not specified, additional analysis of the voice recognition server 400 is required.

예를 들어, 사용자 계정에 하나의 TV(200-1)가 등록되어 있고, 사용자가 "MBC 틀어줘"와 같은 사용자 음성을 발화한 경우, 전자 장치(100')는 음성 인식 처리를 통해 "TV Channel MBC"와 같은 사용자 의도 정보를 획득하고, "TV"의 상태 정보를 기기 제어 서버(500)로 요청할 수 있다. For example, if one TV 200-1 is registered in the user account, and the user utters a user voice such as "Play MBC", the electronic device 100' User intention information such as "Channel MBC" may be acquired, and state information of "TV" may be requested from the device control server 500.

이때, TV 1(200-1)이 켜져 있는 상태인 경우, 기기 제어 서버(500)는 "TV 1(200-1) on"과 같은 상태 정보를 전자 장치(100')로 전송할 수 있다. 한편, TV 1(200-1)의 상태 정보가 on/off 정보에 국한되지 않음은 도 1에 관한 설명에서 전술한 바와 같다. 예를 들어, TV 1(200-1)에서 재생 중인 현재 방송 채널에 관한 정보, 현재 설정된 TV 1(200-1)의 볼륨에 관한 정보 등이 실시 예에 따라 상태 정보에 더 포함될 수도 있음은 물론이다. In this case, when the TV 1 200-1 is turned on, the device control server 500 may transmit state information such as "TV 1 200-1 on" to the electronic device 100 ′. Meanwhile, the state information of the TV 1 200-1 is not limited to the on/off information as described above in the description of FIG. 1. For example, information on the current broadcasting channel being played on TV 1 (200-1), information on the volume of TV 1 (200-1) currently set may be further included in the status information according to embodiments. to be.

그러나, 전자 장치(100')는 "MBC"가 몇 번의 채널을 의미하는지 알 수 없으므로, 음성 인식 서버(400)로 사용자 음성을 전송할 수 있다. ("MBC 틀어줘"와 관련하여 룰도 존재하지 않고, 인공 지능 모델도 학습되지 않은 상태를 전제한다.)However, since the electronic device 100 ′ cannot know how many channels "MBC" means, the user's voice may be transmitted to the voice recognition server 400. (It is assumed that there are no rules related to "Play MBC" and the artificial intelligence model is not trained.)

이에 따라, 음성 인식 서버(400) 역시, 음성 인식 처리를 통해 "TV Channel MBC"와 같은 사용자 의도 정보를 획득하고, 기기 제어 서버(500)로부터 "TV 1(200-1) on"과 같은 상태 정보를 수신할 수 있다.Accordingly, the voice recognition server 400 also acquires user intention information such as “TV Channel MBC” through voice recognition processing, and a state such as “TV 1 (200-1) on” from the device control server 500 You can receive information.

음성 인식 서버(400) 역시 "MBC"가 몇 번의 채널을 의미하는지 알 수 없으므로, 사용자 의도 정보만으로 제어 대상 기기 즉, TV 1(200-1)의 동작이 특정되지 않는다고 판단하고, 사용자 질의와 같은 자체 정책에 따른 추가적인 해석을 통해 MBC가 11번 채널인 것을 확인하게 된다. Since the voice recognition server 400 also cannot know how many channels "MBC" means, it is determined that the operation of the control target device, that is, TV 1 200-1, is not specified only with the user intention information, and It is confirmed that MBC is channel 11 through additional interpretation according to its own policy.

이에 따라, 음성 인식 서버(400)는 "TV 1(200-1) Channel 11”과 같은 제어 명령을 판단하여 기기 제어 서버(500)로 전송하고, 이에 따라, TV 1(200-1)의 채널이 변경될 수 있다. Accordingly, the voice recognition server 400 determines a control command such as "TV 1 (200-1) Channel 11" and transmits it to the device control server 500, and accordingly, the channel of TV 1 (200-1) This is subject to change.

이때, 음성 인식 서버(400) 또는 기기 제어 서버(500)는 음성 인식 서버(400)가 판단한 제어 명령을 전자 장치(100')로 전송하게 되며, 이에 따라, 전자 장치(100')의 프로세서(120)는 아래 표 3의 룰 1과 같은 룰을 생성하여 룰 DB(141-1)를 업데이트 할 수 있다. At this time, the voice recognition server 400 or the device control server 500 transmits the control command determined by the voice recognition server 400 to the electronic device 100', and accordingly, the processor of the electronic device 100' 120) may update the rule DB 141-1 by creating a rule as shown in Rule 1 in Table 3 below.

사용자 의도 정보User Intent Information 외부 기기의 상태 정보External device status information 제어 명령Control command 룰 1Rule 1 TV Channel MBCTV Channel MBC TV 1(200-1): onTV 1(200-1): on TV 1(200-1) Channel 11TV 1(200-1) Channel 11 룰 2Rule 2 에어컨 power-onAir conditioner power-on 에어컨 1: off
에어컨 2: offAir conditioner 1: off
Air conditioner 2: off 에어컨 2 power-onAir conditioner 2 power-on

이후 TV 1(200-1)이 켜져 있는 상태에서 사용자가 다시 "MBC 틀어줘"와 같은 사용자 음성을 발화하면, 프로세서(120)는 상기 표 3의 룰 1을 이용하여 신속하게 제어 명령을 판단할 수 있게 된다. Thereafter, when the user utters a user voice such as "Play MBC" again while the TV 1 (200-1) is turned on, the processor 120 can quickly determine a control command using Rule 1 of Table 3 above. You will be able to.

한편, 이상에서는 전자 장치(100')가 제어 명령 판단 툴(141)을 이용하여 사용자 음성에 대응되는 제어 명령을 판단하는 다양한 예들을 설명하였다. 그러나, 본 개시의 일 실시 예에 따르면, 음성 인식 서버(400)가 제어 명령 판단 툴(141)을 이용하여 사용자 음성에 대응되는 제어 명령을 판단할 수도 있다. Meanwhile, various examples in which the electronic device 100 ′ uses the control command determination tool 141 to determine a control command corresponding to a user's voice have been described above. However, according to an embodiment of the present disclosure, the voice recognition server 400 may determine a control command corresponding to the user's voice using the control command determination tool 141.

도 5는 본 개시의 일 실시 예에 따른 음성 인식 서버의 블럭도이다. 도 5에 따르면, 음성 인식 서버(400)는 통신부(410), 프로세서(420) 및 메모리(430)를 포함할 수 있다. 5 is a block diagram of a voice recognition server according to an embodiment of the present disclosure. Referring to FIG. 5, the voice recognition server 400 may include a communication unit 410, a processor 420, and a memory 430.

통신부(410)는 외부의 각종 서버 또는 각종 기기들과 통신을 수행하기 위한 구성이다. 특히, 통신부(410)는 전자 장치(100) 및 기기 제어 서버(500)와 통신을 수행하여, 사용자 음성, 사용자 의도 정보, 외부 기기의 상태 정보, 제어 명령 등과 같은 각종 정보 내지 데이터를 송, 수신할 수 있다. The communication unit 410 is a component for performing communication with various external servers or various devices. In particular, the communication unit 410 communicates with the electronic device 100 and the device control server 500 to transmit and receive various information or data such as user voice, user intention information, status information of external devices, and control commands. can do.

메모리(430)는 저장된 데이터 또는 정보에 프로세서(420) 등이 접근할 수 있도록, 데이터 또는 정보를 전기 또는 자기 형태로 저장할 수 있다. 특히, 메모리(430)에는, 제어 명령을 판단하기 위한 기초가 되는 제어 명령 판단 툴(431) 및 자체 정책, 제어 명령 판단 툴(431)을 관리하기 위한 툴 관리 모듈(425), 음성 인식 기능을 수행하고 제어 명령을 판단하기 위한 음성 인식 모듈(423), 및 외부 기기(200-1, 200-2)의 상태 정보를 모니터링하고 제어 명령을 송, 수신하기 위한 기기 제어 모듈(421) 등이 저장될 수 있다.The memory 430 may store data or information in electrical or magnetic form so that the processor 420 or the like can access the stored data or information. In particular, the memory 430 includes a control command determination tool 431 that is a basis for determining a control command, a tool management module 425 for managing its own policy, and the control command determination tool 431, and a voice recognition function. A voice recognition module 423 for performing and determining a control command, and a device control module 421 for monitoring status information of external devices 200-1 and 200-2 and transmitting and receiving control commands are stored. Can be.

프로세서(420)는 음성 인식 서버(400)의 전반적인 동작을 제어한다. 특히, 프로세서(420)는 메모리(430)에 저장된 제어 명령 판단 툴(431), 툴 관리 모듈(425), 음성 인식 모듈(423), 기기 제어 모듈(421) 등을 로딩하여 각 모듈의 기능을 수행할 수 있다. The processor 420 controls the overall operation of the voice recognition server 400. In particular, the processor 420 loads a control command determination tool 431, a tool management module 425, a voice recognition module 423, a device control module 421, etc. stored in the memory 430 to perform functions of each module. Can be done.

구체적으로, 도 5는, 프로세서(420)가 메모리(430)에 저장된 기기 제어 모듈(421), 음성 인식 모듈(423), 및 툴 관리 모듈(425)을 로딩하여 해당 기능을 수행하고 있는 상태를 도시하고 있다. Specifically, FIG. 5 shows a state in which the processor 420 loads the device control module 421, the voice recognition module 423, and the tool management module 425 stored in the memory 430 and performs a corresponding function. Is shown.

도 5에 도시된 구성들 중 도 2 및 도 4에 도시된 구성들과 명칭이 동일한 구성들은, 도 2 및 도 4에 도시된 구성들과 동일한 내용이거나 또는 동일하게 동작할 수 있다. 이하에서는, 도 2 및 도 4에서 전술한 것과 동일한 내용의 중복 설명을 생략하고, 차이가 있는 내용을 위주로 설명하기로 한다. Among the components shown in FIG. 5, components having the same name as those shown in FIGS. 2 and 4 may have the same contents or operate in the same manner as the components shown in FIGS. 2 and 4. Hereinafter, redundant descriptions of the same contents as those described above with reference to FIGS. 2 and 4 will be omitted, and different contents will be mainly described.

도 5를 보면, 마이크가 없는 것을 제외하고, 각 구성들이 도 2의 전자 장치(100)의 구성과 동일한 것을 볼 수 있다. 전자 장치(100)에서는 사용자 음성이 마이크(110)를 통해 수신되지만, 음성 인식 서버(400)에서는 통신부(410)를 통해 전자 장치(100)로부터 수신될 수 있다. Referring to FIG. 5, it can be seen that each configuration is the same as that of the electronic device 100 of FIG. 2 except that there is no microphone. In the electronic device 100, the user's voice is received through the microphone 110, but in the voice recognition server 400, it may be received from the electronic device 100 through the communication unit 410.

한편, 음성 인식 서버(400)는, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보에 대응되는 룰이 룰 DB(431-1) 존재하지 않거나, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보를 인공 지능 모델(431-2)에 입력하였으나 제어 명령이 출력되지 않는 경우이더라도, 자체 정책을 통해 직접 제어 명령을 판단하게 되므로, NLU 모듈(123-2)이 판단한 제어 명령을 직접 툴 관리 모듈(425)로 전달하면 되고, 통신부(410)를 통해 외부에서 수신할 필요가 없다. On the other hand, the voice recognition server 400, if the rule corresponding to the acquired user intention information and the state information of the external device does not exist in the rule DB 431-1, or the acquired user intention information and the state information of the external device are artificial. Even if the control command is input to the intelligent model 431-2 but the control command is not output, the control command is directly determined through its own policy, so the control command determined by the NLU module 123-2 is directly transmitted to the tool management module 425 It can be transmitted to and there is no need to receive it from the outside through the communication unit 410.

예를 들어, 통신부(410)를 통해 전자 장치(100)로부터 사용자 음성이 수신되면, 음성 인식 모듈(423)은 수신된 사용자 음성을 ASR 및 NLU 처리하여 사용자 의도 정보를 획득하고, 획득된 사용자 의도 정보와 관련된 외부 기기의 상태 정보를 기기 제어 모듈(421)을 통해 기기 제어 서버(500)로부터 수신할 수 있다.For example, when a user voice is received from the electronic device 100 through the communication unit 410, the voice recognition module 423 processes the received user voice ASR and NLU to obtain user intention information, and obtains user intention. Status information of the external device related to the information may be received from the device control server 500 through the device control module 421.

이에 따라, NLU 모듈(423-2)은 툴 관리 모듈(425)로 제어 명령 판단 툴(431)의 로딩을 요청하고, 로딩된 제어 명령 판단 툴(431)을 이용하여 도 2에서 전술한 바와 같이 사용자 음성에 대응되는 제어 명령을 판단할 수 있다. Accordingly, the NLU module 423-2 requests the loading of the control command determination tool 431 to the tool management module 425, and uses the loaded control command determination tool 431 as described above in FIG. A control command corresponding to the user's voice can be determined.

한편, 제어 명령 판단 툴(431)을 이용하여 제어 명령을 판단할 수 없는 경우, NLU 모듈(423-2)은, 획득된 사용자 의도 정보 및 외부 기기의 상태 정보를 툴 관리 모듈(425)로 전달한다. 또한, NLU 모듈(723-2)은 자체 정책을 이용하여 제어 명령을 판단한다. On the other hand, when the control command cannot be determined using the control command determination tool 431, the NLU module 423-2 transmits the acquired user intention information and the status information of the external device to the tool management module 425 do. Also, the NLU module 723-2 determines a control command using its own policy.

자체 정책을 통해 제어 명령이 판단되면, NLU 모듈(423-2)은 판단된 제어 명령을 툴 관리 모듈(425)로 전달하게 되며, 툴 관리 모듈(425)은 NLU 모듈(423-2)로부터 전달받은 사용자 의도 정보, 외부 기기의 상태 정보 및 제어 명령에 기초하여, 도 2에서 전술한 바와 같이 룰 DB(431-1) 또는 인공 지능 모델(431-2)를 업데이트할 수 있다. When a control command is determined through its own policy, the NLU module 423-2 transmits the determined control command to the tool management module 425, and the tool management module 425 is transferred from the NLU module 423-2. Based on the received user intention information, state information of an external device, and a control command, the rule DB 431-1 or the artificial intelligence model 431-2 may be updated as described above in FIG. 2.

이와 같이, 업데이트된 룰 DB(125) 및 인공 지능 모델(127)은, 이후에 통신부(410)를 통해 전자 장치(100)로부터 수신되는 사용자 음성에 대응되는 제어 명령의 판단에 이용될 수 있다. In this way, the updated rule DB 125 and the artificial intelligence model 127 may be used to determine a control command corresponding to the user's voice received from the electronic device 100 through the communication unit 410 later.

한편, 일반적으로 음성 인식 서버(400)와 같은 서버 장치는 전자 장치(100)와 같은 클라이언트 장치에 비해 대용량의 저장 공간과 고속의 연산 속도를 가질 수 있다. Meanwhile, in general, a server device such as the voice recognition server 400 may have a larger storage space and a higher computing speed than a client device such as the electronic device 100.

따라서, 음성 인식 서버(400)에는, 다양한 상황에서 입력되는 다양한 음성을 원하는 속도로 처리할 수 있는 대용량 고성능의 음성 인식 모듈(423)(ASR 모듈(423-1) 및 NLU 모듈(423-2) 등)이 탑재될 수 있다. Accordingly, the voice recognition server 400 includes a high-capacity, high-performance voice recognition module 423 (ASR module 423-1 and NLU module 423-2) capable of processing various voices input in various situations at a desired speed. Etc.) can be mounted.

그러나, 전자 장치(100)의 경우, 저장 공간이나 연산 성능의 한계로 인해 탑재될 수 있는 음성 인식 모듈(123)(ASR 모듈(123-1) 및 NLU 모듈(123-2) 등)에 한계가 있다. However, in the case of the electronic device 100, there is a limit to the speech recognition module 123 (such as the ASR module 123-1 and the NLU module 123-2) that can be mounted due to the limitation of storage space or computational performance. have.

따라서, 본 개시의 일 실시 예에 따르면, 전자 장치(100)로부터 사용자 음성에 대한 음성 인식 처리(ASR 처리 및/또는 NLU 처리)가 요청되는 경우, 음성 인식 서버(400)의 음성 인식 모듈(423)은, 수신된 사용자 음성을 전자 장치(100)의 요청에 따라 음성 인식 처리하고, 그 결과(ASR 처리 결과인 텍스트 또는 NLU 처리 결과인 사용자 의도 정보)를 통신부(410)를 통해 전자 장치(100)로 전송할 수 있다. Accordingly, according to an embodiment of the present disclosure, when a voice recognition process (ASR process and/or NLU process) for a user's voice is requested from the electronic device 100, the voice recognition module 423 of the voice recognition server 400 ) Processes the received user voice according to the request of the electronic device 100, and transmits the result (text as a result of ASR processing or user intention information as a result of NLU processing) through the communication unit 410. ).

도 6은 본 개시의 일 실시 예에 따른 전자 장치의 제어 방법을 나타내는 흐름도이다. 6 is a flowchart illustrating a method of controlling an electronic device according to an embodiment of the present disclosure.

도 6에 따르면, 전자 장치(100, 100')는 음성 인식 서버(400)에 의해 판단된 제어 명령에 기초한 제어 명령 판단 툴(141)을 저장할 수 있다(S610). Referring to FIG. 6, the electronic devices 100 and 100 ′ may store a control command determination tool 141 based on a control command determined by the voice recognition server 400 (S610 ).

이때, 제어 명령 판단 툴(141)은, 1) 사용자 의도 정보, 외부 기기의 상태 정보 및 음성 인식 서버(400)에 의해 판단된 제어 명령이 서로 매칭된 적어도 하나의 룰을 포함하는 룰 DB(141-1), 및 2) 사용자 의도 정보 및 외부 기기의 상태 정보를 입력으로 하고, 음성 인식 서버(400)에 의해 판단된 제어 명령을 출력으로 하여 학습된 인공 지능 모델(141-2) 중 적어도 하나를 포함할 수 있다. In this case, the control command determination tool 141 includes: 1) a rule DB 141 including at least one rule in which user intention information, state information of an external device, and a control command determined by the voice recognition server 400 are matched with each other. -1), and 2) at least one of the artificial intelligence model 141-2 learned by inputting user intention information and state information of an external device and outputting a control command determined by the voice recognition server 400 It may include.

한편, 사용자 음성이 수신되면, 전자 장치(100, 100')는, 수신된 사용자 음성을 음성 인식 처리하여 사용자 의도 정보를 획득하고(S620), 획득된 사용자 의도 정보와 관련된 외부 기기의 상태 정보를 기기 제어 서버(500)로 요청하여 수신할 수 있다(S630). On the other hand, when a user voice is received, the electronic devices 100 and 100 ′ acquire user intention information by performing speech recognition on the received user voice (S620), and obtain state information of an external device related to the obtained user intention information. The device control server 500 may request and receive (S630).

이에 따라, 전자 장치(100, 100')는, 획득된 사용자 의도 정보 및 수신된 외부 기기의 상태 정보를, 제어 명령 판단 툴(141)에 적용하여, 제어 대상 기기를 제어하기 위한 제어 명령을 판단하고(S640), 판단된 제어 명령을 기기 제어 서버(500)로 전송할 수 있다(S650). Accordingly, the electronic devices 100 and 100 ′ apply the obtained user intention information and the received state information of the external device to the control command determination tool 141 to determine a control command for controlling the control target device. Then (S640), the determined control command may be transmitted to the device control server 500 (S650).

예를 들어, 전자 장치(100, 100')는, 획득된 사용자 의도 정보 및 수신된 외부 기기의 상태 정보에 대응되는 룰이 룰 DB(141-1)에 존재하는 경우, 대응되는 룰에 매칭된 제어 명령을 사용자 음성에 대응되는 제어 명령으로 판단할 수 있다. For example, if a rule corresponding to the acquired user intention information and the received state information of an external device exists in the rule DB 141-1, the electronic device 100 or 100 ′ may be matched to the corresponding rule. The control command may be determined as a control command corresponding to the user's voice.

만일, 대응되는 룰이 룰 DB(141-1)에 존재하지 않으면, 전자 장치(100, 100')는, 사용자 음성을 음성 인식 서버(400)로 전송하고, 이에 따라, 음성 인식 서버(400)에서 판단된 제어 명령이 수신되면, 수신된 제어 명령을 상기 획득된 사용자 의도 정보 및 수신된 외부 기기의 상태 정보와 매칭하여 신규 룰을 생성하고, 생성된 신규 룰을 룰 DB(141-1)에 업데이트할 수 있다. If the corresponding rule does not exist in the rule DB 141-1, the electronic device 100, 100' transmits the user's voice to the voice recognition server 400, and accordingly, the voice recognition server 400 When the determined control command is received, a new rule is generated by matching the received control command with the acquired user intention information and the received state information of the external device, and the generated new rule is added to the rule DB 141-1. Can be updated.

한편, 전자 장치(100, 100')는, 획득된 사용자 의도 정보 및 수신된 외부 기기의 상태 정보를 인공 지능 모델(141-2)에 입력하고, 인공 지능 모델(141-2)로부터 출력되는 제어 명령을 사용자 음성에 대응되는 제어 명령으로 판단할 수도 있다. Meanwhile, the electronic devices 100 and 100' input the acquired user intention information and the received state information of the external device to the artificial intelligence model 141-2, and control output from the artificial intelligence model 141-2. The command may be determined as a control command corresponding to the user's voice.

이때, 인공 지능 모델(141-2)로부터 제어 명령이 출력되지 않으면, 전자 장치(100, 100')는, 사용자 음성을 음성 인식 서버(400)로 전송하고, 이에 따라, 음성 인식 서버(400)에서 판단된 제어 명령이 수신되면, 획득된 사용자 의도 정보, 수신된 외부 기기의 상태 정보 및 수신된 제어 명령에 기초하여 인공 지능 모델을 재학습시킬 수 있다. At this time, if the control command is not output from the artificial intelligence model 141-2, the electronic devices 100 and 100' transmit the user's voice to the voice recognition server 400, and accordingly, the voice recognition server 400 When the determined control command is received, the artificial intelligence model may be retrained based on the acquired user intention information, the received state information of the external device, and the received control command.

한편, 본 개시의 일 실시 예에 따르면, 전자 장치(100, 100')는, 외부 기기의 상태 정보에 기초하여 사용자 의도 정보만으로 제어 대상 기기 및 제어 대상 기기의 동작을 특정할 수 있는 경우, 제어 명령 판단 툴(141)을 이용함 없이, 사용자 의도 정보에 기초하여 제어 명령을 판단할 수 있다. Meanwhile, according to an embodiment of the present disclosure, when the electronic device 100 or 100 ′ can specify the operation of the control target device and the control target device only with user intention information based on the state information of the external device, A control command may be determined based on user intention information without using the command determination tool 141.

이때, 기기 제어 서버(500)로부터 사용자 의도 정보에 포함된 엔티티와 관련된 복수의 외부 기기의 상태 정보가 수신되면, 전자 장치(100, 100')는, 사용자 의도 정보만으로 제어 대상 기기를 특정할 수 없다고 판단하고, 제어 명령 판단 툴(141)을 이용하여 제어 명령을 판단할 수 있다. At this time, when state information of a plurality of external devices related to an entity included in the user intention information is received from the device control server 500, the electronic devices 100 and 100 ′ may specify the control target device only with the user intention information. It is determined that there is no, and a control command may be determined using the control command determination tool 141.

한편, 전자 장치(100, 100')는, 제어 대상 기기가 IR 방식으로 제어 가능한 기기인 경우, 제어 명령을 제어 대상 기기로 직접 전송할 수도 있다. Meanwhile, the electronic devices 100 and 100 ′ may directly transmit a control command to the control target device when the control target device is a device that can be controlled by the IR method.

한편, 본 개시의 다양한 실시 예들은 기기(machine)(예: 컴퓨터)로 읽을 수 있는 저장 매체(machine-readable storage media)에 저장된 명령어를 포함하는 소프트웨어로 구현될 수 있다. 여기서, 기기는, 저장 매체로부터 저장된 명령어를 호출하고, 호출된 명령어에 따라 동작이 가능한 장치로서, 개시된 실시 예들에 따른 전자 장치(100, 100') 또는 음성 인식 서버(400)를 포함할 수 있다. Meanwhile, various embodiments of the present disclosure may be implemented with software including instructions stored in a machine-readable storage media (eg, a computer). Here, the device is a device capable of calling a stored command from a storage medium and operating according to the called command, and may include an electronic device 100 or 100 ′ or a voice recognition server 400 according to the disclosed embodiments. .

상기 명령이 프로세서에 의해 실행될 경우, 프로세서가 직접, 또는 상기 프로세서의 제어하에 다른 구성요소들을 이용하여 상기 명령에 해당하는 기능을 수행할 수 있다. 명령은 컴파일러 또는 인터프리터에 의해 생성 또는 실행되는 코드를 포함할 수 있다. 기기로 읽을 수 있는 저장매체는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, '비일시적'은 저장매체가 신호(signal)를 포함하지 않으며 실재(tangible)한다는 것을 의미할 뿐 데이터가 저장매체에 반영구적 또는 임시적으로 저장됨을 구분하지 않는다.When the command is executed by a processor, the processor may perform a function corresponding to the command directly or by using other components under the control of the processor. Instructions may include code generated or executed by a compiler or interpreter. A storage medium that can be read by a device may be provided in the form of a non-transitory storage medium. Here,'non-transient' means that the storage medium does not contain a signal and is tangible, but does not distinguish between semi-permanent or temporary storage of data in the storage medium.

일 실시 예에 따르면, 본 개시에 개시된 다양한 실시 예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로, 또는 어플리케이션 스토어(예: 플레이 스토어TM)를 통해 온라인으로 배포될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to an embodiment, a method according to various embodiments disclosed in the present disclosure may be provided by being included in a computer program product. Computer program products can be traded between sellers and buyers as commodities. The computer program product may be distributed online in the form of a device-readable storage medium (eg, compact disc read only memory (CD-ROM)) or through an application store (eg, Play StoreTM). In the case of online distribution, at least a portion of the computer program product may be temporarily stored or temporarily generated in a storage medium such as a server of a manufacturer, a server of an application store, or a memory of a relay server.

다양한 실시 예들에 따른 구성 요소(예: 모듈 또는 프로그램) 각각은 단수 또는 복수의 개체로 구성될 수 있으며, 전술한 해당 서브 구성 요소들 중 일부 서브 구성 요소가 생략되거나, 또는 다른 서브 구성 요소가 다양한 실시 예에 더 포함될 수 있다. 대체적으로 또는 추가적으로, 일부 구성 요소들(예: 모듈 또는 프로그램)은 하나의 개체로 통합되어, 통합되기 이전의 각각의 해당 구성 요소에 의해 수행되는 기능을 동일 또는 유사하게 수행할 수 있다. 다양한 실시 예들에 따른, 모듈, 프로그램 또는 다른 구성 요소에 의해 수행되는 동작들은 순차적, 병렬적, 반복적 또는 휴리스틱하게 실행되거나, 적어도 일부 동작이 다른 순서로 실행되거나, 생략되거나, 또는 다른 동작이 추가될 수 있다. Each of the constituent elements (eg, modules or programs) according to various embodiments may be composed of a singular or a plurality of entities, and some sub-elements of the aforementioned sub-elements are omitted, or other sub-elements are various. It may be further included in the embodiment. Alternatively or additionally, some constituent elements (eg, a module or a program) may be integrated into one entity, and functions performed by each corresponding constituent element prior to the consolidation may be performed identically or similarly. Operations performed by modules, programs, or other components according to various embodiments may be sequentially, parallel, repetitively or heuristically executed, or at least some operations may be executed in a different order, omitted, or other operations may be added. I can.

이상의 설명은 본 개시의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 개시의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 또한, 본 개시에 따른 실시 예들은 본 개시의 기술 사상을 한정하기 위한 것이 아니라 설명하기 한 것이고, 이러한 실시 예에 의하여 본 개시의 기술 사상의 범위가 한정되는 것은 아니다. 따라서, 본 개시의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 개시의 권리범위에 포함되는 것으로 해석되어야 할 것이다. The above description is merely illustrative of the technical idea of the present disclosure, and those of ordinary skill in the art to which the present disclosure pertains will be able to make various modifications and variations without departing from the essential characteristics of the present disclosure. Further, the embodiments according to the present disclosure are not intended to limit the technical idea of the present disclosure, but are described, and the scope of the technical idea of the present disclosure is not limited by these embodiments. Accordingly, the scope of protection of the present disclosure should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be interpreted as being included in the scope of the present disclosure.

100: 전자 장치 110: 마이크
120: 프로세서 130: 통신부
140: 메모리
200-1 내지 200-6: 외부 기기 300: 액세스 포인트
400: 음성 인식 서버 500: 기기 제어 서버 100: electronic device 110: microphone
120: processor 130: communication unit
140: memory
200-1 to 200-6: external device 300: access point
400: voice recognition server 500: device control server

Claims

In the electronic device,
microphone;
Communication department;
A memory storing a control command determination tool based on a control command determined by a voice recognition server for voice recognition processing of the user's voice received from the electronic device; And
When a user voice is received through the microphone, the received user voice is voice-recognized to obtain user intention information, and status information of the external device related to the obtained user intention information is transmitted to a plurality of external devices through the communication unit. It receives from the device control server for controlling,
The obtained user intention information and the received state information of the external device are applied to the control command determination tool to determine a control command for controlling a control target device among the plurality of external devices, and the determined control command is And a processor that transmits to the device control server through the communication unit.

The method of claim 1,
The control command determination tool,
And a rule DB including at least one rule in which user intention information, state information of an external device, and a control command determined by the voice recognition server are matched with each other,
The processor,
When a rule corresponding to the acquired user intention information and the received state information of the external device exists in the rule DB, determining a control command matched to the corresponding rule as a control command corresponding to the user voice, Electronic device.

The method of claim 2,
The processor,
When a rule corresponding to the obtained user intention information and the received state information of the external device does not exist in the rule DB, the user voice is transmitted to the voice recognition server through the communication unit,
When a control command determined by the voice recognition server based on the user's voice is received through the communication unit, a new rule is established by matching the received control command with the acquired user intention information and the received state information of the external device. An electronic device that generates and updates the generated new rule to the rule DB.

The method of claim 1,
The control command determination tool,
Including an artificial intelligence model learned by taking user intention information and state information of an external device as input, and outputting a control command determined by a voice recognition server,
The processor,
Inputting the acquired user intention information and the received state information of the external device into the learned artificial intelligence model, and determining a control command output from the learned artificial intelligence model as a control command corresponding to the user voice, Electronic device.

The method of claim 4,
The processor,
When the control command is not output from the learned artificial intelligence model in which the acquired user intention information and the received state information of the external device are input, the user voice is transmitted to the voice recognition server through the communication unit,
When a control command determined by the voice recognition server based on the user voice is received through the communication unit, the learned user intention information, the received state information of the external device, and the received control command An electronic device that retrains an artificial intelligence model.

The method of claim 1,
The processor,
When the operation of the control target device and the control target device can be specified only with the acquired user intention information based on the received state information of the external device, the acquired user intention information without using the control command determination tool The electronic device that determines the control command based on.

The method of claim 6,
The obtained user intention information includes information on an entity,
The processor,
When status information of a plurality of external devices related to the entity is received from the device control server, it is determined that the control target device cannot be specified only with the obtained user intention information, and the control command is performed using the control command determination tool. To determine the electronic device.

The method of claim 1,
The communication unit,
Including; IR (Infrared) communication module,
The processor,
When the control target device is a device that can be controlled by an IR method, the electronic device transmits the determined control command to the control target device through the IR communication module.

In the control method of an electronic device,
Storing a control command determination tool based on a control command determined by a voice recognition server that performs voice recognition processing on the user's voice received from the electronic device;
When a user voice is received, performing voice recognition processing on the received user voice to obtain user intention information;
Receiving state information of an external device related to the acquired user intention information from a device control server for controlling a plurality of external devices;
Determining a control command for controlling a control target device among the plurality of external devices by applying the obtained user intention information and the received state information of the external device to a control command determination tool; And
And transmitting the determined control command to the device control server.

The method of claim 9,
The control command determination tool,
And a rule DB including at least one rule in which user intention information, state information of an external device, and a control command determined by the voice recognition server are matched with each other,
The step of determining the control command,
When a rule corresponding to the acquired user intention information and the received state information of the external device exists in the rule DB, determining a control command matched to the corresponding rule as a control command corresponding to the user voice, Control method.

The method of claim 10,
Transmitting the user voice to the voice recognition server when a rule corresponding to the obtained user intention information and the received state information of the external device does not exist in the rule DB; And
When a control command determined by the voice recognition server is received based on the user voice, a new rule is generated by matching the received control command with the obtained user intention information and the received state information of the external device, and the The method further comprising, updating the generated new rule to the rule DB.

The method of claim 9,
The control command determination tool,
Including an artificial intelligence model learned by taking user intention information and state information of an external device as input, and outputting a control command determined by a voice recognition server,
The step of determining the control command,
Inputting the acquired user intention information and the received state information of the external device into the learned artificial intelligence model, and determining a control command output from the learned artificial intelligence model as a control command corresponding to the user voice, Control method.

The method of claim 12,
Transmitting the user voice to the voice recognition server when the control command is not output from the learned artificial intelligence model in which the acquired user intention information and the received state information of the external device are input; And
When a control command determined by the voice recognition server is received based on the user voice, the learned artificial intelligence model is based on the acquired user intention information, the received state information of the external device, and the received control command. Re-learning; further comprising, the control method.

The method of claim 9,
When the operation of the control target device and the control target device can be specified only with the acquired user intention information based on the received state information of the external device, the acquired user intention information without using the control command determination tool Determining the control command based on; further comprising, the control method.

The method of claim 14,
The obtained user intention information includes information on an entity,
When status information of a plurality of external devices related to the entity is received from the device control server, it is determined that the control target device cannot be specified only with the obtained user intention information, and the control command is performed using the control command determination tool. Determining the; further comprising, the control method.

The method of claim 1,
If the control target device is a device that can be controlled by the IR method, transmitting the determined control command to the control target device; further comprising, the control method.

In the voice recognition server,
Communication department;
A memory storing a control command determination tool based on a control command determined by the voice recognition server based on a user voice received from an electronic device; And
When a user voice is received from the electronic device through the communication unit, the received user voice is voice-recognized to obtain user intention information, and a plurality of state information of an external device related to the obtained user intention information is received through the communication unit. Received from the device control server to control the external device of the
The obtained user intention information and the received state information of the external device are applied to the control command determination tool to determine a control command for controlling a control target device among the plurality of external devices, and the determined control command is A processor for transmitting to the device control server through the communication unit.

The method of claim 17,
The control command determination tool,
A rule DB including at least one rule in which user intention information, state information of an external device, and a control command determined by the voice recognition server are matched with each other, and
A voice recognition server comprising at least one of an artificial intelligence model learned by inputting user intention information and state information of an external device as an input and outputting a control command determined by the voice recognition server.