CN110503947B - Dialogue system, vehicle including the same, and dialogue processing method - Google Patents
Dialogue system, vehicle including the same, and dialogue processing method Download PDFInfo
- Publication number
- CN110503947B CN110503947B CN201811496791.6A CN201811496791A CN110503947B CN 110503947 B CN110503947 B CN 110503947B CN 201811496791 A CN201811496791 A CN 201811496791A CN 110503947 B CN110503947 B CN 110503947B
- Authority
- CN
- China
- Prior art keywords
- action
- dialog
- information
- vehicle
- utterance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims description 45
- 230000009471 action Effects 0.000 claims abstract description 636
- 230000004044 response Effects 0.000 claims abstract description 169
- 230000000875 corresponding effect Effects 0.000 description 138
- 238000004891 communication Methods 0.000 description 78
- 230000006403 short-term memory Effects 0.000 description 62
- 230000007787 long-term memory Effects 0.000 description 61
- 238000000034 method Methods 0.000 description 60
- 238000012545 processing Methods 0.000 description 47
- 239000000446 fuel Substances 0.000 description 41
- 230000015654 memory Effects 0.000 description 33
- 238000004458 analytical method Methods 0.000 description 25
- 238000010586 diagram Methods 0.000 description 24
- 230000008569 process Effects 0.000 description 24
- 230000006399 behavior Effects 0.000 description 14
- 239000000284 extract Substances 0.000 description 14
- 230000006870 function Effects 0.000 description 14
- 239000013598 vector Substances 0.000 description 13
- 238000010438 heat treatment Methods 0.000 description 12
- 230000002085 persistent effect Effects 0.000 description 10
- 230000008859 change Effects 0.000 description 8
- 230000001276 controlling effect Effects 0.000 description 8
- 238000000605 extraction Methods 0.000 description 8
- 230000033001 locomotion Effects 0.000 description 8
- 206010041349 Somnolence Diseases 0.000 description 7
- 238000004378 air conditioning Methods 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 6
- 206010019233 Headaches Diseases 0.000 description 5
- 231100000869 headache Toxicity 0.000 description 5
- 230000007774 longterm Effects 0.000 description 5
- 238000007726 management method Methods 0.000 description 5
- 238000010295 mobile communication Methods 0.000 description 4
- 230000002688 persistence Effects 0.000 description 4
- 230000008439 repair process Effects 0.000 description 4
- 238000001514 detection method Methods 0.000 description 3
- 238000013139 quantization Methods 0.000 description 3
- 206010013954 Dysphoria Diseases 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 238000009423 ventilation Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000036760 body temperature Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000007791 dehumidification Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 210000003195 fascia Anatomy 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 235000012149 noodles Nutrition 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 239000003208 petroleum Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 210000004243 sweat Anatomy 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Navigation (AREA)
Abstract
A dialog system for a vehicle may include: a storage device configured to store context information including at least one of vehicle state information indicating a state of a vehicle and running environment information related to a running environment of the vehicle; an input processor configured to obtain an utterance from a user and to extract an action corresponding to the utterance when recognizing that the utterance includes user state information; a dialog manager configured to obtain parameter values of a condition-determining parameter for determining whether an action corresponding to the utterance is executable from the storage device, to determine an action to be performed based on the parameter values of the condition-determining parameter, and to obtain parameter values of an action parameter for performing the determined action from the storage device; and a result processor configured to generate a response for performing the determined action using the acquired parameter values of the action parameters.
Description
Technical Field
The disclosed invention relates to a dialogue system that provides information or services required by a user by recognizing the intention of the user through a dialogue with the user, a vehicle including the dialogue system, and a dialogue processing method.
Background
Conventional audio-video navigation (AVN) devices for vehicles may provide visual information to a user or receive user input. However, the relatively small screen and input of the AVN device may cause inconvenience to the user in using the AVN device. In particular, dangerous driving situations may occur when a user removes his hand from the steering wheel while driving or looks at the AVN device from the road.
Accordingly, when a dialog system is implemented in a vehicle, it is necessary to provide services in a more convenient and secure manner so that the dialog system can recognize the user's intention by dialog with the user and provide information or services required by the user.
Disclosure of Invention
Technical problem to be solved by the invention
An aspect of the present invention provides a dialogue system that accurately recognizes a user's intention based on various information such as dialogue with the user and vehicle state information, travel environment information, user information, etc., under a travel environment of the vehicle, thereby enabling to provide a service conforming to the user's actual intention or a service most required by the user, a vehicle including the dialogue system, and a dialogue processing method.
Another aspect of the present invention provides a vehicle capable of performing vehicle control according to a user's intention through an indirect utterance other than a direct control utterance of the user, and a dialogue processing method.
Additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Technical scheme for solving technical problems
According to an embodiment of the present invention, a dialogue system for a vehicle may include: a storage device configured to store context information including at least one of vehicle state information indicating a state of a vehicle and running environment information related to a running environment of the vehicle; an input processor configured to obtain an utterance from a user and to extract an action corresponding to the utterance when recognizing that the utterance includes user state information; a dialog manager configured to obtain, from the storage device, parameter values of a condition-determining parameter for determining whether an action corresponding to the utterance is executable, to determine an action to be performed based on the parameter values of the condition-determining parameter, and to obtain parameter values of an action parameter for performing the determined action from the storage device; and a result processor configured to generate a response for performing the determined action using the acquired parameter values of the action parameters.
The storage device may store context information associated with each of the plurality of actions, and the input processor may obtain an information value of the context information associated with the action corresponding to the utterance from the storage device and send the information value to the dialog manager.
When the information value of the context information related to the action corresponding to the utterance is not stored in the storage device, the input processor may request the information value of the context information corresponding to the utterance from the vehicle.
The dialog manager can set the parameter value of the condition-determining parameter or the parameter value of the action parameter equal to the information value of the context information sent from the input processor.
The storage device may store relationships between the relationship actions.
The dialog manager can obtain at least one action related to an action corresponding to the utterance from a storage device.
The dialog manager can determine a priority between an action corresponding to the utterance and at least one related action.
The dialog manager may obtain parameter values of the condition-determining parameters for determining whether the related action is executable from the storage device and determine whether at least one related action is executable based on the obtained parameter values of the condition-determining parameters.
The dialog manager may determine the action to be performed as an executable and having a highest priority between the action corresponding to the utterance and the at least one related action.
When the input processor cannot extract an action corresponding to the utterance, the dialog manager may estimate an action corresponding to the utterance based on at least one of vehicle state information and driving environment information.
The storage device may store relationships between the relationship actions, and the dialog manager may obtain at least one action related to the estimated action from the storage device.
The dialog system may also include a communicator configured to communicate with an external server. When the dialog manager cannot obtain the parameter value of the condition determination parameter or the parameter value of the action parameter from the storage device, the dialog manager may request the parameter value from the external server.
The results processor may generate a dialog response for performing the determined action and a command for controlling operation of the vehicle.
The dialog system may further include: a communicator configured to receive information values of context information from at least one of the vehicle and a mobile device connected to the vehicle and to send a response to the vehicle or the mobile device.
The input processor may extract an action corresponding to the utterance based on user information of the user.
Further, according to an embodiment of the present invention, a dialogue processing method for a vehicle may include: storing, in a storage device, context information including at least one of vehicle state information indicating a state of the vehicle and running environment information related to a running environment of the vehicle; obtaining an utterance from a user; when an utterance is recognized to include user state information, an action corresponding to the utterance is extracted; obtaining parameter values of a condition-determining parameter for determining whether an action corresponding to the utterance is executable from a storage device; determining an action to be performed based on a parameter value of the condition determination parameter; acquiring a parameter value of an action parameter for executing the determined action from a storage device; and generating a response for performing the determined action using the parameter values of the acquired action parameters.
The dialogue processing method may further include: storing context information associated with each of the plurality of actions in a storage device; obtaining, from a storage device, an information value of context information related to an action corresponding to the utterance; and sends the information value to the dialog manager.
The dialogue processing method may further include: requesting an information value of context information corresponding to an utterance from a vehicle when the information value of context information related to an action corresponding to the utterance is not stored in the storage device; the parameter value of the condition determining parameter or the parameter value of the action parameter is set equal to the information value of the context information transmitted from the input processor.
The dialogue processing method may further include: storing the relationship between the relationship actions in a storage device; retrieving from a storage device at least one action related to an action corresponding to the utterance; a priority between an action corresponding to the utterance and the at least one related action is determined.
The dialogue processing method may further include: acquiring parameter values of condition determining parameters for determining whether the related actions are executable from a storage device; and determining whether at least one related action is executable based on the acquired parameter values of the condition determination parameters.
The dialog processing method may further include an act of determining an act to be performed as executable and having a highest priority between an act corresponding to the utterance and the at least one related act.
The dialogue processing method may further include: when the input processor cannot extract an action corresponding to the utterance, the action corresponding to the utterance is estimated based on at least one of vehicle state information and travel environment information.
The dialogue processing method may further include: a dialog response for performing the determined action and a command for controlling operation of the vehicle are generated.
The dialogue processing method may further include: receiving an information value of context information from at least one of a vehicle and a mobile device connected to the vehicle; and sends the response to the vehicle or mobile device.
The dialogue processing method may further include: receiving user information of a user; an action corresponding to the user utterance is extracted based on the user information.
Furthermore, according to an embodiment of the present invention, a vehicle having a dialogue system may include: a storage device configured to store context information including at least one of vehicle state information indicating a state of a vehicle and running environment information related to a running environment of the vehicle; an input processor configured to obtain an utterance from a user and to extract an action corresponding to the utterance when recognizing that the utterance includes user state information; a dialog manager configured to obtain, from the storage device, parameter values of a condition-determining parameter for determining whether an action corresponding to the utterance is executable, to determine an action to be performed based on the parameter values of the condition-determining parameter, and to obtain parameter values of an action parameter for performing the determined action from the storage device; and a result processor configured to generate a response for performing the determined action using the acquired parameter values of the action parameters.
The storage device may store context information associated with each of the plurality of actions, and the input processor may obtain an information value of the context information associated with the action corresponding to the utterance from the storage device and send the information value to the dialog manager.
The dialog manager can set the parameter value of the condition-determining parameter or the parameter value of the action parameter equal to the information value of the context information sent from the input processor.
The storage device may store relationships between the relationship actions, and the dialog manager may obtain at least one action related to the action corresponding to the utterance from the storage device.
The dialog manager can determine a priority between an action corresponding to the utterance and at least one related action.
The dialog manager may obtain parameter values of the condition-determining parameters for determining whether the related action is executable from the storage device and determine whether at least one related action is executable based on the obtained parameter values of the condition-determining parameters.
The input processor may extract an action corresponding to the utterance based on user information of the user.
ADVANTAGEOUS EFFECTS OF INVENTION
According to the dialogue system, the vehicle including the dialogue system, and the dialogue processing method relating to one aspect, the intention of the user is accurately recognized based on various information such as dialogue with the user and vehicle state information, running environment information, user information, and the like in the running environment of the vehicle, whereby a service conforming to the actual intention of the user or a service most required by the user can be provided.
According to the dialogue system, the vehicle including the dialogue system, and the dialogue processing method relating to another aspect, the vehicle control according to the user's intention can be performed by indirect utterances other than the direct control utterances of the user.
Drawings
These and/or other aspects of the invention will be apparent from and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a control block diagram of a dialog system according to one embodiment.
Fig. 2 is a view showing a structure of the vehicle interior.
Fig. 3 to 5 are views showing examples of a dialogue that may be exchanged between the dialogue system and the driver.
Fig. 6 and 7 are control block diagrams schematically showing a connection relationship between a dialogue system and components of a vehicle.
Fig. 8 and 9 are control block diagrams schematically showing the connection relationship between the components of the dialogue system and the components of the vehicle.
Fig. 10 is a control block diagram showing a vehicle individual manner in which a dialogue system is provided in a vehicle.
Fig. 11 and 12 are control block diagrams showing a gateway manner of a vehicle in which a dialogue system is provided in a remote server, and the vehicle serves as a gateway connecting a user to the dialogue system.
Fig. 13 is a control block diagram showing a case where the vehicle can perform a part of the input processing and the output processing in the vehicle gateway system.
Fig. 14 is a control block diagram showing a hybrid manner in which both the remote dialogue system server and the vehicle can perform dialogue processing.
Fig. 15 and 16 are control block diagrams showing a manner of a mobile gateway in which a mobile device connected to a vehicle connects a user to a remote dialogue system server.
Fig. 17 is a control block diagram showing a movement individual manner of setting a dialogue system in a mobile device.
Fig. 18, 19A, and 19B are control block diagrams showing in detail the configuration of the input processor in the configuration of the dialogue system.
Fig. 20A and 20B are views showing examples of information stored in the context understanding table.
Fig. 21 is a control block diagram showing the configuration of the dialog manager in detail.
Fig. 22 is a view showing an example of information stored in the relational action DB.
Fig. 23 is a view showing an example of information stored in the action execution condition DB.
Fig. 24 is a view showing an example of information stored in the action parameter DB.
Fig. 25 is a table showing an example of information stored in the ambiguity resolution information DB.
Fig. 26A and 26B are tables showing various examples of performing vehicle control as a result of a blur solver resolving the blur by referring to the blur resolution information DB and extracting an action.
Fig. 27 is a control block diagram showing the configuration of the result processor in detail.
Fig. 28 to 40 are views showing specific examples in which the dialogue system processes input, manages a dialogue, and outputs a result when a user inputs an utterance related to route guidance.
Fig. 41 is a flowchart illustrating a method of processing user input in a dialog processing method according to an embodiment.
Fig. 42 is a flowchart illustrating a method of managing a dialog using an output of an input processor in a dialog processing method according to an embodiment.
Fig. 43 is a flowchart showing a result processing method for generating a response corresponding to a result of dialog management in the dialog processing method according to an embodiment.
It should be understood that the above-described drawings are not necessarily to scale, presenting a somewhat simplified representation of various preferred features illustrative of the basic principles of the invention. Specific design features of the present invention, including, for example, specific dimensions, orientations, locations, and shapes will be determined in part by the particular intended application and use environment.
Description of the reference numerals
100: Dialogue system
110: Input processor
120: Dialog manager
130: Result processor
200: Vehicle with a vehicle body having a vehicle body support
210: Voice input device
220: Information input device other than speech
230: Dialogue output device
280: Communication device
Detailed Description
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. Those skilled in the art will recognize that the described embodiments may be modified in various different ways without departing from the spirit or scope of the present invention. In the following description, like reference numerals refer to like elements throughout the specification.
Well-known functions or constructions are not described in detail since they would obscure one or more exemplary embodiments with unnecessary detail. Terms such as "portion," "module," "component," and "block" may be implemented as hardware or software, and multiple "portions," "modules," "components," and "blocks" may be implemented as a single component, or a single "portion," "module," "component," and "block" may include multiple components, depending on the embodiment.
Throughout the specification, when it is stated that an element is "connected" to another element, it includes not only direct connection but also indirect connection, including connection via a wireless communication network.
Furthermore, when a portion is recited as "comprising" or "including" a component, the portion may include other components, unless a specific description to the contrary exists.
It will be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element.
As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
The identification code is used for convenience of description, but is not intended to illustrate the order of each step. Unless the context clearly indicates otherwise, each step may be implemented in a different order than shown.
It should be understood that the term "vehicle" or "vehicular" or other similar terms as used herein include motor vehicles in general, including, for example, sport Utility Vehicles (SUVs), buses, trucks, various commercial vehicles, watercraft including a variety of watercraft, aircraft, etc., and include hybrid vehicles, electric vehicles, plug-in hybrid electric vehicles, hydrogen-powered vehicles, and other alternative fuel vehicles (e.g., fuel from sources other than petroleum). As referred to herein, a hybrid vehicle is a vehicle having two or more power sources, such as a gasoline powered vehicle and an electric vehicle.
Additionally, it should be understood that one or more of the following methods or aspects thereof may be performed by at least one controller. The term "controller" may refer to a hardware device that includes a memory and a processor. The memory is configured to store program instructions and the processor is specifically programmed to execute the program instructions to perform one or more processes described further below. As described herein, a control unit may control the operation of a unit, module, component, device, etc. Furthermore, it should be understood that the following methods may be performed by an apparatus comprising a controller in combination with one or more other components, as would be understood by one of ordinary skill in the art.
Furthermore, the controller of the present invention may be embodied as a non-transitory computer readable medium containing executable program instructions for execution by a processor. Examples of computer readable media include, but are not limited to, ROM, RAM, compact Disc (CD) -ROM, magnetic tape, floppy disk, flash memory drives, smart cards, and optical data storage devices. The computer readable recording medium CAN also be distributed over network of computers so that the program instructions are stored and executed in a distributed fashion, such as through a telematics server or Controller Area Network (CAN).
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings.
A dialog system according to one embodiment is a device that recognizes a user's intention by using the user's voice and another input other than the voice and provides a service conforming to the user's intention or a service required by the user, and the dialog system may perform a dialog with the user by outputting a system utterance, which is a means for providing the service or clearly recognizing the user's intention.
In this embodiment, the service provided to the user may include all types of operations performed according to the user's needs or for conforming to the user's intention, wherein all types of operations may include providing information, controlling a vehicle, performing audio/video/navigation functions, and providing content from an external server.
In addition, the dialog system according to one embodiment provides a dialog processing technique specific to the vehicle environment in order to accurately recognize the user's intention in a specific environment, i.e., a vehicle.
The gateway for connecting the dialog system with the user may be a vehicle or a mobile device connected to a vehicle. As described below, the dialog system may be provided in a remote server in or outside the vehicle in order to send or receive data by communication with the vehicle or a mobile device connected to the vehicle.
In addition, some components in the dialog system may be provided in the vehicle and still other components may be provided in the remote server so that the vehicle and the remote server may perform a portion of the operation of the dialog system.
FIG. 1 is a control block diagram of a dialog system according to one embodiment.
Referring to fig. 1, a dialog system 100 according to one embodiment includes: an input processor 110 that processes user input including user's voice and input other than user's voice or input including vehicle-related information or user-related information; a dialog manager 120 recognizing the user's intention using the processing result of the input processor 110 and determining an action corresponding to the user's intention or the vehicle state; a result processor 130 for providing a specific service or outputting a system utterance for continuing the dialog according to the output result of the dialog manager 120; and a storage device 140 that stores various information for operations described later.
The input processor 110 may receive two inputs, such as user speech and inputs other than speech. The input other than the voice may include an input other than the voice of the user by recognizing a gesture of the user or an operation input of the input device, vehicle state information indicating a state of the vehicle, running environment information related to running information of the vehicle, user information indicating a state of the user, and the like. In addition, any information related to the user and the vehicle may be input to the input processor 110 in addition to the above information, as long as the information is used to identify the user's intention or to provide services to the user. The user may include a driver and a passenger.
The input processor 110 converts the user's voice into a text-type utterance by recognizing the user's voice and recognizes the user's intent by applying natural language understanding techniques (Natural Language Understanding) to the user utterance.
In addition, the input processor 110 collects information related to a vehicle state or related to a driving environment of the vehicle other than the user's voice and then uses the collected information to understand the context.
The input processor 110 transmits the user intention obtained through the natural language understanding technique and the context-related information to the dialog manager 120.
The dialog manager 120 determines an action corresponding to the user's intention or current context based on the information related to the user's intention and context, etc., transmitted from the input processor 110, and manages parameters required to perform the corresponding action.
In this embodiment, the actions may represent various actions performed for providing a specific service, and the kind of actions may be predetermined. Providing the service may correspond to performing an action, as desired.
For example, actions such as route guidance, vehicle state check, and gas station recommendation may be predefined in the domain/action inference rule DB 141 (refer to fig. 19A), and actions corresponding to the user's utterance, that is, actions desired by the user, may be extracted according to the stored inference rules. In addition, an action related to an event occurring in the vehicle may be predefined and then stored in the relationship action DB 146b (refer to fig. 21).
The kind of the motion is not limited. If the dialog system 100 allows an action to be performed via the vehicle 200 or the mobile device 400 and the action is predefined and its inference rules or relationships with other actions/events are stored, the action may become the action mentioned above.
The dialog manager 120 sends information about the determined action to the result processor 130.
The results processor 130 generates and outputs dialog responses and commands required to perform the transmitted actions. The dialog response may be output in a text, image, or audio type, and when a command is output, services such as vehicle control and external content provision corresponding to the output command may be performed.
The storage 140 stores various information for dialogue processing and service provision. For example, the storage 140 may prestore information about domains for natural language understanding, actions, voice behaviors, and entity names, may also store a context understanding table for understanding a context from input information, and may also prestore data detected by sensors provided in the vehicle, information about the user, and information required for the actions. A description of the information stored in the storage device 140 will be described later.
As described above, the dialog system 100 provides dialog processing techniques specific to the vehicle environment. All or some of the components of dialog system 100 may be included in a vehicle. The dialog system 100 may be provided in a remote server and the vehicle may act merely as a gateway between the dialog system 100 and the user. In either case, the dialog system 100 may be connected to the user via the vehicle or a mobile device connected to the vehicle.
Fig. 2 is a view showing a structure of the vehicle interior.
Referring to fig. 2, a display 231 configured to display a screen required for vehicle control including an audio function, a video function, a navigation function, and a call function, and an input button 221 configured to receive a control command of a user may be provided in a center stack (CENTER FASCIA) 203 corresponding to a center portion of an instrument panel 201 inside a vehicle 200.
In addition, for the convenience of operation of the user, an input button may be provided in the steering wheel 207 as well, and a rotary shuttle (jog shuttle) 225 serving as an input button may be provided in the center console area 202 provided between the driver seat 254a and the passenger seat 254 b.
The module including the display 231, the input buttons 221, and the processor controlling various functions as a whole may correspond to an audio video navigation (AVN, audio Video Navigation) terminal or a head unit (head unit).
The display 231 may be implemented by any of various display devices, for example, a Liquid Crystal Display (LCD), a light emitting Diode (LED, light Emitting Diode), a plasma display panel (PDP, plasma Display Panel), an Organic LIGHT EMITTING Diode (OLED), and a Cathode Ray Tube (CRT).
As shown in fig. 2, the input buttons 221 may be provided in a hard key type on an area adjacent to the display 231. Alternatively, when the display 231 is implemented by a touch screen, the display 231 may perform the function of the input buttons 221.
The vehicle 200 may receive a user command as voice via the voice input device 210. The voice input device 210 may include a microphone configured to receive sound and then convert the sound into an electrical signal.
As shown in fig. 2, the voice input device 210 may be mounted to the top plate 205 for effective voice input, but the embodiment of the vehicle 200 is not limited thereto, and the voice input device 210 may be mounted to the dashboard 201 or the steering wheel 207. In addition, the voice input device 210 may be mounted to any location as long as the location is suitable for receiving the voice of the user.
Inside the vehicle 200, a speaker 232 may be provided that is configured to talk with the user or to output sound required to provide a service desired by the user. For example, the speakers 232 may be provided inside the driver seat door 253a and the passenger seat door 253 b.
The speaker 232 may output voice for guidance of a navigation route, sound or voice contained in audio/video content, voice for providing information or services desired by a user, a system utterance generated as a response to an utterance of the user, and the like.
The dialog system 100 according to one embodiment provides a service most suitable for a user's life style by using a dialog processing technology suitable for a vehicle environment, and the dialog system 100 may implement a new service using technologies such as a connected car (IoT), an internet of things (IoT), and an Artificial Intelligence (AI).
When a dialogue processing technique suitable for a vehicle environment, such as the dialogue system 100 according to one embodiment, is applied, a critical context (context) can be easily recognized and responded to in a situation where a driver directly drives. The service may be provided by applying a weight to parameters affecting driving such as gasoline shortage and drowsy driving, or information required for providing the service may be easily obtained based on conditions under which the vehicle moves to the destination in most cases, destination information, and the like.
In addition, an intelligent service that recognizes the intention of the driver and provides a function can be easily implemented. This is because real-time information and actions are prioritized in the case of direct driving by the driver. For example, when the driver searches for a gas station while driving, it can be interpreted as the driver's intention to go to the gas station. However, when the driver searches for a gas station in a place other than a vehicle, it can be interpreted as another intention, such as a location information inquiry, a telephone number inquiry, and a price inquiry, in addition to the intention that the driver will go to the gas station.
Further, although the vehicle has a limited space, various situations may occur therein. For example, a driver may utilize the dialog system 100 in various situations, such as driving a vehicle with an unfamiliar interface, such as a rental car, using a ride-on service, managing a vehicle, such as a car wash, a baby on a car, visiting a particular destination, and so forth.
In addition, various service and dialogue situations may occur in each of the stages constituting the vehicle traveling and the front and rear stages of the traveling, such as the vehicle inspection stage, the preparation departure stage, the traveling stage, and the parking stage. Specifically, the driver can utilize the dialogue system 100 in various situations, for example, a situation where the driver does not know the countermeasure of the vehicle problem, a situation where the vehicle is associated with various external devices, a situation where driving habits such as fuel economy are checked, and a situation where a safety support function is used, such as smart cruise control, a navigation operation situation, a drowsy driving situation, a situation where driving repeatedly along the same route every day, a situation where parking is possible in the place, and the like.
Fig. 3 to 5 are views showing examples of a dialogue that may be exchanged between the dialogue system and the driver.
Referring to fig. 3, when the driver inputs an utterance for asking the amount of the current remaining gasoline (U1: how much gasoline is now left.
The driver may input an utterance requesting to receive route guidance to a nearby gas station (U2: let me know the nearby gas station), and the dialog system 100 may output an utterance providing information about the nearest gas station to the current location (S2: a-Oil Seong-rim gas station, B-oil Jang-dae gas station, and C-oil Pacific gas station).
The driver may additionally input an utterance asking for the price of gasoline (U3: where is cheapest.
The driver may input an utterance requiring guidance to the B oil Jang-dae gas station (U4), and the dialogue system 100 may output an utterance indicating to start guidance to the gas station selected by the driver (S4: start route to the B oil Jang-dae gas station).
That is, the driver may be guided to a nearby gas station selling the fuel type of the current vehicle at the lowest price through a dialogue with the dialogue system 100.
Meanwhile, when a gas station is selected in the example shown in fig. 3, the dialogue system 100 may omit some problems and directly provide information, and thus may reduce steps and time of a dialogue.
For example, the dialog system 100 may recognize in advance that the fuel type of the current vehicle is gasoline, and that the driver selects the standard of the gas station as the price. Information about the fuel type of the vehicle may be acquired from the vehicle, and a criterion for the driver to select a gas station may be input in advance from the driver, or the criterion for the driver to select a gas station may be acquired by learning a driver dialogue history or a gas station selection history. This information may be pre-stored in the storage means 140 of the dialog system 100.
In this case, as shown in fig. 4, the dialogue system 100 may actively output an utterance providing information about the fuel price, in particular, an utterance providing information about the fuel price as the fuel type of the current vehicle (s2+s3=s3'), without inputting an utterance (U3) that the driver requests information about the fuel price, i.e., omitting U3.
The driver may omit the utterance (U3) for requesting information about the fuel price, and may form a response of the dialogue system 100 such that the utterance (S2) guiding the nearby gas station and the utterance (S3) guiding the fuel price are integrated as a single response to reduce the steps and time of the dialogue.
In addition, the dialogue system 100 may recognize itself that the driver's intention is to be searching for a gas station based on the fact that the driver inquires about the current amount of fuel left.
In this case, as shown in fig. 5, although the driver does not input an utterance (U2) asking for a nearby gas station, i.e., omits U2, the dialogue system 100 may actively output an utterance (s2+s3=s3 ") providing information on the price of fuel.
In addition, in a state where the nearest fuel station at the current location and the fuel station providing the lowest fuel price are the same fuel station, the utterance (S3') providing the information related to the fuel price may include a question for asking whether to guide to the corresponding fuel station. Thus, the user can request route guidance to the corresponding gas station by simply inputting an utterance (U4' yes) agreeing to the problem of the dialogue system 100 without inputting a specific utterance for requesting guidance to a certain gas station.
As described above, the dialog system 100 can recognize the real intention of the user by considering the content that the user has not issued and actively provide information corresponding to the intention, based on the information obtained in advance. Accordingly, the dialogue steps and time for providing the service desired by the user can be reduced.
Fig. 6 and 7 are control block diagrams schematically showing a connection relationship between a dialogue system and components of a vehicle.
Referring to fig. 6, a user's voice input to the dialog system 100 may be input via a voice input device 210 provided in the vehicle 200. As shown in fig. 2, the voice input device 210 may include a microphone disposed inside the vehicle 200.
The input other than voice in the user input may be input through the information input device 220 other than voice. The information input device 220 may include input buttons 221 and 223 for receiving commands through a user's operation and a rotary shuttle 225, in addition to voice.
In addition, the information input device 220 other than voice may include a camera that photographs the user. By the image imaged by the camera, a gesture, expression, or line-of-sight direction of the user serving as a means of command input can be recognized. Alternatively, the state of the user (drowsiness state or the like) may be identified by an image imaged by the camera.
Information related to the vehicle may be input into the dialog system 100 via the vehicle controller 240. The information related to the vehicle may include vehicle state information or surrounding environment information acquired by various sensors provided in the vehicle 200, and information initially stored in the vehicle 200, such as a fuel type of the vehicle.
The dialog system 100 can recognize the user's intention and context using the user's voice input via the voice input device 210, the input other than the user's voice via the information input device 220 other than voice, and various inputs via the vehicle controller 240. The dialog system 100 outputs a response to perform an action corresponding to the user's intention.
Dialog output device 230 is a device configured to provide output to a speaker in a visual, audible, or tactile manner. The dialog output device 230 may include a display 231 and a speaker 232 disposed in the vehicle 200. The display 231 and the speaker 232 may output a response to a user's utterance, information about a user's question, or a user request in a visual or audible manner. In addition, the vibration may be output by installing a vibrator in the steering wheel 207.
Further, based on the response output from the dialog system 100, the vehicle controller 240 may control the vehicle 200 to perform an action corresponding to the user's intention or the current situation.
Meanwhile, in addition to information acquired by sensors provided in the vehicle 200, the vehicle 200 may collect information acquired from the external content server 300 or the external device, for example, traveling environment information and user information such as traffic conditions, weather, temperature, passenger information, and driver personal information via the communication device 280, and then the vehicle 200 may transmit these information to the dialogue system 100.
According to embodiments of the invention, dialog system 100 may define actions by analyzing user utterances. In addition, the dialog system 100 may determine whether the utterance of the user is a direct control utterance or a status utterance for controlling the vehicle. To recognize the utterance type, dialog system 100 may utilize the morpheme analysis described above and below. For example, when the user speaks "hands are frozen," the dialog system 100 may recognize the user's utterance as a status utterance because there is no direct control operation of the vehicle corresponding to the user's utterance, and the dialog system 100 may acquire an action corresponding to the user's utterance.
Meanwhile, according to an embodiment of the present invention, the information input device 220 other than voice may include a bio-signal measuring device. In particular, the bio-signal measurement apparatus may comprise a medical sensor. The medical sensor may be implemented as a sensor that directly acquires biological information of the user, but may also be implemented as an indirect type, such as a pressure sensor provided on a steering wheel.
The dialog system 100 may obtain an action corresponding to the utterance of the user based on user information of the user received from the information input device 220 other than speech. For example, when it is recognized that the body temperature of the user is lower than a predetermined temperature based on the user information of the user received from the information input device 220 other than the voice, the dialogue system 100 may operate the air conditioner of the vehicle but operate the air conditioner with a weak air amount although the user speaks "hand is frozen".
As shown in fig. 7, information acquired by sensors provided in the vehicle 200, such as a remaining fuel amount, a rainfall speed, surrounding obstacle information, a speed, an engine temperature, a tire pressure, a current position, may be input to the dialogue system 100 via the internal signal controller 241.
Running environment information acquired from the outside via vehicle-to-everything (V2X, vehicle to Everything) communication may be input to the dialog system 100 via the external signal controller 242. V2X represents that the vehicle exchanges and shares various useful information, such as traffic conditions, by communicating with the road infrastructure and other vehicles while traveling.
The V2X communications may include Vehicle-to-Infrastructure (V2I) communications, vehicle-to-Vehicle (V2V) communications, and Vehicle-to-mobile device (V2N: vehicle-to-Nomadic devices) communications. Therefore, by using V2X communication, information such as traffic information on the front or the proximity of another vehicle or the risk of collision with another vehicle can be transmitted and received through communication performed directly between vehicles or communication with infrastructure installed on a road, so that the driver can be notified of the information.
Accordingly, the driving environment information input to the dialog system 100 via the external signal controller 242 may include traffic information about the front, proximity information of neighboring vehicles, collision warning with another vehicle, real-time traffic conditions, unexpected conditions, and traffic flow control states.
Although not shown in the drawings, a signal obtained via V2X may also be input to the vehicle 200 via the communication device 280.
The vehicle controller 240 may include: a memory in which a program for performing the above-described operations and operations described later is stored; and a processor for executing the stored program. At least one memory and one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.
In addition, the internal signal controller 241 and the external signal controller 242 may be implemented by the same processor or by separate processors.
Fig. 8 and 9 are control block diagrams schematically showing the connection relationship between the components of the dialogue system and the components of the vehicle.
Referring to fig. 8, user voice transmitted from the voice input device 210 may be input to the voice input processor 111 provided in the input processor 110, and input other than user voice transmitted from the information input device 220 other than voice may be input to the context information processor 112 provided in the input processor 110.
The context information processor 112 recognizes a context based on the vehicle state information, the running environment information, the user information, and the like. The dialog system 100 can precisely recognize the user's intention or efficiently find out the service required by the user by recognizing the context.
The response output from the results processor 130 may be input to the dialog output device 230 or the vehicle controller 240 to allow the vehicle 200 to provide the service desired by the user. In addition, a response may be transmitted to the external content server 300 to request a desired service.
The vehicle state information, the running environment information, and the user information transmitted from the vehicle controller 240 may be stored in the storage device 140.
Referring to fig. 9, the storage 140 may include a long-term memory 143 and a short-term memory 144. The data stored in the storage 140 may be classified into a short-term memory and a long-term memory according to importance and persistence of the data and intention of a designer.
The short-term memory 144 may store previously performed dialogs. The previous dialog may be a dialog performed within a reference time from the current time. Alternatively, the dialog may be stored continuously until the volume of utterance content between the user and the dialog system 100 reaches a reference value.
For example, when the user speaks the content of "let me know about restaurants near the south of the river", the dialogue system 100 may search for restaurants near the south of the river through the external content server 300 and then provide the user with information about the searched restaurants near the south of the river. As an example of providing information, the dialog system 100 may display a list of restaurants on the display 231 and, when the user speaks "first," may store dialog content related to the request for restaurants and the selection of restaurants in the short-term memory 144.
Alternatively, not only the entire dialogue content but also specific information contained in the dialogue content may be stored. For example, the first restaurant of the list of restaurants may be stored as a user-selected restaurant in the short-term memory 144 or the long-term memory 143.
When the user asks the dialogue system 100 after dialogue about restaurants near the south of the river, "how weather is? "at this time, the conversation system 100 can assume that the user's place of interest is the Jiangnan station from the conversation stored in the short-term memory 144, and then output a response" Jiangnan station is raining ".
Next, when the user speaks "recommended restaurants menu", the dialogue system 100 may assume that "the restaurant" represents a restaurant near the south of the river from the dialogue stored in the short-term memory, and acquire information related to the recommended menu of the corresponding restaurant through the service provided from the external content server 300. Thus, the dialog system 100 may output the response "noodle is the best menu in the restaurant".
Long-term memory 143 may store data according to the persistence of the data. For example, long-term memory 143 may determine that persistence of data, such as telephone numbers of relatives and friends and location of interest (POI) information, e.g., family or company, and user preferences for certain parameters, can be guaranteed and then store the data in the long-term memory. Conversely, when it is determined that the persistence of the data is not guaranteed, the data may be stored in the short-term memory 144.
For example, the user's current location may be temporary data and thus stored in short-term memory 144, and the user's preferences for restaurants may be later available persistent data and thus stored in long-term memory 143.
When the user speaks "there are restaurants here? "at this time, the dialog system 100 may identify the current location of the user from the short-term memory 144 and indicate from the long-term memory 143 that the user likes a chinese restaurant. Accordingly, the conversation system 100 can recommend a list of favorite chinese restaurants of the user around the current location by using the external content.
In addition, the dialog system 100 may actively provide services and information to the user using data stored in the long-term memory 143 and the short-term memory 144.
For example, information related to the user's residence may be stored in long-term memory 143. The dialog system 100 may obtain information about the user's residence from the external content server 300 and then provide information indicating "water shut off is expected on this friday due to cleaning of the apartment".
In addition, information regarding the vehicle battery status may be stored in the short-term memory 144. The dialog system 100 may analyze the vehicle battery status stored in the short-term memory 144 and then provide an indication "battery is in a bad state. It's information is repaired before winter.
Fig. 10 is a control block diagram showing a vehicle individual manner in which a dialogue system is provided in a vehicle.
As shown in fig. 10, according to a vehicle-independent manner, the dialog system 100 having the input processor 110, the dialog manager 120, the result processor 130, and the storage device 140 may be included in the vehicle 200.
When the dialog system 100 is included in the vehicle 200, the vehicle 200 may process the dialog with the user itself and provide services required by the user. However, information required for dialogue processing and service provision may be obtained from the external content server 300.
Vehicle state information or running environment information detected by the vehicle detector 260, such as a remaining fuel amount, a rainfall speed, surrounding obstacle information, a tire pressure, a current position, an engine temperature, a speed, may be input to the dialogue system 100 via the vehicle controller 240.
In addition, the vehicle controller 240 may control an air conditioner 251, a window 252, a door 253, a seat 254, or an AVN 255 provided in the vehicle 200 according to a response output from the dialogue system 100.
For example, when the dialogue system 100 determines that the user's intention or the service required by the user is to lower the temperature in the vehicle 200 and then generates and outputs a corresponding command, the vehicle controller 240 may lower the temperature in the vehicle 200 by controlling the air conditioner 251.
For another example, when the dialog system 100 determines that the user's intention or the service desired by the user is to raise the driver's seat window 252a and generates and outputs a corresponding command, the vehicle controller 240 may raise the driver's seat window 252a by controlling the window 252.
For another example, when the dialog system 100 determines that the user's intention or the service required by the user is to guide a route to a certain destination and generates and outputs a corresponding command, the vehicle controller 240 may perform route guidance by controlling the AVN 255. The communication device 280 may acquire map data, POI information, and the like from the external content server 300 as needed, and then use the information for service provision.
Fig. 11 and 12 are control block diagrams showing a gateway manner of a vehicle in which a dialogue system is provided in a remote server, and the vehicle serves as a gateway connecting a user to the dialogue system.
As shown in fig. 11, according to the vehicle gateway method, the remote dialogue system server 1 may be provided outside the vehicle 200, and the dialogue system client 270 connected with the remote dialogue system server 1 via the communication device 280 may be provided in the vehicle 200. The communication device 280 serves as a gateway for connecting the vehicle 200 and the remote dialogue system server 1.
The dialog system client 270 may serve as an interface to connect to an input/output device and perform collection and transmission and reception of data.
When the voice input device 210 and the information input device 220 other than voice provided in the vehicle 200 receive the user input and transmit the user input to the dialog system client 270, the dialog system client 270 may transmit the input data to the remote dialog system server 1 via the communication device 280.
The vehicle controller 240 may also transmit data detected by the vehicle detector 260 to the dialog system client 270, and the dialog system client 270 may transmit data detected by the vehicle detector 260 to the remote dialog system server 1 via the communication device 280.
Since the above-described dialogue system 100 is provided in the remote dialogue system server 1, input data processing, dialogue processing based on the result of the input data processing, and result processing based on the result of the dialogue processing can be performed by the remote dialogue system server 1.
In addition, the remote dialogue system server 1 may acquire information or content required for input data processing, dialogue management, or result processing from the external content server 300.
The vehicle 200 may also acquire information or content of a service required by the user from the external content server 300 according to the response transmitted from the remote dialogue system server 1.
Referring to fig. 12, the communication device 280 may include at least one communication module configured to communicate with an external device, for example, the communication device 280 may include a short-range communication module 281, a wired communication module 282, and a wireless communication module 283.
The short-range communication module 281 may include various short-range communication modules (e.g., a bluetooth module, an infrared communication module, a radio frequency identification (RFID, radio Frequency Identification) communication module, a wireless local area network (WLAN, wireless Local Access Network) communication module, an NFC communication module, and a ZigBee communication module) configured to transmit and receive signals using a wireless communication network at a short distance.
The wired communication module 282 may include various wired communication modules, such as a local area network (LAN, local Area Network) module, a wide area network (WAN, wide Area Network) module, or a value added network (VAN, value Added Network) module, and various cable communication modules, such as universal serial bus (USB, universal Serial Bus), high definition multimedia interface (HDMI, high Definition Multimedia Interface), digital video interface (DVI, digital Visual Interface), recommended standard232 (RS-232,recommended standard232), power line communication, or plain old telephone service (POTS, plan old telephone service).
The wireless communication module 283 may include a wireless communication module supporting various wireless communication methods, for example, a Wifi module, a wireless broadband (Wireless broadband) module, a global system for mobile communication (GSM, global System for Mobile Communication), code division multiple access (CDMA, code Division Multiple Access), wideband code division multiple access (WCDMA, wideband Code Division Multiple Access), universal mobile telecommunications system (UMTS, universal Mobile Telecommunications System), time division multiple access (TDMA, time Division Multiple Access), long term evolution (LTE, long Term Evolution), 4G, and 5G.
In addition, the communication device 280 may also include an internal communication module (not shown) for communication between electronic devices in the vehicle 200. The internal communication protocol of the vehicle 200 may use a controller area network (CAN, controller Area Network), a local interconnect network (LIN, local Interconnection Network), flexRay, ethernet (Ethernet), and the like.
The conversation system 100 can transmit data to the external content server 300 or the remote conversation system server 1 and receive data from the external content server 300 or the remote conversation system server 1 via the wireless communication module 283. In addition, the conversation system 100 can perform V2X communication using the wireless communication module 283. In addition, the dialogue system 100 may transmit and receive data to and from a mobile device connected to the vehicle 200 by using the short-range communication module 281 or the wired communication module 282.
Fig. 13 is a control block diagram showing a case where the vehicle can perform a part of the input processing and the output processing in the vehicle gateway system.
As described above, the dialogue system client 270 of the vehicle 200 may collect and transmit and receive only data, but when the input processor 271, the result processor 273, and the storage 274 are included in the dialogue system client 270 as shown in fig. 13, the data input from the user or the vehicle may also be processed in the vehicle 200 or the processing related to determining the service provision required by the user may be performed. That is, the operations of the input processor 110 and the result processor 130 may be performed not only by the remote dialogue system server 1 but also by the vehicle 200.
In this case, the dialog system client 270 may perform all or some of the operations of the input processor 110 described above. Dialog system client 270 may perform all or some of the operations of results processor 130 described above.
The allocation of tasks between the remote dialog system server 1 and the dialog system client 270 may be determined taking into account the capacity of the data to be processed, the data processing speed, etc.
Fig. 14 is a control block diagram showing a hybrid manner in which both the remote dialogue system server and the vehicle can perform dialogue processing.
As shown in fig. 14, according to the hybrid manner, since the input processor 110, the dialogue manager 120, the result processor 130, and the storage device 140 are provided in the remote dialogue system server 1, the remote dialogue system server 1 can perform dialogue processing, and since the terminal dialogue system 290 having the input processor 291, the dialogue manager 292, the result processor 293, and the storage device 294 is provided in the vehicle 200, the vehicle 200 can also perform dialogue processing.
However, there may be a difference in capacity or performance between the processor and memory provided in the vehicle 200 and the processor or memory provided in the remote dialogue system server 1. Thus, when the terminal dialogue system 290 is able to output a result by processing all input data and managing a dialogue, the terminal dialogue system 290 can perform the entire process. Otherwise, processing may be requested from the remote dialog system server 1.
Before executing the dialogue process, the terminal dialogue system 290 may determine whether the dialogue process can be executed based on the input data type, and the terminal dialogue system 290 may directly execute the process or request the process from the remote dialogue system server 1 based on the result of the determination.
In addition, when an event of a process which cannot be performed by the terminal dialogue system 290 occurs during the execution of the dialogue process by the terminal dialogue system 290, the terminal dialogue system 290 may request the process from the remote dialogue system server 1 while transmitting the result of its own process to the remote dialogue system server 1.
For example, when high-performance computing power or long-term data processing is required, the remote dialogue system server 1 may perform dialogue processing, and when real-time processing is required, the terminal dialogue system 290 may perform dialogue processing. For example, when a situation occurs in which immediate processing is required and thus data processing prior to synchronization is required, it may be arranged that the terminal dialog system 290 processes the data first.
In addition, the remote dialogue system server 1 can process the dialogue when there is an unregistered speaker in the vehicle and thus user confirmation is required.
Further, when the terminal dialogue system 290 cannot complete the dialogue process itself in a state where it cannot connect with the remote dialogue system server 1 via the communication device 280, the user may be notified that the dialogue process cannot be performed via the dialogue output device 230.
The data stored in the terminal dialogue system 290 and the data stored in the remote dialogue system server 1 may be determined according to a reference such as a data type or a data capacity. For example, in the case of data having a risk of invading privacy due to personal identification, the data may be stored in the storage 294 of the terminal dialogue system 290. In addition, a large amount of data may be stored in the storage device 140 of the remote dialogue system server 1, and a small amount of data may be stored in the storage device 294 of the terminal dialogue system 290. Alternatively, a small amount of data may be stored in the storage means 140 of the remote dialog system server 1 and the storage means 294 of the terminal dialog system 290.
Fig. 15 and 16 are control block diagrams showing a manner of a mobile gateway in which a mobile device connected to a vehicle connects a user to a remote dialogue system server.
As shown in fig. 15, according to the mobile gateway method, the mobile device 400 may receive vehicle state information and travel environment information from the vehicle 200 and transmit user input and the vehicle state information to the remote dialogue system server 1. That is, the mobile device 400 may act as a gateway connecting the user to the remote dialogue system server 1 or connecting the vehicle 200 to the remote dialogue system server 1.
The mobile device 400 may be represented as an electronic device that is portable and capable of transmitting and receiving data to and from an external server and a vehicle, wherein the mobile device 400 may include a smart phone, a smart watch, smart glasses, a PDA, and a tablet computer.
The mobile device 400 may include a voice input device 410 receiving user voice, an input device 420 receiving input other than user voice except the voice input device 410, an output device 430 outputting a response in a visual, audible or tactile manner, a communication device 480 transmitting data to and receiving data from the remote dialogue system server 1 and the vehicle 200 through communication, and a dialogue system client 470 collecting input data from the vehicle 200 and the user and transmitting data to the remote dialogue system server 1 via the communication device 480.
The voice input device 410 may include a microphone that receives sound, converts the sound into an electrical signal, and outputs the electrical signal.
The input device 420 in addition to the voice input device 410 may include input buttons, a touch screen, or a camera provided in the mobile device 400.
The output device 430 may include a display, speaker, or vibrator provided in the mobile device 400.
The voice input device 410, the input device 420 and the output device 430 provided in the mobile device 400, except for the voice input device 410, may serve as input and output interfaces for a user. In addition, a voice input device 210, an information input device 220 other than voice, a dialogue output device 230 provided in the vehicle 200 may be used as input and output interfaces of the user.
When the vehicle 200 transmits data and user input detected by the vehicle detector 260 to the mobile device 400, the dialog system client 470 of the mobile device 400 may transmit the data and user input to the remote dialog system server 1.
In addition, the dialog system client 470 may transmit a response or command transmitted from the remote dialog system server 1 to the vehicle 200. When the dialog system client 470 uses the dialog output device 230 provided in the vehicle 200 as an input and output interface for the user, a response to the utterance of the user may be output via the dialog output device 230. When the dialog system client 470 uses the output device 430 provided in the mobile device 400, a response to the user's utterance may be output via the output device 430.
Commands for vehicle control may be transmitted to the vehicle 200, and the vehicle controller 240 may perform control corresponding to the transmitted commands, thereby providing services required by the user.
On the other hand, the dialog system client 470 may collect input data and send the input data to the remote dialog system server 1. Dialog system client 470 may also perform all or some of the functions of input processor 110 and results processor 130 of dialog system 100.
Referring to fig. 16, the communication device 480 of the mobile device 400 may include at least one communication module configured to communicate with an external device. For example, the communication device 480 may include a short-range communication module 481, a wired communication module 482, and a wireless communication module 483.
The short-range communication module 481 may include at least one of various short-range communication modules (e.g., a bluetooth module, an infrared communication module, a radio frequency identification (RFID, radio Frequency Identification) communication module, a wireless local area network (WLAN, wireless Local Access Network) communication module, an NFC communication module, and a ZigBee communication module) configured to transmit and receive signals using a wireless communication network at a short distance.
The wired communication module 482 may include various wired communication modules, such as a local area network (LAN, local Area Network) module, a wide area network (WAN, wide Area Network) module, or a value added network (VAN, value Added Network) module, and various cable communication modules, such as at least one of universal serial bus (USB, universal Serial Bus), high definition multimedia interface (HDMI, high Definition Multimedia Interface), digital video interface (DVI, digital Visual Interface), recommended standard232 (RS-232,recommended standard232), power line communication, or plain old telephone service (POTS, plain old telephone service).
The wireless communication module 483 may include at least one of wireless communication modules supporting various wireless communication methods, for example, a Wifi module, a wireless broadband (Wireless broadband) module, a global system for mobile communication (GSM, global System for Mobile Communication), code division multiple access (CDMA, code Division Multiple Access), wideband code division multiple access (WCDMA, wideband Code Division Multiple Access), universal mobile telecommunications system (UMTS, universal Mobile Telecommunications System), time division multiple access (TDMA, time Division Multiple Access), long term evolution (LTE, long Term Evolution), 4G, and 5G.
For example, the mobile device 400 may be connected to the vehicle 200 via the short-range communication module 481 or the wired communication module 482, and the mobile device 400 may be connected to the long-range dialogue system server 1 or the external content server 300 via the wireless communication module 483.
Fig. 17 is a control block diagram showing a movement individual manner of setting a dialogue system in a mobile device.
As shown in fig. 17, the dialog system 100 may be provided in the mobile device 400 according to a mobile-alone manner.
Thus, the mobile device 400 can process a session with a user itself and provide a service required by the user without having to connect to the remote session system server 1 for session processing. However, the mobile device 400 may acquire information for session processing and service provision from the external content server 300.
Detailed configuration and detailed operation of each component of the dialog system 100 will be described in detail. According to an embodiment described later, for convenience of explanation, it is assumed that the dialogue system 100 is provided in the vehicle 200.
Fig. 18, 19A, and 19B are control block diagrams showing in detail the configuration of the input processor in the configuration of the dialogue system.
Referring to fig. 18, the input processor 110 may include a voice input processor 111 processing voice input and a context information processor 112 processing context information.
User voice input through the voice input means 210 may be transmitted to the voice input processor 111, and input other than user voice input through the information input means 220 other than voice may be transmitted to the context information processor 112.
The vehicle controller 240 may transmit the vehicle state information, the driving environment information, and the user information to the context information processor 112. The driving environment information and the user information may be provided from the external content server 300 or the mobile device 400 connected to the vehicle 200.
Inputs other than speech may be included in the context information. That is, the context information may include vehicle state information, travel environment information, user information.
The vehicle state information may include information indicating a vehicle state and acquired by a sensor provided in the vehicle 200, and information related to the vehicle and stored in the vehicle, such as a fuel type of the vehicle.
The running environment information may be information acquired by a sensor provided in the vehicle 200. The driving environment information may include image information acquired by a front camera, a rear camera, or a stereo camera, obstacle information acquired by a sensor (e.g., radar, lidar, ultrasonic sensor), and rainfall-related information and rainfall information acquired by a rain sensor, etc.
In addition, the running environment information may also include traffic state information, traffic light information, and adjacent vehicle approach or adjacent vehicle collision risk information acquired via V2X.
The user information may include information related to a user's status measured by a camera or a biometric reader provided in the vehicle, information related to the user directly input by the user using an input device provided in the vehicle, information related to the user stored in the external content server 300, information stored in a mobile device 400 connected to the vehicle, and the like.
The voice input processor 111 may include: a speech recognizer 111a that outputs a text-type utterance by recognizing an input user's speech; a natural language understanding section 111b that recognizes an intention of a user contained in an utterance by applying a natural language understanding technique (Natural Language Understanding) to the utterance of the user; and a dialog input manager 111c that transmits the result of understanding the natural language and the context information to the dialog manager 120.
The speech recognizer 111a may include a speech recognition engine (speech recognition engine) and the speech recognition engine may recognize the speech uttered by the user and generate a recognition result by applying a speech recognition algorithm to the input speech.
At this time, since the input voice is converted into a more useful form for voice recognition, the voice recognizer 111a can detect an actual voice part included in the voice by detecting a start point and an end point from the voice signal. This is known as endpoint detection (EPD, end Point Detection).
Also, the speech recognizer 111a may acquire feature vectors of the input speech from the detected portion by applying feature vector extraction techniques, such as cepstrum (Cepstrum), linear prediction coefficients (LPC, linear Predictive Coefficient), mel-frequency cepstrum coefficients (MFCCs, mel Frequency Cepstral Coefficient), or filter bank energy (filter bank energy).
Also, the voice recognizer 111a may obtain recognition results by comparing the extracted feature vectors with a trained reference pattern (pattern). For this purpose, the speech recognizer 111a may use an Acoustic Model (Acoustic Model) that models and compares signal features of speech, and a Language Model (Language Model) that models a Language-sequential relationship of words, syllables, or the like corresponding to the recognition vocabulary. For this, the storage 140 may store the acoustic model and the language model DB.
The acoustic model may be classified into a direct comparison method of setting the recognition object as a feature vector model and comparing the feature vector model with feature vectors of the voice signal, and a statistical method of statistically processing the feature vectors of the recognition object.
The direct comparison method is to set units such as words or phonemes as recognition objects to the feature vector model and compare the received speech with the feature vector model to determine the similarity between them. A representative example of a direct comparison method is vector quantization (Vector Quantization). Vector quantization is to map feature vectors of a received speech signal to a codebook (codebook) as a reference model to encode the mapped result as representative values and compare the representative values with each other.
The statistical model method is to configure a unit of an identification object as a State Sequence (State Sequence) and use a relationship between the State sequences. Each state sequence may be configured with a plurality of nodes. Methods using the relationship between the state sequences can be classified into dynamic time warping (DTW, dynamic Time Warping), hidden markov models (HMM, hidden Markov Model), methods using neural networks, and the like.
DTW is a method of compensating for a difference in time axis by comparing with a reference model in consideration of a dynamic characteristic of voice in which the length of a signal varies with time even if the same person speaks the same utterance. HMM is a recognition method that presumes speech as a markov process having a state transition probability and an observation probability of a node (output symbol) in each state, and then estimates the state transition probability and the observation probability of the node based on learning data and calculates a probability of generating received speech from the estimated model.
Meanwhile, by applying the sequence relation between units constituting a language to units acquired through speech recognition, a language model modeling the language sequence relation of words, syllables, and the like can reduce acoustic blurring and recognition errors. The language model may include a statistical language model and a finite state automaton (FSA, finite State Automata) based model. The statistical language model uses chain probabilities of words, such as Unigram, bigram and Trigram.
The speech recognizer 111a may perform speech recognition using any of the methods described above. For example, the speech recognizer 111a may use an acoustic model to which HMM is applied, or an N-best search method in which an acoustic model is combined with a speech model. The N-best search method may improve recognition performance by selecting N or less recognition result candidates using an acoustic model and a language model and then re-evaluating the ranking of the recognition result candidates.
The voice recognizer 111a may calculate a confidence value (confidence value) to ensure the reliability of the recognition result. The confidence value is a criterion that indicates how reliable the speech recognition result is. For example, confidence values may be defined for phonemes or words as a result of recognition as relative values of probabilities of issuing respective phonemes or words from different phonemes or words. Thus, the confidence value may be expressed as a value between 0 and 1 or between 0 and 100.
When the confidence value is greater than a predetermined threshold (threshold), the voice recognizer 111a may output a recognition result to allow an operation corresponding to the recognition result to be performed. When the confidence value is equal to or less than the threshold value, the speech recognizer 111a may reject (reject) the recognition result.
An utterance in a text form, which is a recognition result of the speech recognizer 111a, may be input to the natural language understanding section 111b.
The natural language understanding section 111b can recognize the intention of the user utterance included in the utterance language by applying a natural language understanding technique. Thus, the user may input a control command through a natural dialog (dialog), and the dialog system 100 may also cause the input of the control command and provide a service required by the user via the dialog.
First, the natural language understanding section 111b may perform a morpheme analysis on an utterance in a text form. Morphemes are the smallest units of meaning and represent the smallest semantic elements that cannot be subdivided. Thus, morpheme analysis is the first step in natural language understanding and converts an input string into a morpheme string.
The natural language understanding section 111b may acquire a domain from the utterance based on the morpheme analysis result. The fields may be used to identify topics in a user utterance language, and fields indicating various topics (e.g., route guidance, weather search, traffic search, schedule management, fuel management, and air conditioning control, etc.) may be stored as a database.
The natural language understanding section 111b can recognize an entity name from an utterance. The entity names may be proper nouns, such as person names, place names, organization names, time, date, and currency, and the entity name identification may be configured to identify entity names in sentences and determine the type of identified entity names. The natural language understanding section 111b may use the entity name recognition to acquire an important keyword from a sentence and recognize the meaning of the sentence.
The natural language understanding section 111b may analyze a voice behavior contained in the utterance. The voice behavior analysis may be configured to recognize the intent of the user utterance, e.g., whether the user asked a question, whether the user requested, whether the user responded or whether the user simply expressed emotion.
The natural language understanding section 111b extracts an action corresponding to the intention of the utterance of the user. The natural language understanding section 111b may recognize the intention of the utterance of the user based on the information of the domain, the entity name, the voice behavior, and the like corresponding to the utterance, and extract the action corresponding to the utterance. The actions may be defined by objects (objects) and operators (operators).
In addition, the natural language understanding section 111b may acquire parameters related to the execution of the action. The parameters related to the execution of the action may be valid parameters directly required for the execution of the action or invalid parameters for extracting valid parameters.
For example, when the utterance of the user is "let us go to the first stop", the natural language understanding section 111b may acquire "navigation" as a domain corresponding to the utterance, and "route guidance" as an action, in which the voice behavior corresponds to "request".
The entity name "head station" may correspond to [ parameters relevant to the execution of the action: destination ], but may require a specific exit number of the station or GPS information to actually guide the route via the navigation system. In this case, the [ parameters ] extracted by the natural language understanding section 111 b: destination: first stops may be candidate parameters for searching for "first stops" among the plurality of first stops POIs that the user actually desires.
In addition, the natural language understanding section 111b may acquire a tool configured to express a relationship between words or between sentences, for example, a parse tree (Parse-tree).
The morpheme analysis result, domain information, action information, voice behavior information, extracted parameter information, entity name information, and parse tree, which are the processing results of the natural language understanding section 111b, may be transmitted to the dialogue input manager 111c.
The context information processor 112 may include: a context information collector 112a collecting information from the information input device 220 other than voice and the vehicle controller 240; a context information collection manager 112b that manages the collection of context information; and a context understanding section 112c that understands a context based on the result of natural language understanding and the collected context information.
The input processor 110 may include: a memory in which a program for performing the above-described operations and operations described later is stored; and a processor for executing the stored program. At least one memory and one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.
In addition, the voice input processor 111 and the context information processor 112 included in the input processor 110 may be implemented by the same processor or separate processors.
Hereinafter, a method of processing input data by components of the input processor 110 using information stored in the storage device 140 will be described in detail with reference to fig. 19A and 19B.
Referring to fig. 19A, the natural language understanding part 111b may perform domain extraction, entity recognition, voice behavior analysis, and action extraction using the domain/action inference rule DB 141.
In the domain/action inference rule DB 141, domain extraction rules, voice behavior analysis rules, entity name conversion rules, action extraction rules, and the like may be stored.
Other information such as user input other than voice, vehicle state information, driving environment information, and user information may be input to the context information collector 112a and then stored in the context information DB 142, the long-term memory 143, or the short-term memory 144.
For example, raw data (raw data) detected by the vehicle detector 260 may be classified by sensor type and sensor value and then stored in the context information DB 142.
In the short-term memory 144 and the long-term memory 143, data meaningful to the user may be stored, which may include current user status, user preferences and orientations, or data for determining user preferences and orientations.
As described above, information that ensures durability and thus is usable for a long period of time may be stored in the long-term memory 143, and may include a user's phonebook, schedule, preference, learning, personality, work, family member-related information, and the like.
Information that is not guaranteed to be persistent or has uncertainty and thus available for a short period of time may be stored in short-term memory 144, and may include current and previous locations, today's schedules, previous conversation content, conversation participants, environments, domains, and driver status, among others. The data may be repeatedly stored in at least two storage devices among the context information DB 142, the short-term memory 144, and the long-term memory 143 according to the data type.
In addition, among the information stored in the short-term memory 144, data determined to ensure durability may be transmitted to the long-term memory 143.
In addition, information to be stored in the long-term memory 143 may be acquired using information stored in the short-term memory 144 or the context information DB 142. For example, the user's preference may be acquired by analyzing destination information or dialogue content stored for a specific duration, and the acquired user's preference may be stored in the long-term memory 143.
The information to be stored in the long-term memory 143 may be performed in the dialog system 100 or in another external system by obtaining the information stored in the short-term memory 144 or the context information DB 142 using the information.
In the former case, it may be executed in the memory manager 135 of the results processor 130. In this case, data used when acquiring meaningful information or persistent information (e.g., user's preference or orientation) from among the data stored in the short-term memory 144 or the context information DB 142 may be stored in the long-term memory 143 in a log file type.
The memory manager 135 may obtain persistent data by analyzing data stored for more than a certain duration and reenter the data into the long-term memory 143. In the long-term memory 143, the location where the persistent data is stored may be different from the location where the data stored in the log file type is stored.
In addition, the memory manager 135 may determine persistent data among the data stored in the short-term memory 144 and move and store the determined data into the long-term memory 143.
According to an embodiment of the present invention, when the user speaks "hands are frozen", the voice recognizer may recognize the voice, and the natural language understanding section may determine that the user utterance is not an utterance directly corresponding to control of the vehicle, but a status utterance.
The dialog input manager 111c may recognize the user's intent and context based on the user utterance and extract actions by comprehensively considering previous dialogs, events, domains, and dialog tasks.
Meanwhile, when performing the utterance, the user may adjust the temperature and amount of the air conditioner and acquire weather information related to the vehicle by using a temperature sensor provided in the vehicle. The context information collector 112a and the context information collection manager 112b provided in the dialog system may acquire inputs of the user other than voice.
For example, in the case where the user utterance "hand is frozen" is recognized as a status utterance and control to reduce the amount of air conditioning is performed, it is recognized that the user utterance "hand is frozen" is caused by the air conditioning amount being set to be high or the desired temperature being set to be low. Accordingly, when a user utterance is recognized as a status utterance, an action corresponding to the user utterance is extracted, and control may be performed to reduce the amount of air conditioning and raise the desired temperature of the air conditioning.
As shown in fig. 19B, when information to be stored in the long-term memory 143 is obtained using information stored in the short-term memory 144 or the context information DB 142 in an additional external system, the data management system 800 having the communicator 810, the storage 820, and the controller 830 may be used.
The communicator 810 may receive data stored in the context information DB 142 or the short-term memory 144. All of the stored data may be sent to communicator 810 or data used in retrieving meaningful or persistent information (e.g., user preferences or orientations) may be selected and then sent. The received data may be stored in storage device 820.
The controller 830 may acquire the persistent data by analyzing the stored data and then transmit the acquired data to the dialog system 100 via the communicator 810. The transmitted data may be stored in the long-term memory 143 of the dialog system 100.
In addition, the dialog input manager 111c may acquire context information related to the execution of an action by transmitting the output result of the natural language understanding section 111b to the context understanding section 112 c.
The context understanding part 112c can determine which context information is related to the execution of an action corresponding to the intention of the utterance of the user by referring to the context information stored by the action in the context understanding table 145.
Fig. 20A and 20B are views showing examples of information stored in the context understanding table.
Referring to the example of fig. 20A, context information and types of context information related to the execution of the actions may be stored in the context understanding table 145 according to each action.
For example, when the action is route guidance, the current location may be required as context information, and the type of context information may be GPS information. When the action is a vehicle state check, a travel distance may be required as context information, and the type of the context information may be an integer. When an action is recommended for a gas station, the remaining fuel amount and the depletion distance (DTE, distance To Empty) may be required as context information, and the type of context information may be an integer.
When context information related to the execution of an action corresponding to the intention of a user utterance is pre-stored in the context information DB 142, the long-term memory 143, or the short-term memory 144, the context understanding part 112c may acquire corresponding information from the context information DB 142, the long-term memory 143, or the short-term memory 144 and transmit the corresponding information to the dialog input manager 111c.
When context information related to the execution of an action corresponding to the intention of a user utterance is not stored in the context information DB 142, the long-term memory 143 or the short-term memory 144, the context understanding part 112c may request information required by the context information collection manager 112 b. The context information collection manager 112b may allow the context information collector 112a to collect desired information.
The context information collector 112a may collect data periodically or only when a particular event occurs. In addition, the context information collector 112a can periodically collect data and then additionally collect data when a particular event occurs. Further, the context information collector 112a may collect data when receiving a data collection request from the context information collection manager 112 b.
The context information collector 112a may collect desired information and then store the information in the context information DB 142 or the short-term memory 144. The context information collector 112a can send an acknowledgement signal to the context information collection manager 112b.
The context information collection manager 112b may send an acknowledgement signal to the context understanding part 112c, and the context understanding part 112c may acquire the required information from the context information DB 142, the long-term memory 143 or the short-term memory 144 and then send the required information to the dialog input manager 111c.
In particular, when the action corresponding to the intention of the utterance of the user is route guidance, the context understanding section 112c may search the context understanding table 145 and recognize the context information related to the route guidance as the current location.
When the current location is pre-stored in the short-term memory 144, the context understanding portion 112c can retrieve the current location and send the current location to the dialog input manager 111c.
When the current location is not stored in the short-term memory 144, the context understanding portion 112c may request the current location from the context information collection manager 112b, and the context information collection manager 112b may allow the context information collector 112a to obtain the current location from the vehicle controller 240.
The context information collector 112a can obtain the current location and then store the current location in the short term memory 144. The context information collector 112a can send an acknowledgement signal to the context information collection manager 112b. The context information collection manager 112b can send an acknowledgement signal to the context understanding portion 112c, and the context understanding portion 112c can retrieve the current location information from the short term memory 144 and then send that information to the dialog input manager 111c.
The dialog input manager 111c may send the output of the natural language understanding part 111b and the output of the context understanding part 112c to the dialog manager 120, and the dialog input manager 111c may try to prevent duplicate inputs from being input to the dialog manager 120. At this time, the output of the natural language understanding part 111b and the output of the context understanding part 112c may be combined into one output and then transmitted to the dialog manager 120 or may be each independently transmitted to the dialog manager 120.
On the other hand, when the context information collection manager 112b determines that a certain event occurs due to the data collected by the context information collector 112a satisfying a predetermined condition, the context information collection manager 112b may transmit an action trigger signal to the context understanding part 112c.
The context understanding part 112c may search the context understanding table 145 for the context information related to the corresponding event, and when the searched context information is not stored in the context understanding table 145, the context understanding part 112c may send the context information request signal to the context information collection manager 112b again.
As shown in fig. 20B, context information and the type of the context information related to the event may be stored in the context understanding table 145 according to each event.
For example, when the generated event is an engine temperature warning, the engine temperature in integer form may be stored as context information related to the event. When the generated event is a driver drowsiness driving detection, the driver drowsiness driving state in an integer form may be stored as context information related to the event. When the generated event is a tire pressure deficiency, the tire pressure in integer form may be stored as contextual information related to the event. When the generated event is a fuel alert, a depletion Distance (DTE) in integer form may be stored as context information related to the event. When the generated event is a sensor error, a sensor name in text form may be stored as context information related to the event. For another example, context information relating to the interior and exterior of the vehicle may be stored when the driver feels cold in the vehicle.
The context information collection manager 112b can collect the required context information via the context information collector 112a and send a confirmation signal to the context understanding part 112c. The context understanding part 112c may acquire the required context information from the context information DB 142, the long-term memory 143, or the short-term memory 144, and then transmit the context information to the dialog input manager 111c together with the action information.
The dialog input manager 111c can input the output of the context understanding portion 112c to the dialog manager 120.
Fig. 21 is a control block diagram showing the configuration of the dialog manager in detail, fig. 22 is a view showing an example of information stored in the relational action DB, fig. 23 is a view showing an example of information stored in the action execution condition DB, and fig. 24 is a view showing an example of information stored in the action parameter DB.
Referring to fig. 21, the dialog manager 120 may include: a dialog flow manager 121 that requests generation, deletion, and update of dialog or actions; a dialog action manager 122 that generates, deletes, and updates dialog or actions according to the request of the dialog flow manager 121; a ambiguity resolver 123 that ultimately clarifies the user's intent by resolving context ambiguity and dialog ambiguity; a parameter manager 124 that manages parameters required for the execution of the action; an action priority determiner 125 that determines, with respect to a plurality of candidate actions, whether the actions are executable and determines priorities of the actions; and an external information manager 126 managing the external content list and related information, and managing parameter information required for external content query.
The dialog manager 120 may include: a memory in which a program for performing the above-described operations and operations described later is stored; and a processor for executing the stored program. At least one memory and one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.
In addition, each component contained in the dialog manager 120 may be implemented by the same processor or separate processors.
In addition, the dialog manager 120 and the input processor 110 may be implemented by the same processor or separate processors.
The results of the natural language understanding (output of the natural language understanding part) and the context information (output of the context understanding part) as the outputs of the dialog input manager 111a may be input to the dialog flow manager 121, and the outputs of the natural language understanding part 111b may include information related to the utterance content itself of the user, such as a morpheme analysis result, and information such as a domain and an action. The output of the context understanding portion 112c may include events determined by the context information collection manager 112b and context information.
The dialog flow manager 121 may search whether a dialog task or an action task corresponding to the input by the dialog input manager 111a exists in the dialog and action state DB 147.
The dialogue and action state DB147 may be a storage space for managing dialogue states and action states, and thus the dialogue and action state DB147 may store dialogue states and action states related to a currently ongoing dialogue and actions, and preliminary actions to be processed. For example, the dialogue and action status DB147 may store statuses related to completed dialogues and actions, stopped dialogues and actions, ongoing dialogues and actions, and dialogues and actions to be processed.
In addition, the dialogue and action state DB 147 may store the last output state with whether to switch and nest (nesting) actions, switch action indexes, action change times, screen/voice/commands, and the like.
For example, in the case of extracting a domain and an action corresponding to a user utterance, when there is a dialog or an action corresponding to the corresponding domain and action in a recently stored dialog, the dialog and action state DB147 may determine it as a dialog task or an action task corresponding to an input from the dialog input manager 111 a.
When a domain and an action corresponding to a user utterance are not extracted, the dialog and action state DB147 may request generation of a random task or referencing a most recently stored task from the dialog action manager 122.
When there is no dialog task or action task corresponding to the input of the input processor 110 in the dialog and action state DB 147, the dialog flow manager 121 may request the dialog action manager 122 to generate a new dialog task or action task.
When the dialog flow manager 121 manages the dialog flow, the dialog flow manager 121 may refer to the dialog policy DB 148. The dialogue policy DB 148 may store policies for continuing a dialogue, wherein the policies may represent policies for selecting, starting, suggesting, stopping, and terminating a dialogue.
In addition, the dialogue policy DB 148 may store policies regarding the point in time and method of system output response. The dialogue policy DB 148 may store a policy for generating a response by linking a plurality of services and a policy for deleting a previous action and replacing the action with another action.
For example, when the candidate action is plural or the action corresponding to the user's intention or context is plural (a action, B action), two policies may be allowed, wherein the two policies may include: a policy that generates a response to two actions at a time, e.g., "do B action need to be performed after a action is performed? "; and a policy that generates a separate response to one action after generating a response to another action, e.g., "do a action" → "do you want to perform B action? ".
In addition, the dialogue policy DB 148 may store policies for determining priorities in candidate actions. The priority determination policy will be described later.
The dialogue action manager 122 may designate a memory space to the dialogue and action state DB147 and generate dialogue tasks and action tasks corresponding to the output of the input processor 110.
On the other hand, when domains and actions cannot be extracted from the user's utterance, the dialog action manager 122 may generate a random dialog state. In this case, as described later, the ambiguity resolver 123 may recognize the intention of the user based on the content of the utterance of the user, the environmental condition, the vehicle state, and the user information, and determine an action appropriate for the intention of the user.
When there is a dialog task or an action task corresponding to the output of the input processor 110 in the dialog and action state DB 147, the dialog flow manager 121 may request the dialog action manager 122 to refer to the corresponding dialog task or action task.
The action priority determiner 125 may search the relationship action DB 146b to search an action list related to actions or events contained in the output of the input processor 110, and then the action priority determiner 125 may acquire candidate actions.
According to an embodiment of the present invention, when the driver speaks that "hands are frozen", the temperature difference between the inside and the outside of the vehicle is sufficiently large, and the temperature set by the user input other than voice is lower than a predetermined temperature, the dialog flow manager 121 and the dialog action manager 122 may first determine whether the user utterance is a status utterance. That is, the dialog flow manager 121 and the dialog action manager 122 may determine that the user utterance is not a control utterance configured to directly control a component of the vehicle, and extract an action stored in a table for acquiring a solution based on the utterance and the vehicle state.
As shown in fig. 22, the relationship action DB 146b may indicate actions related to each other, a relationship between the actions, actions related to an event, and a relationship between events. For example, route guidance, vehicle status checks, and gas station recommendations may be categorized as relational actions, and the relationships therein may correspond to associations.
Therefore, when route guidance is performed, actions such as vehicle status check and gas station recommendation can be performed together. In this case, "executing together" may include a case where the vehicle state check and the gas station recommendation are executed before or after the route guidance and a case where the vehicle state check and the gas station recommendation are executed during the route guidance (for example, added as a stop point).
The warning light output events may be stored as event actions related to repair shop guidance actions, and the relationship between them may correspond to an association.
Thus, when a warning light output event occurs, repair shop guiding actions may be performed according to the type of warning light or whether repair is required.
When the input processor 110 transmits an action corresponding to an utterance of a user together with an event determined by the context information collection manager 112b, an action related to an action corresponding to an utterance of the user and an action related to the event may become candidate actions.
The extracted candidate action list may be transmitted to the dialogue action manager 122, and the dialogue action manager 122 may update the action state of the dialogue and action state DB 147 by adding the candidate action list.
The action priority determiner 125 searches the action execution condition DB 146c for a condition for executing each candidate action.
As shown in fig. 23, the action execution condition DB 146c may store a condition required to execute an action according to each action, and a parameter for determining whether the corresponding condition is satisfied.
For example, the execution condition for the vehicle state check may be a case where the destination distance is equal to or greater than 100km, wherein the parameter for determining the condition may correspond to the destination distance. The condition recommended by the gas station may be a case where the destination distance is greater than the depletion Distance (DTE), wherein the parameter for determining the condition may correspond to the destination distance and the depletion Distance (DTE).
The action priority determiner 125 transmits the execution conditions of the candidate actions to the dialogue action manager 122, and the dialogue action manager 122 may add the execution conditions per each candidate action and update the action state of the dialogue and action state DB 147.
The action priority determiner 125 may search for parameters (hereinafter referred to as condition determination parameters) required to determine an action execution condition from the context information DB 142, the long-term memory 143, the short-term memory 144, or the dialogue and action state DB147, and determine whether a candidate action can be executed using the searched parameters.
When the parameters for determining the action execution conditions are not stored in the context information DB 142, the long-term memory 143, the short-term memory 144, or the dialogue and action state DB 147, the action priority determiner 125 may acquire the required parameters from the external content server 300 via the external information manager 126.
The action priority determiner 125 may use parameters for determining action execution conditions to determine whether a candidate action may be executed. In addition, the action priority determiner 125 may determine the priority of the candidate action based on whether the candidate action can be executed and a priority determination rule stored in the dialog policy DB 148.
A score for each candidate action may be calculated based on the current situation. Candidate actions with higher computational scores may be given higher priority. For example, an action corresponding to a user utterance, a security score, a convenience score, a processing time point (whether immediate processing is required), a user preference (a degree of acceptance of a user in suggesting a service or a preference predetermined by a user), an administrator score, a score related to a vehicle state, and an action success rate (dialogue success rate) may be used as parameters for calculating the score, as shown in equation 1 below.
[ Equation 1]
Priority score = w1 user utterance action + w2 safety score + w3 convenience score + w4 processing time + w5 processing time + w6 user preference + w7 manager score + w8 score related to vehicle status + w9 action success rate + likelihood of action execution (1: possible, unaware, 0: impossible) the action completion status (completion: 1, incomplete: 0).
As described above, the action priority determiner 125 may provide the most required service to the user by searching for actions directly related to the utterance and the context information of the user and the action list related thereto, and by determining priorities therebetween.
The action priority determiner 125 may transmit the possibility and priority of the candidate action execution to the dialog action manager 122, and the dialog action manager 122 may update the action state of the dialog and action state DB147 by adding the transmitted information.
The parameter manager 124 may search the action parameter DB 146a for parameters (hereinafter referred to as action parameters) for performing each candidate action.
As shown in fig. 24, the action parameter DB 146a may store necessary parameters, selective parameters, initial values of the parameters, and reference positions for acquiring the parameters per action. In a state where initial values of parameters are stored, when there is no parameter value corresponding to a corresponding parameter in the utterance and the context information of the user output from the input processor 110 and there is no parameter value in the context information DB142, an action may be performed according to the stored initial values, or whether or not to perform the action may be confirmed to the user according to the stored initial values.
For example, the necessary parameters for route guidance may include the current location and destination, and the selective parameters may include the type of route. The quick route may be stored as an initial value of the selectivity parameter. The current location and destination may be acquired by sequentially searching the dialogue and action status DB 147, the context information DB 142, the short-term memory 144, or the long-term memory 143.
The necessary parameters for the vehicle state check may include vehicle state information, and the selective parameters may include a portion to be checked (hereinafter referred to as a "check portion"). The whole part (whole) may be stored as an initial value of the selectivity parameter. The vehicle state information may be acquired from the context information DB 142.
The recommended selectivity parameters of the fuel station may include favorite fuel stations, and "a fuel station" may be stored as an initial value of the selectivity parameters. The favorite fueling station can be retrieved from the long-term memory 143. The selectivity parameters may also include the fuel type and fuel price of the vehicle, etc.
As described above, the parameter manager 124 may acquire the parameter value of the parameter searched in the action parameter DB 146a from the corresponding reference position. The reference location where the parameter values are available may be at least one of the context information DB 142, the short-term memory 144 or the long-term memory 143, the dialogue and action status DB 147, and the external content server 300.
Parameter manager 124 may obtain parameter values from external content server 300 via external information manager 126. The external information manager 126 may determine where to acquire information from by referring to the external service set DB146 d.
The external service set DB 146d stores information about external content servers connected to the dialog system 100. For example, the external service set DB 146d may store external service names, descriptions about external services, types of information provided from the external services, external service usage methods, and a main body providing the external services.
The parameter values acquired by the parameter manager 124 may be transmitted to the dialogue action manager 122, and the dialogue action manager 122 may update the dialogue and action state DB 147 by adding initial values to the action states according to the candidate actions.
The parameter manager 124 may acquire parameter values of all candidate actions, or the parameter manager 124 may acquire only parameter values of candidate actions determined to be executable by the action priority determiner 125.
In addition, parameter manager 124 may selectively use parameter values among different types of parameter values that indicate the same information. For example, by using a destination search service of a navigation system, a "head station" indicating a destination and in the form of text may be converted into a "head station" in the form of a POI.
When there is no ambiguity in the dialog and context, the required information can be acquired and the dialog and actions can be managed according to the above-described operations of the action priority determiner 125, the parameter manager 124, and the external information manager 126. When there is ambiguity in the dialog and context, it may be difficult to provide a service required by the user using only the operations of the action priority determiner 125, the parameter manager 124, and the external information manager 126.
In this case, the blur solver 123 can handle blur in the dialog or in the context. For example, when a callback (e.g., that person, yesterday's place, father, mother, grandmother, and daughter-in-law) is included in the conversation, there may be ambiguity as to who or which the callback represents. In this case, the ambiguity resolver 123 may solve the ambiguity by referring to the context information DB142, the long-term memory 143, or the short-term memory 144, or provide a guide to solve the ambiguity.
For example, ambiguous words contained in "yesterday's place", "a market around house", and "first stop i yesterday's go" may correspond to parameter values of the action parameters or parameter values of the condition determination parameters. However, in this case, due to the ambiguity of the word, an actual action cannot be performed or an action execution condition cannot be determined by using the corresponding word.
The ambiguity resolver 123 can solve ambiguity of the parameter value by referring to information stored in the context information DB 142, the long-term memory 143, or the short-term memory 144. In addition, the blur solver 123 may obtain the required information from the external content server 300 by using the external information manager 126, as needed.
For example, the ambiguity resolver 123 may search for where the user has visited yesterday by referring to the short-term memory 144 to convert "where yesterday" into information that can be used as a destination for the route guidance action. In addition, the ambiguity resolver 123 may search the home address of the user by referring to the long-term memory 143, and acquire location information related to the a market in the vicinity of the home address of the user from the external content server 300. Thus, the ambiguity resolver 123 can convert "a market near house" into information that can be used as a destination of the route guidance action.
In addition, when the input processor 110 does not clearly extract an action (object and operator) or when the intention of the user is unclear, the ambiguity resolver 123 may identify the intention of the user by referring to the ambiguity resolution information DB 146e and determine an action corresponding to the identified intention.
Fig. 25 is a table showing an example of information stored in the ambiguity resolution information DB.
Based on the vehicle state information and the surrounding environment information, the ambiguity resolution information DB 146e may match the utterance with an action corresponding to the utterance and then store the utterance and the action. The utterance stored in the ambiguity resolution information DB 146e may be an utterance whose extraction action cannot be understood by natural language. Fig. 25 shows a case where the utterance content is hand frozen or hand cold according to the morpheme analysis result.
The surrounding environment information may include an outdoor temperature of the vehicle and whether it is raining, and the vehicle state information may include on/off of an air conditioner and a heater and an air volume and a wind direction of the air conditioner and on/off of heating wires of a steering wheel.
Specifically, in a state where the outdoor temperature exceeds 20 degrees at the time of rain, when the air conditioner is turned ON (ON), it can be recognized that the hand is frozen due to the air conditioner temperature being set low, and therefore "increase the air conditioner temperature by 3 degrees" can be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature exceeds 20 degrees at the time of rain, when the air conditioner is turned OFF (OFF), it can be recognized that the user feels cold due to rain, and thus "heater on" can be stored as a vehicle control action corresponding thereto.
When the air conditioner is turned ON (ON) and the wind direction of the air conditioner is upper in a state that the outdoor temperature exceeds 20 degrees without raining, it can be recognized that the hand is frozen because the wind of the air conditioner directly affects the hand, and thus "change the wind direction of the air conditioner to the lower side" can be stored as a vehicle control action corresponding thereto.
When the air conditioner is turned ON (ON) and the air direction of the air conditioner is ON the lower side and the air volume is set to exceed the intermediate level in a state where the outdoor temperature exceeds 20 degrees when there is no rain, it can be recognized that the user feels cold due to the air volume of the air conditioner being excessively large, and thus "reducing the air volume of the air conditioner" can be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature exceeds 20 degrees when there is no rain, when the air conditioner is turned ON (ON), the wind direction of the air conditioner is ON the lower side, and the air volume is set to be weak, "increasing the air conditioner temperature by 3 degrees" may be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, when the heater is turned OFF (OFF), it can be recognized that the hand is frozen due to cold weather, and thus "turning on the heater" can be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, when the heater is turned ON (ON) and the steering wheel heating wire is turned off, it can be recognized that the hand is frozen because hot air is not transferred to the hand, and thus, the "steering wheel heating wire is turned ON" can be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, when the heater and the steering wheel heating wire are turned ON (ON) and the wind direction of the heater is the lower side, it can be recognized that the hand is frozen because the wind of the heater is not transferred to the hand, and thus "changing the wind direction of the heater to be bidirectional" can be stored as the vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, the heater and the steering wheel heating wires are turned ON (ON), the wind direction of the heater is the upper side, and when the heater temperature is set lower than the highest temperature, "increasing the temperature of the heater" may be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, the heater and the steering wheel heating wires are turned ON (ON), the wind direction of the heater is the upper side, and the heater temperature is set to be the highest, and when the capacity of the heater is not set to be the highest, "increasing the air volume of the heater" may be stored as the vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, the heater and the steering wheel heating wire are turned ON (ON), the wind direction of the heater is the upper side, and the heater temperature and the air volume of the heater are set to be the highest, and when the seat heating wire is turned off, "turning ON the seat heating wire" may be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, the heater and the steering wheel heating wire are turned ON (ON), the wind direction of the heater is the upper side, and the heater temperature and the air volume of the heater are set to be the highest, "notification wait for a while is notified when the seat heating wire is turned ON, because the heater is now in the full operation (full load) state" may be stored as the vehicle control action corresponding thereto.
Fig. 26A and 26B are tables showing various examples of performing vehicle control as a result of a blur solver resolving the blur by referring to the blur resolution information DB and extracting an action.
For example, as shown in fig. 26A and 26B, in a state where the utterance content according to the morpheme analysis result is that the hand is frozen or the hand is cold, when the surrounding environment is summer, the vehicle state is the wind direction of the air conditioner is on the passenger head side (upper side), the air conditioner setting temperature is 19 degrees, and the air volume of the air conditioner is high level, it can be recognized that the hand is frozen due to the wind of the air conditioner being directed to the hand. An air conditioning control action for reducing the intensity of the air volume while changing the wind direction to the foot side (lower side) may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same contents, when the surrounding environment is winter, the vehicle state is the wind direction of the air conditioner is the feet of the passenger, the air conditioner set temperature is 25 degrees, and the air volume of the air conditioner is high level, it can be recognized that the hands are frozen because the hot air is not transferred to the hands. The "turn on steering wheel heating wire" action may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a state where the speech content according to the morpheme analysis result is "dysphoria", when the vehicle speed is 30km or less and the front-rear gap is less than 30cm, it can be recognized that the dysphoria is caused by heavy traffic. Accordingly, the "change route option in route guidance action (quick route guidance)", "play multimedia content, such as music", or "open chat function" can be extracted as an action corresponding to the utterance, and the vehicle can be controlled according to the extracted action.
In a state where the speech content according to the morpheme analysis result is "drowsiness", when the vehicle state is the inside air mode, drowsiness can be recognized as being caused by lack of ventilation. Accordingly, the "change to the outside air mode" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the vehicle state is the outside air mode and the heater is ON (ON), it can be recognized that drowsiness is caused by the hot air emitted from the heater. The "opening the window" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a state where the utterance content according to the morpheme analysis result is "sweats" or "heat", when the surrounding environment is winter and the heater is ON (ON), it can be recognized that the heat is caused by the hot air emitted from the heater. Accordingly, "lowering the heater temperature" or "reducing the air volume" may be stored as an action corresponding to the utterance.
In addition, in the words having the same content, when the surrounding environment is winter and when the heater is turned OFF (OFF), it can be recognized that heat is caused by the body heat of the user. Accordingly, the "opening window" or the "suggesting opening window" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the surrounding environment is summer and when the air conditioner is OFF (OFF), it can be recognized that the heat is caused by an increase in the internal temperature of the vehicle. Accordingly, the "turning on the air conditioner" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the surrounding environment is summer and when the air conditioner is turned ON (ON), it can be recognized that the heat is caused by the air conditioner temperature being set to be high. Accordingly, the "lowering the temperature of the air conditioner" or the "increasing the air volume of the air conditioner" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a state where the utterance content according to the morpheme analysis result is "cold", when the surrounding environment is summer and when the air conditioner is turned ON (ON), it can be recognized that the cold is caused by the air conditioner temperature being set too low or by the wind force of the air conditioner being too strong. Accordingly, the "increase air conditioning temperature" or the "decrease air volume" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the surrounding environment is summer and when the air conditioner is OFF (OFF), it can be recognized that the coldness is caused by the physical condition of the user. The "operating the heater" or "checking the biorhythm of the user" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the surrounding environment is winter and the heater is ON (ON), it can be recognized that the cooling is caused by the heater temperature being set to be low or the air quantity being weak. Accordingly, the "increase heater temperature" or the "increase air volume" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the utterance having the same content, when the surrounding environment is winter and the heater is turned OFF (OFF), it can be recognized that the cold is caused by the non-operation of the heater. The "operation heater" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a state where the utterance content according to the morpheme analysis result is "headache", when the surrounding environment is winter and the heater is ON (ON), it can be recognized that the headache is caused by lack of ventilation. Accordingly, the "change to outside air mode" or the "open window" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the surrounding environment is winter and the heater is turned OFF (OFF), it can be recognized that the headache is caused by cold. The "operation heater" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the surrounding environment is summer and the air conditioner is OFF (OFF), it can be recognized that the headache is caused by heat. The "operation air conditioner" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the surrounding environment is summer and the air conditioner is ON (ON), it can be recognized that the headache is caused by the air conditioner disease. The "change the wind direction or the air volume of the air conditioner" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a state where the speech content according to the morpheme analysis result is "uncomfortable", when the surrounding environment is winter and is raining, it can be recognized that the discomfort is caused by high humidity. Accordingly, the "operation defogging function" or the "operation dehumidifying function" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the surrounding environment is summer and there is no rain, it can be recognized that the discomfort is caused by seasonal features and heat. Accordingly, the "operating the air conditioner at the lowest temperature" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In addition, in the words having the same content, when the surrounding environment is summer and is raining, it can be recognized that discomfort is caused by heat and high humidity. Accordingly, the "operating the air conditioner in the dehumidification mode" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
According to the operation of the blur solver 123 described above, although there is a blur in the user's utterance or situation, the blur solver 123 can accurately recognize an action actually desired by the user or an action actually required by the user by taking into consideration the surrounding environment information of the user's utterance and the vehicle state information as a whole, and provide the desired action and the required action.
Information about the motion determined by the ambiguity resolver 123 may be transmitted to the dialogue motion manager 122, and the dialogue motion manager 122 may update the dialogue and motion state DB 147 based on the transmitted information.
As described above, the action priority determiner 125 and the parameter manager 124 may determine an action execution condition with respect to the action determined by the ambiguity resolver 123, determine the priority thereof, and acquire a parameter value.
When all of the parameter values for performing each action are acquired that can be acquired through the current context and dialog, the dialog action manager 122 may send a signal to the dialog flow manager 121.
On the other hand, when necessary parameter values for action execution and condition determination do not exist in the dialogue and action state DB 147, the external content server 300, the long-term memory 143, the short-term memory 144, and the context information DB 142, and only the necessary parameters can be acquired by the user, the result processor 130 can generate a dialogue response for querying the user for the parameter values.
The dialog flow manager 121 may send information about the action corresponding to the first priority action and the dialog state to the results processor 130. In addition, the dialog flow manager 121 may transmit information about a plurality of candidate actions according to a dialog policy. .
Fig. 27 is a control block diagram showing the configuration of the result processor in detail.
Referring to fig. 27, the result processor 130 includes: a response generation manager 131 that manages generation of a response required to perform an action input from the dialog manager 120; a dialog response generator 132 that generates a text, image, or audio type response according to the request of the response generation manager 131; a command generator 136 that generates a command for vehicle control or use of an external content providing service according to a request of the response generation manager 131; a service editor 134 that sequentially or occasionally executes a plurality of services and gathers its result values to provide a service desired by the user; an output manager 133 that outputs the generated text type response, image type response, or audio type response, outputs the command generated by the command generator 136, or determines the order of output when the outputs are plural; and a memory manager 135 that manages the long-term memory 143 and the short-term memory 144 based on outputs of the response generation manager 131 and the output manager 133.
The result processor 130 may include: a memory in which a program for performing the above-described operations and operations described later is stored; and a processor for executing the stored program. At least one memory and one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on a single chip or physically separated.
In addition, each of the components included in the result processor 130 may be implemented by the same processor or separate processors.
In addition, the result processor 130, the dialog manager 120, and the input processor 110 may be implemented by the same processor or separate processors.
Responses output by words or contexts corresponding to the user may include conversational responses, vehicle control, and external content provision. The dialog response may include an initial dialog, a query, an answer including information, etc. The dialog responses may be stored as a database in response templates 149.
The response generation manager 131 may request the dialog response generator 132 and the command generator 136 to generate responses required to perform the actions determined by the dialog manager 120. To this end, the response generation manager 131 may transmit information about the action to be performed, which may include an action name, a parameter value, and the like, to the dialog response generator 132 and the command generator 136. When generating the response, the dialog response generator 132 and the command generator 136 may refer to the current dialog state and action state.
The dialog response generator 132 may obtain a dialog response template by searching the response template 149 and generate a dialog response by populating the extracted dialog response template with parameter values. The generated dialog response may be sent to the response generation manager 131. When the parameter values required to generate the dialog response are not transmitted from the dialog manager 120 or when the introduction using the external content is transmitted, the dialog response generator 132 may receive the parameter values from the external content server 300 or search the long-term memory 143, the short-term memory 144, or the context information DB 142.
For example, when the action determined by dialog manager 120 corresponds to route guidance, dialog response generator 132 may search for response templates 149 and then extract dialog response templates "from [ current location: - ] to [ destination: -require [ duration: - ]. Start guidance? ".
The [ current location ] and [ destination ] of the parameters that need to be filled in the dialog response template may be transmitted from the dialog manager 120, and the parameter values of [ duration ] may not be transmitted. In this case, the dialog response generator 132 may request the duration spent from [ current location ] to [ destination ] from the external content server 300.
When the response to the user utterance or context includes vehicle control or external content provision, the command generator 136 may generate commands for performing the vehicle control or external content provision. For example, when the action determined by the dialog manager 120 is control of an air conditioner, a window, a seat, AVN, and the like, the command generator 136 may generate a command to perform the control and then transmit the command to the response generation manager 131.
In addition, when the action determined by the dialog manager 120 requires external content provision, the command generator 136 may generate a command for receiving the corresponding content from the external content server 300 and then transmit the command to the response generation manager 131.
When a plurality of commands are generated by the command generator 136, the service editor 134 may determine a method and order of executing the plurality of commands and send the method and order to the response generation manager 131.
The response generation manager 131 may transmit the response transmitted from the dialog response generator 132, the command generator 136, or the service editor 134 to the output manager 133.
The output manager 133 may determine the dialog response generated by the dialog response generator 132 and the output timing, output order, and output position of the command generated by the command generator 136.
The output manager 133 may output the response by transmitting the dialog response generated by the dialog response generator 132 and the command generated by the command generator 136 to the appropriate output location in the appropriate order at the appropriate timing. The output manager 133 may output Text-to-Speech (TTS) responses via the speaker 232 and Text responses via the display 231. The output manager 133 may use a TTS module provided in the vehicle 200 when outputting a TTS type dialog response, or alternatively, the output manager 133 may include a TTS module.
Depending on the control target, a command may be transmitted to the vehicle controller 240 or the communication device 280 for communicating with the external content server 300.
The response generation manager 131 may also transmit the response transmitted from the dialog response generator 132, the command generator 136, or the service editor 134 to the memory manager 135.
In addition, the output manager 133 may transmit a response of its own output to the memory manager 135.
The memory manager 135 may manage the long-term memory 143 or the short-term memory 144 based on the contents transmitted from the response generation manager 131 and the output manager 133. For example, the memory manager 135 may update the short-term memory 144 by storing dialog content between the user and the system based on the generated and output dialog responses. The memory manager 135 may update the long-term memory 143 by storing information related to the user acquired through a dialogue with the user.
In addition, among the information stored in the short-term memory 144, meaningful persistent information (e.g., preference or orientation of the user) or information for acquiring the meaningful persistent information may be stored in the long-term memory 143.
In addition, based on the vehicle control and the external content request corresponding to the generated and output command, the user preference or the vehicle control history stored in the long-term memory 143 may be updated.
According to the above-described embodiment, the dialogue system 100 can provide a service most suitable for a user by considering various situations occurring inside a vehicle. In particular, the dialog system 100 may also automatically determine a service required by the user based on context information or driver information collected by itself and actively provide the service without inputting an utterance of the user.
For example, the evaluation criteria of the vehicle state may vary depending on the situation at the time of starting the vehicle, and thus feedback may be actively provided. The travel start time may be defined as a vehicle start time, a time point (EPB, electronic Parking Brake) at which the electronic parking brake is released, or a time point at which the navigation destination is set. The vehicle condition evaluation system that calculates the travel availability score may assign weights to the respective devices, and change the variable weights applied to the respective devices according to the situation factors. When it is determined that there is a problem with the vehicle state, solutions may be provided for each device, such as repair shop guidance.
In addition, by considering the destination at the time of starting the vehicle, it can be determined whether the vehicle lacks fuel. When fuel is absent, as feedback of the absence of fuel, an automatic stop point of adding a user's favorite gas station to a route to a destination may be performed, and the user may be notified of a change in the stop point. In addition, the gas station added as the automatic stop point may be changed according to the response of the user.
In addition, although the current vehicle state does not indicate a fuel shortage, a gas station or a refueling time may be actively provided by comprehensively considering the user's next schedule, a main movement record, and the remaining fuel amount.
In addition, by acquiring information related to the physical condition and sleep record of the driver, the vehicle can be conditionally allowed to start based on the acquired information. For example, when the risk of drowsy driving is identified by identifying a physical condition and a sleep record from outside the vehicle, it may be recommended that the user not drive the vehicle. Alternatively, information about the recommended driving time may be provided according to a physical condition or sleep record.
In addition, when a trigger indicating the risk of drowsy driving occurs repeatedly, the risk of drowsy driving may be detected and a warning may be output according to the degree of risk or feedback such as automatically changing the route, i.e., changing the route to the rest area, may be provided. The trigger indicating the risk of drowsiness driving may be obtained by manually measuring the state of the driver and the vehicle state (for example, a case where the heart rate is lowered, a case where the front-rear gap is a reference distance or more, a case where the vehicle speed is a reference speed or less) or through active measurement of a dialogue (for example, a case where a problem is spoken to the driver and the response speed of the driver to the problem is measured).
In addition, when the user inputs an utterance indicating emotion, the dialog system 100 may not extract a certain domain or action from the user's utterance. However, the dialog system 100 may recognize the user's intention by using the surrounding environment information, the vehicle state information, and the user state information, and then continue the dialog. As described above, this embodiment may be performed by the ambiguity resolver 123 resolving the ambiguity of the user utterance.
Hereinafter, an example of a specific dialog process using the dialog system 100 according to one embodiment will be described in detail.
Fig. 28 to 40 are views showing specific examples in which the dialog system 100 processes input, manages dialogs, and outputs a result when a user inputs an utterance related to route guidance.
As shown in fig. 28, when the user inputs the utterance "let us go to the first stop yesterday", the speech recognizer 111a may output the user's speech as an utterance in text form (let us go to the first stop yesterday).
The natural language understanding section 111b can perform a morpheme analysis by referring to the domain/action inference rule DB 141 and output [ domain ] from the morpheme analysis result (yesterday/NNG, go/VV, head station/NNP, go/VV: navigation ], [ action: route guidance ], [ voice behavior; request ] and [ parameter: NLU: destination: first station ], and then inputs them to the dialog input manager 111c.
Referring to fig. 29, when additional information exists in the context understanding part 112c when the natural language understanding result of the natural language understanding part 111b is transmitted to the context understanding part 112c, the dialog input manager 111c may request the context understanding part 112c to transmit the additional information.
The context understanding section 112c may search the context understanding table 145 and extract the and/or the [ field: navigation ] and [ action: route guidance ] the fact that the relevant context information is the current location and the type of context information is a GPS value.
The context understanding part 112c may acquire a GPS value of the current location by searching the context information DB 142. When the GPS value of the current location is not stored in the context information DB 142, the context understanding part 112c may request the GPS value of the current location from the context information collection manager 112 b.
The context information collection manager 112b can send a signal to the context information collector 112a to cause the context information collector 112a to collect GPS values for the current location. The context information collector 112a may collect GPS values of the current location from the vehicle controller 240 and then store the GPS values of the current location in the context information DB 142 while transmitting a GPS value collection confirm signal to the context information collection manager 112b. When the context information collection manager 112b transmits the GPS value collection confirm signal to the context understanding part 112c, the context understanding part 112c may acquire the GPS value of the current location from the context information DB 142 and then transmit the GPS value of the current location to the dialogue input manager 111 c.
The dialog input manager 111c may combine the [ fields ] as natural language understanding results: navigation ], [ action: route guidance ], [ voice behavior; request ], [ parameters: NLU: destination: head station ] and [ context information: current position: uiwang station (GPS value) and then sends the combined information to the dialog manager 120.
Referring to fig. 30, the dialog flow manager 121 may search the dialog and action state DB 147 and determine whether there is a dialog task or action task currently in progress. At this time, the dialog flow manager 121 may refer to the dialog policy DB 148. According to this embodiment, it is assumed that there is no dialog task or action task currently in progress.
The dialog flow manager 121 may request the dialog action manager 122 to generate an action task and a dialog task corresponding to the output of the input processor 110. The generation of the action tasks and the dialog tasks may represent specifying a storage space for storing and managing information related to the action states and the dialog states.
Accordingly, the dialogue action manager 122 may designate a storage space in the dialogue and action state DB 147 to store information about the action state and the dialogue state.
The dialog action manager 122 may send the action state and dialog state to the action priority determiner 125.
The action priority determiner 125 may search the relationship action DB 146b for a vehicle status check and a gas station recommendation related to the path guidance. The route guidance action and the relationship action may become candidate actions.
The action priority determiner 125 may determine the priority of each candidate action according to pre-stored rules. The priority may be determined before the execution condition of each candidate action is determined, or alternatively, only the priority regarding the candidate action satisfying the execution condition may be determined after the execution condition of the candidate action is determined.
The candidate action list may be sent again to the dialog action manager 122, and the dialog action manager 122 may update the action state by adding the searched relationship actions.
Referring to fig. 31, the action priority determiner 125 may search the action execution condition DB 146c for an execution condition for each candidate action and parameters required to determine the execution condition. In addition, the action priority determiner 125 may also determine priorities between candidate actions.
For example, the condition for vehicle state check may be a case where the destination distance is equal to or greater than 100km, wherein the parameter for determining the condition may be the destination distance.
The condition for executing the recommended condition of the gas station may be a case where the destination distance is greater than the depletion Distance (DTE), wherein the parameter for determining the executing condition may be the destination distance and the depletion Distance (DTE).
The dialogue action manager 122 may update the action state by adding a condition for performing each candidate action and parameters required to determine the condition to the dialogue and action state DB 147.
The action priority determiner 125 may search the dialogue and action state DB 147, the context information DB 142, the long-term memory 143, or the short-term memory 144 for parameter values required to determine whether the candidate action satisfies the execution condition, and acquire the parameter values from the dialogue and action state DB 147, the context information DB 142, the long-term memory 143, or the short-term memory 144.
The action priority determiner 125 may acquire the parameter values from the dialog and action state DB 147 when the parameter values are included in the previous dialog content, in context information related to the dialog content, or in context information related to the generated event.
When the action priority determiner 125 fails to acquire the parameter values from the dialogue and action state DB 147, the context information DB 142, the long-term memory 143, or the short-term memory 144, the action priority determiner 125 may request the parameter values from the external information manager 126.
For example, the destination distance may be acquired from the external content server 300 providing the navigation service via the external information manager 126, and the DTE may be acquired from the context information DB 142. Meanwhile, in order to search for a destination distance, correct destination information for the navigation service may be required. In this embodiment, the destination of the utterance input from the user may correspond to a "head station", where the "head station" may include various places having names beginning with the "head station" and the "head station" having a specific meaning. Thus, it may be difficult to search for the correct destination distance using only "head station".
In addition, the parameter values may be acquired from the mobile device 400 connected to the vehicle 200, as needed. For example, when user information (e.g., contacts and calendars) that is not stored in the long-term memory 143 is required as parameter values, the external information manager 126 may request the mobile device 400 for the required information and then acquire the required parameter values.
In addition, when the parameter values cannot be acquired through the storage device 140, the external content server 300, and the mobile device 400, the user can be queried to acquire the required parameter values.
The action priority determiner 125 may determine the execution condition of the candidate action by using the parameter value. Since the destination distance is not searched, the determination of the execution condition related to the vehicle state check action and the recommendation of the gas station can be deferred.
As shown in fig. 32, the dialogue action manager 122 may update the action state by adding the acquired parameter values and whether the action execution condition is satisfied, which is determined by using the corresponding parameter values, to the dialogue and action state DB 147.
The dialog action manager 122 may request a list of parameters for performing each candidate action from the parameter manager 124.
The parameter manager 124 may acquire the current location and destination from the action parameter DB 146a as necessary parameters for performing the route guidance action, and extract a route type (initial value: quick route) as a selective parameter.
The parameter manager 124 may acquire a check part (initial value: whole part) as a selective parameter for performing a vehicle state check action, and extract a favorite gas station (initial value: a-gas station) as a selective parameter for performing a gas station recommendation action.
The extracted parameter list may be sent to the dialog action manager 122 and used to update the action state.
The parameter manager 124 may search for corresponding parameter values in reference locations of each parameter in the dialog and action state DB 147, the context information DB142, the long-term memory 143, and the short-term memory 144 to obtain parameter values of necessary parameters and selective parameters corresponding to respective candidate actions. When it is required to provide parameter values via an external service, the parameter manager 124 may request the required parameter values from the external content server 300 via the external information manager 126.
The parameters for determining the execution conditions of the candidate actions and the parameters for executing the candidate actions may overlap. Among the parameter values acquired by the action priority determiner 125 and then stored in the dialogue and action status DB 147, when there are parameters corresponding to parameters (necessary parameters and selective parameters) for performing candidate actions, the corresponding parameters may be used.
Referring to fig. 33, the dialog action manager 122 may update the action state by adding the parameter values acquired by the parameter manager 124.
As described above, when a destination (head station) extracted from an utterance of a user is used as a parameter of a route guidance action, there may be a blur. Therefore, the parameters of the route guidance action (destination), the parameters of the vehicle state check action (destination distance), and the parameters recommended by the gas station (destination distance) may not have been acquired yet.
When [ parameters: NLU: destination: when the head station is converted into a destination parameter suitable for the route guidance action, the blur solver 123 may check whether there is a blur. As described above, the "head station" may include different kinds of places having names beginning with the "head station", as well as "head station" having a user-specific meaning.
The ambiguity resolver 123 can confirm that there is a modifier of "head station" in the user utterance by referring to the morpheme analysis result. The ambiguity resolver 123 may search the long-term memory 143 or short-term memory 144 for schedules, mobile locations, and contacts to identify the location of "first-come station we yesterday".
For example, the ambiguity resolver 123 may confirm "first-order station we last yesterday" as "first-order station exit 4" from the mobile location that the user performed yesterday. After confirming the existence of the POI (e.g., "head station exit 4"), the ambiguity resolver 123 may acquire a corresponding value.
Destination information obtained by the ambiguity resolver 123 may be sent to the dialogue action manager 122, and the dialogue action manager 122 may update the action state by adding "head station exit 4" to the destination parameters of the candidate actions.
The parameter manager 124 may acquire destination information (head station outlet 4) from the dialogue and action status DB 147 and request a destination distance value from the external content server 300 providing the navigation service via the external information manager 126.
Referring to fig. 34, when the external information manager 126 acquires a destination distance value (80 km) from the external content server 300 and then transmits the destination distance value to the parameter manager 124, the parameter manager 124 may transmit the destination distance value to the dialogue action manager 122 to allow updating of the action state.
The action priority determiner 125 may determine whether the candidate action is executable by referring to the action state and adjust the priority of the candidate action. Since the parameter values of the current position and the destination as necessary parameters are acquired, it can be determined that the route guidance action is executable. Since the destination distance (70 km) is less than 100km, it can be determined that the vehicle state check action is not executable. Since the destination distance (80 km) is greater than DTE, it can be determined that the station recommended action is executable.
Since the vehicle state check action is not executable, the vehicle state check action can be excluded from the priority determination. Thus, the route guidance actions may be ranked first and the gas station recommendation actions may be ranked second.
The dialog action manager 122 may update the action state based on whether the candidate action is executable and the modified priority.
The dialog flow manager 121 may check the dialog states and the action states stored in the dialog and action state DB 147 and may develop a dialog policy by referring to the dialog policy DB 148 to continue the dialog. For example, the dialog flow manager 121 may select the highest priority action among the executable actions, and the dialog flow manager 121 may request the response generation manager 131 to generate a response for conducting a dialog according to the dialog policy DB 148.
The dialogue state and the action state stored in the dialogue and action state DB 147 may be updated to [ state: confirm route guidance start ].
Referring to fig. 35, the response generation manager 131 may request generation of a response of the dialog response generator 132 in response to a request of the dialog flow manager 121.
The dialog response generator 132 may generate TTS responses and text responses by searching the response template 149. For example, dialog response generator 132 may generate a dialog response configured to output "expect 30 minutes from Uiwang to head station exit 4 with TTS and text form. Do you want to start booting? ".
The response generation manager 131 may transmit the TTS response and the text response generated by the dialog response generator 132 to the output manager 133 and the memory manager 135, and the output manager 133 may transmit the TTS response to the speaker 232 and the text response to the display 231. At this point, the output manager 133 may send the TTS response to the speaker 232 after passing through a TTS module configured to collectively refer to text as speech.
The memory manager 135 may store the user request route guidance in the short-term memory 144 or the long-term memory 143.
Configured to interrogate "it is expected that 30 minutes would be required from Uiwang to head station number 4 exit. Do you want to start booting? "dialog responses may be output via display 231 and speaker 232. As shown in fig. 36, when the user speaks "yes", the user's utterance may be input to the speech recognizer 111a and then output as [ text: yes ], and the natural language understanding section 111b may output [ field: - [ action: -voice action: - ] and [ morpheme analysis results: is/IC ].
The natural language understanding result is transmitted to the dialog input manager 111c, and the dialog input manager 111c transmits the natural language understanding result to the dialog manager 120.
Referring to fig. 37, the dialog flow manager 121 may search the dialog and action state DB 147 and analyze the previous dialog state. The dialog flow manager 121 may request the dialog action manager 122 to update dialog/actions related to the currently performed route guidance.
The dialog action manager 122 may update dialog states and action states to [ state: route guidance start ].
The dialog flow manager 121 may request the result processor 130 to generate a response for starting route guidance.
Referring to fig. 38, the dialog action manager 122 may update the dialog state to [ state: next dialog is performed and the action state is updated to [ state ]: execute ].
The dialog flow manager 121 may request the response generation manager 131 to generate a response to the route guidance.
The dialog response generator 132 may generate a dialog response configured to output "start route guidance" as a TTS and text form, and then transmit the dialog response to the response generation manager 131.
The command generator 136 may generate a command [ target ] for performing route guidance: navigation, command: route guidance, destination: first-hand station exit 4, start: uiwang station ], and then sends the command to the response generation manager 131.
The response generation manager 131 may send the generated dialog response and command to the output manager 133. The output manager 133 may output the dialog response via the display 231 and the speaker 232. The output manager 133 may transmit the route guidance command to the AVN 230 of the vehicle 200 via the vehicle controller 240 or to the external content server 300 providing the navigation service.
Referring to fig. 39, the dialog flow manager 121 may select the fuel station recommendation as the next executable action and request that the response generation manager 131 generate a response configured to ask the user whether to recommend a fuel station.
The dialog state and action state may be updated to [ state: check related service recommendations ].
The response generation manager 131 may request the dialog response generator 132 to generate a TTS response and a text response, and the dialog response generator 132 may generate a dialog response configured to output "insufficient fuel to reach the destination in the TTS and text form. Do you want to add an a-gas station to a stay? ". The dialog response generator 132 may send the TTS and text to the response generation manager 131.
The response generation manager 131 transmits the TTS response and the text response generated by the dialog response generator 132 to the output manager 133 and the memory manager 135, and the output manager 133 transmits the TTS response to the speaker 232 and the text response to the display 231.
Configured to query "insufficient fuel to reach the destination. Do you want to add a-gas station to stay? "dialog responses may be output via display 231 and speaker 232. As shown in fig. 40, when the user speaks "no", the user's utterance may be input to the speech recognizer 111a and then output as [ text: no ], and the natural language understanding section 111b may output [ field: - [ action: -speech behavior: - ] and [ morpheme analysis results: no/IC ].
The dialog flow manager 121 may request the dialog action manager 122 to update the dialog state and the action state.
The dialog action manager 122 may update the dialog state to [ state: next dialog is performed and the action state is updated to [ state ]: cancel ].
The dialog flow manager 121 requests the response generation manager 131 to generate a response indicating that the gas station recommendation service is canceled, and the dialog flow manager 121 checks whether there is a dialog to be continued. When the dialog to be continued does not exist, the dialog flow manager 121 may update the dialog state to [ state: idle ] and await user input.
The flow of data processing described above is merely an example of application to dialog system 100. Accordingly, the order in which each component of the dialog system 100 processes data is not limited to the above example, and thus a plurality of components may process data at the same time, or a plurality of components may process data in a different order from the above example.
Hereinafter, according to an embodiment, a dialogue processing method will be described. According to the embodiment, the dialogue processing method may be applied to the above-described dialogue system 100 or the vehicle 200 provided with the dialogue system 100. Accordingly, the descriptions of fig. 1 to 40 will be applied to the dialog processing method in the same manner.
Fig. 41 is a flowchart illustrating a method of processing user input in a dialog processing method according to an embodiment. The method of processing user input may be performed in the input processor 110 of the dialog system 100.
Referring to fig. 41, when an utterance of a user is input (yes in 500), the speech recognizer 111a may recognize the input utterance of the user (510). The user's utterance may be input to a voice input device 210 provided in the vehicle 200 or a voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of a user and output the utterance in text form.
The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form (520) and output a result of the natural language understanding.
Specifically, the natural language understanding process (520) may include performing a morpheme analysis on an utterance in text form (521), extracting a domain from the utterance based on the morpheme analysis result (522), recognizing an entity name (523), analyzing a speech behavior (524), determining a status utterance based on the analyzed speech behavior (525), and extracting an action (526).
The extraction of the domain, the identification of the entity name, and the extraction of the action may be performed by referring to the domain/action inference rule DB 141.
The output of the natural language understanding section 111b, that is, the result of natural language understanding may include a domain corresponding to the utterance of the user, an action, a voice behavior, a result of morpheme analysis, and the like.
Contextual information related to the extracted action is searched (530). The context information related to the extracted action may be stored in the context understanding table 145. The context understanding part 112c searches the context understanding table 145 for context information related to the extracted action, and the context information processor 112c acquires information values of the searched context information from the context information DB 142, the long-term memory 143, or the short-term memory 144.
When additional context information is required (yes in 540), i.e., there is context information that cannot be acquired from the context information DB 142, the long-term memory 143, or the short-term memory 144, the context understanding part 112c may request collection of the corresponding context information (550). Inputs other than speech, such as vehicle state information, ambient information, and driver information, may be input via the context information collector 112a separately from the input of the user's utterance.
The information may be entered periodically or only upon the occurrence of a particular event. In addition, information may be periodically input and then additionally input upon the occurrence of a specific event. In any case, when information collection is requested, the corresponding information may be actively collected.
Accordingly, when the context information related to the action has been collected, the corresponding information may be acquired from the context information DB 142, the long-term memory 143, or the short-term memory 144, otherwise, the corresponding information may be collected via the context information collector 112 a.
When the context information collector 112a, which receives a request for collecting context information, collects corresponding context information and stores the information in the context information DB 142, the context understanding part 112c may acquire the corresponding context information from the context information DB 142.
On the other hand, when the context information collection manager 112b determines that a certain event occurs due to the data collected by the context information collector 112a satisfying a predetermined condition, the context information collection manager 112b may transmit an action trigger signal to the context understanding part 112c.
The context understanding part 112c searches the context understanding table 145 for the context information related to the corresponding event, and when the searched context information is not stored in the context understanding table 145, the context understanding part 112c may send the context information request signal to the context information collection manager 112b again.
When collection of the desired context information is completed, the results of the natural language understanding and the context information may be sent to the dialog manager 120 (560). When an event occurs, information about the event (which event occurred) and context information about the occurred event may also be transmitted.
Fig. 42 is a flowchart illustrating a method of managing a dialog using an output of an input processor in a dialog processing method according to an embodiment. The dialog processing method may be performed by the dialog manager 120 of the dialog system 100.
Referring to fig. 42, the dialog flow manager 121 searches the dialog and action status DB 147 for the relevant dialog history (600).
In this embodiment, a case of extracting a domain and an action from an utterance of a user is described as an example, but there may be a case that: it is not possible to extract domains and actions from the user's utterance because of ambiguity in the utterance content or context. In this case, the dialog action manager 122 may generate a random dialog state, and the ambiguity resolver 123 may recognize the user's intention based on the content of the user's utterance, environmental conditions, vehicle states, user information, and the like, and determine an action appropriate for the user's intention.
When a relevant dialog history exists (yes in 600), the relevant dialog history may be referenced (690). When there is no relevant dialog history (no in 600), new dialog tasks and action tasks may be generated (610).
A related action list related to an action (hereinafter referred to as an input action) extracted from the utterance of the user may be searched in the relation action DB 146b, and a candidate action list may be generated (620). The input actions and the actions related to the input actions may correspond to a list of candidate actions.
The execution condition according to each candidate action may be searched in the action execution condition DB 146c (620). The execution condition may represent a requirement for executing an action. Thus, the determination action is executable when the respective condition is satisfied, but is not executable when the respective condition is not satisfied. In the action execution condition DB 146c, information about the type of parameter used to determine the action execution condition is also stored.
Parameter values for determining an action execution condition may be obtained (640). The parameters used to determine the action execution conditions may be referred to as condition determination parameters. The parameter values of the condition determining parameters may be acquired by searching the context information DB142, the long-term memory 143, the short-term memory 144, or the dialogue and action status DB 147. When it is necessary to provide parameter values of the conditional parameters via the external service, the required parameter values may be provided from the external content server 300 via the external information manager 126.
In addition, when it is impossible to obtain a desired parameter value due to ambiguity in a context and an utterance, the desired parameter value can be obtained by solving the ambiguity using the ambiguity resolver 123.
In addition, although the acquired parameters are invalid parameters having difficulty in determining the action execution condition, the blur solver 123 may acquire valid parameters from the invalid parameters.
Based on the obtained condition determination parameters, it is determined whether each candidate action is executable (650), and a priority of the candidate action may be determined (660). Rules for determining the priority of candidate actions may be pre-stored. The action priority determiner 125 may determine the priority of the candidate actions by considering only the executable candidate actions after determining whether each candidate action is executable. Alternatively, after determining the priority of the candidate actions, whether or not each candidate action is executable, the priority of the candidate actions may be modified based on whether or not each candidate action is executable.
The action parameter DB 146a may be searched for a list of parameters for performing the candidate action (670). The parameters for performing the candidate action may correspond to action parameters. The action parameters may include necessary parameters and optional parameters.
Parameter values for performing candidate actions are obtained 680. The parameter values of the action parameters may be acquired by searching the context information DB 142, the long-term memory 143, the short-term memory 144, or the dialogue and action state DB 147. When it is required to provide the parameter values of the action parameters via the external service, the required parameter values may be provided from the external content server 300 via the external information manager 126.
In addition, when a desired parameter value cannot be obtained due to ambiguity in the context and the utterance, the desired parameter value can be obtained by solving the ambiguity using the ambiguity resolver 123.
In addition, although the acquired parameters are invalid parameters having difficulty in determining the action execution condition, the blur solver 123 may acquire valid parameters from the invalid parameters.
The dialog state and the action state managed by the dialog action manager 122 may be performed through the above-described steps, and the dialog state and the action state may be updated whenever the state changes.
When all available parameter values are obtained, dialog flow manager 121 may send information regarding candidate actions and dialog states to results processor 130. According to the dialog policy, the dialog flow manager 121 may transmit information related to an action corresponding to the first priority or information related to a plurality of candidate actions.
On the other hand, when the desired parameter value can be obtained only by the user because the desired parameter value does not exist in the external content server 300, the long-term memory 143, the short-term memory 144, and the context information DB 142, a dialogue response for querying the parameter value may be output to the user.
Fig. 43 is a flowchart showing a result processing method for generating a response corresponding to a result of dialog management in the dialog processing method according to an embodiment. The result processing method may be performed by the result processor 130 of the dialog system 100.
Referring to FIG. 43, when a dialog response needs to be generated (Yes in 700), dialog response generator 132 searches for response templates 149 (710). The dialog response generator 132 obtains a dialog response template corresponding to the current dialog state and action state from the response template 149 and populates the response template with the required parameter values to generate a dialog response (720).
When the parameter values required to generate the dialog response are not transmitted from the dialog manager 120 or when the introduction using the external content is transmitted, the required parameter values may be provided from the external content server 300 or searched in the long-term memory 143, the short-term memory 144, or the context information DB 142. When the desired parameter values can be obtained only by the user because the desired parameter values are not present in the external content server 300, the long-term memory 143, the short-term memory 144, and the context information DB 142, a dialog response for querying the parameter values can be generated to the user.
When a command needs to be generated 760, the command generator 136 generates a command for vehicle control or use of external content 770.
The generated dialog response or command may be input to the output manager 133, and the output manager 133 may determine an output order of the dialog response and the command or an output order between the plurality of commands (730).
The memory is updated based on the generated dialog response or command (740). The memory manager 135 may update the short-term memory 144 by storing dialogue contents between the user and the system based on the generated dialogue response or command, and update the long-term memory 143 by storing information about the user acquired through the dialogue with the user. In addition, the memory manager 135 may update the user's preference and the vehicle control history stored in the long-term memory 143 based on the generated and outputted vehicle control and external content request.
The output manager 133 may output the response by sending the dialog response and command to the appropriate output location (750). The TTS response may be output via speaker 232 and the text response may be output on display 231. The command may be transmitted to the vehicle controller 240 according to the control object or the command may be transmitted to the external content server 300. In addition, the command may be transmitted to the communication device 280 configured to communicate with the external content server 300.
The dialog processing method according to the embodiment is not limited to the order in the flowcharts described above. The flow according to the flowcharts of fig. 41 to 43 may be merely an example applied to the dialog processing method. Thus, a plurality of steps may be performed simultaneously, and the order of each step may also be changed.
As is apparent from the above description, according to the proposed dialogue system, a vehicle including the dialogue system and a dialogue processing method, it is possible to provide a service suitable for a user's intention or a user's need by using the dialogue processing method specified for the vehicle.
In addition, by considering various contexts occurring in the vehicle, services required by the user can be provided. In particular, regardless of the user's utterance, services required by the user can be determined and actively provided based on context information or driver information collected by the dialog system 100.
The service may be provided according to the actual intention of the user or the service most needed by the user by accurately recognizing the intention of the user based on various information such as dialogue with the user during the running of the vehicle and vehicle state information, running environment information, and user information.
Further, the vehicle control may be performed according to the intention of the user through an indirect utterance different from the direct control utterance of the user.
Although a few embodiments of the present disclosure have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the disclosure, the scope of which is defined in the claims and their equivalents.
Claims (31)
1. A dialog system for a vehicle, comprising:
a storage device configured to store context information including at least one of vehicle state information indicating a state of a vehicle or running environment information related to a running environment of the vehicle;
An input processor configured to obtain an utterance from a user and to extract an action corresponding to the utterance when recognizing that the utterance includes user state information;
A dialog manager configured to obtain, from the storage device, parameter values of a condition-determining parameter for determining whether an action corresponding to the utterance is executable, to determine an action to be performed based on the parameter values of the condition-determining parameter, and to obtain, from the storage device, parameter values of an action parameter for performing the determined action; and
A result processor configured to generate a response for performing the determined action using the acquired parameter values of the action parameters;
Wherein the storage device is coupled to the input processor, the dialog manager, and the results processor.
2. The dialog system of claim 1, wherein:
the storage means is configured to store contextual information relating to each of a plurality of actions, and
The input processor is configured to obtain information values of context information related to the action corresponding to the utterance from the storage device and to send the information values to the dialog manager.
3. The dialog system of claim 2, wherein:
the input processor is configured to request an information value of the context information corresponding to an utterance from the vehicle when the information value of the context information related to the action corresponding to the utterance is not stored in the storage.
4. The dialog system of claim 2, wherein:
the dialog manager is configured to set a parameter value of the condition-determining parameter or a parameter value of the action parameter equal to an information value of context information sent from the input processor.
5. The dialog system of claim 1, wherein:
The storage device is configured to store relationships between relationship actions, and
The dialog manager is configured to obtain, from the storage device, at least one action related to an action corresponding to the utterance.
6. The dialog system of claim 5 wherein:
the dialog manager is configured to determine a priority between an action corresponding to the utterance and at least one related action.
7. The dialog system of claim 6, wherein:
The dialog manager is configured to obtain parameter values of a condition-determining parameter for determining whether the related action is executable from the storage device and to determine whether the at least one related action is executable based on the obtained parameter values of the condition-determining parameter.
8. The dialog system of claim 7, wherein:
The dialog manager is configured to determine an action to be performed as an executable and having a highest priority between an action corresponding to the utterance and the at least one related action.
9. The dialog system of claim 1, wherein,
When the input processor is unable to extract an action corresponding to the utterance, the dialog manager estimates an action corresponding to the utterance based on at least one of the vehicle state information or the driving environment information.
10. The dialog system of claim 1, further comprising:
A communicator configured to communicate with an external server,
Wherein the dialog manager is configured to request the parameter value from the external server when the dialog manager is unable to retrieve the parameter value of the condition determination parameter or the parameter value of the action parameter from the storage device.
11. The dialog system of claim 1, wherein:
The result processor is configured to generate a dialog response for performing the determined action and a command for controlling operation of the vehicle.
12. The dialog system of claim 1, further comprising:
a communicator configured to receive an information value of the context information from at least one of the vehicle or a mobile device connected to the vehicle and to send the response to the vehicle or the mobile device.
13. The dialog system of claim 1, wherein:
The input processor is configured to extract an action corresponding to the utterance based on user information of the user.
14. A dialogue processing method for a vehicle, comprising:
Storing context information in a storage device, the context information including at least one of vehicle state information indicating the vehicle state or running environment information related to a running environment of the vehicle;
Obtaining an utterance from a user;
Extracting an action corresponding to the utterance when the utterance is recognized to include user state information;
Obtaining parameter values of a condition-determining parameter from the storage device, the condition-determining parameter for determining whether an action corresponding to the utterance is executable;
Determining an action to be performed based on the parameter value of the condition determination parameter;
acquiring a parameter value of an action parameter for executing the determined action from the storage device; and
A response for performing the determined action is generated using the acquired parameter values of the action parameters.
15. The dialog processing method of claim 14, further comprising:
Storing context information associated with each of a plurality of actions in the storage device;
Obtaining, from the storage device, information values of context information related to the action corresponding to the utterance; and
The information value is transmitted to a dialog manager, which is connected to the storage device.
16. The dialog processing method of claim 15, further comprising:
requesting an information value of context information corresponding to an utterance from the vehicle when the information value of the context information related to the action corresponding to the utterance is not stored in the storage; and
The parameter value of the condition determining parameter or the parameter value of the action parameter is set equal to an information value of context information corresponding to an utterance from the vehicle.
17. The dialog processing method of claim 15, further comprising:
Storing relationships between relationship actions in the storage device;
Retrieving at least one action from the storage device that is related to an action corresponding to the utterance; and
A priority between the action corresponding to the utterance and the at least one related action is determined.
18. The dialog processing method of claim 17, further comprising:
Obtaining parameter values of condition determining parameters from the storage means, the condition determining parameters being used to determine whether the relevant action is executable; and
A determination is made as to whether the at least one related action is executable based on the acquired parameter values of the condition determination parameters.
19. The dialog processing method of claim 18, further comprising:
The action to be performed is determined to be an action that is executable and has a highest priority between the action corresponding to the utterance and the at least one related action.
20. The dialog processing method of claim 14, further comprising:
When the action corresponding to the utterance is not extracted, the action corresponding to the utterance is estimated based on at least one of the vehicle state information or the running environment information.
21. The dialog processing method of claim 15, further comprising:
When the dialog manager cannot acquire the parameter value of the condition determination parameter or the parameter value of the action parameter from the storage device, the parameter value is requested from an external server.
22. The dialog processing method of claim 14, further comprising:
A dialog response for performing the determined action and a command for controlling operation of the vehicle are generated.
23. The dialog processing method of claim 14, further comprising:
Receiving an information value of the context information from at least one of the vehicle or a mobile device connected to the vehicle; and
The response is sent to the vehicle or the mobile device.
24. The dialog processing method of claim 14, further comprising:
receiving user information of the user; and
An action corresponding to the user utterance is extracted based on the user information.
25. A vehicle having a dialog system, comprising:
A storage device configured to store context information including at least one of vehicle state information indicating the vehicle state or running environment information related to a running environment of the vehicle;
An input processor configured to obtain an utterance from a user and to extract an action corresponding to the utterance when recognizing that the utterance includes user state information;
A dialog manager configured to obtain, from the storage device, parameter values of a condition-determining parameter for determining whether an action corresponding to the utterance is executable, to determine an action to be performed based on the parameter values of the condition-determining parameter, and to obtain, from the storage device, parameter values of an action parameter for performing the determined action; and
A result processor configured to generate a response for performing the determined action using the acquired parameter values of the action parameters;
Wherein the storage device is coupled to the input processor, the dialog manager, and the results processor.
26. The vehicle of claim 25, wherein:
the storage means is configured to store contextual information relating to each of a plurality of actions, and
The input processor is configured to obtain information values of context information related to the action corresponding to the utterance from the storage device and to send the information values to the dialog manager.
27. The vehicle of claim 26, wherein:
the dialog manager is configured to set a parameter value of the condition-determining parameter or a parameter value of the action parameter equal to an information value of context information sent from the input processor.
28. The vehicle of claim 25, wherein:
the storage means stores the relationship between the relationship actions, and
The dialog manager is configured to obtain, from the storage device, at least one action related to an action corresponding to the utterance.
29. The vehicle according to claim 28, wherein:
the dialog manager is configured to determine a priority between an action corresponding to the utterance and the at least one related action.
30. The vehicle of claim 29, wherein:
The dialog manager is configured to obtain parameter values of a condition-determining parameter for determining whether the related action is executable from the storage device and to determine whether the at least one related action is executable based on the obtained parameter values of the condition-determining parameter.
31. The vehicle of claim 25, wherein:
The input processor is configured to extract an action corresponding to the utterance based on user information of the user.
Applications Claiming Priority (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020180056497A KR20190131741A (en) | 2018-05-17 | 2018-05-17 | Dialogue system, and dialogue processing method |
KR10-2018-0056497 | 2018-05-17 | ||
KR10-2018-0067127 | 2018-06-12 | ||
KR1020180067127A KR102562227B1 (en) | 2018-06-12 | 2018-06-12 | Dialogue system, Vehicle and method for controlling the vehicle |
KR1020180073824A KR102695306B1 (en) | 2018-06-27 | 2018-06-27 | Dialogue system, Vehicle and method for controlling the vehicle |
KR10-2018-0073824 | 2018-06-27 | ||
KR1020180122295A KR20200042127A (en) | 2018-10-15 | 2018-10-15 | Dialogue processing apparatus, vehicle having the same and dialogue processing method |
KR10-2018-0122295 | 2018-10-15 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110503947A CN110503947A (en) | 2019-11-26 |
CN110503947B true CN110503947B (en) | 2024-06-18 |
Family
ID=68584955
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811496791.6A Active CN110503947B (en) | 2018-05-17 | 2018-12-07 | Dialogue system, vehicle including the same, and dialogue processing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110503947B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111523460A (en) * | 2020-04-23 | 2020-08-11 | 上海铠盾科技有限公司 | Standard operation behavior detection system |
CN111883125A (en) * | 2020-07-24 | 2020-11-03 | 北京蓦然认知科技有限公司 | Vehicle voice control method, device and system |
CN113614713A (en) * | 2021-06-29 | 2021-11-05 | 华为技术有限公司 | Human-computer interaction method, device, equipment and vehicle |
CN114999490A (en) * | 2022-08-03 | 2022-09-02 | 成都智暄科技有限责任公司 | Intelligent cabin audio control system |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104123936A (en) * | 2013-04-25 | 2014-10-29 | 伊莱比特汽车公司 | Method for automatic training of a dialogue system, dialogue system, and control device for vehicle |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9493130B2 (en) * | 2011-04-22 | 2016-11-15 | Angel A. Penilla | Methods and systems for communicating content to connected vehicle users based detected tone/mood in voice input |
JP6411017B2 (en) * | 2013-09-27 | 2018-10-24 | クラリオン株式会社 | Server and information processing method |
KR101770187B1 (en) * | 2014-03-27 | 2017-09-06 | 한국전자통신연구원 | Method and apparatus for controlling navigation using voice conversation |
US10726831B2 (en) * | 2014-05-20 | 2020-07-28 | Amazon Technologies, Inc. | Context interpretation in natural language processing using previous dialog acts |
KR102249392B1 (en) * | 2014-09-02 | 2021-05-07 | 현대모비스 주식회사 | Apparatus and method for controlling device of vehicle for user customized service |
US20170221480A1 (en) * | 2016-01-29 | 2017-08-03 | GM Global Technology Operations LLC | Speech recognition systems and methods for automated driving |
US20180114528A1 (en) * | 2016-10-26 | 2018-04-26 | IPsoft Incorporated | Systems and methods for generic flexible dialogue management |
-
2018
- 2018-12-07 CN CN201811496791.6A patent/CN110503947B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104123936A (en) * | 2013-04-25 | 2014-10-29 | 伊莱比特汽车公司 | Method for automatic training of a dialogue system, dialogue system, and control device for vehicle |
Non-Patent Citations (1)
Title |
---|
基于多模态信息融合的语音意图理解方法;郑彬彬 等;中国科技论文在线(第07期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110503947A (en) | 2019-11-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108346430B (en) | Dialogue system, vehicle having dialogue system, and dialogue processing method | |
CN110648661B (en) | Dialogue system, vehicle and method for controlling a vehicle | |
CN110660397B (en) | Dialogue system, vehicle and method for controlling a vehicle | |
CN110503948B (en) | Dialogue system and dialogue processing method | |
US10847150B2 (en) | Dialogue system, vehicle having the same and dialogue service processing method | |
US10991368B2 (en) | Dialogue system and dialogue processing method | |
US10937424B2 (en) | Dialogue system and vehicle using the same | |
US10861460B2 (en) | Dialogue system, vehicle having the same and dialogue processing method | |
CN110503947B (en) | Dialogue system, vehicle including the same, and dialogue processing method | |
US10950233B2 (en) | Dialogue system, vehicle having the same and dialogue processing method | |
CN110503949B (en) | Dialogue system, vehicle having dialogue system, and dialogue processing method | |
US11004450B2 (en) | Dialogue system and dialogue processing method | |
KR102403355B1 (en) | Vehicle, mobile for communicate with the vehicle and method for controlling the vehicle | |
KR102487669B1 (en) | Dialogue processing apparatus, vehicle having the same and dialogue processing method | |
KR102448719B1 (en) | Dialogue processing apparatus, vehicle and mobile device having the same, and dialogue processing method | |
CN110562260A (en) | Dialogue system and dialogue processing method | |
KR20190036018A (en) | Dialogue processing apparatus, vehicle having the same and dialogue processing method | |
KR20190135676A (en) | Dialogue system, vehicle having the same and dialogue processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |