CN112463106A

CN112463106A - Voice interaction method, device and equipment based on intelligent screen and storage medium

Info

Publication number: CN112463106A
Application number: CN202011274799.5A
Authority: CN
Inventors: 王云华; 罗新宇
Original assignee: Shenzhen TCL New Technology Co Ltd
Current assignee: Shenzhen TCL New Technology Co Ltd
Priority date: 2020-11-12
Filing date: 2020-11-12
Publication date: 2021-03-09

Abstract

The invention relates to the technical field of intelligent home furnishing, and discloses a voice interaction method, a voice interaction device, voice interaction equipment and a storage medium based on an intelligent screen, wherein the method comprises the following steps: when the voice monitoring component is detected to be started, acquiring the running scene information of the intelligent screen; determining a corresponding scene interaction mode according to the operation scene information; and voice instruction information is collected through the voice monitoring component, and a corresponding target function component is called according to the scene interaction mode to respond to the voice instruction information. The corresponding scene interaction mode is matched according to the operation scene information of the intelligent screen, and the corresponding target function component is called according to the scene interaction mode and the voice instruction information to respond to the voice instruction information, so that the voice instruction information is responded according to the operation scene information of the intelligent screen in a self-adaptive matching mode, the occurrence of invalid interaction is reduced, the efficient voice interaction with the intelligent screen is realized, and the interaction experience of a user is improved.

Description

Voice interaction method, device and equipment based on intelligent screen and storage medium

Technical Field

The invention relates to the technical field of smart home, in particular to a voice interaction method, device, equipment and storage medium based on a smart screen.

Background

Along with the maturation of the speech interaction technology, more and more household electrical appliances begin to carry on the pronunciation and monitor the subassembly in order to realize with user's speech interaction, however among the prior art, the most function singleness of the pronunciation monitoring subassembly that household electrical appliances carried on, and can't realize carrying out the self-adaptation along with mutual scene and match, interactive inefficiency, interactive experience is poor, on this basis, also will realize in the household electrical appliances that the whole course speech interaction of intelligent screen has pushed up a new degree of difficulty. Therefore, how to realize the efficient voice interaction of the smart screen and improve the interaction experience of the user becomes a problem to be solved urgently.

The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.

Disclosure of Invention

The invention mainly aims to provide a voice interaction method, a voice interaction device, voice interaction equipment and a storage medium based on an intelligent screen, and aims to solve the technical problems of realizing high-efficiency voice interaction of the intelligent screen and improving the interaction experience of a user.

In order to achieve the above object, the present invention provides a voice interaction method based on an intelligent screen, comprising the following steps:

when the voice monitoring component is detected to be started, acquiring the running scene information of the intelligent screen;

determining a corresponding scene interaction mode according to the operation scene information;

and voice instruction information is collected through the voice monitoring component, and a corresponding target function component is called according to the scene interaction mode to respond to the voice instruction information.

Preferably, the step of acquiring the operating scenario information of the smart screen when detecting that the voice monitoring component is started includes:

when the voice monitoring component is detected to be started, the operating scene information of the intelligent screen is obtained through the scene monitoring interface corresponding to the voice monitoring component.

Preferably, the step of determining the corresponding scene interaction mode according to the operation scene information specifically includes:

matching the operation scene information with a scene template stored in a preset scene database to obtain a template matching degree;

sorting the template matching degree to obtain a template sorting result;

and determining a corresponding scene interaction mode according to the template sequencing result.

Preferably, the step of acquiring voice instruction information by the voice monitoring component and calling a corresponding target function component according to the scene interaction mode to respond to the voice instruction information specifically includes:

collecting voice instruction information through the voice monitoring component;

performing voice recognition on the voice instruction information to obtain instruction key information and user attribute information;

and calling a corresponding target function component to respond to the voice instruction information according to the instruction key information, the user attribute information and the scene interaction mode.

Preferably, the step of calling a corresponding target function component according to the instruction key information, the user attribute information, and the scene interaction mode to respond to the voice instruction information specifically includes:

determining a corresponding functional component to be called according to the instruction key information and the scene mode;

acquiring a historical use record of a user according to the user attribute information, and determining the preference setting of the user according to the historical use record;

and performing adaptive matching on the functional component to be called according to the preference setting to obtain a corresponding target functional component, and calling the target functional component to respond to the voice instruction information.

acquiring voice instruction information through the voice monitoring component, and determining a corresponding target function component according to the voice instruction information and the scene interaction mode;

and displaying screen display prompt information corresponding to the target function component through the intelligent screen, and calling the corresponding target function component to respond to the voice instruction information.

Preferably, the step of displaying the screen display prompt information corresponding to the target function component through the smart screen specifically includes:

acquiring category information of the target function component, and determining a target display position of screen display prompt information corresponding to the target function component in the intelligent screen according to the category information;

carrying out adaptive adjustment on the target display position according to the current display page of the intelligent screen to obtain an adjusted target display position;

and displaying the screen display prompt information at the adjusted target display position.

In addition, in order to achieve the above object, the present invention further provides a voice interaction device based on a smart screen, wherein the device includes:

the information acquisition module is used for acquiring the running scene information of the intelligent screen when the voice monitoring component is detected to be started;

the mode determining module is used for determining a corresponding scene interaction mode according to the running scene information;

and the function calling module is used for acquiring voice instruction information through the voice monitoring component and calling a corresponding target function component to respond to the voice instruction information according to the scene interaction mode.

In addition, to achieve the above object, the present invention further provides a voice interaction device, including: the system comprises a memory, a processor and a smart screen-based voice interaction program stored on the memory and operable on the processor, wherein the smart screen-based voice interaction program is configured to implement the steps of the smart screen-based voice interaction method as described above.

In addition, to achieve the above object, the present invention further provides a computer readable storage medium, which stores a smart screen-based voice interaction program, and when the smart screen-based voice interaction program is executed by a processor, the method implements the steps of the smart screen-based voice interaction method as described above.

In the invention, when the voice monitoring component is detected to be started, the operation scene information of the intelligent screen is obtained, the corresponding scene interaction mode is determined according to the operation scene information, the voice instruction information is collected through the voice monitoring component, and the corresponding target function component is called according to the scene interaction mode to respond to the voice instruction information. The corresponding scene interaction mode is matched according to the operation scene information of the intelligent screen, and the corresponding target function component is called according to the scene interaction mode and the voice instruction information to respond to the voice instruction information, so that the self-adaptive matching according to the operation scene information of the intelligent screen is realized, the occurrence of invalid interaction is reduced, the efficient voice interaction with the intelligent screen is also realized, and the interaction experience of a user is improved.

Drawings

FIG. 1 is a schematic structural diagram of a smart screen-based voice interaction device in a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a first embodiment of a method for smart screen based voice interaction according to the present invention;

FIG. 3 is a flowchart illustrating a second embodiment of a method for smart screen based voice interaction according to the present invention;

fig. 4 is a block diagram of a first embodiment of a smart screen-based voice interaction apparatus according to the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a voice interaction device in a hardware operating environment according to an embodiment of the present invention.

As shown in fig. 1, the voice interaction apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the configuration shown in FIG. 1 is not intended to be limiting of voice interaction devices and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 1, a memory 1005, which is a kind of computer-readable storage medium, may include therein an operating system, a data storage module, a network communication module, a user interface module, and a smart screen-based voice interaction program.

In the voice interactive apparatus shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the voice interaction device of the present invention may be disposed in the voice interaction device, and the voice interaction device calls the voice interaction program based on the smart screen stored in the memory 1005 through the processor 1001 and executes the voice interaction method based on the smart screen provided by the embodiment of the present invention.

An embodiment of the present invention provides a voice interaction method based on an intelligent screen, and referring to fig. 2, fig. 2 is a schematic flow diagram of a first embodiment of the voice interaction method based on the intelligent screen.

In this embodiment, the voice interaction method based on the smart screen includes the following steps:

step S10: when the voice monitoring component is detected to be started, acquiring the running scene information of the intelligent screen;

it is easy to understand that, the execution main body of the embodiment is the voice interaction device, and the smart screen may be installed on the voice interaction device as an integrated structure, or may be connected to the voice interaction device through a communication bus, which is not limited in this embodiment. The voice monitoring component is loaded in the voice interaction device, in specific implementation, the voice monitoring component can be a voice assistant (such as a message flying talk point), before whether the voice monitoring component is started or not is detected, whether the voice assistant exists or not can be detected, if the voice assistant is not found, a corresponding downloading request can be sent to a cloud, and downloading and installation of the voice assistant are further achieved. The voice monitoring component is mainly used for monitoring voice instruction information input by a user in real time, when the voice monitoring component is detected to be started, the operation scene information of the intelligent screen can be acquired through a scene monitoring (ISceneListenner) interface corresponding to the voice monitoring component, in the specific implementation, the intelligent screen can be notified to submit the operation scene information through a notification (onReminder ()) function of the scene monitoring (ISceneListenner) interface, so that the intelligent screen returns (return) a corresponding character string to represent the operation scene information, and the operation scene information includes but is not limited to a current display interface, a current time, a current position and a current operation application of the intelligent screen.

Step S20: determining a corresponding scene interaction mode according to the operation scene information;

it should be noted that, after the operation scene information is obtained, in order to implement adaptive matching according to the operation scene information of the smart screen, the operation scene information may be matched with a scene template stored in a preset scene database to obtain a template matching degree, then, the template matching degrees are sorted to obtain a template sorting result, and then, a corresponding scene interaction mode is determined according to the template sorting result. In a specific implementation, a first-order scene template (i.e., a scene template with the highest matching degree) in a template sorting result may be selected as a template to be interacted, a scene interaction object is determined based on operating scene information, then, the scene interaction object and the template to be interacted are subjected to adaptive matching to generate a scene interaction mode, where a preset scene database is a database for storing the scene template, the scene template may be a built-in interaction template updated with a version, or a template set by a user according to actual requirements, so that the corresponding scene interaction object is determined according to the operating scene information, and then, interaction is performed based on the scene interaction object, which is not limited in this embodiment, for example, when it is detected that an application is currently operating on an intelligent screen, and the corresponding currently-operating application is a movie playing application, a corresponding current display interface is a movie list display interface, when the received voice instruction information is 'A', the option with the highest matching degree with the film 'A' can be selected; for another example, when it is detected that the current display interface of the smart screen is in the home page interface, when the received voice instruction information is "a", movies, music, and the like related to "a" may be searched.

Step S30: and voice instruction information is collected through the voice monitoring component, and a corresponding target function component is called according to the scene interaction mode to respond to the voice instruction information.

It is easy to understand that, after determining the corresponding scene interaction mode, in order to improve the interaction experience of the user, the voice instruction information may be collected by the voice monitoring component, and the corresponding target function component may be determined according to the voice instruction information and the scene interaction mode, then, the screen display prompt information corresponding to the target function component may be displayed by the smart screen, and the corresponding target function component may be called to respond to the voice instruction information, where the target function component is a function component corresponding to the interaction mode, such as a music playing control, a brightness adjustment control, a video playing control, a volume adjustment control, a progress adjustment control, etc., the screen display prompt information is the prompt information of the display page of the smart screen corresponding to the target function component, for example, when receiving the voice instruction information of "turn up the volume", the volume adjustment bar is displayed, and the volume is increased once by using the preset gradient, where the preset gradient may be set according to the adjustment gradient in the historical volume adjustment record, or may be an adjustment gradient corresponding to the scene interaction mode, which is not limited in this embodiment.

It should be noted that, when the target function component is determined, category information of the target function component may also be obtained, a target display position of the screen display prompt information corresponding to the target function component in the smart screen is determined according to the category information, then, the target display position is adaptively adjusted according to a current display page of the smart screen to obtain the adjusted target display position, and then, the screen display prompt information is displayed at the adjusted target display position, where the target display position may be understood as a pending display position of the target function component in the smart screen determined according to the category information of the target function component, and needs to be adaptively adjusted according to the current display page of the smart screen and then displayed, in a specific implementation, in order to improve interaction experience of a user, when the category of the obtained target interaction control is a volume adjustment control, the self-adaptive adjustment can be carried out according to the display content of the current display page, for example, when the situation that interactive options are displayed in preset areas on two sides of the central axis of the current display page is detected, and only when the boundary areas on two sides of the screen are idle areas without corresponding interactive options, the volume adjusting bar can be displayed on the current display page of the intelligent screen in the vertical direction, so that the interaction accuracy is improved.

In another implementation manner, in order to further improve the interaction experience of the user, when the target function component is determined, the category information and the use frequency of the target function component may also be obtained, and the target display position of the on-screen display prompt information corresponding to the target function component in the smart screen is determined according to the category information and the use frequency, that is, when a plurality of target function components called according to the scene interaction mode are detected, the target display position of the on-screen display prompt information corresponding to the target function component in the smart screen may be determined according to the use frequency of the target function component, and further adjustment is performed in combination with the current display page of the smart screen, for example, when the on-screen display prompt information is displayed in a display list, the use frequency of each target control may be respectively obtained, and the use frequencies are sorted, so as to obtain a frequency sorting result, the target function component with high use frequency can be displayed in the front of the display list according to the frequency sorting result, and for example, when the screen display prompt information is displayed in the display frame, the display font of the first sequential target function component (i.e., the target function component with the highest use frequency) in the frequency sorting result is displayed in a form of one font size larger than the display fonts of other target function components, and the prompt frame corresponding to the first sequential target function component is displayed at the center line of the current display page of the smart screen (i.e., the function component with the highest use frequency is displayed in the middle), so that the user can more intuitively obtain the feedback result of voice interaction, and the interaction experience of the user is improved.

It should be understood that the above is only an example, and the technical solution of the present invention is not limited in any way, and in a specific application, a person skilled in the art may set the technical solution as needed, and the present invention is not limited thereto.

In this embodiment, when it is detected that the voice monitoring component is started, operation scene information of the smart screen is acquired, a corresponding scene interaction mode is determined according to the operation scene information, voice instruction information is acquired through the voice monitoring component, and a corresponding target function component is called according to the scene interaction mode to respond to the voice instruction information. The corresponding scene interaction mode is matched according to the operation scene information of the intelligent screen, and the corresponding target function component is called according to the scene interaction mode and the voice instruction information to respond to the voice instruction information, so that the voice instruction information is responded according to the operation scene information of the intelligent screen in a self-adaptive matching mode, the occurrence of invalid interaction is reduced, the efficient voice interaction with the intelligent screen is realized, and the interaction experience of a user is improved.

Referring to fig. 3, fig. 3 is a flowchart illustrating a voice interaction method based on a smart screen according to a second embodiment of the present invention.

Based on the first embodiment described above, in the present embodiment, the step S30 includes:

step S301: collecting voice instruction information through the voice monitoring component;

step S302: performing voice recognition on the voice instruction information to obtain instruction key information and user attribute information;

step S303: and calling a corresponding target function component to respond to the voice instruction information according to the instruction key information, the user attribute information and the scene interaction mode.

It is easy to understand that when the voice instruction information is collected by the voice monitoring component, voice recognition can be performed on the voice instruction information to obtain instruction key information and user attribute information, the instruction key information can be understood as words of verb attributes, noun attributes and adjective attributes, such as 'sound', 'big', 'small', 'brightness', 'film A', 'music B', etc., the user attribute information is information used for representing user identities, such as voiceprint, tone, etc., then, the corresponding functional component to be called is determined according to the instruction key information and scene mode, the historical usage record of the user is obtained according to the user attribute information, the usage frequency of each functional component of the user is obtained according to the historical usage record, the instruction key information of the voice interaction instruction input by the user is subjected to cluster analysis to obtain a cluster analysis result, and then the preference setting of the user can be extracted based on the cluster analysis result and the usage frequency of each functional component, and then carrying out adaptive matching on the functional components to be called according to preference setting so as to obtain corresponding target functional components, and calling the target functional components to respond to the voice instruction information. If the collected instruction key information input by the user is "play" or "a", and it is detected that the current display interface of the smart screen is in the homepage interface, after the user identity is determined according to the voiceprint of the user, the historical use record of the user can be obtained according to the user identity, and the preference of the user for watching the movie is obtained according to the historical use record of the user, and the preference is for watching the movie at the brightness smaller than the preset brightness threshold, then the movie "a" is searched, and the screen brightness is adjusted to the brightness with the largest use times in the historical use record of the user when the movie "a" is played, and the preset brightness threshold can be set according to actual needs, which is not limited in this embodiment.

It should be noted that, in order to improve the interaction experience and the interaction universality of the user, when the historical usage record of the user is not obtained according to the user attribute information, attribute information used for representing the user identity, such as the gender and the age of the user, can be determined according to the voiceprint, the tone and the like of the user, and the corresponding target function component is called by combining the usage frequency of each target control in the cloud database and the local storage database and the instruction key information of the user, that is, when the user identity cannot be identified according to the voiceprint of the user and the historical usage record corresponding to the user identity cannot be found, attribute information used for representing the user identity, such as the gender and the age and the like of the user, can be determined according to the voiceprint, the tone and the like of the user, and the cloud database is obtained based on the gender, the age and the like of the user, and/or the high-frequency search word, the age and the like of the user in, And analyzing/reading the preference setting of the user to each functional component by using the frequency of each functional component, and calling the corresponding target functional component to respond to the voice instruction information of the user according to the instruction key information of the user, the preference setting of the user to each functional component and the scene interaction mode.

In this embodiment, voice instruction information is collected through a voice monitoring component, voice recognition is performed on the voice instruction information to obtain instruction key information and user attribute information, and a corresponding target function component is called according to the instruction key information, the user attribute information and the scene interaction mode to respond to the voice instruction information. The voice instruction information is identified to obtain the instruction key information and the user attribute information, and then the corresponding target function component is called based on the user attribute information, the instruction key information and the scene interaction mode, so that the voice instruction information is responded according to the attribute information, the instruction key information and the scene interaction mode of the user in a self-adaptive matching mode, the interaction behavior is more suitable for the use habit of the user, the voice interaction is more intelligent, and the interaction experience of the user is further improved.

In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a smart screen-based voice interaction program is stored on the computer-readable storage medium, and when executed by a processor, the smart screen-based voice interaction program implements the steps of the smart screen-based voice interaction method described above.

Referring to fig. 4, fig. 4 is a block diagram illustrating a first embodiment of a voice interaction apparatus based on a smart screen according to the present invention.

As shown in fig. 4, the speech interaction device based on smart screen according to the embodiment of the present invention includes:

the information acquisition module 10 is configured to acquire operating scene information of the smart screen when detecting that the voice monitoring component is started;

a mode determining module 20, configured to determine a corresponding scene interaction mode according to the operating scene information;

and the function calling module 30 is configured to collect voice instruction information through the voice monitoring component, and call a corresponding target function component to respond to the voice instruction information according to the scene interaction mode.

In this embodiment, when it is detected that the voice monitoring component is started, operation scene information of the smart screen is acquired, a corresponding scene interaction mode is determined according to the operation scene information, voice instruction information is acquired through the voice monitoring component, and a corresponding target function component is called according to the scene interaction mode to respond to the voice instruction information. The corresponding scene interaction mode is matched according to the operation scene information of the intelligent screen, and the corresponding target function component is called according to the scene interaction mode and the voice instruction information to respond to the voice instruction information so as to reduce the occurrence of invalid interaction phenomena.

Based on the first embodiment of the voice interaction device based on the smart screen, a second embodiment of the voice interaction device based on the smart screen is provided.

In this embodiment, the information obtaining module 10 is further configured to obtain the operating scene information of the smart screen through a scene monitoring interface corresponding to the voice monitoring component when detecting that the voice monitoring component is started.

The mode determining module 20 is further configured to match the operating scene information with a scene template stored in a preset scene database to obtain a template matching degree;

the mode determining module 20 is further configured to rank the template matching degrees to obtain a template ranking result;

the mode determining module 20 is further configured to determine a corresponding scene interaction mode according to the template sorting result.

The function calling module 30 is further configured to collect voice instruction information through the voice monitoring component;

the function calling module 30 is further configured to perform voice recognition on the voice instruction information to obtain instruction key information and user attribute information;

the function calling module 30 is further configured to call a corresponding target function component to respond to the voice instruction information according to the instruction key information, the user attribute information, and the scene interaction mode.

The function calling module 30 is further configured to determine a corresponding functional component to be called according to the instruction key information and the scene mode;

the function calling module 30 is further configured to obtain a historical usage record of the user according to the user attribute information, and determine a preference setting of the user according to the historical usage record;

the function calling module 30 is further configured to perform adaptive matching on the to-be-called functional component according to the preference setting to obtain a corresponding target functional component, and call the target functional component to respond to the voice instruction information.

The function calling module 30 is further configured to collect voice instruction information through the voice monitoring component, and determine a corresponding target function component according to the voice instruction information and the scene interaction mode;

the function calling module 30 is further configured to display, through the smart screen, on-screen display prompt information corresponding to the target function component, and call the corresponding target function component to respond to the voice instruction information.

The function calling module 30 is further configured to obtain category information of the target function component, and determine a target display position of the screen display prompt information corresponding to the target function component in the smart screen according to the category information;

the function calling module 30 is further configured to perform adaptive adjustment on the target display position according to the current display page of the smart screen to obtain an adjusted target display position;

the function calling module 30 is further configured to display the screen display prompt information at the adjusted target display position.

Other embodiments or specific implementation manners of the voice interaction device based on the smart screen may refer to the above method embodiments, and are not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a computer-readable storage medium (such as a rom/ram, a magnetic disk, or an optical disk), and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A voice interaction method based on a smart screen is characterized by comprising the following steps:

2. The method according to claim 1, wherein the step of acquiring the operating scenario information of the smart screen when detecting that the voice monitoring component is started includes:

3. The method according to claim 1, wherein the step of determining the corresponding scene interaction mode according to the operating scene information specifically includes:

sorting the template matching degree to obtain a template sorting result;

4. The method of claim 1, wherein the step of acquiring voice instruction information by the voice listening component and invoking a corresponding target function component to respond to the voice instruction information according to the scene interaction mode specifically comprises:

5. The method according to claim 4, wherein the step of calling the corresponding target function component to respond to the voice instruction information according to the instruction key information, the user attribute information, and the scene interaction mode specifically includes:

6. The method of claim 1, wherein the step of acquiring voice instruction information by the voice listening component and invoking a corresponding target function component to respond to the voice instruction information according to the scene interaction mode specifically comprises:

7. The method according to claim 6, wherein the step of displaying the on-screen display prompt information corresponding to the target function component through the smart screen specifically includes:

8. A voice interaction device based on a smart screen, the device comprising:

9. A voice interaction device, the device comprising: a memory, a processor and a smart screen based voice interaction program stored on the memory and executable on the processor, the smart screen based voice interaction program configured to implement the steps of the smart screen based voice interaction method of any one of claims 1 to 7.

10. A computer-readable storage medium, wherein a smart screen-based voice interaction program is stored on the computer-readable storage medium, and when executed by a processor, the smart screen-based voice interaction program implements the steps of the smart screen-based voice interaction method according to any one of claims 1 to 7.