CN107943914A

CN107943914A - Voice information processing method and device

Info

Publication number: CN107943914A
Application number: CN201711157169.8A
Authority: CN
Inventors: 张翔; 吴瑞红; 邓晗; 张刚; 石磊
Original assignee: Science And Technology (beijing) Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2017-11-20
Filing date: 2017-11-20
Publication date: 2018-04-20

Abstract

The embodiment of the present application discloses voice information processing method and device.One embodiment of this method includes：Received the first voice messaging input by user of parsing, obtains the first instruction corresponding with the first voice messaging；Perform the first operation corresponding with the first instruction；Obtain and perform the result that the first operation returns, generate and send the second voice messaging to user, wherein, the second voice messaging includes voice messaging corresponding with the result of return and for prompting the user to input whether to perform the information of voice prompt with associated at least one second operation of the first operation；In response to receiving response instruction corresponding with information of voice prompt input by user, the second operation is performed.The embodiment can lift voice service efficiency and improve user experience.

Description

Voice information processing method and device

Technical field

The invention relates to field of computer technology, and in particular to field of artificial intelligence, more particularly, to Voice information processing method and device.

Background technology

With the development of artificial intelligence, the covering surface of speech-sound intelligent assistant is more and more wider.Speech-sound intelligent assistant carries for people More and more facilities are supplied and more functions are hung down class, such as reminded, alarm clock and competitive sports inquiry etc..

The content of the invention

The embodiment of the present application proposes a kind of voice information processing method and device.

In a first aspect, the embodiment of the present application provides a kind of voice information processing method, this method includes：Parsing is received The first voice messaging input by user arrived, obtains the first instruction corresponding with the first voice messaging；Perform and the first instruction pair The first operation answered；Obtain and perform that the first operation returns as a result, generating and sending the second voice messaging to user, wherein, the Two voice messagings include voice messaging corresponding with the result of return and for prompting the user to input whether to perform and the first behaviour Make the information of voice prompt of associated at least one second operation；In response to receiving input by user and information of voice prompt pair The response instruction answered, performs the second operation.

Second aspect, the embodiment of the present application provide a kind of speech information processing apparatus, which includes：Resolution unit, It is configured to parse received the first voice messaging input by user, obtains corresponding with the first voice messaging first and refer to Order；First execution unit, is configured to carry out the first operation corresponding with the first instruction；First generation unit, is used in acquisition Perform that the first operation returns as a result, generating and sending the second voice messaging to user, wherein, the second voice messaging is included with returning The corresponding voice messaging of result that returns and for prompting the user to input whether to perform and the first operation associated at least one the The information of voice prompt of two operations；Second execution unit, is configured to believe in response to receiving input by user and voice prompt Corresponding response instruction is ceased, performs the second operation.

The third aspect, the embodiment of the present application provide a kind of server, which includes：One or more processors； Storage device, for storing one or more programs, when said one or multiple programs are held by said one or multiple processors During row so that said one or multiple processors are realized such as the method described in first aspect.

Fourth aspect, the embodiment of the present application provide a kind of computer-readable recording medium, are stored thereon with computer journey Sequence, wherein, realized when which is executed by processor such as the method described in first aspect.

Voice information processing method and device provided by the embodiments of the present application, it is input by user received by by parsing First voice messaging, obtains the first instruction corresponding with the first voice messaging, then performs the first behaviour corresponding with the first instruction Make；Then obtain and perform that the first operation returns as a result, generating and sending the second voice messaging to user；Finally in response to receiving Instructed to response corresponding with information of voice prompt input by user, perform the second operation, realizing can be according to the need of user First for asking and performing operates returned result to perform corresponding second operation of user demand, can improve existing voice The phenomenon that information isolates between each subfunction service in service, so as to be conducive to lift voice service efficiency and improve user's body Test.

Brief description of the drawings

By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other Feature, objects and advantages will become more apparent upon：

Fig. 1 is that this application can be applied to exemplary system architecture figure therein；

Fig. 2 is the indicative flowchart according to one embodiment of the voice information processing method of the application；

Fig. 3 is the indicative flowchart according to another embodiment of the voice information processing method of the application；

Fig. 4 is the indicative flowchart according to another embodiment of the voice information processing method of the application；

Fig. 5 is the structure diagram according to one embodiment of the speech information processing apparatus of the application；

Fig. 6 is adapted for the structure diagram of the computer system of the server for realizing the embodiment of the present application.

Embodiment

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to It illustrate only easy to describe, in attached drawing and invent relevant part with related.

It should be noted that in the case where there is no conflict, the feature in embodiment and embodiment in the application can phase Mutually combination.Describe the application in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 shows the embodiment of the voice information processing method that can apply the application or speech information processing apparatus Exemplary system architecture 100.

As shown in Figure 1, system architecture 100 can include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 provide communication link medium.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be interacted with using terminal equipment 101,102,103 by network 104 with server 105, to receive or send out Send message etc..Various client applications can be installed on terminal device 101,102,103, such as, intelligent sound assistant class should With, searching class application, instant messaging tools etc..

Terminal device 101,102,103 can be various electronic equipments, include but not limited to smart mobile phone, tablet computer, Pocket computer on knee and desktop computer, intelligent sound box, wearable device etc..

Server 105 can be to provide the server of various services, such as the letter to the output of terminal device 101,102,103 Breath provides the background information processing server supported.Background information processing server can dock received information and analyze etc. Processing, and handling result (such as information according to obtained from the information received carries out relevant search) is fed back into terminal device 101、102、103。

It should be noted that the voice information processing method that the embodiment of the present application is provided generally is performed by server 105, Correspondingly, speech information processing apparatus is generally positioned in server 105.

It should be understood that the number of the terminal device, network and server in Fig. 1 is only schematical.According to realizing need Will, can have any number of terminal device, network and server.

With continued reference to Fig. 2, it illustrates the flow of one embodiment of the voice information processing method according to the application 200.The voice information processing method, comprises the following steps：

Step 201, received the first voice messaging input by user is parsed, is obtained corresponding with the first voice messaging First instruction.

In the present embodiment, electronic equipment (such as the service shown in Fig. 1 of voice information processing method operation thereon Device) input by user the first of above-mentioned terminal device (such as terminal device 101,102,103 described in Fig. 1) transmission can be detected Voice messaging.

In general, above-mentioned terminal device can include audio input/output module, user can be connect by above-mentioned audio input Mouth the first voice messaging of input.Terminal device is sent to above-mentioned electricity after above-mentioned first voice messaging input by user is encoded Sub- equipment.

Above-mentioned electronic equipment, can be to above-mentioned first voice messaging after the first voice messaging input by user is received Parsed, so as to extract the first instruction of user.Specifically, above-mentioned electronic equipment first can carry out the first voice messaging Decoding, is then identified decoded above-mentioned first voice messaging, is converted into text information, obtain and the first voice messaging Corresponding text information.Obtain it is corresponding with the first voice messaging to text information after, to above-mentioned text information carry out language Reason and good sense solution, obtains the first semantic information of user；Determine that corresponding with the first voice messaging first refers to according to the first semantic information Order.

, can be defeated by above-mentioned text information when carrying out semantic understanding to above-mentioned text information in application scenes Enter into semantics recognition model trained in advance, generate the first semantic information.Above-mentioned electronic equipment can be according to above-mentioned first language Adopted information determines the first instruction of user.Herein, above-mentioned first command information can embody the intention of user, that is, user Demand.

In some optional implementations of the present embodiment, text information corresponding with first voice messaging is being obtained Afterwards, the keyword in above-mentioned text information can be extracted, and corresponding with the first voice messaging first is determined according to keyword Instruction.For example, the corresponding keyword of user demand " reminding a certain activity time " can be " prompting ".

Step 202, the first operation corresponding with the described first instruction is performed.

Multiple operations can be pre-set in above-mentioned electronic equipment.Here, each operation can correspond to a function Service.Each operation can be triggered by its corresponding instruction.That is each operation can correspond to an instruction.

After parsing obtains the first instruction corresponding with the first voice messaging of user in step 201, it can trigger Electronic equipment is stated to perform the first operation corresponding with the first instruction.

In application scenes, above-mentioned first instruction for example can be " inquiring about the recent concert of so-and-so singer ", Above-mentioned electronic equipment can perform the first operation according to the first instruction：It is near that so-and-so singer is inquired about on internet or professional website Phase all concert.

It can include at least one first in some optional implementations of the present embodiment, in above-mentioned first instruction to perform Parameter.Execution parameter herein such as can be personage, the time, place,.If for example, the first voice messaging input by user Corresponding semantic information is " the BB songs of inquiry AA singer ", and AA singer, BB songs can be included by performing parameter.

Step 203, obtain and perform that first operation returns as a result, generating and sending the second voice letter to the user Breath.

In the present embodiment, the electronic equipment of voice information processing method operation thereon can obtain the operation of execution first And the result returned.Specifically, can be in multiple information that the first operation returns be performed, selection and the text of the first voice messaging The highest information of word information matches degree is as the result for performing the first operation return.

In the present embodiment, above-mentioned electronic equipment can generate the second voice messaging according to the result that the first operation returns. That is the second voice messaging includes performing the result that the first operation returns.Above-mentioned electronic equipment carries out the second voice messaging Terminal device used by a user is sent to after coding.Terminal device, can be with after above-mentioned second voice after receiving coding Second voice messaging is decoded, and is reported by audio input/output module to above-mentioned user.Above-mentioned second voice messaging can be with Including voice messaging corresponding with the result of return.

Further, above-mentioned electronic equipment may also include prompting user's input in the second voice messaging sent to user Whether information of voice prompt with first operation associated at least one second operation is performed.

In the present embodiment, can be in received the first voice input by user letter of parsing in above-mentioned electronic equipment Breath, before obtaining the first instruction corresponding with first voice messaging, pre-sets the incidence relation between each operation.When , can be according to the incidence relation between above-mentioned each operation after above-mentioned electronic equipment determines the first operation according to the first instruction Associated at least one second operation of first operation is associated.

In some optional implementations of the present embodiment, above-mentioned electronic equipment can be defeated in the received user of parsing The first voice messaging entered, before obtaining the first instruction corresponding with first voice messaging, based on mass users history The statistical analysis of neighbouring operation behavior (for example, operation adjacent twice) data is set and the first associated multiple operations of operation In at least one second incidence relation.For example, counted in the history according to mass users in operation behavior data More than 70% user performs the second operation after the first operation " concert for inquiring about certain one day of so-and-so singer " is performed " create and remind the alarm clock for watching so-and-so singer's concert certain one day ", then above-mentioned electronic equipment, which can be set, above-mentioned " inquires about certain First operation of the concert of certain one day of certain singer " and " create and remind the alarm clock for watching so-and-so singer's concert certain one day " Incidence relation between second operation.First operation and at least one operation can be associated by above-mentioned electronic equipment.

Step 204, instructed in response to receiving response corresponding with the information of voice prompt input by user, perform institute State the second operation.

In the present embodiment, above-mentioned electronic equipment can be in input by user and above-mentioned second voice messaging be received After the corresponding response instruction of information of voice prompt, associated at least one second operation of above-mentioned and the first operation can be performed. Herein, response instruction refers to for characterizing the voice messaging for determining to perform the second operation.

In some optional implementations of the present embodiment, above-mentioned electronic equipment is receiving user's input and voice prompt After information corresponding response instruction, above-mentioned electronic equipment can first according to the result for performing the first operation and returning generate to Few one second execution parameter, the second execution parameter herein for example can be " time ", " place ", " song title " etc.. Then above-mentioned at least one second the second operation of parameter execution is performed.

In application scenes, user wishes to watch CC sports tournaments, but does not know and specifically hold time and ground Point.User can input the first voice messaging " time of inquiry CC sports tournaments and ground by the terminal device used to it Point ".Terminal device will can be sent in above-mentioned electronic equipment after the first above-mentioned encoding speech information.Above-mentioned electronics here connects The first voice messaging after above-mentioned coding is received to be decoded afterwards.Then, above-mentioned first voice messaging is parsed, obtained First instruction --- the when and where of inquiry CC competitive sports.Herein, " CC competitive sports " can be understood as performing parameter.On The first operation can be performed according to the above-mentioned first instruction by stating electronic equipment --- CC competitive sports are inquired about in the local database Fixture, location information, or fixture, the place of CC competitive sports relevant informations are searched on the internet.Above-mentioned electricity Incidence relation in multiple operations that sub- equipment can be arranged in above-mentioned electronic equipment according to presetting between each operation Configuration and associated at least one second operation of the first operation.Second operation herein for example can be " to set viewing CC physical culture matches The alarm clock calling operation of thing " and/or " operation of the train ticket of CC competitive sports host city is gone in purchase ".Performing the first operation institute Return in multiple information, above-mentioned electronic equipment can select the information work for holding time and host place comprising CC sports tournaments To return the result.Above-mentioned electronic equipment can return the result the second voice messaging of generation according to above-mentioned, herein the second voice messaging It can include the specific of CC sports tournaments and hold the time " XX XX month XX day XX points ", host place " XX provinces XX cities XX physical culture Shop ", and whether perform the information of voice prompt with associated at least one second operation of the above-mentioned first operation.User is receiving To after above-mentioned second voice, after information of voice prompt, response instruction can be inputted under the prompting of information of voice prompt, Response instruction characterization such as can be "Yes", " to ", " can with " herein determines to perform the instruction of the second operation.Above-mentioned electronics Equipment performs above-mentioned at least one second operation, such as " set viewing CC bodies after the above-mentioned response for receiving user instructs Educate alarm clock operation " and/or " operation of the train ticket of CC competitive sports host city is gone in purchase ".

In application scenes, user can input " the inquiry of the first voice messaging by the terminal device used to it The song of EE singer ".Terminal device will can be sent in above-mentioned electronic equipment after the first above-mentioned encoding speech information.It is above-mentioned Electronic equipment docks received first voice messaging and is decoded.Then, above-mentioned first voice messaging is parsed, obtains One instruction --- the song of inquiry EE singer.Herein, EE singer can be to perform parameter.Herein second operation for example can be " playing song operation ".Above-mentioned electronic equipment can perform the first operation according to the above-mentioned first instruction --- in the local database The song-related information of inquiry inquiry EE singer, or the behaviour of the song-related information of inquiry EE singer is searched on the internet Make.Above-mentioned electronic equipment can determine to associate with the above-mentioned first operation according to the incidence relation between each operation of default setting It is at least one second operation.In multiple information that the first operation is returned are performed, above-mentioned electronic equipment can select to include The broadcasting time of EE singer is used as more than the song of certain frequency threshold value and returns the result.And return the result generation second according to above-mentioned Voice messaging, whether the second voice messaging can include the list of songs of EE singer herein, and perform and the above-mentioned first operation The prompt message of associated second operation (operation for playing the song in above-mentioned list of songs).User is receiving above-mentioned second After voice, after information of voice prompt, response instruction, response herein can be inputted under the prompting of information of voice prompt The characterization such as can be "Yes", " to ", " can with " is instructed to determine to perform the instruction of the second operation.Above-mentioned electronic equipment is receiving To after the above-mentioned response instruction of user, above-mentioned second operation is performed.

The method that above-described embodiment of the application provides is returned by the demand according to user and the first operation performed Result perform corresponding second operation of user demand, can improve in existing voice service and believe between the service of each subfunction The isolated phenomenon of breath, so as to be conducive to lift voice service efficiency and user experience.

With continued reference to Fig. 3, it illustrates the signal of another embodiment of the voice information processing method according to the application Property flow chart 300.The voice information processing method, comprises the following steps：

Step 301, in response to receiving the associated instructions of user, by the progress of the first operation and at least one second operation Association.

In the present embodiment, the associated instructions that user can be sent by user terminal to above-mentioned electronic equipment, by first Operation is associated with least one second operation.Above-mentioned associated instructions for example can include first operation identification information and Treat the identification information with associated at least one second operation of the described first operation.

Herein, user can send to above-mentioned electronic equipment the demand for realizing oneself is relevant according to the demand of oneself First operates and with realizing that the demand relevant at least one second of oneself operates the associated instructions being associated.

Above-mentioned electronic equipment, can be by configuring the first operation and at least one after the associated instructions of user are received A second operation is associated.Configuration herein for example can be the carrier for setting the first operation (for example, realizing the first operation Computer program) and realize and passed into row information between the carrier (for example, realizing the computer program of the second operation) of the second operation It is defeated.

It is worth noting that above-mentioned steps 301 are not limited to perform before being originally private step 302, in some applications In scene, step 301 can also perform between the step 302 and step 303 of the present embodiment, alternatively, being separately some applications In scene, step 301 can also perform between the step 303 of the present embodiment and the step 304 of the present embodiment.

For example, after the first voice messaging that user have input inquiry CC competitive sports, can be closed by phonetic entry The associated instructions that joint investigation askes competitive sports operation and sets alarm clock to operate, or user can pass through phonetic entry correlation inquiry body The associated instructions of race operation and the operation of purchase train ticket are educated, or inquiry body can be respectively associated in user by phonetic entry Educate race operation and alarm clock operation and the associated instructions of correlation inquiry competitive sports operation and the operation of purchase train ticket are set.

Step 302, received the first voice messaging input by user is parsed, is obtained corresponding with the first voice messaging First instruction.

In the present embodiment, electronic equipment (such as the service shown in Fig. 1 of voice information processing method operation thereon Device) the first voice of user letter that above-mentioned terminal device (such as terminal device 101,102,103 described in Fig. 1) is sent can be detected Breath.

Above-mentioned electronic equipment, can be to above-mentioned first voice messaging after the first voice messaging input by user is received Parsed, so as to extract the first instruction of user.

Step 303, the first operation corresponding with the first instruction is performed.

After parsing obtains the first instruction corresponding with the first voice messaging of user in step 302, it can trigger Electronic equipment is stated to perform the first operation corresponding with the first instruction.

Step 304, obtain and perform that first operation returns as a result, generating and sending the second voice letter to the user Breath.

In the present embodiment, above-mentioned electronic equipment can generate the second voice messaging according to the result that the first operation returns, And terminal device used by a user is sent to after the second voice messaging is encoded.Second voice messaging includes performing first Operate the result returned and prompt the user to input whether to perform the voice with associated at least one second operation of the first operation Prompt message.

Step 305, instructed in response to receiving response corresponding with information of voice prompt input by user, perform the second behaviour Make.

In the present embodiment, above-mentioned electronic equipment can be in input by user and above-mentioned second voice messaging be received After the corresponding response instruction of information of voice prompt, associated at least one second operation of above-mentioned and the first operation can be performed. Herein, response instruction can refer to for characterizing the voice messaging for determining to perform the second instruction.

From figure 3, it can be seen that compared with the corresponding embodiments of Fig. 2, voice information processing method in the present embodiment Flow 300 highlights the step of configured information according to user, the operation of association first and at least one second operation.Thus, The scheme of the present embodiment description can associate the first operation and the second operation according to the instruction of user, be carried so as to fulfill to user For more efficiently voice service.

Please continue to refer to Fig. 4, another embodiment it illustrates the voice information processing method according to the application is shown Meaning property flow chart 400.The voice information processing method, comprises the following steps：

Step 401, received the first voice messaging input by user is parsed, is obtained corresponding with the first voice messaging First instruction.

In the present embodiment, the electronic equipment of voice information processing method operation thereon can detect above-mentioned terminal device The first voice messaging of user of transmission.Above-mentioned first voice messaging is parsed, so as to extract the first instruction of user.

Step 402, the first operation corresponding with the first instruction is performed.

After parsing obtains the first instruction corresponding with the first voice messaging of user in step 401, it can trigger Electronic equipment is stated to perform the first operation corresponding with the first instruction.

Step 403, obtained and associated at least one second behaviour of the first operation based on neural network model trained in advance Make.

In the present embodiment, neural network model trained in advance can be provided with upper electronic equipment.Above-mentioned electronics is set It is standby to be obtained and associated at least one second operation of the first operation according to above-mentioned neural network model.

Above-mentioned neural network model can be the neural network model used after substantial amounts of sample training.Above-mentioned sample can be with It is the sample of handmarking.Such as mark known to multiple users the 3rd operation and the 4th operation completed in the 3rd operation The 3rd operation of sample herein is not a specific operation, can make any one operation in multiple operations.Herein Four operations are not a specific operations, can make any one operation in multiple operations.Constantly adjustment in the training process The parameter of neural network model makes its predicted value convergence mark value.

Step 404, the first operation and at least one second operation are associated.

Above-mentioned electronics can be in step 401 arrive with first operation it is associated it is at least one second operation after, it is right Above-mentioned first operation is associated configuration with least one second operation.Above-mentioned associated configuration for example can be to set to realize first Into row information transmission between the function module (software module or hardware module) of operation and the function module for realizing the second operation.

Step 405, obtain and perform that the first operation returns as a result, generating and sending the second voice messaging to user.

Step 406, instructed in response to receiving response corresponding with information of voice prompt input by user, perform the second behaviour Make.

In the present embodiment, above-mentioned electronic equipment can be in input by user and above-mentioned second voice messaging be received After the corresponding response instruction of information of voice prompt, associated at least one second operation of above-mentioned and the first operation can be performed. Herein, response instruction refers to for characterizing the voice messaging for determining to perform the second instruction.

Figure 4, it is seen that compared with the corresponding embodiments of Fig. 2, voice information processing method in the present embodiment Flow 400 highlight based on neural network model obtains with associated at least one second operation of the first operation, and by first The step of operation is associated with least one second operation.Thus, the present embodiment description scheme can automatically by first operation with And at least one second operation association of prediction, and the second operation is realized according to the response message of user, so as to provide a user More rich voice service.

With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, at a kind of voice messaging One embodiment of device is managed, the device embodiment is corresponding with the embodiment of the method shown in Fig. 2, which can specifically apply In various electronic equipments.

As shown in figure 5, the speech information processing apparatus 500 of the present embodiment includes：Resolution unit 501, the first execution unit 502nd, 503 and second execution unit 504 of generation unit.Wherein, resolution unit 501, are configured to parse received user First voice messaging of input, obtains the first instruction corresponding with first voice messaging；First execution unit 502, configuration For performing the first operation corresponding with the described first instruction；Generation unit 503, is used in acquisition execution first operation and returns Return as a result, generation and to the user send the second voice messaging, wherein, second voice messaging include and the return The corresponding voice messaging of result and for prompt the user to input whether perform with described first operate it is associated at least one The information of voice prompt of second operation；Second execution unit 504, is configured in response to receiving input by user and institute's predicate The corresponding response instruction of sound prompt message, performs second operation.

In the present embodiment, the resolution unit 501 of speech information processing apparatus 500, the first execution unit 502, generation are single The specific processing of 503 and second execution unit 504 of member and its caused technique effect can be corresponded in embodiment with reference to figure 2 respectively Step 201, step 202, step 203, the related description of step 204, details are not described herein.

In some optional implementations of the present embodiment, above-mentioned speech information processing apparatus 500 further includes dispensing unit (not shown).Wherein dispensing unit can determine and associated at least one second behaviour of the first operation as follows Make：Before received the first voice messaging input by user of resolution unit parsing, in response to receiving user's Associated instructions, by described first operation with least one second operation be associated, wherein, the associated instructions include described in First operation identification information and treat and described first operation it is associated it is at least one second operation identification information.

In some optional implementations of the present embodiment, above-mentioned speech information processing apparatus 500 further includes dispensing unit (not shown).Wherein dispensing unit can associate and associated at least one second behaviour of the first operation as follows Make：Determined and associated at least one second operation of the first operation based on neural network model trained in advance；By the first operation It is associated with least one second operation.

In some optional implementations of the present embodiment, above-mentioned speech information processing apparatus 500 further includes dispensing unit (not shown).Wherein dispensing unit can associate and associated at least one second behaviour of the first operation as follows Make：Neighbouring operation behavior data based on mass users determine and associated at least one second operation of the first operation；By first Operation is associated with least one second operation.

In some optional implementations of the present embodiment, the first instruction that above-mentioned resolution unit 501 parses can wrap Include at least one first and perform parameter.And first execution unit 502 can further be configured to：According at least one first Parameter is performed, performs the first operation corresponding with the first instruction.

In some optional implementations of the present embodiment, above-mentioned second execution unit 504 is further configured to：Response In receiving response instruction corresponding with information of voice prompt input by user, at least one second is generated according to the result of return Perform parameter；Parameter, which is performed, according at least one second performs the second operation.

In some optional implementations of the present embodiment, above-mentioned resolution unit 501 is further configured to：To the first language Message breath is identified, and obtains text information corresponding with the first voice messaging；Extract the keyword in text information；According to pass Keyword determines the first instruction corresponding with the first voice messaging.

Below with reference to Fig. 6, it illustrates suitable for for realizing the computer system 600 of the server of the embodiment of the present application Structure diagram.Server shown in Fig. 6 is only an example, should not be to the function and use scope band of the embodiment of the present application Carry out any restrictions.

As shown in fig. 6, computer system 600 includes central processing unit (CPU, Central Processing Unit) 601, its can according to the program being stored in read-only storage (ROM, Read Only Memory) 602 or from storage part 606 programs being loaded into random access storage device (RAM, Random Access Memory) 603 and perform it is various appropriate Action and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.CPU 601、ROM 602 and RAM 603 is connected with each other by bus 604.Input/output (I/O, Input/Output) interface 605 is also connected to Bus 604.

I/O interfaces 605 are connected to lower component：Storage part 606 including hard disk etc.；And including such as LAN (locals Net, Local Area Network) card, modem etc. network interface card communications portion 607.Communications portion 607 passes through Communication process is performed by the network of such as internet.Driver 608 is also according to needing to be connected to I/O interfaces 605.Detachable media 609, such as disk, CD, magneto-optic disk, semiconductor memory etc., as needed be installed on driver 608 on, in order to from The computer program read thereon is mounted into storage part 606 as needed.

Especially, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product, it includes being carried on computer-readable medium On computer program, the computer program include be used for execution flow chart shown in method program code.In such reality Apply in example, which can be downloaded and installed by communications portion 607 from network, and/or from detachable media 609 are mounted.When the computer program is performed by central processing unit (CPU) 601, perform what is limited in the present processes Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or Computer-readable recording medium either the two any combination.Computer-readable recording medium for example can be --- but Be not limited to --- electricity, magnetic, optical, electromagnetic, system, device or the device of infrared ray or semiconductor, or it is any more than combination. The more specifically example of computer-readable recording medium can include but is not limited to：Electrical connection with one or more conducting wires, Portable computer diskette, hard disk, random access storage device (RAM), read-only storage (ROM), erasable type may be programmed read-only deposit Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only storage (CD-ROM), light storage device, magnetic memory Part or above-mentioned any appropriate combination.In this application, computer-readable recording medium can any be included or store The tangible medium of program, the program can be commanded the either device use or in connection of execution system, device.And In the application, computer-readable signal media can include believing in a base band or as the data that a carrier wave part is propagated Number, wherein carrying computer-readable program code.The data-signal of this propagation can take various forms, including but not It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer Any computer-readable medium beyond readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use In by instruction execution system, device either device use or program in connection.Included on computer-readable medium Program code any appropriate medium can be used to transmit, include but not limited to：Wirelessly, electric wire, optical cable, RF etc., Huo Zheshang Any appropriate combination stated.

Flow chart and block diagram in attached drawing, it is illustrated that according to the system of the various embodiments of the application, method and computer journey Architectural framework in the cards, function and the operation of sequence product.At this point, each square frame in flow chart or block diagram can generation The part of one module of table, program segment or code, the part of the module, program segment or code include one or more use In the executable instruction of logic function as defined in realization.It should also be noted that marked at some as in the realization replaced in square frame The function of note can also be with different from the order marked in attached drawing generation.For example, two square frames succeedingly represented are actually It can perform substantially in parallel, they can also be performed in the opposite order sometimes, this is depending on involved function.Also to note Meaning, the combination of each square frame and block diagram in block diagram and/or flow chart and/or the square frame in flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit can also be set within a processor, for example, can be described as：A kind of processor bag Include resolution unit, the first execution unit, generation unit and the second execution unit.Wherein, the title of these units is in certain situation Under do not form restriction to the unit in itself, for example, resolution unit is also described as, " the received user of parsing is defeated The first voice messaging entered, obtains the unit of the first instruction corresponding with first voice messaging ".

As on the other hand, present invention also provides a kind of computer-readable medium, which can be Included in device described in above-described embodiment；Can also be individualism, and without be incorporated the device in.Above-mentioned calculating Machine computer-readable recording medium carries one or more program, when said one or multiple programs are performed by the device so that should Device：Received the first voice messaging input by user of parsing, obtains the first instruction corresponding with the first voice messaging；Hold Row the first operation corresponding with the first instruction；Obtain and perform that the first operation returns as a result, generating and sending the second language to user Message ceases, wherein, the second voice messaging includes voice messaging corresponding with the result of return and for prompting user that input is The information of voice prompt of no execution and associated at least one second operation of the first operation；In response to receive it is input by user with The corresponding response instruction of information of voice prompt, performs the second operation.

Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to the technology that the particular combination of above-mentioned technical characteristic forms Scheme, while should also cover in the case where not departing from foregoing invention design, carried out by above-mentioned technical characteristic or its equivalent feature The other technical solutions for being combined and being formed.Such as features described above has similar work(with (but not limited to) disclosed herein The technical solution that the technical characteristic of energy is replaced mutually and formed.

Claims

1. a kind of voice information processing method, including：

Received the first voice messaging input by user of parsing, obtains corresponding with first voice messaging first and refers to Order；

Perform the first operation corresponding with the described first instruction；

Obtain and perform that first operation returns as a result, generating and sending the second voice messaging to the user, wherein, it is described Second voice messaging include voice messaging corresponding with the result of the return and for prompt the user to input whether perform with The information of voice prompt of associated at least one second operation of first operation；

In response to receiving response instruction corresponding with the information of voice prompt input by user, second operation is performed.

2. according to the method described in claim 1, wherein, the method further includes：

Associated instructions in response to receiving user, by described first operation with least one second operate be associated, its In, the associated instructions include the identification information of the described first operation and treat and the described first operation associated at least one the The identification information of two operations.

3. according to the method described in claim 1, wherein, the method further includes：

Determined and associated at least one second operation of the described first operation based on neural network model trained in advance；

Described first operation and at least one second operation are associated.

4. according to the method described in claim 1, wherein, the method further includes：

Neighbouring operation behavior data based on mass users determine and associated at least one second operation of the described first operation；

Described first operation and at least one second operation are associated.

5. according to the method described in claim 1-4 any one, wherein, first instruction includes at least one first and performs Parameter；And

The execution corresponding with the described first instruction first operates, including：

Parameter is performed according to described at least one first, performs the first operation corresponding with the described first instruction.

6. according to the method described in claim 1-4 any one, wherein, it is described in response to receive it is input by user with it is described The corresponding response instruction of information of voice prompt, performs second operation, including：

In response to receiving response instruction corresponding with the information of voice prompt input by user, according to the result of the return Generate at least one second and perform parameter；

Parameter, which is performed, according to described at least one second performs second operation.

7. according to the method described in claim 1-4 any one, wherein, input by user first received by the parsing Voice messaging, obtains the first instruction corresponding with first voice messaging, including：

First voice messaging is identified, obtains text information corresponding with first voice messaging；

Extract the keyword in the text information；

The first instruction corresponding with first voice messaging is determined according to the keyword.

8. a kind of speech information processing apparatus, including：

Resolution unit, is configured to parse received the first voice messaging input by user, obtains and first voice Corresponding first instruction of information；

First execution unit, is configured to carry out the first operation corresponding with the described first instruction；

Generation unit, be used in obtain perform it is described first operation return as a result, generation and to the user send the second language Message ceases, wherein, second voice messaging includes voice messaging corresponding with the result of the return and for prompting to use Whether family input performs the information of voice prompt with associated at least one second operation of the described first operation；

Second execution unit, is configured to refer in response to receiving response corresponding with the information of voice prompt input by user Order, performs second operation.

9. device according to claim 8, wherein, described device further includes dispensing unit, and the configuration of described dispensing unit is used In：

10. device according to claim 8, wherein, described device further includes dispensing unit, and the configuration of described dispensing unit is used In：

Determined and associated at least one second operation of the described first operation based on neural network model trained in advance；By described in First operation is associated with least one second operation.

11. device according to claim 8, wherein, described device further includes dispensing unit, and the configuration of described dispensing unit is used In：

Described first operation and at least one second operation are associated.

12. according to the device described in claim 8-11 any one, wherein, first instruction includes at least one first and holds Row parameter；And

First execution unit is further configured to：

13. according to the device described in claim 8-11 any one, wherein, second execution unit further configures use In：

14. according to the device described in claim 8-11 any one, wherein, the resolution unit is further configured to：

Extract the keyword in the text information；

15. a kind of server, including：

One or more processors；

Storage device, for storing one or more programs,

When one or more of programs are performed by one or more of processors so that one or more of processors are real The now method as described in any in claim 1-7.

16. a kind of computer-readable recording medium, is stored thereon with computer program, wherein, when which is executed by processor Realize the method as described in any in claim 1-7.