Nothing Special   »   [go: up one dir, main page]

CN112802454A - Method and device for recommending awakening words, terminal equipment and storage medium - Google Patents

Method and device for recommending awakening words, terminal equipment and storage medium Download PDF

Info

Publication number
CN112802454A
CN112802454A CN202011633865.3A CN202011633865A CN112802454A CN 112802454 A CN112802454 A CN 112802454A CN 202011633865 A CN202011633865 A CN 202011633865A CN 112802454 A CN112802454 A CN 112802454A
Authority
CN
China
Prior art keywords
score
alternative
keyword
behavior
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011633865.3A
Other languages
Chinese (zh)
Other versions
CN112802454B (en
Inventor
曹金磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Volkswagen Mobvoi Beijing Information Technology Co Ltd
Original Assignee
Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Volkswagen Mobvoi Beijing Information Technology Co Ltd filed Critical Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority to CN202011633865.3A priority Critical patent/CN112802454B/en
Publication of CN112802454A publication Critical patent/CN112802454A/en
Application granted granted Critical
Publication of CN112802454B publication Critical patent/CN112802454B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a recommendation method, a recommendation device, terminal equipment and a storage medium of a wake-up word, wherein the method comprises the following steps: acquiring network data of a target user within a preset time period; acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to the word frequency and the inverse text frequency index; determining a behavior type score of each alternative keyword in at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page; determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword; and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user. According to the technical scheme disclosed in the embodiment of the invention, personalized push aiming at different users is realized according to the actual focus points and interest points of the users, and the human-computer interaction experience of the users is improved.

Description

Method and device for recommending awakening words, terminal equipment and storage medium
Technical Field
The embodiment of the invention relates to the field of voice interaction, in particular to a method and a device for recommending a wake-up word, terminal equipment and a storage medium.
Background
With the continuous progress of science and technology, the speech recognition technology is rapidly developed, and the technology support is provided for the speech interaction between the user and the intelligent terminal equipment.
The human-computer interaction between the user and the intelligent terminal equipment is very similar to the human-to-human interaction, and comprises a plurality of links such as awakening, responding, inputting, understanding and feeding back, wherein the awakening is the first contact point of the user and the terminal equipment interaction each time, the experience of the awakening link is important in the whole voice interaction process, and the first impression of the user on the product is directly influenced by the quality of the experience.
In the prior art, a determined phrase is usually used as a wake-up word, for example, "question and answer start" is used as a wake-up word, or a plurality of fixed phrase combinations are recommended to a user, for example, "question and answer start", "answer please" and "ask me", and the like, and the user selects the determined wake-up word, but such a recommendation mode cannot recommend the matched wake-up word according to the actual needs of the user, cannot realize personalized push for different users, and the human-computer interaction experience of the user is poor.
Disclosure of Invention
The embodiment of the invention provides a recommendation method, device and equipment of a wake-up word and a storage medium, which are used for recommending a personalized voice wake-up word to a user.
In a first aspect, an embodiment of the present invention provides a method for recommending a wakeup word, including:
acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index;
determining a behavior type score of each alternative keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword;
and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user.
In a second aspect, an embodiment of the present invention provides a device for recommending a wakeup word, including:
the network data acquisition module is used for acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
the text importance score acquisition module is used for acquiring the alternative keywords in each content display page and acquiring the text importance score of each alternative keyword according to the word frequency and the inverse text frequency index;
a behavior type score obtaining module, configured to determine a behavior type score of each candidate keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
the recommendation score acquisition module is used for determining the recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword;
and the awakening word display module is used for determining the recommended awakening words matched with the target user according to the recommendation scores of the alternative keywords and displaying the recommended awakening words to the target user.
In a third aspect, an embodiment of the present invention provides a terminal device, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for recommending a wake-up word according to any embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a method for recommending a wake word according to any embodiment of the present invention.
According to the technical scheme disclosed in the embodiment of the invention, the alternative keywords in each content display page are obtained according to the network data of the user in a period of time in the past, the text importance score is calculated, the recommendation score of each alternative keyword is obtained according to the behavior type score corresponding to the alternative keyword, the recommendation awakening word displayed to the user is further determined according to the recommendation score, the personalized push aiming at different users is realized according to the actual interest points and interest points of the user, and the human-computer interaction experience of the user is improved.
Drawings
Fig. 1 is a flowchart of a method for recommending a wake-up word according to an embodiment of the present invention;
fig. 2 is a block diagram of a device for recommending a wake-up word according to a second embodiment of the present invention;
fig. 3 is a block diagram of a terminal device according to a third embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a method for recommending a wake-up word according to an embodiment of the present invention, where the method is applicable to recommend a voice wake-up word related to a user according to network data of the user, and the method may be executed by a device for recommending a wake-up word according to an embodiment of the present invention, where the device may be implemented by software and/or hardware and integrated in a terminal device or a server, and the method specifically includes the following steps:
s110, acquiring network data of a target user in a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page.
Network data, including page information acquired by a user through a terminal device, for example, a display page of a nearby road condition searched by the user through a vehicle-mounted terminal device, also including page information acquired by the user through a mobile phone Application (APP) and a PC (Personal Computer) end software program related to the terminal device, and also including page information acquired through a mobile phone browser or a PC end browser, for example, page information acquired in a web page form through a browser in a mobile phone or a PC device according to a registered account number of the terminal device; the identity information of the user can be distinguished through a registered account of the terminal equipment; in the embodiment of the present invention, the terminal device is an electronic device with a voice interaction function, and the type of the terminal device is not particularly limited.
Different user behaviors, such as browsing behavior, searching behavior, comment behavior, like behavior, collection behavior, shopping cart adding behavior, purchasing behavior and the like, all obtain corresponding content display pages, for example, after the user executes purchasing behavior, the user can obtain corresponding order pages and payment pages; after executing a search behavior, a user can obtain a corresponding search result page; therefore, for each obtained content presentation page, the user behavior type corresponding to the content presentation page, that is, the user behavior type triggering the content presentation page, may be obtained, for example, the content presentation page a is obtained through a browsing behavior of the user, and the content presentation page B is obtained through a searching behavior of the user. In particular, the browsing behavior comprises an audio playing behavior; the audio playing behavior comprises audio information acquired by a user through accessing the internet and local audio information played through the vehicle-mounted terminal equipment, wherein the audio information comprises a content display page of audio, and information such as the type (such as music, a phase sound, a comment and the like) of the audio, the title of a song, a dialog and/or a performer is displayed. Key information of the audio content may be extracted as alternative keywords, for example: when the type of the audio is music, the name of the music, the noun appearing most frequently in the music, the name of the singer of the music, the name of the favorite of the singer, or the like may be extracted; when the type of audio is a review, the title of the review, the name of the host in the review, the name of the author of the review, or the name of the most popular character in the review, etc. may be extracted.
The network data in the preset time period reflects the information of things which are concerned by the user in the past time period, has stronger relevance with the user and can arouse the interest of the user, and the specific time value can be set according to the needs, for example, the preset time period is 5 days, namely, the network data of the target user in the past 5 days is obtained; particularly, the preset time period can be related to the activity degree of the user, the activity degree is high, and the preset time period can be set to be a small value, namely, the richer network data of the user can be obtained in a short time, so that the attention points and the interest points of the user are reflected; the activity degree is low, a preset time period needs to be set to a larger value, namely, a longer time is needed to acquire the network data of the user, so that the attention points and the interest points of the user can be accurately reflected; the activity level of the user can be determined according to the average network access time of the user every day.
S120, obtaining the alternative keywords in each content display page, and obtaining the text importance score of each alternative keyword according to the word frequency and the inverse text frequency index.
The word frequency and the inverse text frequency index reflect the importance degree of the vocabulary in all content display pages; wherein, Term Frequency (TF) represents the Frequency of each vocabulary in a content presentation page appearing in the content presentation page, and the larger the Frequency value is, the greater the importance of the vocabulary in the content presentation page is; for example, if the word "puppet cat" appears 20 times in a content presentation page, and the content presentation page has 100 words in total, the TF value is 20/100 ═ 0.2; inverse Document Frequency Index (IDF), a measure of the universal importance of a vocabulary if the content containing the vocabulary is shown fewer pagesThen the larger the IDF, the more important the content presentation page in which the vocabulary is located, the IDF value can be obtained by dividing the total number of content presentation pages by the number of content presentation pages including the vocabulary, and dividing the result by the base 10 logarithm, for example, the total number of content presentation pages is 100, the number of content presentation pages including "puppet cat" in each content presentation page is 10, and the IDF value is calculated as
Figure BDA0002880712400000061
Finally multiplying IF by IDF to obtain TF-IDF numerical value, namely the text importance score of the vocabulary; in the above technical solution, the text importance score of the word "puppet cat" is 0.2 × 1 — 0.2.
The content display pages usually comprise rich vocabulary information, all vocabularies in each content display page are not required to be used as alternative keywords, and partial vocabularies with more occurrence times can be obtained in each content display page in a screening mode to be used as alternative keywords so as to reduce the calculation pressure of terminal equipment or a server; specifically, the obtaining of the alternative keywords in each content presentation page includes: acquiring alternative keywords of which the occurrence times are greater than or equal to a preset minimum occurrence time in each content display page; or in each content display page, arranging the words according to the sequence of the occurrence times from large to small, and in each content display page, acquiring a first preset number of alternative keywords according to the arrangement sequence of the words. The vocabulary with the occurrence frequency reaching a certain frequency requirement (for example, 5 times) in each content display page can be used as the alternative keyword; the number requirement of the alternative keywords, i.e. the first preset number (e.g. 3) may also be set in each content presentation page, i.e. in each content presentation page, 3 words with the largest occurrence number are used as the alternative keywords, so as to reduce the number of the alternative keywords, reduce the computing pressure of the terminal device or the server,
s130, determining the behavior type score of each candidate keyword in the at least one content display page according to the user behavior type corresponding to the at least one content display page.
Different user behaviors reflect different degrees of user attention, for example, browsing behaviors of the user only reflect general attention of the user, and purchasing behaviors of the user obviously represent interest points which are very concerned by the user, so different behavior type scores are allocated to different user behavior types, for example, browsing behaviors, searching behaviors, commenting behaviors, agreeing behaviors, collecting behaviors, shopping cart adding behaviors and purchasing behaviors in the technical scheme, the behavior type scores of the behaviors are sequentially increased, and the behavior type scores are respectively set to be 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1.
S140, determining the recommendation score of each candidate keyword according to the importance score and the behavior type score of each candidate keyword.
And performing multiplication operation on the text importance score and the behavior type score of the alternative keyword, wherein the multiplication operation result is the recommendation score of the alternative keyword, and the higher the recommendation score is, the more important the text content of the vocabulary is, the greater the incidence relation with the user is, and the more matched the attention point and the interest point of the user are.
Optionally, in this embodiment of the present invention, after determining, according to a user behavior type corresponding to the at least one content presentation page, a behavior type score of each candidate keyword in the at least one content presentation page, the method further includes: determining the behavior frequency score of each alternative keyword according to the user behavior frequency in unit time corresponding to each alternative keyword; determining a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword, wherein the determining of the recommendation score of each candidate keyword comprises the following steps: and determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score and the behavior frequency score of each alternative keyword.
The unit time is related to the preset time period in the above technical solution, if the preset time period is the past several days (i.e. taking "day" as the time unit), the unit time may be set to one day, if the preset time period is the past several months (i.e. taking "month" as the time unit), the unit time may be set to one month; the user behavior frequency in unit time corresponding to the alternative keyword reflects the influence degree of the alternative keyword on the user, and the larger the user behavior frequency in unit time is, the larger the influence degree on the user is; for example, in one day, the alternative keyword a appears in the content presentation page acquired by the network behavior of the user 2 times, and the alternative keyword B appears in the content presentation page acquired by the network behavior of the user 50 times, and obviously, the influence of the alternative keyword B on the user is greater than that of the alternative keyword a. According to the frequency band of the user behavior frequency in the unit time, a corresponding behavior frequency score can be obtained, for example, when the user behavior frequency of the alternative keyword in the corresponding unit time is 0 to 10 times, the corresponding behavior frequency score is 1; when the number of times is between 11 and 50, the corresponding behavior frequency is 1.5; the corresponding behavior frequency is 1.8 when the number of times is 50 to 100, and the corresponding behavior frequency is 2 when the number of times is more than 100. And performing product operation on the text importance score, the behavior type score and the behavior frequency score of the alternative keyword, wherein the product operation result is the recommendation score of the alternative keyword.
Optionally, in this embodiment of the present invention, after determining, according to a user behavior type corresponding to the at least one content presentation page, a behavior type score of each candidate keyword in the at least one content presentation page, the method further includes: obtaining interest attenuation scores of each alternative keyword; wherein the interest decay score is related to the interval time between the acquisition time of the candidate keyword and the current time; determining a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword, wherein the determining of the recommendation score of each candidate keyword comprises the following steps: determining a recommendation score of each alternative keyword according to the text importance score, the behavior type score and the interest attenuation score of each alternative keyword; or determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score, the behavior frequency score and the interest attenuation score of each alternative keyword.
The interest level of a user in a thing usually declines with time, for example, the user browses a web page five days ago, the candidate keyword obtained by the web page includes a candidate keyword "puppet cat", the user browses a web page one day ago, the candidate keyword obtained by the web page includes a candidate keyword "bose cat", obviously, the browsing content one day ago is more capable of really predicting the current interest point of the user than the browsing content five days ago, therefore, according to the interval time between the obtaining time of each candidate keyword and the current time, the interest decay score of each candidate keyword can be determined, for example, the preset time period is 5 days, then the corresponding interval time can be 5 days, 4 days, 3 days, 2 days and 1 day, the interest decay scores can be respectively set to 0.6, 0.7, 0.8, b, 0.9 and 1, i.e., the interest decay score increases as the interval time becomes shorter.
Optionally, in this embodiment of the present invention, the obtaining the interest decay score of each candidate keyword includes: obtaining interest attenuation score of each candidate keyword based on the following formula
ni=ki×exp(-mi×ti) (formula 1-1)
Wherein m isiAs attenuation coefficient, tiIs the interval time, k, between the acquisition time of the alternative keyword and the current timeiInitial interest score for alternative keywords, niAnd the interest decay score of the alternative keyword is exp, an exponential function with e as a base, and i is the number of the user behavior type corresponding to the alternative keyword.
Taking the above technical solution as an example, the numbers of the browsing behavior, the searching behavior, the commenting behavior, the like behavior, the collecting behavior, the shopping cart adding behavior and the purchasing behavior are respectively 1 to 7, that is, the i values are respectively 1 to 7. Different user behavior types have different initial interest scores, the higher the behavior type score is, the higher the corresponding initial interest score is, and the higher the interest decay score is at the same interval time; for example, the browsing behavior reflects general attention of the user, and accordingly, the initial interest score of the candidate keyword in the user behavior type is low, and after time decay, the interest decay score is low; the purchasing behavior is obviously the interest point which is very concerned by the user, and accordingly, the initial interest score of the alternative keyword under the behavior type of the user is higher, although the interest decay score may still keep a higher value after a period of decay; therefore, different initial interest scores and interest attenuation scores with the interval time being a preset time period are preset for different user behavior types, finally, attenuation coefficients respectively matched with each user behavior type are calculated, and therefore a calculation formula of the interest attenuation scores respectively corresponding to each user behavior type is obtained.
For example, the initial interest score k for a browsing action (corresponding to an i value of 1)1Set to 100, the preset time period is 5 days, and an interest decay score of 1, t, is expected after 5 days1When the value is 5, n1The value is 1, so that the corresponding attenuation coefficient m can be calculated and obtained1Is 0.921, the corresponding formula 1-1 may be specifically changed to n1=100×exp(-0.921×t1) According to the interval time t between the acquisition time and the current time of each alternative keyword1That is, the interest attenuation score n of the candidate keyword can be obtained1
S150, determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user.
When a user wakes up the terminal equipment or modifies a wake-up word of the terminal equipment, one or more alternative keywords with the highest recommendation score are used as the recommended wake-up word To be pushed To the user in a TTS (Text To Speech) broadcasting mode and/or a screen display mode.
Optionally, in this embodiment of the present invention, the determining, according to the recommendation score of each candidate keyword, a recommended wakening word matched with the target user, and displaying the recommended wakening word to the target user includes: classifying the alternative keywords, and respectively acquiring the alternative keywords with the highest recommendation scores under the classification categories as alternative recommended words; and acquiring awakening recommended words with highest recommendation scores and a second preset number from the alternative recommended words, and displaying the awakening recommended words to the target user. Classifying the candidate keywords, including clustering the obtained candidate keywords, for example, clustering the keywords by a Word to vector (Word vector) model to form different classification categories; the method also comprises the steps of presetting a plurality of classification categories, and dividing the candidate keywords into the classification categories according to word senses after the candidate keywords are obtained. The alternative keywords are classified, the awakening recommended words are respectively obtained under all classification categories, the awakening words with strong relevance are recommended to the user from a plurality of different angles, the recommendation range related to the recommended words is further expanded, and the problem that the recommended awakening words are poor in diversity under a single category is solved. For example, most of the network data acquired by the user in the last 5 days is related to "cat", and according to the recommendation score of each candidate keyword, it can be determined that the candidate keywords with the highest score are "puppet cat", "bosch cat", "banglas cat", "skateboard", and "lipstick" in turn, and if the number of the awakening recommended words (i.e., the second preset number) is 3, it is obvious that the recommended awakening words should be "puppet cat", "bosch cat", and "banglas cat"; however, in fact, the three candidate keywords are all in the same category (i.e. category "cat"), and even if only one of the three candidate keywords is recommended to the user (i.e. only "puppet cat" with the highest recommendation score is used as the recommended awakening word), the user can associate with the other two candidate keywords, and still can set the awakening word meeting the needs of the user according to the recommendation; therefore, under different classification categories, the three finally determined recommended awakening words are respectively 'puppet cat' (the corresponding category is 'cat'), 'skateboard' (the corresponding category is 'sports equipment') and 'lipstick' (the corresponding category is 'cosmetic'), and multi-angle pushing of the awakening words is realized.
According to the technical scheme disclosed in the embodiment of the invention, the alternative keywords in each content display page are obtained according to the network data of the user in a period of time in the past, the text importance score is calculated, the recommendation score of each alternative keyword is obtained according to the behavior type score corresponding to the alternative keyword, the recommendation awakening word displayed to the user is further determined according to the recommendation score, the personalized push aiming at different users is realized according to the actual interest points and interest points of the user, and the human-computer interaction experience of the user is improved.
Example two
Fig. 2 is a block diagram of a structure of a device for recommending a wakeup word according to a second embodiment of the present invention, where the device specifically includes: a network data acquisition module 210, a text importance score acquisition module 220, a behavior type score acquisition module 230, a recommendation score acquisition module 240 and a wakeup word presentation module 250;
a network data obtaining module 210, configured to obtain network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
a text importance score obtaining module 220, configured to obtain alternative keywords in each content presentation page, and obtain a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index;
a behavior type score obtaining module 230, configured to determine, according to a user behavior type corresponding to the at least one content presentation page, a behavior type score of each candidate keyword in the at least one content presentation page;
a recommendation score obtaining module 240, configured to determine a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword;
and the awakening word display module 250 is configured to determine, according to the recommendation score of each candidate keyword, a recommended awakening word matched with the target user, and display the recommended awakening word to the target user.
According to the technical scheme disclosed in the embodiment of the invention, the alternative keywords in each content display page are obtained according to the network data of the user in a period of time in the past, the text importance score is calculated, the recommendation score of each alternative keyword is obtained according to the behavior type score corresponding to the alternative keyword, the recommendation awakening word displayed to the user is further determined according to the recommendation score, the personalized push aiming at different users is realized according to the actual interest points and interest points of the user, and the human-computer interaction experience of the user is improved.
Optionally, on the basis of the above technical solution, the user behavior type includes a browsing behavior, a searching behavior, a commenting behavior, a like behavior, a collecting behavior, an adding shopping cart behavior, and/or a purchasing behavior.
Optionally, on the basis of the above technical solution, the text importance score obtaining module 220 is specifically configured to obtain alternative keywords, of which the occurrence times are greater than or equal to a preset minimum occurrence time, in each content display page; or in each content display page, arranging the words according to the sequence of the occurrence times from large to small, and in each content display page, acquiring a first preset number of alternative keywords according to the arrangement sequence of the words.
Optionally, on the basis of the above technical solution, the apparatus for recommending a wakeup word further includes:
and the behavior frequency score acquisition module is used for determining the behavior frequency score of each alternative keyword according to the user behavior frequency in unit time corresponding to each alternative keyword.
Optionally, on the basis of the foregoing technical solution, the recommendation score obtaining module 240 is specifically configured to determine the recommendation score of each candidate keyword according to the text importance score, the behavior type score, and the behavior frequency score of each candidate keyword.
Optionally, on the basis of the above technical solution, the apparatus for recommending a wakeup word further includes:
an interest attenuation score obtaining module, configured to obtain an interest attenuation score of each candidate keyword; wherein the interest decay score is related to the interval time between the acquisition time of the candidate keyword and the current time.
Optionally, on the basis of the foregoing technical solution, the recommendation score obtaining module 240 is specifically configured to determine the recommendation score of each candidate keyword according to the text importance score, the behavior type score, and the interest decay score of each candidate keyword; or determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score, the behavior frequency score and the interest attenuation score of each alternative keyword.
Optionally, on the basis of the above technical solution, the interest decay score obtaining module is specifically configured to obtain the interest decay score of each candidate keyword based on the following formula
ni=ki×exp(-mi×ti)
Wherein m isiAs attenuation coefficient, tiIs the interval time, k, between the acquisition time of the alternative keyword and the current timeiInitial interest score for alternative keywords, niAnd the interest decay score of the alternative keyword is exp, an exponential function with e as a base, and i is the number of the user behavior type corresponding to the alternative keyword.
Optionally, on the basis of the above technical solution, the awakening word display module 250 specifically includes:
the classification execution unit is used for classifying the alternative keywords and respectively acquiring the alternative keywords with the highest recommendation scores under the classification categories as alternative recommended words;
and the awakening word display unit is used for acquiring the awakening recommended words with the highest recommendation scores and the second preset number in the alternative recommended words and displaying the awakening recommended words to the target user.
Optionally, on the basis of the above technical solution, the browsing behavior includes an audio playing behavior.
The device can execute the recommendation method of the awakening words provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details not described in detail in this embodiment, reference may be made to the method provided in any embodiment of the present invention.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a terminal device according to a third embodiment of the present invention. Fig. 3 illustrates a block diagram of an exemplary terminal device 12 suitable for use in implementing embodiments of the present invention. The terminal device 12 shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 3, terminal device 12 is in the form of a general purpose computing device. The components of terminal device 12 may include, but are not limited to: one or more processors or processing units 16, a memory 28, and a bus 18 that couples various system components including the memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Terminal device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by terminal device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Terminal device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, and commonly referred to as a "hard drive"). Although not shown in FIG. 3, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Terminal device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with terminal device 12, and/or with any devices (e.g., network card, modem, etc.) that enable terminal device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, terminal device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via network adapter 20. As shown, the network adapter 20 communicates with the other modules of the terminal device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with terminal device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the memory 28, for example, implementing a recommendation method of a wake word provided by any embodiment of the present invention. Namely: acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page; acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index; determining a behavior type score of each alternative keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page; determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword; and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user.
Example four
The fourth embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a method for recommending a wake-up word according to any embodiment of the present invention; the method comprises the following steps:
acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index;
determining a behavior type score of each alternative keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword;
and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a hint execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for recommending a wake-up word, comprising:
acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index;
determining a behavior type score of each alternative keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword;
and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user.
2. The method of claim 1, wherein the user behavior types include browsing behavior, searching behavior, commenting behavior, praise behavior, collecting behavior, adding shopping cart behavior, and/or purchasing behavior.
3. The method according to claim 1, wherein the obtaining of the alternative keywords in each content presentation page comprises:
acquiring alternative keywords of which the occurrence times are greater than or equal to a preset minimum occurrence time in each content display page;
or in each content display page, arranging the words according to the sequence of the occurrence times from large to small, and in each content display page, acquiring a first preset number of alternative keywords according to the arrangement sequence of the words.
4. The method of claim 1, further comprising, after determining a behavior type score for each of the candidate keywords in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page:
determining the behavior frequency score of each alternative keyword according to the user behavior frequency in unit time corresponding to each alternative keyword;
determining a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword, wherein the determining of the recommendation score of each candidate keyword comprises the following steps:
and determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score and the behavior frequency score of each alternative keyword.
5. The method according to claim 1 or 4, wherein after determining the behavior type score of each candidate keyword in the at least one content presentation page according to the user behavior type corresponding to the at least one content presentation page, the method further comprises:
obtaining interest attenuation scores of each alternative keyword; wherein the interest decay score is related to the interval time between the acquisition time of the candidate keyword and the current time;
determining a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword, wherein the determining of the recommendation score of each candidate keyword comprises the following steps:
determining a recommendation score of each alternative keyword according to the text importance score, the behavior type score and the interest attenuation score of each alternative keyword;
or determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score, the behavior frequency score and the interest attenuation score of each alternative keyword.
6. The method of claim 5, wherein obtaining an interest decay score for each of the candidate keywords comprises:
obtaining interest attenuation score of each candidate keyword based on the following formula
ni=ki×exp(-mi×ti)
Wherein m isiAs attenuation coefficient, tiIs the interval time, k, between the acquisition time of the alternative keyword and the current timeiInitial interest score for alternative keywords, niAnd the interest decay score of the alternative keyword is exp, an exponential function with e as a base, and i is the number of the user behavior type corresponding to the alternative keyword.
7. The method according to claim 1, wherein the determining and presenting the recommended awakening word matched with the target user according to the recommendation score of each candidate keyword comprises:
classifying the alternative keywords, and respectively acquiring the alternative keywords with the highest recommendation scores under the classification categories as alternative recommended words;
and acquiring awakening recommended words with highest recommendation scores and a second preset number from the alternative recommended words, and displaying the awakening recommended words to the target user.
8. An apparatus for recommending a wake-up word, comprising:
the network data acquisition module is used for acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
the text importance score acquisition module is used for acquiring the alternative keywords in each content display page and acquiring the text importance score of each alternative keyword according to the word frequency and the inverse text frequency index;
a behavior type score obtaining module, configured to determine a behavior type score of each candidate keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
the recommendation score acquisition module is used for determining the recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword;
and the awakening word display module is used for determining the recommended awakening words matched with the target user according to the recommendation scores of the alternative keywords and displaying the recommended awakening words to the target user.
9. A terminal device, characterized in that the terminal device comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method of recommending wake words as recited in any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of recommending a wake-up word as claimed in any one of claims 1 to 7.
CN202011633865.3A 2020-12-31 2020-12-31 Method and device for recommending awakening words, terminal equipment and storage medium Active CN112802454B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011633865.3A CN112802454B (en) 2020-12-31 2020-12-31 Method and device for recommending awakening words, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011633865.3A CN112802454B (en) 2020-12-31 2020-12-31 Method and device for recommending awakening words, terminal equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112802454A true CN112802454A (en) 2021-05-14
CN112802454B CN112802454B (en) 2023-02-21

Family

ID=75808578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011633865.3A Active CN112802454B (en) 2020-12-31 2020-12-31 Method and device for recommending awakening words, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112802454B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343084A (en) * 2021-05-25 2021-09-03 北京字节跳动网络技术有限公司 Method and device for pushing key field of text sending, storage medium and computer equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120036144A1 (en) * 2009-02-27 2012-02-09 Kabushiki Kaisha Toshiba Information and recommendation device, method, and program
CN108664513A (en) * 2017-03-31 2018-10-16 北京京东尚科信息技术有限公司 Method, apparatus and equipment for pushing keyword
CN109615487A (en) * 2019-01-04 2019-04-12 平安科技(深圳)有限公司 Products Show method, apparatus, equipment and storage medium based on user behavior
CN111414498A (en) * 2020-04-29 2020-07-14 北京字节跳动网络技术有限公司 Multimedia information recommendation method and device and electronic equipment
CN111723260A (en) * 2019-03-19 2020-09-29 百度在线网络技术(北京)有限公司 Method and device for acquiring recommended content, electronic equipment and readable storage medium
CN111949887A (en) * 2020-08-31 2020-11-17 华东理工大学 Item recommendation method and device and computer-readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120036144A1 (en) * 2009-02-27 2012-02-09 Kabushiki Kaisha Toshiba Information and recommendation device, method, and program
CN108664513A (en) * 2017-03-31 2018-10-16 北京京东尚科信息技术有限公司 Method, apparatus and equipment for pushing keyword
CN109615487A (en) * 2019-01-04 2019-04-12 平安科技(深圳)有限公司 Products Show method, apparatus, equipment and storage medium based on user behavior
CN111723260A (en) * 2019-03-19 2020-09-29 百度在线网络技术(北京)有限公司 Method and device for acquiring recommended content, electronic equipment and readable storage medium
CN111414498A (en) * 2020-04-29 2020-07-14 北京字节跳动网络技术有限公司 Multimedia information recommendation method and device and electronic equipment
CN111949887A (en) * 2020-08-31 2020-11-17 华东理工大学 Item recommendation method and device and computer-readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343084A (en) * 2021-05-25 2021-09-03 北京字节跳动网络技术有限公司 Method and device for pushing key field of text sending, storage medium and computer equipment
CN113343084B (en) * 2021-05-25 2024-09-13 北京字节跳动网络技术有限公司 Text key field pushing method and device, storage medium and computer equipment

Also Published As

Publication number Publication date
CN112802454B (en) 2023-02-21

Similar Documents

Publication Publication Date Title
US10417344B2 (en) Exemplar-based natural language processing
US10733197B2 (en) Method and apparatus for providing information based on artificial intelligence
CN110069608B (en) Voice interaction method, device, equipment and computer storage medium
US9280595B2 (en) Application query conversion
JP5281405B2 (en) Selecting high-quality reviews for display
US8825571B1 (en) Multiple correlation measures for measuring query similarity
CN109299316B (en) Music recommendation method and device and computer equipment
US9122680B2 (en) Information processing apparatus, information processing method, and program
US9852215B1 (en) Identifying text predicted to be of interest
CN109918555B (en) Method, apparatus, device and medium for providing search suggestions
US10810374B2 (en) Matching a query to a set of sentences using a multidimensional relevancy determination
CN111324771B (en) Video tag determination method and device, electronic equipment and storage medium
US20220215860A1 (en) Enhancing review videos
JP7525575B2 (en) Generate interactive audio tracks from visual content
CN111400584A (en) Association word recommendation method and device, computer equipment and storage medium
CN105550217B (en) Scene music searching method and scene music searching device
CN106202087A (en) A kind of information recommendation method and device
Arguello et al. Using query performance predictors to reduce spoken queries
CN118035487A (en) Video index generation and retrieval method and device, electronic equipment and storage medium
CN112802454B (en) Method and device for recommending awakening words, terminal equipment and storage medium
CN113407775B (en) Video searching method and device and electronic equipment
JP4883644B2 (en) RECOMMENDATION DEVICE, RECOMMENDATION SYSTEM, RECOMMENDATION DEVICE CONTROL METHOD, AND RECOMMENDATION SYSTEM CONTROL METHOD
CN111737607A (en) Data processing method, data processing device, electronic equipment and storage medium
CN111460177A (en) Method and device for searching film and television expression, storage medium and computer equipment
JP5513929B2 (en) Experience information reusability evaluation apparatus, method and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant