Nothing Special   »   [go: up one dir, main page]

CN109151642B - Intelligent earphone, intelligent earphone processing method, electronic device and storage medium - Google Patents

Intelligent earphone, intelligent earphone processing method, electronic device and storage medium Download PDF

Info

Publication number
CN109151642B
CN109151642B CN201811033439.9A CN201811033439A CN109151642B CN 109151642 B CN109151642 B CN 109151642B CN 201811033439 A CN201811033439 A CN 201811033439A CN 109151642 B CN109151642 B CN 109151642B
Authority
CN
China
Prior art keywords
scene
type
recording
sound
intelligent earphone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811033439.9A
Other languages
Chinese (zh)
Other versions
CN109151642A (en
Inventor
邓迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qinhai Technology Co.,Ltd.
Original Assignee
Beijing Jinchain Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jinchain Technology Co Ltd filed Critical Beijing Jinchain Technology Co Ltd
Priority to CN201811033439.9A priority Critical patent/CN109151642B/en
Publication of CN109151642A publication Critical patent/CN109151642A/en
Application granted granted Critical
Publication of CN109151642B publication Critical patent/CN109151642B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0007Image acquisition

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention provides an intelligent earphone, an intelligent earphone processing method, electronic equipment and a storage medium, wherein the intelligent earphone comprises the following components: the scene identification module is used for identifying the current scene type of the intelligent earphone; the processing module is used for carrying out processing adaptive to the first type of scene when the identified scene is the first type of scene; when the identified scene is a second type scene, processing adaptive to the second type scene is carried out; wherein, the first kind of scenes are conversation scenes without visual image contents; the second type of scenes are scenes with visual image content. The intelligent earphone provided by the invention can automatically recognize the external scene and perform self-adaptive processing according to the change of the external scene, thereby improving the experience of a user in using the intelligent earphone and enhancing the use dependence of the user on the intelligent earphone.

Description

Intelligent earphone, intelligent earphone processing method, electronic device and storage medium
Technical Field
The invention relates to the technical field of intelligent equipment, in particular to an intelligent earphone, an intelligent earphone processing method, electronic equipment and a storage medium.
Background
The intelligent earphone is a novel wearable intelligent device. The intelligent earphone can be provided with an independent operating system like other intelligent equipment, and can also realize wireless network access through a mobile communication network by a user installing programs such as software and games.
However, one of the major problems with current smart headsets is: the smart headset mainly stays in the main function of listening to sound, and rarely has other automatic intelligent functions. For example, the current intelligent earphones rarely pay attention to the change of external factors and cannot perform adaptive automatic interaction processing according to the change of the external factors, so that inconvenience is caused, the intelligent earphones are not beneficial to users to make full use of, and the use dependence degree of the users on the intelligent earphones is reduced.
Disclosure of Invention
The invention provides an intelligent earphone, an intelligent earphone processing method, electronic equipment and a storage medium, aiming at the problems in the prior art.
Specifically, the invention provides the following technical scheme:
in a first aspect, the present invention provides a smart headset, comprising:
the scene identification module is used for identifying the current scene type of the intelligent earphone;
the processing module is used for carrying out processing adaptive to the first type of scene when the identified scene is the first type of scene;
when the identified scene is a second type scene, processing adaptive to the second type scene is carried out;
wherein, the first kind of scenes are conversation scenes without visual image contents; the second type of scenes are scenes with visual image content.
Further, the scene recognition module is specifically configured to:
identifying the current scene type of the intelligent earphone according to the environment image information acquired by the image acquisition device on the intelligent earphone and the sound information acquired by the sound acquisition device on the intelligent earphone;
when the fact that visual image content does not exist in a current scene is judged according to environment image information collected by an image collecting device on the intelligent earphone, and when the fact that conversation content of two or more people exists is judged according to sound information collected by a sound collecting device on the intelligent earphone, the type of the current scene where the intelligent earphone is located is identified as a first type of scene;
when the visual image content of the current scene is judged to exist according to the environment image information collected by the image collecting device on the intelligent earphone, the type of the current scene of the intelligent earphone is identified as a second type of scene.
Further, the processing module is specifically configured to:
when the identified scene is a first-class scene, automatically recording the sound in the current scene, marking a speaker ID corresponding to the sound according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speaking content in the text file according to the actual occurrence time sequence;
when the identified scene is a second-class scene, automatically recording the visual image content in the current scene or automatically continuously acquiring a single image, if the sound content in the current scene is judged to exist at the same time, automatically recording the sound in the current scene, marking the ID of a speaker to which the corresponding sound belongs according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the ID of the speaker and the corresponding speech content in the text file according to the actually occurring time sequence.
Further, the smart headset further comprises: the trigger comprises a first trigger module, a second trigger module and a third trigger module;
the first trigger module is used for automatically recording the sound of the current scene after receiving a first trigger signal of a user, marking a speaker ID corresponding to the sound according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speaking content in the text file according to the actual occurrence time sequence;
the second trigger module is used for automatically carrying out automatic video recording processing on visual image contents in the current scene after receiving a second trigger signal of a user, and if sound contents exist in the current scene at the same time, automatically recording the sounds in the current scene at the same time, marking a speaker ID to which the corresponding sounds belong according to audio characteristics of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speech contents in the text file according to the actually occurring time sequence;
and the third trigger module is used for automatically acquiring a single image of the visual image in the current scene after receiving a third trigger signal of the user every time.
Further, the scene identification module is further configured to identify whether a current scene type of the smart headset is a third type of scene; the third kind of scenes are sleep and rest scenes;
correspondingly, the processing module is further configured to detect whether the user snores when the identified scene is a third-class scene, and if so, snore is prompted through the vibration module or the music prompting module installed on the intelligent headset, so that the user can adjust the sleeping posture.
Furthermore, the processing module is further configured to detect whether an earthquake or fire hazard exists in the current scene when the identified scene is a third-class scene, and if so, perform hazard reminding through a vibration module or an alarm reminding module installed on the smart headset, so that the user escapes from the scene as soon as possible.
In a second aspect, the present invention further provides an intelligent headphone processing method, including:
identifying the type of a scene where the intelligent earphone is currently located;
when the identified scene is a first type of scene, processing adaptive to the first type of scene is carried out;
when the identified scene is a second type scene, processing adaptive to the second type scene is carried out;
wherein, the first kind of scenes are conversation scenes without visual image contents; the second type of scenes are scenes with visual image content.
Further, when the identified scene is a first type of scene, processing adaptive to the first type of scene is performed; and when the identified scene is a second type of scene, performing processing adapted to the second type of scene, specifically including:
when the identified scene is a first-class scene, automatically recording the sound in the current scene, marking a speaker ID corresponding to the sound according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speaking content in the text file according to the actual occurrence time sequence;
when the identified scene is a second-class scene, automatically recording the visual image content in the current scene or automatically continuously acquiring a single image, if the sound content in the current scene is judged to exist at the same time, automatically recording the sound in the current scene, marking the ID of a speaker to which the corresponding sound belongs according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the ID of the speaker and the corresponding speech content in the text file according to the actually occurring time sequence.
In a third aspect, the present invention also provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the processing method of the smart headset according to the first aspect when executing the program.
In a fourth aspect, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the intelligent headphone processing method according to the first aspect.
According to the technical scheme, the intelligent earphone provided by the invention comprises: the intelligent earphone comprises a scene recognition module and a processing module, wherein the scene recognition module is used for recognizing the current scene type of the intelligent earphone; the processing module is used for carrying out processing adaptive to the first kind of scenes when the identified scenes are the first kind of scenes and carrying out processing adaptive to the second kind of scenes when the identified scenes are the second kind of scenes; here, the first type of scene is a conversation scene without visual image content; the second type of scenes are scenes with visual image content. Therefore, the intelligent earphone provided by the invention can automatically perform adaptive processing according to the current scene of the intelligent earphone. If the current scene is a conversation scene without visual image content, the intelligent earphone can automatically perform recording processing, and if the current scene is a scene with visual image content, the intelligent earphone can automatically perform recording or photographing and recording processing, so that the intelligent earphone becomes more intelligent, and for example, the intelligent earphone can help a user to solve the problem that important information is missed because the user forgets to record or record the conference content during a conference. In addition, the intelligent earphone provided by the invention is particularly suitable for occasions such as conferences, seminars, training, lectures, film watching, concerts and the like, and can perform self-adaptive processing according to the change of external scenes, so that the experience of using the intelligent earphone by a user is improved, and the use dependence of the user on the intelligent earphone is enhanced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic structural diagram of an intelligent headset according to an embodiment of the present invention;
fig. 2 is another schematic structural diagram of an intelligent headset according to an embodiment of the present invention;
fig. 3 is a flowchart of a processing method of an intelligent headset according to another embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to yet another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An embodiment of the present invention provides an intelligent headset, referring to fig. 1, including: a scene recognition module 11 and a processing module 12; wherein:
the scene identification module 11 is configured to identify a current scene type of the smart headset;
the processing module 12 is configured to perform processing adapted to the first type of scene when the identified scene is the first type of scene; when the identified scene is a second type scene, processing adaptive to the second type scene is carried out;
wherein, the first kind of scenes are conversation scenes without visual image contents; the second type of scenes are scenes with visual image content.
It should be noted that the scene recognition module 11 in this embodiment may automatically recognize the type of the scene where the smart headset is currently located. In this embodiment, the scene types that can be automatically recognized by the scene recognition module 11 include at least two types, one type is a conversation scene without visual image content, such as a meeting scene, a talking scene, and a discussion scene, and such a scene is generally characterized by only having a conversation or discussion of two or more people without visual image content. Another is a scene with visual image content, such as training, lecture, movie watching, and concert, and such scenes are generally characterized by the visual image content, which may be: slide show, video clip play, stage, screen or light play, etc. For the second type of scenes, the second type of scenes may or may not have sound features, such as scenes with sound features, e.g., playing movies, playing slides with automatic speech interpretation, or playing slides in cooperation with manual interpretation; it can also be a scene without sound features, such as silent play of a slide show, or a sequential page display of text files, etc.
Accordingly, the processing module 12 performs automatic adaptive processing according to the scene recognition result. For example, when the current scene is identified as a conversation scene without visual image content, the recording process is automatically performed, and when the current scene is identified as a scene with visual image content, the recording or photographing and recording processes are automatically performed.
For example, when a user organizes a small conference for a problem in a conference room, the user forgets to turn on a recording system in the conference room or forgets to turn on a recording function of the conference on a mobile phone, and after the conference is finished, the user cannot really restore the speaking situation of each person. If the user wears the intelligent headset provided by the embodiment, the problem can be solved, because the intelligent headset provided by the embodiment can automatically recognize that the current scene is the first kind of scene, the intelligent headset can automatically record the sound without human intervention, thereby greatly facilitating the user and solving the problem that the user forgets to record the sound or record the conference content and misses important information during the conference.
Similarly, when a user attends a certain training with a slide lecture, it is often inconvenient to shoot slide contents, or it often happens that important slide contents are not shot and stored until the training is finished. If the user wears the intelligent headset provided by the embodiment, the problem can be solved, because the current scene can be automatically identified to be the second scene, the intelligent headset can automatically record video or photograph and record audio without human intervention, and great convenience is brought to the user.
It should be noted that, in order to meet the requirement of the intelligent headset for automatic video recording or photographing, a video recording or photographing module is generally arranged on the intelligent headset, and a micro camera is arranged on the intelligent headset. In addition, the smart headset can also be connected with other wearable devices for triggering other wearable devices to record or take pictures. If the intelligent earphone is connected with the camera shooting glasses with the camera shooting function, the camera shooting glasses are triggered to record or shoot.
It should be noted that all the application scenarios mentioned in this embodiment are scenarios that agree to record or capture the conference voice and video contents, and for the scenarios related to privacy security, the scenarios are not within the scope of the discussion in this embodiment.
As can be seen from the above technical contents, the present embodiment provides an intelligent headset, including: the intelligent earphone comprises a scene recognition module and a processing module, wherein the scene recognition module is used for recognizing the current scene type of the intelligent earphone; the processing module is used for carrying out processing adaptive to the first kind of scenes when the identified scenes are the first kind of scenes and carrying out processing adaptive to the second kind of scenes when the identified scenes are the second kind of scenes; here, the first type of scene is a conversation scene without visual image content; the second type of scenes are scenes with visual image content. Therefore, the intelligent headset provided by the embodiment can automatically perform adaptive processing according to the current scene of the intelligent headset. If the current scene is a conversation scene without visual image content, the intelligent earphone can automatically perform recording processing, and if the current scene is a scene with visual image content, the intelligent earphone can automatically perform recording or photographing and recording processing, so that the intelligent earphone becomes more intelligent, and for example, the intelligent earphone can help a user to solve the problem that important information is missed because the user forgets to record or record the conference content during a conference. In addition, it should be noted that the intelligent headset provided by the invention is particularly suitable for being used in occasions such as conferences, training, lectures, movie watching, concerts and the like, and the intelligent headset provided by the embodiment can be subjected to self-adaptive processing according to the change of external scenes, so that the experience of a user using the intelligent headset is improved, and the use dependence of the user on the intelligent headset is enhanced.
In an optional implementation manner, the scene recognition module 11 is specifically configured to:
identifying the current scene type of the intelligent earphone according to the environment image information acquired by the image acquisition device on the intelligent earphone and the sound information acquired by the sound acquisition device on the intelligent earphone;
when the fact that visual image content does not exist in a current scene is judged according to environment image information collected by an image collecting device on the intelligent earphone, and when the fact that conversation content of two or more people exists is judged according to sound information collected by a sound collecting device on the intelligent earphone, the type of the current scene where the intelligent earphone is located is identified as a first type of scene;
when the visual image content of the current scene is judged to exist according to the environment image information collected by the image collecting device on the intelligent earphone, the type of the current scene of the intelligent earphone is identified as a second type of scene.
In this embodiment, when determining whether the visual image content exists in the current scene, it may be determined whether the 360 ° environmental image has characteristics such as a slide show, a video play, and a stage lighting screen, if so, it is determined that the visual image content exists in the current scene, and if not, it is determined that the visual image content does not exist in the current scene. In addition, when determining whether the environmental image has features such as slide show, video play, stage lighting screen, etc., the determination may be performed by using a feature matching method, or may be performed by using an image pixel brightness value method.
In the present embodiment, in order to avoid erroneous recognition when recognizing the first type of scene, or in order to improve the accuracy of scene recognition, when the condition that the visual image content does not exist in the current scene is judged according to the environment image information collected by the image collecting device on the intelligent earphone is detected, and when the condition that the conversation content of two or more people exists is judged according to the sound information collected by the sound collecting device on the intelligent earphone, one-step judgment is added, namely, whether the current scene is an indoor scene or whether the current scene is a meeting room is further judged according to the environmental image information acquired by the image acquisition device, if so, the type of scene in which the smart headset is currently located is identified as a first type of scene, because generally formal discussions will be all conducted in a room or conference room, the purpose of this is to avoid identifying scenes with chatty between several friends as scenes of the first category.
Based on the above, in an optional implementation, the processing module 12 is specifically configured to:
when the identified scene is a first-class scene, automatically recording the sound in the current scene, marking a speaker ID corresponding to the sound according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speaking content in the text file according to the actual occurrence time sequence;
when the identified scene is a second-class scene, automatically recording the visual image content in the current scene or automatically continuously acquiring a single image, if the sound content in the current scene is judged to exist at the same time, automatically recording the sound in the current scene, marking the ID of a speaker to which the corresponding sound belongs according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the ID of the speaker and the corresponding speech content in the text file according to the actually occurring time sequence.
As can be seen, in this embodiment, when the identified scene is the first-class scene, the sound in the current scene is automatically recorded, and the speaker ID to which the corresponding sound belongs is marked according to the audio feature of the speaker while recording, and the recording file is synchronously converted into a text file while recording, where the speaker ID and the corresponding speech content are sequentially and correspondingly stored according to the actual occurrence time sequence. For example, a recording file and a text file may be used, where the text file is in the form of:
zhang III: speaking content (optionally speaking duration);
and fourthly, plum: the content of the speech;
and (5) Wang Wu: the content of the speech;
zhao Liu: the content of the speech;
zhang III: the content of the speech;
and (5) Wang Wu: the content of the speech;
and (4) Sun nine: and (4) speaking content.
Therefore, the processing mode of the embodiment can truly restore speech of each person in the conference, and can obtain a text file after the conference is finished, the text file completely records the conference process in a conference conscientious mode, and carries out corresponding homing on the speech of each person, the speech process of each person and the interaction process among a plurality of speakers are clearly shown in a text mode, the result can help conference summarization personnel to carry out data arrangement, or the text file can be directly used as a conference record, and therefore a large amount of text arrangement work is omitted. In addition, the text file can also be added with the information of speaking duration.
In addition, in this embodiment, when the identified scene is the second type of scene, whether to perform automatic video recording processing or automatic single-image continuous acquisition processing on the visual image content in the current scene may be selected according to the requirement. For example, when only the picture information about the slide show being played needs to be acquired, the automatic single-image continuous acquisition processing is performed at the preset photographing interval. When a complete slide playing process needs to be acquired, automatic video recording processing can be performed on the visual image content in the current scene.
In an alternative embodiment, referring to fig. 2, the smart headset further comprises: a first trigger module 13, a second trigger module 14 and a third trigger module 15;
the first trigger module 13 is configured to, after receiving a first trigger signal of a user, automatically perform automatic recording processing on sound of a current scene, mark a speaker ID to which the corresponding sound belongs according to an audio feature of the speaker while recording, and synchronously convert a recording file into a text file while recording, where the text file sequentially and correspondingly stores the speaker ID and corresponding speech content according to an actual occurrence time sequence;
the second trigger module 14 is configured to, after receiving a second trigger signal of a user, automatically perform automatic video recording processing on visual image content in a current scene, and if it is determined that sound content exists in the current scene at the same time, automatically record sound in the current scene at the same time, and simultaneously record the sound according to a speaker ID to which the sound belongs according to an audio feature tag of the speaker, and synchronously convert a recording file into a text file during recording, where the text file sequentially and correspondingly stores the speaker ID and the corresponding speech content according to an actually occurring time sequence;
the third triggering module 15 is configured to automatically perform single-image acquisition on the visual image in the current scene after receiving a third triggering signal every time the third triggering signal is received by the user.
As can be seen, in this embodiment, the smart headset may perform some processing under the trigger of the user, that is, the smart headset may perform emergency processing under the trigger of the user in addition to the automatic scene recognition and automatic processing described in the above embodiments.
For example, when the user needs to freely select which image data needs to be collected and which data does not need to be collected according to the training content on the training session, in this case, the user can temporarily trigger the third trigger module 15 when seeing the content that the user wants to save, and then the smart headset automatically performs single-image collection on the visual image in the current scene. Here, the third triggering module 15 is generally a key set on the smart headset, for example, the key 3, so that when the user wants to capture a current slide image being played, the user can trigger the smart headset to automatically perform single-image capture on a visual image in the current scene by pressing the key 3.
In this embodiment, the application scenarios of the first trigger module 13, the second trigger module 14 and the third trigger module 15 may be scenarios that are operated after the automatic scene recognition function on the smart headset is turned off. Of course, there is also a case that when the scene on the smart headset is automatically identified to be in a problem and is not identified correctly, the first trigger module 13, the second trigger module 14 and the third trigger module 15 may also be used for temporary remediation, and the remediation operation is simple and convenient.
In addition, it should be noted that the processing of the third triggering module 15 may be parallel to the processing of the first triggering module 13 or the second triggering module 14. For example, it means that some key slide pages can be photographed while the currently played slide is automatically recorded (here, two independent cameras, one for recording and one for photographing, can be provided if necessary). The advantage of processing in this way is that not only can obtain complete video, but also can obtain more important images of the slide film pages, thereby facilitating the user to check according to the needs.
In an optional embodiment, the scene recognition module 11 is further configured to recognize whether a current scene type of the smart headset is a third type of scene; the third kind of scenes are sleep and rest scenes;
correspondingly, the processing module 12 is further configured to detect whether the user snores when the identified scene is a third-class scene, and if so, perform snore reminding through a vibration module or a music reminding module installed on the smart headset, so that the user can adjust the sleeping posture.
It can be seen that, in this embodiment, the scene recognition module 11 may also recognize a sleep and rest scene, detect whether the user generates snore, and remind the user of snoring through the vibration module or the music reminding module installed on the smart headset when the user generates snore, so that the user can adjust the sleeping posture, thereby ensuring the health of the user.
It should be noted that, when the sleep and rest scene is identified, the identification may be performed according to information such as heartbeat and blood pressure of the user, or may be performed by collecting an image, for example, identifying whether the user is in a sleeping posture.
It should be noted that, when detecting whether the user generates snore, the snore sensor may be used to detect, and other sensors that may generate snore may also be used to detect, which is not limited in the present invention.
In an optional implementation manner, the processing module 12 is further configured to detect whether there is an earthquake or fire hazard in the current scene when the identified scene is a third-type scene, and if so, perform a hazard warning through a vibration module or an alarm warning module installed on the smart headset, so that the user escapes from the scene as soon as possible.
Therefore, in the embodiment, when the current scene is a sleep and rest scene, the method is further used for detecting whether the current scene has earthquake or fire hazard, and if yes, the method carries out hazard reminding through the vibration module or the alarm reminding module arranged on the intelligent earphone so that the user can escape from the scene as soon as possible, and therefore life safety of the user is guaranteed.
It should be noted that, when detecting whether there is an earthquake or fire hazard in the current scene, the detection may be performed by a vibration sensor, a temperature sensor, or another sensor that can detect an earthquake or fire hazard, which is not limited in the present invention. It should be noted that, on the premise that logics or structures of a plurality of optional embodiments provided in this embodiment do not conflict with each other, the optional embodiments may be freely combined, and the present invention is not limited to this.
Based on the same inventive concept, another embodiment of the present invention further provides an intelligent headset processing method, referring to fig. 3, the method including the following steps:
step 101: and identifying the type of the scene where the intelligent earphone is currently located.
Step 102: when the identified scene is a first type of scene, processing adaptive to the first type of scene is carried out; when the identified scene is a second type scene, processing adaptive to the second type scene is carried out; wherein, the first kind of scenes are conversation scenes without visual image contents; the second type of scenes are scenes with visual image content.
In an alternative embodiment, the step 101 may be implemented as follows:
identifying the current scene type of the intelligent earphone according to the environment image information acquired by the image acquisition device on the intelligent earphone and the sound information acquired by the sound acquisition device on the intelligent earphone;
when the fact that visual image content does not exist in a current scene is judged according to environment image information collected by an image collecting device on the intelligent earphone, and when the fact that conversation content of two or more people exists is judged according to sound information collected by a sound collecting device on the intelligent earphone, the type of the current scene where the intelligent earphone is located is identified as a first type of scene;
when the visual image content of the current scene is judged to exist according to the environment image information collected by the image collecting device on the intelligent earphone, the type of the current scene of the intelligent earphone is identified as a second type of scene.
In an alternative embodiment, the step 102 may be implemented as follows:
when the identified scene is a first type of scene, processing adaptive to the first type of scene is carried out; and when the identified scene is a second type of scene, performing processing adapted to the second type of scene, specifically including:
when the identified scene is a first-class scene, automatically recording the sound in the current scene, marking a speaker ID corresponding to the sound according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speaking content in the text file according to the actual occurrence time sequence;
when the identified scene is a second-class scene, automatically recording the visual image content in the current scene or automatically continuously acquiring a single image, if the sound content in the current scene is judged to exist at the same time, automatically recording the sound in the current scene, marking the ID of a speaker to which the corresponding sound belongs according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the ID of the speaker and the corresponding speech content in the text file according to the actually occurring time sequence.
In an optional implementation manner, the intelligent headphone processing method provided in this embodiment further includes:
after receiving a first trigger signal of a user, automatically carrying out automatic recording processing on sound of a current scene, marking a speaker ID corresponding to the sound according to audio characteristics of the speaker while recording, synchronously converting a recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and corresponding speaking content in the text file according to an actual occurrence time sequence;
after receiving a second trigger signal of a user, automatically carrying out automatic video recording processing on visual image content in the current scene, and if judging that sound content exists in the current scene at the same time, simultaneously carrying out automatic sound recording processing on the sound in the current scene, marking a speaker ID to which the corresponding sound belongs according to audio characteristics of the speaker while recording, synchronously converting a recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speech content in the text file according to an actually occurring time sequence;
and automatically acquiring a single image of the visual image in the current scene after receiving a third trigger signal of the user every time.
In an optional implementation manner, the intelligent headphone processing method provided in this embodiment further includes:
identifying whether the current scene type of the intelligent earphone is a third scene; the third scene is a sleep rest scene, whether the user snores is detected when the identified scene is the third scene, if yes, snore is reminded through a vibration module or a music reminding module arranged on the intelligent earphone, and therefore the user can adjust the sleeping posture.
In an optional implementation manner, the intelligent headphone processing method provided in this embodiment further includes:
when the identified scene is a third-class scene, whether the current scene has earthquake or fire danger or not is detected, if yes, danger reminding is carried out through a vibration module or an alarm reminding module arranged on the intelligent earphone, and therefore a user can escape from the scene as soon as possible.
The intelligent headset processing method provided by the embodiment can be implemented by using the intelligent headset provided by the embodiment, the specific working principle and the beneficial effect are similar, and specific contents can be referred to the description of the embodiment and are not described in detail herein.
It should be noted that, on the premise that logics or structures of a plurality of optional embodiments provided in this embodiment do not conflict with each other, the optional embodiments may be freely combined, and the present invention is not limited to this.
Based on the same inventive concept, another embodiment of the present invention provides an electronic device, which specifically includes the following components, with reference to fig. 4: a processor 301, a memory 302, a communication interface 303, and a bus 304;
the processor 301, the memory 302 and the communication interface 303 complete mutual communication through the bus 304; the communication interface 303 is used for realizing information transmission between related devices such as modeling software, an intelligent manufacturing equipment module library and the like;
the processor 301 is configured to call the computer program in the memory 302, and the processor implements all the steps in the first embodiment when executing the computer program, for example, the processor implements the following steps when executing the computer program:
step 101: and identifying the type of the scene where the intelligent earphone is currently located.
Step 102: when the identified scene is a first type of scene, processing adaptive to the first type of scene is carried out; when the identified scene is a second type scene, processing adaptive to the second type scene is carried out; wherein, the first kind of scenes are conversation scenes without visual image contents; the second type of scenes are scenes with visual image content.
Based on the same inventive concept, another embodiment of the present invention provides a computer-readable storage medium, having a computer program stored thereon, where the computer program is executed by a processor to implement all the steps of the first embodiment, for example, when the processor executes the computer program, the processor implements the following steps:
step 101: and identifying the type of the scene where the intelligent earphone is currently located.
Step 102: when the identified scene is a first type of scene, processing adaptive to the first type of scene is carried out; when the identified scene is a second type scene, processing adaptive to the second type scene is carried out; wherein, the first kind of scenes are conversation scenes without visual image contents; the second type of scenes are scenes with visual image content.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above examples are only for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (7)

1. An intelligent headset, comprising:
the scene identification module is used for identifying the current scene type of the intelligent earphone;
the processing module is used for carrying out processing adaptive to the first type of scene when the identified scene is the first type of scene;
when the identified scene is a second type scene, processing adaptive to the second type scene is carried out;
wherein, the first kind of scenes are conversation scenes without visual image contents; the second type of scene is a scene with visual image content; wherein the visual image content comprises: slide show, video clip show, stage, screen or light show;
the scene recognition module is specifically configured to:
identifying the current scene type of the intelligent earphone according to the environment image information acquired by the image acquisition device on the intelligent earphone and the sound information acquired by the sound acquisition device on the intelligent earphone;
when the fact that visual image content does not exist in a current scene is judged according to environment image information collected by an image collecting device on the intelligent earphone, and when the fact that conversation content of two or more people exists is judged according to sound information collected by a sound collecting device on the intelligent earphone, the type of the current scene where the intelligent earphone is located is identified as a first type of scene;
when the visual image content of the current scene is judged to exist according to the environment image information collected by the image collecting device on the intelligent earphone, the type of the current scene of the intelligent earphone is identified as a second type of scene;
wherein, the processing module is specifically configured to:
when the identified scene is a first-class scene, automatically recording the sound in the current scene, marking a speaker ID corresponding to the sound according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speaking content in the text file according to the actual occurrence time sequence;
when the identified scene is a second-class scene, automatically recording the visual image content in the current scene or automatically continuously acquiring a single image, if the sound content in the current scene is judged to exist at the same time, automatically recording the sound in the current scene, marking the ID of a speaker to which the corresponding sound belongs according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the ID of the speaker and the corresponding speech content in the text file according to the actually occurring time sequence.
2. The smart headset of claim 1, further comprising: the trigger comprises a first trigger module, a second trigger module and a third trigger module;
the first trigger module is used for automatically recording the sound of the current scene after receiving a first trigger signal of a user, marking a speaker ID corresponding to the sound according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speaking content in the text file according to the actual occurrence time sequence;
the second trigger module is used for automatically carrying out automatic video recording processing on visual image contents in the current scene after receiving a second trigger signal of a user, and if sound contents exist in the current scene at the same time, automatically recording the sounds in the current scene at the same time, marking a speaker ID to which the corresponding sounds belong according to audio characteristics of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speech contents in the text file according to the actually occurring time sequence;
and the third trigger module is used for automatically acquiring a single image of the visual image in the current scene after receiving a third trigger signal of the user every time.
3. The intelligent headset of claim 1, wherein the scene recognition module is further configured to recognize whether a type of a scene where the intelligent headset is currently located is a third type of scene; the third kind of scenes are sleep and rest scenes;
correspondingly, the processing module is further configured to detect whether the user snores when the identified scene is a third-class scene, and if so, snore is prompted through the vibration module or the music prompting module installed on the intelligent headset, so that the user can adjust the sleeping posture.
4. The intelligent headset of claim 3, wherein the processing module is further configured to detect whether there is an earthquake or fire hazard in the current scene when the identified scene is a third-class scene, and if so, perform a hazard warning through a vibration module or an alarm warning module installed on the intelligent headset, so that the user can escape from the scene as soon as possible.
5. An intelligent headset processing method, comprising:
identifying the type of a scene where the intelligent earphone is currently located;
when the identified scene is a first type of scene, processing adaptive to the first type of scene is carried out;
when the identified scene is a second type scene, processing adaptive to the second type scene is carried out;
wherein, the first kind of scenes are conversation scenes without visual image contents; the second type of scene is a scene with visual image content; wherein the visual image content comprises: slide show, video clip show, stage, screen or light show;
the identifying of the current scene type of the intelligent earphone specifically includes:
identifying the current scene type of the intelligent earphone according to the environment image information acquired by the image acquisition device on the intelligent earphone and the sound information acquired by the sound acquisition device on the intelligent earphone;
when the fact that visual image content does not exist in a current scene is judged according to environment image information collected by an image collecting device on the intelligent earphone, and when the fact that conversation content of two or more people exists is judged according to sound information collected by a sound collecting device on the intelligent earphone, the type of the current scene where the intelligent earphone is located is identified as a first type of scene;
when the visual image content of the current scene is judged to exist according to the environment image information collected by the image collecting device on the intelligent earphone, the type of the current scene of the intelligent earphone is identified as a second type of scene;
when the identified scene is a first type of scene, processing adaptive to the first type of scene is carried out; and when the identified scene is a second type of scene, performing processing adapted to the second type of scene, specifically including:
when the identified scene is a first-class scene, automatically recording the sound in the current scene, marking a speaker ID corresponding to the sound according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the speaker ID and the corresponding speaking content in the text file according to the actual occurrence time sequence;
when the identified scene is a second-class scene, automatically recording the visual image content in the current scene or automatically continuously acquiring a single image, if the sound content in the current scene is judged to exist at the same time, automatically recording the sound in the current scene, marking the ID of a speaker to which the corresponding sound belongs according to the audio characteristic of the speaker while recording, synchronously converting the recording file into a text file while recording, and sequentially and correspondingly storing the ID of the speaker and the corresponding speech content in the text file according to the actually occurring time sequence.
6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the intelligent headset processing method of claim 5 are implemented when the processor executes the program.
7. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the smart headset processing method as claimed in claim 5.
CN201811033439.9A 2018-09-05 2018-09-05 Intelligent earphone, intelligent earphone processing method, electronic device and storage medium Active CN109151642B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811033439.9A CN109151642B (en) 2018-09-05 2018-09-05 Intelligent earphone, intelligent earphone processing method, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811033439.9A CN109151642B (en) 2018-09-05 2018-09-05 Intelligent earphone, intelligent earphone processing method, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN109151642A CN109151642A (en) 2019-01-04
CN109151642B true CN109151642B (en) 2019-12-24

Family

ID=64827097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811033439.9A Active CN109151642B (en) 2018-09-05 2018-09-05 Intelligent earphone, intelligent earphone processing method, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN109151642B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866432B (en) * 2019-04-10 2021-04-13 信创未来(天津)科技有限公司 Multifunctional user mobile terminal control system
CN109887533B (en) * 2019-04-10 2020-04-24 郑州轻工业大学 Multifunctional user mobile terminal control system and method
CN112019960A (en) * 2019-05-28 2020-12-01 深圳市冠旭电子股份有限公司 Method for monitoring scenes by utilizing earphone, device and readable storage medium
CN110248265B (en) * 2019-05-31 2021-04-02 湖北工业大学 Headset with danger early warning function
EP4150922A1 (en) 2020-08-10 2023-03-22 Google LLC Systems and methods for control of an acoustic environment
CN111935581A (en) * 2020-08-13 2020-11-13 长春市长光芯忆科技有限公司 Electronic memory earphone

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009147752A (en) * 2007-12-14 2009-07-02 Sharp Corp Data broadcast compatible mobile view terminal
CN101552826B (en) * 2009-05-04 2012-01-11 中兴通讯股份有限公司 Videophone service automatic answering machine and device
CN201414159Y (en) * 2009-05-06 2010-02-24 珠海市东耀企业有限公司 Multimedia terminal machine
CN102169642B (en) * 2011-04-06 2013-04-03 沈阳航空航天大学 Interactive virtual teacher system having intelligent error correction function
CN102411833A (en) * 2011-08-02 2012-04-11 杭州威威网络科技有限公司 Networking alarm apparatus based on audio identification
CN102982800A (en) * 2012-11-08 2013-03-20 鸿富锦精密工业(深圳)有限公司 Electronic device with audio video file video processing function and audio video file processing method
CN103309855A (en) * 2013-06-18 2013-09-18 江苏华音信息科技有限公司 Audio-video recording and broadcasting device capable of translating speeches and marking subtitles automatically in real time for Chinese and foreign languages
CN103956014A (en) * 2014-05-04 2014-07-30 福建创高安防技术股份有限公司 Remote image recognition antitheft method and system
CN104038717B (en) * 2014-06-26 2017-11-24 北京小鱼在家科技有限公司 A kind of intelligent recording system
CN204069102U (en) * 2014-08-07 2014-12-31 深圳市微思客技术有限公司 Interactive bluetooth earphone and mobile terminal
CN105407379A (en) * 2014-08-26 2016-03-16 天脉聚源(北京)教育科技有限公司 Synchronous recording method for multiple media
US9736580B2 (en) * 2015-03-19 2017-08-15 Intel Corporation Acoustic camera based audio visual scene analysis
CN107527623B (en) * 2017-08-07 2021-02-09 广州视源电子科技股份有限公司 Screen transmission method and device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN109151642A (en) 2019-01-04

Similar Documents

Publication Publication Date Title
CN109151642B (en) Intelligent earphone, intelligent earphone processing method, electronic device and storage medium
WO2020211701A1 (en) Model training method, emotion recognition method, related apparatus and device
KR101906827B1 (en) Apparatus and method for taking a picture continously
WO2020078237A1 (en) Audio processing method and electronic device
CN110618933B (en) Performance analysis method and system, electronic device and storage medium
US9681186B2 (en) Method, apparatus and computer program product for gathering and presenting emotional response to an event
US9923535B2 (en) Noise control method and device
WO2020207328A1 (en) Image recognition method and electronic device
CN105357392A (en) Mobile phone, and mobile phone calls reminding method and apparatus
CN101316324A (en) Terminal and image processing method thereof
WO2010038112A1 (en) System and method for capturing an emotional characteristic of a user acquiring or viewing multimedia content
CN102629979A (en) Shooting mode starting method and photographic device
CN104424073A (en) Information processing method and electronic equipment
CN104298694A (en) Picture message adding method and device and mobile terminal
EP2402839A2 (en) System and method for indexing content viewed on an electronic device
CN105376486A (en) Mobile terminal and control method thereof
CN110365835A (en) A kind of response method, mobile terminal and computer storage medium
CN106875968B (en) Information acquisition method, client and system
CN111225273A (en) Television play control method, storage medium and television
CN105244037B (en) Audio signal processing method and device
JP2012151544A (en) Imaging apparatus and program
JP5550114B2 (en) Imaging device
KR20090011581A (en) Apparatus and method for eyeball recognition photographing of the camera in a portable terminal
CN115700847A (en) Drawing book reading method and related equipment
JP5675089B2 (en) Video information processing apparatus and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Deng Di

Inventor after: Cheng Fang

Inventor before: Deng Di

CB03 Change of inventor or designer information
CP01 Change in the name or title of a patent holder

Address after: 100000 C5-05, F1, Building 19, No. 10, Langjiayuan, Jianguomenwai, Chaoyang District, Beijing

Patentee after: Taiyi Yunjia (Beijing) Technology Co.,Ltd.

Address before: 100000 C5-05, F1, Building 19, No. 10, Langjiayuan, Jianguomenwai, Chaoyang District, Beijing

Patentee before: Yuntai Jinke (Beijing) Technology Co.,Ltd.

Address after: 100000 C5-05, F1, Building 19, No. 10, Langjiayuan, Jianguomenwai, Chaoyang District, Beijing

Patentee after: Taiyi Yunjing (Beijing) Technology Co.,Ltd.

Address before: 100000 C5-05, F1, Building 19, No. 10, Langjiayuan, Jianguomenwai, Chaoyang District, Beijing

Patentee before: Taiyi Yunjia (Beijing) Technology Co.,Ltd.

CP01 Change in the name or title of a patent holder
CP03 Change of name, title or address

Address after: 100000 C5-05, F1, Building 19, No. 10, Langjiayuan, Jianguomenwai, Chaoyang District, Beijing

Patentee after: Yuntai Jinke (Beijing) Technology Co.,Ltd.

Address before: Room A-5524, Building 3, No. 20 Yong'an Road, Shilong Economic Development Zone, Mentougou District, Beijing 102300

Patentee before: BEIJING JINLIAN TECHNOLOGY Co.,Ltd.

CP03 Change of name, title or address
TR01 Transfer of patent right

Effective date of registration: 20221229

Address after: 101100 3586, Floor 1, Building 3, No. 6, Guoxing Second Street, Tongzhou District, Beijing

Patentee after: Beijing Taiyi Digital Technology Co.,Ltd.

Address before: 100000 C5-05, F1, Building 19, No. 10, Langjiayuan, Jianguomenwai, Chaoyang District, Beijing

Patentee before: Taiyi Yunjing (Beijing) Technology Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231107

Address after: 903-76, 9th Floor, Building 17, Yard 30, Shixing Street, Shijingshan District, Beijing, 100000

Patentee after: Beijing Qinhai Technology Co.,Ltd.

Address before: 101100 3586, Floor 1, Building 3, No. 6, Guoxing Second Street, Tongzhou District, Beijing

Patentee before: Beijing Taiyi Digital Technology Co.,Ltd.

TR01 Transfer of patent right