Nothing Special   »   [go: up one dir, main page]

WO2021077840A1 - Gesture control method and apparatus - Google Patents

Gesture control method and apparatus Download PDF

Info

Publication number
WO2021077840A1
WO2021077840A1 PCT/CN2020/105593 CN2020105593W WO2021077840A1 WO 2021077840 A1 WO2021077840 A1 WO 2021077840A1 CN 2020105593 W CN2020105593 W CN 2020105593W WO 2021077840 A1 WO2021077840 A1 WO 2021077840A1
Authority
WO
WIPO (PCT)
Prior art keywords
gesture recognition
recognition result
gesture
target
sequence
Prior art date
Application number
PCT/CN2020/105593
Other languages
French (fr)
Chinese (zh)
Inventor
曾彬
肖琴
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Priority to JP2021544350A priority Critical patent/JP7479388B2/en
Priority to KR1020217034498A priority patent/KR20210141688A/en
Publication of WO2021077840A1 publication Critical patent/WO2021077840A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Definitions

  • the present disclosure relates to computer vision technology, and in particular to a gesture control method and device.
  • the touch screen of a smart phone is a human-computer interaction system realized by touch.
  • products that are controlled through voice interaction. For example, users input related instructions through voice, and products perform related operations according to the voice input instructions.
  • the embodiments of the present disclosure provide at least one gesture control method and device.
  • a gesture control method includes: performing gesture recognition processing on N frames of images consecutively in a sequence in a video stream collected by a camera to obtain a sequence of gesture recognition results, and the gesture recognition
  • the result sequence includes the recognition results of multiple gestures included in the N frames of images; in response to the number of the same gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, it is determined that the same gesture recognition result is a target Gesture recognition result, where N and M are integers greater than 1, and N is greater than or equal to M; send a control instruction corresponding to the target gesture recognition result to the target device, or control the target device to perform the target gesture recognition The operation corresponding to the result.
  • a gesture control device comprising: a recognition processing module, configured to perform gesture recognition processing on N frames of images in a video stream collected by a camera in sequence, respectively, to obtain a gesture recognition result Sequence, the gesture recognition result sequence includes the recognition results of multiple gestures included in the N frames of images; the gesture determination module is configured to respond to the number of the same gesture recognition results included in the gesture recognition result sequence being greater than or Equal to M, it is determined that the same gesture recognition result is the target gesture recognition result, where N and M are both integers greater than 1, and N is greater than or equal to M; the operation control module is used to send the target gesture recognition result to the target device The control instruction corresponding to the result, or the target device is controlled to perform an operation corresponding to the target gesture recognition result.
  • an electronic device that includes a processor and a memory on which a computer program is stored.
  • the computer program can be executed by the processor to implement the method according to the first aspect of the present disclosure. Gesture control method.
  • a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the gesture control method according to the first aspect of the present disclosure.
  • the gesture control method and device when it is determined that a preset number of identical gesture recognition results are obtained, the gesture is confirmed to be valid, and the valid gesture is determined as the target gesture recognition result, which can be determined to a certain extent.
  • the target gesture recognition result which can be determined to a certain extent.
  • Fig. 1 shows a flow chart of a gesture control method according to at least one embodiment of the present disclosure
  • Fig. 1a shows a schematic diagram of a static gesture according to at least one embodiment of the present disclosure
  • Fig. 1b shows a schematic diagram of a dynamic gesture according to at least one embodiment of the present disclosure
  • Fig. 2 shows a flowchart of another gesture control method according to at least one embodiment of the present disclosure
  • Fig. 3 shows a flowchart of another gesture control method according to at least one embodiment of the present disclosure
  • Fig. 4 shows a schematic diagram of a functional interface of a music player according to at least one embodiment of the present disclosure
  • Fig. 5 shows a block diagram of a gesture control device according to at least one embodiment of the present disclosure
  • Fig. 6 shows a block diagram of another gesture control device according to at least one embodiment of the present disclosure
  • Fig. 7 shows a block diagram of an electronic device according to at least one embodiment of the present disclosure.
  • the embodiments of the present disclosure provide a gesture control method to control a device through gesture interaction.
  • FIG. 1 shows a flow chart of a gesture control method according to at least one embodiment of the present disclosure.
  • the method may be executed by a gesture control device, and the method may include step 100 to step 104.
  • step 100 the sequence of consecutive N frames of images in the video stream collected by the camera are respectively subjected to gesture recognition processing to obtain a sequence of gesture recognition results.
  • the device may be referred to as a target device, and controlling the target device may be controlling a functional component in the target device, and the functional component may be a hardware or software module.
  • the target device includes but is not limited to a vehicle, and controlling the target device may include, but is not limited to, controlling one or more functional components such as a media player, an air conditioner controller, and a window controller provided in the vehicle. control. It is understandable that the target device may also include other application devices such as mobile phones, TVs, air conditioners, stereos, and smart homes.
  • the camera can be used to collect the video stream of the user's gestures.
  • the camera on the target device can be used to collect it.
  • the video stream includes N frames of sequential sequential images collected by the camera, and the gestures in the images are gestures made when the user wants to control the operation of the functional components in the target device. N is an integer greater than 1.
  • a gesture recognition result sequence By performing gesture recognition processing on the N frames of images in the aforementioned video stream, respectively, a gesture recognition result sequence can be obtained, and the gesture recognition result sequence includes recognition results of multiple gestures.
  • the gestures made by the user can be static gestures or dynamic gestures.
  • Figures 1a and 1b illustrate some gestures, but it is understandable that the actual implementation is not limited to these gestures.
  • Fig. 1a shows a series of static gestures: OK gesture, V gesture, like gesture, palm gesture, index finger gesture, and fist gesture.
  • Fig. 1b shows a series of dynamic gestures: fist-palm change (fist-to-palm, palm-to-fist), palm translation (up, down, left, and right), index finger rotation (clockwise, counterclockwise).
  • the recognition result of the multiple gestures included in the gesture recognition result sequence may be a static gesture: for example, the gesture recognized in the image is a V gesture, or the gesture recognized in the image is an OK gesture.
  • the obtained gesture recognition result sequence may also include multiple dynamic gestures, for example, multiple “palm translation” gestures are recognized.
  • the gesture recognition result sequence may also be a combination of static gestures and dynamic gestures.
  • the gesture recognition result sequence includes an OK gesture and a palm translation gesture.
  • the gesture recognition in this step can be performed by a pre-trained gesture recognition neural network, and the image collected by the camera is input into the neural network, and the gesture recognition result corresponding to the image can be obtained.
  • step 102 in response to the number of identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, it is determined that the same gesture recognition result is a target gesture recognition result.
  • the target gesture recognition result it can be set to confirm that the gesture is valid only when it is determined that a preset number of identical gesture recognition results are obtained, and the effective gesture is called the target gesture recognition result.
  • the preset number can be set to M, M is also an integer greater than 1, and N is greater than or equal to M.
  • the recognized "V gesture” is confirmed as the target gesture recognition result.
  • this "palm translation gesture” is the target gesture recognition result, where each palm translation gesture can be represented by multiple frames of images Okay.
  • the number of consecutive gestures recognized does not reach the preset number, these images are discarded and re-recognized. For example, if there are three V gestures recognized in consecutive N frames of images, and the preset number "five" is not reached, then the three V gestures are discarded and gesture recognition is performed on the consecutive N frames of images. .
  • step 104 is continued. Otherwise, if the target gesture recognition result is not determined, then continue to perform step 100 to step 102.
  • step 104 a control instruction corresponding to the target gesture recognition result is sent to the target device, or the target device is controlled to perform an operation corresponding to the target gesture recognition result.
  • the corresponding target device can be controlled according to the target gesture recognition result determined above. Specifically, it can control a functional component in the target device.
  • the functional component is a media player, such as a volume control module for playing music in a vehicle
  • the volume can be increased or decreased according to the target gesture recognition result.
  • a control instruction corresponding to the target gesture recognition result can be sent to the target device, and the target device can operate according to the instruction; or, the gesture control apparatus of this embodiment can also control the target device to execute and execute the target device according to the instruction. The operation corresponding to the target gesture recognition result.
  • the gesture control method of this embodiment when it is determined that a preset number of identical gesture recognition results are obtained, the gesture is confirmed to be valid, and the valid gesture is determined as the target gesture recognition result, which can avoid gesture errors to a certain extent. Trigger to improve the accuracy of gesture recognition. For example, if the user accidentally makes a certain gesture, as long as the gesture corresponding to the same gesture recognition result does not reach the preset number, the gesture will not be recognized as a valid target gesture recognition result, so that the target device will not Respond to the gesture to reduce false triggers.
  • FIG. 2 shows another gesture control method according to at least one embodiment of the present disclosure.
  • the method may include step 200 to step 208, wherein the same steps as those in FIG. 1 will not be described in detail.
  • step 200 a multi-frame image collected by a camera is received, and the gesture in the image is a gesture made when the user wants to control the function component in the target device to run.
  • the multi-frame images may be N frames of images continuously in sequence included in the video stream collected by the camera.
  • step 202 gesture recognition processing is performed on the multiple frames of images to obtain a gesture recognition result sequence.
  • the image collected by the camera has multiple frames, and multiple gestures can be recognized according to the multiple frames of images, and these multiple gestures can form a sequence of gesture recognition results.
  • the gesture recognition result sequence may include "V, V, V, V, V, V, fist, V, V".
  • Vs can be referred to as multiple identical gesture recognition results
  • diss can be referred to as differential gesture recognition results, which are different gesture recognition results from the same gesture recognition result .
  • the number of different gesture recognition results may also be multiple.
  • step 204 in response to the plurality of identical gesture recognition results included in the gesture recognition result sequence, a difference gesture recognition result is included among the plurality of same gesture recognition results, and the difference gesture recognition result accounts for the number of the gesture recognition result sequence If the value is lower than the preset value, the difference gesture recognition result is smoothed; wherein, the difference gesture recognition result is different from the same gesture recognition result.
  • determining that the number of consecutive identical gesture recognition results included in the gesture recognition result sequence is greater than or equal to M includes: determining the number of consecutive identical gesture recognition results included in the gesture recognition result sequence after the smoothing process The quantity is greater than or equal to M.
  • the smoothing process includes but not limited to any of the following:
  • the difference gesture recognition result may be corrected to the same gesture recognition result, for example, a fist gesture may be corrected to a V gesture.
  • the aforementioned gesture recognition result sequence "V, V, V, V, V, V, V, fist, V, V" is modified to "V, V, V, V, V, V, V, V, V, V, V. V.
  • the difference gesture recognition result can also be removed from the gesture recognition result sequence.
  • the above sequence "V, V, V, V, V, V, V, fist, V, V" can be modified to "V, V, V” , V, V, V, V, V, V".
  • the gesture recognition result whose time sequence is before the difference gesture recognition result and the gesture recognition result whose time sequence is after the difference gesture recognition result can also be used as consecutive multiple gesture recognition results. That is, the gesture recognition result sequence "V, V, V, V, V, V, fist, V, V" is considered to recognize eight consecutive V gestures, and fist gestures are ignored.
  • step 206 for the smoothed gesture recognition result sequence, if it is recognized that the gesture recognition result sequence includes a continuous preset number of identical gesture recognition results, it is confirmed that the target gesture recognition result is recognized.
  • step 208 a control instruction corresponding to the target gesture recognition result is sent to the target device, or the target device is controlled to perform an operation corresponding to the target gesture recognition result.
  • the gesture control method of this embodiment when a preset number of identical gestures are recognized, the gesture is confirmed to be effective, which improves the accuracy of gesture recognition; and by smoothing the difference gesture recognition results, it is also possible to increase gestures.
  • the sensitivity of recognition improves the response speed of gesture recognition.
  • V gestures such as ten V gestures
  • nine V gestures and two fist gestures are recognized.
  • Smoothing processing requires abandoning the re-recognition of the nine V gestures, and therefore cannot respond to the user's gestures in time; according to the method of this embodiment, the recognition results of the above two fist gestures can be corrected to the correct V gesture recognition results, thereby quickly identifying To effective V gestures, quickly respond to user gestures.
  • FIG. 3 shows a flowchart of another gesture control method according to at least one embodiment of the present disclosure.
  • the method includes step 300 to step 306.
  • step 300 a single frame of the camera acquisition image in the video stream is acquired, the camera acquisition image is an image corresponding to the camera shooting field of view space, and the camera shooting field of view space includes an effective space area for gesture control.
  • the camera is fixed at a certain position of the vehicle, the camera has a corresponding camera shooting field of view space when collecting images, and the image collected by the camera is also an image in this space.
  • the field of view space includes the effective space area of gesture control. For example, only when the driver makes a gesture in a certain space area in front of the vehicle's central control panel, the control based on the gesture will be triggered. If the driver is outside the effective space area Gestures in the area will not trigger gesture control.
  • the image collected by the camera includes the image corresponding to the effective space area of the aforementioned gesture control.
  • step 302 from the image collected by the camera, a partial image area corresponding to the effective space area of the gesture control is selected.
  • the image collected by the camera can be cropped, and the partial image area in the image collected by the camera can be cropped, and the shooting field of view space corresponding to the partial image area is an effective space area for gesture control.
  • the camera may capture a large area of space and capture the entire interior scene in the vehicle.
  • the partial image area selected in this step is the part of the area in front of the vehicle's central control panel included in the image collected by the camera. This part of the area is the effective space area for gesture control. Only when the driver makes gestures in the effective space area Trigger the response of gesture control.
  • step 304 gesture recognition processing is performed on the partial image area to obtain a gesture recognition result.
  • a partial image area when performing gesture recognition on N frames of images in a video stream, a partial image area may be selected from each frame of image, and gesture recognition processing may be performed on the partial image area.
  • the above-mentioned image is the image collected by the camera.
  • step 306 the target device is controlled according to the gesture recognition result.
  • a gesture recognition result sequence is obtained by recognition. If there are a preset number of M identical gesture recognition results in the gesture recognition result sequence, or there are consecutive M identical gesture recognition results, it is confirmed that the same gesture recognition result is a target gesture recognition result.
  • the target device is controlled according to the control instruction corresponding to the target gesture recognition result.
  • the gesture control method of this embodiment when it is determined that a preset number of identical gesture recognition results are obtained, device control is performed according to the gesture, which can prevent false triggering.
  • device control is performed according to the gesture, which can prevent false triggering.
  • recognizing the gestures in the partial image area in the image it is possible to avoid the interference of the images of other areas outside the partial image area to a certain extent, making the recognition of gestures more accurate, and only perform gesture recognition processing on the partial image area Compared with the recognition processing of all the images, the processing speed will be faster.
  • some parameters in the gesture control function can be adjusted in a visual manner.
  • the gesture recognition parameters used for gesture recognition can be visually displayed on the visual interface, and the user adjusts the gesture recognition parameters in the parameter adjustment visual interface in the manner of a progress bar.
  • the gesture recognition parameter may include: M in the "M same gesture recognition results recognized" mentioned above. For example, you can adjust the recognition of 10 identical V gestures to confirm that the V gesture is recognized; you can also set to recognize 8 identical V gestures to confirm that the V gesture is recognized.
  • the system can perform gesture recognition processing according to the gesture recognition parameter. It is more convenient to adjust the gesture recognition parameters through a visual interface.
  • different gesture recognition parameters can be set for different gestures. Taking the above M as an example, the M corresponding to different gestures can be different. For example, if 10 identical V gestures are recognized, it is confirmed that the V gesture is recognized; if 6 OK gestures are recognized, it is confirmed that the OK gesture is recognized. That is, the M corresponding to the V gesture is 10, and the M corresponding to the OK gesture is 6.
  • the gesture recognition parameters may also include, for example, the number of different gestures in the sequence, the number of the same gestures before the different gestures, and so on. These parameters can also be adjusted and set in the form of a progress bar through the above-mentioned visual interface. In addition, each parameter can be adjusted independently. For example, taking M corresponding to different gestures as an example, in the above example, the M corresponding to the V gesture is 10, and the M corresponding to the OK gesture is 6. The M corresponding to different gestures such as OK gesture can be adjusted separately.
  • gesture control method uses the application of gesture control in a vehicle as an example to describe the gesture control method of the present disclosure, but it is understandable that the gesture control method is not limited to being applied to vehicles, and can also be applied to other devices, such as mobile phones. , Smart home system, etc.
  • FIG. 4 shows a schematic diagram of a functional interface of a music player according to at least one embodiment of the present disclosure. The user can click to turn on the music player.
  • the gesture control area 41 ie, the area at the bottom of the player
  • the gesture control area 41 indicates that the gesture control for music playback-related functions is enabled; if the user taps the gesture control area 41 again, the gesture control for music playback-related functions is cancelled.
  • the interface shown in Figure 4 is the functional interface of the music player.
  • the user can use the camera to collect images by making a variety of gestures, and the gesture control device controls the music playing function of the music player according to the received images.
  • the music player can be controlled in response to the gesture recognition result of the image.
  • the volume of music playback can be increased in response to the gesture recognition result of the image; for another example, the window controller can also be controlled to move the window glass in response to the gesture recognition result of the image.
  • the volume of music playback be increased in response to the result of the gesture recognition of the image, but also the change state of the related control functions of the music player caused by the change of the image can be synchronously displayed.
  • the gesture control area 41 lights up the icon of multiple gestures, indicating that multiple gestures are supported in the music playback scene.
  • the gestures include:
  • gesture control function OK Play Thumbs up like Index finger rotates clockwise Increase volume Index finger rotates counterclockwise lower the volume Pan to the right next song Pan to the left Previous song fist time out
  • the target gesture recognition result can be determined according to the following predetermined rules: if there are a preset number of identical gesture recognition results in the gesture recognition result sequence, confirm that the same gesture recognition result is the target gesture Recognition results.
  • the user can make an OK gesture, and the music player starts to play music.
  • the running start of the music playing function can be displayed synchronously in the function status interface of Figure 4; similarly, when the user makes a fist gesture, the music playback is paused, and the running of the music playing function can also be displayed synchronously in the function status interface. Stop.
  • the gesture control device may first determine whether the "OK” gesture has been recognized. If “OK” has not been recognized before, no response will be made; if "OK” has been recognized before, the volume of the music player can be adjusted according to the component control information corresponding to the index finger rotation gesture. For example, if the gesture is "index finger rotate clockwise", you can control the music player to increase the volume of music playback.
  • the volume adjustment display module 42 can also be used to synchronously display the volume increase signal as the index finger rotates clockwise.
  • the gesture control device may first determine whether the "OK” gesture has been recognized. If “OK” has not been recognized before, no response will be made; if "OK” has been recognized before, you can adjust the music player to switch to the next song according to the gesture of panning the palm to the right.
  • the song display module 43 can also be used to synchronously display the song cutting effect as the palm moves to the right.
  • users can also use gestures to control like songs. For example, if the user can give a thumbs up, in response to the gesture, the gesture control device can control the music player to display the like mark of a certain song in the function status interface shown in FIG. 4. For example, the like mark 44 in FIG. 4 is illuminated. It can also be judged in advance whether the "OK" gesture has been recognized before the like.
  • FIG. 5 shows a block diagram of a gesture control device according to at least one embodiment of the present disclosure.
  • the device may include: a recognition processing module 500, a gesture determination module 502, and an operation control module 504.
  • the recognition processing module 500 may perform gesture recognition processing on the sequential sequential N frames of images in the video stream collected by the camera, respectively, to obtain a sequence of gesture recognition results.
  • the gesture recognition result sequence includes recognition results of multiple gestures included in the N frames of images.
  • the gesture determination module 502 may determine that the same gesture recognition result is a target gesture recognition result in response to the number of identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, where N and M are both integers greater than 1. , And N is greater than or equal to M.
  • the operation control module 504 may send a control instruction corresponding to the target gesture recognition result to the target device, or control the target device to perform an operation corresponding to the target gesture recognition result.
  • the recognition processing module and the gesture determination module confirm that the gesture is valid only when the preset number of identical gesture recognition results are obtained, and the valid gesture is determined as the target gesture recognition result. To a certain extent, avoid false triggering of gestures and improve the accuracy of gesture recognition. For example, if the user accidentally makes a certain gesture, as long as the gesture corresponding to the same gesture recognition result does not reach the preset number, the gesture will not be recognized as a valid target gesture recognition result, so that the target device will not Respond to the gesture to reduce false triggers.
  • the gesture determination module 502 may determine that the same gesture recognition result is a target gesture recognition result in response to the number of consecutive identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M.
  • the gesture determination module 502 may also respond to that there is a difference gesture recognition result among multiple identical gesture recognition results included in the gesture recognition result sequence, and the difference gesture recognition result is in the The proportion of the number in the gesture recognition result sequence is lower than the preset value, and the difference gesture recognition result is smoothed; wherein the difference gesture recognition result is different from the same gesture recognition result.
  • the gesture recognition result sequence including at least one frame of image, a difference gesture recognition result that is different from the same gesture recognition result, and in the gesture recognition result sequence, the gesture before the difference gesture recognition result
  • the recognition result and the gesture recognition result after the difference gesture recognition result are both the same gesture recognition result, and the proportion of the difference gesture recognition result in the gesture recognition result sequence is lower than the preset value.
  • determining that the number of consecutive identical gesture recognition results included in the gesture recognition result sequence is greater than or equal to M includes: determining the number of consecutive identical gesture recognition results included in the gesture recognition result sequence after the smoothing process The quantity is greater than or equal to M.
  • the gesture determination module 502 when the gesture determination module 502 smoothes the difference gesture recognition result, it may correct the difference gesture recognition result to the same gesture recognition result, or convert the difference gesture recognition result Removed from the sequence of gesture recognition results.
  • the gesture determination module 502 when the gesture determination module 502 smoothes the difference gesture recognition result, it may combine the gesture recognition result whose time sequence is before the difference gesture recognition result with the time sequence that is located in the difference gesture recognition result.
  • the subsequent gesture recognition result is used as a continuous multiple gesture recognition result.
  • the recognition processing module 500 may perform the following operations when performing gesture recognition processing on N frames of consecutive sequential images in the video stream collected by the camera, and may perform the following operations: acquiring a single frame of the camera-collected image in the video stream, The image captured by the camera is an image corresponding to the field of view space captured by the camera, and the field of view space captured by the camera includes an effective space area for gesture control; from the image captured by the camera, select the image corresponding to the effective space area for gesture control Partial image area; performing the gesture recognition processing on the partial image area.
  • the device may further include a parameter receiving module 600.
  • the parameter receiving module 600 may receive the gesture recognition parameters configured by the user through the visual interface for parameter adjustment, so that the recognition processing module 500 executes the gesture recognition processing according to the gesture recognition parameters.
  • the target device includes a vehicle
  • the operation control module 504 may send a control instruction corresponding to the target gesture recognition result to the functional component in the vehicle, or control the functional component in the vehicle to execute and The operation corresponding to the target gesture recognition result.
  • the functional component includes a media player and/or a window controller
  • the operation control module 504 is used to control the functional component in the vehicle to execute the corresponding target gesture recognition result.
  • the media player may be controlled to change the media playing state in response to the target gesture recognition result, or the vehicle window controller may be controlled to move the window glass in response to the target gesture recognition result.
  • the operation control module 504 may display at least one of the following items on the function status interface corresponding to the function component to be controlled by the image in response to the target gesture recognition result: Running start or running stop state; volume change; like mark of target object.
  • FIG. 7 shows a block diagram of an electronic device according to at least one embodiment of the present disclosure.
  • the electronic device includes a memory 71 and a processor 72.
  • the memory 71 stores a computer program, and when the computer program is executed by the processor 72, the gesture control method according to any embodiment of the present disclosure is implemented.
  • At least one embodiment of the present disclosure further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the gesture control method according to any embodiment of the present disclosure is implemented.
  • one or more embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, one or more embodiments of the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, one or more embodiments of the present disclosure may adopt computer programs implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The form of the product.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • the embodiments of the present disclosure also provide a computer-readable storage medium, and the storage medium may store a computer program.
  • the program When the program is executed by a processor, the training of the neural network for gesture recognition described in any of the embodiments of the present disclosure is realized.
  • the steps of the method, and/or, the steps of the gesture recognition method described in any embodiment of the present disclosure are implemented.
  • the "and/or" means having at least one of the two, for example, "A and/or B" includes three schemes: A, B, and "A and B".
  • Embodiments of the subject matter described in the present disclosure can be implemented as one or more computer programs, that is, one or one of the computer program instructions encoded on a tangible non-transitory program carrier to be executed by a data processing device or to control the operation of the data processing device Multiple modules.
  • the program instructions may be encoded on artificially generated propagated signals, such as machine-generated electrical, optical or electromagnetic signals, which are generated to encode information and transmit it to a suitable receiver device for data transmission.
  • the processing device executes.
  • the computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
  • the processing and logic flow described in the present disclosure can be executed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating according to input data and generating output.
  • the processing and logic flow can also be executed by a dedicated logic circuit, such as FPGA (Field Programmable Gate Array) or ASIC (Application Specific Integrated Circuit), and the device can also be implemented as a dedicated logic circuit.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • Computers suitable for executing computer programs include, for example, general-purpose and/or special-purpose microprocessors, or any other type of central processing unit.
  • the central processing unit will receive instructions and data from a read-only memory and/or a random access memory.
  • the basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • the computer will also include one or more mass storage devices for storing data, such as magnetic disks, magneto-optical disks, or optical disks, or the computer will be operatively coupled to this mass storage device to receive data from or send data to it. It transmits data, or both.
  • the computer does not have to have such equipment.
  • the computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a global positioning system (GPS) receiver, or, for example, a universal serial bus (USB ) Flash drives are portable storage devices, just to name a few.
  • PDA personal digital assistant
  • GPS global positioning system
  • USB universal serial bus
  • Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices (such as EPROM, EEPROM, and flash memory devices), magnetic disks (such as internal hard disks or Removable disks), magneto-optical disks, CD ROM and DVD-ROM disks.
  • semiconductor memory devices such as EPROM, EEPROM, and flash memory devices
  • magnetic disks such as internal hard disks or Removable disks
  • magneto-optical disks CD ROM and DVD-ROM disks.
  • the processor and the memory can be supplemented by or incorporated into a dedicated logic circuit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

The embodiments of the present disclosure provide a gesture control method and apparatus. Said method comprises: performing gesture recognition processing on N chronologically consecutive images in a video stream acquired by a camera, to obtain a gesture recognition result sequence, the gesture recognition result sequence comprising recognition results of a plurality of gestures included in the N images; in response to the number of identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, determining the identical gesture recognition results to be target gesture recognition results, both N and M being integers greater than 1, and N being greater than or equal to M; and sending to a target device a control instruction corresponding to the target gesture recognition results, or controlling a target device to execute an operation corresponding to the target gesture recognition results.

Description

手势控制方法和装置Gesture control method and device
相关申请交叉引用Cross-reference to related applications
本申请基于申请号为201911008049.0、申请日为2019年10月22日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is filed based on a Chinese patent application with an application number of 201911008049.0 and an application date of October 22, 2019, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated into this application by reference.
技术领域Technical field
本公开涉及计算机视觉技术,具体涉及一种手势控制方法和装置。The present disclosure relates to computer vision technology, and in particular to a gesture control method and device.
背景技术Background technique
随着产品智能化、电子化以及互联化的不断发展,出现了很多越来越智能化的人机交互方式,以满足人们追求个性化和时尚化的需求。例如,智能手机的触屏是一种通过触摸来实现的人机交互系统。还有一些通过语音交互进行控制的产品,比如,用户通过语音输入相关指令,产品根据语音输入的指令执行相关的操作。With the continuous development of product intelligence, electronics, and interconnection, many more and more intelligent human-computer interaction methods have emerged to meet people's needs for individualization and fashion. For example, the touch screen of a smart phone is a human-computer interaction system realized by touch. There are also products that are controlled through voice interaction. For example, users input related instructions through voice, and products perform related operations according to the voice input instructions.
发明内容Summary of the invention
本公开实施例至少提供一种手势控制方法和装置。The embodiments of the present disclosure provide at least one gesture control method and device.
根据本公开的第一方面,提供一种手势控制方法,所述方法包括:对摄像头采集到的视频流中时序连续的N帧图像分别进行手势识别处理,得到手势识别结果序列,所述手势识别结果序列中包括所述N帧图像中包括的多个手势的识别结果;响应于所述手势识别结果序列中包括的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果,其中N和M都为大于1的整数,且N大于或等于M;向目标设备发送与所述目标手势识别结果对应的控制指令,或者,控制目标设备执行与所述目标手势识别结果对应的操作。According to a first aspect of the present disclosure, there is provided a gesture control method, the method includes: performing gesture recognition processing on N frames of images consecutively in a sequence in a video stream collected by a camera to obtain a sequence of gesture recognition results, and the gesture recognition The result sequence includes the recognition results of multiple gestures included in the N frames of images; in response to the number of the same gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, it is determined that the same gesture recognition result is a target Gesture recognition result, where N and M are integers greater than 1, and N is greater than or equal to M; send a control instruction corresponding to the target gesture recognition result to the target device, or control the target device to perform the target gesture recognition The operation corresponding to the result.
根据本公开的第二方面,提供一种手势控制装置,所述装置包括:识别处理模块,用于对摄像头采集到的视频流中时序连续的N帧图像分别进行手势识别处理,得到手势识别结果序列,所述手势识别结果序列中包括所述N帧图像中包括的多个手势的识别结果;手势确定模块,用于响应于所述手势识别结果序列中包括的相同手势识别结果的数 量大于或等于M,确定所述相同手势识别结果为目标手势识别结果,其中N和M都为大于1的整数,且N大于或等于M;操作控制模块,用于向目标设备发送与所述目标手势识别结果对应的控制指令,或者,控制目标设备执行与所述目标手势识别结果对应的操作。According to a second aspect of the present disclosure, there is provided a gesture control device, the device comprising: a recognition processing module, configured to perform gesture recognition processing on N frames of images in a video stream collected by a camera in sequence, respectively, to obtain a gesture recognition result Sequence, the gesture recognition result sequence includes the recognition results of multiple gestures included in the N frames of images; the gesture determination module is configured to respond to the number of the same gesture recognition results included in the gesture recognition result sequence being greater than or Equal to M, it is determined that the same gesture recognition result is the target gesture recognition result, where N and M are both integers greater than 1, and N is greater than or equal to M; the operation control module is used to send the target gesture recognition result to the target device The control instruction corresponding to the result, or the target device is controlled to perform an operation corresponding to the target gesture recognition result.
根据本公开的第三方面,提供一种电子设备,所述设备包括处理器及其上存储有计算机程序的存储器,该计算机程序可由所述处理器执行,以实现根据本公开的第一方面的手势控制方法。According to a third aspect of the present disclosure, there is provided an electronic device that includes a processor and a memory on which a computer program is stored. The computer program can be executed by the processor to implement the method according to the first aspect of the present disclosure. Gesture control method.
根据本公开的第四方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现根据本公开的第一方面的手势控制方法。According to a fourth aspect of the present disclosure, there is provided a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the gesture control method according to the first aspect of the present disclosure.
根据本公开实施例提供的手势控制方法和装置,在确定得到预设数量的相同手势识别结果时,才确认该手势有效,且将该有效的手势确定为目标手势识别结果,这能够在一定程度上避免手势误触发,提高手势识别的准确性。比如,如果用户不小心做出了某个手势,只要该手势对应的相同手势识别结果未达到预设数量,则该手势将不会被认定为有效的目标手势识别结果,从而使得目标设备不会响应该手势,减少误触发。According to the gesture control method and device provided by the embodiments of the present disclosure, when it is determined that a preset number of identical gesture recognition results are obtained, the gesture is confirmed to be valid, and the valid gesture is determined as the target gesture recognition result, which can be determined to a certain extent. To avoid false triggering of gestures and improve the accuracy of gesture recognition. For example, if the user accidentally makes a certain gesture, as long as the gesture corresponding to the same gesture recognition result does not reach the preset number, the gesture will not be recognized as a valid target gesture recognition result, so that the target device will not Respond to the gesture to reduce false triggers.
附图说明Description of the drawings
图1示出了根据本公开至少一个实施例的一种手势控制方法流程图;Fig. 1 shows a flow chart of a gesture control method according to at least one embodiment of the present disclosure;
图1a示出了根据本公开至少一个实施例的一种静态手势示意图;Fig. 1a shows a schematic diagram of a static gesture according to at least one embodiment of the present disclosure;
图1b示出了根据本公开至少一个实施例的一种动态手势示意图;Fig. 1b shows a schematic diagram of a dynamic gesture according to at least one embodiment of the present disclosure;
图2示出了根据本公开至少一个实施例的另一种手势控制方法流程图;Fig. 2 shows a flowchart of another gesture control method according to at least one embodiment of the present disclosure;
图3示出了根据本公开至少一个实施例的又一种手势控制方法流程图;Fig. 3 shows a flowchart of another gesture control method according to at least one embodiment of the present disclosure;
图4示出了根据本公开至少一个实施例的一种音乐播放器的功能界面示意图;Fig. 4 shows a schematic diagram of a functional interface of a music player according to at least one embodiment of the present disclosure;
图5示出了根据本公开至少一个实施例的一种手势控制装置的框图;Fig. 5 shows a block diagram of a gesture control device according to at least one embodiment of the present disclosure;
图6示出了根据本公开至少一个实施例的另一种手势控制装置的框图;Fig. 6 shows a block diagram of another gesture control device according to at least one embodiment of the present disclosure;
图7示出了根据本公开至少一个实施例的一种电子设备的框图。Fig. 7 shows a block diagram of an electronic device according to at least one embodiment of the present disclosure.
具体实施方式Detailed ways
本公开实施例提供了一种手势控制方法,通过手势交互的方式对设备进行控制。The embodiments of the present disclosure provide a gesture control method to control a device through gesture interaction.
图1示出了根据本公开至少一个实施例的一种手势控制方法流程图,该方法可以由手势控制装置执行,该方法可以包括步骤100至步骤104。FIG. 1 shows a flow chart of a gesture control method according to at least one embodiment of the present disclosure. The method may be executed by a gesture control device, and the method may include step 100 to step 104.
在步骤100中,对摄像头采集到的视频流中时序连续的N帧图像分别进行手势识别处理,得到手势识别结果序列。In step 100, the sequence of consecutive N frames of images in the video stream collected by the camera are respectively subjected to gesture recognition processing to obtain a sequence of gesture recognition results.
当用户要控制一个设备例如启用该设备中的某项功能时,可以做出某种手势。该设备可以称为目标设备,控制目标设备可以是控制该目标设备中的功能组件,该功能组件可以是硬件或者软件模块。在一个示例中,所述目标设备包括但不限于车辆,对目标设备进行控制可以包括但不限于对车辆中设置的媒体播放器、空调控制器、车窗控制器等一个或多个功能组件的控制。可以理解的是,所述目标设备还可以包括手机、电视、空调、音响、智能家居等其他应用设备。When the user wants to control a device, for example to enable a certain function in the device, a certain gesture can be made. The device may be referred to as a target device, and controlling the target device may be controlling a functional component in the target device, and the functional component may be a hardware or software module. In an example, the target device includes but is not limited to a vehicle, and controlling the target device may include, but is not limited to, controlling one or more functional components such as a media player, an air conditioner controller, and a window controller provided in the vehicle. control. It is understandable that the target device may also include other application devices such as mobile phones, TVs, air conditioners, stereos, and smart homes.
本步骤中,摄像头可以用于采集用户做手势的视频流,例如,可以用目标设备上自带的摄像头采集。该视频流中包括摄像头采集的时序连续的N帧图像,所述图像中的手势是用户要控制目标设备中的功能组件运行时做出的手势。N是大于1的整数。In this step, the camera can be used to collect the video stream of the user's gestures. For example, the camera on the target device can be used to collect it. The video stream includes N frames of sequential sequential images collected by the camera, and the gestures in the images are gestures made when the user wants to control the operation of the functional components in the target device. N is an integer greater than 1.
通过对上述的视频流中的N帧图像分别进行手势识别处理,可以得到手势识别结果序列,该手势识别结果序列中包括多个手势的识别结果。By performing gesture recognition processing on the N frames of images in the aforementioned video stream, respectively, a gesture recognition result sequence can be obtained, and the gesture recognition result sequence includes recognition results of multiple gestures.
用户做出的手势可以是静态手势,也可以是动态手势。如图1a和图1b示例了一些手势,但可以理解的是,实际实施中不局限于这些手势。示例性的,图1a示出了一系列的静态手势:OK手势、V手势、点赞手势、手掌手势、食指手势和拳头手势。示例性的,图1b示出了一系列的动态手势:拳掌互变(拳变掌、掌变拳)、手掌平移(上下左右)、食指旋转(顺时针、逆时针)。The gestures made by the user can be static gestures or dynamic gestures. Figures 1a and 1b illustrate some gestures, but it is understandable that the actual implementation is not limited to these gestures. Illustratively, Fig. 1a shows a series of static gestures: OK gesture, V gesture, like gesture, palm gesture, index finger gesture, and fist gesture. Illustratively, Fig. 1b shows a series of dynamic gestures: fist-palm change (fist-to-palm, palm-to-fist), palm translation (up, down, left, and right), index finger rotation (clockwise, counterclockwise).
例如,该手势识别结果序列中包括的多个手势的识别结果可以是静态手势:比如,识别到图像中的手势是V手势,或者识别到图像中的手势是OK手势。For example, the recognition result of the multiple gestures included in the gesture recognition result sequence may be a static gesture: for example, the gesture recognized in the image is a V gesture, or the gesture recognized in the image is an OK gesture.
又例如,通过对N帧图像进行手势识别处理,得到的手势识别结果序列中还可以包括多个动态手势,比如,识别到多个“手掌平移”手势。For another example, by performing gesture recognition processing on N frames of images, the obtained gesture recognition result sequence may also include multiple dynamic gestures, for example, multiple “palm translation” gestures are recognized.
再例如,该手势识别结果序列中还可以是静态手势和动态手势的组合,比如,手势识别结果序列包括OK手势和手掌平移手势。For another example, the gesture recognition result sequence may also be a combination of static gestures and dynamic gestures. For example, the gesture recognition result sequence includes an OK gesture and a palm translation gesture.
本步骤中的手势识别,例如可以通过预先训练的手势识别神经网络执行,将摄像头采集得到的图像输入该神经网络,就可以得到该图像对应的手势识别结果。The gesture recognition in this step, for example, can be performed by a pre-trained gesture recognition neural network, and the image collected by the camera is input into the neural network, and the gesture recognition result corresponding to the image can be obtained.
在步骤102中,响应于所述手势识别结果序列中包括的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果。In step 102, in response to the number of identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, it is determined that the same gesture recognition result is a target gesture recognition result.
本步骤中,可以设定在确定得到预设数量的相同手势识别结果时,才确认该手势有效,有效的手势称为目标手势识别结果。该预设数量可以设定为M,M也为大于1的整数,且N大于或等于M。In this step, it can be set to confirm that the gesture is valid only when it is determined that a preset number of identical gesture recognition results are obtained, and the effective gesture is called the target gesture recognition result. The preset number can be set to M, M is also an integer greater than 1, and N is greater than or equal to M.
例如,如果在时续连续的N帧图像中识别到连续的五个V手势,则确认该识别到的“V手势”为目标手势识别结果。又例如,如果在时续连续的N帧图像中识别到连续的五个“手掌平移手势”,则该“手掌平移手势”为目标手势识别结果,其中,每一个手掌平移手势可以由多帧图像确定得到。For example, if five consecutive V gestures are recognized in consecutive N frames of images, the recognized "V gesture" is confirmed as the target gesture recognition result. For another example, if five consecutive "palm translation gestures" are recognized in consecutive N frames of images, then this "palm translation gesture" is the target gesture recognition result, where each palm translation gesture can be represented by multiple frames of images Okay.
如果识别到的连续的手势数量未达到预设数量,则丢弃这几个图像重新识别。比如,如果在时序连续的N帧图像中,识别到的V手势是三个,未达到预设数量“五个”,则放弃该三个V手势,重新对时序连续的N帧图像进行手势识别。If the number of consecutive gestures recognized does not reach the preset number, these images are discarded and re-recognized. For example, if there are three V gestures recognized in consecutive N frames of images, and the preset number "five" is not reached, then the three V gestures are discarded and gesture recognition is performed on the consecutive N frames of images. .
当确定出目标手势识别结果时,继续执行步骤104。否则,如果未确定出目标手势识别结果,则继续执行步骤100至步骤102。When the target gesture recognition result is determined, step 104 is continued. Otherwise, if the target gesture recognition result is not determined, then continue to perform step 100 to step 102.
在步骤104中,向目标设备发送与所述目标手势识别结果对应的控制指令,或者,控制目标设备执行与所述目标手势识别结果对应的操作。In step 104, a control instruction corresponding to the target gesture recognition result is sent to the target device, or the target device is controlled to perform an operation corresponding to the target gesture recognition result.
本步骤可以根据上述确定出的目标手势识别结果,控制对应的目标设备。具体可以是控制目标设备中的功能组件,例如,如果该功能组件是媒体播放器,例如车辆中播放音乐的音量控制模块,那么根据目标手势识别结果,可以控制音量的提高或者降低。在实际实施时,可以向目标设备发送与目标手势识别结果对应的控制指令,由目标设备根据该指令进行操作;或者,还可以是由本实施例的手势控制装置根据该指令控制目标设备执行与所述目标手势识别结果对应的操作。In this step, the corresponding target device can be controlled according to the target gesture recognition result determined above. Specifically, it can control a functional component in the target device. For example, if the functional component is a media player, such as a volume control module for playing music in a vehicle, the volume can be increased or decreased according to the target gesture recognition result. In actual implementation, a control instruction corresponding to the target gesture recognition result can be sent to the target device, and the target device can operate according to the instruction; or, the gesture control apparatus of this embodiment can also control the target device to execute and execute the target device according to the instruction. The operation corresponding to the target gesture recognition result.
根据本实施例的手势控制方法,在确定得到预设数量的相同手势识别结果时,才确认该手势有效,且将该有效的手势确定为目标手势识别结果,这能够在一定程度上避免手势误触发,提高手势识别的准确性。比如,如果用户不小心做出了某个手势,只要该手势对应的相同手势识别结果未达到预设数量,则该手势将不会被认定为有效的目标手势识别结果,从而使得目标设备不会响应该手势,减少误触发。According to the gesture control method of this embodiment, when it is determined that a preset number of identical gesture recognition results are obtained, the gesture is confirmed to be valid, and the valid gesture is determined as the target gesture recognition result, which can avoid gesture errors to a certain extent. Trigger to improve the accuracy of gesture recognition. For example, if the user accidentally makes a certain gesture, as long as the gesture corresponding to the same gesture recognition result does not reach the preset number, the gesture will not be recognized as a valid target gesture recognition result, so that the target device will not Respond to the gesture to reduce false triggers.
图2示出了根据本公开至少一个实施例的另一种手势控制方法,该方法可以包括步骤200至步骤208,其中,与图1的步骤相同的步骤将不再详述。FIG. 2 shows another gesture control method according to at least one embodiment of the present disclosure. The method may include step 200 to step 208, wherein the same steps as those in FIG. 1 will not be described in detail.
步骤200中,接收摄像头采集的多帧图像,所述图像中的手势是用户要控制目标设备中的功能组件运行时做出的手势。In step 200, a multi-frame image collected by a camera is received, and the gesture in the image is a gesture made when the user wants to control the function component in the target device to run.
该多帧图像可以是摄像头采集的视频流中包括的时序连续的N帧图像。The multi-frame images may be N frames of images continuously in sequence included in the video stream collected by the camera.
在步骤202中,对所述多帧图像进行手势识别处理,得到手势识别结果序列。In step 202, gesture recognition processing is performed on the multiple frames of images to obtain a gesture recognition result sequence.
例如,摄像头采集的图像有多帧,根据该多帧图像可以识别到多个手势,这多个手势可以组成一个手势识别结果序列。举例来说,该手势识别结果序列可以包括“V、V、V、V、V、V、拳头、V、V”。For example, the image collected by the camera has multiple frames, and multiple gestures can be recognized according to the multiple frames of images, and these multiple gestures can form a sequence of gesture recognition results. For example, the gesture recognition result sequence may include "V, V, V, V, V, V, fist, V, V".
在上述手势识别结果序列中,多个“V”可以称为多个相同手势识别结果,“拳头”可以称为差异手势识别结果,该差异手势识别结果与相同手势识别结果是不同的手势识别结果。在其他示例中,差异手势识别结果的数量也可以是多个。In the above sequence of gesture recognition results, multiple "Vs" can be referred to as multiple identical gesture recognition results, and "fists" can be referred to as differential gesture recognition results, which are different gesture recognition results from the same gesture recognition result . In other examples, the number of different gesture recognition results may also be multiple.
在步骤204中,响应于所述手势识别结果序列中包括的多个相同手势识别结果之间包括有差异手势识别结果,且所述差异手势识别结果在所述手势识别结果序列中的数量占比低于预设值,对所述差异手势识别结果进行平滑处理;其中,所述差异手势识别结果与所述相同手势识别结果不同。换言之,响应于所述手势识别结果序列中包括至少一帧图像的与所述相同手势识别结果不同的差异手势识别结果,而且在所述手势识别结果序列中,所述差异手势识别结果之前的手势识别结果和所述差异手势识别结果之后的手势识别结果都为所述相同手势识别结果,且所述差异手势识别结果在所述手势识别结果序列中的数量占比低于预设值,对所述差异手势识别结果进行平滑处理。其中,确定所述手势识别结果序列中包括的连续的相同手势识别结果的数量大于或等于M包括:确定经过所述平滑处理后的所述手势识别结果序列中包括的连续的相同手势识别结果的数量大于或等于M。In step 204, in response to the plurality of identical gesture recognition results included in the gesture recognition result sequence, a difference gesture recognition result is included among the plurality of same gesture recognition results, and the difference gesture recognition result accounts for the number of the gesture recognition result sequence If the value is lower than the preset value, the difference gesture recognition result is smoothed; wherein, the difference gesture recognition result is different from the same gesture recognition result. In other words, in response to the gesture recognition result sequence including at least one frame of image, a difference gesture recognition result that is different from the same gesture recognition result, and in the gesture recognition result sequence, the gesture before the difference gesture recognition result The recognition result and the gesture recognition result after the difference gesture recognition result are both the same gesture recognition result, and the proportion of the difference gesture recognition result in the gesture recognition result sequence is lower than the preset value. The difference gesture recognition result is smoothed. Wherein, determining that the number of consecutive identical gesture recognition results included in the gesture recognition result sequence is greater than or equal to M includes: determining the number of consecutive identical gesture recognition results included in the gesture recognition result sequence after the smoothing process The quantity is greater than or equal to M.
例如,在上述示例“V、V、V、V、V、V、拳头、V、V”的手势识别结果序列中,“拳头”是差异手势识别结果,拳头手势之前识别到六个V手势,在拳头手势之后识别到两个V手势,也就是说,差异手势识别结果之前的手势识别结果和差异手势识别结果之后的手势识别结果都是相同手势识别结果,即V手势。并且差异手势识别结果的数量在所述手势识别结果序列中的占比低于预设值,比如,“差异手势的数量”与手势识别结果序列总数量的比值低于预设值(例如为15%),则对该差异手势识别结果进行平滑处理。 实际实施中不局限于该判断方式,在此仅做示例。For example, in the gesture recognition result sequence of the above example "V, V, V, V, V, V, fist, V, V", "fist" is the difference gesture recognition result, and six V gestures are recognized before the fist gesture. Two V gestures are recognized after the fist gesture, that is, the gesture recognition result before the difference gesture recognition result and the gesture recognition result after the difference gesture recognition result are both the same gesture recognition result, that is, the V gesture. And the proportion of the number of differential gesture recognition results in the sequence of gesture recognition results is lower than a preset value, for example, the ratio of the "number of differential gestures" to the total number of gesture recognition result sequences is lower than the preset value (for example, 15 %), the difference gesture recognition result is smoothed. The actual implementation is not limited to this judgment method, and it is only an example here.
当确认对差异手势识别结果进行平滑处理之后,所做的平滑处理,包括但不限于如下的任一种:After confirming the smoothing process for the difference gesture recognition result, the smoothing process includes but not limited to any of the following:
例如,可以将所述差异手势识别结果更正为所述相同手势识别结果,比如,将拳头手势更正为V手势。上述的手势识别结果序列“V、V、V、V、V、V、拳头、V、V”修改为“V、V、V、V、V、V、V、V、V”。For example, the difference gesture recognition result may be corrected to the same gesture recognition result, for example, a fist gesture may be corrected to a V gesture. The aforementioned gesture recognition result sequence "V, V, V, V, V, V, fist, V, V" is modified to "V, V, V, V, V, V, V, V, V".
又例如,还可以将差异手势识别结果从手势识别结果序列中去除,比如,上述的序列“V、V、V、V、V、V、拳头、V、V”修改为“V、V、V、V、V、V、V、V”。For another example, the difference gesture recognition result can also be removed from the gesture recognition result sequence. For example, the above sequence "V, V, V, V, V, V, fist, V, V" can be modified to "V, V, V" , V, V, V, V, V".
再例如,还可以将时序位于差异手势识别结果之前的手势识别结果,与时序位于差异手势识别结果之后的手势识别结果,作为连续的多个手势识别结果。即手势识别结果序列“V、V、V、V、V、V、拳头、V、V”被认为是识别到了连续的八个V手势,忽略拳头手势。For another example, the gesture recognition result whose time sequence is before the difference gesture recognition result and the gesture recognition result whose time sequence is after the difference gesture recognition result can also be used as consecutive multiple gesture recognition results. That is, the gesture recognition result sequence "V, V, V, V, V, V, fist, V, V" is considered to recognize eight consecutive V gestures, and fist gestures are ignored.
在步骤206中,对于平滑处理后的手势识别结果序列,若识别到所述手势识别结果序列中包括连续的预设数量的相同手势识别结果,则确认识别到目标手势识别结果。In step 206, for the smoothed gesture recognition result sequence, if it is recognized that the gesture recognition result sequence includes a continuous preset number of identical gesture recognition results, it is confirmed that the target gesture recognition result is recognized.
例如,本实施例中可以设置为,若识别到连续的M个相同手势识别结果,则确认该相同手势识别结果是目标手势识别结果,目标手势识别结果是有效的。比如,若识别到连续的8个V手势,则确认该V手势是目标手势识别结果。For example, in this embodiment, it may be set that if consecutive M identical gesture recognition results are recognized, it is confirmed that the same gesture recognition result is the target gesture recognition result, and the target gesture recognition result is valid. For example, if 8 consecutive V gestures are recognized, it is confirmed that the V gesture is the target gesture recognition result.
在步骤208中,向目标设备发送与所述目标手势识别结果对应的控制指令,或者,控制目标设备执行与所述目标手势识别结果对应的操作。In step 208, a control instruction corresponding to the target gesture recognition result is sent to the target device, or the target device is controlled to perform an operation corresponding to the target gesture recognition result.
根据本实施例的手势控制方法,在识别到预设数量的相同手势时,才确认该手势有效,提高了手势识别的准确性;并且,通过对差异手势识别结果进行平滑处理,也可以增加手势识别的灵敏度,提高手势识别的响应速度。According to the gesture control method of this embodiment, when a preset number of identical gestures are recognized, the gesture is confirmed to be effective, which improves the accuracy of gesture recognition; and by smoothing the difference gesture recognition results, it is also possible to increase gestures. The sensitivity of recognition improves the response speed of gesture recognition.
例如,假设用户实际作出的手势已经达到了预设数量的V手势,例如十个V手势,可是由于误识别,导致识别到了九个V手势和两个拳头手势,那么如果不进行本实施例的平滑处理,则要放弃该九个V手势重新识别,因此无法对用户手势及时响应;而按照本实施例的方法可以将上述两个拳头手势识别结果更正为正确的V手势识别结果,从而快速识别到有效的V手势,快速的对用户手势做出响应。For example, suppose that the actual gestures made by the user have reached a preset number of V gestures, such as ten V gestures, but due to misrecognition, nine V gestures and two fist gestures are recognized. Smoothing processing requires abandoning the re-recognition of the nine V gestures, and therefore cannot respond to the user's gestures in time; according to the method of this embodiment, the recognition results of the above two fist gestures can be corrected to the correct V gesture recognition results, thereby quickly identifying To effective V gestures, quickly respond to user gestures.
图3示出了根据本公开至少一个实施例的又一种手势控制方法流程图,该方法包 括步骤300至步骤306。FIG. 3 shows a flowchart of another gesture control method according to at least one embodiment of the present disclosure. The method includes step 300 to step 306.
在步骤300中,获取视频流中单帧的摄像头采集图像,所述摄像头采集图像是对应于摄像头拍摄视野空间的图像,所述摄像头拍摄视野空间中包括手势控制的有效空间区域。In step 300, a single frame of the camera acquisition image in the video stream is acquired, the camera acquisition image is an image corresponding to the camera shooting field of view space, and the camera shooting field of view space includes an effective space area for gesture control.
本实施例中,摄像头固定在车辆的某个位置,该摄像头在采集图像时具有一个对应的摄像头拍摄视野空间,摄像头采集图像也是该空间内的图像。其中,该视野空间包括手势控制的有效空间区域,例如,驾驶员只有在车辆中控面板前方某个空间区域做手势,才会触发根据该手势的控制,假如驾驶员在该有效空间区域之外的区域做手势,则不会触发手势控制。摄像头采集图像中包括上述的手势控制的有效空间区域对应的图像。In this embodiment, the camera is fixed at a certain position of the vehicle, the camera has a corresponding camera shooting field of view space when collecting images, and the image collected by the camera is also an image in this space. Among them, the field of view space includes the effective space area of gesture control. For example, only when the driver makes a gesture in a certain space area in front of the vehicle's central control panel, the control based on the gesture will be triggered. If the driver is outside the effective space area Gestures in the area will not trigger gesture control. The image collected by the camera includes the image corresponding to the effective space area of the aforementioned gesture control.
在步骤302中,从所述摄像头采集图像中,选择对应于所述手势控制的有效空间区域的局部图像区域。In step 302, from the image collected by the camera, a partial image area corresponding to the effective space area of the gesture control is selected.
本步骤可以对摄像头采集图像进行裁剪,裁剪得到摄像头采集图像中的局部图像区域,该局部图像区域对应的拍摄视野空间是手势控制的有效空间区域。比如,摄像头可能拍摄了一个较大的空间区域,将车辆内的整个内部场景都拍摄到。而本步骤中选择的局部图像区域是摄像头采集图像中包括的对应车辆中控面板前方的部分区域,该部分区域是手势控制的有效空间区域,驾驶员在该有效空间区域内做手势,才会触发手势控制的响应。In this step, the image collected by the camera can be cropped, and the partial image area in the image collected by the camera can be cropped, and the shooting field of view space corresponding to the partial image area is an effective space area for gesture control. For example, the camera may capture a large area of space and capture the entire interior scene in the vehicle. The partial image area selected in this step is the part of the area in front of the vehicle's central control panel included in the image collected by the camera. This part of the area is the effective space area for gesture control. Only when the driver makes gestures in the effective space area Trigger the response of gesture control.
在步骤304中,对局部图像区域进行手势识别处理,得到手势识别结果。In step 304, gesture recognition processing is performed on the partial image area to obtain a gesture recognition result.
在一些实施例中,对视频流中的N帧图像进行手势识别时,可以从每帧图像中选择局部图像区域,并对该局部图像区域进行手势识别处理。上述的图像即摄像头采集图像。In some embodiments, when performing gesture recognition on N frames of images in a video stream, a partial image area may be selected from each frame of image, and gesture recognition processing may be performed on the partial image area. The above-mentioned image is the image collected by the camera.
在步骤306中,根据手势识别结果进行目标设备的控制。In step 306, the target device is controlled according to the gesture recognition result.
例如,对于摄像头采集的N帧图像,识别得到一个手势识别结果序列。若该手势识别结果序列中存在预设数量M的相同手势识别结果,或者存在连续的M个相同手势识别结果时,确认该相同手势识别结果是目标手势识别结果。根据该目标手势识别结果对应的控制指令,对目标设备进行控制。For example, for N frames of images collected by a camera, a gesture recognition result sequence is obtained by recognition. If there are a preset number of M identical gesture recognition results in the gesture recognition result sequence, or there are consecutive M identical gesture recognition results, it is confirmed that the same gesture recognition result is a target gesture recognition result. The target device is controlled according to the control instruction corresponding to the target gesture recognition result.
根据本实施例的手势控制方法,在确定得到预设数量的相同手势识别结果时根据该手势进行设备控制,可以防止误触发。此外,通过识别图像中的局部图像区域中的手 势,能够在一定程度上避免局部图像区域之外的其他区域图像的干扰,使得对手势的识别更加准确,而且仅对局部图像区域进行手势识别处理,相比于对全部的图像进行识别处理,处理速度也会更快。According to the gesture control method of this embodiment, when it is determined that a preset number of identical gesture recognition results are obtained, device control is performed according to the gesture, which can prevent false triggering. In addition, by recognizing the gestures in the partial image area in the image, it is possible to avoid the interference of the images of other areas outside the partial image area to a certain extent, making the recognition of gestures more accurate, and only perform gesture recognition processing on the partial image area Compared with the recognition processing of all the images, the processing speed will be faster.
在又一个实施例中,对手势控制功能中的一些参数可以通过可视化的方式进行调节。例如,可以将用于手势识别的手势识别参数在可视化界面进行可视化显示,用户对参数调节的可视化界面中的手势识别参数以进度条方式进行调节。例如,该手势识别参数可以包括:上面提到的“识别到M个相同手势识别结果”中的M。例如,可以调节识别到10个相同的V手势,确认识别到V手势;也可以设置识别到8个相同的V手势,确认识别到V手势。当用户调节该手势识别参数后,系统可以根据该手势识别参数进行手势识别处理。通过可视化界面的方式调节手势识别参数,比较方便。In another embodiment, some parameters in the gesture control function can be adjusted in a visual manner. For example, the gesture recognition parameters used for gesture recognition can be visually displayed on the visual interface, and the user adjusts the gesture recognition parameters in the parameter adjustment visual interface in the manner of a progress bar. For example, the gesture recognition parameter may include: M in the "M same gesture recognition results recognized" mentioned above. For example, you can adjust the recognition of 10 identical V gestures to confirm that the V gesture is recognized; you can also set to recognize 8 identical V gestures to confirm that the V gesture is recognized. After the user adjusts the gesture recognition parameter, the system can perform gesture recognition processing according to the gesture recognition parameter. It is more convenient to adjust the gesture recognition parameters through a visual interface.
此外,不同的手势可以设置不同的手势识别参数,以上述的M为例,不同的手势对应的M可以不同。比如,识别到10个相同的V手势,确认识别到V手势;识别到6个OK手势,确认识别到OK手势。即V手势对应的M是10,OK手势对应的M是6。In addition, different gesture recognition parameters can be set for different gestures. Taking the above M as an example, the M corresponding to different gestures can be different. For example, if 10 identical V gestures are recognized, it is confirmed that the V gesture is recognized; if 6 OK gestures are recognized, it is confirmed that the OK gesture is recognized. That is, the M corresponding to the V gesture is 10, and the M corresponding to the OK gesture is 6.
该手势识别参数例如还可以包括:序列中出现差异手势的数量、差异手势之前出现的相同手势的数量等等,这些参数也可以通过上述的可视化界面以进度条方式进行调节设置。并且,各个参数可以独立的进行调节,比如,以不同的手势对应的M为例,在上述的例子中,V手势对应的M是10,OK手势对应的M是6,所述的V手势、OK手势等不同的手势对应的M可以分别调节。The gesture recognition parameters may also include, for example, the number of different gestures in the sequence, the number of the same gestures before the different gestures, and so on. These parameters can also be adjusted and set in the form of a progress bar through the above-mentioned visual interface. In addition, each parameter can be adjusted independently. For example, taking M corresponding to different gestures as an example, in the above example, the M corresponding to the V gesture is 10, and the M corresponding to the OK gesture is 6. The M corresponding to different gestures such as OK gesture can be adjusted separately.
如下以在车辆中应用手势控制的功能为例,对本公开的手势控制方法进行描述,但是可以理解的是,该手势控制方法不局限于应用于车辆,还可以应用于其他设备,比如应用于手机、智能家居系统等。The following uses the application of gesture control in a vehicle as an example to describe the gesture control method of the present disclosure, but it is understandable that the gesture control method is not limited to being applied to vehicles, and can also be applied to other devices, such as mobile phones. , Smart home system, etc.
在车辆中,可以由驾驶员通过手势动作实现对车窗、灯光亮度、空调温度等车辆附件的调节;还可以对车辆中的车辆娱乐组件进行控制,例如对音乐播放进行控制,比如,切换歌曲、调节音量。还可以通过手势进行游戏控制,等等。例如,图4示出了根据本公开至少一个实施例的一种音乐播放器的功能界面示意图,用户可以点击开启音乐播放器,在一个示例性的示例中,当用户点击了该播放器界面中的手势控制区域41(即播放器底部的区域)时,表示开启了对音乐播放相关功能的手势控制;若用户再次点击该手势控制区域41,则取消对音乐播放相关功能的手势控制。In the vehicle, the driver can adjust the vehicle accessories such as windows, light brightness, air-conditioning temperature, etc. through gestures; it can also control the vehicle entertainment components in the vehicle, such as controlling music playback, such as switching songs ,volume adjustment. You can also use gestures to control the game, and so on. For example, FIG. 4 shows a schematic diagram of a functional interface of a music player according to at least one embodiment of the present disclosure. The user can click to turn on the music player. In an illustrative example, when the user clicks on the interface of the player When the gesture control area 41 (ie, the area at the bottom of the player) is displayed, it indicates that the gesture control for music playback-related functions is enabled; if the user taps the gesture control area 41 again, the gesture control for music playback-related functions is cancelled.
图4所示的界面是音乐播放器的功能界面。用户可以通过做出多种手势,利用摄 像头采集图像,由手势控制装置根据接收的图像控制音乐播放器的音乐播放功能。并且,还可以在图4所示的界面中,响应于图像的手势识别结果,控制音乐播放器。例如,可以响应于图像的手势识别结果来提高音乐播放的音量;又例如,还可以响应于图像的手势识别结果来控制车窗控制器移动车窗玻璃。再例如,不仅可以响应图像的手势识别结果来提高音乐播放的音量,还可以同步的显示随着图像的变化而产生的音乐播放器相关控制功能的变化状态。The interface shown in Figure 4 is the functional interface of the music player. The user can use the camera to collect images by making a variety of gestures, and the gesture control device controls the music playing function of the music player according to the received images. In addition, in the interface shown in FIG. 4, the music player can be controlled in response to the gesture recognition result of the image. For example, the volume of music playback can be increased in response to the gesture recognition result of the image; for another example, the window controller can also be controlled to move the window glass in response to the gesture recognition result of the image. For another example, not only can the volume of music playback be increased in response to the result of the gesture recognition of the image, but also the change state of the related control functions of the music player caused by the change of the image can be synchronously displayed.
请继续参见图4,手势控制区域41中点亮显示了多个手势的图标,表示该音乐播放场景下支持多个手势来控制,比如,相关的手势与对应控制的音乐播放功能可以参见表1所示,其中的手势包括:Please continue to refer to Figure 4, the gesture control area 41 lights up the icon of multiple gestures, indicating that multiple gestures are supported in the music playback scene. For example, related gestures and corresponding music playback functions can be seen in Table 1. As shown, the gestures include:
表1手势与对应控制的功能Table 1 Gestures and corresponding control functions
手势gesture 控制功能control function
OKOK 播放Play
竖大拇指Thumbs up 点赞like
食指顺时针旋转Index finger rotates clockwise 增大音量Increase volume
食指逆时针旋转Index finger rotates counterclockwise 降低音量lower the volume
手掌向右平移Pan to the right 下一首next song
手掌向左平移Pan to the left 上一首Previous song
拳头fist 暂停time out
对于上述表1中的各个手势的识别,可以按照以下预定规则确定目标手势识别结果:若在手势识别结果序列中有预设数量的相同手势识别结果,则确认所述相同手势识别结果即目标手势识别结果。For the recognition of each gesture in Table 1 above, the target gesture recognition result can be determined according to the following predetermined rules: if there are a preset number of identical gesture recognition results in the gesture recognition result sequence, confirm that the same gesture recognition result is the target gesture Recognition results.
示例性的,在开启对音乐播放相关功能的手势控制后,用户可以做出OK手势,则音乐播放器开始播放音乐。并且,在图4的功能状态界面中可以同步显示音乐播放功能的运行启动;同理,当用户做出拳头手势时,音乐播放暂停,并且在功能状态界面中也可以同步显示音乐播放功能的运行停止。Exemplarily, after turning on the gesture control for the music playing related functions, the user can make an OK gesture, and the music player starts to play music. In addition, the running start of the music playing function can be displayed synchronously in the function status interface of Figure 4; similarly, when the user makes a fist gesture, the music playback is paused, and the running of the music playing function can also be displayed synchronously in the function status interface. Stop.
例如,用户做出了食指旋转的手势,此时在识别到该食指旋转手势后,手势控制装置可以先判断是否已经识别到“OK”手势。如果之前还未识别到“OK”,则不作响应;如果之前已经识别到“OK”,则可以根据食指旋转手势对应的组件控制信息,调节音乐播放器的音量。比如,如果手势是“食指顺时针旋转”,则可以控制音乐播放器增加音乐 播放的音量。同时,在图4的功能状态界面中,还可以通过音量调节显示模块42同步显示随着食指顺时针旋转的音量增大示意。For example, the user makes a gesture of rotating the index finger, and at this time, after the index finger rotating gesture is recognized, the gesture control device may first determine whether the "OK" gesture has been recognized. If "OK" has not been recognized before, no response will be made; if "OK" has been recognized before, the volume of the music player can be adjusted according to the component control information corresponding to the index finger rotation gesture. For example, if the gesture is "index finger rotate clockwise", you can control the music player to increase the volume of music playback. At the same time, in the function status interface of FIG. 4, the volume adjustment display module 42 can also be used to synchronously display the volume increase signal as the index finger rotates clockwise.
又例如,用户做出了手掌向右平移的手势,此时在识别到该食指旋转手势后,手势控制装置可以先判断一下是否已经识别到“OK”手势。如果之前还未识别到“OK”,则不作响应;如果之前已经识别到“OK”,则可以根据手掌向右平移的手势,调节音乐播放器切换下一首歌曲。同时,在图4的功能状态界面中,还可以通过歌曲显示模块43同步显示随着手掌向右平移的切歌效果。For another example, the user makes a gesture of shifting the palm to the right, and at this time, after recognizing the index finger rotation gesture, the gesture control device may first determine whether the "OK" gesture has been recognized. If "OK" has not been recognized before, no response will be made; if "OK" has been recognized before, you can adjust the music player to switch to the next song according to the gesture of panning the palm to the right. At the same time, in the function status interface of FIG. 4, the song display module 43 can also be used to synchronously display the song cutting effect as the palm moves to the right.
此外,用户还可以通过手势控制对歌曲的点赞。例如,用户可以竖大拇指,则响应于该手势,手势控制装置可以控制音乐播放器在图4所示的功能状态界面中显示对某个歌曲的点赞标识。例如,图4中的点赞标识44被点亮。同样在点赞之前可以预先判断是否已经识别到“OK”手势。In addition, users can also use gestures to control like songs. For example, if the user can give a thumbs up, in response to the gesture, the gesture control device can control the music player to display the like mark of a certain song in the function status interface shown in FIG. 4. For example, the like mark 44 in FIG. 4 is illuminated. It can also be judged in advance whether the "OK" gesture has been recognized before the like.
其他功能的手势控制不再详述。Gesture control for other functions will not be described in detail.
图5示出了根据本公开至少一个实施例的一种手势控制装置的框图,如图5所示,该装置可以包括:识别处理模块500、手势确定模块502和操作控制模块504。FIG. 5 shows a block diagram of a gesture control device according to at least one embodiment of the present disclosure. As shown in FIG. 5, the device may include: a recognition processing module 500, a gesture determination module 502, and an operation control module 504.
识别处理模块500可以对摄像头采集到的视频流中时序连续的N帧图像分别进行手势识别处理,得到手势识别结果序列。所述手势识别结果序列中包括所述N帧图像中包括的多个手势的识别结果。The recognition processing module 500 may perform gesture recognition processing on the sequential sequential N frames of images in the video stream collected by the camera, respectively, to obtain a sequence of gesture recognition results. The gesture recognition result sequence includes recognition results of multiple gestures included in the N frames of images.
手势确定模块502可以响应于所述手势识别结果序列中包括的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果,其中N和M都为大于1的整数,且N大于或等于M。The gesture determination module 502 may determine that the same gesture recognition result is a target gesture recognition result in response to the number of identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, where N and M are both integers greater than 1. , And N is greater than or equal to M.
操作控制模块504可以向目标设备发送与所述目标手势识别结果对应的控制指令,或者,控制目标设备执行与所述目标手势识别结果对应的操作。The operation control module 504 may send a control instruction corresponding to the target gesture recognition result to the target device, or control the target device to perform an operation corresponding to the target gesture recognition result.
根据本实施例的手势控制装置,识别处理模块和手势确定模块在确定得到预设数量的相同手势识别结果时,才确认该手势有效,且将该有效的手势确定为目标手势识别结果,这能够在一定程度上避免手势误触发,提高手势识别的准确性。比如,如果用户不小心做出了某个手势,只要该手势对应的相同手势识别结果未达到预设数量,则该手势将不会被认定为有效的目标手势识别结果,从而使得目标设备不会响应该手势,减少误触发。According to the gesture control device of this embodiment, the recognition processing module and the gesture determination module confirm that the gesture is valid only when the preset number of identical gesture recognition results are obtained, and the valid gesture is determined as the target gesture recognition result. To a certain extent, avoid false triggering of gestures and improve the accuracy of gesture recognition. For example, if the user accidentally makes a certain gesture, as long as the gesture corresponding to the same gesture recognition result does not reach the preset number, the gesture will not be recognized as a valid target gesture recognition result, so that the target device will not Respond to the gesture to reduce false triggers.
在一个实施例中,所述手势确定模块502可以响应于所述手势识别结果序列中包括的连续的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果。In one embodiment, the gesture determination module 502 may determine that the same gesture recognition result is a target gesture recognition result in response to the number of consecutive identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M.
在一个实施例中,所述手势确定模块502还可以响应于所述手势识别结果序列中包括的多个相同手势识别结果之间包括有差异手势识别结果,且所述差异手势识别结果在所述手势识别结果序列中的数量占比低于预设值,对所述差异手势识别结果进行平滑处理;其中,所述差异手势识别结果与所述相同手势识别结果不同。换言之,响应于所述手势识别结果序列中包括至少一帧图像的与所述相同手势识别结果不同的差异手势识别结果,而且在所述手势识别结果序列中,所述差异手势识别结果之前的手势识别结果和所述差异手势识别结果之后的手势识别结果都为所述相同手势识别结果,且所述差异手势识别结果在所述手势识别结果序列中的数量占比低于预设值,对所述差异手势识别结果进行平滑处理。其中,确定所述手势识别结果序列中包括的连续的相同手势识别结果的数量大于或等于M包括:确定经过所述平滑处理后的所述手势识别结果序列中包括的连续的相同手势识别结果的数量大于或等于M。In one embodiment, the gesture determination module 502 may also respond to that there is a difference gesture recognition result among multiple identical gesture recognition results included in the gesture recognition result sequence, and the difference gesture recognition result is in the The proportion of the number in the gesture recognition result sequence is lower than the preset value, and the difference gesture recognition result is smoothed; wherein the difference gesture recognition result is different from the same gesture recognition result. In other words, in response to the gesture recognition result sequence including at least one frame of image, a difference gesture recognition result that is different from the same gesture recognition result, and in the gesture recognition result sequence, the gesture before the difference gesture recognition result The recognition result and the gesture recognition result after the difference gesture recognition result are both the same gesture recognition result, and the proportion of the difference gesture recognition result in the gesture recognition result sequence is lower than the preset value. The difference gesture recognition result is smoothed. Wherein, determining that the number of consecutive identical gesture recognition results included in the gesture recognition result sequence is greater than or equal to M includes: determining the number of consecutive identical gesture recognition results included in the gesture recognition result sequence after the smoothing process The quantity is greater than or equal to M.
在一个实施例中,所述手势确定模块502在对所述差异手势识别结果进行平滑处理时,可以将所述差异手势识别结果更正为所述相同手势识别结果,或者将所述差异手势识别结果从所述手势识别结果序列中去除。In one embodiment, when the gesture determination module 502 smoothes the difference gesture recognition result, it may correct the difference gesture recognition result to the same gesture recognition result, or convert the difference gesture recognition result Removed from the sequence of gesture recognition results.
在一个实施例中,所述手势确定模块502在对所述差异手势识别结果进行平滑处理时,可以将时序位于所述差异手势识别结果之前的手势识别结果,与时序位于所述差异手势识别结果之后的手势识别结果,作为连续的多个手势识别结果。In one embodiment, when the gesture determination module 502 smoothes the difference gesture recognition result, it may combine the gesture recognition result whose time sequence is before the difference gesture recognition result with the time sequence that is located in the difference gesture recognition result. The subsequent gesture recognition result is used as a continuous multiple gesture recognition result.
在一个实施例中,所述识别处理模块500在对摄像头采集到的视频流中时序连续的N帧图像分别进行手势识别处理时,可以执行下列操作:获取视频流中单帧的摄像头采集图像,所述摄像头采集图像是对应于摄像头拍摄视野空间的图像,所述摄像头拍摄视野空间中包括手势控制的有效空间区域;从所述摄像头采集图像中,选择与所述手势控制的有效空间区域对应的局部图像区域;对所述局部图像区域进行所述手势识别处理。In an embodiment, the recognition processing module 500 may perform the following operations when performing gesture recognition processing on N frames of consecutive sequential images in the video stream collected by the camera, and may perform the following operations: acquiring a single frame of the camera-collected image in the video stream, The image captured by the camera is an image corresponding to the field of view space captured by the camera, and the field of view space captured by the camera includes an effective space area for gesture control; from the image captured by the camera, select the image corresponding to the effective space area for gesture control Partial image area; performing the gesture recognition processing on the partial image area.
在一个实施例中,如图6所示,该装置还可以包括参数接收模块600。In an embodiment, as shown in FIG. 6, the device may further include a parameter receiving module 600.
参数接收模块600可以接收用户通过用于参数调节的可视化界面配置的手势识别参数,以使得所述识别处理模块500根据所述手势识别参数执行所述手势识别处理。The parameter receiving module 600 may receive the gesture recognition parameters configured by the user through the visual interface for parameter adjustment, so that the recognition processing module 500 executes the gesture recognition processing according to the gesture recognition parameters.
在一个实施例中,目标设备包括车辆,所述操作控制模块504可以向所述车辆中 的功能组件发送与所述目标手势识别结果对应的控制指令,或者控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作。In one embodiment, the target device includes a vehicle, and the operation control module 504 may send a control instruction corresponding to the target gesture recognition result to the functional component in the vehicle, or control the functional component in the vehicle to execute and The operation corresponding to the target gesture recognition result.
在一个实施例中,所述功能组件包括媒体播放器和/或车窗控制器,所述操作控制模块504用于:在控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作时,可以响应于所述目标手势识别结果,控制所述媒体播放器改变媒体播放状态,或者可以响应于所述目标手势识别结果,控制所述车窗控制器移动车窗玻璃。In one embodiment, the functional component includes a media player and/or a window controller, and the operation control module 504 is used to control the functional component in the vehicle to execute the corresponding target gesture recognition result. During operation, the media player may be controlled to change the media playing state in response to the target gesture recognition result, or the vehicle window controller may be controlled to move the window glass in response to the target gesture recognition result.
在一个实施例中,所述操作控制模块504可以响应于所述目标手势识别结果,在要通过图像控制的功能组件对应的功能状态界面上显示下列项目中的至少一种:所述功能组件的运行启动或者运行停止状态;音量的变化;对目标对象的点赞标识。In one embodiment, the operation control module 504 may display at least one of the following items on the function status interface corresponding to the function component to be controlled by the image in response to the target gesture recognition result: Running start or running stop state; volume change; like mark of target object.
图7示出了根据本公开至少一个实施例的一种电子设备的框图,所述电子设备包括存储器71和处理器72。所述存储器71存储有计算机程序,所述计算机程序被所述处理器72执行时实现本公开任一实施例所述的手势控制方法。FIG. 7 shows a block diagram of an electronic device according to at least one embodiment of the present disclosure. The electronic device includes a memory 71 and a processor 72. The memory 71 stores a computer program, and when the computer program is executed by the processor 72, the gesture control method according to any embodiment of the present disclosure is implemented.
本公开至少一个实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现本公开任一实施例所述的手势控制方法。At least one embodiment of the present disclosure further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the gesture control method according to any embodiment of the present disclosure is implemented.
本领域技术人员应明白,本公开一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本公开一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本公开一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that one or more embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, one or more embodiments of the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, one or more embodiments of the present disclosure may adopt computer programs implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The form of the product.
本公开实施例还提供一种计算机可读存储介质,该存储介质上可以存储有计算机程序,所述程序被处理器执行时实现本公开任一实施例描述的用于手势识别的神经网络的训练方法的步骤,和/或,实现本公开任一实施例描述的手势识别方法的步骤。其中,所述的“和/或”表示具有两者中的至少一个,例如,“A和/或B”包括三种方案:A、B、以及“A和B”。The embodiments of the present disclosure also provide a computer-readable storage medium, and the storage medium may store a computer program. When the program is executed by a processor, the training of the neural network for gesture recognition described in any of the embodiments of the present disclosure is realized. The steps of the method, and/or, the steps of the gesture recognition method described in any embodiment of the present disclosure are implemented. Wherein, the "and/or" means having at least one of the two, for example, "A and/or B" includes three schemes: A, B, and "A and B".
本公开中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于数据处理设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。The various embodiments in the present disclosure are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the data processing device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
上述对本公开特定实施例进行了描述。其它实施例在所附权利要求书的范围内。 在一些情况下,在权利要求书中记载的行为或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The specific embodiments of the present disclosure have been described above. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps described in the claims can be performed in a different order than in the embodiments and still achieve desired results. In addition, the processes depicted in the drawings do not necessarily require the specific order or sequential order shown in order to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
本公开中描述的主题及功能操作的实施例可以在以下中实现:数字电子电路、有形体现的计算机软件或固件、包括本公开中公开的结构及其结构性等同物的计算机硬件、或者它们中的一个或多个的组合。本公开中描述的主题的实施例可以实现为一个或多个计算机程序,即编码在有形非暂时性程序载体上以被数据处理装置执行或控制数据处理装置的操作的计算机程序指令中的一个或多个模块。可替代地或附加地,程序指令可以被编码在人工生成的传播信号上,例如机器生成的电、光或电磁信号,该信号被生成以将信息编码并传输到合适的接收机装置以由数据处理装置执行。计算机存储介质可以是机器可读存储设备、机器可读存储基板、随机或串行存取存储器设备、或它们中的一个或多个的组合。The embodiments of the subject and functional operations described in the present disclosure can be implemented in the following: digital electronic circuits, tangible computer software or firmware, computer hardware including the structures disclosed in the present disclosure and structural equivalents thereof, or among them A combination of one or more. Embodiments of the subject matter described in the present disclosure may be implemented as one or more computer programs, that is, one or one of the computer program instructions encoded on a tangible non-transitory program carrier to be executed by a data processing device or to control the operation of the data processing device Multiple modules. Alternatively or in addition, the program instructions may be encoded on artificially generated propagated signals, such as machine-generated electrical, optical or electromagnetic signals, which are generated to encode information and transmit it to a suitable receiver device for data transmission. The processing device executes. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
本公开中描述的处理及逻辑流程可以由执行一个或多个计算机程序的一个或多个可编程计算机执行,以通过根据输入数据进行操作并生成输出来执行相应的功能。所述处理及逻辑流程还可以由专用逻辑电路—例如FPGA(现场可编程门阵列)或ASIC(专用集成电路)来执行,并且装置也可以实现为专用逻辑电路。The processing and logic flow described in the present disclosure can be executed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating according to input data and generating output. The processing and logic flow can also be executed by a dedicated logic circuit, such as FPGA (Field Programmable Gate Array) or ASIC (Application Specific Integrated Circuit), and the device can also be implemented as a dedicated logic circuit.
适合用于执行计算机程序的计算机包括,例如通用和/或专用微处理器,或任何其他类型的中央处理单元。通常,中央处理单元将从只读存储器和/或随机存取存储器接收指令和数据。计算机的基本组件包括用于实施或执行指令的中央处理单元以及用于存储指令和数据的一个或多个存储器设备。通常,计算机还将包括用于存储数据的一个或多个大容量存储设备,例如磁盘、磁光盘或光盘等,或者计算机将可操作地与此大容量存储设备耦接以从其接收数据或向其传送数据,抑或两种情况兼而有之。然而,计算机不是必须具有这样的设备。此外,计算机可以嵌入在另一设备中,例如移动电话、个人数字助理(PDA)、移动音频或视频播放器、游戏操纵台、全球定位系统(GPS)接收机、或例如通用串行总线(USB)闪存驱动器的便携式存储设备,仅举几例。Computers suitable for executing computer programs include, for example, general-purpose and/or special-purpose microprocessors, or any other type of central processing unit. Generally, the central processing unit will receive instructions and data from a read-only memory and/or a random access memory. The basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, the computer will also include one or more mass storage devices for storing data, such as magnetic disks, magneto-optical disks, or optical disks, or the computer will be operatively coupled to this mass storage device to receive data from or send data to it. It transmits data, or both. However, the computer does not have to have such equipment. In addition, the computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a global positioning system (GPS) receiver, or, for example, a universal serial bus (USB ) Flash drives are portable storage devices, just to name a few.
适合于存储计算机程序指令和数据的计算机可读介质包括所有形式的非易失性存储器、媒介和存储器设备,例如包括半导体存储器设备(例如EPROM、EEPROM和闪存设备)、磁盘(例如内部硬盘或可移动盘)、磁光盘以及CD ROM和DVD-ROM盘。处理器和存储器可由专用逻辑电路补充或并入专用逻辑电路中。Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices (such as EPROM, EEPROM, and flash memory devices), magnetic disks (such as internal hard disks or Removable disks), magneto-optical disks, CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by or incorporated into a dedicated logic circuit.
虽然本公开包含许多具体实施细节,但是这些不应被解释为限制任何公开的范围或所要求保护的范围,而是主要用于描述特定公开的具体实施例的特征。本公开内在多个实施例中描述的某些特征也可以在单个实施例中被组合实施。另一方面,在单个实施例中描述的各种特征也可以在多个实施例中分开实施或以任何合适的子组合来实施。此外,虽然特征可以如上所述在某些组合中起作用并且甚至最初如此要求保护,但是来自所要求保护的组合中的一个或多个特征在一些情况下可以从该组合中去除,并且所要求保护的组合可以指向子组合或子组合的变型。Although the present disclosure contains many specific implementation details, these should not be construed as limiting the scope of any disclosure or the scope of protection, but are mainly used to describe the features of specific embodiments of the specific disclosure. Certain features described in multiple embodiments within the present disclosure can also be implemented in combination in a single embodiment. On the other hand, various features described in a single embodiment can also be implemented in multiple embodiments separately or in any suitable sub-combination. In addition, although features can function in certain combinations as described above and even initially claimed as such, one or more features from the claimed combination can in some cases be removed from the combination, and the claimed The combination of protection can be directed to a sub-combination or a variant of the sub-combination.
类似地,虽然在附图中以特定顺序描绘了操作,但是这不应被理解为要求这些操作以所示的特定顺序执行或顺次执行、或者要求所有例示的操作被执行,以实现期望的结果。在某些情况下,多任务和并行处理可能是有利的。此外,上述实施例中的各种系统模块和组件的分离不应被理解为在所有实施例中均需要这样的分离,并且应当理解,所描述的程序组件和系统通常可以一起集成在单个软件产品中,或者封装成多个软件产品。Similarly, although operations are depicted in a specific order in the drawings, this should not be construed as requiring these operations to be performed in the specific order shown or sequentially, or requiring all illustrated operations to be performed to achieve the desired result. In some cases, multitasking and parallel processing may be advantageous. In addition, the separation of various system modules and components in the above embodiments should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can usually be integrated together in a single software product. In, or packaged into multiple software products.
以上所述仅为本公开的一些实施例而已,并不用以限制本公开。凡在本公开的精神和原则之内所做的任何修改、等同替换、变换等,均应包含在本公开的范围之内。The above are only some embodiments of the present disclosure, and are not used to limit the present disclosure. Any modification, equivalent replacement, transformation, etc. made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims (25)

  1. 一种手势控制方法,所述方法包括:A gesture control method, the method includes:
    对摄像头采集到的视频流中时序连续的N帧图像分别进行手势识别处理,得到手势识别结果序列,所述手势识别结果序列中包括所述N帧图像中包括的多个手势的识别结果;Gesture recognition processing is performed on the sequential sequential N frames of images in the video stream collected by the camera to obtain a gesture recognition result sequence, the gesture recognition result sequence includes recognition results of multiple gestures included in the N frames of images;
    响应于所述手势识别结果序列中包括的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果,其中N和M都为大于1的整数,且N大于或等于M;In response to the number of identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, it is determined that the same gesture recognition result is a target gesture recognition result, where both N and M are integers greater than 1, and N is greater than or Equal to M;
    向目标设备发送与所述目标手势识别结果对应的控制指令,或者,控制目标设备执行与所述目标手势识别结果对应的操作。Send a control instruction corresponding to the target gesture recognition result to the target device, or control the target device to perform an operation corresponding to the target gesture recognition result.
  2. 根据权利要求1所述的方法,其中,响应于所述手势识别结果序列中包括的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果包括:The method according to claim 1, wherein in response to the number of identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, determining that the same gesture recognition result is a target gesture recognition result comprises:
    响应于所述手势识别结果序列中包括的连续的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果。In response to the number of consecutive identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, it is determined that the same gesture recognition result is a target gesture recognition result.
  3. 根据权利要求1或2所述的方法,所述响应于所述手势识别结果序列中包括的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果之前,所述方法还包括:The method according to claim 1 or 2, wherein in response to the number of identical gesture recognition results included in the sequence of gesture recognition results being greater than or equal to M, before determining that the same gesture recognition result is a target gesture recognition result, The method also includes:
    响应于所述手势识别结果序列中包括的多个所述相同手势识别结果之间包括有差异手势识别结果、且所述差异手势识别结果在所述手势识别结果序列中的数量占比低于预设值,对所述差异手势识别结果进行平滑处理;其中,所述差异手势识别结果与所述相同手势识别结果不同。In response to the plurality of the same gesture recognition results included in the gesture recognition result sequence including a different gesture recognition result, and the proportion of the difference gesture recognition result in the gesture recognition result sequence is lower than expected Set a value to smooth the difference gesture recognition result; wherein the difference gesture recognition result is different from the same gesture recognition result.
  4. 根据权利要求3所述的方法,其中,对所述差异手势识别结果进行平滑处理包括:The method according to claim 3, wherein the smoothing of the difference gesture recognition result comprises:
    将所述差异手势识别结果更正为所述相同手势识别结果;或者,Correct the difference gesture recognition result to the same gesture recognition result; or,
    将所述差异手势识别结果从所述手势识别结果序列中去除。The difference gesture recognition result is removed from the gesture recognition result sequence.
  5. 根据权利要求3所述的方法,其中,对所述差异手势识别结果进行平滑处理包括:The method according to claim 3, wherein the smoothing of the difference gesture recognition result comprises:
    将时序位于所述差异手势识别结果之前的手势识别结果,与时序位于所述差异手势识别结果之后的手势识别结果,作为连续的多个手势识别结果。The gesture recognition result whose time sequence is before the difference gesture recognition result and the gesture recognition result whose time sequence is after the difference gesture recognition result are used as consecutive multiple gesture recognition results.
  6. 根据权利要求1~5中任一所述的方法,其中,所述对摄像头采集到的视频流中 时序连续的N帧图像分别进行手势识别处理,包括:The method according to any one of claims 1 to 5, wherein said performing gesture recognition processing on N frames of images consecutive in time series in the video stream collected by the camera respectively comprises:
    获取所述视频流中单帧的摄像头采集图像,所述摄像头采集图像是对应于摄像头拍摄视野空间的图像,所述摄像头拍摄视野空间中包括手势控制的有效空间区域;Acquiring a single frame of a camera-captured image in the video stream, where the camera-captured image is an image corresponding to a camera-captured field of view space, and the camera-captured field of view space includes an effective space area for gesture control;
    从所述摄像头采集图像中,选择与所述手势控制的有效空间区域对应的局部图像区域;Selecting a partial image area corresponding to the effective space area controlled by the gesture from the image collected by the camera;
    对所述局部图像区域进行所述手势识别处理。Perform the gesture recognition processing on the partial image area.
  7. 根据权利要求1~6中任一所述的方法,所述方法还包括:The method according to any one of claims 1 to 6, the method further comprising:
    接收用户通过用于参数调节的可视化界面配置的手势识别参数,Receive the gesture recognition parameters configured by the user through the visual interface for parameter adjustment,
    其中,所述手势识别处理是根据所述手势识别参数执行的。Wherein, the gesture recognition processing is executed according to the gesture recognition parameters.
  8. 根据权利要求7所述的方法,其中,The method according to claim 7, wherein:
    所述手势识别参数包括M。The gesture recognition parameter includes M.
  9. 根据权利要求1~8中任一所述的方法,其中,所述目标设备包括车辆,所述向目标设备发送与所述目标手势识别结果对应的控制指令,或者,控制目标设备执行与所述目标手势识别结果对应的操作,包括:The method according to any one of claims 1 to 8, wherein the target device comprises a vehicle, and the control instruction corresponding to the target gesture recognition result is sent to the target device, or the target device is controlled to execute the The operations corresponding to the target gesture recognition result include:
    向所述车辆中的功能组件发送与所述目标手势识别结果对应的控制指令;或者,控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作。Send a control instruction corresponding to the target gesture recognition result to the functional component in the vehicle; or control the function component in the vehicle to perform an operation corresponding to the target gesture recognition result.
  10. 根据权利要求9所述的方法,其中,The method according to claim 9, wherein:
    所述功能组件包括媒体播放器;The functional component includes a media player;
    所述控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作,包括:响应于所述目标手势识别结果,控制所述媒体播放器改变媒体播放状态。The controlling the functional components in the vehicle to perform an operation corresponding to the target gesture recognition result includes: in response to the target gesture recognition result, controlling the media player to change the media playing state.
  11. 根据权利要求9所述的方法,其中,The method according to claim 9, wherein:
    所述功能组件包括车窗控制器;The functional component includes a car window controller;
    所述控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作,包括:响应于所述目标手势识别结果,控制所述车窗控制器移动车窗玻璃。The controlling the functional components in the vehicle to perform an operation corresponding to the target gesture recognition result includes: in response to the target gesture recognition result, controlling the window controller to move the window glass.
  12. 根据权利要求9所述的方法,其中,The method according to claim 9, wherein:
    所述控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作包括:The controlling the functional components in the vehicle to perform an operation corresponding to the target gesture recognition result includes:
    响应于所述目标手势识别结果,在要通过所述图像控制的功能组件对应的功能状态界面上显示下列项目中的至少一种:所述功能组件的运行启动或者运行停止状态,音量的变化,或者对目标对象的点赞标识。In response to the target gesture recognition result, at least one of the following items is displayed on the function status interface corresponding to the function component to be controlled by the image: the running start or running stop state of the function component, the volume change, Or the like mark of the target object.
  13. 一种手势控制装置,所述装置包括:A gesture control device, the device includes:
    识别处理模块,用于对摄像头采集到的视频流中时序连续的N帧图像分别进行手势 识别处理,得到手势识别结果序列,所述手势识别结果序列中包括所述N帧图像中包括的多个手势的识别结果;The recognition processing module is used to perform gesture recognition processing on the sequential consecutive N frames of images in the video stream collected by the camera to obtain a gesture recognition result sequence. The gesture recognition result sequence includes a plurality of images included in the N frames of images. Recognition results of gestures;
    手势确定模块,用于响应于所述手势识别结果序列中包括的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果,其中N和M都为大于1的整数,且N大于或等于M;The gesture determination module is configured to determine that the same gesture recognition result is a target gesture recognition result in response to the number of identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, where N and M are both greater than 1. An integer, and N is greater than or equal to M;
    操作控制模块,用于向目标设备发送与所述目标手势识别结果对应的控制指令,或者,控制目标设备执行与所述目标手势识别结果对应的操作。The operation control module is configured to send a control instruction corresponding to the target gesture recognition result to the target device, or to control the target device to perform an operation corresponding to the target gesture recognition result.
  14. 根据权利要求13所述的装置,其中,The device according to claim 13, wherein:
    所述手势确定模块用于:响应于所述手势识别结果序列中包括的连续的相同手势识别结果的数量大于或等于M,确定所述相同手势识别结果为目标手势识别结果。The gesture determination module is configured to: in response to the number of consecutive identical gesture recognition results included in the gesture recognition result sequence being greater than or equal to M, determine that the same gesture recognition result is a target gesture recognition result.
  15. 根据权利要求13或14所述的装置,其中,The device according to claim 13 or 14, wherein:
    所述手势确定模块用于:响应于所述手势识别结果序列中包括的多个所述相同手势识别结果之间包括有差异手势识别结果,且所述差异手势识别结果在所述手势识别结果序列中的数量占比低于预设值,对所述差异手势识别结果进行平滑处理;其中,所述差异手势识别结果与所述目标手势识别结果不同。The gesture determination module is configured to respond to a difference gesture recognition result among the plurality of the same gesture recognition results included in the gesture recognition result sequence, and the difference gesture recognition result is in the gesture recognition result sequence If the proportion of the number in is lower than the preset value, the difference gesture recognition result is smoothed; wherein, the difference gesture recognition result is different from the target gesture recognition result.
  16. 根据权利要求15所述的装置,其中,所述手势确定模块用于通过下列操作对所述差异手势识别结果进行平滑处理:The device according to claim 15, wherein the gesture determination module is configured to smooth the difference gesture recognition result through the following operations:
    将所述差异手势识别结果更正为所述相同手势识别结果;或者,Correct the difference gesture recognition result to the same gesture recognition result; or,
    将所述差异手势识别结果从所述手势识别结果序列中去除。The difference gesture recognition result is removed from the gesture recognition result sequence.
  17. 根据权利要求15所述的装置,其中,所述手势确定模块用于通过下列操作对所述差异手势识别结果进行平滑处理:The device according to claim 15, wherein the gesture determination module is configured to smooth the difference gesture recognition result through the following operations:
    将时序位于所述差异手势识别结果之前的手势识别结果,与时序位于所述差异手势识别结果之后的手势识别结果,作为连续的多个手势识别结果。The gesture recognition result whose time sequence is before the difference gesture recognition result and the gesture recognition result whose time sequence is after the difference gesture recognition result are used as consecutive multiple gesture recognition results.
  18. 根据权利要求13~17中任一所述的装置,其中,所述识别处理模块用于:The device according to any one of claims 13-17, wherein the identification processing module is configured to:
    在对摄像头采集到的视频流中时序连续的N帧图像分别进行手势识别处理时,获取所述视频流中单帧的摄像头采集图像,所述摄像头采集图像是对应于摄像头拍摄视野空间的图像,所述摄像头拍摄视野空间中包括手势控制的有效空间区域;When performing gesture recognition processing on the consecutive N frames of images in the video stream collected by the camera, acquiring a single frame of the camera collection image in the video stream, and the camera collection image is an image corresponding to the camera's shooting field of view space, The effective space area for gesture control is included in the field of view space captured by the camera;
    从所述摄像头采集图像中,选择与所述手势控制的有效空间区域对应的局部图像区域;Selecting a partial image area corresponding to the effective space area controlled by the gesture from the image collected by the camera;
    对所述局部图像区域进行所述手势识别处理。Perform the gesture recognition processing on the partial image area.
  19. 根据权利要求13~18中任一所述的装置,其中,所述装置还包括:The device according to any one of claims 13-18, wherein the device further comprises:
    参数接收模块,用于接收用户通过用于参数调节的可视化界面配置的手势识别参数,以使得所述识别处理模块根据所述手势识别参数执行所述手势识别处理。The parameter receiving module is configured to receive gesture recognition parameters configured by the user through the visual interface for parameter adjustment, so that the recognition processing module executes the gesture recognition processing according to the gesture recognition parameters.
  20. 根据权利要求13~19中任一所述的装置,其中,The device according to any one of claims 13-19, wherein:
    所述目标设备包括车辆,所述操作控制模块用于:向所述车辆中的功能组件发送与所述目标手势识别结果对应的控制指令;或者,控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作。The target device includes a vehicle, and the operation control module is configured to: send a control instruction corresponding to the target gesture recognition result to a functional component in the vehicle; or, control the functional component in the vehicle to execute the The operation corresponding to the target gesture recognition result.
  21. 根据权利要求20所述的装置,其中,The device of claim 20, wherein:
    所述功能组件包括媒体播放器;The functional component includes a media player;
    所述操作控制模块用于:在控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作时,响应于所述目标手势识别结果,控制所述媒体播放器改变媒体播放状态。The operation control module is configured to control the media player to change the media playing state in response to the target gesture recognition result when controlling the functional components in the vehicle to perform the operation corresponding to the target gesture recognition result.
  22. 根据权利要求20所述的装置,其中,The device of claim 20, wherein:
    所述功能组件包括车窗控制器;The functional component includes a car window controller;
    所述操作控制模块用于:在控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作时,响应于所述目标手势识别结果,控制所述车窗控制器移动车窗玻璃。The operation control module is used to control the vehicle window controller to move the window glass in response to the target gesture recognition result when controlling the functional components in the vehicle to perform the operation corresponding to the target gesture recognition result .
  23. 根据权利要求20所述的装置,其中,The device of claim 20, wherein:
    所述操作控制模块还用于:在控制所述车辆中的功能组件执行与所述目标手势识别结果对应的操作时,响应于所述目标手势识别结果,在要通过所述图像控制的功能组件对应的功能状态界面上显示下列项目中的至少一种:所述功能组件的运行启动或者运行停止状态,音量的变化,或者对目标对象的点赞标识。The operation control module is further configured to: when controlling the functional components in the vehicle to perform an operation corresponding to the target gesture recognition result, in response to the target gesture recognition result, the function component to be controlled through the image At least one of the following items is displayed on the corresponding function status interface: the running start or running stop status of the function component, the change of the volume, or the like mark of the target object.
  24. 一种电子设备,所述设备包括:An electronic device, the device includes:
    处理器;及Processor; and
    存储器,其上存储有计算机程序,该计算机程序可由所述处理器执行,以实现根据权利要求1至12中任一所述的手势控制方法。The memory has a computer program stored thereon, and the computer program can be executed by the processor to implement the gesture control method according to any one of claims 1-12.
  25. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序可由处理器执行,以实现根据权利要求1至12中任一所述的手势控制方法。A computer-readable storage medium having a computer program stored thereon, and the computer program can be executed by a processor to implement the gesture control method according to any one of claims 1-12.
PCT/CN2020/105593 2019-10-22 2020-07-29 Gesture control method and apparatus WO2021077840A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2021544350A JP7479388B2 (en) 2019-10-22 2020-07-29 Gesture control method and device
KR1020217034498A KR20210141688A (en) 2019-10-22 2020-07-29 Gesture control method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911008049.0 2019-10-22
CN201911008049.0A CN110716648B (en) 2019-10-22 2019-10-22 Gesture control method and device

Publications (1)

Publication Number Publication Date
WO2021077840A1 true WO2021077840A1 (en) 2021-04-29

Family

ID=69214071

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105593 WO2021077840A1 (en) 2019-10-22 2020-07-29 Gesture control method and apparatus

Country Status (4)

Country Link
JP (1) JP7479388B2 (en)
KR (1) KR20210141688A (en)
CN (1) CN110716648B (en)
WO (1) WO2021077840A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114461068A (en) * 2022-02-07 2022-05-10 中国第一汽车股份有限公司 Vehicle use guidance interaction method, device, equipment and medium
CN114708577A (en) * 2022-03-29 2022-07-05 上海商汤临港智能科技有限公司 Vehicle window control method and device, electronic equipment and storage medium
CN116761040A (en) * 2023-08-22 2023-09-15 超级芯(江苏)智能科技有限公司 VR cloud platform interaction method and interaction system

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110716648B (en) * 2019-10-22 2021-08-24 上海商汤智能科技有限公司 Gesture control method and device
CN113473198B (en) * 2020-06-23 2023-09-05 青岛海信电子产业控股股份有限公司 Control method of intelligent equipment and intelligent equipment
WO2022021432A1 (en) 2020-07-31 2022-02-03 Oppo广东移动通信有限公司 Gesture control method and related device
CN111880660B (en) * 2020-07-31 2022-10-21 Oppo广东移动通信有限公司 Display screen control method and device, computer equipment and storage medium
CN112069914A (en) * 2020-08-14 2020-12-11 杭州鸿泉物联网技术股份有限公司 Vehicle control method and system
WO2022166338A1 (en) * 2021-02-08 2022-08-11 海信视像科技股份有限公司 Display device
CN113253847B (en) * 2021-06-08 2024-04-30 北京字节跳动网络技术有限公司 Terminal control method, device, terminal and storage medium
CN113934307B (en) * 2021-12-16 2022-03-18 佛山市霖云艾思科技有限公司 Method for starting electronic equipment according to gestures and scenes
CN114463781A (en) * 2022-01-18 2022-05-10 影石创新科技股份有限公司 Method, device and equipment for determining trigger gesture
CN114889543B (en) * 2022-06-08 2024-10-11 中国第一汽车股份有限公司 Anti-fatigue method and device for vehicle driving and vehicle
CN115421591B (en) * 2022-08-15 2024-03-15 珠海视熙科技有限公司 Gesture control device and image pickup apparatus

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110221974A1 (en) * 2010-03-11 2011-09-15 Deutsche Telekom Ag System and method for hand gesture recognition for remote control of an internet protocol tv
CN103376890A (en) * 2012-04-16 2013-10-30 富士通株式会社 Gesture remote control system based on vision
CN108596092A (en) * 2018-04-24 2018-09-28 亮风台(上海)信息科技有限公司 Gesture identification method, device, equipment and storage medium
CN109409277A (en) * 2018-10-18 2019-03-01 北京旷视科技有限公司 Gesture identification method, device, intelligent terminal and computer storage medium
US20190087009A1 (en) * 2017-09-19 2019-03-21 Texas Instruments Incorporated System and method for radar gesture recognition
CN109598198A (en) * 2018-10-31 2019-04-09 深圳市商汤科技有限公司 The method, apparatus of gesture moving direction, medium, program and equipment for identification
CN110716648A (en) * 2019-10-22 2020-01-21 上海商汤智能科技有限公司 Gesture control method and device

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009093291A (en) * 2007-10-04 2009-04-30 Toshiba Corp Gesture determination apparatus and method
WO2014030442A1 (en) 2012-08-22 2014-02-27 日本電気株式会社 Input device, input method, program, and electronic sign
JP6188468B2 (en) 2013-07-23 2017-08-30 アルパイン株式会社 Image recognition device, gesture input device, and computer program
WO2015104919A1 (en) * 2014-01-10 2015-07-16 コニカミノルタ株式会社 Gesture recognition device, operation input device, and gesture recognition method
CN104216514A (en) * 2014-07-08 2014-12-17 深圳市华宝电子科技有限公司 Method and device for controlling vehicle-mounted device, and vehicle
CN106372564A (en) * 2015-07-23 2017-02-01 株式会社理光 Gesture identification method and apparatus
JP6790396B2 (en) 2016-03-18 2020-11-25 株式会社リコー Information processing equipment, information processing system, service processing execution control method and program
JP2018036902A (en) 2016-08-31 2018-03-08 島根県 Equipment operation system, equipment operation method, and equipment operation program
JP2018055614A (en) * 2016-09-30 2018-04-05 島根県 Gesture operation system, and gesture operation method and program
CN110308786B (en) * 2018-03-20 2023-12-26 厦门歌乐电子企业有限公司 Vehicle-mounted equipment and gesture recognition method thereof
CN109492577B (en) * 2018-11-08 2020-09-18 北京奇艺世纪科技有限公司 Gesture recognition method and device and electronic equipment
CN110322760B (en) * 2019-07-08 2020-11-03 北京达佳互联信息技术有限公司 Voice data generation method, device, terminal and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110221974A1 (en) * 2010-03-11 2011-09-15 Deutsche Telekom Ag System and method for hand gesture recognition for remote control of an internet protocol tv
CN103376890A (en) * 2012-04-16 2013-10-30 富士通株式会社 Gesture remote control system based on vision
US20190087009A1 (en) * 2017-09-19 2019-03-21 Texas Instruments Incorporated System and method for radar gesture recognition
CN108596092A (en) * 2018-04-24 2018-09-28 亮风台(上海)信息科技有限公司 Gesture identification method, device, equipment and storage medium
CN109409277A (en) * 2018-10-18 2019-03-01 北京旷视科技有限公司 Gesture identification method, device, intelligent terminal and computer storage medium
CN109598198A (en) * 2018-10-31 2019-04-09 深圳市商汤科技有限公司 The method, apparatus of gesture moving direction, medium, program and equipment for identification
CN110716648A (en) * 2019-10-22 2020-01-21 上海商汤智能科技有限公司 Gesture control method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114461068A (en) * 2022-02-07 2022-05-10 中国第一汽车股份有限公司 Vehicle use guidance interaction method, device, equipment and medium
CN114708577A (en) * 2022-03-29 2022-07-05 上海商汤临港智能科技有限公司 Vehicle window control method and device, electronic equipment and storage medium
CN116761040A (en) * 2023-08-22 2023-09-15 超级芯(江苏)智能科技有限公司 VR cloud platform interaction method and interaction system
CN116761040B (en) * 2023-08-22 2023-10-27 超级芯(江苏)智能科技有限公司 VR cloud platform interaction method and interaction system

Also Published As

Publication number Publication date
CN110716648A (en) 2020-01-21
CN110716648B (en) 2021-08-24
KR20210141688A (en) 2021-11-23
JP2022520030A (en) 2022-03-28
JP7479388B2 (en) 2024-05-08

Similar Documents

Publication Publication Date Title
WO2021077840A1 (en) Gesture control method and apparatus
CN103353935B (en) A kind of 3D dynamic gesture identification method for intelligent domestic system
CN106951871B (en) Motion trajectory identification method and device of operation body and electronic equipment
EP1573498B1 (en) User interface system based on pointing device
CN110764616A (en) Gesture control method and device
WO2022226736A1 (en) Multi-screen interaction method and apparatus, and terminal device and vehicle
WO2020038108A1 (en) Dynamic motion detection method and dynamic motion control method and device
EP3130969A1 (en) Method and device for showing work state of a device
CN103139627A (en) Intelligent television and gesture control method thereof
CN115427920A (en) Method and apparatus for adjusting control display gain of gesture-controlled electronic device
US20140152549A1 (en) System and method for providing user interface using hand shape trace recognition in vehicle
KR101438615B1 (en) System and method for providing a user interface using 2 dimension camera in a vehicle
KR20160106691A (en) System and method for controlling playback of media using gestures
US20140168068A1 (en) System and method for manipulating user interface using wrist angle in vehicle
CN110275611A (en) A kind of parameter adjusting method, device and electronic equipment
KR20160133305A (en) Gesture recognition method, a computing device and a control device
CN106293064A (en) A kind of information processing method and equipment
CN110750159B (en) Gesture control method and device
CN116968501A (en) Intelligent control method and system for vehicle-mounted air conditioner based on infrared technology
WO2023123473A1 (en) Man-machine interaction method and system, and processing device
CN118220019B (en) Vehicle control system, vehicle and vehicle control method
KR101517932B1 (en) The apparatus and method of stereo camera system with wide angle lens applicable for hand gesture recognition
CN116176432A (en) Vehicle-mounted device control method and device, vehicle and storage medium
Liu et al. Gesture recognition using novel efficient and robust 3d image processing
CN113568502A (en) Interaction method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20878767

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021544350

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20217034498

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20878767

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20878767

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24/10/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20878767

Country of ref document: EP

Kind code of ref document: A1