Nothing Special   »   [go: up one dir, main page]

CN111258408B - Object boundary determining method and device for man-machine interaction - Google Patents

Object boundary determining method and device for man-machine interaction Download PDF

Info

Publication number
CN111258408B
CN111258408B CN202010369965.3A CN202010369965A CN111258408B CN 111258408 B CN111258408 B CN 111258408B CN 202010369965 A CN202010369965 A CN 202010369965A CN 111258408 B CN111258408 B CN 111258408B
Authority
CN
China
Prior art keywords
computing board
scene image
content
boundary
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010369965.3A
Other languages
Chinese (zh)
Other versions
CN111258408A (en
Inventor
冯翀
罗观洲
郭嘉伟
马宇航
王宇轩
杜佳诺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenguang Technology Co ltd
Original Assignee
Beijing Shenguang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenguang Technology Co ltd filed Critical Beijing Shenguang Technology Co ltd
Priority to CN202010369965.3A priority Critical patent/CN111258408B/en
Publication of CN111258408A publication Critical patent/CN111258408A/en
Application granted granted Critical
Publication of CN111258408B publication Critical patent/CN111258408B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/002Specific input/output arrangements not covered by G06F3/01 - G06F3/16
    • G06F3/005Input arrangements through a video camera
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/235Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention provides an object boundary determining method and device for man-machine interaction, wherein the method comprises the following steps: shooting scene images in real time by using a wide-angle camera, and sending each frame of shot scene images to a computing board; the computing board judges whether the current scene can determine a boundary or not based on each acquired scene image, and if so, determines the boundary range of the object. The main advantages of the invention are: after the scene image is judged, the position of the blank area is obtained by removing the object in the scene image, so that the delimiting accuracy is ensured, the invention supports the free switching of the delimitation of the user operation interface and the delimitation of the display content, can further extract the content of the user operation interface, is accompanied by the reminding of characters and frames, and is added with the real-time tracking effect of projection, so that the user operation interface projected on a moving object is realized by having good visual effect no matter the display condition of the projected area or the display condition of the panel and realizing the real-time update of the delimitation, and the user experience is greatly improved.

Description

Object boundary determining method and device for man-machine interaction
Technical Field
The invention relates to the technical field of human-computer interaction, in particular to a method and equipment for determining an object boundary for human-computer interaction.
Background
Along with the development of information technology, a plurality of reality technologies are continuously developed, especially, the augmented reality technology is widely applied to entertainment, engineering and other aspects, people in the real world can solve some virtual things in a short distance, and the related technologies comprise a plurality of technical means such as multimedia, three-dimensional modeling, real-time tracking, intelligent interaction, sensing and the like.
Human-computer interaction is a study of the interactive relationships between a research system and a user. The system may be a variety of machines, and may be a computerized system and software. The human-computer interaction interface generally refers to a portion visible to a user. And the user communicates with the system through a human-computer interaction interface and performs operation.
In the prior art, the position, the size and the like of a user operation interface are generally adjusted manually when the user operation interface is projected through a projector, the trouble and the labor are wasted, the user operation interface is difficult to project on a non-blank interface in the prior art, for example, the user operation interface is projected in a textbook, in the prior art, the display content in the operation interface cannot be automatically identified for preliminary delimitation, in the existing delimitation mode, the mode of self-adaptively determining delimitation according to the projected content cannot be carried out, the human-computer interaction efficiency is low, the prior art cannot adaptively adjust the distance between the projector and the projection interface according to the size of a delimitation object, the projection is fuzzy, and the user experience is influenced.
Disclosure of Invention
The present invention provides the following technical solutions to overcome the above-mentioned drawbacks in the prior art.
An object boundary determination method for human-computer interaction, the method comprising:
a scene information acquisition step, namely shooting a scene image in real time by using a wide-angle camera and sending each frame of shot scene image to a computing board;
and a determining step, wherein the computing board judges whether the boundary of the current scene can be determined or not based on each acquired scene image, and if so, the boundary range of the object is determined.
Further, the object is a user operation interface or displayed content.
Further, when the object is a user operation interface, the determining step includes:
the computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and processes the object distribution of the current scene by using a mobilent-ssd detection network to determine the shape of each object and the corresponding category of the object;
the computing board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after combining the position of each object and the corresponding category;
the computing board reads position information from the object data set, all object distribution information is subtracted from the scene image based on the position information to obtain blank area information, then the computing board determines whether the boundary can be determined according to user setting, if so, the position information of the delimitable area is computed to determine the boundary range, and the boundary range is stored;
the computing board transmits the stored boundary range to the projection unit, transmits a successful delimitation signal at the same time, and acquires the stored setting information of the current user from the computing board after the projection unit receives the successful delimitation signal;
the computing board determines whether to project in the blank area, if so, determines a projection area according to the boundary range and the setting information, and projects an operation interface of a user in the projection area; if not, selecting an object to be projected by the user, reading the position information of the object in the object data set by the computing board, and projecting an operation interface of the user based on the position information of the object.
Furthermore, when the object is a user operation interface, the method further includes:
and a first updating step, wherein the computing board processes the scene image transmitted by the wide-angle camera at a first time interval, then compares the processed scene image with the previous scene image with the determined boundary, and if the comparison result is inconsistent, the boundary is determined again.
Further, when the object is a user operation interface, the updating step includes:
the computing board acquires a frame of scene image from the wide-angle camera again every second, and acquires the distribution states of all objects in the frame of scene image by using the mobilene-ssd detection network;
the computing board acquires the setting information of the current projection unit, compares the setting information with the distribution state of the object, if the error of the comparison result is larger than a first threshold value, the boundary can not be determined, the projection unit is updated to a warning state which can not be delimited, and the state of the computing board is adjusted to a state which can judge whether the delimitation can be realized in real time; if the error of the comparison result is smaller than the first threshold value, determining a new boundary range, comparing the new boundary range with the previously stored boundary range, if the comparison result is smaller than a second threshold value, not updating, otherwise, storing the new boundary range and transmitting the new boundary range to the projection unit, and the projection unit correspondingly adjusts the projection area according to the new boundary range.
Further, when the object is a displayed content, the content is a content displayed on a user operation interface, and the determining step includes:
the wide-angle camera shoots a scene image in real time and transmits the scene image to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server predicts the positions of the characters in the content by using a deep learning network, and cuts the picture containing the characters to obtain a first sub-picture and stores the first sub-picture;
the cloud server identifies the text content of the first sub-picture by using a ctc algorithm, and generates a content data set by the text and the corresponding position after identification;
the server transmits the content data set to the computing board, and the computing board transmits information of a position in the content data set to the projection unit;
and the projection unit projects the content data set obtained by the computing board, and the user selects the content needing to be delimited.
Further, the projecting unit projecting the content data set obtained by the computing board and selecting the content to be delimited by the user includes:
the projection unit monitors the computing board in real time, and displays the content in a light color in a projection area after receiving the recognized characters and positions sent by the computing board;
the user selects the content according to the displayed identified content, and after selection, the boundary at the position of the corresponding content is made obvious, which indicates that the current content is selected.
Further, the visualization is shown with an outer frame added.
Further, when the object is displayed content, the method further includes:
and a second updating step, wherein the computing board deeply identifies the content selected by the user, and updates the content to the projection unit for projection display after specific information of the content is obtained.
Still further, the updating step includes:
when the user selects one identified area, the computing board acquires the selected area of the user and records the position of the selected area;
the computer board cuts the area selected by the user into a second sub-picture based on the position of the selected area, and analyzes the character or picture information in the second sub-picture by using an intelligent recognition API;
the calculation board combines the analyzed specific information of the characters or pictures with the position information to obtain detailed information of the selected area, extracts an effective part in the detailed information, normalizes the effective part to obtain normalized data and transmits the normalized data to the projection unit;
and after the projection unit receives the specification data from the computing board, updating corresponding display in a user operation area in the projection area.
The invention also proposes an object boundary determining device for human-computer interaction, the device comprising: the system comprises a projection unit, a wide-angle camera and a computing board;
the wide-angle camera shoots scene images in real time and sends each frame of shot scene images to the computing board; and after receiving each frame of scene image, the computing board judges whether the boundary of the current scene can be determined based on each acquired frame of scene image, and if so, determines the boundary range of the object.
Further, the object is a user operation interface or displayed content.
Further, when the object is a user operation interface, after receiving each frame of scene image, the computing board determines whether the current scene can determine a boundary based on each acquired frame of scene image, and if so, determining the boundary range of the object includes:
the computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and processes the object distribution of the current scene by using a mobilent-ssd detection network to determine the shape of each object and the corresponding category of the object;
the computing board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after combining the position of each object and the corresponding category;
the computing board reads position information from the object data set, all object distribution information is subtracted from the scene image based on the position information to obtain blank area information, then the computing board determines whether the boundary can be determined according to user setting, if so, the position information of the delimitable area is computed to determine the boundary range, and the boundary range is stored;
the computing board transmits the stored boundary range to the projection unit, transmits a successful delimitation signal at the same time, and acquires the stored setting information of the current user from the computing board after the projection unit receives the successful delimitation signal;
the computing board determines whether to project in the blank area, if so, determines a projection area according to the boundary range and the setting information, and projects an operation interface of a user in the projection area; if not, selecting an object to be projected by the user, reading the position information of the object in the object data set by the computing board, and projecting an operation interface of the user based on the position information of the object.
Furthermore, when the object is a user operation interface, the computing board processes the scene image transmitted by the wide-angle camera at a first time interval and then compares the processed scene image with the determined boundary, and if the comparison result is inconsistent, the boundary determination is carried out again.
Further, when the object is a user operation interface, the computing board processes the scene image transmitted by the wide-angle camera at a first time interval and then compares the processed scene image with the previously determined boundary, and if the comparison result is inconsistent, the re-performing the boundary determination includes:
the computing board acquires a frame of scene image from the wide-angle camera again every second, and acquires the distribution states of all objects in the frame of scene image by using the mobilene-ssd detection network;
the computing board acquires the setting information of the current projection unit, compares the setting information with the distribution state of the object, if the error of the comparison result is larger than a first threshold value, the boundary can not be determined, the projection unit is updated to a warning state which can not be delimited, and the state of the computing board is adjusted to a state which can judge whether the delimitation can be realized in real time; if the error of the comparison result is smaller than the first threshold value, determining a new boundary range, comparing the new boundary range with the previously stored boundary range, if the comparison result is smaller than a second threshold value, not updating, otherwise, storing the new boundary range and transmitting the new boundary range to the projection unit, and the projection unit correspondingly adjusts the projection area according to the new boundary range.
Further, when the object is a displayed content, the content is a content displayed on a user operation interface, the computing board receives each frame of scene image, judges whether the current scene can be determined as a boundary based on each acquired frame of scene image, and if so, determines the boundary range of the object including:
the wide-angle camera shoots a scene image in real time and transmits the scene image to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server predicts the positions of the characters in the content by using a deep learning network, and cuts the picture containing the characters to obtain a first sub-picture and stores the first sub-picture;
the cloud server identifies the text content of the first sub-picture by using a ctc algorithm, and generates a content data set by the text and the corresponding position after identification;
the server transmits the content data set to the computing board, and the computing board transmits information of a position in the content data set to the projection unit;
and the projection unit projects the content data set obtained by the computing board, and the user selects the content needing to be delimited.
Further, the projecting unit projecting the content data set obtained by the computing board and selecting the content to be delimited by the user includes:
the projection unit monitors the computing board in real time, and displays the content in a light color in a projection area after receiving the recognized characters and positions sent by the computing board;
the user selects the content according to the displayed identified content, and after selection, the boundary at the position of the corresponding content is made obvious, which indicates that the current content is selected.
Further, the visualization is shown with an outer frame added.
Furthermore, when the object is displayed content, the computing board deeply identifies the content selected by the user, obtains specific information of the content, updates the information to the projection unit and performs projection display.
Furthermore, the step of the computing board deeply identifying the content selected by the user, obtaining the specific information of the content and then updating the specific information to the projection unit for projection display includes:
when the user selects one identified area, the computing board acquires the selected area of the user and records the position of the selected area;
the computer board cuts the area selected by the user into a second sub-picture based on the position of the selected area, and analyzes the character or picture information in the second sub-picture by using an intelligent recognition API;
the calculation board combines the analyzed specific information of the characters or pictures with the position information to obtain detailed information of the selected area, extracts an effective part in the detailed information, normalizes the effective part to obtain normalized data and transmits the normalized data to the projection unit;
and after the projection unit receives the specification data from the computing board, updating corresponding display in a user operation area in the projection area.
The invention has the technical effects that: the invention discloses an object boundary determining method for man-machine interaction, which comprises the following steps: a scene information acquisition step, namely shooting a scene image in real time by using a wide-angle camera and sending each frame of shot scene image to a computing board; and a determining step, wherein the computing board judges whether the boundary of the current scene can be determined or not based on each acquired scene image, and if so, the boundary range of the object is determined. The main advantages of the invention are: after the scene image is judged, the position of the blank area is obtained in a mode of removing objects in the scene image, so that the delimiting accuracy is ensured, and the projected user interface is very clear; the invention supports free switching between delimitation of a user operation interface and delimitation of display content, thereby being beneficial to adding other operations during projection delimitation, for example, the invention can further extract the content, for example, extracting specific characters and retrieving deep information of pictures, and the information can be directly displayed by means of projection.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.
Fig. 1 is a flowchart of an object boundary determination method for human-computer interaction according to one embodiment of the present invention.
Fig. 2 is a schematic diagram of an object boundary determining apparatus for human-computer interaction according to one embodiment of the present invention.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
FIG. 1 illustrates an object boundary determination method for human-computer interaction according to the present invention, the method comprising:
a scene information acquisition step S101, using a wide-angle camera to shoot a scene image in real time, and sending each frame of shot scene image to a computing board;
and a determining step S102, the computer board judges whether the boundary of the current scene can be determined based on each acquired scene image, and if so, the boundary range of the object is determined.
The method of the invention can be applied to an intelligent desk lamp which is provided with a projection unit, namely a projector, a wide-angle camera, a depth camera, an infrared camera and the like, and a computing board is arranged in the intelligent desk lamp, wherein the computing board is at least provided with a processor and a memory and is used for finishing data processing and the like, and certainly, the intelligent desk lamp is also provided with a power supply, a power supply controller and the like. The projection unit can be a projector, and the boundary of an operation interface projected by the projection unit on a desktop can be determined by the method. When a user operates on the operation interface, the boundary of the display content can be further determined, as for which object is delimited, the computing board judges according to the content projected by the current projection unit, determines whether the user operation interface is delimited or the display content is delimited according to the judgment result, then performs corresponding delimiting operation, such as delimiting the projected user operation interface when initializing, and delimitates the display content on the operation interface when the user operates on the operation interface after having the user operation interface, namely the invention supports the free switching of the delimitation of the user operation interface and the delimitation of the display content.
In one embodiment, when the computing board determines, according to the judgment result, that the user operation interface is delimited, that is, the object is the user operation interface, according to the judgment result, the determining step S102 includes:
the computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and processes the object distribution of the current scene by using a mobilent-ssd detection network to determine the shape of each object and the corresponding category of each object;
the computing board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after combining the position of each object and the corresponding category;
the computing board reads position information from the object data set, all object distribution information is subtracted from the scene image based on the position information to obtain blank area information, then the computing board determines whether the boundary can be determined according to user setting, if so, the position information of the delimitable area is computed to determine the boundary range, and the boundary range is stored;
the computing board transmits the stored boundary range to the projection unit, transmits a successful delimitation signal at the same time, and acquires the stored setting information of the current user from the computing board after the projection unit receives the successful delimitation signal;
the computing board determines whether to project in the blank area, if so, determines a projection area according to the boundary range and the setting information, and projects an operation interface of a user in the projection area; if not, selecting an object to be projected by the user, reading the position information of the object in the object data set by the computing board, and projecting an operation interface of the user based on the position information of the object.
Two main delimiting modes are provided, one is the projection of a blank area (projected on the blank area), and if the size of the area meets the size of a projection range set by a user, the delimitation can be realized, which is a more traditional mode; the other mode is based on the projection of the recognized object, namely the projection is carried out on a specific book or paper, if the projectable object exists, such as the book or the paper, the definition can be realized, the invention obtains the position of the blank area by removing the object in the scene image through the determined specific operation, and then carries out the definition, so that the definition accuracy is realized, and the projected user interface is very clear, which is an important invention point of the invention.
In one embodiment, when the object is a user operation interface, the method further includes:
the first updating step S103 is that the computing board processes the scene image transmitted by the wide-angle camera at a first time interval, compares the processed scene image with the previous scene image with the determined boundary, and determines the boundary again if the comparison result is inconsistent. When the object is a user operation interface, the first updating step S103 includes:
the computing board acquires a frame of scene image from the wide-angle camera again every second, and acquires the distribution states of all objects in the frame of scene image by using the mobilene-ssd detection network;
the computing board acquires the setting information of the current projection unit, compares the setting information with the distribution state of the object, if the error of the comparison result is larger than a first threshold value, the boundary can not be determined, the projection unit is updated to a warning state which can not be delimited, and the state of the computing board is adjusted to a state which can judge whether the delimitation can be realized in real time; if the error of the comparison result is smaller than the first threshold value, determining a new boundary range, comparing the new boundary range with the previously stored boundary range, if the comparison result is smaller than a second threshold value, not updating, otherwise, storing the new boundary range and transmitting the new boundary range to the projection unit, and the projection unit correspondingly adjusts the projection area according to the new boundary range.
Through the updating operation, the invention enables the delimitation to be updated in real time, thereby realizing the operation interface of the projection user on the moving object, namely, the projection can track the movement of the object, facilitating the operation of the user, namely, the tracking effect realized through real-time refreshing, further improving the delimitation capability, ensuring the automatic tracking of the projection area when the user moves the equipment in a non-large range, and greatly improving the experience of the user, which is another important invention point of the invention.
In one embodiment, when the computing board determines, according to the content projected by the current projection unit, that the display content is delimited according to the determination result, that is, when the object is the displayed content, the content is the content displayed on the user operation interface, the determining step S102 includes:
the wide-angle camera shoots a scene image in real time and transmits the scene image to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server predicts the positions of the characters in the content by using a deep learning network, and cuts the picture containing the characters to obtain a first sub-picture and stores the first sub-picture;
the cloud server identifies the text content of the first sub-picture by using a ctc algorithm, and generates a content data set by the text and the corresponding position after identification;
the server transmits the content data set to the computing board, and the computing board transmits information of a position in the content data set to the projection unit;
and the projection unit projects the content data set obtained by the computing board, and the user selects the content needing to be delimited.
In one embodiment, the projecting unit projecting the content data set obtained by the computing board and selecting the content to be delimited by the user includes:
the projection unit monitors the computing board in real time, and displays the content in a light color in a projection area after receiving the recognized characters and positions sent by the computing board;
the user selects the content according to the displayed identified content, and after selection, the boundary at the position of the corresponding content is made obvious, which indicates that the current content is selected. The visualization is shown with an outer frame added.
Through the delimitation of the content, the displayed content can be further extracted, for example, specific characters and deep information of a retrieval picture are extracted, the information can be displayed by means of projection directly, for the delimitation of the content, the projection automatically gives out an identified area, a prompt of the characters and a frame is attached, in addition, the real-time tracking effect of the projection is added, and a user can obtain (assign values) again, so that the user has a brand-new feeling when obtaining the information, and the method belongs to another important invention point of the invention.
In one embodiment, when the object is displayed content, the method further comprises:
and a second updating step S104, in which the computing board deeply identifies the content selected by the user, obtains the specific information of the content, updates the content to the projection unit and performs projection display. In one embodiment, the second updating step S104 includes:
when the user selects one identified area, the computing board acquires the selected area of the user and records the position of the selected area;
the computer board cuts the area selected by the user into a second sub-picture based on the position of the selected area, and analyzes the character or picture information in the second sub-picture by using an intelligent recognition API;
the calculation board combines the analyzed specific information of the characters or pictures with the position information to obtain detailed information of the selected area, extracts an effective part in the detailed information, normalizes the effective part to obtain normalized data and transmits the normalized data to the projection unit; normalization refers to extracting a significant portion of information, such as an interpretation of a phrase, etc.
And after the projection unit receives the specification data from the computing board, updating corresponding display in a user operation area in the projection area.
The delimitation of the display content is a method for better recording and marking positions related to characters and pictures, and is also beneficial to information acquisition and next application after delimitation.
In addition, in one embodiment, in order to ensure excellent projection effect on objects with different sizes and distances, the present invention adopts a method for adjusting the focal length of the projector based on the distance of the center point, which specifically comprises the following steps: the computing pad has determined the object (e.g., book) to be delimited based on the user's selection and computed the location of the boundary in the scene to be projected; based on the delimited boundary, the calculation board calculates the intersection point of the diagonal lines of the delimited area, namely the position of the central point, based on the four boundaries of the object, and stores the intersection point again; then starting a depth camera by the computing board to shoot the scene, and temporarily storing the RGB-D information after the complete scene is obtained; the computing board extracts depth information from the RGB-D information based on the obtained RGB-D information, then combines the depth information with the position of the central point of the delimited area to further obtain the distance between the central point and the camera, and then obtains the distance between the projector and the center of the delimited area based on the fine adjustment of the positions of the camera and the projector; and then the computing board calls an initialization method of the projector, the distance is used as an original focal length, and clear projection display of the corresponding position can be realized based on the distance through trapezoidal correction processing of the projector in the initialization process. Through the operation, when the boundary is defined, the center of the projection unit and the boundary area is automatically adjusted based on the size of the defined object, so that the projected interface is clearer, which is another important invention point of the invention.
Fig. 2 shows an object boundary determining apparatus for human-computer interaction of the present invention, the apparatus comprising: projection unit, wide-angle camera, and computing board, etc.;
the wide-angle camera shoots scene images in real time and sends each frame of shot scene images to the computing board; and after receiving each frame of scene image, the computing board judges whether the boundary of the current scene can be determined based on each acquired frame of scene image, and if so, determines the boundary range of the object.
The device of the invention may be an intelligent desk lamp having a projection unit, i.e. a projector, a wide-angle camera, a depth camera, an infrared camera, etc., inside which a computing board is provided, the computing board having at least a processor and a memory for performing data processing, etc., of course, it must also have a power supply, a power supply controller, etc. The projection unit can be a projector, and the boundary of an operation interface projected by the projection unit on a desktop can be determined by the method. When a user operates on an operation interface, the boundary of display content can be further determined, and as for which object is delimited, the computing board judges according to the content projected by the current projection unit, determines whether the user operation interface is delimited or the display content is delimited according to the judgment result, and then performs corresponding delimiting operation, such as delimiting the projected user operation interface when initializing, and delimitates the display content on the operation interface when the user operates on the operation interface after having the user operation interface, namely the invention supports free switching between delimitation of the user operation interface and delimitation of the display content, the intelligent desk lamp can interact with a server, a server is shown in fig. 2, but the server does not belong to a part of the intelligent desk lamp, and the server can be a cloud server and the like.
In one embodiment, when the computing board determines, according to the content projected by the current projection unit, that the user operation interface is delimited according to the determination result, that is, when the object is the user operation interface, the computing board determines, after receiving the each frame of scene image, whether the current scene can determine a boundary based on the acquired each frame of scene image, and if so, determining the boundary range of the object includes:
the computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and processes the object distribution of the current scene by using a mobilent-ssd detection network to determine the shape of each object and the corresponding category of each object;
the computing board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after combining the position of each object and the corresponding category;
the computing board reads position information from the object data set, all object distribution information is subtracted from the scene image based on the position information to obtain blank area information, then the computing board determines whether the boundary can be determined according to user setting, if so, the position information of the delimitable area is computed to determine the boundary range, and the boundary range is stored;
the computing board transmits the stored boundary range to the projection unit, transmits a successful delimitation signal at the same time, and acquires the stored setting information of the current user from the computing board after the projection unit receives the successful delimitation signal;
the computing board determines whether to project in the blank area, if so, determines a projection area according to the boundary range and the setting information, and projects an operation interface of a user in the projection area; if not, selecting an object to be projected by the user, reading the position information of the object in the object data set by the computing board, and projecting an operation interface of the user based on the position information of the object.
Two main delimiting modes are provided, one is the projection of a blank area (projected on the blank area), and if the size of the area meets the size of a projection range set by a user, the delimitation can be realized, which is a more traditional mode; the other mode is based on the projection of the recognized object, namely the projection is carried out on a specific book or paper, if the projectable object exists, such as the book or the paper, the definition can be realized, the invention obtains the position of the blank area by removing the object in the scene image through the determined specific operation, and then carries out the definition, so that the definition accuracy is realized, and the projected user interface is very clear, which is an important invention point of the invention.
In one embodiment, when the object is a user operation interface, the computing board processes a scene image transmitted by the wide-angle camera at a first time interval, compares the processed scene image with a previous scene image with a determined boundary, and if the comparison result is inconsistent, performs boundary determination again.
In one embodiment, when the object is a user operation interface, the computing board processes a scene image transmitted by the wide-angle camera at a first time interval and then compares the processed scene image with a previous scene image with a determined boundary, and if the comparison result is inconsistent, the re-performing the boundary determination includes:
the computing board acquires a frame of scene image from the wide-angle camera again every second, and acquires the distribution states of all objects in the frame of scene image by using the mobilene-ssd detection network;
the computing board acquires the setting information of the current projection unit, compares the setting information with the distribution state of the object, if the error of the comparison result is larger than a first threshold value, the boundary can not be determined, the projection unit is updated to a warning state which can not be delimited, and the state of the computing board is adjusted to a state which can judge whether the delimitation can be realized in real time; if the error of the comparison result is smaller than the first threshold value, determining a new boundary range, comparing the new boundary range with the previously stored boundary range, if the comparison result is smaller than a second threshold value, not updating, otherwise, storing the new boundary range and transmitting the new boundary range to the projection unit, and the projection unit correspondingly adjusts the projection area according to the new boundary range.
Through the updating operation, the invention enables the delimitation to be updated in real time, thereby realizing the operation interface of the projection user on the moving object, namely, the projection can track the movement of the object, facilitating the operation of the user, namely, the tracking effect realized through real-time refreshing, further improving the delimitation capability, ensuring the automatic tracking of the projection area when the user moves the equipment in a non-large range, and greatly improving the experience of the user, which is another important invention point of the invention.
In one embodiment, when the computing board determines, according to the content projected by the current projection unit, that the display content is delimited according to the determination result, that is, when the object is the displayed content, the content is displayed on the user operation interface, and after receiving the each frame of scene image, the computing board determines whether the boundary of the current scene can be determined based on the acquired each frame of scene image, and if so, determining the boundary range of the object includes:
the wide-angle camera shoots a scene image in real time and transmits the scene image to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server predicts the positions of the characters in the content by using a deep learning network, and cuts the picture containing the characters to obtain a first sub-picture and stores the first sub-picture;
the cloud server identifies the text content of the first sub-picture by using a ctc algorithm, and generates a content data set by the text and the corresponding position after identification;
the server transmits the content data set to the computing board, and the computing board transmits information of a position in the content data set to the projection unit;
and the projection unit projects the content data set obtained by the computing board, and the user selects the content needing to be delimited. Preferably, the projecting unit projecting the content data set obtained by the computing board and selecting the content to be delimited by the user includes:
the projection unit monitors the computing board in real time, and displays the content in a light color in a projection area after receiving the recognized characters and positions sent by the computing board;
the user selects the content according to the displayed identified content, and after selection, the boundary at the position of the corresponding content is made obvious, which indicates that the current content is selected. The visualization is shown with an outer frame added.
Through the delimitation of the content, the displayed content can be further extracted, for example, specific characters and deep information of a retrieval picture are extracted, the information can be displayed by means of projection directly, for the delimitation of the content, the projection automatically gives out an identified area, a prompt of the characters and a frame is attached, in addition, the real-time tracking effect of the projection is added, and a user can obtain (assign values) again, so that the user has a brand-new feeling when obtaining the information, and the method belongs to another important invention point of the invention.
In one embodiment, when the object is displayed content, the computing board deeply identifies the content selected by the user, obtains specific information of the content, updates the information to the projection unit, and performs projection display. The step of the computing board deeply identifying the content selected by the user, and updating the content to the projection unit after obtaining the specific information of the content and performing projection display by the computing board comprises the following steps:
when the user selects one identified area, the computing board acquires the selected area of the user and records the position of the selected area;
the computer board cuts the area selected by the user into a second sub-picture based on the position of the selected area, and analyzes the character or picture information in the second sub-picture by using an intelligent recognition API;
the calculation board combines the analyzed specific information of the characters or pictures with the position information to obtain detailed information of the selected area, extracts an effective part in the detailed information, normalizes the effective part to obtain normalized data and transmits the normalized data to the projection unit; normalization refers to extracting a significant portion of information, such as an interpretation of a phrase, etc.
And after the projection unit receives the specification data from the computing board, updating corresponding display in a user operation area in the projection area.
The delimitation of the display content is a method for better recording and marking positions related to characters and pictures, and is also beneficial to information acquisition and next application after delimitation.
In addition, in one embodiment, in order to ensure excellent projection effect on objects with different sizes and distances, the present invention adopts a method for adjusting the focal length of the projector based on the distance of the center point, which specifically comprises the following steps: the computing pad has determined the object (e.g., book) to be delimited based on the user's selection and computed the location of the boundary in the scene to be projected; based on the delimited boundary, the calculation board calculates the intersection point of the diagonal lines of the delimited area, namely the position of the central point, based on the four boundaries of the object, and stores the intersection point again; then starting a depth camera by the computing board to shoot the scene, and temporarily storing the RGB-D information after the complete scene is obtained; the computing board extracts depth information from the RGB-D information based on the obtained RGB-D information, then combines the depth information with the position of the central point of the delimited area to further obtain the distance between the central point and the camera, and then obtains the distance between the projector and the center of the delimited area based on the fine adjustment of the positions of the camera and the projector; and then the computing board calls an initialization method of the projector, the distance is used as an original focal length, and clear projection display of the corresponding position can be realized based on the distance through trapezoidal correction processing of the projector in the initialization process. Through the operation, when the boundary is defined, the center of the projection unit and the boundary area is automatically adjusted based on the size of the defined object, so that the projected interface is clearer, which is another important invention point of the invention.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.

Claims (6)

1. An object boundary determination method for human-computer interaction, the method comprising:
a scene information acquisition step, namely shooting a scene image in real time by using a wide-angle camera and sending each frame of shot scene image to a computing board;
determining, namely judging whether the current scene can determine a boundary or not by the computing board based on each acquired scene image, and if so, determining the boundary range of the object;
the object is a user operation interface or displayed content;
when the object is a user operation interface, the determining step includes:
the computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and processes the object distribution of the current scene by using a mobilent-ssd detection network to determine the shape of each object and the corresponding category of each object;
the computing board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after combining the position of each object and the corresponding category;
the computing board reads position information from the object data set, all object distribution information is subtracted from the scene image based on the position information to obtain blank area information, then the computing board determines whether the boundary can be determined according to user setting, if so, the position information of the delimitable area is computed to determine the boundary range, and the boundary range is stored;
the computing board transmits the stored boundary range to the projection unit, transmits a successful delimitation signal at the same time, and acquires the stored setting information of the current user from the computing board after the projection unit receives the successful delimitation signal;
the computing board determines whether to project in the blank area, if so, determines a projection area according to the boundary range and the setting information, and projects an operation interface of a user in the projection area; if not, selecting an object to be projected by the user, reading the position information of the object in the object data set by the computing board, and projecting an operation interface of the user based on the position information of the object;
when the object is displayed content, the content is content displayed on a user operation interface, and the determining step includes:
the wide-angle camera shoots a scene image in real time and transmits the scene image to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server predicts the positions of the characters in the content by using a deep learning network, and cuts the picture containing the characters to obtain a first sub-picture and stores the first sub-picture;
the cloud server identifies the text content of the first sub-picture by using a ctc algorithm, and generates a content data set by the text and the corresponding position after identification;
the server transmits the content data set to the computing board, and the computing board transmits information of a position in the content data set to the projection unit;
the projection unit projects a content data set obtained by the computing board, and a user selects content needing delimitation;
the projecting unit projecting the content data set obtained by the computing board and selecting the content needing to be delimited by a user comprises the following steps:
the projection unit monitors the computing board in real time, and displays the content in a light color in a projection area after receiving the recognized characters and positions sent by the computing board;
the user selects the content according to the displayed identified content, and after selection, the boundary at the position of the corresponding content is made obvious, which indicates that the current content is selected.
2. The method of claim 1, wherein when the object is a user operation interface,
the method further comprises the following steps:
and a first updating step, wherein the computing board processes the scene image transmitted by the wide-angle camera at a first time interval, then compares the processed scene image with the previous scene image with the determined boundary, and if the comparison result is inconsistent, the boundary is determined again.
3. The method according to claim 2, wherein when the object is a user operation interface, the updating step comprises:
the computing board acquires a frame of scene image from the wide-angle camera again every second, and acquires the distribution states of all objects in the frame of scene image by using the mobilene-ssd detection network;
the computing board acquires the setting information of the current projection unit, compares the setting information with the distribution state of the object, if the error of the comparison result is larger than a first threshold value, the boundary can not be determined, the projection unit is updated to a warning state which can not be delimited, and the state of the computing board is adjusted to a state which can judge whether the delimitation can be realized in real time; if the error of the comparison result is smaller than the first threshold value, determining a new boundary range, comparing the new boundary range with the previously stored boundary range, if the comparison result is smaller than a second threshold value, not updating, otherwise, storing the new boundary range and transmitting the new boundary range to the projection unit, and the projection unit correspondingly adjusts the projection area according to the new boundary range.
4. An object boundary determination device for human-computer interaction, the device comprising: the system comprises a projection unit, a wide-angle camera and a computing board;
the wide-angle camera shoots scene images in real time and sends each frame of shot scene images to the computing board; after receiving each frame of scene image, the computing board judges whether the current scene can determine a boundary or not based on each acquired frame of scene image, and if so, determines the boundary range of the object;
the object is a user operation interface or displayed content, when the object is the user operation interface, the computing board receives each frame of scene image and then judges whether the current scene can determine a boundary based on each acquired frame of scene image, and if so, the boundary range of the object is determined to include:
the computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and processes the object distribution of the current scene by using a mobilent-ssd detection network to determine the shape of each object and the corresponding category of the object;
the computing board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after combining the position of each object and the corresponding category;
the computing board reads position information from the object data set, all object distribution information is subtracted from the scene image based on the position information to obtain blank area information, then the computing board determines whether the boundary can be determined according to user setting, if so, the position information of the delimitable area is computed to determine the boundary range, and the boundary range is stored;
the computing board transmits the stored boundary range to the projection unit, transmits a successful delimitation signal at the same time, and acquires the stored setting information of the current user from the computing board after the projection unit receives the successful delimitation signal;
the computing board determines whether to project in the blank area, if so, determines a projection area according to the boundary range and the setting information, and projects an operation interface of a user in the projection area; if not, selecting an object to be projected by the user, reading the position information of the object in the object data set by the computing board, and projecting an operation interface of the user based on the position information of the object;
when the object is displayed content, the content is content displayed on a user operation interface, the computing board receives each frame of scene image and judges whether the current scene can determine a boundary based on each acquired frame of scene image, if so, the boundary range of the object is determined to include:
the wide-angle camera shoots a scene image in real time and transmits the scene image to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server predicts the positions of the characters in the content by using a deep learning network, and cuts the picture containing the characters to obtain a first sub-picture and stores the first sub-picture;
the cloud server identifies the text content of the first sub-picture by using a ctc algorithm, and generates a content data set by the text and the corresponding position after identification;
the server transmits the content data set to the computing board, and the computing board transmits information of a position in the content data set to the projection unit;
the projection unit projects a content data set obtained by the computing board, and a user selects content needing delimitation;
the projecting unit projecting the content data set obtained by the computing board and selecting the content needing to be delimited by a user comprises the following steps:
the projection unit monitors the computing board in real time, and displays the content in a light color in a projection area after receiving the recognized characters and positions sent by the computing board;
the user selects the content according to the displayed identified content, and after selection, the boundary at the position of the corresponding content is made obvious, which indicates that the current content is selected.
5. The device of claim 4, wherein when the object is a user operation interface, the computing board processes the scene image transmitted by the wide-angle camera at a first time interval and then compares the processed scene image with the previously determined boundary scene image, and if the comparison result is inconsistent, the boundary determination is performed again.
6. The apparatus of claim 5, wherein when the object is a user-operated interface, the computing board processes the scene image transmitted by the wide-angle camera at a first time interval and compares the processed scene image with a previously determined boundary scene image, and if the comparison result is inconsistent, re-performing the boundary determination comprises:
the computing board acquires a frame of scene image from the wide-angle camera again every second, and acquires the distribution states of all objects in the frame of scene image by using the mobilene-ssd detection network;
the computing board acquires the setting information of the current projection unit, compares the setting information with the distribution state of the object, if the error of the comparison result is larger than a first threshold value, the boundary can not be determined, the projection unit is updated to a warning state which can not be delimited, and the state of the computing board is adjusted to a state which can judge whether the delimitation can be realized in real time; if the error of the comparison result is smaller than the first threshold value, determining a new boundary range, comparing the new boundary range with the previously stored boundary range, if the comparison result is smaller than a second threshold value, not updating, otherwise, storing the new boundary range and transmitting the new boundary range to the projection unit, and the projection unit correspondingly adjusts the projection area according to the new boundary range.
CN202010369965.3A 2020-05-06 2020-05-06 Object boundary determining method and device for man-machine interaction Active CN111258408B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010369965.3A CN111258408B (en) 2020-05-06 2020-05-06 Object boundary determining method and device for man-machine interaction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010369965.3A CN111258408B (en) 2020-05-06 2020-05-06 Object boundary determining method and device for man-machine interaction

Publications (2)

Publication Number Publication Date
CN111258408A CN111258408A (en) 2020-06-09
CN111258408B true CN111258408B (en) 2020-09-01

Family

ID=70955199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010369965.3A Active CN111258408B (en) 2020-05-06 2020-05-06 Object boundary determining method and device for man-machine interaction

Country Status (1)

Country Link
CN (1) CN111258408B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112558818B (en) * 2021-02-19 2021-06-08 北京深光科技有限公司 Projection-based remote live broadcast interaction method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107689082A (en) * 2016-08-03 2018-02-13 腾讯科技(深圳)有限公司 A kind of data projection method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8567958B2 (en) * 2011-03-17 2013-10-29 International Business Machines Corporation Organizing projections on a surface
CN104052976B (en) * 2014-06-12 2016-04-27 海信集团有限公司 Projecting method and device
CN106254847B (en) * 2016-06-12 2017-08-25 深圳超多维光电子有限公司 A kind of methods, devices and systems for the display limit for determining stereoscopic display screen
CN106060310A (en) * 2016-06-17 2016-10-26 联想(北京)有限公司 Display control method and display control apparatus
CN106507077B (en) * 2016-11-28 2018-07-24 江苏鸿信系统集成有限公司 Preventing collision method is corrected and blocked to projecting apparatus picture based on image analysis
CN110769224B (en) * 2018-12-27 2021-06-29 成都极米科技股份有限公司 Projection area acquisition method and projection method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107689082A (en) * 2016-08-03 2018-02-13 腾讯科技(深圳)有限公司 A kind of data projection method and device

Also Published As

Publication number Publication date
CN111258408A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
CN110650368B (en) Video processing method and device and electronic equipment
US11176355B2 (en) Facial image processing method and apparatus, electronic device and computer readable storage medium
US20200193577A1 (en) Method and apparatus for implementing image enhancement, and electronic device
US20210334998A1 (en) Image processing method, apparatus, device and medium for locating center of target object region
US11159717B2 (en) Systems and methods for real time screen display coordinate and shape detection
CN112135041B (en) Method and device for processing special effect of human face and storage medium
US12034996B2 (en) Video playing method, apparatus and device, storage medium, and program product
US11854238B2 (en) Information insertion method, apparatus, and device, and computer storage medium
US20200410737A1 (en) Image display method and device applied to electronic device, medium, and electronic device
CN114374882A (en) Barrage information processing method and device, terminal and computer-readable storage medium
CN111258408B (en) Object boundary determining method and device for man-machine interaction
CN117459661A (en) Video processing method, device, equipment and machine-readable storage medium
CN112380940B (en) Processing method and device of high-altitude parabolic monitoring image, electronic equipment and storage medium
CN110958463A (en) Method, device and equipment for detecting and synthesizing virtual gift display position
CN113706504A (en) Ghost processing method and device, storage medium and electronic equipment
CN109799905B (en) Hand tracking method and advertising machine
CN113963355B (en) OCR character recognition method, device, electronic equipment and storage medium
CN113778233B (en) Method and device for controlling display equipment and readable medium
CN111784847A (en) Method and device for displaying object in three-dimensional scene
CN109885172A (en) A kind of object interaction display method and system based on augmented reality AR
CN115309113A (en) Guiding method for part assembly and related equipment
CN111507139A (en) Image effect generation method and device and electronic equipment
CN116363725A (en) Portrait tracking method and system for display device, display device and storage medium
CN115086738B (en) Information adding method, information adding device, computer equipment and storage medium
CN115348438B (en) Control method and related device for three-dimensional display equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant