Nothing Special   »   [go: up one dir, main page]

WO2019242454A1 - 一种物体建模运动方法、装置与设备 - Google Patents

一种物体建模运动方法、装置与设备 Download PDF

Info

Publication number
WO2019242454A1
WO2019242454A1 PCT/CN2019/088480 CN2019088480W WO2019242454A1 WO 2019242454 A1 WO2019242454 A1 WO 2019242454A1 CN 2019088480 W CN2019088480 W CN 2019088480W WO 2019242454 A1 WO2019242454 A1 WO 2019242454A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
model
movement mode
module
bone
Prior art date
Application number
PCT/CN2019/088480
Other languages
English (en)
French (fr)
Inventor
岩本尚也
王提政
雷财华
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to KR1020217001341A priority Critical patent/KR102524422B1/ko
Priority to BR112020025903-9A priority patent/BR112020025903A2/pt
Priority to SG11202012802RA priority patent/SG11202012802RA/en
Priority to CA3104558A priority patent/CA3104558A1/en
Priority to AU2019291441A priority patent/AU2019291441B2/en
Priority to JP2020570722A priority patent/JP7176012B2/ja
Priority to EP19821647.5A priority patent/EP3726476A4/en
Publication of WO2019242454A1 publication Critical patent/WO2019242454A1/zh
Priority to US16/931,024 priority patent/US11436802B2/en
Priority to US17/879,164 priority patent/US20220383579A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/10Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • G06T17/205Re-meshing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/507Depth or shape recovery from shading
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/579Depth or shape recovery from multiple images from motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/653Three-dimensional objects by matching three-dimensional models, e.g. conformal mapping of Riemann surfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/24Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/20Indexing scheme for editing of 3D models
    • G06T2219/2016Rotation, translation, scaling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/20Indexing scheme for editing of 3D models
    • G06T2219/2021Shape modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/20Indexing scheme for editing of 3D models
    • G06T2219/2024Style variation

Definitions

  • the present invention relates to the technical field of terminals, and in particular, to a method, an apparatus, and a device for modeling movement of an object.
  • the methods for acquiring images include using various cameras, cameras, scanners, etc. By using these methods, usually only a planar image of the object can be obtained, that is, two-dimensional information of the object. In many fields, such as machine vision, face shape detection, physical profiling, automatic processing, product quality control, biomedicine, etc., the three-dimensional information of an object is essential. Therefore, 3D scanning technology came into being.
  • the commonly used equipment is a 3D scanner; it is a scientific instrument used to detect and analyze the shape (geometric structure) and appearance data of objects or environments in the real world. (Such as color, surface albedo, etc.).
  • the purpose of the 3D scanner is to create point clouds of the geometric surface of the object. These points can be used to interpolate into the surface shape of the object. The denser the point cloud can create a more accurate model (this process is also called 3D reconstruction). . If the scanner can obtain the surface color, it can further paste a texture map on the reconstructed surface, which is also called texture mapping.
  • the 3D scanner in the prior art is complicated to use, requires users with professional skills to play, and the application scenarios are relatively limited; therefore, how to enable mass users to play with the 3D scanning technology is an urgent problem to be solved.
  • Embodiments of the present invention provide a method, device, and device for object modeling movement, which can scan an object of interest anytime, anywhere, and achieve dynamic effects, enhance fun and playability, and improve user stickiness; lead the trend of the times.
  • an embodiment of the present invention provides an object modeling movement method, which is applied to a mobile terminal, the mobile terminal includes a color camera and a depth camera; and the color camera and the depth camera are located on the same side, front or back of the mobile terminal ;
  • the method specifically includes: using a color camera and a depth camera to perform a panoramic scan on the target object to obtain a 3D model of the target object; obtaining a target bone model; fusing the target bone model with the 3D model of the target object; obtaining a target movement mode; according to the target The motion mode controls the skeletal model, so that the 3D model of the target object moves according to the target motion mode.
  • an embodiment of the present invention provides an object modeling motion device, which is applied to a mobile terminal, and the mobile terminal includes a color camera and a depth camera; and the color camera and the depth camera are located on the same side, front or back of the mobile terminal ;
  • the device comprises: a scanning module for obtaining a 3D model of the target object when the color camera and the depth camera perform panoramic scanning on the target object; a first acquisition module for acquiring the target skeleton model; a fusion module;
  • the skeletal model is fused with the 3D model of the target object;
  • the second acquisition module is used to acquire the target movement mode;
  • the motion module is used to control the skeletal model according to the target movement mode so that the 3D model of the target object is moved according to the target movement mode.
  • the mobile terminal can implement an integrated design of objects from scanning, 3D reconstruction, bone assembly, and preset animation display. There is no need for users to use professional, cumbersome and complicated equipment for professional scanning, and no need to go to the PC to do complex modeling and animation processing. These functions are integrated together and provided to the user, enabling the user to use a mobile terminal.
  • This series of operation methods can be easily played on the Internet, so that any "static object (or near static object)" around the user can be more lively and more vital. Increase the user's interest in using the terminal and improve the user's experience.
  • the depth camera may use a TOF module.
  • the depth camera may use a structured light module.
  • the field angle of the depth camera ranges from 40 degrees to 80 degrees.
  • the range of the infrared transmitting power in the depth camera can be selected between 50-400mw; the super strong light under special applications can generate higher power.
  • the scanning distance when scanning the object is between 20 cm and 80 cm, and the scanning distance can be understood as the distance from the depth camera to the target object.
  • the shooting frame rate of the depth camera during the scanning process can be selected to be not less than 25 fps.
  • the skeletal model can be calculated by a series of algorithms according to the 3D model.
  • a bone model making library may be provided to the user, such as some line segments and points, where the line segments represent bones and the points represent joint nodes.
  • the skeletal model is uploaded to the cloud or stored locally.
  • the method may be completed by a first obtaining module; on a hardware, it may be implemented by a processor calling a program instruction in a memory.
  • a more open production library can be provided to the user, and the line segments and points are completely freely designed by the user, where the line segments represent bones and the points represent joint nodes.
  • the skeletal model is uploaded to the cloud or stored locally.
  • the method may be completed by a first obtaining module; on a hardware, it may be implemented by a processor calling a program instruction in a memory.
  • a bone model having the highest degree of matching with the shape of the target object may be selected from at least one preset bone model as a target Skeleton model.
  • Preset bone models can be stored online, in the cloud, or locally.
  • a chicken bone model, a dog bone model, and a fish bone model are stored locally.
  • the system recognizes the chicken bone model as the target bone model through shape recognition. Similar determination criteria include, but are not limited to, bone shape, bone length, bone thickness, number of bones, and bone composition.
  • the method may be completed by a first obtaining module; on a hardware, it may be implemented by a processor calling a program instruction in a memory.
  • a selection instruction of a user may be received, the selection instruction being used to select a target bone model from at least one preset bone model, and these preset models are stored locally Or from the cloud or the web.
  • the method may be completed by a first obtaining module; on a hardware, it may be implemented by a processor calling a program instruction in a memory.
  • the movement mode of the first object may be obtained, and the movement mode of the first object may be used as the target movement mode.
  • the first object may be an object that is currently moving in real time; it may also be a movement mode of an object that has been photographed and saved in the past; or it may be a preset movement mode of a specific object.
  • This method may be completed by a second acquisition module; on hardware, it may be implemented by a processor calling a program instruction in a memory.
  • This method may be completed by a second acquisition module; on hardware, it may be implemented by a processor calling a program instruction in a memory.
  • the preset target movement mode can be a complete set of movement modes or a movement mode corresponding to the user's operation. If the user beckons at the "resurrected object" displayed in the terminal, the object can be preset. Beckoned in response to animated ways to exercise. More generally, the user can input a preset interactive action to the terminal, and the terminal obtains a corresponding response movement mode according to the interaction action, and controls the 3D model of the object to move according to the response movement mode.
  • the movement mode can be created by the user using an animation production software, of course, this software can be a tool set embedded in the system or included in the APP for scanning and reconstructing the movement.
  • this software can be a tool set embedded in the system or included in the APP for scanning and reconstructing the movement.
  • This method may be completed by a second acquisition module; on hardware, it may be implemented by a processor calling a program instruction in a memory.
  • the movement mode may be to select a movement mode with the highest degree of attribute matching as a target movement mode among a plurality of pre-stored movement modes according to physical attributes.
  • This method may be completed by a second acquisition module; on hardware, it may be implemented by a processor calling a program instruction in a memory.
  • the movement mode may also be based on the skeletal model of the target object (which can be obtained by using any method in the previous step), and the skeletal model by the system or the user Carry out independent design to get the target movement mode.
  • This way is to implement the most suitable animation operation on the 3D model of subsequent objects.
  • This method may be completed by a second acquisition module; on hardware, it may be implemented by a processor calling a program instruction in a memory.
  • the 3D model can be stored locally or in the cloud, and the 3D model can be called directly after some time, and the bone assembly can be freely selected
  • the animation can be automatically played by the mobile terminal, and it can also be controlled by the user inputting operation instructions.
  • the skinning operation is used to determine a change in the position of a point on the surface of the 3D model according to the movement of the skeletal model; and cause the 3D model of the target object to follow the skeletal model for movement.
  • This method can be completed by a motion module; on hardware, it can be implemented by a processor calling program instructions in a memory.
  • the degree of human-computer interaction is enhanced, giving users more freedom to play space, allowing users to deeply participate in the process of resurrecting objects, and developing imagination to increase fun.
  • the processor can call the programs and instructions in the memory for corresponding processing, such as enabling the camera, collecting images, generating 3D models, obtaining bone models or animations, and storing bone models Or animation, adding special effects, and interacting with users.
  • an embodiment of the present invention provides a terminal device, where the terminal device includes a memory, a processor, a bus, a depth camera, and the color camera; the color camera and the depth camera are located on the same side of the mobile terminal; the memory and the depth camera , The color camera and the processor are connected via a bus; the depth camera and the color camera are used for panoramic scanning of the target object under the control of the processor; the memory is used to store computer programs and instructions; the processor is used to call the computer program and The instructions cause the terminal device to execute any one of the possible design methods described above.
  • the terminal device further includes an antenna system, and the antenna system sends and receives wireless communication signals to realize wireless communication with the mobile communication network under the control of the processor;
  • the mobile communication network includes one of the following Or more: GSM network, CDMA network, 3G network, 4G network, 5G network, FDMA, TDMA, PDC, TACS, AMPS, WCDMA, TDSCDMA, WIFI and LTE networks.
  • the invention realizes that objects from scanning, 3D reconstruction, bone assembly, and preset animation display can be completed in one terminal, realizes the resurrection of static objects, and improves the user's interest in using mobile terminals.
  • FIG. 1 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
  • FIG. 2 is a flowchart of an object modeling movement method according to an embodiment of the present invention
  • FIG. 3 is a main process of scanning an object to implement animation in an embodiment of the present invention
  • FIG. 4 is a schematic diagram of structured light according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a TOF according to an embodiment of the present invention.
  • FIG. 6 is a flowchart of a method of meshing + texture mapping according to an embodiment of the present invention.
  • FIG. 7 is a flowchart of a specific gridization implementation scheme in an embodiment of the present invention.
  • FIG. 8 is a flowchart of a specific texture mapping implementation scheme according to an embodiment of the present invention.
  • FIG. 9 is a specific example of meshing + texture mapping in an embodiment of the present invention.
  • FIG. 10 is a flowchart of a specific skeletal assembly scheme according to an embodiment of the present invention.
  • FIG. 11 is a specific animation flowchart in an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of an object modeling motion device according to an embodiment of the present invention.
  • the mobile terminal may be a device that provides users with photographing and / or data connectivity, a handheld device with a wireless connection function, or other processing devices connected to a wireless modem, such as a digital camera, a SLR camera Smart phones can also be other smart devices with camera functions and display functions, such as wearable devices, tablet computers, PDAs (Personal Digital Assistants), drones, aerial cameras, etc.
  • a wireless modem such as a digital camera, a SLR camera Smart phones
  • smart devices can also be other smart devices with camera functions and display functions, such as wearable devices, tablet computers, PDAs (Personal Digital Assistants), drones, aerial cameras, etc.
  • FIG. 1 shows a schematic diagram of an optional hardware structure of the terminal 100.
  • the terminal 100 may include a radio frequency unit 110, a memory 120, an input unit 130, a display unit 140, a photographing unit 150, an audio circuit 160, a speaker 161, a microphone 162, a processor 170, an external interface 180, a power supply 190, and the like component.
  • the radio frequency unit 110 may be used to receive and transmit information or to receive and send signals during a call.
  • the downlink information of the base station is received and processed by the processor 170; in addition, the uplink data of the design is transmitted to the base station.
  • the RF circuit includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
  • the radio frequency unit 110 can also communicate with network devices and other devices through wireless communication.
  • the wireless communication may use any communication standard or protocol, including but not limited to Global System (GSM), General Packet Radio Service (GPRS), and Code Division Multiple Access (Code) Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), E-mail, Short Messaging Service (SMS), etc.
  • GSM Global System
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • E-mail Short Messaging Service
  • the memory 120 may be configured to store instructions and data.
  • the memory 120 may mainly include a storage instruction area and a storage data area, and the storage data area may store a relationship between a joint touch gesture and an application function;
  • the storage instruction area may store an operating system, an application, and at least one Software units such as instructions required for functions, or their subsets, extensions. It may also include a non-volatile random access memory;
  • the processor 170 is provided with hardware, software, and data resources including management computing processing equipment, and supports control software and applications. It is also used for the storage of multimedia files, as well as the storage of running programs and applications.
  • the input unit 130 may be configured to receive inputted numeric or character information, and generate key signal inputs related to user settings and function control of the portable multifunction device.
  • the input unit 130 may include a touch screen 131 and other input devices 132.
  • the touch screen 131 may collect a touch operation performed by the user on or near the touch screen (for example, the user uses a finger, a joint, a stylus or any suitable object to operate on the touch screen or near the touch screen), and drive the corresponding according to a preset program Connection device.
  • the touch screen can detect a user's touch action on the touch screen, convert the touch action into a touch signal and send it to the processor 170, and can receive and execute a command sent by the processor 170; the touch signal includes at least a touch Point coordinate information.
  • the touch screen 131 may provide an input interface and an output interface between the terminal 100 and a user.
  • various types such as resistive, capacitive, infrared, and surface acoustic wave can be used to implement the touch screen.
  • the input unit 130 may include other input devices.
  • the other input devices 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as a volume control button 132, a switch button 133, etc.), a trackball, a mouse, a joystick, and the like.
  • function keys such as a volume control button 132, a switch button 133, etc.
  • a trackball such as a mouse, a joystick, and the like.
  • the touch screen 131 may cover the display panel 141.
  • the touch screen 131 detects a touch operation on or near the touch screen 131, the touch screen 131 is transmitted to the processor 170 to determine the type of the touch event, and the processor 170 then displays the touch event on the display panel according to the type of the touch event.
  • Corresponding visual output is provided on the 141.
  • the touch screen and the display unit may be integrated into one component to implement the input, output, and display functions of the terminal 100.
  • the embodiment of the present invention uses the touch display screen to represent the function set of the touch screen and the display unit; In some embodiments, the touch screen and the display unit may also be used as two separate components.
  • the display unit 140 may be configured to display information input by the user or information provided to the user and various menus of the terminal 100.
  • the display unit is further configured to display an image obtained by the device using the camera 150, which may include a preview image in some shooting modes, a captured initial image, and a target image processed by a certain algorithm after shooting.
  • the photographing unit 150 is configured to collect an image or a video, and may be triggered to be turned on by an application program instruction to implement a photographing or camera function.
  • the shooting unit may include components such as an imaging lens, a filter, and an image sensor. The light emitted or reflected by the object enters the imaging lens, passes through the filter, and finally converges on the image sensor.
  • the imaging lens is mainly used for focusing and imaging the light emitted or reflected by the object (also known as the object to be photographed or the target object) in the angle of view of the photograph;
  • the filter is mainly used to filter the excess light waves in the light (for example, in addition to visible light) Light waves, such as infrared) are filtered out;
  • the image sensor is mainly used to perform photoelectric conversion on the received light signal, convert it into an electrical signal, and input it into the process 170 for subsequent processing.
  • the photographing unit 150 may further include a color camera (color camera) 151 and a depth camera (depth camera) 152; the color camera is used to collect a color image of a target object, and includes a color camera commonly used in current popular terminal products.
  • the depth camera is used to obtain the depth information of the target object.
  • the depth camera can be implemented by TOF technology and structured light technology.
  • TOF is the abbreviation of Time of Flight technology, that is, the sensor emits modulated near-infrared light and reflects after encountering an object.
  • the sensor calculates the distance between the shot and the object by calculating the time difference or phase difference between the light emission and the reflection. To generate depth information.
  • the three-dimensional contours of objects can be presented in topographic maps with different colors representing different distances.
  • structured light is a group of system structures composed of a projection element and a camera.
  • Projection elements are used to project specific light information (such as grating diffraction) onto the surface of the object and the background, and then collected by the camera. Calculate the position and depth of the object based on changes in the light signal caused by the object (such as changes in light thickness and displacement); and then restore the entire three-dimensional space.
  • the audio circuit 160, the speaker 161, and the microphone 162 may provide an audio interface between the user and the terminal 100.
  • the audio circuit 160 can transmit the received electrical data converted electrical signals to the speaker 161, and the speaker 161 converts them into sound signals for output.
  • the microphone 162 is used to collect sound signals and can also convert the collected sound signals. It is an electrical signal, which is converted into audio data after being received by the audio circuit 160, and then processed by the audio data output processor 170, and then sent to, for example, another terminal via the radio frequency unit 110, or the audio data is output to the memory 120 for further processing.
  • the audio circuit may also include a headphone jack 163 for providing a connection interface between the audio circuit and the headphones.
  • the processor 170 is a control center of the terminal 100, and uses various interfaces and lines to connect various parts of the entire mobile phone. By running or executing instructions stored in the memory 120 and calling data stored in the memory 120, various operations of the terminal 100 are performed. Functions and process data for overall monitoring of the phone.
  • the processor 170 may include one or more processing units; preferably, the processor 170 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application program, etc.
  • the modem processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 170.
  • the processor and the memory may be implemented on a single chip.
  • the processor 170 may also be used to generate corresponding operation control signals, send to the corresponding components of the computing processing device, read and process the data in the software, especially read and process the data and programs in the memory 120, so that the Each function module executes the corresponding function, thereby controlling the corresponding component to act according to the requirements of the instruction.
  • the terminal 100 also includes an external interface 180.
  • the external interface can be a standard Micro USB interface, or a multi-pin connector, which can be used to connect the terminal 100 to communicate with other devices, and can also be used to connect a charger to the terminal 100. Charging.
  • the terminal 100 further includes a power source 190 (such as a battery) for supplying power to various components.
  • a power source 190 such as a battery
  • the power source can be logically connected to the processor 170 through a power management system, so as to implement functions such as management of charging, discharging, and power consumption management through the power management system.
  • the terminal 100 may further include a flash, a wireless fidelity (WiFi) module, a Bluetooth module, sensors with different functions, and the like, and details are not described herein again. All the methods described below can be applied to the terminal shown in FIG. 1.
  • FIG. 1 is only an example of a portable multifunctional device, and does not constitute a limitation on the portable multifunctional device. It may include more or fewer components than shown in the figure, or combine some components. Or different parts.
  • an embodiment of the present invention provides an object modeling movement method, which is applied to a mobile terminal, the mobile terminal includes a color camera and a depth camera; and the color camera and the depth camera are located at all locations.
  • the method includes the following steps:
  • Step 21 Use a color camera and a depth camera to perform a panoramic scan on the target object (ie, the scanned object, referred to as an object in some paragraphs) to obtain a 3D model of the target object;
  • Step 22 Obtain a target skeleton model
  • Step 23 Fusion the target bone model with the 3D model of the target object
  • Step 24 Obtain the target movement mode
  • Step 25 Control the skeletal model according to the target movement mode, so that the 3D model of the target object moves according to the target movement mode.
  • the color cameras and depth cameras mentioned above can be located on the front of the terminal device or on the back of the terminal device. Their specific arrangement and quantity can be flexibly determined according to the needs of the designer, which is not limited in this application.
  • FIG. 3 is the main process from scanning the object to implementing the animation. It is preferred to scan an object, obtain a depth map by scanning with a depth camera, and obtain a color map by scanning with a color camera; fuse the depth map and the color map to obtain a mesh model with texture, that is, a 3D model of the object; embed the 3D model into the bones
  • the model makes the skeleton model move according to the skeleton animation (it should be understood that the movement of the skeleton is usually not visible, of course, it can also be visible to the user under some special scene requirements), and the animation effect of the object is presented visually. Detailed description will be given below in conjunction with examples.
  • Step 21 involves depth camera scanning, color camera scanning, and 3D reconstruction; specific examples are as follows.
  • Depth cameras can include 3D / depth sensor or 3D / depth sensor modules to obtain the depth information of static objects. It should be understood that the scanned object should theoretically be a static object. In actual operation, the small dynamic To some extent it is acceptable.
  • the depth information can be obtained using structured light technology and TOF. With the emerging of deep information acquisition methods, deep modules can also include more implementations, which are not limited in the present invention.
  • a structured light schematic is shown in Figure 4, where 301 is an invisible infrared light source, 302 is a grating that generates a certain light pattern, 303 is a scanned object, 304 is an infrared camera, and the light pattern reflected by 303 is obtained Compared with the desired light pattern, the depth information of the scanned part of the target object is obtained through calculation.
  • a TOF depth camera is shown in Figure 5, where 311 is the target object, 312 is the infrared transmitting end in the TOF camera, and 313 is the infrared receiving end.
  • 312 emits infrared light (for example, but not limited to: 850nm-1200nm) at When the target object is reflected, the target object reflects infrared light.
  • the reflected infrared light is received by 313.
  • the sensors of 313 (such as but not limited to: CMOS array or CCD array with a resolution of more than 240 * 180) will generate a series of reflected infrared light.
  • the voltage calculation unit 314 performs calculation based on the series of voltage difference signals, and finally obtains the depth information 315 of the scanned part of the target object.
  • the depth camera and the color camera are synchronously called, and a certain correction algorithm is adopted so that the images corresponding to the scanned target object are consistent.
  • the way the color image is acquired during scanning is basically the same as the way that ordinary cameras take pictures. This section does not repeat them.
  • the distance between the object and the depth camera (or mobile terminal) is between 20cm and 80cm.
  • a specific scanning method can be that the terminal is not moved, holding the target object in the range of 30cm to 70cm in front of the depth camera, and slowly rotating the object in all directions, so that the union of all scans can build a complete object. It should be noted that the hand holding the object should not cover the surface of the object as much as possible.
  • a specific scanning method may be that the object is not moved, the hand-held terminal is placed in the range of 30cm to 70cm in front of the depth camera, and the object is scanned around the panorama, so that the union of all the scanned images can build a complete object. It should be noted that the hand holding the terminal should not cover the surface of the object as much as possible.
  • a specific scanning method may be that the object is not moving, the hand-held terminal is placed in the range of 30cm to 70cm in front of the depth camera, and the object is scanned at a preset angle until the union of all the scan images can build a complete object. It should be noted that the hand holding the terminal should not cover the surface of the object as much as possible.
  • the scene information includes the full picture of the object without leaving dead angles. Therefore, during the panoramic scan, multiple frames of depth maps (depth map sequences) will appear corresponding to each depth map. All correspond to scenes in the scanning range during one scan; multiple frames of color maps (color map sequences) also appear, and each color map corresponds to the scenes in the scan range during one scan.
  • depth map sequences depth map sequences
  • color map sequences color map sequences
  • the shooting frame rate of the depth camera during the scanning process may be greater than or equal to 25 fps, for example, 30 fps, 60 fps, or 120 fps.
  • the terminal may present the scanning progress of the target object, so that the user can observe whether the panorama of the target object has been covered, and the user can autonomously choose to continue scanning or terminate scanning.
  • the position of the depth camera and the color camera can be front or rear
  • the depth camera when the depth camera is located above the front of the phone, it can be used with the front color camera, and the front scan can be used for self-timer scanning.
  • the depth camera When the depth camera is located above the back of the phone, it can be used with the rear color camera, and the rear scan can be used.
  • the front and rear positions in the traditional sense should not constitute any physical position limitation.
  • the depth camera and the color camera can be located on the same side.
  • the terminal can also call a third-party shooting device, such as using an externally connected shooting lever, scanner, external camera, etc.
  • an external color camera can be used, or it can also use An external depth camera can also be used for both.
  • the above color camera scanning and depth camera scanning can be turned on when the user triggers the scanning function, and the triggering operations include timing, triggering a shutter, gesture operation, air separation sensing, device control, and the like.
  • the system can prompt which objects are suitable for scanning or 3D modeling in the preview image; for example, a box or the like can be used to identify the objects in the preview image to prompt the user.
  • the specific device parameters involved in the aforementioned depth camera and color camera are related to the manufacturing process and user requirements and the design constraints of the terminal, and are not specifically limited in the present invention.
  • a depth map sequence 321 and a color map sequence 322 are obtained, where each frame obtained by the depth camera is a depth map of a scanned scene (for example, : Depth image), and each frame obtained by the color camera is a color image of the scanned scene (for example: RGB image); the depth map sequence 321 is obtained by meshing to obtain a mesh model of the target object.
  • the sequence diagram 322 performs texture mapping on the mesh model to obtain the texture mapped mesh model 323, that is, a 3D model of the object.
  • texture mapping may also be performed according to all frames or certain frames of images in the color sequence diagram.
  • Step 331 Obtain a color image (including but not limited to RGB) and a depth image (Depth) in each scanning scene of the target object.
  • Depth map depth map
  • DepthMap contains information on the distance of the depth camera from multiple points on the surface of the target object.
  • DepthMap is similar to a grayscale image, except that its one pixel value represents the actual distance of the depth camera from a point on the surface of the target object.
  • Color and Depth images are usually registered.
  • Step 332 Including, but not limited to, bilateral filtering and denoising the depth map, down-sampling the depth map to generate image pyramids of different resolutions, converting the depth map into a point cloud, estimating the normal vector of each vertex, and cutting the scanned object Out of range points.
  • Step 333 In 332, a depth map and a color map sequence of the target object at different scanning positions are collected.
  • a single frame 3D point cloud obtained from the acquired image sequence is converted into a unified coordinate system. , That is to obtain the pose transformation relationship between the objects at different scanning positions, that is, the pose estimation.
  • Pose estimation is to estimate the 3D pose of an object based on an image sequence.
  • feature point registration and point cloud based registration.
  • point cloud-based fine registration is used.
  • ICP iterative nearest neighbor algorithm
  • rough registration between the two poses can also be performed as an initial value for fine registration. This method can support faster scanning.
  • M being the transform matrix camera pose
  • s i is the 3D point cloud that the current frame to calculate the pose
  • d i is the observation model The point cloud in coordinates, where n i is the normal corresponding to the model point cloud, and the objective function represents the minimum sum of squares of the distance between the point cloud of the current frame and the plane of the voxel model point cloud.
  • Step 334 The 2D depth map is converted into 3D information and integrated into a unified 3D voxel model.
  • TSDF Trusted Signed Distance Function
  • the value of the voxel after fusion is SDF (Signed Distance Function) value, Weight (weight) value, and optional color value.
  • TSDF algorithm is currently the mainstream processing algorithm for 3D point cloud fusion. For weight calculation, the method of averaging is used. For each fusion, the old weight value is increased by one. The new value weight is 1, the new and old SDF values are correspondingly multiplied by their weights, added, and then divided by the number of fusions (new weight value) to obtain the new normalized SDF value.
  • Step 335 Determine whether there is a right preset number of key frames saved at certain angles (such as but not limited to 30, 45, 60, 90, etc.) in the three directions of Roll / Yaw / Pitch, such as the saved key frames
  • the number is less than the preset number (depending on whether the panorama of the target object is covered)
  • the terminal will instruct the user to perform more scanning.
  • the number of key frames is sufficient to cover the panorama of the target object, the user is prompted to complete the scanning, and the scanning can be ended and the subsequent steps can be performed.
  • Step 336 In the real-time fusion process, the input key frame information required for the texture mapping is selected and cached, including information such as color images, poses (position differences between different images), and other information.
  • a preset number (N) of key frames are selected in each of the Roll / Yaw / Pitch directions, and the 360-degree texture of the object can be completely restored.
  • N the preset number of key frames are selected in each of the Roll / Yaw / Pitch directions, and the 360-degree texture of the object can be completely restored.
  • N preset number of key frames are selected in each of the Roll / Yaw / Pitch directions, and the 360-degree texture of the object can be completely restored.
  • the angle (YAW / Pitch / Roll) of each frame in the input image stream and then calculate the sharpness of each frame, construct a selection strategy based on the angle and sharpness, and select the key frame.
  • the angle strategy is to divide 360 degrees into N 360 / N regions in different directions, and each region must have a clear color image.
  • det x a (i + 1, j) -a (i, j)
  • det y a (i, j + 1) -a (i, j)
  • Step 337 Use Marching Cubes algorithm to realize 3D point cloud meshing, and generate triangular patches.
  • the main idea of the Marching Cubes algorithm is to find the boundary between the content part and the background part of the 3D point cloud based on the voxel unit, and extract triangles from the voxel to fit this boundary.
  • the prime points are called real points
  • the background voxel points are called imaginary points.
  • Such a three-dimensional point cloud is a lattice composed of various real and imaginary points.
  • each of the 8 voxels of a voxel may be a real point or an imaginary point, so a voxel has a total of 2 to the 8th power, which is 256 possible cases.
  • the core idea of Marching Cubes algorithm is to use these 256 enumerable situations to extract the equivalent triangle patches in the voxels.
  • a voxel is a cube box composed of adjacent eight individual pixel points in a three-dimensional image.
  • the cube of the MarchingCubes algorithm can also refer to this voxel. Note the difference between voxels and voxels.
  • a voxel is a grid of 8 voxels, and each voxel (except on the boundary) is shared by 8 voxels.
  • a specific texture mapping implementation scheme is as follows:
  • Step 341 According to the mesh model (triangular patch information) and the pose information of the key frames, determine whether all patches are visible under the pose of each key frame. Input all the triangle patches information of the mesh model and the spatial coordinates of key frames, and output the information of whether all triangle patches are visible in the pose of each key frame.
  • the collision detection process involves the calculation of triangle normal vectors in space, judging whether rays and triangles intersect, judging whether rays intersect with AABB ((Axis-aligned bounding box)) bounding boxes, and the construction of hierarchical binary trees.
  • Step 342 According to the result of step 341 and the mesh model, the method of region division and graph cuts is used to mark each face on the mesh model to determine which key frame image (view) it chooses. ) To generate texture.
  • the results of patch labeling can be used as input to the Affine Mapping (Warping) module to generate preliminary texture maps.
  • Step 343 Map the texture of the corresponding area in the key frame image to the texture map, and smooth the patches (patches at the seams) of different key frames.
  • each vertex of Vertex can be seen as two vertices: Vleft belonging to the left patch and Vright belonging to the right patch.
  • the color before adjustment of each vertex V is recorded as fv, and the color correction value gv of each vertex V is obtained through the following minimization equation.
  • v represents the vertex at the keyframe seam, that is, it belongs to the left seam piece and also belongs to the right seam piece.
  • Fv is the color value before adjustment
  • gv is the color correction value, which is the increase ( ⁇ ).
  • the meaning of this formula is that in order to ensure the smoothness of the seams, the difference between the common points of the images in different frames after correction is As small as possible.
  • Vi and Vj indicate that the increment of any two adjacent vertices on the same texture seam should be as small as possible to ensure that one does not increase too much and one decreases too little to cause unevenness.
  • the patches with the same labels in the adjacent regions of the result of the patch are stored as patches, boundary smoothing is performed on all vertices of the patch, the pixel value of each vertex is adjusted, and the triangle rows of the final vertex siege are made
  • An affine transformation based on position and pixels forms the final texture map.
  • the texture atlas of the object is drawn on the surface of the mesh model of the object to obtain a 3D model of the object, which is generally saved in the .obj format. As shown in Figure 9, for example, the texture atlas of the lion is mapped to the mesh model of the lion. The 3D model of the lion texture is obtained.
  • a 3D model of the target object after 3D reconstruction is obtained, that is, a mesh model with texture.
  • you need to add bones to the textured mesh which involves how to obtain the bone model, that is, the target bone model.
  • a user bone model making library may be provided, such as some line segments and points, where the line segments represent bones and the points represent joint nodes.
  • users can be provided with a more open production library.
  • Line segments and points are completely freely designed by users, where line segments represent bones and points represent joint nodes.
  • a bone model with the highest degree of matching with the shape of the target object may be selected from at least one preset bone model as the target bone model.
  • Preset bone models can be stored online, in the cloud, or locally.
  • a chicken bone model, a dog bone model, and a fish bone model are stored locally.
  • the system recognizes the chicken bone model as the target bone model through shape recognition. Similar determination criteria include, but are not limited to, bone shape, bone length, bone thickness, number of bones, and bone composition.
  • a user's selection instruction may be received, and the selection instruction is used to select a target bone model from at least one preset bone model.
  • a specific skeletal assembly scheme is as follows:
  • Step 351 In order to approximate the axial plane and use other calculations, calculate adaptively sampled distance fields of trilinear differences. You can evaluate the signed distance from any point to the surface of an object by constructing a k-demension tree.
  • a kd-tree is a data structure that divides a k-dimensional data space. It is mainly used for the search of key data in multidimensional space.
  • Step 352 Calculate a set of sample points located approximately on the axial plane of the object, find the points where the bone joints may be located, and filter out those points close to the surface of the object.
  • Step 353 In order to select the vertices of the skeletal diagram from the central axis plane, a sphere can be filled in the object, and all points on the central axis are sorted according to the distance from the surface of the 3D model, and the 3D model is drawn from the furthest point. The largest inscribed sphere inside (not exceeding the surface range of the 3D model) to get the radius of the sphere; after that, each point on the central axis is traversed, and the point will only be applied to the point if it is not included in any of the previously filled spheres. Click to construct an inscribed sphere.
  • Step 354 A skeleton graph can be constructed by connecting some sphere centers, and connecting the sphere centers of any two spheres as an edge.
  • the above steps 351 to 354 may be referred to as bone recognition.
  • V represents a vertex and E represents an edge
  • Step 356 Identify the bone hierarchy relationship and reduce the simple hierarchy to approximate the bone shape.
  • a 3D model of the object with the assembled bone model can be obtained.
  • a 3D model obtained by embedding the bone model is obtained in step 23.
  • the movement mode of the first object may be obtained, and the movement mode of the first object may be used as the target movement mode.
  • the first object may be an object that is currently moving in real time (for example, shooting a person who is running and extracting a person's skeleton through a neural network); or it may be a movement of an object that has been photographed and saved in the past (for example, A set of cute moves of a dog have been photographed before, and the movement mode of the movement is stored locally or in the cloud by an algorithm); it can also be a preset movement mode of a specific object. (For example, choose only the movements related to humans)
  • a preset target movement mode For example, the actions of humans, dogs, cats, horses, etc. are stored locally, and users can choose a specific category they like based on their preferences or the degree of object type compliance).
  • the motion mode can be created by the user using animation software.
  • this software can be a tool set embedded in the system or a tool set loaded in the APP for scanning and rebuilding motion, or derived from the first Three-party animation design tool; it can be the current production method or the historical production motion mode or animation.
  • the movement mode may be selected as a target movement mode among a plurality of pre-stored movement modes according to physical attributes. For example, a swimming animation of a fish, an animation of a frog taking off, and an animation of a horse running are stored locally in advance; if the target object scanned by the user is a deer, the animation of the horse running is used as the target movement mode of the deer. (Deer and horse are more similar to fish and frog in appearance, biological species, bone structure and other attributes)
  • the movement mode may also be based on the skeletal model of the target object (which can be obtained by using any method in the previous step).
  • the system or the user may independently design the skeletal model to obtain the target movement mode. This way is to implement the most suitable animation operation on the 3D model of subsequent objects.
  • the movement mode may be a preset skeletal animation, which is generally produced by a professional animation designer.
  • a skeletal animation describes the dynamics of the transformation of each node in the skeleton over time, and usually uses keyframes for storage and expression. Usually has the concept of FPS (Frame Per Second), that is, how many frames are contained in one second. Skeletal animation cannot exist without the skeleton, otherwise it cannot drive the 3D model, so the skeleton animation usually depends on a specific skeleton.
  • the skeleton is usually called Rig, which describes which bones a set of skeletons have, and the connection relationship of each bone ,
  • the default transformation that is, the pose) of each bone, and some additional information.
  • the pose describes a transformed static state of each node in a skeleton, such as a frame of standing and running.
  • Each skeleton will store a binding pose, which is the default pose when making this skeleton.
  • the posture generally does not store the hierarchical relationship of the skeleton, but uses an array to store the transformation of each node in turn, and the node belongs to a specific calcaneus, so it cannot be used independently of the skeleton.
  • poses are part of the results of skeletal animation sampling. It can be seen that the skeleton, pose, and skeletal animation are related to each other to achieve subsequent animation operations.
  • Bone animation essentially records the changes in position, rotation, and scaling of a series of objects stored in a tree structure over time. Each of these objects is a bone.
  • the realization process of animation is that the animation transformation of bones in a group of animations is mapped to the 3D model of the bones assembled in the previous step "Automatic skeletal assembly".
  • the implementation of this action mapping includes but is not limited to game engines and animation engines Etc .; the 3D model performs posture transformation according to the bone transformation.
  • the coherence is a series of animation actions.
  • the user's sensory feeling is that the scanned object is "revived” and the static thing is "moved”.
  • skinning technology is the basis for ensuring that the 3D model of the object follows the movement of the skeletal model.
  • the animation of the 3D model of the object is the action of the 3D model equipped with bones mapped to a set of changing bone models.
  • Each frame needs to realize the deformation of the 3D model surface (that is, the 3D model skin of the object) according to the changes of the bones. This process is called skinning.
  • skinning This process is called skinning.
  • a linear fusion skin (LBS) scheme can be adopted.
  • LBS linear fusion skin
  • the following formula can be used to obtain the position of the next state according to the position of the previous state.
  • v i is the previous position
  • v ′ i is the next position
  • w i, j is the weight of the j-th bone at the point i
  • T j is the transformation matrix.
  • the next position can be calculated by reaching a certain number of points on the surface of the 3D model, and then the pose of the 3D model at the next position can be determined, thereby realizing the animation.
  • the core of the skinning technique is to find the weight w i, j of each bone to each vertex.
  • a method similar to thermal equilibrium can be used to calculate the weight.
  • a 3D model is regarded as an insulating heat conductor, and the temperature of the i-th calcaneus is set to 1 °, while the temperature of the remaining bones is set. Set to 0 °. According to the principle of thermal equilibrium, this way we can set the temperature after the surface vertices are equilibrated to the weight of this point, and the weight value interval is [0,1].
  • the weight calculation method based on thermal balance makes the calculation result of the weight have smooth characteristics, and the presented motion effect will be more real and natural.
  • a specific animation process can be realized as shown in FIG. 11.
  • motion mapping is performed on a 3D model equipped with a target skeleton model according to a motion model or an animation model, and the target skeleton model is controlled.
  • the target skeletal model moves according to a preset motion model, and the skin data is calculated and updated in real time during the movement, so that the 3D model can follow the target skeletal model to achieve smooth movement, and then realize the animation of the 3D model.
  • the skeletal structure of the target object and the skeletal structure of the animation may not be exactly the same. You can map the skeletal structure of the object and the skeletal structure of the animation. For example, the key nodes must be the same.
  • the length of the bones it can be set in proportion; Adapt the skeletal model of the object to the skeletal structure of the animation, such as proportioning cropping and extension, at least so that the skeletal structure of the animation does not exceed the outer surface of the 3D model of the object. Further, some physical operations can be performed to make the trimming Animate the bones to maximize the 3D model of the object, making the animated bone model and the 3D model of the object more harmonious.
  • the above steps 21-25 can be completed step by step in one go or at a certain time interval.
  • the 3D model can be stored locally or in the cloud.
  • the 3D model can be called directly after a certain period of time, freely select bone assembly or freely choose the animation method, and also select the animation background, including but not Limited to images taken in real time, other images already stored locally, cloud data images, etc.
  • the terminal displays the object animation, it can also display the shadow of the object, or add other sound effects, special effects, and so on.
  • the animation can be automatically played by the mobile terminal, and it can also be controlled by the user inputting operation instructions.
  • a series of operations such as scanning, 3D reconstruction, bone assembly, and preset animation display can be realized on the mobile terminal as a whole, and 3D scanning can be easily played for users;
  • the rendering of two-dimensional images can transition to the rendering of 3D animation, allowing users to finally realize the virtual animation actions of objects scanned and modeled by reality; greatly improving the user's interest in using mobile terminals and Sticky, leading photography applications into a new trend.
  • an embodiment of the present invention provides an object modeling and movement device 700.
  • the device 700 can be applied to various types of photographing equipment. As shown in FIG. 12, the device 700 includes a scanning module. 701, a first acquisition module 702, a fusion module 703, a second acquisition module 704, and a motion module 705; the device is applied to a mobile terminal, and the mobile terminal includes a color camera and a depth camera, and the color camera and the depth camera are located on the same side of the mobile terminal; For related characteristics, reference may be made to the description in the foregoing method embodiment.
  • the scanning module 701 is configured to obtain a 3D model of a target object when the color camera and the depth camera perform panoramic scanning on the target object.
  • the scanning module 701 can be called by a processor to enable and control the color camera and the depth camera by using program instructions in the memory. Further, pictures acquired during scanning can be selectively stored in the memory.
  • the first obtaining module 702 is configured to obtain a target bone model.
  • the first obtaining module 702 may be implemented by a processor invoking a corresponding program instruction. Further, the first obtaining module 702 may be implemented by invoking data and algorithms in a local memory or a cloud server and performing corresponding calculations.
  • a fusion module 703 is used to fuse the target bone model with the 3D model of the target object.
  • the fusion module 703 can be implemented by the processor calling corresponding program instructions. Further, the fusion module 703 can call data and algorithms in the local memory or cloud server. Perform the calculations accordingly.
  • the second acquisition module 704 is configured to acquire a target movement mode.
  • the second obtaining module 704 may be implemented by a processor calling a corresponding program instruction. Further, it may be implemented by calling data and algorithms in a local memory or a cloud server and performing corresponding calculations.
  • the movement module 705 is configured to control the skeletal model according to the target movement mode, so that the 3D model of the target object moves according to the target movement mode.
  • the motion module 705 may be implemented by a processor calling a corresponding program instruction. Further, it may also be implemented by calling data and algorithms in a local memory or a cloud server.
  • the scanning module 701 is specifically configured to execute the method mentioned in step 21 and an equivalent replacement method
  • the first acquisition module 702 is specifically configured to execute the method mentioned in step 22 and an equivalent replacement method Method
  • the fusion module 703 is specifically configured to execute the method mentioned in step 23 and a method that can be equivalently replaced
  • the second acquisition module 704 is specifically configured to execute the method mentioned in step 24 and a method that can be equivalently replaced
  • a motion module 705 is specifically configured to execute the method mentioned in step 25 and a method that can be equivalently replaced.
  • the scanning module 701 may perform the methods of steps 331-337 and 341-343; the fusion module 703 may perform the methods of steps 351-356.
  • the apparatus 700 provided by the embodiment of the present invention can realize the integrated design of objects from scanning, 3D reconstruction, bone assembly, and preset animation display. There is no need for users to use professional, cumbersome and complicated equipment for professional scanning, and no need to go to the PC to do complex modeling and animation processing. These functions are integrated together and provided to the user, enabling the user to use a mobile terminal This series of operation methods can be easily played on the Internet, so that any "static object (or near static object)" around the user can be more lively and more vital. Increase the user's interest in using the terminal and improve the user's experience.
  • each of the above modules can be a separately established processing element, or it can be integrated into a certain chip of the terminal to implement, in addition, it can also be stored in the form of program code in the storage element of the controller and processed by a certain processor The components call and execute the functions of the above modules.
  • each module can be integrated together or can be implemented independently.
  • the processing element described herein may be an integrated circuit chip with signal processing capabilities.
  • each step of the above method or each of the above modules may be completed by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
  • the processing element may be a general-purpose processor, such as a central processing unit (English: central processing unit, CPU), or one or more integrated circuits configured to implement the above methods, such as one or more specific integrations. Circuit (English: application-specific integrated circuit, ASIC for short), or one or more microprocessors (English: digital signal processor, dsp), or one or more field programmable gate arrays (English: field-programmable gate array (abbreviated as FPGA).
  • the embodiments of the present invention may be provided as a method, a system, or a computer program product. Therefore, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, the present invention may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a specific manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions
  • the device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Hardware Design (AREA)
  • Architecture (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Processing Or Creating Images (AREA)
  • Image Generation (AREA)
  • Auxiliary Devices For And Details Of Packaging Control (AREA)

Abstract

本发明公开了一种物体建模运动方法,该方法应用于移动终端,移动终端包括彩色相机和深度相机;且彩色相机和深度相机位于移动终端的同一侧;方法包括:利用彩色相机和深度相机对目标物体进行全景扫描,得到目标物体的3D模型;获取目标骨骼模型;将目标骨骼模型与目标物体的3D模型融合;获取目标运动方式;根据目标运动方式控制骨骼模型,使目标物体的3D模型按照目标运动方式进行运动。实现了物体从扫描、3D重建、到骨骼装配,再到预设的动画展示都可以在一个终端内完成,实现了静态物体的动态化,提升用户使用移动终端的趣味性。

Description

一种物体建模运动方法、装置与设备 技术领域
本发明涉及终端技术领域,尤其涉及一种物体建模运动的方法、装置与设备。
背景技术
随着信息和通信技术的发展,人们在生活和工作中接触到越来越多的图形图像。获取图像的方法包括使用各种摄像机、照相机、扫描仪等,利用这些手段通常只能得到物体的平面图像,即物体的二维信息。在许多领域,如机器视觉、面形检测、实物仿形、自动加工、产品质量控制、生物医学等,物体的三维信息是必不可少的。因此3D扫描技术开始应运而生,通常采用的设备是三维扫描仪(3D scanner);它是一种科学仪器,用来侦测并分析现实世界中物体或环境的形状(几何构造)与外观数据(如颜色、表面反照率等性质)。三维扫描仪的用途是创建物体几何表面的点云(point cloud),这些点可用来插补成物体的表面形状,越密集的点云可以创建更精确的模型(这个过程也称做三维重建)。若扫描仪能够取得表面颜色,则可进一步在重建的表面上粘贴材质贴图,亦即所谓的材质映射(texture mapping)。
然而现有技术中的三维扫描仪使用复杂,需要具有专业技能的用户才能玩转,并且应用场景也比较受限;因此如何让大众用户能够玩转三维扫描技术是亟待解决的问题。
发明内容
本发明实施例提供了一种物体建模运动的方法、装置与设备,可以随时随地扫描感兴趣的物体,并实现动态效果,增强趣味性和可玩性,提升用户粘性;引领时代潮流。
本发明实施例提供的具体技术方案如下:
第一方面,本发明实施例提供一种物体建模运动方法,该方法应用于移动终端,该移动终端包括彩色相机和深度相机;且彩色相机和深度相机位于移动终端的同一侧,正面或者反面;该方法具体包括:利用彩色相机和深度相机对目标物体进行全景扫描,得到目标物体的3D模型;获取目标骨骼模型;将目标骨骼模型与目标物体的3D模型融合;获取目标运动方式;根据目标运动方式控制上述骨骼模型,使目标物体的3D模型根据目标运动方式进行运动。
第二方面,本发明实施例提供一种物体建模运动装置,该装置应用于移动终端,该移动终端包括彩色相机和深度相机;且彩色相机和深度相机位于移动终端的同一侧,正面或者反面;该装置包括:扫描模块,用于当彩色相机和深度相机对目标物体进行全景扫描时,得到目标物体的3D模型;第一获取模块,用于获取目标骨骼模型;融合模块;用于将目标骨骼模型与目标物体的3D模型融合;第二获取模块,用于获取目标运动方式;运动模 块,用于根据目标运动方式控制骨骼模型,使目标物体的3D模型根据目标运动方式进行运动。
根据本发明实施例提供的上述方法和装置的技术方案,移动终端可以实现物体从扫描、3D重建、到骨骼装配,再到预设的动画展示的一体化设计。无需用户采用专业的、笨重的、复杂的设备进行专业扫描,也无需再跑到PC端做复杂的建模处理和动画处理,将这些功能集成在一起,提供给用户,使得用户在一个移动终端上能够轻松玩转这一系列的操作方法,使得用户身边任何一个“静态的物体(或近似静态的物体)”能够更鲜活,更具生命力。增加用户使用终端的趣味性,提升用户的体验度。
应理解,在媒体领域,“相机”与“摄像头”可以同义。
根据第一方面或者第二方面,在一种可能的设计中:深度相机可以采用TOF模组。
根据第一方面或者第二方面,在一种可能的设计中:深度相机可以采用结构光模组。
根据第一方面或者第二方面,在一种可能的设计中:深度相机的视场角范围为40度到80度。
根据第一方面或者第二方面,在一种可能的设计中:深度相机中红外线的发射功率的范围可以选择现在50-400mw之间;特殊应用下的超强光线,发生功率可以更高。
根据第一方面或者第二方面,在一种可能的设计中:扫描物体时的扫描距离为介于20cm到80cm,扫描距离可以理解为深度相机到目标物体的距离。
根据第一方面或者第二方面,在一种可能的设计中:扫描过程中深度相机的拍摄帧率可以选择不低于25fps。
根据第一方面或者第二方面,在一种可能的设计中:骨骼模型可以根据3D模型,通过一系列的算法计算得到。
根据第一方面或者第二方面,在一种可能的设计中:可以提供给用户骨骼模型制作库,如一些线段和点,其中,线段表示骨骼,点表示关节节点。接收用户的操作指令,如手势、滑动、快捷键等,将将至少两个线段和至少一个点组合成骨骼模型,从而得到骨骼模型,进一步地,将该骨骼模型上传到云端或存储在本地。该方法可以由第一获取模块完成;硬件上可以由处理器调用存储器中的程序指令实现。
根据第一方面或者第二方面,在一种可能的设计中:可以提供给用户更加开放的制作库,线段和点完全由用户自由设计,其中,线段表示骨骼,点表示关节节点。接收用户的操作指令,如手势、滑动、快捷键等,将将至少两个线段和至少一个点组合成骨骼模型,从而得到骨骼模型,进一步地,将该骨骼模型上传到云端或存储在本地。该方法可以由第一获取模块完成;硬件上可以由处理器调用存储器中的程序指令实现。
根据第一方面或者第二方面,在一种可能的设计中:在一种具体实现过程中,可以从至少一个预设骨骼模型中选择与所述目标物体的外形匹配度最高的骨骼模型作为目标骨骼模型。预设骨骼模型可以存储在网络、云端或者本地。例如本地存储了鸡的骨骼模型、狗的骨骼模型、鱼的骨骼模型,当目标物体为鸭子的时候,系统通过外形识别,将鸡的骨骼模型作为目标骨骼模型。相似的判定标准包括但不限于骨骼形态、骨骼的长度、骨骼的粗细、骨骼的数量、骨骼的组成方式等。该方法可以由第一获取模块完成;硬件上可以由处理器调用存储器中的程序指令实现。
根据第一方面或者第二方面,在一种可能的设计中:可以接收用户的选择指令,该选择指令用于从至少一个预设骨骼模型中选择出目标骨骼模型,这些预设模型是本地存储或者是从云端或网络上调用的。该方法可以由第一获取模块完成;硬件上可以由处理器调用存储器中的程序指令实现。
根据第一方面或者第二方面,在一种可能的设计中:可以获取第一物体的运动方式,将第一物体的运动方式作为目标运动方式。其中,第一物体可以是当前实时运动的物体;也可以是过去拍过且保存下来物体的运动方式;也可以是预设的某一特定物体的运动方式。该方法可以由第二获取模块完成;硬件上可以由处理器调用存储器中的程序指令实现。
根据第一方面或者第二方面,在一种可能的设计中:可以从预设的目标运动方式选出一种。该方法可以由第二获取模块完成;硬件上可以由处理器调用存储器中的程序指令实现。其中,预设的目标运动方式可以是一套完整的运动方式,也可以是与用户操作相对应的运动方式,如用户对着终端中显示的“复活的物体”招手,则物体可以按照预设的招手回应动画方式进行运动。更为通用的,用户可以对终端输入预设的交互动作,终端根据该交互动作获得对应的回应运动方式,并控制物体的3D模型按照回应运动方式进行运动。
根据第一方面或者第二方面,在一种可能的设计中:运动方式可以是用户自己使用动画制作软件制作的,当然这个软件可以是系统内嵌的工具集或者在扫描重建运动的APP中载入的工具集,或者来源于第三方动画设计工具;可以是当前制作也可以是历史制作的运动方式或动画。该方法可以由第二获取模块完成;硬件上可以由处理器调用存储器中的程序指令实现。
根据第一方面或者第二方面,在一种可能的设计中:运动方式可以是根据物理属性在多个预先存储的运动方式中选择出属性匹配度最高的运动方式作为目标运动方式。该方法可以由第二获取模块完成;硬件上可以由处理器调用存储器中的程序指令实现。
根据第一方面或者第二方面,在一种可能的设计中:运动方式还可以是基于目标物体的骨骼模型(可以采用上一步骤中的任一方法去获取)由系统或者用户对该骨骼模型进行自主设计,得到目标运动方式。这一种方式是对后续物体的3D模型实现动画的最契合的动画操作。该方法可以由第二获取模块完成;硬件上可以由处理器调用存储器中的程序指令实现。
根据第一方面或者第二方面,在一种可能的设计中:用户扫描完物体的3D模型之后,可以将3D模型存储在本地或者云端,若干时间以后可以直接调用该3D模型,自由选择骨骼装配或自由选择动画方式,还可以选择动画背景,包括但不限于实时拍摄的图像、本地已经存储的其它图像、云端数据图像等。此外,在实现物体动画的同时,还可以显示出物体的影子,或者增加其它的音效、特效等等。动画可以由移动终端自动播放,还可以在通过用户输入操作指令进行播放的控制。
根据第一方面或者第二方面,在一种可能的设计中:在根据所述目标运动方式控制所述骨骼模型运动的过程中,对所述骨骼模型和所述目标物体的3D模型进行蒙皮操作;所述蒙皮操作用于根据骨骼模型的运动,确定所述3D模型表面上点的位置变化;使所述目标物体的3D模型跟随所述骨骼模型进行运动。该方法可以由运动模块完成;硬件上可以由处理器调用存储器中的程序指令实现。
通过上述可能的设计,加强了人机交互的程度,给用户更加自由的发挥空间,可以让用户在物体复活的过程中深度参与其中,开发想象增加乐趣。
更具体地,上述操作中涉及到的其它技术实现可以由处理器调用存储器中的程序与指令进行相应的处理,如使能相机,采集图像,生成3D模型,获取骨骼模型或动画、存储骨骼模型或动画、增加特效以及与用户之间的交互操作等。
第三方面,本发明实施例提供一种终端设备,所述终端设备包含存储器、处理器、总线、深度相机和所述彩色相机;彩色相机和深度相机位于移动终端的同一侧;存储器、深度相机、彩色相机以及处理器通过总线相连;深度相机和彩色相机用于在处理器的控制下对目标物体进行全景扫描;存储器用于存储计算机程序和指令;处理器用于调用存储器中存储的计算机程序和指令,使终端设备执行如上述任何一种可能的设计方法。
根据第三方面,在一种可能的设计中,终端设备还包括天线系统、天线系统在处理器的控制下,收发无线通信信号实现与移动通信网络的无线通信;移动通信网络包括以下的一种或多种:GSM网络、CDMA网络、3G网络、4G网络、5G网络、FDMA、TDMA、PDC、TACS、AMPS、WCDMA、TDSCDMA、WIFI以及LTE网络。
应理解,发明内容可以包含权利要求中所能实现的所有方法,此处不予以列举。
对于上述任意一种可能的实现方法和步骤,在不违背自然规律的前提下,可以进行不同方法和步骤之间的自由组合,不同的方法和步骤还可以增加或减少一些可能的步骤。本发明不予以穷举和赘述。
本发明实现了物体从扫描、3D重建、到骨骼装配,再到预设的动画展示都可以在一个终端内完成,实现了静态物体的复活,提升用户使用移动终端的趣味性。
附图说明
图1为本发明实施例中一种终端的结构示意图;
图2为本发明实施例中一种物体建模运动方法的流程图;
图3为本发明实施例中一种是对物体扫描到实现动画的主要过程;
图4为本发明实施例中一种结构光示意图;
图5为本发明实施例中一种TOF示意图;
图6为本发明实施例中一种网格化+纹理映射的方法流程图;
图7为本发明实施例中一种具体的网格化实现方案流程图;
图8为本发明实施例中一种具体的纹理映射实现方案流程图;
图9为本发明实施例中一种具体的网格化+纹理映射实例;
图10为本发明实施例中一种具体的骨骼装配方案流程图;
图11为本发明实施例中一种具体的动画流程图;
图12为本发明实施例中一种物体建模运动装置示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,并不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例中,移动终端,可以是向用户提供拍照和/或数据连通性的设备,具有无线连接功能的手持式设备、或连接到无线调制解调器的其他处理设备,比如:数码相机、单反相机、智能手机,还可以是其它带有拍照功能以及显示功能的智能设备,如可穿戴设备、平板电脑、PDA(Personal Digital Assistant,个人数字助理)、无人机、航拍器等。
图1示出了终端100的一种可选的硬件结构示意图。
参考图1所示,终端100可以包括射频单元110、存储器120、输入单元130、显示单元140、拍摄单元150、音频电路160、扬声器161、麦克风162、处理器170、外部接口180、电源190等部件。
射频单元110可用于收发信息或通话过程中信号的接收和发送,特别地,将基站的下行信息接收后,给处理器170处理;另外,将设计上行的数据发送给基站。通常,RF电路包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier,LNA)、双工器等。此外,射频单元110还可以通过无线通信与网络设备和其他设备通信。所述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(Global System of Mobile communication,GSM)、通用分组无线服务(General Packet Radio Service,GPRS)、码分多址(Code Division Multiple Access,CDMA)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、长期演进(Long Term Evolution,LTE)、电子邮件、短消息服务(Short Messaging Service,SMS)等。
存储器120可用于存储指令和数据,存储器120可主要包括存储指令区和存储数据区,存储数据区可存储关节触摸手势与应用程序功能的关联关系;存储指令区可存储操作系统、应用、至少一个功能所需的指令等软件单元,或者他们的子集、扩展集。还可以包括非易失性随机存储器;向处理器170提供包括管理计算处理设备中的硬件、软件以及数据资源,支持控制软件和应用。还用于多媒体文件的存储,以及运行程序和应用的存储。
输入单元130可用于接收输入的数字或字符信息,以及产生与所述便携式多功能装置的用户设置以及功能控制有关的键信号输入。具体地,输入单元130可包括触摸屏131以及其他输入设备132。所述触摸屏131可收集用户在其上或附近的触摸操作(比如用户使用手指、关节、触笔等任何适合的物体在触摸屏上或在触摸屏的附近操作),并根据预先设定的程序驱动相应的连接装置。触摸屏可以检测用户对触摸屏的触摸动作,将所述触摸动作转换为触摸信号发送给所述处理器170,并能接收所述处理器170发来的命令并加以执行;所述触摸信号至少包括触点坐标信息。所述触摸屏131可以提供所述终端100和用户之间的输入界面和输出界面。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触摸屏。除了触摸屏131,输入单元130还可以包括其他输入设备。具体地,其他输入设备132可以包括但不限于物理键盘、功能键(比如音量控制按键132、开关按键133等)、轨迹球、鼠标、操作杆等中的一种或多种。
进一步的,触摸屏131可覆盖显示面板141,当触摸屏131检测到在其上或附近的触摸操作后,传送给处理器170以确定触摸事件的类型,随后处理器170根据触摸事件的类 型在显示面板141上提供相应的视觉输出。在本实施例中,触摸屏与显示单元可以集成为一个部件而实现终端100的输入、输出、显示功能;为便于描述,本发明实施例以触摸显示屏代表触摸屏和显示单元的功能集合;在某些实施例中,触摸屏与显示单元也可以作为两个独立的部件。
显示单元140可用于显示由用户输入的信息或提供给用户的信息以及终端100的各种菜单。在本发明实施例中,显示单元还用于显示设备利用摄像头150获取到的图像,可以包括某些拍摄模式下的预览图像、拍摄的初始图像以及拍摄后经过一定算法处理后的目标图像。
拍摄单元150,用于采集图像或视频,可以通过应用程序指令触发开启,实现拍照或者摄像功能。拍摄单元可以包括成像镜头,滤光片,图像传感器等部件。物体发出或反射的光线进入成像镜头,通过滤光片,最终汇聚在图像传感器上。成像镜头主要是用于对拍照视角中的物体(也可称为待拍摄对象或目标物体)发出或反射的光汇聚成像;滤光片主要是用于将光线中的多余光波(例如除可见光外的光波,如红外)滤去;图像传感器主要是用于对接收到的光信号进行光电转换,转换成电信号,并输入到处理170进行后续处理。
具体地,拍摄单元150还可以包括彩色相机(彩色摄像头)151和深度相机(深度摄像头)152;彩色相机用于采集目标物体的彩色图像,包含当下流行的终端产品中通用的彩色相机。深度相机用于获取目标物体的深度信息,作为举例,深度相机的可以通过TOF技术和结构光技术实现。
其中,TOF是飞行时间(Time of Flight)技术的缩写,即传感器发出经调制的近红外光,遇物体后反射,传感器通过计算光线发射和反射时间差或相位差,来换算被拍摄景物的距离,以产生深度信息。此外,再结合传统的彩色摄像头拍摄,就能将物体的三维轮廓以不同颜色代表不同距离的地形图方式呈现出来。
其中,结构光是一组由投影元件和摄像头组成的系统结构。用投影元件投射特定的光信息(如经过光栅衍射)到物体表面后及背景后,由摄像头采集。根据物体造成的光信号的变化(如光线粗细的变化与位移)来计算物体的位置和深度等信息;进而复原整个三维空间。
音频电路160、扬声器161、麦克风162可提供用户与终端100之间的音频接口。音频电路160可将接收到的音频数据转换后的电信号,传输到扬声器161,由扬声器161转换为声音信号输出;另一方面,麦克风162用于收集声音信号,还可以将收集的声音信号转换为电信号,由音频电路160接收后转换为音频数据,再将音频数据输出处理器170处理后,经射频单元110以发送给比如另一终端,或者将音频数据输出至存储器120以便进一步处理,音频电路也可以包括耳机插孔163,用于提供音频电路和耳机之间的连接接口。
处理器170是终端100的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器120内的指令以及调用存储在存储器120内的数据,执行终端100的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器170可包括一个或多个处理单元;优选的,处理器170可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器170中。在一些实施例中,处理器、存储器、可以在单一芯片上实现,在一些实施例中,他们也可以在独立的芯片上分 别实现。处理器170还可以用于产生相应的操作控制信号,发给计算处理设备相应的部件,读取以及处理软件中的数据,尤其是读取和处理存储器120中的数据和程序,以使其中的各个功能模块执行相应的功能,从而控制相应的部件按指令的要求进行动作。
终端100还包括外部接口180,所述外部接口可以是标准的Micro USB接口,也可以使多针连接器,可以用于连接终端100与其他装置进行通信,也可以用于连接充电器为终端100充电。
终端100还包括给各个部件供电的电源190(比如电池),优选的,电源可以通过电源管理系统与处理器170逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管未示出,终端100还可以包括闪光灯、无线保真(wireless fidelity,WiFi)模块、蓝牙模块、不同功能的传感器等,在此不再赘述。下文中描述的全部方法均可以应用在图1所示的终端中。此外,本领域技术人员可以理解,图1仅仅是便携式多功能装置的举例,并不构成对便携式多功能装置的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件。
参阅图2所示,本发明实施例提供一种物体建模运动方法,该方法应用于一种移动终端,该移动终端包括彩色相机和深度相机;且所述彩色相机和所述深度相机位于所述移动终端的同一侧;该方法包括如下步骤:
步骤21:利用彩色相机和深度相机对目标物体(即被扫描物体,一些段落中简称物体)进行全景扫描,得到目标物体的3D模型;
步骤22:获取目标骨骼模型;
步骤23:将目标骨骼模型与目标物体的3D模型融合;
步骤24:获取目标运动方式;
步骤25:根据目标运动方式控制骨骼模型,使目标物体的3D模型根据目标运动方式进行运动。
其中,上述彩色相机和深度相机可以位于终端设备的前面,也可以位于终端设备的背面,它们具体的排布方式与数量可以根据设计者的需求灵活确定,本申请不做限定。
请参阅图3,图3是对物体扫描到实现动画的主要过程。首选对物体进行扫描,通过深度相机扫描得到深度图,通过彩色相机扫描得到彩色图;根据深度图和彩色图进行融合得到带有纹理的网格模型,即物体的3D模型;将3D模型嵌入骨骼模型,根据骨骼动画使骨骼模型动起来(应理解,骨骼的运动通常不可见,当然一些特殊场景需求下,也可以对用户可见),视觉上呈现出物体的动画效果。下面将结合实例具体展开描述。
步骤21
步骤21涉及到深度相机扫描、彩色相机扫描和3D重建;具体实例如下。
深度相机扫描
深度相机既可以包含3D/深度传感器或3D/深度传感器的模组,用于获取静态物体的深度信息,应理解,扫描的对象理论上应该是静止的物体,实际操作过程中,微小的动态 在一定程度上也是可以接受的。获取深度信息可以采用结构光技术和TOF。随着深度信息获取方法的新兴,深度模组还可以包含更多的实现方式,本发明中不予以限定。
一种结构光原理图见图4所示,其中,301为不可见的红外光源,302为生成一定光图案的光栅,303为被扫描物体,304为红外相机,获取到303反射回来的光图案,跟期望的光图案对比,通过计算获取目标物体被扫描部分的深度信息。
一种TOF深度相机见图5所示,其中,311为目标物体,312为TOF相机中的红外发射端,313为红外接收端,当312发射红外光线(例如但不仅限于:850nm-1200nm)于目标物体时,目标物体会反射红外线,反射的红外线由313接收,313的传感器(例如但不仅限于:CMOS阵列或CCD阵列,分辨率在240*180以上)会因反射的红外光线而产生一系列的电压差信号;深度计算单元314根据这一系列的电压差信号进行计算,最后得到目标物体被扫描部分的深度信息315。
彩色相机扫描
在扫描目标物体的过程中,会同步调用深度相机和彩色相机,采用一定的校正算法,使得两者对应扫描到的目标物体的图像是一致的。彩色相在扫描时获取图像的方式和普通相机拍照的方式基本一致。本部分不做赘述。
在具体实现过程中,在扫描目标物体时,需要在一定的角度范围内(通常受到深度相机或者彩色相机中视场角的较小者的限制)以及一定的距离内扫描物体,受限于深度信息(例如:深度图)的质量,通常物体距离深度相机(或移动终端)的距离为介于20cm到80cm。
一种具体的扫描方式可以是终端不动,手握目标物体置于深度相机前30cm到70cm范围,并全方位缓慢转动物体,使所有的扫描图的并集能够构建出完整物体。应注意,握住物体的手要尽量不遮挡住物体表面。
一种具体的扫描方式可以是物体不动,手握终端置于深度相机前30cm到70cm范围,并对物体进行环绕全景扫描,使所有的扫描图的并集能够构建出完整物体。应注意,握住终端的手要尽量不遮挡住物体表面。
一种具体的扫描方式可以是物体不动,手握终端置于深度相机前30cm到70cm范围,并对物体相隔预设角度进行扫描,直到所有的扫描图的并集能够构建出完整物体。应注意,握住终端的手要尽量不遮挡住物体表面。
具体地,可以拍摄多次,确保画面拍全,场景信息包括了物体的全貌,不留死角,因此,全景扫描的过程中,会对应出现多帧深度图(深度图序列),每一个深度图都是对应一次扫描时扫描范围内的场景;也会对应出现多帧彩色图(彩色图序列),每一个彩色图都是对应一次扫描时扫描范围内的场景。扫描目标物体时也可能包含了其他物体,但是采用上述适中的距离,可以在后续的3D重建过程中将目标物体以外的噪声去除。
一种可能的实现方式中,扫描过程中深度相机的拍摄帧率可以大于等于25fps,例如30fps、60fps、120fps。
一种可能的实现方式中,扫描的过程中,终端可以呈现出目标物体的扫描进度,以使用户观察是否已经覆盖了目标物体的全景,并且可以由用户自主选择继续扫描或者终止扫描。
由于深度相机和彩色相机的位置可以为前置或后置,因此也相应地存在前置扫描和后置扫描两种方式。如深度相机位于手机正面的上方时,可以与前置彩色摄像头配合使用,前置扫描可以实现自拍扫描;如深度相机位于手机背面的上方时,可以与后置彩色摄像头配合使用,后置扫描可以有更多目标物体的去选择,进而实现对目标物体的精细、稳定的扫描。应理解,随着折叠屏终端的出现,深度相机和彩色相机的物理位置是可以发生变化的,因此,传统意义上的前置和后置也不应该构成任何物理位置的限定。扫描物体进行3D建模时,为了保持画面的一致性,深度相机和彩色相机可以位于同一侧,至于与终端的相对位置和相对方向均不做限定,只要能够实现3D重建的摄像头位置组合方式都是可以的。在一种可能的实现方式中,终端还可以调用第三方的拍摄器件,如利用外部连接的拍摄杆、扫描仪、外接摄像头等,可选的,可以利用外部的彩色摄像头,或者,也可以利用外部的深度摄像头,也可以两者都用。
上述彩色相机扫描和深度相机扫描,可以由用户触发扫描功能时开启,触发操作包括定时、触发快门、手势操作、隔空感应、器件操控等。另外,当用户开启相机进行预览时,预览图中可以由系统提示哪些物体适合进行扫描或进行3D建模;例如可以用方框等标识预览图中的物体用于提示用户。
此外,上述深度相机和彩色相机涉及到的具体器件参数,与制造工艺和用户需求以及终端的设计约束有关,本发明中不做任何具体限定。
3D重建(网格化+纹理映射)
如图6所示,对物体进行多帧的、360度全景的扫描后,会得到深度图序列321和彩色图序列322,其中深度相机得到的每一帧都是一个扫描场景的深度图(例如:Depth图),同时彩色相机得到的每一帧都是一个扫描场景的彩色图(例如:RGB图);深度图序列321经过网格化处理得到目标物体的网格(Mesh)模型,根据彩色序列图322对网格模型进行纹理映射,得到纹理映射后的网格模型323,即物体的3D模型。在一种可能的实现中,也可以根据彩色序列图中的全部帧或某几帧图像进行纹理映射。
1)网格化
在一种具体实现过程中,参见图7,一种具体的网格化实现方案如下:
步骤331:获取目标物体每个扫描场景下的彩色图(包括但不限于RGB)和深度图(Depth)。Depth Map(深度图)是包含目标物体的表面上的多个点与深度相机的距离有关的信息。其中,Depth Map类似于灰度图像,只是它的一个像素点值表示的是深度相机距离目标物体的表面上一个点的实际距离。通常彩色图像和Depth图像是配准的。
步骤332:包含但不限于,对深度图双边滤波去噪,对深度图下采样生成不同解析度的图像金字塔,将深度图转换成点云,估算每个顶点的法向量,裁切被扫描物体范围外的点。
步骤333:在332中,采集到目标物体在不同扫描位置的深度图和彩色图序列,要想生成物体模型,需要将采集到的图像序列得到的单帧3D点云转换到统一的坐标系下,即 得到在物体在不同扫描位置之间的位姿变换关系,即位姿估计。位姿估计是基于图像序列估计物体的3D姿态。姿态估计有基于特征点的配准、基于点云的配准这两条思路。图像序列之间物体切换角度较小时,采用基于点云的精配准,例如迭代最近邻点算法ICP(iterative closest point)可以对物体姿态进行估计。当图像序列之间物体切换角度较大时,基于物体3D特征,还可以对两个位姿之间进行粗配准,作为精配准的初值,这种方式可以支持更快速率的扫描。
如果不存在测量误差,则当前帧的3D点全部都在目标物体的3D体素模型(volumetric model)表面上,所以为了求解相机姿态(位姿变换关系),转为求当前帧的3D点云与目标物体的3D体素模型表面点云之间距离的最小值,目标函数为:
Figure PCTCN2019088480-appb-000001
其中,M为相机的位姿变换矩阵,s i为当前要计算位姿的那一帧的3D点云,并将该点云旋转到体素模型的观测坐标系下,d i为模型在观测坐标下的点云,n i为模型点云对应的法线,则目标函数表示当前帧点云到体素模型点云所在平面的最小距离平方和。
步骤334:实现2D深度图转换为3D信息并融合到统一的3D体素模型中。采用TSDF(Truncated Signed Distance Function,截断的有符号的距离函数)算法,融合后体素的值为SDF(Signed Distance Function,有符号的距离函数)值和Weight(权重)值,以及可选的色彩值。TSDF算法目前已经是3D点云融合的主流处理算法。权重计算,采用求均值的方法。每融合一次,旧权重值即加一。新值权重为1,新旧SDF值,相应乘以其权重,相加,再除以融合次数(新的权重值),即得到新的归一化SDF值。
步骤335:判断Roll/Yaw/Pitch三个方向每隔一定角度(例如但不限于30、45、60、90等预设角度)是否有保存右预设数量的关键帧,如已保存的关键帧数小于预设数量(以是否覆盖了目标物体的全景为准)时,需要去继续捕捉场景(彩色图和深度图),终端会指示用户需要进行更多的扫描。当关键帧的数量足够覆盖目标物体的全景时,提示用户扫描完成,可以结束扫描进入后续步骤。
步骤336:实现实时融合过程中,选择并缓存纹理映射所需要的输入关键帧信息,包括彩色图像、位姿(不同图像之间的位置姿态差异)等信息。根据物体建模需要360度扫描的特性,Roll/Yaw/Pitch方向都各选出预设数量(N)个关键帧,即可完全恢复物体360度的纹理。如,通过ICP结果,确定输入图像流里每一帧的角度(YAW/Pitch/Roll),再计算每一帧的清晰度,根据角度和清晰度构建选择策略,选择出关键帧。
角度策略是不同方向上将360度分割出N个360/N的区域,每个区域一定要有一帧清晰的彩色图像。
清晰度检测原理:一般有梯度法和sobel算子评价图像的清晰度。可以选择梯度方法计算清晰度。如下图,对于图像中的某个像素,与其右侧及下侧两个像素一起,如下计算:
det x=a(i+1,j)-a(i,j)
det y=a(i,j+1)-a(i,j)
Figure PCTCN2019088480-appb-000002
blur=sum/(width*height)
blur值越大,则图像越清晰。
步骤337:采用Marching Cubes算法实现3D点云网格化,生成三角面片。Marching Cubes算法的主要思路是以体元为单位来寻找三维点云中内容部分与背景部分的边界,在体元抽取三角片来拟合这个边界,简单来讲,可以讲包含体数据内容的体素点称为实点,而其外的背景体素点都称作虚点。这样一个三维点云就是由各种实点和虚点组成的点阵。例如,从单个体元的角度出发,体元的8个体素点每个都可能是实点或虚点,那么一个体元一共有2的8次方即256种可能的情况。Marching Cubes算法的核心思想就是利用这256种可以枚举的情况来进行体元内的等值三角面片抽取。体元是在三维图像中由相邻的八个体素点组成的正方体方格,MarchingCubes算法的Cube的语义也可以指这个体元。注意区别体元和体素,体元是8个体素构成的方格,而每个体素(除了边界上的之外)都为8个体元所共享。
2)纹理映射
在一种具体实现过程中,参见图8,一种具体的纹理映射实现方案如下:
步骤341:根据网格模型(三角面片信息)以及关键帧的位姿信息,判断每个关键帧的位姿下所有面片是否可见。输入网格模型的所有三角面片信息以及关键帧的空间坐标,输出所有三角面片在每个关键帧的位姿下是否可见的信息。
原理描述:判断某个三角面片在某个位姿下可见还是不可见,则需要判断该位姿与该三角面片的顶点连接成的射线与模型的其他三角面片是否相交,如果有相交的情况,则说明该三角面片被其他面片挡住了,则不可见,否则在该位姿下该三角面片是可见的。
碰撞检测过程涉及到空间中的三角形法向量计算、判断射线与三角形是否相交、判断射线是否与AABB((Axis-aligned bounding box))包围盒相交以及层级二叉树的构建这几大部分。
一种具体示例处理流程如下:
1)取该面片的一个顶点,与当前关键帧图像的相机视点连接得到射线;
2)该射线与二叉层级树由根开始计算遮挡;
3)判断该BV(Bunding Volume)节点是否为叶子节点,若是叶子节点则跳至6);
4)判断射线是否与该BV的AABB包围盒相交,如果不相交则返回1);
5)若射线与该BV相交,则取BV的两个子节点,返回3);
6)如果是叶子节点,先判断射线是否与AABB包围盒相交,若相交再判断是否与三角形相交,如果相交该顶点被遮挡;
7)如果该面片中有1个或1个以上顶点被遮挡,则该面片在当前关键帧图像下不可见。
步骤342:根据步骤341的结果及网格模型,采用区域划分和图割(Graph Cuts)的方法,对网格模型上每个面片(face)进行标记,确定其选择哪个关键帧图像(view)来生成纹理。面片标记的结果可作为仿射贴图(Warping)模块的输入,用于生成初步的纹理图。
步骤343:将关键帧图像中对应区域的纹理映射到纹理图上,并对不同关键帧的patch(接缝处片)做边界平滑。
由于物体3D模型是由多个关键帧图像来生成一张纹理图,因此在关键帧图像选择后纹理图上会在不同关键帧图像生成接缝间有很多颜色不连续。在纹理不连续处,每个顶点Vertex都可以看成两个顶点:属于左边面片的Vleft和属于右边面片的Vright。每一个顶点V的调整前颜色记为fv,通过下面的最小化方程得到每个顶点V的颜色矫正值gv。
Figure PCTCN2019088480-appb-000003
argmin代表求差异最小值,上面公式中包含两部分
Figure PCTCN2019088480-appb-000004
第一部分中,这里v代表关键帧接缝处的顶点(Vertex),即从属于左边接缝处片,也从属于右边接缝处片。
Figure PCTCN2019088480-appb-000005
中的fv是调整前颜色值,gv是颜色校正值,也就是增加量(△),这个公式的意思就是,为了保证接缝处的平滑,不同帧的图像的公共点矫正完后的差异要尽可能的小。第二部分中,Vi和Vj表示同一纹理接缝处片上任意两个相邻顶点的增量要尽可能的小,确保不会一个增加太多一个减少的太少而导致不平滑。
在一种具体实现过程中,将面片标记的结果相邻区域相同标签的存为patch,对所有patch的顶点做边界平滑,调整每个顶点的像素值,对最终的顶点围城的三角行做基于位置和像素的仿射变换,形成最终纹理图。
将物体的纹理图集绘制于物体的网格模型的表面便得到了物体的3D模型,一般保存为.obj格式,如图9所示为例,狮子的纹理图集映射到狮子的网格模型上,得到狮子纹理映射后的3D模型。
步骤22
通过步骤21得到了目标物体经3D重建后的3D模型,即带有纹理的网格模型。接下来,需要给带有纹理的网格加入骨骼,这就涉及到如何获取骨骼模型,即目标骨骼模型。
在一种具体实现过程中,可以提供给用户骨骼模型制作库,如一些线段和点,其中,线段表示骨骼,点表示关节节点。接收用户的操作指令,如手势、滑动、快捷键等,将将至少两个线段和至少一个点组合成骨骼模型,从而得到骨骼模型,进一步地,将该骨骼模型上传到云端或存储在本地。
在一种具体实现过程中,可以提供给用户更加开放的制作库,线段和点完全由用户自由设计,其中,线段表示骨骼,点表示关节节点。接收用户的操作指令,如手势、滑动、快捷键等,将将至少两个线段和至少一个点组合成骨骼模型,从而得到骨骼模型,进一步地,将该骨骼模型上传到云端或存储在本地。
在一种具体实现过程中,可以从至少一个预设骨骼模型中选择与所述目标物体的外形匹配度最高的骨骼模型作为目标骨骼模型。预设骨骼模型可以存储在网络、云端或者本地。例如本地存储了鸡的骨骼模型、狗的骨骼模型、鱼的骨骼模型,当目标物体为鸭子的时候,系统通过外形识别,将鸡的骨骼模型作为目标骨骼模型。相似的判定标准包括但不限于骨骼形态、骨骼的长度、骨骼的粗细、骨骼的数量、骨骼的组成方式等。
在一种具体实现过程中,可以接收用户的选择指令,该选择指令用于从至少一个预设骨骼模型中选择出目标骨骼模型。
步骤23
将目标骨骼模型与目标物体的3D模型进行融合,或者说将目标骨骼模型嵌入到目标物体的3D模型中。需要一定计算物体/角色中骨骼关节节点的位置,使得最终结果的骨架尽可能好的符合目标物体的内部构造,并且同预设的(给定的)骨架看起来尽可能相像。
在一种具体实现过程中,参见图10,一种具体的骨骼装配方案如下:
步骤351:为了近似中轴面和利用其它的计算,计算三线性差值的自适应采样距离场(adaptively Sampled Distance Fields)。可以通过构建kd树(K-demension tree)来评估从任意点到物体表面的有符号的距离大小,kd树是一种分割k维数据空间的数据结构。主要应用于多维空间关键数据的搜索。
步骤352:计算近似位于物体中轴面的一组样本点,找到骨骼关节可能位于的点,滤除那些接近物体表面的点。
步骤353:为了从中轴面中挑选出骨架图的顶点,可以在物体中填充球体,对所有中轴线上的点,按照距离3D模型表面的远近进行排序,从最远的点开始画出3D模型内部最 大的内切球体(不超出3D模型表面范围),得到球体半径;之后遍历中轴线上的每个点,只有当该点不被包含在之前任意一个填充的球体中时,才会对该点构建内切球体。
步骤354:可以通过连接一些球心来构建骨架图,连接任意两个球体相交的球的球心作为一条边。
以上步骤351-步骤354可以称为骨骼识别。
步骤355:提取预设的骨骼模型,需要以一种最优化的方式嵌入步骤354构建的几何体骨架图G=(V,E)(V代表顶点,E代表边缘)中。通常节点需要减少,骨架需要优化。
步骤356:识别骨骼层级关系,减少简单的层级而来近似骨骼形状。
经过骨骼适配(步骤355+356)后,可以得到装配好骨骼模型的物体3D模型。
步骤24
通过步骤23得到了嵌入骨骼模型后的3D模型。接下来,需要给骨骼模型寻找到一些可实现运动方式,进而实现目标物体3D模型的运动,这就涉及到如何获取骨骼的运动方式(也可形象地成为动画),即目标运动方式。
在一种具体实现过程中,可以获取第一物体的运动方式,将第一物体的运动方式作为目标运动方式。其中,第一物体可以是当前实时运动的物体(例如,拍摄一个人正在跑步,通过神经网络提取出人物的骨骼的运动方式);也可以是过去拍过且保存下来物体的运动方式(例如,之前拍摄过一只狗的一组可爱的动作,并通过算法将该动作的运动方式存储在本地或者云端);也可以是预设的某一特定物体的运动方式。(例如,只选择跟人类的有关的运动方式)
在一种具体实现过程中,可以从预设的目标运动方式选出一种。(例如,本地存储有人的动作、狗的动作、猫的动作、马的动作等,用户可以根据自己的喜好或者物体类型的符合度选择自己喜欢的特定类别)。
在一种具体实现过程中,运动方式可以是用户自己使用动画制作软件制作的,当然这个软件可以是系统内嵌的工具集或者在扫描重建运动的APP中载入的工具集,或者来源于第三方动画设计工具;可以是当前制作也可以是历史制作的运动方式或动画。
在一种具体实现过程中,运动方式可以是根据物理属性在多个预先存储的运动方式中选择出属性匹配度最高的运动方式作为目标运动方式。例如,本地预先存储了鱼的游水动画,青蛙起跳的动画,马奔跑的动画;如果用户扫描的目标物体为一只鹿,则将马奔跑的动画作为鹿的目标运动方式。(鹿和马在外形、生物种类、骨骼结构等属性上比其鱼和青蛙相似度更高)
在一种具体实现过程中,运动方式还可以是基于目标物体的骨骼模型(可以采用上一步骤中的任一方法去获取)由系统或者用户对该骨骼模型进行自主设计,得到目标运动方式。这一种方式是对后续物体的3D模型实现动画的最契合的动画操作。
在一种具体实现过程中,运动方式可以是预设的骨骼动画,一般由专业的动画设计师制作而成。
应理解,一个骨骼动画描述了骨架中每一个节点变换随着时间改变的动态,通常使用关键帧的形式来进行储存和表达。通常具有FPS(Frame Per Second)概念,即一秒钟包含多 少帧。骨骼动画不能脱离骨架而存在,否则无法驱动3D模型,因此骨骼动画通常是依赖着特定的骨架而存在的,骨架通常称为Rig,其描述了一套骨架有哪些骨骼,每根骨骼的衔接关系,每根骨骼默认的变换(也就位姿),以及其他的一些额外信息。位姿描述了一个骨架中每一个节点的变换后的一个静态,诸如站立、奔跑的一帧。每个骨架都会储存一个绑定位姿,就是制作这个骨架时候的默认姿态。姿态一般不储存骨架的层次关系,而是用一个数组来依次储存每一个节点的变换,而节点从属于特定的某跟骨头,因而无法脱离骨架单独使用。另外,姿态是骨骼动画采样结果的一部分。可见,骨架、位姿和骨骼动画是互相关联的,共同实现后续的动画操作的。
步骤25
骨骼动画本质上是记录了以树形结构存储的一系列对象的位置、旋转、缩放随着时间变化的动态,其中的每一个对象就是一根骨骼。动画的实现过程为,一组动画中的骨骼动画变换映射到上一个步骤“自动骨骼装配”装配好了骨骼的3D模型中,这种动作的映射的实现方式包括但不限于游戏引擎、动画引擎等;3D模型根据骨骼变换进行姿态变换,连贯起来就是一系列的动画动作,用户感官上的感受就是扫描的物体“复活”了,静态的东西“动”起来了。在动画的实现过程中,蒙皮(skinning)技术是保证物体3D模型跟随骨骼模型运动的基础。
物体的3D模型的动画表现为装配了骨骼的3D模型映射到一组变化的骨骼模型的动作上,每一帧都需要根据骨骼的变化,实现3D模型表面(即物体的3D模型表皮)的形变,这个过程称为蒙皮。从而实现了3D模型到动作的映射,实现了动画效果。
在具体实现过程中,可以采用线性融合蒙皮(LBS)方案,对于3D模型中表面的任意一点,可以采用如下公式根据前一个状态的位置得到下一个状态的位置。其中,v i为上一个位置,v′ i为下一个位置,w i,j为第j根骨头在i点上的权重,T j为变换矩阵。3D模型表面上达到一定数量的点都能计算出下一个位置,则3D模型在下一个位置的姿态便能够确定出来,进而实现动画。蒙皮技术的核心为找到每根骨头对每一个顶点的权重w i,j
Figure PCTCN2019088480-appb-000006
一种具体实现过程中,可以使用类似热平衡的方式来计算权重,将一个3D模型看成是一个绝缘的导热体,并将第i跟骨头温度设为1°,同时将其余的骨头的温度设定为0°。根据热平衡原理,这样我们就可以将表面顶点平衡之后的温度设定为该点的权重,权重值区间为[0,1]。基于热平衡的权重计算方法使得权重的计算结果具有平滑特性,呈现的动作效果会更加真实自然。
应理解,通过改变嵌入骨骼的位置(即通过动画)实现对物体3D模型的动作变换,最终用户看到的是蒙皮的效果。如果特别设定,也可以让用户看到只有骨骼而没有物体3D模型的动画。
一种具体的动画流程实现可参见图11所示,通过预设的骨骼图和多帧动作,即根据运动模型或动画模型对装配了目标骨骼模型的3D模型进行动作映射,控制目标骨骼模型使目标骨骼模型按照预设的运动模型进行运动,在运动的过程中实时计算和更新蒙皮数据,使得3D模型能够跟随目标骨骼模型实现平滑的运动,进而实现3D模型的动画。应理解,目标物体的骨骼模型和动画的骨骼结构可能并不完全一样,可以将物体的骨骼模型和动画的骨骼结构进行位置映射,比如关键节点要一致,至于骨骼长短可以按照比例设置;还可以将物体的骨骼模型和动画的骨骼结构互相适配,比如进行成比例的裁剪和延长,至少使得动画的骨骼结构不超出物体的3D模型的外表面,进一步地,可以经过一些物理运算,使得修整动画骨骼,使其最大限度地支撑起物体的3D模型,使得动画的骨骼模型和物体的3D模型契合地更和谐。
上述步骤21-25既可以一气呵成地逐步完成,也可以具有一定的时间间隔。例如,用户扫描完物体的3D模型之后,可以将3D模型存储在本地或者云端,若干时间以后可以直接调用该3D模型,自由选择骨骼装配或自由选择动画方式,还可以选择动画背景,包括但不限于实时拍摄的图像、本地已经存储的其它图像、云端数据图像等。此外,在终端显示物体动画的同时,还可以显示出物体的影子,或者增加其它的音效、特效等等。动画可以由移动终端自动播放,还可以在通过用户输入操作指令进行播放的控制。
应理解,上述实施例仅是本发明中一些可选的实施方式;并且由于相机参数设计、算法实现方式、用户设置、终端操作系统、终端所处环境、以及用户使用习惯的不同,前文所提及的器件参数、用户使用方法、实施例中涉及的算法均存在多种变形。无法通过穷举而一一列出,本领域技术人员应当理解,基于上述理论进行适应性地调整,包括一些常规方式的替代而产生的技术方案都应属于本发明保护范围内。
通过本发明,可以在移动终端上就能一体实现目标物体从扫描、3D重建、到骨骼装配,再到预设的动画展示等一系列的操作,对于用户来说进行3D扫描能够轻松玩转;并且随着移动终端的拍摄技术的广泛应用,呈现二维图像可向呈现3D动画过渡,允许用户将现实扫描建模的物体最终实现虚拟的动画动作;大大提升了用户使用移动终端的趣味性和粘性,引领拍摄应用进入一个新的潮流。
基于上述实施例提供的物体建模运动方法,本发明实施例提供一种物体建模运动装置700,所述装置700可以应用于各类拍照设备,如图12所示,该装置700包括扫描模块701、第一获取模块702、融合模块703、第二获取模块704以及运动模块705;该装置应用于移动终端,移动终端包括彩色相机和深度相机,彩色相机和深度相机位于移动终端的同一侧;相关特性可以参照前述方法实施例中的描述。
扫描模块701,用于当彩色相机和深度相机对目标物体进行全景扫描时,得到目标物体的3D模型。该扫描模块701可以由处理器调用存储器中的程序指令对上述彩色相机和深度相机使能控制,进一步地,扫描时的采集的图片还可以选择性地存储在存储器中。
第一获取模块702,用于获取目标骨骼模型。该第一获取模块702可以由处理器调用相应的程序指令实现,进一步地,可以通过调用本地存储器或云端服务器中的数据以及算法,进行相应计算实现。
融合模块703,用于将目标骨骼模型与目标物体的3D模型融合;该融合模块703可以由处理器调用相应的程序指令实现,进一步地,可以通过调用本地存储器或云端服务器中的数据以及算法,进行相应计算实现。
第二获取模块704,用于获取目标运动方式。该第二获取模块704可以由处理器调用相应的程序指令实现,进一步地,可以通过调用本地存储器或云端服务器中的数据以及算法,进行相应计算实现。
运动模块705,用于根据目标运动方式控制骨骼模型,使目标物体的3D模型根据目标运动方式进行运动。该运动模块705可以由处理器调用相应的程序指令实现,进一步地,也可以通过调用本地存储器或云端服务器中的数据以及算法实现。
在具体实现过程中,扫描模块701具体用于执行步骤21中所提到的方法以及可以等同替换的方法;第一获取模块702具体用于执行步骤22中所提到的方法以及可以等同替换的方法;融合模块703具体用于执行步骤23中所提到的方法以及可以等同替换的方法;第二获取模块704具体用于执行步骤24中所提到的方法以及可以等同替换的方法;运动模块705具体用于执行步骤25中所提到的方法以及可以等同替换的方法。
更为具体地;
扫描模块701可以执行上述步骤331-337,步骤341-343的方法;融合模块703可以执行上述步骤351-356的方法。
其中,上述具体的方法实施例以及实施例中技术特征的解释、表述、以及多种实现形式的扩展也适用于装置中的方法执行,装置实施例中不予以赘述。
本发明实施例提供的装置700,可以实现了物体从扫描、3D重建、到骨骼装配,再到预设的动画展示的一体化设计。无需用户采用专业的、笨重的、复杂的设备进行专业扫描,也无需再跑到PC端做复杂的建模处理和动画处理,将这些功能集成在一起,提供给用户,使得用户在一个移动终端上能够轻松玩转这一系列的操作方法,使得用户身边任何一个“静态的物体(或近似静态的物体)”能够更鲜活,更具生命力。增加用户使用终端的趣味性,提升用户的体验度。
应理解以上装置700中的各个模块的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。例如,以上各个模块可以为单独设立的处理元件,也可以集成在终端的某一个芯片中实现,此外,也可以以程序代码的形式存储于控制器的存储元件中,由处理器的某一个处理元件调用并执行以上各个模块的功能。此外各个模块可以集成在一起,也可以独立实现。这里所述的处理元件可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤或以上各个模块可以通过处理器元件中的硬件的集成逻辑电路或者软件形式的指令完成。该处理元件可以是通用处理器,例如中央处理器(英文:central processing unit,简称:CPU),还可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个特定集成电路(英文:application-specific integrated circuit,简称:ASIC),或,一个或多个微处理器(英文:digital signal processor,简称:DSP),或,一个或者多个现场可编程门阵列(英文:field-programmable gate array,简称:FPGA)等。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实 施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本发明的部分实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括已列举实施例以及落入本发明范围的所有变更和修改。显然,本领域的技术人员可以对本发明实施例进行各种改动和变型而不脱离本发明实施例的精神和范围。倘若本发明实施例的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也包含这些改动和变型在内。

Claims (20)

  1. 一种物体建模运动方法,其特征在于,所述方法应用于移动终端,所述移动终端包括彩色摄像头和深度传感器模组;所述方法包括:
    利用所述彩色摄像头和所述深度传感器模组对目标物体进行全景扫描,得到所述目标物体的3D模型;
    获取目标运动方式;
    使所述目标物体的3D模型根据所述目标运动方式进行运动。
  2. 如权利要求1所述方法,其特征在于,所述深度传感器模组包括:TOF模组,或者结构光模组;其中,所述彩色摄像头和所述深度传感器模组位于所述移动终端的同一侧。
  3. 如权利要求1或2所述方法,其特征在于,在所述得到所述目标物体的3D模型之后,所述方法还包括:
    获取目标骨骼模型;
    将所述目标骨骼模型与所述目标物体的3D模型融合;
    在所述使所述目标物体的3D模型根据所述目标运动方式进行运动之前,所述方法还包括:
    根据所述目标运动方式控制所述骨骼模型。
  4. 如权利要求1-3任意一项所述方法,其特征在于,所述获取目标骨骼模型包括:
    接收用户的操作指令,所述操作指令用于将至少两个线段和至少一个点组合成骨骼模型;其中,线段表示所述骨骼模型中的骨骼,点表示所述骨骼模型中的关节节点。
  5. 如权利要求1-3任意一项所述方法,其特征在于,所述获取目标骨骼模型包括:
    从至少一个预设骨骼模型中选择与所述目标物体的外形匹配度最高的骨骼模型作为目标骨骼模型。
  6. 如权利要求1-5任一项所述方法,其特征在于,所述获取目标运动方式包括:
    获取第一物体的运动方式,将所述第一物体的运动方式作为目标运动方式。
  7. 如权利要求1-5任一项所述方法,其特征在于,所述获取目标运动方式包括:
    呈现至少两个运动方式给用户,接收用户的选择指令,在所述至少两个运动方式中确定出目标运动方式;或,
    根据目标物体的属性在多个预先存储的运动方式中选择出属性匹配度最高的运动方式作为目标运动方式。
  8. 如权利要求1-5任一项所述方法,其特征在于,所述获取目标运动方式包括:
    接收用户针对所述骨骼模型制作的动画,将所述动画确定为目标运动方式。
  9. 一种物体建模运动装置,其特征在于,所述装置应用于移动终端,所述移动终端包括彩色摄像头和深度传感器模组;所述装置包括:
    扫描模块,用于当所述彩色摄像头和所述深度传感器对目标物体进行全景扫描时,得到所述目标物体的3D模型;
    第二获取模块,用于获取目标运动方式;
    运动模块,用于使所述目标物体的3D模型根据所述目标运动方式进行运动。
  10. 如权利要求9所述装置,其特征在于,所述深度传感器模组包括:TOF模组,或者结构光模组;其中,所述彩色摄像头和所述深度传感器模组位于所述移动终端的同一侧
  11. 如权利要求9或10所述装置,其特征在于,所述装置还包括:
    第一获取模块,用于获取目标骨骼模型;以及,
    融合模块;用于将所述目标骨骼模型与所述目标物体的3D模型融合;
    其中,所述运动模块还具体用于根据所述目标运动方式控制所述骨骼模型。
  12. 如权利要求9-11任一项所述装置,其特征在于,所述第一获取模块具体用于:
    接收用户的操作指令,所述操作指令用于将至少两个线段和至少一个点组合成骨骼模型;其中,线段表示所述骨骼模型中的骨骼,点表示所述骨骼模型中的关节节点;或,
    从至少一个预设骨骼模型中选择与所述目标物体的外形匹配度最高的骨骼模型作为目标骨骼模型;或,
    获取第一物体的运动方式,将所述第一物体的运动方式作为目标运动方式。
  13. 如权利要求9-12任一项所述装置,其特征在于,所述第二获取模块具体用于:
    呈现至少两个运动方式给用户;接收用户的选择指令,在所述至少两个运动方式中确定出目标运动方式;或,
    接收用户针对所述骨骼模型制作的动画,将所述动画确定为目标运动方式;或,
    根据所述物理属性在多个预先存储的运动方式中选择出属性匹配度最高的运动方式作为目标运动方式。
  14. 一种终端设备,其特征在于,所述终端设备包含存储器、处理器、总线、深度传感器模组和彩色摄像头;所述彩色摄像头和所述深度传感器模组位于所述移动终端的同一侧;所述存储器、所述深度传感器模组、彩色摄像头以及所述处理器通过所述总线相连;所述深度传感器模组和所述彩色摄像头用于在所述处理器的控制下对目标物体进行全景扫描;所述存储器用于存储计算机程序和指令;所述处理器用于调用所述存储器中存储的所述计算机程序和指令,使所述终端设备执行如权利要求1~9任一项所述方法。
  15. 如权利要求14所述的终端设备,所述终端设备还包括天线系统、所述天线系统在处理器的控制下,收发无线通信信号实现与移动通信网络的无线通信;所述移动通信网络包括以下的一种或多种:GSM网络、CDMA网络、3G网络、4G网络、5G网络、FDMA、TDMA、PDC、TACS、AMPS、WCDMA、TDSCDMA、WIFI以及LTE网络。
  16. 一种物体建模运动方法,其特征在于,所述方法应用于移动终端,所述移动终端包括彩色摄像头和深度传感器模组;且所述彩色摄像头和所述深度传感器模组位于所述移动终端的同一侧;所述深度传感器模组包括TOF模组或者结构光模组;所述方法包括:
    利用所述彩色摄像头和所述深度传感器模组对目标物体进行全景扫描,得到所述目标物体的3D模型;
    获取目标骨骼模型;
    将所述目标骨骼模型与所述目标物体的3D模型融合;
    获取目标运动方式;
    根据所述目标运动方式控制所述骨骼模型,使所述目标物体的3D模型根据所述目标运动方式进行运动。
  17. 如权利要求16所述方法,其特征在于,所述获取目标骨骼模型包括:
    接收用户的操作指令,所述操作指令用于将至少两个线段和至少一个点组合成骨骼模型;其中,线段表示所述骨骼模型中的骨骼,点表示所述骨骼模型中的关节节点;或,
    从至少一个预设骨骼模型中选择与所述目标物体的外形匹配度最高的骨骼模型作为目标骨骼模型。
  18. 如权利要求16或17所述方法,其特征在于,所述获取目标运动方式包括:
    获取第一物体的运动方式,将所述第一物体的运动方式作为目标运动方式;或,
    呈现至少两个运动方式给用户,接收用户的选择指令,在所述至少两个运动方式中确定出目标运动方式;或,
    接收用户针对所述骨骼模型制作的动画,将所述动画确定为目标运动方式;或,
    根据目标物体的属性在多个预先存储的运动方式中选择出属性匹配度最高的运动方式作为目标运动方式。
  19. 一种物体建模运动装置,其特征在于,所述装置应用于移动终端,所述移动终端包括彩色摄像头和深度传感器模组,所述彩色摄像头和所述深度传感器模组位于所述移动终端的同一侧;所述深度传感器模组包括TOF模组或结构光模组;所述装置包括:
    扫描模块,用于当所述彩色摄像头和所述深度传感器对目标物体进行全景扫描时,得到所述目标物体的3D模型;
    第一获取模块,用于获取目标骨骼模型;
    融合模块;用于将所述目标骨骼模型与所述目标物体的3D模型融合;
    第二获取模块,用于获取目标运动方式;
    运动模块,用于根据所述目标运动方式控制所述骨骼模型,使所述目标物体的3D模型根据所述目标运动方式进行运动。
  20. 如权利要求19所述装置,其特征在于,
    所述第一获取模块具体用于:
    接收用户的操作指令,所述操作指令用于将至少两个线段和至少一个点组合成骨骼模型;其中,线段表示所述骨骼模型中的骨骼,点表示所述骨骼模型中的关节节点;或,
    从至少一个预设骨骼模型中选择与所述目标物体的外形匹配度最高的骨骼模型作为目标骨骼模型;或,
    获取第一物体的运动方式,将所述第一物体的运动方式作为目标运动方式。
    所述第二获取模块具体用于:
    呈现至少两个运动方式给用户,接收用户的选择指令,在所述至少两个运动方式中确定出目标运动方式;或,
    接收用户针对所述骨骼模型制作的动画,将所述动画确定为目标运动方式;或,
    根据所述物理属性在多个预先存储的运动方式中选择出属性匹配度最高的运动方式作为目标运动方式。
PCT/CN2019/088480 2018-06-21 2019-05-27 一种物体建模运动方法、装置与设备 WO2019242454A1 (zh)

Priority Applications (9)

Application Number Priority Date Filing Date Title
KR1020217001341A KR102524422B1 (ko) 2018-06-21 2019-05-27 객체 모델링 및 움직임 방법 및 장치, 그리고 기기
BR112020025903-9A BR112020025903A2 (pt) 2018-06-21 2019-05-27 Método e aparelho de modelagem e movimento de objeto, e dispositivo
SG11202012802RA SG11202012802RA (en) 2018-06-21 2019-05-27 Object modeling and movement method and apparatus, and device
CA3104558A CA3104558A1 (en) 2018-06-21 2019-05-27 Object modeling and movement method and apparatus, and device
AU2019291441A AU2019291441B2 (en) 2018-06-21 2019-05-27 Object modeling and movement method and apparatus, and device
JP2020570722A JP7176012B2 (ja) 2018-06-21 2019-05-27 オブジェクト・モデリング動作方法及び装置並びにデバイス
EP19821647.5A EP3726476A4 (en) 2018-06-21 2019-05-27 OBJECT MODELING PROCESS, APPARATUS AND DEVICE
US16/931,024 US11436802B2 (en) 2018-06-21 2020-07-16 Object modeling and movement method and apparatus, and device
US17/879,164 US20220383579A1 (en) 2018-06-21 2022-08-02 Object modeling and movement method and apparatus, and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810646701.0 2018-06-21
CN201810646701.0A CN110634177A (zh) 2018-06-21 2018-06-21 一种物体建模运动方法、装置与设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/931,024 Continuation US11436802B2 (en) 2018-06-21 2020-07-16 Object modeling and movement method and apparatus, and device

Publications (1)

Publication Number Publication Date
WO2019242454A1 true WO2019242454A1 (zh) 2019-12-26

Family

ID=68967803

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088480 WO2019242454A1 (zh) 2018-06-21 2019-05-27 一种物体建模运动方法、装置与设备

Country Status (10)

Country Link
US (2) US11436802B2 (zh)
EP (1) EP3726476A4 (zh)
JP (1) JP7176012B2 (zh)
KR (1) KR102524422B1 (zh)
CN (3) CN111640176A (zh)
AU (1) AU2019291441B2 (zh)
BR (1) BR112020025903A2 (zh)
CA (1) CA3104558A1 (zh)
SG (1) SG11202012802RA (zh)
WO (1) WO2019242454A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111383309A (zh) * 2020-03-06 2020-07-07 腾讯科技(深圳)有限公司 一种骨骼动画驱动方法、装置及存储介质
JP7086362B1 (ja) 2021-03-29 2022-06-20 株式会社セルシス 情報処理システム、情報処理方法および情報処理プログラム
JP2022550555A (ja) * 2020-01-14 2022-12-02 ▲騰▼▲訊▼科技(深▲セン▼)有限公司 画像処理方法、装置、電子機器及びコンピュータプログラム

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7079287B2 (ja) * 2019-11-07 2022-06-01 株式会社スクウェア・エニックス 鑑賞システム、モデル構成装置、制御方法、プログラム及び記録媒体
US11636618B2 (en) * 2019-11-14 2023-04-25 Samsung Electronics Co., Ltd. Device and method with simultaneous implementation of localization and mapping
US11080862B2 (en) * 2019-11-18 2021-08-03 Ncku Research And Development Foundation Reliability based keyframe switching system and method adaptable to ICP
US11023730B1 (en) * 2020-01-02 2021-06-01 International Business Machines Corporation Fine-grained visual recognition in mobile augmented reality
EP4191540A4 (en) * 2020-07-27 2024-08-07 Vrc Inc 3D DATA SYSTEM AND 3D DATA GENERATION METHODS
CN112014799B (zh) * 2020-08-05 2024-02-09 七海行(深圳)科技有限公司 一种数据采集方法及巡检装置
CN111951360B (zh) * 2020-08-14 2023-06-23 腾讯科技(深圳)有限公司 动画模型处理方法、装置、电子设备及可读存储介质
CN111921202B (zh) * 2020-09-16 2021-01-08 成都完美天智游科技有限公司 虚拟场景的数据处理方法、装置、设备及可读存储介质
CN112330777B (zh) * 2020-11-03 2022-11-18 上海镱可思多媒体科技有限公司 基于三维动画的电机仿真运行数据生成方法、系统及终端
CN112347540B (zh) * 2020-11-09 2023-09-08 重庆智慧之源科技有限公司 建筑物智能检测建模系统
CN112417746B (zh) * 2020-11-18 2022-11-25 中北大学 一种基于神经网络预测碰撞检测的方法
CN112435316B (zh) * 2020-11-30 2023-05-12 上海米哈游天命科技有限公司 一种游戏中的防穿模方法、装置、电子设备及存储介质
CN118154732A (zh) * 2020-12-22 2024-06-07 完美世界(北京)软件科技发展有限公司 动画数据的处理方法及装置、存储介质、计算机设备
CN112541969B (zh) * 2021-01-21 2022-04-12 深圳市桔子智能信息科技有限公司 一种三维人体模型骨骼动态转移绑定方法
US20220237838A1 (en) * 2021-01-27 2022-07-28 Nvidia Corporation Image synthesis using one or more neural networks
CN113034691A (zh) * 2021-03-22 2021-06-25 广州虎牙科技有限公司 人体模型的骨骼绑定方法、装置及电子设备
CN112927331B (zh) * 2021-03-31 2023-09-22 腾讯科技(深圳)有限公司 角色模型的动画生成方法和装置、存储介质及电子设备
KR102571744B1 (ko) * 2021-05-06 2023-08-29 한국전자통신연구원 3차원 콘텐츠 생성 방법 및 장치
CN113313794B (zh) 2021-05-19 2022-11-08 深圳市慧鲤科技有限公司 动画迁移方法和装置、设备及存储介质
CN114118664A (zh) * 2021-07-21 2022-03-01 岭南师范学院 一种解决属性权重和时间权重复杂性的动态决策方法
WO2023022373A1 (en) * 2021-08-19 2023-02-23 Samsung Electronics Co., Ltd. Method and system for generating an animation from a static image
CN117321637A (zh) 2021-08-19 2023-12-29 三星电子株式会社 用于从静态图像生成动画的方法和系统
CN113744400B (zh) * 2021-09-09 2024-07-16 网易(杭州)网络有限公司 地形蒙版选区确定方法、装置及计算机设备
EP4250243A1 (en) * 2021-11-25 2023-09-27 CLO Virtual Fashion Inc. Method and apparatus for determining body part boundary surface of three-dimensional avatar
US12069228B2 (en) * 2021-12-28 2024-08-20 Faro Technologies, Inc. Artificial panorama image production and in-painting for occluded areas in images
KR102561903B1 (ko) * 2022-01-25 2023-08-02 스크린커플스(주) 클라우드 서버를 이용한 ai 기반의 xr 콘텐츠 서비스 방법
WO2023224251A1 (en) * 2022-05-16 2023-11-23 Samsung Electronics Co., Ltd. Systems and methods for recognizing non-line-of-sight human actions
CN116055778B (zh) * 2022-05-30 2023-11-21 荣耀终端有限公司 视频数据的处理方法、电子设备及可读存储介质
CN116452755B (zh) * 2023-06-15 2023-09-22 成就医学科技(天津)有限公司 一种骨骼模型构建方法、系统、介质及设备
CN116664727B (zh) * 2023-07-27 2023-12-08 深圳市中手游网络科技有限公司 一种游戏动画模型识别方法及处理系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102448565A (zh) * 2009-05-29 2012-05-09 微软公司 将骨架数据实时重新定标到游戏化身
CN102915112A (zh) * 2011-06-23 2013-02-06 奥美可互动有限责任公司 用于近距离动作跟踪的系统和方法
US8542252B2 (en) * 2009-05-29 2013-09-24 Microsoft Corporation Target digitization, extraction, and tracking
CN103597516A (zh) * 2011-06-06 2014-02-19 微软公司 控制虚拟环境中的对象
CN103703489A (zh) * 2011-06-06 2014-04-02 微软公司 对象数字化

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4229316B2 (ja) * 2003-05-09 2009-02-25 株式会社バンダイナムコゲームス 画像生成システム、プログラム及び情報記憶媒体
JP5845830B2 (ja) * 2011-11-09 2016-01-20 ソニー株式会社 情報処理装置、表示制御方法、およびプログラム
JP6018707B2 (ja) * 2012-06-21 2016-11-02 マイクロソフト コーポレーション デプスカメラを使用するアバター構築
CN102800126A (zh) * 2012-07-04 2012-11-28 浙江大学 基于多模态融合的实时人体三维姿态恢复的方法
WO2014042121A1 (ja) * 2012-09-12 2014-03-20 独立行政法人産業技術総合研究所 動作評価装置及びそのプログラム
CN203240221U (zh) * 2013-01-17 2013-10-16 佛山科学技术学院 一种电动转台
KR102058857B1 (ko) * 2013-04-08 2019-12-26 삼성전자주식회사 촬영 장치 및 촬영 제어 방법
RU2668408C2 (ru) * 2013-08-04 2018-09-28 Айсмэтч Лтд Устройства, системы и способы виртуализации зеркала
CN104021584B (zh) * 2014-06-25 2017-06-06 无锡梵天信息技术股份有限公司 一种骨骼蒙皮动画的实现方法
US9626803B2 (en) * 2014-12-12 2017-04-18 Qualcomm Incorporated Method and apparatus for image processing in augmented reality systems
CN105137973B (zh) * 2015-08-21 2017-12-01 华南理工大学 一种人机协作场景下的机器人智能躲避人类方法
US20170054897A1 (en) * 2015-08-21 2017-02-23 Samsung Electronics Co., Ltd. Method of automatically focusing on region of interest by an electronic device
CN105225269B (zh) * 2015-09-22 2018-08-17 浙江大学 基于运动机构的三维物体建模系统
JP2017080203A (ja) * 2015-10-29 2017-05-18 キヤノンマーケティングジャパン株式会社 情報処理装置、情報処理方法、プログラム
CN105590096B (zh) * 2015-12-18 2019-05-28 运城学院 基于深度映射的人体活动识别特征表达方法
JP6733267B2 (ja) * 2016-03-31 2020-07-29 富士通株式会社 情報処理プログラム、情報処理方法および情報処理装置
KR101819730B1 (ko) * 2016-04-19 2018-01-17 광주과학기술원 3차원 객체 검출 및 자세추정 방법
CN107577334A (zh) * 2016-07-04 2018-01-12 中兴通讯股份有限公司 一种移动终端的体感操作方法及装置
CN106251389B (zh) * 2016-08-01 2019-12-24 北京小小牛创意科技有限公司 制作动画的方法和装置
ZA201701187B (en) * 2016-08-10 2019-07-31 Tata Consultancy Services Ltd Systems and methods for identifying body joint locations based on sensor data analysis
US20180225858A1 (en) * 2017-02-03 2018-08-09 Sony Corporation Apparatus and method to generate realistic rigged three dimensional (3d) model animation for view-point transform
CN107248195A (zh) * 2017-05-31 2017-10-13 珠海金山网络游戏科技有限公司 一种增强现实的主播方法、装置和系统
CN107274465A (zh) * 2017-05-31 2017-10-20 珠海金山网络游戏科技有限公司 一种虚拟现实的主播方法、装置和系统
CN108053435A (zh) * 2017-11-29 2018-05-18 深圳奥比中光科技有限公司 基于手持移动设备的动态实时三维重建方法和系统
CN108154551B (zh) * 2017-11-29 2021-04-30 奥比中光科技集团股份有限公司 实时动态重建三维人体模型的方法及系统
US9959671B1 (en) * 2018-01-18 2018-05-01 Scandy, LLC System and method for capturing, processing and rendering data through a template-driven processing pipeline

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102448565A (zh) * 2009-05-29 2012-05-09 微软公司 将骨架数据实时重新定标到游戏化身
US8542252B2 (en) * 2009-05-29 2013-09-24 Microsoft Corporation Target digitization, extraction, and tracking
CN103597516A (zh) * 2011-06-06 2014-02-19 微软公司 控制虚拟环境中的对象
CN103703489A (zh) * 2011-06-06 2014-04-02 微软公司 对象数字化
CN102915112A (zh) * 2011-06-23 2013-02-06 奥美可互动有限责任公司 用于近距离动作跟踪的系统和方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3726476A4 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022550555A (ja) * 2020-01-14 2022-12-02 ▲騰▼▲訊▼科技(深▲セン▼)有限公司 画像処理方法、装置、電子機器及びコンピュータプログラム
JP7299414B2 (ja) 2020-01-14 2023-06-27 ▲騰▼▲訊▼科技(深▲セン▼)有限公司 画像処理方法、装置、電子機器及びコンピュータプログラム
CN111383309A (zh) * 2020-03-06 2020-07-07 腾讯科技(深圳)有限公司 一种骨骼动画驱动方法、装置及存储介质
CN111383309B (zh) * 2020-03-06 2023-03-17 腾讯科技(深圳)有限公司 一种骨骼动画驱动方法、装置及存储介质
JP7086362B1 (ja) 2021-03-29 2022-06-20 株式会社セルシス 情報処理システム、情報処理方法および情報処理プログラム
JP2022152058A (ja) * 2021-03-29 2022-10-12 株式会社セルシス 情報処理システム、情報処理方法および情報処理プログラム

Also Published As

Publication number Publication date
JP2021527895A (ja) 2021-10-14
US11436802B2 (en) 2022-09-06
CN110634177A (zh) 2019-12-31
CA3104558A1 (en) 2019-12-26
KR20210019552A (ko) 2021-02-22
AU2019291441B2 (en) 2023-07-06
SG11202012802RA (en) 2021-01-28
EP3726476A4 (en) 2021-04-07
US20220383579A1 (en) 2022-12-01
AU2019291441A1 (en) 2021-01-21
JP7176012B2 (ja) 2022-11-21
CN111640176A (zh) 2020-09-08
KR102524422B1 (ko) 2023-04-20
EP3726476A1 (en) 2020-10-21
BR112020025903A2 (pt) 2021-03-16
US20200349765A1 (en) 2020-11-05
CN111640175A (zh) 2020-09-08

Similar Documents

Publication Publication Date Title
WO2019242454A1 (zh) 一种物体建模运动方法、装置与设备
CN111739146B (zh) 物体三维模型重建方法及装置
KR20180121494A (ko) 단안 카메라들을 이용한 실시간 3d 캡처 및 라이브 피드백을 위한 방법 및 시스템
US20220245912A1 (en) Image display method and device
JP2006053694A (ja) 空間シミュレータ、空間シミュレート方法、空間シミュレートプログラム、記録媒体
Fei et al. 3d gaussian splatting as new era: A survey
EP3533218B1 (en) Simulating depth of field
WO2023066120A1 (zh) 图像处理方法、装置、电子设备及存储介质
CN113628327A (zh) 一种头部三维重建方法及设备
EP3980975B1 (en) Method of inferring microdetail on skin animation
LU502672B1 (en) A method for selecting scene points, distance measurement and a data processing apparatus
Nguyen et al. High resolution 3d content creation using unconstrained and uncalibrated cameras
CA2716257A1 (en) System and method for interactive painting of 2d images for iterative 3d modeling
Pan et al. Research on technology production in Chinese virtual character industry
US20220164863A1 (en) Object virtualization processing method and device, electronic device and storage medium
Lechlek et al. Interactive hdr image-based rendering from unstructured ldr photographs
US20230196702A1 (en) Object Deformation with Bindings and Deformers Interpolated from Key Poses
CN117611778A (zh) 一种直播中背景替换方法、系统、存储介质及直播设备
CN118898680A (zh) 一种对象模型的构建方法、装置、电子设备和存储介质
CN117726644A (zh) 轮廓线绘制方法、装置、计算机设备和存储介质
CN117876590A (zh) 三维模型的重建方法、装置、计算机设备及存储介质
CN116389704A (zh) 视频处理方法、装置、计算机设备、存储介质和产品
Huynh Development of a standardized framework for cost-effective communication system based on 3D data streaming and real-time 3D reconstruction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19821647

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019821647

Country of ref document: EP

Effective date: 20200713

ENP Entry into the national phase

Ref document number: 2020570722

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 3104558

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112020025903

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 20217001341

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2019291441

Country of ref document: AU

Date of ref document: 20190527

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 112020025903

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20201217