CN110168608A

CN110168608A - The system that 3 dimension words for obtaining physical object indicate

Info

Publication number: CN110168608A
Application number: CN201780082775.XA
Authority: CN
Inventors: R.托尔多; S.凡托尼; J.博卢姆; F.加拉托尼
Original assignee: Lego AS
Current assignee: Lego AS
Priority date: 2016-11-22
Filing date: 2017-11-16
Publication date: 2019-08-23
Anticipated expiration: 2037-11-16
Also published as: EP3545497A1; CN110168608B; US11049274B2; WO2018095789A1; US20200143554A1; EP3545497B1; DK3545497T3

Abstract

A method of the digital 3D for creating physical object indicates that the physical object includes subject surface；Wherein, the described method includes: obtaining input data, the input data includes multiple capture images of the physical object and the surface normal information of the object, the capture image is captured by image capture device, and the surface normal information indicates the associated subject surface normal of various pieces with subject surface；The digital 3D for creating subject surface is indicated；Wherein, creation number 3D indicates at least to be based on multiple capture images obtained and surface normal information obtained.

Description

The system that 3 dimension words for obtaining physical object indicate

Technical field

The present invention relates to the method and apparatus of the 3D digital representation for obtaining physical object.Particularly, the present invention relates to Toy enhanced gaming system including such method and apparatus, it may for example comprise the toy building elements with connecting elements System, the connecting elements is for detachably interconnecting toy building elements.

Background technique

Toy construction set known many decades.For many years, simple box-shaped structure block, which has supplemented, has spy Appearance or mechanically or electrically other construction components of airway dysfunction are determined, to enhance play value.These functions include such as motor, switch And lamp and programmable processor, it is subjected to the input from sensor, and received sensor input activation function can be responded Element.

It has been carried out and repeatedly attempts to control virtual game by physics toy.Many such system requirements toys It is coupled to computer by wired or wireless connection communication.However, these prior art systems need toy and department of computer science Communication interface between system.Moreover, above-mentioned prior art toy is relative complex, including electronic component or even memory and communication Interface.Moreover, freedom degree when manufacturing toy from attachment may be limited.

Other systems use vision technique under the background of toy enhanced gaming.For example, US2011/298922 is disclosed A kind of system of the image for extracts physical object.It is virtual that extracted image can be digitally represented on the display device A part of the world or video-game, wherein forbidding virtual world and/or the object of video-game is according in the real world Construction collects to design and construct.However, being intended to provide and being accurately similar in many video-games or other virtual environments The three-dimensional object of physical object.

It is commonly used for being referred to as the 3D weight from multiple images from the process that one group of multiple images creates three-dimensional (3D) model It builds.Hereinafter, the 3D model of physical object is also referred to as the digital representation of the 3D shape of physical object.

According at least one aspect, it is therefore desirable to provide a kind of for creating the three of physical object in a user-friendly manner The process for tieing up the digital representation of (3D) shape, for example, the digital representation of the 3D shape of physics toy construction model.Particularly, the phase It hopes and the method for digital representation that is a kind of easy to use and providing the accurate 3D shape for indicating physical object is provided.It is generally desirable to this Kind method is steady in terms of the factors such as ambient lighting conditions, the mechanical inexactness of device therefor and/or other factors 's.

It is generally desirable to provide one kind such as education of the toy system of toy enhanced gaming system and/or play value Method and apparatus.It would also be desirable to provide a kind of toy construction set, wherein one group of construction component can be easily used in different objects for appreciation It is applied in combination in tool tectonic model and/or with existing toy building elements.Additionally, it is desirable to provide a kind of toy construction set, It allows user, especially children, constructs multiple toy models with user friendly, effective, flexible and reliable way.Especially Ground, it is desirable to provide a kind of toy construction set allows to create the use of virtual objects in the virtual environment of such as game system Family close friend and flexible mode.

Summary of the invention

According in a first aspect, disclosed herein is a kind of for creating the digital representation of at least subject surface of physical object Method.In general, subject surface is the surface in 3d space comprising have the surface portion being individually oriented in the 3 d space.

This method comprises:

Input data is obtained, input data includes multiple capture images of physical object and the surface normal information of object, Surface normal information indicates that subject surface normal is associated with the various pieces of subject surface；

The digital representation of creation at least subject surface；

Wherein, the digital representation for creating subject surface is at least based on multiple capture images obtained and surface obtained Normal information, and include:

The intermediate representation of subject surface is obtained, intermediate representation includes the first part for indicating the first part of subject surface；

The first part of intermediate representation is modified to obtain indicating for modification；

Wherein the first part of modification intermediate representation includes:

Determine the second part of the subject surface near the first part of subject surface；

One or more Object tables associated with identified second part are determined from surface normal information obtained Face normal；

Identified one or more subject surface normals are based at least partially on to modify first of intermediate representation Point.

The embodiment of this process also will be referred to as 3D object reconstruction process or 3D rebuilds assembly line.

Therefore, the embodiment of method described herein be not used only from it is multiple capture images information, also use about The information of the subject surface normal of object, thus the quality that the reconstructing digital for improving the 3D shape of physical object indicates.Particularly, Subject surface normal indicates the subject surface normal of physical object rather than indicates derived empty from the surface 3D of the creation of object Quasi- surface normal indicates the orientation for the virtual surface that the digital 3D created is indicated.Subject surface normal obtained can also Be known as " outside " surface normal because they be from be different from created the digital surface 3D expression source derived from, for example, It is different from the grid representation on the virtual surface 3D.External object surface normal can for example be obtained as indicating to regard from such as camera The normal map of the surface normal on the surface that the given viewpoint of point is seen.Therefore, normal map (normal map) can be method The 2D array of line vector.

The letter about subject surface normal relevant to the degree of approach of a part on surface is used when the process choosing When ceasing digital representation to modify the part, the reconstruction of special high quality may be implemented, for example, for it is many it is flat, The object of smooth surface and limbus.

The modification of the first part of intermediate representation is selectively based on object associated with the second part of subject surface Normal, i.e., based on only surface normal associated with second part.Therefore, the modification of intermediate representation is based only upon local normal letter Breath, rather than based on global normal information associated with all parts of subject surface.It will be appreciated, however, that the process can To include the additional step for depending on global normal information really.The modification is based at least partially on surface normal information, that is, repairs Change the information being also based on other than surface normal information.

The degree of approach can be based on suitable distance metric, for example, being applied to the distance metric of intermediate representation.Distance metric Example includes Euclidean distance.The degree of approach also can be defined as the vertex of grid representation and/or the neighborhood of surface-element, example Such as, monocycle vertex or k ring vertex, wherein k is positive integer, for example, K=1.The first part of intermediate representation can indicate object The point on surface or region and/or the virtual surface defined by intermediate representation.Second part may include some in first part Or all；Alternatively, second part can be non-intersecting with first part.For example, first part can be a little or region, second part First part can be surrounded, for example, around the periphery in the region for limiting first part.In some embodiments, first part can To be the first vertex, second part can be the subject surface of the surface-element expression limited by the vertex around the first vertex A part, for example, passing through the vertex of 1 ring (or higher order ring) around the first vertex.

Digital representation can be any suitable expression on at least surface of object, and the object especially in 3d space Surface shape, be suitable for providing the digital 3D model of physical object.In some embodiments, digital representation includes common fixed The grid of virtual surface in adopted virtual 3d space, such as flat surfaces element surface-element.Surface element can be example The triangle or other kinds of polygon defined in this way by one group of vertex.Other examples of digital representation include that voxel indicates.

Captured image usually indicates the view of the scene from given viewpoint；Therefore, image can be considered as from 3D Projection of the scape to the 2D plane of delineation.Multiple capture images may include the figure captured from the respective viewpoints relative to physical object Picture.Preferably, multiple images include more than two images.Image can indicate luminous intensity and/or colouring information, for example, Different colours/wavelength respective strengths at each picture position.Captured image can be captured by image capture device, such as Image capture device including one or more digital cameras and/or one or more depth cameras, such as described below.One In a little embodiments, image capture device provides additional information, such as depth information, polarization information or other kinds of information；? In some embodiments, in addition to images, this information can be by image capture device as individual data structure or signal It provides；Alternatively, one of such additional information as the individual data structure for including capture image and additional information can be provided Part.

Each subject surface normal can indicate subject surface in subject surface associated with subject surface normal Position, the especially direction at point, that is, surface normal can be indicated from the direction vertical with the tangential plane in the point Surface point vector outwardly.Surface normal information may include multiple normal maps, and each normal map can determine Adopted 2D array, wherein each element representation surface normal of array.In some embodiments, some or all of capture images have Corresponding normal map associated there.Particularly, normal map associated with capture image can indicate and capture image Each pixel or the associated surface normal of pixel group.The creation of normal map can be set by image capture device or by processing It is standby to execute, for example, as the pre-treatment step before reconstruction assembly line.To this end it is possible to use, various for extracting normal map Method obtain surface normal information.As non-limiting example, normal map can by such as (Woodham, 1979) and The photometric stereo algorithm of (Barsky&Petrou, 2003) generates.

Surface normal information can be used in the different phase of reconstruction process, i.e., they can be used for modifying different types of Intermediate representation.

In addition, modification intermediate representation to be to obtain a part for indicating can be iterative process of modification, wherein intermediate representation The expression that can be the modification expression of the acquisition of the previous ones of iterative process and/or wherein modify is used as serving as iterative process Successive iterations input intermediate representation.

The process for being commonly used for object reconstruction may include multiple subprocess, and specifically, the flowing water including subprocess Line, wherein result/output of the subsequent subprocess using the more early subprocess of assembly line.Therefore, one or more of assembly line Process can create one or more intermediate representations, these intermediate representations are used as by the subsequent subprocess of one or more of assembly line Input.Certain processes may will create multiple intermediate representations.Therefore, term " intermediate representation " is intended to indicate that the sub- mistake of whole process The output of journey, which is used by the subsequent subprocess of one or more of whole process, for creating the final table of subject surface Show, i.e. the output of whole process.Depending on the stage along assembly line (pipeline), intermediate representation can have various shapes Formula.For example, intermediate representation can be depth map.Other examples of intermediate representation include will be in the subsequent subprocess for rebuilding assembly line The preliminary surface grid of middle refinement.Similarly, the intermediate representation of modification can be the surface mesh of the depth map of modification, modification Deng.

Therefore, intermediate representation can be the output for rebuilding the previous steps of assembly line, or can be by iterative process The modification that previous ones generate indicates.The expression of modification can be by process creation final expression or it can be it is another A intermediate representation is further processed to obtain and finally indicate.Therefore, the expression of modification may be used as the subsequent of iterative process The input of iteration or the input of the subsequent step as reconstruction assembly line.

In some embodiments, intermediate representation includes depth map, which indicates on from reference position to subject surface Corresponding position distance.Such depth map, which can be, for example to be created in the initial stage of reconstruction process, for example, by coming from The structure of motion process, the processing of multiple view solid etc..In other embodiments, it can be based on connecing from depth camera or similar devices The depth informations of receipts at least partly obtains depth map.Under any circumstance, depth map generally includes hole, i.e., without or almost There is no the region of reliable depth information.This can be for example when an object has many flat, smooth surfaces without being permitted Mostly it is particularly the case when useful feature in multiple view stereoscopic approach.

Therefore, the first part of subject surface may include the hole in the hole either depth map in depth map, and repair Changing intermediate representation may include filling hole using surface normal information, to improve the quality of depth map.In turn, this can promote Into reconstruction process the subsequence stage and improve the quality of final digital representation generated by reconstruction process.

In some embodiments, determine that the second part near first part includes that hole is identified as to hole to be filled and true Determine the periphery in hole, i.e. second part can be confirmed as the periphery in hole or as a part-on periphery or at least as including periphery Or part of it.

In some embodiments, creation intermediate representation includes creating depth map from multiple images, for example, by from being moved through Journey and/or multiple view stereoscopic correspondence analysis execute structure.Alternatively, depth map can be obtained from depth camera.

In some embodiments, hole is identified as hole to be filled includes

Identify the hole in depth map

It is based on surface normal information obtained, determines whether identified hole is the hole to be filled.

Hole in depth map can be determined that in depth map with missing data or with sparse and/or corrupt data Region.

In some embodiments, it includes the determining week with identified hole that whether determining identified hole, which is hole to be filled, The associated first group objects surface normal in side；The first similarity measurement of first group objects surface normal determined by calculating； And the first similarity of calculating is measured and is compared with first object similarity.

Additionally or alternatively, it includes that determination is related to the hole identified that whether determining identified hole, which is hole to be filled, Second group objects surface normal of connection, for example, to relevant surface normal is put in hole；Second group objects surface determined by calculating The second similarity of normal is measured；And the second similarity of calculating is measured and is compared with the second target similarity.Cause This only will be relatively uniform when hole is only identified as the hole to be filled when second similarity measurement is greater than second similarity value Hole be determined as the hole to be filled.

Additionally or alternatively, it includes the determining week with identified hole that whether determining identified hole, which is hole to be filled, The associated first group objects surface normal in side and the second group objects surface normal associated with the hole identified；Calculate first With the compatibility measurement of the second group objects surface normal；And the compatibility measurement of calculating is compared with target compatibility value. Therefore, when hole is only identified as the hole to be filled when compatibility measurement is greater than compatibility value, being only filled with may be by unreliable Depth information caused by hole, while retaining the hole that may represent actual apertures in object in depth map.

Filling hole may include calculate hole in one or more positions depth value, for example, using with identified week The associated depth value in side and/or subject surface normal.Therefore, the intermediate representation of modification can be the depth map of modification, wherein One or more or even all holes is had been filled with.

In some embodiments, this method includes input data for increasing digital representation and capture and/or from described The Optimization Steps of photo consistency metric between normal map derived from surface normal information.In general, photo consistency metric It measures consistent/similar between one group of input picture and the 3D morphology of the model of the scene captured in the input image The degree of property (i.e. consistency).Therefore, digital table can iteratively be modified for increasing the Optimization Steps of photo consistency metric Show (that is, the surface currently rebuild), for example, modification surface mesh, such as the vertex position by modifying surface mesh, to increase Add photo consistency metric.Therefore, Optimization Steps can receive the intermediate representation of surface mesh form, and with the surface of modification The intermediate representation of the form creation modification of figure, causes the increased photo about captured image and/or surface normal consistent Property measurement.

When photo consistency metric includes surface normal information obtained and the surface normal that expression from modification obtains When consistency metric between information, the quality of the subject surface of reconstruction can be further increased.

In general, disclosed herein is the digital representations of at least subject surface for creating physical object according to another aspect, Method embodiment；Wherein, this method comprises:

Create the digital representation of subject surface；

Wherein, the surface normal information of creation digital representation at least multiple capture images based on acquisition and acquisition, and Include:

Obtain the intermediate representation of subject surface；

Surface normal information obtained is based at least partially on to modify the first part of intermediate representation to be modified Expression, the first part of intermediate representation indicates the first part of subject surface；

Wherein modifying the first part of intermediate representation includes leading for increasing intermediate representation with from the surface normal information The Optimization Steps of photo consistency (photoconsistency) measurement between normal map out.

In some embodiments, modification intermediate representation with obtain modification indicate include execution bilateral filtering step, it is optional Ground is followed by the expression for increasing modification, and normal pastes with the input data of capture and/or derived from the surface normal information The Optimization Steps of photo consistency metric between figure.Therefore, bilateral filtering step provides suitable starting point for Optimization Steps, To reduce the risk that Optimization Steps cause local stray optimal, to improve the quality on reconstructed object surface.

In some embodiments, intermediate representation defines the virtual surface in virtual 3d space and the net including surface-element Lattice；Each surface-element defines virtual surface normal；Each surface-element includes multiple vertex, and each vertex limits described virtual Position on surface.Bilateral filtering step includes that at least first top on the multiple vertex is modified by the top displacement of calculating The position of point, to reduce the one or more subject surface normals and corresponding one determined according to surface normal information obtained Difference measurement between a or multiple virtual surface normals.One or more virtual surface normals indicate the phase near the first vertex Answer the orientation of surface-element, and one or more subject surface normal instruction subject surfaces with the adjacent domain in table Orientation at the corresponding corresponding position 3D in the position of surface element.Therefore, as described above, in some embodiments, intermediate representation First part can be the first vertex of surface-element grid.

Particularly, the mesh definition surface topology of object, and in some embodiments, top displacement is opened up by grid The constraint flutterred.For example, information associated with the surface-element near the first vertex can be based only upon to calculate the first vertex Displacement.In some embodiments, the first vertex is associated with one or more surface-elements, and top displacement with first by pushing up The scaled of the associated one or more surface-elements of point.

It in some embodiments, is following right according to each subject surface normal that surface normal information obtained determines As surface normal, subject surface normal instruction with the object at the corresponding position of one of surface element near the first vertex Surface orientation, and selected from the surface normal indicated by surface normal information obtained, for example, subject surface normal is most The all surface normal of close surface normal information obtained associated with the surface-element near the first vertex Average value.For example, for each surface-element near the first vertex, it can be in the candidate surface normal of each normal map Middle selection surface normal, wherein candidate surface normal indicates the part of subject surface associated with the surface-element.Therefore, In some embodiments, bilateral filtering step includes the practical object surface method for selecting to be indicated by surface normal information obtained One of line, and selected surface normal is associated with the associated surface-element in the first vertex, to provide improved Edge is kept.

This disclosure relates to different aspect, including method described above and below, corresponding device, system, method and/or Product, the one or more benefits and advantage that each generation describes in conjunction with one or more first aspects, each has With embodiment corresponding one combination one or two first aforementioned aspects description and/or disclosed in the appended claims A or multiple embodiments.

Particularly, disclosed herein is embodiment for the system for creating the digital representation of physical object；The system packet Data processing system is included, which is configured as executing the step of the embodiment of one or more methods disclosed herein Suddenly.

For this purpose, data processing system may include or may be connected to computer-readable medium, computer program can be from this Computer-readable medium is loaded into the processor of such as CPU to execute.Therefore, computer-readable medium can be deposited on it Program code devices are contained, which is suitable for making data processing system execute sheet when executing on a data processing system The step of literary the method.Data processing system may include properly programmed computer, such as portable computer, plate meter Calculation machine, smart phone, PDA or another programmable computation device with graphic user interface.In some embodiments, at data Reason system may include FTP client FTP, it may for example comprise camera and user interface, and can create and control virtual environment Host system.Client and host system can be connected by suitable communication network (such as internet).

Here and hereinafter, term processor be intended to include any circuit for being adapted for carrying out function described herein and/or Equipment.Particularly, above-mentioned term includes general or specialized programmable microprocessor, such as the central processing unit of computer (CPU) or other data processing systems, digital signal processor (DSP), specific integrated circuit (ASIC), programmable logic array (PLA), field programmable gate array (FPGA), special electronic circuit etc., or combinations thereof.

In some embodiments, which includes scanning movement, which includes the object branch for receiving physical object Support member.Object support can be static object supporting element or movable objects supporting element.For example, object support can be by It is configured to the turntable rotated around the axis of rotation, to allow image capture apparatus to place from the different points of view capture relative to object The multiple images of physical object on turntable.Turntable may include the label for example along the circumference of turntable, data processing system Image can be captured based on one or more to configure, to determine the Angle Position of turntable associated with capture image.At data Reason system can be additionally configured to inclination or other displacements of the detection turntable relative to image capture device, to allow for two A or more image is calculated relative to respective image from the respective viewpoints of its captured physical object.For example, the determination It can be executed by the structure from Motion Technology.

In some embodiments, which further includes image capture device, which can operate with catches Two or more images of object are managed, example is when physical object is placed in object support, two of them or more figure It seem to be obtained from the different points of view relative to physical object.

Image capture device may include one or more sensors, the electromagnetic radiation of detection light or other forms, example Such as light by the surface reflection of the physical object in the visual field of image capture device or other electromagnetic radiation.Image capture device can To include sensor array, such as CCD chip, or the single sensor that can be operated with scanning field of view, or multiple sensings of scanning The combination of device.Therefore, physical object can be passively, because it does not need actively to emit any sound, light, aerogram Number, electric signal etc..Furthermore, it is possible to image be captured in a non contact fashion, without establishing any electrical contact, communication interface etc..

Image capture device may include radiation source, such as light source, can operate will radiate and guide physical object into.For example, Image capture device may include flash lamp, one or more LED, laser and/or analog.Alternatively, image capture apparatus It can be used for detecting the environmental radiation by object reflection.Here, term reflective be intended to indicate that in response to received radiation or wave Any kind of passive transmitting, including diffusing reflection, refraction etc..

Image can be another form of the two-dimensional representation of the visual field of picture or image capture device, which allows true Determine the shape and/or color and/or size of the object in visual field.For example, image capture device may include in response to visible light, The digital camera of infrared light and/or analog.In some embodiments, camera can be 3D camera, can operate also to detect Range information in visual field relative to each point of camera position.Another example of image capture device may include digital phase Machine is suitable for obtaining the data of the local polarization of light, for example, for each pixel of sensor array or the picture of sensor array Plain group.Such camera can be used for obtaining corresponding polarization and/or the surface of each surface point in the visual field of image capture apparatus The 2D of normal schemes.Therefore, captured image can be expressed as the 2D array or other array elements of pixel, each array element table Show the sensitive information with point or directional correlation connection in visual field.The information of sensing may include the intensity of received radiation or wave And/or it is received radiation or wave frequency/wavelength.In some embodiments, 2D array may include additional information, such as distance Figure, polarization figure, surface normal textures and/or other suitable sensed quantities.Therefore, 2D array may include image data and optional Additional information.

In some embodiments, image capture device includes one or more digital cameras, such as two digital cameras are fitted In the respective viewpoints relative to physical object, such as at the corresponding height relative to physical object.In some embodiments, number Word camera is configured as also capturing depth information other than light intensity data (such as RGB data).In some embodiments, number Word camera is configured as the information of the surface normal on one or more surfaces in the visual field of capture designation number camera.For example, Digital camera can be configured as the polarization data for obtaining and receiving light.Camera and/or data processing system can be configured as root Local surface normal is determined according to polarization data obtained.The surface normal of capture can also inspection based on turntable relative to camera The inclination of survey or other displacement and be transformed to world coordinate system.The example for being able to detect the camera sensor of surface normal includes beauty System disclosed in state patent No.8,023,724.For determining that other examples of the technology of surface normal include Wan-Cun Ma Et al. " Rapid Acquisition of Specular and Diffuse Normal Maps from Polarized Spherical Gradient Illumination ", Eurographics Symposium on Rendering (2007), Technology described in Jan Kautz and Sumanta Pattanaik (editor).

Then, data processing system may be adapted to according to light intensity data and polarization data as described herein and/or Surface normal data and/or depth information create digital 3D model.

Thus, for example multiple capture images of the physical object of physics toy construction model and optional additional information can be with Basis as the virtual objects for generating the dimensional Graphics with the 3D shape for corresponding precisely to physical object.Based on capture Image, then which can automatically create including its three-dimensional figured virtual objects.

In some embodiments, which further includes multiple toy building elements, is configured to detachably interconnect, To form the physical object of toy construction model form.Toy building elements can respectively include one or more connecting elements, The connecting elements is configured to for detachably interconnecting toy building elements.

Therefore, the simple capture image of the one or more of physics toy construction model may be used as generating in virtual environment The basis of the virtual objects of appearance is defined with user.User can create to be similar to and use in the virtual environment that computer generates Make the physics toy building model of the object of virtual objects.Since user can construct these objects from toy building elements, because How this user is for construct object with very big freedom degree.In addition, the system for user provide flexibly and should be readily appreciated that and Wieldy mechanism, for influencing the desired appearance of virtual objects in virtual environment.

The process even may include International Patent Application PCT/EP2015/062381 for example by using co-pending Disclosed in mechanism distribute virtual attribute, for example, the ability of such as virtual objects, demand, preference or other attributes behavior category Other game association attributes of property or virtual objects, such as the perceptual property detected based on physical object.

The construction component of system can respectively have selected from scheduled one group of color, shape and/or size color, Shape and/or size, i.e. toy construction set can comprise only predetermined color, shape and/or the size of the preset range of limit Toy building elements.Identified perceptual property can be at least partly (if not completely) by toy building elements The color of shape, shape and size and their relative positions and orientation in the toy construction model of construction limit. Therefore, although toy construction set can provide a large amount of building and select and allow to construct various toy construction moulds Type, but construct as defined in characteristic and toy building system of the freedom degree of toy construction model by each toy building elements Construct the limitation of rule.For example, the color of toy construction model is limited to the color group of each toy building elements.Each toy structure The shape and size of modeling type at least partly by the shape and size of each toy building elements and they can mutually interconnect The mode connect limits.

Therefore, can be determined by one group of scheduled perceptual property can be by the vision category for the toy construction model that processor determines Property.Therefore, in some embodiments, the behavior property of the virtual objects created can be only from corresponding to and toy building system One group of predefined action attribute of consistent predetermined perceptual property set creates.

Various aspects described herein can be realized with various game systems, for example, the virtual environment that computer generates, Middle virtual objects are controlled by data processing system to show that behavior in virtual environment and/or virtual objects have and influence video trip The game play of play or other association attributes developed of virtual environment.

In general, virtual objects can indicate virtual role, the anthropoid role of such as class, similar horn color, illusion Biology etc..Alternatively, virtual objects can be no inanimate object, such as building, vehicle, plant, weapon etc..In some implementations In example, the counterpart in physical world is abiotic virtual objects, such as automobile, and the animation that may be used as in virtual environment is empty Quasi- role.Therefore, in some embodiments, virtual objects are virtual roles, and in some embodiments, and virtual objects are nothings Inanimate object.

Virtual role can be by moving, by interacting with other virtual roles and/or and virtual ring in virtual environment It is interacted present in border without the interaction of life virtual objects and/or with virtual environment itself, and/or is usually engaged in other virtual roles And/or it is usually engaged in and is engaged in virtual environment itself present in virtual environment without life virtual objects and/or usually, and/or is logical It crosses and is otherwise developed in virtual environment, such as growth, aging, develop or lose ability, attribute etc., to show behavior. In general, virtual objects can have attribute, for example, influencing the ability of other differentiation of game play or virtual environment.For example, automobile It can have specific maximum speed or object can have what whether and how determining virtual role interacted with virtual objects Attribute and/or analog.

Therefore, the virtual environment of computer generation can be realized by the computer program executed on a data processing system, And so that data processing system is generated virtual environment and simulate the differentiation of virtual environment at any time, including one in virtual environment or The behavior of the attribute of multiple virtual objects and/or one or more virtual features.For the purpose this specification, computer generates Virtual environment can be it is lasting, that is, even if being interacted without user, for example, it can continue between user conversation Development and presence.In an alternative embodiment, virtual environment only can just evolve when user interacts, for example, only in activity During user conversation.Virtual objects can be at least partly by user's control, i.e., data processing system can at least partly ground In received user input control the behaviors of virtual objects.Computer generate virtual environment can be single user environment or Multi-user environment.In a multi-user environment, more than one user can interact with virtual environment simultaneously, for example, empty by control Each virtual role or other virtual objects in near-ring border.The virtual environment that computer generates, especially lasting multi-user Environment is otherwise referred to as virtual world.The virtual environment that computer generates is frequently used in game system, and wherein user can be controlled One or more virtual roles in virtual environment processed." player " is also sometimes referred to as by the virtual role of user's control.It should Understand, at least some embodiments of aspects described herein can be used in context in addition to gaming.Computer generates The example of virtual environment can include but is not limited to video-game, for example, video-game, games of skill, venture game, movement Game, real time strategy, role playing game, simulation and the like, or combinations thereof.

The expression of virtual environment can be presented in data processing system, the expression including one or more virtual objects, such as Virtual role in virtual environment, and the differentiation including environment and/or virtual objects at any time.

Present disclosure also relates to a kind of computer program product including program code devices, which is suitable for The step of making the data processing system execute one or more methods as described herein when executing in data processing system.

Computer program product can be used as computer-readable medium offer, for example, CD-ROM, DVD, CD, storage card, Flash memory, magnetic storage apparatus, floppy disk, hard disk etc..In other embodiments, computer program product can be used as Downloadable software Packet provides, such as by internet or other computers or downloaded on Web server, or under App Store It is downloaded to the application program of mobile device.

Present disclosure also relates to a kind of data processing system, it is configured as executing one or more methods disclosed herein The step of embodiment.

Present disclosure also relates to a kind of toy construction sets comprising multiple toy building elements and for obtaining computer journey The instruction of sequence computer program code, when computer program code is executed by data processing system, the computer program code So that data processing system executes the step of embodiment of one or more methods as described herein.For example, can be with internet Address provides instruction to forms such as the references in the shop App.Toy construction group may include even being stored thereon with such as computer The computer-readable medium of program code.This toy construction group may include even the camera that may be connected to data processing system Or other image capture devices.

Detailed description of the invention

Fig. 1 schematically shows the embodiment of system disclosed herein.

Fig. 2 shows the exemplary of the method for creating digital representation in the 3d space of the subject surface of physical object Flow chart.Particularly, Fig. 2 shows rebuild assembly line from input 2D photo to the 3D of creation 3D texture model.

Fig. 3 shows the step of automatic profile generates subprocess.

Fig. 4 shows depth map hole filling subprocess.

Fig. 5 shows bilateral filtering subprocess.

Fig. 6 A-C shows the photo consistent mesh optimization process using normal data.

Fig. 7 shows the exemplary flow chart of dimorphism shape and contour optimization subprocess.

Specific embodiment

Now by partly with reference to the toy building elements of brick form come describe to be used for from 2D image data rebuild 3D object Process and system various aspects and embodiment.However, present invention could apply to the physical objects of other forms, such as with In the construction component of the other forms of toy construction group.

Fig. 1 schematically shows the embodiment of the system of the expression of the digital 3D for creating physical object.The system packet It includes computer 401, input equipment 402, display 403, camera 404, turntable 405 and is made of at least one toy building elements Toy construction model 406.

Computer 401 can be the hand of personal computer, desktop computer, laptop computer, such as tablet computer Hold computer, smart phone etc., game console, hand-held amusement equipment or any other appropriate programmable computer.It calculates Machine 401 includes one or more storages of the processor 409 and memory, hard disk etc. of central processing unit (CPU) Equipment.

Display 403 is operatively coupled to computer 401, and computer 401 be configured as on display 403 be in The graphical representation of existing virtual environment 411.Although being shown in Figure 1 for individual functional block, it will be understood that display can integrate In case of computer.

Input equipment 402 is operatively coupled to computer 401 and is configured as receiving user's input.For example, input Equipment may include keyboard, mouse or other indicating equipments etc..In some embodiments, which includes more than one input Equipment.In some embodiments, input equipment can integrate in computer and/or display, for example, with the shape of touch screen Formula.It should be appreciated that the system may include other periphery meters for being operatively coupled to computer, such as be integrated into computer Calculate machine equipment.

Camera 404 can be operated to capture the image of toy construction model 406 and captured image is forwarded to computer 401.For this purpose, toy construction model 406 can be located on turntable 405 by user.In some embodiments, user can be in bottom plate Upper building toy structure model.Camera, which can be, can operate to shoot the digital picture for example in the form of two-dimensional array Digital camera.Particularly, camera can be configured as the luminous intensity for capturing each pixel, and optionally, capture such as each The additional information in the direction of the polarization information and/or surface normal of pixel or pixel group.Alternatively, other kinds of figure can be used As capture device.In other embodiments, object can be located in object support by user, such as desk, and mobile phase Machine is to capture the image of object from different viewpoint.

Display 403, camera 404 and input equipment 402 can be operatively coupled to computer in various ways.Example Such as, one or more of above equipment can be coupled to meter by the suitable wired or wireless input interface of computer 401 Calculation machine, for example, the serial or parallel port of the computer via such as USB port, via bluetooth, Wifi or other suitable nothings Line communication interface.Alternatively, one or all devices are desirably integrated into computer.For example, computer may include integrated aobvious Show device and/or input equipment and/or integrated camera.Particularly, many tablet computers and smart phone include integrated camera, can Integrated touch screen as display and input equipment.

It is stored with program on computer 401, such as applies or other software application, suitable for processing captured image and creates Virtual 3D object as described herein.In general, in the initial step, which receives the physical object of such as toy construction model Multiple digital pictures, from the corresponding Angle Position of turntable or from each viewpoint capture.

In a subsequent step, 3D digital representation of the process from digital picture building toy construction model.For this purpose, the mistake Journey can execute one or more image processing steps known per se in digital image processing field.For example, processing can wrap Include one or more of following steps: background detection, edge detection, color calibration, color detection.It will retouch in more detail below The example for stating this method.Software application can further simulate virtual environment and control the virtual 3D of the creation in virtual environment Object.

It should be appreciated that in some embodiments, computer 401 can be communicably connected to host system, for example, via mutual Networking or other suitable computer networks.Then, host system can execute at least part of processing described herein.Example Such as, in some embodiments, host system can be generated and simulate virtual environment, such as can be calculated with origin from relative client The virtual world of multiple users access of machine.The client computer of execution suitable procedure can be used to capture image in user. Captured image can be handled by client computer or upload to host system to handle and create corresponding virtual objects. Then, virtual objects can be added to virtual world and control the virtual objects in virtual world by host system, such as this paper institute It states.

In the example of fig. 1, virtual environment 411 is the underwater environment of such as virtual aquarium or other underwater environments.It is empty Quasi- object 407,408 is similar to fish or other underwater animals or biology.Particularly, computer is based on toy construction model 406 Institute's captured image creates a virtual objects 407.Computer has created virtual objects 407, to be similar to toy Tectonic model, such as pass through creation 3D grid or other suitable representations.In the example of fig. 1, virtual objects 407 are similar In the shape and color of toy construction model 406.In this example, virtual objects are even similar to building toy construction model 406 each toy building elements.It should be appreciated, however, that the similitude of different stage may be implemented.For example, in some implementations In example, virtual objects can be created to be only similar to the global shape of buildings model without simulating its each toy building elements Internal structure.Virtual objects can also be created to have size corresponding with the size of virtual construct element, for example, passing through Reference length scale is provided, on turntable 405 to allow computer to determine the actual size of toy construction model.Alternatively, meter The size of toy building elements can be used as reference length scale in calculation machine.User can be manual in yet other embodiments, Scale the size of virtual objects.In other embodiments, the virtual objects of reconstruction, which can be used in remove, is related to the application of virtual environment Except software application in.

Fig. 2-7 shows the process and this processing of the expression on the surface for creating physical object in the 3 d space The example of each sub-processes.It is then possible to create virtual objects or role using the embodiments described herein.For example, process And its different examples of subprocess can be executed by the system of Fig. 1.

Particularly, Fig. 2 shows the entire exemplary flow charts for rebuilding assembly line of diagram.Since multiple images, the mistake Journey restores the 3D shape of object by 3D texture model.

The volume of object can be considered as there is the track of all the points of () from all projections simultaneously.The table of the object The face all the points that right and wrong object-point (i.e. white space) is adjacent.

The surface (and be converted into a little or triangle) of object is obtained for being related to computer graphical (game or cgi) Using be preferably as its (in terms of memory space) closer for the expression of object and with traditional 3D assembly line phase Match.

The process is executed by calculating equipment, for example, as depicted in figure 1.In general, calculate equipment can be it is properly programmed Computer.In some embodiments, it can be handled on the server, and return the result to client after a computation End equipment.

Particularly, in initial step 101, which receives the physics pair of such as toy construction model from each viewpoint Multiple digital pictures of elephant.For example, the process can receive the set of 2D image (for example, with the background with original RGB image Compare, pass through segmentation obtain object opacity figure) and all cameras for obtaining image all relevant moulds Type view projections matrix is as input.Although input picture can from any kind of camera feed to assembly line, for The purpose of this explanation, it is assumed that each image is accompanied by corresponding normal map.The creation of normal map can be by having captured (multiple) camera of image executes, or is executed by processing equipment, such as the pre-treatment step before reconstruction assembly line.For This, can be used the various methods for extracting normal map, and reconstruction assembly line described herein is independently of for creating The ad hoc approach of normal map is preferably used for creation normal map associated with each input picture.As non-limiting Example, normal map can be raw by the photometric stereo algorithm in such as (Woodham, 1979) and (Barsky&Petrou, 2003) At wherein the viewpoint of not sharing the same light of the LED of the specific installation on acquisition hardware is utilized to obtain multiple illumination settings.Lamp and camera Sensor can install polarizing filter.

In subsequent step 102, the process computing object profile.Object outline will be combined with image is rebuilding assembly line Several stages in use.Automated process can be used or using shown in Fig. 3 and the method that is described below extracts wheel It is wide.

In subsequent step 103, which is preferably used suitable characteristic point extractor and extracts feature from image Point.Preferably, characteristic point extractor is Scale invariant and has high duplication.Several characteristic point extractors can be used, Such as that proposed in US6711293 and (Lindeberg, 1998).Score response can be with the key point phase of each extraction Association.

Next, the process executes pairs of images match and track generates in step 104.The key point previously extracted is retouched Symbol is stated in pairs of the matching frame across images match.Correspondence key point on image is associated with track.To this end it is possible to use, not With method, such as, but not limited to method described in (Toldo, Gherardi, Farenzena , &Fusiello, 2015).

In subsequent step 105, since known initial estimation, adjusted since the track being previously generated internal and outer Portion's camera parameter.For example, process can be adjusted using beam come the inside and outside ginseng of the position 3D of integrated restoration track and camera Number, for example, as described in (Triggs, McLauchlan, Hartley , &Fitzgibbon, 1999).

Optionally, in subsequent step 106, which can show output among first, for example, the inside of camera and The external and sparse cloud with visibility information.

Next, the process executes voxel engraving in step 107.This shape from profile step allow to pass through by The outline projection being previously calculated generates voxel grid into 3d space, for example, as described in (Laurentini, 1994). For convenience, voxel can be converted to 3D point off density cloud.

In subsequent step 108, which executes dual-shaped and profile iteration optimization.During the step, pass through The volume that space is carved is projected to again on image and by the way that pixel/super-pixel match information is embedded into global optimization frame In optimize the profile being previously calculated.The example of the subprocess is more fully described below with reference to Fig. 7.

Next, the process executes corner detection in step 109.Corner is extracted and is used to improve the whole of reconstructed object Body geometry.To this end it is possible to use, several corner extractors, for example, (Canny, 1986) or (Harris&Stephens, 1988).The corner of extraction matches in different images, and the 3D point generated is for integrating original point cloud and carving out not It is consistent, for example, the depressed section of grid is generated by voxel engraving process.

In step 110, which uses the result from step 108 and 109 to execute pixel depth range computation.In the step During rapid, the initial volume from voxel engraving step is used to limit the depth bounds of each pixel during depth map initialization Search.

In step 111, which executes actual depth figure and calculates.For this purpose, defining initial depth using turning and cross-correlation Degree figure is candidate.For each pixel, zero or one " candidate depth " can be defined.If can be used stem algorithm to it is several related Method generates depth map, see, for example, (Seitz, Curless, Diebel, Scharstein, 2006).Cause This, which creates intermediate representation in the form of depth map.

In step 112, handles the normal being previously calculated and be fed to next step (step 113).It can be by making Normal map, including but not limited to light are calculated with any suitable method for calculating normal map as known in the art Spend stereo algorithm.As shown in Figure 2, the normal of the calculating from step 112 is used in the different phase of assembly line.

Next, the process executes depth map hole using normal map and fills in step 113.Depth pinup picture may include Multiple holes, it is bad due to matching especially in complete non-textured regions.In order to overcome this problem, if it is closed Hole if boundary pixel indicates uniform flat surfaces, and if normal data confirms above-mentioned two discovery, can be filled out Fill the hole in depth map.Therefore, in this step, the intermediate representation of depth diagram form is modified, so as to the depth map of modification The intermediate representation of form creation modification, wherein some or all of holes in previous depth figure have been archived.One of the process Example will be described herein-after as shown in Figure 4.

In step 114, which, which utilizes, is refused based on the exceptional value of global visibility to execute depth map and be fused to 3D sky Between in.By checking that visibility constraints can refuse abnormal point, i.e., they must not block other visible points.Several mistakes can be used Journey enforces visibility constraints, see, for example, (Seitz, Curless, Diebel, Scharstein , &Szeliski, 2006) (Furukawa, Curless, Seitz , &Szeliski, 2010).

In step 115, which can optionally generate output among the second of the assembly line, i.e., by point off density cloud and can The multiple view solid voice output of opinion property information composition.

Next, the process executes grid-search method in step 116.Any known method can be used to extract grid, example Such as solve Poisson's equation (Kazhdan, Bolitho , &Hoppe, 2006) or based on Delaunay algorithm (Seitz, Curless, Diebel, Scharstein , &Szeliski, 2006).The normal previously calculated at frame 112 may be used as adding Input, such as directly in Poisson's equation.Grid can be triangular mesh or another type of polygonal mesh.Grid packet Include one group of triangle (or other kinds of polygon) by their common first edges or corner connection.Corner is referred to as grid Vertex, define in the 3 d space.Therefore, which creates intermediate representation in the form of preliminary/intermediate surface grid.

Next, the process executes bilateral filtering using normal in step 117.It is similar with step 116, it is counted at frame 112 The normal map of calculation is used as additional input.The position on bilateral filtering step mobile grid vertex, to maximize the consistent of normal Property, to generate less noise and sharper keen edge.This step make grid closer in operation based on being followed by photo one Global minimum before the grid optimization of cause property.The example of bilateral filtering method is shown in Fig. 5, will be described herein-after.

Next, the process executes the grid optimization based on photo consistency using normal in step 118.With the first two Step is similar, and the identical normal calculated at frame 112 is used as additional input.The example of grid optimization based on photo consistency exists It shows, and will be described herein-after in Fig. 6.Therefore, which creates from previous intermediate representation (in the form of antecedent trellis) The intermediate representation of modification (in the form of the grid of modification).

In step 119, which executes steady plane fitting: from 3D model inspection plane domain.Plane domain is available In the final mass for improving subsequent extraction, Object identifying and grid.Several plane domain algorithms can be used, including but not limited to (Toldo&Fusiello, Robust multiple structures estimation with j-linkage, 2008) and (Toldo&Fusiello,Photo-consistent planar patches from unstructured cloud of points,2010)。

In step 120, which executes mesh extraction.This can be the plane based on extraction, be also possible to that throwing will be being put Simply geometry (Garland&Heckbert, 1997) after on shadow to respective planes.

In step 121, which constrains to execute veining using multiband, color balance and uniformity.Texture can be with (Allene, Pons, &Keriven, Seamless image-based texture atlases is generated using multiband method using multi-band blending,2008)；By dividing low frequency and high frequency, can more steadily illumination change it is (more Frequency band) and global color variation color balance.In addition, it is contemplated that the property of reconstructed object, can be set certain uniformities Constraint.

Finally, the output of process rebuilds the final output of assembly line, i.e., with the simplification of normal and texture in step 122 3D grid.

It should be appreciated that one or more above-mentioned steps can be modified or even omit by rebuilding the alternate embodiment of assembly line, Change the sequence of some steps, and/or replaces one or more of above-mentioned steps by other steps.

Optionally, once the 3D for creating physical object is indicated, for example, by above-mentioned assembly line, which can be determined One or more perceptual properties of the toy construction model detected, for example, the aspect ratio of the shape detected, mass-tone etc..

In a subsequent step, which can create virtual objects based on the 3D digital representation of reconstruction.If will be The movement of animation virtual objects in virtual environment, then the process, which can be created further, indicates matched bone with the 3D created Frame.

Optionally, the value of process setting one or more virtual attributes associated with virtual objects.The process is based on The perceptual property setting value detected.Such as:

Maximum speed parameter: max_speed=F (aspect ratio) can be arranged based on aspect ratio in the process；

The food type of virtual objects can be arranged in the process based on the color detected, for example,

Situation (color)

(red): food type=meat；

(green): food type=plant；

(otherwise): food type=whole.

Daily Ka Lu needed for virtual role can be arranged in the process based on the size of the toy construction model detected In intake.

In a subsequent step, which can be added to virtual environment by virtual objects and control drilling for virtual environment Become, the behavior including virtual objects.For this purpose, the process can execute control process, which realizes virtual for controlling The control system of virtual objects in environment.

Fig. 3 shows the step of automatic profile generates subprocess.In one embodiment, it is calculated by background and foreground segmentation Method is the profile that each image automatically extracts the object to be rebuild.Thus, it is assumed that some pre-existing knowledge of physics setting It is available, for example, the information of the coarse localization in the form of background image and about object in image space.According to object The rough knowledge of positioning, can be each pixel extraction probability graph P.Probability graph function P (x, y) output belongs to the pixel of object Probability value, the range of value are 0 to 1, and intermediate value 0 indicates that pixel determination belongs to background, are worth and belong to for 1 expression pixel determination The object.It completes to divide by firstly generating super-pixel.Several method can be used to extract super-pixel, such as (Achanta, et al., 2012) described in.Each super-pixel is associated with one group of pixel.Each super-pixel value can with come The value of the pixel belonging to the group is associated.Average value or intermediate value can be used.It can complete to divide in super-pixel rank, then will It is transmitted to pixel scale.

Original RGB image can be converted to LAB color space, extracted with improving correlation function and super-pixel.

In the first part of algorithm, one group of super-pixel seed is detected.Seed is marked as prospect or background, and they Represent the super-pixel with the high probability for belonging to prospect or background.In more detail, for super-pixel i, score can be calculated as follows S。

S (i)=P (i) * dist (i, back (i))

Dist is the distance between two super-pixel function (for example, the Europe between super-pixel intermediate value in LAB color space A few Reed distances), and back is by super-pixel i and the associated function of corresponding super-pixel in background object.If S is lower than Fixed threshold T1, then super-pixel i is associated with background seed, else if S is higher than fixed threshold T2, then super-pixel i and prospect Seed is associated.Adaptive threshold can be used alternatively, for example by calculating scene lighting.

Then using area growing method grows seed super-pixel.Specifically, for close to prospect or background super-pixel s Each super-pixel j calculates the distance d with function dist (j, s).In all super-pixel with minimum range super-pixel with Prospect or background super-pixel collection are associated, and the iteration process belongs to prospect or background collection until all super-pixel.

Image 201 shows the example of generic background image, that is, in order to capture the image of physical object and by physical object It is placed into the image of scene therein.Image 202 shows the example that background image is divided into super-pixel, for example, using general Logical super-pixel method.

Image 203 shows the picture for the physical object 210 being placed on before background.Image 203 can indicate physics pair One of multiple views of elephant are used as the input of assembly line.Image 204 shows the example that image 203 is divided into super-pixel, For example, using common super-pixel method.

Image 205 shows initial seed, is calculated using the above method.Prospect (i.e. object) seed indicates with black, And background seed is indicated with grey.Image 206 shows the background (grey) and prospect (black) grown into final mask, The process that can be described by reference to the step 108 of Fig. 2 is further improved.

Fig. 4 shows depth map hole filling subprocess.Particularly, image 310 schematically shows the example of depth map, For example, as the step 111 of the process from Fig. 2 creates.Image 301 show the effective depth data with gray scale region and Region with missing (or too sparse or unreliable) depth data as white area.Since initial depth figure has effectively Value and missing data, therefore the process is initially the join domain of depth map, with missing data.Each candidate's join domain It is shown by the different shades of gray in image 302.For the candidate region of each connection, which calculates the area of candidate region Domain.Then, which abandons the region with the area greater than predetermined threshold.Remaining candidate region is shown in image 303, And image 4 schematically shows remaining candidate region.

For each remaining connection candidate region and it is based on normal map, which calculates following amount and using being counted The amount of calculation is come the hole that selects candidate region to indicate whether to be filled, for example, by the way that each amount is compared with each threshold value, and And if only if amount determines that region is to fill out when (for example, only when all amounts are above corresponding threshold value) meet scheduled selection criteria The hole filled:

Fall into the first similarity value of the normal in join domain (S1)；High similarity indicates similar to plane surface Region.Allow the process control by the type in the hole being filled based on first similarity value selection region: being only filled with has height The region of first similarity value, which will lead to, is only filled with highly uniform region, while filling the area with small first similarity value Domain leads to the complex region to be filled (although this may introduce some approximate errors in the filling stage).

Fall in the second similarity value of the normal on the periphery (S2) of join domain.Consider that second similarity value allows this Process distinguishes following possible scene: the join domain for lacking depth data can indicate the hole being implicitly present in physical object, Or it may be an indicator that insecure loss data in depth map, do not represent the real hole in object.

In order to correctly distinguish these scenes, which further determines that the normal fallen in region (S1) and falls in its side Boundary (S2) nearby or on normal between compatibility.Low compatibility value indicates that hole and its boundary belong to two different surfaces (same object or object are relative to background), this in turn means that the region does not need to fill.Highly compatible value indicates candidate Region indicates to need the region for the missing data filled.

If the first and second similarities are higher than respective threshold and if compatible value are higher than specific threshold, it is determined that even Connect the hole in the depth map to be filled of region expression.Each similarity can be defined in many ways.For example, in many situations Under, it may be reasonably assumed that hole to be filled be at least be to a certain extent plane region a part.In such case Under, the normal in hole and along boundary will be directed toward identical direction.Therefore, can by using between normal dot product or angle away from From calculating similarity.For example, can by determine consider in region (respectively S1 and S2) in average normal and Similarity is calculated at a distance from the average normal of calculating by calculating each normal wrt.If most of normals and average The distance between normal d is lower than some threshold value (or-d is higher than some threshold value), it is determined that the region is plane.It can be according to area The distance between average normal of the average normal of domain S1 and region S2 determines compatibility value, wherein big distance correspond to it is low simultaneous Capacitive value.

Then, process filling is not yet dropped and has been confirmed as indicating the remaining candidate region in the hole to be filled. For example, depth map can be completed by carrying out interpolation between the depth value on or near the boundary of the join domain to be filled Filling.Other more complicated fill methods can based on the global analysis on boundary, allow filling process more evenly (for example, Use plane fitting).Pri function is also based on using global approach, so that in each iteration, which comments again Which pixel estimates will fill and how to fill.

Fig. 5 shows bilateral filtering subprocess.Bilateral filtering process receives the preliminary grid 531 for indicating the object to be rebuild As input, such as the grid calculated in the step 116 of the process of Fig. 2.It is a that bilateral filtering step also receives multiple n (n > 1) The information of camera and normal map associated with each camera, are schematically indicated by the triangle 532 in Fig. 5.Camera letter Breath includes the information that camera parameter includes camera view, as provided by the step 106 of the process of Fig. 2.Normal map can Be Fig. 2 step 112 provide normal map, further refined alternately through the hole fill process of the step 113 of Fig. 2.n To camera and normal will by (Camera1/Normals1 (C1/N1), Camera2/Normals2 (C2/N2) ..., Camera i/Normal i (Ci/Ni)) it indicates.Bilateral filtering step is iterative algorithm, including following four step, these steps Suddenly the iteration (for example, the iteration of pre-determined number or based on suitable top standard) of certain number is repeated:

Step 1: initially, which calculates the area and center of gravity of each triangle of grid.

Step 2: subsequently, for each triangle, which determines whether triangle is visible.For this purpose, the process can make It is known with any suitable visibility calculation method, such as in area of computer graphics, for example, (Katz, Tal, & Basri, 2007) method of description.

Step 3: then, using normal map associated with camera, which calculates the estimation surface of each triangle Normal, for example, by executing following steps:

Barycenter oftriangle projects in the normal map of all triangles for seeing current check.The sub-step it is defeated It is normal list (L) out

The average value of the normal (L) thereby determined that is calculated, optionally considers weight (example associated with each normal Such as, the confidence value of the confidence level of the normal from photometric stereo method is indicated).

Normal (L) closest in the list of the average value calculated, is the new normal of the triangle of current check.

Therefore, which causes each triangle to have two normals being associated: i) by triangle Shape definition Normal, i.e., perpendicular to the direction of triangle projective planum and ii) by the above process from the estimation surface method of normal map determination Line.Although the first normal defines the orientation of triangle, the second normal indicates estimation table of the physical object at triangle position Face orientation.

When selecting normal from the normal map closest to average normal when the process, closed using with the 3D point considered All method line computations of connection, which individual normal value is distributed to 3D point rather than one average.It has been found that this can It to prevent excess smoothness and preferably keep sharp edge, while being also steady during selection.

Step 4: then, which calculates new estimation vertex position.

The target of the sub-step is the vertex of mobile grid, so that new summit position at least approximately minimizes and 3D model The associated normal of triangle and based on normal map estimation normal between difference.

This sub-step is iteration, and the following formula by being applied to each iteration is described:

Wherein:

V_iIt is old vertex position

V'_iIt is new vertex position

c_jIt is the mass center of j-th of triangle

n_jBe about j-th of triangle mass center derived from normal map normal

N(v_i) it is v_i1 ring neighborhood, that is, have vertex v_iThe set of all triangles of grid as angle (or can be with Use bigger neighborhood).

A_jIt is the region of j-th of triangle

Therefore, mobile vertex vi, so that normal associated with triangle belonging to vertex is more accurately correspond to from method Normal is corresponded to derived from line textures.Contribution from each triangle is weighted by the surface area of triangle, that is, big triangle adds Power is greater than small triangle.Therefore, v_iConstraint of the movement by network topology, i.e. the local attribute by the grid near vertex is true It is fixed.Particularly, constraint of the movement of vi by the region of the triangle around vertex.

Fig. 6 A-C shows the photo consistent mesh optimization process using normal data.Due to initial surface method for reconstructing It is interpolation and since cloud may be comprising a considerable amount of noise, the initial mesh obtained (is labeled as S₀) be usually Noisy and possibly can not capture fine detail.By using all image datas, the grid is three-dimensional with variation multiple view Visible sensation method is refined: S₀Primary condition as the gradient decline with enough energy functions.Due to grid S₀Already close to Desired solution-especially works as S₀When being the result of above-mentioned bilateral filtering step-local optimum is less likely to fall into nothing The local minimum of pass.Prior art lattice optimization techniques based on photo consistency have been presented in several works, including But it is not limited to (Faugeras&Keriven, 2002), (Vu, Labatut, Pons , &Keriven, 2012), (Vu, Labatut, Pons , &Keriven, 2012).Hereinafter, improved photo consistency process will be described, expression normal is further included The data of textures, for example, optionally, further being refined by the step 113 of Fig. 2 as provided by the step 112 of Fig. 2.

Fig. 6 A shows the simple examples of particular surface S and point x on S.For each camera position, calculating is caught by camera The projection of the x in image 633 obtained.In fig. 6, two camera positions 532 and corresponding image 633 are shown, but should Understand, the embodiment of process described herein will usually be related to more than two camera positions.

C_iIt is camera i, C_jIt is camera j.Each camera has corresponding image 633, in this example, I_iIt indicates by C_iIt catches The image obtained, I_jIt is by C_jCaptured image.Similarly, if П indicates the projection of image midpoint x, x_i=П_iIt (x) is that x exists I_iIn projection, x_j=П_jIt (x) is x in I_jIn projection.

Fig. 6 B shows the example of image re-projection: image I_jIn each valid pixel x_jIt can be expressed as

In addition, if from camera position C_iIt can be seen that x, then the following contents is also set up:

As a result, I_ij ^SIt is the C of surface S induction_iMiddle I_jRe-projection.

As shown in Figure 6 C, it means that if I_jIn pixel (do not caused) correctly to project to I again by S_iIn, then Pixel is simply discarded in order to define re-projection.

In order to continue the algorithm, some similarity measurements are defined: as an example, using the cross-correlation of normal, mutual information Or similitude may be used as suitable similarity measurement.Regardless of the similarity measurement selected, in many cases, it is necessary to Suitable mode selects neighbouring camera, for example, in the case where adjacent cameras cannot check the zone similarity of object.

I_iAnd I_ij ^SBetween the Local Metric of similitude at xi be defined as

Then by I_iAnd I_ij ^SBetween the overall measure definitions of similitude be

Wherein I_ij ^SDomain be П_ij.^S

Based on this similarity measurement, following energy function can be defined:

Wherein P is the one group of camera position (Ci, Cj) selected between adjacent cameras position.

For the purpose this specification, we use E'sDerivative come define when surface S along vector field v pass through The change rate of ENERGY E when going through deformation:

The shape movement minimized by function can by the direction of negative derivative evolution surface S execute, with most ENERGY E is reduced goodly.

Hereinafter, how description by merging normal map is modified into the calculating of similarity measurement.

Give two normal maps, it is necessary to which definition allows in the point for belonging to the first normal map and belongs to the second normal map Point between establish the measurement of similarity.Various functions can be used to realize this purpose.Hereinafter, as an example, description Cross-correlation applied to normal uses.

Hereinafter, it is assumed for convenience of description that normal map can be used for all images, i.e., for each camera position, And normal map has been transformed to world space coordinate.If normal acquisition methods export normal map in camera space, Then therefore it may need to execute conversion before treatment.

Now it is contemplated that x_iThe normal map N at place_iAnd N_ij ^S, rather than x_iThe image I at place_iAnd I_ij ^S。

For convenience, we indicate N₁=N_i, N₂=N_ij ^S, we abandon x_iIndex i.

Similarity measurement is defined as

Wherein covariance, mean value and variance respectively indicate as follows:

υ_1,2(x)=K* (N₁(x)·N₂(x))-μ₁(x)·μ₂(x)

μ_r(x)=K*N_r(x) With r=1,2

Wherein K is suitable convolution kernel (for example, Gaussian kernel, average core etc.).

E'sDerivative is needed relative to normal map N₂Similarity measurement derivative.In this case, should Derivative can calculate as follows:

Wherein

D₂[υ_1,2(x)]=N₁(x)-μ₁(x)

We finally obtain

Therefore, which can execute the gradient reduced minimum (about modeling/reconstruction surface) of energy function, be Using the definition of above-mentioned derivative and above-mentioned similarity measurement or other suitable similarity measurements from the similarity measurements of normal map It measures to calculate.

In initial step 601, which receives extracted profile during the previous step for rebuilding assembly line, example Such as, during the step 102 of the process of Fig. 2.

The number of iterations (h) needed for the process further receives is as input, for example, as user-defined input.The value May not be needed it is very high because only 2 or 3 iteration are just enough to make the process when profile good enough is fed to system Convergence.

In step 602, the process is initialized.Particularly, the number of iterations I is arranged to its initial value, in this case i =0.In addition, the super-pixel of each image is extracted, for example, as described above, reference contours are extracted.Increase iteration at step 603 It counts, and is carved at frame 604 using voxel and carry out computation-intensive cloud, for example, as described in the step 107 in conjunction with Fig. 2.Currently Iteration i is greater than 0 and repeats step 603 and 604 when being less than h, and when the condition is true, it checks in step 605, which exists Step 606 is executed before return step 603 to 613.

Specifically, in step 606, the visibility information of computation-intensive cloud.It can be used any for calculating some clouds The method of visibility, for example, (Katz, Tal , &Basri, 2007).

Then, for each super-pixel of each image, computation-intensive cloud of step 607 related 3D point (if there is If).

Next, in step 608, for each super-pixel, by check they it is whether associated with identical 3D point come Construct the list of the correspondence super-pixel on other images.

In subsequent step 609, which marks each background super-pixel, and the background corresponded only in other images is super Pixel and each super-pixel do not have corresponding 3D point as background.

In step 610, the Relevance scores of each prospect super-pixel to other related super-pixel of other images are calculated.Appoint What correlation function can be used as Relevance scores.For example, it is contemplated that two super-pixel median color between simple Europe it is several Reed difference.

In step 611, if high (for example, being higher than predetermined threshold) in the correlation that step 610 calculates, which will work as The super-pixel of preceding image is labeled as prospect, and otherwise the process is marked as unknown.

In step 612, the distance map of the super-pixel based on the prospect for having been marked as considered image generates each image Each of unknown super-pixel probability graph.

In step 613, back to before frame 603, using area growing method is by all super pictures of residue of each image Element is associated with prospect or background.

Condition at frame 605 is fictitious time, i.e., after desired the number of iterations, which continues at frame 614, wherein The process provides point off density cloud and fine definition as output.

The embodiment of method described herein can the hardware of several different elements be realized by means of including, and/or extremely Partially realized by means of properly programmed microprocessor.

If in the claim for listing equipment for drying, if the equipment for drying in these devices can be by the same element, group Part or hardware branch embody.It only states in mutually different dependent claims or what is described in different embodiments certain arranges The combination that the fact that apply is not offered as these measures cannot be used for benefiting.

It is emphasized that when used in this manual, term "comprises/comprising" is for specifying the feature, member The presence of part, step or component, but be not excluded for the presence of other one or more features, element, step, component or group or add Add.

Bibliography:

Achanta,R.,Shaji,A.,Smith,K.,Lucchi,A.,Fua,P.,&Susstrunk,S.(2012) .SLIC superpixels compared to state-of-the-art superpixel methods.IEEE transactions on pattern analysis and machine intelligence,34(11),2274-2282.

Allene,C.,Pons,J.,&Keriven,R.(2008).Seamless image-based texture atlases using multi-band blending.Pattern Recognition,2008.ICPR 2008.19th International Conference on,(pp.1--4).

Allene,C.,Pons,J.-P.,&Keriven,R.(2008).Seamless image-based texture atlases using multi-band blending.Pattern Recognition,2008.ICPR 2008.19th International Conference on,(pp.1-4).

Barsky,S.,&Petrou,M.(2003).The 4-source photometric stereo technique for three-dimensional surfaces in the presence of highlights and shadows.IEEE Transactions on Pattern Analysis and Machine Intelligence,25(10),1239-1252.

Canny,J.(1986).A computational approach to edge detection.IEEE Transactions on pattern analysis and machine intelligence(6),679-698.

Crandall,D.,Owens,A.,Snavely,N.,&Huttenlocher,D.(2011).Discrete- continuous optimization for large-scale structure from motion.Computer Vision and Pattern Recognition(CVPR),2011 IEEE Conference on,(pp.3001--3008).

Farenzena,M.,Fusiello,A.,&Gherardi,R.(2009).Structure-and-motion pipeline on a hierarchical cluster tree.Computer Vision Workshops(ICCV Workshops),2009 IEEE 12th International Conference on,(pp.1489--1496).

Funayama,R.,Yanagihara,H.,Van Gool,L.,Tuytelaars,T.,&Bay,H.(2010).US Patent No.EP1850270 B1.

Furukawa,Y.,Curless,B.,Seitz,S.M.,&Szeliski,R.(2010).Towards internet-scale multi-view stereo.Computer Vision and Pattern Recognition (CVPR),2010 IEEE Conference on,(pp.1434-1441).

Fusiello,T.R.(2010).Photo-consistent planar patches from unstructured cloud of points.European Conference on Computer Vision(pp.298--372).Springer.

Garland,M.,&Heckbert,P.S.(1997).Surface simplification using quadric error metrics.Proceedings of the 24th annual conference on Computer graphics and interactive techniques,(pp.209-216).

Gherardi,R.,&Fusiello,A.(2010).Practical autocalibration.Computer Vision--ECCV 2010,(pp.790--801).

Harris,C.,&Stephens,M.(1988).A combined corner and edge detector.Alvey vision conference,15,p.50.

Hernandez Esteban,C.,&Schmitt,F.(2004).Silhouette and stereo fusion for 3D object modeling.Computer Vision and Image Understanding,367--392.

Hiep,V.H.,Keriven,R.,Labatut,P.,&Pons,J.-P.(2009).Towards high- resolution large-scale multi-view stereo.Computer Vision and Pattern Recognition,2009.CVPR 2009.IEEE Conference on,(pp.1430-1437).

Katz,S.,Tal,A.,&Basri,R.(2007).Direct visibility of point sets.ACM Transactions on Graphics(TOG),26,p.24.

Kazhdan,M.,Bolitho,M.,&Hoppe,H.(2006).Poisson surface reconstruction.Proceedings of the fourth Eurographics symposium on Geometry processing,7.

Laurentini,A.(1994).The visual hull concept for silhouette-based image understanding.IEEE Transactions on pattern analysis and machine intelligence,16(2),150-162.

Lindeberg,T.(1998).Feature detection with automatic scale selection.International journal of computer vision,79--116.

Lowe,D.G.(1999).Object recognition from local scale-invariant features.Computer vision,1999.The proceedings of the seventh IEEE international conference on,2,pp.1150-1157.

Lowe,D.G.(2004).US Patent No.US6711293 B1.

Popa,T.,Germann,M.,Keiser,R.,Gross,M.,&Ziegler,R.(2014).US Patent No.US20140219550 A1.

Seitz,S.M.,Curless,B.,Diebel,J.,Scharstein,D.,&Szeliski,R.(2006).A comparison and evaluation of multi-view stereo reconstruction algorithms.Computer vision and pattern recognition,2006 IEEE Computer Society Conference on,(pp.519--528).

Sinha,S.N.,Steedly,D.E.,&Szeliski,R.S.(2014).US Patent No.US8837811.

Tola,E.,Lepetit,V.,&Fua,P.(2010).Daisy:An efficient dense descriptor applied to wide-baseline stereo.Pattern Analysis and Machine Intelligence, IEEE Transactions on,815--830.

Toldo,R.,&Fusiello,A.(2008).Robust multiple structures estimation with j-linkage.European conference on computer vision,(pp.537-547).

Toldo,R.,&Fusiello,A.(2010).Photo-consistent planar patches from unstructured cloud of points.European Conference on Computer Vision,(pp.589- 602).

Toldo,R.,Gherardi,R.,Farenzena,M.,&Fusiello,A.(2015).Hierarchical structure-and-motion recovery from uncalibrated images.Computer Vision and Image Understanding,140,127-143.

Tong,X.,Li,J.,Hu,W.,Du,Y.,&Zhang,Y.(2013).US Patent No.US20130201187.

Triggs,B.(1997).Autocalibration and the absolute quadric.Computer Vision and Pattern Recognition,1997.Proceedings.,1997 IEEE Computer Society Conference on,(pp.609--614).

Triggs,B.,McLauchlan,P.F.,Hartley,R.I.,&Fitzgibbon,A.W.(1999).Bundle adjustment—a modern synthesis.International workshop on vision algorithms, (pp.298-372).

Woodham,R.J.(1979).Photometric stereo:A reflectance map technique for determining surface orientation from image intensity.22nd Annual Technical Symposium,(pp.136-143).

Wu,C.(2013).Towards linear-time incremental structure from motion.3DTV-Conference,2013 International Conference on,(pp.127--134).

Claims

1. a method of computer implementation, the digital representation of at least subject surface for creating physical object；Wherein, the party Method includes:

Input data is obtained, input data includes multiple capture images of physical object and the surface normal information of object, surface Normal information indicates subject surface normal associated with the various pieces of subject surface；

The digital 3D of creation at least subject surface is indicated；

Wherein, the digital representation for creating subject surface is at least based on multiple capture images obtained and surface normal obtained Information, and include:

Wherein the first part of modification intermediate representation includes:

One or more subject surface methods associated with identified second part are determined from surface normal information obtained Line；

Identified one or more subject surface normals are based at least partially on to modify the first part of intermediate representation.

2. according to the method for claim 1；Wherein, the multiple capture image includes from relative to the corresponding of physical object Viewpoint captured image, preferably more than two images.

3. method according to any of the preceding claims；Wherein each subject surface normal instruction subject surface with The direction at position in the associated subject surface of subject surface normal.

4. method according to any of the preceding claims；Wherein, intermediate representation includes depth map, which indicates The distance of each position on from reference position to subject surface.

5. according to the method for claim 4；Wherein, obtaining intermediate representation includes creating depth map from multiple images.

6. method according to claim 4 or 5；Wherein the first part of subject surface includes the hole in depth map, and The first part for wherein modifying intermediate representation includes filling hole.

7. according to the method for claim 6；Wherein determine subject surface second part include hole is identified as it is to be filled Hole and the periphery in hole that is identified of determination.

8. according to the method for claim 7；Wherein, hole is identified as hole to be filled includes:

Identify the hole in depth map；With

9. according to the method for claim 8；Wherein it is determined that whether the hole identified is that the hole to be filled includes:

Determine the first group objects surface normal associated with the periphery in identified hole；

The first similarity measurement of first group of surface normal determined by calculating；With

The first similarity of calculating is measured and is compared with first object similarity.

10. method according to claim 8 or claim 9；Wherein it is determined that whether the hole identified is that the hole to be filled includes:

Determine the second group objects surface normal associated with the hole identified；

The second similarity measurement of second group objects surface normal determined by calculating；With

The second similarity of calculating is measured and is compared with the second target similarity.

11. method according to any of the preceding claims；Including the input number for increasing intermediate representation and capture According to and/or from photo consistency metric derived from the surface normal information between normal map Optimization Steps.

12. a kind of method for creating the digital representation of at least subject surface of physical object；Wherein, this method comprises:

Create the digital representation of subject surface；

Obtain the intermediate representation of subject surface；

Surface normal information obtained is based at least partially on to modify the first part of intermediate representation to obtain the table of modification Show, the first part of intermediate representation indicates the first part of subject surface；

The first part for wherein modifying intermediate representation includes for increasing intermediate representation and derived from the surface normal information The Optimization Steps of photo consistency metric between normal map.

13. method according to claim 11 or 12；Wherein photo consistency metric includes surface normal letter obtained Consistency metric between breath and the surface normal information obtained from intermediate representation.

14. method according to any of the preceding claims；The first part for wherein modifying intermediate representation includes executing Bilateral filtering step.

15. according to the method for claim 14；It wherein, is Optimization Steps after bilateral filtering step, for increasing modification Expression and capture input data and/or the photo consistency degree derived from the surface normal information between normal map Amount.

16. method according to claim 14 or 15；Wherein, intermediate representation defines virtual surface and including surface-element net Lattice, the mesh definition network topology, each surface-element define virtual surface normal, and each surface-element includes multiple tops Point, each vertex limit the position in the virtual surface；Wherein, bilateral filtering step includes: the top displacement by calculating Come modify the multiple vertex at least the first vertex position, with reduce from surface normal information obtained determine object Difference measurement between surface normal and virtual surface normal；Wherein constraint of the top displacement by network topology.

17. according to the method for claim 16；Wherein the first vertex is associated with one or more surface-elements, and its Middle top displacement by one or more surface-elements associated with the first vertex scaled.

18. method described in any one of 4 to 17 according to claim 1；Wherein, bilateral filtering step includes selection by being obtained One of the subject surface normal that indicates of surface normal information, and by selected surface normal and associated with the first vertex Surface-element is associated.

19. a kind of system for creating the digital representation of physical object；The system includes data processing system, the data processing System is configured as executing the step of method as described in any one of claims 1 to 18.

20. system according to claim 19 further includes scanning movement, the scanning movement includes for receiving physical object Object support.

21. system described in any one of 9 to 20 according to claim 1；It further comprise that can operate to capture physical object The image capture device of two or more images, wherein the two or more images be from relative to physical object not It is obtained with viewpoint.

22. system described in any one of 9 to 21 according to claim 1；It further comprise multiple toy building elements, the object for appreciation Tool construction component is configured to detachably interconnect, to form the physical object of toy construction model form.

23. a kind of computer program product, including program code devices, when executing on a data processing system, described program Code device is suitable for the step of making method described in any one of described data processing system perform claim requirement 1 to 18.