Nothing Special   »   [go: up one dir, main page]

CN109377530B - Binocular depth estimation method based on depth neural network - Google Patents

Binocular depth estimation method based on depth neural network Download PDF

Info

Publication number
CN109377530B
CN109377530B CN201811453789.0A CN201811453789A CN109377530B CN 109377530 B CN109377530 B CN 109377530B CN 201811453789 A CN201811453789 A CN 201811453789A CN 109377530 B CN109377530 B CN 109377530B
Authority
CN
China
Prior art keywords
image
network
depth
layer
scale
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811453789.0A
Other languages
Chinese (zh)
Other versions
CN109377530A (en
Inventor
侯永宏
吕晓冬
许贤哲
陈艳芳
赵健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Kaitong Information Technology Service Co ltd
Zhejiang Qiqiao Lianyun Biosensor Technology Co ltd
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201811453789.0A priority Critical patent/CN109377530B/en
Publication of CN109377530A publication Critical patent/CN109377530A/en
Application granted granted Critical
Publication of CN109377530B publication Critical patent/CN109377530B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a binocular depth estimation method based on a depth neural network, which comprises the following steps: 1) the input left and right viewpoint images are preprocessed to enhance the data; 2) constructing a multi-scale network model of binocular depth estimation, wherein the model comprises a plurality of convolution layers, an activation layer, a residual connection, a multi-scale pooling connection and a linear up-sampling layer; 3) designing a loss function to obtain a minimization result in a continuous training process so as to obtain an optimal network weight; 4) and inputting the image to be processed into a network model to obtain a corresponding depth map, and continuously repeating the steps until the network converges or the training times are reached. The invention adopts the idea of unsupervised learning and only utilizes left and right viewpoint images acquired by a binocular camera as network input. The adaptive design of the network sets the internal and external parameters of the camera as independent model parameters, so that the method can be applied to a plurality of camera systems on the premise of not modifying the network.

Description

Binocular depth estimation method based on depth neural network
Technical Field
The invention belongs to the field of multimedia image processing, relates to computer vision and deep learning technology, and discloses a binocular depth estimation method based on a deep neural network.
Technical Field
Depth estimation is always a popular research direction in the field of computer vision, and three-dimensional data provided by a depth map provides required information for implementation of applications such as three-dimensional reconstruction, Augmented Reality (AR), intelligent navigation and the like. Meanwhile, the position relation expressed by the depth map is very important in a plurality of image tasks, so that an image processing algorithm can be further simplified. Currently, the more common depth estimation is mainly classified into two categories, namely monocular depth estimation and binocular depth estimation.
The monocular depth estimation method only uses one camera, the camera captures continuous image frames in the traditional algorithm, and projection transformation is carried out through an interframe motion model so as to estimate the image depth. The monocular depth estimation based on deep learning is to train a deep neural network by using a data set with real depth information and regress depth by using the deep neural network obtained by learning. The algorithm has simple equipment and low cost and can be suitable for dynamic scenes. But at the same time, because of the lack of scale information, the depth information is usually not accurate enough, and the performance is often seriously degraded in an unknown scene. The binocular estimation method uses two calibrated cameras to view the same object from two different perspectives. Finding the same space point under different visual angles, calculating the parallax between corresponding pixel points, and then converting the parallax into depth through triangulation. The traditional binocular estimation adopts a stereo matching algorithm, so that the calculated amount is large, and the effect on low-texture scenes is poor. The binocular depth estimation based on the deep learning mostly adopts a supervised learning method, and due to the strong learning capacity of a neural network, compared with the traditional method, the accuracy and the speed of the conventional method are greatly improved.
However, supervised learning usually depends too much on the real value, but the real value may have problems of error and noise, sparse depth information, difficult calibration of hardware equipment, and the like, so that the estimated depth value is not accurate enough. The unsupervised learning method has been considered as a research direction in which artificial intelligence can really and effectively learn itself in the real world, and therefore, in recent years, the image depth estimation method based on unsupervised learning has become a research hotspot.
Disclosure of Invention
The invention aims to provide a binocular depth estimation method based on a depth neural network, which adopts the idea of unsupervised learning, only utilizes left and right viewpoint images acquired by a binocular camera as network input, and does not need to acquire depth information of the input images in advance as a training label. Meanwhile, the adaptive design of the network sets the internal and external parameters of the camera as independent model parameters, so that the method is suitable for a plurality of camera systems on the premise of not modifying the network. In addition, the neural network is basically not influenced by illumination, noise and the like, and the robustness is high.
The technical scheme for realizing the purpose of the invention is as follows:
a binocular depth estimation method based on a depth neural network comprises the following steps:
1) performing corresponding image preprocessing such as cutting and transformation on the input left and right viewpoint images to perform data enhancement, wherein the image preprocessing comprises mild affine deformation, random horizontal rotation, random scale jitter, random contrast, brightness, saturation, sharpness and the like, so that the number of samples is further increased, the training optimization of network parameters is facilitated, and the generalization capability of a network is enhanced;
2) and constructing a multi-scale network model of binocular depth estimation, wherein the model comprises a plurality of convolution layers, an activation layer, a residual connection, a multi-scale pooling connection, a linear up-sampling layer and the like.
(a) The network adopts three residual error network structures to carry out multi-scale convolution on input, and each residual error module comprises two convolution layers and an identity mapping. Except for the first convolution kernel of 3 x 3, the rest are 7 x 7 in size.
(b) The second, sixth and fourteenth layers in the network are multi-scale pooling modules, and the average pooling operation is performed on the outputs of the second and sixth layers, with a step size of 4, a kernel size of 4 × 4, a step size of 2, and a kernel size of 2 × 2, respectively, and the convolution is performed by 1 × 1 together with the output of the fourteenth layer.
(c) The left view and the right view are processed through a front-end network, and feature information of the left view and the right view is associated through feature correlation operation after passing through a multi-scale pooling module, so that feature correlation between the two views is calculated:
c(x1,x2)=∑o∈[-k,k]×[-k,k]<fl(x1+o),fr(x2+o)>
c is left image feature in x1The image block and the right image feature centered on x2Correlation of image blocks centered, flIs a left picture feature, frFor the right graph feature, the image block size is k × k.
(d) And then, the network recovers the original resolution of the image according to the correlation characteristics, and depth maps with different scales are obtained by utilizing deconvolution, upsampling and the like. In the linear up-sampling operation, the image is generated by adopting bilinear interpolation for the output of the upper layer, and jump layer connection is carried out with the upper sampling layer by utilizing residual learning, and finally the image is restored to the original size.
3) Setting initialization parameters according to a designed network model, and designing a loss function to obtain a minimization result in a continuous training process so as to obtain the optimal network weight.
The pixel values of the left and right views of the network input are respectively represented as Il、IrWhen the network obtains the predicted depth map of the left image
Figure BDA0001887247680000021
Using the camera internal reference matrix K-1I to be in the image coordinate systemrConverting into a camera coordinate system, converting into the camera coordinate system of the left image by using the external reference matrix T, and then converting into the image coordinate system of the left image again by using the internal reference matrix K, thereby obtaining a transition image
Figure BDA0001887247680000022
The specific formula is as follows:
Figure BDA0001887247680000023
wherein
Figure BDA0001887247680000024
prIs the corresponding image pixel value. The projection transformation enables the pixel coordinates in the transition graph to be continuous values, so that the pixel value of each coordinate is determined by using a 4-neighborhood interpolation method, and finally the target graph is obtained
Figure BDA0001887247680000025
Figure BDA0001887247680000031
Where w is proportional to the spatial distance between the target point and the proximate point, anda,bwab=1。
construction of reconstruction loss function using Huber loss function
Figure BDA0001887247680000032
Figure BDA0001887247680000033
Figure BDA0001887247680000034
4) Inputting the image to be processed into the network model to obtain corresponding depth map, and repeating the above steps
Until the network converges or the training times are reached.
The invention provides a deep neural network based on unsupervised learning, which is used for carrying out network model training on left and right images without real depth information so as to obtain a monocular depth map. The invention adopts the advantage of multiple visual angles of the binocular camera, and realizes the output mapping from the input of a binocular image to a monocular depth image by utilizing the representation learning method of a multilayer representation form, namely a convolutional neural network. The network model obtains different scale receptive fields through multilayer down-sampling operation, utilizes a residual error structure to extract the characteristics of an input image, and adopts a multi-scale pooling module to strengthen the local texture detail information of the image, thereby improving the accuracy and the robustness of the network model. The upper sampling layer adopts a bilinear interpolation method, and a residual error structure is reused to learn information of a plurality of upper sampling layers, so that information loss in the process of recovering the size of the image is reduced, and the accuracy of depth estimation is further ensured.
The invention has the advantages and beneficial effects that:
1. the binocular depth estimation method based on the depth neural network is based on an unsupervised learning method, and the accuracy of the predicted depth value is ensured by utilizing the strong learning capacity of the depth convolution network.
2. The invention uses residual error connection for feature extraction for multiple times, completes multi-scale information fusion by utilizing skip layer connection in up-sampling, reduces the loss and loss of the traditional convolution in information transmission to a certain extent, ensures the integrity of information and greatly improves the network convergence speed.
3. According to the method, images with different scales are obtained through multiple downsampling, and different receptive fields of the images are obtained through a multi-scale pooling module to strengthen local texture details.
4. The characteristic correlation operation in the network carries out the characteristic correlation of the left view and the right view, is not easily influenced by noise, and improves the robustness of the network model.
5. The input image of the network does not have real depth information, the network calculates a target image by predicting a depth image, camera parameters and original input, and constructs a loss function by constructing a difference value between the target image and the original input so as to realize network parameter optimization, so that the whole network can finish training in an unsupervised learning mode.
6. The parameter information of the camera is set outside the network as a part of network parameters, so that the model is suitable for various camera systems with different configurations and has strong self-adaptive capability.
Drawings
Fig. 1 is a diagram of a neural network model for binocular depth estimation.
Detailed Description
The present invention will be described in further detail with reference to the following embodiments, which are illustrative only and not limiting, and the scope of the present invention is not limited thereby.
1) And performing corresponding image preprocessing such as cutting and transformation on the input left and right viewpoint images to perform data enhancement.
The invention adopts the images of left and right visual angles acquired by the binocular camera as network input and can output a monocular depth map under a left camera coordinate system or a right camera coordinate system. For convenience of description, the output monocular depth maps mentioned herein are all depth maps of the left image. The input image in the invention needs the RGB image of left and right visual angles, so the artificially synthesized data set scenefilow and part of data in KITTI2015 data set in real environment are adopted as training data. 39000 binocular images with 960 x 540 resolution and corresponding depth maps are contained in the large data set SceneFlow data set, and a large amount of training data can guarantee the learning capacity of the convolutional neural network. However, the SceneFlow data set is an artificially synthesized image, and therefore has a certain difference from a real image acquired in the real world. In order to enhance the application effect of the model in the daily life scene, the model is selected to be finely adjusted on the KITTI2015 data set in the example so as to adapt to the real scene. The KITTI2015 data set contains 200 binocular images and corresponding sparse depth maps. Because the method is of an unsupervised learning type, the scene flow dataset and the actual depth map data in the KITTI2015 dataset are not used. The higher resolution of the images in the data set makes the network training slower, so the images are randomly cropped to 320x180 to improve the network training speed. In addition, image preprocessing is carried out on the images in the data set, wherein the image preprocessing comprises slight affine deformation, random horizontal rotation, random scale jitter, random contrast, brightness, saturation, sharpness and the like, so that the number of samples is further increased, training optimization of network parameters is facilitated, and the generalization capability of the network is enhanced.
2) And constructing a multi-scale network model of binocular depth estimation, wherein the model comprises a plurality of convolution layers, an activation layer, a residual connection, a multi-scale pooling connection, a linear up-sampling layer and the like.
a) In order to reduce the parameter quantity of the network model on a large scale so that the network model is easier to converge and has stronger feature expression capability, the network selects 3 residual modules to carry out feature extraction on the input image. Except for the first layer, the remaining convolutional layers all use a small convolution kernel of 3 x 3 to better retain edge information. And carrying out batch standardization operation after each convolution layer to ensure that the data distribution is stable. And a ReLU activation function is adopted after each convolution layer in the model, so that the problem of gradient disappearance during network training is prevented. The output of each residual block is sampled down, a multi-scale pooling module is designed to perform average pooling operation on the input of the residual block with different sizes, and dimension reduction is performed through the 1 × 1 convolution layer, so that different feature information can be sensed on different scales, and the training parameters of the network are greatly reduced.
b) And finally obtaining a feature map of one eighth resolution of the original image after the input image passes through the three-time residual error module and the multi-scale pooling module and is subjected to dimensionality reduction. Left and right graph network weight sharing, calculating the characteristic correlation of the two graphs in the correlation operation, and the formula is as follows:
Figure BDA0001887247680000051
in the formula, x1The feature block of the left image which is taken as the center can carry out correlation operation with all the feature blocks of the right image, and matching features from one point in the left image to all points in the right image are calculated in a traversing mode. The matrix can be regarded as matching cost of the feature blocks at different depths, and then depth regression is selected to be regarded as a classification problem. In the deep regression, firstly, the softmax function is utilized
Figure BDA0001887247680000052
j is 1, …, K, and K matching costs at the depth are converted into a probability distribution of the depth and then passed
Figure BDA0001887247680000053
Weighted summation mode obtains more stable depth estimation
Figure BDA0001887247680000054
Wherein
Figure BDA0001887247680000055
Indicating the depth of the predicted pixel, DmaxRepresenting the maximum disparity to be estimated, d being the respective depth values corresponding to the depth probability distribution, CdThe matching cost is expressed, and the final output is the weighted sum of all possible depths of the pixel point and the possibility of the depth.
c) And performing bilinear interpolation on the matching cost of the small scale, adding the upsampled cost into the next larger scale, and performing skip-layer connection on the information of the multiple upsampled layers by utilizing residual connection. Residual learning in the up-sampling process fully utilizes multi-scale information, so that the network further refines depth estimation on the basis of depth estimation of the previous scale, and meanwhile, the network is easier to train.
3) Setting initialization parameters according to a designed network model, and designing a loss function to obtain a minimization result in a continuous training process so as to obtain the optimal network weight.
One key point of the invention is how to realize unsupervised learning, and the network needs to construct a reasonable loss function to train, optimize and adjust the training parameters. Assuming that the prediction target image is a left image depth image, obtaining a network prediction left image depth image
Figure BDA0001887247680000056
To obtain a target map
Figure BDA0001887247680000057
Firstly, an input image right image I in an image coordinate system needs to be processedrUsing an internal reference matrix K-1Conversion to the camera coordinate system of the right image, using the predicted left image depth map according to the stereo matching principle
Figure BDA0001887247680000058
Performing corresponding projection transformation with the external reference matrix T to obtain an image in a left image camera coordinate system, and performing coordinate system transformation again by using the matrix K to obtain a transition image in the left image coordinate system
Figure BDA0001887247680000059
The method can be obtained according to a binocular camera projection conversion formula:
Figure BDA00018872476800000510
wherein
Figure BDA00018872476800000511
prIs the corresponding image pixel value. Due to the characteristics of projective transformation, the transition diagram
Figure BDA00018872476800000512
The coordinates in (1) are converted to continuous values, so that 4 adjacent pixel values of the coordinates are used for linear interpolation. The coordinates of 4 adjacent pixels are respectively upper left, lower left, upper right and lower right, and the interpolation formula is as follows:
Figure BDA0001887247680000061
wherein,
Figure BDA0001887247680000062
for the corresponding pixel value of the target image, w is proportional to the spatial distance between the target point and the adjacent point, and ∑a,bwab=1。
Therefore, the reconstruction loss function is given by:
Figure BDA0001887247680000063
wherein,
Figure BDA0001887247680000064
in the formula, x represents a difference value between corresponding pixel points of the target graph and the input graph, N is the number of pixel points of the image, and c is a threshold value set empirically, which is set to 1 in the present embodiment.
The Huber loss function has jump of first order difference at the c value, when the value is in the c range, the small residual gradient is better, and when the norm exceeds c, the large residual effect is better, so that the two losses can be effectively balanced.
The input image of the network does not have real depth information, but the original input image is estimated through a predicted depth map and a camera parameter matrix and is used as a network label to optimize the training parameter, so that the unsupervised learning of the network is realized. Meanwhile, the camera parameters can be modified externally in the optimization process of network training, so that the model is suitable for a plurality of camera systems and has self-adaptive performance.
4) And inputting the image to be processed into a network model to obtain a corresponding depth map, and continuously repeating the steps until the network converges or the training times are reached.
In the example, the synthesized big data set scenefiow is used for pre-training, and then the KITTI2015 data set is used for fine tuning, so that the network has high precision in daily real scenes, and the method has good universality.
The above description is only for the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution of the present invention and the inventive concept within the scope of the present invention, which is disclosed by the present invention, and the equivalent or change thereof belongs to the protection scope of the present invention.

Claims (4)

1. A binocular depth estimation method based on a depth neural network comprises the following steps:
1) the input left and right viewpoint images are preprocessed to enhance the data;
2) constructing a multi-scale network model of binocular depth estimation, wherein the model comprises a plurality of convolution layers, an activation layer, a residual connection, a multi-scale pooling connection and a linear up-sampling layer;
3) setting initialization parameters according to a designed multi-scale network model, and designing a loss function to obtain a minimization result in a continuous training process so as to obtain an optimal network weight;
4) inputting the image to be processed into a network model to obtain a corresponding depth map, and continuously repeating the steps until the network converges or the training times are reached;
the step 3) is specifically as follows: the pixel values of the left and right views of the network input are respectively represented as Il、IrWhen the network obtains the predicted depth map of the left image
Figure FDA0003118786410000011
Good luck and benefitUsing camera internal reference matrix K-1I to be in the image coordinate systemrConverting into a camera coordinate system, converting into the camera coordinate system of the left image by using the external reference matrix T, and then converting into the image coordinate system of the left image again by using the internal reference matrix K, thereby obtaining a transition image
Figure FDA0003118786410000012
The specific formula is as follows:
Figure FDA0003118786410000013
wherein
Figure FDA0003118786410000014
prFor corresponding image pixel values, the pixel coordinates in the transition image are continuous values through projection transformation, so that the pixel value of each coordinate is determined by using a 4-neighborhood interpolation method, and finally the target image is obtained
Figure FDA0003118786410000015
Figure FDA0003118786410000016
In the formula,
Figure FDA0003118786410000017
is the corresponding pixel value of the target image, a and b are the coordinate values of each adjacent point, wabThe weight of the pixel value of the corresponding coordinate, which is proportional to the target point
Figure FDA0003118786410000018
And the near point
Figure FDA0003118786410000019
A spatial distance of, and
Figure FDA00031187864100000110
construction of reconstruction loss function using Huber loss function
Figure FDA00031187864100000111
Figure FDA00031187864100000112
Figure FDA00031187864100000113
In the above formula, x represents a difference between corresponding pixel points of the target graph and the input graph, N is the number of pixel points of the image, and c is an experience setting threshold.
2. The binocular depth estimation method based on the depth neural network of claim 1, wherein: the multi-scale network model adopts three residual error network structures to carry out multi-scale convolution on input, each residual error module comprises two convolution layers and an identity mapping, the second layer, the sixth layer and the fourteenth layer in the network are multi-scale pooling modules, average pooling operation is carried out on the output of the second layer and the output of the sixth layer, and 1 x1 convolution is carried out together with the output of the fourteenth layer.
3. The binocular depth estimation method based on the depth neural network of claim 2, wherein: the left view and the right view are processed through a front-end network, and feature information of the left view and the right view is associated through feature correlation operation after passing through a multi-scale pooling module, so that feature correlation between the two views is calculated:
c(x1,x2)=∑o∈[-k,k]×[-k,k]<fl(x1+o),fr(x2+o)>
c is left image feature in x1The image block and the right image feature centered on x2Correlation of image blocks centered, flIs a left picture feature, frFor the right graph feature, the image block size is k × k.
4. The binocular depth estimation method based on the depth neural network of claim 3, wherein: the network restores the original resolution of the image according to the correlation characteristics, acquires depth maps of different scales by deconvolution and upsampling, generates an image by bilinear interpolation for the output of the upper layer in the linear upsampling operation, performs skip layer connection with the upper layer upsampling layer by residual learning, and finally restores the image to the original size.
CN201811453789.0A 2018-11-30 2018-11-30 Binocular depth estimation method based on depth neural network Active CN109377530B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811453789.0A CN109377530B (en) 2018-11-30 2018-11-30 Binocular depth estimation method based on depth neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811453789.0A CN109377530B (en) 2018-11-30 2018-11-30 Binocular depth estimation method based on depth neural network

Publications (2)

Publication Number Publication Date
CN109377530A CN109377530A (en) 2019-02-22
CN109377530B true CN109377530B (en) 2021-07-27

Family

ID=65376554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811453789.0A Active CN109377530B (en) 2018-11-30 2018-11-30 Binocular depth estimation method based on depth neural network

Country Status (1)

Country Link
CN (1) CN109377530B (en)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948689B (en) * 2019-03-13 2022-06-03 北京达佳互联信息技术有限公司 Video generation method and device, electronic equipment and storage medium
CN110009674B (en) * 2019-04-01 2021-04-13 厦门大学 Monocular image depth of field real-time calculation method based on unsupervised depth learning
CN110490919B (en) * 2019-07-05 2023-04-18 天津大学 Monocular vision depth estimation method based on deep neural network
CN110298791B (en) * 2019-07-08 2022-10-28 西安邮电大学 Super-resolution reconstruction method and device for license plate image
CN110322499B (en) * 2019-07-09 2021-04-09 浙江科技学院 Monocular image depth estimation method based on multilayer characteristics
CN110414674B (en) * 2019-07-31 2021-09-10 浙江科技学院 Monocular depth estimation method based on residual error network and local refinement
CN111062900B (en) * 2019-11-21 2021-02-12 西北工业大学 Binocular disparity map enhancement method based on confidence fusion
CN111179330A (en) * 2019-12-27 2020-05-19 福建(泉州)哈工大工程技术研究院 Binocular vision scene depth estimation method based on convolutional neural network
CN113076966B (en) * 2020-01-06 2023-06-13 字节跳动有限公司 Image processing method and device, training method of neural network and storage medium
CN113496521B (en) * 2020-04-08 2022-10-18 复旦大学 Method and device for generating depth image and camera external parameter by using multiple color pictures
CN111753961B (en) * 2020-06-26 2023-07-28 北京百度网讯科技有限公司 Model training method and device, prediction method and device
CN112288788B (en) * 2020-10-12 2023-04-28 南京邮电大学 Monocular image depth estimation method
CN112543317B (en) * 2020-12-03 2022-07-12 东南大学 Method for converting high-resolution monocular 2D video into binocular 3D video
CN112261399B (en) * 2020-12-18 2021-03-16 安翰科技(武汉)股份有限公司 Capsule endoscope image three-dimensional reconstruction method, electronic device and readable storage medium
CN112652058B (en) * 2020-12-31 2024-05-31 广州华多网络科技有限公司 Face image replay method and device, computer equipment and storage medium
CN112767467B (en) * 2021-01-25 2022-11-11 郑健青 Double-image depth estimation method based on self-supervision deep learning
CN112785636B (en) * 2021-02-18 2023-04-28 上海理工大学 Multi-scale enhanced monocular depth estimation method
CN112837361B (en) * 2021-03-05 2024-07-16 浙江商汤科技开发有限公司 Depth estimation method and device, electronic equipment and storage medium
CN113239958A (en) * 2021-04-09 2021-08-10 Oppo广东移动通信有限公司 Image depth estimation method and device, electronic equipment and storage medium
US11803939B2 (en) * 2021-04-28 2023-10-31 Shanghai United Imaging Intelligence Co., Ltd. Unsupervised interslice super-resolution for medical images
CN113762358B (en) * 2021-08-18 2024-05-14 江苏大学 Semi-supervised learning three-dimensional reconstruction method based on relative depth training
CN113706599B (en) * 2021-10-29 2022-01-21 纽劢科技(上海)有限公司 Binocular depth estimation method based on pseudo label fusion
CN114170286B (en) * 2021-11-04 2023-04-28 西安理工大学 Monocular depth estimation method based on unsupervised deep learning
CN114782911B (en) * 2022-06-20 2022-09-16 小米汽车科技有限公司 Image processing method, device, equipment, medium, chip and vehicle
CN115966102B (en) * 2022-12-30 2024-10-08 中国科学院长春光学精密机械与物理研究所 Early warning braking method based on deep learning
CN117788843B (en) * 2024-02-27 2024-04-30 青岛超瑞纳米新材料科技有限公司 Carbon nanotube image processing method based on neural network algorithm

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523464A (en) * 2011-12-12 2012-06-27 上海大学 Depth image estimating method of binocular stereo video
CN107204010A (en) * 2017-04-28 2017-09-26 中国科学院计算技术研究所 A kind of monocular image depth estimation method and system
CN107578436A (en) * 2017-08-02 2018-01-12 南京邮电大学 A kind of monocular image depth estimation method based on full convolutional neural networks FCN
CN107767413A (en) * 2017-09-20 2018-03-06 华南理工大学 A kind of image depth estimation method based on convolutional neural networks
CN108389226A (en) * 2018-02-12 2018-08-10 北京工业大学 A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106157307B (en) * 2016-06-27 2018-09-11 浙江工商大学 A kind of monocular image depth estimation method based on multiple dimensioned CNN and continuous CRF

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523464A (en) * 2011-12-12 2012-06-27 上海大学 Depth image estimating method of binocular stereo video
CN107204010A (en) * 2017-04-28 2017-09-26 中国科学院计算技术研究所 A kind of monocular image depth estimation method and system
CN107578436A (en) * 2017-08-02 2018-01-12 南京邮电大学 A kind of monocular image depth estimation method based on full convolutional neural networks FCN
CN107767413A (en) * 2017-09-20 2018-03-06 华南理工大学 A kind of image depth estimation method based on convolutional neural networks
CN108389226A (en) * 2018-02-12 2018-08-10 北京工业大学 A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Unsupervised Monocular Depth Estimation with Left-Right Consistency";Clement Godard 等;《2017 IEEE Conference on Computer Vision and Pattern Recognition》;20170726;全文 *

Also Published As

Publication number Publication date
CN109377530A (en) 2019-02-22

Similar Documents

Publication Publication Date Title
CN109377530B (en) Binocular depth estimation method based on depth neural network
CN111325794B (en) Visual simultaneous localization and map construction method based on depth convolution self-encoder
CN113160375B (en) Three-dimensional reconstruction and camera pose estimation method based on multi-task learning algorithm
CN111259945B (en) Binocular parallax estimation method introducing attention map
CN110378838B (en) Variable-view-angle image generation method and device, storage medium and electronic equipment
WO2018000752A1 (en) Monocular image depth estimation method based on multi-scale cnn and continuous crf
CN109598754B (en) Binocular depth estimation method based on depth convolution network
CN111508013B (en) Stereo matching method
CN108062769B (en) Rapid depth recovery method for three-dimensional reconstruction
CN111783582A (en) Unsupervised monocular depth estimation algorithm based on deep learning
CN113283525B (en) Image matching method based on deep learning
CN112634341A (en) Method for constructing depth estimation model of multi-vision task cooperation
CN113313732A (en) Forward-looking scene depth estimation method based on self-supervision learning
CN113870422B (en) Point cloud reconstruction method, device, equipment and medium
CN112560865B (en) Semantic segmentation method for point cloud under outdoor large scene
CN110381268A (en) method, device, storage medium and electronic equipment for generating video
CN111950477A (en) Single-image three-dimensional face reconstruction method based on video surveillance
CN115035171B (en) Self-supervision monocular depth estimation method based on self-attention guide feature fusion
CN111583340A (en) Method for reducing monocular camera pose estimation error rate based on convolutional neural network
CN115330935A (en) Three-dimensional reconstruction method and system based on deep learning
CN114677479A (en) Natural landscape multi-view three-dimensional reconstruction method based on deep learning
CN115631223A (en) Multi-view stereo reconstruction method based on self-adaptive learning and aggregation
CN110889868A (en) Monocular image depth estimation method combining gradient and texture features
CN112115786B (en) Monocular vision odometer method based on attention U-net
CN110766732A (en) Robust single-camera depth map estimation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220714

Address after: 313009 floor 3, No. 777, Chengzhong Avenue, Lianshi Town, Nanxun District, Huzhou City, Zhejiang Province

Patentee after: Zhejiang Qiqiao Lianyun biosensor technology Co.,Ltd.

Address before: 073000 West 200m northbound at the intersection of Dingzhou commercial street and Xingding Road, Baoding City, Hebei Province (No. 1910, 19th floor, building 3, jueshishan community)

Patentee before: Hebei Kaitong Information Technology Service Co.,Ltd.

Effective date of registration: 20220714

Address after: 073000 West 200m northbound at the intersection of Dingzhou commercial street and Xingding Road, Baoding City, Hebei Province (No. 1910, 19th floor, building 3, jueshishan community)

Patentee after: Hebei Kaitong Information Technology Service Co.,Ltd.

Address before: 300072 Tianjin City, Nankai District Wei Jin Road No. 92

Patentee before: Tianjin University

TR01 Transfer of patent right