research-article

PGDENet: Progressive Guided Fusion and Depth Enhancement Network for RGB-D Indoor Scene Parsing

Authors:

Lu YuAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 25

Pages 3483 - 3494

https://doi.org/10.1109/TMM.2022.3161852

Published: 01 January 2023 Publication History

Abstract

Scene parsing is a fundamental task in computer vision. Various RGB-D (color and depth) scene parsing methods based on fully convolutional networks have achieved excellent performance. However, color and depth information are different in nature and existing methods cannot optimize the cooperation of high-level and low-level information when aggregating modal information, which introduces noise or loss of key information in the aggregated features and generates inaccurate segmentation maps. The features extracted from the depth branch are weak because of the low quality of the depth map, which results in unsatisfactory feature representation. To address these drawbacks, we propose a progressive guided fusion and depth enhancement network (PGDENet) for RGB-D indoor scene parsing. First, high-quality RGB images are used to improve depth data through a depth enhancement module, in which the depth maps are strengthened in terms of channel and spatial correlations. Then, we integrate information from the RGB and enhance depth modalities using a progressive complementary fusion module, in which we start with high-level semantic information and move down layerwise to guide the fusion of adjacent layers while reducing hierarchy-based differences. Extensive experiments are conducted on two public indoor scene datasets, and the results show that the proposed PGDENet outperforms state-of-the-art methods in RGB-D scene parsing.

References

[1]

S. Liu, G. Tian, Y. Zhang, and P. Duan, “Scene recognition mechanism for service robot adapting various families: A CNN-based approach using multi-type cameras,” IEEE Trans. Multimedia, May 2021.

Digital Library

[2]

W. Zhou, Y. Lv, J. Lei, and L. Yu, “Global and local-contrast guides content-aware fusion for RGB-D saliency prediction,” IEEE Trans. Syst., Man, Cybern.: Syst., vol. 51, no. 6, pp. 3641–3649, Jun. 2021.

[3]

J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2015, pp. 3431–3440.

[4]

W. Zhou, Q. Guo, J. Lei, L. Yu, and J.-N. Hwang, “IRFR-Net: Interactive recursive feature-reshaping network for detecting salient objects in RGB-D images,” IEEE Trans. Neural Netw. Learn. Syst., early access, Aug. 2021.

[5]

J. Fu et al., “Dual attention network for scene segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 3146–3154.

[6]

J. He, Z. Deng, and Y. Qiao, “Dynamic multi-scale filters for semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 3562–3572.

[7]

W. Zhou, J. Wu, J. Lei, J.-N. Hwang, and L. Yu, “Salient object detection in stereoscopic 3D images using a deep convolutional residual autoencoder,” IEEE Trans. Multimedia, vol. 23, pp. 3388–3399, Sep. 2021.

[8]

W. Wang and U. Neumann, “Depth-aware CNN for RGB-D segmentation,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 135–150.

[9]

W. Zhou et al., “Local and global feature learning for blind quality evaluation of screen content and natural scene images,” IEEE Trans. Image Process., vol. 27, no. 5, pp. 2086–2095, May 2018.

[10]

S. Gupta, P. Arbelaez, and J. Malik, “Perceptual organization and recognition of indoor scenes from RGB-D images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2013, pp. 564–571.

[11]

B. Jiang, Z. Zhou, X. Wang, J. Tang, and B. Luo, “cmSalGAN: RGB-D salient object detection with cross-view generative adversarial networks,” IEEE Trans. Multimedia, vol. 23, pp. 1343–1353, May 2021.

Digital Library

[12]

C. Dal mutto, P. Zanuttigh, and G. M. Cortelazzo, “Fusion of geometry and color information for scene segmentation,” IEEE J. Sel. Topics Signal Process., vol. 6, no. 5, pp. 505–521, Apr. 2012.

[13]

W. Zhou, J. Liu, J. Lei, J.-N. Hwang, and L. Yu, “GMNet: Graded-feature multilabel-learning network for RGB-thermal urban scene semantic segmentation,” IEEE Trans. Image Process., vol. 30, pp. 7790–7802, Sep. 2021.

Digital Library

[14]

N. Huang, Y. Yang, D. Zhang, Q. Zhang, and J. Han, “Employing bilinear fusion and saliency prior information for RGB-D salient object detection,” IEEE Trans. Multimedia, Mar. 2021.

Digital Library

[15]

N. Huang, Y. Liu, Q. Zhang, and J. Han, “Joint cross-modal and unimodal features for RGB-D salient object detection,” IEEE Trans. Multimedia, vol. 23, pp. 2428–2441, Jul. 2021.

Digital Library

[16]

W. Zhou, Q. Guo, J. Lei, L. Yu, and J.-N. Hwang, “ECFFNet: Effective and consistent feature fusion network for RGB-T salient object detection,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 3, pp. 1224–1235, May 2021.

[17]

L. C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 801–818.

[18]

A. Valada, R. Mohan, and W. Burgard, “Self-supervised model adaptation for multimodal semantic segmentation,” Int. J. Comput. Vis., vol. 128, no. 5, pp. 1–47, May 2019.

[19]

X. Chen et al., “Bi-directional cross-modality feature propagation with separation-and-aggregation gate for rgb-d semantic segmentation,” in Proc. Eur. Conf. Comput. Vis., Aug. 2020, pp. 561–577.

[20]

M. Fayyaz et al., “STFCN: Spatio-temporal fully convolutional neural network for semantic segmentation of street scenes,” in Proc. Asian Conf. Comput. Vis., Nov. 2016, pp. 493–509.

[21]

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput. Assist. Intervention, Oct. 2015, pp. 234–241.

[22]

P. O. Pinheiro, T. Y. Lin, R. Collobert, and P. Dollár, “Learning to refine object segments,” in Proc. Eur. Conf. Comput. Vis., Oct. 2016, pp. 75–91.

[23]

V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, Jan. 2017.

[24]

Q. Tang, F. Liu, J. Jiang, and Y. Zhang, “Epreet: Efficient pyramid representation network for real-time street scene segmentation,” IEEE Trans. Intell. Transp. Syst., Mar. 2021.

Digital Library

[25]

X. Qi, R. Liao, J. Jia, S. Fidler, and R. Urtasun, “3D graph neural networks for rgb-d semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2017, pp. 5209–5218.

[26]

H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2017, pp. 2881–2890.

[27]

X. Li, Z. Zhao, and Q. Wang, “ABSSNet: Attention-based spatial segmentation network for traffic scene understanding,” IEEE Trans. Cybern., Feb. 2021.

[28]

H. Zhang et al., “Context encoding for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 7151–7160.

[29]

H. Fu, M. Gong, C. Wang, and D. Tao, “MoE-SPNet: A mixture-of experts scene parsing network,” Pattern Recognit, vol. 84, pp. 226–236, Dec. 2018.

Digital Library

[30]

H. Zhao et al., “Psanet: Point-wise spatial attention network for scene parsing,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 267–283.

[31]

Z. Zhu, M. Xu, S. Bai, T. Huang, and X. Bai, “Asymmetric non-local neural networks for semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 593–602.

[32]

Y. Zhao, J. Li, Y. Zhang, and Y. Tian, “Multi-class part parsing with joint boundary-semantic awareness,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 9177–9186.

[33]

Z. Huang, C. Wang, X. Wang, W. Liu, and J. Wang, “Semantic image segmentation by scale-adaptive networks,” IEEE Trans. Image Process., vol. 29, pp. 2066–2077, Oct. 2020.

[34]

H. Ding, X. Jiang, A. Q. Liu, N. M. Thalmann, and G. Wang, “Boundary-aware feature propagation for scene segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 6819–6829.

[35]

T. Takikawa, D. Acuna, V. Jampani, and S. Fidler, “Gated-scnn: Gated shape cnns for semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 5229–5238.

[36]

J. Fu et al., “Stacked deconvolutional network for semantic segmentation,” IEEE Trans. Image Process., Jan. 2019.

[37]

P. Zhang, W. Liu, Y. Lei, H. Wang, and H. Lu, “Deep multiphase level set for scene parsing,” IEEE Trans. Image Process., vol. 29, pp. 4556–4567, Feb. 2020.

[38]

C. Couprie, C. Farabet, L. Najman, and Y. Lecun, “Indoor semantic segmentation using depth information,” 2013, arXiv:1301.3572.

[39]

F. Liu, G. Lin, and C. Shen, “Discriminative training of deep fully connected continuous crfs with task-specific loss,” IEEE Trans. Image Process., vol. 26, no. 5, pp. 2127–2136, Feb. 2017.

Digital Library

[40]

L. Ma, J. Stückler, C. Kerl, and D. Cremers, “Multi-view deep learning for consistent semantic mapping with rgb-d cameras,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Sep. 2017, pp. 598–605.

[41]

C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, “Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture,” in Proc. Asian Conf. Comput. Vis., Nov. 2016, pp. 213–228.

[42]

J. Jiang, L. Zheng, F. Luo, and Z. Zhang, “RedNet: Residual encoder-decoder network for indoor rgb-d semantic segmentation,” 2018, arXiv:1806.01054.

[43]

Y. Cheng, R. Cai, Z. Li, X. Zhao, and K. Huang, “Locality-sensitive deconvolution networks with gated fusion for rgb-d indoor semantic segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2017, pp. 1475–1483.

[44]

Y. He, W. Chiu, M. Keuper, and M. Fritz, “Std2p: Rgbd semantic segmentation using spatio-temporal data-driven pooling,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 7158–7167.

[45]

H. Liu, W. Wu, X. Wang, and Y. Qian, “RGB-D joint modelling with scene geometric information for indoor semantic segmentation,” Multimedia Tools Appl., vol. 77, no. 17, pp. 22475–22488, Sep. 2018.

Digital Library

[46]

W. Wang and U. Neumann, “Depth-aware cnn for rgb-d segmentation,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 135–150.

[47]

W. Zhou, J. Yuan, J. Lei, and T. Luo, “TSNet: Three-stream self-attention network for rgb-d indoor semantic segmentation,” IEEE Intell. Syst., vol. 36, no. 4, pp. 73–78, Jun. 2020.

Digital Library

[48]

D. Lin, G. Chen, D. Cohen-Or, P. Heng, and H. Huang, “Cascaded feature network for semantic segmentation of RGB-D images,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2017, pp. 1320–1328.

[49]

L. -Z. Chen, Z. Lin, Z. Wang, Y. -L. Yang, and M. -M. Cheng, “Spatial information guided convolution for real-time rgbd semantic segmentation,” IEEE Trans. Image Process., vol. 30, pp. 2313–2324, Jan. 2021.

Digital Library

[50]

D. Lin, R. Zhang, Y. Ji, P. Li, and H. Huang, “Scn: Switchable context network for semantic segmentation of rgb-d images,” IEEE Trans. Cybern., vol. 50, no. 3, pp. 1120–1131, Jan. 2020.

[51]

Z. Li et al., “LSTM-CF: Unifying context modeling and fusion with lstms for rgb-d scene labeling,” in Proc. Eur. Conf. Comput. Vis., Oct. 2016, pp. 541–557.

[52]

J. Yuan, W. Zhou, and T. Luo, “DMFNet: Deep multi-modal fusion network for RGB-D indoor scene segmentation,” IEEE Access, vol. 7, pp. 169350–169358, Nov. 2019.

[53]

Y. Yuan and J. Wang, “OCNet: Object context network for scene parsing,” 2018, arXiv:1809.00916.

[54]

Z. Huang et al., “CCnet: Criss-cross attention for semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 603–612.

[55]

X. Hu, K. Yang, L. Fei, and K. Wang, “ACNet: Attention based network to exploit complementary features for rgbd semantic segmentation,” in Proc. IEEE Int. Conf. Image Process., Sep. 2019, pp. 1440–1444.

[56]

L. Deng, M. Yang, T. Li, Y. He, and C. Wang, “RFBNet: Deep multimodal networks with residual fusion blocks for rgb-d semantic segmentation,” 2019, arXiv:1907.00135.

[57]

C. Zhu, K. Cai, T. Huang, H. Li, and G. Li, “PDNet: Prior-model guided depth-enhanced network for salient object detection,” in Proc. IEEE Int. Conf. Multimedia Expo., Jul. 2019, pp. 199–204.

[58]

C. Zhu, W. Zhang, T. Li, S. Liu, and G. Li, “Exploiting the value of the center-dark channel prior for salient object detection,” ACM Trans. Intell. Syst. Technol. (TIST), vol. 10, no. 3, pp. 1–20, May 2019.

Digital Library

[59]

Z. Xiong, Y. Yuan, N. Guo, and Q. Wang, “Variational context-deformable convnets for indoor scene parsing,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 3991–4001.

[60]

S. Park, K. Hong, and S. Lee, “RDFNet: Rgb-d multi-level residual feature fusion for indoor semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2017, pp. 4980–4989.

[61]

P. Zhang, W. Liu, Y. Lei, H. Wang, and H. Lu, “RAPNet: Residual atrous pyramid network for importance-aware street scene parsing,” IEEE Trans. Image Process., vol. 29, pp. 5010–5021, Mar. 2020.

[62]

H. Chen and Y. Li, “Progressively complementarity-aware fusion network for RGB-D salient object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 3051–3060.

[63]

Z. Huang, H. X. Chen, T. Zhou, Y. Z. Yang, and B. Y.L. Liu, “Multi-level cross-modal interaction network for rgb-d salient object detection,” Neurocomputing, vol. 452, pp. 200–211, 2020.

[64]

Y. Pang, X. Zhao, L. Zhang, and H. Lu, “Multi-scale interactive network for salient object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 9410–9419.

[65]

D. Fan, Y. Zhai, A. Borji, J. Yang, and L. Shao, “Bbs-net: Rgb-d salient object detection with a bifurcated backbone strategy network,” in Proc. Eur. Conf. Comput. Vis., Aug. 2020, pp. 275–292.

[66]

G. Zhang, J. -H. Xue, P. Xie, S. Yang, and G. Wang, “Non-local aggregation for rgb-d semantic segmentation,” IEEE Signal Process. Lett., vol. 28, pp. 658–662, Mar. 2021.

[67]

W. Zhou, Y. Zhu, J. Lei, J. Wan, and L. Yu, “CCAFNet: Crossflow and cross-scale adaptive fusion network for detecting salient objects in rgb-d images,” IEEE Trans. Multimedia, May 2021.

[68]

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.

[69]

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in Proc. Eur. Conf. Comput. Vis., Oct. 2012, pp. 746–760.

Digital Library

[70]

J. Xiao, A. Owens, and A. Torralba, “Sun3d: A database of big spaces reconstructed using sfm and object labels,” in Proc. IEEE Int. Conf. Comput. Vis., 2013, pp. 1625–1632.

[71]

W. Zhou, X. Lin, J. Lei, L. Yu, and J.-N. Hwang, “MFFENet: Multiscale feature fusion and enhancement network for RGB–Thermal urban road scene parsing,” IEEE Trans. Multimedia, early access, Jun. 2021.

Digital Library

[72]

D. Seichter, M. Köhler, B. Lewandowski, T. Wengefeld, and H. Gross, “Efficient rgb-d semantic segmentation for indoor scene analysis,” 2020, arXiv:2011.06961.

[73]

G. Li et al., “Hierarchical alternate interaction network for rgb-d salient object detection,” IEEE Trans. Image Process., vol. 30, pp. 3528–3542, Mar. 2021.

Digital Library

[74]

W. Zhang, Y. Jiang, K. Fu, and Q. Zhao, “BTS-Net: Bi-directional transfer-and-selection network for RGB-D salient object detection,” in Proc. IEEE Int. Conf. Multimedia Expo., Jul. 2021, pp. 1–6.

Cited By

Zhao ZLi JWang LWang YLu HCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)MaskMentor: Unlocking the Potential of Masked Self-Teaching for Missing Modality RGB-D Semantic SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681698(1915-1923)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681698
Hao ZXiao ZLuo YGuo JWang JShen LHu HCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)PrimKD: Primary Modality Guided Multimodal Fusion for RGB-D Semantic SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681253(1943-1951)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681253
Ye ZLiu YPeng Y(2024)MAAN: Memory-Augmented Auto-Regressive Network for Text-Driven 3D Indoor Scene GenerationIEEE Transactions on Multimedia10.1109/TMM.2024.344365726(11057-11069)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3443657
Show More Cited By

Recommendations

Depth-map completion for large indoor scene reconstruction
Highlights
- Propose a new depth completion algorithm for MVS depth-maps.
- Use occlusion ...
Abstract
Traditional Multi View Stereo (MVS) algorithms are often difficult to deal with large-scale indoor scene reconstruction, due to the photo-consistency measurement errors in weak textured regions, which are commonly exist in indoor ...
RDFC-GAN: RGB-Depth Fusion CycleGAN for Indoor Depth Completion
Raw depth images captured in indoor scenarios frequently exhibit extensive missing values due to the inherent limitations of the sensors and environments. For example, transparent materials frequently elude detection by depth sensors; surfaces may ...
Real‐time depth enhancement by fusion for RGB‐D cameras

This study presents a real‐time refinement procedure for depth data acquired by RGB‐D cameras. Data from RGB‐D cameras suffer from undesired artefacts such as edge inaccuracies or holes owing to occlusions or low object remission. In this work, the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia

IEEE Transactions on Multimedia Volume 25, Issue

2023

8932 pages

ISSN:1520-9210

Issue’s Table of Contents

1520-9210 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 January 2023

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhao ZLi JWang LWang YLu HCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)MaskMentor: Unlocking the Potential of Masked Self-Teaching for Missing Modality RGB-D Semantic SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681698(1915-1923)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681698
Hao ZXiao ZLuo YGuo JWang JShen LHu HCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)PrimKD: Primary Modality Guided Multimodal Fusion for RGB-D Semantic SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681253(1943-1951)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681253
Ye ZLiu YPeng Y(2024)MAAN: Memory-Augmented Auto-Regressive Network for Text-Driven 3D Indoor Scene GenerationIEEE Transactions on Multimedia10.1109/TMM.2024.344365726(11057-11069)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3443657
Xie BDeng YShao ZLi Y(2024)EISNet: A Multi-Modal Fusion Network for Semantic Segmentation With Events and ImagesIEEE Transactions on Multimedia10.1109/TMM.2024.338025526(8639-8650)Online publication date: 21-Mar-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3380255
Dong SZhou WXu CYan W(2024)EGFNet: Edge-Aware Guidance Fusion Network for RGB–Thermal Urban Scene ParsingIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.330636825:1(657-669)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TITS.2023.3306368
Zhou WJian BFang MDong XLiu YJiang Q(2024)DGPINet-KD: Deep Guided and Progressive Integration Network With Knowledge Distillation for RGB-D Indoor Scene AnalysisIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.338235434:9(7844-7855)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TCSVT.2024.3382354
Yang JBai LSun YTian CMao MWang G(2024)Pixel Difference Convolutional Network for RGB-D Semantic SegmentationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.329616234:3(1481-1492)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1109/TCSVT.2023.3296162
Tran HLe TNguyen NNguyen NNguyen A(2024)Fine-tuned depth-augmented U-Net for enhanced semantic segmentation in indoor autonomous vision systemsJournal of Real-Time Image Processing10.1007/s11554-024-01578-722:1Online publication date: 6-Dec-2024
https://dl.acm.org/doi/10.1007/s11554-024-01578-7
Zhou WZhang HYan WLin W(2023)MMSMCNet: Modal Memory Sharing and Morphological Complementary Networks for RGB-T Urban Scene Semantic SegmentationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.327531433:12(7096-7108)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1109/TCSVT.2023.3275314
Xu CLi QJiang XYu DZhou Y(2023)Dual-Space Graph-Based Interaction Network for RGB-Thermal Semantic Segmentation in Electric Power SceneIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.321631333:4(1577-1592)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1109/TCSVT.2022.3216313

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents