Abstract
In hazy environments, the computer vision system may require to perform object detection. The performance of the object detection methods degrades in a hazy environment. To overcome this issue, we propose a Bi-stream feature fusion (BFF) network for object detection in a hazy environment. The BFF network consists of three modules: hybrid input, Bi-stream feature extractor (BFE), and multi-level feature fusion. We present the notion of hybrid input to extract features from the hazy images in an effective manner. This paper leverages the hybrid input for feature extraction from the hazy images to avoid the requirement of enhancement in hazy object detection. The proposed BFE network extracts multi-level features from the hazy image and hybrid input. The multi-level feature fusion (MFF) network performs the convolution-based adaptive feature fusion and processes the extracted features. The proposed BFF model outperforms other state-of-the-art methods in hazy environments while achieving competitive performance in normal conditions. Another challenge in hazy object detection is the unavailability of a dataset with sufficient samples and classes. In this work, we developed a synthetic object detection dataset for a hazy environment (DHOD). The DHOD dataset contains twenty object classes with more than twenty thousand samples.
Similar content being viewed by others
Data Availability
Data will be made available on reasonable request.
References
Akhtar, M.S., Ali, A., Chaudhuri, S.S.: Mobile-unet gan: a single-image dehazing model. Signal Image Video Process pp. 1–9 (2023)
Ali, A., Ghosh, A., Chaudhuri, S.S.: Real-time tracking of moving objects through efficient scale space adaptation and normalized correlation filtering. Signal Image Video Process (2023)
Alzahrani, M.S., Jarraya, S.K., Ben-Abdallah, H., Ali, M.S.: Comprehensive evaluation of skeleton features-based fall detection from microsoft kinect v2. Signal Image Video Process 13, 1431–1439 (2019)
An, G., Guo, J., Wang, Y., Ai, Y.: Egbnet: a real-time edge-guided bilateral network for nighttime semantic segmentation. Signal Image Video Process pp. 1–9 (2023)
Aote, S.S., Wankhade, N., Pardhi, A., Misra, N., Agrawal, H., Potnurwar, A.: An improved deep learning method for flying object detection and recognition. Signal Image Video Process pp. 1–10 (2023)
Bhatnagar, G., Liu, Z.: A novel image fusion framework for night-vision navigation and surveillance. Signal Image Video Process 9, 165–175 (2015)
Bhatt, D., Patel, C., Talsania, H., Patel, J., Vaghela, R., Pandya, S., Modi, K., Ghayvat, H.: Cnn variants for computer vision: history, architecture, application, challenges and future scope. Electronics 10(20), 2470 (2021)
Bi, S., Hu, Z., Zhao, M., Zhang, H., Di, J., Sun, Z.: Self-supervised pretext task collaborative multi-view contrastive learning for video action recognition. Signal Image Video Process 58, 1–8 (2023)
Bulugu, I.: Gesture recognition system based on cross-domain csi extracted from wi-fi devices combined with the 3d cnn. Signal Image Video Process 24, 1–9 (2023)
Cai, Z., Vasconcelos, N.: Cascade r-cnn: high quality object detection and instance segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 43(5), 1483–1498 (2019)
Cakir, S., Cetin, A.E.: Visual object tracking using fourier domain phase information. Signal Image Video Process 16(1), 119–126 (2022)
Chen, Y., Li, W., Sakaridis, C., Dai, D., Van Gool, L.: Domain adaptive faster r-cnn for object detection in the wild. Proc. Conf. Comput. Vis. Pattern Recognit. 25, 3339–3348 (2018)
Chen, Y., Wang, H., Li, W., Sakaridis, C., Dai, D., Van Gool, L.: Scale-aware domain adaptive faster r-cnn. Int. J. Comput. Vis. 129(7), 2223–2243 (2021)
Devipriya, A., Prabakar, D., Singh, L., Oliver, A.S., Qamar, S., Azeem, A.: Machine learning-driven pedestrian detection and classification for electric vehicles: integrating bayesian component network analysis and reinforcement region-based convolutional neural networks. Signal Image Video Process 17(8), 4475–4483 (2023)
Elafi, I., Jedra, M., Zahid, N.: Tracking occluded objects using chromatic co-occurrence matrices and particle filter. Signal Image Video Process 12, 1227–1235 (2018)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Gidaris, S., Komodakis, N.: Object detection via a multi-region and semantic segmentation-aware cnn model. In: Proceedings of the IEEE international conference on computer vision, pp. 1134–1142 (2015)
Girshick, R.: Fast r-cnn. In: IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Godard, C., Aodha, O.M., Firman, M., Brostow, G.: Digging into self-supervised monocular depth estimation. In: IEEE/CVF International Conference on Computer Vision, pp. 3827–3837 (2019)
Guan, D., Huang, J., Xiao, A., Lu, S., Cao, Y.: Uncertainty-aware unsupervised domain adaptation in object detection. IEEE Trans. Multimedia. 24, 2502–2514 (2021)
He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2341–2353 (2011)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 770–778 (2016)
Huang, J., Guan, D., Xiao, A., Lu, S., Shao, L.: Category contrast for unsupervised domain adaptation in visual tasks. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 1203–1214 (2022)
Kalwar, S., Patel, D., Aanegola, A., Konda, K.R., Garg, S., Krishna, K.M.: Gdip: Gated differentiable image processing for object-detection in adverse conditions. arXiv preprint arXiv:2209.14922 (2022)
Kim, S.W., Kook, H.K., Sun, J.Y., Kang, M.C., Ko, S.J.: Parallel feature pyramid network for object detection. In: European conference on computer vision, pp. 234–250 (2018)
Kong, T., Sun, F., Tan, C., Liu, H., Huang, W.: Deep feature pyramid reconfiguration for object detection. In: European Conference on Computer Vision, pp. 169–185 (2018)
Kumar, B., Mishra, A., Talesara, A., Kumar, S., Dey, S., Vyas, Vyas, R.: Object detection for autonomous vehicle in hazy environment using optimized deep learning techniques. In: Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing, pp. 242–249 (2022)
Kumar, N., Sukavanam, N.: A weakly supervised cnn model for spatial localization of human activities in unconstraint environment. Signal Image Video Process 14(5), 1009–1016 (2020)
Li, B., Hua, Y., Lu, M.: Object detection in hazy environment enhanced by preprocessing image dataset with synthetic haze. In: 2020 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1618–1623. IEEE (2020)
Li, B., Ren, W., Fu, D., Tao, D., Feng, D., Zeng, W., Wang, Z.: Benchmarking single-image dehazing and beyond. IEEE Trans. Image Process. 28(1), 492–505 (2019)
Li, C., Zhou, H., Liu, Y., Yang, C., Xie, Y., Li, Z., Zhu, L.: Detection-Friendly Dehazing: Object Detection in Real-World Hazy Scenes. IEEE Trans. Pattern Anal. Mach, Intell (2023)
Li, X., Lv, C., Wang, W., Li, G., Yang, L., Yang, J.: Generalized focal loss: towards efficient representation learning for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3139–3153 (2023)
Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2020)
Liu, S., Huang, D., et al.: Receptive field block net for accurate and fast object detection. In: European conference on computer vision, pp. 385–400 (2018)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)
Liu, W., Ren, G., Yu, R., Guo, S., Zhu, J., Zhang, L.: Image-adaptive yolo for object detection in adverse weather conditions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 1792–1800 (2022)
Nayar, S., Narasimhan, S.: Vision in bad weather. In: Proceedings of IEEE Conference on Computer Vision, vol. 2, pp. 820 – 827 vol.2 (1999)
Nguyen, D.K., Tseng, W.L., Shuai, H.H.: Domain-adaptive object detection via uncertainty-aware distribution alignment. In: Proceedings of the 28th ACM international conference on multimedia, pp. 2499–2507 (2020)
Parihar, A.S., Verma, O.P., Khanna, C.: Fuzzy-contextual contrast enhancement. IEEE Trans. Image Process. 26(4), 1810–1819 (2017)
Patel, C., Bhatt, D., Sharma, U., Patel, R., Pandya, S., Modi, K., Cholli, N., Patel, A., Bhatt, U., Khan, M.A., et al.: Dbgc: dimension-based generic convolution block for object recognition. Sensors 22(5), 1780 (2022)
Priyadharshini, G., Ukrit, M.F.: Cso-cnn: circulatory system optimization-based cascade region cnn for fault estimation and driver behavior detection. Signal Image Video Process pp. 1–9 (2023)
Qin, Q., Chang, K., Huang, M., Li, G.: Denet: Detection-driven enhancement network for object detection under adverse weather conditions. In: Asian Conf. on Computer Vision, pp. 2813–2829 (2022)
Rathee, N., Ganotra, D.: An efficient approach for facial action unit intensity detection using distance metric learning based on cosine similarity. Signal Image Video Process 12, 1141–1148 (2018)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 779–788 (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Saito, K., Ushiku, Y., Harada, T., Saenko, K.: Strong-weak distribution alignment for adaptive object detection. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 6956–6965 (2019)
Sakaridis, C., Dai, D., Van Gool, L.: Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 126(9), 973–992 (2018)
Shen, Z., Liu, Z., Li, J., Jiang, Y.G., Chen, Y., Xue, X.: Dsod: Learning deeply supervised object detectors from scratch. In: IEEE International Conference on Computer Vision, pp. 1937–1945 (2017)
Shen, Z., Maheshwari, H., Yao, W., Savvides, M.: Scl: Towards accurate domain adaptive object detection via gradient detach based stacked complementary losses. arXiv preprint arXiv:1911.02559 (2019)
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern Recognition pp. 761–769 (2016)
Sindagi, V.A., Oza, P., Yasarla, R., Patel, V.M.: Prior-based domain adaptive object detection for hazy and rainy conditions. In: European Conference on Computer Vision, pp. 763–780. Springer (2020)
Singh, K., Parihar, A.S.: Variational optimization based single image dehazing. J. Vis. Commun. Image Represent. 79, 103241 (2021)
Singh, K., Parihar, A.S.: Illumination estimation for nature preserving low-light image enhancement. The Visual Computer pp. 1–16 (2023)
Soumya, T., Thampi, S.M.: Self-organized night video enhancement for surveillance systems. Signal Image Video Process 11, 57–64 (2017)
Tammvee, M., Anbarjafari, G.: Human activity recognition-based path planning for autonomous vehicles. Signal Image Video Process 15(4), 809–816 (2021)
Tanwar, R., Verma, S., Kumar, M., et al.: Object detection using image dehazing: A journey of visual improvement. In: International Conference on Intelligent Technologies, pp. 1–8. IEEE (2022)
Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104, 154–171 (2013)
Vs, V., Gupta, V., Oza, P., Sindagi, V.A., Patel, V.M.: Mega-cda: Memory guided attention for category-aware unsupervised domain adaptive object detection. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 4516–4526 (2021)
Yang, X., Mi, M.B., Yuan, Y., Wang, X., Tan, R.T.: Object detection in foggy scenes by embedding depth and reconstruction into domain adaptation. Conference on Computer Vision, pp. 1093–1108 (2022)
Zhang, Z., Qiao, S., Xie, C., Shen, W., Wang, B., Yuille, A.L.: Single-shot object detection with enriched semantics. Proceedings of the IEEE conference on computer vision and pattern Recognition pp. 5813–5821 (2018)
Zhang, Z., Zhao, L., Liu, Y., Zhang, S., Yang, J.: Unified density-aware image dehazing and object detection in real-world hazy scenes. In: Asian Conference on Computer Vision (2020)
Zhou, P., Ni, B., Geng, C., Hu, J., Xu, Y.: Scale-transferrable object detection. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 528–537 (2018)
Zhou, W., Du, D., Zhang, L., Luo, T., Wu, Y.: Multi-granularity alignment domain adaptation for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9581–9590 (2022)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision, pp. 2242–2251 (2017)
Zhu, R., Zhang, S., Wang, X., Wen, L., Shi, H., Bo, L., Mei, T.: Scratchdet: Training single-shot object detectors from scratch. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2263–2272 (2019)
Zitnick, C.L., Dollár, P.: Edge boxes: Locating object proposals from edges. In: European Conference on Computer Vision, pp. 391–405. Springer (2014)
Funding
In this research, there is no funding involved from any agency.
Author information
Authors and Affiliations
Contributions
KS and ASP have equal contributions to the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Ethical Approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Singh, K., Parihar, A.S. Bff: Bi-stream feature fusion for object detection in hazy environment. SIViP 18, 3097–3107 (2024). https://doi.org/10.1007/s11760-023-02973-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02973-6