Abstract
Generating the decision map with accurate boundaries is the key to fusing multi-focus images. In this paper, we introduce edge-preservation (EP) techniques into neural networks to improve the quality of decision maps, supported by an interesting phenomenon we found: the maps generated by traditional EP techniques are similar to the feature maps in the trained network with excellent performance. Based on the manifold theory in the field of edge-preservation, we propose a novel edge-aware layer derived from isometric domain transformation and a recursive filter, which effectively eliminates burrs and pseudo-edges in the decision map by highlighting the edge discrepancy between the focused and defocused regions. This edge-aware layer is incorporated to a Siamese-style encoder and a decoder to form a complete segmentation architecture, termed Y-Net, which can contrastively learn and capture the feature differences of the sourced images with a relatively small number of training data (i.e., 10,000 image pairs). In addition, a new strategy based on randomization is devised to generate masks and simulate multi-focus images with natural images, which alleviates the absence of ground-truth and the lack of training sets in multi-focus image fusion (MFIF) task. The experimental results on four publicly available datasets demonstrate that Y-Net with the edge-aware layers is superior to other state-of-the-art fusion networks in terms of qualitative and quantitative comparison.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agustsson, E., Timofte, R. (2017). Ntire 2017 challenge on single image super-resolution: Dataset and study. In: The IEEE conference on computer vision and pattern recognition (CVPR) Workshops, pp 1–8.
Bai, X., Zhang, Y., Zhou, F., et al. (2015). Quadtree-based multi-focus image fusion using a weighted focus-measure. Information Fusion, 22, 105–118.
Barash, D. (2002). Fundamental relationship between bilateral filtering, adaptive smoothing, and the nonlinear diffusion equation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(6), 844–847.
Bhat, S., & Koundal, D. (2021). Multi-focus image fusion techniques: a survey. Artificial Intelligence Review, 54(8), 5735–5787.
Chen, Y., & Blum, R. S. (2009). A new automated quality assessment algorithm for image fusion. Image and Vision Computing, 27(10), 1421–1432.
Chen, Z., Wang, D., Gong, S., & et al. (2017). Application of multi-focus image fusion in visual power patrol inspection. In: 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), IEEE, pp 1688–1692.
Cvejic, N., Canagarajah, C., & Bull, D. (2006). Image fusion metric based on mutual information and tsallis entropy. Electronics Letters, 42(11), 626–627.
Farid, M. S., Mahmood, A., & Al-Maadeed, S. A. (2019). Multi-focus image fusion using content adaptive blurring. Information Fusion, 45, 96–112.
Gastal, E. S., & Oliveira, M. M. (2011). Domain transform for edge-aware image and video processing. ACM SIGGRAPH, 1, 1–12.
Kim, J., Lee, JK., Lee, KM. (2016). Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1646–1654.
Li, J., Guo, X., Lu, G., et al. (2020). Drpl: Deep regression pair learning for multi-focus image fusion. IEEE Transactions on Image Processing, 29, 4816–4831.
Li, S., Kang, X., Fang, L., et al. (2017). Pixel-level image fusion: A survey of the state of the art. Information Fusion, 33, 100–112.
Liu, Y., Liu, S., & Wang, Z. (2015). A general framework for image fusion based on multi-scale transform and sparse representation. Information Fusion, 24, 147–164.
Liu, Y., Chen, X., Peng, H., et al. (2017). Multi-focus image fusion with a deep convolutional neural network. Information Fusion, 36, 191–207.
Liu, Y., Chen, X., Wang, Z., et al. (2018). Deep learning for pixel-level image fusion: Recent advances and future prospects. Information Fusion, 42, 158–173.
Liu, Y., Wang, L., Cheng, J., et al. (2020). Multi-focus image fusion: A survey of the state of the art. Information Fusion, 64, 71–91.
Liu, Z., Blasch, E., Xue, Z., et al. (2011). Objective assessment of multiresolution image fusion algorithms for context enhancement in night vision: a comparative study. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(1), 94–109.
Liu, Z., Chai, Y., Yin, H., et al. (2017). A novel multi-focus image fusion approach based on image decomposition. Information Fusion, 35, 102–116.
Luo, Y., He, K., Xu, D., et al. (2022). Infrared and visible image fusion based on visibility enhancement and hybrid multiscale decomposition. Optik, 258(168), 914.
Ma, B., Zhu, Y., Yin, X., et al. (2021). Sesf-fuse: An unsupervised deep model for multi-focus image fusion. Neural Computing and Applications, 33(11), 5793–5804.
Ma, B., Yin, X., Wu, D., et al. (2022). End-to-end learning for simultaneously generating decision map and multi-focus image fusion result. Neurocomputing, 470, 204–216.
Ma, H., Liao, Q., Zhang, J., et al. (2020). An \(\alpha \)-matte boundary defocus model-based cascaded network for multi-focus image fusion. IEEE Transactions on Image Processing, 29, 8668–8679.
Meher, B., Agrawal, S., Panda, R., et al. (2019). A survey on region based image fusion methods. Information Fusion, 48, 119–132.
Nejati, M., Samavi, S., & Shirani, S. (2015). Multi-focus image fusion using dictionary-based sparse representation. Information Fusion, 25, 72–84.
Qin, X., Zhang, Z., Huang, C., et al. (2020). U2-net: Going deeper with nested u-structure for salient object detection. Pattern Recognition, 106(107), 404.
Qiu, X., Li, M., Zhang, L., et al. (2019). Guided filter-based multi-focus image fusion through focus region detection. Signal Processing: Image Communication, 72, 35–46.
Qu, G., Zhang, D., & Yan, P. (2002). Information measure for performance of image fusion. Electronics Letters, 38(7), 313–315.
Rajalingam, B., & Priya, R. (2018). Hybrid multimodality medical image fusion technique for feature enhancement in medical diagnosis. International Journal of Engineering Science Invention, 2, 52–60.
Smadi, A. A., Yang, S., Mehmood, A., et al. (2021). Smart pansharpening approach using kernel-based image filtering. IET Image Processing, 15(11), 2629–2642.
Smadi, A. A., Abugabah, A., Mehmood, A., et al. (2022). Brain image fusion approach based on side window filtering. Procedia Computer Science, 198, 295–300.
Tan, W., Thitøn, W., Xiang, P., et al. (2021). Multi-modal brain image fusion based on multi-level edge-preserving filtering. Biomedical Signal Processing and Control, 64(102), 280.
Tang, H., Xiao, B., Li, W., et al. (2018). Pixel convolutional neural network for multi-focus image fusion. Information Sciences, 433, 125–141.
Wang, L., Lu, H., Wang, Y., et al. (2017). Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 136–145.
Wang, Pw., Liu, B. (2008). A novel image fusion metric based on multi-scale analysis. In: 2008 9th International Conference on Signal Processing, IEEE, pp 965–968.
Wang, Y., Wang, L., Yang, J., et al. (2019a). Flickr1024: A large-scale dataset for stereo image super-resolution. In: International Conference on Computer Vision Workshops, pp 3852–3857.
Wang, Y., Li, X., Zhu, R., et al. (2021). A multi-focus image fusion framework based on multi-scale sparse representation in gradient domain. Signal Processing, 189(108), 254.
Wang, Z., Li, X., Duan, H., et al. (2019). Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain. Multimedia Tools and Applications, 78(24), 34483–34512.
Wang, Z., Li, X., Duan, H., et al. (2022). A self-supervised residual feature learning model for multifocus image fusion. IEEE Transactions on Image Processing, 31, 4527–4542.
Wang, Z., Li, X., Yu, S., et al. (2022b). Vsp-fuse: Multifocus image fusion model using the knowledge transferred from visual salience priors. IEEE Transactions on Circuits and Systems for Video Technology pp 1–15
Xiao, B., Xu, B., Bi, X., et al. (2020). Global-feature encoding u-net (geu-net) for multi-focus image fusion. IEEE Transactions on Image Processing, 30, 163–175.
Xu, H., Ma, J., Jiang, J., et al .(2020a). U2fusion: A unified unsupervised image fusion network. IEEE Transactions on Pattern Analysis and Machine Intelligence
Xu, S., Wei, X., Zhang, C., et al. (2020b). Mffw: A new dataset for multi-focus image fusion. arXiv preprint arXiv:2002.04780
Xydeas, C. S., & Petrovic, V. S. (2000). Objective pixel-level image fusion performance measure. Sensor Fusion: Architectures, Algorithms, and Applications IV, 4051, 89–98.
Yang, C., Zhang, J. Q., Wang, X. R., et al. (2008). A novel similarity based quality metric for image fusion. Information Fusion, 9(2), 156–160.
Yu, S., Li, X., Ma, M., et al. (2021). Multi-focus image fusion based on l1 image transform. Multimedia Tools and Applications, 80(4), 5673–5700.
Zang, Y., Zhou, D., Wang, C., et al. (2021). Ufa-fuse: A novel deep supervised and hybrid model for multifocus image fusion. IEEE Transactions on Instrumentation and Measurement, 70, 1–17.
Zhang, B., Lu, X., Pei, H., et al. (2016). Multi-focus image fusion algorithm based on focused region extraction. Neurocomputing, 174, 733–748.
Zhang, H., & Ma, J. (2021). Sdnet: A versatile squeeze-and-decomposition network for real-time image fusion. International Journal of Computer Vision, 129(10), 2761–2785.
Zhang, H., Xu, H., Xiao, Y., et al. (2020a). Rethinking the image fusion: A fast unified image fusion network based on proportional maintenance of gradient and intensity. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 12,797–12,804.
Zhang, H., Xu, H., Tian, X., et al. (2021). Image fusion meets deep learning: A survey and perspective. Information Fusion, 76, 323–336.
Zhang, Q., & Levine, M. D. (2016). Robust multi-focus image fusion using multi-task sparse representation and spatial context. IEEE Transactions on Image Processing, 25(5), 2045–2058.
Zhang, X. (2021). Deep learning-based multi-focus image fusion: A survey and a comparative study. IEEE Transactions on Pattern Analysis and Machine Intelligence online., 44, 4819.
Zhang, Y., Liu, Y., Sun, P., et al. (2020). Ifcnn: A general image fusion framework based on convolutional neural network. Information Fusion, 54, 99–118.
Zhi-guo, J., Dong-bing, H., Jin, C., et al. (2004). A wavelet based algorithm for multi-focus micro-image fusion. In: Third International Conference on Image and Graphics (ICIG’04), IEEE, pp 176–179.
Acknowledgements
This work was supported by the National Natural Science Foundation of China [grant number 61801190]; and the National Key Research and Development Project of China [grant number 2019YFC0409105]; and the “Thirteenth Five-Year Plan” Scientific Research Planning Project of Education Department of Jilin Province [grant numbers JJKH20200997KJ, JJKH20200678KJ]; and the Fundamental Research Funds for the Central Universities, JLU. Young and Middle-aged Science and Technology lnnovation and Entrepreneurship Outstanding Talents (Team) Project (lnnovation Category) of Jilin Province (NO. 20230508052RC), Natural Science Foundation of Jilin Province (NO. 20220101108JC), Key R &D Project of Jilin Province (NO. 20220203035SF).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Communicated by Adrien Bartoli.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
To simplify the nearest neighbor strategy, we only consider the distance between \(x_i\) and \(x_{i+1}\). \(x_{i+1}\) and \(x_{i+2}\) are represented by \(x_i+h\) and \(x_i+2h\), respectively, since sapling interval is equal to h. Since only one interval is calculated, the coefficients of the slope change term and the distance change term are reset to 1. The desired 1D domain is expressed as \({\mathscr {D}}(x_i)\), where \({\mathscr {D}}(x_i)={\mathscr {T}}(x_i,I(x_i))\). Considering Eq. (3), the distance between the nearest \(x_i\) and \(x_{i+1}\) in 1D domain can be computed by:
To eliminate the absolute value sign on the left side of the equation, appropriate constraints need to be imposed on \({\mathscr {D}}(x_{i}+h)\ge {\mathscr {D}}(x_i)\). Divide both sides of the equation by h, and then takes the limit \(h \rightarrow 0\), as follows:
Considering that the minimum sampling interval h is 1 for images, Eq. (13) is expressed by Eq. (14):
where \({{\mathscr {D}}}^{\prime }(x_i)\) denotes the derivative of \({{\mathscr {D}}}(x_i)\) with respect to \(x_i\). By integrating \({{\mathscr {D}}}^{\prime }(x_i)\) and granting \({\mathscr {D}}(0)=0\), \({\mathscr {D}}(x_i)\) is expressed as:
\({\mathscr {D}}\) shows how \({\mathscr {T}}\) transforms curve C from \({\mathbb {R}}^2\) to \({\mathbb {R}}\) while preserving the edges. If the signal \(\xi \) has c channels, we obtain the \({\mathscr {D}}\) shown in Eq. (16) to transform the curve in \({\mathbb {R}}^{c+1}\) to \({\mathbb {R}}\).
For a 1D signal \(\xi \) with c channels, a warping \({\mathscr {D}}\) of 1D spatial domain is introduced and denoted by \({\mathscr {T}}\): \({\mathbb {R}}^{c+1} \rightarrow {\mathbb {R}}\), where \({\mathscr {D}}(x)=(x, \xi _1(x), \xi _2(x),...,\xi _c(x))\). \({\mathscr {D}}\) is named domain transformation in this study. Benefiting from the design of simultaneously processing multiple channels, \({\mathscr {D}}\) protects the edges from artifacts.
Besides, two parameters are constructed, \(\sigma _r\) and \(\sigma _s\), which can be determined by the input feature maps to realize the self-adaption of \({\mathscr {D}}\) so as not hinder network training. These two parameters can be used to control the space and range of the signal from the perspective of the signal. Then, the complete form of \({\mathscr {D}}\) is represented as:
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Z., Li, X., Zhao, L. et al. When Multi-Focus Image Fusion Networks Meet Traditional Edge-Preservation Technology. Int J Comput Vis 131, 2529–2552 (2023). https://doi.org/10.1007/s11263-023-01806-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-023-01806-w