Quality Assessment of SAR-to-Optical Image Translation
<p>The framework of our algorithm.</p> "> Figure 2
<p>A single channel remote sensing image of mountain. (<b>a</b>) Original image, (<b>b</b>) same view, distorted by warp affine, and (<b>c</b>) same view, distorted by Gaussian blur. Image quality assessment (IQA) scores between (<b>a</b>) and (<b>b</b>): mean square error (MSE) = 0.091, structural similarity (SSIM) = 0.056 and learned perceptual image patch similarity (LPIPS) = 0.628. IQA scores between (<b>a</b>) and (<b>c</b>): MSE = 0.009, SSIM = 0.537 and LPIPS = 0.661. (Note that, unlike MSE and LPIPS, larger values of SSIM indicate better quality.).</p> "> Figure 3
<p>Architectures of the proposed synthetic aperture radar (SAR)-to-optical image translation model.</p> "> Figure 4
<p>Architecture of the distorted image restoration model. It includes 16 residual blocks. 3 × 3 Conv. indicates the convolutional kernel with the size of 3 × 3. The number after the dot represents the output channel of each convolutional layer.</p> "> Figure 5
<p>Multiple scenes image translation results obtained by pixel to pixel (pix2pix), cycle-consistent adversarial networks (CycleGAN), high-definition pixel to pixel (pix2pixHD) and feature-guided SAR-to-optical image translation (FGGAN). (<b>a</b>) Farmland, (<b>b</b>) Forest, (<b>c</b>) Gorge, (<b>d</b>) River, and (<b>e</b>) Residential.</p> "> Figure 6
<p>Distorted Image Restoration results obtained by SSIM, feature similarity (FSIM), MSE, LPIPS and deep image structure and texture similarity (DISTS). (<b>a</b>) Industrial Area with pix2pix distortion, (<b>b</b>) River with CycleGAN distortion, (<b>c</b>) Gorge with pix2pixHD distortion, (<b>d</b>) Mountain with contrast shift, (<b>e</b>) Residential area with Gaussian blur, and (<b>f</b>) Farmland with speckle noise.</p> "> Figure 7
<p>Distorted Gorge recovery obtained by different models at 500, 2000, 10,000 and 20,000 iterations.</p> "> Figure 8
<p>Convergence curves of the whole iterations in Gorge restoration based on different IQA models. (<b>a</b>) SSIM; (<b>b</b>) FSIM; (<b>c</b>) MSE; (<b>d</b>) LPIPS; (<b>e</b>) DISTS.</p> "> Figure 8 Cont.
<p>Convergence curves of the whole iterations in Gorge restoration based on different IQA models. (<b>a</b>) SSIM; (<b>b</b>) FSIM; (<b>c</b>) MSE; (<b>d</b>) LPIPS; (<b>e</b>) DISTS.</p> "> Figure 9
<p>Objective ranking of the restoration results optimized using IQA metrics. The horizontal axis represents the metrics used to train the image restoration network, and the vertical axis denotes the metrics used to evaluate the restoration performance. The number 1 to 5 indicate the rank order from the best to the worst. GAN and TRA mean GAN and Traditional distortions respectively. (<b>a</b>) GAN-Pix2pixHD. (<b>b</b>) GAN-Pix2pix; (<b>c</b>) GAN-CycleGAN; (<b>d</b>) TRA-contrast shift; (<b>e</b>) TRA-Gaussian blur. (<b>f</b>) TRA-speckle noise.</p> "> Figure 9 Cont.
<p>Objective ranking of the restoration results optimized using IQA metrics. The horizontal axis represents the metrics used to train the image restoration network, and the vertical axis denotes the metrics used to evaluate the restoration performance. The number 1 to 5 indicate the rank order from the best to the worst. GAN and TRA mean GAN and Traditional distortions respectively. (<b>a</b>) GAN-Pix2pixHD. (<b>b</b>) GAN-Pix2pix; (<b>c</b>) GAN-CycleGAN; (<b>d</b>) TRA-contrast shift; (<b>e</b>) TRA-Gaussian blur. (<b>f</b>) TRA-speckle noise.</p> "> Figure 10
<p>Relationships between feature extraction and image classification. Convolutional neural networks (CNN) variants include 18-layer ResNet, Inception, SqueezeNet and 19-layer visual geometry group network(VGG).</p> "> Figure 11
<p>Loss function and accuracy curves of classification experiments using 18-layer ResNet. (<b>a</b>) Loss function curve of the training sets. (<b>b</b>) Loss function curve of the test sets. (<b>c</b>) Accuracy curve of the training sets. (<b>d</b>) Accuracy curve of the test sets.</p> "> Figure A1
<p>More detailed results of SAR-to-optical image translation.</p> ">
Abstract
:1. Introduction
- In view of the difficulties in the interpretation of SAR images, a SAR-to-optical image translation model is designed, which realizes the translation by taking advantage of the baselines containing supervised and unsupervised architectures.
- Considering the lack of quality assessment tools in SAR-to-optical image translation, a large-scale comparison of perceptual IQA models is performed to select suitable ones and explore the availability of objective metrics for stylization tasks.
- Besides visual perception and IQA measures, the properties of the translation results in terms of follow-up applications are described. We launch discussion and evaluation of scene classification in this paper to ensure the diversity of features involved.
2. Related Works
2.1. Image-to-Image Translation
2.2. Image Quality Assessment (IQA)
2.3. Image Feature Extraction
3. Methods
3.1. Synthetic Aperture Radar (SAR)-to-Optical Image Translation Model
3.1.1. Network Architectures
3.1.2. Loss Functions
- Pix2pix
- CycleGAN
- Pix2pixHD
- FGGAN
3.2. Image Restoration Model
3.2.1. Network Architectures
3.2.2. Loss Functions
4. Experiments
4.1. Datasets
4.2. Implement Details
4.3. Visual Inspection of SAR-to-Optical Translation
- Farmland
- Forest
- Gorge
- River
- Residential
4.4. IQA Model Selection
4.5. Objective Evaluation of Translation Results
4.6. Impact on Feature Extraction
5. Conclusions and Outlook
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Appendix A. Investigations on Remote-Sensing Datasets
- University of California (UC) Merced LandUse Datasets [34], contain 21 scene classes and 100 samples of size 256 × 256 in each class.
- Northwestern Polytechnical University—Remote Sensing Image Scene Classification (NWPU-RESISC)45 Dataset [35], is a publicly available benchmark for Remote Sensing Image Scene Classification (RESISC) created by NorthWestern Polytechnical University (NWPU) and contains 45 categories of scenarios.
- Aerial Image Dataset (AID) [36], has 30 different scene classes and about 200 to 400 samples of size 600 × 600 in each class.
- Gaofen Image Dataset (GID) [37], contains 150 high-quality GaoFen-2 (GF-2) images of more than 60 different cities in China. These images cover more than 50,000 square kilometers of geographic area.
- A Large-scale Dataset for Object DeTection in Aerial Images (DOTA)-v1.0 [38], has 15 categories of different scenes. The size of each image ranges from 800 × 800 to 4000 × 4000.
- DOTA-v1.5 Dataset [38], is an updated version of DOTA-v1.0 and contains 16 different scene classes.
- A Large-scale Dataset for Instance Segmentation in Aerial Images (iSAID) [39], comes with 655,451 object instances for 15 categories across 2806 high-resolution images.
- Wuhan University–Remote Sensing (WHU-RS)19 Dataset [40], has 19 different scene classes and 50 samples of size 600 × 600 in each class.
- Scene Image dataset designed by RS_IDEA Group in Wuhan University (SIRI-WHU) [41], has 12 different scene classes and 200 samples of size 200 × 200 in each class.
- Remote Sensing Classification (RSC)11 Dataset [42], has 11 different scene classes and more than 100 samples of size 512 × 512 in each class.
Category | Scene | Number of Times |
---|---|---|
Natural | Grass | 8 |
Forest | 7 | |
River | 6 | |
Lake1 | 6 | |
Farmland | 6 | |
Beach | 5 | |
Artificial | Harbor | 8 |
Residential | 7 | |
Airplane | 6 | |
Storage tank | 6 | |
Baseball court | 5 | |
Industrial | 5 | |
Ground track field | 5 |
Appendix B. Supplementary Experimental Results
Evaluation | Method | GAN—Pix2pixHD | GAN—Pix2pix | GAN—CycleGAN | TRA—Contrast Shift | TRA—Gaussian Blur | TRA—Speckle Noise |
---|---|---|---|---|---|---|---|
SSIM | SSIM | 0.9999 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
FSIM | 0.7245 | 0.9428 | 0.2370 | 0.7347 | 0.7123 | 0.4209 | |
MSE | 0.9588 | 1.0000 | 1.0000 | 0.8626 | 1.0000 | 0.9999 | |
LPIPS | 0.9997 | 0.9957 | 0.9985 | 0.9896 | 0.9994 | 0.9984 | |
DISTS | 0.9917 | 0.9404 | 0.9805 | 0.8732 | 0.9947 | 0.9276 | |
FSIM | SSIM | 0.9944 | 0.9999 | 0.9994 | 0.9999 | 0.9997 | 1.0000 |
FSIM | 0.9999 | 0.9616 | 0.9999 | 0.9964 | 0.9999 | 0.9995 | |
MSE | 0.9981 | 1.0000 | 1.0000 | 0.9575 | 0.9999 | 1.0000 | |
LPIPS | 0.9998 | 0.9985 | 0.9576 | 0.9869 | 0.9970 | 0.9571 | |
DISTS | 0.9840 | 0.9585 | 0.9884 | 0.9597 | 0.9851 | 0.9614 | |
MSE | SSIM | 0.0022 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
FSIM | 0.0109 | 0.0832 | 0.0255 | 0.0046 | 0.0115 | 0.0025 | |
MSE | 0.0000 | 0.0000 | 0.0000 | 0.0033 | 0.0000 | 0.0000 | |
LPIPS | 0.0000 | 0.0011 | 0.0003 | 0.0018 | 0.0000 | 0.0001 | |
DISTS | 0.0007 | 0.0039 | 0.0002 | 0.0045 | 0.0000 | 0.0002 | |
LPIPS | SSIM | 0.0002 | 0.0023 | 0.0010 | 0.0000 | 0.0004 | 0.0006 |
FSIM | 0.2923 | 0.5398 | 0.6180 | 0.1801 | 0.2371 | 0.2447 | |
MSE | 0.0092 | 0.0001 | 0.0000 | 0.0501 | 0.0000 | 0.0002 | |
LPIPS | 0.0000 | 0.0000 | 0.0000 | 0.0048 | 0.0000 | 0.0001 | |
DISTS | 0.0053 | 0.0345 | 0.0127 | 0.0459 | 0.0026 | 0.0352 | |
DISTS | SSIM | 0.0031 | 0.0000 | 0.0000 | 0.0000 | 0.0013 | 0.0037 |
FSIM | 0.2361 | 0.5437 | 0.4756 | 0.1877 | 0.2514 | 0.2447 | |
MSE | 0.0299 | 0.0002 | 0.0000 | 0.1033 | 0.0001 | 0.0016 | |
LPIPS | 0.0198 | 0.0815 | 0.0324 | 0.0935 | 0.0101 | 0.0504 | |
DISTS | 0.0003 | 0.0250 | 0.0153 | 0.0409 | 0.0002 | 0.0002 |
References
- Fuentes Reyes, M.; Auer, S.; Merkle, N.; Henry, C.; Schmitt, M. Sar-to-optical image translation based on conditional generative adversarial networks—Optimization, opportunities and limits. Remote Sens. 2019, 11, 2067. [Google Scholar] [CrossRef] [Green Version]
- Argenti, F.; Lapini, A.; Bianchi, T.; Alparone, L. A tutorial on speckle reduction in synthetic aperture radar images. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–35. [Google Scholar] [CrossRef] [Green Version]
- Fu, S.; Xu, F.; Jin, Y.Q. Translating SAR to optical images for assisted interpretation. arXiv 2019, arXiv:1901.03749. [Google Scholar]
- Toriya, H.; Dewan, A.; Kitahara, I. SAR2OPT: Image alignment between multi-modal images using generative adversarial networks. In Proceedings of the IGARSS 2019-2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 923–926. [Google Scholar]
- Enomoto, K.; Sakurada, K.; Wang, W.; Kawaguchi, N.; Matsuoka, M.; Nakamura, R. Image translation between SAR and optical imagery with generative adversarial nets. In Proceedings of the IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 1752–1755. [Google Scholar]
- Ley, A.; Dhondt, O.; Valade, S.; Haensch, R.; Hellwich, O. Exploiting GAN-based SAR to optical image transcoding for improved classification via deep learning. In Proceedings of the EUSAR 2018; 12th European Conference on Synthetic Aperture Radar, Aachen, Germany, 4–7 June 2018; pp. 1–6. [Google Scholar]
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision 2017, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
- Wang, T.C.; Liu, M.Y.; Zhu, J.Y.; Tao, A.; Kautz, J.; Catanzaro, B. High-resolution image synthesis and semantic manipulation with conditional gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8798–8807. [Google Scholar]
- Zhang, J.; Zhou, J.; Lu, X. Feature-Guided SAR-to-Optical Image Translation. IEEE Access 2020, 8, 70925–70937. [Google Scholar] [CrossRef]
- Wang, L.; Xu, X.; Yu, Y.; Yang, R.; Gui, R.; Xu, Z.; Pu, F. SAR-to-optical image translation using supervised cycle-consistent adversarial networks. IEEE Access 2019, 7, 129136–129149. [Google Scholar] [CrossRef]
- Wang, Z. Applications of objective image quality assessment methods [applications corner]. IEEE Signal Process. Mag. 2011, 28, 137–142. [Google Scholar] [CrossRef]
- Channappayya, S.S.; Bovik, A.C.; Caramanis, C.; Heath, R.W. SSIM-optimal linear image restoration. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 765–768. [Google Scholar]
- Wang, S.; Rehman, A.; Wang, Z.; Ma, S.; Gao, W. SSIM-motivated rate-distortion optimization for video coding. IEEE Trans. Circuits Syst. Video Technol. 2011, 22, 516–529. [Google Scholar] [CrossRef] [Green Version]
- Snell, J.; Ridgeway, K.; Liao, R.; Roads, B.D.; Mozer, M.C.; Zemel, R.S. Learning to generate images with perceptual similarity metrics. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 4277–4281. [Google Scholar]
- Ding, K.; Ma, K.; Wang, S.; Simoncelli, E.P. Comparison of Image Quality Models for Optimization of Image Processing Systems. arXiv 2020, arXiv:2005.01338. [Google Scholar]
- Shen, Z.; Huang, M.; Shi, J.; Xue, X.; Huang, T.S. Towards instance-level image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 16–20 June 2019; pp. 3683–3692. [Google Scholar]
- Bishop, C.M. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Wang, Z.; Bovik, A.C. Mean squared error: Love it or leave it? A new look at signal fidelity measures. IEEE Signal Process. Mag. 2009, 26, 98–117. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, L.; Zhang, L.; Mou, X.; Zhang, D. FSIM: A feature similarity index for image quality assessment. IEEE Trans. Image Process. 2011, 20, 2378–2386. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ding, K.; Ma, K.; Wang, S.; Simoncelli, E.P. Image quality assessment: Unifying structure and texture similarity. arXiv 2020, arXiv:2004.07728. [Google Scholar]
- Osberger, W.; Bergmann, N.; Maeder, A. An automatic image quality assessment technique incorporating high level perceptual factors. In Proceedings of the IEEE International Conference on Image Processing 1998, Chicago, IL, USA, 4–7 October 1998; pp. 414–418. [Google Scholar]
- Markman, A.B.; Gentner, D. Nonintentional similarity processing. New Unconscious 2005, 2, 107–137. [Google Scholar]
- Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, Salt Lake City, UT, USA, 18–22 June 2018; pp. 586–595. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv 2016, arXiv:1602.07360. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015, Boston, MA, USA, 8–12 June 2015; pp. 1–9. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, New York, NY, USA, 3–8 December 2012; pp. 1097–1105. [Google Scholar]
- Lim, B.; Son, S.; Kim, H.; Nah, S.; Mu Lee, K. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144. [Google Scholar]
- Schmitt, M.; Hughes, L.H.; Zhu, X.X. The SEN1-2 dataset for deep learning in SAR-optical data fusion. arXiv 2018, arXiv:1807.01569. [Google Scholar] [CrossRef] [Green Version]
- Zhang, R.; Isola, P.; Efros, A.A. Colorful image colorization. In Proceedings of the European Conference on Computer Vision 2016, Amsterdam, The Netherlands, 8–16 October 2016; pp. 649–666. [Google Scholar]
- UC Merced Land Use Dataset. Available online: http://weegee.vision.ucmerced.edu/datasets/landuse.html (accessed on 28 October 2010).
- Cheng, G.; Han, J.; Lu, X. Remote sensing image scene classification: Benchmark and state of the art. Proc. IEEE 2017, 105, 1865–1883. [Google Scholar] [CrossRef] [Green Version]
- Xia, G.S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhong, Y.; Zhang, L. AID: A benchmark data set for performance evaluation of aerial scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3965–3981. [Google Scholar] [CrossRef] [Green Version]
- Demir, I.; Koperski, K.; Lindenbaum, D.; Pang, G.; Huang, J.; Basu, S.; Hughes, F.; Tuia, D.; Raskar, R. A challenge to parse the earth through satellite images. arXiv 2018, arXiv:1805.06561. [Google Scholar]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3974–3983. [Google Scholar]
- Waqas Zamir, S.; Arora, A.; Gupta, A.; Khan, S.; Sun, G.; Shahbaz Khan, F.; Zhu, F.; Shao, L.; Xia, G.-S.; Bai, X. ISAID: A large-scale dataset for instance segmentation in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2019, Long Beach, CA, USA, 16–20 June 2019; pp. 28–37. [Google Scholar]
- Dai, D.; Yang, W. Satellite image classification via two-layer sparse coding with biased image representation. IEEE Geosci. Remote Sens. Lett. 2010, 8, 173–176. [Google Scholar] [CrossRef] [Green Version]
- Zhong, Y. Available online: http://www.lmars.whu.edu.cn/prof_web/zhongyanfei/e-code.html (accessed on 19 April 2016).
- RSC11Datasets. Available online: https://www.researchgate.net/publication/271647282_RS_C11_Database (accessed on 31 January 2015).
Classes | Number of Images | |
---|---|---|
Image Translation Dataset and Image Classification Dataset | Train | Farmland (413), Forest (449), Gorge (428), River (304), Residential (206) |
Test | Farmland (32), Forest (34), Gorge (33), River (31), Residential (28) | |
Image Restoration Dataset | Train | Pix2pix (110), CycleGAN (110), Pix2pixHD (110), FGGAN (110), Contrast Shift (150), Gaussian Blur (150), Speckle Noise (150) |
Task | Image Translation | Image Restoration | Image Classification |
---|---|---|---|
Input | 256 × 256 × 1 | 256 × 256 × 1 | 256 × 256 × 1 |
Batch Size | 1 | 16 | 16 |
Epoch | 200 | - | 50 |
Iteration | - | 20000 | - |
Initial Learning Rate | 0.0002 | 0.005 | 0.001 |
Optimizer | Adam (0.5, 0.999) | Adam (0.9, 0.999) | SGD (0.9) |
Weight | - | - | |
Output | 256 × 256 × 1 | 256 × 256 × 1 | - |
Evaluation | Method | SSIM | FSIM | MSE | LPIPS | DISTS |
---|---|---|---|---|---|---|
SSIM | Before | 0.6217 | ||||
After | 1.0000 | 0.6287 | 0.9702 | 0.9969 | 0.9514 | |
FSIM | Before | 0.7502 | ||||
After | 0.9989 | 0.9929 | 0.9926 | 0.9828 | 0.9729 | |
MSE | Before | 0.0241 | ||||
After | 0.0004 | 0.0230 | 0.0006 | 0.0006 | 0.0016 | |
LPIPS | Before | 0.3805 | ||||
After | 0.0008 | 0.3520 | 0.0099 | 0.0008 | 0.0227 | |
DISTS | Before | 0.3726 | ||||
After | 0.0014 | 0.2839 | 0.0225 | 0.0480 | 0.0137 |
Scene | Model | SSIM | MSE | PSNR | LPIPS—VGG | LPIPS—Squeeze |
---|---|---|---|---|---|---|
Farmland | CycleGAN | 0.2560 | 0.0302 | 15.68 | 0.5568 | 0.2977 |
Pix2pix | 0.2385 | 0.0300 | 15.63 | 0.5598 | 0.3012 | |
Pix2pixHD | 0.6371 | 0.0158 | 19.72 | 0.3558 | 0.1909 | |
FGGAN | 0.7189 | 0.0109 | 21.43 | 0.2821 | 0.1348 | |
Forest | CycleGAN | 0.4012 | 0.0102 | 21.20 | 0.4946 | 0.2113 |
Pix2pix | 0.3780 | 0.0108 | 20.94 | 0.5093 | 0.2194 | |
Pix2pixHD | 0.5031 | 0.0112 | 22.20 | 0.4378 | 0.2213 | |
FGGAN | 0.6129 | 0.0087 | 24.06 | 0.3592 | 0.1640 | |
Gorge | CycleGAN | 0.3083 | 0.0342 | 15.78 | 0.5081 | 0.2540 |
Pix2pix | 0.3025 | 0.0356 | 15.72 | 0.5081 | 0.2341 | |
Pix2pixHD | 0.4835 | 0.0337 | 17.50 | 0.4306 | 0.2280 | |
FGGAN | 0.5720 | 0.0240 | 18.56 | 0.3698 | 0.1822 | |
River | CycleGAN | 0.2901 | 0.0261 | 16.63 | 0.5042 | 0.2297 |
Pix2pix | 0.2726 | 0.0303 | 16.19 | 0.5109 | 0.2348 | |
Pix2pixHD | 0.5286 | 0.0209 | 18.90 | 0.3959 | 0.2002 | |
FGGAN | 0.5913 | 0.0234 | 18.99 | 0.3337 | 0.1592 | |
Residential | CycleGAN | 0.1259 | 0.0638 | 12.22 | 0.5875 | 0.3298 |
Pix2pix | 0.1154 | 0.0644 | 12.14 | 0.5886 | 0.3171 | |
Pix2pixHD | 0.2118 | 0.0657 | 12.51 | 0.5486 | 0.2921 | |
FGGAN | 0.2704 | 0.0557 | 13.66 | 0.5207 | 0.2600 | |
Average | CycleGAN | 0.2763 | 0.0329 | 16.30 | 0.5302 | 0.2645 |
Pix2pix | 0.2614 | 0.0342 | 16.12 | 0.5353 | 0.2613 | |
Pix2pixHD | 0.4728 | 0.0295 | 18.17 | 0.4337 | 0.2265 | |
FGGAN | 0.5531 | 0.0245 | 19.34 | 0.3731 | 0.1800 |
Model | Scene | SSIM | PSNR/MSE | LPIPS |
---|---|---|---|---|
Pix2pix | Farmland | |||
Forest | 🗸 | 🗸 | 🗸 | |
Gorge | 🗸 | 🗸 | ||
River | 🗸 | |||
Residential | ||||
CycleGAN | Farmland | |||
Forest | 🗸 | 🗸 | 🗸 | |
Gorge | 🗸 | |||
River | 🗸 | 🗸 | ||
Residential | ||||
Pix2pixHD | Farmland | 🗸 | 🗸 | 🗸 |
Forest | 🗸 | |||
Gorge | ||||
River | 🗸 | 🗸 | ||
Residential | ||||
FGGAN | Farmland | 🗸 | 🗸 | 🗸 |
Forest | 🗸 | 🗸 | ||
Gorge | ||||
River | 🗸 | |||
Residential |
Model | ResNet | Inception | SqueezeNet | VGG |
---|---|---|---|---|
FGGAN | 0.9051 | 0.9177 | 0.9304 | 0.8544 |
CycleGAN | 0.8797 | 0.8671 | 0.8418 | 0.8608 |
Pix2pix | 0.8861 | 0.8734 | 0.9241 | 0.8544 |
Pix2pixHD | 0.8987 | 0.8734 | 0.8544 | 0.8418 |
SAR | 0.7532 | 0.6899 | 0.7025 | 0.7531 |
Model | ResNet | Inception | SqueezeNet | VGG |
---|---|---|---|---|
w/o IQA | 0.5570 | 0.5044 | 0.6056 | 0.6519 |
w/SSIM | 0.9177 | 0.9208 | 0.9342 | 0.9215 |
w/FSIM | 0.6273 | 0.6056 | 0.6519 | 0.7468 |
w/MSE | 0.9051 | 0.9215 | 0.9177 | 0.9084 |
w/LPIPS | 0.9084 | 0.8987 | 0.9051 | 0.9208 |
w/DISTS | 0.8905 | 0.8720 | 0.8987 | 0.8861 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, J.; Zhou, J.; Li, M.; Zhou, H.; Yu, T. Quality Assessment of SAR-to-Optical Image Translation. Remote Sens. 2020, 12, 3472. https://doi.org/10.3390/rs12213472
Zhang J, Zhou J, Li M, Zhou H, Yu T. Quality Assessment of SAR-to-Optical Image Translation. Remote Sensing. 2020; 12(21):3472. https://doi.org/10.3390/rs12213472
Chicago/Turabian StyleZhang, Jiexin, Jianjiang Zhou, Minglei Li, Huiyu Zhou, and Tianzhu Yu. 2020. "Quality Assessment of SAR-to-Optical Image Translation" Remote Sensing 12, no. 21: 3472. https://doi.org/10.3390/rs12213472
APA StyleZhang, J., Zhou, J., Li, M., Zhou, H., & Yu, T. (2020). Quality Assessment of SAR-to-Optical Image Translation. Remote Sensing, 12(21), 3472. https://doi.org/10.3390/rs12213472