Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3664647.3681666acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Generalizing ISP Model by Unsupervised Raw-to-raw Mapping

Published: 28 October 2024 Publication History

Abstract

ISP (Image Signal Processor) serves as a pipeline converting unprocessed raw images to sRGB images, positioned before nearly all visual tasks. Due to the varying spectral sensitivities of cameras, raw images captured by different cameras exist in different color spaces, making it challenging to deploy ISP across cameras with consistent performance. To address this challenge, it is intuitively to incorporate a raw-to-raw mapping (mapping raw images across camera color spaces) module into the ISP. However, the lack of paired data (i.e., images of the same scene captured by different cameras) makes it difficult to train a raw-to-raw model using supervised learning methods. In this paper, we aim to achieve ISP generalization by proposing the first unsupervised raw-to-raw model. To be specific, we propose a CSTPP (Color Space Transformation Parameters Predictor) module to predict the space transformation parameters in a patch-wise manner, which can accurately perform color space transformation and flexibly manage complex lighting conditions. Additionally, we design a CycleGAN-style training framework to realize unsupervised learning, overcoming the deficiency of paired data. Our proposed unsupervised model achieved performance comparable to that of the state-of-the-art semi-supervised method in raw-to-raw task. Furthermore, to assess its ability to generalize the ISP model across different cameras, we for the first formulated cross-camera ISP task and demonstrated the performance of our method through extensive experiments. The codes are released at https://github.com/ydxxxx/Unsupervised-Raw-to-raw-Mapping.

References

[1]
Mahmoud Afifi, Abdelrahman Abdelhamed, Abdullah Abuolaim, Abhijith Punnappurath, and Michael S Brown. 2021. Cie xyz net: Unprocessing images for low-level computer vision tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 9 (2021), 4688--4700.
[2]
Mahmoud Afifi and Abdullah Abuolaim. 2021. Semi-Supervised Raw-to-Raw Mapping. In British Machine Vision Conference (BMVC).
[3]
Mahmoud Afifi, Jonathan T Barron, Chloe LeGendre, Yun-Ta Tsai, and Francois Bleibel. 2021. Cross-camera convolutional color constancy. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1981--1990.
[4]
Mahmoud Afifi and Michael S Brown. 2019. Sensor-Independent Illumination Estimation for DNN Models. In British Machine Vision Conference (BMVC).
[5]
Mahmoud Afifi and Michael S Brown. 2020. Deep white-balance editing. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 1397--1406.
[6]
Mahmoud Afifi, Marcus A Brubaker, and Michael S Brown. 2022. Auto white-balance correction for mixed-illuminant scenes. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1210--1219.
[7]
Jarno Nikkanen, and Moncef Gabbouj. 2018. A Data Set for Camera-Independent Color Constancy. IEEE Transactions on Image Processing, Vol. 27, 2 (2018), 530--544. https://doi.org/10.1109/TIP.2017.2764264
[8]
Tim Brooks, Ben Mildenhall, Tianfan Xue, Jiawen Chen, Dillon Sharlet, and Jonathan T Barron. 2019. Unprocessing images for learned raw denoising. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11036--11045.
[9]
Yoav Chai, Raja Giryes, and Lior Wolf. 2020. Supervised and unsupervised learning of parameterized color enhancement. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 992--1000.
[10]
Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. 2018. Learning to see in the dark. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3291--3300.
[11]
Marcos V Conde, Florin Vasluianu, and Radu Timofte. 2024. BSRAW: Improving Blind RAW Image Super-Resolution. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 8500--8510.
[12]
Linhui Dai, Xiaohong Liu, Chengqi Li, and Jun Chen. 2020. Awnet: Attentive wavelet network for image isp. In Computer Vision--ECCV 2020 Workshops: Glasgow, UK, August 23--28, 2020, Proceedings, Part III 16. Springer, 185--201.
[13]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations.
[14]
Hansen Feng, Lizhi Wang, Yuzhi Wang, and Hua Huang. 2022. Learnability enhancement for low-light raw denoising: Where paired real data meets noise modeling. In Proceedings of the 30th ACM International Conference on Multimedia. 1436--1444.
[15]
Weiran Gou, Ziyao Yi, Yan Xiang, Shaoqing Li, Zibin Liu, Dehui Kong, and Ke Xu. 2023. SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12182--12195.
[16]
Andrey Ignatov, Grigory Malivenko, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, et al. 2022. Pynet-v2 mobile: Efficient on-device photo processing with neural networks. In 2022 26th International Conference on Pattern Recognition (ICPR). IEEE, 677--684.
[17]
Andrey Ignatov, Anastasia Sycheva, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, et al. 2022. MicroISP: processing 32mp photos on mobile devices with deep learning. In European Conference on Computer Vision. Springer, 729--746.
[18]
Andrey Ignatov, Luc Van Gool, and Radu Timofte. 2020. Replacing mobile camera isp with a single deep learning model. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 536--537.
[19]
Dongyoung Kim, Jinwoo Kim, Seonghyeon Nam, Dongwoo Lee, Yeonkyung Lee, Nahyup Kang, Hyong-Euk Lee, ByungIn Yoo, Jae-Joon Han, and Seon Joo Kim. 2021. Large scale multi-illuminant (lsmi) dataset for developing white balance algorithm under mixed illumination. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2410--2419.
[20]
Seon Joo Kim, Hai Ting Lin, Zheng Lu, Sabine Süsstrunk, Stephen Lin, and Michael S Brown. 2012. A new in-camera imaging model for color computer vision and its application. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, 12 (2012), 2289--2302.
[21]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[22]
Samu Koskinen, Dan Yang, and Joni-Kristian Kämäräinen. 2020. Cross-dataset color constancy revisited using sensor-to-sensor transfer. BMVC.
[23]
Yifan Li, Yaochen Li, Wenneng Tang, Zhifeng Zhu, Jinhuo Yang, and Yuehu Liu. 2023. Swin-UNIT: Transformer-based GAN for High-resolution Unpaired Image Translation. In Proceedings of the 31st ACM International Conference on Multimedia. 4657--4665.
[24]
William Ljungbergh, Joakim Johnander, Christoffer Petersson, and Michael Felsberg. 2023. Raw or cooked? object detection on raw images. In Scandinavian Conference on Image Analysis. Springer, 374--385.
[25]
Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, and Stephen Paul Smolley. 2017. Least squares generative adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2794--2802.
[26]
Igor Morawski, Yu-An Chen, Yu-Sheng Lin, Shusil Dangi, Kai He, and Winston H Hsu. 2022. Genisp: Neural isp for low-light machine cognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 630--639.
[27]
Ali Mosleh, Avinash Sharma, Emmanuel Onzon, Fahim Mannan, Nicolas Robidoux, and Felix Heide. 2020. Hardware-in-the-loop end-to-end optimization of camera image processing pipelines. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7529--7538.
[28]
Rang Nguyen, Dilip K Prasad, and Michael S Brown. 2014. Raw-to-raw: Mapping between image sensor color responses. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3398--3405.
[29]
Taesung Park, Alexei A Efros, Richard Zhang, and Jun-Yan Zhu. 2020. Contrastive learning for unpaired image-to-image translation. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part IX 16. Springer, 319--345.
[30]
Gaurav Parmar, Richard Zhang, and Jun-Yan Zhu. 2022. On Aliased Resizing and Surprising Subtleties in GAN Evaluation. In CVPR.
[31]
Zhiliang Peng, Zonghao Guo, Wei Huang, Yaowei Wang, Lingxi Xie, Jianbin Jiao, Qi Tian, and Qixiang Ye. 2023. Conformer: Local features coupling global representations for recognition and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
[32]
Haina Qin, Longfei Han, Juan Wang, Congxuan Zhang, Yanwei Li, Bing Li, and Weiming Hu. 2022. Attention-Aware Learning for Hyperparameter Prediction in Image Processing Pipelines. In European Conference on Computer Vision. Springer, 271--287.
[33]
Haina Qin, Longfei Han, Weihua Xiong, Juan Wang, Wentao Ma, Bing Li, and Weiming Hu. 2023. Learning to Exploit the Sequence-Specific Prior Knowledge for Image Processing Pipelines Optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 22314--22323.
[34]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5--9, 2015, proceedings, part III 18. Springer, 234--241.
[35]
Xinyu Sun, Zhikun Zhao, Lili Wei, Congyan Lang, Mingxuan Cai, Longfei Han, Juan Wang, Bing Li, and Yuxuan Guo. 2024. RL-SeqISP: Reinforcement Learning-Based Sequential Optimization for Image Signal Processing. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 5025--5033.
[36]
Dmitrii Torbunov, Yi Huang, Haiwang Yu, Jin Huang, Shinjae Yoo, Meifeng Lin, Brett Viren, and Yihui Ren. 2023. Uvcgan: Unet vision transformer cycle-consistent gan for unpaired image-to-image translation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision. 702--712.
[37]
Chuheng Wei, Guoyuan Wu, Matthew Barth, Pak Hung Chan, Valentina Donzella, and Anthony Huggett. 2023. Enhanced Object Detection by Integrating Camera Parameters into Raw Image-Based Faster R-CNN. In 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 4473--4478.
[38]
Kaixuan Wei, Ying Fu, Yinqiang Zheng, and Jiaolong Yang. 2021. Physics-based noise modeling for extreme low-light photography. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 11 (2021), 8520--8537.
[39]
Ruikang Xu, Chang Chen, Jingyang Peng, Cheng Li, Yibin Huang, Fenglong Song, Youliang Yan, and Zhiwei Xiong. 2023. Toward RAW Object Detection: A New Benchmark and A New Model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13384--13393.
[40]
Yanchao Yang and Stefano Soatto. 2020. Fda: Fourier domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4085--4095.
[41]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
October 2024
11719 pages
ISBN:9798400706868
DOI:10.1145/3664647
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. image signal processor
  2. parameterized model
  3. unsupervised raw-to-raw mapping

Qualifiers

  • Research-article

Conference

MM '24
Sponsor:
MM '24: The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 93
    Total Downloads
  • Downloads (Last 12 months)93
  • Downloads (Last 6 weeks)17
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media