research-article

Generalizing ISP Model by Unsupervised Raw-to-raw Mapping

Authors:

Yang YangAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 3809 - 3817

https://doi.org/10.1145/3664647.3681666

Published: 28 October 2024 Publication History

Abstract

ISP (Image Signal Processor) serves as a pipeline converting unprocessed raw images to sRGB images, positioned before nearly all visual tasks. Due to the varying spectral sensitivities of cameras, raw images captured by different cameras exist in different color spaces, making it challenging to deploy ISP across cameras with consistent performance. To address this challenge, it is intuitively to incorporate a raw-to-raw mapping (mapping raw images across camera color spaces) module into the ISP. However, the lack of paired data (i.e., images of the same scene captured by different cameras) makes it difficult to train a raw-to-raw model using supervised learning methods. In this paper, we aim to achieve ISP generalization by proposing the first unsupervised raw-to-raw model. To be specific, we propose a CSTPP (Color Space Transformation Parameters Predictor) module to predict the space transformation parameters in a patch-wise manner, which can accurately perform color space transformation and flexibly manage complex lighting conditions. Additionally, we design a CycleGAN-style training framework to realize unsupervised learning, overcoming the deficiency of paired data. Our proposed unsupervised model achieved performance comparable to that of the state-of-the-art semi-supervised method in raw-to-raw task. Furthermore, to assess its ability to generalize the ISP model across different cameras, we for the first formulated cross-camera ISP task and demonstrated the performance of our method through extensive experiments. The codes are released at https://github.com/ydxxxx/Unsupervised-Raw-to-raw-Mapping.

References

[1]

Mahmoud Afifi, Abdelrahman Abdelhamed, Abdullah Abuolaim, Abhijith Punnappurath, and Michael S Brown. 2021. Cie xyz net: Unprocessing images for low-level computer vision tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 9 (2021), 4688--4700.

[2]

Mahmoud Afifi and Abdullah Abuolaim. 2021. Semi-Supervised Raw-to-Raw Mapping. In British Machine Vision Conference (BMVC).

[3]

Mahmoud Afifi, Jonathan T Barron, Chloe LeGendre, Yun-Ta Tsai, and Francois Bleibel. 2021. Cross-camera convolutional color constancy. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1981--1990.

[4]

Mahmoud Afifi and Michael S Brown. 2019. Sensor-Independent Illumination Estimation for DNN Models. In British Machine Vision Conference (BMVC).

[5]

Mahmoud Afifi and Michael S Brown. 2020. Deep white-balance editing. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 1397--1406.

[6]

Mahmoud Afifi, Marcus A Brubaker, and Michael S Brown. 2022. Auto white-balance correction for mixed-illuminant scenes. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1210--1219.

[7]

Jarno Nikkanen, and Moncef Gabbouj. 2018. A Data Set for Camera-Independent Color Constancy. IEEE Transactions on Image Processing, Vol. 27, 2 (2018), 530--544. https://doi.org/10.1109/TIP.2017.2764264

Digital Library

[8]

Tim Brooks, Ben Mildenhall, Tianfan Xue, Jiawen Chen, Dillon Sharlet, and Jonathan T Barron. 2019. Unprocessing images for learned raw denoising. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11036--11045.

[9]

Yoav Chai, Raja Giryes, and Lior Wolf. 2020. Supervised and unsupervised learning of parameterized color enhancement. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 992--1000.

[10]

Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. 2018. Learning to see in the dark. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3291--3300.

[11]

Marcos V Conde, Florin Vasluianu, and Radu Timofte. 2024. BSRAW: Improving Blind RAW Image Super-Resolution. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 8500--8510.

[12]

Linhui Dai, Xiaohong Liu, Chengqi Li, and Jun Chen. 2020. Awnet: Attentive wavelet network for image isp. In Computer Vision--ECCV 2020 Workshops: Glasgow, UK, August 23--28, 2020, Proceedings, Part III 16. Springer, 185--201.

Digital Library

[13]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations.

[14]

Hansen Feng, Lizhi Wang, Yuzhi Wang, and Hua Huang. 2022. Learnability enhancement for low-light raw denoising: Where paired real data meets noise modeling. In Proceedings of the 30th ACM International Conference on Multimedia. 1436--1444.

Digital Library

[15]

Weiran Gou, Ziyao Yi, Yan Xiang, Shaoqing Li, Zibin Liu, Dehui Kong, and Ke Xu. 2023. SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12182--12195.

[16]

Andrey Ignatov, Grigory Malivenko, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, et al. 2022. Pynet-v2 mobile: Efficient on-device photo processing with neural networks. In 2022 26th International Conference on Pattern Recognition (ICPR). IEEE, 677--684.

[17]

Andrey Ignatov, Anastasia Sycheva, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, et al. 2022. MicroISP: processing 32mp photos on mobile devices with deep learning. In European Conference on Computer Vision. Springer, 729--746.

[18]

Andrey Ignatov, Luc Van Gool, and Radu Timofte. 2020. Replacing mobile camera isp with a single deep learning model. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 536--537.

[19]

Dongyoung Kim, Jinwoo Kim, Seonghyeon Nam, Dongwoo Lee, Yeonkyung Lee, Nahyup Kang, Hyong-Euk Lee, ByungIn Yoo, Jae-Joon Han, and Seon Joo Kim. 2021. Large scale multi-illuminant (lsmi) dataset for developing white balance algorithm under mixed illumination. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2410--2419.

[20]

Seon Joo Kim, Hai Ting Lin, Zheng Lu, Sabine Süsstrunk, Stephen Lin, and Michael S Brown. 2012. A new in-camera imaging model for color computer vision and its application. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, 12 (2012), 2289--2302.

Digital Library

[21]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[22]

Samu Koskinen, Dan Yang, and Joni-Kristian Kämäräinen. 2020. Cross-dataset color constancy revisited using sensor-to-sensor transfer. BMVC.

[23]

Yifan Li, Yaochen Li, Wenneng Tang, Zhifeng Zhu, Jinhuo Yang, and Yuehu Liu. 2023. Swin-UNIT: Transformer-based GAN for High-resolution Unpaired Image Translation. In Proceedings of the 31st ACM International Conference on Multimedia. 4657--4665.

Digital Library

[24]

William Ljungbergh, Joakim Johnander, Christoffer Petersson, and Michael Felsberg. 2023. Raw or cooked? object detection on raw images. In Scandinavian Conference on Image Analysis. Springer, 374--385.

Digital Library

[25]

Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, and Stephen Paul Smolley. 2017. Least squares generative adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2794--2802.

[26]

Igor Morawski, Yu-An Chen, Yu-Sheng Lin, Shusil Dangi, Kai He, and Winston H Hsu. 2022. Genisp: Neural isp for low-light machine cognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 630--639.

[27]

Ali Mosleh, Avinash Sharma, Emmanuel Onzon, Fahim Mannan, Nicolas Robidoux, and Felix Heide. 2020. Hardware-in-the-loop end-to-end optimization of camera image processing pipelines. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7529--7538.

[28]

Rang Nguyen, Dilip K Prasad, and Michael S Brown. 2014. Raw-to-raw: Mapping between image sensor color responses. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3398--3405.

[29]

Taesung Park, Alexei A Efros, Richard Zhang, and Jun-Yan Zhu. 2020. Contrastive learning for unpaired image-to-image translation. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part IX 16. Springer, 319--345.

[30]

Gaurav Parmar, Richard Zhang, and Jun-Yan Zhu. 2022. On Aliased Resizing and Surprising Subtleties in GAN Evaluation. In CVPR.

[31]

Zhiliang Peng, Zonghao Guo, Wei Huang, Yaowei Wang, Lingxi Xie, Jianbin Jiao, Qi Tian, and Qixiang Ye. 2023. Conformer: Local features coupling global representations for recognition and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).

[32]

Haina Qin, Longfei Han, Juan Wang, Congxuan Zhang, Yanwei Li, Bing Li, and Weiming Hu. 2022. Attention-Aware Learning for Hyperparameter Prediction in Image Processing Pipelines. In European Conference on Computer Vision. Springer, 271--287.

[33]

Haina Qin, Longfei Han, Weihua Xiong, Juan Wang, Wentao Ma, Bing Li, and Weiming Hu. 2023. Learning to Exploit the Sequence-Specific Prior Knowledge for Image Processing Pipelines Optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 22314--22323.

[34]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5--9, 2015, proceedings, part III 18. Springer, 234--241.

[35]

Xinyu Sun, Zhikun Zhao, Lili Wei, Congyan Lang, Mingxuan Cai, Longfei Han, Juan Wang, Bing Li, and Yuxuan Guo. 2024. RL-SeqISP: Reinforcement Learning-Based Sequential Optimization for Image Signal Processing. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 5025--5033.

[36]

Dmitrii Torbunov, Yi Huang, Haiwang Yu, Jin Huang, Shinjae Yoo, Meifeng Lin, Brett Viren, and Yihui Ren. 2023. Uvcgan: Unet vision transformer cycle-consistent gan for unpaired image-to-image translation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision. 702--712.

[37]

Chuheng Wei, Guoyuan Wu, Matthew Barth, Pak Hung Chan, Valentina Donzella, and Anthony Huggett. 2023. Enhanced Object Detection by Integrating Camera Parameters into Raw Image-Based Faster R-CNN. In 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 4473--4478.

[38]

Kaixuan Wei, Ying Fu, Yinqiang Zheng, and Jiaolong Yang. 2021. Physics-based noise modeling for extreme low-light photography. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 11 (2021), 8520--8537.

[39]

Ruikang Xu, Chang Chen, Jingyang Peng, Cheng Li, Yibin Huang, Fenglong Song, Youliang Yan, and Zhiwei Xiong. 2023. Toward RAW Object Detection: A New Benchmark and A New Model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13384--13393.

[40]

Yanchao Yang and Stefano Soatto. 2020. Fda: Fourier domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4085--4095.

[41]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.

Index Terms

Generalizing ISP Model by Unsupervised Raw-to-raw Mapping
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
  2. Computer graphics
    1. Image manipulation
      1. Computational photography
      2. Image processing

Recommendations

Unsupervised Vehicle Re-Identification via Raw UAV Videos
Image and Graphics
Abstract
For matching vehicles across different camera views, vehicle Re-Identification has made great progress in supervised learning. However, supervised approach would require extensive manual labeling which is costly and unfeasible for large-scale ...
Boosting semi-supervised face recognition with raw faces
Abstract
Deep facial recognition benefits significantly from large-scale training data; however, the bottleneck of high labeling costs persists. Therefore, to reduce the labeling costs, it is desirable to train a model using limited labeled ...
Highlights
- Pioneering use of raw unlabeled data.
- Utilizing of overlaps by relaxing the ...
Raw Image Deblurring
Deep learning-based blind image deblurring plays an essential role in solving image blur since all existing kernels are limited in modeling the real world blur. Thus far, researchers focus on powerful models to handle the deblurring problem and achieve ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
93
Total Downloads

Downloads (Last 12 months)93
Downloads (Last 6 weeks)17

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten