Structure-and texture-aware learning for low-light image enhancement
Proceedings of the 30th ACM International Conference on Multimedia, 2022•dl.acm.org
Structure and texture information is critically important for low-light image enhancement, in
terms of stable global adjustment and fine details recovery. However, most existing methods
tend to learn the structure and texture of low-light images in a coupled manner, without well
considering the heterogeneity between them, which challenges the capability of the model
to learn both adequately. In this paper, we tackle this problem in a divide and conquer
strategy, based on the observation that the structure and texture representations are highly …
terms of stable global adjustment and fine details recovery. However, most existing methods
tend to learn the structure and texture of low-light images in a coupled manner, without well
considering the heterogeneity between them, which challenges the capability of the model
to learn both adequately. In this paper, we tackle this problem in a divide and conquer
strategy, based on the observation that the structure and texture representations are highly …
Structure and texture information is critically important for low-light image enhancement, in terms of stable global adjustment and fine details recovery. However, most existing methods tend to learn the structure and texture of low-light images in a coupled manner, without well considering the heterogeneity between them, which challenges the capability of the model to learn both adequately. In this paper, we tackle this problem in a divide and conquer strategy, based on the observation that the structure and texture representations are highly separated in the frequency spectrum. Specifically, we propose a Structure and Texture Aware Network (STAN) for low-light image enhancement, which consists of a structure sub-network and a texture sub-network. The former exploits the low-pass characteristic of the transformer to capture low-frequency-related structural representation. While the latter builds upon central difference convolution to capture high-frequency-related texture representation. We establish the Multi-Spectrum Interaction (MSI) module between two sub-networks to bidirectionally provide complementary information. In addition, to further elevate the capability of the model, we introduce a dual distillation scheme that assists the learning process of two sub-networks via counterparts' normal-light structure and texture representations. Comprehensive experiments show that the proposed STAN outperforms the state-of-the-art methods qualitatively and quantitatively.