research-article

Open access

Low Rank Multi-Dictionary Selection at Scale

Authors:

Boya Ma,

Maxwell McNeil,

Abram Magner,

Petko BogdanovAuthors Info & Claims

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 2106 - 2117

https://doi.org/10.1145/3637528.3671723

Published: 24 August 2024 Publication History

PDF eReader

Abstract

The sparse dictionary coding framework represents signals as a linear combination of a few predefined dictionary atoms. It has been employed for images, time series, graph signals and recently for 2-way (or 2D) spatio-temporal data employing jointly temporal and spatial dictionaries. Large and over-complete dictionaries enable high-quality models, but also pose scalability challenges which are exacerbated in multi-dictionary settings. Hence, an important problem that we address in this paper is: How to scale multi-dictionary coding for large dictionaries and datasets?

We propose a multi-dictionary atom selection technique for low-rank sparse coding named LRMDS. To enable scalability to large dictionaries and datasets, it progressively selects groups of row-column atom pairs based on their alignment with the data and performs convex relaxation coding via the corresponding sub-dictionaries. We demonstrate both theoretically and experimentally that when the data has a low-rank encoding with a sparse subset of the atoms, LRMDS is able to select them with strong guarantees under mild assumptions. Furthermore, we demonstrate the scalability and quality of LRMDS in both synthetic and real-world datasets and for a range of coding dictionaries. It achieves 3 times to 10 times speed-up compared to baselines, while obtaining up to two orders of magnitude improvement in representation quality on some of the real world datasets given a fixed target number of atoms.

Supplemental Material

MP4 File - LRMDS-video

In this video we introduce our low-rank multi-dictionary selection model. We first discuss the basic low-rank multi-dictionary coding model (TGSD), its wide range of applications and its main drawback: limited scalability with large over-complete dictionaries. We introduce the general idea of how we address this limitation by sub-selecting dictionary atoms informed by the input data. We preview some experimental results demonstrating the ability of our approach to accurately discard a large fraction of unnecessary atoms and enable significant reduction in both the representation error and running time.

Download
139.20 MB

References

[1]

[n. d.]. Wikipedia Page Views Statistics http://dumps.wikimedia.org/other/ pagecounts-raw/.

Abstract

Supplemental Material

References

Index Terms

Recommendations

Structure-Constrained Low-Rank and Partial Sparse Representation with Sample Selection for image classification

Learning Scale and Shift-Invariant Dictionary for Sparse Representation

Image Denoising Using Low-Rank Dictionary and Sparse Representation

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations