Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3078971.3079001acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

TEX-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition

Published: 06 June 2017 Publication History

Abstract

Recognizing materials and textures in realistic imaging conditions is a challenging computer vision problem. For many years, local features based orderless representations were a dominant approach for texture recognition. Recently deep local features, extracted from the intermediate layers of a Convolutional Neural Network (CNN), are used as filter banks. These dense local descriptors from a deep model, when encoded with Fisher Vectors, have shown to provide excellent results for texture recognition. The CNN models, employed in such approaches, take RGB patches as input and train on a large amount of labeled images. We show that CNN models, which we call TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard deep models trained on RGB patches. We further investigate two deep architectures, namely early and late fusion, to combine the texture and color information. Experiments on benchmark texture datasets clearly demonstrate that TEX-Nets provide complementary information to standard RGB deep network. Our approach provides a large gain of 4.8%, 3.5%, 2.6% and 4.1% respectively in accuracy on the DTD, KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets, compared to the standard RGB network of the same architecture. Further, our final combination leads to consistent improvements over the state-of-the-art on all four datasets.

References

[1]
Timo Ahonen, Jiri Matas, Chu He, and Matti Pietikainen. 2009. Rotation Invariant Image Description with Local Binary Pattern Histogram Fourier Features. In SCIA.
[2]
Joan Bruna and Stephane Mallat. 2013. Invariant Scattering Convolution Networks. TSE 35, 8 (2013), 1872--1886.
[3]
Barbara Caputo, Eric Hayman, and P Mallikarjuna. 2005. Class-Specific Material Categorisation. In ICCV.
[4]
Tsung-Han Chan, Kui Jia, Shenghua Gao, and Yi Ma. 2014. "PCANet: A Simple Deep Learning Baseline for Image Classification" TIP 24, 12 (2014), 5017--5032.
[5]
Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the Devil in the Details: Delving Deep into Convolutional Nets. In BMVC.
[6]
Jie Chen, Shiguang Shan, Chu He, Guoying Zhao, Matti Pietikainen, Xilin Chen, and Wen Gao. 2010. WLD: A Robust Local Image Descriptor. PAMI 32, 9 (2010), 1705--1720.
[7]
Guilhem Cheron, Ivan Laptev, and Cordelia Schmid. 2015. P-CNN: Pose-Based CNN Features for Action Recognition. In ICCV.
[8]
Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. 2014. Describing Textures in the Wild. In CVPR.
[9]
Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, and Andrea Vedaldi. 2016. Deep Filter Banks for Texture Recognition, Description, and Segmentation. IJCV 118, 1 (2016), 65--94.
[10]
G. Csurka, C. Bray, C. Dance, and L. Fan. 2004. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, ECCV.
[11]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. Ima- geNet: A large-scale hierarchical image database. In Proc. CVPR.
[12]
Andreas Eitel, Jost Tobias Springenberg, Luciano Spinello, Martin Riedmiller, and Wolfram Burgard. 2015. Multimodal Deep Learning for Robust RGB-D Object Recognition. In IROS.
[13]
Abdolhossein Fathi and Ahmad Nilchi. 2012. Noise tolerant local binary pattern operator for efficient texture analysis. PRL 33, 9 (2012), 1093--1100.
[14]
Yimo Guo, Guoying Zhao, and Matti Pietikainen. 2012. Discriminative features for texture description. PR 45, 10 (2012), 3834--3843.
[15]
Zhenhua Guo, Lei Zhang, and David Zhang. 2010. A Completed Modeling of Local Binary Pattern Operator for Texture Classification. TIP 19, 6 (2010), 1657--1663.
[16]
Zhenhua Guo, Lei Zhang, and David Zhang. 2010. Rotation invariant texture classification using LBP variance (LBPV) with global matching. PR 43, 3 (2010), 706--719.
[17]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR.
[18]
Fahad Shahbaz Khan, Rao Muhammad Anwer, Joost van de Weijer, Andrew Bagdanov, Antonio Lopez, and Michael Felsberg. 2013. Coloring Action Recognition in Still Images. IJCV 105, 3 (2013), 205--221.
[19]
Fahad Shahbaz Khan, Rao Muhammad Anwer, Joost van de Weijer, Andrew D. Bagdanov, Maria Vanrell, and Antonio M. Lopez. 2012. Color attributes for object detection. In CVPR.
[20]
Fahad Shahbaz Khan, Rao Muhammad Anwer, Joost van de Weijer, Michael Felsberg, and Jorma Laaksonen. 2015. Compact color-texture description for texture classification. PRL 51 (2015), 16--22.
[21]
Fahad Shahbaz Khan, Joost van de Weijer, Sadiq Ali, and Michael Felsberg. 2013. Evaluating the Impact of Color on Texture Recognition. In CAIP.
[22]
Fahad Shahbaz Khan, Joost van de Weijer, Rao Muhammad Anwer, Andrew Bagdanov, Michael Felsberg, and Jorma Laaksonen. 2016. Scale Coding Bag of Deep Features for Human Attribute and Action Recognition. arXiv preprint arXiv:1612.04884 (2016).
[23]
Fahad Shahbaz Khan, Joost van de Weijer, Rao Muhammad Anwer, Michael Felsberg, and Carlo Gatta. 2014. Semantic Pyramids for Gender and Action Recognition. TIP 23, 8 (2014), 3633--3645.
[24]
Fahad Shahbaz Khan, Joost van de Weijer, and Maria Vanrell. 2009. Top-Down Color Attention for Object Recognition. In ICCV.
[25]
Fahad Shahbaz Khan, Joost van de Weijer, and Maria Vanrell. 2012. Modulating Shape Features by Color Attention for Object Recognition. IJCV 98, 1 (2012), 49--64.
[26]
Yann LeCun, Bernhard Boser, John Denker, Donnie Henderson, R Howard, Wayne Hubbard, and Lawrence Jackel. 1989. Handwritten Digit Recognition with a Back-Propagation Network. In NIPS.
[27]
Seung Ho Lee, Jae Young Choi, Yong Man Ro, and Konstantinos Plataniotis. 2012. Local Color Vector Binary Patterns From Multichannel Face Images for Face Recognition. TIP 21, 4 (2012), 2347--2353.
[28]
Thomas Leung and Jitendra Malik. 1996. Detecting, localizing and grouping repeated scene elements from an image. In ECCV.
[29]
Thomas Leung and Jitendra Malik. 2001. Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons. IJCV 43, 1 (2001), 29--44.
[30]
Gil Levi and Tal Hassner. 2015. Emotion Recognition in the Wild via Convolu- tional Neural Networks and Mapped Binary Patterns. In ICMI.
[31]
Li Liu, Paul Fieguth, Yulan Guo, Xiaogang Wang, and Matti Pietikainen. 2017. Local binary features for texture classification: Taxonomy and experimental study. PR 62 (2017), 135--160.
[32]
Li Liu, Paul Fieguth, Xiaogang Wang, Matti Pietikainen, and Dewen Hu. 2016. Evaluation of LBP and Deep Texture Descriptors with a New Robustness Bench- mark. In ECCV.
[33]
Li Liu, Songyang Lao, Paul Fieguth, Yulan Guo, Xiaogang Wang, and Matti Pietikainen. 2016. Median Robust Extended Local Binary Pattern for Texture Classification. TIP 25, 3 (2016), 1368--1381.
[34]
Lingqiao Liu, Chunhua Shen, and Anton van den Hengel. 2015. The Treasure beneath Convolutional Layers: Cross-convolutional-layer Pooling for Image Classification. In CVPR.
[35]
Li Liu, Lingjun Zhao, Yunli Long, and Paul Fieguth. 2012. Extended local binary patterns for texture classification. IMAVIS 30, 2 (2012), 86--99.
[36]
Li Liu, Lingjun Zhao, Yunli Long, Gangyao Kuang, and Paul Fieguth. 2012. Extended local binary patterns for texture classification. IVC 30, 2 (2012), 86--99.
[37]
Topi Maenpaa and Matti Pietikainen. 2004. Classification with color and texture: jointly or separately? PR 37, 8 (2004), 1629--1640.
[38]
Timo Ojala, Matti Pietikainen, and Topi Maenpaa. 2002. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. PAMI 24, 7 (2002), 971--987.
[39]
Ville Ojansivu, Esa Rahtu, and Janne Heikkila. 2009. Rotation Invariant Local Phase Quantization for Blur Insensitive Texture Analysis. In ICPR.
[40]
Florent Perronnin and Christopher Dance. 2007. Fisher Kernels on Visual Vocab- ularies for Image Categorization. In CVPR.
[41]
Gaurav Sharma, Sibt ul Hussain, and Frederic Jurie. 2012. Local Higher-Order Statistics (LHS) for Texture Categorization and Facial Analysis. In ECCV.
[42]
Karen Simonyan and Andrew Zisserman. 2014. Two-Stream Convolutional Networks for Action Recognition in Videos. In NIPS.
[43]
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR.
[44]
Milan Sulc and Jiri Matas. 2014. Fast Features Invariant to Rotation and Scale of Texture. In ECCV Workshops.
[45]
Xiaoyang Tan and Bill Triggs. 2007. Fusing Gabor and LBP Feature Sets for Kernel-Based Face Recognition. In AMFG.
[46]
Xiaoyang Tan and Bill Triggs. 2010. Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions. TIP 19, 9 (2010), 1635--1650.
[47]
Radu Timofte and Luc Van Gool. 2012. A Training-free Classification Framework for Textures, Writers, and Materials. In BMVC.
[48]
Sibt ul Hussain and Bill Triggs. 2012. Visual Recognition Using Local Quantized Patterns. In ECCV.
[49]
Xiaoyu Wang, Tony Han, and Shuicheng Yan. 2009. An HOG-LBP Human Detector with Partial Occlusion Handling. In ICCV.
[50]
Matthew Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convolutional Networks. In ECCV.
[51]
Junge Zhang, Kaiqi Huang, Yinan Yu, and Tieniu Tan. 2011. Boosted local structured HOG-LBP for object localization. In CVPR.
[52]
Jun Zhang, Jimin Liang, and Heng Zhao. 2013. Local Energy Pattern for Texture Classification Using Self-Adaptive Quantization Thresholds. TIP 22, 1 (2013), 31--42.
[53]
J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid. 2007. Local features and kernels for classification of texture and object catergories: A Comprehensive Study. IJCV 73, 2 (2007), 213--218.
[54]
Jun Zhang, Heng Zhao, and Jimin Liang. 2013. Continuous rotation invariant local descriptors for texton dictionary-based texture classification. CVIU 117, 1 (2013), 56--75.
[55]
Guoying Zhao, Timo Ahonen, Jiri Matas, and Matti Pietikainen. 2012. Rotation- Invariant Image and Video Description With Local Binary Pattern Features. TIP 21, 4 (2012), 1465--1477

Cited By

View all
  • (2024)Deep and shallow feature fusion framework for remote sensing open pit coal mine scene recognitionScientific Reports10.1038/s41598-024-72855-514:1Online publication date: 15-Oct-2024
  • (2024)Tactile texture recognition of multi-modal bionic finger based on multi-modal CBAM-CNN interpretable methodDisplays10.1016/j.displa.2024.10273283(102732)Online publication date: Jul-2024
  • (2020)Empirical Remarks on the Translational Equivariance of Convolutional LayersApplied Sciences10.3390/app1009316110:9(3161)Online publication date: 1-May-2020
  • Show More Cited By

Index Terms

  1. TEX-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval
    June 2017
    524 pages
    ISBN:9781450347013
    DOI:10.1145/3078971
    • General Chairs:
    • Bogdan Ionescu,
    • Nicu Sebe,
    • Program Chairs:
    • Jiashi Feng,
    • Martha Larson,
    • Rainer Lienhart,
    • Cees Snoek
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 June 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. convolutional neural networks
    2. local binary patterns
    3. texture recognition

    Qualifiers

    • Research-article

    Conference

    ICMR '17
    Sponsor:

    Acceptance Rates

    ICMR '17 Paper Acceptance Rate 33 of 95 submissions, 35%;
    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 05 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Deep and shallow feature fusion framework for remote sensing open pit coal mine scene recognitionScientific Reports10.1038/s41598-024-72855-514:1Online publication date: 15-Oct-2024
    • (2024)Tactile texture recognition of multi-modal bionic finger based on multi-modal CBAM-CNN interpretable methodDisplays10.1016/j.displa.2024.10273283(102732)Online publication date: Jul-2024
    • (2020)Empirical Remarks on the Translational Equivariance of Convolutional LayersApplied Sciences10.3390/app1009316110:9(3161)Online publication date: 1-May-2020
    • (2019)A Novel Attention-based Neural Network for Video Scene Classification in Complex BackgroundProceedings of the 32nd International Conference on Computer Animation and Social Agents10.1145/3328756.3328768(85-88)Online publication date: 1-Jul-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media