Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3095713.3095730acmotherconferencesArticle/Chapter ViewAbstractPublication PagescbmiConference Proceedingsconference-collections
research-article

Connoisseur: classification of styles of Mexican architectural heritage with deep learning and visual attention prediction

Published: 19 June 2017 Publication History

Abstract

The automatic description of multimedia content was mainly developed for classification tasks, retrieval systems and massive ordering of data. Preservation of cultural heritage is a field of high importance for application to this method. Our problem is classification of architectural styles of buildings in digital photographs of Mexican cultural heritage. The selection of relevant content in the scene for training classification models allows them to be more precise in the classification task. Here we use a saliency-driven approach to predict visual attention in images and use it to train a Convolutional Neural Network to identify the architectural style of Mexican buildings. Also, we present an analysis of the behavior of the models trained under the traditional cropped image and the prominence maps. In this sense, we show that the performance of the saliency-based CNNs is better than the traditional training reaching a classification rate of 97% in validation dataset. It is considered that style identification with this technique can make a wide contribution in video description tasks, specifically in the automatic documentation of Mexican cultural heritage.

References

[1]
Guy Thomas Buswell. 1935. How people look at pictures: a study of the psychology and perception in art. (1935).
[2]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580--587.
[3]
Iván González-Díaz, Vincent Buso, and Jenny Benois-Pineau. 2016. Perceptual modeling in the problem of active object recognition in visual scenes. Pattern Recognition 56 (2016), 129--141.
[4]
NVIDIA DIGITS-Interactive Deep Learning GPU. 2015. Training System. (2015).
[5]
Jonathan Harel, Christof Koch, and Pietro Perona. 2007. Graph-based visual saliency. In Advances in neural information processing systems. 545--552.
[6]
Andrew G Howard. 2013. Some improvements on deep convolutional neural network based image classification. arXiv preprint arXiv:1312.5402 (2013).
[7]
Laurent Itti, Christof Koch, and Ernst Niebur. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on pattern analysis and machine intelligence 20, 11 (1998), 1254--1259.
[8]
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia. ACM, 675--678.
[9]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[10]
Jose Llamas, Pedro M Lerones, Eduardo Zalama, and Jaime Gómez-García-Bermejo. 2016. Applying Deep Learning Techniques to Cultural Heritage Images Within the INCEPTION Project. In Euro-Mediterranean Conference. Springer, 25--32.
[11]
Abraham Montoya Obeso, Jenny Benois-Pineau, Alejandro Álvaro Ramírez Acosta, and Mireya Saraí García Vázquez. 2016. Architectural style classification of Mexican historical buildings using deep convolutional neural networks and sparse features. Journal of Electronic Imaging 26, 1 (2016), 011016. https://doi.org/10.1117/1.JEI.26.1.011016
[12]
Abraham Montoya Obeso, Laura Mariel Amaya Reyes, Mario Lopez Rodriguez, Mario Humberto Mijes Cruz, Mireya Saraí García Vázquez, Jenny Benois-Pineau, Luis Miguel Zamudio Fuentes, Elizabeth Cano Martinez, Jesús Abimelek Flores Secundino, Jose Luis Rivera Martinez, et al. 2016. Image annotation for Mexican buildings database. In SPIE Optical Engineering+ Applications. International Society for Optics and Photonics, 99700Y--99700Y.
[13]
Alex Papushoy and Adrian G Bors. 2015. Image retrieval based on query by saliency content. Digital Signal Processing 36 (2015), 156--173.
[14]
Pierre Sermanet, David Eigen, Xiang Zhang, Michaël Mathieu, Rob Fergus, and Yann LeCun. 2013. Overfeat: Integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229 (2013).
[15]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[16]
Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton. 2013. On the importance of initialization and momentum in deep learning. In International conference on machine learning. 1139--1147.
[17]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.
[18]
Jasper RR Uijlings, Koen EA Van De Sande, Theo Gevers, and Arnold WM Smeulders. 2013. Selective search for object recognition. International journal of computer vision 104, 2 (2013), 154--171.
[19]
Matthew D Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European conference on computer vision. Springer, 818--833.

Cited By

View all
  • (2024)KolamNetV2: efficient attention-based deep learning network for tamil heritage art-kolam classificationHeritage Science10.1186/s40494-024-01167-812:1Online publication date: 19-Feb-2024
  • (2024)Utilizing Deep Learning Techniques for Categorizing Architectural Heritage Images2024 Second International Conference on Advances in Information Technology (ICAIT)10.1109/ICAIT61638.2024.10690689(1-5)Online publication date: 24-Jul-2024
  • (2024)Interpreting regional characteristics of Tibetan-Qiang houses in Northwestern Sichuan by Deep Learning and Image LandscapeInternational Journal of Applied Earth Observation and Geoinformation10.1016/j.jag.2024.103865129(103865)Online publication date: May-2024
  • Show More Cited By

Index Terms

  1. Connoisseur: classification of styles of Mexican architectural heritage with deep learning and visual attention prediction

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing
        June 2017
        237 pages
        ISBN:9781450353335
        DOI:10.1145/3095713
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 19 June 2017

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. CNN
        2. cultural heritage
        3. deep learning
        4. image classification
        5. visual attention prediction

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        CBMI '17

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)17
        • Downloads (Last 6 weeks)1
        Reflects downloads up to 25 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)KolamNetV2: efficient attention-based deep learning network for tamil heritage art-kolam classificationHeritage Science10.1186/s40494-024-01167-812:1Online publication date: 19-Feb-2024
        • (2024)Utilizing Deep Learning Techniques for Categorizing Architectural Heritage Images2024 Second International Conference on Advances in Information Technology (ICAIT)10.1109/ICAIT61638.2024.10690689(1-5)Online publication date: 24-Jul-2024
        • (2024)Interpreting regional characteristics of Tibetan-Qiang houses in Northwestern Sichuan by Deep Learning and Image LandscapeInternational Journal of Applied Earth Observation and Geoinformation10.1016/j.jag.2024.103865129(103865)Online publication date: May-2024
        • (2024)Dual attention and channel transformer based generative adversarial network for restoration of the damaged artworkEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107457128:COnline publication date: 14-Mar-2024
        • (2024)Preserving Artistic Heritage: A Comprehensive Review of Virtual Restoration Methods for Damaged ArtworksArchives of Computational Methods in Engineering10.1007/s11831-024-10175-7Online publication date: 5-Sep-2024
        • (2023)Architectural style classification of the Chinese traditional settlements using deep learningInternational Conference on Geographic Information and Remote Sensing Technology (GIRST 2022)10.1117/12.2667749(139)Online publication date: 10-Feb-2023
        • (2023)Application of Deep Learning Strategy for Multi-classification of Indian Heritage Images2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT56998.2023.10307029(1-5)Online publication date: 6-Jul-2023
        • (2023)Deep Learning Multiclassification Model: Recognizing Monuments2023 Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)10.1109/ICAISS58487.2023.10250711(315-319)Online publication date: 23-Aug-2023
        • (2023)Classification of the qilou (arcade building) using a robust image processing framework based on the Faster R-CNN with ResNet50Journal of Asian Architecture and Building Engineering10.1080/13467581.2023.223803823:2(595-612)Online publication date: 29-Jul-2023
        • (2023)Deep Learning-Based 3-D Model for the Cultural Heritage Sites in the State of Gujarat, IndiaArtificial Intelligence and Sustainable Computing10.1007/978-981-99-1431-9_59(737-750)Online publication date: 24-Sep-2023
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media