Nothing Special   »   [go: up one dir, main page]

Skip to main content

Developing and Aligning a Detailed Controlled Vocabulary for Artwork

  • Conference paper
  • First Online:
New Trends in Database and Information Systems (ADBIS 2022)

Abstract

Controlled vocabularies have proved to be critical for data interoperability and accessibility. In the cultural heritage (CH) domain, description of artworks are often given as free text, thus making filtering and searching burdensome (e.g. listing all artworks of a specific type). Despite being multi-language and quite detailed, the Getty’s Art & Architecture Thesaurus –a de facto standard for describing artworks– has a low coverage for languages different than English and sometimes does not reach the required degree of granularity to describe specific niche artworks. We build upon the Italian Vocabulary of Artworks, developed by the Italian Ministry of Cultural Heritage (MIC) and a set of free text descriptions from ArCO, the knowledge graph of the Italian CH, to propose an extension of the Vocabulary of Artworks and align it to the Getty’s thesaurus. Our framework relies on text matching and natural language processing tools for suggesting candidate alignments between free text and terms and between cross-vocabulary terms, with a human in the loop for validation and refinement. We produce 1.166 new terms (31% more w.r.t. the original vocabulary) and 1.330 links to the Getty’s thesaurus, with estimated coverage of 21%.

This work was supported by the project POR FESR Lazio 2014–2020: “ReAD - Representation of Architectural Data”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.iccd.beniculturali.it/it/normative.

  2. 2.

    http://www.getty.edu/research/tools/vocabularies/aat/aat_faq.html#number.

  3. 3.

    http://www.iccd.beniculturali.it/it/ricercanormative/139/thesaurus-per-la-definizione-dei-beni-storici-artistici.

  4. 4.

    https://www.wikidata.org/.

  5. 5.

    http://compling.hss.ntu.edu.sg/omw/.

  6. 6.

    http://www.getty.edu/research/tools/vocabularies/aat/aat_faq.html#number.

  7. 7.

    https://www.wikidata.org/wiki/Property:P1014.

  8. 8.

    https://pro.europeana.eu/project/evaluation-and-enrichments.

  9. 9.

    http://www.iccd.beniculturali.it/it/ricercanormative/139/thesaurus-per-la-definizione-dei-beni-storici-artistici.

  10. 10.

    http://www.iccd.beniculturali.it/it/ricercanormative/108/thesaurus-per-la-definizione-dei-reperti-archeologici.

  11. 11.

    http://www.catalogo.beniculturali.it/sigecSSU_FE/Home.action?timestamp=1521647516354.

  12. 12.

    http://www.iccd.beniculturali.it/it/sigecweb.

  13. 13.

    https://github.com/LuanaBulla/Controlled-Vocabularies-for-Cultural-Heritage/tree/main/code.

  14. 14.

    https://www.wikidata.org/.

  15. 15.

    http://compling.hss.ntu.edu.sg/omw/.

  16. 16.

    https://github.com/LuanaBulla/Controlled-Vocabularies-for-Cultural-Heritage.

References

  1. Aloia, N., et al.: Enabling european archaeological research: the ariadne e-infrastructure. Internet Archaeol. 43 (2017)

    Google Scholar 

  2. Binding, C., Tudhope, D.: Improving interoperability using vocabulary linked data. Int. J. Digit. Libr. 17(1), 5–21 (2015). https://doi.org/10.1007/s00799-015-0166-y

    Article  Google Scholar 

  3. Carriero, V.A., et al.: ArCo: the italian cultural heritage knowledge graph. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 36–52. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_3

    Chapter  Google Scholar 

  4. Cobb, J.: The journey to linked open data: the getty vocabularies. J. Libr. Metadata 15(3–4), 142–156 (2015)

    Article  Google Scholar 

  5. Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116 (2019)

  6. Euzenat, J., Shvaiko, P.: Ontology matching. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38721-0

    Book  MATH  Google Scholar 

  7. Fellbaum, C.: Wordnet: An electronic lexical database: Bradford book. MIT Press, Cambridge (1998)

    Book  Google Scholar 

  8. Feng, F.a.o.: Language-agnostic bert sentence embedding. arXiv preprint arXiv:2007.01852 (2020)

  9. Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 22–30. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24775-3_5

    Chapter  Google Scholar 

  10. Golub, K., et al.: Automated classification of textual documents based on a controlled vocabulary in engineering. KO 34(4), 247–263 (2007)

    Google Scholar 

  11. Hakak, S., et al.: Exact string matching algorithms: survey, issues, and future research directions. IEEE Access 7, 69614–69637 (2019)

    Article  Google Scholar 

  12. Harpring, P.: Introduction to controlled vocabularies: terminology for art, architecture, and other cultural works. Getty Publications (2010)

    Google Scholar 

  13. Levenshtein, V.I., et al.: Binary codes capable of correcting deletions, insertions, and reversals. In: Soviet Physics Doklady, vol. 10, pp. 707–710. Soviet Union (1966)

    Google Scholar 

  14. Liu, Y., et al.: Multilingual denoising pre-training for neural machine translation. Trans. Assoc. Comput. Linguist. 8, 726–742 (2020)

    Article  Google Scholar 

  15. Luan, Y., et al.: Sparse, dense, and attentional representations for text retrieval. Trans. Assoc. Comput. Linguist. 9, 329–345 (2021)

    Article  Google Scholar 

  16. Morshed, A.u., Sini, M.: Creating and aligning controlled vocabularies. In: Workshop on AT4DL 2009, p. 50 (2009)

    Google Scholar 

  17. Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)

  18. Tordai, A., et al.: Aligning large skos-like vocabularies: Two case studies. In: ESWC (2010)

    Google Scholar 

  19. Vrandečić, D.: Wikidata: A new platform for collaborative data collection. In: Proceedings of the 21st International Conference on World Wide Web, pp. 1063–1064 (2012)

    Google Scholar 

  20. Zad, S., et al.: A survey of deep learning methods on semantic similarity and sentence modeling. In: 12th IEMCON, pp. 0466–0472. IEEE (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luana Bulla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bulla, L. et al. (2022). Developing and Aligning a Detailed Controlled Vocabulary for Artwork. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15743-1_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15742-4

  • Online ISBN: 978-3-031-15743-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics