research-article

Free access

SemEval-2012 task 6: a pilot on semantic textual similarity

Authors:

Eneko Agirre,

Mona Diab,

Daniel Cer,

Aitor Gonzalez-AgirreAuthors Info & Claims

SemEval '12: Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation

Pages 385 - 393

Published: 07 June 2012 Publication History

PDF eReader

Abstract

Semantic Textual Similarity (STS) measures the degree of semantic equivalence between two texts. This paper presents the results of the STS pilot task in Semeval. The training data contained 2000 sentence pairs from previously existing paraphrase datasets and machine translation evaluation resources. The test data also comprised 2000 sentences pairs for those datasets, plus two surprise datasets with 400 pairs from a different machine translation evaluation corpus and 750 pairs from a lexical resource mapping exercise. The similarity of pairs of sentences was rated on a 0-5 scale (low to high similarity) by human judges using Amazon Mechanical Turk, with high Pearson correlation scores, around 90%. 35 teams participated in the task, submitting 88 runs. The best results scored a Pearson correlation >80%, well above a simple lexical baseline that only scored a 31% correlation. This pilot task opens an exciting way ahead, although there are still open issues, specially the evaluation metric.

References

[1]

Chris Callison-Burch, Cameron Fordyce, Philipp Koehn, Christof Monz, and Josh Schroeder. 2007. (meta-) evaluation of machine translation. In Proceedings of the Second Workshop on Statistical Machine Translation, StatMT '07, pages 136--158.

Digital Library

Google Scholar

[2]

Chris Callison-Burch, Cameron Fordyce, Philipp Koehn, Christof Monz, and Josh Schroeder. 2008. Further meta-evaluation of machine translation. In Proceedings of the Third Workshop on Statistical Machine Translation, StatMT '08, pages 70--106.

Digital Library

Google Scholar

[3]

David L. Chen and William B. Dolan. 2011. Collecting highly parallel data for paraphrase evaluation. In Proceedings of the 49th Annual Meetings of the Association for Computational Linguistics (ACL).

Digital Library

Google Scholar

[4]

B. Dolan, C. Quirk, and C. Brockett. 2004. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In COLING 04: Proceedings of the 20th international conference on Computational Linguistics, page 350.

Digital Library

Google Scholar

[5]

Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. MIT Press.

Google Scholar

[6]

Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw, and Ralph Weischedel. 2006. Ontonotes: The 90% solution. In Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL.

Digital Library

Google Scholar

[7]

Michael D. Lee, Brandon Pincombe, and Matthew Welsh. 2005. An empirical evaluation of models of text document similarity. In Proceedings of the 27th Annual Conference of the Cognitive Science Society, pages 1254--1259, Mahwah, NJ.

Google Scholar

[8]

Y. Li, D. McLean, Z. A. Bandar, J. D. O'Shea, and K. Crockett. 2006. Sentence similarity based on semantic nets and corpus statistics. IEEE Transactions on Knowledge and Data Engineering, 18(8): 1138--1150, August.

Digital Library

Google Scholar

[9]

Herbert Rubenstein and John B. Goodenough. 1965. Contextual correlates of synonymy. Commun. ACM, 8(10):627--633, October.

Digital Library

Google Scholar

[10]

E. Ukkonen. 1985. Algorithms for approximate string matching. Information and Contro, 64:110--118.

Digital Library

Google Scholar

Cited By

View all

Ji HSi QLin ZWang WCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Towards Flexible Evaluation for Generative Visual Question AnsweringProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681400(38-47)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681400
Xiao SLiu ZZhang PMuennighoff NLian DNie JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)C-Pack: Packed Resources For General Chinese EmbeddingsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657878(641-649)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657878
Miao PDu ZZhang JFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)DebCSE: Rethinking Unsupervised Contrastive Sentence Embedding Learning in the Debiasing PerspectiveProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614833(1847-1856)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614833
Show More Cited By

SemEval-2012 task 6: a pilot on semantic textual similarity

Recommendations

SemEval-2010 task 3: cross-lingual word sense disambiguation
SEW '09: Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions

We propose a multilingual unsupervised Word Sense Disambiguation (WSD) task for a sample of English nouns. Instead of providing manually sensetagged examples for each sense of a polysemous noun, our sense inventory is built up on the basis of the ...
System description of Semantic Textual Similarity (STS) in the SemEval-2012 (Task 6)
SemEval '12: Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation

This paper briefly reports our submissions to the Semantic Textual Similarity (STS) task in the SemEval 2012 (Task 6). We first use knowledge-based methods to compute word semantic similarity as well as Word Sense Disambiguation (WSD). We also consider ...
Semeval-2012 task 8: cross-lingual textual entailment for content synchronization
SemEval '12: Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation

This paper presents the first round of the task on Cross-lingual Textual Entailment for Content Synchronization, organized within SemEval-2012. The task was designed to promote research on semantic inference over texts written in different languages, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

June 2012

758 pages

General Chair:
Eneko Agirre
University of the Basque Country

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 07 June 2012

Qualifiers

Research-article

Acceptance Rates

Overall Acceptance Rate 8 of 31 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

61
Total Citations
View Citations
1,230
Total Downloads

Downloads (Last 12 months)129
Downloads (Last 6 weeks)24

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Ji HSi QLin ZWang WCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Towards Flexible Evaluation for Generative Visual Question AnsweringProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681400(38-47)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681400
Xiao SLiu ZZhang PMuennighoff NLian DNie JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)C-Pack: Packed Resources For General Chinese EmbeddingsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657878(641-649)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657878
Miao PDu ZZhang JFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)DebCSE: Rethinking Unsupervised Contrastive Sentence Embedding Learning in the Debiasing PerspectiveProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614833(1847-1856)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614833
Unnam NReddy KPandey AManwani N(2022)Journey to the center of the words: Word weighting scheme based on the geometry of word embeddingsProceedings of the 34th International Conference on Scientific and Statistical Database Management10.1145/3538712.3538720(1-12)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3538712.3538720
Wang WGe LZhang JYang CAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)Improving Contrastive Learning of Sentence Embeddings with Case-Augmented Positives and Retrieved NegativesProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531823(2159-2165)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531823
Srinarasi SRam RRaghavendra SP Patil ARajarajeswari SBelgod MKabra RSingh A(2021)A Combination of Enhanced WordNet and BERT for Semantic Textual SimilarityProceedings of the 2021 2nd International Conference on Control, Robotics and Intelligent System10.1145/3483845.3483898(191-198)Online publication date: 20-Aug-2021
https://dl.acm.org/doi/10.1145/3483845.3483898
Alian MAwajan AAl-Hasan AAkuzhia R(2021)Building Arabic Paraphrasing Benchmark based on Transformation RulesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/344677020:4(1-17)Online publication date: 9-Jun-2021
https://dl.acm.org/doi/10.1145/3446770
Vázquez RRaganato ACreutz MTiedemann J(2020)A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine TranslationComputational Linguistics10.1162/coli_a_0037746:2(387-424)Online publication date: 1-Jun-2020
https://dl.acm.org/doi/10.1162/coli_a_00377
Lv CWang FWang JYao LDu X(2020)Siamese Multiplicative LSTM for Semantic Text SimilarityProceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3446132.3446160(1-5)Online publication date: 24-Dec-2020
https://dl.acm.org/doi/10.1145/3446132.3446160
Du MVidal JMarkovsky BTsuchida K(2019)WikitheoriaProceedings of the 7th ACIS International Conference on Applied Computing and Information Technology10.1145/3325291.3325355(1-5)Online publication date: 29-May-2019
https://dl.acm.org/doi/10.1145/3325291.3325355
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Recommendations

SemEval-2010 task 3: cross-lingual word sense disambiguation

System description of Semantic Textual Similarity (STS) in the SemEval-2012 (Task 6)

Semeval-2012 task 8: cross-lingual textual entailment for content synchronization

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations