research-article

Towards a Better Learning of Near-Synonyms: Automatically Suggesting Example Sentences via Fill in the Blank

Authors:

Chieh-Yang Huang,

Lun-Wei KuAuthors Info & Claims

WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

Pages 293 - 302

https://doi.org/10.1145/3041021.3054163

Published: 03 April 2017 Publication History

Abstract

Language learners are confused by near-synonyms and often look for answers from the Web. However, there is little to aid them in sorting through the overwhelming load of information that is offered. In this paper, we propose a new research problem: suggesting example sentences for learning word distinctions. We focus on near-synonyms as the first step. Two kinds of one-class classifiers, the GMM and BiLSTM models, are used to solve fill-in-the-blank (FITB) questions and further to select example sentences which best differentiate groups of near-synonyms. Experiments are conducted on both an open benchmark and a private dataset for the FITB task. Experiments show that the proposed approach yields an accuracy of 73.05% and 83.59% respectively, comparable to state-of-the-art multi-class classifiers. Learner study further shows the results of the example sentence suggestion by the learning effectiveness and demonstrates the proposed model indeed is more effective in learning near-synonyms compared to the resource-based models.

References

[1]

M.-H. Chen, S.-T. Huang, J. Chang, and H.-C. Liou. Developing a corpus-based paraphrase tool to improve efl learners' writing skills. Computer Assisted Language Learning, 28(1):22--40, 2015.

[2]

M.-H. Chen and M. Lin. Factors and analysis of common miscollocations of college students in taiwan. Studies in English Language and Literature, 2011.

[3]

M. E. Curtis. The role of vocabulary instruction in adult basic education. Comings, J., Garner, B., Smith, C., Review of Adult Learning and Literacy, 6:43--69, 2006.

[4]

E. Dale and J. S. Chall. A formula for predicting readability: Instructions. Educational research bulletin, pages 37--54, 1948.

[5]

G. De Melo and G. Weikum. Extracting sense-disambiguated example sentences from parallel corpora. In Proceedings of the 1st Workshop on Definition Extraction, pages 40--46. Association for Computational Linguistics, 2009.

Digital Library

[6]

J. Didakowski, L. Lemnitzer, and A. Geyken. Automatic example sentence extraction for a contemporary german dictionary. In Proceedings EURALEX, pages 343--349, 2012.

[7]

P. Edmonds. Choosing the word most typical in context using a lexical co-occurrence network. In Proceedings of the 35th annual meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, pages 507--509. Association for Computational Linguistics, 1997.

Digital Library

[8]

A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing, pages 6645--6649. IEEE, 2013.

[9]

A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks, 18(5):602--610, 2005.

Digital Library

[10]

K. Harvey and D. Yuill. A study of the use of a monolingual pedagogical dictionary by learners of english engaged in writing. Applied Linguistics, 18(3):253--278, 1997.

[11]

S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.

Digital Library

[12]

Y.-T. Huang, H.-P. Chang, Y. Sun, and M. C. Chen. A robust estimation scheme of reading difficulty for second language learners. In Advanced Learning Technologies (ICALT), 2011 11th IEEE International Conference on, pages 58--62. IEEE, 2011.

Digital Library

[13]

T. Huckin and J. Coady. Incidental vocabulary acquisition in a second language. Studies in second language acquisition, 21(02):181--193, 1999.

[14]

D. Inkpen. A statistical model for near-synonym choice. ACM Transactions on Speech and Language Processing (TSLP), 4(1):2, 2007.

Digital Library

[15]

A. Islam and D. Inkpen. Near-synonym choice using a 5-gram language model. Research in Computing Sciences, 46:41--52, 2010.

[16]

T. John. Should you be persuaded: Two examples of data-drivenlearning. Johns TF, King P. Classroom Conlcor-ldanlcing. Birmingham: ELR, 1991.

[17]

T. Johns. From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning. Perspectives on pedagogical grammar, 293, 1994.

[18]

A. Kilgarriff, M. Husák, K. McAdam, M. Rundell, and P. Rychlỳ. Gdex: Automatically finding good dictionary examples in a corpus. In Proceedings of the XIII EURALEX International Congress (Barcelona, 15-19 July 2008), pages 425--432, 2008.

[19]

B. Laufer. Ease and difficulty in vocabulary learning: Some teaching implications. Foreign Language Annals, 23(2):147--155, 1990.

[20]

D. Liu. Salience and construal in the use of synonymy: A study of two sets of near-synonymous nouns. Cognitive Linguistics, 24(1):67--113, 2013.

[21]

D. Liu and S. Zhong. L2 vs. l1 use of synonymy: An empirical study of synonym use/acquisition. Applied Linguistics, page amu022, 2014.

[22]

M. Martin. Advanced vocabulary teaching: The problem of synonyms. The Modern Language Journal, 68(2):130--137, 1984.

[23]

W. Nagy and D. Gentner. Semantic constraints on lexical categories. Language and Cognitive Processes, 5(3):169--201, 1990.

[24]

I. Nation. Vocabulary size, growth, and use. The bilingual lexicon, pages 115--134, 1993.

[25]

I. S. Nation. Learning vocabulary in another language. Ernst Klett Sprachen, 2001.

[26]

P. Nation and J. Newton. Teaching vocabulary. Second language vocabulary acquisition, pages 238--254, 1997.

[27]

J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. In EMNLP, volume 14, pages 1532--1543, 2014.

[28]

I. Pilán, E. Volodina, and R. Johansson. Rule-based and machine learning approaches for second language sentence-level readability. In Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications, pages 174--184, 2014.

[29]

M. Schuster and K. K. Paliwal. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11):2673--2681, 1997.

Digital Library

[30]

S. E. Schwarm and M. Ostendorf. Reading level assessment using support vector machines and statistical language models. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 523--530. Association for Computational Linguistics, 2005.

Digital Library

[31]

G. Schwarz et al. Estimating the dimension of a model. The annals of statistics, 6(2):461--464, 1978.

[32]

Z. Shiraz and M. Yamini. Investigating the interface between depth of vocabulary knowledge and efl learners' strategy use. World Applied Sciences Journal, 14(5):666--673, 2011.

[33]

J. Sinclair. Collins COBUILD English Usage. Collins, 1992.

[34]

J. M. Sinclair. How to use corpora in language teaching, volume 12. John Benjamins Publishing, 2004.

[35]

T. Tinkham. The effect of semantic clustering on the learning of second language vocabulary. System, 21(3):371--380, 1993.

[36]

N. D. Turton and J. B. Heaton. Longman dictionary of common errors. Longman, 1996.

[37]

S. H. Walker and D. B. Duncan. Estimation of the probability of an event as a function of several independent variables. Biometrika, 54(1--2):167--179, 1967.

[38]

T. Wang and G. Hirst. Near-synonym lexical choice in latent semantic space. In Proceedings of the 23rd International Conference on Computational Linguistics, pages 1182--1190. Association for Computational Linguistics, 2010.

Digital Library

[39]

R. Waring. The negative effects of learning words in semantic sets: A replication. System, 25(2):261--274, 1997.

[40]

S. Webb. The effects of synonymy on second-language vocabulary learning. Reading in a Foreign Language, 19(2):120, 2007.

[41]

L. Xu and M. I. Jordan. On convergence properties of the em algorithm for gaussian mixtures. Neural computation, 8(1):129--151, 1996.

Digital Library

[42]

Y. Yeh, H.-C. Liou, and Y.-H. Li. Online synonym materials and concordancing for efl college writing. Computer Assisted Language Learning, 20(2):131--152, 2007.

Cited By

Chi AChen LChang YLee SChang J(2023)Learning to Paraphrase Sentences to Different Complexity LevelsTransactions of the Association for Computational Linguistics10.1162/tacl_a_0060611(1332-1354)Online publication date: 13-Nov-2023
https://doi.org/10.1162/tacl_a_00606

Index Terms

Towards a Better Learning of Near-Synonyms: Automatically Suggesting Example Sentences via Fill in the Blank

Recommendations

Improving Extraction of Japanese Functional Expressions with Discontinuous Types through Part-Of-Speech Tagging
ICBDE '23: Proceedings of the 2023 6th International Conference on Big Data and Education

In learning process of Japanese as a second language (JSL), the acquisition of grammatical knowledge is the requisite foundation to learn Japanese well. There are various types of functional expressions in Japanese grammar, which is one of the difficult ...
Part-of-Speech (POS) Tagging Using Deep Learning-Based Approaches on the Designed Khasi POS Corpus
Part-of-speech (POS) tagging is one of the research challenging fields in natural language processing (NLP). It requires good knowledge of a particular language with large amounts of data or corpora for feature engineering, which can lead to achieving a ...
Acquiring collocations for lexical choice between near-synonyms
ULA '02: Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9

We extend a lexical knowledge-base of near-synonym differences with knowledge about their collocational behaviour. This type of knowledge is useful in the process of lexical choice between near-synonyms. We acquire collocations for the near-synonyms of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

April 2017

1738 pages

ISBN:9781450349147

General Chairs:
Rick Barrett
W3Events
,
Rick Cummings
Murdoch University
,
Program Chairs:
Eugene Agichtein
Emory University
,
Evgeniy Gabrilovich
Google Research

Sponsors

IW3C2: International World Wide Web Conference Committee

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 03 April 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Ministry of Science and Technology Taiwan

Conference

WWW '17

Sponsor:

IW3C2

WWW '17: 26th International World Wide Web Conference

April 3 - 7, 2017

Perth, Australia

Acceptance Rates

WWW '17 Companion Paper Acceptance Rate 164 of 966 submissions, 17%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
180
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)3

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chi AChen LChang YLee SChang J(2023)Learning to Paraphrase Sentences to Different Complexity LevelsTransactions of the Association for Computational Linguistics10.1162/tacl_a_0060611(1332-1354)Online publication date: 13-Nov-2023
https://doi.org/10.1162/tacl_a_00606

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents