Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3041021.3054163acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Towards a Better Learning of Near-Synonyms: Automatically Suggesting Example Sentences via Fill in the Blank

Published: 03 April 2017 Publication History

Abstract

Language learners are confused by near-synonyms and often look for answers from the Web. However, there is little to aid them in sorting through the overwhelming load of information that is offered. In this paper, we propose a new research problem: suggesting example sentences for learning word distinctions. We focus on near-synonyms as the first step. Two kinds of one-class classifiers, the GMM and BiLSTM models, are used to solve fill-in-the-blank (FITB) questions and further to select example sentences which best differentiate groups of near-synonyms. Experiments are conducted on both an open benchmark and a private dataset for the FITB task. Experiments show that the proposed approach yields an accuracy of 73.05% and 83.59% respectively, comparable to state-of-the-art multi-class classifiers. Learner study further shows the results of the example sentence suggestion by the learning effectiveness and demonstrates the proposed model indeed is more effective in learning near-synonyms compared to the resource-based models.

References

[1]
M.-H. Chen, S.-T. Huang, J. Chang, and H.-C. Liou. Developing a corpus-based paraphrase tool to improve efl learners' writing skills. Computer Assisted Language Learning, 28(1):22--40, 2015.
[2]
M.-H. Chen and M. Lin. Factors and analysis of common miscollocations of college students in taiwan. Studies in English Language and Literature, 2011.
[3]
M. E. Curtis. The role of vocabulary instruction in adult basic education. Comings, J., Garner, B., Smith, C., Review of Adult Learning and Literacy, 6:43--69, 2006.
[4]
E. Dale and J. S. Chall. A formula for predicting readability: Instructions. Educational research bulletin, pages 37--54, 1948.
[5]
G. De Melo and G. Weikum. Extracting sense-disambiguated example sentences from parallel corpora. In Proceedings of the 1st Workshop on Definition Extraction, pages 40--46. Association for Computational Linguistics, 2009.
[6]
J. Didakowski, L. Lemnitzer, and A. Geyken. Automatic example sentence extraction for a contemporary german dictionary. In Proceedings EURALEX, pages 343--349, 2012.
[7]
P. Edmonds. Choosing the word most typical in context using a lexical co-occurrence network. In Proceedings of the 35th annual meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, pages 507--509. Association for Computational Linguistics, 1997.
[8]
A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing, pages 6645--6649. IEEE, 2013.
[9]
A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks, 18(5):602--610, 2005.
[10]
K. Harvey and D. Yuill. A study of the use of a monolingual pedagogical dictionary by learners of english engaged in writing. Applied Linguistics, 18(3):253--278, 1997.
[11]
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.
[12]
Y.-T. Huang, H.-P. Chang, Y. Sun, and M. C. Chen. A robust estimation scheme of reading difficulty for second language learners. In Advanced Learning Technologies (ICALT), 2011 11th IEEE International Conference on, pages 58--62. IEEE, 2011.
[13]
T. Huckin and J. Coady. Incidental vocabulary acquisition in a second language. Studies in second language acquisition, 21(02):181--193, 1999.
[14]
D. Inkpen. A statistical model for near-synonym choice. ACM Transactions on Speech and Language Processing (TSLP), 4(1):2, 2007.
[15]
A. Islam and D. Inkpen. Near-synonym choice using a 5-gram language model. Research in Computing Sciences, 46:41--52, 2010.
[16]
T. John. Should you be persuaded: Two examples of data-drivenlearning. Johns TF, King P. Classroom Conlcor-ldanlcing. Birmingham: ELR, 1991.
[17]
T. Johns. From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning. Perspectives on pedagogical grammar, 293, 1994.
[18]
A. Kilgarriff, M. Husák, K. McAdam, M. Rundell, and P. Rychlỳ. Gdex: Automatically finding good dictionary examples in a corpus. In Proceedings of the XIII EURALEX International Congress (Barcelona, 15-19 July 2008), pages 425--432, 2008.
[19]
B. Laufer. Ease and difficulty in vocabulary learning: Some teaching implications. Foreign Language Annals, 23(2):147--155, 1990.
[20]
D. Liu. Salience and construal in the use of synonymy: A study of two sets of near-synonymous nouns. Cognitive Linguistics, 24(1):67--113, 2013.
[21]
D. Liu and S. Zhong. L2 vs. l1 use of synonymy: An empirical study of synonym use/acquisition. Applied Linguistics, page amu022, 2014.
[22]
M. Martin. Advanced vocabulary teaching: The problem of synonyms. The Modern Language Journal, 68(2):130--137, 1984.
[23]
W. Nagy and D. Gentner. Semantic constraints on lexical categories. Language and Cognitive Processes, 5(3):169--201, 1990.
[24]
I. Nation. Vocabulary size, growth, and use. The bilingual lexicon, pages 115--134, 1993.
[25]
I. S. Nation. Learning vocabulary in another language. Ernst Klett Sprachen, 2001.
[26]
P. Nation and J. Newton. Teaching vocabulary. Second language vocabulary acquisition, pages 238--254, 1997.
[27]
J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. In EMNLP, volume 14, pages 1532--1543, 2014.
[28]
I. Pilán, E. Volodina, and R. Johansson. Rule-based and machine learning approaches for second language sentence-level readability. In Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications, pages 174--184, 2014.
[29]
M. Schuster and K. K. Paliwal. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11):2673--2681, 1997.
[30]
S. E. Schwarm and M. Ostendorf. Reading level assessment using support vector machines and statistical language models. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 523--530. Association for Computational Linguistics, 2005.
[31]
G. Schwarz et al. Estimating the dimension of a model. The annals of statistics, 6(2):461--464, 1978.
[32]
Z. Shiraz and M. Yamini. Investigating the interface between depth of vocabulary knowledge and efl learners' strategy use. World Applied Sciences Journal, 14(5):666--673, 2011.
[33]
J. Sinclair. Collins COBUILD English Usage. Collins, 1992.
[34]
J. M. Sinclair. How to use corpora in language teaching, volume 12. John Benjamins Publishing, 2004.
[35]
T. Tinkham. The effect of semantic clustering on the learning of second language vocabulary. System, 21(3):371--380, 1993.
[36]
N. D. Turton and J. B. Heaton. Longman dictionary of common errors. Longman, 1996.
[37]
S. H. Walker and D. B. Duncan. Estimation of the probability of an event as a function of several independent variables. Biometrika, 54(1--2):167--179, 1967.
[38]
T. Wang and G. Hirst. Near-synonym lexical choice in latent semantic space. In Proceedings of the 23rd International Conference on Computational Linguistics, pages 1182--1190. Association for Computational Linguistics, 2010.
[39]
R. Waring. The negative effects of learning words in semantic sets: A replication. System, 25(2):261--274, 1997.
[40]
S. Webb. The effects of synonymy on second-language vocabulary learning. Reading in a Foreign Language, 19(2):120, 2007.
[41]
L. Xu and M. I. Jordan. On convergence properties of the em algorithm for gaussian mixtures. Neural computation, 8(1):129--151, 1996.
[42]
Y. Yeh, H.-C. Liou, and Y.-H. Li. Online synonym materials and concordancing for efl college writing. Computer Assisted Language Learning, 20(2):131--152, 2007.

Cited By

View all
  • (2023)Learning to Paraphrase Sentences to Different Complexity LevelsTransactions of the Association for Computational Linguistics10.1162/tacl_a_0060611(1332-1354)Online publication date: 13-Nov-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion
April 2017
1738 pages
ISBN:9781450349147

Sponsors

  • IW3C2: International World Wide Web Conference Committee

In-Cooperation

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 03 April 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bilstm
  2. computer-assisted language learning
  3. data-driven language learning
  4. gmm
  5. natural language processing

Qualifiers

  • Research-article

Funding Sources

  • Ministry of Science and Technology Taiwan

Conference

WWW '17
Sponsor:
  • IW3C2

Acceptance Rates

WWW '17 Companion Paper Acceptance Rate 164 of 966 submissions, 17%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)3
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Learning to Paraphrase Sentences to Different Complexity LevelsTransactions of the Association for Computational Linguistics10.1162/tacl_a_0060611(1332-1354)Online publication date: 13-Nov-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media