Nothing Special   »   [go: up one dir, main page]

skip to main content
10.3115/980691.980751dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Noun-phrase co-occurrence statistics for semiautomatic semantic lexicon construction

Published: 10 August 1998 Publication History

Abstract

Generating semantic lexicons semi-automatically could be a great time saver, relative to creating them by hand. In this paper, we present an algorithm for extracting potential entries for a category from an on-line corpus, based upon a small set of exemplars. Our algorithm finds more correct terms and fewer incorrect ones than previous work in this area. Additionally, the entries that are generated potentially provide broader coverage of the category than would occur to an individual coding them by hand. Our algorithm finds many terms not included within Wordnet (many more than previous algorithms), and could be viewed as an "enhancer" of existing broad-coverage resources.

References

[1]
E. Charniak, S. Goldwater, and M. Johnson. 1998. Edge-based best-first chart parsing. forthcoming.
[2]
T. Dunning. 1993. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1): 61--74.
[3]
W. A. Gale K. W. Churh, and D. Yarowsky. 1992. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26: 415--439.
[4]
M. Lauer. 1995. Corpus statistics meet the noun compound: Some empirical results. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pages 47--55.
[5]
M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2): 313--330.
[6]
G. Miller. 1990. Wordnet: An on-line lexical database. International Journal of Lexicography, 3(4).
[7]
MUC-4 Proceedings. 1992. Proceedings of the Fourth Message Understanding Conference. Morgan Kaufmann, San Mateo, CA.
[8]
E. Riloff and J. Shepherd. 1997. A corpus-based approach for building semantic lexicons. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pages 127--132.
[9]
H. Schütze. 1992. Word sense disambiguation with sublexical representation. In Workshop Notes, Statistically-Based NLP Techniques, pages 109--113. AAAI.
[10]
D. Yarowsky. 1995. Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pages 189--196.

Cited By

View all
  • (2019)Corpus-based Set Expansion with Lexical Features and Distributed RepresentationsProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331359(1153-1156)Online publication date: 18-Jul-2019
  • (2017)Using contexts and constraints for improved geotagging of human trafficking webpagesProceedings of the Fourth International ACM Workshop on Managing and Mining Enriched Geo-Spatial Data10.1145/3080546.3080547(1-6)Online publication date: 14-May-2017
  • (2016)A Novel Approach to Managing the Dynamic Nature of Semantic RelatednessJournal of Database Management10.4018/JDM.201604010127:2(1-26)Online publication date: 1-Apr-2016
  • Show More Cited By
  1. Noun-phrase co-occurrence statistics for semiautomatic semantic lexicon construction

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      ACL '98/COLING '98: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 2
      August 1998
      768 pages

      Sponsors

      • Government of Canada
      • Université de Montréal

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 10 August 1998

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate 85 of 443 submissions, 19%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)63
      • Downloads (Last 6 weeks)7
      Reflects downloads up to 21 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)Corpus-based Set Expansion with Lexical Features and Distributed RepresentationsProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331359(1153-1156)Online publication date: 18-Jul-2019
      • (2017)Using contexts and constraints for improved geotagging of human trafficking webpagesProceedings of the Fourth International ACM Workshop on Managing and Mining Enriched Geo-Spatial Data10.1145/3080546.3080547(1-6)Online publication date: 14-May-2017
      • (2016)A Novel Approach to Managing the Dynamic Nature of Semantic RelatednessJournal of Database Management10.4018/JDM.201604010127:2(1-26)Online publication date: 1-Apr-2016
      • (2013)Concept-based analysis of scientific literatureProceedings of the 22nd ACM international conference on Information & Knowledge Management10.1145/2505515.2505613(1733-1738)Online publication date: 27-Oct-2013
      • (2013)Measuring Conceptual Entanglement in Collections of DocumentsSelected Papers of the 7th International Conference on Quantum Interaction - Volume 836910.1007/978-3-642-54943-4_12(134-146)Online publication date: 25-Jul-2013
      • (2012)Ensemble-based semantic lexicon induction for semantic taggingProceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation10.5555/2387636.2387669(199-208)Online publication date: 7-Jun-2012
      • (2012)Taxonomy induction using hierarchical random graphsProceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies10.5555/2382029.2382094(466-476)Online publication date: 3-Jun-2012
      • (2012)Corpus-Driven hyponym acquisition for turkish languageProceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I10.1007/978-3-642-28604-9_3(29-41)Online publication date: 11-Mar-2012
      • (2010)Conceptual modeling of online entertainment programming guide for natural language interfaceProceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems10.5555/1894525.1894551(188-195)Online publication date: 23-Jun-2010
      • (2010)Paraphrase alignment for synonym evidence discoveryProceedings of the 23rd International Conference on Computational Linguistics10.5555/1873781.1873827(403-411)Online publication date: 23-Aug-2010
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media