Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/315149.315355guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Towards multidocument summarization by reformulation: progress and prospects

Published: 18 July 1999 Publication History

Abstract

By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We are developing a multidocument summarization system to automatically generate a concise summary by identifying and synthesizing similarities across a set of related documents. Our approach is unique in its integration of machine learning and statistical techniques to identify similar paragraphs, intersection of similar phrases within paragraphs, and language generation to reformulate the wording of the summary. Our evaluation of system components shows that learning over multiple extracted linguistic features is more effective than information retrieval approaches at identifying similar text units for summarization and that it is possible to generate a fluent summary that conveys similarities among documents even when full semantic interpretations of the input text are not available.

References

[1]
James Allan, Jaime Carbonell, George Doddington, Jon Yamron, and Y. Yang. Topic Detection and Tracking Pilot Study: Final Report. In Proceedings of the Broadcast News Understanding and Transcription Workshop , pages 194-218,1998.
[2]
Regina Barzilay and Michael Elhadad. Using Lexical Chains for Text Summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization , pages 10-17, Madrid, Spain, August 1997. Association for Computational Linguistics.
[3]
Regina Barzilay, Kathleen R. McKeown, and Michael Elhadad. Information Fusion in the Context of Multi-Document Summarization. In Proceedings of the 37th Annual Meeting of the ACL , College Park, Maryland, June 1999. Association for Computational Linguistics.
[4]
William Cohen. Learning Trees and Rules with Set-Valued Features. In Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI-96) . American Association for Artificial Intelligence, 1996.
[5]
Michael Collins. A New Statistical Parser Based on Bigram Lexical Dependencies. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics , Santa Cruz, California, 1996.
[6]
Michael Elhadad. Using Argumentation to Control Lexical Choice: A Functional Unification Implementation . PhD thesis, Department of Computer Science, Columbia University, New York, 1993.
[7]
Richard Kittredge and Igor A. Mel'¿uk. Towards a Computable Model of Meaning-Text Relations Within a Natural Sublanguage. In Proceedings of the Eighth International Joint Conference on Artificial Intelligence (IJCAI-83) , pages 657-659, Karlsruhe, West Germany, August 1983.
[8]
Judith Klavans and Min-Yen Kan. The Role of Verbs in Document Access. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (ACL/COLING-98) , Montreal, Canada, 1998.
[9]
Julian M. Kupiec, Jan Pedersen, and Francine Chen. A Trainable Document Summarizer. In Edward A. Fox, Peter Ingwersen, and Raya Fidel, editors, Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , pages 68-73, Seattle, Washington, July 1995.
[10]
Beth Levin. English Verb Classes and Alternations: A Preliminary Investigation . University of Chicago Press, Chicago, Illinois, 1993.
[11]
Chin-Yew Lin and Eduard Hovy. Identifying Topics by Position. In Proceedings of the 5th ACL Conference on Applied Natural Language Processing , pages 283-290, Washington, D.C., April 1997.
[12]
Inderjeet Mani and Eric Bloedorn. Multi-document Summarization by Graph Search and Matching. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-97) , pages 622-628, Providence, Rhode Island, 1997. American Association for Artificial Intelligence.
[13]
Daniel Marcu. From Discourse Structures to Text Summaries. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization , pages 82-88, Madrid, Spain, August 1997. Association for Computational Linguistics.
[14]
Daniel Marcu. To Build Text Summaries of High Quality, Nuclearity is not Sufficient. In Proceedings of the AAAI Symposium on Intelligent Text Summarization , pages 1-8, Stanford University, Stanford, California, March 1998. American Association for Artificial Intelligence.
[15]
George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine J. Miller. Introduction to WordNet: An On-Line Lexical Database. International Journal of Lexicography , 3 (4):235-312,1990.
[16]
Chris D. Paice. Constructing Literature Abstracts by Computer: Techniques and Prospects. Information Processing and Management , 26 : 171-186, 1990.
[17]
Dragomir R. Radev and Kathleen R. McKeown. Generating Natural Language Summaries from Multiple On-Line Sources. Computational Linguistics , 24 (3):469-500, September 1998.
[18]
Jacques Robin. Revision-Based Generation of Natural Language Summaries Providing Historical Background: Corpus-Based Analysis, Design, Implementation, and Evaluation . PhD thesis, Department of Computer Science, Columbia University, New York, 1994. Also Columbia University Technical Report CU-CS-034- 94.
[19]
G. Salton and C. Buckley. Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management , 25 (5):513-523, 1988.
[20]
Alan F. Smeaton. Progress in the Application of Natural Language Processing to Information Retrieval Tasks. The Computer Journal , 35 (3):268-278, 1992.
[21]
Tomek Strzalkowski, Jin Wang, and Bowden Wise. A Robust Practical Text Summarization. In Proceedings of the AAAI Symposium on Intelligent Text Summarization , pages 26-33, Stanford University, Stanford, California, March 1998. American Association for Artificial Intelligence.
[22]
Nina Wacholder. Simplex NPs Clustered by Head: A Method For Identifying Significant Topics in a Document. In Proceedings of the Workshop on the Computational Treatment of Nominals , pages 70-79, Montreal, Canada, October 1998. COLING-ACL.
[23]
Yiming Yang, Tom Pierce, and Jaime Carbonell. A Study on Retrospective and On-Line Event Detection. In Proceedings of the 21st Annual International ACM SIG1R Conference on Research and Development in Information Retrieval , Melbourne, Australia, August 1998.

Cited By

View all
  • (2018)A joint model of conversational discourse and latent topics on microblogsComputational Linguistics10.1162/coli_a_0033544:4(719-754)Online publication date: 1-Dec-2018
  • (2017)Discovering Typical Histories of Entities by Multi-Timeline SummarizationProceedings of the 28th ACM Conference on Hypertext and Social Media10.1145/3078714.3078725(105-114)Online publication date: 4-Jul-2017
  • (2017)Multi-factors based sentence ordering for cross-document fusion from multimodal contentNeurocomputing10.1016/j.neucom.2016.12.084253:C(6-14)Online publication date: 30-Aug-2017
  • Show More Cited By

Index Terms

  1. Towards multidocument summarization by reformulation: progress and prospects

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      AAAI '99/IAAI '99: Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
      July 1999
      998 pages
      ISBN:0262511061

      Sponsors

      • AAAI: Am Assoc for Artifical Intelligence

      Publisher

      American Association for Artificial Intelligence

      United States

      Publication History

      Published: 18 July 1999

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 19 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2018)A joint model of conversational discourse and latent topics on microblogsComputational Linguistics10.1162/coli_a_0033544:4(719-754)Online publication date: 1-Dec-2018
      • (2017)Discovering Typical Histories of Entities by Multi-Timeline SummarizationProceedings of the 28th ACM Conference on Hypertext and Social Media10.1145/3078714.3078725(105-114)Online publication date: 4-Jul-2017
      • (2017)Multi-factors based sentence ordering for cross-document fusion from multimodal contentNeurocomputing10.1016/j.neucom.2016.12.084253:C(6-14)Online publication date: 30-Aug-2017
      • (2016)Counting Clusters in Twitter PostsProceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies10.1145/2905055.2905295(1-9)Online publication date: 4-Mar-2016
      • (2016)The Knowledge AcceleratorProceedings of the 2016 CHI Conference on Human Factors in Computing Systems10.1145/2858036.2858364(2258-2270)Online publication date: 7-May-2016
      • (2015)Profile-Based Summarisation for Web Site NavigationACM Transactions on Information Systems10.1145/269966133:1(1-39)Online publication date: 17-Feb-2015
      • (2015)An EDU-Based Approach for Thai Multi-Document Summarization and Its ApplicationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/264156714:1(1-26)Online publication date: 30-Jan-2015
      • (2014)SRRankIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2014.236046122:12(2048-2058)Online publication date: 1-Dec-2014
      • (2011)Automatic summarizationProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 201110.5555/2002465.2002468(1-86)Online publication date: 19-Jun-2011
      • (2011)Multi-document summarization of scientific corporaProceedings of the 2011 ACM Symposium on Applied Computing10.1145/1982185.1982243(252-258)Online publication date: 21-Mar-2011
      • Show More Cited By

      View Options

      View options

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media