A New Approach for Multi-Document Update Summarization

Chong Long¹,
Min-Lie Huang¹,
Xiao-Yan Zhu¹ &
…
Ming Li²

121 Accesses
10 Citations
Explore all metrics

Abstract

Fast changing knowledge on the Internet can be acquired more efficiently with the help of automatic document summarization and updating techniques. This paper describes a novel approach for multi-document update summarization. The best summary is defined to be the one which has the minimum information distance to the entire document set. The best update summary has the minimum conditional information distance to a document cluster given that a prior document cluster has already been read. Experiments on the DUC/TAC 2007 to 2009 datasets (http://duc.nist.gov/, http://www.nist.gov/tac/) have proved that our method closely correlates with the human summaries and outperforms other programs such as LexRank in many categories under the ROUGE evaluation criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

UIDS: A Multilingual Document Summarization Framework Based on Summary Diversity and Hierarchical Topics

Update summarization: building from scratch for Portuguese and comparing to English

Article Open access 21 September 2018

Intra-document and Inter-document Redundancy in Multi-document Summarization

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Luhn H P. The automatic creation of literature abstracts. IBM Journal of Research and Development, 1958, 2(2): 159-165.
Article MathSciNet Google Scholar
Wan X, Yang J, Xiao J. Manifold-ranking based topic-focused multi-document summarization. In Proc IJCAI, Hyderabad, India, Jan. 6-12, 2007, pp.2903-2908.
Li M, Vitányi P M. An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag, 1997.
Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proc. SIGIR, Melbourne, Australia, Aug. 24-28, 1998, pp.335-336.
Radev D R, Jing H, Stys M, Tam D. Centroid-based summarization of multiple documents. Information Processing and Management, 2004, 40(6): 919-938.
Article MATH Google Scholar
Kupiec J, Pedersen J, Chen F. A trainable document summarizer. In Proc. SIGIR, Seattle, USA, Jul. 9-13, 1995, pp.68-73.
Leskovec J, Milic-Frayling N, Grobelnik M. Impact of linguistic analysis on the semantic graph coverage and learning of document extracts. In Proc. AAAI, Pittsburgh, USA, Jul. 9-13, 2005, pp.1069-1074.
Shen D, Sun J T, Li H, Yang Q, Chen Z. Document summarization using conditional random fields. In Proc. IJCAI, Hyderabad, India, Jan. 6-12, 2007, pp.2862-2867.
Zhang J, Cheng X, Wu G, Xu H. Adasum: An adaptive model for summarization. In Proc. CIKM, Napa Valley, USA, Oct. 26-30, 2008, pp.901-909.
Erkan G, Radev D R. Lexpagerank: Prestige in multidocument text summarization. In Proc. EMNLP, Barcelona, Spain, Jul. 25-26, 2004, pp.365-371.
Mihalcea R, Tarau P. Textrank — Bring order into texts. In Proc. EMNLP, Barcelona, Spain, Jul. 25-26, 2004, pp.119-126.
Mihalcea R, Tarau P. A language independent algorithm for single and multiple document summarization. In Proc. IJCNLP, Jeju Island, Korea, Oct.11-13, 2005, pp.19-24.
Wan X, Yang J, Xiao J. Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In Proc. ACL, Prague, Czech Republic, Jun. 23-30, 2007, pp.552-559.
Wan X. An exploration of document impact on graph-based multi-document summarization. In Proc. EMNLP, Hawaii, USA, Oct. 25-27, 2008, pp.755-762.
Bennett C H, Gács P, Li M, Vitányi P M, Zurek W H. Information distance. IEEE Transactions on Information Theory, Jul. 1998, 44(4): 1407-1423.
Article MATH Google Scholar
Li M, Badger J H, Chen X, Kwong S, Kearney P, Zhang H. An information-based sequence distance and its application to whole mitochondrial genome phylogeny. Bioinformatics, 2001, 17(2): 149-154.
Article Google Scholar
Li M, Chen X, Li X, Ma B, Vitányi P M. The similarity metric. IEEE Transactions on Information Theory, 2004, 50(12): 3250-3264.
Article Google Scholar
Long C, Zhu X, Li M, Ma B. Information shared by many objects. In Proc. CIKM, Napa Valley, USA, Oct. 26-30, 2008, pp.1213-1220.
Benedetto D, Caglioti E, Loreto V. Language trees and zipping. Physical Review Letters, Jan. 2002, 88(4): 048702.
Article Google Scholar
Bennett C H, Li M, Ma B. Chain letters and evolutionary histories. Scientific American, Jun. 2003, 288(6): 76-81.
Article Google Scholar
Cilibrasi R L, Vitányi P M. The Google similarity distance. IEEE Transactions on Knowledge and Data Engineering, Mar. 2007, 19(3): 370-383.
Article Google Scholar
Zhang X, Hao Y, Zhu X, Li M. Information distance from a question to an answer. In Proc. SIGKDD, San Jose, USA, Aug. 12-15, 2007, pp.874-883.
Ziv J, Lempel A. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 1977, 23(3): 337-343.
Article MathSciNet MATH Google Scholar
Lin C Y, Hovy E. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proc. HLT-NAACL, Edmonton, Canada, May 27-June 1, 2003, pp.71-78.
Nenkova A, Passonneau R,Mckeown K. The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Transactions on Speech and Language Processing, Apr. 2007, 4(2): 1-23.
Article Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
Chong Long, Min-Lie Huang & Xiao-Yan Zhu (Member, CCF)
School of Computer Science, University of Waterloo, Waterloo, N2L 3G1, Canada
Ming Li (Fellow, ACM, IEEE)

Authors

Chong Long
View author publications
You can also search for this author in PubMed Google Scholar
Min-Lie Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Yan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Ming Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-Yan Zhu.

Additional information

The work was supported by the National Natural Science Foundation of China under Grant No. 60973104, the National Basic Research 973 Program of China under Grant No. 2007CB311003, and the IRCI Project from IDRC, Canada.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Long, C., Huang, ML., Zhu, XY. et al. A New Approach for Multi-Document Update Summarization. J. Comput. Sci. Technol. 25, 739–749 (2010). https://doi.org/10.1007/s11390-010-9361-x

Download citation

Received: 22 October 2009
Revised: 08 April 2010
Published: 11 July 2010
Issue Date: July 2010
DOI: https://doi.org/10.1007/s11390-010-9361-x

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

UIDS: A Multilingual Document Summarization Framework Based on Summary Diversity and Hierarchical Topics

Update summarization: building from scratch for Portuguese and comparing to English

Intra-document and Inter-document Redundancy in Multi-document Summarization

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A New Approach for Multi-Document Update Summarization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

UIDS: A Multilingual Document Summarization Framework Based on Summary Diversity and Hierarchical Topics

Update summarization: building from scratch for Portuguese and comparing to English

Intra-document and Inter-document Redundancy in Multi-document Summarization

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation