Abstract
As a result of the rapid growth in Internet access, significantly more information has become available online in real time. However, there is not sufficient time for users to read large volumes of information and make decisions accordingly. The problem of information-overloading can be resolved through the application of automatic summarization. Many summarization systems for documents in different languages have been implemented. However, the performance of summarization system on documents in different languages has not yet been investigated. In this paper, we compare the result of fractal summarization technique on parallel documents in Chinese and English. The grammatical and lexical differences between Chinese and English have significant effect on the summarization processes. Their impact on the performances of the summarization for the Chinese and English parallel documents is compared.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Barnsley, M.F., Jacquin, A.E.: Application of Recurrent Iterated Function Systems to Images. In: Proceedings of SPIE Visual Communications and Image Processing 1988, vol. 1001, pp. 122–131 (1988)
Baxendale, P.: Machine-Made Index for Technical Literature - An Experiment. IBM Journal, 354–361 (October 1958)
Chen, H.H., Huang, S.J.: A Summarization System for Chinese News from Multiple Sources. In: Proceedings of 4th International Workshop on Information Retrieval with Asia Languages, pp. 1–7 (1999)
Cowie, J., Mahesh, K., Nirenburg, S., Zajaz, R.: MINDS-Multilingual Interactive Document Summarization. In: Working Notes of the AAAI Spring Symposium on Intelligent Text Summarization, California, USA, pp. 131–132. AAAI Press, Menlo Park (1998)
Edmundson, H.P.: New Method in Automatic Extraction. Journal of the ACM 16(2), 264–285 (1968)
Endres-Niggemeyer, B., Maier, E., Sigel, A.: How to Implement a Naturalistic Model of Abstracting: Four Core Working Steps of an Expert Abstractor. Information Processing and Management 31(5), 631–674 (1995)
Feder, J.: Fractals. Plenum, New York (1988)
Frakes, W.: Stemming Algorithms. In: Frakes, W.B., Baeza-Yates, R. (eds.) Information Retrieval: data structures and algorithms, pp. 131–160. Prentice-Hall, Englewood Cliffs (1992)
Gallager, R.: Information theory and reliable communication (1968)
Gan, K.W., Palmer, M., Lua, K.T.: A Statistically Emergent Approach for Language Processing: Application to Modeling Context effects in Ambiguous Chinese Word Boundary Perception. Computational Linguistics, 531–553 (1996)
Glaser, B.G., Strauss, A.L.: The Discovery of Grounded Theory, Strategies for Qualitative Research. Aldine de Gruyter, New York (1967)
Hearst, M.A.: Subtopic Structuring for Full-Length Document Access. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 56–68 (1993)
Hull, D.: Stemming algorithms - a case study for detailed evaluation. Journal of the American Society for Information Science 47(1), 70–84 (1996)
Kataoka, A., Masuyama, S., Yamamoto, K.: Summarization by shortening a Japanese Noun Modifier into Expression ‘A no B’. In: Proceedings of NLPRS 1999, pp. 409–414 (1999)
Koike, H.: Fractal Views: A Fractal-Based Method for Controlling Information Display. ACM Transaction on Information Systems 13(3), 305–323 (1995)
Lam-Adesina, M., Jones, G.J.F.: Applying Summarization Techniques for Term Selection in Relevance Feedback. In: Proceedings of SIGIR 2001, pp. 1–9 (2001)
Lin, Y., Hovy, E.H.: Identifying Topics by Position. In: Proceedings of the Applied Natural Language Processing Conference (ANLP 1997), Washington, DC, pp. 283–290 (1997)
Luhn, H.P.: The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development, 159–165 (1958)
Mandelbrot, B.: The fractal geometry of nature. W.H. Freeman, New York (1983)
Mani, I.: Recent Development in Text Summarization. In: ACM CIKM 2001, Georgia, USA, pp. 529–531 (2001)
Myaeng, S.H., Jang, D.H.: Development and Evaluation of a Statistically-Based Document Summarization System. In: Mani, I. (ed.) Advances in Automatic Text Summarization, pp. 61–70. MIT Press, Cambridge (1999)
Nie, J.Y., Hannan, M.L., Jin, W.: Combining Dictionary, Rules and Statistical Information in Segmentation of Chinese. Computer Processing of Chinese and Oriental Languages, 125–143 (1995)
Ogden, W., Cowie, J., Davis, M., Ludovik, E., Molina-Salgado, H., Shin, H.: Getting information from documents you cannot read: an interactive cross-language text retrieval and summarization system. In: Joint ACM DL/SIGIR Workshop on Multilingual Information Discovery and Access (1999)
Salton G., and Buckley C. Term-Weighting Approaches in Automatic Text Retrieval. Information Processing and Management, 24, 513–523, 1988.
Yang, C.C., Luk, J., Yung, J., Yen, J.: Combination and Boundary Detection Approach for Chinese Indexing. Journal of the American Society for Information Science, Special Topic Issue on Digital Libraries 51(4), 340–351 (2000)
Yang, C.C., Li, K.W.: Automatic Construction of English/Chinese Parallel Corpora. Journal of the American Society for Information Science and Technology 54(8), 730–742 (2003)
Yang, C.C., Wang, F.L.: Fractal Summarization: Summarization Based on Fractal Theory. In: Proceedings of the 26th Annual International ACM Conference(SIGIR 2003), Toronto, Canada, July 28-August 1 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, F.L., Yang, C.C. (2003). Automatic Summarization of Chinese and English Parallel Documents. In: Sembok, T.M.T., Zaman, H.B., Chen, H., Urs, S.R., Myaeng, SH. (eds) Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access. ICADL 2003. Lecture Notes in Computer Science, vol 2911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24594-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-24594-0_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20608-8
Online ISBN: 978-3-540-24594-0
eBook Packages: Springer Book Archive