Abstract
Social network analysis has emerged as a key technique in countering crime and terrorism. The Enron e-mail dataset, originally made public and posted to the web by the Federal Energy Regulatory Commission during its investigation, consists of around half a million e-mails among several thousand individuals. It is valuable in the sense that it is perhaps the only real e-mail dataset that is accessible to the research community. This paper presents preliminary results of an analysis of the Enron e-mail dataset based on a variation of the Author-Recipient-Topic (ART) model [1]. The GR-ART model described here uses grammatical relations as features, rather than bags of words. It is our hypothesis that using grammatical relations as features will provide a more useful model of authors, topics, and recipients than will the use of words alone. This research complements earlier research by one of the authors in applying information extraction techniques to cross-document named entity co-reference [2].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
McCallum, A., Corrada-Emmanuel, A., Wang, X.: The author-recipient-topic model for topic and role discovery in social networks: Experiments with enron and academic email. Technical Report UM-CS-2004-096, UMass Amherst (2004)
Patman, F., Thompson, P.: Names: a new frontier in text mining, [6], pp. 27–38.
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Steyvers, M., Smyth, P., Rosen-Ziv, M., Griffiths, T.: Probabilistic author-topic models for information discovery. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington (2004)
Carroll, J., Briscoe, E., Sanfilippo, A.: Parser evaluation: a survey and a new proposal. In: Proceedings of the 1st International Conference on Language Resources and Evaluation, Granada, Spain, pp. 447–454 (1998)
Lin, L., Geng, X., Whinston, A.B.: Intelligence and security informatics: An information economics perspective. In: Chen, H., Miranda, R., Zeng, D.D., Demchak, C.C., Schroeder, J., Madhusudan, T. (eds.) ISI 2003. LNCS, vol. 2665, pp. 375–378. Springer, Heidelberg (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thompson, P., Zhang, W. (2006). Analyzing Social Networks in E-Mail with Rich Syntactic Features. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, FY. (eds) Intelligence and Security Informatics. ISI 2006. Lecture Notes in Computer Science, vol 3975. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11760146_86
Download citation
DOI: https://doi.org/10.1007/11760146_86
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34478-0
Online ISBN: 978-3-540-34479-7
eBook Packages: Computer ScienceComputer Science (R0)