Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1111449.1111471acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
Article

Automatically classifying emails into activities

Published: 29 January 2006 Publication History

Abstract

Email-based activity management systems promise to give users better tools for managing increasing volumes of email, by organizing email according to a user's activities. Current activity management systems do not automatically classify incoming messages by the activity to which they belong, instead relying on simple heuristics (such as message threads), or asking the user to manually classify incoming messages as belonging to an activity. This paper presents several algorithms for automatically recognizing emails as part of an ongoing activity. Our baseline methods are the use of message reply-to threads to determine activity membership and a naïve Bayes classifier. Our SimSubset and SimOverlap algorithms compare the people involved in an activity against the recipients of each incoming message. Our SimContent algorithm uses IRR (a variant of latent semantic indexing) to classify emails into activities using similarity based on message contents. An empirical evaluation shows that each of these methods provide a significant improvement to the baseline methods. In addition, we show that a combined approach that votes the predictions of the individual methods performs better than each individual method alone.

References

[1]
Manu Aery and Sharma Chakravarthy. eMailSift: mining-based approaches to email classification. In SIGIR '04: Proc. of the 27th annual intl. ACM SIGIR conf. on information retrieval, pages 580--581. ACM Press, 2004.
[2]
Rie Kubota Ando and Lillian Lee. Iterative residual rescaling. In SIGIR '01: Proc. of the 24th annual intl. ACM SIGIR conf. on information retrieval, pages 154--162. ACM Press, 2001.
[3]
V. Bellotti, N. Ducheneaut, M. Howard, and I. Smith. Taking email to task: the design and evaluation of a task management centered email tool. In CHI '03: Proc. of the SIGCHI conf. on Human factors in computing systems, pages 345--352. ACM Press, 2003.
[4]
W. Cohen, V. Carvalho, and T. Mitchell. Learning to classify email into "speech acts". In Proc. Conf. Empirical Methods in Natural Language Processing, 2004.
[5]
Alex Cozzi, Tom Moran, and Clemens Drews. The shared checklist: Reorganizing the user experience around unified activities. In 10th Intl Conf on Human-Computer Interaction (INTERACT 2005), Sept. 2005.
[6]
S. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the Society for Information Science, 41(6):391--407, 1990.
[7]
A. N. Dragunov, T. G. Dietterich, K. Johnsrude, M. McLaughlin, L. Li, and J. L. Herlocker. TaskTracer: a desktop environment to support multi-tasking knowledge workers. In IUI '05: Proc. of 10th intl. conf. on Intelligent User Interfaces, pages 75--82. ACM Press, 2005.
[8]
N. Ducheneaut and V. Bellotti. E-mail as habitat: an exploration of embedded personal information management. interactions, 8(5):30--38, 2001.
[9]
Y. Huang, D. Govindaraju, T. Mitchell, V. Rocha de Carvalho, and W. Cohen. Inferring ongoing activities of workstation users by clustering email. In Proc. of the 1st Conf. on Email and Anti-Spam, July 2004.
[10]
R. Khoussainov and N. Kushmerick. Email task management: An iterative relational learning approach. In Proc. Conf. Email and Anti-Spam, 2005.
[11]
S. Kiritchenko, S. Matwin, and S. Abu-Hakima. Email classification with temporal features. In Proceedings of Intelligent Information Systems, New Trends in Intelligent Information Processing and Web Mining (IIPWM) 2004, pages 523--534. Springer Verlag, 2004.
[12]
Svetlana Kiritchenko and Stan Matwin. Email classification with co-training. In CASCON '01: Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research, pages 192--201. IBM Press, 2001.
[13]
N. Kushmerick and T. Lau. Automated email activity management: an unsupervised learning approach. In IUI '05: Proc. of the 10th intl. conf. on Intelligent User Interfaces, pages 67--74. ACM Press, 2005.
[14]
Andrew McCallum, Andres Corrada-Emmanuel, and Xuerui Wang. Topic and Role Discovery in Social Networks. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, July 2005.
[15]
Luke McDowell, Oren Etzioni, Alon Halevy, and Henry Levy. Semantic email. In WWW '04: Proceedings of the 13th international conference on World Wide Web, pages 244--254. ACM Press, 2004.
[16]
Thomas P. Moran, Alex Cozzi, and Stephen P. Farrell. Unified Activity Management: Supporting People in eBusiness. Communications of the ACM, 2005. To appear.
[17]
M. J. Muller, W. Geyer, B. Brownholtz, E. Wilcox, and D. R. Millen. One-hundred days in an activity-centric collaboration environment based on shared objects. In CHI '04: Proc. of the SIGCHI conference on Human factors in computing systems, pages 375--382. ACM Press, 2004.
[18]
M.F. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
[19]
Mehran Sahami, Susan Dumais, David Heckerman, and Eric Horvitz. A bayesian approach to filtering junk E-mail. In Learning for Text Categorization: Papers from the 1998 Workshop, Madison, Wisconsin, 1998. AAAI Technical Report WS-98-05.
[20]
R. Segal, J. Crawford, J. Kephart, and B. Leiba. SpamGuru: An Enterprise Anti-Spam Filtering System. In Proceedings of the First Conference on Email and Anti-Spam, July 2004.
[21]
R. Segal and J. Kephart. Incremental Learning in SwiftFile. In ICML '00: Proc. of the 17th Intl. Conf. on Machine Learning, pages 863--870, San Francisco, CA, 2000.
[22]
A. Surendran, J. Platt, and E. Renshaw. Automatic discovery of personal topics to organize email. In Proc. of the 2nd Conf. on Email and Anti-Spam, July 2005.
[23]
Steve Whittaker and Candace Sidner. Email overload: exploring personal information management of email. In CHI '96: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 276--283, New York, NY, USA, 1996. ACM Press.
[24]
Ian H. Witten and Eibe Frank. Data Mining: Practical machine learning tools and techniques, 2nd ed. Morgan Kaufmann, 2005.

Cited By

View all
  • (2023)Phish and Chips: Language-agnostic classification of unsolicited emails2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00189(1385-1391)Online publication date: 1-Nov-2023
  • (2023)How to Discover Competences from Help Interactions2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394440(4885-4888)Online publication date: 1-Oct-2023
  • (2023)Process fragments discovery from emails: Functional, data and behavioral perspectives discoveryInformation Systems10.1016/j.is.2023.102229118(102229)Online publication date: Sep-2023
  • Show More Cited By

Index Terms

  1. Automatically classifying emails into activities

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IUI '06: Proceedings of the 11th international conference on Intelligent user interfaces
    January 2006
    392 pages
    ISBN:1595932879
    DOI:10.1145/1111449
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 January 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. activity management
    2. email
    3. machine learning
    4. text classification

    Qualifiers

    • Article

    Conference

    IUI06
    IUI06: 11th International Conference on Intelligent User Interfaces
    January 29 - February 1, 2006
    Sydney, Australia

    Acceptance Rates

    Overall Acceptance Rate 746 of 2,811 submissions, 27%

    Upcoming Conference

    IUI '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Phish and Chips: Language-agnostic classification of unsolicited emails2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00189(1385-1391)Online publication date: 1-Nov-2023
    • (2023)How to Discover Competences from Help Interactions2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394440(4885-4888)Online publication date: 1-Oct-2023
    • (2023)Process fragments discovery from emails: Functional, data and behavioral perspectives discoveryInformation Systems10.1016/j.is.2023.102229118(102229)Online publication date: Sep-2023
    • (2022)Competency Manifestation Clues within Interactions in Computer Mediated CommunicationJournal of Engineering Research and Sciences10.55708/js01050181:5(167-178)Online publication date: May-2022
    • (2022)A Reproducible Approach for Mining Business Activities from Emails for Process AnalyticsService-Oriented Computing – ICSOC 2021 Workshops10.1007/978-3-031-14135-5_6(77-91)Online publication date: 24-Aug-2022
    • (2021)Competency Detection from Interactions Within Communities of PracticeSN Computer Science10.1007/s42979-021-00861-93:1Online publication date: 30-Oct-2021
    • (2021)Automated Business Process Discovery from Unstructured Natural-Language DocumentsBusiness Process Management Workshops10.1007/978-3-030-66498-5_18(232-243)Online publication date: 19-Jan-2021
    • (2021)Multi‐perspective business process discovery from messaging systems: State‐of‐the artConcurrency and Computation: Practice and Experience10.1002/cpe.664235:11Online publication date: 30-Sep-2021
    • (2020)How Impactful Is Presentation in Email? The Effect of Avatars and SignaturesACM Transactions on Interactive Intelligent Systems10.1145/334564110:3(1-26)Online publication date: 13-Nov-2020
    • (2020)Toward Activity Discovery in the Personal WebProceedings of the 13th International Conference on Web Search and Data Mining10.1145/3336191.3371828(492-500)Online publication date: 20-Jan-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media