Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/290941.290970acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article
Free access

Distributional clustering of words for text classification

Published: 01 August 1998 Publication History
First page of PDF

References

[1]
P. F. Brown, P. V. deSouza, R. L. Mercer, V. J. Della Pietra, and J. C. Lai. Class-based n-gram models of natural language. Computational Linguistics, 18(4):467-479, 1992.
[2]
Thomas Cover and Peter Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21-27, 1967.
[3]
Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. john Wiley, 1991.
[4]
Mark Craven, Daniel DiPasquo, Dayne Freitag, Andrew McCallum, Tom Mitchell, Kamal Nigam, and Sean Slattery. Learning to extract symbolic knowledge from the World Wide Web. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98), 1998.
[5]
Ido Dagan, Fernando Pereira, and Lillian Lee. Similarity-based estimation of word cooccurrence probabilities. In Proceedings of the 32rid Annual Meeting of the Association .for Computational Linguistics, 1994.
[6]
S. C. Deerwester, S. T. Dumais, T. K. Landaner, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391-407, 1990.
[7]
P. Domingos and M. Pazzani. Beyond independence: Conditions for the optimality of the simple bayesian classifier. Machine Learnin9, 29:103-130, 1997.
[8]
Susan T. Dumais. Using LSI for information filtering: TREC-3 experiments. Technical Report 500- 225, National Institute of Standards and Technology, 1995.
[9]
Jerome H. Friedman. On bias, variance, 0/1 - loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1:55-77, 1997.
[10]
Thorsten Joachims. A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In International Conference on Machine Learning (ICML), 1997.
[11]
j. D. Jobson. Applied Multivariate Data Analysis - Volume iI: Categorical and Multivariate Methods. Springer Verlag, 1992.
[12]
R. Kerber. Chimerge: Discretization of numeric attributes. In Proceedings of Tenth National Conference on Artificial Intelligence (AAAI-9e), 1992.
[13]
D. Koller and M. Sahami. Toward optimal feature selection. In Proceedings of Thirteenth International Conference on Machine Learning (ICML-96), 1996.
[14]
Ken Lang. Newsweeder: Learning to filter netnews. In International Conference on Machine Learning (ICML), pages 331-339, 1995.
[15]
Lillian Lee. Similarity-Based Approaches to Natural Language Processing. PhD thesis, Harvard University, 1997. (also Technical Report TR-11-97).
[16]
David Lewis and Marc Ringuette. A comparison of two learning algorithms for text categorization. In Third Annual Symposium on Document Analysis and Information Retrieval, pages 81-93, 1994.
[17]
David D. Lewis and Kimberly A. Knowles. Threading electronic mail: A preliminary study. Information Processing and Management, 33(2):209-217, 1997.
[18]
H. Liu and R. Setiono. Chi2: Feature selection and discretization of numeric attributes. In Proceedings of 7th IEEE Int'l Conference on Tools with Artificial Intelligence, 1995.
[19]
Andrew McCallum and Kamal Nigam. A comparison of event models for naive Bayes text classification. In AAAI-98 Workshop on Learning for Text Categorization, 1998. http://www, cs.cmu.edu/-#mccallum.
[20]
Fernando Pereira, Naftali Tishby, and Lillian Lee. Distributional clustering of english words. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 183-90, 1993.
[21]
WiseWire. http://www.wisewire.com.
[22]
Yiming Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th Annual International A CM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256-263, 1995.
[23]
Yiming Yang and Jan Pederson. Feature selection in statistical learning of text categorization. In ICML- 97, pages 412-420, 1997.

Cited By

View all
  • (2024)RT-APT: A Real-time APT Anomaly Detection Method for Large-scale Provenance GraphJournal of Network and Computer Applications10.1016/j.jnca.2024.104036(104036)Online publication date: Oct-2024
  • (2024)A novel text clustering model based on topic modelling and social network analysisChaos, Solitons & Fractals10.1016/j.chaos.2024.114633181(114633)Online publication date: Apr-2024
  • (2023)Twenty Years of Machine-Learning-Based Text Classification: A Systematic ReviewAlgorithms10.3390/a1605023616:5(236)Online publication date: 29-Apr-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
August 1998
394 pages
ISBN:1581130155
DOI:10.1145/290941
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 1998

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGIR98
Sponsor:
  • University of Melbourne
  • SIGIR

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)239
  • Downloads (Last 6 weeks)25
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)RT-APT: A Real-time APT Anomaly Detection Method for Large-scale Provenance GraphJournal of Network and Computer Applications10.1016/j.jnca.2024.104036(104036)Online publication date: Oct-2024
  • (2024)A novel text clustering model based on topic modelling and social network analysisChaos, Solitons & Fractals10.1016/j.chaos.2024.114633181(114633)Online publication date: Apr-2024
  • (2023)Twenty Years of Machine-Learning-Based Text Classification: A Systematic ReviewAlgorithms10.3390/a1605023616:5(236)Online publication date: 29-Apr-2023
  • (2023)Neural Text Classification by Jointly Learning to Cluster and Align2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191269(1-8)Online publication date: 18-Jun-2023
  • (2023)Effective Contact Tracing of Covid Patients Using Machine Learning Technique2023 1st International Conference on Optimization Techniques for Learning (ICOTL)10.1109/ICOTL59758.2023.10435063(1-6)Online publication date: 7-Dec-2023
  • (2022)Algorithms for the classification of text documents, taking into account proximity in the attribute spaceModeling of systems and processes10.12737/2219-0767-2022-15-1-36-4315:1(36-43)Online publication date: 8-Apr-2022
  • (2022)Kullback–Leibler Divergence Metric LearningIEEE Transactions on Cybernetics10.1109/TCYB.2020.300824852:4(2047-2058)Online publication date: Apr-2022
  • (2022)Identifying new innovative services using M&A data: An integrated approach of data-driven morphological analysisTechnological Forecasting and Social Change10.1016/j.techfore.2021.121197174(121197)Online publication date: Jan-2022
  • (2022)Feature SelectionAdvances in Big Data Analytics10.1007/978-981-16-3607-3_4(249-304)Online publication date: 1-Jan-2022
  • (2022)Text Classification: Basic ModelsMachine Learning for Text10.1007/978-3-030-96623-2_5(115-158)Online publication date: 10-Feb-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media