Automated learning of decision rules for text categorization

Article Properties
  • Language
    English
  • Publication Date
    1994/07/01
  • Indian UGC (Journal)
  • Refrences
    24
  • Citations
    211
  • Chidanand Apté IBM T. J. Watson Research Center, Yorktown Heights, NY
  • Fred Damerau IBM T. J. Watson Research Center, Yorktown Heights, NY
  • Sholom M. Weiss Rutgers Univ., New Brunswick, NJ
Abstract
Cite
Apté, Chidanand, et al. “Automated Learning of Decision Rules for Text Categorization”. ACM Transactions on Information Systems, vol. 12, no. 3, 1994, pp. 233-51, https://doi.org/10.1145/183422.183423.
Apté, C., Damerau, F., & Weiss, S. M. (1994). Automated learning of decision rules for text categorization. ACM Transactions on Information Systems, 12(3), 233-251. https://doi.org/10.1145/183422.183423
Apté C, Damerau F, Weiss SM. Automated learning of decision rules for text categorization. ACM Transactions on Information Systems. 1994;12(3):233-51.
Journal Categories
Science
Mathematics
Instruments and machines
Electronic computers
Computer science
Science
Science (General)
Cybernetics
Information theory
Technology
Electrical engineering
Electronics
Nuclear engineering
Telecommunication
Technology
Technology (General)
Industrial engineering
Management engineering
Information technology
Description

Can machines learn to categorize text as well as humans? This study presents extensive experiments on automated rule-based induction methods for large document collections, aiming to discover classification patterns for document categorization and personalized filtering. The research demonstrates that machine-generated decision rules can achieve performance comparable to human-engineered systems, while using the same rule-based representation. Results on the Reuters collection benchmark reveal a significant performance gain compared to other machine-learning techniques, achieving an 80.5% recall/precision breakeven point, a substantial improvement over the previously reported 67%. The study also explores methodological alternatives, including universal versus local dictionaries and binary versus frequency-related features, in the context of high-dimensional feature spaces. This work highlights the potential of machine learning to automate text categorization tasks, reducing the need for extensive human involvement. These findings have implications for information retrieval, document management, and the development of intelligent systems.

Published in ACM Transactions on Information Systems, this research aligns with the journal's focus on information retrieval, text processing, and intelligent systems. By presenting an automated approach to text categorization, the study contributes to the advancement of information systems technologies and their applications, which is central to the journal's scope.

Refrences
Citations
Citations Analysis
The first research to cite this article was titled Optimized rule induction and was published in 1993. The most recent citation comes from a 2024 study titled Optimized rule induction . This article reached its peak citation in 2022 , with 13 citations.It has been cited in 127 different journals, 7% of which are open access. Among related journals, the Expert Systems with Applications cited this research the most, with 12 citations. The chart below illustrates the annual citation trends for this article.
Citations used this article by year