Machine learning in automated text categorization

Article Properties

Language

English
DOI (url)

10.1145/505282.505283
Publication Date

2002/03/01
Journal

ACM Computing Surveys
Indian UGC (Journal)
Refrences

147
Citations

2,025
Fabrizio Sebastiani Consiglio Nazionale delle Ricerche, Pisa, Italy

Abstract

Cite

Sebastiani, Fabrizio. “Machine Learning in Automated Text Categorization”. ACM Computing Surveys, vol. 34, no. 1, 2002, pp. 1-47, https://doi.org/10.1145/505282.505283.

Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1-47. https://doi.org/10.1145/505282.505283

Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys. 2002;34(1):1-47.

Journal Categories

Science

Mathematics

Instruments and machines

Electronic computers

Computer science

Science

Mathematics

Instruments and machines

Electronic computers

Computer science

Computer software

Technology

Electrical engineering

Electronics

Nuclear engineering

Electronics

Computer engineering

Computer hardware

Description

Can machines truly understand text? This research explores the burgeoning field of automated text categorization using **machine learning** techniques. The paper dives into how computers can be trained to classify documents into predefined categories, mimicking the abilities of human experts, and presents its significance due to the explosion of documents in the digital form. The study discusses approaches within the **machine learning paradigm** that automatically construct classifiers. The authors delve into crucial aspects such as **document representation**, **classifier construction**, and **classifier evaluation**, and the study considers the labor-saving implications of machine learning compared to traditional methods. By learning from pre-classified examples, these systems offer advantages like enhanced effectiveness, reduced labor costs, and domain portability. The survey focuses on data collection, analysis, and integration. It details processes for **document representation**, classifier building, and performance assessment, offering insights into algorithm design and optimization. This research pushes the boundaries of **automated text categorization**, contributing to more efficient information management and knowledge discovery.

Published in ACM Computing Surveys, a journal focused on significant advancements in computer science, this paper on machine learning in automated text categorization aligns directly with the journal's scope. It provides a comprehensive overview of techniques, enhancing the field and offering valuable insights for computer scientists and researchers.

Refrences

Citations

Citations Analysis

Category	Category Repetition
Science: Mathematics: Instruments and machines: Electronic computers. Computer science	924
Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics	456
Science: Science (General): Cybernetics: Information theory	434
Technology: Mechanical engineering and machinery	409
Technology: Engineering (General). Civil engineering (General)	313

The first research to cite this article was titled 10.1162/153244303322753625 and was published in 2000. The most recent citation comes from a 2024 study titled 10.1162/153244303322753625 . This article reached its peak citation in 2020 , with 143 citations.It has been cited in 739 different journals, 13% of which are open access. Among related journals, the Expert Systems with Applications cited this research the most, with 89 citations. The chart below illustrates the annual citation trends for this article.