Unsupervised Learning of the Morphology of a Natural Language

Article Properties
  • Language
    English
  • Publication Date
    2001/06/01
  • Indian UGC (Journal)
  • Refrences
    9
  • Citations
    120
  • John Goldsmith University of Chicago, Department of Linguistics, University of Chicago, 1010 E. 59th Street, Chicago, IL 60637.
Abstract
Cite
Goldsmith, John. “Unsupervised Learning of the Morphology of a Natural Language”. Computational Linguistics, vol. 27, no. 2, 2001, pp. 153-98, https://doi.org/10.1162/089120101750300490.
Goldsmith, J. (2001). Unsupervised Learning of the Morphology of a Natural Language. Computational Linguistics, 27(2), 153-198. https://doi.org/10.1162/089120101750300490
Goldsmith J. Unsupervised Learning of the Morphology of a Natural Language. Computational Linguistics. 2001;27(2):153-98.
Journal Categories
Language and Literature
Philology
Linguistics
Language and Literature
Philology
Linguistics
Computational linguistics
Natural language processing
Science
Mathematics
Instruments and machines
Electronic computers
Computer science
Technology
Electrical engineering
Electronics
Nuclear engineering
Electronics
Technology
Mechanical engineering and machinery
Description

Can computers learn language like humans? This study explores how unsupervised learning, specifically Minimum Description Length (MDL) analysis, can be used to model the morphological segmentation of natural language. Focusing on European languages, the research utilizes corpora of varying sizes to develop a set of heuristics that rapidly builds a probabilistic morphological grammar. The modifications proposed by these heuristics are evaluated using MDL, determining whether they should be adopted. The generated grammar closely mirrors analyses developed by human morphologists, suggesting the potential of this approach. MDL analysis offers a powerful tool for rapidly developing a probabilistic morphological grammar. By applying MDL, the study efficiently navigates the vast possibilities in language structure. The research examines the relationship between this method of grammatical analysis and evaluation metrics used in early generative grammar, bridging computational and theoretical linguistics. This research demonstrates that MDL analysis can effectively model unsupervised learning of morphological segmentation, providing valuable insights into how machines can learn language structures without explicit instruction. The findings have implications for natural language processing, computational linguistics, and our understanding of the cognitive processes involved in language acquisition. The success of MDL offers avenues for future research in automated language learning and grammatical analysis.

Published in Computational Linguistics, a leading journal covering the field, this paper is highly relevant due to its focus on natural language processing. The journal addresses computational approaches to language, a central theme of this work. By exploring unsupervised learning techniques, this research builds upon existing literature in the field, offering novel insights into morphological grammar development and its relationship to early generative grammar.

Refrences
Citations
Citations Analysis
The first research to cite this article was titled The generalized universal law of generalization and was published in 2003. The most recent citation comes from a 2024 study titled The generalized universal law of generalization . This article reached its peak citation in 2010 , with 11 citations.It has been cited in 75 different journals, 12% of which are open access. Among related journals, the ACM Transactions on Asian and Low-Resource Language Information Processing cited this research the most, with 5 citations. The chart below illustrates the annual citation trends for this article.
Citations used this article by year