Separating Style and Content with Bilinear Models

Article Properties
  • Language
    English
  • Publication Date
    2000/06/01
  • Indian UGC (Journal)
  • Refrences
    25
  • Citations
    286
  • Joshua B. Tenenbaum Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A.
  • William T. Freeman MERL, a Mitsubishi Electric Research Lab, 201 Broadway, Cambridge, MA 02139, U.S.A.
Abstract
Cite
Tenenbaum, Joshua B., and William T. Freeman. “Separating Style and Content With Bilinear Models”. Neural Computation, vol. 12, no. 6, 2000, pp. 1247-83, https://doi.org/10.1162/089976600300015349.
Tenenbaum, J. B., & Freeman, W. T. (2000). Separating Style and Content with Bilinear Models. Neural Computation, 12(6), 1247-1283. https://doi.org/10.1162/089976600300015349
Tenenbaum JB, Freeman WT. Separating Style and Content with Bilinear Models. Neural Computation. 2000;12(6):1247-83.
Journal Categories
Medicine
Internal medicine
Neurosciences
Biological psychiatry
Neuropsychiatry
Science
Mathematics
Instruments and machines
Electronic computers
Computer science
Technology
Electrical engineering
Electronics
Nuclear engineering
Electronics
Technology
Mechanical engineering and machinery
Description

Can computers learn to distinguish style from content like humans do? This research presents a computational model for separating 'content' from 'style' in perceptual systems using bilinear models. Perceptual systems routinely separate “content” from “style,” classifying familiar words spoken in an unfamiliar accent, identifying a font or handwriting style across letters, or recognizing a familiar face or object seen under unfamiliar viewing conditions. The general framework solves two-factor tasks using bilinear models and can be fit to data using efficient algorithms based on singular value decomposition and expectation-maximization. It provides expressive representations of factor interactions while maintaining computational tractability. Existing factor models (Mardia, Kent, & Bibby, 1979; Hinton & Zemel, 1994; Ghahramani, 1995; Bell & Sejnowski, 1995; Hinton, Dayan, Frey, & Neal, 1995; Dayan, Hinton, Neal, & Zemel, 1995; Hinton & Ghahramani, 1997) are either insufficiently rich to capture the complex interactions of perceptually meaningful factors such as phoneme and speaker accent or letter and font, or do not allow efficient learning algorithms. The model is tested across three perceptual domains: spoken vowel classification, font extrapolation, and face illumination translation. The model offers a powerful tool for machine learning and artificial intelligence, with potential applications ranging from speech recognition to image processing.

Published in Neural Computation, this paper aligns with the journal's focus on theoretical and computational approaches to understanding neural and cognitive processes. The development and application of bilinear models for separating style and content contribute to the understanding of perceptual learning and representation.

Refrences
Citations
Citations Analysis
The first research to cite this article was titled Parametric hidden Markov models for gesture recognition and was published in 1999. The most recent citation comes from a 2024 study titled Parametric hidden Markov models for gesture recognition . This article reached its peak citation in 2021 , with 29 citations.It has been cited in 120 different journals, 18% of which are open access. Among related journals, the IEEE Transactions on Pattern Analysis and Machine Intelligence cited this research the most, with 18 citations. The chart below illustrates the annual citation trends for this article.
Citations used this article by year