Query-based sampling of text databases

Article Properties
Abstract
Cite
Callan, Jamie, and Margaret Connell. “Query-Based Sampling of Text Databases”. ACM Transactions on Information Systems, vol. 19, no. 2, 2001, pp. 97-130, https://doi.org/10.1145/382979.383040.
Callan, J., & Connell, M. (2001). Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2), 97-130. https://doi.org/10.1145/382979.383040
Callan J, Connell M. Query-based sampling of text databases. ACM Transactions on Information Systems. 2001;19(2):97-130.
Journal Categories
Science
Mathematics
Instruments and machines
Electronic computers
Computer science
Science
Science (General)
Cybernetics
Information theory
Technology
Electrical engineering
Electronics
Nuclear engineering
Telecommunication
Technology
Technology (General)
Industrial engineering
Management engineering
Information technology
Description

Faced with too much information, how do we find the most relevant databases? This paper introduces query-based sampling, a novel technique for acquiring resource descriptions to improve text database selection. Unlike existing methods, query-based sampling does not require cooperation from resource providers, making it suitable for wide-area networks. The study demonstrates that this technique creates accurate resource descriptions efficiently, enabling automatic database selection and improving information retrieval performance. This represents an important step forward in overcoming limitations of existing techniques.

Published in ACM Transactions on Information Systems, this paper aligns with the journal's focus on information retrieval, database management, and information systems architecture. The proposed query-based sampling technique directly addresses the problem of resource discovery in large-scale information systems, a core area of interest for the journal.

Refrences
Citations
Citations Analysis
The first research to cite this article was titled A semisupervised learning method to merge search engine results and was published in 2003. The most recent citation comes from a 2023 study titled A semisupervised learning method to merge search engine results . This article reached its peak citation in 2007 , with 8 citations.It has been cited in 39 different journals, 2% of which are open access. Among related journals, the ACM Transactions on Information Systems cited this research the most, with 10 citations. The chart below illustrates the annual citation trends for this article.
Citations used this article by year