Abstract
Query-by-example and query-by-keyword both suffer from the problem of “aliasing,” meaning that example-images and keywords potentially have variable interpretations or multiple semantics. For discerning which semantic is appropriate for a given query, we have established that combining active learning with kernel methods is a very effective approach. In this work, we first examine active-learning strategies, and then focus on addressing the challenges of two scalability issues: scalability in concept complexity and in dataset size. We present remedies, explain limitations, and discuss future directions that research might take.
Similar content being viewed by others
References
Blum A, Mitchell T (1998) Combining labeled and unlabeled data wih co-training. In: Proceedings of the workshop on computational learning theory, Madison, Wisconsin, 92–100
Brinker K (2003) Incorporating diversity in active learning with support vector machines. In: Prooceedings of the twentieth international conference on machine learning, Washington, District of Columbia, 59–66
Chang E, Goh K, Sychay G, Wu G (2003a) CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans Circuits Syst Video Technol 13(1):26–38 (Special issue on conceptual and dynamic aspects of multimedia content description)
Chang E, Li B (2003) MEGA—the maximizing expected generalization algorithm for learning complex query concepts. ACM Trans Inf. Sys. 21(4):347–382
Chang E, Li B, Wu G, Goh K-S (2003b) Statistical learning for effective visual information retrieval. In: IEEE Conference in Image Processing, Barcelona, Spain, 606–612
Flickner M, Sawhney H, Ashley J, Huang Q, Dom B, Gorkani M, Hafner J, Lee D, Petkovic D, Steele D, Yanker P (1995) Query by image and video content: the QBIC system. IEEE Computer 28(9):23–32
Goh K, Chang EY, Lai W-C (2004) Concept-dependent multimodal active learning for image retrieval. In: ACM international conference on multimedia, New York, New York, 564–571
Li B, Chang, E (2003) Discovery of a perceptual distance function for measuring image similarity. ACM Multimedia J. 8(6):512–522 (Special issue on content-based image retrieval)
Li C, Chang E, Garcia-Molina H, Wiederhold G (2002) Clustering for approximate similarity queries in high-dimensional spaces. IEEE Trans Knowl Data Eng. 14(4):792–808
Panda N, Chang E (2005) Exploiting geometry for support vector machine indexing. In: SIAM conference on data mining, Newport Beach, California
Tong S, Chang E (2001) Support vector machine active learning for image retrieval. In: Proceedings of ACM international conference on multimedia, Ottawa, Canada, 107–118
Tong S, Koller D (2000) Support vector machine active learning with applications to text classification. In: Proceedings of the 17th international conference on machine learning, Stanford, USA, 401–412
Vapnik V (1995) The nature of statistical learning theory. Springer, Berlin Heidelberg New York
Zhang Z, Wu G, Wang G, Chang E (2005) Bayesian kernel regression. In: International conference on machine learning, Bonn, Germany
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Panda, N., Goh, KS. & Chang, E.Y. Active learning in very large databases. Multimed Tools Appl 31, 249–267 (2006). https://doi.org/10.1007/s11042-006-0043-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-006-0043-1