Abstract
Learning ranking scores is critical for the multimedia database retrieval problem. In this paper, we propose a novel ranking score learning algorithm by exploring the sparse structure and using it to regularize ranking scores. To explore the sparse structure, we assume that each multimedia object could be represented as a sparse linear combination of all other objects, and combination coefficients are regarded as a similarity measure between objects and used to regularize their ranking scores. Moreover, we propose to learn the sparse combination coefficients and the ranking scores simultaneously. A unified objective function is constructed with regard to both the combination coefficients and the ranking scores, and is optimized by an iterative algorithm. Experiments on two multimedia database retrieval data sets demonstrate the significant improvements of the propose algorithm over state-of-the-art ranking score learning algorithms.
Similar content being viewed by others
References
Agichtein E, Brill E, Dumais S (2006) Improving web search ranking by incorporating user behavior information, pp 19–26
Ahonen T, Hadid A, Pietikäinen M (2006) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 28(12):2037–2041
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
Bian W, Tao D (2010) Biased discriminant euclidean embedding for content-based image retrieval. IEEE Trans Image Process 19(2):545–554
Bober M (2001) Mpeg-7 visual shape descriptors. IEEE Trans Circ Syst Video Tech 11(6):716–719
Breu H, Gil J, Kirkpatrick D, Werman M (1995) Linear time euclidean distance transform algorithms. IEEE Trans Pattern Anal Mach Intell 17(5):529–533
Clausi D, Ed Jernigan M (2000) Designing gabor filters for optimal texture separability. Pattern Recog 33(11):1835–1849
Cook N (2007) Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation 115(7):928–935
Davis J, Goadrich M (2006) The relationship between precision-recall and roc curves, pp 233–240
De Maesschalck R, Jouan-Rimbaud D, Massart D (2000) The mahalanobis distance. Chemom Intell Lab Syst 50(1):1–18
Ding K, Liu Y (2013) A probabilistic 3d model retrieval system using sphere image. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) 7724 LNCS(PART), vol 1, pp 536–547
Euzenat J (2007) Semantic precision and recall for ontology alignment evaluation, pp 348–353
Feng DD, Siu WC, Zhang HJ (2003) Multimedia information retrieval and management: technological fundamentals and applications. Springer
Forti M, Tesi A (1995) New conditions for global stability of neural networks with application to linear and quadratic programming problems. IEEE Trans Circuits Syst I: Fundam Theory Appl 42(7):354–366
Gao X, Xiao B, Tao D, Li X (2008) Image categorization: graph edit distance + edge direction histogram. Pattern Recog 41(10):3179–3191
Grigorescu S, Petkov N, Kruizinga P (2002) Comparison of texture features based on gabor filters. IEEE Trans Image Process 11(10):1160–1167
Hand D, Till R (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171–186
Hanley J, McNeil B (1982) The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143(1):29–36
Haveliwala T (2003) Topic-sensitive pagerank: a context-sensitive ranking algorithm for web search. IEEE Trans Knowl Data Eng 15(4):784–796
He R, Hu BG, Zheng WS, Guo Y (2010) Two-stage sparse representation for robust recognition on large-scale database. In: AAAI, vol 10, pp 1–1
Hiremath P, Pujari J (2007) Content based image retrieval using color, texture and shape features, pp 780–784
Hotho A, Jäschke R, Schmilz C, Stumme G (2006) Information retrieval in folksonomies: search and ranking. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) 4011 LNCS, pp 411–426
Huang Y, Powers R, Montelione G (2005) Protein nmr recall, precision, and f-measure scores (rpf scores): Structure quality assessment measures based on information retrieval statistics. J Am Chem Soc 127(6):1665–1674
Jain A, Farrokhnia F (1991) Unsupervised texture segmentation using gabor filters. Pattern Recog 24(12):1167–1186
Kapela R, Rybarczyk A (2007) Real-time shape description system based on mpeg-7 descriptors. J Syst Archit 53(9):602–618
Kim JH, Seo YH, Kim DW, Yoo JS (2011) Stereoscopic conversion of monoscopic video using edge direction histogram. Int J Innov Comput Inf Control 7(11):6289–6300
Kim YW, Oh IS (2004) Watermarking text document images using edge direction histograms. Pattern Recog Lett 25(11):1243–1251
Ma Z, Nie F, Yang Y, Uijlings J, Sebe N (2012) Web image annotation via subspace-sparsity collaborated feature selection. IEEE Trans Multimedia 14(4 PART1):1021–1030
Mangai M, Gounden N (2013) Subspace-based clustering and retrieval of 3-d objects. Comput Electr Eng 39(3):809–817
Momoh J, El-Hawary M, Adapa R (1999) A review of selected optimal power flow literature to 1993 part i: nonlinear and quadratic programming approaches. IEEE Trans Power Syst 14(1):96–103
Myerson J, Green L, Warusawitharana M (2001) Area under the curve as a measure of discounting. J Exper Anal Behav 76(2):235–243
Naphade M, Huang T (2002) Extracting semantics from audiovisual content: the final frontier in multimedia retrieval. IEEE Trans Neural Netw 13(4):793–810
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Machine Intell 24(7):971–987
Papadakis P, Pratikakis I, Theoharis T, Perantonis S (2010) Panorama: a 3d shape descriptor based on panoramic views for unsupervised 3d object retrieval. Int J Comput Vis 89(2–3):177–192
Pass G, Zabih R (1996) Histogram refinement for content-based image retrieval, pp 96–102
Pass G, Zabih R, Miller J (1996) Comparing images using color coherence vectors, pp 65–73
Pencina M, D’Agostino Sr. R, D’Agostino Jr. R, Vasan R (2008) Evaluating the added predictive ability of a new marker: From area under the roc curve to reclassification and beyond. Stat Med 27(2):157–172
Perkins N, Schisterman E (2006) The inconsistency of ”optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve. Am J Epidemiol 163(7):670–675
Provost F, Domingos P (2003) Tree induction for probability-based ranking. Mach Learn 52(3):199–215
Shilane P, Min P, Kazhdan M, Funkhouser T (2004) The princeton shape benchmark. In: Shape modeling applications, 2004. Proceedings, pp 167–178. IEEE
Wang JZ, Li J, Wiederhold G (2001) Simplicity: semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell 23(9):947–963
Wang J, Bensmail H, Gao X (2012) Multiple graph regularized protein domain ranking. BMC Bioinforma 13:307. doi:10.1186/1471-2105-13-307
Wang J, Bensmail H, Yao N, Gao X (2013) Discriminative sparse coding on multi-manifolds. Knowl-Based Syst 54:199–206
Wang M, Gao Y, Lu K, Rui Y (2013) View-based discriminative probabilistic modeling for 3d object retrieval and recognition. IEEE Trans Image Process 22(4):1395–1407
Wang J, Bensmail H, Gao X (2014) Feature selection and multi-kernel learning for sparse representation on manifold. Neural Netw 51:9–16
Yanagawa A, Hsu W, Chang SF (2006) Brief descriptions of visual features for baseline trecvid concept detectors. Columbia University ADVENT Technical Report, pp 219–2006
Yang Y, Nie F, Xu D, Luo J, Zhuang Y, Pan Y (2012) A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Trans Pattern Anal Mach Intell 34(4):723–742
Yang Y, Xu D, Nie F, Luo J, Zhuang Y (2009) Ranking with local regression and global alignment for cross media retrieval, pp 175–184
Yang Y, Zhuang YT, Wu F, Pan YH (2008) Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Trans Multimedia 10(3):437–446
Yue Y, Finley T, Radlinski F, Joachims T (2007) A support vector method for optimizing average precision. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, pp 271–278. ACM
YuJie L, Feng B, ZongMin L, Hua L (2013) 3d model retrieval based on 3d fractional fourier transform. Int Arab J Inf Tech 10(5)
Zhang D, Lu G (2003) Evaluation of mpeg-7 shape descriptors against other shape descriptors. Multimed Syst 9(1):15–30
Zhao G, Pietikäinen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29(6):915–928
Zhou D, Weston J, Gretton A, Bousquet O, Schölkopf B (2003) Ranking on data manifolds. Adv Neural Inf Process Syst 16:169–176
Zhu X, Huang Z, Cheng H, Cui J, Shen H (2013) Sparse hashing for fast multimedia search. ACM Trans Inf Syst 31(2)
Zhu X, Huang Z, Yang Y, Tao Shen H, Xu C, Luo J (2013) Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recog 46(1):215–229
Zhuang YT, Yang Y, Wu F (2008) Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Trans Multimed 10(2):221–229
Acknowledgements
Jim Jing-Yan Wang and Yijun Sun are in part supported by US National Science Foundation under grant No. DBI-1062362. The study is supported by grants from Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, China, and King Abdullah University of Science and Technology (KAUST), Saudi Arabia.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, J.JY., Sun, Y. & Gao, X. Sparse structure regularized ranking. Multimed Tools Appl 74, 635–654 (2015). https://doi.org/10.1007/s11042-014-1939-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-1939-9