Abstract
In this paper, we propose a new framework which can capture the latent relative information within the multiple views of 3D model, named View-wised Discriminative Ranking(VDR). Different to existing view-based methods which treat the multiple views as the independent information, we want to model the relative information within multiple views. By placing the views of model in certain order, we learn the parameters of ranking function as a new robust model representation. We evaluate our proposal on several challenging datasets for 3D retrieval and the comparison experiments demonstrate the superiority of the proposed method in both retrieval accuracy and efficiency.
Similar content being viewed by others
References
Ankerst M, Kastenmu̇ller G (1999) Hans-Peter Kriegel, and Thomas Seidl. 3d shape histograms for similarity search and classification in spatial databases. In: Advances in Spatial Databases, 6th International Symposium, SSD’99, Hong Kong, China, July 20-23, Proceedings, pp 207–226
Ansary TF, Daoudi M, Vandeborre J-P (2007) A bayesian 3-d search engine using adaptive views clustering. IEEE Trans Multimedia 9(1):78–88
Chen D-Y, Tian X-P, Shen Y-T, Ouhyoung M (2003) On visual similarity based 3d model retrieval. In: Computer graphics forum, vol 22, pp 223–232
Drucker H, Burges CJC, Kaufman L, Smola AJ, Vapnik V (1996) Support vector regression machines. In: Advances in Neural Information Processing Systems 9, NIPS, Denver, CO, USA, December 2-5, 1996, pp 155–161
Fang Y, Xie J, Dai G, Wang M, Zhu F, Xu T, Wong EK (2015) 3d deep shape descriptor. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA, June 7-12, pp 2319–2328
Gao Y, Dai Q (2014) View-based 3d object retrieval: Challenges and approaches. IEEE MultiMedia 21(3):52–57
Gao Y, Dai Q, Wang M, Naiyao Z (2011) 3d model retrieval using weighted bipartite graph matching. Sig Proc Image Comm 26(1):39–47
Gao Y, Tang J, Hong R, Yan S, Dai Q, Zhang N, Chua T-S (2012) Camera constraint-free view-based 3-d object retrieval. IEEE Trans Image Process 21 (4):2269–2281
Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans Image Process 21(9):4290–4303
Gao Z, Zhang L, Chen M-Y, Hauptmann AG, Zhang H, Cai A-N (2014) Enhanced and hierarchical structure algorithm for data imbalance problem in semantic extraction under massive video dataset. Multimedia Tools Appl 68(3):641–657
Gao Z, Zhang H, Xu GP, Xue YB, Hauptmann AG (2015) Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition. Signal Process 112:83–97
Gao Z, Zhang H, Xu GP, Xue YB (2015) Multi-perspective and multi-modality joint representation and recognition model for 3d action recognition. Neurocomputing 151:554–564
Guo H, Wang J, Gao Y, Li J, Lu H (2016) Multi-view 3d object retrieval with deep embedding network, vol 25, pp 5526–5537
Gao Z, Li S, Zhang G, Zhu Y, Wang C, Zhang H (2017) Evaluation of regularized multi-task leaning algorithms for single/multi-view human action recognition. In: Multimedia Tools and Applications, pp 1–24
Gao Z, Li S, Zhu Y, Wang C, Zhang H (2017) Collaborative sparse representation leaning model for rgbd action recognition. Journal of Visual Communication and Image Representation
Gao Z, Zhang G-T, Zhang H, Xue Y-B, Xu G (2017) 3d human action recognition model based on image set and regularized multi-task leaning. Neurocomputing 252:67–76
Hilaga M, Shinagawa Y, Komura T, Kunii TL (2001) Topology matching for fully automatic similarity estimation of 3d shapes. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH, Los Angeles, California, USA, August 12-17, pp 203–212
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Li F-F (2014) Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June, 23-28 2014, pp 1725–1732
Kim W-Y, Kim Y-S (2000) A region-based shape descriptor using zernike moments. Sig Proc Image Comm 16(1-2):95–102
Leibe B, Schiele B (2003) Analyzing appearance and contour based methods for object categorization. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 16-22 June 2003, Madison, WI, USA, pp 409–415
Liu T-Y (2011) Learning to Rank for Information Retrieval. Springer, Berlin
Liu A, Wang Z, Nie W, Yuting S (2015) Graph-based characteristic view set extraction and matching for 3d model retrieval. Inf Sci 320:429–442
Liu A, Nie W, Gao Y, Su Y (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116
Liu A, Nie W, Gao Y, Su Y (2017) View-based 3-d model retrieval: A benchmark. IEEE Trans Cybern PP(99):1–13
Liu A, Su Y, Nie W, Kankanhalli MS (2017) Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans Pattern Anal Mach Intell 39(1):102–114
Lu K, Ji R, Tang J, Gao Y (2014) Learning-based bipartite graph matching for view-based 3d model retrieval. IEEE Trans Image Process 23(10):4553–4563
Lu F, Sato I, Sato Y (2015) Uncalibrated photometric stereo based on elevation angle recovery from brdf symmetry of isotropic materials. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 168–176
Mu̇ller H, Mu̇ller W, Squire D, Marchand-Maillet S, Pun T (2001) Performance evaluation in content-based image retrieval: overview and proposals. Pattern Recogn Lett 22(5):593–601
Nie W, Liu A, Gao Z, Su Y (2015) Clique-graph matching by preserving global & local structure. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12 2015, pp 4503–4510
Persoon E, Fu K-S (1977) Shape discrimination using fourier descriptors. IEEE Trans Syst Man Cybern 7(3):170–179
Shilane P, Min P, Kazhdan MM, Funkhouser TA (2004) The princeton shape benchmark. In: 2004 International Conference on Shape Modeling and Applications (SMI 2004), 7-9 June 2004, Genova, Italy, pp 167–178
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition
Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. In: IEEE International Conference on Computer Vision, ICCV, Santiago, Chile, December 7-13, pp 945–953
Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA, June 7-12, pp 1–9
Wang M, Gao Y, Lu K, Rui Y (2013) View-based discriminative probabilistic modeling for 3d object retrieval and recognition. IEEE Trans Image Process 22(4):1395–1407
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA, June 7-12, pp 1912–1920
Xie J, Yi F, Zhu F, Wong EK (2015) Deepshape: Deep learned shape descriptor for 3d shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA, June 7-12, pp 1275–1283
Yang L, Albregtsen F (1996) Fast and exact computation of cartesian geometric moments using discrete green’s theorem. Pattern Recogn 29(7):1061–1073
Zhang D, Lu G (2002) Generic fourier descriptor for shape-based image retrieval. In: Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, ICME, Lausanne, Switzerland, vol I, pp 425–428
Zhao S, Yao H, Zhang Y, Wang Y, Liu S (2015) View-based 3d object retrieval via multi-modal graph learning. Signal Process 112:110–118
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, W., An, Y. View-wised discriminative ranking for 3D object retrieval. Multimed Tools Appl 77, 22035–22049 (2018). https://doi.org/10.1007/s11042-017-5208-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5208-6