A robust multilevel segment description for multi-class object recognition

Mohammadreza Mostajabi^1,2 &
Iman Gholampour^1,2

454 Accesses
Explore all metrics

Abstract

We present an attempt to improve the performance of multi-class image segmentation systems based on a multilevel description of segments. The multi-class image segmentation system used in this paper marks the segments in an image, describes the segments via multilevel feature vectors and passes the vectors to a multi-class object classifier. The focus of this paper is on the segment description section. We first propose a robust, scale-invariant texture feature set, named directional differences (DDs). This feature is designed by investigating the flaws of conventional texture features. The advantages of DDs are justified both analytically and experimentally. We have conducted several experiments on the performance of our multi-class image segmentation system to compare DDs with some well-known texture features. Experimental results show that DDs present about 8 % higher classification accuracy. Feature reduction experiments also show that in a combined feature space, DDs remain in the list of most effective features even for small feature vector sizes. To describe a segment fully, we introduce a multilevel strategy called different levels of feature extraction (DLFE) that enables the system to include the semantic relations and contextual information in the features. This information is very effective especially for highly occluded objects. DLFE concatenates the features related to different views of every segment. Experimental results that show more than 4 % improvement in multi-class image segmentation accuracy is achieved. Using the semantic information in the classifier section adds another 2 % improvement to the accuracy of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object Detection with Semi-local Features

Class-specific image representation for image classification using multiple scale-invariant region detectors

Article 19 January 2016

Supervised image segmentation using Q-Shift Dual-Tree Complex Wavelet Transform coefficients with a texton approach

Article 22 June 2015

References

Shotton, J., Winn, J., Rother, C., Criminisi, A.: TextonBoost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. J. Comput. Vis. 81, 2–23 (2009)
Article Google Scholar
Shotton, J., Johnson, M., Cipolla, R.: Semantic Texton Forests for Image Categorization and Segmentation. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 1–8 (2008)
Schroff, F., Criminisi, A., Zisserman, A.: Object class segmentation using random forests. In: Proceedings of the British machine vision conference, pp. 54.1–54.10 (2008)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. IEEE Conf. Comput. Vis. Pattern Recogn. 1, 886–893 (2005)
Google Scholar
Bosch, Z., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM international conference on Image and video retrieval, pp. 401–408 (2007)
Overett, G., Tychsen-Smith, L., Petersson, L., Petersson, N., Andersson, L.: Creating robust high-throughput traffic sign detectors using centre-surround HOG statistics. Mach. Vis. Appl., pp. 1–14 (2011)
Xu, C., Prince, J.L.: Snakes, shapes, and gradient vector flow. IEEE Trans. Image Process. 7, 359–369 (1998)
Article MATH MathSciNet Google Scholar
Prewitt, J.: Object Enhancement and Extraction. In Picture Processing and Psychopictorics. Academic Press, New York (1970)
Google Scholar
Kekre, H.B., Gharge, S.M.: Image segmentation using extended edge operator for mammographic images. Int. J. Comput. Sci. Eng. 2(04), 1086–1091 (2010)
Google Scholar
Marr, D., Hildreth, E.: Theory of edge detection. Proc. R. Soc. Lond. Ser. B Biol. Sci. 207(1167), 187–217 (1980)
Article Google Scholar
Mostajabi, M., Gholampour, I.: A framework based on the affine invariant regions for improving unsupervised image segmentation. Information sciences signal processing and their applications (ISSPA), 11th International conference on, Montreal, Canada, pp. 17–22 (2012)
Tighe, J., Lazebnik, S.: SuperParsing: scalable nonparametric image parsing with superpixels. Euro. Conf. Comput. Vis. (2010)
Tighe, J., Lazebnik, S.: Finding things: image parsing with regions and per-exemplar detectors. IEEE Conf. Comput. Vis. Pattern Recogn. (2013)
Carreira, J., Sminchisescu, C.: CPMC: Automatic object segmentation using constrained parametric min-cuts. In: IEEE transactions on pattern analysis and machine intelligence (TPAMI), July (2012)
Ion, A., Carreira, J., Sminchisescu, C.: Probabilistic joint image segmentation and labeling by figure-ground composition. Int. J. Comput. Vis. (IJCV), November (2013)
Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Semantic segmentation with second-order pooling. In: ECCV, pp. 430–443 (2012)
Kohli, P., Ladick, L., Torr, P.: Robust higher order potentials for enforcing label consistency. Int. J. Comput. Vis. 82, 302–324 (2009)
Article Google Scholar
Plath, N., Toussaint, M., Nakajima, S.: Multi-class image segmentation using conditional random fields and global classification. In: Proceedings of international conference on machine learning, pp. 817–824 (2009)
Bertelli, L., Tianli, Y., Vu, D., Gokturk, B.: Kernelized structural SVM learning for supervised object segmentation. In: Proceedings of the 2011 IEEE conference on computer vision and pattern recognition, pp. 2153–2160 (2011)
Blaschko, M., Lampert, C.: Learning to localize objects with structured output regression.In: Proceedings of the 10th European conference on computer vision, Marseille, France, pp. 2–15 (2008)
Li, N., DiCarlo, J.: Unsupervised natural visual experience rapidly reshapes size-invariant object representation in inferior temporal cortex. Neuron 67, 1062–1075 (2010)
Article Google Scholar
Oliva, Z., Torralba, A.: The role of context in object recognition. Trends Cogn. Sci. 11, 520–527 (2007)
Article Google Scholar
Ma, W., Navalpakkam, V., Beck, J., Berg, R., Pouget, A.: Behavior and neural basis of near-optimal visual search. Nat. Neurosci. 14, 783–790 (2011)
Article Google Scholar
Oliva, A.: Gist of the scene. In: The encyclopedia of neurobiology of attention, pp. 251–256 (2005)
Mylonas, P., Spyrou, E., Avrithis, Y., Kollias, S.: Using visual context and region semantics for high-level concept detection. Trans. Multi. 11, 229–243 (2009)
Article Google Scholar
Torralba, Z., Murphy, K. P., Freeman, W. T.: Contextual models for object detection using boosted random fields. Adv. Neural Inf. Process. Syst. (NIPS), pp. 1401–1408 (2005)
Jian, Y., Fidler, S., Urtasun, R.: Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. Comput. Vis. Pattern Recogn.(CVPR), pp. 702–709 (2012)
Galleguillos, C., Belongie, S.: Context based object categorization:a critical survey. Comput. Vis. Image Underst. 114, 712–722 (2010)
Article Google Scholar
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Class. pp. 61–74 (1999)
Liang, H., Zhang, H., Yan, Y.: Decision trees for probability estimation: an empirical study. IEEE 24th international conference on tools with artificial intelligence, pp. 756–764 (2012)
Herbert, B., Andreas, E., Tinne, T., Van Luc, G.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110, 346–359 (2008)
Article Google Scholar
Comaniciu, D., Meer, P.: Mean Shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell., pp. 603–619 (2002)
Jamie, S., John, W., Carsten, R., Antonio, C.: TextonBoost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. Presented at the proceedings of the 9th European conference on computer vision, Volume Part I, pp. 1–15 (2006)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings 18th international conference on machine learning, pp. 282–289 (2001)
Gould S., Fulton R., Koller D.: Decomposing a scene into geometric and semantically consistent regions. In: Proceedings of international conference on computer vision (ICCV) (2009)
Chang, C.-C., Lin, C.-J.: LIBSVM : a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971–987 (2002)
Article Google Scholar
Chen, Y.-W., Lin, C.-J.: Combining SVMs with various feature selection strategies. Featur. Extr. 207, 315–324 (2006)
Article Google Scholar
CIE.: ‘Colorimetry’, CIE Pub. 15.2, 2nd ed., Commission International de L’Eclairage, pp. 29–30 (1986)
Domke, J.: Graphical models / conditional random field toolbox. http://users.cecs.anu.edu.au/~jdomke/JGMT/. Accessed 18 April (2013)
Domke, J.: Learning graphical model parameters with approximate marginal inference. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2454–2467 (2013)
Article Google Scholar
Socher R., Lin C., Ng A., Manning C.: Parsing natural scenes and natural language with recursive neural networks. In ICML (2011)
Kumar M., Koller D.: Efficiently selecting regions for scene understanding. In CVPR (2010)

Download references

Author information

Authors and Affiliations

Electronics Research Institute, Sharif University of Technology, Tehran, Iran
Mohammadreza Mostajabi & Iman Gholampour
Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran
Mohammadreza Mostajabi & Iman Gholampour

Authors

Mohammadreza Mostajabi
View author publications
You can also search for this author in PubMed Google Scholar
Iman Gholampour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammadreza Mostajabi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mostajabi, M., Gholampour, I. A robust multilevel segment description for multi-class object recognition. Machine Vision and Applications 26, 15–30 (2015). https://doi.org/10.1007/s00138-014-0642-1

Download citation

Received: 21 August 2013
Revised: 25 August 2014
Accepted: 06 October 2014
Published: 30 October 2014
Issue Date: January 2015
DOI: https://doi.org/10.1007/s00138-014-0642-1

A robust multilevel segment description for multi-class object recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Object Detection with Semi-local Features

Class-specific image representation for image classification using multiple scale-invariant region detectors

Supervised image segmentation using Q-Shift Dual-Tree Complex Wavelet Transform coefficients with a texton approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A robust multilevel segment description for multi-class object recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Object Detection with Semi-local Features

Class-specific image representation for image classification using multiple scale-invariant region detectors

Supervised image segmentation using Q-Shift Dual-Tree Complex Wavelet Transform coefficients with a texton approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation