Accurate Natural Trail Detection Using a Combination of a Deep Neural Network and Dynamic Programming
<p>Natural trail detection system using a deep neural network (DNN) and dynamic programming (DP). A DNN trained with supervised data maps the input image into trail and non-trail areas. The starting and the goal point of the local segment are computed using the output from the DNN. DP is then used on the trail map to obtain the local segment of the visible trail.</p> "> Figure 2
<p>(<b>a</b>) Examples of trail images from the IDSIA dataset used for our experiment. The dataset to train deep neural network consists of 100 × 100 RGB patches for the trail and non-trail areas; (<b>b</b>) “Trail” patches are extracted from the regions where hikers would walk, but without any distinct boundary or markings; (<b>c</b>) “Non-trail” patches are extracted from other surrounding areas in the image.</p> "> Figure 3
<p>Deep Neural Network architecture. The network is composed of five convolutional layers and three fully connected (FC) layers, with a Softmax classifier on top. The input to the network is a RGB color image patch of size 80 × 80 pixels. The network outputs two numbers corresponding to the probability of the input patch belonging to the trail and non-trail area, respectively.</p> "> Figure 4
<p>Fully Convolutional Neural Network (FCN) corresponding to the DNN shown in <a href="#sensors-18-00178-f003" class="html-fig">Figure 3</a>. The FCN is obtained by converting the last three fully connected (FC) layers of the DNN to convolutional layers by reshaping the FC layers. The network can process arbitrary sized input images and the output of the network are two score maps corresponding to the trail and the non-trail category, respectively. Given an RGB input of size 480 × 752, the network outputs two feature maps of size 26 × 43 pixels each. Each point in the output map represents the normalized probability of the corresponding image patch belonging to one of the considered categories.</p> "> Figure 5
<p>Trail segmentation using DNN. (Top row) Some representative images (resized to 240 × 376) from the test set. (Bottom row) The trail maps obtained using the proposed pipeline are overlaid (26 × 43 maps are up-sampled to 240 × 376) on the corresponding test images. The probabilities that points belong to a trail are coded with intensity of the red component, and a weighted sum with the image pixels is computed.</p> "> Figure 6
<p>Transition to the node (i,j) is allowed only from its nearest five predecessors {k,l}. The cost of transition <span class="html-italic">d<sub>kl</sub></span><span class="html-italic"><sub>→ij</sub></span> from its predecessors is assigned empirically as [0.2, 0.1, 0, 0.1, 0.2].</p> "> Figure 7
<p>Trail detection using the proposed method. The trail detected by dynamic programming (blue color), the smoother version produced by fitting a second order polynomial (green color), and the local trail segment annotated by a human observer (red color) are superimposed on the test image.</p> "> Figure 8
<p>Receiver operating characteristic (ROC) curve of the DNN.</p> "> Figure 9
<p>Performance of the proposed method. (<b>a</b>) Histogram of errors in determining the starting point; (<b>b</b>) The distribution of errors (Δx, Δy) in determining the endpoint; (<b>c</b>) Histogram of errors corresponding to the x component (Δx) and; (<b>d</b>) the y component (Δy) in determining the endpoint of the local trail segment.</p> "> Figure 10
<p>Receiver operating characteristic (ROC) curves of the network trained the IDSIA dataset, the network trained on the new data only, and the network trained on IDSIA and fine-tuned on the new data.</p> "> Figure 11
<p>Results of trail segmentation in new environment. (<b>a</b>) Sample images from new trail; (<b>b</b>) Trail map generated by the (<b>b</b>) DNN trained on IDSIA dataset; (<b>c</b>) DNN trained on the small dataset from the new environment; (<b>d</b>) DNN trained on IDSIA dataset and fine-tuned with data from the new environment.</p> ">
Abstract
:1. Introduction
2. Patch-Based Deep Neural Network for Trail Segmentation
2.1. Dataset
2.2. Deep Neural Network for Image Patch Classification
2.2.1. Deep Neural Network Architecture
2.2.2. Deep Neural Network Training
2.3. Fully Convolutional Neural Network for Trail Map Generation
2.4. Starting Point and Terminal Row of the Trail
3. Dynamic Programming for Trail Line Detection
4. Experiments and Results
4.1. Performance of the Patch-Based Trail Classifier
4.2. Performance of the Trail Detection System
4.3. Detecting Trail in New Environment
5. Conclusions
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Chiu, K.Y.; Lin, S.-F. Lane detection using color-based segmentation. In Proceedings of the IEEE Intelligent Vehicles Symposium, Las Vegas, NV, USA, 6–8 June 2005. [Google Scholar]
- Kong, H.; Audibert, J.Y.; Ponce, J. General road detection from a single image. IEEE Trans. Image Process. 2010, 19, 2211–2220. [Google Scholar] [CrossRef] [PubMed]
- Dahlkamp, H.; Kaehler, A.; Stavens, D.; Thrun, S.; Bradski, G.R. Self-supervised Monocular Road Detection in Desert Terrain. In Robotics: Science and Systems; The MIT Press: Cambridge, MA, USA, 2006; Volume 38. [Google Scholar]
- Yenikaya, S.; Yenikaya, G.; Düven, E. Keeping the vehicle on the road: A survey on on-road lane detection systems. ACM Comput. Surv. 2013, 46, 2. [Google Scholar] [CrossRef]
- McCall, J.C.; Trivedi, M.M. Video-based lane estimation and tracking for driver assistance: Survey, system, and evaluation. IEEE Trans. Intell. Transp. Syst. 2006, 7, 20–37. [Google Scholar] [CrossRef]
- Hillel, A.B.; Lerner, R.; Levi, D.; Raz, G. Recent progress in road and lane detection: A survey. Mach. Vis. Appl. 2014, 25, 727–745. [Google Scholar] [CrossRef]
- Rasmussen, C.; Scott, D. Shape-guided superpixel grouping for trail detection and tracking. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, 22–26 September 2008. [Google Scholar]
- Rasmussen, C.; Lu, Y.; Kocamaz, M. Appearance contrast for fast, robust trail-following. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis, MO, USA, 10–15 October 2009. [Google Scholar]
- Santana, P.; Correia, L.; Mendonça, R.; Alves, N.; Barata, J. Tracking natural trails with swarm-based visual saliency. J. Field Robot. 2013, 30, 64–86. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G.E. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2012; pp. 1097–1105. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2015; pp. 91–99. [Google Scholar]
- Noh, H.; Hong, S.; Han, B. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1520–1528. [Google Scholar]
- Garg, R.; Gustavo, C.; Reid, I. Unsupervised CNN for single view depth estimation: Geometry to the rescue. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 740–756. [Google Scholar]
- Flynn, J.; Neulander, I.; Philbin, J.; Snavely, N. DeepStereo: Learning to predict new views from the world’s imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 27–30 June 2016; pp. 5515–5524. [Google Scholar]
- Huval, B.; Wang, T.; Tandon, S.; Kiske, J.; Song, W.; Pazhayampallil, J.; Andriluka, M.; Rajpurkar, P.; Migimatsu, T.; Cheng-Yue, R.; et al. An empirical evaluation of deep learning on highway driving. arXiv, 2015; arXiv:1504.01716. [Google Scholar]
- Chen, C.; Seff, A.; Kornhauser, A.; Xiao, J. Deepdriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 2722–2730. [Google Scholar]
- Bojarski, M.; Davide, D.D.; Dworakowski, D.; Firner, B.; Flepp, B.; Goyal, P.; Jackel, L.D.; Monfort, M.; Muller, U.; Zhang, J.; et al. End to End Learning for Self-Driving Cars. arXiv, 2016; arXiv:1604.07316. [Google Scholar]
- Yuan, Y.; Jiang, Z.; Wang, Q. Video-based road detection via online structural learning. Neurocomputing 2015, 168, 336–347. [Google Scholar] [CrossRef]
- Wang, Q.; Gao, J.; Yuan, Y. Embedding Structured Contour and Location Prior in Siamesed Fully Convolutional Networks for Road Detection. IEEE Trans. Intell. Transp. Syst. 2017. [Google Scholar] [CrossRef]
- Hadsell, R.; Sermanet, P.; Ben, J.; Erkan, A.; Scoffier, M.; Kavukcuoglu, K.; Muller, U.; LeCun, Y. Learning long-range vision for autonomous off-road driving. J. Field Rob. 2009, 26, 120–144. [Google Scholar] [CrossRef]
- Giusti, A.; Guzzi, J.; Cireşan, D.C.; He, F.L.; Rodríguez, J.P.; Fontana, F.; Faessler, M.; Forster, C.; Schmidhuber, J.; Di Caro, G.; et al. A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robot. Autom. Lett. 2016, 1, 661–667. [Google Scholar] [CrossRef]
- Smolyanskiy, N.; Kamenev, A.; Smith, J.; Birchfield, S. Toward Low-Flying Autonomous MAV Trail Navigation using Deep Neural Networks for Environmental Awareness. arXiv, 2017; arXiv:1705.02550. [Google Scholar]
- A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots. Available online: http://people.idsia.ch/~guzzi/DataSet.html (accessed on 5 January 2018).
- Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 818–833. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; pp. 249–256. [Google Scholar]
- Bergstra, J.; Breuleux, O.; Bastien, F.; Lamblin, P.; Pascanu, R.; Desjardins, G.; Turian, J.; Warde-Farley, D.; Bengio, Y. Theano: A CPU and GPU Math Expression Compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy), Austin, TX, USA, 28 June–3 July 2010. [Google Scholar]
- Kingma, D.; Adam, J.B. Adam: A method for stochastic optimization. arXiv, 2014; arXiv:1412.6980. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Bellman, R. Dynamic Programming; Courier Corporation: North Chelmsford, MA, USA, 2013. [Google Scholar]
Predicted (→) Actual (↓) | Trail | Non-Trail |
---|---|---|
Trail | 11,357 | 6082 |
Non-Trail | 4562 | 66,059 |
Comparison | Mean Pixel Deviation |
---|---|
Human1-Human2 | 9.45 |
Human1-proposed method | 22.7 |
Human2-proposed method | 25.28 |
Human1-shape_guided [7] | 25.68 |
Human2-shape_guided [7] | 27.85 |
Predicted (→) Actual(↓) | Trail | Non-Trail |
---|---|---|
Trail | 3776 | 1972 |
Non-Trail | 1827 | 7325 |
Predicted (→) Actual (↓) | Trail | Non-Trail |
---|---|---|
Trail | 4872 | 876 |
Non-Trail | 514 | 8638 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Adhikari, S.P.; Yang, C.; Slot, K.; Kim, H. Accurate Natural Trail Detection Using a Combination of a Deep Neural Network and Dynamic Programming. Sensors 2018, 18, 178. https://doi.org/10.3390/s18010178
Adhikari SP, Yang C, Slot K, Kim H. Accurate Natural Trail Detection Using a Combination of a Deep Neural Network and Dynamic Programming. Sensors. 2018; 18(1):178. https://doi.org/10.3390/s18010178
Chicago/Turabian StyleAdhikari, Shyam Prasad, Changju Yang, Krzysztof Slot, and Hyongsuk Kim. 2018. "Accurate Natural Trail Detection Using a Combination of a Deep Neural Network and Dynamic Programming" Sensors 18, no. 1: 178. https://doi.org/10.3390/s18010178