Research on Distance Transform and Neural Network Lidar Information Sampling Classification-Based Semantic Segmentation of 2D Indoor Room Maps
<p>(<b>a</b>) Binary map matrix; (<b>b</b>) Distance transform matrix.</p> "> Figure 2
<p>The segmentation process of the watershed algorithm.</p> "> Figure 3
<p>The overall framework of the proposed method.</p> "> Figure 4
<p>(<b>a</b>) shows the four labels and its grayscale values: 255 for rooms, 195 for doorways, 127 for corridors, and 0 for walls, (<b>b</b>) shows the simulated laser scanner measurement within the manually labelled map.</p> "> Figure 5
<p>The laser map information collected from different kinds of areas. The room is labelled as 1, the corridor is labelled as 2, and the doorway is labelled as 3.</p> "> Figure 6
<p>The framework of proposed LCNet Block.</p> "> Figure 7
<p>(<b>a</b>) is the binary pre-processed map (PPM), (<b>b</b>) is the result of distance transformation, (<b>c</b>) is the pre-segmented map (PSM) processed by distance transformation and watershed algorithm, (<b>d</b>) is the map with room and corridors labels from automatic classification, and (<b>e</b>) is the sampling diagram in the optimized sampling areas, (<b>f</b>) is the result of Semantic Segmented map.</p> "> Figure 8
<p>The designed mobile robot with 2D lidar.</p> "> Figure 9
<p>(<b>a</b>) is the binary pre-processed map (PPM), (<b>b</b>) is the result of distance transformation, (<b>c</b>) is the pre-segmented map (PSM)processed by distance transformation and watershed algorithm, (<b>d</b>) is the map with room and corridor labels, and (<b>e</b>) is the result of Semantic Segmented map.</p> "> Figure 10
<p>The accuracy results of the ResNet-18 and LCNet in the iterative training.</p> "> Figure 11
<p>(<b>a</b>) is the accuracy of the LCNet, (<b>b</b>) is the accuracy of the ResNet.</p> "> Figure 12
<p>(<b>a</b>) freiburg_building52 map, 6986 points are sampled. (<b>b</b>) lab_d map, 12,630 points are sampled. (<b>c</b>) is lab_c map, 9852 points are sampled. (<b>d</b>) is lab_intel map, 15,368 points are sampled.</p> "> Figure 13
<p>The experimental results of proposed method in three different maps.</p> "> Figure 14
<p>Exemplary segmentation results: the first column depicts the ground truth room segmentation from human labeling, the second column shows the proposed method’s segmentation, the third column yields the Voronoi graph-based segmentation, column 4 is the morphological-based segmentation.</p> ">
Abstract
:1. Introduction
2. Related Works
2.1. Semantic Labels
2.2. Deep Learning for Classification
2.3. The Distance Transform Watershed Based Pre-Segmentation
3. Proposed Method
3.1. Laser Data from Simulated Lidar
3.2. The Optimized LCNet Network
3.3. Classification Based on Pre-Segmentation and Optimized Sampling Areas
3.3.1. Pre-Segmentation
Algorithm 1 Labeling room areas and corridors areas with “winner-take-all” principle |
Input: Pre-processed map(PPM), Pre-segmented map(PSM), classification results of sampling points; Output: The map with room labels and corridors labels; 1: The classification results of sampling points in each area were counted. m is the number of sampling points identified as rooms in each area, and n is the number of sampling points identified as corridors; 2: for each area of the binarized map do 3: if m >= n then 4: Classify this area as room; 5: else 6: Classify this area as corridors; 7: end if 8: end for |
3.3.2. Optimized Sampling Areas and the Extraction of Doorway Labels
Algorithm 2 Labeling doorway areas |
Input: Pre-segmented map (PSM), pre-processed map (PPM), classification results of sampling points, the map with room labels and corridors labels; Output: Result of semantic segmentation; 1: Extracting the size of pre-processed map, define rows as number of rows and cols as number of columns, define two-dimensional vector dl_type to store grayscale on both sides of the dividing line, define vector dl_n to store the number of pixels of each dividing line, define vector dl_d to store the number of pixels with “doorway” label of each dividing line; 2: for x in [0, cols - 1] do 3: for y in [0, rows - 1] do 4: if PPM(x, y) != 0 && PSM(x, y) == 0 then 5: if size(dl_type) == 0 then 6: Push {g1, g2}into dl_type, where g1 and g2 are gray values of the both sides of the pixel (x, y) respectively; 7: Push [33] into dl_n, push [33] into dl_d; 8: if pixel (x, y) is classified as “doorway” then 9: dl_d[0]++; 10: end if 11: else 12: if {g1, g2} exists in dl_type then 13: dl_n[i]++, where i is the corresponding subscript when dl_type[i] == {g1, g2}; 14: if pixel (x, y) is classified as “doorway” then 15: dl_d[i]++; 16: end if 11: else 12: Push {g1, g2} into dl_type, push [33] into dl_n, push [33] into dl_d; 13: if pixel (x, y) is classified as “doorway” then 14: dl_d[size(dl_d) - 1]++; 15: end if 16: end if 17: end if 18: end if 19: end for 20: end for 21: Define the threshold thr_d; 22: for j in [0, size(dl_type) - 1] do 23: if dl_d[j] / dl_n[j] >= thr_d then 24: Mark the jth dividing line as doorway; 25: else 26: Erase the jth dividing line; 27: end if 28: end for |
4. Experiments and Analysis
4.1. Results of the LCNet and the ResNet-18
4.2. Results of the Proposed Classification Method
4.3. Comparison with Other Algorithms
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Mozos, Ó.M. Semantic Labeling of Places with Mobile Robots; Springer: Berlin/Heidelberg, Germany, 2010; Volume 61. [Google Scholar]
- Thrun, S. Learning metric-topological maps for indoor mobile robot navigation. Artif. Intell. 1998, 99, 21–71. [Google Scholar] [CrossRef] [Green Version]
- Bormann, R.; Jordan, F.; Li, W.; Hampp, J.; Hägele, M. Room segmentation: Survey, implementation, and analysis. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 1019–1026. [Google Scholar]
- Mozos, Ó.M.; Rottmann, A.; Triebel, R.; Jensfelt, P.; Burgard, W. Semantic Labeling of Places Using Information Extracted from Laser and Vision Sensor Data. In Proceedings of the IEEE/RSJ IROS Workshop: From Sensors to Human Spatial Concepts, Beijing, China, 10 October 2006. [Google Scholar]
- Pronobis, A.; Jensfelt, P. Hierarchical Multi-Modal Place Categorization. In Proceedings of the 5th European Conference on Mobile Robots (ECMR), Örebro, Sweden, 7–9 September 2011; pp. 159–164. [Google Scholar]
- Nieto-Granda, C.; Rogers, J.G.; Trevor, A.J.; Christensen, H.I. Semantic map partitioning in indoor environments using regional analysis. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 1451–1456. [Google Scholar]
- Bormann, R.; Hampp, J.; Hägele, M. New brooms sweep clean-an autonomous robotic cleaning assistant for professional office cleaning. In Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 4470–4477. [Google Scholar]
- Fabrizi, E.; Saffiotti, A. Augmenting topology-based maps with geometric information. Robot. Auton. Syst. 2002, 40, 91–97. [Google Scholar] [CrossRef]
- Jung, J.; Stachniss, C.; Kim, C. Automatic Room Segmentation of 3D Laser Data Using Morphological Processing. ISPRS Int. J. Geo-Inf. 2017, 6, 206. [Google Scholar] [CrossRef] [Green Version]
- Buschka, P.; Saffiotti, A. A virtual sensor for room detection. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Lausanne, Switzerland, 30 September–4October 2002; pp. 637–642. [Google Scholar]
- Diosi, A.; Taylor, G.; Kleeman, L. Interactive SLAM using laser and advanced sonar. In Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain, 18–22 April 2005; pp. 1103–1108. [Google Scholar]
- Butt, M.A.; Maragos, P. Optimum design of chamfer distance transforms. IEEE Trans. Image Process. 1998, 7, 1477–1484. [Google Scholar] [CrossRef] [PubMed]
- Digabel, H.; Lantuéjoul, C. Iterative algorithms. In Proceedings of the 2nd European Symposium Quantitative Analysis of Microstructures in Material Science, Biology and Medicine, Caen, France, 4–7 October 1977; p. 8. [Google Scholar]
- Kleiner, A.; Baravalle, R.; Kolling, A.; Pilotti, P.; Munich, M. A solution to room-by-room coverage for autonomous cleaning robots. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 5346–5352. [Google Scholar]
- Lai, X.; Yuan, Y.; Li, Y.; Wang, M. Full-Waveform LiDAR Point Clouds Classification Based on Wavelet Support Vector Machine and Ensemble Learning. Sensors 2019, 19, 3191. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Valero, E.; Adán, A.; Cerrada, C.J.S. Automatic method for building indoor boundary models from dense point clouds collected by laser scanners. Sensors 2012, 12, 16099–16115. [Google Scholar] [CrossRef] [PubMed]
- Kim, C.; Habib, A.; Pyeon, M.; Kwon, G.-R.; Jung, J.; Heo, J. Segmentation of Planar Surfaces from Laser Scanning Data Using the Magnitude of Normal Position Vector for Adaptive Neighborhoods. Sensors 2016, 16, 140. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Fortune, S.J.A. A sweepline algorithm for Voronoi diagrams. Algorithmica 1987, 2, 153–174. [Google Scholar] [CrossRef]
- Choset, H. Incremental construction of the generalized Voronoi diagram, the generalized Voronoi graph, and the hierarchical generalized Voronoi graph. In Proceedings of the First CGC Workshop on Computational Geometry, Baltimore, MD, USA, 10–11 October 1996. [Google Scholar]
- O’Sullivan, S. An Empirical Evaluation of Map Building Methodologies in Mobile Robotics Using the Feature Prediction Sonar Noise Filter and Metric Grid Map Benchmarking Suite; University of Limerick: Limerick, Ireland, 2003. [Google Scholar]
- Okabe, A. Spatial tessellations. In International Encyclopedia of Geography: People, the Earth, Environment and Technology: People, the Earth, Environment and Technology; John Wiley & Sons: Hoboken, NJ, USA, 2016; pp. 1–11. [Google Scholar]
- Lau, B.; Sprunk, C.; Burgard, W. Improved updating of Euclidean distance maps and Voronoi diagrams. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 281–286. [Google Scholar]
- Karimipour, F.; Ghandehari, M. A stable Voronoi-based algorithm for medial axis extraction through labeling sample points. In Proceedings of the 2012 Ninth International Symposium on Voronoi Diagrams in Science and Engineering, New Brunswick, NJ, USA, 27–29 June 2012; pp. 109–114. [Google Scholar]
- Thrun, S.; Bücken, A. Integrating grid-based and topological maps for mobile robot navigation. In Proceedings of the National Conference on Artificial Intelligence, Portland, Oregon, 4–8 August 1996; pp. 944–951. [Google Scholar]
- Spexard, T.; Li, S.; Wrede, B.; Fritsch, J.; Sagerer, G.; Booij, O.; Zivkovic, Z.; Terwijn, B.; Krose, B. BIRON, where are you? Enabling a robot to learn new places in a real home environment by integrating spoken dialog and visual localization. In Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 9–15 October 2006; pp. 934–940. [Google Scholar]
- Friedman, S.; Pasula, H.; Fox, D. Voronoi Random Fields: Extracting Topological Structure of Indoor Environments via Place Labeling. In Proceedings of the International Joint Conference on Artificial Intelligence, Hyderabad, India, 6–12 January 2007; pp. 2109–2114. [Google Scholar]
- Sjöö, K. Semantic map segmentation using function-based energy maximization. In Proceedings of the 2012 IEEE International Conference on Robotics and Automation, Saint Paul, MN, USA, 14–18 May 2012; pp. 4066–4073. [Google Scholar]
- Oberlander, J.; Uhl, K.; Zollner, J.M.; Dillmann, R. A region-based SLAM algorithm capturing metric, topological, and semantic properties. In Proceedings of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA, 19–23 May 2008; pp. 1886–1891. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Chen, Y.; Fan, H.; Xu, B.; Yan, Z.; Kalantidis, Y.; Rohrbach, M.; Yan, S.; Feng, J. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 3435–3444. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
ResNet-18 | LCNet | |
---|---|---|
Convolutional block | 3 × 3 conv 3 × 3 conv | 1 × 1 conv 3 × 3 Octconv 1 × 1 conv |
Layers | LCNet | Output Size |
---|---|---|
Input | 48 × 48 | |
Block 1 | 1 × 1 conv 3 × 3 Octconv 1 × 1 conv | 48 × 48 |
Transition layer | 2 × 2 average pooling | 24 × 24 |
Block 2 | 1 × 1 conv 3 × 3 Octconv 1 × 1 conv | 24 × 24 |
Transition layer | 2 × 2 average pooling | 12 × 12 |
Block 3 | 1 × 1 conv 3 × 3 Octconv 1 × 1 conv | 12 × 12 |
Transition layer | 2 × 2 average pooling | 6 × 6 |
Block 4 | 1 × 1 conv 3 × 3 Octconv 1 × 1 conv | 6 × 6 |
Transition layer | 2 × 2 average pooling | 3 × 3 |
FC layer | 256D FC layer | 1 × 1 |
FC layer | 512D FC layer | 1 × 1 |
Classification layer | 512D FC layer, softmax | 1 × 1 |
Network | Model Size | Sample Number of Test Set | Running Time on PC | Running Time on Raspberry PI |
---|---|---|---|---|
LCNet | 3.4 M | 8190 | 2.47 s | 18.08 s |
ResNet-18 | 44.7 M | 8190 | 8.32 s | - |
Correctly Classfied Points/All Sampling Points | Accuracy Rate | Running Time | |
---|---|---|---|
Map 1 | 578/589 | 98.13% | 1.77 s |
Map 2 | 662/682 | 97.06% | 2.41 s |
Map 3 | 774/786 | 98.47% | 3.06 s |
Proposed Method | Voronoi | Morphological | |
---|---|---|---|
Recall | 96.5% ± 1.6% | 93.5% ± 1.4% | 94.7% ± 2.8% |
Precision | 94.3% ± 3.9% | 86.6% ± 8.7% | 91.3% ± 5.7% |
Average Runtime (s) | 2.83 ± 1.21 | 2.07 ± 1.06 | 1.25 ± 0.42 |
Segment area (m2) | 58.2 ± 10.4 | 42.6 ± 8.5 | 39.1 ± 18.6 |
Segmented labels | Yes | No | No |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zheng, T.; Duan, Z.; Wang, J.; Lu, G.; Li, S.; Yu, Z. Research on Distance Transform and Neural Network Lidar Information Sampling Classification-Based Semantic Segmentation of 2D Indoor Room Maps. Sensors 2021, 21, 1365. https://doi.org/10.3390/s21041365
Zheng T, Duan Z, Wang J, Lu G, Li S, Yu Z. Research on Distance Transform and Neural Network Lidar Information Sampling Classification-Based Semantic Segmentation of 2D Indoor Room Maps. Sensors. 2021; 21(4):1365. https://doi.org/10.3390/s21041365
Chicago/Turabian StyleZheng, Tao, Zhizhao Duan, Jin Wang, Guodong Lu, Shengjie Li, and Zhiyong Yu. 2021. "Research on Distance Transform and Neural Network Lidar Information Sampling Classification-Based Semantic Segmentation of 2D Indoor Room Maps" Sensors 21, no. 4: 1365. https://doi.org/10.3390/s21041365
APA StyleZheng, T., Duan, Z., Wang, J., Lu, G., Li, S., & Yu, Z. (2021). Research on Distance Transform and Neural Network Lidar Information Sampling Classification-Based Semantic Segmentation of 2D Indoor Room Maps. Sensors, 21(4), 1365. https://doi.org/10.3390/s21041365