Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D Mapping
<p>Prototype acquisition system used to create the PC3D dataset in the city of Paris. Sensors: Velodyne HDL32 LiDAR, Ladybug5 360° camera, Photonfocus MV1 16-band VIR and 25-band NIR hyperspectral cameras (hyperspectral data are not available in this dataset; they cannot be used in mobile mapping due to the limited exposure time).</p> "> Figure 2
<p>Paris-CARLA-3D dataset: (<b>left</b>) Paris point clouds with color information on LiDAR points; (<b>right</b>) manual semantic annotation of the LiDAR points (using the same tags from the CARLA simulator). We can see the large number of details in the manual annotation.</p> "> Figure 3
<p><b>Left</b>, prediction in <math display="inline"><semantics> <msub> <mi>S</mi> <mn>0</mn> </msub> </semantics></math> test set of Paris data using KPConv model. <b>Right</b>, ground truth.</p> "> Figure 4
<p><b>Left</b>, prediction in <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math> test set of Paris data using KPConv model. <b>Right</b>, ground truth.</p> "> Figure 5
<p><b>Left</b>, prediction in <math display="inline"><semantics> <msub> <mi>T</mi> <mn>1</mn> </msub> </semantics></math> test set of CARLA data using KPConv model. <b>Right</b>, ground truth.</p> "> Figure 6
<p><b>Left</b>, prediction in <math display="inline"><semantics> <msub> <mi>T</mi> <mn>7</mn> </msub> </semantics></math> test set of CARLA data using KPConv model. <b>Right</b>, ground truth.</p> "> Figure 7
<p>Instances of vehicles in <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math> test set (Paris data).</p> "> Figure 8
<p>Pedestrians in Paris data: we can see inside the red circle the difficulty of differentiating the instances of pedestrians.</p> "> Figure 9
<p><b>Top</b>, vehicle instances from our proposed baseline using BEV projections and geometrical features in <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math> Paris data. <b>Bottom</b>, ground truth.</p> "> Figure 10
<p>Paris data after removal of vehicles and pedestrians. Zones in red circles show the interest in conducting scene completion for 3D mapping, in order to fill holes from removed pedestrians, parked cars, and from the occlusion of other objects, and also to improve the sampling of points in areas far from the LiDAR.</p> "> Figure 11
<p>Scene completion task for one chunk point cloud in Town1 (<math display="inline"><semantics> <msub> <mi>T</mi> <mn>1</mn> </msub> </semantics></math>) of CARLA test data (training on CARLA data).</p> "> Figure 12
<p>Scene completion task for one chunk point cloud in Soufflot0 (<math display="inline"><semantics> <msub> <mi>S</mi> <mn>0</mn> </msub> </semantics></math>) of Paris test data (training on Paris data).</p> "> Figure A1
<p>Paris training set. From <b>top</b> to <b>bottom</b>: <math display="inline"><semantics> <msub> <mi>S</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>2</mn> </msub> </semantics></math> (real data).</p> "> Figure A2
<p>Paris validation set. From <b>top</b> to <b>bottom</b>: <math display="inline"><semantics> <msub> <mi>S</mi> <mn>4</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>5</mn> </msub> </semantics></math> (real data).</p> "> Figure A3
<p>Paris test set. From <b>top</b> to <b>bottom</b>: <math display="inline"><semantics> <msub> <mi>S</mi> <mn>0</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math> (real data).</p> "> Figure A4
<p>CARLA training set. From <b>top</b> to <b>bottom</b>: <math display="inline"><semantics> <msub> <mi>T</mi> <mn>2</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>T</mi> <mn>3</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>T</mi> <mn>4</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>T</mi> <mn>5</mn> </msub> </semantics></math> (synthetic data).</p> "> Figure A5
<p>CARLA validation set. <math display="inline"><semantics> <msub> <mi>T</mi> <mn>6</mn> </msub> </semantics></math> (synthetic data).</p> "> Figure A6
<p>CARLA test set. From <b>top</b> to <b>bottom</b>: <math display="inline"><semantics> <msub> <mi>T</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>T</mi> <mn>7</mn> </msub> </semantics></math> (synthetic data).</p> ">
Abstract
:1. Introduction
- the publication of a new dataset, called Paris-CARLA-3D (PC3D in short)—synthetic and real point clouds of outdoor environments; the dataset is available at the following URL: https://npm3d.fr/paris-carla-3d, accessed on 15 October 2021;
- the protocol and experiments with baselines on three tasks (semantic segmentation, instance segmentation, and scene completion) based on this dataset.
2. Related Datasets
3. Dataset Construction
3.1. Paris (Real Data)
3.2. CARLA (Synthetic Data)
3.3. Interest in Having Both Synthetic and Real Data
4. Dataset Properties
4.1. Statistics of Classes
4.2. Color
4.3. Split for Training
- Training data: , (Paris); , , , (CARLA);
- Validation data: , (Paris); (CARLA);
- Test data: , (Paris); , (CARLA).
4.4. Transfer Learning
5. Semantic Segmentation (SS) Task
5.1. Task Protocol
5.2. Experiments: Setting a Baseline
5.2.1. Baseline Parameters
5.2.2. Implementation Details
5.2.3. Quantitative Results
5.2.4. Qualitative Results
5.2.5. Influence of Color
5.2.6. Transfer Learning
6. Instance Segmentation (IS) Task
6.1. Task Protocol
6.2. Experiments: Setting a Baseline
- Occupancy image ()—binary image with presence or not of things class;
- Elevation image ()—stores the maximal elevation among all projected points on the same pixel;
- Accumulation image ()—stores the number of points projected on the same pixel.
6.2.1. Vehicles in Paris and CARLA Data
- 1.
- Discard the predicted points of the vehicle if the z coordinate is greater than 4 m in ;
- 2.
- Connect close components with two consecutive morphological dilations of by a square of 3-pixel size;
- 3.
- Fill holes smaller than ten pixels inside each connected component; this is performed with a morphological area closing;
- 4.
- Discard instances with less than 500 points in ;
- 5.
- Discard instances not surrounded by ground-like classes in .
6.2.2. Pedestrians in CARLA Data
6.2.3. Quantitative Results
6.2.4. Qualitative Results
7. Scene Completion (SC) Task
7.1. Task Protocol
7.2. Experiments: Setting a Baseline
7.2.1. Quantitative Results
7.2.2. Qualitative Results
7.2.3. Transfer Learning with Scene Completion
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Complementary on Paris-CARLA-3D Dataset
Appendix A.1. Class Statistics
Paris | CARLA | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Class | |||||||||||||
unlabeled | 0.9 | 1.5 | 3.9 | 3.2 | 1.9 | 0.9 | 5.8 | 2.9 | - | 7.6 | 0.0 | 6.4 | 1.8 |
building | 14.9 | 18.9 | 34.2 | 36.6 | 33.1 | 32.9 | 6.8 | 22.6 | 15.3 | 4.5 | 16.1 | 2.6 | 3.3 |
fence | 2.3 | 0.6 | 0.7 | 0.8 | - | 0.4 | 1.0 | 0.6 | 0.0 | 0.5 | 3.8 | 1.5 | 0.6 |
other | 2.1 | 3.4 | 6.7 | 2.2 | 2.5 | 0.4 | - | - | - | - | 0.1 | 0.1 | 0.1 |
pedestrian | 0.2 | 1.0 | 0.6 | 1.0 | 0.7 | 0.7 | 0.1 | 0.2 | 0.1 | 0.0 | - | 0.1 | 0.0 |
pole | 0.6 | 0.9 | 0.6 | 0.8 | 0.7 | 1.1 | 0.6 | 0.6 | 4.2 | 0.8 | 0.8 | 0.4 | 0.3 |
road-line | 3.8 | 3.7 | 2.4 | 4.1 | 3.5 | 3.4 | 0.2 | 0.2 | 2.9 | 1.6 | 2.2 | 1.3 | 1.7 |
road | 41.0 | 49.7 | 35.0 | 37.6 | 40.6 | 27.5 | 47.8 | 37.2 | 53.1 | 52.8 | 44.7 | 58.0 | 42.8 |
sidewalk | 10.1 | 4.2 | 7.3 | 6.7 | 11.9 | 29.4 | 22.5 | 17.5 | 10.3 | 1.7 | 10.5 | 3.1 | 0.4 |
vegetation | 18.5 | 9.0 | 0.1 | 0.3 | 0.1 | - | 8.7 | 10.8 | 2.7 | 12.8 | 4.6 | 8.1 | 23.1 |
vehicles | 1.3 | 1.8 | 6.5 | 6.5 | 3.3 | 1.6 | 1.7 | 3.1 | 0.9 | 0.5 | 3.1 | 4.2 | 0.9 |
wall | - | - | - | - | - | - | 1.9 | 3.6 | 1.4 | 5.4 | 5.3 | 3.4 | - |
traffic sign | 0.1 | 0.4 | 0.1 | 0.1 | 0.3 | 0.1 | - | 0.0 | - | 0.1 | 0.0 | 0.0 | 0.1 |
sky | - | - | - | - | - | - | - | - | - | - | - | - | - |
ground | - | - | - | - | - | - | - | 0.0 | 0.2 | 1.4 | 0.3 | 0.1 | - |
bridge | - | - | - | - | - | - | 1.7 | - | - | 0.7 | 6.6 | - | - |
rail-track | - | - | - | - | - | - | - | - | 7.6 | - | 0.5 | - | - |
guard-rail | - | - | - | - | - | - | 0.0 | - | - | 4.3 | - | 1.2 | 0.5 |
static | 2.6 | 2.3 | 0.3 | 0.1 | 0.7 | 1.5 | 0.1 | 0.1 | - | - | - | - | - |
traffic light | 0.1 | 0.2 | 0.1 | 0.1 | 0.1 | - | 0.8 | 0.5 | 0.3 | 0.3 | 0.3 | - | - |
dynamic | 0.3 | 1.6 | 1.5 | 0.2 | 0.7 | 0.0 | 0.1 | 0.1 | 0.1 | 0.3 | 0.1 | 0.1 | 0.1 |
water | - | - | - | - | - | - | 0.4 | - | 0.0 | - | - | - | 0.6 |
terrain | 1.4 | 0.8 | - | - | - | - | - | - | 0.9 | 4.8 | 1.1 | 9.6 | 23.8 |
# Points | 60 M | 700 M |
Appendix A.2. Instances
Appendix B. Images of the Dataset
References
- Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor Segmentation and Support Inference from RGBD Images. In Proceedings of the Computer Vision—ECCV 2012, Florence, Italy, 7–13 October 2012; Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 746–760. [Google Scholar]
- Varney, N.; Asari, V.K.; Graehling, Q. DALES: A Large-scale Aerial LiDAR Data Set for Semantic Segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 717–726. [Google Scholar] [CrossRef]
- Li, X.; Li, C.; Tong, Z.; Lim, A.; Yuan, J.; Wu, Y.; Tang, J.; Huang, R. Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 238–246. [Google Scholar] [CrossRef]
- Hu, Q.; Yang, B.; Khalid, S.; Xiao, W.; Trigoni, N.; Markham, A. Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2021; pp. 4977–4987. [Google Scholar]
- Behley, J.; Garbade, M.; Milioto, A.; Quenzel, J.; Behnke, S.; Stachniss, C.; Gall, J. SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019; pp. 9296–9306. [Google Scholar] [CrossRef] [Green Version]
- Tan, W.; Qin, N.; Ma, L.; Li, Y.; Du, J.; Cai, G.; Yang, K.; Li, J. Toronto-3D: A Large-Scale Mobile LiDAR Dataset for Semantic Segmentation of Urban Roadways. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Hackel, T.; Savinov, N.; Ladicky, L.; Wegner, J.D.; Schindler, K.; Pollefeys, M. SEMANTIC3D.NET: A new large-scale point cloud classification benchmark. arXiv 2017, arXiv:1704.03847. [Google Scholar] [CrossRef] [Green Version]
- Xiao, J.; Owens, A.; Torralba, A. SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels. In Proceedings of the 2013 IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, 1–8 December 2013; pp. 1625–1632. [Google Scholar] [CrossRef] [Green Version]
- McCormac, J.; Handa, A.; Leutenegger, S.; Davison, A.J. SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2697–2706. [Google Scholar] [CrossRef]
- Armeni, I.; Sener, O.; Zamir, A.R.; Jiang, H.; Brilakis, I.; Fischer, M.; Savarese, S. 3D Semantic Parsing of Large-Scale Indoor Spaces. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1534–1543. [Google Scholar] [CrossRef]
- Dai, A.; Chang, A.X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2432–2443. [Google Scholar] [CrossRef] [Green Version]
- Chang, A.; Dai, A.; Funkhouser, T.; Halber, M.; Niebner, M.; Savva, M.; Song, S.; Zeng, A.; Zhang, Y. Matterport3D: Learning from RGB-D Data in Indoor Environments. In Proceedings of the 2017 International Conference on 3D Vision (3DV), Qingdao, China, 10–12 October 2017; pp. 667–676. [Google Scholar] [CrossRef] [Green Version]
- Hurl, B.; Czarnecki, K.; Waslander, S. Precise Synthetic Image and LiDAR (PreSIL) Dataset for Autonomous Vehicle Perception. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 2522–2529. [Google Scholar] [CrossRef] [Green Version]
- Caesar, H.; Bankiti, V.; Lang, A.H.; Vora, S.; Liong, V.E.; Xu, Q.; Krishnan, A.; Pan, Y.; Baldan, G.; Beijbom, O. nuScenes: A Multimodal Dataset for Autonomous Driving. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 11618–11628. [Google Scholar] [CrossRef]
- Geyer, J.; Kassahun, Y.; Mahmudi, M.; Ricou, X.; Durgesh, R.; Chung, A.S.; Hauswald, L.; Pham, V.H.; Mühlegg, M.; Dorn, S.; et al. A2D2: Audi Autonomous Driving Dataset. arXiv 2020, arXiv:2004.06320. [Google Scholar]
- Pan, Y.; Gao, B.; Mei, J.; Geng, S.; Li, C.; Zhao, H. SemanticPOSS: A Point Cloud Dataset with Large Quantity of Dynamic Instances. In Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA, 23 June 2020; pp. 687–693. [Google Scholar] [CrossRef]
- Xiao, A.; Huang, J.; Guan, D.; Zhan, F.; Lu, S. SynLiDAR: Learning From Synthetic LiDAR Sequential Point Cloud for Semantic Segmentation. arXiv 2021, arXiv:2107.05399. [Google Scholar]
- Deschaud, J.E. KITTI-CARLA: A KITTI-like dataset generated by CARLA Simulator. arXiv 2021, arXiv:2109.00892. [Google Scholar]
- Munoz, D.; Bagnell, J.A.; Vandapel, N.; Hebert, M. Contextual classification with functional Max-Margin Markov Networks. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 975–982. [Google Scholar] [CrossRef]
- Serna, A.; Marcotegui, B.; Goulette, F.; Deschaud, J.E. Paris-rue-Madame database: A 3D mobile laser scanner dataset for benchmarking urban detection, segmentation and classification methods. In Proceedings of the 4th International Conference on Pattern Recognition, Applications and Methods (ICPRAM 2014), Loire Valley, France, 6–8 March 2014. [Google Scholar]
- Vallet, B.; Brédif, M.; Serna, A.; Marcotegui, B.; Paparoditis, N. TerraMobilita/iQmulus urban point cloud analysis benchmark. Comput. Graph. 2015, 49, 126–133. [Google Scholar] [CrossRef] [Green Version]
- Roynard, X.; Deschaud, J.E.; Goulette, F. Paris-Lille-3D: A large and high-quality ground-truth urban point cloud dataset for automatic segmentation and classification. Int. J. Robot. Res. 2018, 37, 545–557. [Google Scholar] [CrossRef] [Green Version]
- Griffiths, D.; Boehm, J. SynthCity: A large scale synthetic point cloud. arXiv 2019, arXiv:1907.04758. [Google Scholar]
- Zhu, J.; Gehrung, J.; Huang, R.; Borgmann, B.; Sun, Z.; Hoegner, L.; Hebel, M.; Xu, Y.; Stilla, U. TUM-MLS-2016: An Annotated Mobile LiDAR Dataset of the TUM City Campus for Semantic Point Cloud Interpretation in Urban Areas. Remote Sens. 2020, 12, 1875. [Google Scholar] [CrossRef]
- Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012. [Google Scholar]
- Deschaud, J.E. IMLS-SLAM: Scan-to-Model Matching Based on 3D Data. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 2480–2485. [Google Scholar] [CrossRef] [Green Version]
- Dosovitskiy, A.; Ros, G.; Codevilla, F.; Lopez, A.; Koltun, V. CARLA: An Open Urban Driving Simulator. In Proceedings of the 1st Annual Conference on Robot Learning, Mountain View, CA, USA, 13–15 November 2017; pp. 1–16. [Google Scholar]
- Bello, S.A.; Yu, S.; Wang, C.; Adam, J.M.; Li, J. Review: Deep Learning on 3D Point Clouds. Remote Sens. 2020, 12, 1729. [Google Scholar] [CrossRef]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 5105–5114. [Google Scholar]
- Thomas, H.; Qi, C.R.; Deschaud, J.E.; Marcotegui, B.; Goulette, F.; Guibas, L.J. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019; pp. 6411–6420. [Google Scholar]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4338–4364. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef] [Green Version]
- Duque-Arias, D.; Velasco-Forero, S.; Deschaud, J.E.; Goulette, F.; Serna, A.; Decencière, E.; Marcotegui, B. On power Jaccard losses for semantic segmentation. In Proceedings of the VISAPP 2021: 16th International Conference on Computer Vision Theory and Applications, Vienna, Austria, 8–10 March 2021. [Google Scholar]
- Chaton, T.; Chaulet, N.; Horache, S.; Landrieu, L. Torch-Points3D: A Modular Multi-Task Framework for Reproducible Deep Learning on 3D Point Clouds. In Proceedings of the 2020 International Conference on 3D Vision (3DV), Fukuoka, Japan, 25–28 November 2020; pp. 1–10. [Google Scholar] [CrossRef]
- Kirillov, A.; He, K.; Girshick, R.; Rother, C.; Dollár, P. Panoptic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 9404–9413. [Google Scholar]
- Serna, A.; Marcotegui, B. Detection, segmentation and classification of 3D urban objects using mathematical morphology and supervised learning. ISPRS J. Photogramm. Remote Sens. 2014, 93, 243–255. [Google Scholar] [CrossRef] [Green Version]
- Gomes, L.; Regina Pereira Bellon, O.; Silva, L. 3D reconstruction methods for digital preservation of cultural heritage: A survey. Pattern Recognit. Lett. 2014, 50, 3–14. [Google Scholar] [CrossRef]
- Xu, Y.; Zhu, X.; Shi, J.; Zhang, G.; Bao, H.; Li, H. Depth Completion From Sparse LiDAR Data With Depth-Normal Constraints. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019. [Google Scholar]
- Guo, X.; Xiao, J.; Wang, Y. A Survey on Algorithms of Hole Filling in 3D Surface Reconstruction. Vis. Comput. 2018, 34, 93–103. [Google Scholar] [CrossRef]
- Roldao, L.; de Charette, R.; Verroust-Blondet, A. 3D Semantic Scene Completion: A Survey. arXiv 2021, arXiv:2103.07466. [Google Scholar]
- Dai, A.; Siddiqui, Y.; Thies, J.; Valentin, J.; Niessner, M. SPSG: Self-Supervised Photometric Scene Generation From RGB-D Scans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 21–24 June 2021; pp. 1747–1756. [Google Scholar]
- Hoppe, H.; DeRose, T.; Duchamp, T.; McDonald, J.; Stuetzle, W. Surface Reconstruction from Unorganized Points. SIGGRAPH Comput. Graph. 1992, 26, 71–78. [Google Scholar] [CrossRef]
- Dai, A.; Diller, C.; Niessner, M. SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 846–855. [Google Scholar] [CrossRef]
- Curless, B.; Levoy, M. A Volumetric Method for Building Complex Models from Range Images. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 4–9 August 1996; Association for Computing Machinery: New York, NY, USA, 1996; pp. 303–312. [Google Scholar] [CrossRef] [Green Version]
- Lorensen, W.E.; Cline, H.E. Marching cubes: A high resolution 3D surface construction algorithm. In Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, New York, NY, USA, July 1987; Association for Computing Machinery: New York, NY, USA, 1987; pp. 163–169. [Google Scholar] [CrossRef]
Scene | Type | Dataset (Year) | World | # Points | RGB | Tasks | ||
---|---|---|---|---|---|---|---|---|
SS | IS | SC | ||||||
Indoor | Mapping | SUN3D [8] (2013) | Real | 8 M | Yes | 🗸(11) | 🗸 | |
SceneNet [9] (2015) | Synthetic | - | Yes | 🗸(11) | 🗸 | 🗸 | ||
S3DIS [10] (2016) | Real | 696 M | Yes | 🗸(13) | 🗸 | 🗸 | ||
ScanNet [11] (2017) | Real | 5581 M | Yes | 🗸(11) | 🗸 | 🗸 | ||
Matterport3D [12] (2017) | Real | 24 M | Yes | 🗸(11) | 🗸 | 🗸 | ||
Outdoor | Perception | PreSIL [13] (2019) | Synthetic | 3135 M | Yes | 🗸(12) | 🗸 | |
SemanticKITTI [5] (2019) | Real | 4549 M | No | 🗸(25) | 🗸 | 🗸 | ||
nuScenes-Lidarseg [14] (2019) | Real | 1400 M | Yes | 🗸(32) | 🗸 | |||
A2D2 [15] (2020) | Real | 1238 M | Yes | 🗸(38) | 🗸 | |||
SemanticPOSS [16] (2020) | Real | 216 M | No | 🗸(14) | 🗸 | |||
SynLiDAR [17] (2021) | Synthetic | 19,482 M | No | 🗸(32) | ||||
KITTI-CARLA [18] (2021) | Synthetic | 4500 M | Yes | 🗸(23) | 🗸 | |||
Mapping | Oakland [19] (2009) | Real | 2 M | No | 🗸(5) | |||
Paris-rue-Madame [20] (2014) | Real | 20 M | No | 🗸(17) | 🗸 | |||
iQmulus [21] (2015) | Real | 12 M | No | 🗸(8) | 🗸 | 🗸 | ||
Semantic3D [7] (2017) | Real | 4009 M | Yes | 🗸(8) | ||||
Paris-Lille-3D [22] (2018) | Real | 143 M | No | 🗸(9) | 🗸 | |||
SynthCity [23] (2019) | Synthetic | 368 M | Yes | 🗸(9) | ||||
Toronto-3D [6] (2020) | Real | 78 M | Yes | 🗸(8) | ||||
TUM-MLS-2016 [24] (2020) | Real | 41 M | No | 🗸(8) | ||||
Paris-CARLA-3D (2021) | Synthetic+Real | 700 + 60 M | Yes | 🗸(23) | 🗸 | 🗸 |
Model | Paris | CARLA | Overall | ||
---|---|---|---|---|---|
mIoU | |||||
PointNet++ [29] | 13.9 | 25.8 | 4.0 | 12.0 | 13.9 |
KPConv [30] | 45.2 | 62.9 | 16.7 | 25.3 | 37.5 |
Model | Paris | CARLA | Overall | ||
---|---|---|---|---|---|
mIoU | |||||
KPConv w/o color | 39.4 | 41.5 | 35.3 | 17.0 | 33.3 |
KPConv with color | 45.2 | 62.9 | 16.7 | 25.3 | 37.5 |
Transfer Learning Scenarios | Paris | Overall | |
---|---|---|---|
mIoU | |||
No fine-tuning | 20.6 | 17.7 | 19.2 |
Freeze except last layer | 24.1 | 31.0 | 27.6 |
Freeze feature extractor | 29.0 | 41.3 | 35.2 |
No frozen parameters | 42.8 | 50.0 | 46.4 |
No transfer | 45.2 | 62.9 | 51.7 |
# Instances | SM | PQ | mIoU | |
---|---|---|---|---|
—Vehicles | 10 | 90.0 | 70.9 | 81.6 |
—Vehicles | 86 | 32.6 | 40.5 | 28.0 |
—Vehicles | 41 | 17.1 | 20.4 | 14.2 |
—Vehicles | 27 | 74.1 | 72.6 | 61.2 |
—Pedestrians | 49 | 18.4 | 17.0 | 13.9 |
—Pedestrians | 3 | 100.0 | 9.0 | 66.0 |
Mean | 216 | 55.3 | 38.4 | 44.2 |
Test Set | ||||
---|---|---|---|---|
and (Paris) | cm | cm | 0.40 | 85.3% |
and (CARLA) | cm | cm | 0.49 | 80.3% |
Test Set: and Paris data | ||||
---|---|---|---|---|
Trained only on Paris | 16.6 cm | 10.7 cm | 0.40 | 85.3% |
Trained only on CARLA | 16.6 cm | 8.0 cm | 0.48 | 84.0% |
Pre-trained CARLA, fine-tuned on Paris | 16.6 cm | 7.5 cm | 0.35 | 88.7% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Deschaud, J.-E.; Duque, D.; Richa, J.P.; Velasco-Forero, S.; Marcotegui, B.; Goulette, F. Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D Mapping. Remote Sens. 2021, 13, 4713. https://doi.org/10.3390/rs13224713
Deschaud J-E, Duque D, Richa JP, Velasco-Forero S, Marcotegui B, Goulette F. Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D Mapping. Remote Sensing. 2021; 13(22):4713. https://doi.org/10.3390/rs13224713
Chicago/Turabian StyleDeschaud, Jean-Emmanuel, David Duque, Jean Pierre Richa, Santiago Velasco-Forero, Beatriz Marcotegui, and François Goulette. 2021. "Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D Mapping" Remote Sensing 13, no. 22: 4713. https://doi.org/10.3390/rs13224713
APA StyleDeschaud, J.-E., Duque, D., Richa, J. P., Velasco-Forero, S., Marcotegui, B., & Goulette, F. (2021). Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D Mapping. Remote Sensing, 13(22), 4713. https://doi.org/10.3390/rs13224713