Article

PolyRoom: Room-Aware Transformer for Floorplan Reconstruction

Authors: Yuzhou Liu, Lingjie Zhu, Xiaodong Ma, Hanqiao Ye, Xiang Gao, Xianwei Zheng, Shuhan ShenAuthors Info & Claims

Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part L

Pages 322 - 339

https://doi.org/10.1007/978-3-031-72973-7_19

Published: 01 November 2024 Publication History

Abstract

Reconstructing geometry and topology structures from raw unstructured data has always been an important research topic in indoor mapping research. In this paper, we aim to reconstruct the floorplan with a vectorized representation from point clouds. Despite significant advancements achieved in recent years, current methods still encounter several challenges, such as missing corners or edges, inaccuracies in corner positions or angles, self-intersecting or overlapping polygons, and potentially implausible topology. To tackle these challenges, we present PolyRoom, a room-aware Transformer that leverages uniform sampling representation, room-aware query initialization, and room-aware self-attention for floorplan reconstruction. Specifically, we adopt a uniform sampling floorplan representation to enable dense supervision during training and effective utilization of angle information. Additionally, we propose a room-aware query initialization scheme to prevent non-polygonal sequences and introduce room-aware self-attention to enhance memory efficiency and model performance. Experimental results on two widely used datasets demonstrate that PolyRoom surpasses current state-of-the-art methods both quantitatively and qualitatively. Our code is available at: https://github.com/3dv-casia/PolyRoom/.

References

[1]

Avetisyan, A., Khanova, T., Choy, C., Dash, D., Dai, A., Nießner, M.: SceneCAD: predicting object alignments and layouts in RGB-D scans. In: European Conference on Computer Vision (ECCV), pp. 596–612 (2020)

[2]

Browne, C.B., et al.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)

[3]

Cabral, R., Furukawa, Y.: Piecewise planar and compact floorplan reconstruction from images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 628–635 (2014)

[4]

Castrejon, L., Kundu, K., Urtasun, R., Fidler, S.: Annotating object instances with a polygon-RNN. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4485–4493 (2017)

[5]

Chen, C., Wang, R., Vogel, C., Pollefeys, M.: F3Loc: fusion and filtering for floorplan localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 18029–18038 (2024)

[6]

Chen, J., Deng, R., Furukawa, Y.: PolyDiffuse: polygonal shape reconstruction via guided set diffusion models. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1863–1888 (2023)

[7]

Chen, J., Liu, C., Wu, J., Furukawa, Y.: Floor-SP: inverse CAD for floorplans by sequential room-wise shortest path. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2670 (2019)

[8]

Chen, J., Qian, Y., Furukawa, Y.: HEAT: holistic edge attention Transformer for structured reconstruction. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3856–3865 (2022)

[9]

Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., Girdhar, R.: Masked-attention mask Transformer for universal image segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1280–1289 (2022)

[10]

Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2432–2443 (2017)

[11]

Dosovitskiy, A., et al.: An image is worth 16

\times

16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (ICLR) (2020)

[12]

Douglas DH and Peucker TK Algorithms for the reduction of the number of points required to represent a digitized line or its caricature Cartographica 1973 10 2 112-122

[13]

Fan, Z., Zhu, L., Li, H., Chen, X., Zhu, S., Tan, P.: FloorPlanCAD: a large-scale CAD drawing dataset for panoptic symbol spotting. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10108–10117 (2021)

[14]

Favreau JD, Lafarge F, Bousseau A, and Auvolat A Extracting geometric structures in images with Delaunay point processes IEEE Trans. Pattern Anal. Mach. Intell. 2019 42 4 837-850

Digital Library

[15]

Han J, Liu Y, Rong M, Zheng X, and Shen S FloorUSG: indoor floorplan reconstruction by unifying 2D semantics and 3D geometry ISPRS J. Photogramm. Remote. Sens. 2023 196 490-501

[16]

He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)

[17]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

[18]

Hochreiter S and Schmidhuber J Long short-term memory Neural Comput. 1997 9 8 1735-1780

Digital Library

[19]

Hu Y, Wang Z, Huang Z, and Liu Y PolyBuilding: polygon transformer for building extraction ISPRS J. Photogramm. Remote. Sens. 2023 199 15-27

[20]

Ibrahem, H., Salem, A., Kang, H.S.: ST-RoomNet: learning room layout estimation from single image through unsupervised spatial transformations. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3376–3384 (2023)

[21]

Ikehata, S., Yang, H., Furukawa, Y.: Structured indoor modeling. In: IEEE International Conference on Computer Vision (ICCV), pp. 1323–1331 (2015)

[22]

Jiang, Z., Xiang, Z., Xu, J., Zhao, M.: LGT-Net: indoor panoramic room layout estimation with geometry-aware transformer network. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1644–1653 (2022)

[23]

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)

[24]

Lazarow, J., Xu, W., Tu, Z.: Instance segmentation with mask-supervised polygonal boundary Transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4372–4381 (2022)

[25]

Lee, C.Y., Badrinarayanan, V., Malisiewicz, T., Rabinovich, A.: RoomNet: end-to-end room layout estimation. In: IEEE International Conference on Computer Vision (ICCV), pp. 4875–4884 (2017)

[26]

Li, J., Chan, C.L., Le Chan, J., Li, Z., Wan, K.W., Yau, W.Y.: Cognitive navigation for indoor environment using floorplan. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9030–9037. IEEE (2021)

[27]

Liang, J., Homayounfar, N., Ma, W.C., Xiong, Y., Hu, R., Urtasun, R.: PolyTransform: deep polygon transformer for instance segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9128–9137 (2020)

[28]

Liao, B., et al.: MapTRv2: an end-to-end framework for online vectorized HD map construction. arXiv preprint arXiv:2308.05736 (2023)

[29]

Liu, H., et al.: Lightweight structured line map based visual localization. IEEE Robot. Automat. Lett. 9(6), 5182–5189 (2024)

[30]

Liu, J., et al.: PolyFormer: referring image segmentation as sequential polygon generation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 18653–18663 (2023)

[31]

Liu, Z., et al.: Swin transformer: hierarchical vision Transformer using shifted windows. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9992–10002 (2021)

[32]

Peng, S., Jiang, W., Pi, H., Li, X., Bao, H., Zhou, X.: Deep snake for real-time instance segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8530–8539 (2020)

[33]

Stekovic, S., Rad, M., Fraundorfer, F., Lepetit, V.: MonteFloor: extending MCTs for reconstructing accurate large-scale floor plans. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 16014–16023 (2021)

[34]

Su, J.W., Tung, K.Y., Peng, C.H., Wonka, P., Chu, H.K.: SLIBO-Net: floorplan reconstruction via slicing box representation with local geometry regularization. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 48781–48792 (2023)

[35]

Sun, C., Hsiao, C.W., Sun, M., Chen, H.T.: HorizonNet: learning room layout with 1D representation and pano stretch data augmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1047–1056 (2019)

[36]

Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (NIPS) (2017)

[37]

Xu, Y., Xu, W., Cheung, D., Tu, Z.: Line segment detection using transformers without edges. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4255–4264 (2021)

[38]

Xue, N., et al.: Holistically-attracted wireframe parsing. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2785–2794 (2020)

[39]

Yue, Y., Kontogianni, T., Schindler, K., Engelmann, F.: Connecting the dots: floorplan reconstruction using two-level queries. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 845–854 (2023)

[40]

Zhang, F., Xu, X., Nauata, N., Furukawa, Y.: Structured outdoor architecture reconstruction by exploration and classification. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 12407–12415 (2021)

[41]

Zheng J, Zhang J, Li J, Tang R, Gao S, and Zhou Z Vedaldi A, Bischof H, Brox T, and Frahm J-M Structured3D: a large photo-realistic dataset for structured 3D modeling Computer Vision – ECCV 2020 2020 Cham Springer 519-535

Digital Library

[42]

Zheng, Z., Li, J., Zhu, L., Li, H., Petzold, F., Tan, P.: GAT-CADNet: graph attention network for panoptic symbol spotting in CAD drawings. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11737–11746 (2022)

[43]

Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: International Conference on Learning Representations (ICLR) (2020)

Index Terms

PolyRoom: Room-Aware Transformer for Floorplan Reconstruction

Index terms have been assigned to the content through auto-classification.

Recommendations

An Effective Floorplan-Guided Placement Algorithm for Large-Scale Mixed-Size Designs

In this article we propose an effective algorithm flow to handle modern large-scale mixed-size placement, both with and without geometry constraints. The basic idea is to use floorplanning to guide the placement of objects at the global level. The flow ...
Floorplan-guided placement for large-scale mixed-size designs
Constraint-driven floorplan repair
DAC '06: Proceedings of the 43rd annual Design Automation Conference

Floorplanning algorithms have traditionally underperformed experienced designers, even when relatively simple interconnect metrics are concerned. However, the sheer scale of modern systems on chip makes an all-manual design flow infeasible. In this ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part L

Sep 2024

568 pages

ISBN:978-3-031-72972-0

DOI:10.1007/978-3-031-72973-7

Editors:
Aleš Leonardis
University of Birmingham, Birmingham, UK
,
Elisa Ricci
https://ror.org/05trd4x28University of Trento, Trento, Italy
,
Stefan Roth
Technical University of Darmstadt, Darmstadt, Germany
,
Olga Russakovsky
Princeton University, Princeton, NJ, USA
,
Torsten Sattler
Czech Technical University in Prague, Prague, Czech Republic
,
Gül Varol
École des Ponts ParisTech, Marne-la-Vallée, France

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 November 2024

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents