COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15071))

Included in the following conference series:

European Conference on Computer Vision

112 Accesses

Abstract

The generation of large-scale urban layouts has garnered substantial interest across various disciplines. Prior methods have utilized procedural generation requiring manual rule coding or deep learning needing abundant data. However, prior approaches have not considered the context-sensitive nature of urban layout generation. Our approach addresses this gap by leveraging a canonical graph representation for the entire city, which facilitates scalability and captures the multi-layer semantics inherent in urban layouts. We introduce a novel graph-based masked autoencoder (GMAE) for city-scale urban layout generation. The method encodes attributed buildings, city blocks, communities and cities into a unified graph structure, enabling self-supervised masked training for graph autoencoder. Additionally, we employ scheduled iterative sampling for 2.5D layout generation, prioritizing the generation of important city blocks and buildings. Our approach achieves good realism, semantic consistency, and correctness across the heterogeneous urban styles in 330 US cities. Codes and datasets are released at https://github.com/Arking1995/COHO.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automated urban planning aware spatial hierarchies and human instructions

Article 29 November 2022

RoadNetGAN: Generating Road Networks in Planar Graph Representation

UrbanEvolver: Function-Aware Urban Layout Regeneration

Article 19 March 2024

References

Climate and economic justice screening tool. https://screeningtool.geoplatform.gov/
Arroyo, D.M., Postels, J., Tombari, F.: Variational transformer networks for layout generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13642–13652 (2021)
Google Scholar
Bhatt, M., et al.: Design and deployment of photo2building: a cloud-based procedural modeling tool as a service. In: Practice and Experience in Advanced Research Computing, pp. 132–138 (2020)
Google Scholar
Bińkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying mmd GANs. arXiv preprint arXiv:1801.01401 (2018)
Bokeloh, M., Wand, M., Seidel, H.P.: A connection between partial symmetry and inverse procedural modeling. In: ACM SIGGRAPH 2010 Papers, pp. 1–10 (2010)
Google Scholar
Brooks, T., et al.: Video generation models as world simulators (2024). https://openai.com/research/video-generation-models-as-world-simulators
Bureau, U.S.C.: Topologically integrated geographic encoding and referencing. https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html
Chai, L., Tucker, R., Li, Z., Isola, P., Snavely, N.: Persistent nature: a generative model of unbounded 3D worlds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20863–20874 (2023)
Google Scholar
Chai, S., Zhuang, L., Yan, F.: LayoutDM: transformer-based diffusion model for layout generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18349–18358 (2023)
Google Scholar
Chang, H., Zhang, H., Jiang, L., Liu, C., Freeman, W.T.: MaskGIT: masked generative image transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11315–11325 (2022)
Google Scholar
Chang, K.H., Cheng, C.Y., Luo, J., Murata, S., Nourbakhsh, M., Tsuji, Y.: Building-GAN: graph-conditioned architectural volumetric design generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11956–11965 (2021)
Google Scholar
Chen, Z., Wang, G., Liu, Z.: Scenedreamer: unbounded 3D scene generation from 2D image collections. arXiv preprint arXiv:2302.01330 (2023)
Deng, J., et al.: CityGen: infinite and controllable 3D city layout generation. arXiv preprint arXiv:2312.01508 (2023)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Esser, P., Rombach, R., Ommer, B.: Taming transformers for high-resolution image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12873–12883 (2021)
Google Scholar
Gupta, K., Lazarow, J., Achille, A., Davis, L.S., Mahadevan, V., Shrivastava, A.: Layouttransformer: Layout generation and completion with self-attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1004–1014 (2021)
Google Scholar
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)
Google Scholar
He, L., Aliaga, D.: Globalmapper: arbitrary-shaped urban layout generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 454–464 (2023)
Google Scholar
He, L., Lu, Y., Corring, J., Florencio, D., Zhang, C.: Diffusion-based document layout generation. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds.) Document Analysis and Recognition - ICDAR 2023. LNCS, pp. 361–378. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41676-7_21
Chapter Google Scholar
He, L., Shan, J., Aliaga, D.: Generative building feature estimation from satellite images. IEEE Trans. Geosci. Remote Sens. 61, 1–13 (2023)
Google Scholar
Heris, M.P., Foks, N.L., Bagstad, K.J., Troy, A., Ancona, Z.H.: A rasterized building footprint dataset for the united states. Sci. Data 7(1), 207 (2020)
Article Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local NASH equilibrium. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Hou, Z., et al.: Graphmae2: a decoding-enhanced masked self-supervised graph learner. In: Proceedings of the ACM Web Conference 2023, pp. 737–746 (2023)
Google Scholar
Hou, Z., et al.: GraphMAE: self-supervised masked graph autoencoders. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 594–604 (2022)
Google Scholar
Hua, H., et al.: Finematch: aspect-based fine-grained image and text mismatch detection and correction. arXiv preprint arXiv:2404.14715 (2024)
Hui, M., Zhang, Z., Zhang, X., Xie, W., Wang, Y., Lu, Y.: Unifying layout generation with a decoupled diffusion model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1942–1951 (2023)
Google Scholar
Inoue, N., Kikuchi, K., Simo-Serra, E., Otani, M., Yamaguchi, K.: LayoutDM: discrete diffusion model for controllable layout generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10167–10176 (2023)
Google Scholar
Jiang, Z., et al.: Layoutformer++: conditional graphic layout generation via constraint serialization and decoding space restriction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18403–18412 (2023)
Google Scholar
Jyothi, A.A., Durand, T., He, J., Sigal, L., Mori, G.: LayoutVAE: stochastic scene layout generation from a label set. In: Proceedings of the IEEE/CVF International Conference on Computer Vision pp. 9895–9904 (2019)
Google Scholar
Jyothi, A.A., Durand, T., He, J., Sigal, L., Mori, G.: LayoutVAE: stochastic scene layout generation from a label set. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9894–9903 (2019). https://doi.org/10.1109/ICCV.2019.00999
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Li, J., Yang, J., Hertzmann, A., Zhang, J., Xu, T.: LayoutGAN: synthesizing graphic layouts with vector-wireframe adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 43(7), 2388–2399 (2020)
Article Google Scholar
Li, T., Chang, H., Mishra, S., Zhang, H., Katabi, D., Krishnan, D.: Mage: masked generative encoder to unify representation learning and image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2142–2152 (2023)
Google Scholar
Li, Z., Wang, Q., Snavely, N., Kanazawa, A.: Infinitenature-zero: learning perpetual view generation of natural scenes from single images. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13661, pp. 515–534. Springer, Cham (2022)
Google Scholar
Lin, C.H., Lee, H.Y., Cheng, Y.C., Tulyakov, S., Yang, M.H.: InfinityGAN: towards infinite-pixel image synthesis. arXiv preprint arXiv:2104.03963 (2021)
Lin, C.H., et al.: Infinicity: infinite-scale city synthesis. arXiv preprint arXiv:2301.09637 (2023)
Lipp, M., Scherzer, D., Wonka, P., Wimmer, M.: Interactive modeling of city layouts using layers of procedural content. In: Computer Graphics Forum, vol. 30, pp. 345–354. Wiley Online Library (2011)
Google Scholar
Ma, H., Zeng, D., Liu, Y.: Learning individualized treatment rules with many treatments: a supervised clustering approach using adaptive fusion. Adv. Neural. Inf. Process. Syst. 35, 15956–15969 (2022)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Nauata, N., Chang, K.-H., Cheng, C.-Y., Mori, G., Furukawa, Y.: House-GAN: relational generative adversarial networks for graph-constrained house layout generation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part I. LNCS, vol. 12346, pp. 162–177. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_10
Chapter Google Scholar
OpenStreetMap contributors (2017). Planet dump retrieved from https://planet.osm.org. https://www.openstreetmap.org
Para, W., Guerrero, P., Kelly, T., Guibas, L.J., Wonka, P.: Generative layout modeling using constraint graphs. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6690–6700 (2021)
Google Scholar
Patel, P., Kalyanam, R., He, L., Aliaga, D., Niyogi, D.: Deep learning-based urban morphology for city-scale environmental modeling. PNAS Nexus 2(3), pgad027 (2023)
Google Scholar
Patil, A.G., Ben-Eliezer, O., Perel, O., Averbuch-Elor, H.: Read: recursive autoencoders for document layout generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 544–545 (2020)
Google Scholar
Peebles, W., Xie, S.: Scalable diffusion models with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4195–4205 (2023)
Google Scholar
Podell, D., et al.: SDXL: improving latent diffusion models for high-resolution image synthesis. arXiv preprint arXiv:2307.01952 (2023)
Shen, Y., Ma, W.C., Wang, S.: SGAM: building a virtual 3d world through simultaneous generation and mapping. Adv. Neural. Inf. Process. Syst. 35, 22090–22102 (2022)
Google Scholar
Sheng, Y., et al.: Controllable shadow generation using pixel height maps. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13683, pp. 240–256. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20050-2_15
Chapter Google Scholar
Sheng, Y., et al.: Dr. bokeh: differentiable occlusion-aware bokeh rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4515–4525 (2024)
Google Scholar
Shi, Y., Huang, Z., Feng, S., Zhong, H., Wang, W., Sun, Y.: Masked label prediction: Unified message passing model for semi-supervised classification. arXiv preprint arXiv:2009.03509 (2020)
Song, Y., et al.: Objectstitch: object compositing with diffusion model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18310–18319 (2023)
Google Scholar
Song, Y., et al.: Imprint: generative object compositing by learning identity-preserving representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8048–8058 (2024)
Google Scholar
Tabata, S., Yoshihara, H., Maeda, H., Yokoyama, K.: Automatic layout generation for graphical design magazines. In: ACM SIGGRAPH 2019 Posters, pp. 1–2 (2019)
Google Scholar
Van Den Oord, A., Vinyals, O., et al.: Neural discrete representation learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Vanegas, C.A., Kelly, T., Weber, B., Halatsch, J., Aliaga, D.G., Müller, P.: Procedural generation of parcels in urban modeling. In: Computer Graphics Forum, vol. 31, pp. 681–690. Wiley Online Library (2012)
Google Scholar
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y., et al.: Graph attention networks. Stat 1050(20), 10–48550 (2017)
Google Scholar
Wu, W., Fu, X.M., Tang, R., Wang, Y., Qi, Y.H., Liu, L.: Data-driven interior plan generation for residential buildings. ACM Trans. Graph. (TOG) 38(6), 1–12 (2019)
Article Google Scholar
Xie, H., Chen, Z., Hong, F., Liu, Z.: Citydreamer: compositional generative model of unbounded 3d cities. arXiv preprint arXiv:2309.00610 (2023)
Xu, L., et al.: Blockplanner: city block generation with vectorized graph representation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5077–5086 (2021)
Google Scholar
Yan, W., Zhang, Y., Abbeel, P., Srinivas, A.: VideoGPT: video generation using VQ-VAE and transformers. arXiv preprint arXiv:2104.10157 (2021)
Yang, C.F., Fan, W.C., Yang, F.E., Wang, Y.C.F.: LayoutTransformer: scene layout generation with conceptual and spatial diversity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3732–3741 (2021)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Google Scholar
Zhang, X., Ma, W., Varinlioglu, G., Rauh, N., He, L., Aliaga, D.: Guided pluralistic building contour completion. Vis. Comput. 38(9), 3205–3216 (2022)
Article Google Scholar
Zheng, X., Qiao, X., Cao, Y., Lau, R.W.: Content-aware generative modeling of graphic design layouts. ACM Trans. Graph. (TOG) 38(4), 1–15 (2019)
Article Google Scholar

Download references

Acknowledgements

This project was funded in part by NSF Grant #2107096 and NSF Grant #1835739.

Author information

Authors and Affiliations

Purdue University, West Lafayette, USA
Liu He & Daniel Aliaga

Authors

Liu He
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Aliaga
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liu He .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 11614 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, L., Aliaga, D. (2025). COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15071. Springer, Cham. https://doi.org/10.1007/978-3-031-72624-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-72624-8_1
Published: 26 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72623-1
Online ISBN: 978-3-031-72624-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Automated urban planning aware spatial hierarchies and human instructions

RoadNetGAN: Generating Road Networks in Planar Graph Representation

UrbanEvolver: Function-Aware Urban Layout Regeneration

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 11614 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Automated urban planning aware spatial hierarchies and human instructions

RoadNetGAN: Generating Road Networks in Planar Graph Representation

UrbanEvolver: Function-Aware Urban Layout Regeneration

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 11614 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation