High-Performance Terrain Rendering Using Hardware Tessellation
High-Performance Terrain Rendering Using Hardware Tessellation
High-Performance Terrain Rendering Using Hardware Tessellation
ABSTRACT
In this paper, we present a new terrain rendering approach, with adaptive triangulation performed entirely on the
GPU via tessellation unit available on the DX11-class graphics hardware. The proposed approach avoids
encoding of the triangulation topology thus reducing the CPU burden significantly. It also minimizes the data
transfer overhead between host and GPU memory, which also improves rendering performance. During the
preprocessing, we construct a multiresolution terrain height map representation that is encoded by the robust
compression technique enabling direct error control. The technique is efficiently accelerated by the GPU and
allows the trade-off between speed and compression performance. At run time, an adaptive triangulation is
constructed in two stages: a coarse and a fine-grain one. At the first stage, rendering algorithm selects the
coarsest level patches that satisfy the given error threshold. At the second stage, each patch is subdivided into
smaller blocks which are then tessellated on the GPU in the way that guarantees seamless triangulation.
Keywords
Terrain rendering, DX11, GPU, adaptive tessellation, compression, level of detail.
criteria together with the view parameters. This
1. INTRODUCTION allows dramatic reduction of the model complexity
Despite the rapid advances in the graphics hardware, without significant loss of visual accuracy. Brief
high geometric fidelity and real-time large scale overview of different terrain rendering approaches is
terrain visualization is still an active research area. given in the following section. In the previous
The primary reason is that the size and resolution of methods, the adaptive triangulation was usually
digital terrain models grow at a significantly higher constructed by the CPU and then transferred to the
rate than the graphics hardware can manage. Even the GPU for rendering. New capabilities of DX11-class
modest height map can easily exceed the available graphics hardware enable new approach, when
memory of today’s highest-end graphics platforms. adaptive terrain tessellation is built entirely on the
So it is still important to dynamically control the GPU. This reduces the memory storage requirements
triangulation complexity and reduce the height map together with the CPU load. It also reduces the
size to fit the hardware limitations and meet real-time amount of data to be transferred from the main
constraints. memory to the GPU that again results in a higher
To effectively render large terrains, a number of rendering performance.
dynamic multiresolution approaches as well as data
compression techniques have been developed in the 2. RELATED WORK
last years. These algorithms typically adapt the Many research papers about adaptive view-dependent
terrain tessellation using local surface roughness triangulation construction methods were published in
the last years. Refer to a nice survey by R. Pajarola
Permission to make digital or hard copies of all or part of and E. Gobbetti [PG07].
this work for personal or classroom use is granted without
fee provided that copies are not made or distributed for Early approaches construct triangulated irregular
profit or commercial advantage and that copies bear this networks (TINs). Exploiting progressive meshes for
notice and the full citation on the first page. To copy terrain simplification [Hop98] is one specific
otherwise, or republish, to post on servers or to example. Though TIN-based methods do minimize
redistribute to lists, requires prior specific permission the amount of triangles to be rendered for a given
and/or a fee. error bound, they are too computationally and storage
To refine samples from R, we exploit the following Magnitudes and signs of the resulting prediction
observation: the refined sample hˆ2(li ,)2 j (from Hˆ C(l ) ) errors d i(,lj) are then separately encoded using
corresponding to the sample hˆi(,lj1) (from Hˆ P(l 1) ) can adaptive arithmetic coding.
only take one of the following 3 values (see fig. 3): As it was already discussed, symbols being used
2 l 1 2 l 1
exploit adaptive approach that learns the statistical
hˆi(,lj1) 0 properties of the input symbol stream on the fly. This
( l 1) 0 1 is implemented as a histogram which counts
q i, j
-1 corresponding symbol frequencies (see [WNC87] for
3 l 1 2 l 1 l 1 0 l 1 2 l 1 3 l 1
details). Note that simple context modeling can
improve the compression performance with minimal
algorithmic complexity increase.
hˆ2(li ,)2 j 4 l 2 l 0 2 l 4 l During the preprocessing, the whole hierarchy is
recursively traversed starting from the root (level 0)
q 2(li ), 2 j -3 -2 -1 0 1 2 3 and the proposed encoding process is repeated for
5 l 3 l l 0 l 3 5
6 l 4 l 2 l 2 l l 4 l l 6 l
each patch.
The proposed compression scheme enables direct
control of the reconstruction precision in L error
Figure 3. Quantizing two successive levels. metric: it assures that the maximum reconstruction
This also means that if hˆi(,lj1) is encoded by the more than l . For comparison, compression method
error of a terrain block at level l of the hierarchy is no
quantized value qi(,lj1) , then corresponding q 2(li), 2 j can [Mal00] used in geometry clipmaps [LH04] does not
only take one of the following 3 values: provide a guaranteed error bound in L metric. C-
Err Appr ( Pm(l,n) ) ErrInt ( Pm(l,n) ) max{ Err Appr ( P2(ml 1s), 2nt )} , ErrScr ( Pm(l,n) )
Err ( Pm(l,n) )
s ,t 1 (c,Vm(l,n) )
l l0 1,...0
where 12 max( Rh ctg (h / 2), Rv ctg (v / 2)) , Rh
(l )
where ErrInt ( P ) is the maximum geometric and Rv are horizontal and vertical resolutions of the
view port, h and v are the horizontal and vertical
m, n
5. CONSTRUCTING VIEW-
DEPENDENT ADAPTIVE MODEL
The proposed level-of-detail selection process
consists of two stages. The first stage is the coarse
LOD selection which is done on a per-patch level: an
unbalanced patch quad tree is constructed with the d=0 d=1 d=2
leaf patches satisfying the given error tolerance. On Figure 6. Triangulations of a 9×9 tessellation
the second stage, the fine-grain LOD selection is block.
performed, at which each patch is precisely
triangulated using the hardware tessellation unit. To determine the degree of simplification for each
block, we calculate a series of block errors. These
Coarse Level of Detail Selection errors represent the deviation of the block’s
The coarse LOD selection is performed similar to simplified triangulation from the patch’s height map
other quad tree-based terrain rendering methods. For samples, covered by the block but not included into
this purpose, an unbalanced patch quad tree is the simplified triangulation (dotted circles in fig. 6).
maintained. It defines the block-based adaptive
Let’s denote the error of the tessellation block located
model, which approximates the terrain with the
is simplified by a factor of 2 d by (r d,s) . The
at the (r , s) position in the patch, whose triangulation
specified screen-space error.
height map that are used to calculate (r1,)s and (r 2,s)
well as these samples (dotted circles) of the patch’s using the world-view-projection matrix, we will get
the edge screen space error estimation (fig. 7) given
that the neighboring tessellation blocks are simplified
are shown in fig. 6 (center and right images
correspondingly). by a factor of 2 d . We can then select the maximum
simplification level d for the edge that does not lead
To get the final error bound for the tessellation block,
to unacceptable error as follows:
it is necessary to take into account the patch’s error
bound. This final error bound hereinafter is referred d arg max proj (ec( d ), , ec( d ), )
to as (rd, s) and is calculated as follows:
d
ec( d ), ec (edc ) / 2 ez
Note that in contrast to all previous terrain
simplification methods, all operations required to
where e z is the world space z (up) axis unit vector. triangulate the patch are performed entirely on the
GPU and does not involve any CPU computations.
rendering using single draw call with instancing. The Chunked LOD H/W Tessellation H/W tess + tex array & instancing