High-Performance Terrain Rendering Using Hardware Tessellation

View metadata, citation and similar papers at core.ac.
uk brought to you by CORE

provided by DSpace at University of West Bohemia
High-Performance Terrain Rendering Using

Hardware Tessellation
Egor Yusov Maxim Shevtsov
Intel Corporation Intel Corporation
30 Turgeneva street, 30 Turgeneva street,
603024, Russia, Nizhny Novgorod 603024, Russia, Nizhny Novgorod
egor.a.yusov@intel.com maxim.y.shevtsov@intel.com
ABSTRACT
In this paper, we present a new terrain rendering approach, with adaptive triangulation performed entirely on the
GPU via tessellation unit available on the DX11-class graphics hardware. The proposed approach avoids
encoding of the triangulation topology thus reducing the CPU burden significantly. It also minimizes the data
transfer overhead between host and GPU memory, which also improves rendering performance. During the
preprocessing, we construct a multiresolution terrain height map representation that is encoded by the robust
compression technique enabling direct error control. The technique is efficiently accelerated by the GPU and
allows the trade-off between speed and compression performance. At run time, an adaptive triangulation is
constructed in two stages: a coarse and a fine-grain one. At the first stage, rendering algorithm selects the
coarsest level patches that satisfy the given error threshold. At the second stage, each patch is subdivided into
smaller blocks which are then tessellated on the GPU in the way that guarantees seamless triangulation.
Keywords
Terrain rendering, DX11, GPU, adaptive tessellation, compression, level of detail.
criteria together with the view parameters. This
1. INTRODUCTION allows dramatic reduction of the model complexity
Despite the rapid advances in the graphics hardware, without significant loss of visual accuracy. Brief
high geometric fidelity and real-time large scale overview of different terrain rendering approaches is
terrain visualization is still an active research area. given in the following section. In the previous
The primary reason is that the size and resolution of methods, the adaptive triangulation was usually
digital terrain models grow at a significantly higher constructed by the CPU and then transferred to the
rate than the graphics hardware can manage. Even the GPU for rendering. New capabilities of DX11-class
modest height map can easily exceed the available graphics hardware enable new approach, when
memory of today’s highest-end graphics platforms. adaptive terrain tessellation is built entirely on the
So it is still important to dynamically control the GPU. This reduces the memory storage requirements
triangulation complexity and reduce the height map together with the CPU load. It also reduces the
size to fit the hardware limitations and meet real-time amount of data to be transferred from the main
constraints. memory to the GPU that again results in a higher
To effectively render large terrains, a number of rendering performance.
dynamic multiresolution approaches as well as data
compression techniques have been developed in the 2. RELATED WORK
last years. These algorithms typically adapt the Many research papers about adaptive view-dependent
terrain tessellation using local surface roughness triangulation construction methods were published in
the last years. Refer to a nice survey by R. Pajarola
Permission to make digital or hard copies of all or part of and E. Gobbetti [PG07].
this work for personal or classroom use is granted without
fee provided that copies are not made or distributed for Early approaches construct triangulated irregular
profit or commercial advantage and that copies bear this networks (TINs). Exploiting progressive meshes for
notice and the full citation on the first page. To copy terrain simplification [Hop98] is one specific
otherwise, or republish, to post on servers or to example. Though TIN-based methods do minimize
redistribute to lists, requires prior specific permission the amount of triangles to be rendered for a given
and/or a fee. error bound, they are too computationally and storage
Journal of WSCG 85 ISSN 1213-6972

demanding. More regular triangulations such as grain triangulation construction methods. In contrast
bintree hierarchies [LKR+96, DWS+97] or restricted to the previous approaches, our adaptive view-
quad trees [Paj98] are faster and easier to implement dependent triangulation is constructed entirely on the
for the price of slightly more redundant triangulation. GPU using hardware-supported tessellation. This
Recent approaches are based on techniques that fully offloads computations from the CPU while also
exploit the power of modern graphics hardware. reduces expensive CPU-GPU data transfers. We also
CABTT algorithm [Lev02] by J. Levenberg as well propose fast and simple GPU-accelerated
as BDAM [CGG+03a] and P-BDAM [CGG+03b] compression technique for progressively encoding
methods by Cignoni et al exploit bintree hierarchies multiresolution hierarchy that enables direct control
of pre-computed triangulations or batches instead of of a reconstruction precision.
individual triangles. Geometry clipmaps approach Algorithm Overview
[LH04] renders the terrain as a set of nested regular To achieve real-time rendering and meet the
grids centered about the viewer, allowing efficient hardware limitations, we exploit the LOD technique.
GPU utilization. The method exploits regular grid To create various levels of detail, during the
pyramid data structure in conjunction with the lossy preprocessing, a multiresolution hierarchy is
image compression technique [Mal00] to constructed by recursively downsampling the initial
dramatically reduce the storage requirements. data and subdividing it into overlapping patches. In
However, the algorithm completely ignores local order to reduce the memory requirements, the
surface features of the terrain and provides no resulting hierarchy is then encoded using simple and
guarantees for the error bound, which becomes efficient compression algorithm described in
especially apparent on high-variation terrains. section 4.
Next, C-BDAM method, an extension of BDAM and Constructing adaptive terrain model to be rendered is
P-BDAM algorithms, was presented by Gobbetti et al a two-stage process. The first stage is the coarse per-
in [GMC+06]. The method exploits a wavelet-based patch LOD selection: the rendering algorithm selects
two stage near-lossless compression technique to the coarsest level patches that tolerate the given
efficiently encode the height map data. In C-BDAM, screen-space error. They are cached in a GPU
uniform batch triangulations are used which do not memory and due to the frame-to-frame coherence are
adapt to local surface features. Regular triangulations re-used for a number of successive frames. On the
typically generate significantly more triangles and second stage, a fine-grain LOD selection is
unreasonably increase the GPU load. performed: each patch is seamlessly triangulated
Terrain rendering method presented by Schneider and using hardware. For this purpose, each patch is
Westermann [SW06] partitions the terrain into square subdivided into the equal-sized smaller blocks that
tiles and builds for each tile a discrete set of LODs are independently triangulated by the GPU-supported
using a nested mesh hierarchy. Following this tessellation unit, as described in section 5.
approach, Dick et al proposed the method for tile Experimental results are given in section 6. Section 7
triangulations encoding that enables efficient GPU- concludes the paper.
based decoding [DSW09].
All these methods either completely ignore local 4. BUILDING COMPRESSED
terrain surface features (like [LC03, LH04, MULTIRESOLUTION TERRAIN
GMC+06]) for the sake of efficient GPU utilization, REPRESENTATION
or pre-compute the triangulations off-line and then
just load them during rendering [CGG+03a, Patch Quad Tree
CGG+03b]. For the case of compressed data, GPU The core structure of the proposed multiresolution
can also be used for geometry decompressing as well model is a quad tree of square blocks (hereinafter
[SW06, DSW09]. referred to as patches). This structure is commonly
used in real-time terrain rendering systems [Ulr00,
By the best of our knowledge, none of the previous DSW09].
methods take an advantage of the tessellation unit
exposed by the latest DX11-class graphics hardware The patch quad tree is constructed at the preprocess
for precise yet adaptive (view-dependent) terrain stage. At the first step, a series of coarser height maps
tessellation. is built. Each height map is the downsampled version
of the previous one (fig. 1). At the next step, the
patch quad tree itself is constructed by subdividing
each level into (2n  1)  (2n  1) square blocks
3. CONTRIBUTION
The main contribution is a novel terrain rendering
approach, which combines efficiency of the chunk- (65x65, 129x129, 257x257 etc.), refer to fig. 2.
based terrain rendering with the accuracy of fine-

I  {(i, j ) : hî(,lj)  Hˆ C(l )  (i, j )  R} d i(,lj)  Ql ( PR (hî(,lj) ))  qi(,lj) , (i, j )  I
To refine samples from R, we exploit the following Magnitudes and signs of the resulting prediction
observation: the refined sample hˆ2(li ,)2 j (from Hˆ C(l ) ) errors d i(,lj) are then separately encoded using
corresponding to the sample hî(,lj1) (from Hˆ P(l 1) ) can adaptive arithmetic coding.
only take one of the following 3 values (see fig. 3): As it was already discussed, symbols being used
hˆ2(li ,)2 j  {hî(,lj1)  2 l , hî(,lj1) , hî(,lj1)  2 l } .

during described compression process are encoded
with the technique described in [WNC87]. We
2 l 1 2 l 1
exploit adaptive approach that learns the statistical
hî(,lj1) 0 properties of the input symbol stream on the fly. This
( l 1) 0 1 is implemented as a histogram which counts
q i, j
-1 corresponding symbol frequencies (see [WNC87] for
3 l 1 2 l 1  l 1 0  l 1 2 l 1 3 l 1
details). Note that simple context modeling can
improve the compression performance with minimal
algorithmic complexity increase.
hˆ2(li ,)2 j 4 l 2 l 0 2 l 4 l During the preprocessing, the whole hierarchy is
recursively traversed starting from the root (level 0)
q 2(li ), 2 j -3 -2 -1 0 1 2 3 and the proposed encoding process is repeated for
5 l 3 l  l 0  l 3 5
6 l 4 l 2 l 2 l l 4 l l 6 l
each patch.
The proposed compression scheme enables direct
control of the reconstruction precision in L error
Figure 3. Quantizing two successive levels. metric: it assures that the maximum reconstruction
This also means that if hî(,lj1) is encoded by the more than  l . For comparison, compression method
error of a terrain block at level l of the hierarchy is no
quantized value qi(,lj1) , then corresponding q 2(li), 2 j can [Mal00] used in geometry clipmaps [LH04] does not
only take one of the following 3 values: provide a guaranteed error bound in L metric. C-
q2(li), 2 j {2qi(,lj1)  1, 2qi(,lj1) , 2qi(,lj1)  1}

BDAM [GMC+06] exploits sophisticated two-stage
compression scheme to assure the given error
Since qi(,lj1) is known, encoding the q 2(li), 2 j requires
tolerance. This provides higher compression ratios
only 3 symbols: 1 , 0 or 1 . These symbols are

but is more computationally expensive than the
presented scheme. Moreover, as we will show in the
encoded using adaptive arithmetic coding [WNC87]. next section, our technique can be efficiently
At the second step, we encode the remaining samples accelerated using the GPU.
located at positions from I in Hˆ C(l ) (dotted circles in Calculating Guaranteed Patch Error
fig. 4). This is done by predicting the sample’s value Bound
from the refined samples and by encoding the During the quad tree construction, each patch in the
prediction error. hierarchy is assigned a world space approximation
error. It conservatively estimates the maximum
geometric deviation of the patch’s reconstructed
Refined samples (R) height map from the underlying original full-detail
Interpolated samples (I) height map. This value is required at the run time to
estimate the screen-space error and to construct the
patch-based adaptive model, which approximates the
terrain with the specified screen-space error.
Figure 4. Refined and interpolated samples of
Let’s denote the patch located at the level l of the
the child patches joined height map Hˆ C(l ) .
quad tree at the (m, n) position by the Pm(l, n) and its
For the sake of GPU-acceleration, we exploit bilinear
upper error bound by the Err ( Pm(l, n) ) . To calculate
predictor P (hˆ (l ) ) that calculates predicted value of
R i, j
Err ( Pm(l, n) ) , we first calculate approximation error
hî(,lj) as a weighted sum of 4 refined samples located
Err Appr ( Pm(l,n) ) , which is the upper bound of the
at the neighboring positions in R. We then calculate
the prediction error as follows: maximum distance from the patch’s precise height
map to the samples of the underlying full-detail (level

l 0 ) height map. It is recursively calculated using the for the vertices within a patch, ErrScr ( Pm(l,n) ) can be
same method as used in ROAM [DWS+97] to calculated using standard LOD formula for
calculate the nested wedgie thickness: conservatively determining the maximum screen-
ErrAppr ( Pm(l,0n) )  0 space vertex error (see [Ulr00, Lev02]):
Err Appr ( Pm(l,n) )  ErrInt ( Pm(l,n) )  max{ Err Appr ( P2(ml 1s), 2nt )} , ErrScr ( Pm(l,n) )  
Err ( Pm(l,n) )
s ,t 1  (c,Vm(l,n) )
l  l0  1,...0
where   12 max( Rh  ctg (h / 2), Rv  ctg (v / 2)) , Rh
(l )
where ErrInt ( P ) is the maximum geometric and Rv are horizontal and vertical resolutions of the
view port,  h and  v are the horizontal and vertical
m, n
deviation of the linearly interpolated patch’s height

camera fields of view, and  (c,Vm(l, n) ) is the distance
map from its children height maps. Two-dimensional
illustration for determining ErrInt ( Pm(l, n) ) is given in
from the camera position c to the closest point on the
fig. 5.
patch’s bounding box Vm(l, n) .
ErrInt ( Pm(l,n) ) Tessellation Blocks

(l ) During the fine-grain LOD selection, each patch in
ErrInt ( P m,n )
the unbalanced patch quad tree is adaptively
triangulated using the GPU. For this purpose, each
patch is subdivided into the small equal-sized blocks
Child patches’ (level l) height map samples that we call tessellation blocks. For instance, a 65×65
Parent patch’s (level l-1) height map samples patch can be subdivided into the 4×4 grid of 17×17
tessellation blocks or into the 8×8 grid of 9×9 blocks
Figure 5. Patch’s height map interpolation error. etc. Detail level for each tessellation block is
determined independently by the hull shader: the
Since for the patch Pm(l, n) , the reconstructed height block can be rendered in the full resolution (fig. 6,
 l , the final patch’s upper error bound is given by:

map deviates from the exact height map by at most left) or in the resolution reduced by a factor of 2 d ,
d = 1,2,… (fig. 6, center, right).
Err ( Pm(l,n) )  Err Appr ( Pm(l,n) )   l (r0,s)  0 (r1,)s (r 2,s)
5. CONSTRUCTING VIEW-
DEPENDENT ADAPTIVE MODEL
The proposed level-of-detail selection process
consists of two stages. The first stage is the coarse
LOD selection which is done on a per-patch level: an
unbalanced patch quad tree is constructed with the d=0 d=1 d=2
leaf patches satisfying the given error tolerance. On Figure 6. Triangulations of a 9×9 tessellation
the second stage, the fine-grain LOD selection is block.
performed, at which each patch is precisely
triangulated using the hardware tessellation unit. To determine the degree of simplification for each
block, we calculate a series of block errors. These
Coarse Level of Detail Selection errors represent the deviation of the block’s
The coarse LOD selection is performed similar to simplified triangulation from the patch’s height map
other quad tree-based terrain rendering methods. For samples, covered by the block but not included into
this purpose, an unbalanced patch quad tree is the simplified triangulation (dotted circles in fig. 6).
maintained. It defines the block-based adaptive
Let’s denote the error of the tessellation block located
model, which approximates the terrain with the
is simplified by a factor of 2 d by (r d,s) . The
at the (r , s) position in the patch, whose triangulation
specified screen-space error.
tessellation block errors (r d,s)

The unbalanced quad tree is cached in a GPU
memory and is updated according to the results of are computed as
ErrScr ( Pm(l,n) ) with the user-defined error threshold  .

comparing patch’s screen-space error estimation follows:
(rd,s)  max  (v, Tr(,ds ) ) , d = 1,2,…
Since we already have the maximum geometric error vTr , s
(d )

where Tr(,ds ) is the tessellation block (r ,s) triangulation Thus ec( d ),  and ec( d ),  define a segment of length
simplified by a factor of 2 d and  (v, Tr(,ds ) ) is the (edc ) directed along the z axis such that the edge
vertical distance from the vertex v to the triangulation centre ec is located in the segment’s middle.
Tr(,ds ) . Two and four times simplified triangulations as If we project this segment onto the viewing plane
height map that are used to calculate (r1,)s and (r 2,s)
well as these samples (dotted circles) of the patch’s using the world-view-projection matrix, we will get
the edge screen space error estimation (fig. 7) given
that the neighboring tessellation blocks are simplified
are shown in fig. 6 (center and right images
correspondingly). by a factor of 2 d . We can then select the maximum
simplification level d for the edge that does not lead
To get the final error bound for the tessellation block,
to unacceptable error as follows:
it is necessary to take into account the patch’s error
bound. This final error bound hereinafter is referred d  arg max proj (ec( d ),  , ec( d ),  )  
to as (rd, s) and is calculated as follows:
d
(rd,s)  (rd,s)  Err ( Pm(l,n) )

In our particular implementation, we calculate errors
for 4 simplification levels such that tessellation block
triangulation can be simplified by a maximum factor
of (2 4 ) 2  256 . This enables us to store the ec( d ), 
tessellation block errors as a 4-component vector. ec
( d ), 
Fine-Grain Level of Detail Selection e c
When the patch is to be rendered, it’s necessary to

estimate how much its tessellation blocks’ proj (ec( d ),  , ec( d ),  )
triangulations can be simplified without introducing
unacceptable error. This is done using the current Figure 7. Calculating edge screen space error.
frame’s world-view-projection matrix. Each
tessellation block is processed independently and for The same selection process is done for each edge.
each block’s edge, a tessellation factor is determined. Tessellation factor for the block interior is then
To eliminate cracks, tessellation factors for shared defined as the minimum of its edge tessellation
edges of neighboring blocks must be computed in the factors. This method assures that tessellation factors
same way. The tessellation factors are then passed to for shared edges of neighboring blocks are computed
the tessellation stage of the graphics pipeline, which equally and guarantees seamless patch triangulation.
generates final triangulation. An example of a patch triangulation is given in fig. 8.
Tessellation factors for all edges are determined
identically. Let’s consider some edge and denote its 0 1
center by ec . Let’s define edge errors (edc ) as the
maximum error of the tessellation blocks sharing this 0 1
edge. For example, block (r , s) left edge’s errors are
calculated as follows:
(edc )  max((rd)1,s , (rd,s) ) , d = 1,2,…
Figure 8. Seamlessly triangulated patch’s
tessellation blocks.
Next let’s define a series of segments in a world To hide gaps between neighboring patches, we
space specified by theirs end points ec( d ),  and ec( d ),  exploit “vertical skirts” around the perimeter of each
determined as follows: patch as proposed by T. Ulrich [Ulr00]. The top of
the skirt matches the patch’s edge and the skirt height
ec( d ),   ec  (edc ) / 2  ez is selected such that it hides all possible cracks.
ec( d ),   ec  (edc ) / 2  ez
Note that in contrast to all previous terrain
simplification methods, all operations required to
where e z is the world space z (up) axis unit vector. triangulate the patch are performed entirely on the
GPU and does not involve any CPU computations.

Implementation Details The data set was compressed to 46.8 MB (11:1) with
The presented algorithm was implemented with the 1 meter error tolerance. For comparison, C-BDAM
C++ in an MS Visual Studio .NET environment. method, which exploits much more sophisticated
approach, compressed the same data set to 19.2 MB
In our system, the CPU decodes the bit stream in
(26:1) [GMC+06]. Note that in contrast to C-BDAM,
parallel to the rendering thread and all other tasks are
our method enables hardware-based decompression.
done on the GPU. To facilitate GPU-accelerated
Note also that in practice we compress extended
(2n  3)  (2n  3) height map for each patch for the
decompression, we support several temporary
textures. The first one is (2n  1)  (2n  1) 8-bit
sake of seamless normal map generation. As opposed
texture TR that is populated with the parent patch’s
refinement labels ( 1 , 0 or 1 ) from R. The second
to compressing conventional diffuse textures, height
one is (2  2n  1)  (2  2n  1) 8-bit texture TI

maps usually require less space. That is why we
believe that provided 11x compression rate is a good
justification for the quality of our algorithm.
storing prediction errors d i(,lj) for samples from I.
During our run-time experiments, the Puget Sound
GPU-part of the decompression is done in two steps:
 At the first step, parent patch height map is
data set was rendered with 2 pixels screen space error
tolerance at 1920x1200 resolution (fig. 10). We
refined by rendering to the temporary texture TP . compared the rendering performance of our method
 At the second step, child patch height maps are with our implementation of the chunked LOD
rendered. approach [Ulr00]. As fig. 10 shows, the data set was
rendered at 607 fps on average with minimum at 465
During the second step, TP is filtered using
fps with the proposed method. When the same terrain
hardware-supported bilinear filtering, interpolation was rendered with our method but without exploiting
errors are loaded from TI and added to the instancing and texture arrays described previously,
interpolated samples from TP . the frame rate was almost 2x lower. As fig. 10 shows,
our method is more than 3.5x faster than the chunked
Patch’s height and normal maps as well as the LOD approach.
tessellation block errors are stored as texture arrays.
1200
A list of unused subresources is supported. When
1000
patch is created, we find unused subresource in the
list and release it when the patch is destroyed. 800
H/W tess + tex
FPS
Tessellation block errors as well as normal maps are 600

array & instancing
computed on the GPU when the patch is created by 400
rendering to the appropriate texture array element. 200

H/W Tessellation
Exploiting texture arrays enables the whole terrain 0

Chunked LOD
rendering using single draw call with instancing. The Chunked LOD H/W Tessellation H/W tess + tex array & instancing
per-instance data contains patch location, level in the

Figure 10. Rendering performance at 1920×1200
hierarchy and the texture index. Patch rendering hull
resolution.
shader calculates tessellation factor for each edge and
passes the data to the tessellator. Tessellator Our experiments showed that the optimal tessellation
generates topology and domain coordinates that are block size that provides the best performance is 8×8.
passed to the domain shader. Domain shader Other interesting statistics for this rendering
calculates world space position for each vertex and experiment is that approximately 1024 of 128×128
fetches the height map value from the appropriate patches were kept in GPU memory (only ~200 of
texture array element. The resulting triangles then them were rendered per frame on average). Each
pass in a conventional way via rasterizer. height map was stored with 16 bit precision. All
patches demanded just 32 MB of the GPU memory.
6. EXPERIMENTAL RESULTS AND We also exploited normal map compressed using
DISCUSSION BC5, which required additional 16 MB of data.
Diffuse maps are not taken into account because
To test our system, we used 16385×16385 height
special algorithms that are behind the scope of this
map of the Puget Sound sampled at 10 meter spacing,
work are designed to compress them. However, the
which is used as the common benchmark and is
same quad tree-based subdivision scheme can be
available at [PS]. The raw data size (16 bps) is 512
integrated with our method to handle diffuse texture.
MB. The compression and run-time experiments were
done on a workstation with the following hardware Since our method enables using small screen space
configuration: single Intel Core i7 @2.67; 6.0 GB error threshold (2 pixels or less), we did not observe
RAM; NVidia GTX480. any popping artifacts during our experiments even

though there is no morph between successive LODs optimally adapting meshes. In Proc. IEEE
in our current implementation. Visualization, pp. 81–88, 1997.
In all our experiments, the whole compressed [GMC+06] Gobbetti, E., Marton, F., Cignoni, P.,
hierarchy easily fitted into the main memory. Di Benedetto, M., and Ganovelli, F. C-BDAM –
However, our approach can be easily extended for the compressed batched dynamic adaptive meshes for
out-of-core rendering of arbitrary large terrains. In terrain rendering. Computer Graphics Forum,
this case, the whole compressed multiresolution Vol. 25, No. 3, pp. 333–342, 2006.
representation would be kept in a repository on the [Hop98] Hoppe, H. Smooth view-dependent level-of-
disk or a network server, as for example in the detail control and its application to terrain
geometry clipmaps. This would allow on-demand rendering. In Proc. IEEE Visualization, pp. 35–
extraction from the repository rather accessing the 42, 1998.
data directly in the memory.
[LC03] Larsen, B.D., and Christensen, N.J. Real-time
7. CONCLUSION AND FUTURE Terrain Rendering using Smooth Hardware
Optimized Level of Detail. Journal of WSCG,
WORK Vol. 11, No. 2, pp. 282–289, 2003.
We presented a new real-time large-scale terrain
rendering technique, which is based on the [Lev02] Levenberg, J. Fast view-dependent level-of-
exploitation of the hardware-supported tessellation detail rendering using cached geometry. In Proc.
available in modern GPUs. Since triangulation is IEEE Visualization, pp. 259–265, 2002.
performed entirely on the GPU, there is no need to [LKR+96] Lindstrom, P., Koller, D., Ribarsky, W.,
encode the triangulation topology. Moreover, the Hodges, L.F., Faust, N., and Turner, G.A. Real-
triangulation is precisely adapted to each camera time, continuous level of detail rendering of
position. To reduce the data storage requirements, we height fields. In Proc. ACM SIGGRAPH, pp.
use robust compression technique that enables direct 109–118, 1996.
control over the reconstruction precision and is also
[LH04] Losasso, F., and Hoppe, H. Geometry
accelerated by the GPU.
clipmaps: Terrain rendering using nested regular
We consider support for dynamic terrain grids. In Proc. ACM SIGGRAPH, pp. 769–776,
modifications as a future work topic. Since the 2004.
triangulation topology is constructed entirely on the
[Mal00] Malvar, H. Fast Progressive Image Coding
GPU, it would require only updating the tessellation
without Wavelets. In Proceedings of Data
block errors, and the triangulation will be updated
Compression Conference (DCC ’00), Snowbird,
accordingly. Another possible direction is to extend
UT, USA, pp. 243–252, 28-30 March 2000.
the presented algorithm for rendering arbitrary high-
detailed 2D-parameterized surfaces. [Paj98] Pajarola, R. Large scale terrain visualization
using the restricted quadtree triangulation. In
8. REFERENCES Proc. IEEE Visualization, pp. 19–26, 1998.
[CGG+03a] Cignoni, P., Ganovelli, F., Gobbetti, E., [PG07] Pajarola, R., and Gobbetti, E. Survey on
Marton, F., Ponchio, F., and Scopigno, R. semi-regular multiresolution models for
BDAM – batched dynamic adaptive meshes for interactive terrain rendering. The Visual
high performance terrain visualization. Computer Computer, Vol. 23, No. 8, pp. 583–605, 2007.
Graphics Forum, Vol. 22, No. 3, pp. 505–514, [PS] Puget Sound elevation data set is available at
2003. http://www.cc.gatech.edu/projects/large_models/p
[CGG+03b] Cignoni, P., Ganovelli, F., Gobbetti, E., s.html
Marton, F., Ponchio, F., and Scopigno, R. Planet- [SW06] Schneider, J., and Westermann, R. GPU-
sized batched dynamic adaptive meshes (P- Friendly High-Quality Terrain Rendering. Journal
BDAM). In Proc. IEEE Visualization, pp. 147– of WSCG, Vol. 14, pp. 49–56, 2006.
154, 2003.
[Ulr00] Ulrich, T. Rendering massive terrains using
[DSW09] Dick, C., Schneider, J., and Westermann, chunked level of detail. ACM SIGGraph Course
R. Efficient Geometry Compression for GPU- “Super-size it! Scaling up to Massive Virtual
based Decoding in Realtime Terrain Rendering. Worlds”, 2000.
In Computer Graphics Forum, Vol. 28, No 1, pp.
67–83, 2009. [WNC87] Witten, I.H., Neal, R.M., and Cleary J.G.,
Arithmetic coding for data compression. Comm.
[DWS+97] Duchaineau, M., Wolinsky, M., Sigeti, ACM, Vol. 30, No. 6, pp. 520–540, June 1987.
D.E., Miller, M.C., Aldrich, C., and Mineev-
Weinstein, M.B. ROAMing terrain: Real-time

High-Performance Terrain Rendering Using Hardware Tessellation

Uploaded by

Copyright:

Available Formats

High-Performance Terrain Rendering Using Hardware Tessellation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

High-Performance Terrain Rendering Using Hardware Tessellation

Uploaded by

Copyright:

Available Formats

View metadata, citation and similar papers at core.ac.

uk brought to you by CORE

High-Performance Terrain Rendering Using

Journal of WSCG 85 ISSN 1213-6972

Journal of WSCG 86 ISSN 1213-6972

hˆ2(li ,)2 j  {hî(,lj1)  2 l , hî(,lj1) , hî(,lj1)  2 l } .

q2(li), 2 j {2qi(,lj1)  1, 2qi(,lj1) , 2qi(,lj1)  1}

only 3 symbols: 1 , 0 or 1 . These symbols are

Journal of WSCG 88 ISSN 1213-6972

deviation of the linearly interpolated patch’s height

ErrInt ( Pm(l,n) ) Tessellation Blocks

 l , the final patch’s upper error bound is given by:

tessellation block errors (r d,s)

ErrScr ( Pm(l,n) ) with the user-defined error threshold  .

Journal of WSCG 89 ISSN 1213-6972

(rd,s)  (rd,s)  Err ( Pm(l,n) )

When the patch is to be rendered, it’s necessary to

Journal of WSCG 90 ISSN 1213-6972

one is (2  2n  1)  (2  2n  1) 8-bit texture TI

Tessellation block errors as well as normal maps are 600

computed on the GPU when the patch is created by 400

rendering to the appropriate texture array element. 200

Exploiting texture arrays enables the whole terrain 0

per-instance data contains patch location, level in the

Journal of WSCG 91 ISSN 1213-6972

Journal of WSCG 92 ISSN 1213-6972

You might also like