1. Introduction
For online geographical research platforms, the visualization of raster data is an essential function [
1,
2]. Raster data exhibit simple data structures and facilitate easy input/output, but suffer from low information density, high redundancy, and large volumes [
3]. Remote sensing images are one of the most significant sources of raster data [
4]. With the advancement of satellite technology and improvements in sensor accuracy, the volumes of remote sensing images are expanding geometrically, posing significant challenges in storage, calculation, and visualization [
5,
6,
7]. Thanks to developments in cloud storage and distributed computing, there are increasingly mature and reliable solutions for offline remote sensing big data processing [
8,
9,
10]. However, online visualization of large-scale remote sensing data as a raster layer map service, particularly real-time calculation and rendering, has been less explored [
11].
Currently, online raster layer map services using remote sensing data as sources primarily utilize GIS servers or static tiles, each with their advantages and disadvantages for data sharing and visualization.
GIS servers like GeoServer or self-developed servers typically interface directly with raster data sources in formats like GeoTIFF, publishing them as map services. These services support various Open Geospatial Consortium (OGC) standards, including the Web Map Service (WMS) and Web Map Tile Service (WMTS) [
12]. The specific interaction involves the client requesting data from the GIS server according to the service standard, with parameters such as geographic bounding boxes, tile coordinates, rendering methods, etc. The server then reads, processes, and computes the data, returning a resulting image that satisfies user-defined renderings [
13]. However, handling large datasets and numerous requests can strain servers, leading to slow responses or failures. Therefore, a GIS server is more suitable for smaller, less computationally intensive raster layers [
14]. Resource augmentation and caching strategies can alleviate network request stress [
15].
Static tiles represent another manifestation of cached tiles, forgoing the server’s real-time calculation and rendering of requested regional data [
16]. Instead, through data preprocessing, large volumes of remote sensing data from the same dataset are sliced into small tiles (in PNG or JPG format), organized by zoom level (
z), column (
x), and row (
y), and structured into a pyramid shape [
17,
18]. The primary role of the server is simply to distribute these tiles [
19]. Although this reduces the flexibility of map services compared to real-time rendering, it decreases server load and enhances scalability [
20]. The noticeable improvement in access efficiency explains why today’s mainstream online basemaps are predominantly served in this manner [
21]. Hosting the directory of static tiles with reverse proxy servers such as Nginx allows the file set to form the raster layer map service, where requests can be made directly to the corresponding images using the {
z}/{
x}/{
y} path. Trading space for time characterizes the essence of static tiles, which also highlights its inherent drawbacks. Image files require additional storage space and grow exponentially with each increase in zoom level [
22]. The large number of small files poses significant challenges in management and storage. Additionally, static tiles do not support user-defined rendering because the tiling process causes remote sensing images to lose their band information, typically rendering them in RGB. The color palettes for raster data derived from remote sensing are determined by the data processor. As a result, static tiles are primarily used for a wide range of remote sensing image basemaps. They offer distinct advantages for handling data that require time-intensive dynamic mosaicking, are frequently re-accessed, and do not necessitate customized rendering.
However, for online maps featuring wide coverage and high-resolution raster data, there is still no unified and satisfactory solution that can provide efficient data access while enabling custom calculation and rendering. Essentially, this challenge arises from the inherent conflict between the time-consuming calculation required by large data volumes and the need for rapid response from online map services.
The advent of Cloud-Optimized GeoTIFF (COG) brings a possible solution to this problem. COG remains a type of GeoTIFF, but it supports partial reading of data by standardizing metadata and organization. With the explosive growth of remote sensing images, more data are being migrated to cloud storage. However, the random order of common GeoTIFF labels and its striped data organization does not support partial reading by interest area. Consequently, when accessing data on the cloud, the entire file must be downloaded to local and then loaded, which is both inefficient and costly, as cloud storage typically incurs charges based on data traffic. To address these issues, COG has been developed as a new data standard that is compatible with traditional GIS software. It enhances partial access to data and is particularly well suited for cloud storage services such as object storage, thereby facilitating efficient workflows on the cloud. The implementation of COG relies on two complementary technologies: specific data organization and HTTP GET Range requests.
In terms of data structure, COG primarily employs “Tiling” and “Overviews”. Tiling in COG involves creating numerous internal tiles within the actual image, as opposed to the simple strips used in regular GeoTIFF. While striped data require reading the entire file to access key parts, tiling allows for quicker access to specific areas by reading only the corresponding tiles of the file. Overviews involve creating multiple downsampled versions of the original image, which reduces size and pixel count, thereby lowering detail content. Typically, a COG contains multiple overviews at different resolutions to match zoom levels, similar to a tile pyramid (
Figure 1). Although these overviews increase the overall file size, they facilitate faster data access since renderers can work directly with overviews instead of having to resample them [
23].
Another pivotal technology is HTTP GET Range request. HTTP Version 1.1 introduced a new feature called range request. When requesting data from a server using a GET request, an optional range field can be added to the request header: bytes = {start}-{end}, which indicates a partial request within the specified byte range [
24]. However, the server must support this functionality. Most object storage services, such as Amazon S3, currently support range requests, provided the requester knows the target data range. To this end, COG has also made corresponding specifications to change the disorder of ordinary TIFF labels. It systematically places the label catalog and label values as metadata in the file header, with the data portion following thereafter [
25]. When accessing a COG on the cloud, an initial HTTP Range request retrieves the COG file header information, typically about 16 KB, which contains the metadata [
26]. These metadata include the offsets for each overview and each underlying tile data. A subsequent request, based on these offsets, accesses the overview or tile data of the area of interest, thus achieving partial reading.
These features enable COG to support efficient data flow, facilitating real-time calculation, rendering, and analysis of raster data online. Furthermore, as COG’s benefits become more widely recognized, an increasing number of GIS platforms and open-source projects are starting to support COG. More remote sensing image products are also being released in COG format by default, providing convenience for users [
27]. It has become relatively mature in handling the online loading of a single COG file [
28]. With the support of object storage, clients can directly access and load COG on the cloud, enabling them to obtain band information for custom calculation, rendering, and analysis, which can be considered a new raster layer serving method. However, for map services consisting of a COG set, specifically those with extensive area segmentation and tiling, the current solution continues to adhere to the concept of the GIS server. Its essence is real-time calculation, bolstered by advancements in data reading and computing power [
29]. Typically, this approach reads COGs partially based on the
zxy coordinate of the user-requested tile, dynamically mosaics, and returns the resulting image [
30].
Table 1 summarizes the characteristics of the above-mentioned raster layer serving methods.
With advantages stemming from its characteristics, COG is increasingly participating in map services built with raster data, holding promising prospects. However, there is still room to refine its application. While the GIS server hosting COG benefits from its rapid data-reading capabilities, which facilitate quicker dynamic mosaicking, the computational load remains on the server. For instance, the GeoServer uses image processing libraries such as GDAL to concatenate data from multiple COG files into one continuous raster image. This process can lead to performance issues when handling large-scale datasets [
31]. Although cloud computing vendors can offer elastic computing resources, these come with non-negligible costs.
Is there a better method to organize source data from a COG set into a large-scale raster layer map service? The preprocessing concepts and pyramid structures used in static tiles might provide valuable insights. In this paper, we proposed a novel method named DCPMS, integrating the advantages of COG and pyramid tiles, providing a better technical solution for large-scale raster data layer serving across the open Internet. With DCPMS, tiles in COG format are generated at each level of the pyramid offline. Subsequently, the corresponding COGs are loaded in accordance with the specified window range, and the target image is presented in a quicker and more accurate manner through partial reading. This approach enables the autonomous scheduling of large-scale raster data with a single map service while preserving COG band information for customized computation and rendering. Furthermore, the decentralized service architecture disperses the computational load to the client, thereby significantly enhancing the ability to cope with high concurrency. In addition, the pyramid structure is optimized to reduce the time and additional storage space required for data preprocessing.
2. Methods
DCPMS is designed to realize the construction of map services and the optimization of applications for large-scale raster data. The technology of DCPMS comprises three primary components (
Figure 2): (1) Conceptual design of COG-pyramid model. (2) Decentralized service architecture and loading process based on the index file. (3) Data preprocessing flow and optimization strategy. The above contents are discussed in detail in the following sections.
2.1. COG-Pyramid Model of DCPMS
The organization of COG offers a flexible and convenient approach for loading a single image online. The subsequent issue to be addressed is how to organize the COG set of segmented images into a raster layer map service. The pyramid structure is the simplest and most straightforward method to implement.
The specific design of the pyramid structure can refer to the
zxy structure commonly used for static tiles, with enhancements based on it. The design ideas of mainstream tile map service standards such as WMTS and TMS (Tile Map Service) are similar, but they are proposed by different organizations and vary in details, such as the origin position and increasing direction of
y. Taking the
zxy organization of WMTS as an example, a tile can be uniquely identified by three elements (
z,
x,
y):
z represents the zoom level, increasing with the enlargement of the map, and the number of corresponding tiles increases and the content becomes more detailed;
x represents the position of the tile in the horizontal direction (longitude) of the map, increasing from left to right; and
y represents the position of the tile in the vertical direction (latitude) of the map, increasing from top to bottom. Usually,
x and
y range from 0 to
and double as
z increases [
32].
In the visualization of online map services, the principle generally followed is to read the corresponding data according to the actual window range, which helps reduce the amount of data transmitted and improves access rates. The pyramid structure is a good embodiment of this principle: when previewing the map, only the lowest-resolution tiles at the top of the pyramid are loaded; as the window zooms, the map focuses on a specific area, and only a few high-resolution tiles that cover the window are loaded, minimizing the data downloaded. In essence, the multiple overviews of COG are equivalent to a pyramid structure. If file size is not a constraint, the COG set can be merged into a single huge COG through the gdal_merge.py script in GDAL (Geospatial Data Abstraction Library) or a custom algorithm. This merged file would contain multiple overviews to accommodate different zoom levels, making it feasible to publish a raster layer map service in this manner. However, a single huge file presents many drawbacks, such as difficulties in maintenance and updates. Once the file becomes corrupted, the entire map service could crash. Additionally, updates to some underlying data might require rewriting the entire file, further complicating maintenance.
Therefore, organizing segmented COGs through a pyramid structure is a feasible solution. First, tiles in the tile pyramid are typically PNG or JPG files, which should be replaced with raster data in COG format to retain corresponding loading characteristics. Second, the tile pyramid structure cannot be fully replicated because tile resolution is generally 256 × 256 pixels. These tiles are small in size and large in quantity. In contrast, remote sensing images typically have much higher resolutions, often thousands of pixels in both length and width. They contain a wealth of information while occupying a large amount of space, requiring fewer files for storage. A simple improvement is to reduce the number of zoom levels in the COG-pyramid and increase the magnification relationship between levels. Overviews within the COGs can effectively replace the omitted levels, embedding certain levels directly into the COGs. Finally, in terms of data preprocessing, the bottom of the pyramid can directly use the original segmented data in COG format, while the upper levels require downsampling according to the actual pyramid structure. The purpose of downsampling is to enable smooth scaling of the raster layer map service, similar to the pyramid of static tiles. The resulting COG-pyramid model, as shown in
Figure 3, leverages the dual characteristics of the pyramid structure and COG. This model enables the segmented COG set to rapidly locate specific images while retaining the benefits of on-demand reading and access to band information. The essence of the COG-pyramid is a concentrated version of the tile pyramid. It can be considered as a method of organizing the entire tile pyramid into many small pyramids and packaging them into COGs, thus greatly reducing the number of levels and tiles.
2.2. Index File of DCPMS and the Loading Process
The organizational characteristics of COG permit its independent publication in the absence of the GIS server, with the sole reliance on object storage. It is desirable for this capability to be maintained in DCPMS. To facilitate this, an additional index file can be introduced as metadata to describe, manage, and access the COG-pyramid. One straightforward approach involves the generation of an XML or JSON file that records the overall structure of the COG-pyramid and the corresponding COG address for each tile. Subsequently, clients are able to obtain data by means of a request, following the parsing of the index file.
In this study, DCPMS.json is designed as an index file for the metadata of the raster layer map service. The advantage of JSON is its use of key-value pairs, which provide a concise, clear structure and are easy to read and write, facilitating direct browser parsing. The main content of the JSON file includes the COG address corresponding to each tile, which can be a HTTP or S3 protocol URL. This creates a mapping that allows the client to calculate
zxy coordinates based on the current window to locate the corresponding COGs. Another benefit of JSON is that it can store additional pyramid structural information, such as the raster layer’s bounding box and the number of rows and columns per level, which enables customization of the pyramid structure. The fields in JSON are relatively flexible and can be extended as agreed upon with the client. Below is an example of the DCPMS.json file. The specific definition of the JSON file structure is in
Table A1.
DCPMS.json |
{ “extent”: [−180, −90, 180, 90], // bounding box “level”: 9, // maximum zoom level “tileSize”: [4096, 4096], // resolution of each COG “structure”: { “0”: [1, 1], // number of rows and columns per level [x, y] “9”: [512, 256] } “pyramid”: { “0”: { “0”: { “0”: “http://localhost:9000/test/s2/0/0/0/B04.tif” } }, “9”: { “363”: { “56”: “https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/43/U/ER/2023/7/S2B_43UER_20230702_0_L2A/B04.tif” * } } } } |
* accessed on 27 January 2024. |
The entire JSON file functions as a raster layer map service, stored in object storage and accessible at any time, eliminating the need for GIS server maintenance. This approach achieves decentralization and reduces operational and maintenance pressures. Additionally, the COG addresses within the JSON file support various object storage sources, facilitating easy expansion. Data interactions are illustrated in
Figure 4. Moreover, the process of updating the map service is straightforward. All that is required is to modify the corresponding COG address in the JSON file, offering considerable flexibility.
With the COG-pyramid and index file obtained, the client also needs to make corresponding adaptations to present raster layer map services. The process of loading the DCPMS mainly comprises two parts: querying the corresponding COGs based on the window and loading the specific COGs.
Querying corresponding COGs involves conversion algorithms for parameters such as the window’s bounding box and zoom level to
zxy coordinates. The bounding box (
bbox) represents a rectangular spatial range, taking latitude and longitude as an example, composed of an array of four numbers representing minimum longitude, minimum latitude, maximum longitude, and maximum latitude in order. The zoom level indicates the magnification of the current window, requiring conversion to
z in the COG-pyramid. The specific method involves converting these parameters to the actual spatial resolution, which is the geographical spatial distance represented by each pixel. From all levels of the pyramid, the closest smaller spatial resolution is selected and converted to the corresponding
z. A smaller value is chosen to obtain a closest larger
z, which can make full use of the overviews. Spatial resolution is calculated as follows:
The map_width refers to the total width of the map. For example, in the WGS84 geographic coordinate system, the total longitude range is 360 degrees, which is considered the map width. The tile_width typically measures 256 pixels for raster layer map services. The tile_count represents the number of tiles across the width of the current level, usually satisfy tile_count = For instance, at zoom level 0, there is only one tile with 256 × 256 pixels, and the spatial resolution of this level can be calculated as 1.40625 degrees per pixel. The spatial resolution of each level can be calculated by Equation (1). The client typically provides a function to acquire the spatial resolution for the current window. With the spatial resolution, converting the zoom level of the map window to the corresponding z in the COG-pyramid becomes straightforward.
Next, take the number of rows and columns for this level from the index file and calculate the span of each tile in the
x and
y directions based on the pyramid’s extent. The maximum and minimum numbers of the tiles intersecting the window can be obtained by dividing the window’s coordinates by the span. Finally, all tiles are looped through, and the tile coordinates are returned in array form for subsequent COG data requests. The specific algorithm is implemented in Algorithm 1.
Algorithm 1 Conversion of Window Parameters to COG-pyramid Tile Coordinates |
Input: window’s bounding box (bbox), current zoom level (zoom), COG-pyramid structure (pyramid)- ●
Obtain the bounding box of the pyramid (extent), the resolution of COG (tileSize), and the maximum number of columns of each level in the x direction (z_i_column) from pyramid. Calculate the spatial resolution of each level (z_i_resolution): z_i_resolution = (extent[2] − extent[0])/(z_i_column * tileSize[0]) - ●
Calculate the spatial resolution of the current window from zoom. Select the closest smaller value and convert to the corresponding pyramid level (z) - ●
Obtain the maximum number of columns in the x direction (column) and the maximum number of rows in the y direction (row) under the z level from pyramid - ●
Calculate the span in the x direction (x_tile_range) and in the y direction (y_tile_range) for each tile: x_tile_range = (extent[2] − extent[0])/column y_tile_range = (extent[3] − extent[1])/row - ●
Calculate the minimum tile number (x_min) and maximum tile number (x_max) in the x direction and the minimum tile number (y_min) and maximum tile number (y_max) in the y direction of bbox: x_min = Math.floor((bbox[0] − extent[0])/x_tile_range) x_max = Math.ceil((bbox[2] − extent[0])/x_tile_range) − 1 y_min = Math.floor((bbox[1] − extent[1])/y_tile_range) y_max = Math.ceil((bbox[3] − extent[1])/y_tile_range) − 1 - ●
Traverse the above range of x and y through a double-layer loop, where all tiles intersect the window. Store these tile coordinates into the array (tiles): for (var x = x_min; x <= x_max; x++) { for (var y = y_min; y <= y_max; y++) { tiles.push([z, x, y]); } }
|
After obtaining all COG tile coordinates within the current window range, specific image loading begins. In the COG-pyramid index file, z, x, and y are used successively as keys to sequentially find the corresponding COG address. Then, two HTTP Range requests are sent according to the COG loading method previously described. And the images are uniformly presented in the window after processing by the client. The choice between loading an overview of the original data also depends on the spatial resolution.
It is worth noting that remote sensing images require accurate geographic location and scale information for geographical analysis and interpretation. Therefore, projection coordinate systems are usually employed, such as the UTM projection of Sentinel-2 images. However, to optimize visualization and loading performance, online map services typically use the Web Mercator projection or the WGS84 geographic coordinate system. This may lead to discrepancies between the coordinate system of the remote sensing images and that used by the map service. Therefore, before loading remote sensing images, it is necessary to use a geospatial coordinate transformation library (e.g., Proj4) to convert the remote sensing images from their original coordinate system to the coordinate system used by the map service. Then, the image pixels are mapped to the geographic coordinate by affine transformation to pinpoint the correct position on the map. This allows images with different projections to be seamlessly integrated into online raster layer map services.
2.3. Data Processing Flow of DCPMS
The primary task of building DCPMS is to determine the pyramid structure, which requires customization according to the specific characteristics of the underlying COG data source. While it is possible to directly mosaic and slice the underlying data to fit the tile range, this data is usually voluminous, making the process cumbersome and time-consuming. The COG-pyramid provides a simple mapping relationship that allows the client to locate the corresponding COG according to the window range. The actual loading process does not involve directly tiling the data according to the tile range as static tiles do. Instead, it requires calculating the geographical position of the COG through affine transformation. Consequently, there is no need to strictly align with the tile bounding box, nor is additional processing of the original data necessary.
Before designing the COG-pyramid structure, it is essential to determine a standard pyramid with a scaling multiplier of 2 times between adjacent levels (i.e., the number of tiles in the
x and
y directions doubles for each increase in
z level) as a reference pyramid to facilitate subsequent COG-pyramid achieving a close smooth scaling effect. The key attribute of the reference pyramid is its maximum level
z_max, which can be calculated from the original image’s spatial resolution and represents the maximum effective level of the original image tiles. Based on the circumference of the Earth,
l meters, and the spatial resolution of the original image data,
m meters per pixel, the following equation can be derived from Equation (1):
The maximum level z_max of the reference pyramid can be calculated, which corresponds to the equivalent level of the COG-pyramid and its embedded overviews, to help determine the level of the COG-pyramid.
The design of the COG-pyramid structure primarily consists of two aspects: level and tile. The first step is to design the pyramid level, which is assisted by the
z_max of the reference pyramid and the number of overviews
n of the original data. Since the original data serve as the bottom of the COG-pyramid, and the resolution ratio between overviews is typically 2 times, which is the same as the reference pyramid, the original data can be equated to the
z_max −
n to
z_max levels of the reference pyramid. Therefore, the upper levels of the COG-pyramid only need to be equivalent to the 0 to
z_max −
n − 1 levels of the reference pyramid. The number of overviews in the upper COG can be preset to be consistent with the original data, all of which are
n. Then, the number of levels in the upper COG-pyramid can be calculated from (
z_max −
n)/(
n + 1), rounded up. Each COG-pyramid level corresponds to
n + 1 levels of the reference pyramid, allowing for overlap to ensure complete coverage of the maximum level of reference pyramid
z_max. The overall design, as depicted in the COG-pyramid model schematic (
Figure 3), makes full use of overviews as hidden levels. It is worth noting that the above COG-pyramid level design strategy obtains the minimum number of levels under ideal conditions, and the specific implementation can be adjusted according to the actual situation.
The second part is the design of the number and size of tile within each COG-pyramid level, which can be divided into two categories based on common remote sensing product data. The first category is satellite remote sensing images, such as Sentinel-2 and Landsat-8, which usually have their own grid and overlap. A solution for this data source is to directly utilize the columns and rows from the reference pyramid. At the bottom of COG-pyramid, the
z_max −
n level of the reference pyramid is selected as the object of image mapping. If some images are difficult to match, the corresponding reference pyramid level can be appropriately increased, such as
z_max −
n + 1, to increase the number of tiles, which is convenient for subsequent mapping. The number of rows and columns in the upper COG-pyramid is also determined by the reference pyramid. At the same time, it is also necessary to determine the resolution of subsequent upper-resampled COG tiles. It can be calculated based on the reference pyramid tile size of 256 × 256 pixels and the preset number of overviews
n. The calculation equation is as follows:
The second category is remote sensing image-derived product data, which have usually been converted to the common WGS84 geographic coordinate system and segmented by latitude and longitude. For instance, in the WorldCover 2021 dataset provided by the European Space Agency (ESA), each image spans an area of 3° × 3° [
33]. For such data, it is feasible to customize the numbers of rows and columns at each pyramid level without strictly adhering to powers of two like the reference pyramid. The pyramid bottom can directly utilize existing segmented COG, such as 120 × 60 tiles. To ensure smooth scaling, the upper levels can appropriately adjust the number of rows and columns per level. But the ratio of rows and columns needs to be consistent with that of the bottom level. Additionally, the resolution of the upper-level resampling COG can also take the same value as the original data. Once designed, record data extent and numbers of rows and columns per level in the index file DCPMS.json for subsequent client parsing and loading.
Processing data hierarchically is similar to pyramid structure determination, divided into mapping bottom data and resampling upper data. Leverage original COG format data to reduce big data processing time.
Before processing the underlying data, it is essential to ensure that the original raster data are in COG format. They can be converted by the GDAL script with parameter -co TILED=YES. If original data are not COGs, mapping loses meaning. Direct resampling to COG format for the bottom of the pyramid is better. In the mapping of the underlying data, if the product data are segmented by latitude and longitude, the algorithm can be directly implemented to calculate the corresponding custom pyramid tile coordinates according to the latitude and longitude range in the COG file name or metadata. Remote sensing image data are more complex. Taking Sentinel-2 images to construct DCPMS as an example, first of all, it is necessary to traverse the grid number and retrieve the image with the best quality on each grid number. The screening conditions, such as the minimum cloud cover, can be screened by invoking the SpatioTemporal Asset Catalog (STAC) [
34] interface. After the best quality image set is obtained, each
xy tile at the bottom of the pyramid is traversed to calculate the intersection degree with the bounding box of each image in the dataset, that is, the area ratio of the overlapping part to the bounding box of the tile, ranging from 0 to 1. A threshold value such as 0.8 can be set to indicate a rough match, meaning that the image can be loaded when scaled to the tile range. Finally, the tile coordinates (
z,
x,
y) and corresponding COG file paths are written into DCPMS.json to complete the mapping.
After the bottom-level mapping, upper data are resampled using the input file path recorded in DCPMS.json. Due to the close relationship between pyramid levels, only
n ×
n lower COGs are needed in a resampling, where
n is the zoom ratio. During code implementation, COG as an input only requires reading its overview, not the entire image. These details vastly reduce data usage and speed up resampling. With the help of open-source raster data processing libraries such as Rasterio, the downsampled COG is generated by code. The new file path needs to be written back into the DCPMS.json so that it can be read when the upper data is resampled. Finally, the COG set of the upper pyramid and a DCPMS.json file recording the information of the COG-pyramid are obtained. These form additional storage for building a raster layer map service. At this point, the data part of DCPMS has been built. A flowchart is shown in
Figure 5.
3. Experiments and Analysis
3.1. Experimental Preparation
The experimental data used in this paper consist of two datasets. One is the Sentinel-2 image dataset [
35] and the other is the WorldCover 2021 dataset [
33]. Due to the large amount of data, only high-quality partial Sentinel-2 images within a specific range are selected. The two datasets are detailed in
Table 2.
Original Sentinel-2 images are stored in Amazon Cloud S3 object storage. Simultaneously, MinIO, a local private cloud storage service, is deployed to store the original WorldCover 2021 dataset and the COGs generated during DCPMS construction. MinIO is a lightweight, high-performance object storage service compatible with the Amazon S3 interface, highly suitable for storing large-capacity unstructured data.
Data processing is primarily implemented in Python 3.2.2. The COGs in the upper pyramid are downsampled with Rasterio. As for the online visualization of browser pages, it is realized through development with the open-source JavaScript library OpenLayers. The advantage of OpenLayers is its mature adaptation to COG. It can easily load and render multiple COGs with WebGL. And the open-source library ol-cesium is used to switch between 2D map OpenLayers and 3D Earth Cesium.
3.2. Experiments on the Performance of Raster Layer Serving Methods
The Sentinel-2 image dataset, which contains multi-band information, is employed to evaluate the performance of various raster layer serving methods.
According to Equation (2), the maximum zoom level of the reference pyramid is 14, which is the equivalent level to be realized by the COG-pyramid. Sentinel-2 images contain 4 overviews with doubling relationships between levels, acting as levels 10–13. Thus, the original Sentinel-2 images can be approximately equivalent to reference pyramid levels 10–14 as the COG-pyramid’s bottom. The upper pyramid is designed with 3 levels, equivalent to static tiles levels 0–4, 3–7, and 6–10. With the original images, the scale effect of 14-level static tiles can be achieved. Due to the complexity of Sentinel-2’s special grid system for raster layer map services, the tiles from the reference pyramid are used directly. The original images are mapped onto the 10-level tiles of the reference pyramid. The resolution of upper resampling COGs can be calculated by Equation (3), which is 4096 × 4096 pixels.
In terms of data preprocessing, the image retrieval process is initially conducted through the STAC interface, recording the address of the highest quality image and its bounding box by grid number. Subsequently, the images are mapped to the bottom tiles of the pyramid. The original images are not downloaded due to the significant volume of data, and their S3 object storage addresses provided by Amazon Cloud Open data are directly used. The overviews of COGs on the cloud are read via HTTP during the upper resampling process to reduce data transmission. Finally, the calculation and rendering of RGB and NDVI are developed on the website. The latitude and longitude and the reflectance of the band can be obtained in real time according to the mouse position. The visualization effects are depicted in
Figure 6 and
Figure 7.
The comparative test of a single COG’s reading performance versus an ordinary GeoTIFF is detailed on the GDAL Wiki [
26]. Consequently, this study focuses on the performance advantages of the DCPMS in real-time band calculation and visual rendering, compared to the GIS server hosting a COG set. The focus is the difference between high-concurrent real-time computing decentralized to clients versus concentrated on the server. The GIS server used is self-developed with Java’s Spring Boot framework, supporting HTTP access to COG via the Apache SIS library. The service can directly read DCPMS.json to obtain the target tile, reducing dynamic mosaicking calculation. It focuses solely on performing band calculations, rendering the image, and returning the output. In addition, this experiment supplements the loading time of static tiles map service for comparison.
The performance experiment is divided into two main components: data reading and high concurrency, which represent the performance of the client and server, respectively.
3.2.1. Experiment on Data Loading
On the client side, user experience focuses on the loading time of the map service. The debugging tool of the browser is selected to test and record the overall time consumption of data reading and rendering loading after the map window is determined.
Figure 8 presents the test results on the client side. First, for RGB rendering, static tiles demonstrate the fastest loading, taking about 0.37 s to fully display the map of the current window range. This efficiency stems from directly reading the existing image files, which are relatively small in size. Typically, the number of tiles covering the browser window is about 30, and the total data volume is around 3 MB. In contrast, the DCPMS loads a bit slower, with a load time of approximately 1.1 s and a data volume of about 13 MB across four bands. The GIS server is special because it is the server that reads the COG and computes and renders the image to return. The reading time is longer when the data are first accessed, so the total time for the first load of the current window map is about 1.5 s. When loading the same window for the second time, the time is greatly reduced by about 0.41 s because of the cache on the server. Its data volume is also about 3 MB, which is close to that of static tiles.
Band calculation performance is also evaluated. Taking NDVI as an example, DCPMS does not need additional data requests because it has read the near-infrared band, and can immediately display the NDVI calculation rendering results by switching rendering modes. According to the debugging tool, the execution of scripts, rendering, and other steps takes about 0.02 s. The GIS server, on the other hand, needs to request the server again and takes more time. Inside the server, the previously unused near-infrared band needs to be read for calculation and rendering, which takes 0.45 s. The above loading times are tested 20 times and averaged, involving different Windows after scaling or panning, and the client did not have the browser cache turned on.
3.2.2. Experiment on High Concurrency
As for the server side, the emphasis is on measuring the server’s response time under various levels of concurrent load. Apache JMeter is utilized to assess the performance of the GIS server and object storage. The object storage server used is a locally built MinIO, which stores and publishes COGs and static tiles.
The test results on the server side are shown in
Figure 9. The response time of the object storage server and GIS server is summarized under different concurrent times. The stress test time for each concurrency is 5 min. The test of object storage is divided into two cases: the full reading of static tiles and the partial reading of COG. The average reading of static tiles is about 100 KB, while the partial reading of COG adopts a random byte range, and the volume of data requested is about 300 KB, which is close to the actual request of the map service. As the number of concurrent requests increases, the response times of both the object storage and GIS server rise, which is caused by the bottleneck of computing resources. However, the response speed of object storage, whether for reading static tiles or partial COG, consistently outperforms the GIS server across all levels of concurrency. Notably, with a thousand concurrent requests, the response time for the COG-pyramid is only 26% of that of the GIS server. According to performance monitoring, the object storage is mainly limited by CPU resources when distributing data, and the GIS server is also limited by memory resources on the basis of the former, which is mainly used for reading data and caching. This indicates that, compared to direct data delivery from object storage to the client, the GIS server, acting as an intermediary server, incurs extra time losses in data input and output, leading to poorer performance. At the same time, under high concurrency, the calculation and rendering load are concentrated on the server side. Due to limitations in computing resources, the server cannot process each request promptly, causing some requests to be queued. In addition, the cache is difficult to hit because of the high randomness of tile access. The considerable data-reading demands and time-intensive file I/O further exacerbate the server’s slow response. It can be concluded that the decentralized service architecture can greatly reduce the service load caused by high concurrency. Regarding the response time gap between static tiles and part of COG, which consistently stands at about five times, it is speculated to be primarily due to the size of the data and the method of reading.
In the performance comparison experiment, it is evident that the static tiles excel in terms of loading speed and high concurrency performance. As a common map service for ordinary users, it is undoubtedly the best choice. The main reason for its superior performance is that static tiles are used as the data carrier, the data volume is small, and they are easy to load and distribute. However, static tiles lack the band information necessary for real-time analysis. Thanks to the mature technology of object storage products, the partial reading of COG-pyramid is also fast and stable under high concurrency. Although its response time is slower compared to static tiles due to the larger amount of data, it still outperforms the GIS server. At the same time, with the progress of hardware level, the improvement of client performance, and the development of software technology such as WebGL, the user’s browser does not have much load on the simple calculation and rendering of a window’s pixels. Map services can be built with simple object storage, and the resources required to maintain such services are small. Conversely, the GIS server shows poor performance in the experiment, which may be partly attributed to limited resources, but the essence is the overall design idea. The load is concentrated on the server side, with extensive file input and output, modifications in rendering style generating additional requests, and limited caching capabilities. These issues severely impact performance. Therefore, the GIS server is not suitable for map services that handle large volumes of data. A better solution is to offload real-time calculation and rendering to the client side, as demonstrated by the COG-pyramid approach.
3.3. Experiment on Data Preprocessing
Raster layer map services that utilize a large number of remote sensing images or derivative products as the data sources require preprocessing. Additionally, extra storage space is needed to build overviews or indexes to facilitate the smooth loading of map services, essentially trading space for time. This experiment supplements the data preprocessing comparison test that the Sentinel-2 image dataset could not carry out due to the large amount of data.
According to the spatial resolution of the WorldCover 2021 dataset, the maximum level of the reference pyramid is 14. The single original COG contains 6 overviews, which as the bottom level of the pyramid corresponds to 8–14 levels of the reference pyramid. Since the original data is segmented according to latitude and longitude, the bottom of the customized COG-pyramid can directly align with the original data, and the number of tiles is 120 × 60. On the basis that the numbers of columns and rows are integers and the ratio is 2, the upper pyramid includes two levels of tiles with numbers of 2 × 1 and 16 × 8, respectively. The resampling data maintain the same resolution and overviews as the original data. In terms of data processing, the initial step is to extract the latitude and longitude coordinates of the origin point at the lower-left corner of the image from the file name, such as ESA_WorldCover_10m_2021_V200_N51E000_Map.tif. File paths are mapped to the bottom tile of the custom pyramid and recorded in DCPMS.json to facilitate the reading of COGs during downsampling. When downsampling the upper COGs, the corresponding lower COGs can be traversed and the appropriate overviews are read.
Unlike Sentinel-2 images, the WorldCover dataset is a single band product data, which cannot be subjected to overlay calculations. This dataset is more focused on data sharing and visualization. In this instance, unique value rendering is achieved according to the pixel value, and it is presented using the color band of the product data. The visualization effect is displayed in
Figure 10.
According to the reference pyramid, the static tiles pyramid needs to be sliced up to level 14. This pyramid uses WGS84 coordinates, 512 × 256 pixels with two tiles in level 1. Each level has 256 × pixels in the x direction, half in the y direction.
Before slicing, the gdal_merge.py script is used to merge all data into a large COG. Subsequently, the gdal2tiles.py script is used to slice, and the total tile data is 1.1 TB, which took 112 h to process. Of course, distributed computing frameworks such as Spark can also be used for slicing, without merging and speeding up. But either way, in the face of large data volume raster datasets, building a tile pyramid takes a lot of time or computing resources to process, because the underlying data is equivalent to reconstruction. The more detailed the data, the higher the pyramid, the greater the number of image files, exponentially increasing processing time and additional storage space.
The COG-pyramid cleverly avoids this problem. The original data are used as the bottom of the pyramid, with the built-in overviews acting as hidden levels. During the resampling process for the upper pyramid, only the overviews from the lower COGs are required. These specificities significantly reduce the resampling time and storage space needed. The construction of the COG-pyramid takes only 14 h, excluding the underlying original data, with the additional two levels of COG occupying only about 5.4 GB. Furthermore, the number of files is considerably smaller compared to static tiles, making them easier to manage and maintain. The comparison of processing time and storage space between static tiles and DCPMS is displayed in
Figure 11.
In addition, Amazon Cloud Open Data provides S3 object storage access to a large amount of remote sensing product data, which can be partially read and directly build the upper COG-pyramid without occupying local storage space. The loading of underlying data in the map service can also access public cloud files through the HTTP address recorded by DCPMS.json, further alleviating storage load.
In terms of data preprocessing and storage, the COG-pyramid demonstrates significant advantages over static tiles. In this experiment, the GIS server also utilizes the processed COG-pyramid to expedite tile calculations. If a fully dynamic mosaicking GIS server is used, although the preprocessing work is greatly reduced, the response time should be difficult to accept according to the offline data preprocessing efficiency of the test machine.