Nothing Special   »   [go: up one dir, main page]

Next Article in Journal
Application of UAV Photogrammetry in Displacement Measurement of the Soil Nail Walls Using Local Features and CPDA Method
Next Article in Special Issue
Intercropping Classification From GF-1 and GF-2 Satellite Imagery Using a Rotation Forest Based on an SVM
Previous Article in Journal
Spatiotemporal Influence of Urban Environment on Taxi Ridership Using Geographically and Temporally Weighted Regression
Previous Article in Special Issue
Single-Tree Detection in High-Resolution Remote-Sensing Images Based on a Cascade Neural Network
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Comparison of Standard Modeling Techniques Using Digital Aerial Imagery with National Elevation Datasets and Airborne LiDAR to Predict Size and Density Forest Metrics in the Sapphire Mountains MT, USA

1
RedCastle Resources, Inc. Onsite contractor with Northern Region Geospatial Group, USDA Forest Service, 26 Fort Missoula Road, Missoula, MT 59808, USA
2
Rocky Mountain Research Station, USDA Forest Service, 800 E. Beckwith, Missoula, MT 59801, USA
3
Northern Region Geospatial Group, USDA Forest Service, 26 Fort Missoula Road, Missoula, MT 59808, USA
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2019, 8(1), 24; https://doi.org/10.3390/ijgi8010024
Submission received: 21 November 2018 / Revised: 26 December 2018 / Accepted: 7 January 2019 / Published: 11 January 2019
(This article belongs to the Special Issue Geographic Information Science in Forestry)
Figure 1
<p>Daly-Gold Study Area in the Sapphire Mountains of the Bitterroot National Forest in western Montana, USA is shown. The yellow linear features describe the National Forest Systems administrative boundary and the land within it, and the yellow point features illustrate the location of plots and associated field measurements used in this study.</p> ">
Figure 2
<p>Composition and structure of common stand exam plot design, with outer circle (orange) representing a 0.04 hectare plot for trees greater than or equal to 12.7 cm diameter at breast height (radius = 11.4 m) and inner circle (yellow) representing a 0.0013 ha plot for trees less than 12.7 cm DBH (radius = 2 m). The outer square represents the area summarized for the NAIP imagery and LiDAR data.</p> ">
Figure 3
<p>Flow diagram of the data sources, processing steps, modeling, and comparisons made within the study. NAIP, LiDAR, and TOPO data were converted to texture, height, and elevation metrics and transformed to spatial surfaces using principal component analysis (PCA). PCA surface values were then related to field data based on plot locations (Overlay) and were used to model basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH) and trees per hectare (TPH) relationships. Modeled relationships were then used to create raster surfaces that were compared at the spatial resolution of the plot and stand.</p> ">
Figure 4
<p>Daly-Gold Study Area lifeform classification with randomly selected forest polygons highlighted in black. Summarized metric predictions were compared in these polygons.</p> ">
Figure 5
<p>Distribution of basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH) collected on 60 plots within the study area.</p> ">
Figure 6
<p>Principal component graph, where LiDAR is solid black, NAIP is dashed, and TOPO is dotted. The solid horizontal line indicates the minimum 95% threshold of the cumulative variation that is explained by the number of principal components.</p> ">
Figure 7
<p>Illustration of TOPO, NAIP, and LiDAR principal component raster datasets. Note that only the first three components are displayed by the red-green-blue-color composite. The proportion of correlation explained (λ) by those components is display above each raster dataset.</p> ">
Figure 8
<p>Linear relationship between basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH) estimates derived from NAIP and LiDAR models (<a href="#ijgi-08-00024-t001" class="html-table">Table 1</a>). Observed versus predicted values for NAIP and LiDAR based BAWD, QMD, BAH, and TPH are given in <a href="#ijgi-08-00024-f0A2" class="html-fig">Figure A2</a>.</p> ">
Figure 9
<p>Variation in out of bag root mean squared error (RMSE) for 10 basal area weighted diameter Random Forest models. Results for quadratic mean diameter, basal area per hectare, and trees per hectare are given in <a href="#ijgi-08-00024-f0A3" class="html-fig">Figure A3</a>.</p> ">
Figure A1
<p>Example of minor differences in basal area weighted diameter (BAWD with field measured min: 0.00 cm, and max: 58.62 cm), quadratic mean diameter (QMD, with field measured min: 1.27 cm, and max: 53.34 cm), basal area per hectare (BAH with field measured min: 0.00 cm, and max: 45.22 cm), and trees per hectare (TPH with field measured min: 0.00, and max: 41,439.54) estimated values for NAIP, LiDAR and TOPO based models. Additional summary statistics are given in <a href="#ijgi-08-00024-t0A1" class="html-table">Table A1</a>.</p> ">
Figure A2
<p>Observed verse predicted values for NAIP (left) and LiDAR (right) based, basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), trees per hectare (TPH), best fitting Random Forest models.</p> ">
Figure A3
<p>Variation in out of bag root mean squared error (RMSE) for 10 quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH) Random Forest models.</p> ">
Versions Notes

Abstract

:
In recent years airborne Light Detection and Ranging (LiDAR) technology has received a great deal of attention. Using airborne LiDAR, analysts have successfully related height measurements to forest characteristics such as tree size, basal area, and number of trees. Similarly, National Agricultural Imagery Program (NAIP) digital aerial imagery in combination with elevation datasets such as the National Elevation Dataset (NED) have been used to estimate similar forest characteristics. Few comparisons, however, have been made between using airborne LiDAR, NAIP, and NED to estimate forest characteristics. In this study we compare airborne LiDAR, NAIP, and NAIP assisted NED based models of forest characteristics commonly used within forest management at the spatial scale of field plots and forest stands. Our findings suggest that there is a high degree of similarity in model fit and estimated values when using LiDAR, NAIP, and NAIP assisted NED predictor variables.

1. Introduction

Airborne Light Detection and Ranging (LiDAR) is an active sensor technology that has recently gained attention in forestry as a predictor variable for commonly quantified forest characteristics. Because airborne LiDAR is an effective tool for measuring heights, some have investigated the relationships between those heights and common forest metrics used to inform land management and planning activities [1,2]. While studies have shown promise in relating LiDAR obtained heights to metrics such as basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per ha (BAH), and trees per ha (TPH) [3,4], it is important to recognize that the precision and accuracy of those estimates are only based on the strength of the relationship between remotely sensed and plot data, and not the technology itself.
Similarly, fine resolution passive airborne imagery produced by organizations such as the National Agricultural Imagery Program (NAIP) [5] have also been used to estimate forest metrics [6,7,8,9]. In particular, strong relationships have been developed between BAWD, QMD, BAH, and TPH using spectral values, measures of image texture [10], and topographic variables derived from imagery such as the National Elevation Dataset (NED) [11]. Despite these similarities, few direct comparisons have been made between the quality and costs of forest metric estimates produced from LiDAR datasets and those derived from high resolution imagery and topographic data such as NAIP and NED.
In isolation, the quality and cost of using various remotely sensed sources of data as model predictor variables for forest characteristics is fairly straight forward to calculate and compare. Metrics such as root mean squared error (RMSE) and dollars per hectare can be calculated, attributed to each of the different remotely sensed data sources, and evaluated to determine which source of data produces the least error and cost the least to acquire. However, when both quality and cost are simultaneously assessed within the context of informing forest management, tradeoffs in model precision and data acquisition costs can be made to justify the use of a given dataset. This assumes there is a difference between quality (model error) and cost (currency per hectare) in using different remotely sensed data.
Assuming that costs associated with collecting field data used to train predictive models of forest characteristics are the same and that data processing costs are similar for the different sources of remotely sensed data, the vast majority of cost difference among different remotely sensed data can be attributed to data acquisition. Aerial imagery such as NAIP are acquired across the continental USA every two years, and are readily available, free of charge, to the public through United States Department of Agriculture Aerial Photography Field Office [12]. Similarly, seamless 10 m elevation data for the continental USA have been collected by USGS and is readily available, free of charge to the public [13]. Airborne LiDAR on the other hand can have variable cost per unit area to acquire and is typically contracted on a project basis, costing both time and money for data acquisition and contract management across a relatively limited spatial extent.
Given that NAIP and NED products are readily available across broad spatial extents and free to the public, these data represent low cost potential predictor variables for modeling forest characteristics within the continental USA. Adding additional remotely sensed data, especially at an increased cost, is really only justifiable from a practical standpoint when the quality (less error) of models are substantially improved. It follows then that acquiring airborne LiDAR for the sole purpose of estimating forest characteristics within the continental USA is really only justifiable when the quality of model predictions substantially improve, and that the added expense of acquisition is less than the gains associated with improved model prediction. Similarly, in the event that this situation is reversed (airborne LiDAR data freely available and there is an additional cost to acquire imagery) the same line of reasoning would apply. However, the improvement in predictive model quality of forest characteristics such as BAWD, QMD, BAH, and TPH given different sources of remotely sensed data is seldom tested. While these different sources of data have been successfully related to forest metrics and measure different aspects of the forest (spectral, elevation, and height) it is important to recognize they are only correlated with BAWD, QMD, BAH, and TPH, and are more than likely highly correlated with one another for a given geographic area. Moreover, given the similarity in reported model errors in studies that solely use imagery [6,7,8,9] or LiDAR [3,4] based predictor variables to estimate forest characteristics, it may be that both sources of data provide similar predictive capability.
This has lead us to hypothesize that there is little difference between modeled estimates of BAWD, QMD, BAH, and TPH derived from NAIP, NAIP assisted NED, and LiDAR based remotely sensed data. To test this hypothesis we compare BAWD, QMD, BAH, and TPH estimates from Random Forest regression models that relate plot measurements to LiDAR, NAIP, and NED derivatives in the Bitterroot National Forest, Montana, USA. Our comparisons evaluate model fit and complexity using RMSE and Akaike’s information criterion (AIC) [14,15] at the spatial resolution of the plot and the forest stand.

2. Materials and Methods

2.1. Study Area

This project is located at the headwaters of Daly and Gold creeks, in the Sapphire Mountains, on the east side of the Bitterroot Valley, approximately forty miles south of Missoula, MT (Figure 1). The Daly–Gold Study Area encompasses 39,354 hectares, and approximately 70 percent of this areas is within the Bitterroot National Forest. The lowest and highest elevations are 1152 m and 2578 m, with a wide range of slope and aspect distributions [11]. This setting is generally representative of moderate elevation mountains of western Montana, USA. Mean annual elevation-weighted precipitation is 71 cm, ranging between 43 cm to 97 cm. Temperature ranges from an average minimum of −1.78 °C to an average maximum of 13.89 °C [16]. The forest types in this area are representative of patterns found elsewhere in montane environments of the Rocky Mountains, where Ponderosa pine (Pinus ponderosa), Douglas-fir (Pseudotsuga menziesii), and western larch (Larix occidentalis) occupy lower to mid elevations, transitioning to lodgepole pine (Pinus contorta) and subalpine fir (Abies lasiocarpa) at higher elevations. On wet sites and in riparian areas, quaking aspen (Populus tremuloides), black cottonwood (Populus balsamifera), and Engelmann spruce (Picea engelmannii) are commonly found [17].

2.2. Data Sources

Field data used in this study include plot coordinates derived from a Trimble RTX global positioning system (GPS; horizontal accuracy < 3 m), tree counts, and measurements based on common stand exam (CSE) procedures [18,19]. The allocation of field plots across the study was based on a clustered sampling design of a previous study [20]. Inputs for the clustering process included a normalized difference vegetation index (NDVI) derived from Landsat 5 imagery and elevation and solar insolation derived from the NED 30 m digital elevation model (DEM). In all, 27 clusters were created and three plots were randomly located within each of the 27 clusters to insure samples were drawn across a wide range of topographic and spectral conditions. All 81 plot locations were visited and tree data were collected by Bitterroot National Forest silviculture staff from June to October 2011. At each location a 0.04 ha circular plot (radius 11.4 m) was installed and all trees greater than or equal to 12.7 cm diameter at breast height (DBH) were counted, described, and measured. Additionally, for trees less than 12.7 cm DBH a second 0.0013 ha circular sub-plot was installed (radius 2 m). At 21 of the plot locations, GPS readings did not meet minimum spatial accuracy requirements for our study and were removed from the comparisons. In total, 60 GPS locations and corresponding plot data constitute our field sample dataset. An illustration of the dimensions and layout of plots is given in Figure 2.
Spectral data acquired for our study came from the National Agricultural Imagery Program (NAIP). NAIP collects fine resolution, four-band digital aerial imagery during the agricultural growing seasons across the continental United States, repeated every two years [21]. NAIP imagery is acquired at a 1 m spatial resolution with a horizontal accuracy that locates 95% of photo-identifiable ground control points within 6 m of their actual location. The spectral bands of NAIP include red, green, blue, and near-infrared [22]. NAIP imagery is made publicly available to download through the aerial photography field office (APFO) website [12] as a series of quarter quadrants image tiles that have been mosaicked and color balanced. Using USGS file transfer protocol image service [20], we downloaded 19 NAIP tiles that covered our study area. From those tiles we created one continuous image mosaic, and this is the source of our 2009 NAIP data.
Elevation data acquired for our study consisted of 1/3 arc-second (~10 m) digital elevation models [13] (DEM) and were acquired using the USGS file transfer protocol image service [23]. These data are the source for our topographic dataset (TOPO). Light Detection and Ranging (LiDAR) data used in our study were collected in two separate acquisitions. The first data collection flights occurred between June 24th and 28th, and the second acquisition took place between August 8th and 9th, 2010. In both cases, the Leica ALS50 Phase II and Leica ALS60 laser systems were mounted on a fixed wing Cessna Caravan aircraft, and used with a scan angle of ±15° from nadir and pulse rate designed to yield an average native density of ≥6 points per m2 over terrestrial surfaces. The entire study area was surveyed with an opposing flight line side-lap of 50% to reduce laser shadowing and increase surface laser painting. The sensor recorded up to four returns per pulse, and all discernable laser returns were processed for the output dataset [22]. Table 1 briefly describes the datasets used for our comparisons.

2.3. Data Processing

Tree data collected at each field plot were processed to yield summarized values of size and density. Using the location and extent of 0.04 ha field plots (~23 m by 23 m), summary statistics were computed from NAIP, TOPO, and LiDAR data sources. Summary statistics from NAIP data included metrics related to central tendency, dispersion, and texture for each image band. From the TOPO dataset, mean elevation, slope, and trigonometrically transformed aspect values were computed. From the LiDAR dataset, a bare earth model was generated and used to filter the point cloud, and all standard FUSION-based height metrics [25] were computed for the extent of each plot (Table A1). Following the generation of plot-based cloud metrics, associated grid metrics were computed and assigned to a cell size that matched the extent of the plot.
Many of the spectral (NAIP), topographical (TOPO), and height metrics (LiDAR) derived from each data source were highly correlated. To remove redundancy in the data, normalize values, and produce independent (orthogonal) predictor variables, we performed three singular vector decomposition principal component analyses (PCA) [26]. Relevant component scores from the PCAs were used to develop a suite of Random Forest regression models [27,28,29] predicting BAWD, QMD, BAH, and TPH.
Candidate models for BAWD, QMD, BAH, and TPH were compared using RMSE and AIC [14,15]. From those models and relevant principal scores for each data source, raster surfaces were generated for the extent of the study area. To evaluate mean stand level estimates of BAWD, QMD, BAH, and TPH, we summarized the raster cells within the boundary of 100 randomly chosen forested polygons extracted from the most current, vector-based Bitterroot National Forest existing vegetation database [30,31,32] and compared mean polygon estimates using paired t-tests (Figure 3). The remainder of this section describes the detailed procedures used to process the different sources of data, build predictive models, and compare models and estimated stand values.

2.3.1. Processing Plot Data

The forest tree size and density characteristics we focused on in this study include: BAWD, QMD, BAH, and TPH. Each of these provide meaningful information related to the health and condition of a given forested stand and are used within the discipline of forestry to inform when, where, and how treatments occur across the landscape [33,34]. The metrics BAWD and QMD describe tree diameter, while BAH and TPH relate to the cross-sectional stem area and tree density. In the field, tree status (live or dead), species, counts, DBH, and height within the boundary of a fixed 0.04 ha plot are recorded (Figure 2). Once collected, plot summaries of BAWD, QMD, BAH, and TPH were calculated using only information from live trees [35,36]. Calculation of size and density metrics are based on the following equations:
BA = D B H 2 0.0000785
BAWD =   B A D B H B A
TPH = E F t r e e   c o u n t
BAH =   E F B A
QMD =   B A H T P H 0.0000785
where DBH is measured in cm, BA is calculated in m2 ha−1, and EF is the expansion factor used to bring plot summaries to per hectare values for 0.04 and 0.0013 ha plots.
Some plot locations contained no trees (no tally) while others contained only small trees (<1.37 m tall). For metrics such as BAWD and QMD these two conditions do not represent the same meaning on the ground, especially with regards to the spectral signature captured at those locations. To account for this discrepancy, plots with no tallies were given a BAWD and QMD value of 0 cm while plots that only contained trees less than 1.37 m tall were given BAWD and QMD values of 1.27 cm. The values of 0 cm and 1.27 cm were also translated to the BAH and TPH records to ensure consistent representation.

2.3.2. Processing NAIP Data

To remove areas outside of our study, we clipped our continuous NAIP mosaic to a 90 m buffer around the Daly-Gold study boundary. Using that clipped mosaic we then passed a moving window of 23 by 23 cells (approximate extent of each field plot) across each band to calculate mean, standard deviation, and gray level co-occurrence matrix (GLCM) [10] horizontal contrast values [8]. Combined, these 12 surfaces (4 mean, 4 standard deviations, and 4 GLCM) make up the base NAIP predictor variables.

2.3.3. Processing TOPO Data

Similar to the base NAIP variables we clipped our DEM to a 90 m buffer around the Daly-Gold study boundary. From the DEM, slope (degrees), northing, and easting surfaces were created. To convert terrain orientation to northing and easting values, we transformed degrees of aspect, derived from the clipped DEM, with cosine and sin functions, respectively [8]. We then passed a moving window of 3 by 3 cells (approximate extent of a plot) across the DEM, slope, northing, and easting raster datasets to calculate mean values for the approximate extent of a plot. Combined, these four surfaces were then resampled to the same spatial resolution as the NAIP imagery and constitute the base TOPO predictor variables.

2.3.4. Processing LiDAR Data

LiDAR data was processed using FUSION software [1,2,25]. To automate the procedures within FUSION, we developed a specialized set of tools that handled file and data management. We also constructed a data handling architecture to process the original LiDAR (*.las) files with the algorithms (*.exe files) contained within FUSION. These algorithms produce a bare earth model, point cloud metrics for all plots, and corresponding grid metric surfaces for the study area.
As a first step, we used the default GroundFilter algorithm within FUSION to identify laser returns that lie on the probable ground surface, specifying a 1 m cell resolution. Then the GridSurfaceCreate algorithm was used to produce an output raster representing the probable ground surface, also with 1 m cell resolution. Point cloud metrics were produced for the extent of each plot by calling the CloudMetrics algorithm. Prior to the computation of cloud metrics, ground surface values were subtracted from point clouds to yield surface heights. In the process of computing cloud metrics, we specified a plot radius of 11.4 m and chose to remove all points lower than 0 m and above 45.7 m. Rectangles as opposed to circles were used for convenience and to represent the extent of each plot in a manner that relates to grid cells. This process yielded a suite of cloud metrics (38) for each plot location in tabular form.
In addition to cloud metrics, grid metrics were calculated using the GridMetrics algorithm. To calculate grid metrics, ground surface values were subtracted from the original *.las files, points with values less than 0 or greater than 45.7 m removed from the analysis, and output cell resolution was specified as 23 m, to correspond to the plot dimension. This yielded the same suite of 38 grid metrics, also in tabular form. Tabular grid metric outputs were converted to ERDAS Imagine *.img raster surface files using our custom processing routines. Combined these images make up the base LiDAR predictor variables.

2.3.5. Computing Principal Components

A total of 12 NAIP, 4 TOPO, and 38 LiDAR metrics were associated with all plot locations using the center coordinates of the plot. Within each data source, many metrics have substantially different units of measure and describe similar aspects of the data. Because of these issues, we performed three singular vector decomposition PCAs to normalize the values within each source of data (i.e., we used the correlation matrix) and remove the correlation among variables [8,26]. To standardize the number of predictor variables while maintaining the vast majority of information within those variables, we selected the minimum number of principle components that account for at least 95% of the correlation in the data.
Using the Eigen vectors of the selected components and the base predictor surfaces, we created three multiband principal component score raster surfaces (i.e., one multi-band raster for each data source). Finally, we resampled our principal component score surfaces to a common resolution of 10 m to standardize the spatial resolution of our predictor variables and help visualize the heterogeneous nature of the study area. These 3 multiband principle component score surfaces represent the spatial configuration of the predictor variables used in candidate BAWD, QMD, BAH, and TPH models.

2.3.6. Algorithm Development and Evaluation

Multiple nested candidate Random Forest regression models [27,28] of plot BAWD, QMD, BAH, and TPH given NAIP, TOPO, or LiDAR principal component scores were compared using RMSE and AIC. The Random Forest methodology uses randomization and bagging to estimate multiple regression trees for subsets of the data and predictor variables and uses the out of bag (OOB) observations to estimate error. Specifically, models were built to predict tree size and density characteristics based on (1) NAIP principal component scores only, (2) LiDAR principal component scores only, (3) NAIP and TOPO principal component scores, (4) NAIP and LiDAR principal component scores, and lastly the combination of (5) NAIP, TOPO, and LiDAR principal component scores. All candidate models were generated with 100 decision trees, a random selection of 0.66 of the training data, and 4 randomly selected independent variables for each tree. To evaluate the similarity in predicted mean values of BAWD, QMD, BAH, and TPH using the different sources of data, linear regression was performed on a pairwise basis using predictions from each candidate model. In addition, for each candidate model Random Forest was performed ten times and OOB RMSEs were recorded to describe and account for the variability in the Random Forest modeling technique.
Random Forest regression results were summarized across the ten iterations and the mean OOB RMSE was reported for each candidate model. Mean OOB RMSEs were used in the calculation of the AIC score, which was the basis for model selection. In this context, AIC favors models that achieve similar results with less complexity, and thus, models with the lowest AIC score and fewest predictor variables were deemed the most parsimonious. Top fitting models with differences of less than 2 AIC units between models (ΔAIC) indicated similar performance [15].

2.3.7. Spatial Summary Comparison

Random Forest models were used in conjunction with the appropriate principal component score surfaces to create spatially explicit predictions of plot BAWD, QMD, BAH, and TPH. The resulting raster surfaces were summarized to all forest stands across the study area. Polygon-based stands were independently created for the project area as part of the existing vegetation database used by the Bitterroot National Forest [28,29,30]. Polygonal mean values of BAWD, QMD, BAH, and TPH were calculated from estimated mean surfaces and attributed to the corresponding stands using zonal statistics [8].
To assess the stand-level differences in predicted forest metrics, between the various models based on NAIP, TOPO, and LiDAR principal component scores, we randomly selected 100 forest polygons from the vegetation database and compared their summarized mean values using paired T-tests. Polygons representing forest stands were identified by generating a categorical lifeform map that includes forest, non-forest vegetation, water, sparse vegetation, and disturbed (e.g., fire, insect, or disease affected vegetation) classes. Within the forest vegetation class, 100 polygons greater than 0.04 ha in size were randomly selected. An illustration of the lifeform classification and spatial distribution of the randomly selected forest polygons is given in Figure 4.

3. Results

Tree data were collected on 60 plots across the study area, and tabulated to produce plot values for BAWD, QMD, BAH, and TPH. The distribution of BAWD, QMD and BAH plot values was generally considered normal while TPH values were strongly right skewed (Figure 5). Principle components used as predictor variables accounting for 96%, 97%, and 96% of the information within NAIP (5 components), TOPO (3 components) and LiDAR (5 components) data sources, respectively (Figure 6). Often Eigen vectors of a PCA are interpreted with respect to original values of the data. In this case we simply wanted to remove the correlation among each source of remotely sensed data and use the orthogonal scores of the components as predictor variables in subsequent analysis. Our PCAs effectively performed this objective. Scores calculated from the Eigen vectors of the corresponding principal components were used with plot summaries of BAWD, QMD, BAH, and TPH to build a suite of models describing the relationship between field and remotely sensed data sources. Figure 7 illustrates the spatial distribution of the multiband raster datasets generated from the principal component scores and used to produce the spatial predictions of the response variables.
Model fit and complexity were evaluated using RMSE and AIC (Table 2). Generally, models derived from NAIP and LiDAR predictor variables alone performed as well as more complex models. The best fitting, most parsimonious models for BAWD, QMD, and TPH contained NAIP principal component scores. The best fitting, most parsimonious model for BAH contained LiDAR principal component scores. When estimates of BAWD, QMD, BAH, and TPH derived from NAIP and LiDAR based models were regressed against one another, they followed a positive one to one relationship explaining approximately 90% of the variation between predicted estimates (Figure 8). When paired with NAIP or LiDAR variables, TOPO variables provided little additional information.
Random Forest regression model iterations yielded similar trends as shown in Table 2 but illustrate the variability of using the Random Forest ensemble modeling techniques (Figure 9). While improvements in predictive capability (smaller RMSE) can be achieved by using more complex models (Table 2 and Figure 9), those improvements are relatively minor with regard to the additional complexity (AIC scores in Table 2, Figure A1). Moreover, given the stochastic nature of ensemble modeling (Figure 9), some perceived differences in model results may be due to chance alone and was in large part the rationale for using AIC to identify the top fitting models for each characteristic.
At the stand level, differences between summarized estimates of models derived from NAIP and LiDAR are generally indistinguishable. In all comparisons except BAH, paired differences between NAIP and LiDAR derived models were statistically insignificant (Table 3). For BAH, the average paired difference was less than 2.2 m2 ha−1 which in many surveys is roughly equivalent to one tree. Combined, these comparisons suggest that models based on NAIP and LiDAR have very similar predictive capability, error structure, and spatial distribution. This is particularly evident at the spatial scale associated with forest stands where individual pixel predictions are aggregated to produce a mean value for each stand.

4. Discussion

Forest resource managers need timely and accurate information about the size and density of trees to make decisions about what, where, and how to apply treatments across landscapes. Metrics like BAWD, QMD, BAH, and TPH are commonly used to describe structural elements of the forest. LiDAR is a relatively new technology that is capable of providing highly precise measurements of height, and has been shown as an effective tool for making spatial predictions of forest metrics by relating observed heights to ground-based plot measurements of trees [3,4].
While LiDAR data can be useful, it is not available in all locations and must generally be acquired for specific project areas [37]. Many other digital spatial datasets like topography and high resolution imagery are available as continuous data across the continental United States, and can be obtained free of charge. Like LiDAR data, recent research has also shown that fine resolution imagery such as NAIP can be used to effectively predict forest metrics with relationships based on field data.
Alone, neither NAIP nor LiDAR data can model environmental phenomenon without high quality, reliable reference information. In this study, 60 field plots following the CSE protocol were used to describe a range of forest conditions. While this field protocol is commonly used for forest inventories, it may not be optimal for relating ground measurements to fine resolution spatial data such as NAIP imagery or LiDAR [38].
Without adequate ground representation of the characteristics to be estimated, good model fit cannot be achieved. This was evident in the relatively large RMSE between measured and estimated TPH in our study. To enhance our ability to estimate and map forest characteristics, a sufficient number of plots should be collected to cover the range of conditions across the study area. Those plots should also be designed to fully describe the entire area of the plot. Future research on appropriate plot layout and configuration with regard to predictor variables is needed to better understand these relationships, and improve the predictive ability of forest metric modeling.
For many natural resource management questions, collection of training data can be one of the expensive and important aspects of a project. When data collection saving can be made by using readily available, and free, remotely sensed data in lieu of purchasing new data, it may be beneficial to use those savings on collecting robust training datasets. With more time and resources available for training data collection, the spatial distribution of trainings sites may be enhanced, or more thorough measurements may be taken at each site, depending on project specific needs. Whether more information is collected at each site, or more sites are added to a set, better training data generally lead to better fitting models and ultimately, reduced uncertainly, enhanced predictive ability, and more confident decision making.
Using the relationships derived from our field plots and remotely sensed data, we demonstrated that there was a great deal of similarity between NAIP and LiDAR based model estimates at the spatial scale of a plot and forest stand. In most of our comparisons models that used NAIP based predictors out performed those that used LiDAR or TOPO. In some cases, using both NAIP and LiDAR based predictors showed minor improvements in model fit. Often those improvements though were small and from a model complexity stand point there was no justification for adding additional sources of data, especially sources of data that have additional acquisition costs. Assuming that acquisition cost of remotely sensed data is a key factor to quantifying forest characteristics for the sake of forest management decision making, in our study the added cost of acquiring LiDAR data to estimate BAWD, QMD, BAH, or TPH was not worth the marginal improvement in estimating those characteristics at the spatial scale of the plot and stand.
Cost, quality, continuity, and time are all essential components to consider when quantifying forest characteristics. Our analyses suggest that outputs from models based on NAIP or LIDAR at the spatial scale of the plot and stand compare favorably. Thus, using imagery or elevation-based data to generate estimates of the forest metrics described in this study is somewhat irrelevant. If one source of data is available, there is not much to be gained by spending time or money to acquire an additional source. However, full coverage image data tends to be more readily available than LiDAR, which is currently only collected in isolated patches. Furthermore, when time is of the essence, it must be understood that custom acquisition of remotely sensed data, whether imagery or LIDAR, can be time consuming and may significantly delay a project.
For other forest characteristics related to topography and understory vegetation we presume that airborne LiDAR based models would have a substantial advantage over passive based imagery in accurately quantifying those characteristics. However, our study focused on quantifying common overstory components of the forest used in forest management decision making. In this instance, with the exception of BAH, LiDAR based models did not substantially improve estimates of BAWD, QMD, or TPH. Moreover, when comparing modeled estimates of BAH, especially at the spatial scale of the forest stand, it is highly questionable if the added cost of LiDAR acquisition is worth the minor reduction in modeled error. This has leads us to suggest that when existing remotely sensed data is available such as NAIP and additional resources are available for quantifying overstory forest characteristics, more emphasis on field data collection and design may prove to be a better use of those financial resources than purchasing additional remotely sensed data.

5. Conclusions

Both NAIP and LiDAR were used independently and in combination to predict the spatial distribution of forest characteristics such as BAWD, QMD, BAH, TPH. While recent attention has been focused on using LiDAR data to estimate these characteristics, we have shown that, when using common modeling techniques, similar results can be obtained by using high resolution imagery (NAIP). Therefore, either type of data could be used to perform these analyses and produce equivalent results when estimates of size and density are of interest. This is particularly apparent when fine grain, spatially explicit raster estimates are summarized to larger units such as forest stands.
Interestingly, NAIP and LiDAR measure different aspects of the environment, and when trees and forest characteristics are considered they both provide disparate but useful information for describing BAWD, QMD, BAH, and TPH. Our modeling results show that although different predictor layers (NAIP versus LiDAR) are being used to estimate forest characteristics, each output yields relatively similar interpretations of the forest. We show that regardless of whether NAIP or LiDAR is used, they both have equivalent predictive ability, error structure, and spatial distribution. Given the similarity in predicting forest metrics from these sources of data, using NAIP is a low-cost alternative to using LiDAR.

Author Contributions

R.A. and J.H. designed the study and were lead authors of the manuscript. R.A. performed the majority of the analyses within the study. J.H. built the tools described within the study and helped implement the analyses. S.B. provided administrative oversight and contributed to writing portions of the manuscript.

Funding

This research was funded by the U.S. Forest Service and the Rocky Mountain Research Station. Additional support for the development of the RMRS Raster Utility was provided by the Agriculture and Food Research Initiative Competitive Grant 2013-68005-21298 from the USDA National Institute of Food and Agriculture through the Bioenergy Alliance Network of the Rockies (BANR).

Acknowledgments

We thank the three independent reviewers for their helpful comments and suggestions. Their advice helped to significantly improve this article.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Figure A1. Example of minor differences in basal area weighted diameter (BAWD with field measured min: 0.00 cm, and max: 58.62 cm), quadratic mean diameter (QMD, with field measured min: 1.27 cm, and max: 53.34 cm), basal area per hectare (BAH with field measured min: 0.00 cm, and max: 45.22 cm), and trees per hectare (TPH with field measured min: 0.00, and max: 41,439.54) estimated values for NAIP, LiDAR and TOPO based models. Additional summary statistics are given in Table A1.
Figure A1. Example of minor differences in basal area weighted diameter (BAWD with field measured min: 0.00 cm, and max: 58.62 cm), quadratic mean diameter (QMD, with field measured min: 1.27 cm, and max: 53.34 cm), basal area per hectare (BAH with field measured min: 0.00 cm, and max: 45.22 cm), and trees per hectare (TPH with field measured min: 0.00, and max: 41,439.54) estimated values for NAIP, LiDAR and TOPO based models. Additional summary statistics are given in Table A1.
Ijgi 08 00024 g0a1
Figure A2. Observed verse predicted values for NAIP (left) and LiDAR (right) based, basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), trees per hectare (TPH), best fitting Random Forest models.
Figure A2. Observed verse predicted values for NAIP (left) and LiDAR (right) based, basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), trees per hectare (TPH), best fitting Random Forest models.
Ijgi 08 00024 g0a2
Figure A3. Variation in out of bag root mean squared error (RMSE) for 10 quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH) Random Forest models.
Figure A3. Variation in out of bag root mean squared error (RMSE) for 10 quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH) Random Forest models.
Ijgi 08 00024 g0a3
Table A1. Description and summary statics of field and remote sample observations.
Table A1. Description and summary statics of field and remote sample observations.
ATTRIBUTEDESCRIPTIONMeanSDMinMax
BAWDBasal area weighted diameter at breast height (cm)27.9914.610.0058.62
QMDQuadratic mean diameter at breast height (cm)23.8111.171.2753.34
BAHBasal area (m2 per ha)18.2812.090.0045.22
TPHTrees per ha3407.586726.090.0041,439.54
Elev_minimumMinimum elevation recorded0.010.000.010.04
Elev_maximumMaximum elevation recorded44.1915.9911.4177.10
Elev_meanMean elevation recorded9.895.440.3820.14
Elev_modeMode elevation recorded1.494.410.0117.79
Elev_stddevStandard deviation of elevation values recorded8.533.800.8015.74
Elev_varianceVariance of elevation values recorded87.2564.140.65247.70
Elev_CVCoefficient of variation of elevation values recorded1.030.410.492.29
Elev_IQInterquartile distance of elevation values recorded12.347.140.2328.52
Elev_skewnessHistogram skewness of elevation values recorded1.161.00(0.13)5.37
Elev_kurtosisHistogram kurtosis of elevation values recorded5.507.261.8445.88
Elev_AADAverage absolute deviation of elevation values recorded6.993.350.4013.81
Elev_MAD_medianMedian of the absolute deviations from the overall median of AAD5.213.710.0614.01
Elev_MAD_modeMedian of the absolute deviations from the overall mode of AAD6.995.630.0820.71
Elev_L1L-moment 19.895.440.3820.14
Elev_L2L-moment 24.592.190.268.97
Elev_L3L-moment 30.970.67(0.18)3.35
Elev_L4L-moment 40.290.30(0.62)1.28
Elev_L_CVL-moment coefficient of variation0.520.130.270.82
Elev_L_skewnessL-moment skewness0.260.18(0.04)0.72
Elev_L_kurtosisL-moment kurtosis0.100.11(0.08)0.49
Elev_P01Elevation values of the 1st percentile0.060.070.010.35
Elev_P05Elevation values of the 5th percentile0.350.620.013.17
Elev_P10Elevation values of the 10th percentile0.781.290.026.20
Elev_P20Elevation values of the 20th percentile2.062.430.049.98
Elev_P25Elevation values of the 25th percentile2.873.050.0411.36
Elev_P30Elevation values of the 30th percentile3.733.670.0512.47
Elev_P40Elevation values of the 40th percentile5.624.900.0715.85
Elev_P50Elevation values of the 50th percentile7.926.020.0920.72
Elev_P60Elevation values of the 60th percentile10.517.220.1225.26
Elev_P70Elevation values of the 70th percentile13.538.170.1931.01
Elev_P75Elevation values of the 75th percentile15.228.650.2833.75
Elev_P80Elevation values of the 80th percentile17.139.160.3536.39
Elev_P90Elevation values of the 90th percentile22.0510.410.7741.91
Elev_P95Elevation values of the 95th percentile25.9611.491.7546.58
Elev_P99Elevation values of the 99th percentile33.0013.383.9856.37
Canopy_relief_ratio((mean elevation − min elevation)/(max elevation − min elevation))0.210.090.030.44
Elev_SQRT_mean_SQGeneralized means for the 2nd power (elevation quadratic mean)13.206.360.8925.56
Elev_CURT_mean_CUBEGeneralized means for the 3rd power (elevation cubic mean)15.606.991.5329.05
NAIP_Band1NAIP1 focal mean 23 × 2351.5422.7422.33107.58
NAIP_Band2NAIP2 focal mean 23 × 2371.5120.5837.15121.52
NAIP_Band3NAIP3 focal mean 23 × 2361.1913.4742.2993.70
NAIP_Band4NAIP4 focal mean 23 × 2375.3132.5212.96140.43
NAIP_Band5NAIP_SD1 focal standard deviation 23 × 2318.937.856.0745.33
NAIP_Band6NAIP_SD2 focal standard deviation 23 × 2319.786.275.3339.10
NAIP_Band7NAIP_SD3 focal standard deviation 23 × 237.544.241.4925.89
NAIP_Band8NAIP_SD4 focal standard deviation 23 × 2330.189.996.3952.53
NAIP_Band9GLCM1 horizontal 23 × 2393.1860.6010.15308.79
NAIP_Band10GLCM2 horizontal 23 × 2396.8243.489.71199.63
NAIP_Band11GLCM3 horizontal 23 × 238.619.830.3363.97
NAIP_Band12GLCM4 horizontal 23 × 23180.40111.016.35483.20
TOPO_Band1ELEVATION focal mean 23 × 232,020.00253.781597.452422.87
TOPO_Band2EASTING focal mean 23 × 23(0.65)0.32(1.00)0.00
TOPO_Band3NORTHING focal mean 23 × 230.580.340.001.00
TOPO_Band4SLOPE focal mean 23 × 2324.668.683.4443.75
Table A2. BAWD paired stand comparisons between models developed from combinations of NAIP, LiDAR, and Topo data sources.
Table A2. BAWD paired stand comparisons between models developed from combinations of NAIP, LiDAR, and Topo data sources.
Compare V1–V2NDifferenceT-Statp-Value
NAIP-LIDAR100−0.119−0.2340.8150
NAIP-NAIP_TOPO1000.6452.0570.0423
NAIP-NAIP_LIDAR100−0.643−1.8420.0684
NAIP-NAIP_TOPO_LIDAR1000.0840.2200.8260
LIDAR-NAIP_TOPO1000.7671.5040.1359
LIDAR-NAIP_LIDAR100−0.521−2.5200.0133
LIDAR-NAIP_TOPO_LIDAR1000.2060.9240.3580
NAIP_TOPO-NAIP_LIDAR100−1.288−3.3150.0013
Table A3. QMD paired comparisons between models developed from combinations of NAIP, LiDAR, and Topo data sources.
Table A3. QMD paired comparisons between models developed from combinations of NAIP, LiDAR, and Topo data sources.
Compare V1–V2NDifferenceT-Statp-Value
NAIP-LIDAR100−0.173−0.4940.6219
NAIP-NAIP_TOPO1000.1550.53180.5960
NAIP-NAIP_LIDAR100−0.249−0.9340.3524
NAIP-NAIP_TOPO_LIDAR1000.2920.98570.3267
LIDAR-NAIP_TOPO1000.3300.85040.3972
LIDAR-NAIP_LIDAR100−0.074−0.4870.6267
LIDAR-NAIP_TOPO_LIDAR1000.4673.13120.0023
NAIP_TOPO-NAIP_LIDAR100−0.404−1.1440.2551
Table A4. BAH paired comparisons between models developed from combinations of NAIP, LiDAR and Topo data sources.
Table A4. BAH paired comparisons between models developed from combinations of NAIP, LiDAR and Topo data sources.
Compare V1–V2NDifferenceT-Statp-Value
NAIP-LIDAR100−2.211−3.420.0009
NAIP-NAIP_TOPO1000.2181.250.2125
NAIP-NAIP_LIDAR100−1.666−3.900.0002
NAIP-NAIP_TOPO_LIDAR100−1.498−3.480.0007
LIDAR-NAIP_TOPO1002.4304.11<0.0001
LIDAR-NAIP_LIDAR1000.5462.240.0275
LIDAR-NAIP_TOPO_LIDAR1000.7142.980.0036
NAIP_TOPO-NAIP_LIDAR100−1.884−4.87<0.0001
Table A5. TPH paired comparisons between models developed from combinations of NAIP, LiDAR and Topo data sources.
Table A5. TPH paired comparisons between models developed from combinations of NAIP, LiDAR and Topo data sources.
Compare V1–V2NDifferenceT-Statp-Value
NAIP-LIDAR1001280.4640.6434
NAIP-NAIP_TOPO1005253.3050.0013
NAIP-NAIP_LIDAR1003842.6210.0102
NAIP-NAIP_TOPO_LIDAR1007173.8930.0002
LIDAR-NAIP_TOPO1003972.3700.0197
LIDAR-NAIP_LIDAR1002561.7870.0770
LIDAR-NAIP_TOPO_LIDAR1005884.532<0.0001
NAIP_TOPO-NAIP_LIDAR100−141−1.4270.1567

References

  1. Sheridan, R.D.; Popescu, S.C.; Gatziolis, D.; Morgan, C.L.S.; Ku, N.W. Modeling forest aboveground biomass and volume using airborne LiDAR metrics and forest inventory and analysis data in the Pacific Northwest. Remote Sens. 2015, 7, 229–255. [Google Scholar] [CrossRef]
  2. Mitchell, B.; Jacokes-Mancini, R.; Fisk, H.; Evans, D. Considerations for Using Lidar Data—A Project Implementation Guide; RSAC-10032-MAN1; USDA Forest Service, Remote Sensing Applications Center: Salt Lake City, UT, USA, 2012; 13p.
  3. McRoberts, R.E.; Naesset, E.; Gobakken, T. Inference for lidar-assisted estimation of forest growing stock volume. Remote Sens. Environ. 2013, 128, 268–275. [Google Scholar] [CrossRef]
  4. Moser, P.; Vibrans, A.C.; McRoberts, R.E.; Naesset, E.; Gobakken, T.; Chirici, G.; Mura, M.; Marchetti, M. Methods for variable selection in LiDAR-assisted forest inventories. Forestry 2017, 90, 112–124. [Google Scholar] [CrossRef]
  5. USDA Farm Service Agency. Four Band Digital Imagery Information Sheet. 2016. Available online: https://www.fsa.usda.gov/Assets/USDA-FSA-Public/usdafiles/APFO/support-documents/pdfs/fourband_infosheet_2016.pdf (accessed on 14 February 2017).
  6. Hogland, J.; Anderson, N.; Wells, L.; Chung, W. Estimating forest characteristics using NAIP imagery and ArcObjects. In Proceedings of the 2014 ESRI International Users Conference, San Diego, CA, USA, 9–13 August 2004. [Google Scholar]
  7. Hogland, J.; Anderson, N. Estimating FIA plot characteristics using NAIP imagery, function modeling, and the RMRS Raster Utility coding library. In Pushing Boundaries: New Directions in Inventory Techniques and Applications: Forest Inventory and Analysis (FIA) Symposium 2015, Portland, Oregon, 8–10 December 2015; Stanton, S.M., Christensen, G.A., Eds.; General Technical Report PNW-GTR-931; U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station: Portland, OR, USA, 2015; pp. 340–344. [Google Scholar]
  8. Hogland, J.; Anderson, N. Function modeling improves the efficiency of spatial modeling using big data from remote sensing. Big Data Cogn. Comput. 2017, 1, 3. [Google Scholar] [CrossRef]
  9. Hogland, J.; Anderson, N.; St. Peters, J.; Drake, J.; Medley, P. Mapping Forest Characteristics at Fine Resolution Across Large Landscapes of the Southeastern United States Using NAIP Imagery and FIA Field Plot Data. Int. J. Geo-Inf. 2018, 7, 140. [Google Scholar] [CrossRef]
  10. Haralick, R.; Shanmugam, K.; Dinstein, J. Texture features for image classification. IEEE Trans. Syst. Man Cybern. 1973, 3, 610–621. [Google Scholar] [CrossRef]
  11. Gesch, D.; Oimoen, M.; Geenlee, S.; Nelson, C.; Steuck, M.; Tyler, D. The National Elevation Dataset. Photogramm. Eng. Remote Sens. 2002, 68, 5–11. [Google Scholar]
  12. Aerial Photography Field Office (APFO). Imagery Programs. 2018. Available online: https://www.fsa.usda.gov/programs-and-services/aerial-photography/imagery-programs/index (accessed on 17 December 2018).
  13. Sugarbaker, L.; Eldridge, D.; Jason, A.; Lukas, V.; Saghy, D.; Stoker, J.; Thunen, D. Status of the 3D Elevation Program; U.S. Geological Survey Open-File Report 2016–1196; U.S. Geological Survey: Reston, VA, USA, 2015; 13p.
  14. Akaike, H. Information theory and an extension of the maximum likelihood principle. In Proceedings of the 2nd International Symposium on Information Theory; Petrov, B.N., Csaki, F., Eds.; Akadémiai Kiadó: Budapest, Hungary, 1973; pp. 267–281. [Google Scholar]
  15. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
  16. Thornton, P.E.; Thornton, M.M.; Mayer, B.W.; Wei, Y.; Devarakonda, R.; Vose, R.S.; Cook, R.B. Daymet: Daily Surface Weather Data on a 1-km Grid for North America, Version 3; ORNL DAAC: Oak Ridge, TN, USA, 2013. Available online: http://dx.doi.org/10.3334/ORNLDAAC/1328 (accessed on 25 January 2017).
  17. Burns, R.; Honkala, B. Silvics of North America: 1. Conifers; 2. Hardwoods; Agriculture Handbook 654, Tech. Coords; U.S. Department of Agriculture, Forest Service: Washington, DC, USA, 1990; Volume 2, 877p.
  18. Freese, F. Elementary Forest Sampling; Agriculture Handbook No. 232; USDA Southern Forest Experiment Station, Forest Service: Knoxville, TN, USA, 1973. Available online: https://www.fs.fed.us/fmsc/ftp/measure/cruising/other/docs/AgHbk232.pdf (accessed on 25 January 2017).
  19. USDA Forest Service. Region One Vegetation Classification, Mapping, Inventory and Analysis Report 16-5 v6.0. Region 1 Common Stand Exam and Inventory and Monitoring Field Guide. 2016. Available online: http://fsweb.r1.fs.fed.us/forest/inv/cse_exams/R1_CSE_IM_FG_v2016.zip (accessed on 25 January 2017).
  20. Holden, Z.A.; USDA Forest Service. Region 1 Regional Office, Forest and Rangeland Management, 26 Fort Missoula Rd. Missoula, MT 59804, USA. Personal communications, 2016.
  21. USDA Farm Service Agency. National Agriculture Imagery Program (NAIP) Information Sheet. 2015. Available online: https://www.fsa.usda.gov/Assets/USDA-FSA-Public/usdafiles/APFO/ naip_info_sheet_2015.pdf (accessed on 25 January 2017).
  22. USDI Geological Survey. File Transfer Protocol. Staged NAIP, 2018. Available online: ftp://rockyftp.cr.usgs.gov/vdelivery/Datasets/Staged/NAIP/ (accessed on 9 March 2018).
  23. USDI Geological Survey. File Transfer Protocol. Staged NED, 2018. Available online: ftp://rockyftp.cr.usgs.gov/vdelivery/Datasets/Staged/NED/ (accessed on 9 March 2018).
  24. Watershed Sciences. LiDAR Remote Sensing Collection: USFS Bitterroot National Forest, Montana; Delivery 2 October 29, 2010; Watershed Science: Corvallis, OR, USA, 2010; Available online: https://quantumspatial.com/ (accessed on 8 January 2018).
  25. McGaughey, R.J. FUSION/LDV: Software for LIDAR Data Analysis and Visualization; USDA Forest Service, Pacific Northwest Research Station: Washington, DC, USA, 2016. Available online: http://forsys.cfr.washington.edu/fusion/ FUSION_manual.pdf (accessed on 25 January 2017).
  26. Souza, C. A Tutorial on Principal Component Analysis with Accord.NET Framework; Technical Report; Department of Computing, Federal University of Sao Carlos: São Carlos, Brazil, 2012; Available online: http://arxiv.org/ftp/arxiv/papers/1210/1210.7463.pdf (accessed on 27 September 2013).
  27. Breiman, L. 2001. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  28. Liaw, A.; Wiener, M. Classification and regression by randomForest. R News 2002, 2, 18–22. Available online: http://CRAN.R-project.org/doc/Rnews/ (accessed on 25 January 2017).
  29. Liaw, A. Documentation for R Package randomForest. 2015. Available online: https://cran.r-project.org/web/packages/randomForest/randomForest.pdf (accessed on 25 January 2017).
  30. USDA Forest Service. Geospatial Data: Northern Region Existing Vegetation Mapping Program. 2016. Available online: https://www.fs.usda.gov/detailfull/r1/landmanagement/gis/?cid=stelprdb5331054& width=full (accessed on 25 January 2017).
  31. Barber, J.; Brown, S.R.; Ahl, R. Mid-Level and Base-Level Databases of the R1 Existing Vegetation (VMap) Products; Region One Vegetation Classification, Mapping, Inventory and Analysis Report; USDA Forest Service: Washington, DC, USA, 2016; pp. 12–38. Available online: https://www.fs.usda.gov/Internet/FSE_DOCUMENTS/ stelprdb5364531.pdf (accessed on 25 January 2017).
  32. Ahl, R.; Brown, S. The Bitterroot and Lolo National Forests Region 1 Existing Vegetation Database (VMap) Revision of 2016; Northern Region Geospatial Group Project Report NRGG-PR-BITLO-VMAP2016; USDA Forest Service: Washington, DC, USA, 2017. Available online: https://www.fs.usda.gov/Internet/FSE_DOCUMENTS/ fseprd573432.pdf (accessed on 5 March 2017).
  33. Avery, T.; Burkhart, H. Forest Measurements, 4th ed.; McGraw Hill: Boston, MA, USA, 1994; p. 408. [Google Scholar]
  34. Smith, D.; Larson, B.; Kelty, M.; Ashton, P. The Practice of Silviculture Applied Forest Ecology, 9th ed.; John Wily & Sons, Inc.: New York, NY, USA, 1997; p. 537. [Google Scholar]
  35. Bush, R. Overview R1 Summary Database; Region One Vegetation Classification, Mapping, Inventory and Analysis Report 14-16 v2.0; USDA Forest Service: Washington, DC, USA, 2014. Available online: http://fsweb.r1.fs.fed.us/forest/inv/r1_tools/Overview_R1_SDB.pdf (accessed on 25 January 2017).
  36. Bush, R.; Reyes, R.; Zeiler, J.D. R1 Summary Database Reports and Utilities User’s Guide; Region One Vegetation Classification, Mapping, Inventory and Analysis Report 16-2 v3.2; USDA Forest Service: Washington, DC, USA, 2016. Available online: http://fsweb.r1.fs.fed.us/forest/inv/r1_tools/R1_SDB_Rep_Util.pdf (accessed on 25 January 2017).
  37. Mitchell, B.; Walterman, M.; Mellin, T.; Wilcox, C.; Lynch, A.M.; Anhold, J.; Falk, D.A.; Koprowski, J.; Laes, D.; Evans, D.; et al. Mapping Vegetation Structure in the Pinaleno Mountains Using Lidar-Phase 3: Forest Inventory Modeling; RSAC-10007-RPT; USDA Forest Service, Remote Sensing Applications Center: Salt Lake City, UT, USA, 2012; 17p.
  38. Mundt, J.T.; Streutker, D.R.; Glenn, N.F. Mapping sagebrush distribution using fusion of hyperspectral and lidar classifications. Photogramm. Eng. Remote Sens. 2006, 72, 47–54. [Google Scholar] [CrossRef]
Figure 1. Daly-Gold Study Area in the Sapphire Mountains of the Bitterroot National Forest in western Montana, USA is shown. The yellow linear features describe the National Forest Systems administrative boundary and the land within it, and the yellow point features illustrate the location of plots and associated field measurements used in this study.
Figure 1. Daly-Gold Study Area in the Sapphire Mountains of the Bitterroot National Forest in western Montana, USA is shown. The yellow linear features describe the National Forest Systems administrative boundary and the land within it, and the yellow point features illustrate the location of plots and associated field measurements used in this study.
Ijgi 08 00024 g001
Figure 2. Composition and structure of common stand exam plot design, with outer circle (orange) representing a 0.04 hectare plot for trees greater than or equal to 12.7 cm diameter at breast height (radius = 11.4 m) and inner circle (yellow) representing a 0.0013 ha plot for trees less than 12.7 cm DBH (radius = 2 m). The outer square represents the area summarized for the NAIP imagery and LiDAR data.
Figure 2. Composition and structure of common stand exam plot design, with outer circle (orange) representing a 0.04 hectare plot for trees greater than or equal to 12.7 cm diameter at breast height (radius = 11.4 m) and inner circle (yellow) representing a 0.0013 ha plot for trees less than 12.7 cm DBH (radius = 2 m). The outer square represents the area summarized for the NAIP imagery and LiDAR data.
Ijgi 08 00024 g002
Figure 3. Flow diagram of the data sources, processing steps, modeling, and comparisons made within the study. NAIP, LiDAR, and TOPO data were converted to texture, height, and elevation metrics and transformed to spatial surfaces using principal component analysis (PCA). PCA surface values were then related to field data based on plot locations (Overlay) and were used to model basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH) and trees per hectare (TPH) relationships. Modeled relationships were then used to create raster surfaces that were compared at the spatial resolution of the plot and stand.
Figure 3. Flow diagram of the data sources, processing steps, modeling, and comparisons made within the study. NAIP, LiDAR, and TOPO data were converted to texture, height, and elevation metrics and transformed to spatial surfaces using principal component analysis (PCA). PCA surface values were then related to field data based on plot locations (Overlay) and were used to model basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH) and trees per hectare (TPH) relationships. Modeled relationships were then used to create raster surfaces that were compared at the spatial resolution of the plot and stand.
Ijgi 08 00024 g003
Figure 4. Daly-Gold Study Area lifeform classification with randomly selected forest polygons highlighted in black. Summarized metric predictions were compared in these polygons.
Figure 4. Daly-Gold Study Area lifeform classification with randomly selected forest polygons highlighted in black. Summarized metric predictions were compared in these polygons.
Ijgi 08 00024 g004
Figure 5. Distribution of basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH) collected on 60 plots within the study area.
Figure 5. Distribution of basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH) collected on 60 plots within the study area.
Ijgi 08 00024 g005
Figure 6. Principal component graph, where LiDAR is solid black, NAIP is dashed, and TOPO is dotted. The solid horizontal line indicates the minimum 95% threshold of the cumulative variation that is explained by the number of principal components.
Figure 6. Principal component graph, where LiDAR is solid black, NAIP is dashed, and TOPO is dotted. The solid horizontal line indicates the minimum 95% threshold of the cumulative variation that is explained by the number of principal components.
Ijgi 08 00024 g006
Figure 7. Illustration of TOPO, NAIP, and LiDAR principal component raster datasets. Note that only the first three components are displayed by the red-green-blue-color composite. The proportion of correlation explained (λ) by those components is display above each raster dataset.
Figure 7. Illustration of TOPO, NAIP, and LiDAR principal component raster datasets. Note that only the first three components are displayed by the red-green-blue-color composite. The proportion of correlation explained (λ) by those components is display above each raster dataset.
Ijgi 08 00024 g007
Figure 8. Linear relationship between basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH) estimates derived from NAIP and LiDAR models (Table 1). Observed versus predicted values for NAIP and LiDAR based BAWD, QMD, BAH, and TPH are given in Figure A2.
Figure 8. Linear relationship between basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH) estimates derived from NAIP and LiDAR models (Table 1). Observed versus predicted values for NAIP and LiDAR based BAWD, QMD, BAH, and TPH are given in Figure A2.
Ijgi 08 00024 g008
Figure 9. Variation in out of bag root mean squared error (RMSE) for 10 basal area weighted diameter Random Forest models. Results for quadratic mean diameter, basal area per hectare, and trees per hectare are given in Figure A3.
Figure 9. Variation in out of bag root mean squared error (RMSE) for 10 basal area weighted diameter Random Forest models. Results for quadratic mean diameter, basal area per hectare, and trees per hectare are given in Figure A3.
Ijgi 08 00024 g009
Table 1. Description for baseline datasets.
Table 1. Description for baseline datasets.
SourceDescriptionTiles/Plots
NAIPFour band aerial imagery: 1 m, color infrared [12,21]19
LiDARPoint cloud elevation data ≥6 points per m2 [24]752
TOPOUSGS 10 m national elevation dataset [23]8
PlotsField measured height, DBH, and count [18,19]60
Table 2. Out of bag model fit, comparison, and rank for basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH), where n = the number of observations, k = the number of randomly selected variables, RMSE = root mean squared error, MSE = mean squared error, AIC = Akaike Information Criterion, and ∆AIC = departure from minimum AIC score. * denotes best fitting model with the fewest number of predictor variables.
Table 2. Out of bag model fit, comparison, and rank for basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH), where n = the number of observations, k = the number of randomly selected variables, RMSE = root mean squared error, MSE = mean squared error, AIC = Akaike Information Criterion, and ∆AIC = departure from minimum AIC score. * denotes best fitting model with the fewest number of predictor variables.
ResponseModelVariablesnkRMSEMSEAICΔ AIC
1NAIP *60512.71613150
2LiDAR60513.21743205
BAWD3NAIP TOPO60812.21493161
4NAIP LiDAR601011.71373150
5NAIP TOPO LiDAR601311.71373216
1NAIP60510.21032883
2LiDAR60511.713730520
QMD3NAIP TOPO *6089.4882850
4NAIP LiDAR60109.99829510
5NAIP TOPO LiDAR60139.79329813
1NAIP60511.212730027
2LiDAR *6059.0812730
BAH3NAIP TOPO60811.513130935
4NAIP LiDAR60108.6752795
5NAIP TOPO LiDAR60139.18329118
1NAIP *6056936.248,111,45210710
2LiDAR6057015.349,214,65310731
TPH3NAIP TOPO6086891.847,496,39710775
4NAIP LiDAR60106990.648,868,559108211
5NAIP TOPO LiDAR60136815.246,446,411108514
Table 3. NAIP vs LiDAR Paired t-test results for 100 summarized stand level estimates (n) of basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH). Additional paired comparisons between models developed from combinations of NAIP, LiDAR and Topo data sources can be found in Table A2, Table A3, Table A4 and Table A5.
Table 3. NAIP vs LiDAR Paired t-test results for 100 summarized stand level estimates (n) of basal area weighted diameter (BAWD), quadratic mean diameter (QMD), basal area per hectare (BAH), and trees per hectare (TPH). Additional paired comparisons between models developed from combinations of NAIP, LiDAR and Topo data sources can be found in Table A2, Table A3, Table A4 and Table A5.
CharacteristicnDifferenceT-Statp-Value
BAWD100−0.119−0.2340.8150
QMD100−0.173−0.4940.6219
BAH100−2.211−3.420.0009
TPH1001280.4640.6434

Share and Cite

MDPI and ACS Style

Ahl, R.; Hogland, J.; Brown, S. A Comparison of Standard Modeling Techniques Using Digital Aerial Imagery with National Elevation Datasets and Airborne LiDAR to Predict Size and Density Forest Metrics in the Sapphire Mountains MT, USA. ISPRS Int. J. Geo-Inf. 2019, 8, 24. https://doi.org/10.3390/ijgi8010024

AMA Style

Ahl R, Hogland J, Brown S. A Comparison of Standard Modeling Techniques Using Digital Aerial Imagery with National Elevation Datasets and Airborne LiDAR to Predict Size and Density Forest Metrics in the Sapphire Mountains MT, USA. ISPRS International Journal of Geo-Information. 2019; 8(1):24. https://doi.org/10.3390/ijgi8010024

Chicago/Turabian Style

Ahl, Robert, John Hogland, and Steve Brown. 2019. "A Comparison of Standard Modeling Techniques Using Digital Aerial Imagery with National Elevation Datasets and Airborne LiDAR to Predict Size and Density Forest Metrics in the Sapphire Mountains MT, USA" ISPRS International Journal of Geo-Information 8, no. 1: 24. https://doi.org/10.3390/ijgi8010024

APA Style

Ahl, R., Hogland, J., & Brown, S. (2019). A Comparison of Standard Modeling Techniques Using Digital Aerial Imagery with National Elevation Datasets and Airborne LiDAR to Predict Size and Density Forest Metrics in the Sapphire Mountains MT, USA. ISPRS International Journal of Geo-Information, 8(1), 24. https://doi.org/10.3390/ijgi8010024

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop