Return to GeoComputation 2000 Index
The technique is based on the hypothesis that the distribution and scale
of errors within a DEM are at least partly related to characteristics of
the terrain. The technique involves the collection of high accuracy elevation
measurements to compute DEM error, the generation of a set of terrain parameters
to characterise the terrain and developing regression models to define
the relationship between DEM error and terrain character. The regression
models form the basis for creating a RMSE surface to portray DEM error.
These error surfaces provide more detailed information about DEM error
than a single global estimate of RMSE and an initial assessment of these
surfaces indicates they are of sufficient quality for use in stochastic
simulations of the impact of DEM error on spatial modelling applications.
Error surfaces also have the potential to open the door to a more deterministic
approach towards incorporation of uncertainty into spatial modelling by
means of probabilistic modelling techniques.
Assessment of DEM quality is commonly restricted to reporting a Root Mean Square Error (RMSE) value. For example, in the UK, the Ordnance Survey’s digital contour data has a quoted accuracy of +/- 1.0m to 1.8m RMSE (REF). The United States Geological Survey (USGS) describes the accuracy of its 7.5 minute DEMs with one RMSE value for each quadrangle or tile. This RMSE value is based on the difference between DEM elevation and the elevation of test points measured by field survey or aerotriangulation, or from a spot height or point on a contour line from an existing source map (USGS, 1997). The RMSE for each quadrangle is calculated from 28 test points. Such error estimates have three limitations:
This paper describes the development of a spatially distributed model of DEM error (an error surface), which could be used in probabilistic modelling techniques. The technique is based on the hypothesis that the errors in a DEM are at least partly related to the nature of the terrain. Identifying the relationship between error and terrain parameters, such as slope, curvature or relative relief, allows one to create an error surface.
The primary study area for this research is a 2km x 1.3km region of Snowdonia, North Wales, UK. Ordnance Survey 1:10,000 scale Landform Profile digital contours with a vertical interval of 10m have been used to generate DEMs with a 1m horizontal resolution. The key stages of the Snowdonia research have been reapplied to a 23.5km x 18.1km region of Mestersvig, northeast Greenland. The Greenland study area is used to validate that the findings in Snowdonia are applicable to other mountain regions and to different scales of source data and DEMs. Manual digitising of 50m contours on 1:15,000 scale Mylar contour maps derived from aerial photography has been used to generate DEMs with a 10m horizontal resolution.
GPS survey techniques have been used to measure true elevation to an accuracy of +/- 0.9m RMSE. 106 points have been surveyed in the Snowdonia study area. 103 points have been surveyed in the Greenland study area.
1. Elevation.
2. Slope angle (or gradient).
3. Plan curvature.
4. Profile curvature.
5. Relative relief:
The range of elevation values of all grid cells within a 10-cell radius of the grid cell concerned.
6. Texture:
A measure of the ruggedness of the terrain calculated as the range of slope values of all grid cells
within a 10-cell radius of the grid cell concerned.
7. Mean extremity:
The elevation of a grid cell minus the mean elevation of all grid cells within a 10-cell radius of that
grid cell. Indicates the elevation of the grid cell relative to its neighbours.
8. Minimum extremity:
The elevation of a grid cell minus the lowest elevation of all grid cells within a 10-cell radius of that
grid cell. A value of near zero would indicate that that grid cell is in a pit.
9. Maximum extremity:
The elevation of a grid cell minus the highest elevation of all grid cells within a 10-cell radius of that
grid cell. A value of near zero would indicate that that grid cell is on a peak.
10. Standard deviation:
The standard deviation of elevation values of all grid cells within a 10-cell radius of the grid cell
concerned.
11. Point distance:
The distance between a grid cell and the nearest of the contour vertices from which the DEM was
interpolated.
Table 4.1.1. Summary of Snowdon correlations between terrain parameters
and DEM error.
The correlations indicate, first, that the relationship is stronger for mean error difference than signed error and, second, that the relationship is strongest for the spline with tension DEM. However, although 5 out of 11 correlation coefficients for the mean error difference of SpTen12 are significant, the correlations are only weak or moderately strong.
Table 4.2.1. Summary of further correlations between terrain parameters
and DEM error.
Table 4.2.1 illustrates again that the SpTen12 DEM gives the strongest correlations and the greatest number of significant correlations. Also there are a far greater number of significant correlations and stronger correlations between the terrain parameters and the mean error difference. Although stronger correlations have been found than for the initial 11 terrain parameters, they are still only moderately strong. However, there is clear evidence of a relationship between DEM error and the characteristics of the terrain, which could be defined through regression modelling.
Table 4.3.1. Coefficients for multiple linear regression using all 96
terrain parameters.
The R2 value is an estimate of the proportion of values that will be correctly predicted by applying the calculated regression equation to unknown values of Y, which in this case is elevation error at unsampled locations. The more variables that are used in the multiple regression the more likely it is that the independent variables (the terrain parameters) can be made to fit the dependent variable (elevation error) within the scope of the sample. When fitting 96 independent variables to 100+ observations it is highly likely that the R2 value is overly optimistic. The adjusted R2 value takes account of the number of variables used and gives a more realistic indication of how well the calculated regression equation will predict unknown elevation errors. It is this adjusted R2 value that should be used to judge the effectiveness of the regressions performed.
The adjusted R2 values for Snowdon’s SpTen12 DEM indicate that attempting to predict the elevation error throughout the DEM is potentially viable. This is also true of Greenland’s SpTen12 DEM. The values for the other two DEMs are not so promising.
In order to predict elevation error for an entire DEM, a raster image is required for each of the terrain parameters used in the multiple regression modelling. Creation of 96 images and the subsequent map algebra required to replicate the regression equation as an error surface image is not practical. It is also inefficient as many of the terrain parameters have little influence on the regression model. So the next step in modelling elevation error is to find the best regression equation using only a limited number of terrain parameters. This has been achieved using a stepwise regression procedure, adding or removing one variable to the regression at each step. Figure 3.8.1 plots the number of variables used against the corresponding adjusted R2 values for Snowdon’s SpTen12 DEM. The graph shows that there is little increment in the adjusted R2 values beyond 20 variables. Generating an error surface from 20 images is computationally feasible. Therefore regression equations, and subsequently the error surfaces, have been generated from the best 20 variables.
Figure 4.3.1. Graph of Adjusted R2 plotted against number
of variables.
Table 4.3.2 gives the regression coefficients for the 20 variables selected by stepwise regression modelling. As with the multiple regression using all 96 terrain parameters, the terrain parameters derived from SpTen12 have markedly stronger goodness-of-fit with the error values. Further work concentrates solely on the SpTen12 DEMs.
Table 4.3.2. Coefficients for stepwise regression using 20 variables.
Table 4.4.1. Error surfaces’ summary statistics.
The error surfaces’ summary statistics would seem to indicate that the regression equations perform poorly when extrapolated to a whole study area. Greenland’s signed error surface has a maximum error more than 20 times as high as Mount Everest! Both the mean error difference surfaces have negative values, which are invalid. All of the four error surfaces are characterised by a high degree of short-range variability.
To reduce the short range variability of the error surfaces and suppress the extreme values a smoothing mean filter has been applied, using a circular window of 20-cell radius. Additionally, the intention of this research is not to produce a cell by cell exact estimate of elevation error, but to produce a spatially distributed model of RMSE, which improves on a single global RMSE value. So RMSE has been calculated for each cell and its 20-cell radius circular neighbourhood. Summary statistics for the resulting RMSE surfaces are given in Table 4.4.2.
Table 4.4.2. RMSE surfaces’ summary statistics.
The process of filtering and creating an RMSE surface removes the most extreme cell values, although high maximum values remain. However, in Snowdon the proportion of cells with an RMSE greater than 25m is 1.93% for the RMSE surface derived from the signed error values and 1.15% for the RMSE surface derived from mean error difference values. For Greenland the proportion of cells with an RMSE greater than 250m is 0.22% for the RMSE surface derived from the signed error values and 2.08% for the RMSE surface derived from mean error difference values. Histograms for the RMSE surface values are given in Figure 4.4.1 to Figure 4.4.4.
Figure 4.4.1. Histogram of RMSE values derived from signed error for
Snowdon.
Figure 4.4.2. Histogram of RMSE values derived from mean error difference
for Snowdon.
Figure 4.4.3. Histogram of RMSE values derived from signed error for
Greenland.
Figure 4.4.4. Histogram of RMSE values derived from mean error difference
for Greenland.
Figure 4.4.5 and Figure 4.4.6 show orthographic perspective views of Snowdon’s RMSE surface derived from mean error difference and Greenland’s RMSE surface derived from signed error draped over the corresponding DEMs.
Figure 4.4.5. Orthographic perspective view of Snowdon’s RMSE surface
derived from mean error difference draped over the SpTen12 DEM.
Figure 4.4.6. Orthographic perspective view of Greenland’s RMSE surface
derived from mean error difference draped over the SpTen12 DEM.
There are also differences in terms of the specific terrain parameters used in the stepwise regression modelling. Table 5.0.1 shows the number of occurrences of each type of terrain parameter used as variables in the regression equations for the SpTen12 DEMs.
Table 5.0.1 Number of occurrences of terrain parameters in stepwise
regressions.
The terrain parameters used in the regression equations vary between Snowdon and Greenland and between signed error and mean error difference. Point distance is used least frequently and only in the Snowdon equations. Profile curvature, slope and mean extremity are used most frequently. However, it is evident that the most “useful” types of terrain parameter vary depending both on the type of error and the location being modelled. No universal regression equation is apparent. It would seem that the factors influencing the distribution and scale of DEM errors are specific both to the nature of the terrain being modelled and to the way in which the original elevation data have been captured.
Table 4.4.1 indicates that the initial error surfaces derived from the regression equations are of limited quality. It is to be expected that the error surfaces have problems. First, the regression equations are only based on a limited number of GPS survey points. Although these points represent a variety of terrain characteristics, only accessible locations can be surveyed. So the steepest and most rugged terrain is not represented. Consequently, the extreme and invalid error values are found in the steepest and most rugged areas. Second, the adjusted R2 values of about 0.8 to 0.85 indicate a reasonable, but not great, goodness-of-fit. So some poor predictions of error and extreme values within the error surfaces are to be expected. Table 4.4.2 shows that mean filtering and calculation of RMSE values suppresses the initial error surface problems. However, the resulting RMSE surfaces are now only approximations of the variation in error over the study areas, rather than the exact predictions of the initial error surfaces. But an approximation of the error is realistically the best that can be expected for three reasons. First, DEM error has only been measured at a limited number of survey points. Although the survey points were selected so as to give the best possible representation of the variety of terrain characteristics present in each study area, the most inaccessible areas could not be included. Second, the terrain parameters are derived from DEMs containing error and will be subject to error themselves. Therefore, it is not possible to define an exact relationship between DEM and terrain parameters. Third, while a large number of terrain parameters have been derived from the DEMs, these parameters may not be the optimum for characterising the terrain and its relationship with DEM error. In particular, the use of 5, 10 and 20 cell radii may not be the best. It may be beneficial to employ a more computer-intensive approach, in which terrain parameters are calculated over a greater number of radii.
The quality of the RMSE surfaces in terms of the summary statistics in Table 4.4.2 and the histograms in Figure 4.4.1 to Figure 4.4.4 reflects the adjusted R2 values of the corresponding regression equations. For Snowdon, the regression with mean error difference has the higher adjusted R2 value (0.886 compared to 0.834 for signed error) and the resulting RMSE surface has lower minimum, maximum and mean and a lower percentage of cells greater than 25m. For Greenland, the regression with signed error has the higher adjusted R2 value (0.801 compared to 0.797 for minimum error difference) and the resulting RMSE surface has lower minimum and mean values and lower percentage of cells greater than 250m.
The orthographic views of Figure 4.4.5 and Figure 4.4.6 indicate that the RMSE surfaces appear “sensible”. Although the distribution of error values differs between Snowdon and Greenland, in both locations the relationship between DEM error and terrain characteristics can be discerned. In Snowdon the highest values coincide with the steepest slopes, which are near-vertical rock outcrops or cliff faces. In the case of Greenland, the highest values tend to lie along the ridges. It is probable that the differences in distribution are due to both differences in the terrain characteristics of the two areas and the differences in the scales of source data and DEMs.
The set of terrain parameters, which provides the optimum regression with elevation error, is location specific. In this study 20 terrain parameters have been chosen from a total of 96 based on terrain characteristics within 5m, 10m and 20m radii of a target cell. Terrain parameters based on other radii may be more effective. A more truly GeoComputation-style approach may be beneficial, in which terrain parameters for a whole range of radii are computed and assessed.
When applied to stochastic simulations, RMSE surfaces have the potential to give a better account of the influence of DEM error on modelling outcomes than use of a single RMSE value. Work is continuing to develop probabilistic hydrological models, which work with a DEM and an accompanying RMSE surface. The outcomes of this further work will demonstrate the usefulness of the RMSE surfaces developed here, and the additional knowledge of DEM error that they provide.
Clark, K.J., 1993. Data constraints on GIS application development. In: Kovar, K. & H. P. Nachtnebel (Ed.s), Application of Geographic Information Systems in Hydrology and Water Resources Management. IAHS Publication 211, 451 – 463.
Fisher, P.F., 1994. Probable and fuzzy methods of the viewshed operation. In: Worboys, M.F. (Ed.), Innovations in GIS 1. Taylor and Francis, 167-176.
Goodchild, M.F. and X. Han, 1995. The Effects of Topographic Error in GIS. International Journal of GIS, 9/2.
Heywood, I., G. Smith, B. Carlisle & G. Jordan, 1999. Global Positioning Systems as a Practical Field Work Tool: Applications in Mountain Environments. In: M. Pacione (Ed.), Applied Geography: Principles and Practice. Routledge, London. Ch. 43.
Li, Z., 1991. Effects of checkpoints on the reliability of DTM accuracy estimates obtained from experimental tests. Photogrammetric Engineering & Remote Sensing, 57(10), 1333-1340.
Magellan, 1994. Magellan GPS ProMark X User Guide. Magellan Systems Corporation, California.l
Miller, D.R. and J.G. Morrice, 1996. Assessing Uncertainty in Catchment Boundary Delimitation. Proceedings of 3rd International conference on GIS and Environmental Modelling, Jan 1996, Santa Fe, New Mexico. URL: http://ncgia.ucsb.edu/conf/SANTA_FE_CD/papers/miller1_david/miller_paper1.html Last accessed 16.7.97
Monckton, C., 1994. An Investigation into the Spatial Structure of Error in Digital Elevation Data. In: Worboys, M.F. (Ed.), Innovations in GIS 1. Taylor and Francis, 201-211.
Ordnance Survey, 1999. Land-Form PROFILE User Guide, Version 3.0 Data. URL: http://www.ordsvy.gov.uk/downloads/height/profile/profil_w.pdf Last accessed 10.8.00
USGS, 1998. National Mapping Program Technical Instructions - Standards for Digital Elevation Models Part 2: Specifications. URL: http://rockyweb.cr.usgs.gov/nmpstds/acrodocs/dem/PDEM0198.PDF Last accessed 10.8.00
Weibel, R. & M. Brändli, 1995. Adaptive Methods for the Refinement of Digital Terrain Models for Geomorphometric Applications. Zeitschrift fur Geomorphologie, Dec. 1995, 13-30.