1Baskent University
Department of Computer Engineering
Baglica Kampusu
Eskisehir Yolu
Baglica 06530
Ankara
Turkey
Thomson@baskent.edu.tr
2National Atlas of Canada
Canada Centre for Remote Sensing
615 Booth Street
Ottawa, Ontario K1A 0E9
Canada
Brooks@nrcan.gc.ca
http://www.nrcan.gc.ca/~rbrooks/
By means of the perceptual grouping principle of ‘good continuation’ the network can be analysed (i.e. broken down) into linear elements, which are non-branching chains of arcs. We term these linear elements ‘strokes’; this term is prompted by the idea of a curvilinear segment that can be drawn in one smooth movement and without a dramatic change in style. Unlike arcs, strokes may in theory contain any positive number of nodes, and may intersect each other or themselves. They are also computationally simple to derive, and can be applied in both road and river network analysis. Furthermore, the stroke-building algorithm is both robust enough to operate on incomplete or imperfect data, and can be easily constrained using additional information.
Further analysis allows the derived strokes to be ordered, to reflect their relative importance in the network. The deletion of the elements according to this sequence then provides a simple and effective method of generalising (attenuating) the network. In the case of hydrologic networks, the established method of generalisation on the basis of stream orders can be viewed as a special instance of this more general approach. We also note that when two sub-strokes are combined into a new stroke the attributes of the resulting stroke can be easily obtained from those of the component sub-strokes. This allows the analysis of a network to be derived from the analysis of component pieces - with concomitant advantages in efficiency and storage.
This perception-based approach to network generalisation has been implemented and evaluated at CCRS. The implementation is outlined, results presented, and the effectiveness of the techniques demonstrated on both road and river networks. The technique has been used to do the hydrology selection for a map covering Canada's three northern territories.
An implementation of this approach has been extensively tested at the CCRS. These tests have shown that the approach is applicable to both road and river networks. The technique has been used in practice with good results to create a selection of rivers from the 1:1M scale VMAP0 dataset to be shown on a forthcoming printed map of Canada’s Northern Territories to be published in the fall of 2000.
Procedures for distribution refinement have obvious applications to data volume reduction, for saving storage space or improving data processing and transmission times. A further important reason for such a procedure is the homogenisation of different datasets in the process of data integration. A primary objective of database abstraction is the derivation of databases at multiple levels of accuracy or resolution, and this form of generalisation serves as a preprocessing operation for cartographic generalisation and the visual presentation of the data (map generation) [Weibel 1995].
Guidelines as to the amount of data reduction that is appropriate for a given coarsening of scale have been around for a long time: for example, the "radical law" of Töpfer and Pillewizer [1966]. The problem addressed here is how to decide the particular network elements to be removed – further entailing the issue of how exactly the network elements should be defined.
The stream orders developed by physical geographers and hydrologists – e.g. Strahler [1957] and Horton [1945] orders – when taken together with segment length give a relatively simple but effective ordering of the segments. The stream order calculations require, of course, knowledge of the stream directions. Complications are introduced into the ordering scheme by stream braiding and delta structures, and the method is complicated further when lakes within the network are also subject to removal by generalization. But these difficulties can be overcome and a reliable generalisation scheme for hydrologic networks can be established on this foundation [Richardson 1993].
Road networks have proved more difficult to generalise, however. This is because road networks, unlike hydrologic networks, contain abundant loop structures (hamstringing attempts to derive equivalents to stream orders, which assume an underlying tree structure), there is no equivalent to a flow direction, and road segments may have associated a range of semantic data, such as road class, width, surface type, and seasonality.
Many perceptual grouping principles have been identified, such as proximity, similarity, symmetry, closure, parallelism, collinearity, co-termination and continuity. Some illustrative examples are given in figure 1.
Figures 1(a) to 1(c) show grouping according to different forms of similarity of the elements – similarity of shape, colour (or grayscale) and orientation. Thus, for example, figure 1(b) is perceived as two groups of white circles and two groups of black circles, even though the vertical separation of the circles is slightly less than the horizontal separation. Figure 1(d) shows the effect of collinearity: the three collinear elements are perceived as the figure, with the other elements as ground. Figure 1(e) shows the importance of good continuation: this figure is perceived as one smooth curve with two smaller curves incident on it, rather than as, for example, five connected linear elements. Figure 1(f) shows the effect of uniform density: the collection of dots is perceived as two groups, where the dot spacing within each group is relatively uniform.
The perceptual grouping principles are now widely recognized as the basis for parsing the visual world into surfaces and objects according to relatively simple visual characteristics, in a process that operates independently of the domain being represented [Palmer, 1983]. Consequently, perceptual organization principles play a vital role in the understanding of two-dimensional images of three-dimensional scenes [Witkin & Tenenbaum 1983; Biederman et al. 1991; Lowe 1985]. These principles are also crucial in the interpretation of other forms of images, such as sectional images [Thomson & Claridge 1989; Thomson 1999], and in the understanding of maps [Bertin 1983; MacEachren 1995].
The important role of these principles in map generalization has also long been recognized. Notably, DeLucia and Black [1987, p.175] stated that "… there are direct analogies between what is required for successful map generalization and … the principles of perceptual organization enunciated long ago by the Gestalt psychologists".
The arc-node topological data model is commonly used in GIS systems. In this model, a dataset of linear features is maintained as a planar graph. Many of the graph theoretic properties are explicitly represented in the data structure. In a network in such a system, there are two types of object: the arc, or linear feature, and the nodes, which occur at the ends of the arcs.
In our data model, a stroke will comprise a set of arcs and nodes. This set represents a path through the network from one terminal node to another (possibly the same) which will utilize all the arcs in the set, without repetition – although the path may self-intersect and so nodes may repeat. This path is the stroke.
Figure 2: Strokes constructed on the road network for downtown Ottawa
In the implementation used at the National Atlas of Canada, stroke building is a local process, considering each node in turn and dependent completely on the properties of the network at that node neighbourhood. This implies that strokes can be locally adjusted, with the update requiring only small processing time. (One exception to this may occur in the hydrologic network where the stroke building may be made dependent on supporting information such as the largest upstream length or largest drained area. In this case, the stroke-building process is still local, but the generation of the supporting information may require processing the entire basin.)
It is interesting to compare this approach to network generalization with that described by Mackaness [1995] for urban road generalization. The latter scheme is similar in that ‘axial lines’ are derived for a network, quantitative indices are derived for these elements, and these then provide the basis for the generalization. The derivation of axial lines, however, is based on lines of sight, following a concept drawn from urban space pattern analysis [Penn, 1993]. Compared with axial lines, strokes are far simpler computationally to derive and, being based on more generally applicable principles, they can justifiably be used for a wider range of networks.
The stroke construction process is robust, in that there are no degenerate cases that cause the algorithm to fail. However, if faulty data is used as input, strokes will still be defined, although they may not accurately correspond to any meaningful real world objects. To achieve meaningful results it is essential that the dataset have correct connectivity, and in the case of hydrologic networks, correct flow direction. In both cases, better results can be achieved by using additional attribute information to guide the stroke building process. For example, in both road and river networks, constraining the process so that arcs with the same name attribute (e.g. Mackenzie River; Bank Street) are connected into a stroke yields good results. Where this information is incomplete, the strokes will be built according to geometric considerations.
Once the strokes are constructed the networks must still be generalised. First an order of importance must be assigned to every stroke in the network. Then some proportion of them must be removed to reduce the data density. Finally any cleanup processes, if necessary, should be applied to create the final selection.
Each stroke acts as a connector between all possible pairs of its neighbours. Therefore, connectivity is maintained so long as all pairs of the neighbours of a stroke still have a path connecting them after the stroke is removed. The following procedure creates an ordered list of strokes such that strokes can be removed in sequence without disconnecting the network.
Let there be a stack, S, of strokes ordered by salience so that the least salient stroke will be next off the stack. Let H be a heap, currently empty, and let L be an output list.
Figure 3: Raw road data for Ottawa and surrounding area
Figure 4: Generalised road network for Ottawa and surrounding area
Figure 5: Horton stream ordering method
By carefully selecting the stroke building rules, the strokes can be constructed so that they can be ordered using Horton’s method. The strokes are used to define the main path, needed for the second stage of processing. At each node, a choice of the main path is made which is then connected to form a stroke. The maximum first stage (Strahler) order occurring on any element of a stroke is assigned as the Horton ordering of that stroke.
The definition of what constitutes the "main stream" is a key component of this process, and has a large effect on the quality of the results. Horton’s original criteria were that the longest and straightest path should be used to define the main stream. Other determinations of the main stream have included the straightest path [Richardson 1993] or the largest drained area [Pfafstetter 1989]. In many cases, quirks of human geography have influenced the selection of what is considered to be the main stream in defiance of any geometric characteristics. For example, the Missouri river system is a much longer headwater of the Mississippi river than what is named the Mississippi head water. However, the historical development of the United States has assured that this river system is named in a manner that contradicts its geometry. The use of the stroke data model does not enforce any particular choice of main stream; the rules used for building the strokes should simply be the same rules used for selecting the main stream.
Braided streams add an additional level of complexity to the ordering process. Prior to assigning any orders, braids were separated from main channels using the rule that the main channel was the shortest and straightest path to the ocean. Braid channels were assigned a stream order in the first processing step, but the intersection of two braids of the same order, or of a braid and main stream of the same order did not increment the order of the outflowing stream. When the strokes were constructed, they were not allowed to include both braided and main channels. The result was that the main channel in a braided stream would be surrounded by braids of equal or lesser order, while preventing a runaway increase in stream order due to the many intersections within the braid. When removing elements from the network according to their length and Horton order, this scheme prevented any components of the braid from becoming disconnected.
Figure 6: Ungeneralised river network (Attawapiskat River)
Figure 7 : Generalised river network (Attawapiskat River)
The stroke-building algorithm itself is very efficient. Measured computational efficiency in practice appears to be approximately linear in the number of nodes, as shown in Figure 8. While the times listed here appear to be rather large, recent experiments indicate that at least a ten-fold increase in speed can be achieved by recoding outside of the ArcInfo GIS system. Furthermore, the stroke building algorithm will operate on erroneous datasets. The results will be progressively degraded as the source data degrades, but this allows operation in a production environment on large datasets, where there may not always be the option of detecting and correcting all the errors in a dataset.
Figure 8: Processing time for stroke building is linear
Finally, the selection which resulted from the application of the generalisation technique described here was processed using cartographic generalisation techniques, including line simplification, displacement and exaggeration.
The hydrologic network for Canada, at the 1:1M scale contains approximately 500 000 arc features. As a complete dataset, it is difficult to handle this dataset, so for practical reasons it had to be partitioned for processing. Drainage basins partition naturally along the ridge lines of the topography. Road datasets can also be inconveniently large, and their partitioning into sub-networks may similarly be warranted. In this regard it may be noted that when two sub-strokes are combined into a new stroke during stroke-building the attributes of the resulting stroke can usually be obtained from those of the component sub-strokes. This should allow the analysis of a network to be derived from the analysis of component pieces - with concomitant advantages in efficiency and storage.
Once constructed, the perceptual stroke model was used for other applications beyond the generalisation of the data. In particular, it assisted in semi-automatically matching name attributes to river tributaries, greatly increasing the efficiency of that process.
This method is robust enough to be used on faulty data with reasonable results. Furthermore, arbitrary local constraints could be added to the stroke building method without affecting its overall performance. This allowed additional information, such as names, where available to be used to adjust the systems performance.
Although no extensive testing has been done, it is reasonable to expect that the stroke-based generalisation method will apply to other types of networks, including pipeline and electrical networks. Current work in the National Atlas of Canada is exploring this possibility.
Statistics Canada generously provided road network data from their database which was used for the road network experiments.
Biederman, I., Hilton, H. J., and Hummel, J. E. 1991. Pattern goodness and pattern recognition. In: Pomerantz, J. R. and Lockhead, G. R. (Eds.), The Perception of Structure, Washington DC: American Psychological Association.
Brooks, R. 1999. The new and improved base framework for National Atlas data. Proceedings of the ICA 19th International Cartographic Conference (Ottawa, 14-21 August). Also available at http://atlas.gc.ca/english/about_us/index_pres_e.html#framework
Cohen, S. and Ward, L. M. 1989. Sensation and Perception, Orlando, FL: Harcourt, Brace and Jovanovich.
DeLucia, A. and Black, T. 1987. A comprehensive approach to automatic feature generalization, Proceedings of the 13th International Cartographic Conference (Morelia, Mexico), 168-191.
Goudie, A., Ed. 1985. The Encyclopedic Dictionary of Physical Geography. Basil Blackwell, Ltd. Oxford, UK.
Horton, H. 1945. Erosional development of streams and their drainage basins, Bulletin of the Geological Society of America, 56, 275-370.
Lowe, D. G. 1985. Perceptual Organisation and Visual Recognition, Boston: Kluwer Academic.
MacEachren, A. M. 1995. How Maps Work: Representation, Visualization, and Design, New York: Guilford Press.
Mackaness, W. A. 1995. Analysis of urban road networks to support cartographic generalization, Cartography and Geographic Information Systems, 22(4), 306-316.
McMaster, R. B. and Shea, K. S. 1992. Generalization in Digital Cartography, Washington DC: Association of American Geographers.
Palmer, S. E. 1983. The psychology of perceptual organisation: a transformational approach. In: J. Beck et al. (Eds.), Human and Machine Vision, New York: Academic Press, 269-339.
Penn, A. 1993). Intelligent analysis of urban space patterns: graphical interfaces to precedent databases for urban design, Proceedings of Auto Carto 11, 53-62.
Pfafstetter, O. 1989. Classification of hydrographic basins: coding methodology, Rio de Janeiro: DNOS (translated by Verdin, J. P., 1991. U.S. Bureau of Reclamation, Brasilia).
Richardson, D. E. 1993. Automatic Spatial and Thematic Generalization using a Context Transformation Model, Doctoral Dissertation, Wageningen Agricultural University, Netherlands.
Richardson, D. E. 1994. Generalization of spatial and thematic data using inheritance and classification and aggregation hierarchies. In: Waugh, T. C. and Healey, R. G. (Eds.), Advances in GIS Research 2, London: Taylor and Francis, 957-972.
Richardson, D. E., and Thomson, R. C. 1996. Integrating thematic, geometric and topological information in the generalization of road networks. Cartographica, 33(1), 75–83.
Rusak Mazur, E., and Castner, H. W. 1990. Horton’s ordering scheme and the generalisation of river networks, The Cartographic Journal, 27, 104-112.
Shreve, R. L. 1966. Statistical law of stream numbers, Journal of Geology 74: 17-37.
Strahler, A. N. 1957. Quantitative analysis of watershed geomorphology, Trans. of the American Geophysical Union 38: 913-920.
Thomson, R. C. 1999. Interpreting temporal relations in rock micrograph and geological section images, Proceedings of the 1st International Symposium on Imaging Applications in Geology (Liège, Belgium), 229-232.
Thomson, R. C., and Claridge, E. 1989. A 'computer vision' approach to the analysis of crystal profiles in rock sections, Proceedings of the 6th Scandinavian Conference on Image Analysis (Oulu, Finland), 1208-1215.
Thomson, R. C., and Richardson, D. E. 1999. The ‘Good Continuation’ Principle of Perceptual Organization applied to the Generalization of Road Networks, Proceedings of the ICA 19th International Cartographic Conference (Ottawa, 14-21 August), 1215–1223.
Thomson, R. C., and Richardson, D. E. 1995. A graph theory approach to road network generalization, Proceedings of the 17th International Cartographic Conference (Barcelona, Spain), 1871-1880.
Töpfer, F., and Pillewizer, W. 1966. The principles of selection: a means of cartographic generalization, The Cartographic Journal, 3(1), 10-16.
Weibel, R. 1995. Three essential building blocks for automated generalization. In Müller, J. C., Lagrange, J. P and Weibel, R. (Eds.) GIS and Generalization: Methodology and Practice, London: Taylor and Francis, 56-69.
Witkin, A. P., and Tenenbaum, J. M. 1983). What is perceptual organization for?, Proceedings of the 8th International Joint Conference on Artificial Intelligence (Karlsruhe, Germany), 1023-1026.