Summary: The starting point for most crime prediction is determining the patterns in spatial data. In many cases patterns are enough to identify the future risk of crime, but they also act as one of our chief sources for information on the behaviour of offenders and victims, information that will ultimately allow us to tackle offender behaviour better. In this section we look at some techniques for standard statistical analysis, specifically introducing the popular CrimeStat toolkit.
Often the starting data for spatial crime analysis is a set of points on a map. These are often confused, not least because in some areas the points will be very sparse, while in others they'll be stacked on top of each other. One option is to shade a map of administrative areas by the number of crimes, or the average crime by unit area. This is often useful when dividing administrative responsibility between organisations. However, although such thematic maps give us a clearer picture of crime patterns than the points themselves, there are a number of drawbacks to aggregating crimes to ward boundaries, not least in the case of absolute crime numbers, the Modifiable Areal Unit Problem.
A better approach is to use the points to derive new statistics that summarise the spatial and/or temporal distributions of crime. To this end, the free CrimeStat tool provides a number of useful functions. This document discusses some of the functions that CrimeStat provides. There is no associated practical work, as the CrimeStat documentation serves as an adequate tutorial, but assistance can be provided for anyone interested in using the software.
CrimeStat is free software developed jointly by crime analysts at Houston, Texas, and the National Institute of Justice. It provides tools to perform statistical summaries of crime, and to run models. The procedures/statistics are extremely well documented; the documentation is a good book about the statistics in its own right. The following sections briefly summarise the most popular methods. For further information, refer to the main CrimeStat documentation.
These methods can be used to describe the distribution of crimes in space. They are somewhat coarse; providing summary statistics rather than isolating particular areas of interest. However, they can often be useful as part of preliminary analyses, or as a means of broadly comparing two or more distributions. The methods provided in CrimeStat include:
Clustering refers to a range of techniques that can be used to identify areas that contain a significantly greater number of incidents than would be expected if the spatial distribution of the points were random, i.e. the identification of 'hot spots'. Commonly-used clustering techniques that are available in CrimeStat include:
Density methods are similar to clustering techniques in that the aim is to identify areas that have high or low crime prevalence. However, they are a descriptive tool and do not provide any evidence as to whether apparent clusters are significant. Density methods are often used as a preliminary method to understand the spatial distribution of incidents, before performing more statistically rigorous analyses. They are particularly useful for large data sets, in which it can be difficult to interpret a pattern from the point data directly (see introduction, above). The methods that are built into CrimeStat include:
There are also a number of techniques that can be used to create traditional statistical models including regression and discrete choice.
Levine, N. et al. (2013) CrimeStat IV: A Spatial Statistics Program for the Analysis of Crime Incident Locations. The National Institute of Justice, Washington DC (note that this is most easily downloaded with the software from the CrimeStat website).