Measuring Confidentiality Risks in Aggregate Census Data

Click here to start

Table of Contents

Measuring Confidentiality Risks in Aggregate Census Data

Objectives of the Research


Some Background

Data Confidentiality!

Data Protection is a related issue


PPT Slide

Little evidence of commonsense!

Unfortunately Data Protection is increasingly being conflated with Data Privacy

All DATA modification is inherently harmful because it DAMAGES the data and affects all subsequent uses of it

Spatial Data for ZONES are NOT really personal information

PPT Slide


The whole problem is rapidly spiraling out of control!

This fear of litigation is converting into a fear of creating geographic information that may disclose personal information!

One result could be:

Yet... The Geographical aggregation of personal data is a seemingly good data encryption and confidentiality preserving device!

The problems now are:

The dangers are that if we do not answer these questions in a convincing statistical-scientific way then we will lose access to best quality small area data about people

Its VERY important because

PPT Slide

The Census is in the Front Line in the political and public debate about the confidentiality of personal data

Census Data Confidentiality

Fear of Confidentiality Leaks

The problem is ..

There is no scientific statistical basis for any of these thresholds


No knowledge of whether they were necessary and if they were whether they work!

The rules and procedures being proposed for 2001 are trying to prevent disclosure and enhance confidentiality BUT without any explicit knowledge of what the risks really are!

Also .. There is a pervasive NEGATIVE and pessimistic attitude regarding confidentiality

Differencing Problems are an opportunity for more flexible output areas. Discuss.

Yes.. because it violates the rules!

PPT Slide

What are the RISKS in releasing aggregate census data for subthreshold areas or for tables with too many zeros are for tables with a few big numbers?

What are the risks? What are the risks? What are the risks? What are the risks? What are the risks? What are the risks? What are the risks?

PPT Slide

PPT Slide

You have to be able to measure them empirically

A new approach

Measuring disclosure risks

Micro-census Data Disclosure Risks

Microcensus data risks assessment

Risk amelioration factors

The answer is ...

PPT Slide

Modifying the risk assessment for geographically aggregate data

Spatial Aggregation does horrendous damage to microdata and much detail is lost in the data recodes and transformations that occur. The hope is that these data modifications will greatly reduce the dangers of disclosure.

Two TYPES of aggregate census data

Derived Statistics

Count Data

How do you measure the confidentiality risks in aggregate census data?

Micro-data when expressed as spatial data can be represented as a series of table coordinates

Table co-ordinate representation of micro-census data

Some algebra

The confidentiality risk depends on there being a perfect match between the ED counts and the microdata profile of the individuals assigned to it

Some more algebra

PPT Slide

Note the subtleties

ONS are the only Organization who can actually APPLY and TEST our methodology

The BEST we can do is to create portable software and test our method on microcensus data that we have available to us and hope that ONS may subsequently wish to try our method and develop it further for real UK census data

Two Data Sets are used here:

Modes of Application

As a Measure of Confidentiality Risks

Some Results for Disclosure Risks

PPT Slide

PPT Slide

PPT Slide

PPT Slide

PPT Slide

PPT Slide

Modifying Aspects

What's next

Possible 2001 Census Applications

Other Applications

This methodology has the potential to unleash the full power of GIS related flexible user orientated zone design relevant to many personal data applications whilst ensuring confidentiality safeguards are met

Authors: Stan Openshaw, Phil Rees, Andrew Evans, and Oliver Duke-Williams, University of Leeds


Home Page: