- Events:
- DevCSI Developer Days (Dev8D 2011)
- DevCSI Linked Data Hackday
- Events on the horizon:
- Geocomputation 2011
- Workshop on Dynamic Distributed Data-Intensive Applications, Programming Abstractions, and Systems (3DAPAS)
- https://sites.google.com/site/3dapas/
- To be held in conjunction with the 20th International ACM Symposium on High-Performance Parallel and Distributed Computing
- 2011-06-08, San Jose, California, USA.
- The Future of Computational Social Science (JITP2011)
- Spatial Statistics 2011: Mapping Global Change
- International Symposium on Grids and Clouds (ISGC 2011)
- JISC Conference 2011
- Funding Opportunities
- Google Research Awards
- JISC
- Roadmap of future grant funding 2010-08 to 2011-07
-
-
-
- Browsing
- Miscellanea
- Teaching
- e-Science
-
- Browsing
- Miscellanea
- Teaching
- e-Science
-
- Browsing
- Miscellanea
- Teaching
- e-Science
-
- Browsing
- Miscellanea
- Teaching
- e-Science
-
- Browsing
- Miscellanea
- Teaching
- e-Science
-
- Browsing
- Miscellanea
- ...
- Review Meeting with Gill Valentine and Joe Holden
- Teaching
- e-Science
- Developing Generic Math library
-
- Browsing
- Publication
- Developing Riyadh RTA first draft
- Teaching
- e-Science
- Developing test suite for Generic library
-
- Browsing
- Miscellanea
- Publication
- Developing Riyadh RTA first draft
- Teaching
- e-Science
- Developing test suite for Generic library
-
- Browsing
- Publication
- Developing Riyadh RTA first draft
- Teaching
- GEOG3600
- Meeting with Tom
- Meeting with Charlie
- e-Science
- Developing test suite for Generic library
-
- Browsing
- Publication
- Developing Riyadh RTA first draft
- Teaching
- e-Science
- Developing test suite for Generic library
-
- Browsing
- e-Science
- e-ISS
- NeISS face to face meeting Day 2
- ...
- http://portal.ncess.ac.uk/access/wiki/site/e-iss/f2fagendajan2011.html
- Notes
- Workpackage review
- How do we describe our simulation outputs and the simulation modesl as Research Objects.
- I mentioned the cluster tool refactoring work that I started with Ian Turton in December 2010
- Chat with Neil about Research Excellence Framework (REF) and impact
- Chat with Neil about the Math package I am developing with the intention of getting it accepted for development as part of jakarta commons Math from Apache Software Foundation
- Chat with Rob about Random and exponent methods in my Math package.
- Archive and distributed data store discussion
- NeISS wants such a thing, but it is known to be a reasonably big job...
- Someone enthusiastic is wanted to take this forward probably in collaboration with other organisations.
- Management and dissemination issues arising from JISC programme meeting and steering group meeting 2010-11-04
- Suggested enhancements:
- NeISS web site
- YouTube channel with videos posted that describe our work
- Simon Hettrick from SSI can advise us about technology for creating videos.
- Each video should have an introduction, voice over and subtitiles/annotations.
- For sustainablitiy, a plan is wanted to demonstrate our wares and generate further funding interest to sustain our efforts.
- Need to identify what of our services, software and data we want to have used and further developed by ourselves and others...
- Make more approaches to policy makers and perhaps start by considering how NeISS e-Infrastructure for Social Simulation could have been used to avoid problems of the last 20 years if it existed 20 years ago.
- Census Atlas
- Browsing
-
- Browsing
- Publication
- Developing Riyadh RTA first draft
- e-Science
- e-ISS
- NeISS face to face meeting Day 1
- ...
- http://portal.ncess.ac.uk/access/wiki/site/e-iss/f2fagendajan2011.html
- Notes on a day working with Tom Doherty
- Aim was to develop understand of what each other was doing and learn how things currently hang together and work.
- By the end of the day, we believed we had appreciated where we wanted to go and had a plan of the next steps...
- For the GENESIS Simulator, timing of runs is going to be useful
- It helps appreciation of how much resource is needed to create results.
- When the models can be run in parallel, they can also be used to benchmark performance.
- MoSeS collaboration with colleagues from the University of Westminster who helped MoSeS Population Initialisation Software to be run via the Westminster P-Grade portal.
- The MoSeS Java code developed by Andy essentially generated individual and houshold level population data for the UK by integrating published Census output data.
- MPJ express was used for parallelisation.
- Andy enjoyed the experience of P-Grade and the MoSeS collaboration with grid experts :-)
- This experience is very relevant for our NeISS collaboration.
- It was intended that the MoSeS results would be used to seed Social Simulation models, but this is not really happening...
- Andy aims to continue work with Alex Voss and colleagues at Academia Sinica in Taiwan on Social Simulation in the year ahead.
- User Interfaces for the GENESIS simulator
- Input
- Need for users to select inputs from archive or from local files and set some parameters for multiple runs and for the number of iterations the simulation is run for.
- Processing
- Archive
- Archive
- An archive is wanted on NGS resources for sharing and reuse of simulation results.
- Also results can be very large and used as inputs to subsequent runs and so rather than download and upload these, it would be better to store them near where the computation is done.
- Ideally we would have a federated data store for simulation outputs and metadata.
- Important or heavily used results can be replicated to numerous sites and this can be done anyway as a way to back up results.
- Inevitably we cannot store all our results on data element near all the compute element we might want to use and anyway an archive will fill up and we should consider how to manage the archive and how to delete archived results in order to save a more useful result.
- A key for us is to be able to replicate any result from scratch.
- As the output of one run can be used as an input to another, to maintain provenece a full chain of metadata is required to recreate a result from scratch.
- For the GENESIS demographic simulations, the metadata is small. Anyway, hashes or unique checksums of the metadata can be stored in a metadata map. This can be used to test for existance of a result and account for the number of times a result is retrieved from the archive.
- Sensibly, less used results might persist for less time (i.e. deleted more readily than rarely used results).
- For expert users, they might also know if a result is important or not and this evaluation might be based on how long a result took to generate and if it likely to be a commonly used startpoint for further simulation or if it is or will become an unnecessarily archived intermediate result in due course.
- Tom mentioned that he would talk to Sam Skipsey (Data Grid expert colleague) about our data issues and see if he is interested in helping us...
-
- Browsing
- Publication
- Developing Riyadh RTA first draft
- Teaching
- Browsing
- Publication
- Developing Riyadh RTA first draft
- e-Science
- GENESIS
- Demographic Modelling
- Rethinking the storage of data about dead people
- Having a directory for each dead person is convenient in that it is easy to retrieve the data and save extra information.
- However, the Operating Systems are struggling to hold the enormous file directory structures that result from populations of millions.
- One way to scale things a little better might be to store dead people in collections of say 1000 individuals.
- There are options for how to organise and address the collections
- The collections of the dead can be organised by person ID
- Effectively this is a date of birth ordering.
- For simulations that run for several lifetimes, this would work as when the last in a cohort (1000 individuals) dies, the collection when serialised and written out might never be reloaded for the simulation, but the data would be neatly archived using 1000 less directories and 999 less files.
- The tricky thing with this is that multiple dead collections might be wanted at roughly the same stage of processing and this would be computationally demanding.
- The collections of the dead can be organised by order of death.
- This is efficient in terms of writing out the results and making computational resources available for running the simulation, but in terms of finding the data for a specific dead person it becomes tricky
- One way to tackle that difficulty is to provide ordered indexed collections as well indicating which collections a dead agent is stored in. These could be much larger mapped collections of PersonID and CollectionID.
- Browsing
- Publication
- Developing Riyadh RTA first draft
- Browsing
- Miscellanea
- Thoughts on road safety in light of the Geovation challenge
- http://www.geovation.org.uk/
- http://www.geovation.org.uk/geovationchallenge/
- I would like to show that a reduction of the default national speed limit to 20mph would
- Make roads safer for children, the elderly, pedestrians, cyclists, riders and indeed all road users
- Would not significantly impact journey times and may indeed improve them
- Would be more cost effective and reduce the need for traffic calming obstacles and signage in current 20 mph zones and in general
- Any 40mph road should have cycle lanes
- I would like black box accident recorders to be mandatory in all motorised road vehicles
- It may be that lower insurance premiums can be awarded to safe drivers that have their data disseminated back to the insurance companies and that this is the more feasible path to technology adoption.
- Already expensive vehicles are fitted with tracking systems in case of theft, but this could also be used for regulation.
- There is worry about a big brother society, but the place is unsafe as uninsured drivers still kill pedestrians and that is a systems failure as well as a personal failure.
- Publication
- Developing Riyadh RTA first draft
- e-Science
- e-ISS
- Collaboration with Ian Turton on Geographical Analayis Technology Development
- GAM like rate weighted k-th nearest neighbour cluster detector method formularisation
- Convert incidence points to incidence rate points
- Calculate the distance weighted rates from each point in a raster to the k-th nearest incidence rate points
- Map the raster, filtering the result if desired
- I suggest an Inverse Distance Weighting Modified Shepard's Method as outline on wikipedia:
- Emailed Nick Malleson to encourage his involvement
- Preparing for project meeting on 2011-01-12 and 2011-01-13
- ...
- Getting Liferay and a DB set up for submitting Grid jobs...
- Browsing
- Miscellanea
- Collaboration with Ian Turton on Geographical Analayis Technology Development
- The latest code submitted by Ian now compiles with no errors
- I emailed Ian the draft abstract I cobbled together yesterday for GeoComputation 2011 as an open office document.
- Publication
- Developing Riyadh RTA first draft
2010-12-28
2010-12-26
2010-12-24
2010-12-23
2010-12-22
Miscellanea
- ...
- Collaboration with Ian Turton on Geographical Analayis Technology Development
- Compiled latest version of Spatial-Cluster-Detection on my CentOS PC
- Preparing an abstract for Geocomputation 2011
- Developing a GAM/K Spatial Clustering Web Processing Service with GeoTools
- Ian Turton (GeoVISTA Center, Department of Geography, The Pennsylvania State University), Andy Turner (The Centre for Computational Geography, School of Geography, University of Leeds)
- Abstract
- Introduction
-
On the 16th of December 2010 a Google code repository was set up to refactor some Java code for spatial cluster detection called Cluster.
The repository is located at the following URL:
-
As far as we know, although the Java code and the Cluster tool have been used, they have not been (openly) developed since being made openly available online as an output of the SPIN! project.
Hopefully this story and the availability of an open access code repository and the use the GeoTools process API to control things attracts further interest in developing this implementation collaboratively.
- Background
-
The SPIN! project was funded for 3 years from 2000-01-01 until 2002-12-31 by the European Commission, (Turner 2011).
It built on many years of work developing what were called geographical analysis machines (Openshaw 1995).
-
The first online version of the geographical analysis machine GAM/K was made available for others to use as part of project entitled "A smart spatial pattern explorer for the geographical analysis of GIS data" funded by the UK Economic and Social Research Council (Turton I 1998, Openshaw et al. 1999).
The user of the system would configure it by uploading input files and setting parameters via some web forms.
The processing was performed on hardware based at the University of Leeds and the results when ready were available for the user to download.
This online version of GAM/K (and a similar tool called GEM) was available for about a year until catastrophic hardware failures occurred and the system was not recovered.
The system failure followed an even bigger catastrophe - Stan Openshaw who had pioneered the development of GAM/K suffered a disabling stroke in 1999 and retired.
-
The first efforts to make a web service provided a useful learning experience and at the same time, a first translation of GAM/K Fortran code into Java was done.
At the time Java was in its infancy and a re-implementation was made for the SPIN! project where GAM/K and various other spatial cluster detection methods were implmented in a single software component for a spatial data mining system for data of public interest.
The component was called Cluster.
A web based SPIN! system (which incorporated Cluster) was made available at a SPIN! project partner institution, but this system became unavailable just over a year after the end of the project.
The built Cluster component was stand alone and it and the source code were made available online at the end of the SPIN! project.
They are still available and work, but really the code was not being actively developed as it did not reside in a open source code repository and a community of users was not being developed.
Motivation for putting this right was sparked by user interest in December 2010.
It is also timely as the NeISS project aims to provide a version of GAM/K as a hosted service as part of the e-Infrastructure it is developing (Turner 2010).
-
In this latest phase of spatial cluster detection development, the GeoTools process API is being used to control things (Davis 2008).
A Web Processing Service (Open Geospatial Consortium, 2007) is being developed to test with GeoServer (GeoTools 2011, GeoServer 2011).
It is hoped that by the time of the conference a hosted service will be available as part of the NeISS e-Infrastructure and we can present some development details and make some highlights.
- Progress
- To date the refactoring has been the work of Ian Turton who set up the code repository and did an initial refactoring of GAM/K which replicated the Gateshead Cancer Cluster result as shown in Figure 1 (Openshaw et al. 1987).
-
- Figure 1
- References
-
Davis G., 2008, GeoTools Users Guide: Implementing a new Process. [online] http://docs.codehaus.org/display/GEOTDOC/Implementing+a+new+Process [Accessed on 2011-01-04].
-
GeoServer, 2010, Geoserver Web Pages. [online] http://geoserver.org [Accessed on 2011-01-04].
-
GeoTools, 2010, GeoTools Web Pages. [online] http://www.geotools.org [Accessed on 2011-01-04].
-
Open Geospatial Consortium, 2007, OGC Web Processing Service Standard. [online] http://www.opengeospatial.org/standards/wps [Accessed on 2011-01-04].
-
Openshaw S, 1995, Developing Automated and Smart Spatial Pattern Exploration Tools for Geographical Information Systems Applications. In the Journal of the Royal Statistical Society. Series D (The Statistician) Vol. 44, No. 1 (1995), pages 3-16. [online] http://www.jstor.org/stable/2348611 [Accessed on 2011-01-04].
-
Openshaw S, Charlton M. E., Wymer C., Craft A., 1987, A Mark 1 Geographical Analysis Machine for the Automated Analysis of Point Data Sets. In the International Journal of Geographical Information Systems, Vol. 1, No. 4, pages 335-358. [online] http://www.informaworld.com/openurl?genre=article&issn=1365-8816&volume=1&issue=4&spage=335 [Accessed on 2011-01-04].
-
Openshaw S., Turton I., MacGill J., Davy J., 1999, Putting the Geographical Analysis Machine on the Internet.
Chapter 10, pages 121-132 in Gittings, B. (ed) Innovations in GIS 6. : Innovations in GIS 6 Volume 1, Part 4.
[online] http://www.informaworld.com/openurl?genre=article&isbn=978-0-7484-0886-3&volume=1&issue=4&spage=121 [Accessed on 2011-01-04].
-
Turner A., 2010, NeISS Project Web Page. [online] http://www.geog.leeds.ac.uk/personal/a.turner/projects/e-ISS/ [Accessed on 2011-01-04].
-
Turner A., 2011, SPIN! Project Web Page. [online] http://www.ccg.leeds.ac.uk/projects/spin/ [Accessed on 2011-01-04].
-
Turton I, 1998, [online] http://www.ccg.leeds.ac.uk/projects/smart/ [Accessed on 2011-01-04].
- http://standard.cege.ucl.ac.uk/workshops/Geocomputation
- http://www.geocomputation.org/
- http://www.geocomputation.org/2011/
- (Re)creating SPIN! Project Web Pages
Holiday work
- 2011-01-03
- GEOG1300
- Started marking 2nd essays...
- Set up quora and friendfeed accounts:
- Posted on Cameron Neylon's blog:
- http://cameronneylon.net/blog/finding-the-time/
- Post
- Thank you Cameron et al.
- I've just read about ostatus.org which status.net implements.
- For a few months, status.net based services like identi.ca have provided reasonable conversation context.
- A recent change to Twitter also allowed users to get some context, but I don't find it easy to see entire conversations. Perhaps there is another service that does that for Twitter...
- 2010-12-24
- NeISS/GENESIS
- Released new version to address xsd file not found issues and emailed Tom...
- 2010-12-23
- Compiled latest version of Spatial-Cluster-Detection on my laptop
Teaching and Administration