Introduction
- This web page is a progress report for MoSeS for the period 2006-08-31 to 2006-10-02.
- Contents:
Summary
- During the reporting period the School of Geography Beowulf machine has been used intensively.
- This use produced numerous sets of results for analysis for Leeds.
- There were two types of two types of result:
- One distinction is that results were produced either for Output Area (OA) or Middle Level Super Output Area (MSOA) control constraints and optimisations.
- Another distiction was that one type of results used ISARDataRecords for both Household Population (HP) and Communal Establishment Population (CEP), while the other used HouseholdSARDataRecords for HP and ISARDataRecords for CEP.
- For each type of result, a number of outputs were produced for a range of optimisation criteria:
- Some results were highly optimised, others were simply constrained.
- In producing results for MSOA using the HSARDataRecords and constraining by CAS003 a difficulty was encountered using Sampling With No Replacement (SNR).
- An analysis of the results has revealed some logic errors in the procedure. Various refinements have been and are necessary.
- Analysis of Results:
- The production of results has gone hand in hand with the development of methods and further output to analyse performance.
- This has lead to the identification of logic errors in the initialisation process.
- Next Steps:
- Allow an option for Sampling With Replacement (SWR).
- Improve analysis of results.
- Produce data.
- Report.
Details
- Beowulf processing:
- Major problems have been experienced in producing results.
- This is partly due to the way the parallel processing with MPJExpress was implemented.
- It is also down to a communication problem that seems to be either a bug or feature of MPJExpress.
- Cooking the data is not as automated as one would like.
- The use of Barrier() raised new issues.
- The parallel processing methods are being re-written to be more robust and flexible
- First try is to use a queue to write out results.
- This should be much more efficient than using Barrier().
- MSOA CAS003 control constraints and SNR:
- There is an MSOA in Leeds containing over 800 Household Reference Persons (HRPs) aged 16 to 19.
- There are only just over 600 such HRPs in the Household SAR (HSAR).
- There is no easy way around this problem...
- One solution implemented controlled as much as possible, then allowed for the generation of another set of non-control constraining HRPs from another age group.
- It seems more desirable to either boost by creating new synthetic HSARDataRecords, or relax the SNR criteria.
- I do not understand the requirement for a SNR criteria.
- The SNR criteria is the main reason that the optimisation is so difficult... It requires many additional checks to be made... It results in many inefficiencies...
- Logic errors a matter of age:
- During initialisation HRPs are selected from within an age range, but during optimisation only specific ages are swapped.
-
- This needs to be changed so that a specific age can be swapped with any from a range.
- This logic error has not totally rubbished previous results, but it is likely they are much less optimal than desired.
- It is known how to fix this and has been in new methods developed for Sampling With Replacement (SWR).
- Next Steps:
- Allow an option for Sampling With Replacement (SWR).
- This has been implimented for the case of using ISARDataRecords for both HP and CEP:
- This is the easiest case to deal with as the population size is easier to fix.
- This implementation is being adapted for using HSARDataRecords for HP.
- Improve the analysis of results:
- Extend from looking at Age/Gender profiles to comparing health, employment and non-optimised variables, e.g. Ethnicity.
- Produce a full set of results for comparison and report on an analysis of these.
Links