Modelling: Parallel Programming
[Part 11 of 12]
When trying to model the real world we frequently hit up against the limits of computing. In this part we'll look at solutions for various issues associated with running models or analysies on a single computer.
Firstly, we'll look at some of the potential problems you may come across, most notably running out of memory and/or processing time.
Introduction to parallelisation (powerpoint)
Further info:
Sadly Flynn's Taxonomy isn’t named after the TRON character, but rather Michael Flynn.
Papers:
Evans, A.J. (2012) 'Uncertainty and Error'. In Heppenstall, A.J., Crooks, A.T., See, L.M., and Batty, M. (2011) 'Agent-Based Models of Geographical Systems'. Springer.
Malleson, N., See, L.M., Evans, A.J. and Heppenstall, A.J. (2010) Implementing comprehensive offender behaviour in a realistic agent-based model of burglary Simulation, 0(0): 1-22
Parry, H., Evans, A.J., and Morgan, D. (2006) Aphid population dynamics in agricultural landscapes: an agent-based simulation model Ecological Modelling, Special Issue on Pattern and Processes of Dynamic Landscapes, 199 (4), 451-463.
Parry, H. and Evans, A.J. (2008) A comparative analysis of parallel processing and super-individual methods for improving the computational performance of a large individual-based model Ecological Modelling, 214 (2-4), 141-152
Next, we'll look at how to make programs that run across multiple processors.
Further info:
MPJ Express and docs.
P2P-MPI (well set up for Peer-to-Peer development).
Some platforms (like mpiJava) require an underlying C implementation to wrap around, like LAM.
Parallel programming (powerpoint)
Papers:
Finally, we'll look at some different computer architectures for tying together multiple machines.
Further info:
Peer-to-peer framework BOINC (Wikipedia).
Cloud option Amazon Elastic Compute Cloud (Amazon EC2). This can include
virtual clusters for HPC
(nice YouTube video;
instance definitions;
pricing).
Parallelisation across cores using Java concurrency (Introduction and additions in 1.7).
Alternatives to parallel programming include treating all the machines as one giant virtual machine (e.g. using MOSIX), or using remote machine invocation, utilising something like CORBA.
Running *nix: options include VirtualBox, and running from a USB stick (How-to; LinuxLive, PenDrive).
Learning how to use *nix tutorial.
More on Secure Shell.
Communication levels are key. Low communication levels (as in batch distribution) allow greater flexibility. To understand the impact of communication on our architectural choices we have to think about the speed at which processing is done.
If we look at the processing and memory retrieval itself, we can see that RAM is much faster than hard-drive access, and CPUs can generally run faster than RAM can supply data (MIps is millions of instructions per second – an instruction potentially containing a number of bytes; MBps is millions of bits per second). However, the real bottlenecks are the communications between components. This varies even on a single motherboard (the base on which most computer components sit or are connected to). The connections between elements are called "buses". The System Bus connects the CPU and memory. Already we can see that this limits the data transfer from the RAM to the CPU. IO Buses sit between the CPU and devices like hard-drives. The typical hard-drive IO Bus runs much slower than a hard-drive can be read. This is why we don’t want to write to the hard-drive if we can help it – writing here is almost 2 orders of magnitude slower than the CPU can process data.
Central Processing Units can now process >7000 million instructions per second.
Typical RAM read speeds are ~3000 million bits per second.
Typical System Bus transfer rates are ~1000 million bits per second.
Typical hard-drive reading speeds are 700 million bits per second.
Typical IO bus for harddrives run at 133 million bits per second.
Typical fast ethernet runs at 100Mbps.
Typical ethernet connection runs at 10Mbps.
Typical home network runs at 1.6Mbps.