Parallelisation and Java
Having built our model, we now want to run it. We're going to use MPJ Express to compile and run it.
First, go to the MPJ Express download site, download it, and expand it into a directory on your M: drive. Here we assume M:/MPJ.
Now download the model files: Model.java, Landscape.java, Agent.java. Here we'll assume they are saved in M:/java/model.
We'll now compile and run the files. MPJ demands a couple of Environment variables are set up. If you run it in Cluster Configuration, these have to be set up in the System Environment Variables. However, we're going to run in Multicore Configuration, and we can get away with setting these up at the command line, as MPJ is going to run from a single instance of the command prompt. Rather than repeatedly typing these commands, we'll run them from a bat ("batch") file (Windows DOS command prompt script). In the same directory as your model java files, make a empty text file called run.bat (make sure it isn't run.bat.txt). Open it in Notepad++ and add the following:
@echo off
set MPJ_HOME=m:\MPJ
set PATH=%PATH%;"%MPJ_HOME%\bin"
This will set up the MPJ_HOME variable and add the MPJ bin directory to the computer's search path. The first line just stops the commands being repeated back to you when they run. To run the bat file you can usually just double-click it, however, we're going to want to see some outputs at the command line, so open up a command prompt, navigate to your batch file, and just type the name of the file to run it (you can miss off the ".bat" if you like). You shouldn't see anything at the moment if it runs properly. (You can find out more about bat files in this Introduction and Detailed tutorial, plus info on Setting variables).
Next we need to add some lines to compile our code. Add the following:
"C:\Program Files\Java\jdk1.7.0_25\bin\javac" -cp .;%MPJ_HOME%/lib/mpj.jar *.java
Note that we use the cp classpath parameter to tell java where the MPJ files are. Run the bat file again to check it works.
Finally, we need to run the files on multiple nodes (in this case multiple cores, but it is pretty much the same process for multiple machines). Add the following to the bat file (Don't run it yet!):
%MPJ_HOME%\bin\mpjrun.bat -np 4 Model
This invokes a second bat file, that MPJ supplies, that runs the java classes across the local cores or a set of identified nodes. Here we've asked for the Model.class file to be run using 4 nodes.
You can try running it, however it may come back and say it doesn't recognise java. If it does this, you need to find the mpjrun.bat file and tell it where java is. Alter it from this:
@echo off
java -jar %MPJ_HOME%/lib/starter.jar %*
to this...
@echo off
"C:\Program Files\Java\jdk1.7.0_25\bin\java" -jar %MPJ_HOME%/lib/starter.jar %*
Note that %* just passes the starter.jar the parameters passed into mpjrun.bat by run.bat.
Run the run.bat file again. You should now see some results from the model, generated
running on 4 nodes. If you ask for more nodes than the machine has, it will just split the
program into the number of threads you've asked for, and allocate them to the cores that
are available. This makes MPJ really useful for parallel development. The other good thing
about developing on one machine, but using parallel cores is that if you add System.out.println
requests, they appear on the command line at your machine, so debugging is easy. Try and
get the program to list the number of agents on each node, for example. There is more
information on debugging across multiple PCs and using Eclipse in MPJ development on the
MJP Express website under "user guides".
Finally, although we don't have a Beowulf cluster any more, they are still useful to know about, not least because a lot of them are built into grid and cloud systems. With this in mind, we thought you might be interested in comparing running MPJ like this with running stuff on a Beowulf cluster, so I've saved our old Beowulf instructions for you. Although you can't use them, if you read through them they're give you a flavour of how this stuff works on Linux, and using a different MPI implementation that demands a C-based MPI implementation to run under it: Beowulf instructions.
If, for your dissertation, you want to make a model that requires HPC, we can get you access to a Grid system that runs MPJ in cluster configuration; just talk to Andy or Nick.