Running
[Agent practical 9 of 9]
Again, fortunately our library contains some code to help with this.
Make yourself a new directory \java\src\unpackaged\framework9par\ and download these model files into it:
Unzip GA.zip again, but this time don't delete the \java\src\unpackaged\framework9\genetic_algorithm\grid\ directory and code.
The model files above are exactly the same as before, but with one change. As the parallel GA code does a certain amount of throwing round
objects as chunks of binary data, we need to be able to serialize our code -- that is, turn it into binary suitable for transfering
and writing to files. Because of this, each of our classes (but not the interfaces, which can't implement anything) now states that it implements the
java.io.Serializable
interface, and imports the interface. In actual fact, you
don't have to do anything more than that to implement this interface as it has no method stubs -- you just need to say you're doing it.
The new grid
package turns the standard GA code into code that runs on a Grid Computing system, that is
a system of heterogeneous computers (including data servers, visualisation suites, etc.) connected together to form a supercomputing facility. However, it will just as
well run on a cluster of PCs, or even a single PC with multiple cores, which is how we're going to use it. We're going to use
MPJ Express to do this. MPJ Express allows us to run Java code in parallel, either across a network or on a single multi-core PC with no distinction between the two. For this reason it makes
an excellent testing environment.
To run the model in parallel, first, go to the MPJ Express download site, download it, and expand it into a directory on your M: drive. Here we assume M:/MPJ.
Next, we need a slightly different GAWrapper. Here it is:
If you open it up, you'll see that the change is in the main()
which now uses the parallel enabled
genetic_algorithm.grid.GeneticGrid
class. This package is now imported, along with mpi
.
In addition, the class also uses a little bit of basic MPI to determine whether it should run the final prediction. As the GAWrapper will run on each core, we need to decide on one, and one only, to run the prediction. With a network run, we'd pick node-zero, as this would be the one we started the program on, and often the only one with a screen. With a multi-core run, actually everything writes to the same place and displays on the console. However, for clarity and flexibility we'll do as we'd do on a network version, and just run the final model on node zero, which will be the starting core.
To do this, we initialise MPI, and then find the rank (node number) of the current core. If the code is running on node zero, we'll finish the program by running the prediction:
mpi.MPI.Init(args);
if (mpi.MPI.COMM_WORLD.Rank() == 0) { // etc.
Perhaps more obscurely, because of the way the code runs and distributes its components, it isn't possible to use a standard constructor within this main method after MPI has initialised. Instead, then, we have to run the final model as an external process, which we do by getting the Runtime to do it.
Save the GAWrapper to the directory with the code in it and switch to the directory on the command prompt.
We'll now compile and run the files. MPJ demands a couple of
Environment variables are set up. If you run it in Cluster Configuration (across the network), these
have to be set up in the System Environment Variables. However, we're
going to run in Multicore Configuration, and we can get away with setting these
up at the command line, as MPJ is going to run from a single instance of the
command prompt. Rather than repeatedly typing these commands, we'll run them
from a bat file: Grid.bat. Open it up and check it out. If you've saved your MPJ anywhere else, you'll need to
edit the set MPJ_HOME=m:\MPJ
line.
This script will set up the MPJ_HOME variable and add the MPJ bin directory to the computer's search path. The first line just stops the commands being repeated back to you when they run. The file then compiles the model files, ensuring that the MPJ classes are in the classpath (using the cp classpath parameter), and then runs the model.
The line:
%MPJ_HOME%\bin\mpjrun.bat -np 4 GAWrapper
invokes a second bat file, that MPJ supplies, that runs the java classes across the local cores or a set of identified nodes. Here we've asked for the GAWrapper.class file to be run using 4 nodes.
You can try running it, however it may come back and say it doesn't recognise java. If it does this, you need to find the mpjrun.bat file and tell it where java is. Alter it from this:
@echo off
java -jar %MPJ_HOME%/lib/starter.jar %*
to this...
@echo off
"C:\Program Files\Java\jdk1.8.0\bin\javac" -jar %MPJ_HOME%/lib/starter.jar %*
Note that %* just passes the starter.jar the parameters passed into mpjrun.bat by Grid.bat.
Run the Grid.bat file. You should now see some results from the model, generated running on 4 nodes. You might also notice that it creates a number of log directories in your code directory.
You'll probably find that, although we're doing more runs here (800 vs. 400), the result will be worse than the non-parallel result. This is mainly because Nick's parallel code takes the number of nodes other than node zero as the number of Chromosomes; for us this is then three. In the non-parallel version, we use 30. Obviously on a grid system with hundreds of nodes, this would work better.
However, despite this, we do see a significant speed up. Some of this will be down to the low number of Chromosomes; but, even if we run them with the same number of Chromosomes and Chromosome keep rate, we see a significant speedup. On an Ok PC, the non-parallel model does 800 runs in 132.53 seconds, while the parallel version does this in just 3.37 seconds. This is vastly faster than just running at four times the speed, and results, in part, because of improved memory allocation, in part because node zero is usually doing other stuff as well as running our model, and in part because the algorithm Nick has implemented is slightly simpler in the parallel version and actually requires a larger number of iterations to produce results equivalent to the non-parallel version.
Note that if you ask for more nodes than the machine has, it will just split the
program into the number of threads you've asked for, and allocate them to the cores that
are available. This makes MPJ really useful for parallel development. The other good thing
about developing on one machine, but using parallel cores is that if you add System.out.println
requests, they appear on the command line at your machine, so debugging is easy. There is more
information on debugging across multiple PCs and using Eclipse in MPJ development on the
MJP Express website under "user guides".
Finally, let's go on to a brief summary.