Running
[Agent practical 9 of 9]
The simplest way is to use a GA library.
Fortunately for us, the ever-excellent Nick Malleson has written some great GA code which you
can download here: GA.zip. Download the code and upzip it to your code directory for this practical such that
the GA java files are in \java\src\unpackaged\framework9\genetic_algorithm\. When you unzip it, as well as this genetic_algorithm
package,
you should also find a genetic_algorithm.grid package in its own sub-directory.
We'll come back to this. For the moment, delete the grid directory and its java files, as it will otherwise cause issues with compiling.
Nick's code is well put together, so it's worth a look (it comes with a simple built-in example), but the detail need not worry us too much at the moment. The key thing is that we need to do two things:
1) Subclass his genetic_algorithm.Chromosome
class, and override its createGenes()
and calcFitness()
methods.
The former allows you to determine the range of values in the genes and the total number of genes (each gene will be one variable we want to
calibrate), while the latter allows you to fix
how you assess the fitness of the set of genes in the Chromosome. In our case this will involve
running the model with the genes the code makes for us having taken account of the ranges requested in createGenes()
, and
then comparing the results with the 'real' data.
2) Set up a class that will start the GA running.
Once we start the GA running, it will breed sets of Chromosome objects (using our subclass), each filled with genes based on createGenes()
and each
reporting its fitness to the core GA when it calls their calcFitness()
method. The central GA will stop when it has reached the
end of a number of iterations, or a stopping criterion (an error rate it would be nice to reach).
As it happens, we're going to make a single class that both subclasses genetic_algorithm.Chromosome
and runs the
GA. It's going to wrap around our model to add the GA, so we'll call it GAWrapper:
Download this class and save it into the directory the rest of our model code is in (not the genetic_algorithm directory. Open it up and have a look at it. In combination with the above description you should be able to get the idea of how it works from the comments.
Note that createGenes()
makes two genes, whose values will range from 1.0 to 200.0, with changes of one if a value is unsuitable
(on top of any initial randomisation):
public Gene[] createGenes() {
Gene[] thegenes = new Gene[2];
for (int i = 0; i < 2; i++) {
thegenes[i] = new Gene("Gene" + i, 1.0, 200.0, 1.0);
}
return thegenes;
}
Note that our calcFitness()
gets hold of the values allotted to the current Chromosome object we're inside (genes
is a
protected
variable in the superclass). It uses these to construct a String array, which it then passes directly to our Model
Constructor (bypassing the main
method which would usually pass these on from the command line). Neat hu?
Having run the model calcFitness()
then reads in the results, and calculates a cell-by-cell total error between the two
datasets, which it reports as the fitness of this Chromosome.
Finally note that GAWrapper contains a main()
method to run everything. This runs the GA, telling it
what our Chromosome class is:
Genetic gen = new Genetic(GAWrapper.class, f, 400, 30, 10);
gen.run();
before getting the best set of genes from the GA and running the model one last time to make a prediction using a new initial data file newroom.txt (this one only has 24 computers):
Chromosome[] chromes = gen.getChromes();
String[] newArgs = {Integer.toString(10), Integer.toString(3360), Double.toString(chromes[0].getGenes()[0].getValue()), Double.toString(chromes[0].getGenes()[1].getValue()), "newroom.txt", "prediction.txt"};
new Model(newArgs);
We could then validate this prediction against new data using the same kind of code that's in calcFitness()
if we
wanted, but we'll leave it at the prediction stage for now.
Anyhow, compile all the code (you should just be able to use javac *.java -- it will travel up the directory tree to find files it needs), and then run java GAWrapper ga.csv. The command line argument is the filename for a file containing the details of the fitness after each iteration of the GA. You can open it up to see how the GA progressed, though I wouldn't expect any great results for this particular model as, in truth, the two behaviours are both similar in action (both act to control the time at a location), so the GA will struggle to distinguish between them and you'll probably end up with an average of both for each. For reference the real figures (the data was made by running the GUI model) were an eatingRate of 50 and a fullUp of 80.
So, that takes a short while for the runs we've built in, but for a more serious calibration effort it would take some time. Finally, then, let's look at whether we can use parallel programming to speed it up?