We are delighted to present a guest blog post by Dr Colin Wren. Colin is currently a post-doc at Arizona State University and will be joining the Anthropology faculty at University of Colorado – Colorado Springs in Fall 2015. He uses GIS and agent-based models to study mobility patterns in the past at small and large scales, in particular how patterns are influenced by agent’s perception of the environment. Follow him on Twitter @cdwren.
This is the first blog post in a series of tutorials dedicated on how to time your code and make it go faster. So watch this space Python users!
I think it’s fair to say that all modelers want their models to run faster. A faster run time means we can do more tests during development, run the model longer, run a broader suite of parameters during sensitivity analyses, and get results from the final complete set of runs with Netlogo’s BehaviorSpace faster. My modeling mentor always told us that archaeological models were “over modeled and under run” and so making your model run faster is a good way to run it more and make sure you understand what it is doing.
However, our models are limited by the processing speed of the computer. This is affected by the processor itself, but also the RAM and the hard disk speed depending on what you’re asking the model to do. For the sake of this tutorial we’ll keep it simple and just think about this as the total time it takes to accomplish the tasks we set for it. A simple count of all the things that are asked of the model gives a good estimate. Most of the tasks and little math operations can be counted as roughly equivalent (for exceptions see the end of the post).
Before we get to Netlogo’s Profiler extension, think about how many patches and how many agents there are in your model. Probably you have more patches than agents. So any time you ask all the patches to do something, it’s probably going to take longer to process than asking all the agents to do something. For this reason I tend to try to get most things done by the agents themselves, even if it is updating a patch variable.
For example, this:
ask patches [if count agents-here > 1 [set resources resources - ( count agents-here * harvest-rate )]]
will take much longer than this:
ask agents [ask patch-here [set resources resources - harvest-rate]]
even though both are accomplishing the same task.
To quantify the model’s processing speed, we need to find out which parts of the model are asked to be performed and how many times. This is where Netlogo’s Profiler extension comes in. To enable it, add the following to the very top of your model’s code:
Next add a button to the interface of the model, call it Profiler, and then in the code box enter:
setup ;; set up the model profiler:start ;; start profiling repeat 30 [ go ] ;; run something you want to measure profiler:stop ;; stop profiling print profiler:report ;; view the results profiler:reset ;; clear the data
This is assuming you have set up your favorite model in the standard way with the initialization procedure called “setup”, and the iterating part of the model called “go”. Now when we push the button it will run the model through 30 ticks of the model and then drop the processing time results into the “Command Center” of Netlogo.
BEGIN PROFILING DUMP Sorted by Exclusive Time Name Calls Incl T(ms) Excl T(ms) Excl/calls GO 30 2339.607 2339.607 77.987 Sorted by Inclusive Time GO 30 2339.607 2339.607 77.987 Sorted by Number of Calls GO 30 2339.607 2339.607 77.987 END PROFILING DUMP
The results are structured in three tables which are really just one table re-sorted by each of the table columns. Each row represents a model procedure that was called at least once. The first column is the number of times that procedure was “called”. The second is the “Inclusive time” in milliseconds for that procedure, or the total time the model spent within that procedure AND the procedures it called (e.g. the inclusive time for “go” is the total run time). The second column is the “Exclusive time”, which is the total time spent just on that procedure WITHOUT any procedures it called. The last is the exclusive time per call (rather than summed over all the calls). I find the Exclusive Time column the most useful in terms of finding out why my model is taking so long.
Now, if you have designed your model such that everything happens within one big “go” procedure, you’re not going to get very interesting results here. To make the most of Profiler, you’re going to have to break up the model into separate procedures. This is good practice for coding anyway since it helps to be able to see the overall structure of the model in go (almost like a table of contents), and have the different parts of the model divided up into different segments of code.
As an example of how to do this, see the before and after code for this simple foraging model below:
to go ask agents [ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; consume resources set backpack backpack + harvest-rate ask patch-here [set resources resources - harvest-rate ] ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; move to a new cell let p one-of neighbors if [resources] of p > [resources] of patch-here [move-to p] ] ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Update the display to reflect the change in resources ask patches [ set pcolor scale-color green resources min [resources] of patches max [resources] of patches ] tick end
ask agents [ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; consume resources consume ;this is our new consume procedure, which is defined below ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; move to a new cell move ] update-patches ;update-patches-faster tick end to consume set backpack backpack + harvest-rate ask patch-here [ set resources resources - harvest-rate ] end to move let p one-of neighbors if [resources] of p > [resources] of patch-here [move-to p] end to update-patches ask patches [ set pcolor scale-color green resources min [resources] of patches max [resources] of patches ] end
Now when we click our Profiler button it will divide the processing time into the time spent on “go”, “move”, “consume”, and “update-patches”. This is useful since now we can see which is taking the most processing time and we can re-code our model to make it faster.
BEGIN PROFILING DUMP Sorted by Exclusive Time Name Calls Incl T(ms) Excl T(ms) Excl/calls UPDATE-PATCHES 30 2221.975 2221.975 74.066 CONSUME 300 1.804 1.804 0.006 GO-AFTER 30 2226.679 1.603 0.053 MOVE 300 1.297 1.297 0.004
You can easily see from this table that update-patches is the part taking the most time even though it is being called much less often than consume and move. So why is this? Well if you look at the code you can see that within the procedure I’ve asked every patch to look at every patch twice (once for min, once for max)! This is a ridiculously computationally intensive way to do something as simple as changing the patch colour. A better way is like this:
to update-patches-faster let min-val min [resources] of patches let max-val max [resources] of patches ask patches [ set pcolor scale-color green resources min-val max-val ] end
On my machine this took 2000 ms the first way and only 12 ms the second way. An even faster method would be to have your agents update the pcolor of the patch they’ve just harvested (bonus points to the first person to figure out how to do this and tweet the solution to @cdwren). This has the advantage of most patches not being asked to do anything for most ticks, and as expected it reduced the processing time to 0.4 ms.
If you’re unsure which of your different coding methods is going to run faster, then make a new profiler button to check it out! On the interface, make a new button, call it profile-modules, and do the same as before but with go replaced by the name of your modules:
setup ;; set up the model profiler:start ;; start profiling repeat 30 [ update-patches update-patches-faster ] ;; run something you want to measure profiler:stop ;; stop profiling print profiler:report ;; view the results profiler:reset ;; clear the data
Now when you click the button, it will give quickly you a specific break-down of those options without the rest of your model getting in the way.
Last thing for my inaugural and probably far too long post, my top 5 list of computational time sucks to avoid in NetLogo:
1) ask patches
Get the agents to update patches whenever possible
2) Interface plots
Computational time for plots are included to the go section of the profiler report. While necessary, plots take a lot of time. Add a
plots-on? switch to the interface, and an
if plots-on? to the plot code and turn them off when possible
3) Writing to Command Center
Useful for debugging, but make sure to comment out those
4) Think simpler. Writing a simple model is hard, but reducing the complexity of agent behavior will speed everything up. Keep only what’s necessary and sufficient.
5) Mean, min, max [variable] of patches
Already covered above, but use temporary variables to calculate these once instead of having every patch or agent repeat it.
Download my sample model here.
Ok, that’s it for now, happy coding everyone!