In the first blogpost dealing with the modelling tools of trade Ben Davies argued in favour of NetLogo. I join the debate to argue for the simplest of commonly used programming languages – Python.
As pretty much everyone coming from archaeology into modelling I started with NetLogo. And I do appreciate its beauty, its simplicity, its user-friendliness and the underlying philosophy that the simpler the simulation the better.
The unbelievable speed in which you can teach a complete newbie (or yourself) to create a full blown simulation from scratch makes it the perfect tool to introduce non-coding archaeologists to modelling. On top of that it gives you the feeling of familiarity with what you code (turtles ‘hatch’ or get ‘sprouted’, they ‘move-to’, they ‘die’ etc.) and an immediate feedback on what’s happening as the simulation unravels on the screen. It is a great for prototyping, playing with ideas and creating simple models.
In a nutshell, I love the turtles!
Sadly, our paths split at some point. I learnt the second best thing in the computing world (after this blog) – Python, and I have never looked back. Here are the four main reasons why:
A complete change of perspective
NetLogo’s simplicity is seductive yet it may be deceiving. The simplicity of use comes at the price of obscuring what exactly happens underneath the hood of your simulation. You told your turtles to ‘move-to’ but what exactly does it mean? Do the agents have a special variable ‘location’ or, perhaps, the grid on which they live is a dynamically updated matrix? It may not take much time to develop a model but it will take time to fully understand it. And it is just too tempting to simply trust the language and hope that it does what you think it does. Until, as it happened to several of my colleagues, you reimplement your model in another language and the results are completely different. Why? Because they thought the model was doing x and, in fact, it was doing y. Anyone who has ever used any stats package, GIS software or a database knows how surprisingly often this happens in the digital world.
Programming is, in fact, a series of very simple maths operations and the more you understand what combination of these calculations constitute your model, the better. Sometimes, it may even become obvious that you don’t need a simulation at all – you can use nothing more than a calculator and still solve the problem. As boring as it may sound, sooner or later we will have to switch the focus of modelling in archaeology from ‘making turtles do x’ to thinking in terms of mathematical operations.
This touches on the issue of testing. ‘Testing’ means you take a series of numbers, calculate outside the model how these numbers should change with every step of the simulation and compare them to the numbers that your model spat out. One should not underestimated the testing stage – it often takes as long (or longer) as building the code. NetLogo lacks effective testing tools that would automate this process and it is difficult to ‘tease out’ the underlying calculations and do it in another environment (by hand, Excel, R etc). In practice, it is tempting to look at the screen and conclude ‘looks like it is doing what I think it should be doing’. If we want to avoid faulty modelling results that’s not good enough.
Computational modelling is not mainstream archaeology (yet, we are working on it 😉 and the chances are that many of the models will not be replicated any time soon or, indeed, at all. Therefore, taking responsibility of ensuring there are no bugs in your code is extremely important, even more so than in other disciplines. Using standard programming languages makes it much easier.
However, compare it to a programming language (Python, Java, C etc.) and soon you’ll see the chasm of how much you can achieve in a given time. As my Python teacher said in the first lecture ‘Life is short and the PhD even shorter’. You can optimise NetLogo code, you can use list, you can dump some of the heavy calculations onto its R extension to speed up your simulation, etc. Or you can code it in Python. Even a poor implementation in Python is likely to be faster than a well designed NetLogo code.
This looks like mostly logistic issue but it cuts deeper into the philosophy of model building. Why is the speed important? Three reasons. 1. With more speed you can simply do more – you can try out a wider parameter space, test more scenarios, test different implementations etc. 2. You can run the circle of model development several times, which is the best way to improve the quality of any model. 3. You will avoid the horror situations of realising there is a tiny bug shortly before a paper deadline/ an important conference/thesis submission and not being able to do anything about it. I would argue that the initial time you spend learning a programming language other than NetLogo will be given back to you later down the line when you need to run your simulations.
Universality – one language to rule them all
A number of non-archaeologists blogged recently about how Python seems to be taking over the world of scientific computing with its simplicity, its vast libraries of extensions and the ever growing documentation and support base (see here and here). They pointed out that scientists slowly shift from “I need software x to do x’, software y to do y’ and software z to do z'” to “how can I do it all in Python?”. The same applies to archaeologists.
Hardly any simulation can go forward without some form of GIS manipulation and some sort of data analysis. Anyone who had the pleasure of dealing with the GIS software that is currently the industry standard (you know which one I mean) knows that it involves long bouts of pulling ones hair, exclaiming ‘are you kidding me?’ and swearing to all known forces in the universe you will never, never ever use it again. Achtung, achtung my friends, let me put a stop to everyone’s misery as there are fantastic alternatives! You can use one of the open source GIS software (for example Grass GIS) or, even simpler, use ArcMap model builder, export the code in 3 clicks and use the script. Both solutions, however, involve at least some rudimentary knowledge of Python.
The same applies to data analysis (read: the stats) you will, most probably, run on your simulation’s results. The easy way is to use Excel, SPSS, MiniTab or another interface based software. Most of people quickly realise that this is either a) a frustrating exercise and/or b) too limited to do what you actually need and/or c) ridiculously time consuming (remember those days you spent copy-pasting columns from one spreadsheet to another?). R solves most of these problems. Python solves them even better. Even if you’re a coding fanatic and love learning new programming languages (which begs the questions why do you use NetLogo in the first place?) there is what Yarkoni calls the “cognitive switch cost of reminding yourself say, (…) that you need to call len(array) instead of array.length to get the size of an array…” etc. Now that you can do pretty much everything in Python instead of using the triad of ArcMap/Grass – NetLogo – R you cut down significantly on the overhead of switching between them every three months and relearning the same commands over and over again. Not to mention you don’t need to convert the files from one format to another. Remember the ‘life is short and the PhD…”?
Less than half of all PhD graduates will continue in Academia (if you want to get depressed in a matter of minutes, check the Nature report or the Economist’s analysis here and a fantastic article on ‘how academia resembles a drug gang’ here). I’m not saying you’re not going to be one of the lucky ones but stats are stats and nothing indicates that anthro-/archaeology has a particularly higher retention rate of PhDs compared to other disciplines.
The circle of academic-based simulation projects, non-humanities departments or commercial companies that are likely to be looking for someone with NetLogo coding skills is small. But what if you can code in Python (or Java, or C++)? The dream job (or ‘any job’) is much more likely to materialise.
In the current academic climate (around 1% or less of research funding spent on humanities AND social science in both US and Europe) it’s good to have a plan B (even though I hope none of us will need to execute it). An industry and academy approved and widely used programming language gives you a solid transferable skill, which NetLogo, no matter how well it actually works, doesn’t.
My final verdict: start off with the turtles, use it for prototyping, playing out with ideas and teaching modelling newbies but when you feel comfortable with ‘if loops’, ‘lists’ and ‘scheduling’ move on to Python. You won’t regret.