Sooner or later everyone comes to a realisation that one’s code is slow, too slow. Thankfully there were a lot of ‘everyone’ before us and they actually did something about it. This means that the tempting option of leaving one’s laptop churning the simulation and going on a month long holiday is not longer defendable. This blogpost will sketch out what one can do instead. It draws from my own experience, so it’s very mac-centric (windows users can just skip all the paragraphs on how to deal with the hell that apple sends your way) and applies to scripts written in Python only. For a tutorial on how to profile your code in NetLogo, check out Colin Wren’s post here.
Also, if you’re a proper software developer stop reading now or you may get a cardiac arrest with my frivolous attitude to the comp science nomenclature. Having wasted hours on deciphering the cryptic jargon of online tutorials, stack overflow and various bits of documentation I make no apologies.
Finally, a piece of warning: it may take up to a full day to set everything up. There will be cursing, there will be plenty of ‘are you kidding me’ and ‘oh c’mon why cannot you just work’ moments, so make sure you take frequent breaks, switch the computer off and on from time to time and know that perseverance is key. Good luck!
The key to speeding up your code is to a) figure out which bits are actually slow and b) make them go faster. The first task is called ‘profiling,’ the second ‘optimisation’ (so that you know what to google for). This post will be about profiling with another one on optimisation following soon.
A good place to start is to check how long your code takes to run overall. You can time it by typing in the terminal on Mac/command line in Windows (don’t type the ‘$’ it’s just a marker to say ‘new line’):
$ time python script_name.py
script_name.py is the name of your file. Remember to either navigate to the folder containing the file by typing
cd (meaning ‘change directory’) followed by the path to the file, eg.
cd user/phd/python_scripts or provide the full path, e.g.
$ time python user/phd/p_scripts/script_name.py
If you cannot figure out what is the full path (thank you apple, you really couldn’t make it any more complicated), drag and drop the file onto the terminal as if you were moving it from one folder to another and the full path will appear in a new line.
Once it works, the time function produces a pretty straight forward output, to tell you how long it all took:
Now, add up the sys and user times – if it is much less than the real time then the main problem is that you computer is busy with other stuff and the code needed to wait until other tasks were completed. Yep, switching your Facebook off may actually speed up the code. Sad times.
Profiling the functions
So far so good, but the overall time tells you little about which bit of the code is slowing things down. For the rescue comes the armada of profiling tools. The first step into the world of profiling is to watch this highly enjoyable talk: Python profiling. The first half is really useful, the second half is for some hardcore business application so you can skip it.
To sum it up, Python has a standard inbuilt tool called cProfile. You call it from the terminal with:
$ python -m cProfile script_name.py
And usually it produces pages upon pages of not particularly useful output, along the lines of:
163884 function calls (161237 primitive calls) in 5.938 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 <string>:1(ArgInfo)
1 0.000 0.000 0.000 0.000 <string>:1(ArgSpec)
1 0.000 0.000 0.000 0.000 <string>:1(Arguments)
1 0.000 0.000 0.000 0.000 <string>:1(Attribute)
1 0.000 0.000 0.000 0.000 <string>:1(DecimalTuple)
1 0.000 0.000 0.000 0.000 <string>:1(Match)
1 0.000 0.000 0.000 0.000 <string>:1(Mismatch)
1 0.000 0.000 0.000 0.000 <string>:1(ModuleInfo)
1 0.000 0.000 0.000 0.000 <string>:1(ParseResult)
1 0.000 0.000 0.000 0.000 <string>:1(SplitResult)
One needs to use some tools to trawl through this. To start with, it makes sense to stop viewing it in the the terminal window. If you run this command:
$ python -m cProfile -o script_output script_name.py
it will create a file with the data and ‘script_output’ is the name you want to assign to that file. To then see the data in a browser window install the cprofilev tool. As usual they tell you that one line of code:
$ pip install cprofilev
in your terminal is enough to install it. Yeah, right, as if that was ever to happen. To start with, make sure you’ve got the words
pip at the beginning of the line. Using sudo means that you’re pretending to be the root administrator, i.e. the almighty ‘this is my bloody computer and if I want to I will break it’ – you can also give yourself the root admin rights by following these instructions, but apple will drag you through pages and pages of warnings that will leave only the bravest of us actually daring to continue. So whenever I get an ‘access denied’ message I stick the
sudo command in front and usually it does the trick:
$ sudo pip install cprofilev
If the terminal spits out an error message about
pip, it is likely that you don’t have it installed, so type in the terminal :
$ sudo easy_install pip
and try again. This should be enough of sweat to make it work but if it keep on producing error messages, go to the ultimate authority of google. If it did work (i.e. you run the
$ sudo pip install cprofilev and it didn’t show any errors) type in the terminal:
$ cprofilev /path/to/script_output
script_output is the name you assigned to the file created with the cProfiler four code-snippets ago (scroll up). The terminal will spit out this fantastically cryptic message:
cprofilev server listening on port 4000
This just means that you need to copy past this line into your browser (Safari, Firefox, IE):
and the data will appear as a nice webpage where you can click on the headings and sort it by the number of calls, total time they took etc. You can find a comprehensive description of how to use it here.
Run, snake, run!
$ sudo easy_install SquareMap RunSnakeRun
You need wxPython to run the tool, thankfully this one has actually a ‘click’ installation type (I almost forgot these exists!): you can get the file from here. If may get an error saying that the disc is damaged, it’s not. It’s (yet again…) apple who does not want you to instal anything that is not ‘certified’ by them. Here are instructions on how to bypass it in the Mac internal settings.
If you’re a happy windows or linux user you’re good to go, if you have a mac there is one more bug, you probably get this error message:
OSError( """Unable to determine user's application-data directory""" )
This is because the software is not fully compatible with Mac OS X but you can repair it by typing in the terminal:
Now, run this in your terminal:
$ runsnake script_output
where the script_output is that file you created with cProfiler, remember? The one you got with this line:
$ python -m cProfile -o script_output script_name.py
and you should now be able to get a nice visualisation of how much time each function consumes. It looks like this:
In the left hand side panel you can sort the functions by the execution time, the number of calls the combined time they took etc, while the right hand side panel shows the same information in a more human friendly format, i.e. in colour.
Run, snake, run is truly a fantastic tool and it gives you an idea of what eats the time and power pointing to the functions that may be optimised for a better performance. But, it also floods you with loads of the ‘inner guts of Python’ – functions inside functions inside functions hence finding out which bit of your code, i.e. the exact line, is the root of the problem is far from obvious. The line_profiler tool is great for that. There are some nice tutorial on how to use it here and here. To instal it try typing in the terminal:
$ sudo pip
This should do the trick, if not download the files from here and try all possible installation routes described here. Once it’s installed, in order to use it add the
@profile decorator in front of the function you want to test, for example:
and run from the terminal:
$ kernprof.py -l -v script_name.py
The -l flag makes the decorator (
@profile) works and the -v flag is used to display the time when the script is finished.
If it doesn’t work make sure that you have the
kernprof.py in the same folder as the script you want to run (it’s in the line_profiler folder, you’ve downloaded earlier), or provide the full path to where it lives in your computer, for example:
$ /line_profiler-1.0/kernprof.py -l -v script_name.py
The output is pure joy and simplicity, and looks something like this:
Now, we’re talking. It clearly shows that in case of my script almost 60% of the time is spent on line 20 where I count how often each number appears in my list. If you need someone to spell out how to read the output, head to this tutorial.
If you want to go further, check out this tutorial on profiling how much memory the program uses and checking if none of it is leaking. Or get on with it and switch to working on speeding up the slow bits of your code. The next tutorial will give you a few hints on how to achieve it.
Top image source: Alan Cleaver on Flickr flickr.com/photos/alancleaver/2661425133/in/album-72157606825074174/