The Segregation

Published almost half a century ago the Segregation Model is the most commonly invoked example of how simple and abstract models can give you big and very real knowledge (and a Noble Prize).

The idea is so simple that it is sometimes used as beginners tutorial in NetLogo but recently it got a beautifully crafted new interactive visualisation by Vi Hart and Nicki Case, which you can find here.

Imagine a happy society of yellows and blues. The yellows quite like the blues and vice versa but they also like to live close to other yellows. Now the key element of the story is that even if that preference of living among members of your own group is very slight (we are  taking 30%) it leads to a creation of segregated neighbourhoods. Yes, the actual segregated neighbourhoods where yellows live with yellows and blues live with other blues. One would struggle to call anyone racist because they wanted to live in an area where one third of their neighbours are of the same sort yet these harmless preference may create a harmful environment for everyone.

This is a very counterintuitive (and probably for that reason nobody figured it out earlier) but the Hart and Case implementation of  the model allows everyone to test it for themselves. The playable guides you through the process and allows you to test different scenarios. They also include a nice extension – it turns out that even a slight preference for living in a diverse neighbourhood will revert the segregation pattern.

And on that cheerful note: happy winter break everyone!

Advertisement

Call for Papers: 2015 Congress on Evolutionary Computation, Sendai, Japan, May 25-28

The annual Congress on Evolutionary Computation is being held next year in Sendai, Japan. This conference looks to be a mix of academic and industry folks, with an eye on applications of evolutionary programming and distributed intelligence approaches. From the website:

IEEE CEC 2015 is a world-class conference that brings together researchers and practitioners in the field of evolutionary computation and computational intelligence from around the globe. Technical exchanges within the research community will encompass keynote lectures, regular and special sessions, tutorials, and competitions, as well as poster presentations. In addition, participants will be treated to a series of social functions, receptions, and networking events to establish new connections and foster everlasting friendship among fellow counterparts.

While we all go to conferences for the everlasting friendships, there appears to be some good focus on particle swarm and agent-based systems, so lots to be learned. Some special sessions which might be of interest include:

SS40 – Evolutionary Computation with Human Factors
SS42 – Cultural Algorithms: Real World Applications of Collective Intelligence (chaired by Robert Reynolds)
SS45 – Complex Adaptive Systems
SS50 – Social Simulation Using Evolutionary Computation and Meta Heuristics

Abstract submissions close on December 19th. Click here for more information.

Featured image: Zuihōden mausoleum in Sendai (image by 663Highlands at Wikimedia Commons)

Creating Gorgeous Graphics in R

You’ve made an amazing simulation. You’ve double and triple checked to ensure that the output you’re seeing is not something you’ve entrained in the simulation. You have emergence, and you want to show it.

Instead of doing tons of ugly screenshots, you want to make flashy graphics that will quickly and easily display your data. And suddenly you’re hit with the reality: you have to learn another computer language.

This tutorial is meant to show you how to make gorgeous graphics in R. This is the first in a multipart series I’ll be writing about graphing in R.

Please note: it is recommended that instead of copying and pasting what I’ve written below into your console, that you type it directly into R, because there are problems with font compatibility between this blog’s font and the font required by R. Mainly, the quotes don’t translate well between this blog and R, so you may get errors. To note, every instance of an apostrophe or quote may be rendered incorrectly and will thus be read incorrectly by the programming language R. Besides, it’s good practice to type it all, right?

First, if you haven’t done it already, download R. Once it’s downloaded you will see in your applications you have two versions of R: regular, and R64. R64 is a bigger, more powerful beast, and is good for scads of data. Thus, I always make sure that R64 is used (set it as your default).

(For info, I work on a Mac, so there may be tiny differences in how things run. For example, to execute the code in Mac I use command+enter. It’s slightly different for PC.)

Once R64 is open, click File -> New Document. Here you are going to create your R script. Save this (currently blank) file and give it a name that makes sense to you.

Good data tracking

I usually create a new folder for each of my projects, that way all of the data and graphs that are created (junk graphs and good graphs) are in a central location. However, I have one folder where I keep all of my R scripts, which I name descriptive things with dates (Mongolia_Graphics_Oct_2014.R).

Alright, once you create your file, let’s start coding!

The “R Console” is where your code executes. You can type things there, and if you hit command+enter it will run. However, if you want to save your script you want to write it in the new file you created.
But, to start, let’s write something in the console. Type this:


getwd()


And you hit execute (command+enter). That tells you which working directory you are working from, a.k.a. this is where your files will be read from, and where you will write files.

Open your file and type your first line of code:


setwd(“/Users/stefani”)


You can see that this looks similar to the above, but between the parenthesis you are telling the computer what to do. Here you’re telling it to set the working directory as “stefani,” which happens to be my working directory. I suggest you use your own name. If you retype getwd() that will show you what you set your working directory to.

Libraries
R is basically a shell within which you can dock different programs. These programs are called “libraries” and are stored in a central location.

Let’s install the library, “lattice.” It’s simple

Type this in your window:


install.packages(“lattice”)


You’ll see it crunching about, and finally it will finish. Now if we want to make sure it’s loaded, type this:


library(lattice)


Okay, now please find and load the following:


library(lattice)

library(latticeExtra)

library(Hmisc)

library(RColorBrewer)

library(plotrix)


For this tutorial we are going to work with some dummy-data that I output from a simple simulation so we can make some population graphs. First, please download the dummy data here (upper right hand corner, click “download csv”): https://www.academia.edu/9049652/Dummy_Data_for_Tutorial

Okay, let’s make some graphs!

First, we need to load the data. It’s a simple command.

And then put that in your working directory.

To read data, do this:


read.csv(“data.csv”)


But of course we want to assign that a name, so:


data<-read.csv(“data.csv”, header=T)


(we are reading the csv file we have and assigning it to the local variable “data”. Header=T means there are names for the variables, so we want to respect that and not assign the first column as data points).

Now run this next command:


data[1:5,]


And that will show you the first five lines of your data. That shows you all the variables: ticks, agentsA, agentsB, agentsC, agentsD, numberAgents, seed, patchVariability, energyFromFood, energyRegrowthTime, energyLossFromDeadPatches

Okay, first, we note that everything to the right of “numberAgents” is static, so those are parameters that were not changed for this tutorial. But we notice that the first six variables do change. Ticks, that’s the time, so we know we’ll want that on the y axis. The other five are the counts of different types of agents. numberAgents appears to be a sum of agentsA-D, so first let’s make a graph of ticks versus numberAgents.

Try this:


plot(numberAgents~ticks, data=data)


That would read that we’re plotting the # of agents (on the x axis) by the number of ticks. The data we’re using is the variable we assigned above, which is “data”.

Screen Shot 2014-11-02 at 10.22.01 AM

Well, my goodness, isn’t that ugly? But it shows a basic trajectory of what we’ve got. There’s a circle for each of the data points (numberAgents), and it’s versus the number of ticks. We can add something very minor:


plot(numberAgents~ticks, data=data, type=”l”)


That takes the circles and makes them into a line. “Type” here means “what type of data are we plotting?” L=lines, p=points. If you want more information type:


?type


And that gives you the help menu for R

If we add more things to our string there, we can change how thick the line is and what color.


plot(numberAgents~ticks, data=data, type=”l”, lwd=2, col=’blue’, ylim=c(0,30))

Screen Shot 2014-11-02 at 10.24.42 AM


Here we changed the color of the line (col), the thickness (lwd), and the scale (ylim means the y axis goes between 0 and 30).

Let’s say we want to save this graph? What do we do? Take a screenshot? NO! We do this:


pdf(file=”PopulationsAllAgents.pdf”, width=5.44, height = 3.5, bg=”white”, paper=”special”, family=”Helvetica”, pointsize=8)

par(mar=c(4.1,4.5,0.5,1.2))

par(oma=c(0,0,0,0))

plot(numberAgents~ticks, data=data, type=”l”, lwd=1, col=’blue’, ylim=c(0,30))

box()

dev.off()


We are creating a pdf file, giving it the size, type of font we’re using, etc. You can see that our command to plot the line is exactly the same as before. We’re putting a border around our graph. dev.off() is the command to say “we are done creating our pdf graph”.

If you run this you can check your working directory (as you set above) and you should see a graph named PopulationsAllAgents

Now let’s add the other agents to this graph!


pdf(file=”PopulationsAllAgents.pdf”, width=5.44, height = 3.5, bg=”white”, paper=”special”, family=”Helvetica”, pointsize=8)

par(mar=c(4.1,4.5,0.5,1.2))

par(oma=c(0,0,0,0))

plot(numberAgents~ticks, data=data, type=”l”, lwd=1, col=’blue’, ylim=c(0,30))

lines (agentsA~ticks, data=data, type=”l”, lwd=1, col=’red’, ylab=FALSE, xlab=FALSE)

lines (agentsB~ticks, data=data, type=”l”, lwd=1, col=’violet’, ylab=FALSE, xlab=FALSE)

lines (agentsC~ticks, data=data, type=”l”, lwd=1, col=’orange’, ylab=FALSE, xlab=FALSE)

lines (agentsD~ticks, data=data, type=”l”, lwd=1, col=’green’, ylab=FALSE, xlab=FALSE)

box()

dev.off()

Screen Shot 2014-11-02 at 10.26.08 AM


You should see five lines in blue, red, purple, orange and green. Note that when you start a plot you use the command “plot” and if you add more stuff to it you use other commands. Here we used the command “lines” but you can also use other commands such as “points.” If you wanted to only show the individual strategies you would skip the first line.

Okay, now let’s add a legend to our code. Let’s put it in the upper right hand corner, and use two columns. (topright can be exchanged for other placements. See here: http://stat.ethz.ch/R-manual/R-patched/library/graphics/html/legend.html )


legend(“topright”, legend=c(“All Agents”, “A Agents”, ” B Agents”, “C Agents”, “D Agents”), bty=”n”, fill = c(‘blue’, ‘red’, ‘violet’, ‘orange’, ‘green’),cex=1.1,ncol=2)


Here’s the full code:


pdf(file=”PopulationsAllAgents.pdf”, width=5.44, height = 3.5, bg=”white”, paper=”special”, family=”Helvetica”, pointsize=8)

par(mar=c(4.1,4.5,0.5,1.2))

par(oma=c(0,0,0,0))

plot(numberAgents~ticks, data=data, type=”l”, lwd=1, col=’blue’, ylim=c(0,30))

lines (agentsA~ticks, data=data, type=”l”, lwd=1, col=’red’, ylab=FALSE, xlab=FALSE)

lines (agentsB~ticks, data=data, type=”l”, lwd=1, col=’violet’, ylab=FALSE, xlab=FALSE)

lines (agentsC~ticks, data=data, type=”l”, lwd=1, col=’orange’, ylab=FALSE, xlab=FALSE)

lines (agentsD~ticks, data=data, type=”l”, lwd=1, col=’green’, ylab=FALSE, xlab=FALSE)

box()

legend(“topright”, legend=c(“All Agents”, “A Agents”, ” B Agents”, “C Agents”, “D Agents”), bty=”n”, fill = c(‘blue’, ‘red’, ‘violet’, ‘orange’, ‘green’),cex=1.1,ncol=2)

dev.off()

Screen Shot 2014-11-02 at 10.27.40 AM


Let’s inspect the graph. If you look, you can see that the red and blue lines go on top of each other. Hmmm. And the red line ends up being on top. That’s because things happen in the order you write them. Since we put the red line second, it draws on top. Maybe we want to be able to show different types of lines. The command for line types is lty. You can see what they look like here:

http://www.statmethods.net/advgraphs/parameters.html

Play around with changing lty and the lwd (line width) until you get the graph you want.

If you don’t like the labels, set ylab and xlab to false on the first plot command. Also, if you want a different type of output, change it from pdf to png or whichever.

You’ve made a spanking new population graph!

Want more R code? I have a bunch in the appendixes of my M.A. thesis, which is archived here:

https://www.academia.edu/1526170/Why_Can_t_We_Be_Friends_Exchange_Alliances_and_Aggregation_on_the_Colorado_Plateau_Crabtree_M.A._Thesis_2012

Extra special thanks to Kyle Bocinsky, who taught me all I know about making graphics in R, and whose scripts inspired this post.

The next post in this series, “How to handle big data” will examine what to do when you have data from thousands of runs of simulations.

Keep the MODELLING revolution going! CAA2015, Siena

The CAA (Computing Applications and Quantitative Methods in Archaeology) conference has always been the number one destination for archaeological modellers of all sorts. The motto of the next meeting (to be held in lovely Siena, Italy! 30/03-5/04 2014) is ‘Keep the revolution going‘ and given the outstanding presence of simulation, complexity and modelling last year in Paris, I thought it will be a tall order.
Fear no more! The revolution keeps on going with a number of hands-on workshops and sessions on modelling scheduled for Siena. From modelling dispersals to network application complexity science is well represented. What is, perhaps, worth particular attention is the roundtable: Simulating the Past: Complex Systems Simulation in Archaeology which aims to sketch out the current place within the discipline and the future direction of simulation in archaeology. It’s also a call for a formation of a CAA Special Interest Group in Complex Systems Simulation (more about it soon). Follow the links to the abstracts for more details.

The call for papers is now open (deadline: 20 November). Follow this link to submit: http://caaconference.org/program/ .

Sessions:

5L Modelling large-scale human dispersals: data, pattern and process

Michael Maerker, Christine Hertler, Iza Romanowska

5A Modelling approaches to analyse the socio-economic context in archaeology

Monica De Cet, Philip Verhagen

5H Geographical and temporal network science in archaeology

Tom Brughmans, Daniel Weidele

Roundtable

RT5 Simulating the Past: Complex Systems Simulation in Archaeology

Iza Romanowska, Joan Anton Barceló

Workshops

WS8 First steps in agent-based modelling with Netlogo

Iza Romanowska, Tom Brughmans, Benjamin Davies

WS5 Introduction to exploratory network analysis for archaeologists using Visone

Daniel Weidele, Tom Brughmans

 

Image: http://commons.wikimedia.org/wiki/File:Siena5.jpg

MODSIM 2015 Gold Coast call for sessions is now open

MODSIM (alternatively, the International Congress on Modelling and Simulation) is the biennial conference for the Modelling and Simulation Society of Australia and New Zealand, an organisation which promotes the use of simulation and spans the divide between industry and academia. The conference will be held next November at the Gold Coast Convention and Exhibition Centre in Queensland, Australia. The call for sessions is now open, paper abstracts will need to be submitted in April.

For more information, see the MODSIM website.

Starting with Python

No matter if you are an experienced NetLogo coder or have just started with modelling, learning how to code in a scripting language is likely to help you at some point when building a simulation. We have already discussed at length the pros and cons of using NetLogo versus other programming languages, but this is not a marriage, you can actually use both! There are certain aspects in which NetLogo beats any other platform (simplicity, fast development of models and many more), while in some situations it is just so much easier to use simple scripts (dealing with GIS, batch editing of files etc). Therefore, we’ve put together a quick guide on how to start with Python pointing out all the useful resources out there.

How to instal Python

It will take time, a lot of effort and nerves to install Python from scratch. Instead, go for one of the scientific distributions:

Anaconda is a free distribution containing all the useful packages, You can get it from here, and then simply follow the installation instructions here.

Enthought Canopy comes free for academic users. You can get it from here and there is a large training suite attached to it (https://training.enthought.com/courses).

The very first steps

There’s a beginner Python Coursera module starting very soon: https://www.coursera.org/course/pythonlearn but if you missed it, don’t worry, they repeat regularly.

Source: http://www.greenteapress.com/thinkpython/think_python_comp2.medium.png

If you prefer to work with written text, go for ‘Think Python‘ – probably the best programming textbook ever created. You can get the pdf for free, here. It is likely to take a week of full time work to get through the book and do all the exercises but it is worth doing it in one go. It’s unbelievable how quickly one forgets  stuff and then gets lost in further chapters. There are loads of other books that can help you to learn Python: with the Head First Python you could teach a monkey to code but I found it so slow it was actually frustrating.

Source: http://www.cengage.com/covers/imageServlet?epi=1200064128513708027464865671616351030

Alternatively, Python Programming for the absolute beginner is a fun one, as you learn to code by building computer games (the downside is I spent way too much time playing them). Finally, if you need some more practice, especially in more heavy-weight scientific computing I recommend doing some of the exercises from this course, and checking out Hans Fangohr’s textbook. There are many more beginner’s resources, you will find a comprehensive list of them here.

It is common that one gets stuck with a piece of code that just do not want to work. For a quick reference, the Python documentation is actually pretty clearly written and has examples.  Finally, StackOverflow is where you find help in more difficult situations. It’s a question-and-answer forum, but before you actually ask a question, check first if someone hasn’t done it already (in about 99% of cases they did). There is no shame in googling ‘how to index an array in python’, everyone does it and it saves a lot of time.

How to get from the for-loop into agents

There is a gigantic conceptual chasm every modeller needs to jump over: going from the simple to the complex. Once you learnt the basics, such as the for-loops, list comprehension, reading and writing files etc. it is hard to imagine how a complex simulation can be built from such simple blocks.

Source: http://www.greenteapress.com/compmod/think_complexity_cover.png

If you’re feeling suicidal you could try to build one from scratch. However,  there are easier ways. To start with, the fantastic ‘Think Python’ has a sequel! It’s called ‘Think Complexity’ and, again, you can get the pdf for free, here. This is a great resource, giving you a thorough tour of complexity science applications and the exercises will give you enough coding experience to build your own models.

The second way is to build up on already existing models (btw, this is not cheating, this is how most of computer science works).  There is a fantastic library of simulations written in Python  called PYCX (Python-based CompleX systems simulations). It contains sample codes of complex systems simulations written in plain Python. Similarly the OpenABM repository has at least some models written in Python.

And once you see it can be done, there’s nothing there to stop you!

Going further – Python productivity tools

There are several tools which make working in Python much easier (they all come with the  Anaconda and Enthought distributions).

Source: http://matplotlib.org/mpl_examples/api/logo2.hires.png

Visualisations: Matplotlib is the standard Python library for graphs. I am yet to find its limits and it is surprisingly easy to use. Another useful resource is the colour brewer.  It’s a simple website that gives you different colour palettes  to represent different type of data (in hex, rgb and cmyk so ready to be plugged into your visualisations straight away). It can save you a lot of time otherwise wasted on trying to decide if the orange goes ok with the blue…

The Debugger: The generic python debugger is called ‘pdb’ and you can find a great tutorial on how to use it here. I personally prefer the ipdb debugger, if only because it actually prints stuff in colour (you appreciate it after a few hours of staring at the code); it works exactly the same as the pdb debugger.

The IPython Notebook: The IPython notebook is a fantastic platform for developing code interactively and then sharing it with other people (you can find the official tutorial here). Its merits may not be immediately obvious but after a while there’s almost no coming back.

Sumatra: Sooner or later everyone experiences the problem of ‘which version of the code produced the results???‘. Thankfully there is a solution to it and it’s called Sumatra. It automatically tracks different versions of the code and links the output files to them.

Data Analysis: You can do data analysis straight in Python but there are tools that make it easier.

Source: http://pandas.pydata.org/_static/pandas_logo.png

I haven’t actually used Pandas myself yet, but it’s the talk of the town at the moment so probably worth checking out.

GIS: Python is the language of choice for GIS, hence the simple integration. If you are using ESRI ArcMap, create your algorithm in the model builder, go to -> export -> to python (see a simple tutorial here, and a more comprehensive one on how to use model builder and python here). You can do pretty much the same thing in QGIS (check out their tutorial, here). On top of that, you can use Python straight from the console (see a tutorial here).

Call for Papers: Spatial analysis and modelling of human settlement systems

Call for papers for a conference in beautiful Besançon, France the 27th to 30th of April 2015.

Since work began during the Archaeomedes project in the 90s and ArchaeDyn project in the late 2000s, many new research tools have been developed to analyze settlement patterns and, in particular, to observe changes in settlement systems. This session concerns changes or transitions in settlement systems and is an opportunity to take stock of current research from different viewpoints, including : conceptual models, methodological and technical models, including but not limited to statistical models, GIS, agent-based simulation, network approaches and more.

Oral and poster presentations should be sent to archeometrie2015@univ-fcomte.fr before the 30th October 2014.
We hope to see you there.

This conference will be in both French and English.

Visit the webpage of the GMPCA congress – Archeometry 2015
http://chrono-environnement.univ-fcomte.fr/spip.php?article1967

TransMonDyn in Pullman, WA

A week of exciting simulation work will be presented in Pullman, WA by the TransMonDyn team November 10th to the 14th! TransMonDyn is an ambitious French project that aims to model human settlement dynamics and major transitions worldwide. These transitions include the spread of different languages in Africa, game theoretic approaches to Romanization, network analysis, and hosts of other fantastic research! Find out more about TransMonDyn here! http://www.transmondyn.parisgeo.cnrs.fr

Visualizing Worldwide Births and Deaths

Some folks in cyberspace have taken to visualizing data on births and deaths worldwide. This simulation shows the spot on a world map where a birth or a death has been recorded, and flashes it before your eyes. Green for birth, red for death. While numbers are thrown out there in the media (4.1 births per second), it’s hard to imagine what that looks like. This map does just that.

One colleague has pointed out that this map skews toward countries that do very good census keeping, so maybe this doesn’t show all of them. But in the meantime this simulation both shows you where these demographic events are happening, and how big of a discrepancy there is between the rates. This could be a place for great data mining and future publications, assuming one can get at the data that is running behind this sim.

For example, can we see areas that are being disproportionately hit by diseases (ebola?) and do those deaths really seem to be a large percentage of deaths worldwide? Can we see where programs for abstinence versus family planning are in effect? How about trends in births or deaths–can we see where one country has many births in one streak, and then few for a while, and can this tell us about events that may have marked conception (a.k.a. can we see February 14th popping up in the U.S.A. if we look around Nov 14th?).

In the meantime, enjoy the simulation. It’s quite hypnotizing.

Here’s the link: http://worldbirthsanddeaths.com