Tag Archives: simulation

EAA goes digital!

January 24, 2018 izaromanowska Leave a comment

This year the EAA (European Association of Archaeologists) Annual Meeting is taking place between 5-8 September 2018 in the loveliest of cities – Barcelona. We have prepared an exciting set of simulation-complexity-data related events, so if you use a computer in your research this EAA will be the most exciting yet!

During the conference we will be running a standard paper session: CAA@EAA: Computational Models in Archaeology (abstract below) focusing on formal, computational models in archaeology (not exclusively simulation, but we do like our ABMs ;). The abstract deadline is 15 February. You can submit your abstract via the EAA system.

On top of that throughout the conference we will offer Data Clinic – a personalised one-to-one consultation with data and modelling specialists (summary below). In order to give us a head-start with matching archaeologists to data experts we ask participants to submit a short summary outlining their data, research questions and the ideas they may already have via the standard route of the EAA system (please note, that as an alternative format it will not count towards the paper limit imposed by the EAA).

Finally, we are very excited to announce the Summer School in Digital Archaeology which will take place immediately after the EAA, between 10-14 September 2018. A week of hands-on tutorials, seminars, team challenges and intensive learning, the Summer School will provide an in depth training in formal computational models focusing on data modelling, network science, semantic web and agent-based modelling. Thanks to the generous support of the Complex Systems Society we are able to offer a number of bursaries for the participants. For more details please see the School website; we recommend to pre-register as soon as possible (pre-registration form).

Please feel free to pass this info onto your colleagues and students who might be interested.

We hope to see many of you in sunny Barcelona!

————————————————————————————————————-

Session: #672 CAA @ EAA: Computational Models in Archaeology

Theme:

Theories and methods in archaeological sciences

Session format:

Session, made up of a combination of papers, max. 15 minutes each

Models are pervasive in archaeology. In addition to the high volume of empirical archaeological research, there is a strong and constant interest among archaeologists and historians in questions regarding the nature, mechanisms and particularities of social and socio-natural processes and interactions in the past. However, for the most part these models are constructed using non-formal verbal arguments and conceptual hypothesis building, which makes it difficult to test them against available data or to understand the behaviour of more complex models of past phenomena.

The aim of this session is to discuss the role of formal computational modelling in archaeological theory-building and to showcase applications of the approach. This session will showcase the slowly changing trend in our discipline towards more common use of formal methods.

We invite contributions applying computational and quantitative methods such as GIS, data analysis and management, simulation, network science, ontologies, and others to study past phenomena concerned with societal change, human-environment interactions and various aspects of past systems such as economy, cultural evolution or migration. Methodological and theoretical papers on the benefits and challenges of quantification, the epistemology of formal methods and the use of archaeological material as a proxy for social processes are also welcome.

Main organisers:

dr Iza Romanowska (Spain), dr Luce Prignano (Spain), María Coto-Sarmiento (Spain), dr Tom Brughmans (United Kingdom), Ignacio Morer (Spain)

Session: #663 Archaeological Data Clinic. Personalised consulting to get the best of your data.

Theme:

Theories and methods in archaeological sciences

Session format:

Discussion session: Personalised consulting to get the best of archaeologial data. We will set up meetings with an expert in data analysis / network science / agent-based modelling.

In the ideal world we would all have enough time to learn statistics, data analysis, R, several foreign and ancient languages and to read the complete works by Foucault. In reality, most researchers artfully walk the thin line between knowing enough and bluffing. The aim of this workshop is to streamline the process by pairing archaeologists with data and computer science specialists.

If you have a dataset and no idea what to do with it…
if you think PCA/least cost paths / network analysis / agent-based modelling is the way forward for your project but you don’t know how to get started…
If you need a second opinion to ensure that what you’ve already done makes sense…

…then this drop-in clinic is for you.

Let us know about your case by submitting an abstract with the following information:

A few sentences project outline;
Type and amount of data;
Research question(s);
What type of analysis you’d like to perform? (if known).

We will set up a meeting with an expert in data analysis / network science / agent-based modelling. They will help you to query and wrangle your data, to analyse and visualise it and to guide you on the next steps. They may help you choose the right software or point you towards a study where similar problems have been solved. In a nutshell, they will save you a lot of time and frustration and make your research go further!

Main Organisers:

Dr Luce Prignano (Spain), Dr Iza Romanowska (Spain), Dr Sergi Lozano (Spain), Dr Francesca Fulminante (United Kingdom), Dr Rob Witcher (United Kingdom), Dr Tom Brughmans (United Kingdom).

Noteworthy Publications

A full, and growing, bibliography of ABM in archaeology

October 23, 2017 izaromanowska Leave a comment

With more and more case studies, methodological papers and other musings on ABM being published every year, it is often difficult to stay on top of the literature. Equally, since most of ABMers in archaeology are self-taught the initial ‘reading process’ may be quite haphazard. But not any more! Introducing: bit.ly/ABMbiblio

Now, whenever needed, you can consult a comprehensive list of all publications dealing with ABM in archaeology hosted on GitHub. What is more important, the list will be continuously updated, both by the authors and by everyone else. So if you know of a publication that have not been listed yet, or, our most sincere apologies, we missed your paper, simply put up a pull request and we’ll merge your suggestions. (Please note that if there is more than one paper for a project we feature only the main publication.) Follow this link to explore all-you-can-eat paper buffet of ABM in archaeology.

Case Studies, Tutorials

The Powers and Pitfalls of Power-Law Analyses

December 8, 2016 stefanicrabtree Leave a comment

People love power-laws. In the 90s and early 2000s it seemed like they were found everywhere. Yet early power-law studies did not subject the data distributions to rigorous tests. This decreased the potential value of some of these studies. And since an influential study by Aaron Clauset of CU Boulder , Cosma Shalizi of Carnegie Mellon, and Mark Newman of the University of Michigan, researchers have become aware that not all distributions that look power-law like are actually power-laws.

But power-law analyses can be incredibly useful. In this post I show you first what a power-law is, second demonstrate an appropriate case-study to use these analyses in, and third walk you through how to use these analyses to understand distributions in your data.

What is a power-law?

A power-law describes a distribution of something—wealth, connections in a network, sizes of cities—that follow what is known as the law of preferential attachment. In power-laws there will be many of the smallest object, with increasingly fewer of the larger objects. However, the largest objects disproportionally get the highest quantities of stuff.

The world wide web follows a power-law. Many sites (like Simulating Complexity) get small amounts of traffic, but some sites (like Google, for example) get high amounts of traffic. Then, because they get more traffic, they attract even more visits to their sites. Cities also tend to follow power-law distributions, with many small towns, and few very large cities. But those large cities seem to keep getting larger. Austin, TX for example, has 157.2 new citizens per day, making this city the fastest growing city in the United States. People are attracted to it because people keep moving there, which perpetuates the growth. Theoretically there should be a limit, though maybe the limit will be turning our planet into a Texas-themed Coruscant.

This is in direct contrast to log-normal distributions. Log-normal distributions follow the law of proportional effect. This means that as something increases in size, it is predictably larger than what came before it. Larger things in log-normal distributions do not attract exponentially more things… they have a proportional amount of what came before. For example, experience and income should follow a log-normal distribution. As someone works in a job longer they should get promotions that reflect their experience. When we look at incomes of all people in a region we see that when incomes are more log-normally distributed these reflect greater equality, whereas when incomes are more power-law-like, inequality increases. Modern incomes seem to follow log-normality up to a point, after which they follow a power-law, showing that the richest attract that much more wealth, but under a certain threshold wealth is predictable.

If we analyze the distribution of modern incomes in a developing nation and see that they follow a power-law distribution, we will understand that there is a ‘rich get richer’ dynamic in that country, whereas if we see the incomes follow a log-normal distribution we would understand that that country had greater internal equality. We might want to know this to help influence policy.

When we analyze power-laws, however, we don’t want to just look at the graph that is created and say “Yeah, I think that looks like a power-law.” Early studies seemed to do just that. Thankfully Clauset et al. came up with rigorous methods to examine a distribution of data and see if it’s a power-law, or if it follows another distribution (such as log-normal). Below I show how to use these tools in R.

Power-law analyses and archaeology

So, if modern analyses of these distributions can tell us something about the equality (log-normal) or inequality (power-law) of a population, then these tools can be useful for examining the lifeways of past people. Questions we might be interested in asking are whether prehistoric cities also follow a power-law distribution, suggesting that the largest cities offered more social (and potentially economic) benefits similar to modern cities. Or we might want to understand whether societies in prehistory were more egalitarian or more hierarchical, thus looking at distributions of income and wealth (as archaeologists define them) to examine these. Power-law analyses of distributions of artifacts or settlement sizes would enable us to understand the development of inequality in the past.

Clifford Brown et al. talked about these very issues in their chapter Poor Mayapan from the book The Ancient Maya of Mexico edited by Braswell. While they don’t use the statistical tools I present below, they do present good arguments for why and when power-law versus other types of distributions would occur, and I would recommend tracking down this book and reading it if you’re interested in using power-law analyses in archaeology. Specifically they suggest that power-law distributions would not occur randomly, so there is intentionality behind those power-law-like distributions.

I recently used power-law and log-normal analyses to try to understand the development of hierarchy in the American Southwest. The results of this study will be published in 2017 in American Antiquity. Briefly, I wanted to look at multiple types of evidence, including ceremonial structures, settlements, and simulation data to understand the mechanisms that could have led to hierarchy and whether or not (and when) Ancestral Pueblo groups were more egalitarian or more hierarchical. Since I was comparing multiple different datasets, a method to quantitatively compare them was needed. Thus I turned to Clauset’s methods.

These had been updated by Gillespie in the R package poweRlaw.

Below I will go over the poweRlaw package with a built-in dataset, the Moby Dick words dataset. This dataset counts the frequency of different words. For example, there are many instances of the word “the” (19815, to be exact) but very few instances of other words, like “lamp” (34 occurrences) or “choice” (5 occurrences), or “exquisite” (1 occurrence). (Side note, I randomly guessed at each of these words, assuming each would have fewer occurrences. My friend Simon DeDeo tells me that ‘exquisite’ in this case is hapax legomenon, or a term that only has one recorded use. Thanks Simon.) To see more go to http://roadtolarissa.com/whalewords/.

In my research I used other datasets that measured physical things (the size of roomblocks, kivas, and territories) so there’s a small mental leap for using a new dataset, but this should allow you to follow along.

The Tutorial

Open R.

Load the poweRlaw package

library(“poweRlaw”)

Add in the data

data(“moby”, package=”poweRlaw”)

This will load the data into your R session.

Side note:

If you are loading in your own data, you first load it in like you normally would, e.g.:

data <- read.csv(“data.csv”)

Then if you were subsetting your data you’d do something like this:

a <- subset(data, Temporal_Assignment !=’Pueblo III (A.D. 1140-1300)’)

Next you have to decide if your data is discrete or continuous. What do I mean by this?

Discrete data can only take on particular values. In the case of the Moby Dick dataset, since we are counting physical words, this data is discrete. You can have 1 occurrence of exquisite and 34 occurrences of lamp. You can’t have 34.79 occurrences of it—it either exists or it doesn’t.

Continuous data is something that doesn’t fit into simple entities, but whose measurement can exist on a long spectrum. Height, for example, is continuous. Even if we bin peoples’ heights into neat categories (e.g., 6 feet tall, or 1.83 meters) the person’s height probably has some tailing digit, so they aren’t exactly 6 feet, but maybe 6.000127 feet tall. If we are being precise in our measurements, that would be continuous data.

The data I used in my article on kiva, settlement, and territory sizes was continuous. This Moby Dick data is discrete.
The reason this matters is the poweRlaw package has two separate functions for continuous versus discrete data. These are:

conpl for continuous data, and

displ for discrete data

You can technically use either function and you won’t get an error from R, but the results will differ slightly, so it’s important to know which type of data you are using.

In the tutorial written here I will be using the displ function since the Moby dataset is discrete. Substitute in conpl for any continuous data.

So, to create the powerlaw object first we fit the displ to it. So,

pl_a <- displ$new(moby)

We then want to estimate the x-min value. Powerlaws are usually only power-law-like in their tails… the early part of the distribution is much more variable, so we find a minimum value below which we say “computer, just ignore that stuff.”

However, first I like to look at what the x_min values are, just to see that the code is working. So:

pl_a$getXmin()

Then we estimate and set the x-mins

So this is the code that does that:

est <- estimate_xmin(a)

We then update the power-law object with the new x-min value:

pl_a$setXmin(est)

We do a similar thing to estimate the exponent α of the power law. This function is pars, so:

Pl_a$getPars()

estimate_pars(pl_a)

Then we also want to know how likely our data fits a power law. For this we estimate a p-value (explained in Clauset et al). Here is the code to do that (and output those data):

booty <- bootstrap_p(pl_a)

This will take a little while, so sit back and drink a cup of coffee while R chunks for you.

Then look at the output:

booty

Alright, we don’t need the whole sim, but it’s good to have the goodness of fit (gof: 0.00825) and p value (p: 0.75), so this code below records those for you.

variables <- c(“p”, “gof”)

bootyout <- booty[variables]

write.table(bootyout, file=”/Volumes/file.csv”, sep=’,’, append=F, row.names=FALSE, col.names=TRUE)

Next, we need to see if our data better fits a log-normal distribution. Here we compare our dataset to a log-normal distribution, and then compare the p-values and perform a goodness-of-fit test. If you have continuous data you’d use conlnorm for a continuous log normal distribution. Since we are using discrete data with the Moby dataset we use the function dislnorm. Again, just make sure you know which type of data you’re using.

### Estimating a log normal fit

aa <- dislnorm$new(moby)

We then set the xmin in the log-normal dataset so that the two distributions are comparable.

aa$setXmin(pl_a$getXmin())

Then we estimate the slope as above

est2 <-estimate_pars(aa)

aa$setPars(est2$pars)

Now we compare our two distributions. Please note that it matters which order you put these in. Here I have the power-law value first with the log-normal value second. I discuss what ramifications this has below.

comp <- compare_distributions(pl_a, aa)

Then we actually print out the stats:

comp

And then I create a printable dataset that we can then look at later.

myvars <- c(“test_statistic”, “p_one_sided”, “p_two_sided”)

compout <- comp[myvars]

write.table(compout, file=”/Volumes/file2.csv”, sep=’,’, append=F, row.names=FALSE, col.names=TRUE)

And now all we have left to do is graph it!

pdf(file=paste(‘/Volumes/Power_Law.pdf’, sep=”),width=5.44, height = 3.5, bg=”white”, paper=”special”, family=”Helvetica”, pointsize=8)

par(mar=c(4.1,4.5,0.5,1.2))

par(oma=c(0,0,0,0))

plot(pts_a, col=’black’, log=’xy’, xlab=”, ylab=”, xlim=c(1,400), ylim=c(0.01,1))

lines(pl_a, col=2, lty=3, lwd=2, xlab=”, ylab=”)

lines(aa, col=3, lty=2, lwd=1)

legend(“bottomleft”, cex=1, xpd=T, ncol=1, lty=c(3,2), col=c(2,3), legend=c(“powerlaw fit”, “log normal fit”), lwd=1, yjust=0.5,xjust=0.5, bty=”n”)

text(x=70,y= 1,cex=1, pos=4, labels=paste(“Power law p-value: “,bootyout$p))

mtext(“All regions, Size”, side=1, line=3, cex=1.2)

mtext(“Relative frequencies”, side=2, line=3.2, cex=1.2)

legend=c(“powerlaw fit”, “log normal fit”)

box()

dev.off()

Now, how do you actually tell which is better, the log normal or power-law? Here is how I describe it in my upcoming article:

The alpha parameter reports the slope of the best-fit power-law line. The power-law probability reports the probability that the empirical data could have been generated by a power law; the closer that statistic is to 1, the more likely that is. We consider values below 0.1 as rejecting the hypothesis that the distribution was generated by a power law (Clauset et al. 2009:16). The test statistic indicates how closely the empirical data match the log normal. Negative values indicate log-normal distributions, and the higher the absolute value, the more confident the interpretation. However, it is possible to have a test statistic that indicates a log-normal distribution in addition to a power-law probability that indicates a power-law, so we employ the compare distributions test to compare the fit of the distribution to a power-law and to the log-normal distribution. Values below 0.4 indicate a better fit to the log-normal; those above 0.6 favor a power-law; intermediate values are ambiguous. Please note, though, that it depends on what order you put the two distributions in the R code: if you put log-normal in first in the above compare distributions code, then the above would be reversed—those below 0.4 would favor power-laws, while above 0.6 would favor log normality. I may be wrong, but as far as I can tell it doesn’t actually matter which order you put the two distributions in, as long as you know which one went first and interpret it accordingly.

So, there you have it! Now you can run a power-law analysis on many types of data distributions to examine if you have a rich-get-richer dynamic occurring! Special thanks to Aaron Clauset for answering my questions when I originally began pursuing this research.

Full code at the end:

library(“poweRlaw”)

data(“moby”, package=”poweRlaw”)

pl_a <- displ$new(moby)

pl_a$getXmin()

est <- estimate_xmin(a)

pl_a$setXmin(est)

Pl_a$getPars()

estimate_pars(pl_a)

booty <- bootstrap_p(pl_a)

variables <- c(“p”, “gof”)

bootyout <- booty[variables]

#write.table(bootyout, file=”/Volumes/file.csv”, sep=’,’, append=F, row.names=FALSE, col.names=TRUE)

### Estimating a log normal fit

aa <- dislnorm$new(moby)

aa$setXmin(pl_a$getXmin())

est2 <-estimate_pars(aa)

aa$setPars(est2$pars)

comp <- compare_distributions(pl_a, aa)

comp

myvars <- c(“test_statistic”, “p_one_sided”, “p_two_sided”)

compout <- comp[myvars]

write.table(compout, file=”/Volumes/file2.csv”, sep=’,’, append=F, row.names=FALSE, col.names=TRUE)

pdf(file=paste(‘/Volumes/Power_Law.pdf’, sep=”),width=5.44, height = 3.5, bg=”white”, paper=”special”, family=”Helvetica”, pointsize=8)

par(mar=c(4.1,4.5,0.5,1.2))

par(oma=c(0,0,0,0))

plot(pts_a, col=’black’, log=’xy’, xlab=”, ylab=”, xlim=c(1,400), ylim=c(0.01,1))

lines(pl_a, col=2, lty=3, lwd=2, xlab=”, ylab=”)

lines(aa, col=3, lty=2, lwd=1)

legend(“bottomleft”, cex=1, xpd=T, ncol=1, lty=c(3,2), col=c(2,3), legend=c(“powerlaw fit”, “log normal fit”), lwd=1, yjust=0.5,xjust=0.5, bty=”n”)

text(x=70,y= 1,cex=1, pos=4, labels=paste(“Power law p-value: “,bootyout$p))

mtext(“All regions, Size”, side=1, line=3, cex=1.2)

mtext(“Relative frequencies”, side=2, line=3.2, cex=1.2)

legend=c(“powerlaw fit”, “log normal fit”)

box()

dev.off()

Tutorials

Complex social dynamics in a few lines of code

July 13, 2016 izaromanowska Leave a comment

To prove that there is a world beyond agents, turtles and all things ABM, we have created a neat little tutorial in system dynamics implemented in Python.

Delivered by Xavier Rubio-Campillo and Jonas Alcaina just a few days ago at the annual Digital Humanities conference (this year held in the most wonderful of all cities – Krakow), it is tailored to humanities students so it does not require any previous experience in coding.

System dynamics is a type of mathematical or equation-based modelling. Archaeologists (with a few noble exceptions) have so far shunned from, what is often perceived as, ‘pure math’ mostly citing the ‘too simplistic’ argument when awful mathematics teacher trauma was probably the real reason. However, in many cases an ABM is a complete overkill when a simple system dynamics model would be well within one’s abilities. So give it a go if only to ‘dewizardify’* the equations.

Follow this link for the zip file with the tutorial: https://zenodo.org/record/57660#.V4YIIKu7Ldk

*the term ‘dewizardify’ courtesy of SSI fellow Robert Davey (@froggleston)

Events

Social Simulation Conference 2016

April 11, 2016 izaromanowska Leave a comment

You may remember the SPUHH (Simulating the Past to Understand Human History) event in Barcelona in 2014. It was a satellite to the Social Simulation Conference – an annual gathering of researchers whose models are concerned with complex social systems. This year edition of the SSC is taking place in Rome, September 19-23. The paper abstract deadline is 18th April.

Although there are no sessions dedicated specifically to archaeology this year, this is a great opportunity to get a bit of an insight into what our colleagues [working on humans you can actually talk to because they are not so very dead] are up to. Often you can find many commonalities between their and our problems, models and interpretations. Not to mention, an occasional check on what is going on in the world of social theory is always useful.

Like every year, the conference is followed by a week long summer school in social simulation using NetLogo. For more details see here: http://www.ssc2016.cnr.it/essa-s4/

Noteworthy Publications

Everything you ever wanted to know about building a simulation, but without the jargon

January 25, 2016 izaromanowska Leave a comment

I think everyone who had anything to do with modelling came across an innocent colleague/supervisor/another academic enthusiastically exclaiming:

“Well, isn’t this a great topic for a simulation? Why don’t we put it together – you do the coding and I’ll take care of the rest. It will be done and dusted in two weeks!”

“Sure! I routinely build well-informed and properly tested simulations in less than two weeks.” – answered no one, ever.

Building a simulation can be a long and frustrating process with unwelcome surprises popping out at every corner. Recently I summarised the 9 phases of developing a model and the most common pitfalls in an paper published in Human Biology: ‘So You Think You Can Model? A Guide to Building and Evaluating Archaeological Simulation Models of Dispersals‘. It is an entirely jargon free overview of the simulation pipeline, predominantly aimed at anyone who want to start building their own archaeological simulation but does not know what does the process entail. It will be equally useful to non-modellers, who want to learn more about the technique before they start trusting the results we throw at them. And, I hope, it may inspire more realistic time management for simulation projects 🙂

You can access the preprint of it here. It is not as nicely typeset as the published version but, hey!, it is open access.

Events

Free 2 days workshop on agent-based modelling for archaeologists @CAA2016 Oslo

January 11, 2016 izaromanowska 1 Comment

Even if you are an ABM expert please help us spread the word!

Agent-based modelling (ABM) has taken by storm disciplines from all corners of the scientific spectrum, from ecology to transport and social sciences and it is becoming increasingly popular in archaeology.

Now it is your turn to give it go!

Learn how to use the simulation software and explore how this popular complexity science technique can complement your research. This two-day workshop will provide an introduction to ABM using NetLogo – an open-source platform for building agent-based models, which combines user-friendly interface, simple coding language and a vast library of model examples, making it an ideal starting point for entry-level agent-based modellers, as well as a useful prototyping tool for more experienced programmers.

For more details see the workshop leaflet: Workshop_leaflet-3

To secure a place please send an email to i.romanowska at soton.ac.uk expressing your interest and briefly describing your background and the reasons why you want to attend. The event is free of charge, but you need to register to the CAA conference. Please note that places are limited and early applications will be given preference.

General

The hypes and downs of simulation

September 1, 2015 izaromanowska 2 Comments

Have you ever wondered when exactly simulation and agent-based modelling started being widely used in science? Did it pick up straight away or was there a long lag with researchers sticking to older, more familiar methods? Did it go hand in hand with the rise of chaos theory or perhaps together with complexity science?

Since (let’s face it) googling is the primary research method nowadays, I resorted to one of google’s tools to tackle some of these questions: the Ngram viewer. If you have not come across it before, it searchers for all instances of a particular word in the billions of books that google has been kindly scanning for us. It is a handy tool for investigating long-term trends in language, science, popular culture or politics. And although some issues have been raised about its accuracy (e.g., not ALL the books ever written are in the database and there has been some issues with how well it transcribes from scans to text), biases (e.g., it is very much focused on English publications) and misuses (mostly by linguists), it is nevertheless a much better method than drawing together some anecdotal evidence or following other people’s opinions. It is also much quicker.

So taking it with a healthy handful of salt, here are the results.

Simulation shot up in the 1960s as if there was no tomorrow. Eyeballing it, it looks like its growth was pretty much exponential. There seems to be a correction in the 1980s and it looks like it has reached a plateau in the last two decades.

This to many looks strikingly similar to a Gartner hype cycle. The cycle plots a common pattern in life-histories of different technologies (or you can just call it a simple adaptation of Hegel/Fichte’s Thesis-Antithesis-Synthesis triad).

Screen Shot 2015-08-28 at 16.22.36 — Gartner Hype Cycle. Source: http://www.gartner.com/technology/research/methodologies/hype-cycle.jsp

It shows how the initial ‘hype’ quickly transforms into a phase of disillusionment and negative reactions when the new technique fails to solve all of humanity’s grand problems. This is then followed by a rebounce (‘slope of enlightenment’…) fuelled by an increase of more critical applications and a correction in the level of expectations. Finally, the technique becomes a standard tool leading to a plateau of its popularity.

It looks like simulation has reached this plateau in mid 1990s. However, I have some vague recollections that there is some underlying data problem in the Ngram Viewer for the last few years – either more recent books have been added to the google database in disproportionally higher numbers or there has been a sudden increase in online publications or something similar skews the patterns compared to previous decades [if anyone knows more about it, please comment below and I’ll amend my conclusions]. Thus, let’s call the plateau a ‘tentative plateau’ for now.

2. I wondered if simulation might have reached the ceiling of how popular any particular scientific method can be so I compared it with other prominent tools and it looks like we are, indeed, in the right ballpark.

Let’s add archaeology to the equation. Just to see how important we are and to boost our egos a bit. Or not.

3. I was also interested to see if the rise of ‘simulation’ corresponds with the birth of the chaos theory, the cybernetics or the complexity science. However, this time the picture is far from clear.

Although ‘complexity’ and ‘simulation’ follow similar trajectory, it is not particularly evident whether the trend for ‘complexity’ is not just a general increase of the use of the word in contexts different than science. This is nicely exemplified by ‘chaos’ which does not seem to gain much during the golden years of chaos theory, most likely because its general-use as a common English word would have drown any scientific trends.

4. Finally, let’s have a closer look at our favourite technique: Agent-based Modelling.

There is a considerable delay in its adoption compared to simulation as it is only in mid 1990s that ABM really starts to be visible. It also looks like Americans have been leading the way (despite their funny spelling of the word ‘modelling’). Most worryingly though, the ‘disillusionment’ correction phase does not seem to have been reached yet, which indicates that there are some ~~turbulent~~ interesting times ahead of us.

Case Studies, Noteworthy Publications, Tutorials, Uncategorized

Writing a model, from start to finish: A step-by-step tutorial of Ger Grouper

June 2, 2015 stefanicrabtree 2 Comments

My colleague Dr. Julia K. Clark and I recently published a model in the multidisciplinary journal land based on household fission/fusion dynamics in Mongolia. It’s entitled “Examining Social Adaptations in a Volatile Landscape in Northern Mongolia via the Agent-Based Model Ger Grouper”and you can read it here. We present a fairly streamlined model with a clear goal.

The question is, though, how do you get from point A to point X?

A secret is, most models do not spring forth from a modeler’s head fully formed. They are created in increments and along the way things change, plans evolve, and workflow shifts. In this post today I aim to show you how we created a model from a sketchy start to published article.

The beginning:

Dr. Clark is an archaeologist working on Bronze Age habitations in Northern Mongolia. You can read her dissertation here. I have had the privilege of working with her for many years, recently going to Mongolia with her to help survey and excavate archaeological cites, and do ethnoarchaeologial work with modern nomads. From these experiences we developed an agent-based model.

One thing we saw happen in Mongolia was the combining of households to reduce risk. When I was interviewing a family I learned that the daughter’s house (a ger, the Mongolian name for yurt) had burnt down, so she had come to live with her parents. This risk mitigation strategy meant that the two households combined—the flocks of sheep and goats combined, the amount of owned resources combined. When the daughter, her kids and her husband eventually were able to move, the resources would have been divided between parent and child household.

Events like this are virtually impossible to see in the archaeological record, but we realized that these strategies have lasted for at least 100 years, and likely for many centuries previously. People in Mongolia have been dealing with unpredictable weather events for as long as they’ve lived there. There are large winter events known as dzuds: imagine the worst blizzard you’ve been in, and make it worse. The Mongolian people have come up with a complex system of exchange, kinship, and pasture management to deal with the harsh climate. We wanted to know what rules evolved over time to ensure survival. We decided to model it.

We created all of our models in Net Logo. All of the code for these models is at the end of this tutorial so you can see our thought process.

Model 1: Static Population Random Walk Model

This first model looks almost nothing like the finished model. It’s really a messy toy model that was used to begin to explore the problem. Our first question was:

Do we see households aggregated near each other, and can this aggregation tell us anything about household sharing?

To create a simplified model we allowed population to remain static. We gave the households very simplified rules, allowing for movement in spring and winter, reproduction, and resource acquisition. The household ate the grass it landed on (simulating herds eating grass) and they would move again in the next season. The amount of energy a Ger received from consuming grass was static and set within the model.

We ran this model multiple times to look at aggregation. We realized that we essentially just created a model of random walks, but this random walk model provided the basis for where we went from there.

Model 2: Changing population and the emergence of a useful model

This model looks very similar to the above model, but instead of having population replacement we allow for ‘sexual’ reproduction. When a household accrues above a certain storage threshold they probabilistically reproduce, dividing their resources between themselves and their offspring.

We also allowed a new parameter which controlled how much of the landscape would be productive. It would be set at the beginning of the model and would stay the same throughout the run. While this was an improvement over Model 1, we realized that with this model the landscape wasn’t really reacting to household usage: it was fairly static from start to finish. There were patches of productivity and patches of no productivity. We wanted to see what would happen with unpredictable events, like the gigantic Mongolian winter storms known as dzuds.

We did some quick analyses with the “Index of Dispersion” statistic (for more info look here) and the following graphic shows a scientific poster we created for the presentation of this simplified model.

We realized we essentially created the nullest of null models. Nothing in reality looked like what we produced in this model, but it created a great stepping-stone to adding on complexity. Moreover, by doing multiple tests on the model we were able to ensure we truly understood what we had coded.

Model 3: The Final Model

There were several things we were unsatisfied with Model 2.

First, we wanted to know if people would group together, but in the previous model groupage was not intentional. It was completely random. While this is useful for letting us know if households being close together is random or a product of social choice, it didn’t help us get at risk management. For that we needed a way for households to intentionally choose to cooperate with another household.

We realized we needed to explore creating different strategies for different groups of agents. Thus we used the “breeds” option native in NetLogo to create four strategies related to sharing. These strategies corresponded to the probability of a household asking for help from another household when their storage got below a certain level. We created four strategies:

Lineage A: 100% cooperation

Lineage B: 50% cooperation

Lineage C: 25% cooperation

Lineage D: 0% cooperation

This way we could directly examine the survival of different sharing strategies in different environments. Would always sharing ever be a good option? Likewise, would never sharing be a good option? You can read the paper to see our results, but by creating different lineages we could examine this.

Second, in our research we realized that people could live pretty much anywhere during the summer. In the winter, however, people couldn’t live just anywhere; during ethnographic interviews families talked about areas they would go to when winters weren’t harsh, and areas they’d go to when winters were harsh. For our model we wanted to create distinct areas for people to live in. Thus we divided the landscape into two halves, summer and winter, and allowed the Gers to ‘teleport’ between the two halves, and then use their usual move function once in the summer or winter patches. This, then, mimics the long-distance movement we noted in ethnographic interviews. In the winter patches we allowed for ‘green’ patches (healthy grass), ‘brown’ patches (patches that grass could grow on but were currently denuded), and ‘grey’ patches (to symbolize areas where grass couldn’t grow—rocky outcrops, frozen lakes, etc). By changing the parameter that corresponds to the amount of grey patches we can simulate different productive or unproductive winter environments.

We also allowed the households to use memory to return to previously good winter patches. This meant that a household that previously teleported on a green patch would remember it. This reduces the probability of landing on a grey patch. While it’s true that they could return to that patch and it would be dead, a green or brown patch is likely to abut another green patch, reducing the odds of landing in a truly bad patch.

We added multiple variables for the final version of Ger Grouper:

Ger reproduce (the probability of reproducing at any timestep)

N (The number of each lineage to be seeded at the start of the simulation)

Patch-variability (the amount of winter patches we allow to be dead)

Ger-gain from food: The amount of energy that each household gets from consuming grass

Grass-regrowth time: how long it takes for grass to regrow once eaten

Energy-loss from dead patches: How much each household is charged for landing on unproductive patches.

Results:

Sharing strategies are stable and useful for different environments, and ‘restricted sharing’ practices are the most optimal. However, in some environments all-share and no-share are better than restricted sharing.

For the full results and for what that might mean for Mongolia read the article here.

Now here’s the massive amounts of code. Be gentle; some of this is poorly documented or not exactly what we were hoping for when we began. But hopefully it shows you my thought process, and you should be able to cut and paste into NetLogo and get it to run.

Model 1:

globals [
run-count
]

breed [
ger
gers
]

turtles-own [
trait
energy
]

patches-own [ ]

to setup

let saved-run-count run-count

ca ;; shorthand for “clear all”

set run-count saved-run-count + 1

crt N ;; creating N turtles; this will be controlled by a slider at the GUI
[
set shape “ger”
set size 3
setxy random-xcor random-ycor
set trait random 1000
set label trait
]

;; this section randomizes the patches so that at initialization ‘patchiness’ is set randomly
;; The “Z” slider at the getgo makes it so the user can make the land more lush (a lower number for Z) or more sparse (a higher number)
;; this is based on “patch clusters” in the netlogo code library
;; this identifies one patch, colors it green, repeats this 50 times randomly in the landscape and asks
;; random patches to color as their neighbors are

ask patches
[ set pcolor green + green * random Z ]
repeat 50
[ ask patches
[set pcolor [pcolor] of one-of neighbors4 ] ]

ask patches
[
if pcolor != green [set pcolor brown]
]

reset-ticks

end

to go

if not any? turtles [ stop ] ;; this is so that if all our turtles die the sim doesn’t keep moving

move_spring
move_autumn

if ticks >= 10
[ stop ] ;; since population is held constant, just having a few ticks is fine to test fission/fusion

end

;; set heading randomly away from current patch
;; this is done because in the ethnography, Mongolians are much less constrained to where they can live in spring
;; to simulate moving to a completely different area, they randomly move in a direction
;; and are not constrained as to what productivity the patches has
;; instead of making the world flip between green in spring and a patchy green/brown they can simply live anywhere in spring

to move_spring
ask turtles
[
left random 360
forward 10
]

;; because movement is divided into two seasons, spring and winter, file writing has to happen in each “move” space
;; this file writes both the spring patterns
;; and the autumn patterns of the turtles
;; by recording the tick
;; the individual Ger
;; and the x and y coordinates

;; if ticks = 8 [
;; ask turtles
;; [
;; file-open “GerGrouper1.txt”
;; file-type (word ticks “,”)
;; file-type (word who “,”)
;; file-type (word xcor “,”)
;; file-print ycor
;; file-close
;; ]
;; ]

tick

end

to move_autumn

;; Here, the turtles move in a random direction away from the patch they were on during the spring.
;; They are constrained during autumn to try and live on a green patch.
;; If the patch is brown, a.k.a dead
;; The turtle looks at its Von Neuman neighborhood.
;; If they find a patch in the neighborhood that’s green, then they move there and are rewarded 1 energy.
;; If they don’t, they are deducted 1 energy.
;; Turtles die by having their energy dip below 0 (defined below under “death”).

ask turtles
[
left random 360
forward 10
]
ask turtles [
;; if previous tick
;; pcolor = green [
;;move to that patch
;; ]

rt random 360
forward 30

]

ask turtles [
if pcolor = brown [
ifelse any? neighbors4 with [ pcolor = green ]
[
move-to one-of neighbors4 with [ pcolor = green ]
set energy energy + 1 ;; add energy if they land on a green patch
]
[
set energy energy – 1 ;; deduct energy if they don’t end up on a green patch
]

]
death

;; this file writes both the autumn patterns of the turtles
;; by recording the run count, the seed, number of turtles, tick
;; the individual Ger ID
;; and the x and y coordinates

]
ask turtles [
if ticks = 9 [
file-open “GerGrouper_Final.txt”
file-type (word run-count “,”)
file-type (word seed “,”)
file-type (word N “,”)
file-type (word ticks “,”)
file-type (word who “,”)
file-type (word xcor “,”)
file-print ycor
file-close
]
]

tick

end

;; this is how turtles die at the end of autumn and how they reproduce
;; since reproduction is just replacement

to death ;; turtle procedure
; when energy dips below zero, die
if energy < 0 [
hatch 1
setxy random-xcor random-ycor
set trait random 1000
set label trait
die

]

end

Model 2:

globals [ ]

breed [
ger
gers
]

turtles-own [
trait
energy
]

patches-own [ ]

to setup

ca ;; shorthand for “clear all”

crt N ;; creating N turtles; this will be controlled by a slider at the GUI
[
set shape “ger”
set size 3
setxy random-xcor random-ycor
set trait random 1000
set label trait
]

;; this section randomizes the patches so that at initialization ‘patchiness’ is set randomly
;; The “Z” slider at the getgo makes it so the user can make the land more lush (a lower number for Z) or more sparse (a higher number)
ask patches
[ set pcolor green + green * random Z ]
repeat 50
[ ask patches
[set pcolor [pcolor] of one-of neighbors4 ] ]

ask patches
[
if pcolor != green [set pcolor brown]
]

reset-ticks

end

to go

if not any? turtles [ stop ] ;; this is so that if all our turtles die the sim doesn’t keep moving

move_spring
move_autumn

if ticks >= 100
[ stop ] ;; 100 ticks = 50 years, a suitable amount of time to see fission/fusion

end

to move_spring
ask turtles
[
left random 360
forward 20
]

ask turtles
[
file-open “GerGrouper.txt”
file-type (word ticks “,”)
file-type (word who “,”)
file-type (word xcor “,”)
file-print ycor
file-close
]

tick

end

to move_autumn

ask turtles
[
rt random 360
forward 10

]

death
reproduce-gers

;; this is the same file writing procedure as above, but instantiated in autumn after death and reproduction have happened

file-open “GerGrouper.txt”
file-type (word ticks “,”)
file-type (word who “,”)
file-type (word xcor “,”)
file-print ycor
file-close

]
]
tick
end

;; this procedure is how a turtle reproduces. The “gers-reproduce” slider is on the GUI and shows the percent likelihood of reproduction

to reproduce-gers ;; procedure for turtles
if random-float 100 < gers-reproduce [ ;; throw “dice” to see if you will reproduce
set energy (energy / 2) ;; divide energy between parent and offspring
hatch 1 [ rt random-float 360 fd 1 ]
set trait random 1000
set label trait ;; hatch an offspring and move it forward 1 step, then give it a random label so it isn’t labeled like its parents
]
end

;; this is how turtles die at the end of autumn

to death ;; turtle procedure
; when energy dips below zero, die
if energy < 0 [ die ]

end

Model 3:

globals [
grass
summer-patches
winter-patches
]

breed [
lineageA ;; 100% cooperation
]
breed [
lineageB ;; 50% cooperation
]
breed [
lineageC ;; 25% cooperation
]
breed [
lineageD ;; 0% cooperation
]

turtles-own [
visited-patches
energy
trait
]

patches-own [countdown]

to setup

ca ;; shorthand for “clear all”
create-lineageA N
[
set shape “ger”
set size 3
setxy random-xcor random-ycor
set trait random 1000
set label lineageA
set energy 20
set color blue
set visited-patches (list patch-here)
]
create-lineageB N
[
set shape “ger”
set size 3
setxy random-xcor random-ycor
set trait random 1000
set label lineageB
set energy 20
set color pink
set visited-patches (list patch-here)
]
create-lineageC N
[
set shape “ger”
set size 3
setxy random-xcor random-ycor
set trait random 1000
set label trait
set label lineageC
set energy 20
set color red
set visited-patches (list patch-here)
]
create-lineageD N
[
set shape “ger”
set size 3
setxy random-xcor random-ycor
set trait random 1000
set energy 20
set label lineageD
set color yellow
set visited-patches (list patch-here)
]

set summer-patches patches with [ pxcor < 0 ]
ask summer-patches [ set pcolor green ] ;;one-of [ green brown ]]

set winter-patches patches with [ pxcor >= 0 ]
ask winter-patches [ set countdown random grass-regrowth-time ;; initialize grass grow clocks randomly
set pcolor one-of [green brown brown grey grey grey] ;;equal amount dead patches as patches that can be living
]

ask patches [
set countdown random grass-regrowth-time ;; initialize grass grow clocks randomly
]
;; ]

file-open “ger_grouper2.csv”
file-close

reset-ticks

;; tick

end
to go
if not any? turtles [ stop ]
if ticks >= 500 [ stop ]

move_spring
ask turtles [
set energy energy – energy-loss-from-dead-patches ;; deduct energy if they land on a bad patch
eat-grass
]
write-spring

move_autumn
ask turtles[
set energy energy – energy-loss-from-dead-patches ;; deduct energy if they land on a bad patch
eat-grass

update-history ]
ask lineageA [
if energy <= 10 [
ask one-of lineageA in-radius 5 [
set energy energy + 10 ]
set energy energy – 10
]
]
ask lineageB [
if energy <= 10 [
if random-float 100 < 50 [
ask one-of lineageB in-radius 5 [
set energy energy + 10 ]
set energy energy – 10
]
]
]

ask lineageC [
if energy <= 10 [
if random-float 100 < 25 [
ask one-of lineageC in-radius 5 [
set energy energy + 10 ]
set energy energy – 10
]
]
]
write-autumn

ask patches
[ grow-grass ]
set grass count patches with [pcolor = green]

if count turtles >= 500
[ stop ]

ask patches
[ variability ]

end

to move_spring
ask turtles
[
move-to-empty-one-of summer-patches
if pcolor != green [
ifelse any? neighbors with [ pcolor = green ]
[
move-to one-of neighbors with [ pcolor = green ]
set energy energy – 1 + ger-gain-from-food
]
[
set energy energy – energy-loss-from-dead-patches ;; deduct energy if they don’t end up on a green patch
]
]
]

tick

end

to move_autumn

ask turtles [
if ticks >= 2 [
move-to one-of visited-patches
move-to-empty-one-of winter-patches
]
]

; ask patch
; if the patch is brown, a.k.a dead
; look at your neighborhood
; if you find a patch in the neighborhood that’s green move there

ask turtles [
if pcolor != green [
ifelse any? neighbors with [ pcolor = green ]
[
move-to one-of neighbors with [ pcolor = green ]
set energy energy – 1 + ger-gain-from-food
]
[
set energy energy – energy-loss-from-dead-patches ;; deduct energy if they don’t end up on a green patch
]
set label round energy

death
reproduce-gers
]
]
tick
end

to eat-grass ;; get procedure
;; get eat grass, turn the patch brown
if pcolor = green [
set pcolor brown
set energy energy + ger-gain-from-food ;; ger gain energy by eating
]
end

to variability
if random-float 100 < patch-variability
[
if pcolor = green [
set pcolor brown ]
]
;; ]
end

to reproduce-gers ;; ger procedure
if energy >= 20 [
if random-float 100 < gerreproduce [ ;; throw “dice” to see if you will reproduce
set energy (energy / 2) ;; divide energy between parent and offspring
hatch 1 [ rt random-float 360 fd 1 ]
]
]
end

to death ;; turtle procedure
; when energy dips below zero, die
if energy < 5 [ die ]

end

to grow-grass ;; patch procedure
;; countdown on brown patches: if reach 0, grow some grass
if pcolor = brown [
ifelse countdown <= 0
[ set pcolor green
set countdown grass-regrowth-time ]
[ set countdown countdown – 1 ]
]
end

to move-to-empty-one-of [locations] ;; turtle procedure
move-to one-of locations
while [any? other turtles-here] [
move-to one-of locations
]

end

;; here the gers remember the last 4 patches they went to that were green
;; since update-history is only called in the winter
;; the gers only remember the last few winter camps that were productive

to update-history
if pcolor = green
[
set visited-patches (lput patch-here visited-patches)
]

end

to write-autumn
file-open “ger_grouper2.csv”
file-type (word behaviorspace-run-number “,”)
file-type (word ticks “,”)
file-type (word seed “,”)
file-type (word count lineageA”,”)
file-type (word count lineageB”,”)
file-type (word count lineageC”,”)
file-type (word count lineageD”,”)
file-type (word N “,”)
file-type (word gerreproduce “,”)
file-type (word patch-variability “,”)
file-type (word ger-gain-from-food “,”)
file-type (word grass-regrowth-time “,”)
file-print (word energy-loss-from-dead-patches “,”)

file-close
end

to write-spring
file-open “ger_grouper2.csv”
file-type (word behaviorspace-run-number “,”)
file-type (word ticks “,”)
file-type (word seed “,”)
file-type (word count lineageA”,”)
file-type (word count lineageB”,”)
file-type (word count lineageC”,”)
file-type (word count lineageD”,”)
file-type (word N “,”)
file-type (word gerreproduce “,”)
file-type (word patch-variability “,”)
file-type (word ger-gain-from-food “,”)
file-type (word grass-regrowth-time “,”)
file-print (word energy-loss-from-dead-patches “,”)
file-close
end

Noteworthy Publications

Review: Agent-based Modeling and Simulation in Archaeology

February 23, 2015 stefanicrabtree Leave a comment

A brand-new book from Springer press is sure to become a staple in our bookshelves. Agent-based Modeling and Simulation in Archaeology provides a much-needed update, in one solid volume, on the methods and practice of using agent-based modeling to understand the past.

The two introductory chapters serve to highlight the utility of abm, in general terms with Lake’s “Explaining the Past with ABM” chapter, and in more specificity with Swedlund et al’s “Modeling Archaeology” chapter, which delves into the case-study of Artificial Anasazi. Lake specifically tells us that there is a large

“advantage of adding computer simulation to the archaeologists toolkit: not only [does] it force us to codify and make explicit our assumptions, but… it also allows us to explore the outcome of behaviors which can no longer be observed and for which there is no reliable recent historical record. In addition, it allows us to explore the outcome of behavior aggregated at the often coarse grained spatial and temporal resolution of the archaeological record.” (Lake, p. 9, this volume).

So say we all.

The most useful portion of this book to those new to agent-based modeling is probably the Methods section. While Railsback and Grimm have written the seminal text on learning agent-based modeling, their ecological approach can sometimes leave archaeologists scratching their heads. The four chapters on methods in this volume, however, concretely link ABM approaches with archaeology, discussing the unique sets of challenges we face in archaeology and how simulation methods can address those questions. Those concerned with questions that are tied intrinsically to the landscape will especially enjoy Koch’s “Geosimulation” chapter, while each of us should probably memorize Popper and Pichler’s “Reproducibility” chapter, and attempt to keep our work transparent, as they so rightly suggest.

The Applications section of this book provides four unique case studies that use ABM in varied situations. Each of these is well-researched and provide different viewpoints in how to use ABM effectively in archaeology. From Crema’s analysis of fission-fusion dynamics, to Kowarik et al’s and Danielisová et al’s chapters on prehistoric economies, to Barceló et al’s look at territoriality and social networks, these chapters are sure to provide good fodder for learning about different archaeological systems and how ABM can bring light to muddy portions of our understanding.

Despite being heavily cited, this book left out (as authors) a few pioneers in agent-based modeling in archaeology (Kohler, Premo), which may be a reflection of the mostly-European-based authorship of the chapters in this book, or is likely due to the fact that this book is based on a meeting held in Vienna. If the book has an updated version it would be good to include other voices from across the pond.

The heavy price tag of the book ($175 on Amazon, currently $120 on Springer) might make this beyond the scope for students to whom this book seems aimed. Hopefully a less costly paperback version, or e-reader version, of this book will come out to increase its accessibility.

All in all, this book will be a worthwhile addition to our bookshelves, and I can already imagine incorporating it into courses in agent-based modeling.

Find the book, Agent-based Modeling and Simulation in Archaeology, editors Gabriel Wurzer, Kerstin Kowarik and Hans Reschreiter on Amazon, or more info on Springer.

simulatingcomplexity

Tag Archives: simulation

EAA goes digital!

Session: #672 CAA @ EAA: Computational Models in Archaeology

Session: #663 Archaeological Data Clinic. Personalised consulting to get the best of your data.

A full, and growing, bibliography of ABM in archaeology

The Powers and Pitfalls of Power-Law Analyses

Complex social dynamics in a few lines of code

Everything you ever wanted to know about building a simulation, but without the jargon

Free 2 days workshop on agent-based modelling for archaeologists @CAA2016 Oslo

The hypes and downs of simulation

Writing a model, from start to finish: A step-by-step tutorial of Ger Grouper

Review: Agent-based Modeling and Simulation in Archaeology

From the world of Complex Systems Simulation in Humanities