This year the simulating complexity team is yet again teaching a 2-day workshop on agent-based modelling in archaeology as a satellite to the CAA conference. The workshop will take place on Sunday and Monday 12-13 March 2017. The workshop is free of charge, however, you have to register to the conference (which has some good modelling session as well).
Last year we had an absolute blast with over 30 participants, 10 instructors and 96% satisfaction rate (of the students, instructors were 100% happy!).
The workshop will follow along similar lines to last year although we have a few new and exciting instructors and a few new topics. For more details check here and here or simply get in touch!
The Simulating Complexity team is involved in two sessions at the CAA. Please consider putting together an abstract for submission. See them both below. Submission system can be accessed through here: http://caaconference.org/
Session: Data, Theory, Methods, and Models. Approaching Anthropology and Archaeology through Computational Modeling
Abstract: Quantitative model-based approaches to archaeology have been rapidly gaining popularity. Their utility in providing an experimental test-bed for examining how individual actions and decisions could influence the emergence of complex social and socio-environmental systems has fueled a spectacular increase in adoption of computational modeling techniques to traditional archaeological studies. However, computational models are restricted by the limitations of the technique used, and are not a “silver bullet” solution for understanding the archaeological and anthropological record. Rather, simulation and other types of formal modeling methods provide a way to interdigitate between archaeology/anthropology and computational approaches and between the data and theory, with each providing a feedback to the other. In this session we seek well-developed models that use data and theory from the anthropological and archaeological records to demonstrate the utility of computational modeling for understanding various aspects of human behavior. Equally, we invite case studies showcasing innovative new approaches to archaeological models and new techniques expanding the use of computational modeling techniques.
Everything wrong with…
Abstract: This is a different kind of session. Instead of the normal celebration of our success this session will be looking at our challenges. But, not degrading into self-pity and negativity, as it will be about critical reflection and possible solutions. The goal of this session is to raise the issues we should be tackling. To break the mold of the typical conference session, in which we review what we have solved, and instead explore what needs to be solved. Each participant will give a short (max 10 minutes but preference will be for 5 mins.) presentation in which they take one topic and critically analysis the problems surrounding it, both new and old. Ideally, at the end each participant would have laid out a map of the challenges facing their topic. The floor will then be opened up to the audience to add more issues, refute the problems raised, or propose solutions. This is open to any topic- GIS, 3D modelling, public engagement, databases, linked data, simulations, networks, etc. It can be about a very narrow topic or broad ranging e.g. everything that is wrong with C14 dating, everything wrong with least cost path analysis in ArchGIS, everything wrong with post-prossussalism, etc. However, this is an evaluation of our methods and theories and not meant to be as high level as past CAA sessions that have looked at grand challenges e.g. the beginning of agriculture. Anyone interested in presenting are asked to submit a topic (1-2 sentences) and your estimated time to summarize it (5 or 10 minutes). Full abstracts are not necessary.
An older version of this tutorial used the now-deprecated ncdf package for R. This updated version makes use of the ncdf4 package, and fixes a few broken links while we’re at it.
You found it: the holy grail of palaeoenvironmental datasets. Some government agency or environmental science department put together some brilliant time series GIS package and you want to find a way to import it into your model. But oftentimes the data may be in a format which isn’t readable by your modeling software, or takes some finagling to get the data in there. NetCDF is one of the more notorious of these. A NetCDF file (which stands for Network Common Data Form) is a multidimensional array, where each layer represents the spatial gridded distribution of a different variable or set of variables, and sets of grids can be stacked into time slices. To make this a little more clear, here’s a diagram:
In this diagram, each table represents a gridded spatial coverage for a single variable. Three variables are represented this way, and these are stored together in a single time step. The actual structure of the file might be simpler (that is, it might consist of a single variable and/or single time step) or more complex (with many more variables or where each variable is actually a set of coverages representing a range of values for that variable; imagine water temperature readings taken at a series of depths). These chunks of data can then be accessed as combined spatial coverages over time. Folks who work with climate and earth systems tend to store their data this way. It’s also a convenient way to keep track of data obtained from satellite measurements over time. They’re great for managing lots of spatial data, but if you’ve never dealt with them before, they can be a bit of a bear to work with. ArcGIS and QGIS support them, but it can be difficult to work them into simulations without converting to a more benign data type like an ASCII file. In a previous post, we’ve discussed importing GIS data into a NetLogo model, but of course this depends on our ability to get the data into a model-readable format. The following tutorial is going to walk through the process of getting a NetCDF file, manipulating it in R, and then getting it into NetLogo.
Step #1 – Locate the data
First let’s locate a useful NetCDF dataset and import it to R. As an example, we’ll use the Global Potential Vegetation Dataset from the UW-Madison Nelson Institute Sage Center for Sustainability and the Global Environment. As you can see, the data is also available as an ASCII file; this is useful because you can use this later to check that you’ve got the NetCDF working. Click on the appropriate link to download the Global Potential Veg Data NetCDF. The file is a tarball (extension .tar.gz), so you’ll need something to unzip it. If you’re not partial to a particular file compressor, try 7-Zip. Keep track of where the file is located on your local drive after downloading and unzipping.
Step #2- Bring the data into R
R won’t read NetCDF files as is, so you’ll need to download a package that works with this kind of data. The ncdf package is one of a few different packages that work with these files, and we’ll use it for this tutorial. First, open the R console and go to Packages->Install Packages and download the ncdf4 package from your preferred mirror site. Then load the package by entering the following: library(ncdf4) Now, remembering where you saved your NetCDF file, you can bring it into R with the following command: data <- nc_open(filename) If you didn’t save the data file in your R working directory and want to navigate to the file, just replace filename with file.choose(). For now, we’ll use the 0.5 degree resolution vegetation data (vegtype_0.5.nc). Now if you type in data and press enter, you can check to see what the data variable holds. You should get something like this:
This is telling you what your file is composed of. The first line tells you the name of the file. Beneath this are your variables. In this case, there is only one, vegtype, which according to the above uses a number just shy of nine hundred quintillion as a missing value (the computer will interpret any occurences of this number as no data).
Next come your dimensions, giving the intervals of measurement. In this case, there are four dimensions: longitude, latitude, level, and time. Our file only has one time slice, meaning that it represents a single snapshot of data; if this number is larger, there will be more coverages included in your file over time. The coverage spans from 89.75 S to 89.75 N latitude in 0.5 degree increments, and 180 W to 180 E longitude by the same increments.
To access the vegtype data, we need to assign it to a local variable, which we will call veg:
ncvar_get(data,"vegtype") -> veg
The ncvar_get command extracts an identified variable (“vegtype”) and extracts it from the NetCDF file (data) as a matrix. Then we assign it to the local variable veg. There are a number of other commands within the ncdf4 package which are useful for reading and writing NetCDF files, but these go beyond the scope of this blog entry. You can read more about them here.
Step #3 – Checking out the data
Now our data is available to us as a matrix. We can view it by entering the following:
Oops! Our output reads from bottom to top instead of top to bottom. No problem, we can just invert the latitude of the matrix like so:
However, this only changes the view; when we get the data into NetLogo later on, we’ll need to transpose it. But for now, let’s add some terrain colors. According to the readme file associated with the data, there are 15 different landcover types used here:
Tropical Evergreen Forest/Woodland
Tropical Deciduous Forest/Woodland
Temperate Broadleaf Evergreen Forest/Woodland
Temperate Needleleaf Evergreen Forest/Woodland
Temperate Deciduous Forest/Woodland
Boreal Evergreen Forest/Woodland
Boreal Deciduous Forest/Woodland
Evergreen/Deciduous Mixed Forest/Woodland
We could choose individual colors for each of these, but for the moment we’ll just use the in-built terrain color ramp:
Step #4 – Exporting the data to NetLogo
Finally, we want to read our data into a modeling platform, in this case NetLogo, so let’s export it as a raster coverage we can work with. Before we do any file writing, we’ll need to coerce the matrix into a data frame and make sure we transpose it so that it doesn’t come out upside down again. To do this, we’ll use the following code:
The as.data.frame command does the coercing, while the t command does the transposing. Now we have to open up the file we’re going to write to:
This establishes a connection to an open file which we’ve named vegcover.asc. Next, we’ll write the header data for an ASCII coverage. We can do this by adding lines to the file:
This may look like a bunch of nonsense, but each \t is a tab, and each \n is a new line. The result is a header on our file which looks like this: ncols 720 nrows 360 xllcorner -179.75 yllcorner -89.75 cellsize 0.5 NODATA_value 8.99999982852418e+20 Any program (whether a NetLogo model, GIS, or otherwise) that reads this file will look for this header first. The terms ncols and nrows define the number of columns and rows in the grid. The xllcorner and yllcorner define the lower left corner of the grid. The cellsize term describes how large each cell should be, and the NODATA_value is the same value from the original dataset which we used to define places where data is not available. Now just need to enter in our transposed data.
This will take our data frame and write it to the file we just created, appending it after the header. It’s important that your separator be a space (sep=” “) in order to assure that it is in a format NetLogo can read. Also make sure to get rid of any row and column names as well. Now we can read our file into NetLogo using the GIS extension (for an explanation of this, see here). Open a new NetLogo file, set the world window settings with the origin at the bottom left, a max-pxcor of 719 and and max-pycor of 359, and a patch size of 1. Save your NetLogo model in the same directory as the vegcover.asc file, and the following NetLogo code should do the trick:
set vegcover gis:load-dataset "vegcover.asc"
gis:set-world-envelope-ds gis:envelope-of vegcover
ask patches [
set pcolor white
set vegtype gis:raster-sample vegcover self
ask patches with [ vegtype <= 8 ] [
set pcolor scale-color green vegtype -5 10
ask patches with [ vegtype > 8 ] [
set pcolor scale-color pink vegtype 9 15
This should produce a world in which patches have a variable called vegtype with values that correspond to the original dataset. Furthermore, patches are colored according to a set scheme where forested areas are on a scale of green, while non-forested areas are on a scale of pink. The result:
If you’re truly curious as to whether this has worked as it should, you might download the ASCII version of the 0.5 degree data from the SAGE website, save it to the same directory, and replace vegcover.asc with the name of the ASCII file in the above NetLogo code to see if there is any difference.
So far, this has been meant to provide a simple tutorial of how to get data from a NetCDF file into an ABM platform. If you’re only dealing with a single coverage, you might be more at home converting your file using QGIS or another standalone GIS. If you’re dealing with multiple time steps or variables from a large dataset, it might make sense to write an R script that will extract the data systematically using combinations of the commands above. However, you might also make use of the R NetLogo extension to query a NetCDF file on the fly. To proceed with this part of the tutorial, you’ll need to download the R extension and have it installed correctly.
We’ll start a new NetLogo model, implement the R extension, and create two global variables and a patch variable:
extensions [ R ]
globals [ snowcover s ]
patches-own [ snow ]
The snowcover variable will be our dataset, while s will be a placeholder for monthly coverages. The patch variable snow will be the individual grid cell values from our data which will be updated monthly. Next, we’ll run a setup command which clears the model, installs the ncdf library, opens our NetCDF snowcover file, extracts our snowcover data, and resets our ticks counter. You may need to edit the code below so that it reflects the location of your NetCDF file.
r:eval "ncvar_get(data, \"snowcover\") -> snow"
Now, we could automate the process of converting to ASCII and importing the GIS data here, but that’s likely to be a slow solution and generate a lot of file bloat. Alternatively, if our world window is scaled to the same size as the NetCDF grid (or to some easily computed fraction of it), we can simply import the raw data and transmit the values directly to patches (not unlike the File Input example here). To do this, right click on the world window and edit it so that the location of the origin is the bottom left, and that the max-pxcor is 359 and the max-pycor is 89 (this is 360 x 90, the same size as our Northern Hemisphere snowcover data). We’ll also make sure the world doesn’t wrap, and set the patch size to 3 to make sure it fits on our screen.
Next, we’ll generate the transposed dataframe as in the above example, but this time for a single monthly coverage. Then we’ll import this data from R into the NetLogo placeholder variable s:
r:eval (word "snow2<-as.data.frame(t(snow[,," ticks "]))")
set s r:get "snow2"
ask patches [ get-snow ]
if ticks >= 297 [ stop ]
Because our snowcover data has a time component, we need to tell it which month we want to use by inserting a value for the third axis. For example, if we wanted the value for row 1, column 1 in month 3, we would send R the phrase snow[1,1,3]. In this case, we want the entire coverage but for a single month, so we leave our the values for row and column and only feed R a value for the month. We use the word command here to concatenate the string which will serve as our R command, but which incorporates the current value from the NetLogo ticks counter to substitute for the month value. As the ticks counter increases, this will shift the data from one month to the next. The if ticks >= 297 [ stop ] command will ensure that the model only runs for as long as we have data for (which is 297 months). When we import this data frame from R into our NetLogo model, it will be imported as a set nested lists, where each sublist represents a column from the data frame (from 1 to 360).If we enter s into the command line, it will look something like this:
What we’ll want to do is pull values from these lists which correspond with the patch coordinates. However, remember that our world originates in the bottom left and increases toward the top right, while our data originates in the top left and increases toward the bottom right. What we’ll need to do is flip the y-axis values we use to reflect this (note: originating the model in the top left would give our NetLogo world negative Y-values, which would likewise need to be converted). We can do this with the following:
let x pxcor let y ((89 - pycor) / 89 ) * 89
set snow item y (item x s)
set pcolor scale-color grey snow 0 100
What this does is create temporary x and y values from the patch coordinates, but inverts the y-axis value of the patch (so top left is now bottom left). Then the patch sets its snow value by pulling out the value that corresponds with the appropriate row (item y) from the list the corresponds with the appropriate column (item x s). Finally, it sets is color along a scale from 0 to 100. When we run this code, the result is a lovely visualization of the monthly changes in snow cover from the Northern Hemisphere, like so:
So there you have it; a couple of different ways to get NetCDF data into a model using R and NetLogo. Of course, if you’re going to all of this trouble to work with such extensive datasets, it may be worth your while to explore alternativeplatforms which can build in native NetCDF support. Or you might build a model in R entirely. But I reckon the language is largely inconsequential as long as the model is well thought out, and part of that is figuring out what kind of input data you need and how to get it into your model. With a bit of imagination, there are many, many ways to skin this cat.
Ramankutty, N., and J.A. Foley (1999). Estimating historical changes in global land cover: croplands from 1700 to 1992, Global Biogeochemical Cycles 13(4), 997-1027.
Cavalieri, D. J., J. Crawford, M. Drinkwater, W. J. Emery, D. T. Eppler, L. D. Farmer, M. Goodberlet, R. Jentz, A. Milman, C. Morris, R. Onstott, A. Schweiger, R. Shuchman, K. Steffen, C. T. Swift, C. Wackerman, and R. L. Weaver. 1992. NASA sea ice validation program for the DMSP SSM/I: final report. NASA Technical Memorandum 104559. 126 pp.
However, if you weren’t able to make it up to Oslo, Doug Rocks-Macqueen, author of the excellent blog Doug’s Archaeology, has you covered: his session recordings have been making their way out on to the interwebs via his YouTube channel, Recording Archaeology. Now you can relive all of the action of CAA Oslo right in your own home!
Here’s a few of the sessions, helpfully organized as playlists of individual talks:
The 45th CAA conference will bring together scholars from across the globe to share their cutting edge research from a diverse range of fields in a focused, but informal, setting. One thing that the CAA prides itself on is a strong sense of community, and we hope to continue to grow that community by welcoming new participants this year. This is only the 3rd time the conference has been held in the United States, and we are excited to have old and new members join us in Atlanta this coming spring.
There are a TON of sessions to choose from this year, showcasing the diversity of computational approaches in archaeology as well as interest in theory and ways of knowing. The full list of sessions is here.
The authors of this blog will be co-chairing a few different sessions at the conference, including:
Quantitative model-based approaches to archaeology have been rapidly gaining popularity. Their utility in providing an experimental test-bed for examining how individual actions and decisions could influence the emergence of complex social and socio-environmental systems has fueled a spectacular increase in adoption of computational modeling techniques to traditional archaeological studies. However, computational models are restricted by the limitations of the technique used, and are not a “silver bullet” solution for understanding the archaeological and anthropological record. Rather, simulation and other types of formal modeling methods provide a way to interdigitate between archaeology/anthropology and computational approaches and between the data and theory, with each providing a feedback to the other. In this session we seek well-developed models that use data and theory from the anthropological and archaeological records to demonstrate the utility of computational modeling for understanding various aspects of human behavior. Equally, we invite case studies showcasing innovative new approaches to archaeological models and new techniques expanding the use of computational modeling techniques.
This is a different kind of session. Instead of the normal celebration of our success this session will be looking at our challenges. But, not degrading into self-pity and negativity, as it will be about critical reflection and possible solutions. The goal of this session is to raise the issues we should be tackling. To break the mold of the typical conference session, in which we review what we have solved, and instead explore what needs to be solved. Each participant will give a short (max 10 minutes but preference will be for 5 mins.) presentation in which they take one topic and critically analysis the problems surrounding it, both new and old. Ideally, at the end each participant would have laid out a map of the challenges facing their topic. The floor will then be opened up to the audience to add more issues, refute the problems raised, or propose solutions. This is open to any topic- GIS, 3D modelling, public engagement, databases, linked data, simulations, networks, etc. It can be about a very narrow topic or broad ranging e.g. everything that is wrong with C14 dating, everything wrong with least cost path analysis in ArcGIS, everything wrong with post-prossussalism, etc. However, this is an evaluation of our methods and theories and not meant to be as high level as past CAA sessions that have looked at grand challenges e.g. the beginning of agriculture. Anyone interested in presenting are asked to submit a topic (1-2 sentences) and your estimated time to summarize it (5 or 10 minutes). Full abstracts are not necessary.
The continuing rise of computational modelling applications, in particular simulation approaches, resembles the ‘hype’ cycles our discipline experienced in the past. The introduction of statistics, data management or GIS all started with inflated expectations and an explosion in applications, followed by a ‘correction’ phase seeing the early optimism dwindling and a heavy critique towards exaggerated claims and examples of misapplication. The next phase, ‘maturity’, is reached when the use of a particular technique is not questioned any more (although particular applications of it may still be) as it becomes part of the standard research toolkit. The verdict is still out whether the use of simulation techniques in archaeology is reaching the peak of the ‘optimism’ phase or is perhaps still in the midst of the ‘correction’ phase. However, lessons learned from other, now commonly used, computational methods or coming from other disciplines could accelerate the process of establishing simulation in the mainstream of archaeological practice. The Special Interest Group in Complex System Simulation would like to open the discussion to a wide audience of archaeologists and therefore invites all CAA2017 participants to take an active part in the roundtable. During the meeting we will consider the current place of simulation in archaeological practice, the main challenges facing modellers and the road map for the future.
This video, brought to you by our friends over at the Barcelona Supercomputing Center, does a great job of explaining in easy-to-understand terms what agent-based modeling is, and how it can be useful for both understanding the past and making the past relevant to the present. No small feat to accomplish in about 3 minutes. Have a look!
University of Kiel, Germany will be hosting a workshop “Socio-Environmental Dynamics over the Last 12,000 Years: The Creation of Landscapes IV” between 20-24th March 2017. It includes several sessions on simulation, modelling and ABM with a special emphasis on socio-natural systems. The abstract submission deadline is a still quite some time (30th November) but it may be worth putting the event into your calendars if you are not planning on crossing the ocean for the CAA in Atlanta or the SAAs in Vancouver.
To prove that there is a world beyond agents, turtles and all things ABM, we have created a neat little tutorial in system dynamics implemented in Python.
Delivered by Xavier Rubio-Campillo and Jonas Alcaina just a few days ago at the annual Digital Humanities conference (this year held in the most wonderful of all cities – Krakow), it is tailored to humanities students so it does not require any previous experience in coding.
System dynamics is a type of mathematical or equation-based modelling. Archaeologists (with a few noble exceptions) have so far shunned from, what is often perceived as, ‘pure math’ mostly citing the ‘too simplistic’ argument when awful mathematics teacher trauma was probably the real reason. However, in many cases an ABM is a complete overkill when a simple system dynamics model would be well within one’s abilities. So give it a go if only to ‘dewizardify’* the equations.
The CSSSA will be hosting its annual conference in November, bringing researchers from all stripes of computational social science together in beautiful Santa Fe, New Mexico. According to the website, some of the topics to be discussed at the meeting include (but are not limited to):
Social network analysis
Agent-based models / modeling
Economic models / resource allocation
Biological systems / metabolism / bioenergetics
Efficiencies / fitness functions
Competition / cooperation
Networks / information flow
Vision / knowledge acquisition
Adaptation / evolution
Local knowledge / global patterns
Game theoretic models
Applications close August 15th, 2016. For more information, check out the CSSSA website.
Possibly the longest running meeting on agent-based modeling, SwarmFest, is being held this July at the University of Vermont campus in Burlington. Now in its 20th(!) year, SwarmFest brings together people from a range of backgrounds in ABM and simulation. From the website:
SwarmFest is the annual meeting of the Swarm Development Group (SDG), and one of the oldest communities involved in the development and propagation of agent-based modeling. SwarmFest has traditionally involved a mix of both tool-users and tool-developers, drawn from many domains of expertise. These have included, in the past, computer scientists, software engineers, biomedical researchers, ecologists, economists, political scientists, social scientists, resource management specialists and evolutionary biologists. SwarmFest represents a low-key environment for researchers to explore new ideas and approaches, and benefit from a multi-disciplinary environment.
Given the concentration of computational and complexity labs at UVM, this promises to be a very exciting meeting. And summertime is a fantastic time to be on Lake Champlain, or really any lake in New England, so I wholeheartedly recommend the trek to Burlington.
Call of abstracts closes June 15th, so get in quickly. For more info, see the website.
From the world of Complex Systems Simulation in Humanities