The annual Conference on Complex Systems is one of the scientific gatherings where researchers present, discuss and debunk all things complex. This year it would be a double shame to miss it since it takes place in Cancun, Mexico between 17-22 September. If anyone needs any more encouragement, we are organising an exciting session focused on the evolution of broadly defined cultural complexity. Please send your abstracts by the 26th of May here. Any questions? Drop us an email: ccs17-at-bsc-dot-es
Details below and on the website: https://ccs17.bsc.es/
Human sociocultural evolution has been documented throughout the history of humans and earlier hominins. This evolution manifests itself through development from tools as simple as a rock used to break nuts, to something as complex as a spaceship able to land man on other planets. Equally, we have witnessed evolution of human population towards complex multilevel social organisation.
Although cases of decrease and loss of this type of complexity have been reported, in global terms it tends to increase with time. Despite its significance, the conditions and the factors driving this increase are still poorly understood and subject to debate. Different hypothesis trying to explain the rise of sociocultural complexity in human societies have been proposed (demographic factor, cognitive component, historical contingency…) but so far no consensus has been reached.
Here we raise a number of questions:
Can we better define sociocultural complexity and confirm its general tendency to increase over the course of human history?
What are the main factors enablingan increase of cultural complexity?
Are there reliable way to measure the complexity in material culture and social organisationconstructs, that is?
How can we quantify and compare the impact of different factors?
What causes a loss of cultural complexity in a society? And how often these losses occurred in the past?
Goals of the session
In this satellite meeting we want to bring together a community of researchers coming from different scientific domains and interested in different aspect of the evolution of social and cultural complexity. From archaeologists, to linguists, social scientists, historians and artificial intelligence specialists – the topic of sociocultural complexity transgresses traditional discipline boundaries. We want to establish and promote a constructive dialogue incorporating different perspectives: theoretical as well as empirical approaches, research based on historical and archaeological sources, as well as actual evidences and contemporary theories. We are particularly interested in formal approaches which enable more constructive theory building and hypothesis testing. However, even establishing common vocabulary of terms and concepts and discussing the main methodological challenges in studying sociocultural complexity is an important step towards a more cohesive framework for the understanding of cultural evolution in general and for individual research case studies in particular. Our approach is informed by the convergence between simulation and formal methods in archaeological studies and recent developments in complex systems science and complex network analysis.
The session will focus but is not limited to:
Social dynamics of innovation.
Cumulative Culture and social learning.
Evolution of Technology and technological changes
Cognitive Process,Creativity, cooperation and innovation
Population Dynamics and Demographic Studies
Computer tools to understand the cultural evolutionary change
This year the simulating complexity team is yet again teaching a 2-day workshop on agent-based modelling in archaeology as a satellite to the CAA conference. The workshop will take place on Sunday and Monday 12-13 March 2017. The workshop is free of charge, however, you have to register to the conference (which has some good modelling session as well).
Last year we had an absolute blast with over 30 participants, 10 instructors and 96% satisfaction rate (of the students, instructors were 100% happy!).
The workshop will follow along similar lines to last year although we have a few new and exciting instructors and a few new topics. For more details check here and here or simply get in touch!
The Simulating Complexity team is involved in two sessions at the CAA. Please consider putting together an abstract for submission. See them both below. Submission system can be accessed through here: http://caaconference.org/
Session: Data, Theory, Methods, and Models. Approaching Anthropology and Archaeology through Computational Modeling
Abstract: Quantitative model-based approaches to archaeology have been rapidly gaining popularity. Their utility in providing an experimental test-bed for examining how individual actions and decisions could influence the emergence of complex social and socio-environmental systems has fueled a spectacular increase in adoption of computational modeling techniques to traditional archaeological studies. However, computational models are restricted by the limitations of the technique used, and are not a “silver bullet” solution for understanding the archaeological and anthropological record. Rather, simulation and other types of formal modeling methods provide a way to interdigitate between archaeology/anthropology and computational approaches and between the data and theory, with each providing a feedback to the other. In this session we seek well-developed models that use data and theory from the anthropological and archaeological records to demonstrate the utility of computational modeling for understanding various aspects of human behavior. Equally, we invite case studies showcasing innovative new approaches to archaeological models and new techniques expanding the use of computational modeling techniques.
Everything wrong with…
Abstract: This is a different kind of session. Instead of the normal celebration of our success this session will be looking at our challenges. But, not degrading into self-pity and negativity, as it will be about critical reflection and possible solutions. The goal of this session is to raise the issues we should be tackling. To break the mold of the typical conference session, in which we review what we have solved, and instead explore what needs to be solved. Each participant will give a short (max 10 minutes but preference will be for 5 mins.) presentation in which they take one topic and critically analysis the problems surrounding it, both new and old. Ideally, at the end each participant would have laid out a map of the challenges facing their topic. The floor will then be opened up to the audience to add more issues, refute the problems raised, or propose solutions. This is open to any topic- GIS, 3D modelling, public engagement, databases, linked data, simulations, networks, etc. It can be about a very narrow topic or broad ranging e.g. everything that is wrong with C14 dating, everything wrong with least cost path analysis in ArchGIS, everything wrong with post-prossussalism, etc. However, this is an evaluation of our methods and theories and not meant to be as high level as past CAA sessions that have looked at grand challenges e.g. the beginning of agriculture. Anyone interested in presenting are asked to submit a topic (1-2 sentences) and your estimated time to summarize it (5 or 10 minutes). Full abstracts are not necessary.
An older version of this tutorial used the now-deprecated ncdf package for R. This updated version makes use of the ncdf4 package, and fixes a few broken links while we’re at it.
You found it: the holy grail of palaeoenvironmental datasets. Some government agency or environmental science department put together some brilliant time series GIS package and you want to find a way to import it into your model. But oftentimes the data may be in a format which isn’t readable by your modeling software, or takes some finagling to get the data in there. NetCDF is one of the more notorious of these. A NetCDF file (which stands for Network Common Data Form) is a multidimensional array, where each layer represents the spatial gridded distribution of a different variable or set of variables, and sets of grids can be stacked into time slices. To make this a little more clear, here’s a diagram:
The basic structure of a NetCDF file
In this diagram, each table represents a gridded spatial coverage for a single variable. Three variables are represented this way, and these are stored together in a single time step. The actual structure of the file might be simpler (that is, it might consist of a single variable and/or single time step) or more complex (with many more variables or where each variable is actually a set of coverages representing a range of values for that variable; imagine water temperature readings taken at a series of depths). These chunks of data can then be accessed as combined spatial coverages over time. Folks who work with climate and earth systems tend to store their data this way. It’s also a convenient way to keep track of data obtained from satellite measurements over time. They’re great for managing lots of spatial data, but if you’ve never dealt with them before, they can be a bit of a bear to work with. ArcGIS and QGIS support them, but it can be difficult to work them into simulations without converting to a more benign data type like an ASCII file. In a previous post, we’ve discussed importing GIS data into a NetLogo model, but of course this depends on our ability to get the data into a model-readable format. The following tutorial is going to walk through the process of getting a NetCDF file, manipulating it in R, and then getting it into NetLogo.
Step #1 – Locate the data
First let’s locate a useful NetCDF dataset and import it to R. As an example, we’ll use the Global Potential Vegetation Dataset from the UW-Madison Nelson Institute Sage Center for Sustainability and the Global Environment. As you can see, the data is also available as an ASCII file; this is useful because you can use this later to check that you’ve got the NetCDF working. Click on the appropriate link to download the Global Potential Veg Data NetCDF. The file is a tarball (extension .tar.gz), so you’ll need something to unzip it. If you’re not partial to a particular file compressor, try 7-Zip. Keep track of where the file is located on your local drive after downloading and unzipping.
Step #2- Bring the data into R
R won’t read NetCDF files as is, so you’ll need to download a package that works with this kind of data. The ncdf package is one of a few different packages that work with these files, and we’ll use it for this tutorial. First, open the R console and go to Packages->Install Packages and download the ncdf4 package from your preferred mirror site. Then load the package by entering the following: library(ncdf4) Now, remembering where you saved your NetCDF file, you can bring it into R with the following command: data <- nc_open(filename) If you didn’t save the data file in your R working directory and want to navigate to the file, just replace filename with file.choose(). For now, we’ll use the 0.5 degree resolution vegetation data (vegtype_0.5.nc). Now if you type in data and press enter, you can check to see what the data variable holds. You should get something like this:
This is telling you what your file is composed of. The first line tells you the name of the file. Beneath this are your variables. In this case, there is only one, vegtype, which according to the above uses a number just shy of nine hundred quintillion as a missing value (the computer will interpret any occurences of this number as no data).
Next come your dimensions, giving the intervals of measurement. In this case, there are four dimensions: longitude, latitude, level, and time. Our file only has one time slice, meaning that it represents a single snapshot of data; if this number is larger, there will be more coverages included in your file over time. The coverage spans from 89.75 S to 89.75 N latitude in 0.5 degree increments, and 180 W to 180 E longitude by the same increments.
To access the vegtype data, we need to assign it to a local variable, which we will call veg:
ncvar_get(data,"vegtype") -> veg
The ncvar_get command extracts an identified variable (“vegtype”) and extracts it from the NetCDF file (data) as a matrix. Then we assign it to the local variable veg. There are a number of other commands within the ncdf4 package which are useful for reading and writing NetCDF files, but these go beyond the scope of this blog entry. You can read more about them here.
Step #3 – Checking out the data
Now our data is available to us as a matrix. We can view it by entering the following:
image(veg)
R visualization of potential vegetation dataset (upside down)
Oops! Our output reads from bottom to top instead of top to bottom. No problem, we can just invert the latitude of the matrix like so:
image(veg, ylim=c(1,0))
R visualization of potential vegetation dataset (right side up)
However, this only changes the view; when we get the data into NetLogo later on, we’ll need to transpose it. But for now, let’s add some terrain colors. According to the readme file associated with the data, there are 15 different landcover types used here:
Tropical Evergreen Forest/Woodland
Tropical Deciduous Forest/Woodland
Temperate Broadleaf Evergreen Forest/Woodland
Temperate Needleleaf Evergreen Forest/Woodland
Temperate Deciduous Forest/Woodland
Boreal Evergreen Forest/Woodland
Boreal Deciduous Forest/Woodland
Evergreen/Deciduous Mixed Forest/Woodland
Savanna
Grassland/Steppe
Dense Shrubland
Open Shrubland
Tundra
Desert
Polar Desert/Rock/Ice
We could choose individual colors for each of these, but for the moment we’ll just use the in-built terrain color ramp:
image(veg,ylim=c(1,0),col=terrain.colors(15))
R visualization of potential vegetation dataset using terrain colors
Step #4 – Exporting the data to NetLogo
Finally, we want to read our data into a modeling platform, in this case NetLogo, so let’s export it as a raster coverage we can work with. Before we do any file writing, we’ll need to coerce the matrix into a data frame and make sure we transpose it so that it doesn’t come out upside down again. To do this, we’ll use the following code:
veg2<-as.data.frame(t(veg))
The as.data.frame command does the coercing, while the t command does the transposing. Now we have to open up the file we’re going to write to:
fileCon<-file('vegcover.asc')
This establishes a connection to an open file which we’ve named vegcover.asc. Next, we’ll write the header data for an ASCII coverage. We can do this by adding lines to the file:
This may look like a bunch of nonsense, but each \t is a tab, and each \n is a new line. The result is a header on our file which looks like this: ncols 720 nrows 360 xllcorner -179.75 yllcorner -89.75 cellsize 0.5 NODATA_value 8.99999982852418e+20 Any program (whether a NetLogo model, GIS, or otherwise) that reads this file will look for this header first. The terms ncols and nrows define the number of columns and rows in the grid. The xllcorner and yllcorner define the lower left corner of the grid. The cellsize term describes how large each cell should be, and the NODATA_value is the same value from the original dataset which we used to define places where data is not available. Now just need to enter in our transposed data.
This will take our data frame and write it to the file we just created, appending it after the header. It’s important that your separator be a space (sep=” “) in order to assure that it is in a format NetLogo can read. Also make sure to get rid of any row and column names as well. Now we can read our file into NetLogo using the GIS extension (for an explanation of this, see here). Open a new NetLogo file, set the world window settings with the origin at the bottom left, a max-pxcor of 719 and and max-pycor of 359, and a patch size of 1. Save your NetLogo model in the same directory as the vegcover.asc file, and the following NetLogo code should do the trick:
to setup
clear-all
set vegcover gis:load-dataset "vegcover.asc"
gis:set-world-envelope-ds gis:envelope-of vegcover
ask patches [
set pcolor white
set vegtype gis:raster-sample vegcover self
]
ask patches with [ vegtype <= 8 ] [
set pcolor scale-color green vegtype -5 10
]
ask patches with [ vegtype > 8 ] [
set pcolor scale-color pink vegtype 9 15
]
end
This should produce a world in which patches have a variable called vegtype with values that correspond to the original dataset. Furthermore, patches are colored according to a set scheme where forested areas are on a scale of green, while non-forested areas are on a scale of pink. The result:
NetLogo visualization of potential vegetation dataset
If you’re truly curious as to whether this has worked as it should, you might download the ASCII version of the 0.5 degree data from the SAGE website, save it to the same directory, and replace vegcover.asc with the name of the ASCII file in the above NetLogo code to see if there is any difference.
Going further
So far, this has been meant to provide a simple tutorial of how to get data from a NetCDF file into an ABM platform. If you’re only dealing with a single coverage, you might be more at home converting your file using QGIS or another standalone GIS. If you’re dealing with multiple time steps or variables from a large dataset, it might make sense to write an R script that will extract the data systematically using combinations of the commands above. However, you might also make use of the R NetLogo extension to query a NetCDF file on the fly. To proceed with this part of the tutorial, you’ll need to download the R extension and have it installed correctly.
First, let’s find a NetCDF file with a temporal component. In honor of the impending winter my Northern Hemisphere colleagues are about to endure, I’m going to use the Northern Hemisphere EASE-Grid Snow Cover and Sea Ice Extent dataset from NOAA, which gives monthly (derived from weekly) snow cover data from 1971 to 1995. Go to the website and download the Monthly Mean dataset and save the file ‘snowcover.mon.mean.nc’ to your local drive, keeping track of the its location.
We’ll start a new NetLogo model, implement the R extension, and create two global variables and a patch variable:
extensions [ R ]
globals [ snowcover s ]
patches-own [ snow ]
The snowcover variable will be our dataset, while s will be a placeholder for monthly coverages. The patch variable snow will be the individual grid cell values from our data which will be updated monthly. Next, we’ll run a setup command which clears the model, installs the ncdf library, opens our NetCDF snowcover file, extracts our snowcover data, and resets our ticks counter. You may need to edit the code below so that it reflects the location of your NetCDF file.
to setup
clear-all
r:clear
r:eval "library(ncdf4)"
r:eval "data<-nc_open(\"C:/Users/me/Downloads/snowcover.mon.mean.nc\")"
r:eval "ncvar_get(data, \"snowcover\") -> snow"
reset-ticks
end
Now, we could automate the process of converting to ASCII and importing the GIS data here, but that’s likely to be a slow solution and generate a lot of file bloat. Alternatively, if our world window is scaled to the same size as the NetCDF grid (or to some easily computed fraction of it), we can simply import the raw data and transmit the values directly to patches (not unlike the File Input example here). To do this, right click on the world window and edit it so that the location of the origin is the bottom left, and that the max-pxcor is 359 and the max-pycor is 89 (this is 360 x 90, the same size as our Northern Hemisphere snowcover data). We’ll also make sure the world doesn’t wrap, and set the patch size to 3 to make sure it fits on our screen.
NetLogo world window settings
Next, we’ll generate the transposed dataframe as in the above example, but this time for a single monthly coverage. Then we’ll import this data from R into the NetLogo placeholder variable s:
to go
tick
r:eval (word "snow2<-as.data.frame(t(snow[,," ticks "]))")
set s r:get "snow2"
ask patches [ get-snow ]
if ticks >= 297 [ stop ]
end
Because our snowcover data has a time component, we need to tell it which month we want to use by inserting a value for the third axis. For example, if we wanted the value for row 1, column 1 in month 3, we would send R the phrase snow[1,1,3]. In this case, we want the entire coverage but for a single month, so we leave our the values for row and column and only feed R a value for the month. We use the word command here to concatenate the string which will serve as our R command, but which incorporates the current value from the NetLogo ticks counter to substitute for the month value. As the ticks counter increases, this will shift the data from one month to the next. The if ticks >= 297 [ stop ] command will ensure that the model only runs for as long as we have data for (which is 297 months). When we import this data frame from R into our NetLogo model, it will be imported as a set nested lists, where each sublist represents a column from the data frame (from 1 to 360).If we enter s into the command line, it will look something like this:
What we’ll want to do is pull values from these lists which correspond with the patch coordinates. However, remember that our world originates in the bottom left and increases toward the top right, while our data originates in the top left and increases toward the bottom right. What we’ll need to do is flip the y-axis values we use to reflect this (note: originating the model in the top left would give our NetLogo world negative Y-values, which would likewise need to be converted). We can do this with the following:
to get-snow
let x pxcor let y ((89 - pycor) / 89 ) * 89
set snow item y (item x s)
set pcolor scale-color grey snow 0 100
end
What this does is create temporary x and y values from the patch coordinates, but inverts the y-axis value of the patch (so top left is now bottom left). Then the patch sets its snow value by pulling out the value that corresponds with the appropriate row (item y) from the list the corresponds with the appropriate column (item x s). Finally, it sets is color along a scale from 0 to 100. When we run this code, the result is a lovely visualization of the monthly changes in snow cover from the Northern Hemisphere, like so:
NetLogo visualization of first three years of snowcover data
So there you have it; a couple of different ways to get NetCDF data into a model using R and NetLogo. Of course, if you’re going to all of this trouble to work with such extensive datasets, it may be worth your while to explore alternativeplatforms which can build in native NetCDF support. Or you might build a model in R entirely. But I reckon the language is largely inconsequential as long as the model is well thought out, and part of that is figuring out what kind of input data you need and how to get it into your model. With a bit of imagination, there are many, many ways to skin this cat.
Data references:
Ramankutty, N., and J.A. Foley (1999). Estimating historical changes in global land cover: croplands from 1700 to 1992, Global Biogeochemical Cycles 13(4), 997-1027.
Cavalieri, D. J., J. Crawford, M. Drinkwater, W. J. Emery, D. T. Eppler, L. D. Farmer, M. Goodberlet, R. Jentz, A. Milman, C. Morris, R. Onstott, A. Schweiger, R. Shuchman, K. Steffen, C. T. Swift, C. Wackerman, and R. L. Weaver. 1992. NASA sea ice validation program for the DMSP SSM/I: final report. NASA Technical Memorandum 104559. 126 pp.
However, if you weren’t able to make it up to Oslo, Doug Rocks-Macqueen, author of the excellent blog Doug’s Archaeology, has you covered: his session recordings have been making their way out on to the interwebs via his YouTube channel, Recording Archaeology. Now you can relive all of the action of CAA Oslo right in your own home!
Here’s a few of the sessions, helpfully organized as playlists of individual talks:
The folks at CAA have recently announced a call for papers for the 2017 conference, to be held at Georgia State University in Atlanta. From the conference website:
The 45th CAA conference will bring together scholars from across the globe to share their cutting edge research from a diverse range of fields in a focused, but informal, setting. One thing that the CAA prides itself on is a strong sense of community, and we hope to continue to grow that community by welcoming new participants this year. This is only the 3rd time the conference has been held in the United States, and we are excited to have old and new members join us in Atlanta this coming spring.
There are a TON of sessions to choose from this year, showcasing the diversity of computational approaches in archaeology as well as interest in theory and ways of knowing. The full list of sessions is here.
The authors of this blog will be co-chairing a few different sessions at the conference, including:
Quantitative model-based approaches to archaeology have been rapidly gaining popularity. Their utility in providing an experimental test-bed for examining how individual actions and decisions could influence the emergence of complex social and socio-environmental systems has fueled a spectacular increase in adoption of computational modeling techniques to traditional archaeological studies. However, computational models are restricted by the limitations of the technique used, and are not a “silver bullet” solution for understanding the archaeological and anthropological record. Rather, simulation and other types of formal modeling methods provide a way to interdigitate between archaeology/anthropology and computational approaches and between the data and theory, with each providing a feedback to the other. In this session we seek well-developed models that use data and theory from the anthropological and archaeological records to demonstrate the utility of computational modeling for understanding various aspects of human behavior. Equally, we invite case studies showcasing innovative new approaches to archaeological models and new techniques expanding the use of computational modeling techniques.
This is a different kind of session. Instead of the normal celebration of our success this session will be looking at our challenges. But, not degrading into self-pity and negativity, as it will be about critical reflection and possible solutions. The goal of this session is to raise the issues we should be tackling. To break the mold of the typical conference session, in which we review what we have solved, and instead explore what needs to be solved. Each participant will give a short (max 10 minutes but preference will be for 5 mins.) presentation in which they take one topic and critically analysis the problems surrounding it, both new and old. Ideally, at the end each participant would have laid out a map of the challenges facing their topic. The floor will then be opened up to the audience to add more issues, refute the problems raised, or propose solutions. This is open to any topic- GIS, 3D modelling, public engagement, databases, linked data, simulations, networks, etc. It can be about a very narrow topic or broad ranging e.g. everything that is wrong with C14 dating, everything wrong with least cost path analysis in ArcGIS, everything wrong with post-prossussalism, etc. However, this is an evaluation of our methods and theories and not meant to be as high level as past CAA sessions that have looked at grand challenges e.g. the beginning of agriculture. Anyone interested in presenting are asked to submit a topic (1-2 sentences) and your estimated time to summarize it (5 or 10 minutes). Full abstracts are not necessary.
The continuing rise of computational modelling applications, in particular simulation approaches, resembles the ‘hype’ cycles our discipline experienced in the past. The introduction of statistics, data management or GIS all started with inflated expectations and an explosion in applications, followed by a ‘correction’ phase seeing the early optimism dwindling and a heavy critique towards exaggerated claims and examples of misapplication. The next phase, ‘maturity’, is reached when the use of a particular technique is not questioned any more (although particular applications of it may still be) as it becomes part of the standard research toolkit. The verdict is still out whether the use of simulation techniques in archaeology is reaching the peak of the ‘optimism’ phase or is perhaps still in the midst of the ‘correction’ phase. However, lessons learned from other, now commonly used, computational methods or coming from other disciplines could accelerate the process of establishing simulation in the mainstream of archaeological practice. The Special Interest Group in Complex System Simulation would like to open the discussion to a wide audience of archaeologists and therefore invites all CAA2017 participants to take an active part in the roundtable. During the meeting we will consider the current place of simulation in archaeological practice, the main challenges facing modellers and the road map for the future.
This video, brought to you by our friends over at the Barcelona Supercomputing Center, does a great job of explaining in easy-to-understand terms what agent-based modeling is, and how it can be useful for both understanding the past and making the past relevant to the present. No small feat to accomplish in about 3 minutes. Have a look!
University of Kiel, Germany will be hosting a workshop “Socio-Environmental Dynamics over the Last 12,000 Years: The Creation of Landscapes IV” between 20-24th March 2017. It includes several sessions on simulation, modelling and ABM with a special emphasis on socio-natural systems. The abstract submission deadline is a still quite some time (30th November) but it may be worth putting the event into your calendars if you are not planning on crossing the ocean for the CAA in Atlanta or the SAAs in Vancouver.
To prove that there is a world beyond agents, turtles and all things ABM, we have created a neat little tutorial in system dynamics implemented in Python.
Delivered by Xavier Rubio-Campillo and Jonas Alcaina just a few days ago at the annual Digital Humanities conference (this year held in the most wonderful of all cities – Krakow), it is tailored to humanities students so it does not require any previous experience in coding.
System dynamics is a type of mathematical or equation-based modelling. Archaeologists (with a few noble exceptions) have so far shunned from, what is often perceived as, ‘pure math’ mostly citing the ‘too simplistic’ argument when awful mathematics teacher trauma was probably the real reason. However, in many cases an ABM is a complete overkill when a simple system dynamics model would be well within one’s abilities. So give it a go if only to ‘dewizardify’* the equations.
The CSSSA will be hosting its annual conference in November, bringing researchers from all stripes of computational social science together in beautiful Santa Fe, New Mexico. According to the website, some of the topics to be discussed at the meeting include (but are not limited to):
Social network analysis
Agent-based models / modeling
Emergence
Economic models / resource allocation
Population dynamics
Ecosystems
Political/social systems
Biological systems / metabolism / bioenergetics
Efficiencies / fitness functions
Competition / cooperation
Networks / information flow
Social contagion
Vision / knowledge acquisition
Influence
Swarm intelligence
Adaptation / evolution
Decision making
Local knowledge / global patterns
Game theoretic models
Strategy
Learning
Applications close August 15th, 2016. For more information, check out the CSSSA website.