Tag Archives: Tutorials

The Powers and Pitfalls of Power-Law Analyses

December 8, 2016 stefanicrabtree Leave a comment

People love power-laws. In the 90s and early 2000s it seemed like they were found everywhere. Yet early power-law studies did not subject the data distributions to rigorous tests. This decreased the potential value of some of these studies. And since an influential study by Aaron Clauset of CU Boulder , Cosma Shalizi of Carnegie Mellon, and Mark Newman of the University of Michigan, researchers have become aware that not all distributions that look power-law like are actually power-laws.

But power-law analyses can be incredibly useful. In this post I show you first what a power-law is, second demonstrate an appropriate case-study to use these analyses in, and third walk you through how to use these analyses to understand distributions in your data.

What is a power-law?

A power-law describes a distribution of something—wealth, connections in a network, sizes of cities—that follow what is known as the law of preferential attachment. In power-laws there will be many of the smallest object, with increasingly fewer of the larger objects. However, the largest objects disproportionally get the highest quantities of stuff.

The world wide web follows a power-law. Many sites (like Simulating Complexity) get small amounts of traffic, but some sites (like Google, for example) get high amounts of traffic. Then, because they get more traffic, they attract even more visits to their sites. Cities also tend to follow power-law distributions, with many small towns, and few very large cities. But those large cities seem to keep getting larger. Austin, TX for example, has 157.2 new citizens per day, making this city the fastest growing city in the United States. People are attracted to it because people keep moving there, which perpetuates the growth. Theoretically there should be a limit, though maybe the limit will be turning our planet into a Texas-themed Coruscant.

This is in direct contrast to log-normal distributions. Log-normal distributions follow the law of proportional effect. This means that as something increases in size, it is predictably larger than what came before it. Larger things in log-normal distributions do not attract exponentially more things… they have a proportional amount of what came before. For example, experience and income should follow a log-normal distribution. As someone works in a job longer they should get promotions that reflect their experience. When we look at incomes of all people in a region we see that when incomes are more log-normally distributed these reflect greater equality, whereas when incomes are more power-law-like, inequality increases. Modern incomes seem to follow log-normality up to a point, after which they follow a power-law, showing that the richest attract that much more wealth, but under a certain threshold wealth is predictable.

If we analyze the distribution of modern incomes in a developing nation and see that they follow a power-law distribution, we will understand that there is a ‘rich get richer’ dynamic in that country, whereas if we see the incomes follow a log-normal distribution we would understand that that country had greater internal equality. We might want to know this to help influence policy.

When we analyze power-laws, however, we don’t want to just look at the graph that is created and say “Yeah, I think that looks like a power-law.” Early studies seemed to do just that. Thankfully Clauset et al. came up with rigorous methods to examine a distribution of data and see if it’s a power-law, or if it follows another distribution (such as log-normal). Below I show how to use these tools in R.

Power-law analyses and archaeology

So, if modern analyses of these distributions can tell us something about the equality (log-normal) or inequality (power-law) of a population, then these tools can be useful for examining the lifeways of past people. Questions we might be interested in asking are whether prehistoric cities also follow a power-law distribution, suggesting that the largest cities offered more social (and potentially economic) benefits similar to modern cities. Or we might want to understand whether societies in prehistory were more egalitarian or more hierarchical, thus looking at distributions of income and wealth (as archaeologists define them) to examine these. Power-law analyses of distributions of artifacts or settlement sizes would enable us to understand the development of inequality in the past.

Clifford Brown et al. talked about these very issues in their chapter Poor Mayapan from the book The Ancient Maya of Mexico edited by Braswell. While they don’t use the statistical tools I present below, they do present good arguments for why and when power-law versus other types of distributions would occur, and I would recommend tracking down this book and reading it if you’re interested in using power-law analyses in archaeology. Specifically they suggest that power-law distributions would not occur randomly, so there is intentionality behind those power-law-like distributions.

I recently used power-law and log-normal analyses to try to understand the development of hierarchy in the American Southwest. The results of this study will be published in 2017 in American Antiquity. Briefly, I wanted to look at multiple types of evidence, including ceremonial structures, settlements, and simulation data to understand the mechanisms that could have led to hierarchy and whether or not (and when) Ancestral Pueblo groups were more egalitarian or more hierarchical. Since I was comparing multiple different datasets, a method to quantitatively compare them was needed. Thus I turned to Clauset’s methods.

These had been updated by Gillespie in the R package poweRlaw.

Below I will go over the poweRlaw package with a built-in dataset, the Moby Dick words dataset. This dataset counts the frequency of different words. For example, there are many instances of the word “the” (19815, to be exact) but very few instances of other words, like “lamp” (34 occurrences) or “choice” (5 occurrences), or “exquisite” (1 occurrence). (Side note, I randomly guessed at each of these words, assuming each would have fewer occurrences. My friend Simon DeDeo tells me that ‘exquisite’ in this case is hapax legomenon, or a term that only has one recorded use. Thanks Simon.) To see more go to http://roadtolarissa.com/whalewords/.

In my research I used other datasets that measured physical things (the size of roomblocks, kivas, and territories) so there’s a small mental leap for using a new dataset, but this should allow you to follow along.

The Tutorial

Open R.

Load the poweRlaw package

library(“poweRlaw”)

Add in the data

data(“moby”, package=”poweRlaw”)

This will load the data into your R session.

Side note:

If you are loading in your own data, you first load it in like you normally would, e.g.:

data <- read.csv(“data.csv”)

Then if you were subsetting your data you’d do something like this:

a <- subset(data, Temporal_Assignment !=’Pueblo III (A.D. 1140-1300)’)

Next you have to decide if your data is discrete or continuous. What do I mean by this?

Discrete data can only take on particular values. In the case of the Moby Dick dataset, since we are counting physical words, this data is discrete. You can have 1 occurrence of exquisite and 34 occurrences of lamp. You can’t have 34.79 occurrences of it—it either exists or it doesn’t.

Continuous data is something that doesn’t fit into simple entities, but whose measurement can exist on a long spectrum. Height, for example, is continuous. Even if we bin peoples’ heights into neat categories (e.g., 6 feet tall, or 1.83 meters) the person’s height probably has some tailing digit, so they aren’t exactly 6 feet, but maybe 6.000127 feet tall. If we are being precise in our measurements, that would be continuous data.

The data I used in my article on kiva, settlement, and territory sizes was continuous. This Moby Dick data is discrete.
The reason this matters is the poweRlaw package has two separate functions for continuous versus discrete data. These are:

conpl for continuous data, and

displ for discrete data

You can technically use either function and you won’t get an error from R, but the results will differ slightly, so it’s important to know which type of data you are using.

In the tutorial written here I will be using the displ function since the Moby dataset is discrete. Substitute in conpl for any continuous data.

So, to create the powerlaw object first we fit the displ to it. So,

pl_a <- displ$new(moby)

We then want to estimate the x-min value. Powerlaws are usually only power-law-like in their tails… the early part of the distribution is much more variable, so we find a minimum value below which we say “computer, just ignore that stuff.”

However, first I like to look at what the x_min values are, just to see that the code is working. So:

pl_a$getXmin()

Then we estimate and set the x-mins

So this is the code that does that:

est <- estimate_xmin(a)

We then update the power-law object with the new x-min value:

pl_a$setXmin(est)

We do a similar thing to estimate the exponent α of the power law. This function is pars, so:

Pl_a$getPars()

estimate_pars(pl_a)

Then we also want to know how likely our data fits a power law. For this we estimate a p-value (explained in Clauset et al). Here is the code to do that (and output those data):

booty <- bootstrap_p(pl_a)

This will take a little while, so sit back and drink a cup of coffee while R chunks for you.

Then look at the output:

booty

Alright, we don’t need the whole sim, but it’s good to have the goodness of fit (gof: 0.00825) and p value (p: 0.75), so this code below records those for you.

variables <- c(“p”, “gof”)

bootyout <- booty[variables]

write.table(bootyout, file=”/Volumes/file.csv”, sep=’,’, append=F, row.names=FALSE, col.names=TRUE)

Next, we need to see if our data better fits a log-normal distribution. Here we compare our dataset to a log-normal distribution, and then compare the p-values and perform a goodness-of-fit test. If you have continuous data you’d use conlnorm for a continuous log normal distribution. Since we are using discrete data with the Moby dataset we use the function dislnorm. Again, just make sure you know which type of data you’re using.

### Estimating a log normal fit

aa <- dislnorm$new(moby)

We then set the xmin in the log-normal dataset so that the two distributions are comparable.

aa$setXmin(pl_a$getXmin())

Then we estimate the slope as above

est2 <-estimate_pars(aa)

aa$setPars(est2$pars)

Now we compare our two distributions. Please note that it matters which order you put these in. Here I have the power-law value first with the log-normal value second. I discuss what ramifications this has below.

comp <- compare_distributions(pl_a, aa)

Then we actually print out the stats:

comp

And then I create a printable dataset that we can then look at later.

myvars <- c(“test_statistic”, “p_one_sided”, “p_two_sided”)

compout <- comp[myvars]

write.table(compout, file=”/Volumes/file2.csv”, sep=’,’, append=F, row.names=FALSE, col.names=TRUE)

And now all we have left to do is graph it!

pdf(file=paste(‘/Volumes/Power_Law.pdf’, sep=”),width=5.44, height = 3.5, bg=”white”, paper=”special”, family=”Helvetica”, pointsize=8)

par(mar=c(4.1,4.5,0.5,1.2))

par(oma=c(0,0,0,0))

plot(pts_a, col=’black’, log=’xy’, xlab=”, ylab=”, xlim=c(1,400), ylim=c(0.01,1))

lines(pl_a, col=2, lty=3, lwd=2, xlab=”, ylab=”)

lines(aa, col=3, lty=2, lwd=1)

legend(“bottomleft”, cex=1, xpd=T, ncol=1, lty=c(3,2), col=c(2,3), legend=c(“powerlaw fit”, “log normal fit”), lwd=1, yjust=0.5,xjust=0.5, bty=”n”)

text(x=70,y= 1,cex=1, pos=4, labels=paste(“Power law p-value: “,bootyout$p))

mtext(“All regions, Size”, side=1, line=3, cex=1.2)

mtext(“Relative frequencies”, side=2, line=3.2, cex=1.2)

legend=c(“powerlaw fit”, “log normal fit”)

box()

dev.off()

Now, how do you actually tell which is better, the log normal or power-law? Here is how I describe it in my upcoming article:

The alpha parameter reports the slope of the best-fit power-law line. The power-law probability reports the probability that the empirical data could have been generated by a power law; the closer that statistic is to 1, the more likely that is. We consider values below 0.1 as rejecting the hypothesis that the distribution was generated by a power law (Clauset et al. 2009:16). The test statistic indicates how closely the empirical data match the log normal. Negative values indicate log-normal distributions, and the higher the absolute value, the more confident the interpretation. However, it is possible to have a test statistic that indicates a log-normal distribution in addition to a power-law probability that indicates a power-law, so we employ the compare distributions test to compare the fit of the distribution to a power-law and to the log-normal distribution. Values below 0.4 indicate a better fit to the log-normal; those above 0.6 favor a power-law; intermediate values are ambiguous. Please note, though, that it depends on what order you put the two distributions in the R code: if you put log-normal in first in the above compare distributions code, then the above would be reversed—those below 0.4 would favor power-laws, while above 0.6 would favor log normality. I may be wrong, but as far as I can tell it doesn’t actually matter which order you put the two distributions in, as long as you know which one went first and interpret it accordingly.

So, there you have it! Now you can run a power-law analysis on many types of data distributions to examine if you have a rich-get-richer dynamic occurring! Special thanks to Aaron Clauset for answering my questions when I originally began pursuing this research.

Full code at the end:

library(“poweRlaw”)

data(“moby”, package=”poweRlaw”)

pl_a <- displ$new(moby)

pl_a$getXmin()

est <- estimate_xmin(a)

pl_a$setXmin(est)

Pl_a$getPars()

estimate_pars(pl_a)

booty <- bootstrap_p(pl_a)

variables <- c(“p”, “gof”)

bootyout <- booty[variables]

#write.table(bootyout, file=”/Volumes/file.csv”, sep=’,’, append=F, row.names=FALSE, col.names=TRUE)

### Estimating a log normal fit

aa <- dislnorm$new(moby)

aa$setXmin(pl_a$getXmin())

est2 <-estimate_pars(aa)

aa$setPars(est2$pars)

comp <- compare_distributions(pl_a, aa)

comp

myvars <- c(“test_statistic”, “p_one_sided”, “p_two_sided”)

compout <- comp[myvars]

write.table(compout, file=”/Volumes/file2.csv”, sep=’,’, append=F, row.names=FALSE, col.names=TRUE)

pdf(file=paste(‘/Volumes/Power_Law.pdf’, sep=”),width=5.44, height = 3.5, bg=”white”, paper=”special”, family=”Helvetica”, pointsize=8)

par(mar=c(4.1,4.5,0.5,1.2))

par(oma=c(0,0,0,0))

plot(pts_a, col=’black’, log=’xy’, xlab=”, ylab=”, xlim=c(1,400), ylim=c(0.01,1))

lines(pl_a, col=2, lty=3, lwd=2, xlab=”, ylab=”)

lines(aa, col=3, lty=2, lwd=1)

legend(“bottomleft”, cex=1, xpd=T, ncol=1, lty=c(3,2), col=c(2,3), legend=c(“powerlaw fit”, “log normal fit”), lwd=1, yjust=0.5,xjust=0.5, bty=”n”)

text(x=70,y= 1,cex=1, pos=4, labels=paste(“Power law p-value: “,bootyout$p))

mtext(“All regions, Size”, side=1, line=3, cex=1.2)

mtext(“Relative frequencies”, side=2, line=3.2, cex=1.2)

legend=c(“powerlaw fit”, “log normal fit”)

box()

dev.off()

General

SSI to the rescue

February 29, 2016 izaromanowska Leave a comment

Ever heard of the Software Sustainability Institute? It is an EPSRC (UK’s engineering and physical science research council) funded organisation championing best practices in research software development (they are quite keen on best practice in data management as well). They have some really useful resources such as tutorials, guides to best practice and listings of the software and data carpentry training events. I wanted to draw your attention to them, because I fell that the times when archaeological simulations will need to start conforming to the painful (yet necessary) software development standards are looming upon us. The institute’s website is a great place to start.

More to the point, the Institute has just release a call for projects (see below for details). In a nutshell, the idea is that a team of research software developers (read: MacGyver meets Big-Bang-Theory) comes over and makes your code better, speeds up your simulation (e.g., by parallelising it), improves your data storage strategy, stabilises the simulation, helps with developing unit testing or version control, packs the model into an ‘out-of-the-box’ format (e.g., by developing a user-friendly interface) or whatever else you ask for that will make your code better, more sustainable, more reusable/replicable or useful for a wider community. All of that free of charge.

The open call below mentions BBSCR and ESRC, but projects funded through any UK research council (incl. AHRC and NERC), other funding bodies as well as projects based abroad are eligible to apply. The only condition is that applications “are judged on the positive potential impact on the UK research community”. The application is pretty straight forward and the call comes up twice to three times a year. The next deadline is 29th April. See below for the official call and follow the links for more details.

————————————————————————–

Get help to improve your research software

If you write code as part of your research, then you can get help to improve it – free of charge – through the Software Sustainability Institute’s Open Call for Projects. The call closes on April 29 2016.

Apply at http://bit.ly/ssi-open-call-projects

You can ask for our help to improve your research software, your development practices, or your community of users and contributors (or all three!). You may want to improve the sustainability or reproducibility of your software, and need an assessment to see what to do next. Perhaps you need guidance or development effort to help improve specific aspects or make better use of infrastructure.

We accept submissions from any discipline, in relation to research software at any level of maturity, and are particularly keen to attract applications from BBSRC and ESRC funding areas.

The Software Sustainability Institute is a national facility funded by the EPSRC. Since 2010, the Institute’s Research Software Group[1] has assisted over 50 projects across all the UK Research Councils. In an ongoing survey, 93% of our previous collaborators indicated they were “very satisfied” with the results of the work. To see how we’ve helped others, you can check out our portfolio of past and current projects[2].

A typical Open Call project runs between one and six months, during which time we work with successful applicants to create and implement a tailored work plan. You can submit an application to the Open Call at any time, which only takes a few minutes, at http://bit.ly/ssi-open-call-projects.

We’re also interested in partnering on proposals. If you would like to know more about the Open Call, or explore options for partnership, please get in touch with us at info (at) software (dot) ac (dot) uk.

Tutorials

Fun with Markov Chains: A Tutorial Using NetLogo

February 10, 2015 benjdavies 2 Comments

When I started working on this How-To on building a simple Markov chain (a useful component of model-building), I came across this great visualization at Setosa Blog. Clearly, after this and the really incredible Segregation visualization that Iza posted about, I’m starting to feel the need to step up my data viz game. So before you read on, have a look at the visualization because it does a really good job of explaining the fundamentals and gives you a chance to play with a Markov chain yourself. I’ll just briefly cover what Markov chains are, and then we’ll get into using Markov chains in an ABM using NetLogo.

What is a Markov chain?

A Markov chain is a way to model how a system changes from one state to another over time. Imagine a system with two states: A and B. If the system is in state A, there is a probability that over the next time step, the system will transition to state B (with the inverse probability that the system will remain in state A). There are also probabilities that a system in state B will transition to state A. Since the system we’re modeling has to exist in one of the two states, the transition probabilities for each state should add up to 1.00. These are sometimes visualized using a simple network graph, like so:

Markov chains are useful for dealing with phenomena that are auto-correlated: that is, the future state of the phenomenon depends on its current state. When this is true, we may not be able to accurately model the phenomenon as a random probability draw. To use this method, we need to know what the current state of the phenomenon is, and what the likelihood is of the system transitioning to another state. The weather is often used as an example. In ecological modeling, we often want to model rainfall because pretty much every living thing on earth depends to some degree on the frequency and regularity of rainfall. Let’s look at the rainfall data for Boston, Massachusetts in April of last year (courtesy of Weather Underground).

It rained 16 out of 30 days in April, or 53.33% of the time. We could simulate April rain frequency in Boston by simply performing 30 checks against a probability of 53.33%. If you wanted to do that in R, we could do the following:

prob<-0.5333

checks<-(runif(30, 0, 1)) <= prob

Here, we have two variables: one expressing the transition probability, and the other running the routine of generating thirty random numbers between 0 and 1 and checking whether they are less than or equal to that probability. If we enter checks into the command prompt, it would give us the following output:

TRUE FALSE TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE TRUE FALSE FALSE TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE

In this simulated dataset, TRUE is a rainy day and FALSE is a clear day. That looks OK, but does rain really work that way? Do we really get long spans where it rains on alternating days? One thing we notice from the Boston rainfall data is that rainy days don’t seem to occur sporadically; they occur in groups in between spans of good weather. This probably makes intuitive sense; if it starts raining today, there’s a better-than-average chance it might rain tomorrow. That’s because the process of rain clouds passing over our heads is auto-correlated. However, we can use this data to estimate how likely it is for the weather to go from dry to wet (DTW), or vice versa (WTD). We calculate these values using the following equations:

DTW = number of dry days followed by a wet day / total number of dry days = 5 / 14 = 35.71%

WTD = number of wet days followed by a dry day / total number of wet days = 5 / 16 = 31.25%

More data would be better, and Weather Underground has Boston weather going back about 95 years. But with these transition probabilities, we could build a Markov chain that simulates the sequence of rainy versus clear days in April in Boston in a way that captures some of that auto-correlation.

Making it rain in NetLogo

Let’s say we wanted to model grass growth in NetLogo, using a Markov chain like the one above to simulate daily rainfall. First, we’ll open a new model, head over to the Code tab, and create a global variable for the weather. We can also consider grass as a component of the patches in the model, and the grass could be in various growth stages, which we’ll call growth.

Next, we’ll set the model up so that we start off in one of two weather conditions: “WET” or “DRY”. We’ll also set up the grass so that growth is randomly distributed between 0 and 10 among the patches (we code this as random 11 to account for 0), and we’ll color the patches according to their growth using the scale-color command (darker = more growth, lighter = less growth).

Then, in the go procedure, we’ll have two functions: markov-rain and grow-grass, that will repeat during the model at each time step, or tick, until a limit of 500 ticks is reached.

The markov-rain procedure will determine the transitions between “WET” and “DRY” weather, and we’ll use variable placeholders WTD for the wet-to-dry transition, and DTW for the dry-to-wet transition. This might throw up an error about these variables not existing, but we’ll get to that later.

What’s happening here? To use a Markov chain in NetLogo, we need to know what the current state of the process is, in this case the current weather, and then apply our transition probabilities to determine whether the state has changed. Here, we use the ifelse command to determine whether it is currently “WET” or “DRY”, and then determine the next state using the appropriate transition probability (WTD if conditions are currently “WET”, DTW if conditions are currently “DRY”). If a random number draw between 0 and 1.000 is less than whatever the value for the transition is, then the weather will change to its opposite state; otherwise, it will remain the same. The grow-grass procedure will again use ifelse to determine the weather. If it is “WET”, any grass which has not reached the maximum growth level of 10 will grow by one unit. If it is “DRY”, grass which has not reached the minimum growth level of 0 will grow by one unit.

The last step is to switch over to the Interface tab and create sliders for the WTD and DTW variables. I set these up from 0 to 1.0 at an interval of 0.1. I also added the setup and go buttons as well, making sure to tick the Forever? option for the go button so that it will run repeatedly until the time limit is reached. Finally, I added a plot which records the mean [ growth ] of patches. The interface should look something like this:

Running the model

When both the WTD and DTW are set to 50%, the rainfall effectively follows a random walk between dry and wet, as we might expect. This produces grass growth that varies randomly over time.

If we use the sliders to change the conditions so that it is likelier to transition from dry to wet (DTW = 80%) than wet to dry (WTD= 20%), the outcome is fairly predictable: wetter conditions on the whole, meaning greater vegetation growth (top image). When the transition is the other way around (DTW = 20%, WTD = 80%), conditions are drier which means less grass growth (bottom image).

But what about when both transitions are less likely to occur? Let’s set them both at 30%, not far off from our Boston scenario:

Under this setting we get dramatic sways between dry and wet conditions, causing rapid growth and decline of vegetation. What about if both transitions are higher? Let’s set both transitions to 80%:

When both transitions are set to 80%, we get fewer dramatic sways in vegetation growth. Why does this happen? The answer lies in the likelihood of transitioning and the buffer provided by the growth factor. In the 30%-30% scenario, the likelihood of transitioning overall is low, but when it does happen, it’s more likely to stay in whatever state it transitions to, swinging the vegetation to one extreme or the other fairly rapidly. Under the 80%-80% scenario, there is a greater likelihood that a wet day will be quickly offset by a dry day (and vice versa), which has a balancing effect over the short-term. This is an interesting behavior that may not be intuitive without building and running the model first.

Going further

Of course, this is a very simple model, and I sincerely hope no one would actually attempt to predict the weather or vegetation growth with it (DISCLAIMER: No, really, don’t do it). How could it be improved? Well, for starters, while we model rain/not rain in binary terms here, we know that the intensity of rain can vary tremendously. Since there is no limit to the number of states we can use, we could add further numbers of states and transitions to model the likelihood of going from drizzle to deluge and all points in between. Furthermore, the transition probabilities are likely to change over the course of the year, relative to seasonal changes. There are also likely to be changes from year to year or decade to decade relative to global climatic cycles. Adding these different levels of change could be accomplished by treating each smaller scale time period as nodes within a larger Markov chain with its own transition probabilities.

But we might also use Markov chains for modeling adaptive agent behaviors as well. Oftentimes, we want agents that don’t follow the same script by rote each time step. For instance, an agent might eat at its favorite restaurant x% of the time, but y% of the time it might try something new. How do different settings of x and y affect the ranging patterns of the agent? What if the agent changes the likelihood of eating at particular restaurants based on dining experiences? What determines how likely a restaurant is to be visited? Each agent could develop an entire decision-making schema and subsequent evolving ranging patterns through the process of exploring and adapting a simple Markov chain like the one presented here.

More on Markov chains

One of the interesting features of a Markov process with a finite number of states and fixed transition probabilities (like our rainfall model) is that, over time, it converges on a distribution of outcomes. This can be really useful for predicting the future state of the system, but also for determining where changing the transition probabilities is likely to have the greatest effect. There’s a lot more to learn about Markov chains and what kinds of processes are best represented by them. Here are a few things to get you started:

The historical background on Markov chains gets explained like a pro in this video from Khan Academy.
This set of short videos from Scott Page’s Model Thinking course does a good job of providing an overview, and explains the Markov Convergence Theorem in more detail.
This freely available book has a good overview of how Markov chains work mathematically in Chapter 11.
This paper by Izquierdo and colleagues gives a very thorough treatment of computer simulation and Markov chain analysis.

Featured image: A Markov chain of how my PhD thesis is coming along…

UPDATE 18 Feb 2015: Per request, the code for the model can be downloaded here: markov model
UPDATE 10 Aug 2015: Full model available here

simulatingcomplexity

Tag Archives: Tutorials

The Powers and Pitfalls of Power-Law Analyses

SSI to the rescue

Fun with Markov Chains: A Tutorial Using NetLogo

From the world of Complex Systems Simulation in Humanities