Paul Matthews, 2012, University of the West of England http://www.cems.uwe.ac.uk/~pmatthew/
Using data from StackOverflow : http://www.stackoverflow.com
Data released under Creative Commons Attribution-ShareAlike: http://creativecommons.org/licenses/by-sa/3.0/
See here for StackOverflow specific attribution requirements: http://blog.stackoverflow.com/2009/06/attribution-required/
The R code uses the [ProjectTemplate] (http://projecttemplate.net/index.html) format, with ODFWeave to create the document. When you load the project (src/eda.R), the data sets will be created - data used in the paper is in the cache folder.
The full data from the dumps - raw and preprocessed - was stored in MySQL and is not available here - contact the author if interested.
See the lib folder for graph and data functions. See the doc folder for the document template.