Best way to organize and analyze Behavior Space data

582 views
Skip to first unread message

Pradeesh Kumar K V

unread,
Oct 28, 2021, 4:46:00 AM10/28/21
to netlogo-users
Hello all,

I have 100 Behavior Space run outputs for 3 model variables. The outputs for each run are clubbed together in adjacent columns in an Excel sheet. For example run 1 outputs for variables A, B and C will be, for example, in columns A, B and C. Run 2 outputs will be in columns D, E and F and so on.

Manually arranging this data to find say mean for one variable over time is cumbersome as it requires manually shifting the data to adjacent columns. What is the best way in general to arrange Behavior Space data? Is it through coding?

Thanks,

Pradeesh

Wade Schuette

unread,
Oct 28, 2021, 11:11:35 AM10/28/21
to Pradeesh Kumar K V, netlogo-users
I'm not sure there is an easier way, in general, than reading the Behavior Space output file into some tool such as the statistical package R, which has nice tools for reading and then "wrangling" the data, slicing and dicing and rearranging and computing values and generating informative graphics or even interactive graphics using something like R-Shiny.   Once you get the scripts written and debugged in a tool like R-Studio,  one-click is all it takes to read in entire new runs and convert them to helpful statistics and graphics.

The other advantage is that you can publish the R-scripts along with the dataset  (and your Netlogo model code, of course ) so other people can verify and build on your work.     In many fields there are now standards that strongly suggest or require that you document reproducibly how you came up with your "Figure 2.7", for example! :)   https://www.dcc.ac.uk/guidance/standards/metadata/list

Manually wrangling Excel spreadsheets has a very high risk of making a mistake you don't notice,  converting your subsequent analysis and possibly publications into something you regret or have to retract.  It's also time consuming and if you do it very often it's faster to resign yourself to learning enough R to be helpful. R and R-Studio are free and there's lots of help and tutorials available on-line.  There's probably under 20 commands in R that you need to get sorted out, once, that can then be tweaked for most future needs, and RStudio will give you syntax, hints, and debugging information as you master those.  ( Or hopefully, find someone down the corridor who can write them for you. )   Sadly, R-studio is I think oriented to Windows users.

Example - the RStudio menu command:   file > import-dataset > from text (readr )  
will read in the Behavior Space output CSV file  ( the "table" one not the "spreadsheet" one ),  and once you tell it  to skip the first 6 lines ( put a 6 in the skip-box )  it locates the headers for the columns and as you point at each one and specify the type it literally writes the R-code statement for you.  You only need to do that once and then you can save the script and reuse it from then on.

My latest script begins:

 
# This reads the 54,000 row BehaviorSpace output 30 variables
# generated yesterday model 1.17
# it creates possible row labels, and removes the transaction-cost column
# clear variables with
rm(list=ls())

# you need to edit the following line to suit your computer.
# This line is automatically written if you use the menu choice
# Session > set working directory

setwd("~/netlogo/whatever")

# these are the most useful extensions I've found to add.
# if they're not "installed" they will need to be installed.

library(readr)
library(tidyverse)

# then, the RStudio menu command:   file > import-dataset > from text (readr )  will
# write something like the following for you:

 df <- read_csv("my-behavior-space-output-table.csv",skip = 6)

# followed by about 20 uses of the wrangling commands
#  "rename", "mutate" and "select" such as

# make shorter names without dashes in them:
df2 <- rename(df, costpct = 'Cost-percent',
gdppct = 'fund-gdp-percent-scaled',
translog = 'transaction-cost-log)
...
# make a temporary database for safety removing a column you don't want
temp <- select(df,-denumb)

 # scale variables if you want to have values between 1 and 100 or whatever
df <- temp %>% mutate ( emitred = emitred / 10E6)

# Make the data into a "data table" format and possibly filter down to just some subset
myDT <- data.table(df)
myDT <- filter(myDT, variable7 > 15)

# and see what it looks like now, including hovering your mouse over the
# column headers to see min and max values or sort on that column

view(df)

And at that point you can run statistics on the data,  grouped however you want
and generate nice graphics.

I'll yield to the thousand online youtube video tutorials on what to do next.
Here's a ten minute one that tells you more than you need but demonstrates
select, mutate, filter, summarize, grouping, etc.


Wade



--
You received this message because you are subscribed to the Google Groups "netlogo-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to netlogo-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netlogo-users/CANOZKcjivso3bmNV9vv1TDy1DJPbwDcKg1Vru1%3DeWTL1YgcpPA%40mail.gmail.com.

wade.s...@gmail.com

unread,
Oct 28, 2021, 11:25:54 AM10/28/21
to netlogo-users
Meh, that tutorial video is way too busy and confusing.  Sorry about that. There are nicer ones
These look way better for novices:

R data manipulation with Rstudio and dplyr:

and an introduction to ggplot as a plotting package

Wade

Stephen Guerin

unread,
Oct 28, 2021, 11:47:23 AM10/28/21
to Pradeesh Kumar K V, netlogo-users
Hi Pradesh, 

try outputting your BehaviorSpace results as a Table instead of Spreadsheet (unfortunate names for the options).

Open output in Excel or Google sheets and create a pivot table to get your means, std dev, per category. 

a Google for "pivot table netlogo behaviorspace" brought up this nice tutorial


--

wade.s...@gmail.com

unread,
Oct 28, 2021, 11:59:27 AM10/28/21
to netlogo-users
If someone new to R and R studio wants to install them ( and there are versions for mac and ubuntu as well as windows )
here's a link on how to do that, step by step

Again, this is overkill if you are just playing around and I see a nice excel pivot table video was just posted.
But if you're going to be doing this a lot,  I recommend learning enough R to get by. The two videos I posted earlier
teach all you need for most purposes.

Wade

Pradeesh Kumar K V

unread,
Oct 29, 2021, 12:40:56 AM10/29/21
to wade.s...@gmail.com, netlogo-users
Hello Wade,

Thanks a lot for your suggestions on R and R-Studio, for the sample code and references to the videos. R looks like the right tool for my purposes. I started learning R few years back but had to drop it after completing the first few lessons. Will pick up from there.

Much appreciated.

Best,

Pradeesh

Pradeesh Kumar K V

unread,
Oct 29, 2021, 12:43:56 AM10/29/21
to stephen...@redfish.com, netlogo-users
Hello Stephen,

Thanks for the valuable tip. I exported data as table and was able to create pivot table very easily. 

Very helpful.

Best,

Pradeesh 
Reply all
Reply to author
Forward
0 new messages