Adding new column to dataframe

3,651 views
Skip to first unread message

Jason Solack

unread,
May 15, 2014, 10:59:54 PM5/15/14
to julia...@googlegroups.com
So i feel like this a simple question, but i can't find reference to it.  

Lets say i have a DataFrame with columns A and B and i want to add a new column C that is A+B.  How would i do that?

Sorry if i'm overlooking an easy answer!

jason

Sam L

unread,
May 16, 2014, 2:23:35 AM5/16/14
to julia...@googlegroups.com
> df = DataFrame(A=rand(10), B=rand(10))
> df[:C] = df[:A] .+ df[:B]

does the trick in DataFrames v0.5.4.

Jason Solack

unread,
May 16, 2014, 7:10:06 AM5/16/14
to julia...@googlegroups.com
Thank you!

Westley Hennigh

unread,
Oct 12, 2014, 3:05:20 PM10/12/14
to julia...@googlegroups.com
Suppose that I want to create a new column of integers, default them all to "not set" (in other words, NA), and then loop and initialize some of them later.

I can't just `df[:C] = NA` because then I'll have a column that's an Array{NA,1}...

So maybe I've got to do something like:
df[:C] = fill!(Array(Any, size(df,1)), NA)

But then I'm sort of breaking the DataFrame structure (as I understand it). Underneath, the DataFrame is suppose to be a nicely typed set of column arrays, with a separate set of columns that contain values that indicate when something is missing. What I just produced is a column with a very generic type where all values are set and some just happen to be the special NA value.

Is there a better way to do this?

On Friday, May 16, 2014 7:10:06 AM UTC-4, Jason Solack wrote:
Thank you!

John Myles White

unread,
Oct 12, 2014, 3:10:56 PM10/12/14
to julia...@googlegroups.com
Use something like this:

julia> using DataFrames

julia> DataArray(Int, 10)
10-element DataArray{Int64,1}:
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA

— John

Westley Hennigh

unread,
Oct 12, 2014, 3:14:53 PM10/12/14
to julia...@googlegroups.com
Hahaha, right, thanks!
Reply all
Reply to author
Forward
0 new messages