Working with DataFrame columns types that are Nullable

333 views
Skip to first unread message

John Best

unread,
Jul 21, 2016, 9:05:42 PM7/21/16
to julia-users

I've got ODBC.jl set up to retrieve a couple of queries. This works, but it is returning a DataFrame with column eltypes of Nullable{Int64}, Nullable{Dec64}, etc. I'd like to convert the numeric element types to Float64 for use in my analysis (which was written based on reading .csv's of the data). I would try to present an example, but I can't seem to construct a basic DataFrame with NullableArray columns without getting:

    ERROR: MethodError: `upgrade_vector` has no method matching upgrade_vector(::NullableArrays.NullableArray{Int64,1})WARNING: Error showing method candidates, aborted

     in setindex! at /home/jkbest/.julia/v0.4/DataFrames/src/dataframe/dataframe.jl:368
     in DataFrame at /home/jkbest/.julia/v0.4/DataFrames/src/dataframe/dataframe.jl:104

This is also the error I get when I try to manually convert a column, i.e.

    df[:colA] = NullableArrays.NullableArray{Float64}(df[:colA])

This is the first time I've tried working with NullableArrays. Is there any way to convert a DataFrame of NullableArrays to a DataFrame of DataArrays? And how would I do that? At least I know that my existing code works for that.

Thanks,
John

Evan Fields

unread,
Jul 22, 2016, 10:13:25 AM7/22/16
to julia-users
As far as I know, DataFrames are only backed by DataArrays. (I believe there's current work being done to upgrade the speed and type stability of DataFrames, in part by using Nullables.)

It might be helpful to write a little convenience function like

f(n) = isnull(n) ? NA : get(n)

I will say I've found DataArrays to be super finicky when inferring types. YMMV.

John Best

unread,
Jul 22, 2016, 3:10:57 PM7/22/16
to julia-users
Yeah, that's why I was surprised to get the Nullables. It's probably ODBC.jl anticipating the DataFrames.jl switch.

Jacob Quinn

unread,
Jul 22, 2016, 3:13:31 PM7/22/16
to julia...@googlegroups.com
You are correct. There are properties of NullableArrays required for proper data transfer/handling, but I still wanted users to get a familiar type back. I'm definitely helping out with https://github.com/JuliaStats/DataFrames.jl/pull/1008 to ensure DataFrames gets ported over as quickly as possible.

-Jacob

Reply all
Reply to author
Forward
0 new messages