Custom array backed dataframe

83 views
Skip to first unread message

Lee Bates

unread,
Oct 25, 2016, 10:45:27 PM10/25/16
to julia-stats
Hi,

I've created a wrapper around a mmap array that allows me to efficiently grow and shrink the array.
I would like to use this custom array as the data for the DataArrays backing a DataFrame. My issue is that the DataFrames and DataArray packages use Array rather than AbstractArray for most of the methods and types declared.

I don't believe there is any performance loss by using abstract types. Would it be possible to migrate to using Abstract types?

Also, if this is ok, would and copy replace on all instances of Array{T, N} to AbstractArray{T, N} be enough to update the packages?

Thanks,
Lee

Milan Bouchet-Valat

unread,
Nov 1, 2016, 11:11:24 AM11/1/16
to julia...@googlegroups.com
DataFrames can already have columns of arbitrary array types, but
currently conversion to DataArrays is done automatically in most cases.
We're still discussing whether this behavior should be kept or not:
https://github.com/JuliaStats/DataFrames.jl/issues/1091

Anyway DataArrays is going to be deprecated in favor of NullableArrays.
You can have a look at how Feather.jl handles the creation of
NullableArrays based on mmapped data:
https://github.com/JuliaStats/Feather.jl/blob/245dea9cffd264e25415ae29237ba44e245edaf7/src/Feather.jl#L154


Regards
Reply all
Reply to author
Forward
0 new messages