drRead functions using a multiple core single drive desktop

6 views
Skip to first unread message

ross.da...@gmail.com

unread,
Oct 20, 2015, 8:41:52 PM10/20/15
to Tessera-Users
An embarrassing question.

How can drRead be quicker than read.table or similar basic R read functions if the HSD or SSD access "disk" memory sequentially ?

It appears that drRead splits the data into chunks but if the computer is limited by a single input channel then it seems there is not benefit as they can only be read sequentially.

Maybe there is a document somewhere that explains this?

I would appreciate any advice.

Ross

jeremiah rounds

unread,
Oct 21, 2015, 1:14:05 AM10/21/15
to Tessera-Users, ross.da...@gmail.com
drRead exist to help get data into a ddo/ddf in  truly distributed system.  The fact that it works on a system where that isn't true is just part of the agnostic philosophy of datadrs design.    There is no real goal of being fast in the scenario you present.

It is not faster, and if speed of loading data into R is your goal it is hard to believe there is any function significantly faster than fread of data.table.  That thing is a speed demon.  

Ryan Hafen

unread,
Oct 24, 2015, 2:02:13 PM10/24/15
to Jeremiah Rounds, Tessera-Users, ross.da...@gmail.com
Hi Ross,

Not an embarrassing question at all!  If your data is small enough to read in in one go then it is much faster to use fread or similar to read it into memory and then cast it as a ddf and even convert it to a local disk ddf.  We should probably document that better.  The reason drRead.csv for local disk works the way it does is that it anticipates that your input file might be too large to read in all at once.

Ryan



--
You received this message because you are subscribed to the Google Groups "Tessera-Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tessera-user...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tessera-users/850ed36c-e32d-4a5b-88b7-1fa8376f5f2a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages