Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

dfs.tempdir doesn't work?

116 views
Skip to first unread message

Michael Smith

unread,
Jul 19, 2013, 10:23:12 PM7/19/13
to rha...@googlegroups.com
Hi, 

There seems to be a problem with rmr.options(dfs.tempdir). When I run the following code, it gets saved to hdfs://tmp instead of hdfs://user/me/tmp-RHadoop, although the latter directory exists and has the necessary permissions. Furthermore, after closing the R session, the temporary directory (e.g. hdfs://tmp/RtmpbhGuGa) doesn't get deleted, although it is no longer needed, and if you repeat this many times it will start cluttering hdfs://tmp. 

library("rmr2")
rmr
.options(dfs.tempdir = "/user/me/tmp-RHadoop")
big
.data.object <- to.dfs(1:12)

Thanks!

Antonio Piccolboni

unread,
Jul 19, 2013, 11:18:55 PM7/19/13
to RHadoop Google Group
Hi Michael,
thanks for the report, I just pushed a fix to master. Too bad we have just released 2.2.2. This looks like something odd happened during a merge because a whole function appear reverted to before this feature was added.


Antonio  





--
post: rha...@googlegroups.com ||
unsubscribe: rhadoop+u...@googlegroups.com ||
web: https://groups.google.com/d/forum/rhadoop?hl=en-US
---
You received this message because you are subscribed to the Google Groups "RHadoop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rhadoop+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Michael Smith

unread,
Jul 20, 2013, 12:05:56 AM7/20/13
to rha...@googlegroups.com, ant...@piccolboni.info
Hi Antonio, 

Thanks for responding so quickly (as usual), it works for me in master. 

M

Antonio Piccolboni

unread,
Jul 20, 2013, 10:12:12 AM7/20/13
to RHadoop Google Group
Does the deletion on exit also work?


Amtpmop

Antonio Piccolboni

unread,
Jul 20, 2013, 8:36:08 PM7/20/13
to RHadoop Google Group
The default doesn't work for me anymore. It appears that tempdir called during package init doesn't exist later on and tempdir() returns something else. Does that happen to you too? Thanks


Antonio

Michael Smith

unread,
Jul 22, 2013, 10:31:11 AM7/22/13
to rha...@googlegroups.com, ant...@piccolboni.info
You're right. When I use the default, it does clean up the temporary file, but it does not clean up the temporary directory after closing R. For example, after closing R, there is still the directory hdfs://tmp/Rtmp7JDOsb on HDFS (although the file in that was contained in it has been cleaned up, i.e. deleted). 

Furthermore, what I have noticed for the default is that it always uses the same directory name (e.g. hdfs://tmp/Rtmp7JDOsb in my case), although that should not happen. This is inconsistent with tempdir(), which should give a random directory name that should be different each time it is called. 

If the above issues could be fixed, then my custom example above should probably be modified like this to make it more consistent with the default behavior:

library("rmr2")
rmr
.options(dfs.tempdir = paste0("/user/me/tmp-RHadoop/", basename(tempdir())))
big
.data.object <- to.dfs(1:12)

Antonio Piccolboni

unread,
Jul 22, 2013, 1:15:17 PM7/22/13
to rha...@googlegroups.com, ant...@piccolboni.info
I tried a different fix by leaving the default to NULL and calling tempdir later in the process. In master now if you want to try it out.

Michael Smith

unread,
Jul 22, 2013, 10:19:13 PM7/22/13
to rha...@googlegroups.com, ant...@piccolboni.info
Thanks Antonio. 

For the default, it now really creates a randomly-named subdirectory. But it still does not remove this directory, after closing R (although it removes the files in this directory). 

Also, I was wondering whether in line 31 of rmr2/pkg/R/mapreduce.R it would also make sense to omit the default argument (or set it to NULL). 

Antonio Piccolboni

unread,
Jul 26, 2013, 6:33:28 PM7/26/13
to rha...@googlegroups.com, ant...@piccolboni.info


On Monday, July 22, 2013 10:19:13 PM UTC-4, Michael Smith wrote:
Thanks Antonio. 

For the default, it now really creates a randomly-named subdirectory. But it still does not remove this directory, after closing R (although it removes the files in this directory). 

This is now issue #61. 


Also, I was wondering whether in line 31 of rmr2/pkg/R/mapreduce.R it would also make sense to omit the default argument (or set it to NULL). 

Absolutely, fixed in master.

Antonio

Michael Smith

unread,
Jul 31, 2013, 9:37:26 AM7/31/13
to rha...@googlegroups.com, ant...@piccolboni.info
Thanks, I'll follow issue #61 on github. 
Reply all
Reply to author
Forward
0 new messages