iryoung
unread,Nov 20, 2009, 4:07:15 AM11/20/09Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to cascading-user
TempHfs's inputPaths is omitted by hadoop when pathname starts with
non-word characters
Hello, there
I got some issues and I hope this should be fixed.
The thing is, flows will fail if pipe's name is started with non-word
character when pathname of TempHfs contains the name .
So, I tried to figure out what's going on inside cascading & hadoop,
and I found that when making pathname for TempHfs, all non-word
characters are substituted for under-bar characters which is filtered
out by hadoop's FileInputFormat.hiddenFileFilter and throw
InvalidInputException with message such as "Input path does not exist:
file/home/simple/build/test/_hello_7549_1057656".
For people like me lived in south korea , using korean as pipe name is
really useful, but without putting dummy prefix, it's not working
now.. :(
So, I hope to fix this issues. There're many ways to solve this issue,
but I think changing Hfs.makeTemporaryPathDir like this is most
simplest one.
protected String makeTemporaryPathDir( String name )
{
return name.replaceAll( "[\\W\\s]+", "W" ) + Integer.toString
( (int) ( 10000000 * Math.random() ) );
}
or, add new function for making temporary path for TempHfs , or remove
hiddenFileFilter, and so on.
Thanks.
p.s I think, cascading is really useful, and thank all you guys very
much ,esp ckw :)