Groups
Groups
Sign in
Groups
Groups
Hadoop In China
Conversations
About
Send feedback
Help
答复: 对于输入是大批量小文件如何处理
15 views
Skip to first unread message
leon...@gmail.com
unread,
Jul 5, 2012, 7:40:47 PM
7/5/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to hadoo...@googlegroups.com
简单的处理方式应该是设置jvm重用吧
-- Sent from my HP Veer
zhang辉张<
zhang...@gmail.com
>于2012-7-4 09:49写道:
hi,
现在遇到个问题,输入是大批量的小文件,每个map解码一个小文件,其实解码本身很快,但是启动一个map占用的时间就很大了
想问问大家,对于这种输入是大批量小文件是如何处理滴
--
You received this message because you are subscribed to the Google Groups "Hadoop In China" group.
To post to this group, send email to
hadoo...@googlegroups.com
.
To unsubscribe from this group, send email to
hadooper_cn...@googlegroups.com
.
For more options, visit this group at
http://groups.google.com/group/hadooper_cn?hl=en
.
zhang辉张
unread,
Jul 10, 2012, 7:24:30 AM
7/10/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to hadoo...@googlegroups.com
有木有出现使用
jvm 之后,出现ENOENT:No such file or directory这个错误呢?
zhang辉张
unread,
Jul 13, 2012, 2:29:47 AM
7/13/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to hadoo...@googlegroups.com
大家有木有用过
CombineFileInputFormat,在使用的时候有木有出现outofmemorey的错误
一个目录下面待该6W多个小文件,每个不超过10M
feng lu
unread,
Jul 13, 2012, 10:06:47 PM
7/13/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to hadoo...@googlegroups.com
6W个文件,每个文件不超过10M,那也要600G, 你设置了CombineFileInputFormat的maxSplitSize了吗,它的getSplits返回了多少个InputSplit,会不会单个Map任务的处理数据量多大导致的。
2012/7/13 zhang辉张
<
zhang...@gmail.com
>
--
Don't Grow Old, Grow Up... :-)
zhang辉张
unread,
Jul 19, 2012, 8:04:57 AM
7/19/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to hadoo...@googlegroups.com
CombineFileInputFormat的max
SplitSiz
如何设置呢?
2012/7/14 feng lu
<
amuse...@gmail.com
>
zhang辉张
unread,
Jul 19, 2012, 9:14:58 AM
7/19/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to hadoo...@googlegroups.com
重新设置了maxSplitSize之后,出现io错误。
错误日志包含一下内容:
R/W/S=150/147/0 in:6=150/24 [rec/s] out:6=147/24 [rec/s]
这是神马意思呢?
2012/7/19 zhang辉张
<
zhang...@gmail.com
>
Reply all
Reply to author
Forward
0 new messages