Groups
Groups
Sign in
Groups
Groups
cs402pku
Conversations
About
Send feedback
Help
【求教】如何才能得到单词的偏移量?
78 views
Skip to first unread message
张华祥
unread,
Jul 20, 2014, 4:17:32 AM
7/20/14
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to cs40...@googlegroups.com
RT
郭行健
unread,
Jul 20, 2014, 4:56:49 AM
7/20/14
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to cs40...@googlegroups.com
在默认的TextInpuFormat中,Mapper的key(类型是LongWritable)存储的便是这一行的偏移量。单词的偏移量可以在此基础上一个字节一个字节地数……虽然我觉得有行偏移量就足够了。
在 2014年7月20日星期日UTC+8下午4时17分32秒,张华祥写道:
RT
杨博文
unread,
Jul 20, 2014, 7:53:56 AM
7/20/14
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to cs40...@googlegroups.com
默认情况下的FileInputFormat是TextInputFormat,可以看做是FileInputFormat<LongWritable,Text>
其中默认情况下LongWritable是该行相对文件的偏移量,Text是这一行的内容。
map(Object key,Text value,Context context)里的key.toString()就是了。
在 2014年7月20日星期日UTC+8下午4时17分32秒,zinc写道:
RT
Haoyan Huo
unread,
Jul 20, 2014, 9:26:49 PM
7/20/14
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to cs40...@googlegroups.com
java.util.regex.Matcher.start() 正是做这个的
On Sunday, July 20, 2014 4:17:32 PM UTC+8, zinc wrote:
RT
zinc
unread,
Jul 21, 2014, 5:23:07 AM
7/21/14
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to cs40...@googlegroups.com
真心有用
在 2014年7月20日星期日UTC-7下午6时26分49秒,Haoyan Huo写道:
Reply all
Reply to author
Forward
0 new messages