How to use FileSplitter to read huge file

18 views
Skip to first unread message

chiranjeevi vasupilli

unread,
Sep 29, 2015, 4:15:07 AM9/29/15
to apex-dev
I have one on block size , How to decide the block size?

as per my understanding the 

noofBlocks=filesize / blocksize 

By this some records may be split into two blocks, when converting into  record we dont have the complete data in one block.

how to handle this scenario?

thanks in adavance.

Thanks -Chiru 

Sandeep Deshmukh

unread,
Sep 29, 2015, 5:59:25 AM9/29/15
to chiranjeevi vasupilli, apex-dev
If you need to consider record boundary, you will need to handle it similar to TextInputFormat of Hadoop

Regards,
Sandeep

--
You received this message because you are subscribed to the Google Groups "apex-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to apex-dev+u...@googlegroups.com.
To post to this group, send email to apex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/apex-dev/43d87684-c53e-4b8d-85f6-6efe9b7d69b5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sandeep Deshmukh

unread,
Sep 29, 2015, 6:02:19 AM9/29/15
to chiranjeevi vasupilli, dev, apex-dev
Moving to dev@apex.

Regards,
Sandeep

chiranjeevi vasupilli

unread,
Sep 29, 2015, 8:34:30 AM9/29/15
to apex-dev
hi ,

can we split the file based after end of the line ?
I mean 

ex: file

1,bbb,ffdd222
2,ccc,ggttt12
3,ddd,dfdfdfd

can we split the file into block like
block-1
1,bbb,ffdd222
2,ccc,ggttt12

block-2
3,ddd,dfdfdfd

Thanks -Chiru
Reply all
Reply to author
Forward
0 new messages