read word document from java

188 views
Skip to first unread message

Harsh Chhabra

unread,
Jun 4, 2012, 5:26:04 AM6/4/12
to iit...@googlegroups.com
hey everyone can you tell me how to read word document file in java with proper format.
tell me the steps which i need to do,in order to do so.

s|s

unread,
Jun 4, 2012, 6:48:55 AM6/4/12
to iit...@googlegroups.com
Harsh
You could explore Apache POI project available at http://poi.apache.org/
> --
> Mailing list guidelines and other related articles:
> http://lug-iitd.org/Footer
>


--
Supreet Sethi
Ph IN: +919811143517
Ph Skype: d_j_i_n_n
Profile: http://www.google.com/profiles/supreet.sethi
Twt: http://twitter.com/djinn

Harsh Chhabra

unread,
Jun 4, 2012, 7:29:21 AM6/4/12
to iit...@googlegroups.com
i did that,but when i compile i am getting error like u should include XFFS instead of HFFS.....i did lots ot things to include jar files and all finally not able to read a doc file...
can u tell me the exact procedure to read a word doc,if u ever used ???

Biju Balakrishnan

unread,
Jun 4, 2012, 8:31:09 AM6/4/12
to iit...@googlegroups.com

i did that,but when i compile i am getting error like u should include XFFS instead of HFFS.....
you might be getting this error as you are trying to read docx or xlsx file.

just try with doc or xls file it might work.

also post what you have done so far.

Biju Balakrishnan

unread,
Jun 4, 2012, 5:33:09 AM6/4/12
to iit...@googlegroups.com
harsh,
Could u be more clear what you want?

On Mon, Jun 4, 2012 at 2:56 PM, Harsh Chhabra <harsh...@gmail.com> wrote:
hey everyone can you tell me how to read word document file in java with proper format.
tell me the steps which i need to do,in order to do so.
there are two API available to read word doc files in java. which pretty much does everything needed
  • Apache Poi
  • Tika
Did u try these?
if u did what more do you need?


--
Biju

Harsh Chhabra

unread,
Jun 6, 2012, 9:05:44 PM6/6/12
to iit...@googlegroups.com
thanx @biju.
now its working everything well,i have used two different poi to read tables in doc and docx files. 
HWPF for doc file and XEPF for docx file,now is there ant way,one code work for both kind of file ? 

Biju Balakrishnan

unread,
Jun 7, 2012, 1:31:46 AM6/7/12
to iit...@googlegroups.com
Harsh,

On Thu, Jun 7, 2012 at 6:35 AM, Harsh Chhabra <harsh...@gmail.com> wrote:
thanx @biju.
now its working everything well,i have used two different poi to read tables in doc and docx files. 
HWPF for doc file and XEPF for docx file,now is there ant way,one code work for both kind of file ? 
 
There is no api in poi that has support of HWPF+XWPF.

But Tika gives the support. i tried for getting the text from both doc and docx files, it works. i used the parseToString method in Tika which reads both doc and docx file contents.
Try playing around with Tika n see if it solves your purpose.
 

On Mon, Jun 4, 2012 at 3:03 PM, Biju Balakrishnan <biju...@gmail.com> wrote:
harsh,
Could u be more clear what you want?

On Mon, Jun 4, 2012 at 2:56 PM, Harsh Chhabra <harsh...@gmail.com> wrote:
hey everyone can you tell me how to read word document file in java with proper format.
tell me the steps which i need to do,in order to do so.
there are two API available to read word doc files in java. which pretty much does everything needed
  • Apache Poi
  • Tika
Did u try these?
if u did what more do you need?


--
Biju

--
Mailing list guidelines and other related articles: http://lug-iitd.org/Footer

--
Mailing list guidelines and other related articles: http://lug-iitd.org/Footer



--
Biju

Reply all
Reply to author
Forward
0 new messages