convert .doc to .rst

4,260 views
Skip to first unread message

Renato Pontefice

unread,
Mar 1, 2016, 7:43:14 AM3/1/16
to sphinx-users
Hi, I'm a very new user of sphinx. 
I need to convert a bunch of .doc (MS word) made by MS word, to .rst.
This is, to start using sphinx (and a wiki) to create user doc to our procedure sw.

I do not find the way to do that. Isw it possible?

TIA

ChrisD

unread,
Mar 1, 2016, 8:40:48 AM3/1/16
to sphinx...@googlegroups.com

 
I need to convert a bunch of .doc (MS word) made by MS word, to .rst.

Pandoc (http://pandoc.org/) can convert docx to rst. I do not know of a tool to convert the older doc files to rst, but there are a number of tools for converting doc to docx, and then you could use Pandoc to convert to rst.

I haven't done this myself so don't know how good the conversion would be...

Chris

Renato Pontefice

unread,
Mar 1, 2016, 9:46:52 AM3/1/16
to sphinx...@googlegroups.com
with pandoc I do not find the way to convert docx to rst (just rst to docx) :-(

--
You received this message because you are subscribed to the Google Groups "sphinx-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sphinx-users...@googlegroups.com.
To post to this group, send email to sphinx...@googlegroups.com.
Visit this group at https://groups.google.com/group/sphinx-users.
For more options, visit https://groups.google.com/d/optout.

Martin Bless

unread,
Mar 1, 2016, 11:00:56 AM3/1/16
to sphinx-users
Hello Renato,

In the TYPO3 community we had and have a lot of openoffice documents. I've written ("hacked"!) a set of tools to convert those documents. That's the good news: yes, it is possible. The bad news: I didn't find time yet to create documentation about the howto yet.

Here is a toolchain and how it works:

Maybe this works as well (and would be much easier):

  • Save your Word document as VALID xhtml file.
  • Run: "python ooxhtml2rst.py"

The basic idea of my toolset is that we take the xhtml format that OpenOffice writes as input for my converter.


Good luck and Leave a note at https://github.com/marble/T3PythonDocBuilderPackage/issues if you can make it work!


Martin


--

http://mbless.de







D

renato

unread,
Mar 1, 2016, 7:24:42 PM3/1/16
to sphinx...@googlegroups.com
Hi Martin
how do I can save in XHTML?
From MS Word or from LibreOffice?

I do that exporting in xhtml by LO 5.x (on linux pc), but what I obtained, is a dirty text file with many xml tags...
Do I made something wrong?

Renato
--

Michael F. Peintinger

unread,
Mar 1, 2016, 7:24:42 PM3/1/16
to sphinx-users
Hi Renato,
here's a short blog entry on how I do it:
I hope this helps!
Cheers
Michael

ChrisD

unread,
Mar 1, 2016, 7:24:46 PM3/1/16
to sphinx...@googlegroups.com
It works for me. :-)

I have no problems using pandoc to convert docx to html or rst.. I'm using the latest pandoc version 1.16.0.2.

Chris

ChrisD

unread,
Mar 1, 2016, 7:24:49 PM3/1/16
to sphinx...@googlegroups.com
I posted this previously but it doesn't appear to have made it out to the group. I'll try again...

Renato Pontefice

unread,
Mar 3, 2016, 5:54:35 AM3/3/16
to sphinx...@googlegroups.com
Great! I've done all the chain that Michael describe in his tutorial. I've just two prob:
One image inside a table is not shown,  and another "in line" with text.. Do I have to check the .rst? I mean: the prob could be in the conversion? or what?

Thanks

Renato

Reply all
Reply to author
Forward
0 new messages