I have an OpenOffice.org doc that contains a load of text and tables,
all styled up (little hand formatting) from my tutorial at PyCon this
year. I'm interested in turning it into documentation for the packages I
covered and I was wondering if anyone knew of a way of automating this?
I'm interested in the options for doing this as a one-off export btu
also interested if anyone has found a way to do thsi as an ongoing thing?
Hmmm, I guess what I'm really interested in is a wysiwig ReST editor. Is
one of those in existence anywhere?
cheers,
Chris
--
Simplistix - Content Management, Zope & Python Consulting
- http://www.simplistix.co.uk
I've had some luck converting XHTML to reST with well... xhtml2rest[1]
in the past. You probably would have to push OOo's output through tidy
first to make it work, unless its output has become much better since
I last used it. I think it is all going to depend on how much styling
you wish to keep from the OOo file how successful this is going to be.
Also, don't forget that ODT is only a bunch of XML files in a zip, you
could probably hack a quick generator with XSLT at a push.
And I've just found odt2rst[2] in The Google by guessing the search
term, but I've never used it.
Thanks,
James
1. http://docutils.sourceforge.net/sandbox/xhtml2rest/
2. http://code.google.com/p/odt2rst/
I only know of converters that may get you started. I would be a
therrific addition to docutils/sphinx:
* PyODConverter: http://www.artofsolving.com/opensource/pyodconverter
=> requires openoffice install
=> i used it here: Nautilus-Skript zur Konvertierung von Oopenoffice
Dokumenten: http://forum.ubuntuusers.de/post/850846/
(German, but you may read the bash script)
* in the net I found also:
http://opendocumentfellowship.com/development/projects/odfpy
> Hmmm, I guess what I'm really interested in is a wysiwig ReST editor. Is
> one of those in existence anywhere?
* Gedit has a ReSt good schema:
http://textmethod.com/wiki/ReStructuredTextToolsForGedit
Unfortunatly, pydev doesn't support it yet.
At PyCon, I've spoken with people who'd also like to have Sphinx output ODT.
It seems that someone has to do something about that situation ;)
> Hmmm, I guess what I'm really interested in is a wysiwig ReST editor. Is
> one of those in existence anywhere?
I don't know of one. It seems to me that while it is hard to get right, it
is also not as useful for reST as for many other heavy markup languages.
Georg
Well, it's kind of the opposite... The one thing that does my head in
with Sphinx is that it feel pretty insane to me, when we wave good gui
tools like <insert favourite word processor here> that we're going back
to manually hacking plain text and including the nastiness that ReST
requires by hand...
>> Hmmm, I guess what I'm really interested in is a wysiwig ReST editor. Is
>> one of those in existence anywhere?
>
> I don't know of one. It seems to me that while it is hard to get right, it
> is also not as useful for reST as for many other heavy markup languages.
ReST *is* a heavy markup language ;-)
Hmm, take 2 on a response, after having a bit more of a think...
So, wouldn't it be cool if you could
- check out some ReST documentation
- run rest2odt to turn it into a .odt
- edit in OOo
- run odt2rest to turn in back into ReST
- check in the docs
...where "rest" might be Sphinx-specific?
That was we get rest goodness and all the tools Sphinx supplies and a
nice wysiwig editing environment...
I guess I'd need to get to know Sphinx much better before I attempted
implementing any of this, but do people feel this would be a "good
thing" and/or possible/easy to implement?
cheers,
Chris
PS: The following libraries look relevant:
http://pypi.python.org/pypi/appy.pod/0.2.1
http://pypi.python.org/pypi/relatorio/0.5.0
http://opendocumentfellowship.com/development/projects/odfpy
http://www.rexx.com/~dkuhlman/odtwriter.html
Has anyone used any of these?
The other side seems harder... Has anyone found anything in python for
parsing a .odt back into python? Once it's back in "python", are there
libraries for writing sphinx-ish ReST?
One this to consider Chris is that once the document has been edited it needs
to be processed by sphinx. I'm sure this can be handled by a simple script
that either polls a directory at a specific term or called as a subprocess
being able to tell whether the output is one of three.
Secondly, since sphinx docs in my setup are served from an apache document
root directory there are permission to consider as well.
Thanks for your work on squishdot. Enjoyed that while it lasted. No longer
play with zope/plone and friends as they moved way past my requirements.
Shpinx is nice and light, relatively simple and very easy to learn, a few
nigglies to sort out but with time I'm sure it will get as close to
perfection as needed. Hope it stays light however.
OOo is a very very heavy editor not to discourage you. Check out
http://code.google.com/p/ulipad/ as its a very nice fully functional gui for
rest documents. All that is missing is post editing processor ie: a few
scripts to compile docs and move them to a predetermined directory.
Best regards and best of luck
--
/ch
Personally I fail to see how it would add anything to editing other
than confusion, some examples follow.
What should happen when users have selected styling options? Say
a user has selected Arial 24 to use in a heading instead of using the
semantic option and selecting headline2(or whatever your word processor
uses) how would you cope with that during export? If a user has chosen
to make half a line green how should that be treated? How should
typeface changes be treated? What happens when a word is double
underlined in a paragraph?
Should non-semantic styling just be dropped on the floor during
conversion, or would you shove stylistic attributes in to a reST comment
so the transform could be two-way? I'm not trying to pile on the stop
motion with my comments here, just thinking about what you suggested.
My editor shows graphically bold, underline, headings, and such. It
allows me to jump back and forth between link text and link definitions
with a keystroke, etc. It tells me when I've made a reST formatting
error, and takes me to it. That is definitely good enough for me, but
yeah I can imagine some people would like more mouse oriented options.
I'd hazard a guess that many of those people who aren't satisfied with
their current tools could be satisfied with a web based editor as has
been discussed here before, and it wouldn't need all the hassle of
training users not to use half the functionality of their word processor
that can't be expressed in reST(and I'd argue thankfully so).
> PS: The following libraries look relevant:
> Has anyone used any of these?
> The other side seems harder... Has anyone found anything in python for
> parsing a .odt back into python? Once it's back in "python", are there
> libraries for writing sphinx-ish ReST?
It's only XML so basically any XML tools. ElementTree if you're going
to munge it with Python, XSLT if you just want to push it through
a filter it in to some other format.
Thanks,
James
I've kinda swung round to thinking about OOo again...
Yes, it's a heavy editor, that's why I want to use it. I *want* spell
checking, I want a UI that helps me rather than having to do everything
by hand.
I *don't* expect any conversion script I wrote would work with
everything-you-can-stick-in-an-odt. It would handle what ReST is capable
of handling, at best, and barf on other stuff.
Maybe one day ;-)
Chris
Ignore it, maybe issuing a warning.
> If a user has chosen
> to make half a line green how should that be treated?
Ignore it, issue a warning.
> How should
> typeface changes be treated?
Ignore it, issue a warning.
> What happens when a word is double
> underlined in a paragraph?
Ignore it, issue a warning.
> Should non-semantic styling just be dropped on the floor during
> conversion, or would you shove stylistic attributes in to a reST comment
> so the transform could be two-way?
I would ignore it and issue warning when going from ODT->ReST. I suspect
ReST is a small subset of what an ODT can handle, so I doubt that way
would be a problem, although special consideration would likely be
needed to ReST and Sphinx specific stuff like auto-indexes, etc.
> My editor shows graphically bold, underline, headings, and such. It
> allows me to jump back and forth between link text and link definitions
> with a keystroke, etc. It tells me when I've made a reST formatting
> error, and takes me to it. That is definitely good enough for me, but
> yeah I can imagine some people would like more mouse oriented options.
Which editor do you use?
> I'd hazard a guess that many of those people who aren't satisfied with
> their current tools could be satisfied with a web based editor as has
> been discussed here before, and it wouldn't need all the hassle of
> training users not to use half the functionality of their word processor
> that can't be expressed in reST(and I'd argue thankfully so).
Training is pretty simple when the ODT->ReST script ignores what it
safely can while issuing warning and plain barfing on the rest.
> It's only XML so basically any XML tools.
"only XML" - you're a funny guy ;-)
Chris
* Chris Withers (ch...@simplistix.co.uk) wrote:
> Ignore it, maybe issuing a warning.
> Ignore it, issue a warning.
> Ignore it, issue a warning.
> Ignore it, issue a warning.
That is why I replied initially, the only way I could see to do this
conversion was ignoring much of the user's settings and issuing tonnes
of warnings. The outcome of which is that the user either:
a) has to now look at Sphinx output with no headings therefore no TOCs
and broken intra-document links, no text styling, or all the
various other ignored properties
b) fire up the word processor and switch all the WYSIWYG options they've
set to the required WYDefineIWYG options.
> > My editor shows graphically bold, underline, headings, and such. It
> > allows me to jump back and forth between link text and link definitions
> > with a keystroke, etc. It tells me when I've made a reST formatting
> > error, and takes me to it. That is definitely good enough for me, but
> > yeah I can imagine some people would like more mouse oriented options.
>
> Which editor do you use?
vim most of the time, but I'd expect similar functionality in any
other editor really. To your other email, using vim as the example
again, spell checking is standard and a toolbar button for bold would be
added with:
:imenu icon=<some.png> Toolbar.bold <command>
and ":vmenu" if you want to support mouse highlighted text.
> > It's only XML so basically any XML tools.
>
> "only XML" - you're a funny guy ;-)
I'm not sure I see the problem really, in the case of ODT it is well
defined and nicely namespaced XML. A glance at a local ODT here shows
you can parse headings and paragraphs out with a f-ugly 2 minute script:
import sys
import textwrap
import zipfile
from xml.etree import ElementTree as ET
ns_elem = lambda e, s="text": "{urn:oasis:names:tc:opendocument:xmlns:%s:1.0}%s" % (s, e)
HEADERS = (None, "=", "-", "'")
zip = zipfile.ZipFile(sys.argv[1])
doc = ET.parse(zip.open("content.xml"))
body = doc.find("//" + ns_elem("text", "office"))
for p in body.findall(ns_elem("p")):
style = p.get(ns_elem("style-name"))
if p.text:
if style in ("P1", "P2", "P3"):
print HEADERS[int(style[1])] * len(p.text)
print
else:
print textwrap.fill(p.text)
else:
print
Of course, it breaks down on other files because their authors have set
different styling options(like using P2 for paragraphs) but I expected
that. My point is the parsing[1] is simple, especially because of the
XML, but you can't trust the styling. I will add the caveat that you're
going to need a real XML parser, because of the extensive namespace
usage.
Thanks,
James
1. And if I was doing this properly I'd write a parser not a hack like
above, but it worked in IPython and that's what matters :)
I don't agree. If you're roundtripping ReST, then what tool produces the
ODT will give you an ODT that will re-import.
If you're starting from scratch, I'm imagine a ReST template in ODT
format that helps you "do the right thing". Suitables styles, etc,
should make it easy.
> a) has to now look at Sphinx output with no headings therefore no TOCs
> and broken intra-document links, no text styling, or all the
> various other ignored properties
Why said anything about this? The above all sound like things that could
be roundtripped...
> b) fire up the word processor and switch all the WYSIWYG options they've
> set to the required WYDefineIWYG options.
I don't understand what you're saying here...
> vim most of the time, but I'd expect similar functionality in any
> other editor really. To your other email, using vim as the example
> again, spell checking is standard and a toolbar button for bold would be
> added with:
Is there a ReST mode for emacs?
>>> It's only XML so basically any XML tools.
>> "only XML" - you're a funny guy ;-)
>
> I'm not sure I see the problem really, in the case of ODT it is well
> defined and nicely namespaced XML. A glance at a local ODT here shows
> you can parse headings and paragraphs out with a f-ugly 2 minute script:
...yes, fugly. "Only" when applied to anything as complex as ReST or ODT
is a bit of a joke...
Yes rst.el, it comes with emacs.
Thanks,
James
> Is there a ReST mode for emacs?
Not only for emacs...
Docutils maintains a list of ReST supporting editors (and other relevant
links) at the `Docutils Link List`__.
__ http://docutils.sourceforge.net/docs/user/links.html
Günter