This script will convert files to PDF with OpenOffice.
Other formats (html, doc ...) should be possible, too.
Have a nice day,
thomas
# OpenOffice1.1 comes with its own python interpreter.
# This Script needs to be run with the python from OpenOffice.org:
# /opt/OpenOffice.org/program/python
# Start the Office before connecting:
# soffice "-accept=socket,host=localhost,port=2002;urp;"
#
# pyUNO Imports
import uno
from com.sun.star.beans import PropertyValue
# Python Imports
import os
import sys
# For a list of possible export formats see
# http://www.openoffice.org/files/documents/25/111/filter_description.html
# or
# /opt/OpenOffice.org/share/registry/data/org/openoffice/Office/TypeDetection.xcu
export_format="writer_pdf_Export"
export_extension="pdf"
def usage():
print """Usage: %s in_dir out_dir
All files in in_dir will be opened with OpenOffice.org and
saved to out_dir
You must start the office with this line before starting
this script:
soffice "-accept=socket,host=localhost,port=2002;urp;"
""" % (os.path.basename(sys.argv[0]))
def do_file(file, desktop, out_url):
# Load File
file=os.path.abspath(file)
url="file:///%s" % file
properties=[]
p=PropertyValue()
p.Name="Hidden"
p.Value=True
properties.append(p)
doc=desktop.loadComponentFromURL(
url, "_blank", 0, tuple(properties));
if not doc:
print "Failed to open '%s'" % file
return
# Save File
properties=[]
p=PropertyValue()
p.Name="Overwrite"
p.Value=True
properties.append(p)
p=PropertyValue()
p.Name="FilterName"
p.Value=export_format
properties.append(p)
p=PropertyValue()
p.Name="Hidden"
p.Value=True
basename=os.path.basename(file)
idx=basename.rfind(".")
assert(idx!=-1)
basename=basename[:idx]
url_save="%s/%s.%s" % (out_url, basename, export_extension)
try:
doc.storeToURL(
url_save, tuple(properties))
except:
print "Failed while writing: '%s'" % file
doc.dispose()
def main():
if len(sys.argv)!=3:
usage()
sys.exit(1)
in_dir=sys.argv[1]
out_dir=sys.argv[2]
out_url="file://%s" % os.path.abspath(out_dir)
print out_url
# Init: Connect to running soffice process
context = uno.getComponentContext()
resolver=context.ServiceManager.createInstanceWithContext(
"com.sun.star.bridge.UnoUrlResolver", context)
try:
ctx = resolver.resolve(
"uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
except:
print "Could not connect to running openoffice."
usage()
sys.exit()
smgr=ctx.ServiceManager
desktop = smgr.createInstanceWithContext("com.sun.star.frame.Desktop",ctx)
files=os.listdir(in_dir)
files.sort()
for file in files:
print "Processing %s" % file
file=os.path.join(in_dir, file)
do_file(file, desktop, out_url)
if __name__=="__main__":
main()
> This script will convert files to PDF with OpenOffice. Other formats
> (html, doc ...) should be possible, too.
Nice.
Is there any way to start soffice without a display?
Could be used to offer a webservice-documentconverter.
Murple
No, but there is a special X-Server which does not need
a real display. But I don't think that it is thread save.
Untested:
You could start e.g. 5 office instances listening on 5
different ports and syncronize the access yourself.
Fortunately I only need it for batch processing.
thomas
Nice work!
One question though, doesn't this work only for OOo writer files (not
Draw or Calc files)?
I tried hacking it, and in the end, simply tried all output filters
that put out PDF in succession, until one worked.
I _know_ this is a hack, but I couldn't find any information on how to
use pyUNO to query OO and return what _type_ of file it had
opened.....
How _does_ one do that, anyway?
Thanks,
Duane
properties = [PropertyValue() for i in range(3)]
And then fill it in:
properties[0].Name = ...
properties[0].Value = ...