Using single instance of LibreOffice to convert documents - is this safe?

1,177 views
Skip to first unread message

Cliff

unread,
Mar 26, 2012, 10:02:58 PM3/26/12
to web2py-users
I'm working on a form creation project. Because formatting is
critical and the users are accustomed to office suites, I am having
them use Libreoffice to create their form templates. Also,
LibreOffice has a Python interface which is quite easy to use.

Right now I'm using a single instance of LibreOffice running headless
and using it to process documents. I create a bridge to it within the
controller and talk to it across the bridge. I am a little concerned
about multiple users instantiating connections to LibreOffice at the
same time. I don't know if there will be unexpected results.

Does anyone have an opinion? Or maybe even experience in using
LibreOffice/OpenOffice in this way?

Thanks,
Cliff Kachinske

Wikus van de Merwe

unread,
Mar 27, 2012, 6:33:22 AM3/27/12
to web...@googlegroups.com
If this processing is done after the form submission you can delegate that to a background task.
Having a queue of tasks end executing them one by one should solve the concurrent access
problem. Check out the book section on scheduler:
http://web2py.com/books/default/chapter/29/4#Scheduler-%28experimental%29

Cliff

unread,
Mar 27, 2012, 7:47:15 AM3/27/12
to web2py-users
Thanks Wilkus.

Further research this AM says Libre/Open Office does not multi-thread
well.

The scheduler is just what I need.

On Mar 27, 6:33 am, Wikus van de Merwe <dupakrop...@googlemail.com>
wrote:
> problem. Check out the book section on scheduler:http://web2py.com/books/default/chapter/29/4#Scheduler-%28experimenta...

DenesL

unread,
Mar 28, 2012, 11:17:20 AM3/28/12
to web...@googlegroups.com
Hi Cliff,

could you post more details on your interface to LibreOffice?

Last time I looked at this it did not work properly (UNO bridge with OpenOffice) but from your initial post it sounds like a viable alternative now.

Thanks,
Denes

Cliff

unread,
Mar 28, 2012, 12:53:07 PM3/28/12
to web...@googlegroups.com
Most of what I know comes from this:

http://lucasmanual.com/mywiki/OpenOffice

Other points
1.  You can start LibreOffice from a script, but you can't connect to it in that same script.  That one cost me half a day.
2.  LibreOffice is gonna crash.  You'll need a cron job to check if LibreOffice is still running and restart it if it's died.
3.  It's slow.  If LibreOffice is going to do much work, use the scheduler as Wilkus suggests.
4.  Get a version of Python with uno baked in.
5.  ZipFile can unpack an odt document.  Beware on upload, though; don't mess with the odt doc until after it is saved.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Here's the code I use to start LibreOffice:

#! /usr/bin/python2.7
import uno, subprocess, shlex, pyuno, os, socket ## prune this import list!
# fire up libreoffice
rgs = '/usr/lib/libreoffice/program/swriter -accept="socket,host=localhost,port=2002;urp;StarOffice.ServiceManager" -norestore -nofirstwizard -nologo -headless'
args = shlex.split(rgs)
lo = subprocess.Popen(args)

++++++++++++++++++++++++++++++++++++++++++++++++++++
Here's a controller.  
def do_documents(form):
    import uno, os, socket, string  ## prune this list
    from com.sun.star.beans import PropertyValue
    try:
        os.mkdir(request.folder + 'temp_pdf')
    except OSError:
        pass

    package_name = db.document_packages[request.args(0)].name
    items = {}
    # processing a hand made form
    # get the doc id, make a list of fields for each
    for k,v in form.vars.iteritems():
        k_split = k.split('_')
        if len(k_split) < 2 or k_split[0][:3] != 'id=':
            continue
        doc_id = k_split[0][3:]
        if doc_id not in items:
            items[doc_id] = []
        items[doc_id].append((k_split[1], v))
    # now attach the the running LibreOffice instance
    # still need to implement a check if running and recovery if not
    local = uno.getComponentContext()
    resolver = local.ServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", local)
    context = resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
    desktop = context.ServiceManager.createInstanceWithContext("com.sun.star.frame.Desktop", context)

    for k, v in items.iteritems():
        rcrd = db.document_templates[k]
        path = request.folder + 'uploads/' + rcrd.body
        tmplt = desktop.loadComponentFromURL("file:///"+path ,"_blank", 0, ())
        print type(tmplt) # diagnostic
        
        search = tmplt.createSearchDescriptor()
        for val in v:
            search.SearchString = '{{='+val[0]+'}}'
            found = tmplt.findFirst(search)
            while found:
                found.String = string.replace(found.String, unicode('{{='+val[0]+'}}', 'utf-8'), unicode(val[1], 'utf-8'))
                found = tmplt.findNext(found.End, search)
## next step is to implement the pdf conversion.  I THINK this code will do it
##        property = (PropertyValue("FilterName" , 0, "writer_pdf_Export" , 0 ),) 
##        newpath = request.folder + 'temp_pdf/' + os.path.split(path)[1]
##        tmplt.storeToURL("file:///" + newpath,property)
##        tmplt.dispose()
        tmplt.storeAsURL("file:///home/cjk/wtf.odt",()) # not final code
        tmplt.dispose()

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Cliff

unread,
Mar 28, 2012, 1:14:47 PM3/28/12
to web...@googlegroups.com
Oh yeah, I almost forgot.

I've seen a lot of posts about how headless LibreOffice needs X server running.  I'm just setting up an Ubuntu server now for test purposes.  I'll report back here.
Reply all
Reply to author
Forward
0 new messages