Integration speed improvements + LibreOffice plugin problem

89 views
Skip to first unread message

Frank Bennett

unread,
May 30, 2012, 10:32:33 AM5/30/12
to zotero-dev
Simon,

I've been posting versions of a patch to integration.js to the Zotero
tracker on GitHub. Working with a user's large document as a test
case, we have brought the time required to refresh citations down from
about 8 minutes to about 25 seconds. For the test document, a nearly a
20-fold speed increase. The system also scales better, as the time
required for document processing is now directly proportional to the
number of citations, rather than extending geometrically.

https://github.com/zotero/zotero/issues/134

The approach is to add citations to the processor registry in one go,
using updateItems(), and then to rely on the processor's persistent
registry and tainting logic to identify citations that require an
update, and write only those fields back into the document. The patch
linked from the tracker thread is now in pretty good shape, as far as
I can tell. Unfortunately, we're getting some weird behavior that I
can't explain from the code in integration.js. I think it might be a
problem with the plugin, but I'm not sure.

I've tested in MLZ, using code from which the patch posted to the
thread is derived. Attempting to insert a citation in the document
after opening hangs the processor, with the final message:

zotero(3): ZoteroOpenOfficeIntegration: Performing asynchronous read
zotero(3): Reading from stream
zotero(3): ZoteroOpenOfficeIntegration: Reading 1853189228 bytes from
stream

I can avoid the error by setting dom.max_chrome_script_run_time to 0
to disable timeouts, or by setting it to a large value. I have seen
hangs very occasionally with this adjustment, but they do become
uncommon.

The value of dom.max_chrome_script_run_time must be set to 0 (or a
large value) at Firefox startup. As you can see in the patch, I'm
attempting to disable timeouts on the fly, but that has no effect; the
hang at a faulty attempt at a 1.8 gig read during the first citation
insert attempt still occurs consistently.

I have tested with two configurations (and combinations of the two) in
an attempt to isolate the problem. The only thing I haven't varied is
the LibreOffice Integration plugin. The failure is consistent in all
configurations, which leads me to wonder whether the greater speed of
transactions (?) might be causing the plugin to fall over. Either
that, or I've missed something in my adjustments to the code in
integration.js (?). Here are the configurations that I have tested
with, on an Ubuntu 12.04 system:

Firefox 12.0 (Canonical build)
java-1.6.0-openjdk-i386 (Ubuntu)
LibreOffice 3.5.3.2 (Ubuntu)

LibreOffice 3.5.4.2
java-7-oracle
Firefox 13.0 (Firefox build)

I hope you can help with this. I'd really like to land these changes
in official Zotero if possible. The speed improvement is dramatic
enough to be noticeable to anyone working on a project that grows to
thesis length (unless Word is blistering fast in ways that LibreOffice
is not).

Frank

Simon Kornblith

unread,
May 30, 2012, 3:11:52 PM5/30/12
to zotero-dev
Taking a look at this patch, I'm about 95% sure it's going to blow
things up under certain circumstances, and I'm not sure I'd be
comfortable shipping it without a beta first. However, I doubt the
issues I can see in the code are the cause of the issues you're
seeing. If you can send me both the document and a full debug log,
I'll see what I can do.

Simon

Frank Bennett

unread,
May 30, 2012, 9:59:50 PM5/30/12
to zotero-dev
Thanks, Simon. Documents are on the way.

I agree that if this works, it should go through beta testing. Speed
isn't beneficial if the car goes off the track.

Frank

Frank Bennett

unread,
Jun 6, 2012, 10:12:13 PM6/6/12
to zotero-dev
Thanks to optimizations in the LibreOffice plugin introduced by Simon,
the latest iteration of the revised integration code seems to be very
solid and fast. The user who has been testing the code (Rudolf Ammann)
reports that it's working well, and we're enthusiastic about landing
it in the official client after further testing by others.

There is one problem that remains to be solved before the code can be
turned loose for trials, though: a document update in progress does
not seem to properly block overlapping operations, at least with the
LibreOffice plugin. I've described the behaviour under the issue on
GitHub:

https://github.com/zotero/zotero/issues/134#issuecomment-6167368

Apart from this one glitch, we're very happy with the way this has
worked out.

Frank

Frank Bennett

unread,
Jun 8, 2012, 10:30:07 AM6/8/12
to zotero-dev
I think I may have identified the spot where the overlapping executes
can be fixed in the LibreOffice plugin. It's in the Java code, but
I've been unable to recompile it (I get an error in
RegistrationHandler). Does this look like it might do the trick?

diff --git a/build/source/org/zotero/integration/ooo/comp/Comm.java b/
build/source/org/zotero/integration/ooo/comp/Comm.java
index e67f8b7..2dd7277 100644
--- a/build/source/org/zotero/integration/ooo/comp/Comm.java
+++ b/build/source/org/zotero/integration/ooo/comp/Comm.java
@@ -49,13 +49,16 @@ class Comm implements Runnable {
*/
void sendCommand(String command) {
// Execute command
- nextMessage = command;
- if(mThread.isAlive()) {
- mThread.interrupt();
- } else {
- mThread = new Thread(this);
- mThread.start();
- }
+ ZoteroOpenOfficeIntegrationImpl.debugPrint("XXX sendCommand()
thing");
+ if (mActiveDocument == null) {
+ nextMessage = command;
+ if(mThread.isAlive()) {
+ mThread.interrupt();
+ } else {
+ mThread = new Thread(this);
+ mThread.start();
+ }
+ }
}

void showError(String errString, Exception exception) {


Frank

Simon Kornblith

unread,
Jun 10, 2012, 1:18:35 PM6/10/12
to zotero-dev
Can you try the latest version from GitHub? I changed the plugin
protocol to handle multiple documents, which we eventually would have
wanted anyway, and should take care of this issue.

Simon

Frank Bennett

unread,
Jun 10, 2012, 5:08:25 PM6/10/12
to zotero-dev
On Jun 11, 2:18 am, Simon Kornblith <si...@simonster.com> wrote:
> Can you try the latest version from GitHub? I changed the plugin
> protocol to handle multiple documents, which we eventually would have
> wanted anyway, and should take care of this issue.
>
> Simon

The error persists. A trace generated by clicking Refresh, waiting for
the document write phase to begin, and then clicking Add Citation is
here:

https://gist.github.com/2907320

The second command is apparently coming through on the stream being
read by the first. Grepping for a couple of comments I added to the
code:

bennett@bennett-dynabook-R631-28E:~/src/zotero-libreoffice-integration
$ grep 'XXX.*refresh' ~/ERRORS.txt
zotero(3)(+0000000): XXX receiveCommand() [l.202]: refresh
bennett@bennett-dynabook-R631-28E:~/src/zotero-libreoffice-integration
$ grep 'XXX.*addCitation' ~/ERRORS.txt
zotero(3)(+0000000): XXX receiveCommand() [l.286]: addCitation

Frank

Frank Bennett

unread,
Jun 14, 2012, 3:39:02 PM6/14/12
to zoter...@googlegroups.com
Ping.

Simon Kornblith

unread,
Jun 14, 2012, 11:04:02 PM6/14/12
to zotero-dev
I'm working on this.

Frank Bennett

unread,
Jun 15, 2012, 8:38:55 AM6/15/12
to zoter...@googlegroups.com
Sorry for the traffic.

Simon Kornblith

unread,
Jun 21, 2012, 8:32:40 PM6/21/12
to zotero-dev
Try the latest code on GitHub. You will have to manually reinstall the
extension if you were running the previous code.

Simon

Frank Bennett

unread,
Jun 22, 2012, 3:26:21 AM6/22/12
to zoter...@googlegroups.com
On Fri, Jun 22, 2012 at 9:32 AM, Simon Kornblith <si...@simonster.com> wrote:
> Try the latest code on GitHub. You will have to manually reinstall the
> extension if you were running the previous code.
>
> Simon

I've been able to produce hangs, and to trigger a document update that
filled all fields with the same citation. I think it's due to timeouts
again, but I'll need to spend more time to nail down steps to
reproduce and capture useful traces. I'll be tied down for the next
couple of days, but should be able get to it on Sunday.

Frank

Frank Bennett

unread,
Jun 23, 2012, 11:23:38 AM6/23/12
to zoter...@googlegroups.com
I've spent some time with the fbennett/zotero/integration branch and
the "Implement transaction IDs for protocol" commit. I am able to
trigger intermittent hangs, but (although I've tried everything I
could think of) I haven't been able to come up with reliable steps to
reproduce or to identify a pattern, but it's pretty frequent, with the
hang hitting about 30-40% of setDocPrefs() transactions in my testing
with a document containing 450 citations.

I can't say for sure, but I think timeouts may be irrelevant. I've
seen the hang strike during setDocPrefs(), and during addCitation(),
and the point at which it strikes (always during plugin execution) is
erratic. Yesterday I saw an instance in which the same citation text
replaced every field in the document after a point of insert. Although
I have been unable to reproduce that a second time, it's certainly
worrying.

The attached trace will probably not be terribly informative. If there
is anything further I can do in testing, let me know.

Frank
integration-trace.zip

Frank Bennett

unread,
Jun 23, 2012, 7:21:04 PM6/23/12
to zoter...@googlegroups.com
Not sure how encouraging this will be, but here goes.

As the trace suggests, the hang occurs inside onInputStreamReady, at the first this.iStream.read32(), after this.iStream.available() returns a non-empty value. There is nothing odd about the return value of available() on iterations that produce a hang: when present it is always type number, value 1.

It's a bit mysterious, but when this.iStream.available() is captured to a variable _twice_ before testing the condition, the hang does not seem to occur.

Frank

Frank Bennett

unread,
Jun 23, 2012, 7:37:28 PM6/23/12
to zoter...@googlegroups.com
Scratch that. Hangs can still occur.

For the record, this is with Linux (Ubuntu), Firefox 12.0.

Frank

Frank Bennett

unread,
Jun 23, 2012, 8:46:49 PM6/23/12
to zoter...@googlegroups.com
I am happy to report steps to reproduce, using the test document (420
references), and the latest plugin code in GitHub:

Prep:

(1) If logged in on http://zotero.org, log out
(2) In about:config, set extensions.zotero.debug.log to "true"
(3) If Firefox or LibreOffice are running, stop them (just for a clean slate)

Test:

(4) Start Firefox with firefox -ProfileManager >~/ERRORS.txt
(5) Start LibreOffice against test document
(6) Click setDocPrefs() button in LibreOffice
(7) Use tail -f ~/ERRORS.txt in a terminal to follow Zotero activity
(8) Wait until DATE operations are complete and field updates begin
(9) In Firefox, visit http://zotero.org
(10) Click "login" button
(11) Firefox will hang at "ZoteroOpenOfficeIntegration: Performing
asynchronous read" before login page appears

This has produced a hang in 6 successive trials.

Frank

Frank Bennett

unread,
Jun 24, 2012, 6:54:16 PM6/24/12
to zoter...@googlegroups.com
The initial tests were run with the bundled Ubuntu build of Firefox 12.0.

Just to be sure, I've run the test below with a Mozilla build of Firefox 13.0, and I get the same failure.

Frank

Simon Kornblith

unread,
Jun 24, 2012, 7:57:12 PM6/24/12
to zotero-dev
I was able to reproduce this, and it seems to be fixed in the latest
version from GitHub. This new version uses asynchronous sockets the
right way, which fixes the hang but may result in a speed hit :(

I'm now trying to figure out why communication between LibreOffice and
Zotero seems to be about 25X faster on OS X than on Linux. Since none
of the code uses OS-specific interfaces, I'm not particularly
optimistic, but it seems that there's some inefficiency in the way
asynchronous sockets are implemented in Firefox on Linux.

Simon

On Jun 24, 6:54 pm, Frank Bennett <biercena...@gmail.com> wrote:
> The initial tests were run with the bundled Ubuntu build of Firefox 12.0.
>
> Just to be sure, I've run the test below with a Mozilla build of Firefox
> 13.0, and I get the same failure.
>
> Frank
>
>
>
>
>
>
>
> On Sunday, June 24, 2012 9:46:49 AM UTC+9, Frank Bennett wrote:
>
> > I am happy to report steps to reproduce, using the test document (420
> > references), and the latest plugin code in GitHub:
>
> > Prep:
>
> > (1) If logged in onhttp://zotero.org, log out
> > (2) In about:config, set extensions.zotero.debug.log to "true"
> > (3) If Firefox or LibreOffice are running, stop them (just for a clean
> > slate)
>
> > Test:
>
> > (4) Start Firefox with firefox -ProfileManager >~/ERRORS.txt
> > (5) Start LibreOffice against test document
> > (6) Click setDocPrefs() button in LibreOffice
> > (7) Use tail -f ~/ERRORS.txt in a terminal to follow Zotero activity
> > (8) Wait until DATE operations are complete and field updates begin
> > (9) In Firefox, visithttp://zotero.org

Simon Kornblith

unread,
Jun 24, 2012, 9:42:48 PM6/24/12
to zotero-dev
Speed issues should be fixed now.

Frank Bennett

unread,
Jun 24, 2012, 10:37:43 PM6/24/12
to zoter...@googlegroups.com
This looks great. I saw your coalesce patch while setting up for
testing, and noticed a very considerable speed improvement in updates.
I've just taken a shot at breaking things again after applying the
flush patch as well, and it all seems very solid. Operations clear
very fast, and hammering incessantly on the Zotero buttons while the
plugin is active in the document has no effect until the current
operation completes.

There was a bug in my changes to integration.js that caused
addCitation() to fail (leaving the {citation} marker behind) if
invoked immediately after an initial setDocPrefs() run. I've fixed
that in the latest checkins to the integration and multi branches of
the fbennett fork.

In the modified integration.js, there is some complicated business
with .refresh and .reload toggles on the session object. This is
probably trickier than it needs to be (it was the cause of the hanging
{citation} marker), but the logic seems to be delivering complete
updates with minimal overhead.

Definitely feels like this is a step closer to rotten tomato testing.

Frank
> --
> You received this message because you are subscribed to the Google Groups "zotero-dev" group.
> To post to this group, send email to zoter...@googlegroups.com.
> To unsubscribe from this group, send email to zotero-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/zotero-dev?hl=en.
>
Reply all
Reply to author
Forward
0 new messages