Question about Word Processor Integration Infrastructure

31 views
Skip to first unread message

Kieren Diment

unread,
Feb 13, 2012, 5:10:55 AM2/13/12
to zoter...@googlegroups.com
It's a long standing fantasy of mine to be able to write in Markdown in emacs using as seamless a process as the current word processor integration package. I got some of the way there over the xmas holidays, but then the new plugin was released and I really like the new user interface. Unfortunately this meant that I abandoned my work to move my writing back to Word for the time being :(

Is there some sort of minimal example that shows how to hook into this interface. Ideally I'd like it to do something like this from a shell:

#!/bin/sh
echo bringing up Zotero integration dialog now ...
# some command so that the integration dialog comes up, ,and script waits while user selects some citation, supress author options prefix etc.
# script outputs some information, maybe json encoded about what was selected in the dialog

Is there some minimially simple way of making this happen? I really wouldn't know where to start looking. If there's not too much overhead in getting a development environment running, I'm happy to look slowly - I already have a development instance of zotero under a separate profile (runnning the github code), so all I really need is some pointers for getting started with the WP integration code.

Bruce D'Arcus

unread,
Feb 13, 2012, 3:42:06 PM2/13/12
to zoter...@googlegroups.com
On Mon, Feb 13, 2012 at 5:10 AM, Kieren Diment <dim...@gmail.com> wrote:

> It's a long standing fantasy of mine to be able to write in Markdown in emacs using as seamless a process as the current word processor integration package.  I got some of the way there over the xmas holidays ...

Can you elaborate more on this?

Also, just a reminder: pandoc supports citations and CSL too, and
therefore allow you to write in markdown, and get output to all manner
of formats: docx, html, latex, epub, etc.

Bruce

Kieren Diment

unread,
Feb 13, 2012, 4:13:55 PM2/13/12
to zoter...@googlegroups.com

On 14/02/2012, at 7:42 AM, Bruce D'Arcus wrote:

> On Mon, Feb 13, 2012 at 5:10 AM, Kieren Diment <dim...@gmail.com> wrote:
>
>> It's a long standing fantasy of mine to be able to write in Markdown in emacs using as seamless a process as the current word processor integration package. I got some of the way there over the xmas holidays ...
>
> Can you elaborate more on this?

Well what I did in the holidays was rig something together to make a request to the integration server (custom code that I wrote) that would return some text/plain containing information about the currently selected zotero items, with the view that post-processing the text file could extract the identifiers in the text file and dump some formatted references into the end of the document.

However then the new Word integration dialog came out, where the worlflow is fast and seamless (rather than menu/multiselect driven it's autocomplete driven). It would be really nice to be able to make use of this interface for plain text based workflows.

>
> Also, just a reminder: pandoc supports citations and CSL too, and
> therefore allow you to write in markdown, and get output to all manner
> of formats: docx, html, latex, epub, etc.
>

Yes, I am aware of that. My understanding is that it uses an intermediary file containing the required citations, and I need to type in my citations in a particular way by hand, and otherwise I haven't found documentation describing a seamless workflow to achieve this either. I'm also unsure that tightly coupling the citations into the parser is the best idea. It is probably better to stick to a format, and process the citations at the end of the document preparation then use pandoc to shuttle between document formats once the citations are no longer changing.

Bruce D'Arcus

unread,
Feb 13, 2012, 4:33:38 PM2/13/12
to zoter...@googlegroups.com
On Mon, Feb 13, 2012 at 4:13 PM, Kieren Diment <dim...@gmail.com> wrote:
>
>
> On 14/02/2012, at 7:42 AM, Bruce D'Arcus wrote:
>
>> On Mon, Feb 13, 2012 at 5:10 AM, Kieren Diment <dim...@gmail.com> wrote:
>>
>>> It's a long standing fantasy of mine to be able to write in Markdown in emacs using as seamless a process as the current word processor integration package.  I got some of the way there over the xmas holidays ...
>>
>> Can you elaborate more on this?
>
> Well what I did in the holidays was rig something together to make a request to the integration server (custom code that I wrote)  that would return some text/plain containing information about the currently selected zotero items, with the view that post-processing the text file could extract the identifiers in the text file and dump some formatted references into the end of the document.
>
> However then the new Word integration dialog came out, where the worlflow is fast and seamless (rather than menu/multiselect driven it's autocomplete driven).  It would be really nice to be able to make use of this interface for plain text based workflows.
>
>>
>> Also, just a reminder: pandoc supports citations and CSL too, and
>> therefore allow you to write in markdown, and get output to all manner
>> of formats: docx, html, latex, epub, etc.
>>
>
> Yes, I am aware of that.  My understanding is that it uses an intermediary file containing the required citations, and I need to type in my citations in a particular way by hand, and otherwise I haven't found documentation describing a seamless workflow to achieve this either.

Yeah, there'd need to be some sort of glue code to create and maintain
that input file to be ideal.

> I'm also unsure that tightly coupling the citations into the parser is the best idea.  It is probably better to stick to a format, and process the citations at the end of the document preparation then use pandoc to shuttle between document formats once the citations are no longer changing.

Not following here. Pandoc does allow you to output the results back
to markdown if you want a kind of "cooked" document (if that's what
you mean).

Bruce

Kieren Diment

unread,
Feb 13, 2012, 5:44:19 PM2/13/12
to zoter...@googlegroups.com

On 14/02/2012, at 8:33 AM, Bruce D'Arcus wrote:

>
>> I'm also unsure that tightly coupling the citations into the parser is the best idea. It is probably better to stick to a format, and process the citations at the end of the document preparation then use pandoc to shuttle between document formats once the citations are no longer changing.
>
> Not following here. Pandoc does allow you to output the results back
> to markdown if you want a kind of "cooked" document (if that's what
> you mean).

You're right, these are the half formed incoherent thoughts about a partly considered workflow :)

Fundamentally I think replacing the mozrepl requirement (mozrepl is fragile in my experience, although I've used it for long-running operations) in zotero-plain with something firefox/dialog box based, with a bit of glue scripting to handle some of the pandoc fun might work.

Frank Bennett

unread,
Feb 13, 2012, 5:51:34 PM2/13/12
to zotero-dev
In the zot4rst bridge, I took a shot at implementing an "rst" mode in
the processor, but quickly abandoned the idea, because inline markup
in reStructuredText can't be nested (for constructs with in-field
markup like "<i><i>In re</i> Smith</i>" -- "**In re* Smith*" won't
parse). To get a full range of formatting functionality, the zot4rst
bridge requests HTML markup from the processor, and an html2rst()
utility function (written by Erik) converts it to the internal
reStructuredText representation for insertion into the document during
processing.

As a result, the citations cannot be formatted in the source code of
the document itself; to separate the document from the Zotero instance
(for portability), a local copy of the structured citation data is
needed (so that it can be reprocessed). I believe that the tool has an
option to dump citations to a json file for that purpose. In zot4rst
(and I assume in pandoc as well) the main role of a plain-text
"integration plugin" would be to focus on maintaining that companion
data, either as a separate file or an embedded document segment.

Frank

Simon

unread,
Feb 13, 2012, 6:42:50 PM2/13/12
to zotero-dev
One option would be to try to make use of the integration APIs.
Bringing up the integration dialog is easy. On Linux:

echo 'OpenOffice addCitation' > "~/.zoteroIntegrationPipe"

On OS X:

echo 'OpenOffice addCitation' > "/Users/Shared/.zoteroIntegrationPipe_
$USER"

On Windows:

C:\Program Files (x86)\Mozilla Firefox\firefox -ZoteroIntegrationAgent
OpenOffice -ZoteroIntegrationCommand addCitation

This creates a new instance of the XPCOM component with contract ID
@zotero.org/Zotero/integration/application?agent=OpenOffice;1, and
calls the getActiveDocument() method on that XPCOM component to get a
new component that implements the zoteroIntegrationDocument interface.
The calls that follow depend on the operation requested. The code that
does this is in chrome/content/zotero/xpcom/integration.js; the XPCOM
components are implemented in the individual integration plugins. Full
documentation of all interfaces that an XPCOM component would need to
implement is at:

https://raw.github.com/zotero/zotero-libreoffice-integration/master/build/zoteroIntegration.idl

If you implement these interfaces in an XPCOM component with contract
ID @zotero.org/Zotero/integration/application?agent=X;1, then writing
'X addCitation' to the pipe will instantiate your component and set
things in motion.

The big caveat here is that the integration functionality is intended
to work with a word processor, and so it's designed with the idea that
you will store large amount of text in fields and dynamically update
them later. I _think_ you could mimic the whole interface well enough
to get it to do the formatting if you store shorter codes in the
Markdown file that would then link to the field codes returned by the
setCode interface on zoteroIntegrationField in a separate file (which
would also need to contain the document data). This would offload the
process of actually formatting the document to Zotero, rather than a
separate citeproc-js instance. If you want to go this route, I can
tell you a little more about what I have in mind.

Alternatively, you could implement just enough of the interface to get
a field code when you send an addCitation command. This field code
contains a CSL JSON representation of the items specified that you
could then format independent of Zotero using citeproc-js.

Finally, you could implement this without directly involving the
integration code at all as follows:

- Create a new pipe with

Zotero.IPC.Pipe.initPipeListener(pipe, callback);

where pipe is an nsILocalFile and callback is a function that will
receive the string sent on a pipe. You can then trigger code running
within Zotero by writing to the pipe. We use a different trick on
Windows that I can tell you more about if you're interested.

- Use the add citation dialog to select a reference. To start with,
you want to reimplement the interfaces supported by
Zotero.Integration.CitationEditInterface in integration.js to do as
little as possible (right now it does things like read items cited out
of the document asynchronously, which you may not want to do). Then
open chrome://zotero/content/integration/quickFormat.xul, passing your
CitationEditInterface implementation as an argument to the window.

- When the add citation dialog calls CitationEditInterface.accept(),
close the window, serialize whatever information you want out of
CitationEditInterface.citation (which will be populated by the
QuickFormat window) and write it to a pipe or to a file.

Simon

Erik Hetzner

unread,
Feb 15, 2012, 1:55:09 AM2/15/12
to zoter...@googlegroups.com, Kieren Diment
At Tue, 14 Feb 2012 09:44:19 +1100,

Hi Kieren,

This would be great. Coming up with a better solution for
vi/emacs/lyx/etc. users would be great. I won’t be able to work on
this at the moment, but I’d love to see it!

best, Erik

Reply all
Reply to author
Forward
0 new messages