SQL and translators

15 views
Skip to first unread message

Frank Bennett

unread,
Apr 26, 2012, 6:39:29 PM4/26/12
to zotero-dev
This is a question that runs well beyond my knowledge. Here's the use
case.

I want to build a site translator for OpenCongress that captures the
metadata for individual sections of a Bill (that part is easy, of
course), and creates related items for any target legislation that the
Bill amends. The data is not linked, so the translator will need to
rely on screen scraping.

Target legislation of an amending act can be expressed two ways: a
cite to US Code; or a popular name. The US Code cites are easy: the
metadata can be parsed directly from the cite, and the US Code hosted
by Cornell LII provides an orderly link structure that allows us to
generate a link to the text blind, by building the URL from the cite
metadata, without visiting the Cornell site itself.

Legislation referred to by popular name is harder. I'd like to provide
a translator with the ability to do a search for (say) "Safe
Explosives Act", and get back a known set of metadata for that
resource. I'm not sure whether that's possible, or how to go about it
if it is.

Cornell LII offers a (very big) table of popular names, which can be
ripped, stored in SQL and made available via a plugin. From there,
things get hazy for me. As I understand it, two things prevent use of
this external data from within a plugin: SQL is only accessible in the
main thread; and the translators have no access to anything outside
their sandbox.

Is there a way to do this (maybe with callbacks, which I also confess
to not yet understand), or should I just let it rest?

Avram Lyon

unread,
Apr 26, 2012, 8:18:42 PM4/26/12
to zoter...@googlegroups.com
On Thu, Apr 26, 2012 at 3:39 PM, Frank Bennett <bierc...@gmail.com> wrote:
> This is a question that runs well beyond my knowledge. Here's the use
> case.

Your knowledge runs beyond mine. But...

[..]
> Cornell LII offers a (very big) table of popular names, which can be
> ripped, stored in SQL and made available via a plugin. From there,
> things get hazy for me. As I understand it, two things prevent use of
> this external data from within a plugin: SQL is only accessible in the
> main thread; and the translators have no access to anything outside
> their sandbox.
>
> Is there a way to do this (maybe with callbacks, which I also confess
> to not yet understand), or should I just let it rest?

I think I plugin should still be able to insert itself into the
sandbox, likely by extending something in the code that exposes
functions to the sandbox. It might also be possible to write a fake
translator that simply contains the popular names index in the form
of, say, associative arrays, and which exposes name lookup through
functions exported to be used with getTranslatorObject(). Such an
approach could actually be used to wrap interaction with an external
server that could actually house the index, if the list proves to be
too much to include in the translator code itself.

Just some brainstorming, but this doesn't look like an impossible task.

Avram

Simon Kornblith

unread,
Apr 26, 2012, 11:44:02 PM4/26/12
to zotero-dev
If the list is relatively small (in the hundreds of kilobytes) then it
might be suitable for inclusion in a translator. If not, it might be
worth considering whether it's possible to create or convince Cornell
to create a simple web-facing lookup API that could be used without a
plugin. However, if neither of these things are feasible, the
translators are run on the main thread, and a plugin could run SQL
either synchronously on the main thread (which is not recommended
because it will hang the Firefox UI, although Zotero does it in plenty
of places) or asynchronously with a callback. Exposing something to
the translator sandbox is easy:

var Zotero = Components.classes["@zotero.org/Zotero;1"]
.getService(Components.interfaces.nsISupports).wrappedJSObject;
Zotero.Utilities.Translate.prototype.myFunction =
function(some,arguments,here) {
// Do something
}

Creating a function that takes a callback is a little more
complicated, because Zotero.Translate needs to know that translation
isn't done until the callback completes, but it's not difficult. If
you'd like some example code,
Zotero.Utilities.Translate.prototype.doPost is a thin wrapper around
Zotero.HTTP.doPost that does this and not much else.

Simon
Reply all
Reply to author
Forward
0 new messages