I'm fairly technically-adept, and can even do some basic programming
(though not so much javascript), but every time I have in mind writing
a translator, I give up; not enough time for me to bother to learn how
to write it. I can't believe I'm alone.
Question: is there a way to make writing translators easier such that
users don't have to write code? Or where code writing could at least
be significantly limited?
I'm thinking of maybe a simple JSON-based configuration where you're
just itemizing the key components (title, url pattern, variable
mapping).
Bruce
1) Translators use JavaScript, not Java (though the two use a similar
syntax).
2) http://www.zotero.org/support/dev/creating_translators_for_sites
3) You can look at existing translators here:
https://www.zotero.org/svn/extension/trunk/translators/
As I said originally, I don't think this is helpful at all. The
barrier to entry is too high. Is there really not a way to (greatly)
simplify this?
Bruce
...
>> As I said originally, I don't think this is helpful at all. The
>> barrier to entry is too high. Is there really not a way to (greatly)
>> simplify this?
>
> This is a very tall order. But a starting point might be a tool to do
> triage on a page. A lot of time can be consumed just figuring out
> where the data in a page is coming from, so that you can lay some kind
> of plan for digging it out. If you could do something like 'wgroke --
> tellabout="John H. Smith" --ris http://www.thatsite.com/', and get
> back a report telling what URL and XPath will access the name "John H.
> Smith" that you see on the screen at that address, and what URLs
> _might_ yield RIS data, that would cut out a lot of the early
> uncertainty that dissuades people from the first attempt.
That's one issue, but for me, that's actually a trivial problem. I can
go really far, really quickly, with just view source, or Firebug.
The bigger issue is the amount of redundant code that gets written for
every translator. I don't have the time nor the JS skills to figure
this out.
What I have in mind is either some standard functions that allow me to
do stuff like:
mapData('title', doc.head.title)
... or even, potentially, to have the mappings defined in simple JSON
maps/dictionaries. In either case, I'm imagining much shorter,
simpler, translator files.
Bruce
I tend to agree. There is a tutorial here: http://www.zotero.org/support/dev/scaffold_tutorial
which could do with a little more love (especially attaching a pdf
and/or, an attachment of a web page, and adding from a list of
multiple items, also finding and obtaining data from proper structured
data sources - RIS etc. )
It would be nice to have a translator "Wizard". I suppose the first
step to developing that would be for an expert in creating translators
to take some existing translators and turn them into pseudocode, to
expose the repetitive bits of the translator code and abstract them
away.
...
> I hope that I have just had a bad run of luck, but it does kind of
> invite the question of how much bang one could get for one's
> automation buck. Maybe a good starting point would be to collect a
> good batch of links to sites "in the wild" that you feel should be
> easy to translate, as a pool of test data.
I come across a lot of examples like this ...
I want to create a translator for press releases, say.
I can derive the institution and type from the base URI.
I can derive the title from the selector 'h1.pagename' (or 'head
title' and then split the string on the '|' and take the first part).
The only tricky thing is the date, since it's not wrapped in any specific node.
That took me 60 seconds to figure out. Writing a translator would take
a whole lot longer; too much hassle for me to bother.
Bruce
> I just finished a tutorial on writing Zotero translators. You can take
> a look here: http://niche.uwo.ca/zotero-guide .
>
> Hope that helps.
You know, I saw that yesterday. Mucho kudos on that; a great piece of work!
OTOH, it's 17 chapters/pages, which underlines my point; it's too hard
to write translators!
Bruce