Zotero unAPI not recognising DOM changes?

128 views
Skip to first unread message

Adam Retter

unread,
Apr 15, 2011, 9:03:16 AM4/15/11
to zotero-dev
I implemented unAPI, and the Zotero plugin now finds our mods data an
you can click the 'Folder Icon' 'Save to Zotero… (unAPI)' in the URL
bar. However our results are split into pages of 10 results, and the
paging is done with Ajax, when you change Page, Zotero still only sees
the results 1-10 and not the ones in the page i.e. 11-20 - why is
this?

Can Zotero recognise updates to the DOM, or only updates when the page
is fully reloaded and the URI changes?

Avram Lyon

unread,
Apr 16, 2011, 1:16:00 AM4/16/11
to zoter...@googlegroups.com

You've correctly identified the problem here. Zotero's detection
apparatus is triggered by one of Firefox's page loading events, and it
only runs once (that is, subsequent loading won't change the icon).

This is still a little odd, however, since I thought that the DOM in
its current state was being sent to doWeb at the moment you click the
"Save to Zotero" icon. From what you're saying, it sounds like the DOM
is being passed in the state it had when detectWeb was first run.

As for the general question of Zotero's translation and DOM updates,
there has been discussion of this in the past, and the hard part was
deciding what DOM modification events should trigger re-detection for
Zotero. I believe there's no technical reason why Zotero couldn't
register for other types of DOM modification, but we haven't managed
to identify which DOM events would be appropriate. Perhaps you have a
recommendation?

Avram

Adam Retter

unread,
Apr 19, 2011, 4:37:28 AM4/19/11
to zoter...@googlegroups.com, ajl...@gmail.com
> On Fri, Apr 15, 2011 at 8:03 AM, Adam Retter <adam....@googlemail.com> wrote:
>> I implemented unAPI, and the Zotero plugin now finds our mods data an
>> you can click the 'Folder Icon' 'Save to Zotero… (unAPI)' in the URL
>> bar. However our results are split into pages of 10 results, and the
>> paging is done with Ajax, when you change Page, Zotero still only sees
>> the results 1-10 and not the ones in the page i.e. 11-20 - why is
>> this?
>>
>> Can Zotero recognise updates to the DOM, or only updates when the page
>> is fully reloaded and the URI changes?
>
> You've correctly identified the problem here. Zotero's detection
> apparatus is triggered by one of Firefox's page loading events, and it
> only runs once (that is, subsequent loading won't change the icon).

1) Okay so thats a problem in this Web 2.0 world!

> This is still a little odd, however, since I thought that the DOM in
> its current state was being sent to doWeb at the moment you click the
> "Save to Zotero" icon. From what you're saying, it sounds like the DOM
> is being passed in the state it had when detectWeb was first run.

2) Indeed. Can this be fixed, how do I get someone to fix this?

In the larger picture we have 3 things we need to do with Zotero and
looking for a Zotero developer or JavaScript developer with knowledge
of JavaScript to pick up these small paid projects and contribute the
fixes back into Zotero.

i) Fix this issue.
ii) Improve the quality of the MODS translation in Zotero.
iii) Create a small plugin for Zotero that can export (in MODS format)
back into our REST based system.

Do you know how I would contact such a person? Or perhaps you are such a person?

> As for the general question of Zotero's translation and DOM updates,
> there has been discussion of this in the past, and the hard part was
> deciding what DOM modification events should trigger re-detection for
> Zotero. I believe there's no technical reason why Zotero couldn't
> register for other types of DOM modification, but we haven't managed
> to identify which DOM events would be appropriate. Perhaps you have a
> recommendation?

3) I am afraid that I am not really a DOM/JavaScript guy, so I cant
really help answer this.
If in (2) the DOM was sent in its current state when you click the
'Save to Zotero' icon, then this would fix our problem.

> Avram

--
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk

Richard Karnesky

unread,
Apr 19, 2011, 12:13:44 PM4/19/11
to zotero-dev
> ii) Improve the quality of the MODS translation in Zotero.

What specific improvements need to be made? The Aquifer Metadata
Working Group made a great list & I think I addressed much of the low-
hanging fruit. I'm reasonably familiar with the MODS translator in
zotero & know more about MODS XML in general (I made the refbase
implementation, which has been re-used in other FLOSS projects).

--Rick

Jens Østergaard Petersen

unread,
Apr 19, 2011, 1:23:36 PM4/19/11
to zoter...@googlegroups.com
Hi Rick,

I hope by tomorrow afternoon (UCT) to have gathered together a list of things that I think should be changed in Zotero <> MODS exports/imports.

Some call for discussion, some are pretty straightforward, I think.

I work with Adam on this project.

Best,

Jens

> --
> You received this message because you are subscribed to the Google Groups "zotero-dev" group.
> To post to this group, send email to zoter...@googlegroups.com.
> To unsubscribe from this group, send email to zotero-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/zotero-dev?hl=en.
>

simonsterdotcom

unread,
Apr 19, 2011, 4:05:47 PM4/19/11
to zotero-dev
On Apr 19, 4:37 am, Adam Retter <adam.ret...@googlemail.com> wrote:
The problem wasn't the DOM sent to the translator (which is always up
to date), but that the unAPI translator only looked for IDs during
detection. I've committed an updated unAPI translator, which is
available at

https://www.zotero.org/svn/extension/trunk/translators/unAPI.js

This translator is almost a complete rewrite of the old unAPI
translator, so it would be great to get some testing, not only here,
but on other sites as well.

We may also want to consider allowing translators to send a DOM event
(say, zoteroItemMetadataChanged, or something like that) to indicate
that we should re-run the detection process. I think that this would
be fairly easy to do.

Simon

Avram Lyon

unread,
Apr 19, 2011, 4:20:33 PM4/19/11
to Adam Retter, zoter...@googlegroups.com
On Tue, Apr 19, 2011 at 12:37 PM, Adam Retter
<adam....@googlemail.com> wrote:
[..]

>> This is still a little odd, however, since I thought that the DOM in
>> its current state was being sent to doWeb at the moment you click the
>> "Save to Zotero" icon. From what you're saying, it sounds like the DOM
>> is being passed in the state it had when detectWeb was first run.
>
> 2) Indeed. Can this be fixed, how do I get someone to fix this?

[My comment here is no longer relevant given Simon's more
knowledgeable comments and fix.]

> In the larger picture we have 3 things we need to do with Zotero and
> looking for a Zotero developer or JavaScript developer with knowledge
> of JavaScript to pick up these small paid projects and contribute the
> fixes back into Zotero.
>
> i) Fix this issue.
> ii) Improve the quality of the MODS translation in Zotero.
> iii) Create a small plugin for Zotero that can export (in MODS format)
> back into our REST based system.
>
> Do you know how I would contact such a person? Or perhaps you are such a person?

Along with Richard, I'm eager to hear your feedback on (ii). My sense
is that MODS is relatively rarely-used in the Zotero world (few
translators build off of it), so there may be some uncovered edge
cases. Your detailed feedback is welcome.

As for (iii), can you elaborate on the workflow you're looking for? I
encourage you to try using Quick Copy (set up in the Zotero prefs)
with an export translator instead of a style, which can be set up as a
site-specific preference. Then you can have a drag-and-drop receiver
on the webpage and let users just drag their items into your system.
If you want an actual push solution that lets people push items to
your service in MODS format, without requiring drag-and-drop, you will
need a Zotero plugin. I suppose I could do it myself for a reasonable
fee, but you might be better off modifying an existing plugin like
Zotfile to add capability like this.

Avram

Avram Lyon

unread,
Apr 19, 2011, 4:25:03 PM4/19/11
to zoter...@googlegroups.com
On Wed, Apr 20, 2011 at 12:05 AM, simonsterdotcom
<simonst...@gmail.com> wrote:
>> > As for the general question of Zotero's translation and DOM updates,
>> > there has been discussion of this in the past, and the hard part was
>> > deciding what DOM modification events should trigger re-detection for
>> > Zotero. I believe there's no technical reason why Zotero couldn't
>> > register for other types of DOM modification, but we haven't managed
>> > to identify which DOM events would be appropriate. Perhaps you have a
>> > recommendation?
[..]

> We may also want to consider allowing translators to send a DOM event
> (say, zoteroItemMetadataChanged, or something like that) to indicate
> that we should re-run the detection process. I think that this would
> be fairly easy to do.

I was under the impression that we were trying to stay away from
Zotero-specific signals like this (similarly, meta tags). Is there
perhaps a neutral way to phrase the event so we can call this an
informal standard event that means "re-scrape me"?

I'll try to play with the unAPI translator ASAP (and review the month
of reviews that have backed up...)

Avram

simonsterdotcom

unread,
Apr 19, 2011, 7:26:37 PM4/19/11
to zotero-dev
On Apr 19, 4:25 pm, Avram Lyon <ajl...@gmail.com> wrote:
> On Wed, Apr 20, 2011 at 12:05 AM, simonsterdotcom
>
> <simonsterdot...@gmail.com> wrote:
> >> > As for the general question of Zotero's translation and DOM updates,
> >> > there has been discussion of this in the past, and the hard part was
> >> > deciding what DOM modification events should trigger re-detection for
> >> > Zotero. I believe there's no technical reason why Zotero couldn't
> >> > register for other types of DOM modification, but we haven't managed
> >> > to identify which DOM events would be appropriate. Perhaps you have a
> >> > recommendation?
> [..]
> > We may also want to consider allowing translators to send a DOM event
> > (say, zoteroItemMetadataChanged, or something like that) to indicate
> > that we should re-run the detection process. I think that this would
> > be fairly easy to do.
>
> I was under the impression that we were trying to stay away from
> Zotero-specific signals like this (similarly, meta tags). Is there
> perhaps a neutral way to phrase the event so we can call this an
> informal standard event that means "re-scrape me"?

The event would be specific to browser-based reference management
tools, and I guess at this point I'd be surprised if there is another
such tool, so I'm not sure if there is problem with including "Zotero"
in the name, but I guess we could name it something like
"CitationInfoChanged". FWIW, Mozilla has created these kinds of non-
standard events for implementation-specific functionality (e.g.,
MozAfterPaint).

Alternatives include convincing developers to fire the "load" event a
second time, or to register for DOMLinkAdded and get implementers to
add and remove a link element instead of firing a non-standard event.
Both of these would be hacks. In theory, we could also register for
DOM mutation events, but this apparently has a severe performance
impact, so it's probably not the best idea.

Simon

Avram Lyon

unread,
Apr 20, 2011, 12:26:13 AM4/20/11
to zoter...@googlegroups.com
On Wed, Apr 20, 2011 at 3:26 AM, simonsterdotcom
<simonst...@gmail.com> wrote:
[..]

>> I was under the impression that we were trying to stay away from
>> Zotero-specific signals like this (similarly, meta tags). Is there
>> perhaps a neutral way to phrase the event so we can call this an
>> informal standard event that means "re-scrape me"?
>
> The event would be specific to browser-based reference management
> tools, and I guess at this point I'd be surprised if there is another
> such tool, so I'm not sure if there is problem with including "Zotero"
> in the name, but I guess we could name it something like
> "CitationInfoChanged". FWIW, Mozilla has created these kinds of non-
> standard events for implementation-specific functionality (e.g.,
> MozAfterPaint).

I don't know of any other software that would be looking for this
event either, so I suppose it's just a philosophical position. Maybe
we don't need to be afraid of including our name from time to time.
And if this turns out to be useful to others, they can listen for it
as well, even if it does have our name in it.

So ZoteroInfoChanged? To be fired when a page has new cite-able content?

Avram

Jens Østergaard Petersen

unread,
Apr 20, 2011, 4:25:49 AM4/20/11
to zoter...@googlegroups.com, Eric Decker, Adam Retter
Hi Rick,

I was not aware of the DLF/Aquifer work in connection with Zotero. I use their work in other contexts, so that is a bad oversight on my part. But how they can even entertain the notion that they can make Zotero "import all MODS fields to Zotero and be able to export from Zotero back to MODS without losing any information" is beyond me, given the difference in expressivity of Zotero and MODS (and the looseness of the MODS schema, even following DLF/Aquifer Guidelines). If Zotero was to become able to do this, it would probably be out of the reach of most users.

What I wish to achieve is less than this - first of all, I wish to allow data input in Zotero to be exported to the right MODS elements; second, to be able to import again what has been exported; and, third, to be able to import some of the most common MODS structures in Zotero. I suggest that we tackle the MODS issue in these three steps and concentrate on step one for now. Our use case is connected with step one, for our staff uses Zotero to input records for out institute bibliography.

I am sure that some of the step two problems will be solved while solving the step one problems, and - for now - the only step three problem I would like to see solved is for Zotero to import all children of MODS titleInfo (nonSort, subTitle, partNumber, partName, in addition to title).

We are also interested in seeing these changes implemented in Multilingual Zotero, but we will take that up later.

The MODS.js file I have been testing against is version 2.1b3. The method I employ is naive, in the sense that I only examine input and output, and do not try to relate what I see to the SQLite fields or to the JavaScript of MODS.js - I am challenged enough as it is!

What I have done is to create a "full" entry for the entry types we most commonly use: Book, Book Section, Conference Paper, Journal Article, Book Section, Encyclopedia Article, and Newspaper Article. I then export these entries to MODS and note what I think should be done differently. I have some background in MARC cataloguing, but no formal training, and I am sure there are things one should discuss and some of the "mistakes" I point out may be my own. Also, I am sure that there are aspects of this wonderful tool that is Zotero that I simply am not aware of, for I have not yet used it actively myself ....

Best,

Jens

##

Book:

1 Series Editor gets exported to mods/name with role/roleTerm "cbt".
It should be exported to mods/relatedItem type="series"/name with role/roleTerm "edt" - sorry for the improvised notation!

2 Translator gets exported to mods/name with roleTerm "cbt".
It should be exported with roleTerm "trl".

3 Date gets exported to originInfo/copyrightDate.
It should be exported to originInfo/dateIssued.
If anyone believes that the copyright date is when something is published, I will gladly give a lecture.

4 Accessed gets exported as originInfo/dateCaptured.
It should be exported to location/url/@dateLastAccessed.
Accessed obviously relates to URL, under which it is displayed in Zotero, not to the record as a whole (it is not the book that is accessed, it is the URL). In MODS, dateCaptured is the "date on which the resource was digitized or a subsequent snapshot was taken", so if I scan a book or archive a web site and catalogue the scan or archive, this is where the date I made the scan or did the archiving goes. If I access someone else's resource on the Internet, the only thing I can know is when I accessed it - and the only interesting thing is usually when I last accessed it.

5 Library Catalog does not get exported.
It should probably be exported as location/physicalLocation.
I am not sure about this, but this sounds most reasonable, given the fact that Library Catalog is coupled with Call Number.

6 Call Number gets exported as classification.
It should probably be exported to location/holdingSimple/copyInformation/shelfLocator.
The MODS element classification is a "designation applied to a resource that indicates the subject by applying a formal system of coding and organizing resources according to subject areas." Typical type values are "lcc" and "udc." A Call Number is what you write on a requisition to have the book fetched from the shelves. A call number would typically be some sort of shelf locator, i.e. a reference to the physical whereabouts of the book.

7 Archive does not get exported.
It should probably be exported as location/physicalLocation.
This is the same destination that I have argued Library Catalog should have. I am not quite sure what the difference in Zotero is between Library and Archive? In my terminology, books don't usually belong in archives (whereas documents typically do).

8 Loc. in Archive gets exported to physicalLocation.
It should probably be exported to location/holdingSimple/copyInformation/shelfLocator.
See 7.

9 Series Number gets exported to relatedItem type="series"/titleInfo/partNumber
It should be exported to relatedItem type="series"/part/detail type="volume"/number.
A partNumber is part of a title, not part of a publication. A part of a publication is noted in - part! The position within a series of books is represented in the same way as the position of a volume within a periodical. In Zotero, the position of an volume within a periodical is correctly represented.

10 # of Pages does not get does not get exported.
It should be exported to physicalDescription/extent.
I had to check that again, but this is true!

11 Language does not get does not get exported.
It should be exported to language/languageTerm type="text".

12 Short Title does not get does not get exported.
It should be exported to titleInfo type="abbreviated"/title.
With other record types, Short Title _is_ exported.

13 Extra does not get does not get exported.
It should probably be exported to note. Of course, the problem is here why it is then not called "Note", but a Zotero Note is outside the cataloguing record itself (which becomes a problem when importing). Is that why the field is labelled "Extra"?

14 Additionally, originInfo/issuance with the value "monographic" should be exported for books.

##

Journal Article:

1 Editor gets exported to name.
It should be exported to relatedItem type="host"/name.
I think there is some equivocation in Zotero as regards "Editor". Obviously, a book can have an editor and the editor is then noted on the same level as the author. Theoretically speaking, a journal article can have an editor as well (as e.g. a posthumous article edited for publication), but normally in connection with a journal article or a book section (what we term an edited volume) when we talk about an editor, we refer to the editor of the periodical (containing a collection of articles) or the book (containing a collection of contributions on the theme of the book) as the editor. That is at least how our users interpret this. - In connection we periodicals we run into the problem that editors are mostly corporate, and Zotero only knows personal names, but let's leave that for later.

2 Reviewed Author gets exported to name with the roleTerm "cbt".
Well, "cbt" is plain wrong. What I really think Zotero should be doing here is to establish a Review record type. Instead of this, you could add a Reviewed Title and push both in relatedItem type"references".

3 Translator gets exported to name with the roleTerm "cbt".
I am sure you will agree that the roleTerm should be "trl".

4 Accessed, Library Catalog, Call Number, Archive, Loc. in Archive, Language, Extra
Same problem as Book.

5 Series, Series Title, Series Text
I am probably bibliographically challenged, but I cannot make out what it could mean for a journal or journal article to belong to a series. I have no way of evaluating the export to relatedItem type="series" here; however, Series Title gets exported to partTitle, which is plainly not in MODS, and Series Text gets exported to subTitle, which is hard to make sense of.

##

Book Section

NB: I suppose this might apply to two different types of record: one in which one records a contribution to what we call "an edited volume" and one in which one records only part of a book (say a particular chapter of a monograph). Since the latter use case is rare, I only consider the first use case here.

1 Book Author
I don't know what this could mean - except if it falls under the second use case above. I would suggest that Zotero removes this, to prevent confusion. An edited volume cannot have an author. One could then establish Book Part as a record type.

2 Editor gets exported to name.
Since the editor of the edited volume is what is probably meant here, it should be exported to relatedItem type="host"/name. I note that this is what the Zotero Chicago export does (I have not checked this export elsewhere).

3 Series Editor, Translator, Accessed, Series Number, Library Catalog, Call Number, Archive, Loc. in Archive, Language, Extra
Same as Book.

4 Series gets exported to relatedItem type="series"
Since the contribution is not what occurs in a series, but the edited volume is, it should be exported to relatedItem type="host"/relatedItem type="series".
Probably not all MODS processors can handle nestings this deep, but ours can! Anyway, this is the only correct output.

##

Thesis

1 Type gets exported to genre (untyped) twice.
It should only be exported once.

2 Accessed, Library Catalog, Call Number, Archive, Loc. in Archive, Language, Extra
Same as Book.

##

Conference Paper

1 typeOfResource with the value "undefined" gets output.
This should be "text". It is invalid to have "undefined".

2 Editor
Since the editor of the conference volume is probably meant, it should be exported to relatedItem type="host"/name.

3 Series Editor, Translator, Date, Accessed, Language, Extra
Same as Book.

4 Volume, Pages gets exported to part.
They should be exported to relatedItem type="host"/part.
They are not part of the conference paper, but the conference paper is part of the conference volume.

5 Place, Publisher gets exported to originInfo
They should be exported to relatedItem type="host"/originInfo.
They characterize the conference volume, not the conference paper.

6 ISBN gets exported to identifier type="isbn"
It should be exported to relatedItem type="host"/identifier type="isbn".
It characterizes the conference volume, not the conference paper. The DOI is (potentially) different.

7 Proceedings Title gets exported to relatedItem (untyped)
It should be exported to relatedItem type="host"
It is the host that the conference paper occurs in.

8 Series gets exported to relatedItem type="series"
It should be exported to relatedItem type="host"/relatedItem type="series".
It is the series that the conference volume occurs in.

9 Conference Name is not output.
It should go into name type="conference".

10 Short title does not get output.
It should be output to titleInfo type="abbreviated"

##

Encyclopedia Article

1 typeOfResource with the value "undefined" gets output.
This should be "text". It is invalid to have "undefined".

2 Editor
Same as Book Section

3 Series Editor, Translator, Series Number, Accessed, Library Catalog, Call Number, Archive, Loc. in Archive, Language, Extra
Same as Book

4 Volume, Pages gets exported to part.
Same as Conference Paper

5 Place, Publisher gets exported to originInfo
Same as Conference Paper

6 ISBN gets exported to identifier type="isbn"
Same as Conference Paper

7 Encyclopedia Title gets exported to relatedItem (untyped)
It should be exported to relatedItem type="host"

8 Short title does not get output.
It should be exported to titleInfo type="abbreviated"

##

Newspaper Article

1 Reviewed Author gets exported to name with the roleTerm "cbt".
Same as Journal Article.

2 Translator, Accessed, Library Catalog, Call Number, Archive, Loc. in Archive, Language, Extra, Short Title
Same problem as Book.

##

I enclose the files I have worked with.

article.xml
book.xml
conference paper.xml
contribution.xml
encyclopedia article.xml
newspaper article.xml
thesis.xml

Richard Karnesky

unread,
Apr 20, 2011, 11:53:00 AM4/20/11
to zotero-dev
> My sense
> is that MODS is relatively rarely-used in the Zotero world (few
> translators build off of it)

My impression is a little bit different. Most sites that use MODS
don't need a site-specific translator because the MODS data is rich
enough that it doesn't need further customization & adoption also
often correlates with a preference for agnosticism to clients that
want the data. There is still more MODS in the wild than BIBO RDF
(though this may not always be the case) & many sites and server
software that support unAPI (which, admittedly, isn't a HUGE number),
also support MODS.


> As for (iii), can you elaborate on the workflow you're looking for? I
> encourage you to try using Quick Copy (set up in the Zotero prefs)
> with an export translator instead of a style, which can be set up as a
> site-specific preference. Then you can have a drag-and-drop receiver
> on the webpage and let users just drag their items into your system.
> If you want an actual push solution that lets people push items to
> your service in MODS format, without requiring drag-and-drop, you will
> need a Zotero plugin. I suppose I could do it myself for a reasonable
> fee, but you might be better off modifying an existing plugin like
> Zotfile to add capability like this.

I agree with this. refbase had a private testing zotero plugin that
pushed selected records to the server. I never updated this for
Zotero 2 and now I personally just use quick copy. It isn't an
uncommon feature request & a more generic solution that worked with
many web-based programs could be useful. The old Zotero To GROK Codex
plugin would be another example of the need & I think there are
probably a few others.

--Rick

Avram Lyon

unread,
Apr 20, 2011, 12:45:43 PM4/20/11
to zoter...@googlegroups.com
On Wed, Apr 20, 2011 at 7:53 PM, Richard Karnesky <karn...@gmail.com> wrote:
>> My sense
>> is that MODS is relatively rarely-used in the Zotero world (few
>> translators build off of it)
>
> My impression is a little bit different.  Most sites that use MODS
> don't need a site-specific translator because the MODS data is rich
> enough that it doesn't need further customization & adoption also
> often correlates with a preference for agnosticism to clients that
> want the data.  There is still more MODS in the wild than BIBO RDF
> (though this may not always be the case) & many sites and server
> software that support unAPI (which, admittedly, isn't a HUGE number),
> also support MODS.

This is purely an academic point, so there's little need to argue it
either way, I suppose. I'm mainly comparing it to MARC, RIS and BibTeX
in terms of frequency of use. As with BIBO RDF, I fully expect to see
more people concerned about varied details of our imports and exports
as more people try to use MODS with Zotero. Many such points will come
down to varying interpretations of the respective models and their
relationship to Zotero's model-- it's telling that we've had only
minimal revisions to the underlying MODS translator in the past year
or so. The core is solid-- the mappings might at times be debatable.
So lets debate them, as it becomes necessary.

Avram

Adam Retter

unread,
Apr 21, 2011, 6:28:09 AM4/21/11
to zotero-dev, karn...@gmail.com, ajl...@gmail.com
Richard, Avram,

Have you had a change to study the issues in the MODS translation that
Jens has raised?

How can we move forward with getting these issues resolved?

Thanks Adam.

On Apr 20, 10:25 am, Jens Østergaard Petersen <oest...@gmail.com>
wrote:
> They are not part of the conference paper, but the conference paper ...
>
> read more »
>
>  article.xml
> 5KViewDownload
>
>  book.xml
> 7KViewDownload
>
>  conference paper.xml
> 5KViewDownload
>
>  contribution.xml
> 5KViewDownload
>
>  encyclopedia article.xml
> 5KViewDownload
>
>  newspaper article.xml
> 2KViewDownload
>
>  thesis.xml
> 2KViewDownload
>
>
>
> On Apr 19, 2011, at 6:13 PM, Richard Karnesky wrote:
>
>
>
>
>
>
>

Richard Karnesky

unread,
Apr 21, 2011, 12:22:02 PM4/21/11
to zotero-dev
I have a day job, so will not always be fast at responding to this
thread & may be slow in implementing these changes if I do them gratis
(and I certainly don't mind doing this, as it improves functionality
of zotero that I care about)...

I'll comment on some suggestions, below. Most comments that do not
have comments seem like good ideas to me. I have not actually tested
the export, so am taking your analysis at face value (though note that
you use 'cbt' when you mean 'ctb').


> 1 Series Editor gets exported to mods/name with role/roleTerm "cbt".
> It should be exported to mods/relatedItem type="series"/name with role/roleTerm "edt" - sorry for the improvised notation!

Mmmmaybe. But also note the text at http://www.loc.gov/marc/relators/relaterm.html:
Publishing director [pbd]
Use for a person or organization who presides over the elaboration
of a collective work to ensure its coherence or continuity. This
includes editors-in-chief, literary editors, editors of series, etc.


> 3 Date gets exported to originInfo/copyrightDate.
> It should be exported to originInfo/dateIssued.
> If anyone believes that the copyright date is when something is published, I will gladly give a lecture.

Lectures aside, it is typically the copyright date that is stored in
online databases. It is also usually the copyright date that gets
entered for references that are added manually: it is on the copyright
page in the front matter of most modern books.


> 5 Library Catalog does not get exported.
> It should probably be exported as location/physicalLocation.
> I am not sure about this, but this sounds most reasonable, given the fact that Library Catalog is coupled with Call Number.

Yeah, I don't think I like it there: Zotero uses the field to record
the source of the citation. More often than not, this is not a
physical location. It is a website (SpringerLink or ScienceDirect or
whatever).


> 6 Call Number gets exported as classification.
> It should probably be exported to location/holdingSimple/copyInformation/shelfLocator.
> The MODS element classification is a "designation applied to a resource that indicates the subject by applying a formal system of coding and organizing resources according to subject areas." Typical type values are "lcc" and "udc." A Call Number is what you write on a requisition to have the book fetched from the shelves. A call number would typically be some sort of shelf locator, i.e. a reference to the physical whereabouts of the book.

I think current usage is fine, as is stated explicitly in:
http://www.loc.gov/standards/mods/userguide/classification.html
[also: it is difficult (but perhaps not impossible) to retro-actively
provide an authority.]


> 7 Archive does not get exported.
> It should probably be exported as location/physicalLocation.
> This is the same destination that I have argued Library Catalog should have. I am not quite sure what the difference in Zotero is between Library and Archive? In my terminology, books don't usually belong in archives (whereas documents typically do).
>
> 8 Loc. in Archive gets exported to physicalLocation.
> It should probably be exported to location/holdingSimple/copyInformation/shelfLocator.
> See 7.

Haven't formed opinions of these yet.


> 1 Book Author
> I don't know what this could mean - except if it falls under the second use case above. I would suggest that Zotero removes this, to prevent confusion. An edited volume cannot have an author. One could then establish Book Part as a record type.

This is commonly used, for example, when a forward or afterword that
is written by someone other than the author is cited. People use
this, so I don't think it should be removed.


> Thesis
>
> 1 Type gets exported to genre (untyped) twice.
> It should only be exported once.

Not sure I follow. Most exports have two genres exported: local and
macgt


> Conference Paper
....
> 4 Volume, Pages gets exported to part.
> They should be exported to relatedItem type="host"/part.
> They are not part of the conference paper, but the conference paper ...
>
> read more »

Argh. Other comments will have to follow in a new message.

--Rick

Richard Karnesky

unread,
Apr 21, 2011, 12:27:07 PM4/21/11
to zotero-dev
> > read more »
>
> Argh.  Other comments will have to follow in a new message.

Actually, perusing the comments that had been truncated: I either
agree with them or have commented on similar issues in the last
message.

I have, myself, commented that a lot of information needs to be re-
nested in reladeditem=host. I haven't implemented it yet because it
was not trivial & it doesn't really hurt import/export (it just isn't
aesthetically pleasing/technically right). It will be possible,
though.
>
> --Rick

Avram Lyon

unread,
Apr 21, 2011, 12:34:31 PM4/21/11
to zotero-dev
On Thu, Apr 21, 2011 at 2:28 PM, Adam Retter <adam....@googlemail.com> wrote:
> Richard, Avram,
>
> Have you had a change to study the issues in the MODS translation that
> Jens has raised?
>
> How can we move forward with getting these issues resolved?

I really haven't worked with MODS enough to respond to them. I'll
defer to the judgment of the more MODS-proficient here.

The way to get these resolved is to come to consensus here at
zotero-dev on how we want the translation to go and editing MODS.js to
match. Then we'll commit MODS.js and we're done.

Avram

Jens Østergaard Petersen

unread,
Apr 21, 2011, 2:26:54 PM4/21/11
to zoter...@googlegroups.com
Hi Rick,

Thank you for going through my over-long list. And sorry for any impatience on our part ....

Some quick notes below:

On Apr 21, 2011, at 6:22 PM, Richard Karnesky wrote:

> I have a day job, so will not always be fast at responding to this
> thread & may be slow in implementing these changes if I do them gratis
> (and I certainly don't mind doing this, as it improves functionality
> of zotero that I care about)...
>
> I'll comment on some suggestions, below. Most comments that do not
> have comments seem like good ideas to me. I have not actually tested
> the export, so am taking your analysis at face value (though note that
> you use 'cbt' when you mean 'ctb').
>
>
>> 1 Series Editor gets exported to mods/name with role/roleTerm "cbt".
>> It should be exported to mods/relatedItem type="series"/name with role/roleTerm "edt" - sorry for the improvised notation!
>
> Mmmmaybe. But also note the text at http://www.loc.gov/marc/relators/relaterm.html:
> Publishing director [pbd]
> Use for a person or organization who presides over the elaboration
> of a collective work to ensure its coherence or continuity. This
> includes editors-in-chief, literary editors, editors of series, etc.

Using 'pbd' is fine with me, but the name still belongs inside relatedItem. The LC sample file for book chapter uses the text value "editor", so using "publishing director" could be overkill.

> 3 Date gets exported to originInfo/copyrightDate.
>> It should be exported to originInfo/dateIssued.
>> If anyone believes that the copyright date is when something is published, I will gladly give a lecture.
>
> Lectures aside, it is typically the copyright date that is stored in
> online databases. It is also usually the copyright date that gets

> entered for references that are added manually: it is on the copyright
> page in the front matter of most modern books.

Are we talking about the same thing? Exceptions aside, the year noted at the bottom of the title page would be rendered as dateIssued in MODS. When I learnt my MARC, if this was missing (which was rare), we were to default to the less reliable copyright date found in the copyright declaration, usually on the back of the title page. Do users really use the copyright date when they enter manually?

In the sample files included when I first tried out Zotero, there was an old book - about the Americas, I seem to recall. This 16th (?) century book had its year of issue output with copyrightDate - which struck me as rather odd. But if, as you say, most online databases use the default copyright date without caring about the year of issue, I guess the situation is different, given Zotero's user base. No big deal: we can just substitute one for the other locally.

> 5 Library Catalog does not get exported.
>> It should probably be exported as location/physicalLocation.
>> I am not sure about this, but this sounds most reasonable, given the fact that Library Catalog is coupled with Call Number.
>
> Yeah, I don't think I like it there: Zotero uses the field to record
> the source of the citation. More often than not, this is not a
> physical location. It is a website (SpringerLink or ScienceDirect or
> whatever).

Fine with me: then it is location/url. However, this contains one url only, whereas Zotero pairs Library Catalog with Call Number (I think), so some role should be given the latter. Do SpringerLink or ScienceDirect have call numbers? Are (were) call numbers not some code you wrote on a requisition, telling the staff where the book was shelved?

>> 6 Call Number gets exported as classification.
>> It should probably be exported to location/holdingSimple/copyInformation/shelfLocator.
>> The MODS element classification is a "designation applied to a resource that indicates the subject by applying a formal system of coding and organizing resources according to subject areas." Typical type values are "lcc" and "udc." A Call Number is what you write on a requisition to have the book fetched from the shelves. A call number would typically be some sort of shelf locator, i.e. a reference to the physical whereabouts of the book.
>
> I think current usage is fine, as is stated explicitly in:
> http://www.loc.gov/standards/mods/userguide/classification.html
> [also: it is difficult (but perhaps not impossible) to retro-actively
> provide an authority.]

Well, LC does not admit just any old call number here - "call numbers whose authorities are referenced in Classification Scheme Source Codes maintained by the Library of Congress," i.e. call numbers which can function as a classification. Here, the "authority attribute is required."

> 7 Archive does not get exported.
>> It should probably be exported as location/physicalLocation.
>> This is the same destination that I have argued Library Catalog should have. I am not quite sure what the difference in Zotero is between Library and Archive? In my terminology, books don't usually belong in archives (whereas documents typically do).
>>
>> 8 Loc. in Archive gets exported to physicalLocation.
>> It should probably be exported to location/holdingSimple/copyInformation/shelfLocator.
>> See 7.
>
> Haven't formed opinions of these yet.
>
>
>> 1 Book Author
>> I don't know what this could mean - except if it falls under the second use case above. I would suggest that Zotero removes this, to prevent confusion. An edited volume cannot have an author. One could then establish Book Part as a record type.
>
> This is commonly used, for example, when a forward or afterword that
> is written by someone other than the author is cited. People use
> this, so I don't think it should be removed.

I don't think I get you: is the author of a foreword or afterword supposed to be entered in the Book Author field? In roleTerm, this would be "aui" (Use for a person or organization responsible for an introduction, preface, foreword, or other critical introductory matter, but who is not the chief author.)

Also, how do I know how a field is commonly used in Zotero?

> Thesis
>>
>> 1 Type gets exported to genre (untyped) twice.
>> It should only be exported once.
>
> Not sure I follow. Most exports have two genres exported: local and
> macgt

Sure, there is a <genre authority="local">thesis</genre> and a <genre authority="marcgt">thesis</genre>, but after that the exact string "<genre>Ph.D.</genre>" is output twice. Doesn't hurt, but looks a little silly.

>> Conference Paper
> ....
>> 4 Volume, Pages gets exported to part.
>> They should be exported to relatedItem type="host"/part.
>> They are not part of the conference paper, but the conference paper ...
>>
>> read more »
>
> Argh. Other comments will have to follow in a new message.

Hope you have some nice time off in the days to come!

Jens

Bruce D'Arcus

unread,
Apr 21, 2011, 2:31:04 PM4/21/11
to zoter...@googlegroups.com
2011/4/21 Jens Østergaard Petersen <oes...@gmail.com>:

...

>> 3 Date gets exported to originInfo/copyrightDate.
>>> It should be exported to originInfo/dateIssued.
>>> If anyone believes that the copyright date is when something is published, I will gladly give a lecture.
>>
>> Lectures aside, it is typically the copyright date that is stored in
>> online databases. It is also usually the copyright date that gets
>
>> entered for references that are added manually: it is on the copyright
>> page in the front matter of most modern books.
>
> Are we talking about the same thing? Exceptions aside, the year noted at the bottom of the title page would be rendered as dateIssued in MODS. When I learnt my MARC, if this was missing (which was rare), we were to default to the less reliable copyright date found in the copyright declaration, usually on the back of the title page. Do users really use the copyright date when they enter manually?
>
> In the sample files included when I first tried out Zotero, there was an old book - about the Americas, I seem to recall. This 16th (?) century book had its year of issue output with copyrightDate - which struck me as rather odd. But if, as you say, most online databases use the default copyright date without caring about the year of issue, I guess the situation is different, given Zotero's user base. No big deal: we can just substitute one for the other locally.

I'll give you an example close to home:

<http://openlibrary.org/books/OL3399524M/Boundaries_of_Dissent>

The book was actually issued late 2005, and this is the date Amazon lists.

But the only date one actually sees anywhere is the copyright date,
which is 2006.

MODS is maybe a little too clever on dates for its own good.

Bruce

Jens Østergaard Petersen

unread,
Apr 22, 2011, 5:50:29 AM4/22/11
to zoter...@googlegroups.com
Hi Bruce,

Thanks for the example - I like the way only the first sentence of the book is given for free: it _is_ a very nice sentence!

On Apr 21, 2011, at 8:31 PM, Bruce D'Arcus wrote:

> 2011/4/21 Jens Østergaard Petersen <oes...@gmail.com>:
>

<snip/>


>
>> In the sample files included when I first tried out Zotero, there was an old book - about the Americas, I seem to recall. This 16th (?) century book had its year of issue output with copyrightDate - which struck me as rather odd. But if, as you say, most online databases use the default copyright date without caring about the year of issue, I guess the situation is different, given Zotero's user base. No big deal: we can just substitute one for the other locally.
>
> I'll give you an example close to home:
>
> <http://openlibrary.org/books/OL3399524M/Boundaries_of_Dissent>
>
> The book was actually issued late 2005, and this is the date Amazon lists.
>
> But the only date one actually sees anywhere is the copyright date,
> which is 2006.

All of the four internet bookstores linked to on the page give the year as 2005 and some narrow this down to the day (November 21, 2005) - Powell's Books for some reason has October.

Library of Congress, Hollis, and Google Books all give 2006, that's for sure.

However, this is not the point.

If a MARC cataloguer uses a copyright date in the absence of a date of publication, the year becomes prefixed with "c". For an example of this, try searching for Sources of Chinese tradition / compiled by Wm. Theodore de Bary and Irene Bloom in LC or Hollis. You will see that the relevant field reads 260 |a New York : |b Columbia University Press, |c c1999-c2000, with a "c" before the year. This is a book I have on my shelf and it does not have a year of publication on its title page, only a copyright date.

In the case of your book, the same field reads 260 |a New York : |b Routledge, |c 2006. The subfield is named "c", but there is no "c" before the year.

From this I reason that the year of publication is actually on the title page of your book and that the cataloguers are therefore using what in MODS would be dateIssued. That this date is wrong is something that you know (as the author of the book) and the internet bookstores know (presumably because they are notified by the publisher about date the publication, very relevant to their business), but the cataloguer is supposed to base the record on primary evidence (the book on the desk), so it may not be known to the cataloguer that the date _claimed_ for publication by the publisher on the title page is not the _actual_ date. If the cataloguer does, by some indirect means, know the actual date of publication, the emended date is added, enclosed in brackets.

The copyright date is the date that copyright was claimed for the publication. This declaration is usually printed on the page following the title page. If there is no date of publication, the cataloguer defaults to the copyright date (and notes this downgrade in precision by use of the "c"). In the cases that I (vaguely) recall having encountered, I think it was usually the case that the copyright date was before the date of publication. Usually the mismatch happens because the book is ready to be printed in one year, but - for economic reasons - printing has to be postponed till the next year. Obviously, this can also work the other way 'round: publication was planned for one year, but - for various reasons (e.g., the importance of the book) - the date of publication can be advanced without changing the print plates.

Therefore, Library of Congress, Hollis, and Google Books do not use the copyright date; also, it is probably not the case that "the only date one actually sees anywhere is the copyright date." What one sees is the actual date of publication (in the internet bookstores), the date of publication claimed by the publisher (in the library catalogues) - and the full story (in your mail).

> MODS is maybe a little too clever on dates for its own good.

Well, I don't know .... At least MODS does not make any assumptions, true or false, whereas Zotero does. Don't get me wrong: I think MODS is deficient in many ways, but probably not regarding dates.

Cheers,

Jens

>
> Bruce

Richard Karnesky

unread,
Apr 22, 2011, 11:50:56 AM4/22/11
to zotero-dev
> Hope you have some nice time off in the days to come!

Not likely, but I'm happy to fire off a quicker reply...


> Using 'pbd' is fine with me, but the name still belongs inside relatedItem. The LC sample file for book chapter uses the text value "editor", so using "publishing director" could be overkill.

Not necessarily directed at this particular point, but more of a broad
statement: the LoC has given many, many examples. Some of those
examples don't conform to be what I would think was a best practice (I
suspect many were automated translations from MARC, and so some missed
more-nuanced expressions available in MODS & some of that MARC data
may not have been meticulous). That is to say: I think we should
usually err on doing as they say, not as they do. (But maybe or maybe
not here...)


> > 3 Date gets exported to originInfo/copyrightDate.
> >> It should be exported to originInfo/dateIssued.

Book design often differs, but none of the dozen books I checked have
the issued date shown.

I think the title page has (by definition) the full title of the
book. It will often have the editors (if applicable) and/or authors
(esp. in the case where a single author or group of authors were
responsible for most content (e.g. not a collection)). It will often
have the publisher & perhaps some geographic location about the
publisher and/or people. It will usually NOT have copyright or
catalog information.

The copyright page appears later in the front matter (not always
following the title page). This has the date that the publishers
asserts copyright on the material, *not* the date issued (hence
Bruce's example). This copyright date may differ from the date the
copyright was registered.

Yes, MARC 260$c is often used to store this copyright date. But I
will claim that was one reason the copyright date type was added to
MODS. Bruce may correct me (he had participated in this discussion on
the MODS list many years ago, but I only lurked). See
http://listserv.loc.gov/cgi-bin/wa?A2=ind0304&L=MODS&P=R2476&I=-3 and
others there.


> > 5 Library Catalog does not get exported.
>
> > Yeah, I don't think I like it there: Zotero uses the field to record
> > the source of the citation.  More often than not, this is not a
> > physical location.  It is a website (SpringerLink or ScienceDirect or
> > whatever).
>
> Fine with me: then it is location/url.

No, it isn't. The field will usually contain information about the
translator zotero used to get the data (and could even be 'unAPI'),
and not a resolvable URL. This is a tricky field to export. It can
sometimes carry information about a physical library where resources
have call numbers, but more often will not.


> Do SpringerLink or ScienceDirect have call numbers? Are (were) call numbers not some code you wrote on a requisition, telling the staff where the book was shelved?

So...Perhaps we only export catalog if a call number is included?

> >> 1 Book Author
> >> I don't know what this could mean - except if it falls under the second use case above. I would suggest that Zotero removes this, to prevent confusion. An edited volume cannot have an author. One could then establish Book Part as a record type.
>
> > This is commonly used, for example, when a forward or afterword that
> > is written by someone other than the author is cited.  People use
> > this, so I don't think it should be removed.
>
> I don't think I get you: is the author of a foreword or afterword supposed to be entered in the Book Author field? In roleTerm, this would be "aui" (Use for a person or organization responsible for an introduction, preface, foreword, or other critical introductory matter, but who is not the chief author.)

No. The author of the book section (which may not appear in the
introductory matter, so may not be an 'aui' (which Zotero doesn't have
a concept of) would be listed as 'author' in zotero. The chief author
would be listed in 'book author'. e.g.:

Item Type: Book Section
Title: "Introduction: Life of Biringuccio"
Author: Cyril Stanley Smith
Book Author: Vannoccio Biringuccio
Book Title: The Pirotechnia


> Also, how do I know how a field is commonly used in Zotero?

Right now, you probably have to peruse the translator source code or
trac and/or the forums (the latter is probably essential for
uncommonly used fields).


> > Thesis
>
> >> 1 Type gets exported to genre (untyped) twice.
> >> It should only be exported once.
> Sure, there is a <genre authority="local">thesis</genre> and a <genre authority="marcgt">thesis</genre>, but after that the exact string "<genre>Ph.D.</genre>" is output twice. Doesn't hurt, but looks a little silly.

Ah: thanks.


--Rick

Bruce D'Arcus

unread,
Apr 22, 2011, 11:55:09 AM4/22/11
to zoter...@googlegroups.com
Jens, just to clarify a point of mine ....
2011/4/22 Jens Østergaard Petersen <oes...@gmail.com>:

...

> Therefore, Library of Congress, Hollis, and Google Books do not use the copyright date; also, it is probably not the case that "the only date one actually sees anywhere is the copyright date." What one sees is the actual date of publication (in the internet bookstores), the date of publication claimed by the publisher (in the library catalogues) - and the full story (in your mail).

By "see" I mean two things:

a) anywhere on the physical book, and ...
b) in a (correct) citation

Zotero in tool for scholars (not librarians ;-)), so the second is
particularly key.

Bruce

Jens Østergaard Petersen

unread,
Apr 23, 2011, 8:19:11 AM4/23/11
to zoter...@googlegroups.com
Hi Rick,

MODS is still primarily used as a interchange format (American History Online being the flagship here), and, as far as I know, very few institutions use MODS as their primary input format, at least for resources that are not electronic in origin. I think you are right about the examples on the MODS web site being automatically generated (the MARC markup debris left behind is telling), so what we really need is a set of carefully crafted examples of MODS markup, in addition to the snippets in the User Guidelines. Best practice is not something we think up, but something that evolves in a user community. I think it would be of great help for Zotero to have such a set as well - we were not able to figure out what you explain below about Book Author in Book Section, for instance. Well, I guess that with fast-moving open source projects like this, documentation always tends to be a little behind ....

The incidence of books having year of publication on their title page is closely connected to - their year of publication. Books from before the '90s typically has this, books published from the '90s on typically do not - which I neglected, being mostly occupied with old books. This makes it a little difficult to make a safe time-independent mapping from Zotero to MODS, but actually this is a fault of MODS, for MODS typically encodes variations of a general data type with attributes, but here they have created this long list of different date elements - it would have been much easier to just have a general <date> element which one could specify for type - or leave untyped for Zotero's "any-old-date." Since MODS is an interchange format, it should be possible to interchange it with other formats than MARC, including of course Zotero.

Now, I do not have access to Bruce's book, so I am reduced to reasoning backwards from the library catalogues, which is not ideal (Amazon does not give the title page). What we say about these matters also of course mainly apply to "Western" books, but let us leave it at that. What you say about the copyright date does, as I read it, agree with what I said, but what I don't understand is that you write that there is a copyright date and a date on which the copyright was registered. I take it that you are referring to the US system (which is a little special in world context). Does such a registration date ever figure in a book?

I now see what you mean with Book Author in Book Section, but doesn't this overload Book Section, since presumably Book Section is used primarily to register contributions to edited volumes? I guess you might also want to use it to catalogue a part of a book (say, Chapter Three of Bruce's book, if I happen to have a scan of that), and now you say it can be used to catalogue an introduction to a monograph authored by someone else than the monograph author. All these are valid use cases, but perhaps Zotero gets a little too complicated here by trying to be simple?

Some comments below as well, but I think you have answered all the questions I had, thanks!

Have a nice weekend!

Jens

On Apr 22, 2011, at 5:50 PM, Richard Karnesky wrote:

>> Hope you have some nice time off in the days to come!
>
> Not likely, but I'm happy to fire off a quicker reply...
>
>
>> Using 'pbd' is fine with me, but the name still belongs inside relatedItem. The LC sample file for book chapter uses the text value "editor", so using "publishing director" could be overkill.

Come to think of it, I don't think I have ever seen 'pbd' live.

> Not necessarily directed at this particular point, but more of a broad
> statement: the LoC has given many, many examples. Some of those
> examples don't conform to be what I would think was a best practice (I
> suspect many were automated translations from MARC, and so some missed
> more-nuanced expressions available in MODS & some of that MARC data
> may not have been meticulous). That is to say: I think we should
> usually err on doing as they say, not as they do. (But maybe or maybe
> not here...)
>
>
>>> 3 Date gets exported to originInfo/copyrightDate.
>>>> It should be exported to originInfo/dateIssued.
>
> Book design often differs, but none of the dozen books I checked have
> the issued date shown.
>
> I think the title page has (by definition) the full title of the
> book. It will often have the editors (if applicable) and/or authors
> (esp. in the case where a single author or group of authors were
> responsible for most content (e.g. not a collection)). It will often
> have the publisher & perhaps some geographic location about the
> publisher and/or people. It will usually NOT have copyright or
> catalog information.

I agree.

> The copyright page appears later in the front matter (not always
> following the title page). This has the date that the publishers
> asserts copyright on the material, *not* the date issued (hence
> Bruce's example). This copyright date may differ from the date the
> copyright was registered.
>
> Yes, MARC 260$c is often used to store this copyright date. But I
> will claim that was one reason the copyright date type was added to
> MODS. Bruce may correct me (he had participated in this discussion on
> the MODS list many years ago, but I only lurked). See
> http://listserv.loc.gov/cgi-bin/wa?A2=ind0304&L=MODS&P=R2476&I=-3 and
> others there.
>
>
>>> 5 Library Catalog does not get exported.
>>
>>> Yeah, I don't think I like it there: Zotero uses the field to record
>>> the source of the citation. More often than not, this is not a
>>> physical location. It is a website (SpringerLink or ScienceDirect or
>>> whatever).
>>
>> Fine with me: then it is location/url.
>
> No, it isn't. The field will usually contain information about the
> translator zotero used to get the data (and could even be 'unAPI'),
> and not a resolvable URL. This is a tricky field to export. It can
> sometimes carry information about a physical library where resources
> have call numbers, but more often will not.

Well, you must constantly bump into problems of this kind: should Zotero maintain simplicity or provide more precision and complexity? Here, we are just trying to grasp the semantics of these fields, since we have not found any direct explanations. The field would be populated automatically when people siphon off entries from library catalogues, but I think they will have trouble using it if they enter data manually.

>> Do SpringerLink or ScienceDirect have call numbers? Are (were) call numbers not some code you wrote on a requisition, telling the staff where the book was shelved?
>
> So...Perhaps we only export catalog if a call number is included?

That seems a viable option.

>>>> 1 Book Author
>>>> I don't know what this could mean - except if it falls under the second use case above. I would suggest that Zotero removes this, to prevent confusion. An edited volume cannot have an author. One could then establish Book Part as a record type.
>>
>>> This is commonly used, for example, when a forward or afterword that
>>> is written by someone other than the author is cited. People use
>>> this, so I don't think it should be removed.
>>
>> I don't think I get you: is the author of a foreword or afterword supposed to be entered in the Book Author field? In roleTerm, this would be "aui" (Use for a person or organization responsible for an introduction, preface, foreword, or other critical introductory matter, but who is not the chief author.)
>
> No. The author of the book section (which may not appear in the
> introductory matter, so may not be an 'aui' (which Zotero doesn't have
> a concept of)

and I have never seen in the flesh as well.

> would be listed as 'author' in zotero. The chief author
> would be listed in 'book author'. e.g.:
>
> Item Type: Book Section
> Title: "Introduction: Life of Biringuccio"
> Author: Cyril Stanley Smith
> Book Author: Vannoccio Biringuccio
> Book Title: The Pirotechnia

Got it.

> Also, how do I know how a field is commonly used in Zotero?
>
> Right now, you probably have to peruse the translator source code or
> trac and/or the forums (the latter is probably essential for
> uncommonly used fields).

Thanks. I realise that my question was wrong: I meant to ask, how is a field commonly used when doing manual input? I guess there is no answer to that - and there are, as far as I can see, no recommendations either. I see that I would have been able to retrieve the answer to my question from <http://forums.zotero.org/discussion/15636/changes-to-fields-and-item-types-for-zotero-22/>, but these things need to be assembled in some kind of manual to be really useful.

>
>
>>> Thesis
>>
>>>> 1 Type gets exported to genre (untyped) twice.
>>>> It should only be exported once.
>> Sure, there is a <genre authority="local">thesis</genre> and a <genre authority="marcgt">thesis</genre>, but after that the exact string "<genre>Ph.D.</genre>" is output twice. Doesn't hurt, but looks a little silly.
>
> Ah: thanks.
>
>
> --Rick
>

Richard Karnesky

unread,
Apr 26, 2011, 12:26:08 PM4/26/11
to zotero-dev
> The incidence of books having year of publication on their title page is closely connected to - their year of publication. Books from before the '90s typically has this, books published from the '90s on typically do not - which I neglected, being mostly occupied with old books.

I don't know what standard practice is, but in most of the books I
have that do have a date on the title page, the date is the same as on
the copyright page. Of course books are often distributed in the same
year that copyright is asserted, so this may not be a surprise. But I
don't see that date claimed as the distribution date.


> Since MODS is an interchange format, it should be possible to interchange it with other formats than MARC, including of course Zotero.

I'm actually surprised to see that bibutils handles dateIssued but
does not use copyrightDate (especially in cases where dateIssued is
not present). I think that Zotero's current behavior is technically
the most reasonable choice by the specification, but now am more
swayed that we should probably use dateIssued for now, as that is what
others are (ab)using.


> what I don't understand is that you write that there is a copyright date and a date on which the copyright was registered. I take it that you are referring to the US system (which is a little special in world context). Does such a registration date ever figure in a book?

In Berne Convention countries, copyright is automatic & does not
require registration. I'd argue that it is this date that is usually
in the printed book (following the (C) on the copyright page). Due to
nuances in publishing/distribution, this is not always the release
date of the book.

Denmark doesn't have a voluntary registration system, but the US is
hardly alone in offering one, so I don't think of it as
"specialized". Not only does the US publish more books per year than
any other country, but many other top publishing countries have some
sort of registration. In the UK, there are private registrars & the
British Library. China and Russia have national registries. It isn't
until you get down to Germany where there is (as far as I know) no
state-run or widely-used private registration system. But I digress.


> I now see what you mean with Book Author in Book Section, but doesn't this overload Book Section, since presumably Book Section is used primarily to register contributions to edited volumes?


While that is what most of my "Book Section" entries are used for, no:
that is not what is only used for. It makes it easy to repeatedly
cite the same excerpt from a book.


> I guess you might also want to use it to catalogue a part of a book (say, Chapter Three of Bruce's book, if I happen to have a scan of that), and now you say it can be used to catalogue an introduction to a monograph authored by someone else than the monograph author. All these are valid use cases, but perhaps Zotero gets a little too complicated here by trying to be simple?

I would argue that it is very simple for those using zotero as a
citation tool: there's a single item type to keep track of all three
use cases & all cases are thus easy. It *might* be slightly more
complicated for catalogers/librarians, but that complication seems to
only be a conceptual difference between zotero and other cataloging
tools.


> > Also, how do I know how a field is commonly used in Zotero?
>
> > Right now, you probably have to peruse the translator source code or
> > trac and/or the forums (the latter is probably essential for
> > uncommonly used fields).
>
> Thanks. I realise that my question was wrong: I meant to ask, how is a field commonly used when doing manual input? I guess there is no answer to that - and there are, as far as I can see, no recommendations either. I see that I would have been able to retrieve the answer to my question from <http://forums.zotero.org/discussion/15636/changes-to-fields-and-item-...>, but these things need to be assembled in some kind of manual to be really useful.

There was a recent solicitation for feedback on all aspect of the
zotero website, which would include both user- and developer-
documentation & this request may fit well on there. I agree it might
be useful & a motivated individual could probably dig through publicly
shared libraries to see how the fields actually end up getting used.
Not something I'd really like to do (if I had the time), but you might
start to think about specific questions to raise or other
documentation details that you'd like to see.


Best,

Rick

Richard Lehane

unread,
May 25, 2011, 9:06:18 AM5/25/11
to zotero-dev
Last month there was a discussion on this list about changes to the
MODS export/import. Two points raised in that discussion were:
> 7 Archive does not get exported.
> It should probably be exported as location/physicalLocation.

> 8 Loc. in Archive gets exported to physicalLocation.
> It should probably be exported to location/holdingSimple/copyInformation/shelfLocator.

I'd just like to second those two points: I'm working on an unapi
service for an archival catalogue using MODS (will switch to
rdf_zotero when it is available) and for manuscript type material it
is really vital to include the name of the Archive where records are
held. In the MODS guidelines, physicalLocation is described as "The
institution or repository that holds the resource or where it is
available." It does seem to make more sense to map the
location.physicalLocation element to the 'Archive' field. The mods
element location.shelfLocator (described in the guidelines as
"Shelfmark or other shelving designation that indicates the location
identifier for a copy.") seems more appropriate for the 'Loc in
Archive' field.

The specific changes to the MODS.js translator are:
- change "newItem.archiveLocation =
mods.m::location.m::physicalLocation.text().toString();" to
"newItem.archiveLocation =
mods.m::location.m::shelfLocator.text().toString();"
- and add "newItem.archive =
mods.m::location.m::physicalLocation.text().toString();"
...and make matching changes for the export section of the translator.

I hope this makes sense. Thanks very much!


Avram Lyon

unread,
May 26, 2011, 3:41:49 PM5/26/11
to zoter...@googlegroups.com
On Wed, May 25, 2011 at 5:06 PM, Richard Lehane
<richard...@records.nsw.gov.au> wrote:
> Last month there was a discussion on this list about changes to the
> MODS export/import. Two points raised in that discussion were:
>> 7 Archive does not get exported.
>> It should probably be exported as location/physicalLocation.
>
>> 8 Loc. in Archive gets exported to physicalLocation.
>> It should probably be exported to location/holdingSimple/copyInformation/shelfLocator.
>
> I'd just like to second those two points: I'm working on an unapi
> [..]

So is there agreement that these changes should be made? This sounds
good to me, but I'm not a MODS‌ expert.

Richard (Lehane, not Karnesky)-- Can you post the full proposed code
somewhere like gist.github.com and post a link here?

Avram

Richard Karnesky

unread,
May 26, 2011, 5:55:08 PM5/26/11
to zotero-dev
I've been working on getting many of the MODS changes suggested in
this thread implemented for MODS export (import will come later). I
will post these this weekend. These two hadn't made the list of
changes I would make & I had withheld opinion upthread.

> The specific changes to the MODS.js translator are:
> - change "newItem.archiveLocation =
> mods.m::location.m::physicalLocation.text().toString();" to
> "newItem.archiveLocation =
> mods.m::location.m::shelfLocator.text().toString();"
> - and add "newItem.archive =
> mods.m::location.m::physicalLocation.text().toString();"

I'm not too sure about changing the importer this naively when it will
break import of legacy data. I think we can be smarter about this.
Further, it differs a bit from the original recommendation (to use
holdingSimple). For what it is worth, I agree that we should probably
just thow it up in location/shelfLocator for now.


> ...and make matching changes for the export section of the translator.

I can add the export changes if there is still reasonable demand and
consensus. I suggest holding off on import for now...


--Rick

Lehane, Richard

unread,
May 26, 2011, 9:55:53 PM5/26/11
to zoter...@googlegroups.com
Thanks Rick and Avram for taking this into consideration. From my (admittedly selfish!) perspective of trying to get our data into Zotero, my real concern is the importer but I do understand you will have to tread carefully to avoid breaking existing MODS implementations.

If location/shelfLocator is currently ignored by the importer perhaps you could use the presence of that field as trigger for switching behaviour (on the basis that any legacy MODS files containing this element aren't being fully imported anyway & the new behaviour would probably be an improvement)?

Something along the lines of:
https://gist.github.com/994440

On the question of shelfLocator vs holdingSimple - from my reading of MODS, I see use of shelfLocator as a better fit for Zotero's simple location fields. Location/holdingSimple seems to be designed for more detailed & granular location information (for example to record the location of multiple forms of the same item).

Cheers
Richard



*********************************************************************
This email and any files transmitted with it are intended solely for
the use of the addressee(s) and may contain information that is
confidential or subject to legal privilege. If you receive this email
and you are not the addressee (or responsible for delivery of the
email to the addressee), please note that any copying, distribution
or use of this email is prohibited and as such, please disregard the
contents of the email, delete the email and notify the sender
immediately.

State Records advises that this email and any attached files should
be scanned to detect viruses and accepts no liability for loss or
damage (whether caused by negligence or not) resulting from the use
of any attached files.
*********************************************************************

Richard Karnesky

unread,
Jun 8, 2011, 12:45:33 PM6/8/11
to zotero-dev
Jens and I corresponded a bit off list.

Presently, the translator uses:
copyrightDate: books, book sections
dateIssued: journal articles, magazine articles, newspaper articles
dateCreated: everything else
This is presumably just from the very limited view of types the
translator had originally had (and the translator does pre-date some
of the other types). There is definite room for improvement here!
However, there is room for debate over where to shove stuff in, I
think (though Jens and I are more-or-less on the same page.

I think dateIssued should probably be the fall-back for published
works.

This means enumerating what uses dateCreated, which is mostly for all
types that are possibly non-published. From the hip, this would be:
audiorecording, document, e-mail, instant message, interview,
letter, manuscript, presentation, video recording

Assuming our understanding of copyrightDate is correct, it should be
used more often (in all of the published sub-sections of a monograph;
not just book sections):
book,book section, conference paper,dictionary entry,encyclopedia
article

The following are likely to have copyright dates that are shown in a
way that is similar to the date on the copyright page of a book:
film, radio broadcast,tv broadcast

The web-distributed content can probably be considered "issued":
blog post,forum post,podcast,web page
I don't know what to do with the legal stuff, but I'd be inclined to
also use dateIssued for:
bill,case,hearing,statute,patent

So where do we put these:
report, thesis, art, computer program, map
I'd be inclined to say the fallback of "issued" is good enough? (I am
a little tempted to treat a thesis like any other published monograph
(so would lean to using copyright, but this is mostly just a personal
bias, based on my own thesis that required a copyright page)

I was pleasantly surprised that Jens and I seemed to come to a
reasonable agreement on all of these ugly type-dependent nits, but
thought I'd give others a chance to comment before I actually (and
somewhat arbitrarily) "painted the bike shed."

--Rick

Adam Retter

unread,
Jul 10, 2011, 1:26:23 PM7/10/11
to zoter...@googlegroups.com, simonst...@gmail.com, Petersen, Jens Østergaard
Okay,

So I have tried using both the Development XPI's for Zotero 2.1 branch
and trunk, available from here
http://www.zotero.org/support/dev/svn_and_trac_access (as of today).

Neither seems to recognise our unAPI, in fact I would say the
behaviour has regressed, as previously it would at least recognise the
unAPI for our first page of search results, now it doesnt even see
that!
I have enabled the Zotero log and I attach the details below for
visiting the following URI -
http://kjc-ws2.kjc.uni-heidelberg.de:8600/exist/apps/library/modules/search/index.html

Could someone else visit this URI and tell me why Zotero does not see our unAPI?


The Zotero Log (from Zotero 2.1 developer XPI) -

(3)(+0012322): Translate: Running regular expressions

(3)(+0000004): Translate: Searching for translators for
http://kjc-ws2.kjc.uni-heidelberg.de:8600/exist/apps/library/modules/search/index.html

(4)(+0000000): Translate: Binding sandbox to
http://kjc-ws2.kjc.uni-heidelberg.de:8600/exist/apps/library/modules/search/index.html

(4)(+0000001): Translate: Parsing code for unAPI

(4)(+0000002): Translate: Parsing code for COinS

(4)(+0000001): Translate: Parsing code for DOI

(4)(+0000002): Translate: Parsing code for Embedded RDF

(5)(+0000002): Translate: running handler 0 for translators

(5)(+0008867): SELECT COUNT(*) FROM fulltextItems WHERE (indexedPages
IS NOT NULL AND indexedPages=totalPages) OR (indexedChars IS NOT NULL
AND indexedChars=totalChars)

(5)(+0000000): SELECT COUNT(*) FROM fulltextItems WHERE (indexedPages
IS NOT NULL AND indexedPages<totalPages) OR (indexedChars IS NOT NULL
AND indexedChars<totalChars)

(5)(+0000001): SELECT COUNT(*) FROM itemAttachments WHERE itemID NOT
IN (SELECT itemID FROM fulltextItems WHERE indexedPages IS NOT NULL OR
indexedChars IS NOT NULL)

(5)(+0000000): SELECT COUNT(*) FROM fulltextWords

(3)(+0000066): DATE: retrieved with algorithms: ({year:2010, month:0, day:27})

(3)(+0000000): DATE: retrieved with algorithms: ({year:2010, month:6, day:26})

(3)(+0000000): DATE: retrieved with algorithms: ({year:2010, month:6, day:26})

(3)(+0000000): DATE: retrieved with algorithms: ({year:2010, month:2, day:15})

(3)(+0000000): DATE: retrieved with algorithms: ({year:2009, month:11, day:4})

(3)(+0000001): DATE: could not apply algorithms

(3)(+0000001): DATE: retrieved with algorithms: ({year:2010, month:7, day:25})

(3)(+0000000): DATE: retrieved with algorithms: ({year:2009, month:11, day:15})

(3)(+0000000): DATE: retrieved with algorithms: ({year:2008, month:4, day:13})

(3)(+0000000): DATE: retrieved with algorithms: ({year:2009, month:10, day:10})

(3)(+0000001): DATE: retrieved with algorithms: ({year:2009, month:8, day:13})

(3)(+0000000): DATE: retrieved with algorithms: ({year:2008, month:8, day:16})

(3)(+0000007): Translate: WARNING: new Zotero.Translate() is
deprecated; please don't use this if you don't have to

(3)(+0000000): Translate: Searching for translators for an undisclosed location

(5)(+0000002): SELECT key AS domainPath, value AS format FROM settings
WHERE setting='quickCopySite' ORDER BY domainPath COLLATE NOCASE

(4)(+0006634): Registering observer for
[collection,search,share,group,bucket] in notifier with hash lo'

(5)(+0000002): SELECT itemTypeID AS id, typeName AS name, custom FROM
itemTypesCombined WHERE display=2

(5)(+0000001): SELECT itemTypeID AS id, typeName AS name, custom FROM
itemTypesCombined WHERE display=1

Thanks Adam.

> --
> You received this message because you are subscribed to the Google Groups "zotero-dev" group.
> To post to this group, send email to zoter...@googlegroups.com.
> To unsubscribe from this group, send email to zotero-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/zotero-dev?hl=en.
>
>

--
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk

skornblith

unread,
Jul 10, 2011, 6:29:56 PM7/10/11
to zotero-dev
This looks like the same detect issue, although the new translator may
be stricter about what it believes it should display the detect icon
on. As of just now, the trunk supports a ZoteroItemUpdated event that
you can dispatch when Zotero should re-run detection. I am not yet
committing to putting this into the next release, but you can give it
a try and let us know if it works. (My testing indicates it should.)
You can dispatch the event as:

var ev = document.createEvent('HTMLEvents');
ev.initEvent('ZoteroItemUpdated', true, true);
document.dispatchEvent(ev);

Simon

On Jul 10, 1:26 pm, Adam Retter <adam.ret...@googlemail.com> wrote:
> Okay,
>
> So I have tried using both the Development XPI's for Zotero 2.1 branch
> and trunk, available from herehttp://www.zotero.org/support/dev/svn_and_trac_access(as of today).
>
> Neither seems to recognise our unAPI, in fact I would say the
> behaviour has regressed, as previously it would at least recognise the
> unAPI for our first page of search results, now it doesnt even see
> that!
> I have enabled the Zotero log and I attach the details below for
> visiting the following URI -http://kjc-ws2.kjc.uni-heidelberg.de:8600/exist/apps/library/modules/...
>
> Could someone else visit this URI and tell me why Zotero does not see our unAPI?
>
> The Zotero Log (from Zotero 2.1 developer XPI) -
>
> (3)(+0012322): Translate: Running regular expressions
>
> (3)(+0000004): Translate: Searching for translators forhttp://kjc-ws2.kjc.uni-heidelberg.de:8600/exist/apps/library/modules/...
>
> (4)(+0000000): Translate: Binding sandbox tohttp://kjc-ws2.kjc.uni-heidelberg.de:8600/exist/apps/library/modules/...
> > For more options, visit this group athttp://groups.google.com/group/zotero-dev?hl=en.

Adam Retter

unread,
Jul 12, 2011, 5:02:37 PM7/12/11
to zoter...@googlegroups.com
> This looks like the same detect issue, although the new translator may
> be stricter about what it believes it should display the detect icon
> on. As of just now, the trunk supports a ZoteroItemUpdated event that
> you can dispatch when Zotero should re-run detection. I am not yet
> committing to putting this into the next release, but you can give it
> a try and let us know if it works. (My testing indicates it should.)
> You can dispatch the event as:
>
> var ev = document.createEvent('HTMLEvents');
> ev.initEvent('ZoteroItemUpdated', true, true);
> document.dispatchEvent(ev);


After adding this code, Zotero now does recognise our first page of
results and the folder icon appears in the url bar.
When you page to the next page of results, this code is fired again,
yet when you click the folder icon in the url bar, Zotero still seems
to only show the first page of results in the pop-up dialog.

What to try next?

> For more options, visit this group at http://groups.google.com/group/zotero-dev?hl=en.

Simon

unread,
Jul 13, 2011, 8:05:12 PM7/13/11
to zoter...@googlegroups.com
Should be fixed now.


On Tuesday, July 12, 2011 5:02:37 PM UTC-4, Adam Retter wrote:
> This looks like the same detect issue, although the new translator may
> be stricter about what it believes it should display the detect icon
> on. As of just now, the trunk supports a ZoteroItemUpdated event that
> you can dispatch when Zotero should re-run detection. I am not yet
> committing to putting this into the next release, but you can give it
> a try and let us know if it works. (My testing indicates it should.)
> You can dispatch the event as:
>
> var ev = document.createEvent('HTMLEvents');
> ev.initEvent('ZoteroItemUpdated', true, true);
> document.dispatchEvent(ev);


After adding this code, Zotero now does recognise our first page of
results and the folder icon appears in the url bar.
When you page to the next page of results, this code is fired again,
yet when you click the folder icon in the url bar, Zotero still seems
to only show the first page of results in the pop-up dialog.

What to try next?

> Simon

>> On 19 April 2011 22:05, simonsterdotcom <simonst...@gmail.com> wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > On Apr 19, 4:37 am, Adam Retter <adam....@googlemail.com> wrote:

Adam Retter

unread,
Jul 14, 2011, 8:36:33 AM7/14/11
to zoter...@googlegroups.com
Okay, I only have the trunk XPI that is published on the Zotero
website, how do I test this, do I need to do an update to a checkout
of trunk and build it or similar?

> To view this discussion on the web visit
> https://groups.google.com/d/msg/zotero-dev/-/o9TU740STxwJ.

Simon

unread,
Jul 14, 2011, 9:04:56 AM7/14/11
to zoter...@googlegroups.com
The XPI updates automatically. Use "Check for Updates" under the cog menu in the Firefox Add-ons Manager, or reinstall the XPI from the website.

>> >> On 19 April 2011 22:05, simonsterdotcom <simon...@gmail.com> wrote:

>> >> > To post to this group, send email to zote...@googlegroups.com.


>> >> > To unsubscribe from this group, send email to

>> >> > zotero-...@googlegroups.com.


>> >> > For more options, visit this group
>> >> > athttp://groups.google.com/group/zotero-dev?hl=en.
>> >>
>> >> --
>> >> Adam Retter
>> >>
>> >> skype: adam.retter
>> >> tweet: adamretterhttp://www.adamretter.org.uk
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups "zotero-dev" group.

>> > To post to this group, send email to zote...@googlegroups.com.


>> > To unsubscribe from this group, send email to

>> > zotero-...@googlegroups.com.


>> > For more options, visit this group at
>> > http://groups.google.com/group/zotero-dev?hl=en.
>> >
>> >
>>
>> --
>> Adam Retter
>>
>> skype: adam.retter
>> tweet: adamretter
>> http://www.adamretter.org.uk
>
> --
> You received this message because you are subscribed to the Google Groups
> "zotero-dev" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/zotero-dev/-/o9TU740STxwJ.
> To post to this group, send email to zoter...@googlegroups.com.
> To unsubscribe from this group, send email to
> zotero-dev+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/zotero-dev?hl=en.
>

Reply all
Reply to author
Forward
0 new messages