On 5/15/14, 10:21 AM, Katie wrote:
> The overall goal is to recreate the Zotero library in a WP database.
> The collections import is regularly timing out for some users, and the
> number of requests made is the major difference between importing
> items and collections.
It sounds like you're to import an entire Zotero library within the
lifespan of a single request? If that's the case, I'd say that's
probably not a reasonable expectation — for smaller libraries it'd be no
problem, but you don't really have any guarantee going in of how long it
will take. I'm not sure if WP gives you any ability to schedule
background jobs, but that might be something worth looking into.
> Here's a very common case: A user wishes to display all items for a
> collection. Here it makes sense for collections to have a list of
> their items (easy, quick db call). But to import, say, even 50
> collections to get this relationship information would mean 1 + 50
> requests.
Right, but that's where looking at this as an import might not be the
best approach. Would it not be possible to simply load the
collection-item membership from the API on-demand (and cache it) when
the user requests data for that collection?
> If the collections are included as a list for each item, that cuts
> down the number of requests to Zotero (which is great) but it's not
> ideal for using that information. For this case, the query would need
> to search every item in the database and use something like
> FIND_IN_SET or IN.
Well, there's no need to store it as it's served. The next major version
of the Zotero client will use this API for syncing, but it will still
have a collectionItems table with just collectionID and itemID for easy
local querying. So if loading collection-items on-demand isn't possible
for some reason and you do need to treat this as a full-library import,
then you can just pull the collections data out of the item JSON and
store and query it separately.
> That said, this would be trickier but doable. If nothing can be done,
> I will work with the item's JSON. But ideally each collection's JSON
> would come with a list of their items.
There are a couple issues with doing what you suggest:
1) Collection-item membership doesn't affect the modification time of
collections, which means the collection JSON can't be modified on
changes, since the cached data is keyed by the version. And if the mod
time did change, it would result in unnecessary work for sync clients,
who would have to download both the item and the collection on a
membership change.
2) If the collection did include item keys, what would you actually do
with that? You'd still need all the items in the library in order to
display them, and if you have all the items in the library, you already
have the collection-item membership as discussed above. The same
argument can be made in reverse, of course, but the difference is that
most libraries have orders of magnitude more items than collections, so
it makes sense to treat collections as item properties, since getting
the collection data for those keys is usually an extra one or two requests.
Anyhow, I hope that's helpful. On a general note, if you haven't yet you
should take a look at the API syncing page [1], which lays out a basic
syncing method and few possible variations depending on use case. Let me
know if you have any questions on that.
- Dan
[1]
https://www.zotero.org/support/dev/server_api/v2/syncing