item keys uniqueness

35 views
Skip to first unread message

Emiliano Heyns

unread,
Nov 11, 2021, 4:49:49 AM11/11/21
to zotero-dev
I know item keys are unique within a library/group, but ISTR that they are not necessarily unique across all libraries/groups, which is why I always see them used in a libraryID/itemKey pair. Correct? But attachments are also laid out on disk just by their item key. If eg attachment item key 2YPFQMML is both in my personal library and in a shared group, what path do they each get on disk? I don't see the library encoded in the path.

Dan Stillman

unread,
Nov 11, 2021, 9:35:19 PM11/11/21
to zoter...@googlegroups.com
On 11/11/21 4:49 AM, Emiliano Heyns wrote:
> I know item keys are unique within a library/group, but ISTR that they
> are not necessarily unique across all libraries/groups, which is why I
> always see them used in a libraryID/itemKey pair. Correct?

Correct.

> But attachments are also laid out on disk just by their item key. If
> eg attachment item key 2YPFQMML is both in my personal library and in
> a shared group, what path do they each get on disk?

It's best not to think about it.

(The storage layout predates groups, and we've never fixed this. The
good news is that you'd have to be extraordinarily unlikely to end up
with the same key, both for attachments, in two different libraries.)

Dan Stillman

unread,
Nov 11, 2021, 9:37:25 PM11/11/21
to zoter...@googlegroups.com
On 11/11/21 9:35 PM, Dan Stillman wrote:
> you'd have to be extraordinarily unlikely to end up with the same key,
> both for attachments, in two different libraries

Extraordinarily unlucky, that is.

Avana Vana

unread,
Nov 24, 2021, 5:47:56 AM11/24/21
to zotero-dev
Sorry, my mistake—I used the wrong set of characters for Zotero IDs. Here's the corrected collision calculation (again, if it were nanoID)
Screen Shot 2021-11-24 at 4.59.30 AM.png
So with this algorithm, if you created 1 Zotero ID every millisecond, there would only be a 1% probability of collision after ~3 minutes.  In other words, after around 180,000 Zotero IDs there is a 1% chance of collisions (if the algorithm were similar in 'randomness' to nanoID).

Avana Vana

unread,
Nov 24, 2021, 5:47:59 AM11/24/21
to zotero-dev
I'm not sure what the method for generating is internally in Zotero, but just for fun, I plugged the Zotero key specs into the nanoID collision calculator and got these results:

Screen Shot 2021-11-24 at 4.31.14 AM.png

Again, this is for a different algorithm (nanoID), but it does give a general sense of the level of scarcity of Zotero IDs.  I'd be interested to know what the actual algorithm used is and if there is a collision calculator for it. 
On Thursday, November 11, 2021 at 9:35:19 PM UTC-5 Dan Stillman wrote:
Reply all
Reply to author
Forward
0 new messages