Bookmarking in the New York Times

9 views
Skip to first unread message

stml

unread,
Jan 18, 2011, 5:09:52 AM1/18/11
to Open Bookmarks
This is interesting: http://open.blogs.nytimes.com/2011/01/11/emphasis-update-and-source/

Uses paragraph hashes and calculates the Levenshtein distance across
pieces of text for deep-linking and highlighting in the New York
Times. This is quite close to what I've been thinking about, with
adjustments for much longer pieces of text, such as a rough progress /
percentage of work indicator.

Hadrien Gardeur

unread,
Jan 18, 2011, 5:26:42 AM1/18/11
to openbo...@googlegroups.com
A few quick notes:
  • Like the fact that it fits into a URI
  • They're using fragments (#), I havent checked exactly how it works on their end but I expect the highlighting to be done on the client-side (JS)
  • I'm not sure that this would easily scale to longer texts without separating into two pointers (start and end)
  • It looks weak (using the first letters) even with Levenshtein (I'd rather use a text excerpt + Levenshtein)

stml

unread,
Jan 18, 2011, 9:24:39 AM1/18/11
to Open Bookmarks
It is two pointers: the initial paragraph key points to the first and
last sentences in a paragraph, and the highlight key points to
specific sentences in the selected paragraphs. It also satisfies, via
the Levenshtein distance code, OB's requirements to be quite fuzzy.

The keying may need strengthening for longer texts, either using text
excerpts or just more first letters, but I'm going to run some tests
to see how reliable it is.



On Jan 18, 10:26 am, Hadrien Gardeur <hadrien.gard...@feedbooks.com>
wrote:
> A few quick notes:
>
>    - Like the fact that it fits into a URI
>    - They're using fragments (#), I havent checked exactly how it works on
>    their end but I expect the highlighting to be done on the client-side (JS)
>    - I'm not sure that this would easily scale to longer texts without
>    separating into two pointers (start and end)
>    - It looks weak (using the first letters) even with Levenshtein (I'd

Hadrien Gardeur

unread,
Jan 18, 2011, 9:25:58 AM1/18/11
to openbo...@googlegroups.com
It is two pointers: the initial paragraph key points to the first and
last sentences in a paragraph, and the highlight key points to
specific sentences in the selected paragraphs. It also satisfies, via
the Levenshtein distance code, OB's requirements to be quite fuzzy.

"In a paragraph", that's the whole problem. 

Marc Köhlbrugge (*openmargin)

unread,
Jan 18, 2011, 11:09:15 AM1/18/11
to openbo...@googlegroups.com
We used a similar approach for one of the earlier prototypes for our ereader-app *openmargin, but like Hadrien points out referring only the paragraphs won't work in a lot of cases. (e.g. headings, lists, etc)

Instead, we'll be using two Xpoint-references (begin- and end-point) -and- save the citated text. We're still working on this, but I can't see a reason it wouldn't work. I'll give you guys an update when we're there.

Marc 

stml

unread,
Jan 19, 2011, 6:13:26 AM1/19/11
to Open Bookmarks
Why won't it work for headings and paragraphs, in principle?

A heading can be treated as a one-sentence paragraph, a list or list
items can be treated as paragraphs. A paragraph is just a block of
text.

Mark K - very interested to hear what xpoint references you're using!


On Jan 18, 4:09 pm, Marc Köhlbrugge (*openmargin)
<m...@openmargin.com> wrote:
> We used a similar approach for one of the earlier prototypes for our
> ereader-app *openmargin, but like Hadrien points out referring only the
> paragraphs won't work in a lot of cases. (e.g. headings, lists, etc)
>
> Instead, we'll be using two Xpoint-references (begin- and end-point) -and-
> save the citated text. We're still working on this, but I can't see a reason
> it wouldn't work. I'll give you guys an update when we're there.
>
> Marc
>
> On Tue, Jan 18, 2011 at 3:25 PM, Hadrien Gardeur <
>

Marc Köhlbrugge (*openmargin)

unread,
Jan 19, 2011, 7:32:19 AM1/19/11
to openbo...@googlegroups.com
A heading can be treated as a one-sentence paragraph, a list or list items can be treated as paragraphs. A paragraph is just a block of text.

That's true. Instead of paragraphs you would be referring to 'nodes'. But what happens when those nodes are nested? For example a blockquote containing a list. How would you refer to a certain list item?

Mark K - very interested to hear what xpoint references you're using!

I will get back to you when we have done some tests. But really you could just take a look at one of the webbased annotation services (Zotero comes to mind). They are basically doing the same thing. (If you see an ePub book as a webpage)
Reply all
Reply to author
Forward
0 new messages