Extract pdf annotations to hypothesis

19 views
Skip to first unread message

Jochem Holscher

unread,
Nov 6, 2023, 2:16:07 PM11/6/23
to Hypothes.is Forum
Hi,
Hypothesis is so amazing that I want to extract my non-hypothesis pdf annotations and upload them to hypothesis.

Question
Has anyone automated the process of extracting annotations from the pdf annotation layer and uploading those to hypothesis?

Context
I annotate occasionally on my Onyx Boox, which has its own app that I don't want to leave or alter. The annotations go into the pdf annotation layer, which I know, can be programmatically extracted.
Before going through that hassle myself, I figured I'd first look around whether someone has preceded me.

Cheers,

Hakarlfresser

unread,
Nov 7, 2023, 9:28:40 AM11/7/23
to Hypothes.is Forum
Jochen,

Text annotations in PDF files may be "buried" in PDF format and not be available to copy/extract as simple text. Try copying a sample of your PDF to "DocDrop" (https://docdrop.org/ocr/) to "free" the text. Then copy the "freed" text from DocDrop to Hypothesis. I do not know if there is a routine to do this for multiple pages, sorry.

Jack in CO/US
Reply all
Reply to author
Forward
0 new messages