downloading archived bookmarks

90 views
Skip to first unread message

Gerald Oskoboiny

unread,
Jan 25, 2012, 2:52:35 PM1/25/12
to Pinboard
Hi,

Any progress on allowing automated downloads of archived bookmarks?
cf. http://pinboard.in/faq/#download_archived

It seems like it would be really simple to implement by publishing a
page that has a list of links to all of a user's archived bookmarks,
then people could just run "wget -np -m" to grab a copy. (and
incremental updates would be fast + cheap for both sides)

Alternatively, has anyone set up automated crawling/scraping of
their archived bookmarks?

--
Gerald Oskoboiny <ger...@impressive.net>
http://impressive.net/people/gerald/

maciej

unread,
Jan 25, 2012, 4:43:41 PM1/25/12
to Pinboard
My plan is to let people queue downloads and get emailed a link when
those are ready. You'll be able to request a full download, or just a
tarball of stuff crawled since a certain date. Eventually it will
also be possible to have Pinboard regularly sync stuff to a Dropbox or
S3 account of your choice.

The problem with the approach you outline is that the archive servers
are not set up for a high volume of read traffic, and will bog down
quickly if people try to crawl their stuff. I'm happy to work with
anyone who wants to regularly grab their data, but please do contact
me about it before unleashing your bot armies.


On Jan 25, 11:52 am, Gerald Oskoboiny <ger...@impressive.net> wrote:
> Hi,
>
> Any progress on allowing automated downloads of archived bookmarks?
> cf.http://pinboard.in/faq/#download_archived

Gerald Oskoboiny

unread,
Jan 30, 2012, 2:29:51 PM1/30/12
to pinboa...@googlegroups.com
* maciej <mcegl...@gmail.com> [2012-01-25 13:43-0800]

> On Jan 25, 11:52 am, Gerald Oskoboiny <ger...@impressive.net> wrote:

> > Any progress on allowing automated downloads of archived bookmarks?
> > cf.http://pinboard.in/faq/#download_archived

> My plan is to let people queue downloads and get emailed a link when


> those are ready. You'll be able to request a full download, or just a
> tarball of stuff crawled since a certain date.

I would really like an implementation that makes it easy to
automate downloads, so I can just set it up and forget about it.
(no manual intervention needed)

> Eventually it will also be possible to have Pinboard regularly
> sync stuff to a Dropbox or S3 account of your choice.

That sounds better, though it would obligate me to pay for
storage elsewhere.

> > It seems like it would be really simple to implement by publishing a
> > page that has a list of links to all of a user's archived bookmarks,
> > then people could just run "wget -np -m" to grab a copy. (and
> > incremental updates would be fast + cheap for both sides)

> The problem with the approach you outline is that the archive servers


> are not set up for a high volume of read traffic, and will bog down
> quickly if people try to crawl their stuff.

I think you could handle those issues easily by adding a cache
and/or rate-limiting requests to archived content. I'd be happy
to chat about implementation details if that would help.

Drew

unread,
Feb 27, 2012, 6:16:52 PM2/27/12
to Pinboard
Dropbox integration would be perfect for me. I could almost replace
Evernote.


On Jan 30, 11:29 am, Gerald Oskoboiny <ger...@impressive.net> wrote:
> * maciej <mceglow...@gmail.com> [2012-01-25 13:43-0800]
>
> > On Jan 25, 11:52 am, Gerald Oskoboiny <ger...@impressive.net> wrote:
> > > Any progress on allowing automated downloads of archived bookmarks?
> > > cf.http://pinboard.in/faq/#download_archived
> > My plan is to let people queue downloads and get emailed a link when
> > those are ready.  You'll be able to request a full download, or just a
> > tarball of stuff crawled since a certain date.
>
> I would really like an implementation that makes it easy to
> automate downloads, so I can just set it up and forget about it.
> (no manual intervention needed)
>
> > Eventually it will also be possible to have Pinboard regularly
> > sync stuff to aDropboxor S3 account of your choice.

Johannes

unread,
Feb 29, 2012, 7:11:34 PM2/29/12
to pinboa...@googlegroups.com
there could be an opportunity here: implement "download archived bookmarks" as an action for ifttt - dropbox integration would be easy this way!

Gerald Oskoboiny

unread,
Mar 13, 2012, 9:59:24 PM3/13/12
to pinboa...@googlegroups.com
* Gerald Oskoboiny <ger...@impressive.net> [2012-01-30 11:29-0800]

> * maciej <mcegl...@gmail.com> [2012-01-25 13:43-0800]
> > On Jan 25, 11:52�am, Gerald Oskoboiny <ger...@impressive.net> wrote:
> > > Any progress on allowing automated downloads of archived bookmarks?
> > > cf.http://pinboard.in/faq/#download_archived
>
> > My plan is to let people queue downloads and get emailed a link when
> > those are ready. You'll be able to request a full download, or just a
> > tarball of stuff crawled since a certain date.
>
> I would really like an implementation that makes it easy to
> automate downloads, so I can just set it up and forget about it.
> (no manual intervention needed)

> > > It seems like it would be really simple to implement by publishing a


> > > page that has a list of links to all of a user's archived bookmarks,
> > > then people could just run "wget -np -m" to grab a copy. (and
> > > incremental updates would be fast + cheap for both sides)
>
> > The problem with the approach you outline is that the archive servers
> > are not set up for a high volume of read traffic, and will bog down
> > quickly if people try to crawl their stuff.

Another idea:

Allow archival account holders to request a one-time download, as
you do now, then publish a page for each user that has a list of
links to the most recent 24 hours of crawled content. Then people
could do a full download once, and run wget -m daily to keep
their copy up to date with minimal server load.

Reply all
Reply to author
Forward
0 new messages