A Solution! -- Serving Personally Archived Web Pages in TiddlyWiki

354 views
Skip to first unread message

Steven Schneider

unread,
Nov 17, 2016, 9:59:05 AM11/17/16
to TiddlyWiki
Folks, with the help of several in this group, I've developed a work flow & macro  for serving links to archived Web pages in TiddlyWiki. Note this is a different approach than bookmarking, which serves links to current pages on the Web; this approach makes a local copy of a Web page and generates a permalink that can be invoked in TW.

In its current version, it relies on a Firefox extension (Scrapbook) to do the archiving; I've yet to find a similar Chrome or other browser extension. And, of course, it requires a repository on the web -- I've tested so far with GitHub, and would be interested to see tests done using Dropbox (via UpDog) and others. Thoughts, and ideas for enhancement are welcome here.


//steve.

Tristan Kohl

unread,
Nov 19, 2016, 2:26:49 PM11/19/16
to TiddlyWiki
Hi Steve,

as I understand your project you create a local copy of a webpage and serve that through a link in TW which opens the page in a new tab?

You said, your approach relies on Scrapbook, but if I got your idea right, you do not need that at all. For Firefox there is a expansion called Mozilla Archive Format, which saves a whole webpage in a single MHT file (actually a zip following some rules). For Chrome there is i.e. SingleFile (note: I do not use Chrome, this was just the first result I got looking for MAFF and MHT). Afterwards you can link to the local file the same way you did in your example.

Synchronization is up to you and I think your solution works quite good.

Cheers
Tristan

Mark S.

unread,
Nov 19, 2016, 2:57:52 PM11/19/16
to TiddlyWiki
What's great about scrapbook is that it allows you to save just the parts you want (which is usually just the main article). You can even edit it to cut out the extra. The MAFF and MHT save the entire page, toc's, advertisements, floating links bar, and even sometimes pop-up announcements.

Mark

Tristan Kohl

unread,
Nov 19, 2016, 3:27:20 PM11/19/16
to TiddlyWiki
Well you are right, I did not think about that since I used MAFF some years ago.

Today I use PrintFriendly which generates great looking PDFs from websites via a bookmarklet and you can cut out parts (save just a small part) of the website. Big advantage for me is that I do not depend on my browser to see the PDF plus I can embed it in TiddlyWiki via canonical_url.

Cheers
Tristan

Mark S.

unread,
Nov 19, 2016, 4:01:01 PM11/19/16
to TiddlyWiki
That does look nice. The problem with PDF solutions for web pages is that frequently they want to paginate right in the middle of an image. Do you happen to know if printfriendly is smart enough to avoid that?

One work-around I've found is to copy the selected text/images into Word 2013. Word is smart enough not to split images at the border, and you can tweak font size, image size and borders. Then export to PDF.

Have fun,
Mark

Steven Schneider

unread,
Nov 21, 2016, 9:37:58 PM11/21/16
to TiddlyWiki
Good comments, and thanks for the suggestions. I'll check out MAFF and MHT. PDF is not in hypertext context, i.e. not clickable and page requisites not viewable as individual files, so for my purposes. doesn't do the trick.

And, yes, scrapbook allows editing of archived pages, something I'm generally not interested in (being an archivist :), but something very powerful indeed. 

I am pretty much stuck on Scrapbook -- because it saves the page in hypertext context, with all links etc. What I don't like about it is that it edits / modifies the original html code, so it loses its "purity" from an archiving perspective. Better from this perspective: WARCreate: http://warcreate.com/.  Much trickier to get running, however.

Most importantly for these purposes, Scrapbook generates a reliable permalink based on the timestamp (i.e. timestamp/index.html) which makes the whole "save a link to an archived web page" macro work pretty much flawlessly. Three clicks: one to scrapbook, two to open scrapbooked page (doubles as a check on archiving), and three to highlight timestamp. Then, a copy, and paste into macro. 

Future: Someone more clever than I could probably manipulate Scrapbook to automatically push the timestamp of the last scrapbooked page into the buffer, and generate macro text -- much the same way that Dropbox will push a link to a screencapture into the clipboard buffer for immediate pasting. Similarly, Jing will allow customization of the text pushed into the clipboard (I've used it to generate text that is a TW macro), but that is far beyond my abilities to code.

//steve.
Reply all
Reply to author
Forward
0 new messages