Re: Memento: Timemap for git commits history (and other ideas)

29 views
Skip to first unread message

Michael Nelson

unread,
Oct 7, 2018, 4:43:39 PM10/7/18
to Karim Gemayel, memen...@googlegroups.com

Hi Karim,

I'm cc'ing this reply to the memento-dev email list -- you'll want to
join that group and look through the email archives:

https://groups.google.com/d/forum/memento-dev

responses inlined below:

On Sun, 7 Oct 2018, Karim Gemayel wrote:

> Hello,
>
> I'm a French developer discovering the Memento protocol, and I was
> wondering if someone had the idea of providing a Memento interface for
> git commits history.
>
> At first glance, it seems it'd fit the protocol. I'm wondering if having
> a Memento interface on GitHub would make sense. What do you think ?

a GitHub gateway has been built, and it's available at
timetravel.mementoweb.org. see for example:

http://timetravel.mementoweb.org/list/20130315161942/http://github.com/oduwsdl/CarbonDate

which shows mementos for http://github.com/oduwsdl/CarbonDate in both the
Internet Archive as well as GitHub, the latter of which is presumably the
"best" archive for this URL.

I believe this is a skeleton code for which you can adapt a timegate for
most content management systems (like the GitHub example above):

https://github.com/mementoweb/timegate

>
> Also, I was wondering if there was any plan to use some <link
> rel="timegate|timemap"> in HTML headers, similarly to OpenSearch
> autodiscovery
> https://developer.mozilla.org/en-US/docs/Web/OpenSearch#Autodiscovery_of_search_plugins
> .

That's possible, though off the top of my head I'm not aware of any sites
doing that. Sites like w3.org have the headers in the http response:

$ curl -I https://www.w3.org/wiki/Main_Page
HTTP/1.1 200 OK
Date: Sun, 07 Oct 2018 20:07:53 GMT
X-Content-Type-Options: nosniff
Link: <https://www.w3.org/wiki/Main_Page>; rel="original
latest-version",<https://www.w3.org/wiki/Special:TimeGate/Main_Page>;
rel="timegate",<https://www.w3.org/wiki/Special:TimeMap/Main_Page>;
rel="timemap"; type="application/link-format"; from="Thu, 01 Jan 1970
00:00:00 GMT"; until="Wed, 22 Nov 2017 16:54:55
GMT",<https://www.w3.org/wiki/index.php?title=Main_Page&oldid=30366>;
rel="first memento"; datetime="Thu, 01 Jan 1970 00:00:00
GMT",<https://www.w3.org/wiki/index.php?title=Main_Page&oldid=105259>;
rel="last memento"; datetime="Wed, 22 Nov 2017 16:54:55 GMT"
[deletia]


>
> I believe that such an autodiscovery mechanism could increase the use of
> the Memento protocol, not only for information science people, but
> perhaps for the press and journalists (if I were a marketer, I'd say
> it'd be a new standard and modern manner to browse newspapers websites).
>
> So, perhaps having the New York Times or GitHub implementing the Memento
> protocol would help to "spread the word". (I for one would really like
> to be able to browse the news that way.)

I agree that this would be welcome. It could be done right now, if the
sites adopted the protocol.

>
> Also, I'm impatiently waiting for Memento support in Wikipedia which
> could give great exposure to the protocol, but it's not yet in their
> plan apparently https://phabricator.wikimedia.org/T36778 whereas it'd be
> the perfect use case.

We've tried several times to get them to adopt Memento. The tools are
written, but it's been frustrated by a few vocal opponents who prefer to
keep the history of pages difficult to access.

>
> At last, is there any plan to add some interaction with the WARC format ?
>
> Concretely, I'm wondering if there's a standard way to retrieve a WARC
> archive based on the Memento timemap (been looking into WARC headers but
> haven't found any working example). And also perhaps a standard way,
> either to POST a link to Internet Archive & Archive.Today, or even
> directly a WARC file.
>

Not really. First, not all Memento accessible archives/CMSs support WARC
(e.g., mediawiki, archive.is, github). For archives that do use WARCs, a
replayed page will typically have resources from several different WARC
files, so getting "the" WARC file isn't necessarily a defined operation.
You could rewrite a new WARC file with the replayed contents, if that's
what you'd like to do. See:

https://warcreate.com/
https://webrecorder.io/

See also:

https://github.com/WASAPI-Community
https://archive.org/details/wasapi

As for submitting to multiple archives at once, there is no standard, but
there are libraries that handle it, e.g.:

https://github.com/oduwsdl/archivenow
http://ws-dl.blogspot.com/2017/02/2017-02-22-archive-now-archivenow.html

regards,

Michael

> I'm new to Memento and WARC, so I might be missing some things.
>
> Cheers.
>
> --
> Karim Gemayel
>

----
Michael L. Nelson m...@cs.odu.edu http://www.cs.odu.edu/~mln/
Dept of Computer Science, Old Dominion University, Norfolk VA 23529
+1 757 683 6393 +1 757 683 4900 (f)

Reply all
Reply to author
Forward
0 new messages