TimeTravel Service fetching timemaps at the wrong URL

28 views
Skip to first unread message

Daniel Bicho

unread,
Nov 18, 2019, 6:59:10 AM11/18/19
to Memento Development

Hey!


We at Arquivo.pt updated our Pywb, and in consequence the announcing URL for a resource TimeMap changed.


We see that the TimeTravel Service agent is hitting our servers for timemaps on an URL like this:

https://arquivo.pt/wayback/timemap/*/https://fccn.pt


But instead of using that URL it should be getting the timemaps from https://arquivo.pt/wayback/timemap/link/https://fccn.pt.


We added a redirect so the TimeTravel requests don't hit the wall.


We would like to understand better the implications of this change and that TimeTravel service stops going through the redirect.


Should not time travel service following  the announced timemap?
Have we done something wrong at our side?


Best regards!

Martin Klein

unread,
Nov 18, 2019, 3:33:14 PM11/18/19
to memen...@googlegroups.com
Hi Daniel,

Thanks for the info! We have changed the URI-T in our TimeTravel service configurations and will not hit the old URI anymore.
There is nothing you did wrong. We currently have no comprehensive plan in place how to learn or get informed about such changes - a situation that can/should be improved.

cheers
M

--

---
You received this message because you are subscribed to the Google Groups "Memento Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to memento-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/memento-dev/c9d84d7a-beb2-4ceb-8c84-8c60c11dd7fb%40googlegroups.com.

Martin Klein

unread,
Nov 18, 2019, 3:54:33 PM11/18/19
to memen...@googlegroups.com
On this topic:
It seems your timemaps are not well formatted - see example below.
The URIs are in fact URI-Rs and not URI-Ms and so a client would have to somehow construct URI-Ms given the datetime and an assumed or learned base URI.
Is this something you intended? We have seen various implementations of Pywb but have never come across a timemap serialization like that. It brakes all sorts of clients.

cheers
M




HTTP/1.1 200 OK
Date: Mon, 18 Nov 2019 20:49:33 GMT
Server: Apache
Content-Type: application/link-format
Content-Length: 118754
Link: <https://fccn.pt>; rel="original", <https://arquivo.pt/wayback/mp_/https://fccn.pt>; rel="timegate", <https://arquivo.pt/wayback/timemap/link/https://fccn.pt>; rel="timemap"; type="application/link-format"
Vary: accept-datetime
Cache-Control: max-age=300, public, must-revalidate
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: X-Requested-With

<https://arquivo.pt/wayback/timemap/link/https://fccn.pt>; rel="self"; type="application/link-format"; from="Sun, 13 Oct 1996 14:56:50 GMT",
<https://arquivo.pt/wayback/mp_/https://fccn.pt>; rel="timegate",
<https://fccn.pt>; rel="original",
<http://www.fccn.pt/>; rel="memento"; datetime="Sun, 13 Oct 1996 14:56:50 GMT"; collection="$root",
<http://www.fccn.pt/>; rel="memento"; datetime="Wed, 10 Dec 1997 20:21:37 GMT"; collection="$root",
<http://www.fccn.pt/>; rel="memento"; datetime="Wed, 10 Dec 1997 20:21:37 GMT"; collection="$root",
<http://www.fccn.pt/>; rel="memento"; datetime="Sun, 15 Feb 1998 10:16:45 GMT"; collection="$root",
<http://www.fccn.pt/>; rel="memento"; datetime="Sun, 15 Feb 1998 10:16:45 GMT"; collection="$root",
<http://www.fccn.pt/>; rel="memento"; datetime="Thu, 03 Dec 1998 04:12:05 GMT"; collection="$root",
<http://www.fccn.pt/>; rel="memento"; datetime="Thu, 03 Dec 1998 04:12:05 GMT"; collection="$root",
<http://www.fccn.pt/>; rel="memento"; datetime="Sat, 12 Dec 1998 02:49:05 GMT"; collection="$root",
...

Daniel Bicho

unread,
Nov 19, 2019, 7:39:44 AM11/19/19
to Memento Development
We didn't intend it at all!!! It was a regression that we haven't detected.
Thanks to warn about it.

It seems that this behavior is happening on all 2.0+ releases.
The good news is that at the version 2.4 rc2 this problem seems to be solved!

Not sure yet how to proceed. We have some replay issues that pywb2 solves.

So, at this moment TimeTravel  client cannot handle this right?

On Mon, Nov 18, 2019 at 4:59 AM Daniel Bicho <danie...@gmail.com> wrote:

Hey!


We at Arquivo.pt updated our Pywb, and in consequence the announcing URL for a resource TimeMap changed.


We see that the TimeTravel Service agent is hitting our servers for timemaps on an URL like this:

https://arquivo.pt/wayback/timemap/*/https://fccn.pt


But instead of using that URL it should be getting the timemaps from https://arquivo.pt/wayback/timemap/link/https://fccn.pt.


We added a redirect so the TimeTravel requests don't hit the wall.


We would like to understand better the implications of this change and that TimeTravel service stops going through the redirect.


Should not time travel service following  the announced timemap?
Have we done something wrong at our side?


Best regards!

--

---
You received this message because you are subscribed to the Google Groups "Memento Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to memen...@googlegroups.com.

Ilya Kreymer

unread,
Nov 19, 2019, 10:34:07 AM11/19/19
to memen...@googlegroups.com
Hi,

Yes, I can confirm that this has been fixed in the upcoming 2.4.0 release, and was originally brought up as part of https://github.com/ukwa/ukwa-pywb/issues/37
and fixed in this commit.

Daniel, can you update to the latest 2.4.rc2 branch? Happy to discuss off-list.

If needed, could also make a patch for the existing 2.3.x release as well if that'd be easier.

Best,
Ilya



To unsubscribe from this group and stop receiving emails from it, send an email to memento-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/memento-dev/6b4ec372-cebe-4c24-9fcf-41e17f9d4993%40googlegroups.com.

Martin Klein

unread,
Nov 19, 2019, 10:37:57 AM11/19/19
to memen...@googlegroups.com
Thanks for the update and the fix, Ilya!

cheers
M

Sawood Alam

unread,
Nov 19, 2019, 10:39:09 AM11/19/19
to memento-dev
What is the preferred TimeGate entrypoint? I need to update MemGator as well.

Best,

--
Sawood Alam
PhD Candidate
Old Dominion University
Norfolk, Virginia - 23529



Daniel Bicho

unread,
Nov 20, 2019, 9:50:37 AM11/20/19
to Memento Development
Hey,

A patch to the existing 2.3.x would be super welcome. Much cleaner solution for us.
I am not sure yet how many changes we need to do to start using the 2.4!

Martin Klein

unread,
Nov 20, 2019, 11:00:58 AM11/20/19
to memen...@googlegroups.com
Hi Daniel,

I just noticed that I never answered your question:
No, no TimeMap-consuming client (not only the TimeTravel service) can handle these timemaps as they are non-compliant with the Memento specification. IMHO, building a temporary workaround is not a good solution as it opens the door to having to accommodate many more bad implementations and that means the end of interoperability. So the bottom line is this needs to be addressed in pywb (as I think Ilya acknowledged), there is no other solution.

cheers
M

To unsubscribe from this group and stop receiving emails from it, send an email to memento-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/memento-dev/6b4ec372-cebe-4c24-9fcf-41e17f9d4993%40googlegroups.com.

Daniel Bicho

unread,
Dec 9, 2019, 9:21:24 AM12/9/19
to Memento Development
Hi Martin, Samwood, Ilya

Arquivo.pt has the pywb 2.4 versions online now.

The TimeMaps should be okay now!

Can you please check at your side if everything is back to normality and you are able to fetch information through Arquivo.pt Memento API?


On Wednesday, November 20, 2019 at 4:00:58 PM UTC, Martin Klein wrote:
Hi Daniel,

I just noticed that I never answered your question:
No, no TimeMap-consuming client (not only the TimeTravel service) can handle these timemaps as they are non-compliant with the Memento specification. IMHO, building a temporary workaround is not a good solution as it opens the door to having to accommodate many more bad implementations and that means the end of interoperability. So the bottom line is this needs to be addressed in pywb (as I think Ilya acknowledged), there is no other solution.

cheers
M

Sawood Alam

unread,
Dec 9, 2019, 10:01:46 AM12/9/19
to memento-dev
Thanks Daniel for the update. I will update MemGator to make use of these new TimeMap and TimeGate endpoints.

Things look good, except a non-standard link attribute called "collection" which I know was added in PyWB lately, but the value of it is set to "$root". Perhaps PyWB needs to be fixed to not include this attribute when there are no named collections.

$ curl -is "https://arquivo.pt/wayback/timemap/link/https://fccn.pt" | head -15
HTTP/1.1 200 OK
Date: Mon, 09 Dec 2019 14:55:57 GMT
Server: Apache
Content-Type: application/link-format
Content-Length: 171940
Link: <https://fccn.pt>; rel="original", <https://arquivo.pt/wayback/https://fccn.pt>; rel="timegate", <https://arquivo.pt/wayback/timemap/link/https://fccn.pt>; rel="timemap"; type="application/link-format"

Vary: accept-datetime
Cache-Control: max-age=300, public, must-revalidate
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: X-Requested-With

<https://arquivo.pt/wayback/timemap/link/https://fccn.pt>; rel="self"; type="application/link-format"; from="Sun, 13 Oct 1996 14:56:50 GMT",

<https://fccn.pt>; rel="original",
<https://arquivo.pt/wayback/19961013145650mp_/http://www.fccn.pt/>; rel="memento"; datetime="Sun, 13 Oct 1996 14:56:50 GMT"; collection="$root",


Best,

--
Sawood Alam
PhD Candidate
Old Dominion University
Norfolk, Virginia - 23529


To unsubscribe from this group and stop receiving emails from it, send an email to memento-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/memento-dev/8ddab29e-3b54-4a4e-9331-4e15cf6e82cc%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages