Robustlinks API tries to contact localhost

18 views
Skip to first unread message

Patrick Hochstenbach

unread,
Sep 30, 2021, 9:40:03 AM9/30/21
to Memento Development
Hi

When I send to the RobustLinks a url like http://localhost/ it will respond correctly with a http 400 BAD REQUEST. But any other combinations to have other kind of illegal local urls will result in a Python traceback error data and a http  503 SERVICE UNAVAILABLE from the server side.

E.g try 

https://localhost (https version)
http://localhost:2000 (other ports)
...

I don't mind to get an error code back (which is the correct response). But 503 results in my automated agents to try faulty urls over and over. A 400 result would be better. The robustlinks server should never try to contact localhosts in any form.

Cheers
Patrick

Shawn Jones

unread,
Sep 30, 2021, 5:50:05 PM9/30/21
to Memento Development
Patrick,

Thank you so much for identifying this behavior. We have diagnosed the problem and updated the code to address this issue. Now, whenever the Robust Links API cannot resolve the domain of the submitted URI (e.g., https://www.bbc.co.uk/) to a global registered domain name (e.g., bbc.co.uk), it will return a 403. This change helps us address cases for localhost and 127.0.0.1 as well as other "private" or "local" host names (e.g., http://nightshade:8080/index.html or http://192.168.1.1:5000/dir/index.php) that likely cannot be preserved in a public web archive.

Again, I really appreciate you bringing this to our attention.

Cheers,

Shawn

Reply all
Reply to author
Forward
0 new messages