curl shell scripting for error on invalid URL

35 views
Skip to first unread message

Collection PhotoGraphex

unread,
Jul 1, 2020, 11:44:42 PM7/1/20
to superca...@googlegroups.com
Hello, 

Maybe someone with more experience with shell() scripting and SuperCard, as I am having a difficulty with this basic command:

shell(merge("curl -fo `[[hfsToPosix(filePath)]]` `[[imageURL]]`"))

With is working great in the right circumstances, but will create a dummy empty document if the URL is invalid. 

My questions: 

1- Why does it creates a document with the intended name, but in reality is just a text file for an HTTP result, for ex.:

HTTP/1.1 302 Found
Date: Thu, 02 Jul 2020 03:27:53 GMT
Server: Apache
Content-Type: text/html; charset=iso-8859-1

Well, for this particular Apache server, when there is an HTTP 404 Not Found, the server redirect to an error page! 
Normally, it should return a 500 error at </500.html>.

2- How can I incept the result to acknowledge the error?

Help is appreciated!

Regards

André

codegreen

unread,
Jul 2, 2020, 3:53:29 AM7/2/20
to SuperCard Discussion
Adding the -f option to curl is supposed to catch 404 errors, but redirects and authentication errors may still slip through...

On Linux you could use wget --spider (which usually detects these) to pre-flight URLs, but on OS X (unless you want to DL and install wget) we're stuck with curl.

This might do it, but no promises:

  shell(merge("curl -o /dev/null -sI -r0-0 `[[imageURL]]` && curl -o `[[hfsToPosix(filePath)]]` `[[imageURL]]`"))  


-Mark

Collection PhotoGraphex

unread,
Jul 2, 2020, 7:23:08 PM7/2/20
to SuperCard Discussion
Hello Mark, thanks for your omnipresent helpfulness!
 

  shell(merge("curl -o /dev/null -sI -r0-0 `[[imageURL]]` && curl -o `[[hfsToPosix(filePath)]]` `[[imageURL]]`"))  


Up to now, I can't notice a difference with the tests I am doing with this new script. 

I've tried to reprogram the web server to avoid specific redirection, I also tried to mimick errors on other web site with the same result. I guess whatever server I address to, I am always going to get some sort of an HTML reply anyway and this reply is going to be saved under the original document name from the invalid URL!

I wonder if there is a way to get some results or feedback from the shell? 

At least to be able get the real name of the downloaded document?

I went through the curl manual without much success either!

Otherwise, I'll try to look for a temporary workaround for this issue by trying to detect and to avoid theses invalid URL beforehand. Or to get info or to read the content after saving!

Regards

André

codegreen

unread,
Jul 2, 2020, 9:27:03 PM7/2/20
to SuperCard Discussion
What do you get back from your server with each of these?

shell(merge("curl -s -o /dev/null -w `%{http_code}` `[[aGoodURL]]`"))

shell(merge("curl -s -o /dev/null -w `%{http_code}` `[[aBogusURL]]`"))


-Mark

Collection PhotoGraphex

unread,
Jul 2, 2020, 9:48:15 PM7/2/20
to superca...@googlegroups.com
Hello, 


shell(merge("curl -s -o /dev/null -w `%{http_code}` `[[aGoodURL]]`"))

shell(merge("curl -s -o /dev/null -w `%{http_code}` `[[aBogusURL]]`"))


You are getting warmer, it does returns the HTTP status code! Bravo!
But no downloaded file!

I suppose you are suggesting I should check the HTTP code first and then proceed to the download of the document on the condition of getting 200

That's great!

I am trying it right away!

André

codegreen

unread,
Jul 2, 2020, 10:15:59 PM7/2/20
to SuperCard Discussion
I suppose you are suggesting I should check the HTTP code first and then proceed to the download of the document on the condition of getting 200

Exactly. 

Collection PhotoGraphex

unread,
Jul 2, 2020, 11:10:35 PM7/2/20
to SuperCard Discussion
I'm trying a slight variation suggested on StackExchange, by introducing the parameter -I to improve the response time:

shell(merge("curl -s -o /dev/null -I -w `%{http_code}` `[[aGoodURL]]`"))

For now, I am just looking for the code to get the size of the document in advance, while testing I noticed I had JPEG previews that are over 100Mb, way too big for slower computers, I will just skip them until I find a efficient way for downsizing the download.

André


codegreen

unread,
Jul 2, 2020, 11:22:25 PM7/2/20
to SuperCard Discussion
I included that in a previous example for the same reason, but didn't want to risk it messing with the http response code for this test.

One mystery at a time... 

;-)
-Mark

Collection PhotoGraphex

unread,
Jul 2, 2020, 11:38:42 PM7/2/20
to SuperCard Discussion

I included that in a previous example for the same reason, but didn't want to risk it messing with the http response code for this test.

Yes! In your June 28 message you propose a script for scaling pictures. 

One mystery at a time... 

That's why, for the time being I just want to skip those too large JPEG documents. 

But the resizing principle is giving me a lot of ideas for future uses!

I found this example on a post of StackExchange
$ URL="http://api.twitter.com/1/statuses/public_timeline.json"
$ curl -sI $URL | grep -i Content-Length
Content-Length: 134
I just need to figure out how to transpose it for a SuperCard shell() function.  

André

Collection PhotoGraphex

unread,
Jul 3, 2020, 10:14:56 PM7/3/20
to superca...@googlegroups.com
To shell scripter,

It seems this script will return the numerical byte size of the corresponding document for the provided URL:

shell(merge("curl -sI `[[TheURL]]` | grep -i Content-Length | awk '{print $2}'"))


Or for convenience, the numerical Megabyte size directly:


shell(merge("curl -sI `[[TheURL]]` | grep -i Content-Length | awk '{print $2/1024/1024}'"))


This is my amateurish take, someone more competent may improve it!



Reply all
Reply to author
Forward
0 new messages