--
Dave Farrance
Hello,
You can use wget --spider :
$ wget --spider URI && command_if_success
>On 2008-08-04, Dave Farrance <DaveFa...@OMiTTHiSyahooANDTHiS.co.uk> wrote:
>> How do I test if a file exists on the web, where the file can be
>> expressed as either an ftp or http url?
> You can use wget --spider :
>
>$ wget --spider URI && command_if_success
Thanks. That'll do fine.
--
Dave Farrance
Ah, it seems to work fine for http but gives a false positive for ftp:
$ wget -q --spider http://www.mirrorservice.org/invalid && echo YES
$
$ wget -q --spider ftp://ftp.mirrorservice.org/invalid && echo YES
YES
$
$ wget --version
GNU Wget 1.11
--
Dave Farrance
>>>> How do I test if a file exists on the web, where the file can be
>>>> expressed as either an ftp or http url?
>>
>>> You can use wget --spider :
>>>
>>> $ wget --spider URI && command_if_success
>
> Ah, it seems to work fine for http but gives a false positive for ftp:
>
> $ wget -q --spider http://www.mirrorservice.org/invalid && echo YES
> $
> $ wget -q --spider ftp://ftp.mirrorservice.org/invalid && echo YES
> YES
Don't use the --spider option with ftp.
>In news:5gne94l6s61vujp6c...@4ax.com,
>Dave Farrance <DaveFa...@OMiTTHiSyahooANDTHiS.co.uk> typed:
>
>> Ah, it seems to work fine for http but gives a false positive for ftp:
>>
>> $ wget -q --spider http://www.mirrorservice.org/invalid && echo YES
>> $
>> $ wget -q --spider ftp://ftp.mirrorservice.org/invalid && echo YES
>> YES
>
>Don't use the --spider option with ftp.
That'd download the file if the URL is valid, which is not what I want.
I've got a script that takes the URL of a very large file as a parameter,
sleeps until midnight, and then downloads the file. But I want to put in
a test to check that the URL is valid when the script is first run.
--
Dave Farrance
>>> Ah, it seems to work fine for http but gives a false positive for
>>> ftp:
>>>
>>> $ wget -q --spider http://www.mirrorservice.org/invalid && echo YES
>>> $
>>> $ wget -q --spider ftp://ftp.mirrorservice.org/invalid && echo YES
>>> YES
>>
>> Don't use the --spider option with ftp.
>
> That'd download the file if the URL is valid, which is not what I
> want.
curl -sl ftp://ftp.mirrorservice.org/ | grep invalid && echo YES
>curl -sl ftp://ftp.mirrorservice.org/ | grep invalid && echo YES
The script would have to parse and split the URL, and handle http
differently, so I hope there's a simpler way.
--
Dave Farrance
It is simple:
url=ftp://ftp.mirrorservice.org/invalid
file=${url##*/}
host=${url%"$file"}
curl -sl "$hone" | grep "$file" && echo YES
--
Chris F.A. Johnson, author <http://cfaj.freeshell.org/shell/>
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
===== My code in this post, if any, assumes the POSIX locale
===== and is released under the GNU General Public Licence
>> curl -sl ftp://ftp.mirrorservice.org/ | grep invalid && echo YES
>
> The script would have to parse and split the URL, and handle http
> differently, so I hope there's a simpler way.
if [ `echo "$URL" | grep '^ftp'` ]; then
URL=`dirname "$URL"`
filename=`basename "$URL"`
curl -sl "$URL" | grep "$filename" && echo YES ftp
else
wget -q --spider "$URL" && echo YES http
fi
May I assume,
> curl -sl "$host" | grep "$file" && echo YES
> How do I test if a file exists on the web, where the file can be
> expressed as either an ftp or http url?
>
Functions, test and output:
prompt$ cat t
#-----------------
#FTP
f(){
wget --server-response "$1" 2>&1|while read;do
[ "${REPLY:0:4}" = "213 " ]&&killall wget&&return
done
}
URL=ftp://ftp.gnupg.org/gcrypt/gnupg/gnupg-2.0.9.tar.bz
t=$(date +%s)
for e in 1 2 3;do
F=$URL$e
f $F
echo \$?=$? $(($(date +%s)-$t))s $F
done
#---------------------------
#HTTP bash
f(){
H=${1#*/*/}
P=${H#*/}
exec 3<>/dev/tcp/${H%%/*}/80
printf "GET /$P HTTP/1.0\r\n\r\n">&3
read<&3;REPLY=${REPLY%?}
exec 3<&-
C=${REPLY#* }
[[ ' 200 302 ' =~ "${C% *}" ]]
}
URL=http://www.google.com/intl/en_ALL/images/logo.gi
for e in e f g;do
F=$URL$e
f $F
echo \$?=$? $(($(date +%s)-$t))s $C $REPLY: $P
done
#-------------------------------
prompt$ . ./t
$?=1 4s ftp://ftp.gnupg.org/gcrypt/gnupg/gnupg-2.0.9.tar.bz1
$?=0 7s ftp://ftp.gnupg.org/gcrypt/gnupg/gnupg-2.0.9.tar.bz2
$?=1 11s ftp://ftp.gnupg.org/gcrypt/gnupg/gnupg-2.0.9.tar.bz3
$?=1 11s 404 Not Found HTTP/1.0 404 Not Found: intl/en_ALL/images/logo.gie
$?=0 11s 200 OK HTTP/1.0 200 OK: intl/en_ALL/images/logo.gif
$?=1 12s 404 Not Found HTTP/1.0 404 Not Found: intl/en_ALL/images/logo.gig
prompt$
>How do I test if a file exists on the web, where the file can be
>expressed as either an ftp or http url?
Thanks to everybody that replied.
I think that the best check for my purpose is to find if a file length
exists -- and display it as an extra check because I'll normally know the
magnitude of the file that I intend to download, such as a DVD iso.
"wget -S --spider $url" outputs the file size prefixed with "213" for ftp
urls or "Content-Length:" for http urls.
The script below schedules a valid url to be downloaded from midnight:
#!/bin/bash
url=$1
len=$(wget -S --spider "$url" 2>&1 | \
grep -E '^ Content-Length:|^213' | tail -n1 | \
sed 's/ Content-Length://;s/213//')
[[ -z "$len" ]] && echo "Invalid URL" && exit 1
echo "File length = $len"
secs=$(($(date "+86400-%-H*3600-%-M*60-%-S")))
{ sleep $secs; wget $url; }&