I'm trying to get a whole bunch of HTML source from a website (which I
do not own, but am merely trying to read):
https://www.blahblah.com/cgi-bin/Num=xxxx
where I have a list of numbers xxxx in a file.
I would love to run lynx and simply redirect the html dump to a file,
but alas, I do not have access to lynx either. So, perhaps I can
write a PHP script to do this?
readfile( "https://www.blahblah.com/cgi-bin/Num=xxxx" ) does not work:
Warning: readfile("https://www.blahblah.com/cgi-bin/Num=xxxx") - No
such file or directory
What am I doing wrong?
Thanks,
Mike Darrett
hey mike - maybe this will give you a clue:
http://www.php.net/manual/en/function.file.php
all you do is stick the html source in an array - each element holds a
line - then you just run through the array and print it.
How about wget?
> readfile( "https://www.blahblah.com/cgi-bin/Num=xxxx" ) does not work:
>
> Warning: readfile("https://www.blahblah.com/cgi-bin/Num=xxxx") - No
> such file or directory
>
> What am I doing wrong?
Can you open those URLs in your browser? Is authorization of any kind
required there? (The reason I am asking about authorization is that
you have a secure URL...) If no authorization is needed, try using
sockets:
$local = fopen ('xxxx.htm', 'w');
$host = 'www.blahblah.com';
$path = '/cgi-bin/Num=xxxx';
$remote = fsockopen($host, '443'); // Note that we should use port 443
// for secure HTTP connection.
if ($remote) {
fputs($remote, "GET ".$path." HTTP/1.0\r\nHost: ".$host."\r\n\r\n");
while(!feof($remote)) {
fputs($local, fgets($remote, 10240));
}
} else {
echo 'Oops... Can't read it anyway... ';
}
fclose ($remote);
fclose ($local);
Cheers,
NC
I'll ignore the right or wrongs of what you are doing for a moment and point
out the problems you're likely to face:
- I notice that you are trying to read a secure page (the s in https://
tells me this)... this might be in part your problem...
- I notice that the files you want to read reside under a /cgi-bin/
directory - its possible that the author could have some tests to check
which browser you are using to visit - hence your script is unlikely to
deliever those header variables and might fail. In addition, there might be
cookies, which your script is unlikely to process and thus the author of the
server programs could have fixed it so that what you see in your browser, is
different from a non-broswer connection...
Technically... what you are doing should work so I would first try what you
want to do on an insecure server... make sure that works first before you
start working on a secure server...
anyway... laters
randelld
From the PHP manual:
*Tip:* You can use a URL as a filename with this function if the _fopen
wrappers_ have been enabled. See _fopen()_ for more details on how to
specify the filename and _Appendix I_ for a list of supported URL protocols.
So the wrappers could be disabled.
--
1.41421356237309504880168872420969807856967187537694
Nope, it's not true that the above isn't false, unless it is.
The search is more profitable than the find.
Never, ever back down.
I did try this; I get the "Oops" error mesage.
Since it's not exactly a file, I do wonder whether I can open the
website as a "file"...???
Don't worry too much about the legals of what I'm doing. This is
actually the website of one of my distributors; they're being
difficult when I ask them for a price list for their software. (You
would think they'd want to help me as much as possible, since I'm a
reseller... ah well.) Turns out I can grab the prices directly from
the https://...
Mike
Most systems have cURL installed on them now. If it is not, you can
get it from curl.haxx.se
Write a script that shell_exec()'s curl...
/////////////////////////////////////////////////
$aNums = array('123', '125', '1234', '1456');
foreach($aNums as $sNum)
{
$sCmd = "curl 'https://www.blahblah.com/cgi-bin/Num=$sNum' >
$sNum.html";
shell_exec($sCmd);
}
////////////////////////////////////////////////
mike-...@darrettenterprises.com (Mike Darrett) wrote in message news:<d945119c.03050...@posting.google.com>...