Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Reading and saving URL content

0 views
Skip to first unread message

ki...@yahoo.com

unread,
Jan 3, 2006, 6:40:18 AM1/3/06
to
In my current PHP project, I have to read pages from a website and
parse the data. If using the Internet browser, I would have to do the
following steps:

Step 1: specify the state to display by using:
www.foo.com/selectState.asp?state=ca
and it will redirect to
www.foo.com/list.asp?city=sanjose&page=1

Step 2. www.foo.com/list.asp?city=sanjose&page=2
Step 3. www.foo.com/list.asp?city=sanjose&page=3
...
and go on...

I can use fopen($url, 'r') and fgets() to get the contents and read
all the data back into a file. That's the easy part. But the data
returned is not the correct one at Step 1. It returns something like
the custom page-not-found error. I guess the redirection causing that
problem.
If I continue to read the content as Step 2, I get an empty page: seems
like the 'state' session variable has not been defined in Step 1.

Could someone help me with this problem ? Thanks,

J.O. Aho

unread,
Jan 3, 2006, 8:37:36 AM1/3/06
to

You could use a external tool for fetching the pages in question, wget has
been around for quite many years and has been proved to be an excellent tool
for this, allows you to fake headers in case the site you are getting the page
from requires "reference page" to server some of the pages..


//Aho

johan

unread,
Jan 9, 2006, 9:02:55 AM1/9/06
to
On Tue, 03 Jan 2006 03:40:18 -0800, kinh wrote:

>
> I can use fopen($url, 'r') and fgets() to get the contents and read all
> the data back into a file. That's the easy part. But the data returned is
> not the correct one at Step 1. It returns something like the custom
> page-not-found error. I guess the redirection causing that problem.
> If I continue to read the content as Step 2, I get an empty page: seems
> like the 'state' session variable has not been defined in Step 1.
>
> Could someone help me with this problem ? Thanks,

See the CURL functions:
http://php.belnet.be/manual/en/ref.curl.php
and on your favorite mirror , of course.

0 new messages