Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Grab data from webpage

0 views
Skip to first unread message

Roger Lin

unread,
Dec 24, 2002, 1:06:55 AM12/24/02
to
I am trying to grab data from one web page automatically.

I can simplify the form action as the following:

<form METHOD="POST" ACTION="http://www.theweb.com/getdata.htm">
<input type="hidden" name="cate1" value="65309">
<input type="hidden" name="cate2" value="76171">
<input type="hidden" name="vendor" value="190">
<input type="hidden" name="type" value="AXRE">
<input type="hidden" name="dat_begin" value="0">
<input type="hidden" name="dat_end" value="9">
<input type="hidden" name="dat_count" value="421">
<input type="hidden" name="product" value="FFSA">
<input type="Image" name="Find Next"
src="http://www.theweb.com/images/findnext.gif" WIDTH="125"
HEIGHT="20" border="0">

I could get the data in html format. However, I would get nothing ,
if I put it together as the following:

http://www.theweb.com/getdata.htm?cate1=65309&cate2=76171&vendor=190&type=AXRE&dat_begin=0&dat_end=9&dat_count=421&product=FFSA

Is there anyway I can grab the data automatically?

TIA

Roger

Roger Lin

unread,
Dec 24, 2002, 12:03:16 PM12/24/02
to
After searching the web, I solved the problem using tcpTrace.

On Tue, 24 Dec 2002 00:06:55 -0600, Roger Lin <wei...@dynapost.com>
wrote:

Nikolai Chuvakhin

unread,
Dec 24, 2002, 1:46:44 PM12/24/02
to
Roger Lin <wei...@dynapost.com> wrote in message
news:<83A65D8D4ED483CF.A3B22324...@lp.airnews.net>...

>
> I can simplify the form action as the following:
>
> <form METHOD="POST" ACTION="http://www.theweb.com/getdata.htm">
[inputs skipped]

>
> I could get the data in html format. However, I would get nothing ,
> if I put it together as the following:
>
> http://www.theweb.com/getdata.htm?cate1=65309&cate2=76171&vendor=190&type=AXRE&dat_begin=0&dat_end=9&dat_count=421&product=FFSA

Most likely, this is because the target page expects input via
POST method used by your form, not via the GET method you are
using by including name-value pairs in the URL.

> Is there anyway I can grab the data automatically?

Something like this might help:

$content = "";
$post_query = "cate1=65309&cate2=76171&vendor=190&type=AXRE&dat_begin=0&dat_end=9&dat_count=421&product=FFSA";
$post_query = urlencode($post_query) . "\r\n";
$host = "www.theweb.com";
$path = "/getdata.htm";
$fp = fsockopen($host, "80");
if ($fp) {
fputs($fp, "POST ".$path." HTTP/1.0\r\nHost: ".$host."\r\n");
fputs($fp, "Content-type: application/x-www-form-urlencoded\r\n");
fputs($fp, "Content-length: ". strlen($post_query) ."\r\n\r\n");
fputs($fp, $post_query);
while (!feof($fp)) {
$content .= fgets($fp, 4096);
}
}

Cheers,
NC

0 new messages