Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ASP help

1 view
Skip to first unread message

Rizwan

unread,
Jun 18, 2006, 6:12:00 AM6/18/06
to
hi there to all......

Is it possible to fetch data records from other website and store in a
database dynamically....


I mean to say i want to fetch data from a yellow pages website .......
showing records of different companies.
i.e, http://www.website.com/data.aspx?CompanyID=1&DirectoryID=2


So, now I want a script which goes on this path and copy data after
that go on next page which is CompanyID=3&DirectoryID=2 and follow the
procedure.........


thats it


I shall be very thankful to everyone who can do or help me to do....


thanks.

Bob Barrows [MVP]

unread,
Jun 18, 2006, 8:43:58 AM6/18/06
to
Rizwan wrote:
> hi there to all......
>
> Is it possible to fetch data records from other website and store in a
> database dynamically....
>
>
> I mean to say i want to fetch data from a yellow pages website .......
> showing records of different companies.
> i.e, http://www.website.com/data.aspx?CompanyID=1&DirectoryID=2
>
>
> So, now I want a script which goes on this path and copy data after
> that go on next page which is CompanyID=3&DirectoryID=2 and follow the
> procedure.........
>
>
http://www.aspfaq.com/show.asp?id=2173
--
Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"


Jeff

unread,
Jun 18, 2006, 7:30:27 PM6/18/06
to
great question. I went to the link provided above, but have this question.

expanding on that, is there any way to break down the actual records instead
of showing everything that is in the HTML?

or is this a stupid question since everything on the client side IS just
html?
i think i answered my own question didn't i


"Rizwan" <rizwa...@gmail.com> wrote in message
news:1150625519.9...@i40g2000cwc.googlegroups.com...

Mike Brind

unread,
Jun 19, 2006, 2:30:47 AM6/19/06
to
Actually, as far as the ServerXMLHTTP object is concerned, HTML is
just a string of text. You can break down the individual parts of a
record and put them into a database. You will need to use regular
expression patterns to identify the company name, address, telephone
number etc, to parse the .responseText.

I've used this technique a few times to grap a copy of an online
directory, but you must be aware of copyright issues.

--
Mike Brind

Jeff

unread,
Jun 19, 2006, 7:35:28 AM6/19/06
to
thanks mike... but can you go into detail a bit more?
how would one go about doing this. How would you isolate say, a name field
out of a complete page of html code?

"Mike Brind" <paxt...@hotmail.com> wrote in message
news:1150698647.7...@r2g2000cwb.googlegroups.com...

Mike Brind

unread,
Jun 19, 2006, 4:20:28 PM6/19/06
to
You would have to look for boundaries. What, in the html source,
identifies the beginning and the end of the piece of text you want to
grab? It might be some html tags, eg <h2>Some Company</h2>, or <span
class="companyname">Some Company</span>, or it might just be straight
text eg Company: Some Company followed by a new line. Then you would
have to create a regexp pattern to match it.

In the first case, my pattern would be: "<h2>(.*)<\/h2>". In the last
example, a pattern could be "Company:(.*)".

Jeff

unread,
Jun 19, 2006, 5:19:35 PM6/19/06
to
understood. so if i am trying to get data from this page say:

http://clanwars.gig-gamers.com/testhtml.asp

which is from pgatour.com for example

i don't see any boundries right off hand. with the exception of a url before
each player name.
so if i wanted to get this data, along with the data presented for each
user, what would my boundries be?


"Mike Brind" <paxt...@hotmail.com> wrote in message

news:1150748428.5...@y41g2000cwy.googlegroups.com...

Mike Brind

unread,
Jun 19, 2006, 5:34:11 PM6/19/06
to
Dunno. You would have to sort this out first:

Active Server Pages error 'ASP 0138'

Nested Script Block

/testhtml.asp, line 94

A script block cannot be placed inside another script block.

:-)

Bob Barrows [MVP]

unread,
Jun 19, 2006, 5:33:48 PM6/19/06
to
Jeff wrote:
> understood. so if i am trying to get data from this page say:
>
> http://clanwars.gig-gamers.com/testhtml.asp

Active Server Pages error 'ASP 0138'

Nested Script Block

/testhtml.asp, line 94

A script block cannot be placed inside another script block.

Rather than pointing us at a url, post some of the page's source (NOT
THE WHOLE PAGE!)


>
> which is from pgatour.com for example
>
> i don't see any boundries right off hand. with the exception of a url
> before each player name.
> so if i wanted to get this data, along with the data presented for
> each user, what would my boundries be?

Sounds to me as if you need to visit those urls ...


--
Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.


Jeff

unread,
Jun 19, 2006, 7:25:54 PM6/19/06
to
sorry guys.... here is some of the html. just a few players worth

<tr>
<th class="c1" rowSpan="2">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/rank/2006">
Pos</a></th>
<th class="c1" vAlign="center" rowSpan="2">Player Name<br>
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/first/2006">
First</a>/<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/last/2006">Last</a></th>
<th class="c1" colSpan="3">Scoring</th>
<th class="c1" colSpan="4">Rounds</th>
<th class="c1" vAlign="center" rowSpan="2">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/rank/2006">
Total<br>
Score</a></th>
</tr>
<tr>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/today/2006">
Today</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/hole/2006">
Thru</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/rank/2006">
To Par</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/r1/2006">
1</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/r2/2006">
2</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/r3/2006">
3</a></th>
<th bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/r4/2006">
4</a></th>
</tr>
<tr>
<th bgColor="#ffffff">1</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/238088">
Geoff Ogilvy</a>&nbsp;<img
src="http://images.golfweb.com/u/photos/misc/titleist_sm.gif"
border="0"></td>
<th bgColor="#ffffff">+2</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+5</th>
<th bgColor="#ffffff">71</th>
<th bgColor="#ffffff">70</th>
<th bgColor="#ffffff">72</th>
<th bgColor="#ffffff">72</th>
<th bgColor="#ffffff">285</th>
</tr>
<tr>
<th bgColor="#ffffff">2</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/132075">
Phil Mickelson</a>&nbsp;</td>
<th bgColor="#ffffff">+4</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+6</th>
<th bgColor="#ffffff">70</th>
<th bgColor="#ffffff">73</th>
<th bgColor="#ffffff">69</th>
<th bgColor="#ffffff">74</th>
<th bgColor="#ffffff">286</th>
</tr>
<tr>
<th bgColor="#ffffff">2</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/399352">
Colin Montgomerie</a>&nbsp;</td>
<th bgColor="#ffffff">+1</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+6</th>
<th bgColor="#ffffff">69</th>
<th bgColor="#ffffff">71</th>
<th bgColor="#ffffff">75</th>
<th bgColor="#ffffff">71</th>
<th bgColor="#ffffff">286</th>
</tr>
<tr>
<th bgColor="#ffffff">2</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/132018">
Jim Furyk</a>&nbsp;</td>
<th bgColor="#ffffff">E</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+6</th>
<th bgColor="#ffffff">70</th>
<th bgColor="#ffffff">72</th>
<th bgColor="#ffffff">74</th>
<th bgColor="#ffffff">70</th>
<th bgColor="#ffffff">286</th>
</tr>
<tr>
<th bgColor="#ffffff">5</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/399340">
Padraig Harrington</a>&nbsp;<img
src="http://images.golfweb.com/u/photos/misc/titleist_sm.gif"
border="0"></td>
<th bgColor="#ffffff">+1</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+7</th>
<th bgColor="#ffffff">73</th>
<th bgColor="#ffffff">69</th>
<th bgColor="#ffffff">74</th>
<th bgColor="#ffffff">71</th>
<th bgColor="#ffffff">287</th>
</tr>


"Mike Brind" <paxt...@hotmail.com> wrote in message

news:1150752851.9...@y41g2000cwy.googlegroups.com...

Anthony Jones

unread,
Jun 20, 2006, 4:16:09 AM6/20/06
to

"Jeff" <gig...@adelphia.net> wrote in message
news:5rCdnfcOyJTYrArZ...@adelphia.com...

Use something like this:-


Dim rgx
Dim oMatches
Dim oMatch

Set rgx = new RegExp
' Remove below is one line.
rgx.Pattern =
"<a\W+href=""http://www.golfweb.com/tournaments/usopen/scorecards/2006/(\d+)
"">\W*(.*?)\W*</a>"

rgx.Global = True
rgx.IgnoreCase = true

Set oMatches = rgx.Execute(sInputHTML)

For each oMatch in oMatches
'oMatch.SubMatches(0) contains the player ID
'oMatch.SubMatches(1) contains the player name
Next


Anthony.


Mike Brind

unread,
Jun 20, 2006, 5:21:43 AM6/20/06
to

Anthony Jones wrote:
> "Jeff" <gig...@adelphia.net> wrote in message
> news:5rCdnfcOyJTYrArZ...@adelphia.com...
> > sorry guys.... here is some of the html. just a few players worth
> >

<snipped>

<snipped>

> >
>
> Use something like this:-
>
>
> Dim rgx
> Dim oMatches
> Dim oMatch
>
> Set rgx = new RegExp
> ' Remove below is one line.
> rgx.Pattern =
> "<a\W+href=""http://www.golfweb.com/tournaments/usopen/scorecards/2006/(\d+)
> "">\W*(.*?)\W*</a>"
>
> rgx.Global = True
> rgx.IgnoreCase = true
>
> Set oMatches = rgx.Execute(sInputHTML)
>
> For each oMatch in oMatches
> 'oMatch.SubMatches(0) contains the player ID
> 'oMatch.SubMatches(1) contains the player name
> Next
>

Mine was like this:

Set objRegExpr = New regexp
objRegExpr.Pattern = "2006\/\d{6}" & chr(34) & ">\W*(.*?)\W*<"
objRegExpr.Global = True
objRegExpr.IgnoreCase = True
set colmatches = objRegExpr.Execute(strSearchOn)
'response.write colMatches.Count
For Each objMatch in colMatches
Response.Write objMatch.SubMatches(0) & "<br>"
Next

I didn't bother with the ID, which I assumed from the example was
always 6 digits.

:-)

--
Mike Brind

Jeff

unread,
Jun 20, 2006, 7:09:24 AM6/20/06
to
ok.. this all is getting interesting.
so if i am reading this correctly, this is picking up the column values
based on the pattern with the name or id??


"Mike Brind" <paxt...@hotmail.com> wrote in message

news:1150795303....@y41g2000cwy.googlegroups.com...

Mike Brind

unread,
Jun 21, 2006, 4:08:01 AM6/21/06
to
No. It's picking out values from a text string (the html code in the
page) that match the pattern provided to the RegExp object. It has no
concept of column values or names or ids. Anthony's pattern has two
parts in brackets. Those parts match the position of an id number that
he identified from looking at the html, and the position of the name of
the player. While the whole hyperlink is matched, and any others that
have the same pattern in the html, those two bracketed parts are stored
in the SubMatches collection for later use, and are referenced through
their ordinal position.

So you create your own database, with your own table and column names
and within the For ... Next loop, you run an ADO insert into your
database:

For each oMatch in oMatches
'oMatch.SubMatches(0) contains the player ID
'oMatch.SubMatches(1) contains the player name

conn.execute("INSERT INTO <table> (PlayerID, PlayerName) VALUES ("
& oMatch.SubMatches(0) & ",'" & oMatch.SubMatches(1) & "')")
Next

Scott Mitchell has put together a very good set of articles on regular
expressions:
http://www.4guysfromrolla.com/webtech/regularexpressions.shtml

--
Mike Brind

Jeff

unread,
Jun 21, 2006, 6:56:52 AM6/21/06
to
ok.. thanks for all the help guys... most appreciated


"Mike Brind" <paxt...@hotmail.com> wrote in message

news:1150877281....@b68g2000cwa.googlegroups.com...

0 new messages