Is it possible to fetch data records from other website and store in a
database dynamically....
I mean to say i want to fetch data from a yellow pages website .......
showing records of different companies.
i.e, http://www.website.com/data.aspx?CompanyID=1&DirectoryID=2
So, now I want a script which goes on this path and copy data after
that go on next page which is CompanyID=3&DirectoryID=2 and follow the
procedure.........
thats it
I shall be very thankful to everyone who can do or help me to do....
thanks.
expanding on that, is there any way to break down the actual records instead
of showing everything that is in the HTML?
or is this a stupid question since everything on the client side IS just
html?
i think i answered my own question didn't i
"Rizwan" <rizwa...@gmail.com> wrote in message
news:1150625519.9...@i40g2000cwc.googlegroups.com...
I've used this technique a few times to grap a copy of an online
directory, but you must be aware of copyright issues.
--
Mike Brind
"Mike Brind" <paxt...@hotmail.com> wrote in message
news:1150698647.7...@r2g2000cwb.googlegroups.com...
In the first case, my pattern would be: "<h2>(.*)<\/h2>". In the last
example, a pattern could be "Company:(.*)".
http://clanwars.gig-gamers.com/testhtml.asp
which is from pgatour.com for example
i don't see any boundries right off hand. with the exception of a url before
each player name.
so if i wanted to get this data, along with the data presented for each
user, what would my boundries be?
"Mike Brind" <paxt...@hotmail.com> wrote in message
news:1150748428.5...@y41g2000cwy.googlegroups.com...
Active Server Pages error 'ASP 0138'
Nested Script Block
/testhtml.asp, line 94
A script block cannot be placed inside another script block.
:-)
Active Server Pages error 'ASP 0138'
Nested Script Block
/testhtml.asp, line 94
A script block cannot be placed inside another script block.
Rather than pointing us at a url, post some of the page's source (NOT
THE WHOLE PAGE!)
>
> which is from pgatour.com for example
>
> i don't see any boundries right off hand. with the exception of a url
> before each player name.
> so if i wanted to get this data, along with the data presented for
> each user, what would my boundries be?
Sounds to me as if you need to visit those urls ...
--
Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.
<tr>
<th class="c1" rowSpan="2">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/rank/2006">
Pos</a></th>
<th class="c1" vAlign="center" rowSpan="2">Player Name<br>
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/first/2006">
First</a>/<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/last/2006">Last</a></th>
<th class="c1" colSpan="3">Scoring</th>
<th class="c1" colSpan="4">Rounds</th>
<th class="c1" vAlign="center" rowSpan="2">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/rank/2006">
Total<br>
Score</a></th>
</tr>
<tr>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/today/2006">
Today</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/hole/2006">
Thru</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/rank/2006">
To Par</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/r1/2006">
1</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/r2/2006">
2</a></th>
<th class="cc">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/r3/2006">
3</a></th>
<th bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/leaderboard/r4/2006">
4</a></th>
</tr>
<tr>
<th bgColor="#ffffff">1</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/238088">
Geoff Ogilvy</a> <img
src="http://images.golfweb.com/u/photos/misc/titleist_sm.gif"
border="0"></td>
<th bgColor="#ffffff">+2</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+5</th>
<th bgColor="#ffffff">71</th>
<th bgColor="#ffffff">70</th>
<th bgColor="#ffffff">72</th>
<th bgColor="#ffffff">72</th>
<th bgColor="#ffffff">285</th>
</tr>
<tr>
<th bgColor="#ffffff">2</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/132075">
Phil Mickelson</a> </td>
<th bgColor="#ffffff">+4</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+6</th>
<th bgColor="#ffffff">70</th>
<th bgColor="#ffffff">73</th>
<th bgColor="#ffffff">69</th>
<th bgColor="#ffffff">74</th>
<th bgColor="#ffffff">286</th>
</tr>
<tr>
<th bgColor="#ffffff">2</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/399352">
Colin Montgomerie</a> </td>
<th bgColor="#ffffff">+1</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+6</th>
<th bgColor="#ffffff">69</th>
<th bgColor="#ffffff">71</th>
<th bgColor="#ffffff">75</th>
<th bgColor="#ffffff">71</th>
<th bgColor="#ffffff">286</th>
</tr>
<tr>
<th bgColor="#ffffff">2</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/132018">
Jim Furyk</a> </td>
<th bgColor="#ffffff">E</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+6</th>
<th bgColor="#ffffff">70</th>
<th bgColor="#ffffff">72</th>
<th bgColor="#ffffff">74</th>
<th bgColor="#ffffff">70</th>
<th bgColor="#ffffff">286</th>
</tr>
<tr>
<th bgColor="#ffffff">5</th>
<td bgColor="#ffffff">
<a
href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/399340">
Padraig Harrington</a> <img
src="http://images.golfweb.com/u/photos/misc/titleist_sm.gif"
border="0"></td>
<th bgColor="#ffffff">+1</th>
<th bgColor="#ffffff">F</th>
<th bgColor="#ffffff">+7</th>
<th bgColor="#ffffff">73</th>
<th bgColor="#ffffff">69</th>
<th bgColor="#ffffff">74</th>
<th bgColor="#ffffff">71</th>
<th bgColor="#ffffff">287</th>
</tr>
"Mike Brind" <paxt...@hotmail.com> wrote in message
news:1150752851.9...@y41g2000cwy.googlegroups.com...
Use something like this:-
Dim rgx
Dim oMatches
Dim oMatch
Set rgx = new RegExp
' Remove below is one line.
rgx.Pattern =
"<a\W+href=""http://www.golfweb.com/tournaments/usopen/scorecards/2006/(\d+)
"">\W*(.*?)\W*</a>"
rgx.Global = True
rgx.IgnoreCase = true
Set oMatches = rgx.Execute(sInputHTML)
For each oMatch in oMatches
'oMatch.SubMatches(0) contains the player ID
'oMatch.SubMatches(1) contains the player name
Next
Anthony.
<snipped>
> > <a
> > href="http://www.golfweb.com/tournaments/usopen/scorecards/2006/238088">
> > Geoff Ogilvy</a> <img
> > src="http://images.golfweb.com/u/photos/misc/titleist_sm.gif"
> > border="0"></td>
<snipped>
> >
>
> Use something like this:-
>
>
> Dim rgx
> Dim oMatches
> Dim oMatch
>
> Set rgx = new RegExp
> ' Remove below is one line.
> rgx.Pattern =
> "<a\W+href=""http://www.golfweb.com/tournaments/usopen/scorecards/2006/(\d+)
> "">\W*(.*?)\W*</a>"
>
> rgx.Global = True
> rgx.IgnoreCase = true
>
> Set oMatches = rgx.Execute(sInputHTML)
>
> For each oMatch in oMatches
> 'oMatch.SubMatches(0) contains the player ID
> 'oMatch.SubMatches(1) contains the player name
> Next
>
Mine was like this:
Set objRegExpr = New regexp
objRegExpr.Pattern = "2006\/\d{6}" & chr(34) & ">\W*(.*?)\W*<"
objRegExpr.Global = True
objRegExpr.IgnoreCase = True
set colmatches = objRegExpr.Execute(strSearchOn)
'response.write colMatches.Count
For Each objMatch in colMatches
Response.Write objMatch.SubMatches(0) & "<br>"
Next
I didn't bother with the ID, which I assumed from the example was
always 6 digits.
:-)
--
Mike Brind
"Mike Brind" <paxt...@hotmail.com> wrote in message
news:1150795303....@y41g2000cwy.googlegroups.com...
So you create your own database, with your own table and column names
and within the For ... Next loop, you run an ADO insert into your
database:
For each oMatch in oMatches
'oMatch.SubMatches(0) contains the player ID
'oMatch.SubMatches(1) contains the player name
conn.execute("INSERT INTO <table> (PlayerID, PlayerName) VALUES ("
& oMatch.SubMatches(0) & ",'" & oMatch.SubMatches(1) & "')")
Next
Scott Mitchell has put together a very good set of articles on regular
expressions:
http://www.4guysfromrolla.com/webtech/regularexpressions.shtml
--
Mike Brind
"Mike Brind" <paxt...@hotmail.com> wrote in message
news:1150877281....@b68g2000cwa.googlegroups.com...