I was doing scraping using the same code for 4 completely different
sites simultaneously, so I had a pretty object oriented approach (read
polymorphic inheritance) and was running my object in multiple
threads. But the basic principle remains the same. There may be many
ways to do it... Here's how I did it and would suggest you to take the
following steps :
1. Understand how the request-response system works for the web and
how your browser works internally.
2. Study the HttpWebRequest and HttpWebResponse classes in the .NET
Framework and understand how they can be used to mimic any web
traffic.
3. Use Fiddler (v.2 parses HTTPS !) to analyze the HTTP post taking
place when you submit data on the target site. That will give you a
list of form parameters that are passed to the server with each
request. Since this is an
ASP.NET site, the ViewState would be one of
the required parameters and must be passed untampered.
4. Create a (preferably configurable) list of the form parameters that
are submitted with the request. An XML file that is dynamically loaded
by the application was how I did it.
5. Load this list and create a querystring containing all those
parameter names and the values they expect. The values here would be
substituted by the values obtained from users of your intranet site.
You might need to URLEncode the Viewstate value.
6. Convert the string to a byte array and write this byte array to the
RequestStream exposed by your HttpWebRequest object.
7. Set as many properties of the HttpWebRequest object as you can...
such as ProtocolVersion, Method, ContentType, Timeout, and any other
Headers you need. This information can usually be found by studying
the request.that Fiddler plugged into.
8. Use the HttpWebRequest.GetResponse() method to obtain a
HttpWebResponse object that is the response returned by the server. If
all works well, this would be the next page that you see when manually
submitting data.
You can then analyze this page for any data you want.
Hope that helps !
> > Hope this helps some!- Hide quoted text -
>
> - Show quoted text -