I've searched and searched for an answer with no luck. I'm trying to
automate interaction with a secure web site to retrieve some data.
Normal interaction starts at a login page where you enter a username
and password. If successful, you are presented with a form to fill
out. I have successfully extracted everything I need from the form
and created a single long URL. I can easily enter one long URL to log
into the site and then another longer URL to retrieve my data. I've
used these URLs in the script. Unfortunately, the script doesn't work
(obviously). The entire script (although somewhat messy) is shown
below with all of the personal data removed. When I run the script
everything looks like it's running okay (I've turned on some sort of
debug so I can watch the script run) except the final output is the
"signon" web page again. It's as if the site thinks I never
successfully logged in. Any ideas what is going wrong here? If
necessary I can get the debug output and post as well (I just don't
have it on this machine right now). Thanks in advance.
use LWP;
use LWP::UserAgent;
use LWP::ConnCache;
use LWP::Debug qw(+);
use HTTP::Cookies;
my $browser = LWP::UserAgent->new(keep_alive=>1);
my $cookie_jar = HTTP::Cookies->new();
my $cache = $browser->conn_cache(LWP::ConnCache->new());
my $loginurl = "https://something-here";
my $profileurl = "https://something-more-here";
my $probeurl = "https://something-even-more-here";
$browser->cookie_jar($cookie_jar);
$browser->credentials("something.somewhere.acme.com:443","something.somewhere.acme.com","XXXXXX\@acme.com"
=> "XXXXXXX");
my @netscape_like_headers = (
'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)',
'Accept-Language' => 'en-US',
'Accept-Charset' => 'iso-8859-1,*,utf-8',
'Accept-Encoding' => 'gzip',
'Accept' => "image/gib, image/x-bitmap, image/jpeg,
image/pjpeg, image/png, */*",
);
sub TryProbedata {
#Retrieves probe data from web site
my $response = $browser->post($probeurl, @netscape_like_headers);
if($response->is_error){
die "Can't get probe data: $url -- ", $response->status_line;
}
else{
print "Probe data retrieve was successful. Writing to
logfile\n";
$returndata = $response->content;
print LOGFILE "$returndata\n";
}
}
sub loadProfile {
#Loads profile page
my $response = $browser->post($profileurl, @netscape_like_headers); #,
if($response->is_error){
die "Can't load profile: $url -- ", $response->status_line;
}
else{
$response->status_line;
print "Load profile success calling probe data retrieve
procedure\n";
TryProbedata();
}
}
sub siteLogin {
#Logs into a given web site using username and password
my $response = $browser->post($loginurl, @netscape_like_headers); #,
if($response->is_error){
die "Can't sign in: $url -- ", $response->status_line;
}
else{
$response->status_line;
print "Login success calling load profile procedure\n";
loadProfile();
}
}
#########################################################################################################
#
# Main Program Section
#
#########################################################################################################
print "Probe Collect is starting.\n";
open(LOGFILE, ">output.txt");
siteLogin();
close(LOGFILE);
print "Program complete.\n";
: I've searched and searched for an answer with no luck. I'm trying to
: automate interaction with a secure web site to retrieve some data.
Perhaps there are additional things being set in the headers, especially
the referrer, or cookies, and you aren't setting them all correctly.
If you run a local proxy then you can display the complete exchange with a
regular browser, and then send your requests through the proxy to compare
them.
I have used Proxomitron to examine my interactions using a browser, e.g.
IE has an option to select a proxy, Proxomitron has complete instructions
in how to do this.
I have not used it to examine output from lwp scripts, but I am pretty
sure you can do the same thing by telling the lwp script about the proxy.
--
(Paying) telecommute programming projects wanted. Simply reply to this.
Use WWW::Mechanize. Your program will become a lot shorter and
easier for you and others to understand. I have encountered
exactly the symptoms you describe with bank sites and there are
several possibilities, such as the presence of hidden variables,
or inputs set by JavaScript. You may have to inspect the source
quite carefully.
--
Peter Scott
http://www.perldebugged.com/
*** NEW *** http://www.perlmedic.com/