Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
WWW::Mechanize submit_form does not return expected
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  6 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Justin C  
View profile  
 More options Feb 10, 5:12 am
Newsgroups: comp.lang.perl.misc
From: Justin C <justin.1...@purestblue.com>
Date: Fri, 10 Feb 2012 10:12:45 +0000
Local: Fri, Feb 10 2012 5:12 am
Subject: WWW::Mechanize submit_form does not return expected
The web-page I'm automating with mech is not returning the same data
that it returns when accessed with a browser. The java-script on the
form page, as far as I can see, does no more than sanitize what the user
enters, I can't see how it could possibly be telling the server that the
request is not a real person. I've got $mech->agent_alias('Mac Safari')
so it should think the request is from a real browser. I'm returning
input for all form fields (including the default values for the hidden
fields). I don't know where to go from here.

Any suggestions will be gratefully received. What follows is only for
anyone who would like to know, in detail, what's going on. My code is at
the bottom of this message.

Here's an overview of why I'm doing what I am: The site I'm trying to
automate is an EU governmental one. My employer has recently been hit
with a VAT bill because one of our customers de-registered for VAT early
last year and therefore we shouldn't be sending goods VAT-free. The
customer didn't tell us, and we're liable for unpaid VAT. Our government
site suggest regular checking of customer VAT numbers against the EU
database to avoid this. There is a module on CPAN that can give a
valid/invalid result for VAT details, but I want to capture the
certificate the site issues because it has a consultation number as
proof of checking the VAT status, without which the certificate is
useless, it could easily have been fraudulently made. The only part of
the data that's missing when I use mech is the consultation number, the
vital part that proves you did carry out the check.

The web-page that contains the form I'm mechanising is at
<URL:http://ec.europa.eu/taxation_customs/vies/vieshome.do?selectedLanguag...>
subsitute that last EN for your preferred language code, ES, FR, DE, DK,
etc., you can find the form in the source by searching on "viesquer.do".

The CPAN module that gets halfway there is
Business::Tax::VAT::Validation. I didn't discover this until I'd almost
completed my program, but it doesn't provide the consultation number
which can be used as a defence if challenged by the VAT authorities.

Here's the code I'm using, VAT numbers omitted for privacy, if anyone
wants to run this, and can't find any valid data to use, please let me
know by email and I'll send some data that works, I just don't want to
put other people's data 'out there':

my $cust = shift;
#       an array ref.   $_->[0] = our customer identifier - not submitted, just
#                                                         happens to be in the array.
#                                       $_->[1] = 2 letter country EU code
#                                       $_->[2]      = customer VAT number

my %requester = (
        state   =>   'GB',           #       2 letter country EU code
        vat_no  =>   '',                     #       a valid VAT number for the country
                                                        #               - removed for privacy reasons.
);

my $mech = WWW::Mechanize->new();

#       Set a sensible UA - don't want to be thought of as a bot
$mech->agent_alias('Mac Safari');

#       load the page containing the form
$mech->get($site_root);

#       fill-in and submit the form
$mech->submit_form(
        form_name       =>   'frmVat',
        fields  =>   {
                ms                              =>   $cust->[1],
                iso                             =>   $cust->[1],
                vat                             =>   $cust->[2],
                reqeusterMs             =>   $requester{state},
                requesterIso    =>   $requester{state},
                requesterVat    =>   $requester{vat_no},
                BtnSubmitVat    =>   'Verify',
                name                    =>   '',
                companyType             =>   '',
                street1                 =>   '',
                postcode                =>   '',
                city                    =>   '',
        }
);

return \$mech->content;
__end__

Thank you for any help you can give.

   Justin.

--
Justin C, by the sea.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
J. Gleixner  
View profile  
 More options Feb 10, 11:20 am
Newsgroups: comp.lang.perl.misc
From: "J. Gleixner" <glex_no-s...@qwest-spam-no.invalid>
Date: Fri, 10 Feb 2012 10:20:37 -0600
Local: Fri, Feb 10 2012 11:20 am
Subject: Re: WWW::Mechanize submit_form does not return expected
On 02/10/12 04:12, Justin C wrote:
> The web-page I'm automating with mech is not returning the same data
> that it returns when accessed with a browser. The java-script on the
> form page, as far as I can see, does no more than sanitize what the user
> enters, I can't see how it could possibly be telling the server that the
> request is not a real person. I've got $mech->agent_alias('Mac Safari')
> so it should think the request is from a real browser. I'm returning
> input for all form fields (including the default values for the hidden
> fields). I don't know where to go from here.

[...]

Firefox with the Firebug Plug-in might help you find if you're
sending values differently. You easily can see what's sent and the
response in the Console tab.

If that matches what you're sending, then possibly what's returned
is processed by Javascript before being displayed to the browser, and
that would be the next thing to examine.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tad McClellan  
View profile  
 More options Feb 10, 2:24 pm
Newsgroups: comp.lang.perl.misc
From: Tad McClellan <ta...@seesig.invalid>
Date: Fri, 10 Feb 2012 13:24:58 -0600
Local: Fri, Feb 10 2012 2:24 pm
Subject: Re: WWW::Mechanize submit_form does not return expected

I'd use the "web scraping proxy" from AT&T:

    http://www2.research.att.com/sw/tools/wsp/

It logs HTTP requests/responses in the form of Perl code (UserAgent).

Then we don't need to know what all the JS does, we just need to know
how to construct the request we want...

--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Justin C  
View profile  
 More options Feb 14, 5:26 am
Newsgroups: comp.lang.perl.misc
From: Justin C <justin.1...@purestblue.com>
Date: Tue, 14 Feb 2012 10:26:15 +0000
Local: Tues, Feb 14 2012 5:26 am
Subject: Re: WWW::Mechanize submit_form does not return expected
On 2012-02-10, J. Gleixner <glex_no-s...@qwest-spam-no.invalid> wrote:

Thank you for the suggestion J, but I've been bitten by FireFox plugins
before and avoid them now. The WSP suggestion is working for me.

   Justin.

--
Justin C, by the sea.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Justin C  
View profile  
 More options Feb 14, 5:36 am
Newsgroups: comp.lang.perl.misc
From: Justin C <justin.1...@purestblue.com>
Date: Tue, 14 Feb 2012 10:36:57 +0000
Local: Tues, Feb 14 2012 5:36 am
Subject: Re: WWW::Mechanize submit_form does not return expected
On 2012-02-10, Tad McClellan <ta...@seesig.invalid> wrote:

> I'd use the "web scraping proxy" from AT&T:

>     http://www2.research.att.com/sw/tools/wsp/

> It logs HTTP requests/responses in the form of Perl code (UserAgent).

> Then we don't need to know what all the JS does, we just need to know
> how to construct the request we want...

Thank you, Tad, that's very useful. I think I've found a cookie. I'm
testing new code now.

   Justin.

--
Justin C, by the sea.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Justin C  
View profile  
 More options Feb 15, 8:17 am
Newsgroups: comp.lang.perl.misc
From: Justin C <justin.1...@purestblue.com>
Date: Wed, 15 Feb 2012 13:17:48 +0000
Local: Wed, Feb 15 2012 8:17 am
Subject: Re: WWW::Mechanize submit_form does not return expected
On 2012-02-14, Justin C <justin.1...@purestblue.com> wrote:

> On 2012-02-10, Tad McClellan <ta...@seesig.invalid> wrote:

>> I'd use the "web scraping proxy" from AT&T:

>>     http://www2.research.att.com/sw/tools/wsp/

>> It logs HTTP requests/responses in the form of Perl code (UserAgent).

>> Then we don't need to know what all the JS does, we just need to know
>> how to construct the request we want...

> Thank you, Tad, that's very useful. I think I've found a cookie. I'm
> testing new code now.

Update: After much *much* time, too much time, trying to debug this I
finally found the problem. I had a typo in the name of a field in the
$mech->submit_form. A small transpose of two letters. The site still
functioned as I expected, but my error caused the site not to give me a
confirmation number.

I hate how debugging takes 3 times (or more) the time it takes to code!

Anyway, thanks to Tad and J for their suggestions.

   Justin.

--
Justin C, by the sea.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »