Why does signing in with Mechanize

38 views
Skip to first unread message

Olivier Morel

unread,
Oct 17, 2014, 10:02:35 PM10/17/14
to rubyfr...@googlegroups.com, ruby...@ruby-lang.org

I'm trying to connect automaticaly to a website using Mechanize.

I read searched the internet but I can't find any solution to my problem, which is why when i write nothing in, value = "", I get Incorrect Login or Password  but when i write my password i get nothing ?

agent = Mechanize.new{ |a| a.log = Logger.new("mech.log") }
@agent.user_agent_alias = 'Linux Firefox'
login_page = agent.get('http://xxxx.org/?op=my_files')
   login_form = login_page.forms.first
   login_field = login_form.field_with(name: "login").value = "MyLogin"
   password_field = login_form.field_with(name: "password").value = "MyPassword"
     home_page = login_form.submit
     puts home_page.content

the code HTML

<div id="signincontainer">
        <form method="POST" action="http://xxx.org/" name="FL" id="signin">
          <input name="op" value="login" type="hidden">
          <input name="redirect" value="" type="hidden">
          <span class="signinq">
          <input style="background: url('images/username.png') no-repeat scroll 5px 50% rgb(255, 255, 255);" id="username" name="login" title="username" tabindex="4" type="text">
          <a class="donthaveaccount" href="http://xxxx.org/signup.html"><span>
          Sign Up
          </span></a> </span> <span class="signinq">
          <input style="background: url('images/password.png') no-repeat scroll 5px 50% rgb(255, 255, 255);" id="password" name="password" value="" title="password" tabindex="5" type="password">
          <a class="forgotpassword" href="http://xxxx.org/forgot-pass.html" id="resend_password_link"><span>
          Forgot your password?
          </span></a> </span>
          <input id="signin_submit" value="Enviar" tabindex="6" src="images2/signin.png" type="image">
        </form>
      </div>



the log file of the connection : https://gist.github.com/anonymous/b24486ab1822c178b190



--

Olivier Gosse-Gardet

unread,
Oct 18, 2014, 5:57:43 AM10/18/14
to rubyfr...@googlegroups.com, ruby...@ruby-lang.org
J'ai essayé ton code sur un autre site. Tout fonctionne bien. 

Ce qui est étrange dans ta log, c'est que le cookie n'est pas en request_header.

Tu as la log suivante
DEBUG -- : response-header: set-cookie => lang=english; domain=.xxxx.org; path=/, xfsts=z8uzbu3ic7x0eenv; domain=.xxxx.org; path=/; expires=Mon, 17-Nov-2014 01:55:02 GMT, login=MyLogin; domain=.xxxx.org; path=/; expires=Thu, 16-Apr-2015 01:55:02 GMT
DEBUG -- : saved cookie: lang=english
DEBUG -- : saved cookie: xfsts=z8uzbu3ic7x0eenv
DEBUG -- : saved cookie: login=MyLogin
 INFO -- : follow redirect to: http://xxxx.org/?op=my_files
 INFO -- : Net::HTTP::Get: /?op=my_files
DEBUG -- : request-header: accept => */*D, [2014-10-18T03:55:02.385134 #16718] DEBUG -- : request-header: user-agent => Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624
DEBUG -- : request-header: accept-encoding => gzip,deflate,identity
DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7
DEBUG -- : request-header: accept-language => en-us,en;q=0.5
DEBUG -- : request-header: host => xxxx.org
DEBUG -- : request-header: referer => http://xxxx.org/
 INFO -- : status: Net::HTTPOK 1.1 200 OK
DEBUG -- : response-header: server => nginx/1.6.0

il manque dans les request-header :
request-header: cookie => lang=english; domain=.xxxx.org; path=/, xfsts=z8uzbu3ic7x0eenv
Ne connaissant pas le site, je suppose que xfsts est la session. 

Il faut que tu te connectes manuellement sur le site avec le login et le mot de passe.
Vérifie les cookies en utilisant la console javascript 
document.cookie.
Les cookies doivent être les même sur le site et dans la log. 
- Si oui, il y a un problème avec le cookie dans ton programme. Tu peux essayer de mettre toi meme le cookie manuellement (http://stackoverflow.com/questions/7046535/maintaining-cookies-between-mechanize-requests)
- Si non, il y a un problème avec le formulaire de connection.

J'espère que ca aide !

--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "RubyFR public".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse rubyfr-publi...@googlegroups.com.
Pour envoyer un message à ce groupe, envoyez un e-mail à l'adresse rubyfr...@googlegroups.com.
Visitez ce groupe à l'adresse http://groups.google.com/group/rubyfr-public.
Pour obtenir davantage d'options, consultez la page https://groups.google.com/d/optout.



--
Cdt

tamouse pontiki

unread,
Oct 18, 2014, 5:57:43 AM10/18/14
to Ruby users, rubyfr...@googlegroups.com
On Fri, Oct 17, 2014 at 9:02 PM, Olivier Morel <olivi...@gmail.com> wrote:

I'm trying to connect automaticaly to a website using Mechanize.

I read searched the internet but I can't find any solution to my problem, which is why when i write nothing in, value = "", I get Incorrect Login or Password  but when i write my password i get nothing ?


I can't say for sure what's going on here.

I do notice this: 

agent = Mechanize.new{ |a| a.log = Logger.new("mech.log") }
@agent.user_agent_alias = 'Linux Firefox'


Is agent an attr_accessor or something? You've used a bare word 'agent' above and below, but set the user agent alias on an instance variable '@agent'. 

login_page = agent.get('http://xxxx.org/?op=my_files')
   login_form = login_page.forms.first
   login_field = login_form.field_with(name: "login").value = "MyLogin"
   password_field = login_form.field_with(name: "password").value = "MyPassword"
     home_page = login_form.submit
     puts home_page.content

Maybe check agent.current_page at this point, as well? It *should* be the same as home_page, but...
 

the code HTML

<div id="signincontainer">
        <form method="POST" action="http://xxx.org/" name="FL" id="signin">
          <input name="op" value="login" type="hidden">
          <input name="redirect" value="" type="hidden">
          <span class="signinq">
          <input style="background: url('images/username.png') no-repeat scroll 5px 50% rgb(255, 255, 255);" id="username" name="login" title="username" tabindex="4" type="text">
          <a class="donthaveaccount" href="http://xxxx.org/signup.html"><span>
          Sign Up
          </span></a> </span> <span class="signinq">
          <input style="background: url('images/password.png') no-repeat scroll 5px 50% rgb(255, 255, 255);" id="password" name="password" value="" title="password" tabindex="5" type="password">
          <a class="forgotpassword" href="http://xxxx.org/forgot-pass.html" id="resend_password_link"><span>
          Forgot your password?
          </span></a> </span>
          <input id="signin_submit" value="Enviar" tabindex="6" src="images2/signin.png" type="image">
        </form>
      </div>



the log file of the connection : https://gist.github.com/anonymous/b24486ab1822c178b190


From the log, it certainly looks like it succeeded.


I have a similar sort of thing in one of my scrapers at: https://github.com/tamouse/scrapers/blob/master/lib/scrapers/manning_books.rb#L13-L22

Olivier Morel

unread,
Oct 20, 2014, 10:43:00 AM10/20/14
to Ruby users, rubyfr...@googlegroups.com
How can i check i'm logged on the web site ?
Because if i write anything in my variable login i get this in the log

LOG:

D, [2014-10-20T16:35:22.507210 #8913] DEBUG -- : request-header: content-length => 52
I, [2014-10-20T16:35:24.660472 #8913]  INFO -- : status: Net::HTTPOK 1.1 200 OK
D, [2014-10-20T16:35:24.660576 #8913] DEBUG -- : response-header: server => nginx/1.6.0
D, [2014-10-20T16:35:24.660617 #8913] DEBUG -- : response-header: date => Mon, 20 Oct 2014 14:35:24 GMT
D, [2014-10-20T16:35:24.660653 #8913] DEBUG -- : response-header: content-type => text/html; charset=UTF-8
D, [2014-10-20T16:35:24.660688 #8913] DEBUG -- : response-header: transfer-encoding => chunked
D, [2014-10-20T16:35:24.660722 #8913] DEBUG -- : response-header: connection => keep-alive


CODE:

mechanize = Mechanize.new{ |a| a.log = Logger.new("mech.log") }
mechanize.user_agent_alias = "Linux Firefox"
login_page = mechanize.get('http://xxxxxx.org/?op=my_files')
  current_page = login_page.forms.first
      login_field = current_page.field_with(name: "login").value ="XXXX"
      login_field = current_page.field_with(name: "password").value = "XXXXX@1064"
   home_page = current_page.submit


--
Cordialement

Olivier Morel

Reply all
Reply to author
Forward
0 new messages