On Sat, 27 Dec 2014 03:44:50 -0800 (PST)
brunetto <
brunett...@gmail.com> wrote:
> it works, but not always (not all the sites).
>
> Why are the post form names different from those in the page source?
> Is there a comprehensive reading material?
> A standard approach to login from a (go) code into a site?
[...]
There's no standard approach to "login ... into a site" because this
concept does not really exist. I mean, there's no standardized login
protocol for web applications (something like, say, SASL), which could
be implemented as library and used in each and every case. Instead,
each web application implements something on its own, and in most cases
catered to interactive usage. Think of it: when the user's browser
renders a login page, the user is not at all concerned with how the
variables on this form are named, and what happens when they click/tap
the "login" button on that page. More to this, certain kinds of sites
-- with public image hosting solutions being a good example -- try to
actively combat automated logins to prevent creation of automated
utilities -- image uploaders in our example -- which would allow the
user to use the web application without ever seeing its web UI in their
browser. The reason is simple: these sites cash on showing
advertisements to their users, so if there's no web UI downloading, then
no banners are shown etc. Such sites will intentionally generate a
login form containing randomly named variables, and supply you a cookie
which would allow it to decypher/remap these names back into sensible
values when the browser sends the filled form back.
These days, those web applications which want to be good to automated
solutions usually provide some explicit means to do that. Typically,
these are REST APIs.
So what can you do for this?
Unfortunately, in complicated cases you have to play web browser and do
whatever it would do if it rendered the login form, let the user fill
it and send it back. This requires paying attention to cookies and the
Referer HTTP header. Unfortunately, if the web site uses JavaScript to
operate the form (quite a de-facto standard these days), you're out of
luck. If the web site mangles the form values, you might try to guess
their meaning from their relative disposition and types.
As to comprehensive reading material...
I do not know any book which covered everything needed, but originally
all this stuff (web forms) appeared in the form of a server-side
protocol called CGI, so you might start with googling for this word.
Any book on server-side web programming (no matter which langugage it
targets) should cover the most basic topics. Books on PHP are in
abundance, for example.