How do I get the page status (200, 404, 500, etc)

1,265 views
Skip to first unread message

Paul Denize

unread,
Apr 21, 2009, 5:25:25 PM4/21/09
to Watir General
I must be missing something but I spent some time yesterday trying to
get the page status code. But I could not find this anywhere.

I want to know if the page I requested failed in some way(404,500),
was from the cache(302), or was successfully fetched(200).

How do I get the page status? Or even the page header (which has this
in it)?

Tony

unread,
Apr 22, 2009, 9:29:49 PM4/22/09
to Watir General
Hi Paul,

Would suggest you use fiddler which is a free tool.(http://
www.fiddler2.com/fiddler2/)
You could run fiddler while your scripts are executing and capture the
status code, urls hit etc from a different process.

You could also try httpwatch (not free) http://www.httpwatch.com/
This supports ruby programming and could be used along with watir to
get the results you need.

If anyone has a better solution using COM, iam waiting to hear.

-Tony

Paul Rogers

unread,
Apr 22, 2009, 10:17:45 PM4/22/09
to watir-...@googlegroups.com
i could never find a way of getting the status code easily. If you look through the watir source it does something like use a regular expression against the page title to try and figure it out. At one point watir would raise an exception when the page finished loading if it was an error, I think that is now off by default

Paul

Jarmo Pertman

unread,
Apr 23, 2009, 6:41:38 AM4/23/09
to Watir General
I also wondered why this regular expression script was in Watir's
source, since it is actually really easy to get status with Net::HTTP.

Anyway, here are 2 examples I just did which could be used. First one
returns response status code and response class. It will throw
exception if web server isn't running for example. Here's the script
itself:

require 'net/http'
require 'uri'

def page_status url
url = URI.parse(url)

http = Net::HTTP.new(url.host, url.port)

http.start do
http.request_get(url.path.empty? ? "/" : url.path) do |res|
return {:name => res.class, :code => res.code}
end
end
end

status = page_status("http://localhost")

puts status[:name]
puts status[:code]

-----
Output would be something similar to:
Net::HTTPOK
200

But I doubt that you actually want to know the exact response code.
You probably just want to know if the page loaded correctly with
status 200, right? Anyway, then you could use something like this
instead:

def page_ok? url
url = URI.parse(url)

http = Net::HTTP.new(url.host, url.port)


begin
http.start do
http.request_get(url.path.empty? ? "/" : url.path) do |res|
return false unless res.kind_of?(Net::HTTPSuccess) # or
Net::HTTPOK if you want to exclude 201, 202, 203, 204, 205 and 206
end
end
rescue => e
puts "Got error: #{e.inspect}"
return false
end

true
end

puts page_ok?("http://localhost")

-----
Output would be true

It seems pretty easy and straightforward to me like most of things in
Ruby :) Anyway, maybe also Watir's own script should be changed to use
some similar solution? It would have to be modified a little of course
to work with https also so Watir could check itself if page was loaded
or not and throw error or something?

Anyway, hope that one of these solutions works for you.

You can of course read detailed information about Net::HTTP from Rdoc
at http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/index.html

Regards,
Jarmo


On Apr 23, 5:17 am, Paul Rogers <paul.rog...@shaw.ca> wrote:
> i could never find a way of getting the status code easily. If you look
> through the watir source it does something like use a regular expression
> against the page title to try and figure it out. At one point watir would
> raise an exception when the page finished loading if it was an error, I
> think that is now off by default
>
> Paul
>
> On Wed, Apr 22, 2009 at 7:29 PM, Tony <ynot...@gmail.com> wrote:
>
> > Hi Paul,
>
> > Would suggest you use fiddler which is a free tool.(http://
> >www.fiddler2.com/fiddler2/)
> > You could run fiddler while your scripts are executing and capture the
> > status code, urls hit etc from a different process.
>
> > You could also try httpwatch (not free)http://www.httpwatch.com/

Jarmo Pertman

unread,
Apr 23, 2009, 6:48:32 AM4/23/09
to Watir General
Ok, I played a little more so it would also work with https.

Just needed to add few extra lines to existing script:

def page_status url
url = URI.parse(url)

http = Net::HTTP.new(url.host, url.port)

if url.scheme.downcase == "https"
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
http.use_ssl = true
end

http.start do
http.request_get(url.path.empty? ? "/" : url.path) do |res|
return {:name => res.class, :code => res.code}
end
end
end

status = page_status("https://localhost")

puts status[:name]
puts status[:code]

Jarmo

Jarmo Pertman

unread,
Apr 23, 2009, 6:50:01 AM4/23/09
to Watir General
Sorry, you also need to replace require 'net/http' with require 'net/
https'

Jarmo

Paul Rogers

unread,
Apr 23, 2009, 11:22:07 AM4/23/09
to watir-...@googlegroups.com
can you get openssl to work on windows. I dont remember if this is easy or hard.
Watir is different to net/http. If you did a watir request, and then a net/http request, thats 2 requests. And you need to get all the cookies etc from the browser into net/http. And you wont get any javascript or ajax stuff happening.

Paul

Jarmo Pertman

unread,
Apr 24, 2009, 7:26:56 AM4/24/09
to Watir General
Oh, you're right! I forgot to think anything about sessions and
cookies :/ In that way it should be indeed easier just to scan page
html...

But anyway, I don't see why such things are needed. Why won't you just
create normal tests, which assert multiple things and they would fail
for sure if page is 404 or something similar?

Jarmo

On Apr 23, 6:22 pm, Paul Rogers <paul.rog...@shaw.ca> wrote:
> can you get openssl to work on windows. I dont remember if this is easy or
> hard.
> Watir is different to net/http. If you did a watir request, and then a
> net/http request, thats 2 requests. And you need to get all the cookies etc
> from the browser into net/http. And you wont get any javascript or ajax
> stuff happening.
>
> Paul
>

marekj

unread,
Apr 24, 2009, 11:28:15 AM4/24/09
to watir-...@googlegroups.com
Maybe we need to make some distinctions here.
I view Watir as a 'browser driver' in the sense that it talks to the
DocumentObjectModel of a browser and doesn't care how the browser
deals with http protocol (well, we can maybe peek a bit). Watir API is
talking to the final composition of html, js, css, etc.. as a DOM
entity and not separate the parts that it takes to build it. The
browser builds the DOM, Watir can query the DOM and modify it and
initiate the submit request to invoke a behavior the way a human user
would.

What you seem to be asking for is more like HtmlUnit which has a
'webclient as a substitute for a browser' where you can access the
'protocol' the webclient is using. There you can talk http protocol
directly. In that paradign HtmlUnit IS the browser while in our
paradimg we let the Browser be the Browser and Watir manipulate its
DOM.

The advantage of Watir is that it can test the final Behaviour as
profided by the DOM to the user without managing the structural state
of sessions, cookies and http requests. We instruct the browser to
fetch a page but by the API which is simulating the way a user would
do it using the Browser's mechanisms.
This may be a trivial explanation but an important distinction I think.

marekj

Watirloo: Semantic Page Objects in UseCases
http://github.com/marekj/watirloo/

Paul Denize

unread,
May 21, 2009, 12:44:07 AM5/21/09
to Watir General
Thanks for all the replys.

At first I was quite excited then realized they would make additional
calls - the cookies, session, conversation elements would all be a
nightmare. Not to mention that the load balancer may give valid
results sometimes and invalid the next (say if one server is bad).

The reason I do this is becase I do not treat Watir as a tool to test
with but a Frameword in which to build test tools.

When someone in my tool says goto I get water to go to the page, time
the response, do a screenshot, spellcheck, send the page for html
validation, store a summary of the elements and html, and if requested
simulate a printer view of the page, etc.

All somewhat a waste of time if the page is a 500 error. And I do not
know what the user is going to verify next, all they said was goto....

I do think I have openssl in the set of tools my workbench
installs ... could this be elaborate on as a solution?

basu

unread,
Jul 1, 2009, 11:27:31 PM7/1/09
to Watir General
Hi Germo,

If we are collecting all the links of a web page and navigating to
each link via click, don't you think you are sending http request
twice ? (via click and http.request_get)
Instead cannot we figure out page is broken or not without sending the
extra request? (dont want to verify "page not found" browser title)
Reply all
Reply to author
Forward
0 new messages