Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

the return of urllib.request.urlopen("http://www.example.com/", params)

1,030 views
Skip to first unread message

tunpishuang

unread,
Apr 1, 2009, 12:17:29 AM4/1/09
to
hey guys , i'm new in python ...here i got a little problem that get
me confused...
i wanna do an uthentication of an login page , here is the example
from the python lib ref. :

>>> import urllib.request
>>> import urllib.parse
>>> params = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> f = urllib.request.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" % params)
>>> print(f.read())

i wanted to know what the the return value of f ,,if it's an instance
of http.client.HTTPResponse
and why the return source of the web page is quote like this :

b'<html></html>'

if i wanna to read the first 10 bytes of f and compare with the web
source.
what the first 10 bytes will be ? is it "b'<html><h" or "<html></ht"
or something else.
i've debug for this script , but still got the error that they don't
match~

any suggestions?

pardon me for the poor english~

Steven D'Aprano

unread,
Apr 1, 2009, 12:49:30 AM4/1/09
to
On Tue, 31 Mar 2009 21:17:29 -0700, tunpishuang wrote:

> hey guys , i'm new in python ...here i got a little problem that get me
> confused...
> i wanna do an uthentication of an login page , here is the example from
> the python lib ref. :
>
>>>> import urllib.request
>>>> import urllib.parse
>>>> params = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}) f
>>>> = urllib.request.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" %
>>>> params) print(f.read())
>
> i wanted to know what the the return value of f ,,if it's an instance of
> http.client.HTTPResponse
> and why the return source of the web page is quote like this :
>
> b'<html></html>'


Looks like you are using Python 3.0.

In Python 2.x, the native string type is bytes (commonly called "ASCII").
So a string of bytes is displayed like this:

'abcdef...'

and a string of Unicode characters is displayed like this:

u'abcdef...'


In Python 3.0, the native string type is unicode, but HTTP responses are
bytes. Bytes are displayed like this:

b'abcdef...'

and strings of characters like this:

'abcdef...'

Notice that the B on the outside of the quotes is not part of the string,
it is part of the display format, just like the quotes themselves.


You can convert the bytes into a string by just called str() on the
output:

# untested
>>> print(str(f.read()))
'<html></html>'

This should work for the simple case, although once you start getting
more complicated strings, with international (non-English) characters,
you will need to supply an encoding.


Does this help?

--
Steven

tunpishuang

unread,
Apr 1, 2009, 12:59:00 AM4/1/09
to
On Apr 1, 12:49 pm, Steven D'Aprano


so much thanks Steven ,,,
one day u came to travel in China ,i'll be you guide~ :)

0 new messages