Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to apply the user's HTML environment in a Python programme?

53 views
Skip to first unread message

BobAalsma

unread,
Sep 21, 2012, 8:57:57 AM9/21/12
to
I'd like to write a programme that will be offered as a web service (Django), in which the user will point to a specific URL and the programme will be used to read the text of that URL.

This text can be behind a username/password, but for several reasons, I don't want to know those.

So I would like to set up a situation where the user logs in (if/when appropriate), points out the URL to my programme and my programme would then be able to read that particular text.

I'm aware this may sound fishy. It should not be: I want the user to be fully aware and in control of this process.

Any thoughts on how to approach this?

Best regards,
Bob

Joel Goldstick

unread,
Sep 21, 2012, 9:22:42 AM9/21/12
to BobAalsma, pytho...@python.org
There are several python modules to get web pages. urllib, urllib2
and another called requests.
(http://kennethreitz.com/requests-python-http-module.html) Check
those out
>
> Best regards,
> Bob
> --
> http://mail.python.org/mailman/listinfo/python-list



--
Joel Goldstick

BobAalsma

unread,
Sep 21, 2012, 9:31:37 AM9/21/12
to BobAalsma, pytho...@python.org
Op vrijdag 21 september 2012 15:23:14 UTC+2 schreef Joel Goldstick het volgende:
Thanks, Joel, yes, but as far as I'm aware these would all require the Python programme to have the user's username and password (or "credentials"), which I wanted to avoid.

BobAalsma

unread,
Sep 21, 2012, 9:31:37 AM9/21/12
to comp.lan...@googlegroups.com, pytho...@python.org, BobAalsma
Op vrijdag 21 september 2012 15:23:14 UTC+2 schreef Joel Goldstick het volgende:

Jerry Hill

unread,
Sep 21, 2012, 9:36:08 AM9/21/12
to pytho...@python.org
On Fri, Sep 21, 2012 at 9:31 AM, BobAalsma <overhaalsg...@me.com> wrote:
> Thanks, Joel, yes, but as far as I'm aware these would all require the Python programme to have the user's username and password (or "credentials"), which I wanted to avoid.

No matter what you do, your web service is going to have to
authenticate with the remote web site. The details of that
authentication are going to vary with each remote web site you want to
connect to.

--
Jerry

BobAalsma

unread,
Sep 21, 2012, 9:58:42 AM9/21/12
to pytho...@python.org
Op vrijdag 21 september 2012 15:36:11 UTC+2 schreef Jerry Hill het volgende:
> On Fri, Sep 21, 2012 at 9:31 AM, BobAalsma wrote:
>
> > Thanks, Joel, yes, but as far as I'm aware these would all require the Python programme to have the user's username and password (or "credentials"), which I wanted to avoid.
>
>
>
> No matter what you do, your web service is going to have to
>
> authenticate with the remote web site. The details of that
>
> authentication are going to vary with each remote web site you want to
>
> connect to.
>
>
>
> --
>
> Jerry

Hmm, from the previous posts I get the impression that I could best solve this by asking the user for the specific combination of username, password and URL + promising not to keep any of that...

OK, that does sound doable - thank you all

Bob

BobAalsma

unread,
Sep 21, 2012, 9:58:42 AM9/21/12
to comp.lan...@googlegroups.com, pytho...@python.org
Op vrijdag 21 september 2012 15:36:11 UTC+2 schreef Jerry Hill het volgende:
> On Fri, Sep 21, 2012 at 9:31 AM, BobAalsma wrote:
>
> > Thanks, Joel, yes, but as far as I'm aware these would all require the Python programme to have the user's username and password (or "credentials"), which I wanted to avoid.
>
>
>
> No matter what you do, your web service is going to have to
>
> authenticate with the remote web site. The details of that
>
> authentication are going to vary with each remote web site you want to
>
> connect to.
>
>
>
> --
>
> Jerry

Joel Goldstick

unread,
Sep 21, 2012, 10:15:26 AM9/21/12
to BobAalsma, pytho...@python.org, comp.lan...@googlegroups.com
I recommend that you write your program to read pages that are not
protected. Once you get that working, you can go back and figure out
how you want to get the username/password from your 'friends' and add
that in. Also look up Beautiful Soup (version 4) for a great library
to parse the pages that you retrieve

Peter Otten

unread,
Sep 21, 2012, 10:33:32 AM9/21/12
to pytho...@python.org
BobAalsma wrote:

> Hmm, from the previous posts I get the impression that I could best solve
> this by asking the user for the specific combination of username, password
> and URL + promising not to keep any of that...
>
> OK, that does sound doable - thank you all

Hmm, promising seems doable, but keeping?


David Smith

unread,
Sep 21, 2012, 11:28:07 AM9/21/12
to pytho...@python.org
On 2012-09-21 08:57, BobAalsma wrote:
> This text can be behind a username/password, but for several reasons, I don't want to know those.
>
> So I would like to set up a situation where the user logs in (if/when appropriate), points out the URL to my programme and my programme would then be able to read that particular text.
I do this from a bat file that I will later translate to Python.
I tell my work wiki which file I want. I use chrome, so for every new
session I'm asked for my credentials. However, that is all transparent
to my bat file.

For that matter, when I download a new build from part of another bat
file, I use Firefox and never see the credential exchange.

I wouldn't expect any different behavior using Python.

Message has been deleted

BobAalsma

unread,
Sep 22, 2012, 7:34:02 AM9/22/12
to pytho...@python.org
Op vrijdag 21 september 2012 17:28:02 UTC+2 schreef David Smith het volgende:
Umm, David, sorry, you've lost me but I think this could be a good solution - at least the division in client side/server side sounds like what I'm looking for. Could you please elaborate?

Bob

BobAalsma

unread,
Sep 22, 2012, 7:34:02 AM9/22/12
to comp.lan...@googlegroups.com, pytho...@python.org
Op vrijdag 21 september 2012 17:28:02 UTC+2 schreef David Smith het volgende:

BobAalsma

unread,
Sep 22, 2012, 7:38:08 AM9/22/12
to pytho...@python.org
Op vrijdag 21 september 2012 22:10:04 UTC+2 schreef Dennis Lee Bieber het volgende:
> On Fri, 21 Sep 2012 09:36:08 -0400, Jerry Hill
>
> declaimed the following in gmane.comp.python.general:
>
>
>
> > On Fri, Sep 21, 2012 at 9:31 AM, BobAalsma wrote:
>
> > > Thanks, Joel, yes, but as far as I'm aware these would all require the Python programme to have the user's username and password (or "credentials"), which I wanted to avoid.
>
> >
>
> > No matter what you do, your web service is going to have to
>
> > authenticate with the remote web site. The details of that
>
> > authentication are going to vary with each remote web site you want to
>
> > connect to.
>
>
>
> Hmmm, convoluted but presuming the "login" third party site uses
>
> cookies... Would it be possible to use Javascript on the client "copy"
>
> the HTML from the third-party and then transmit it to the application
>
> rather than having the application trying to do a direct fetch given
>
> just the URL?
>
>
>
> This should keep the authentication local to the client machine.
>
>
>
>
>
> --
>
> Wulfraed Dennis Lee Bieber AF6VN
>
> wlfraed@....com HTTP://wlfraed.home.netcom.com/

Wulfraed, yes, as with David's proposal: this sounds good, but I wouldn't know the first thing about Javascript...
I'm also concerned that both solutions would seem to imply distributing software (or "software") to the clients systems.
Hmm.

Bob

BobAalsma

unread,
Sep 22, 2012, 7:38:08 AM9/22/12
to comp.lan...@googlegroups.com, pytho...@python.org
Op vrijdag 21 september 2012 22:10:04 UTC+2 schreef Dennis Lee Bieber het volgende:
> On Fri, 21 Sep 2012 09:36:08 -0400, Jerry Hill
>
> declaimed the following in gmane.comp.python.general:
>
>
>
> > On Fri, Sep 21, 2012 at 9:31 AM, BobAalsma wrote:
>
> > > Thanks, Joel, yes, but as far as I'm aware these would all require the Python programme to have the user's username and password (or "credentials"), which I wanted to avoid.
>
> >
>

Thomas Jollans

unread,
Sep 22, 2012, 8:01:29 AM9/22/12
to pytho...@python.org
What services are you planning to interface with? Many services (twitter
being a notable pioneer) have systems for external (web) applications to
log in without being given a user's username & password.

I think it's possible to load a page in an iframe and access it using
JavaScript/DOM from the parent page. This is probably what you'll want
to do.

You say you don't know the first thing about JavaScript. Well, my
friend, if you're developing for the web, learn JavaScript, or,
depending on your situation, hire a front end developer who knows
JavaScript. You can only do so much on the web without using JavaScript.
I recently discovered this guide to learning JS; it sounds reasonable:
http://javascriptissexy.com/how-to-learn-javascript-properly/

http://pyjs.org/ may be worth a look too.


-- Thomas

PS: Most of your messages appear to be both To: and Cc: this list.
Please stop sending each message twice, it's rather distracting.

Message has been deleted

BobAalsma

unread,
Oct 1, 2012, 10:19:27 AM10/1/12
to
Op vrijdag 21 september 2012 16:15:30 UTC+2 schreef Joel Goldstick het volgende:
Joel,

I've spent some time with this but don't really understand my results - some help would be appreciated.
I've built a tester that will read my LinkedIn home page, which is password protected.
When I use that method for reading other people's pages, the program is redirected to the LinkedIn login page.
When I paste the URLs for the other people's pages in any browser, the requested pages are shown.

Bob

Ramchandra Apte

unread,
Oct 2, 2012, 12:17:20 PM10/2/12
to
Not all the authentication information is in the URL.
Some of it is in cookies in the browser.
0 new messages