Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Good programming style

0 views
Skip to first unread message

Astley Le Jasper

unread,
Sep 12, 2008, 6:08:33 AM9/12/08
to
I'm still learning python and would like to know what's a good way of
organizing code.

I am writing some scripts to scrape a number of different website that
hold similar information and then collating it all together. Obviously
each site needs to be handled differently, but once the information is
collected then more generic functions can be used.

Is it best to have it all in one script or split it into per site
scripts that can then be called by a manager script? If everything is
in one script would you have per site functions to extract the data or
generic function that contain vary slightly depending on the site, for
example

import firstSiteScript
import secondSiteScript

firstsitedata = firstSiteScript.getData('search_str)
secondsitedata = secondSiteScript.getData('search_str)
etc etc

OR

def getFirstSiteData(search_str):
etc etc
def getSecondSiteData(search_str):
etc etc

OR

def getdata(search_str, website):
if website == 'firstsite':
....
elif website =='secondsite':

etc

Bruno Desthuilliers

unread,
Sep 12, 2008, 6:44:36 AM9/12/08
to
Astley Le Jasper a écrit :

> I'm still learning python and would like to know what's a good way of
> organizing code.
>
> I am writing some scripts to scrape a number of different website that
> hold similar information and then collating it all together. Obviously
> each site needs to be handled differently, but once the information is
> collected then more generic functions can be used.
>
> Is it best to have it all in one script or split it into per site
> scripts that can then be called by a manager script?
> If everything is
> in one script would you have per site functions to extract the data or
> generic function that contain vary slightly depending on the site,

As far as I'm concerned, I'd choose the first solution. Decoupling
what's varying (here, site-specific stuff) from "invariants" is so far
the best way I know to keep complexity manageable.

> for
> example
>
> import firstSiteScript
> import secondSiteScript
>
> firstsitedata = firstSiteScript.getData('search_str)
> secondsitedata = secondSiteScript.getData('search_str)
> etc etc

Even better :

- put generic functions in a 'generic' module
- put all site-specific stuff each in it's own module in a specific
'site_scripts' directory
- in your 'main' script, scan the site_scripts directory to loop over
site-specific modules, import them and run them (look for the __import__
function).

This is kind of a Q&D lightweight plugin system, that avoids having to
hard-code imports and calls in the main script, so you just have to
add/remove site-specific script to/from the site_scripts directory .

Also, imported modules are not recompiled on each import - only when
they change - while the 'main' script get recompiled on each invocation.

(snip)

> OR
>
> def getdata(search_str, website):
> if website == 'firstsite':
> ....
> elif website =='secondsite':

This one is IMHO the very worst thing to do.

My 2 cents...

Astley Le Jasper

unread,
Sep 12, 2008, 6:54:52 AM9/12/08
to
On 12 Sep, 12:44, Bruno Desthuilliers <bruno.

Excellent, thanks for that.

Ben Finney

unread,
Sep 14, 2008, 6:41:22 PM9/14/08
to
Astley Le Jasper <Astley....@gmail.com> writes:

> Is it best to have it all in one script or split it into per site
> scripts that can then be called by a manager script? If everything
> is in one script would you have per site functions to extract the
> data or generic function that contain vary slightly depending on the
> site, for example
>
> import firstSiteScript
> import secondSiteScript

First: each of these things you're importing is a "module" in Python.
A script is what I prefer, for clarity, to call a "program": it's
intended to be executed independently as the top level of execution.

Second: please do yourself a favour and drop the camelCaseNames.
Follow PEP 8 <URL:http://www.python.org/dev/peps/pep-0008> for style
and naming in your Python code.

> firstsitedata = firstSiteScript.getData('search_str)
> secondsitedata = secondSiteScript.getData('search_str)
> etc etc

I'm presuming that there will be large areas of common functionality
between these different sites. On that basis, it's prbably best to
treat the differences as differences of *configuration* where
possible, instead of having separate modules for the entire site.

You might like to look at a web framework which gathers much of this
functionality together for you, and provides flexible ways to define
different sites in terms of those common elements
<URL:http://wiki.python.org/moin/WebFrameworks>.

--
\ “Following fashion and the status quo is easy. Thinking about |
`\ your users' lives and creating something practical is much |
_o__) harder.” —Ryan Singer, 2008-07-09 |
Ben Finney

Grant Edwards

unread,
Sep 14, 2008, 8:29:18 PM9/14/08
to
On 2008-09-14, Ben Finney <bignose+h...@benfinney.id.au> wrote:
> Astley Le Jasper <Astley....@gmail.com> writes:
>
>> Is it best to have it all in one script or split it into per
>> site scripts that can then be called by a manager script? If
>> everything is in one script would you have per site functions
>> to extract the data or generic function that contain vary
>> slightly depending on the site, for example
>>
>> import firstSiteScript
>> import secondSiteScript
>
> First: each of these things you're importing is a "module" in
> Python. A script is what I prefer, for clarity, to call a
> "program": it's intended to be executed independently as the
> top level of execution.
>
> Second: please do yourself a favour and drop the camelCaseNames.
> Follow PEP 8 <URL:http://www.python.org/dev/peps/pep-0008> for style
> and naming in your Python code.

If he finds camelcase more readable and easier to type (as do
I), how is switching to underscores "doing himself a favor"?

I'm generally in favor of using a consistent naming style
throughout a project, but I don't see why the naming style used
in my source code should be subject to somebody else's
arbitrary standard.

When it comes to writing code intended for the standard library
in the main Python distribution, I would certainly defer to the
existing standard as defined in PEP 8. However, I don't see
any reason that style should be imposed on all everybody else.

--
Grant

Ben Finney

unread,
Sep 14, 2008, 9:01:23 PM9/14/08
to
Grant Edwards <gra...@visi.com> writes:

> On 2008-09-14, Ben Finney <bignose+h...@benfinney.id.au> wrote:
> > Second: please do yourself a favour and drop the camelCaseNames.
> > Follow PEP 8 <URL:http://www.python.org/dev/peps/pep-0008> for style
> > and naming in your Python code.
>
> If he finds camelcase more readable and easier to type (as do
> I), how is switching to underscores "doing himself a favor"?
>
> I'm generally in favor of using a consistent naming style
> throughout a project, but I don't see why the naming style used
> in my source code should be subject to somebody else's
> arbitrary standard.

Because the code we write rarely stays isolated from other code. There
is an existing convention, and it's better to pick a (sufficiently
sane) style convention and stick to it than argue about what the
convention should be.

> When it comes to writing code intended for the standard library
> in the main Python distribution, I would certainly defer to the
> existing standard as defined in PEP 8. However, I don't see
> any reason that style should be imposed on all everybody else.

Who's imposing? I'm saying it's a good idea for everyone to do it, and
going so far as to say that one is doing oneself a favour by following
the convention. I have no more power than you to "impose" convention
on anyone.

--
\ “‘Did you sleep well?’ ‘No, I made a couple of mistakes.’” |
`\ —Steven Wright |
_o__) |
Ben Finney

Grant Edwards

unread,
Sep 14, 2008, 10:10:14 PM9/14/08
to
On 2008-09-15, Ben Finney <bignose+h...@benfinney.id.au> wrote:
> Grant Edwards <gra...@visi.com> writes:
>> On 2008-09-14, Ben Finney <bignose+h...@benfinney.id.au> wrote:
>>
>>> Second: please do yourself a favour and drop the
>>> camelCaseNames. Follow PEP 8
>>> <URL:http://www.python.org/dev/peps/pep-0008> for style and
>>> naming in your Python code.
>>
>> If he finds camelcase more readable and easier to type (as do
>> I), how is switching to underscores "doing himself a favor"?
>>
>> I'm generally in favor of using a consistent naming style
>> throughout a project, but I don't see why the naming style
>> used in my source code should be subject to somebody else's
>> arbitrary standard.
>
> Because the code we write rarely stays isolated from other
> code. There is an existing convention,

There are many existing conventions.

> and it's better to pick a (sufficiently sane) style convention
> and stick to it than argue about what the convention should
> be.

I suppose if everybody agreed to pick one, and all the source
code in the world was changed to meet it, that would "a good
thing". It just seems like a goal so unrealistic as to make it
a bit of an overstatement to tell people they're better off
following convention X than following convention Y.

When packages as significant as wxPython use naming conventions
other than PEP 8, I find it hard to make a case that the PEP 8
naming convention is any better than any other.

>> When it comes to writing code intended for the standard
>> library in the main Python distribution, I would certainly
>> defer to the existing standard as defined in PEP 8. However,
>> I don't see any reason that style should be imposed on all
>> everybody else.
>
> Who's imposing? I'm saying it's a good idea for everyone to do
> it, and going so far as to say that one is doing oneself a
> favour by following the convention. I have no more power than
> you to "impose" convention on anyone.

My apologies -- "impose" was too strong a word to use.

If we were starting from scratch and there was no extant source
code in the world, then it would make sense to encourage
everybody to pick one convention. [I still think it would be
rather quixotic.] But, there are so many projects out there
with naming conventions other than PEP 8, that I don't see how
there's an advantage to picking one over another (except for
the obvious also-rans like "all upper case, no vowels, and a
maximum length of 6 characters").

I'll agree that sticking with a single convention within a
project is definitely a good thing.

I'm personally aware of mixed/camel-case projects from 25+
years ago, so I'm afraid PEP 8 came along a bit too late...

--
Grant

Adelle Hartley

unread,
Sep 14, 2008, 11:25:27 PM9/14/08
to pytho...@python.org
Grant Edwards wrote:
> When packages as significant as wxPython use naming conventions
> other than PEP 8, I find it hard to make a case that the PEP 8
> naming convention is any better than any other.

This relates to a question I was thinking about...

I'm looking at porting a library that was written for COM and .Net to
work as a Python module, and was wondering whether it would be better to
stick to the library's current naming convention so that the API is as
similar as possible on each platform, or to adopt a "when in Rome..."
policy and follow the "most mainstream" naming pattern for each
platform/language.

Adelle.

Sean DiZazzo

unread,
Sep 14, 2008, 11:35:38 PM9/14/08
to
On Sep 14, 7:10 pm, Grant Edwards <gra...@visi.com> wrote:

> On 2008-09-15, Ben Finney <bignose+hates-s...@benfinney.id.au> wrote:
>
>
>
> > Grant Edwards <gra...@visi.com> writes:

+1

CamelCase FTW!

~Sean

Ben Finney

unread,
Sep 15, 2008, 12:04:35 AM9/15/08
to
Adelle Hartley <ade...@akemi.com.au> writes:

> I'm looking at porting a library that was written for COM and .Net
> to work as a Python module, and was wondering whether it would be
> better to stick to the library's current naming convention so that
> the API is as similar as possible on each platform, or to adopt a
> "when in Rome..." policy and follow the "most mainstream" naming
> pattern for each platform/language.

I think it's more important for Python library APIs to comply with the
Python coding guidelines (as specified in PEP 8) than to comply with
standards in other languages.

The Python library you're implementing isn't being used in those other
languages, so the conventions of other languages have little
relevance. It's being used in Python code, so it should mesh well with
PEP 8 compliant code — by having the API itself comply with PEP 8.

--
\ “When cryptography is outlawed, bayl bhgynjf jvyy unir |
`\ cevinpl.” —Anonymous |
_o__) |
Ben Finney

Grant Edwards

unread,
Sep 15, 2008, 12:07:08 AM9/15/08
to

If all that is would change is naming, then my advice would be
to keep the existing naming. That way it matches existing
documentation and examples. But, it does violate PEP 8.

If the API itself is going to be changed significantly so that
it's unique to the Python port (different to the point where
existing documentation and examples are no longer useful), then
using standard PEP 8 naming conventions is probably a good choice.

--
Grant


Grant Edwards

unread,
Sep 15, 2008, 12:30:32 AM9/15/08
to
On 2008-09-15, Ben Finney <bignose+h...@benfinney.id.au> wrote:
> Adelle Hartley <ade...@akemi.com.au> writes:
>
>> I'm looking at porting a library that was written for COM and .Net
>> to work as a Python module, and was wondering whether it would be
>> better to stick to the library's current naming convention so that
>> the API is as similar as possible on each platform, or to adopt a
>> "when in Rome..." policy and follow the "most mainstream" naming
>> pattern for each platform/language.
>
> I think it's more important for Python library APIs to comply
> with the Python coding guidelines (as specified in PEP 8) than
> to comply with standards in other languages.

I think the practical matter of being able to use existing
documentation and examples might be more important than
maintinging the purity of PEP 8 naming styles.

> The Python library you're implementing isn't being used in
> those other languages, so the conventions of other languages
> have little relevance.
>
> It's being used in Python code, so it should mesh well with

> PEP 8 compliant code ??? by having the API itself comply with
> PEP 8.

I think that battle was lost long ago, but maybe that just
because I use a lot of libraries written in C, C++, and Fortan
and then wrapped with things like swing.

--
Grant


Ben Finney

unread,
Sep 15, 2008, 12:46:38 AM9/15/08
to
Grant Edwards <gra...@visi.com> writes:

> On 2008-09-15, Ben Finney <bignose+h...@benfinney.id.au> wrote:
> > I think it's more important for Python library APIs to comply with
> > the Python coding guidelines (as specified in PEP 8) than to
> > comply with standards in other languages.
>
> I think the practical matter of being able to use existing
> documentation and examples

How are examples written in another language a "practical matter" for
using a Python library? Surely less "practical" than having one's
Python code base use a consistent style.

--
\ “Dyslexia means never having to say that you're ysror.” |
`\ —anonymous |
_o__) |
Ben Finney

Grant Edwards

unread,
Sep 15, 2008, 1:13:48 AM9/15/08
to
On 2008-09-15, Ben Finney <bignose+h...@benfinney.id.au> wrote:
> Grant Edwards <gra...@visi.com> writes:
>
>> On 2008-09-15, Ben Finney <bignose+h...@benfinney.id.au> wrote:
>> > I think it's more important for Python library APIs to comply with
>> > the Python coding guidelines (as specified in PEP 8) than to
>> > comply with standards in other languages.
>>
>> I think the practical matter of being able to use existing
>> documentation and examples
>
> How are examples written in another language a "practical matter" for
> using a Python library? Surely less "practical" than having one's
> Python code base use a consistent style.

If there is already a set of documentation and usage examples
for the library, then changing the names just for the sake of
"purity" means that you've now got documentation that's wrong.

For example, the vast majority of wxPython consists of wrapped
C++ library routines. There is a large body of existing
documentation and sample code for those library routines, and
they're all CamelCase. IMO, following that documentation is
more important and useful than having all the names changed to
agree with other Python libraries. (It's also a lot less work.)

--
Grant

George Sakkis

unread,
Sep 15, 2008, 10:41:35 AM9/15/08
to

+1. Another factor is whether the original library's API is reasonably
stable or it's expected to change significantly in the future. In the
former case you may want to provide a more pythonic API, otherwise
you'll have to do more work to keep in sync two separate APIs with
every new version.

George

0 new messages