is there any principle when writing python function

smith jack

unread,

Aug 23, 2011, 7:59:21 AM8/23/11

to pytho...@python.org

i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?

Peter Otten

unread,

Aug 23, 2011, 8:20:50 AM8/23/11

to

smith jack wrote:

Five ;)

Mel

unread,

Aug 23, 2011, 8:53:49 AM8/23/11

to

smith jack wrote:

> i have heard that function invocation in python is expensive, but make
> lots of functions are a good design habit in many other languages, so
> is there any principle when writing python function?

It's hard to discuss in the abstract. A function should perform a
recognizable step in solving the program's problem. If you prepared to
write your program by describing each of several operations the program
would have to perform, then you might go on to plan a function for each of
the described operations. The high-level functions can then be analyzed,
and will probably lead to functions of their own.

Test-driven development encourages smaller functions that give you a better
granularity of testing. Even so, the testable functions should each perform
one meaningful step of a more general problem.

> for example, how many lines should form a function?

Maybe as few as one.

def increase (x, a):
return x+a

is kind of stupid, but a more complicated line

def expand_template (bitwidth, defs):
'''Turn Run-Length-Encoded list into bits.'''
return np.array (sum (([bit]*(count*bitwidth) for count, bit in
defs), []), np.int8)

is the epitome of intelligence. I wrote it myself. Even increase might be
useful:

def increase (x, a):
return x + a * application_dependent_quantity

`increase` has become a meaningful operation in the imaginary application
we're discussing.

For an upper bound, it's harder to say. If you read to the end of a
function and can't remember how it started, or what it did in between, it's
too big. If you're reading on your favourite screen, and the end and the
beginning are more than one page-scroll apart, it might be too big. If it's
too big, factoring it into sub-steps and making functions of some of those
sub-steps is the fix.

Mel.

Roy Smith

unread,

Aug 23, 2011, 8:55:19 AM8/23/11

to

In article <mailman.346.13141007...@python.org>,
smith jack <thin...@gmail.com> wrote:

Enough lines to do what the function needs to do, but no more.

Seriously, break up your program into functions based on logical
groupings, and whatever makes your code easiest to understand. When
you're all done, if your program is too slow, run it under the profiler.
Use the profiling results to indicate which parts need improvement.

It's very unlikely that function call overhead will be a significant
issue. Don't worry about stuff like that unless the profiler shows its
a bottleneck. Don't try to guess what's slow. My guesses are almost
always wrong. Yours will be too.

If your program runs fast enough as it is, don't even bother with the
profiler. Be happy that you've got something useful and move on to the
next thing you've got to do.

Roy Smith

unread,

Aug 23, 2011, 8:56:57 AM8/23/11

to

In article <j305uo$pmd$1...@solani.org>, Peter Otten <__pet...@web.de>
wrote:

Five is right out.

Ulrich Eckhardt

unread,

Aug 23, 2011, 9:00:47 AM8/23/11

to

Don't compromise the design and clarity of your code just because you heard
some rumors about performance. Also, for any performance question, please
consult a profiler.

Uli

--
Domino Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

Steven D'Aprano

unread,

Aug 23, 2011, 11:22:19 AM8/23/11

to

smith jack wrote:

> i have heard that function invocation in python is expensive,

It's expensive, but not *that* expensive. Compare:

[steve@sylar ~]$ python3.2 -m timeit 'x = "abc".upper()'
1000000 loops, best of 3: 0.31 usec per loop
[steve@sylar ~]$ python3.2 -m timeit -s 'def f():
return "abc".upper()' 'f()'
1000000 loops, best of 3: 0.53 usec per loop

So the function call is nearly as expensive as this (very simple!) sample
code. But in absolute terms, that's not very expensive at all. If we make
the code more expensive:

[steve@sylar ~]$ python3.2 -m timeit '("abc"*1000)[2:995].upper().lower()'
10000 loops, best of 3: 32.3 usec per loop
[steve@sylar ~]$ python3.2 -m timeit -s 'def f(): return ("abc"*1000
[2:995].upper().lower()' 'f()'
10000 loops, best of 3: 33.9 usec per loop

the function call overhead becomes trivial.

Cases where function call overhead is significant are rare. Not vanishingly
rare, but rare enough that you shouldn't worry about them.

> but make
> lots of functions are a good design habit in many other languages, so
> is there any principle when writing python function?
> for example, how many lines should form a function?

About as long as a piece of string.

A more serious answer: it should be exactly as long as needed to do the
smallest amount of work that makes up one action, and no longer or shorter.

If you want to maximise the programmer's efficiency, a single function
should be short enough to keep the whole thing in your short-term memory at
once. This means it should consist of no more than seven, plus or minus
two, chunks of code. A chunk may be a single line, or a few lines that
together make up a unit, or if the lines are particularly complex, *less*
than a line.

http://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two
http://www.codinghorror.com/blog/2006/08/the-magical-number-seven-plus-or-minus-two.html

(Don't be put off by the use of the term "magical" -- there's nothing
literally magical about this. It's just a side-effect of the way human
cognition works.)

Anything longer than 7±2 chunks, and you will find yourself having to scroll
backwards and forwards through the function, swapping information into your
short-term memory, in order to understand it.

Even 7±2 is probably excessive: I find that I'm most comfortable with
functions that perform 4±1 chunks of work. An example from one of my
classes:

def find(self, prefix):
"""Find the item that matches prefix."""
prefix = prefix.lower() # Chunk #1
menu = self._cleaned_menu # Chunk #2
for i,s in enumerate(menu, 1): # Chunk #3
if s.lower().startswith(prefix):
return i
return None # Chunk #4

So that's three one-line chunks and one three-line chunk.

--
Steven

Seebs

unread,

Aug 23, 2011, 12:53:40 PM8/23/11

to

On 2011-08-23, smith jack <thin...@gmail.com> wrote:
> i have heard that function invocation in python is expensive, but make
> lots of functions are a good design habit in many other languages, so
> is there any principle when writing python function?

Lots of them. None of them have to do with performance.

> for example, how many lines should form a function?

Between zero (which has to be written "pass") and a few hundred. Usually
closer to the lower end of that range. Occasionally outside it.

Which is to say: This is the wrong question.

Let us give you the two laws of software optimization.

Law #1: Don't do it.

If you try to optimize stuff, you will waste a ton of time doing things that,
it turns out, are unimportant.

Law #2: (Experts only.) Don't do it yet.

You don't know enough to "optimize" this yet.

Write something that does what it is supposed to do and which you understand
clearly. See how it looks. If it looks like it is running well enough,
STOP. You are done.

Now, if it is too slow, and you are running it on real data, NOW it is time
to think about why it is slow. And the solution there is not to read abstract
theories about your language, but to profile it -- actually time execution and
find out where the time goes.

I've been writing code, and making it faster, for some longish period of time.
I have not yet ever in any language found cause to worry about function call
overhead.

-s
--
Copyright 2011, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

Message has been deleted

Terry Reedy

unread,

Aug 23, 2011, 2:19:15 PM8/23/11

to pytho...@python.org

On 8/23/2011 7:59 AM, smith jack wrote:
> i have heard that function invocation in python is expensive,

That comes into play when chosing between

list2 = map(lambda x: 2*x, list1) # versus
list2 = [2*x for x in list1]

It also comes into play when choosing between looping with recursion
(function calls) versus looping with iteration (while/for). In Python,
the iteration is faster, while some functional languages omit looping
syntax constructs and perhaps auto-translate some recursion to iteration.

> but makelots of functions are a good design habit in many other languages,

Same for Python, with the exceptions noted above of avoiding trivial
one-use functions when there is an alternative.

> is there any principle when writing python function?

Same as usual. Functions define new words and create new abstractions
than encapsulate a unit of computation.

> for example, how many lines should form a function?

1 to many, as long as the 1 is more complex than 2*x, unless the trivial
function is required for a callback. I doubt the stdlib has many defs
longer than 100 lines.

Try the following: complex enough that the function call overhead does
not matter; simple enough to be understood as a unit.

I just came up with the following hypothesis: the complexity of a
function is related to the number of *different* functions used to
define it:
x = a*b + c/d - e**f
is more complex (harder to understand) than
x = a + b + c + d + e + f

For this purpose, different statememts count as functions (and indeed,
they translate to bytecode functions. So:
for i in iterable:
if f(i):
print i
is more complex than
a = 1
b = 2
c = 3
d = 4

People can retain at most about 10 different things in short term
memory. So perhaps 10 different 'functions' within a function, or at
least a commented block, is enough.

--
Terry Jan Reedy

Terry Reedy

unread,

Aug 23, 2011, 2:29:38 PM8/23/11

to pytho...@python.org

On 8/23/2011 11:22 AM, Steven D'Aprano wrote:

> Even 7±2 is probably excessive: I find that I'm most comfortable with
> functions that perform 4±1 chunks of work. An example from one of my
> classes:
>
> def find(self, prefix):
> """Find the item that matches prefix."""
> prefix = prefix.lower() # Chunk #1
> menu = self._cleaned_menu # Chunk #2
> for i,s in enumerate(menu, 1): # Chunk #3
> if s.lower().startswith(prefix):
> return i
> return None # Chunk #4
>
> So that's three one-line chunks and one three-line chunk.

In terms of different functions performed (see my previous post), I see
attribute lookup
assignment
enumerate
sequence unpacking
for-looping
if-conditioning
lower
startswith
return
That is 9, which is enough.

--
Terry Jan Reedy

Message has been deleted

Steven D'Aprano

unread,

Aug 23, 2011, 9:44:14 PM8/23/11

to

Terry Reedy wrote:

I think we have broad agreement, but we're counting different things.
Analogy: you're counting atoms, I'm grouping atoms into molecules and
counting them.

It's a little like phone numbers: it's not an accident that we normally
group phone numbers into groups of 2-4 digits:

011 23 4567 8901

In general, people can more easily memorise four chunks of four digits (give
or take) than one chunk of 13 digits: 0112345678901.

--
Steven

alex23

unread,

Aug 23, 2011, 11:05:18 PM8/23/11

to

rantingrick <rantingr...@gmail.com> wrote:
> Everyone here who is suggesting that function bodies should be
> confined to ANY length is an idiot.

Or, more likely, is the sort of coder who has worked with other coders
in the past and understands the value of readable code.

> Don't worry if it too small or too big. It's
> not the size that matters, it's the motion of the sources ocean!

If only you spent as much time actually thinking about what you're
saying as trying to find 'clever' ways to say it...

> Always use
> comments to clarify code and NEVER EVER create more functions only for
> the sake of short function bodies

This is quite likely the worst advice you've ever given. I can only
assume you've never had to refactor the sort of code you're advocating
here.

alex23

unread,

Aug 23, 2011, 11:08:44 PM8/23/11

to

rantingrick <rantingr...@gmail.com> wrote:
> https://sites.google.com/site/thefutureofpython/

"Very soon I will be hashing out a specification for python 4000."

AHAHAHAHAhahahahahahahAHAHAHAHahahahahaaaaaaa. So rich. Anyone willing
to bet serious money we won't see this before 4000AD?

"Heck even our leader seems as a captain too drunk with vanity to
care; and our members like a ship lost at sea left to sport of every
troll-ish wind!"

Quite frankly, you're a condescending, arrogant blow-hard that this
community would be better off without.

"We must constantly strive to remove multiplicity from our systems;
lest it consumes us!"

s/multiplicity/rantingrick/ and I'm in full agreement.

Red John

unread,

Aug 24, 2011, 7:29:26 PM8/24/11

to

> "We must constantly strive to remove multiplicity from our systems;
> lest it consumes us!"
>
> s/multiplicity/rantingrick/ and I'm in full agreement.

QFT

ti...@thsu.org

unread,

Aug 26, 2011, 1:20:29 AM8/26/11

to

My suggestion is to think how you would test the function, in order to
get 100% code coverage. The parts of the function that are difficult
to test, those are the parts that you want to pull out into their own
separate function.

For example, a block of code within a conditional statement, where the
test condition cannot be passed in, is a prime example of a block of
code that should be pulled out into a separate function.

Obviously, there are times where this is not practical - exception
handling comes to mind - but that should be your rule of thumb. If a
block of code is hard to test, pull it out into it's own function, so
that it's easier to test.
--
// T.Hsu

Roy Smith

unread,

Aug 26, 2011, 7:15:52 AM8/26/11

to

In article
<c2fe3168-92b1-46a1...@19g2000vbv.googlegroups.com>,
ti...@thsu.org wrote:

> On Aug 23, 7:59 am, smith jack <thinke...@gmail.com> wrote:
> > i have heard that function invocation in python is expensive, but make
> > lots of functions are a good design habit in many other languages, so
> > is there any principle when writing python function?
> > for example, how many lines should form a function?
>
> My suggestion is to think how you would test the function, in order to
> get 100% code coverage.

I'm not convinced 100% code coverage is an achievable goal for any major
project. I was once involved in a serious code coverage program. We
had a large body of code (100's of KLOC of C++) which we were licensing
to somebody else. The customer was insisting that we do code coverage
testing and set a standard of something like 80% coverage.

There was a dedicated team of about 4 people working on this for the
better part of a year. They never came close to 80%. More like 60%,
and that was after radical surgery to eliminate dead code and branches
that couldn't be reached. The hard parts are testing the code that
deals with unusual error conditions caused by interfaces to the external
world.

The problem is, it's just damn hard to simulate all the different kinds
of errors that can occur. This was network intensive code. Every call
that touches the network can fail in all sorts of ways that are near
impossible to simulate. We also had lots of code that tried to deal
with memory exhaustion. Again, that's hard to simulate.

I'm not saying code coverage testing is a bad thing. Many of the issues
I mention above could have been solved with additional abstraction
layers, but that adds complexity of its own. Certainly, designing a
body of code to be testable from the get-go is a far superior to trying
to retrofit tests to an existing code base (which is what we were doing).

> The parts of the function that are difficult
> to test, those are the parts that you want to pull out into their own
> separate function.
>
> For example, a block of code within a conditional statement, where the
> test condition cannot be passed in, is a prime example of a block of
> code that should be pulled out into a separate function.

Maybe. In general, it's certainly true that a bunch of smallish
functions, each of which performs exactly one job, is easier to work
with than a huge ball of spaghetti code. On the other hand, interfaces
are a common cause of bugs. When you pull a hunk of code out into its
own function, you create a new interface. Sometimes that adds
complexity (and bugs) of its own.

> Obviously, there are times where this is not practical - exception
> handling comes to mind - but that should be your rule of thumb. If a
> block of code is hard to test, pull it out into it's own function, so
> that it's easier to test.

In general, that's good advice. You'll also usually find that code
which is easy to test is also easy to understand and easy to modify.

Message has been deleted

John Gordon

unread,

Aug 26, 2011, 11:40:40 AM8/26/11

to

In <7b47ca17-d3f1-4d91...@ea4g2000vbb.googlegroups.com> rantingrick <ranti...@gmail.com> writes:

> Furthermore: If you are moving code out of one function to ONLY be
> called by that ONE function then you are a bad programmer and should
> have your editor taken away for six months. You should ONLY create
> more func/methods if those func/methods will be called from two or
> more places in the code. The very essence of func/meths is the fact
> that they are reusable.

That's one very important aspect of functions, yes. But there's another:
abstraction.

If I'm writing a module that needs to fetch user details from an LDAP
server, it might be worthwhile to put all of the LDAP-specific code in
its own method, even if it's only used once. That way the main module
can just contain a line like this:

user_info = get_ldap_results("cn=john gordon,ou=people,dc=company,dc=com")

The main module keeps a high level of abstraction instead of descending
into dozens or even hundreds of lines of LDAP-specific code.

--
John Gordon A is for Amy, who fell down the stairs
gor...@panix.com B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

Tobiah

unread,

Aug 26, 2011, 11:48:28 AM8/26/11

to

> Furthermore: If you are moving code out of one function to ONLY be
> called by that ONE function then you are a bad programmer and should
> have your editor taken away for six months. You should ONLY create
> more func/methods if those func/methods will be called from two or
> more places in the code. The very essence of func/meths is the fact
> that they are reusable.

While I understand and agree with that basic tenet, I think
that the capitalized 'ONLY' is too strong. I do split out
code into function for readability, even when the function
will only be called from the place from which I split it out.

I don't think that this adds to the 'spaghetti' factor. It
can make my life much easier when I go to debug my own code
years later.

In python, I use a small function to block out an idea
as a sort of pseudo code, although it's valid python. Then
I just define the supporting functions, and the task is done:

def validate_registrants():

for dude in get_registrants():
id = get_id(dude)
amount_paid = get_amount_paid(dude)
amount_owed = get_amount_owed(dude)

if amount_paid != amount_owed():
flag(dude)

I get that this cries out for a 'dude' object, but
I'm just making a point. When I go back to this code,
I can very quickly see what the overall flow is, and
jump to the problem area by function name. The above
block might expand to a couple of hundred lines if I
didn't split it out like this.

Chris Angelico

unread,

Aug 26, 2011, 12:10:34 PM8/26/11

to pytho...@python.org

On Sat, Aug 27, 2011 at 1:48 AM, Tobiah <tob...@teranews.com> wrote:
> While I understand and agree with that basic tenet, I think
> that the capitalized 'ONLY' is too strong. I do split out
> code into function for readability, even when the function
> will only be called from the place from which I split it out.
>

This can be good and can be bad. It's good when it aids readability;
it's bad when you need to pass practically the entire locals() as
function arguments and/or return values. I would split the function
only when both halves (caller and callee) can be given short and
useful names - if you can't explain what a block of code does in a few
words, it's probably a poor choice for splitting out into a function.

ChrisA

Steven D'Aprano

unread,

Aug 26, 2011, 2:16:30 PM8/26/11

to

Tobiah wrote:

>
>> Furthermore: If you are moving code out of one function to ONLY be
>> called by that ONE function then you are a bad programmer and should
>> have your editor taken away for six months. You should ONLY create
>> more func/methods if those func/methods will be called from two or
>> more places in the code. The very essence of func/meths is the fact
>> that they are reusable.
>
> While I understand and agree with that basic tenet, I think
> that the capitalized 'ONLY' is too strong. I do split out
> code into function for readability, even when the function
> will only be called from the place from which I split it out.

In other words, you disagree. Which is good, because the text you quote is
terrible advice, and it is ironic that the person you quote judges others
as bad programmers when his advice is so bad.

I can think of at least five reasons apart from re-use why it might be
appropriate to pull out code into its own function or method even if it is
used in one place only:

(1) Extensibility. Just earlier today I turned one method into three:

def select(self):
response = input(self)
if response:
index = self.find(response)
else:
index = self.default
return self.menuitems[index-1]

turned into:

def choose(self, response):
if response:
index = self.find(response)
else:
index = self.default
return self.menuitems[index-1]

def raw_select(self):
return input(self)

def select(self):
return self.choose(self.raw_select())

I did this so that subclasses could override the behaviour of each component
individually, even though the caller is not expected to call raw_select or
choose directly. (I may even consider making them private.)

(2) Testing. It is very difficult to reach into the middle of a function and
test part of it. It is very difficult to get full test coverage of big
monolithic blocks of code: to ensure you test each path through a big
function, the number of test cases rises exponentially. By splitting it
into functions, you can test each part in isolation, which requires much
less work.

(3) Fault isolation. If you have a 100 line function that fails on line 73,
that failure may have been introduced way back in line 16. By splitting the
function up into smaller functions, you can more easily isolate where the
failure comes from, by checking for violated pre- and post-conditions.

(4) Maintainability. It's just easier to document and reason about a
function that does one thing, than one that tries to do everything. Which
would you rather work with, individual functions for:

buy_ingredients
clean_kitchen_work_area
wash_vegetables
prepare_ingredients
cook_main_course
fold_serviettes
make_desert
serve_meal
do_washing_up

etc., or one massive function:

prepare_and_serve_five_course_meal

Even if each function is only called once, maintenance is simpler if the
code is broken up into more easily understood pieces.

(5) Machine efficiency. This can go either way. Code takes up memory too,
and it may be easier for the compiler to work with 1000 small functions
than 1 big function. I've actually seen somebody write a single function so
big that Python couldn't import the module, because it ran out of memory
trying to compile it! (This function was *huge* -- the source code was many
megabytes in size.) I don't remember the details, but refactoring the
source code into smaller functions fixed it.

On the other hand, if you are tight for memory, 1 big function may have less
overhead than 1000 small functions; and these days, with even entry level
PCs often having a GB or more of memory, it is rare to come across a
function so big that the size of code matters. Even a 10,000 line function
is likely to be only a couple of hundred KB in size:

>>> text = '\n'.join('print x+i' for i in range(1, 10001))
>>> code = compile(text, '', 'exec')
>>> sys.getsizeof(code.co_code) # size in bytes
90028

So that's four really good reasons for splitting code into functions, and
one borderline one, other than code re-use. There may be others.

--
Steven

Message has been deleted

Chris Angelico

unread,

Aug 26, 2011, 5:45:53 PM8/26/11

to pytho...@python.org

On Sat, Aug 27, 2011 at 4:05 AM, rantingrick <ranti...@gmail.com> wrote:
> Now take a look at MY simple ONE module solution. It has JUST enough
> methods and NOT a single more!

I disagree - create_widgets() is completely unnecessary in the
presence of show(), unless it's possible to show the dialog, hide it,
and then re-show it without recreating the widgets.

On Sat, Aug 27, 2011 at 4:16 AM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> I can think of at least five reasons apart from re-use why it might be
> appropriate to pull out code into its own function or method even if it is
> used in one place only:

I'm glad you say "might be", because your five reasons aren't always
reasons for refactoring. I'll play devil's advocate for a moment,
because discussion is both fun and informative: :)

> (1) Extensibility. Just earlier today I turned one method into three:

> I did this so that subclasses could override the behaviour of each component
> individually, even though the caller is not expected to call raw_select or
> choose directly. (I may even consider making them private.)

Definitely, but it's no value if you make every tiny thing into your
own function. Sometimes the best way to code is to use lower-level
functionality directly (not wrapping input() inside raw_select() for
instance), and letting someone monkey-patch if they want to change
your code. A judgment call.

> (2) Testing. It is very difficult to reach into the middle of a function and

> test part of it. ... By splitting it

> into functions, you can test each part in isolation, which requires much
> less work.

Yes, but 100% coverage isn't that big a deal. If the function does
precisely one logical thing, then you don't _need_ to test parts in
isolation - you can treat it as a black box and just ensure that it's
doing the right thing under various circumstances. However, this ties
in nicely with your next point...

> (3) Fault isolation. If you have a 100 line function that fails on line 73,
> that failure may have been introduced way back in line 16. By splitting the
> function up into smaller functions, you can more easily isolate where the
> failure comes from, by checking for violated pre- and post-conditions.

... and here's where #2 really shines. If you break your function in
two, the natural thing to do is to test each half separately, with the
correct preconditions, and examine its output. If your fault was on
line 16, your test for that half of the function has a chance of
detecting it. I don't have a Devil's Advocate put-down for this one,
save the rather weak comment that it's possible to check pre- and
post-conditions without refactoring. :)

> (4) Maintainability. It's just easier to document and reason about a
> function that does one thing, than one that tries to do everything. Which
> would you rather work with, individual functions for:

> ... omnomnom ...

> Even if each function is only called once, maintenance is simpler if the
> code is broken up into more easily understood pieces.

Yes, as long as you do the job intelligently. Goes back to what I said
about naming functions - in your kitchen example, every function has a
self-documenting name, which means you've broken it out more-or-less
correctly. (I'd still want to have
prepare_and_serve_five_course_meal() of course, but it would be
calling on all the others.) Breaking something out illogically doesn't
help maintainability at all - in fact, it'll make it worse. "So this
function does what, exactly? And if I need to add a line of code,
ought I to do it here, or over there? Does anyone else actually call
this function? MIGHT someone be reaching into my module and calling
this function directly? I'd better keep it... ugh."

> (5) Machine efficiency. This can go either way.

And that's the very best thing to say about efficiency. Ever. In C, I
can write static functions and let the compiler inline them; in Java,
I tried to do the same thing, and found ridiculous overheads. Ended up
making a monolith rather than go through Java's overhead. But if I'd
changed what VM I was running it on, that might well have changed.
Profile, profile, profile.

> So that's four really good reasons for splitting code into functions, and
> one borderline one, other than code re-use. There may be others.

I'm sure there are. But let's face it: We're programming in PYTHON.
Not C, not Erlang, not Pike, not PHP. Python. If this has been the
right choice, then we should assume that efficiency isn't king, but
readability and maintainability probably are; so the important
considerations are not "will it take two extra nanoseconds to execute"
but "can my successor understand what the code's doing" and "will he,
if he edits my code, have a reasonable expectation that he's not
breaking stuff". These are always important.

ChrisA

Message has been deleted

Steven D'Aprano

unread,

Aug 26, 2011, 9:26:21 PM8/26/11

to

Chris Angelico wrote:

> On Sat, Aug 27, 2011 at 4:16 AM, Steven D'Aprano
> <steve+comp....@pearwood.info> wrote:
>> I can think of at least five reasons apart from re-use why it might be
>> appropriate to pull out code into its own function or method even if it
>> is used in one place only:
>
> I'm glad you say "might be", because your five reasons aren't always
> reasons for refactoring. I'll play devil's advocate for a moment,
> because discussion is both fun and informative: :)

Naturally :)

I say "might be" because I mean it: these arguments have to be weighed up
against the argument against breaking code out of functions. It's easy to
imagine an extreme case where there are a billion *tiny* functions, each of
which does one micro-operation:

def f1(x): return x + 1
def f2(x): return 3*x
def f3(x): return f2(f1(x)) # instead of 3*(x+1)
...

If spaghetti code (GOTOs tangled all through the code with no structure) is
bad, so is ravioli code (code bundled up into tiny parcels and then thrown
together higgledy-piggledy). Both cases can lead to an unmaintainable mess.
Nobody is arguing that "More Functions Is Always Good". Sensible coders
understand that you should seek a happy medium and not introduce more
functions just for the sake of having More! Functions!.

But I'm not arguing with you, we're in agreement.

One last comment though:

[...]

> Definitely, but it's no value if you make every tiny thing into your
> own function. Sometimes the best way to code is to use lower-level
> functionality directly (not wrapping input() inside raw_select() for
> instance), and letting someone monkey-patch if they want to change
> your code. A judgment call.

I agree on the first part (don't split *everything* into functions) but I
think that the monkey-patch idea is tricky and dangerous in practice. The
first problem is, how do you know what needs to be monkey-patched? You may
not have access to the source code to read, and it may not be as obvious
as "oh, it gets input from the user, so it must be calling input()".

Second, even if you know what to monkey-patch, it's really hard to isolate
the modification to just the method you want. By their nature, monkey-
patches apply globally to the module. And if you patch the builtins module,
they apply *everywhere*.

So while monkey-patching can work, it's tricky to get it right and it should
be left as a last resort.

--
Steven

Chris Angelico

unread,

Aug 26, 2011, 9:37:27 PM8/26/11

to pytho...@python.org

On Sat, Aug 27, 2011 at 11:26 AM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> I say "might be" because I mean it: these arguments have to be weighed up
> against the argument against breaking code out of functions. It's easy to
> imagine an extreme case where there are a billion *tiny* functions, each of
> which does one micro-operation:
>
> def f1(x): return x + 1
> def f2(x): return 3*x
> def f3(x): return f2(f1(x)) # instead of 3*(x+1)

This fails the "give it a decent name" test. Can you name these
functions according to what they do, as opposed to how they do it? For
instance:

def add_flagfall(x): return x + 1 # add a $1 flagfall to the price
def add_tax(x): return 3*x # this is seriously nasty tax
def real_price(x): return add_tax(add_flagfall(x)) # instead of 3*(x+1)

This would be acceptable, because each micro-operation has real
meaning. I'd prefer to do it as constants rather than functions, but
at least they're justifying their names.

And you're absolutely right about monkey-patching.

ChrisA

Roy Smith

unread,

Aug 27, 2011, 12:41:36 PM8/27/11

to

Chris Angelico <ros...@gmail.com> wrote:

> the important
> considerations are not "will it take two extra nanoseconds to execute"
> but "can my successor understand what the code's doing" and "will he,
> if he edits my code, have a reasonable expectation that he's not
> breaking stuff". These are always important.

Forget about your successor. Will *you* be able to figure out what you
did 6 months from now? I can't tell you how many times I've looked at
some piece of code, muttered, "Who wrote this crap?" and called up the
checkin history only to discover that *I* wrote it :-)

Chris Angelico

unread,

Aug 27, 2011, 12:57:50 PM8/27/11

to pytho...@python.org

On Sun, Aug 28, 2011 at 2:41 AM, Roy Smith <r...@panix.com> wrote:
> Forget about your successor. Will *you* be able to figure out what you
> did 6 months from now? I can't tell you how many times I've looked at
> some piece of code, muttered, "Who wrote this crap?" and called up the
> checkin history only to discover that *I* wrote it :-)

Heh. In that case, you were your own successor :) I always word it as
a different person to dodge the "But I'll remember!" excuse, but you
are absolutely right, and I've had that exact same experience myself.

Fred comes up to me and says, "How do I use FooMatic?" Me: "I dunno,
ask Joe." Fred: "But didn't you write it?" Me: "Yeah, that was years
ago, I've forgotten. Ask Joe, he still uses the program."

ChrisA

Emile van Sebille

unread,

Aug 27, 2011, 1:27:48 PM8/27/11

to pytho...@python.org

On 8/27/2011 9:41 AM Roy Smith said...

When you consider that you're looking at the code six months later it's
likely for one of three reasons: you have to fix a bug; you need to add
features; or the code's only now getting used.

So you then take the extra 20-30 minutes, tease the code apart, refactor
as needed and end up with better more readable debugged code.

I consider that the right time to do this type of cleanup.

For all the crap I write that works well for six months before needing
to be cleaned up, there's a whole lot more crap that never gets looked
at again that I didn't clean up and never spent the extra 20-30 minutes
considering how my future self might view what I wrote.

I'm not suggesting that you shouldn't develop good coding habits that
adhere to established standards and result in well structured readable
code, only that if that ugly piece of code works that you move on. You
can bullet proof it after you uncover the vulnerabilities.

Code is first and foremost written to be executed.

Emile

Chris Angelico

unread,

Aug 27, 2011, 1:31:19 PM8/27/11

to pytho...@python.org

On Sun, Aug 28, 2011 at 3:27 AM, Emile van Sebille <em...@fenx.com> wrote:
> Code is first and foremost written to be executed.
>

+1 QOTW. Yes, it'll be read, and most likely read several times, by
humans, but ultimately its purpose is to be executed.

And in the case of some code, the programmer needs the same treatment,
but that's a different issue...

ChrisA

Steven D'Aprano

unread,

Aug 27, 2011, 4:27:32 PM8/27/11

to

Chris Angelico wrote:

> On Sun, Aug 28, 2011 at 3:27 AM, Emile van Sebille <em...@fenx.com> wrote:
>> Code is first and foremost written to be executed.
>>
>
> +1 QOTW. Yes, it'll be read, and most likely read several times, by
> humans, but ultimately its purpose is to be executed.

You've never noticed the masses of code written in text books, blogs, web
pages, discussion forums like this one, etc.?

Real world code for production is usually messy and complicated and filled
with data validation and error checking code. There's a lot of code without
that, because it was written explicitly to be read by humans, and the fact
that it may be executed as well is incidental. Some code is even written in
pseudo-code that *cannot* be executed. It's clear to me that a non-trivial
amount of code is specifically written to be consumed by other humans, not
by machines.

It seems to me that, broadly speaking, there are languages designed with
execution of code as the primary purpose:

Fortran, C, Lisp, Java, PL/I, APL, Forth, ...

and there are languages designed with *writing* of code as the primary
purpose:

Perl, AWK, sed, bash, ...

and then there are languages where *reading* is the primary purpose:

Python, Ruby, Hypertalk, Inform 7, Pascal, AppleScript, ...

and then there are languages where the torment of the damned is the primary
purpose:

INTERCAL, Oook, Brainf*ck, Whitespace, Malbolge, ...

and then there are languages with few, or no, design principles to speak of,
or as compromise languages that (deliberately or accidentally) straddle the
other categories. It all depends on the motivation and values of the
language designer, and the trade-offs the language makes. Which category
any specific language may fall into may be a matter of degree, or a matter
of opinion, or both.

--
Steven

Chris Angelico

unread,

Aug 27, 2011, 4:38:31 PM8/27/11

to pytho...@python.org

On Sun, Aug 28, 2011 at 6:27 AM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> You've never noticed the masses of code written in text books, blogs, web
> pages, discussion forums like this one, etc.?
>
> Real world code for production is usually messy and complicated and filled
> with data validation and error checking code. There's a lot of code without
> that, because it was written explicitly to be read by humans, and the fact
> that it may be executed as well is incidental. Some code is even written in
> pseudo-code that *cannot* be executed. It's clear to me that a non-trivial
> amount of code is specifically written to be consumed by other humans, not
> by machines.

Yes, I'm aware of the quantities of code that are primarily for human
consumption. But in the original context, which was of editing code
six months down the track, I still believe that such code is primarily
for the machine. In that situation, there are times when it's not
worth the hassle of writing beautiful code; you'd do better to just
get that code generated and in operation.

Same goes for lint tools and debuggers - sometimes, it's easier to
just put the code into a live situation (or a perfect copy of) and see
where it breaks, than to use a simulation/test harness.

ChrisA

Roy Smith

unread,

Aug 27, 2011, 5:09:51 PM8/27/11

to

In article <4e595334$0$30000$c3e8da3$5496...@news.astraweb.com>,

Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> and then there are languages with few, or no, design principles to speak of

Oh, like PHP?

Ben Finney

unread,

Aug 27, 2011, 5:57:48 PM8/27/11

to

Emile van Sebille <em...@fenx.com> writes:

> Code is first and foremost written to be executed.

−1 QotW. I disagree, and have a counter-aphorism:

“Programs must be written for people to read, and only incidentally for
machines to execute.”
—Abelson & Sussman, _Structure and Interpretation of Computer Programs_

Yes, the primary *function* of the code you write is for it to
eventually execute. But the primary *audience* of the text you type into
your buffer is not the computer, but the humans who will read it. That's
what must be foremost in your mind while writing that text.

--
\ “If you can't beat them, arrange to have them beaten.” —George |
`\ Carlin |
_o__) |
Ben Finney

Emile van Sebille

unread,

Aug 27, 2011, 6:21:06 PM8/27/11

to pytho...@python.org

On 8/27/2011 2:57 PM Ben Finney said...

> Emile van Sebille<em...@fenx.com> writes:
>
>> Code is first and foremost written to be executed.
>

> “Programs must be written for people to read, and only incidentally for
> machines to execute.”

> —Abelson& Sussman, _Structure and Interpretation of Computer Programs_
>

That's certainly self-fulfilling -- code that doesn't execute will need
to be read to be understood, and to be fixed so that it does run.
Nobody cares about code not intended to be executed. Pretty it up as
much as you have free time to do so to enlighten your intended audience.

Code that runs from the offset may not ever again need to be read, so
the only audience will ever be the processor.

I find it much to easy to waste enormous amounts of time prettying up
code that works. Pretty it up when it doesn't -- that's the code that
needs the attention.

Emile

Message has been deleted

Roy Smith

unread,

Aug 27, 2011, 7:09:41 PM8/27/11

to

In article <mailman.489.13144836...@python.org>,

Emile van Sebille <em...@fenx.com> wrote:

> code that doesn't execute will need to be read to be understood, and
> to be fixed so that it does run.

That is certainly true, but it's not the whole story. Even code that
works perfectly today will need to be modified in the future. Business
requirements change. Your code will need to be ported to a new OS.
You'll need to make it work for 64-bit. Or i18n. Or y2k (well, don't
need to worry about that one any more). Or with a different run-time
library. A new complier. A different database. Regulatory changes
will impose new requirements Or, your company will get bought and
you'll need to interface with a whole new system.

Code is never done. At least not until the project is dead.

Stephen Hansen

unread,

Aug 27, 2011, 7:27:16 PM8/27/11

to pytho...@python.org

On 8/27/11 3:21 PM, Emile van Sebille wrote:
> On 8/27/2011 2:57 PM Ben Finney said...

>> Emile van Sebille<em...@fenx.com> writes:
>>
>>> Code is first and foremost written to be executed.
>>
>
>

>> “Programs must be written for people to read, and only
>> incidentally for
>> machines to execute.”

>> —Abelson& Sussman, _Structure and Interpretation of Computer
>> Programs_
>>
>
> That's certainly self-fulfilling -- code that doesn't execute will need

> to be read to be understood, and to be fixed so that it does run. Nobody
> cares about code not intended to be executed. Pretty it up as much as
> you have free time to do so to enlighten your intended audience.

Er, you're interpreting the quote... way overboard. No one's talking
about code that isn't intended to be executed, I don't think; the quote
includes, "and only incidentally for machines to execute." That's still
the there, and its still important. It should just not be the prime
concern while actually writing the code.

The code has to actually do something. If not, obviously you'll have to
change it.

The Pythonic emphasis on doing readable, pretty code isn't JUST about
making code that just looks good; its not merely an aesthetic that the
community endorses.

And although people often tout the very valid reason why readability
counts-- that code is often read more then written, and that coming back
to a chunk of code 6 months later and being able to understand fully
what its doing is very important... that's not the only reason
readability counts.

Readable, pretty, elegantly crafted code is also far more likely to be
*correct* code.

However, this:

> Code that runs from the offset may not ever again need to be read, so
> the only audience will ever be the processor.
>
> I find it much to easy to waste enormous amounts of time prettying up
> code that works. Pretty it up when it doesn't -- that's the code that
> needs the attention.

... seems to me to be a rather significant self-fulfilling prophecy in
its own right. The chances that the code does what its supposed to do,
accurately, and without any bugs, goes down in my experience quite
significantly the farther away from "pretty" it is.

If you code some crazy, overly clever, poorly organized, messy chunk of
something that /works/ -- that's fine and dandy. But unless you have
some /seriously/ comprehensive test coverage then the chances that you
can eyeball it and be sure it doesn't have some subtle bugs that will
call you back to fix it later, is pretty low. In my experience.

Its not that pretty code is bug-free, but code which is easily read and
understood is vastly more likely to be functioning correctly and reliably.

Also... it just does not take that much time to make "pretty code". It
really doesn't.

The entire idea that its hard, time-consuming, effort-draining or
difficult to make code clean and "pretty" from the get-go is just wrong.

You don't need to do a major "prettying up" stage after the fact. Sure,
sometimes refactoring would greatly help a body of code as it evolves,
but you can do that as it becomes beneficial for maintenance reasons and
not just for pretty's sake.

--

Stephen Hansen
... Also: Ixokai
... Mail: me+list/python (AT) ixokai (DOT) io
... Blog: http://meh.ixokai.io/

signature.asc

harrismh777

unread,

Aug 28, 2011, 12:51:59 AM8/28/11

to

smith jack wrote:
> i have heard that function invocation in python is expensive, but make
> lots of functions are a good design habit in many other languages, so
> is there any principle when writing python function?
> for example, how many lines should form a function?

Once Abraham Lincoln was asked how long a man's legs should be. (Well,
he was a tall man and had exceptionally long legs... his bed had to be
specially made.)

Old Abe said, "A man's legs ought to be long enough to reach from his
body to the floor".

One time the Austrian Emperor decided that one of Wolfgang Amadeus
Mozart's masterpieces contained too many notes... when asked how many
notes a masterpiece ought to contain it is reported that Mozart
retorted, "I use precisely as many notes as the piece requires, not one
note more, and not one note less".

After starting the python interpreter import this:

import this

... study carefully. If you're not Dutch, don't worry if some of it
confuses you. ... apply liberally to your function praxis.

kind regards,

--
m harris

FSF ...free as in freedom/
http://webpages.charter.net/harrismh777/gnulinux/gnulinux.htm

Neil Cerutti

unread,

Aug 29, 2011, 10:52:40 AM8/29/11

to

On 2011-08-26, Chris Angelico <ros...@gmail.com> wrote:
> On Sat, Aug 27, 2011 at 1:48 AM, Tobiah <tob...@teranews.com>
> wrote:
>> While I understand and agree with that basic tenet, I think

>> that the capitalized 'ONLY' is too strong. ?I do split out

>> code into function for readability, even when the function
>> will only be called from the place from which I split it out.
>
> This can be good and can be bad. It's good when it aids
> readability; it's bad when you need to pass practically the
> entire locals() as function arguments and/or return values.

Even when lots of context is needed, defining the context with
function calls is a big improvement over directly using names in
a module's global namespace.

Sometimes repeatedly reused context suggests that creating new
classes of objects might be a good idea.

> I would split the function only when both halves (caller and
> callee) can be given short and useful names - if you can't
> explain what a block of code does in a few words, it's probably
> a poor choice for splitting out into a function.

I agree, except for the implied unconditional preference for
short names. I believe the length of a name should usually be
proportional to the scope of the object it represents.

In my house, I'm dad. In my chorus, I'm Neil. In town I'm Neil
Cerutti, and in the global scope I have to use a meaningless
unique identifier. Hopefully no Python namespace ever gets that
big.

--
Neil Cerutti

Chris Angelico

unread,

Aug 29, 2011, 2:20:49 PM8/29/11

to pytho...@python.org

On Tue, Aug 30, 2011 at 12:52 AM, Neil Cerutti <ne...@norwich.edu> wrote:
>> I would split the function only when both halves (caller and
>> callee) can be given short and useful names - if you can't
>> explain what a block of code does in a few words, it's probably
>> a poor choice for splitting out into a function.
>
> I agree, except for the implied unconditional preference for
> short names. I believe the length of a name should usually be
> proportional to the scope of the object it represents.

Oh,I definitely prefer short names to this:
http://thedailywtf.com/Articles/Double-Line.aspx

"Short" is a relative term. If the function's name is 20 characters
long and meaningful, that's fine.

> In my house, I'm dad. In my chorus, I'm Neil. In town I'm Neil
> Cerutti, and in the global scope I have to use a meaningless
> unique identifier. Hopefully no Python namespace ever gets that
> big.

Chorus? Does that imply that you sing? Neat :)

What you have, I think, is a module named Cerutti, in which you have a
class of which Neil is an instance. Inside method functions, you can
be referenced by "self" (which is to code what pronouns are to
English); outside of them, you are referred to as Neil; and outside
the module, Cerutti.Neil is the cleanest way to reference you. But
your name is still Neil, no matter how you're referenced.

Chris Angelico
whose name is sometimes Chris, sometimes Rosuav, and sometimes "Chris
or Michael" by people who can't distinguish him from his brother

Neil Cerutti

unread,

Aug 29, 2011, 2:40:24 PM8/29/11

to

On 2011-08-29, Chris Angelico <ros...@gmail.com> wrote:
>> In my house, I'm dad. In my chorus, I'm Neil. In town I'm Neil
>> Cerutti, and in the global scope I have to use a meaningless
>> unique identifier. Hopefully no Python namespace ever gets that
>> big.
>
> Chorus? Does that imply that you sing? Neat :)

Wait... not all Python programmers sing?

> What you have, I think, is a module named Cerutti, in which you
> have a class of which Neil is an instance. Inside method
> functions, you can be referenced by "self" (which is to code
> what pronouns are to English); outside of them, you are
> referred to as Neil; and outside the module, Cerutti.Neil is
> the cleanest way to reference you. But your name is still Neil,
> no matter how you're referenced.

The problem with that scenario is that, in real life, there's
more than one Cerutti.Neil, and they like to move around. ;)

--
Neil Cerutti

Chris Angelico

unread,

Aug 29, 2011, 3:02:52 PM8/29/11

to pytho...@python.org

On Tue, Aug 30, 2011 at 4:40 AM, Neil Cerutti <ne...@norwich.edu> wrote:
> Wait... not all Python programmers sing?

I do, and there seems to be more than coincidental overlap between
musos and coders.

> The problem with that scenario is that, in real life, there's
> more than one Cerutti.Neil, and they like to move around. ;)

Yes indeed; which means that your Cerutti module is in a package:

from norwich import Cerutti

It's always possible to make a locally-unique identifier into a more
globally unique one by prepending another tag to it. Alternatively,
you need to be duck-typed: you're the Neil Cerutti who writes code,
and if some other Neil Cerutti is asked to write code, he will throw
an exception. That's probably the easiest way to deal with it - but I
don't know of a way to implement it in a coded way. Maybe all names
actually point to lists of objects, and whenever you try to do
something with a name, the system goes through the elements of the
list until one doesn't fail?

Going back to the original question, the length of function name
required for it to be "meaningful" is, obviously, a variable quantity.
But I think it's still reasonable to use that as a rule of thumb for
dividing functions - if you can sanely name both halves, without
putting the entire code into the function name, then you have a case
for refactoring.

ChrisA

Ben Finney

unread,

Aug 29, 2011, 6:17:35 PM8/29/11

to

Neil Cerutti <ne...@norwich.edu> writes:

> On 2011-08-29, Chris Angelico <ros...@gmail.com> wrote:
> > Chorus? Does that imply that you sing? Neat :)
>
> Wait... not all Python programmers sing?

All Python programmers sing. Some of them should not.

--
\ “To be is to do” —Plato |
`\ “To do is to be” —Aristotle |
_o__) “Do be do be do” —Sinatra |
Ben Finney

Message has been deleted