vs. Python

53 views
Skip to first unread message

Dan Fichter

unread,
Aug 30, 2009, 3:24:50 AM8/30/09
to clo...@googlegroups.com
I'd like to convince some Python developers that Clojure is not foreign and does many things better than Python.  I'd appreciate whatever suggestions you have about what I've written.

Though I am crazy about STM and Clojure agents, I don't deal with them here.  This has to be a gentle introduction rather than a thorough survey of Clojure, so I've left out other language features, too.

Hello, world.

[Clojure]

(println "Hello, world")

[Python]

print "Hello, world"

In a function.

[Clojure]

(defn foo []
  (println "Hello, world"))

[Python]

def foo():
  print "Hello, world"

A list (in Clojure, a vector) of three integers.

[Clojure]

[1 2 3]

[Python]

[1, 2, 3]

A hash-set of three integers.

[Clojure]

#{1 2 3}

[Python]

set([1, 2, 3])

A hash-map.

[Clojure]

{0 false 1 true}

[Python]

{0 : False, 1: True}

Test for the equality of two collections.

[Clojure]

(= [1 2 3] [1 2 3])

[Python]

[1, 2, 3] == [1, 2, 3]

Clojure tests for the value-equality of collections, just like Python does.

Test for the equality of two distinct objects.

[Clojure]

(not= (new Exception) (new Exception))

[Python]

Exception() != Exception()

Again, Clojure works just like Python.  Two distinct objects are not equal.

Print the integers 1-10.

[Clojure]

(doseq [i (range 1 11)]
  (println i))

[Python]

for i in range(1, 11):
  print i

Try calling foo with x; call bar with an AssertionError if one is thrown; and finally call baz.

[Clojure]

(try
  (foo x)
  (catch AssertionError e
    (bar e))
  (finally
    (baz)))

[Python]

try:
  foo(x)
except AssertionError, e:
  bar(e)
finally:
  baz()

Note that Clojure doesn't have its own exception type system.  It simply lets you catch, throw, and reflect on Java's exception objects, directly.

If x is true, call foo with y.  Otherwise, call bar with y.

[Clojure]

(if x
  (foo y)
  (bar y))

[Python]

if x:
  foo(y)
else:
  bar(y)

Same thing in a function.

[Clojure]

(defn baz [x y]
  (if x
    (foo y)
    (bar y)))

[Python]

def baz (x, y):
  if x:
    foo(y)
  else:
    bar(y)

Same thing, but baz returns whatever foo or bar returned.

[Clojure]

(defn baz [x y]
  (if x
    (foo y)
    (bar y)))

[Python]

def baz (x, y):
  if x:
    return foo(y)
  else:
    return bar(y)

The Clojure didn't change, because Clojure doesn't have return statements.  A function simply returns the last form it evaluates.

This makes it tricky to write a function that has a dozen different return points, which could be something you miss or something you're glad not to be tempted by, depending on who you are.  The absence of return statements makes your Clojure functions feel referentially transparent, which, ideally they will be.

If x is true, call foo and then foo2 with y.  Otherwise, call bar with y.

[Clojure]

(if x
  (do (foo y)
      (foo2 y))
  (bar y))

[Python]

if x:
  foo(y)
  foo2(y)
else:
  bar(y)

The price of Clojure's not requiring else statements is that an if statement only takes three forms: the test, the form to evaluate if the test is true, and the form to evaluate otherwise.  (It's fine to omit the last of these.)  If you want any of these forms to do more than one thing, you need to wrap it in do, so it's recognized as a single form.

Whitespace conventions make it still very readable.  There are just two forms in the column inside if: the do block and the bar block.

If you don't need an else block, you can replace if with when and drop the do, because when supplies one implicitly:

(when x
  (foo y)
  (foo2 y))

Find the first match of "a." in "abac".

[Clojure]

(re-find #"a." "abac")

[Python]

import re
re.search("a.", "abac").group()

Clojure has regexp functions in the language core and gives you convenient #"<pattern>" syntax for regexp patterns.  Python's re makes you call group() on a regexp match, which Clojure doesn't.

Find the first match of "a." in the string x, which may or may not contain the pattern "a.".

[Clojure]

(re-find #"a." x)

[Python]

import re
match = re.search("a.", x)
if match:
  match.group()

re.search() returns None if there is no match, which of course you wouldn't be allowed to call group() on.  The Clojure is just a lot nicer.  re-find simply returns the string you want or nil if there isn't a match.

Take a dictionary of defaults and a user-supplied dictionary and return a new dictionary that combines them, with the user's entries winning when both dictionaries have the same key.

[Clojure]

(defn foo [defaults user]
  (merge defaults user))

[Python]

def foo(defaults, user):
  d = defaults.copy()
  d.update(user)
  return d

def foo(defaults, user):
  d = {}
  for k, v in defaults.items() + user.items():
    d[k] = v
  return d

It's not the end of the world, but Python doesn't have a nice way of merging dictionaries, like Clojure has.

In Python, I could accidentally modify an input to foo, and I would, if I forgot to make a copy of defaults before calling update().

That mistake is impossible in Clojure, because all Clojure data structures are immutable.  I don't have to consult the Clojure API documentation to find out whether merge does anything destructive to its inputs.  It can't, because the inputs are Clojure data structures, so there's no risk of accidentally modifying them.

Check whether every object in the collection x can be called like a function. 

[Clojure]

(defn foo [x]
  (every? ifn? x))

[Python]

def foo(x):
  for y in x:
    if not hasattr(y, "__call__"):
      return False
  return True

def foo(x):
  return x == [y for y in x if hasattr(y, "__call__")]

def foo(x):
  return x == filter(lambda y: hasattr(y, "__call__"), x)

Both the Python foo and Clojure foo will work on any type of collection x.  Clojure has two things Python doesn't: a higher-order function every? that returns whether a predicate (true/false-returning) function returns true for every element in a collection, and a predicate function ifn? that tells you whether its argument implements the function interface. 

Clojure has many more higher-order functions (like not-any?, some) and many more predicate functions (like nil?, number?, even?).

A lot of what we do is manipulate collections, and functional programming just makes it better.  There's room for error in Python.  In the first (and best-performing) Python example, you have an opportunity to mess up and put return True inside the for block rather than after it.  In the second Python example, you have an opportunity to mess up and call some useful but destructive method on y in your list comprehension, leaving the caller of foo with a modified x.

Clojure is safer because its data structures are immutable, and its functional style makes your code more readable and far more direct.

Think of the bugs you've seen that wouldn't have been there if you hadn't needed fancy nested loops with multiple return points or copy()s of collection objects.  Functional programming makes it impossible to have these kinds of bugs.

Read the contents of a file.

[Clojure]

(slurp filename)

[Python]

open(filename, 'r').read() # who cares about closing files opened in read-mode?

with open(filename, 'r') as f: # i do
  f.read()

Slurp is a reliable convenience.  It uses a java.io.BufferedReader and java.lang.StringBuilder and closes your file even if an exception is thrown while reading, just like the Python context manager does.

The point is that Clojure can be trusted to do file, network, and database io.  It simply uses Java's io libraries.  You can use them directly in Clojure, without the convenience of wrappers like slurp, if you want to.

Call foo and bar, redirecting any standard output to a different output stream, x.

[Clojure]

(binding [*out* x]
  (foo)
  (bar))

[Python]

import sys
try:
  sys.stdout = x
  foo()
  bar()
finally:
  sys.stdout = sys.__stdout__

Binding takes an already-defined global var, like *out*, and temporarily rebinds it to something new.  The rebinding isn't just lexically scoped; it follows the call stack and lasts until the block of code wrapped in binding is completed.  So if foo and bar do some printing and also call baz, which does some more printing, all of this output goes to x.  The rebinding is done inside a try/finally block, so it's guaranteed to be lifted when the binding block is completed.

Define z as the lazy Cartesian product of two collections, x and y.

[Clojure]

(def z (for [i x j y] [i j]))

[Python]

def foo(x, y):
  for i in x:
    for j in y:
      yield i, j
z = foo(x, y)

import itertools
z = itertools.product(x, y) # new in Python 2.6

To get a lazy sequence in Python, you need to write a generator-function and then call it to get a generator, or use itertools.  Either way, you get an object that can be iterated over lazily but can only be iterated over once.  It's too easy to go wrong.  Loop over z once, and you're great.  Loop over it again, and nothing will happen: no action, and no exception. 

Clojure's for loops automatically result in lazy sequences.  Loop over the Clojure z as many times as you want, in different contexts, and you'll always get the entire sequence.  Clojure's lazy evaluation is simply much safer than Python's.

Define a capturer object that can do two things: capture something, and return everything it's captured so far.

[Clojure]

(defn create-capturer []
  (let [captured (atom #{})]
    {:capture          (fn [x] (swap! captured conj x))
     :get-all-captured (fn []  (deref captured))}))

[Python]

class Capturer(object):
  def __init__(self):
    self._captured = set([])
  def capture(self, x):
    self._captured.add(x)
  def get_all_captured(self):
    return self._captured.copy()

To get a Clojure capturer, you call (create-capturer).  To get a Python capturer, you call Capturer().

create-capturer returns a map containing two functions.  Look up the keyword :capture and you'll get a function that captures.  Look up the keyword :get-all-captured and you'll get a function that shows you your captured objects.

In this implementation, a mutable storage location called an atom points to your captured objects.  This is one of the ways Clojure does mutable state.  Your atom, captured, points to a hash-set.

Hash-sets, like all Clojure data structures, are immutable.  But you can repoint captured, your atom, to a different hash-set.  When you call (swap! captured conj x), you create a new hash-set, containing x along with your already-captured objects, and you repoint captured to this new hash-set.  When you call (deref captured), you get to peek at the hash-set that captured currently points to.

No one using create-capturer is going to be repointing your atom directly.  She can't, because create-capturer doesn't return captured directly.  It returns functions that close over captured, and the only way anyone can deal with captured is by calling your functions. 

Users don't need to know that you're using an atom, nor do you need to worry that a user could try to repoint captured except through your :capture function.  It's bullet-proof encapsulation, and it was easy to get.

I haven't used atoms until now.  I didn't need them for manipulating dictionaries, parsing strings, inspecting collections of callables, reading files, or redirecting standard output.  Mutable state is a special design pattern in Clojure, and you normally don't need it.  But when you do, you get a very simple system of mutable pointers to immutable data structures.

Is the Python better?  There's no notion of atoms, so it's simpler.  You can just modify a capturer's _captured directly.  But this means you have to hope that no one will.  And in get_all_captured(), you have to remember to return a copy of _captured, because you can assume people will want to pop() or update() or clear() or otherwise mess around with what get_all_captured() returns.  Why wouldn't they?

The Clojure version is more concise and radically safer but a little more conceptually packed.  Is it worth your trouble?

Chas Emerick

unread,
Aug 30, 2009, 9:13:38 AM8/30/09
to clo...@googlegroups.com

On Aug 30, 2009, at 3:24 AM, Dan Fichter wrote:

> I'd like to convince some Python developers that Clojure is not
> foreign and does many things better than Python. I'd appreciate
> whatever suggestions you have about what I've written.
>
> Though I am crazy about STM and Clojure agents, I don't deal with
> them here. This has to be a gentle introduction rather than a
> thorough survey of Clojure, so I've left out other language
> features, too.

It's certainly a comprehensive survey of parallel impls of various
patterns. I've no idea how effective it'll be, not being a python
programmer (any more!), but it's worth a shot.

One thing did trouble me, though:

> Define a capturer object that can do two things: capture something,
> and return everything it's captured so far.
>
> [Clojure]
>
> (defn create-capturer []
> (let [captured (atom #{})]
> {:capture (fn [x] (swap! captured conj x))
> :get-all-captured (fn [] (deref captured))}))


I know you're trying to draw parallels, but I think this goes too far
in trying to write python (or java) in clojure, especially given
immutable data structures. All you need are #{x} to create a
'capturer', (conj c x) to add a new item to it, and c to get the set
of things captured so far. I know you know that, but presenting the
above doesn't seem to do anyone any favors since it's absolutely not
idiomatic.

Alternatively, if you feel like you must approach a direct corollary,
this is a little better IMO (I cringe a little at poor-man's closure-
methods :-):

(defn capture
([x] (atom #{x}))
([c x] (do
(swap! c conj x)
c)))

And of course, @c will get you the set of captured objects.

- Chas

Jason Baker

unread,
Aug 30, 2009, 9:31:05 AM8/30/09
to Clojure
On Aug 30, 2:24 am, Dan Fichter <daniel.fich...@gmail.com> wrote:
> The Clojure version is more concise and radically safer but a little more
> conceptually packed.  Is it worth your trouble?

Being primarily a Python programmer, I can say that the first thing my
co-workers would say is that Clojure isn't as readable as Python is.
And I think that there's some validity to this. Python is (in my
opinion) one of the most readable languages ever invented.

For example, consider these two snippets:

[Clojure]
(if x
(foo y)
(bar y))

[Python]
if x:
foo(y)
else:
bar(y)

The Python version reads like english whereas the Clojure version
doesn't. Now, I'm sure that there are plenty of people here that will
find the Clojure version more readable, but that's not really the
point. Your co-workers are likely to lean more towards my way of
thinking than they are to theirs.

One thing that was really eye-opening to me was Paul Graham's
statement that in writing on Lisp, you're literally writing the
language before you write the program. The implication to me was that
if Lisp isn't readable, it's your fault. :-)

So here's what I would do. I would say something more along the lines
of "Here's the Python version and here's the Clojure version. The
Clojure version has the following advantages... BUT if you're still
attached to the Python version, here's how you can write the Python
version in Lisp." Granted, I don't know if this is possible most of
the time, but surely there are a few cases where this would be easy to
do. To me, this is a much better approach than trying to say
"Clojure's going to force you to think in a totally different way."

Rob Wolfe

unread,
Aug 30, 2009, 10:34:31 AM8/30/09
to Clojure


Dan Fichter napisał(a):
> I'd like to convince some Python developers that Clojure is not foreign and
> does many things better than Python. I'd appreciate whatever suggestions
> you have about what I've written.

[...]

> Check whether every object in the collection x can be called like a
> function.
>
> [Clojure]
>
> (defn foo [x]
> (every? ifn? x))
>
> [Python]
>
> def foo(x):
> for y in x:
> if not hasattr(y, "__call__"):
> return False
> return True
>
> def foo(x):
> return x == [y for y in x if hasattr(y, "__call__")]
>
> def foo(x):
> return x == filter(lambda y: hasattr(y, "__call__"), x)
>
> Both the Python foo and Clojure foo will work on any type of collection x.
> Clojure has two things Python doesn't: a higher-order function every? that
> returns whether a predicate (true/false-returning) function returns true for
> every element in a collection, and a predicate function ifn? that tells you
> whether its argument implements the function interface.

Small correction here. I would write this function like this:

def foo(x):
return all(callable(f) for f in x)

or like this:

def foo(x):
return all(map(callable, x))

Python does have higher order functions "all" and "any".
They don't take predicates, but using lazy generator expressions
is pretty easy here.
Besides using "callable" instead of "hasattr" is imho more pythonic.

Your comparison is great in showing that Lisp not necessarily
has to be unreadable and incomprehensible for Pythonistas.
In fact Clojure is very clean and readable and my pythonic
background didn't stand in the way of understanding it.
The problem begins when it comes to understand algorithms
written in functional style. For example using "reduce" for Clojurians
seems to be very straightforward and common. For people coming from
imperative world it is not so easy to use and understand.
It would be nice to have something like "Clojure for Python
programmers"
with many examples of algorithms written in Python and Clojure style.

Br,
Rob

Dragan Djuric

unread,
Aug 30, 2009, 11:06:09 AM8/30/09
to Clojure


> For example, consider these two snippets:
>
> [Clojure]
> (if x
>   (foo y)
>   (bar y))
>
> [Python]
> if x:
>   foo(y)
> else:
>   bar(y)
>

Yeah, but then, in plain old Java:
if x {
foo(y);
} else {
bar(y);
}

which is not that different at all. The point is that these trivial
micro-snippets don't tell you much about how the language fit into the
real-world problems. They may indicate how the language will be
approachable to beginners, but noting more...

Michael Wood

unread,
Aug 30, 2009, 12:11:41 PM8/30/09
to clo...@googlegroups.com
2009/8/30 Dan Fichter <daniel....@gmail.com>:
[...]

> [Clojure]
>
> (println "Hello, world")
>
> [Python]
>
> print "Hello, world"

Or Python 3.x:

print("Hello, world")

> A list (in Clojure, a vector) of three integers.
>
> [Clojure]
>
> [1 2 3]

This can also be written in Clojure as [1, 2, 3]. Commas are
considered whitespace.

> A hash-map.
>
> [Clojure]
>
> {0 false 1 true}

This can, as above, be written with commas if preferred:

{0 false, 1 true}

--
Michael Wood <esio...@gmail.com>

CuppoJava

unread,
Aug 30, 2009, 12:32:02 PM8/30/09
to Clojure
Your examples are very good I think. It always helps to have a
straight-forward conversion from one language to another for
beginners. They will eventually pick up idioms and methodology by
playing around.

One comparison that bothers me though is this:

(not= (new Exception) (new Exception))
Again, Clojure works just like Python. Two distinct objects are not
equal.

This is not strictly true. (new Exception) is not equal to (new
Exception) because equals() is defined as being an identity
comparison. Other classes define equals() differently.

eg. (not= (new Integer 2) (new Integer 2))

-Patrick

James Cunningham

unread,
Aug 30, 2009, 11:19:22 AM8/30/09
to Clojure
On Aug 30, 3:24 am, Dan Fichter <daniel.fich...@gmail.com> wrote:
> [...]
> Take a dictionary of defaults and a user-supplied dictionary and return a
> new dictionary that combines them, with the user's entries winning when both
> dictionaries have the same key.
>
> [Clojure]
>
> (defn foo [defaults user]
>   (merge defaults user))
>
> [Python]
> [...]
>
> def foo(defaults, user):
>   d = {}
>   for k, v in defaults.items() + user.items():
>     d[k] = v
>   return d

You don't need to explicitly build up a new list; using the dictionary
constructor will result in a fairer comparison. Clojure is still a
little nicer for having a built-in function.

def foo(defaults, user):
return dict(defaults.items() + user.items())

> [...]
> Define z as the lazy Cartesian product of two collections, x and y.
>
> [Clojure]
>
> (def z (for [i x j y] [i j]))
>
> [Python]
>
> def foo(x, y):
>   for i in x:
>     for j in y:
>       yield i, j
> z = foo(x, y)
>
> import itertools
> z = itertools.product(x, y) # new in Python 2.6
>
> To get a lazy sequence in Python, you need to write a generator-function and
> then call it to get a generator, or use itertools.  

That's not quite right; you could also use a generator expression. The
Clojure idiom that you made use of is directly available in Python:

((i, j) for i in x for j in y)

gives you a lazy sequence of entries in the Cartesian product of x and
y.

Best,
James

Jonathan

unread,
Aug 30, 2009, 11:48:33 AM8/30/09
to Clojure
There are two styles of expression in higher level languages
(including Python and Clojure). Functional programming (map, filter,
reduce, fold) on one side and set (and list) comprehensions on the
other. This is somewhat a matter of culture, not capability. Although
slightly less convenient, functional programming is certainly possible
in Python, and likewise Clojure includes comprehensions. Python
programmers are more likely to reach for comprehensions and generator
expressions (lazy comprehensions) and Clojure programmers are more
likely to reach for functional operators and lazy sequences.

Having used both comprehensions and functional languages is that they
are about equally expressive, equally powerful, and generally used for
the same purposes. I would much rather programming in a language that
has a good tools for functional programming or comprehensions than a
language that does not have one or the other. Other than that I have
no strong preference.

Some comprehension (style) languages: SETL (see http://en.wikipedia.org/wiki/SETL.
SETL is a long-lost, but very interesting family of very high level
languages based on set theory), Haskell, Python. Its interesting that
Python, which started out inspired by SETL, only adopted
comprehensions much later via Haskell. Python has become very SETL-
like over time.

Some functional (style) languages: Lisp, Scheme, ML, OCaml, Haskell,
Smalltalk, Ruby, Scala, and of course Clojure. Smalltalk began the
process of integrating the functional and object-oriented traditions
by basing all of its control constructs on "blocks" which are just
anonymous functions. Ruby's use of blocks are almost identical to
those in Smalltalk.

I much prefer Python's more set-theory like syntax for comprehensions
to Scala and Clojure's versions, but as I mentioned, comprehensions
have a smaller role in languages that emphasize functional
programming. I am just as happy using using map, filter, reduce, and
iterate. I suspect, since there are many more languages that emphasize
the functional style that the style will continue to grow in
popularity. But Python also has a fully valid and equally expressive
way of saying some of the same things. I do think set-theory style
comprehensions (as in Python) are often are more readable than
equivalent functional expressions. Its worth paying attention.

Clojure and Scala have other strong benefits, so I continue to do much
of my (exploratory) programming in those.

John Harrop

unread,
Aug 30, 2009, 3:16:36 PM8/30/09
to clo...@googlegroups.com
There's an identical? predicate. It will return false for the above. Interestingly, it will return true for two of Integer/valueOf 2, and likewise for any other integer between -128 and 127 inclusive, otherwise false. Integer values that fit in a signed byte are apparently cached. But "new" must always generate a new object, so (identical? (new ...) (new ...)) always returns false no matter what replaces each "...".

Laurent PETIT

unread,
Aug 31, 2009, 8:08:43 AM8/31/09
to clo...@googlegroups.com
Hi,

Just one point:

2009/8/30 Dan Fichter <daniel....@gmail.com>

Read the contents of a file.

[Clojure]

(slurp filename)

[Python]

open(filename, 'r').read() # who cares about closing files opened in read-mode?

"who cares about closing files opened in read-mode" ?

I would say anybody concerned about blowing up the underlying OS if not releasing files handles (especially if you open files in a tight loop), or do I miss something ?

B Smith-Mannschott

unread,
Aug 31, 2009, 8:49:09 AM8/31/09
to clo...@googlegroups.com

CPython will close the underlying file when its handle is garbage
collection. The collection itself will be prompt and deterministic due
to CPython's use of reference counting for GC. You shouldn't have a
problem, even in a tight loop.

Jython, on the other hand, ... uses Java's GC, which has many
advantages of Python's 70's style reference counter, but being
deterministic isn't one of them.

Incidentally, Python 2.6 provides something akin to Clojure's with-open macro:

with open( "somefile", "rb" ) as aFile:
do_something_with_contents_of(aFile)


// BEn

Konrad Hinsen

unread,
Aug 31, 2009, 9:03:31 AM8/31/09
to clo...@googlegroups.com
On 31 Aug 2009, at 14:08, Laurent PETIT wrote:

> [Python]
>
> open(filename, 'r').read() # who cares about closing files opened in
> read-mode?
>
> "who cares about closing files opened in read-mode" ?
>
> I would say anybody concerned about blowing up the underlying OS if
> not releasing files handles (especially if you open files in a tight
> loop), or do I miss something ?

In this particular case, there is no reason to worry: open() returns a
file object that is fed to the method read(), but after that method
returns, there is no more reference to the object, so it is garbage
collected. Upon destruction of the file object, Python closes the
file. All that is documented behaviour in Python, so it is safe to
rely on it.

It is another question if relying on a list of reasonable but not
evident behaviours is good style. Personally, I don't use such
constructs in library code, but I do in scripts for personal use.

In Clojure, I would be much more careful because I know the Java
libraries less well than the Python libraries. Which illustrates that
"good style" also depends on someone's experience.

Konrad.

Laurent PETIT

unread,
Aug 31, 2009, 9:24:15 AM8/31/09
to clo...@googlegroups.com
Usually the java libraries explicitly mention not to place OS resource handles on the finalize() method called by the GC, that why I had the reflex of thinking it was generally applicable to all languages with a GC.

--
Laurent

2009/8/31 Konrad Hinsen <konrad...@fastmail.net>

Brian Hurt

unread,
Aug 31, 2009, 5:04:06 PM8/31/09
to clo...@googlegroups.com
On Sun, Aug 30, 2009 at 9:31 AM, Jason Baker <amno...@gmail.com> wrote:

On Aug 30, 2:24 am, Dan Fichter <daniel.fich...@gmail.com> wrote:
> The Clojure version is more concise and radically safer but a little more
> conceptually packed.  Is it worth your trouble?

Being primarily a Python programmer, I can say that the first thing my
co-workers would say is that Clojure isn't as readable as Python is.



Any language you are familiar and comfortable with is going to seem much more readable and much more intuitive than a language you are unfamiliar with.  Even similarity to English presupposes a familiarity with and comfort with English- something most people on this planet don't have.  A native English speaker would find a programming language whose syntax was based on, say, Mandarin or Swahili, very "unintuitive".

The point here is that arguing in favor of a new language on the basis of intuitiveness and readability is a losing argument.

Instead, I'd concentrate on the advantages Clojure has- things like incredibly good parallelism capabilities, tight integration with Java (for example, can you extend Lucene's HitCollector abstract base class to implement your own hit collector in Jython?  This is an honest question- I really don't know), etc.

Brian

Brian Hurt

unread,
Aug 31, 2009, 5:15:11 PM8/31/09
to clo...@googlegroups.com
On Mon, Aug 31, 2009 at 9:03 AM, Konrad Hinsen <konrad...@fastmail.net> wrote:


In this particular case, there is no reason to worry: open() returns a
file object that is fed to the method read(), but after that method
returns, there is no more reference to the object, so it is garbage
collected. Upon destruction of the file object, Python closes the
file. All that is documented behaviour in Python, so it is safe to
rely on it.


If I recall correctly (and correct me if I'm wrong), Python uses a reference counting garbage collector.  Which means as soon as the reference to the object goes away, the object gets collected and the handle closed.  Most JVMs use some form of mark & sweep algorithm, which means it may be some time before the object gets collected (and the resource freed).  This is especially the case in a generational GC system, where long-lived objects get collected much less frequently.  So, for long-running programs, it's possible to pile up uncollected resources to the point where you run out of the resource, simply because unused objects haven't been collected yet.

Generally, when you open a file descriptor, you should always make sure it's gets closed when you're done with it.

Brian

John Harrop

unread,
Aug 31, 2009, 5:41:31 PM8/31/09
to clo...@googlegroups.com
On Mon, Aug 31, 2009 at 5:15 PM, Brian Hurt <bhu...@gmail.com> wrote:
If I recall correctly (and correct me if I'm wrong), Python uses a reference counting garbage collector.  Which means as soon as the reference to the object goes away, the object gets collected and the handle closed.  Most JVMs use some form of mark & sweep algorithm, which means it may be some time before the object gets collected (and the resource freed).  This is especially the case in a generational GC system, where long-lived objects get collected much less frequently.  So, for long-running programs, it's possible to pile up uncollected resources to the point where you run out of the resource, simply because unused objects haven't been collected yet.

This suggests that when low-level JVM functions that try to get a file handle, socket, or what-not from the OS fail they should invoke the garbage collector in a full stop-the-world collection and then retry, just as the memory allocator already does, and throw the IOException only if they still fail afterward. (These resources tend to have finalizers, so the GC should be run twice back-to-back to collect them, or even repeatedly until no garbage was collected.)
 
Then most cases of this would cause the occasional very slow file handle acquisition instead of a crash or other error.

Generally, when you open a file descriptor, you should always make sure it's gets closed when you're done with it.

But I do agree with this. Finalizers and gc of objects holding native resources are a safety net; it's better not to fall into it even when it's there. You might not die but the judges will be holding up placards reading 0.0, 0.1, 0.0, 1.2, 0.3 or some such after your performance. :)

John Harrop

unread,
Aug 31, 2009, 5:38:05 PM8/31/09
to clo...@googlegroups.com
On Mon, Aug 31, 2009 at 5:04 PM, Brian Hurt <bhu...@gmail.com> wrote:
On Sun, Aug 30, 2009 at 9:31 AM, Jason Baker <amno...@gmail.com> wrote:
On Aug 30, 2:24 am, Dan Fichter <daniel.fich...@gmail.com> wrote:
> The Clojure version is more concise and radically safer but a little more
> conceptually packed.  Is it worth your trouble?

Being primarily a Python programmer, I can say that the first thing my
co-workers would say is that Clojure isn't as readable as Python is.



Any language you are familiar and comfortable with is going to seem much more readable and much more intuitive than a language you are unfamiliar with.  Even similarity to English presupposes a familiarity with and comfort with English- something most people on this planet don't have.  A native English speaker would find a programming language whose syntax was based on, say, Mandarin or Swahili, very "unintuitive".

The point here is that arguing in favor of a new language on the basis of intuitiveness and readability is a losing argument.

That may depend on the audience. If the audience is a bunch of Python programmers, similarities to Python may be quite relevant and not comprise a losing argument.

Reply all
Reply to author
Forward
0 new messages