Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

suggestion for os.path.commonprefix

0 views
Skip to first unread message

Cristian Barbarosie

unread,
May 21, 2002, 5:45:49 AM5/21/02
to
I noticed the following peculiar behaviour os.path.commonprefix:

>>> os.path.commonprefix(['/home/someuser/modulef/essay.tex',
... '/home/someuser/mollifiers/fig01.ps'])
'/home/someuser/mo'

I think the answer should be '/home/someuser/'. At least, this is
what I expect when I do pathname manipulations in Python.
So I propose the following replacement for os.path.commonprefix:

-------------------------------------------------------------------
def commonprefix (list_of_paths):
"""Like in os.path, but more in the spirit of path manipulation.
Items must end in os.sep if they represent directories."""
number_of_paths = len(list_of_paths)
return ''
first_path = list_of_paths[0]
largest_i = -1
for i in range(len(first_path)):
stop = 0
character = first_path[i]
for n in range(1,number_of_paths):
item = list_of_paths[n]
if (i >= len(item)) or (character <> item[i]):
# here we took advantage of the way Python evaluates
# logical expressions: if the first one is true,
# then the second one is not evaluated
stop = 1
break
if stop: break
if character == os.sep: largest_i = i
prefix = first_path[:largest_i+1]
return prefix
--------------------------------------------------------------------

Notes:

1. I know that "if number_of_paths == 0" is the same as "if not
number_of_paths", but I prefer the first version, it is more readable.

2. It is important that directory names end in os.sep. This means that
you cannot use os.path.dirname, as it returns something not ending in
os.sep:
>>> os.path.dirname('/home/someuser/modulef/essay.tex')
'/home/someuser/modulef'
So I built my own dirname:

def dirname (path):
"""The inconvenient of os.path.dirname is that it does not give
a name ending in / (that is, ending in os.sep),
except when it returns root directory."""
dir = os.path.dirname (path)
l = len(dir)
if l==0: return ''
if dir[l-1] != os.sep:
dir = dir + os.sep
return dir

3. The problem with my version of commonprefix is that, if you give
just the name of a _file_ as an argument, it returns the parent
directory instead of the file itself:
>>> commonprefix(['/home/someuser/modulef/essay.tex'])
'/home/someuser/modulef/'
For me this behaviour is acceptable, for others it may be not.

Thank you for your attention,
Cristian Barbarosie
barb...@lmc.fc.ul.pt
http://www.lmc.fc.ul.pt/~barbaros

Cristian Barbarosie

unread,
May 21, 2002, 11:51:22 AM5/21/02
to
One line dissapeared misteriously from the code in my previous posting.
(if number_of_paths == 0) Below is the full version. Sorry for the typo.

One more comment: yes, I suppose dirname could be defined as
def dirname(path): return commonprefix([path])

Cristian Barbarosie
http://www.lmc.fc.ul.pt/~barbaros
-----------------------------------------------------------------

def commonprefix (list_of_paths):
"""Like in os.path, but more in the spirit of path manipulation.
Items must end in os.sep if they represent directories."""
number_of_paths = len(list_of_paths)

if number_of_paths == 0: return ''

Tim Peters

unread,
May 21, 2002, 11:07:21 PM5/21/02
to
[Cristian Barbarosie]

> I noticed the following peculiar behaviour os.path.commonprefix:
>
> >>> os.path.commonprefix(['/home/someuser/modulef/essay.tex',
> ... '/home/someuser/mollifiers/fig01.ps'])
> '/home/someuser/mo'

As the docs say,

Return the longest path prefix (taken character-by-character) that is
a prefix of all paths in list. If list is empty, return the empty
string (''). Note that this may return invalid paths because it works
a character at a time.

> I think the answer should be '/home/someuser/'. At least, this is
> what I expect when I do pathname manipulations in Python.
> So I propose the following replacement for os.path.commonprefix:

That was tried before, and it broke too much code. You could push for a new
function with a new name, but commonprefix won't change again.

If you want to write one, note that os.sep isn't enough: os.altsep also
needs to be considered, and at least on Windows both forward and backward
slashes are legit path separators but os.altsep is None. Then some people
will complain that it's platform-dependent (commonprefix as-is is not).
Others will insist that, in your example, you leave the trailing slash in,
while others will insist that you don't. Have fun <wink>.

Greg Ewing

unread,
May 22, 2002, 11:12:05 PM5/22/02
to
Tim Peters wrote:
>
> If you want to write one, note that os.sep isn't enough: os.altsep also
> needs to be considered,

I wrote some path utilities a while ago, including
a common prefix finder, and I didn't use os.sep or
os.altsep. Instead, I picked the paths apart using
os.path.split and compared the elements.

Hang on, I should have them around somewhere...

...ah, here they are:

-----------------------------------------------------
#
# Some pathname utilities.
# By Greg Ewing, 2001
# License: Share and Enjoy
#

import os

def relative_path(base, target):
"""Given two absolute pathnames base and target,
returns a relative pathname to target from the
directory specified by base. If there is no common
prefix, returns the target path unchanged.
"""
common, base_tail, target_tail = split_common(base, target)
#print "common:", common
#print "base_tail:", base_tail
#print "target_tail:", target_tail
r = len(base_tail) * [os.pardir] + target_tail
if r:
return os.path.join(*r)
else:
return os.curdir

def split_common(path1, path2):
"""Return a tuple of three lists of pathname components:
the common directory prefix of path1 and path2, the remaining
part of path1, and the remaining part of path2.
"""
p1 = split_all(path1)
p2 = split_all(path2)
c = []
i = 0
imax = min(len(p1), len(p2))
while i < imax:
if os.path.normcase(p1[i]) == os.path.normcase(p2[i]):
c.append(p1[i])
else:
break
i = i + 1
return c, p1[i:], p2[i:]

def split_all(path):
"""Return a list of the pathname components of the given path.
"""
result = []
head = path
while head:
head2, tail = os.path.split(head)
if head2 == head:
break # reached root on Unix or drive specification on Windows
head = head2
result.insert(0, tail)
if head:
result.insert(0, head)
return result

-----------------------------------------------------

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg

Cristian Barbarosie

unread,
May 23, 2002, 11:45:50 AM5/23/02
to
Greg Ewing <lo...@replyto.address.invalid> wrote in message news:<3CEC5E05...@replyto.address.invalid>...

> I wrote some path utilities a while ago, including
> a common prefix finder, and I didn't use os.sep or
> os.altsep. Instead, I picked the paths apart using
> os.path.split and compared the elements.

You should not have changed the subject line, I almost missed
your post. Thank you for the code.
Cristian Barbarosie
http://www.lmc.fc.ul.pt/~barbaros

Greg Ewing

unread,
May 23, 2002, 8:11:02 PM5/23/02
to
Cristian Barbarosie wrote:
>
> You should not have changed the subject line, I almost missed
> your post.

Sorry -- I forget sometimes that there are newsreaders
that don't understand threads properly. I changed it
to try to make it more obvious that there was something
attached, lest people miss it! Looks like I can't
win...

> Thank you for the code.

You're welcome,

0 new messages