path module

Ian Bicking

unread,

Jul 8, 2003, 3:57:49 AM7/8/03

to

I think Jason Orendorff's path module is really nice:
http://www.jorendorff.com/articles/python/path/

Beats the hell out of os.path, which is an ugly thing indeed. The OO
interface means you could use the interface nicely to implement other
things, like URLs. The problem? It's just some module. The various os
functions (of which path replaces quite a few) have become idiomatic to
me, and I'm sure others as well. I find myself reluctant to use it in
code that's not essentially private, because it's changing something
small and seemingly trivial, and people won't be familiar with it.

The solution? It should be a builtin! Or, if not a builtin, included
in the os module. But I actually like the idea of it being a builtin --
if open is a builtin, path stands right up there too. It would get rid
of 90% of the use of the os module.

Thoughts? Reactions?

Ian

Just

unread,

Jul 8, 2003, 4:17:16 AM7/8/03

to

In article <mailman.1057651032...@python.org>,
Ian Bicking <ia...@colorstudy.com> wrote:

I would greatly appreaciate such a module in the std library, but while
Jason's module has some very cool features, to my taste it goes a bit
too far with overloading operators. I really don't like overloading a/b
to mean os.path.join(a, b) and I don't think the default iterator should
do a listdir (although list(mypath) is indeed cute). If it were toned
down a bit in this area I think we may be able to make a good case for
including it in the std library.

Just

Van Gale

unread,

Jul 8, 2003, 5:59:03 AM7/8/03

to

Ian Bicking wrote:
> I think Jason Orendorff's path module is really nice:
> http://www.jorendorff.com/articles/python/path/

I love it and have been using it in a few personal projects. My only
gripe is its monolithic nature :)

Van

holger krekel

unread,

Jul 8, 2003, 5:32:19 AM7/8/03

to

Ian Bicking wrote:
> I think Jason Orendorff's path module is really nice:
> http://www.jorendorff.com/articles/python/path/

Yes, looks nice.

> Beats the hell out of os.path, which is an ugly thing indeed. The OO
> interface means you could use the interface nicely to implement other
> things, like URLs. The problem? It's just some module. The various os
> functions (of which path replaces quite a few) have become idiomatic to
> me, and I'm sure others as well. I find myself reluctant to use it in
> code that's not essentially private, because it's changing something
> small and seemingly trivial, and people won't be familiar with it.
>
> The solution? It should be a builtin! Or, if not a builtin, included
> in the os module. But I actually like the idea of it being a builtin --
> if open is a builtin, path stands right up there too. It would get rid
> of 90% of the use of the os module.
>
> Thoughts? Reactions?

I agree that something like Jason Orendorff's path module should go into
the standard library. I've coded a similar module and i think that
a discussion about certain design decisions would probably improve our
approaches.

For example Jason lets the "path" object inherit from "str" (or unicode)
but i think it's better to provide a "__str__" method so that you can say

str(pathinstance).endswith('.py')

and *not* base the path object on str/unicode.

unicode(pathinstance)

would just fail if your platform doesn't support this. First, i tried
the inheritance approach, btw, but it is ambigous (e.g. for the
join-method (str.join and os.path.join).

Also, my module provides most of the os.path.* methods as "filters" so
you can say

dirs = filter(isdir, list_obj_pathobjects)
fnames = filter(AND(nolink, isfile), list_obj_pathobjects)

in addition to

pathobject.isfile()
etc.

Recently, i also did some experimentation with "virtual-fs" features so
that you can transparently access http/ftp/svn files/directories. I even
got that to work with "<tab>-completion" but that was quite a hack :-)

I am pretty sure that virtual-fs-like-extensibility would be a big
"selling" point and would motivate the use of such a module and
finally the inclusion into the stdlib. Of course, the local-fs should
be the convenient case but it shouldn't be hard to use the same methods
for accessing remote "repositories".

Anyway, i am all for going in this direction and would probably
like to participate in such a development and design effort.

cheers,

holger

Ian Bicking

unread,

Jul 8, 2003, 5:15:34 AM7/8/03

to

On Tue, 2003-07-08 at 03:17, Just wrote:
> I would greatly appreaciate such a module in the std library, but while
> Jason's module has some very cool features, to my taste it goes a bit
> too far with overloading operators. I really don't like overloading a/b
> to mean os.path.join(a, b) and I don't think the default iterator should
> do a listdir (although list(mypath) is indeed cute). If it were toned
> down a bit in this area I think we may be able to make a good case for
> including it in the std library.

I've never wanted to iterate over a string, and find that an annoying
feature in Python (I'd *much* rather get an exception), so covering up
the previous string behavior doesn't seem a big deal to me. But I can
see why iterating over a path may be a little too magic. But paths are
also containers (at least if they point to a directory), so iterating
over them seems only natural. I could see something like mypath.dir()
being reasonable alternative (mypath.list() also, but that looks funny
to me because "list" is special to my eye).

I do like the /, though. mypath.joinpath(filename) is rather
long-winded (since he wisely avoids reusing the join method). / has no
meaning for strings, so it's not really overloading the operator, merely
adding it in a specific context. Several operators are reused for
different meanings, % in particular comes to mind, this doesn't seem
that bad. I like the way it looks and feels. It feels better to me
than using + for string concatenation ;) -- maybe because division is
not all that common an operation anyway.

Ian

Just

unread,

Jul 8, 2003, 6:11:52 AM7/8/03

to

In article <mailman.105765679...@python.org>,
holger krekel <py...@devel.trillke.net> wrote:

> I am pretty sure that virtual-fs-like-extensibility would be a big
> "selling" point and would motivate the use of such a module and
> finally the inclusion into the stdlib. Of course, the local-fs should
> be the convenient case but it shouldn't be hard to use the same methods
> for accessing remote "repositories".

Excellent point!

Just

Just van Rossum

unread,

Jul 8, 2003, 6:07:48 AM7/8/03

to

Ian Bicking wrote:

> On Tue, 2003-07-08 at 03:17, Just wrote:
> > I would greatly appreaciate such a module in the std library, but
> > while Jason's module has some very cool features, to my taste it
> > goes a bit too far with overloading operators. I really don't like
> > overloading a/b to mean os.path.join(a, b) and I don't think the
> > default iterator should do a listdir (although list(mypath) is
> > indeed cute). If it were toned down a bit in this area I think we
> > may be able to make a good case for including it in the std library.
>
> I've never wanted to iterate over a string, and find that an annoying
> feature in Python (I'd *much* rather get an exception),

It's basically a side effect of having a __getitem__ that takes
integers.

> so covering
> up the previous string behavior doesn't seem a big deal to me.

It's not just that, but iterating over a path could _also_ mean to
iterate over the path elements, so it's not obvious it should iterate
over the directory contents. Also, what if path points to a file?

> But I
> can see why iterating over a path may be a little too magic. But
> paths are also containers (at least if they point to a directory), so
> iterating over them seems only natural. I could see something like
> mypath.dir() being reasonable alternative (mypath.list() also, but
> that looks funny to me because "list" is special to my eye).

I'd say path.listdir() is most natural.

> I do like the /, though. mypath.joinpath(filename) is rather
> long-winded (since he wisely avoids reusing the join method). / has
> no meaning for strings, so it's not really overloading the operator,
> merely adding it in a specific context.

It'll confuse the hell out of people who expect / to mean "divide" ;-)

> Several operators are reused
> for different meanings, % in particular comes to mind, this doesn't
> seem that bad. I like the way it looks and feels. It feels better
> to me than using + for string concatenation ;) -- maybe because
> division is not all that common an operation anyway.

I tend to think that this whole discussion shows that _maybe_ it's not
such a good idea to subclass str/unicode after all. (Aside: path.py
taught me about the existence of os.path.supports_unicode_filenames in
2.3, but at least on my platform (OSX) it has the wrong value. I opened
a bug, #767645.)

Some string methods are handy on paths, such as .endswith(), while
others are not, like .upper() or .splitlines(). Other string method
names or operators would make sense for paths, but -- as you say -- are
out of the question since their meaning would be quite different, eg.
.join() and +.

I find overloading / unneccesary, since it's not like you'll have to
write

a.join(b).join(c).join(d)

but rather

a.join(b, c, d)

which I don't think is all that bad.

Btw. long ago (2001, before Jason's path module) I posted a balloon to
python-dev about this subject:
http://mail.python.org/pipermail/python-dev/2001-August/016663.html

There was hardly any response, and Guido said we wasn't too enthusiastic
about the idea, so we might have to work hard to pull this off ;-).

Also, with unicode, Guido's suggestion that any object that implements
__str__ can be used with open() seems to be no longer true. At least I
can't get it to work.

Just

Skip Montanaro

unread,

Jul 8, 2003, 9:33:58 AM7/8/03

to

Just> I find overloading / unneccesary, since it's not like you'll have
Just> to write

Just> a.join(b).join(c).join(d)

Just> but rather

Just> a.join(b, c, d)

Just> which I don't think is all that bad.

Yes, but

a/b/c/d

is nicely analogous to Unix pathname syntax.

Skip

Just van Rossum

unread,

Jul 8, 2003, 9:51:03 AM7/8/03

to

Skip Montanaro wrote:

> Just> a.join(b, c, d)
>
> Just> which I don't think is all that bad.
>
> Yes, but
>
> a/b/c/d
>
> is nicely analogous to Unix pathname syntax.

But when the items are variables, what you read is not what you get.
Often you'll want (some) literals, and then you get

path = basePath/"a"/"b"/"c"

...and _that_ I find quite horrible...

(Did I mention that / usually means divide in Python? ;-)

Just

Hallvard B Furuseth

unread,

Jul 8, 2003, 10:52:33 AM7/8/03

to

If there is going to be a New Wonderful Path module, I suggest you give
some thought to paths on systems that don't fit the simple Unix/Windows
path model. A path to a filename may e.g. looks different from a path
to a directory. It's a long time since I used anything like that, but I
think Tops-20 filename paths looked like this:

device:<dir.dir.dir>file.ext;version

where most components were optional. The device: part could also be a
'logical name' (basically an alias) for a directory or device, I don't
remember if it could alias a file name too.

The Common Lisp pathname type might be worth looking into,
<http://www.iti.informatik.tu-darmstadt.de/cl-hyperspec/Body/sec_19-2.html>
They have done a lot of work to try to get it right, and from what
I hear they did a good job.

--
Hallvard

Skip Montanaro

unread,

Jul 8, 2003, 10:51:26 AM7/8/03

to

>> path = basePath/"a/b/c"

Just> Ooh, it _can_ get worse ;-/

Just> Also: this would not be portable on platforms not using / as
Just> os.sep, ...

Not necessarily. My guess (again, without trying it) is that it does the
right thing. Right near the top of

http://www.jorendorff.com/articles/python/path/

Jason writes:

I like for my code to be cross-platform, but I tired of typing
os.path.join in about 1994.

>> Sure, just like '%' means modulo in Python, but it seems to have
>> found a home in printf-style string expansion.

Just> True, but string expansion is quite old (possibly even Python 0.9
Just> or 1.0?), so most people are used to it. (Although, newbies
Just> without a C background are usually baffled by it. I know I was,
Just> back then...).

Just because a better alternative to os.path turns up now is no reason to
discount it.

Nonetheless, before anything like Jason's path module is incorporated into
the standard distribution, a PEP is almost certainly required. I imagine
there are some things which could be done better (or at least differently)
to make the overall module more acceptable.

Skip

Just van Rossum

unread,

Jul 8, 2003, 10:16:11 AM7/8/03

to

Skip Montanaro wrote:

> Just> But when the items are variables, what you read is not what you
> Just> get. Often you'll want (some) literals, and then you get
>
> Just> path = basePath/"a"/"b"/"c"
>
> Just> ...and _that_ I find quite horrible...
>
> I don't know for sure, but I suspect the above could also be
>
> path = basePath/"a/b/c"

Ooh, it _can_ get worse ;-/

Also: this would not be portable on platforms not using / as os.sep, so
is almost equivalent to not using os.path at all and doing

path = basePath + "/a/b/c"

> Still not perfect, but in any case, the '/' is meant to be
> suggestive, not literal. Perhaps you would have preferred he use
> ':'? ;-)

Heh...

> Just> (Did I mention that / usually means divide in Python? ;-)

>
> Sure, just like '%' means modulo in Python, but it seems to have
> found a home in printf-style string expansion.

True, but string expansion is quite old (possibly even Python 0.9 or
1.0?), so most people are used to it. (Although, newbies without a C
background are usually baffled by it. I know I was, back then...).

Just

Skip Montanaro

unread,

Jul 8, 2003, 10:03:15 AM7/8/03

to

>> a/b/c/d
>>
>> is nicely analogous to Unix pathname syntax.

Just> But when the items are variables, what you read is not what you

Just> get. Often you'll want (some) literals, and then you get

Just> path = basePath/"a"/"b"/"c"

Just> ...and _that_ I find quite horrible...

I don't know for sure, but I suspect the above could also be

path = basePath/"a/b/c"

Still not perfect, but in any case, the '/' is meant to be suggestive, not

literal. Perhaps you would have preferred he use ':'? ;-)

Just> (Did I mention that / usually means divide in Python? ;-)

Sure, just like '%' means modulo in Python, but it seems to have found a
home in printf-style string expansion.

Skip

Nick Vargish

unread,

Jul 8, 2003, 11:14:37 AM7/8/03

to

Just van Rossum <ju...@xs4all.nl> writes:

> (Did I mention that / usually means divide in Python? ;-)

I believe you did... But doesn't % usually mean modulo in Python?

:^)

Nick

--
# sigmask | 0.2 | 2003-01-07 | public domain | feed this to a python
print reduce(lambda x,y:x+chr(ord(y)-1),'Ojdl!Wbshjti!=obwAqbusjpu/ofu?','')

John J. Lee

unread,

Jul 8, 2003, 11:08:46 AM7/8/03

to

holger krekel <py...@devel.trillke.net> writes:
[...]

> Recently, i also did some experimentation with "virtual-fs" features so
> that you can transparently access http/ftp/svn files/directories. I even
> got that to work with "<tab>-completion" but that was quite a hack :-)

[...]

Note that this overlaps a bit with urllib and urllib2. Just something
that would need thinking about.

John

Peter Hansen

unread,

Jul 8, 2003, 12:24:48 PM7/8/03

to

Nick Vargish wrote:
>
> Just van Rossum <ju...@xs4all.nl> writes:
>
> > (Did I mention that / usually means divide in Python? ;-)
>
> I believe you did... But doesn't % usually mean modulo in Python?

I believe you'll find it's more common for it to mean "format",
but the point is that it is well understood that it means *either*,
depending on context.

Using / for this new concatenation-like behaviour is tantamount
to adding new syntax to Python again... :-(

-Peter

holger krekel

unread,

Jul 8, 2003, 12:03:59 PM7/8/03

to

sure, i used *urllib* under the hood :-)

holger

Skip Montanaro

unread,

Jul 8, 2003, 12:02:02 PM7/8/03

to

Hallvard> It's a long time since I used anything like that, but I think
Hallvard> Tops-20 filename paths looked like this:

Hallvard> device:<dir.dir.dir>file.ext;version

Hallvard> where most components were optional.

which is (not too surprisingly) very similar to VMS:

device:[dir.dir.dir]file.ext;version

Skip

Just van Rossum

unread,

Jul 8, 2003, 11:02:29 AM7/8/03

to

Skip Montanaro wrote:

> Just> Also: this would not be portable on platforms not using / as
> Just> os.sep, ...
>
> Not necessarily.

True, but it takes guessing: "did the author really mean to specify
several path components here, or is he/she intentionally using the unix
path separater in file names on a platform that _does't_ use / as the
path separator?". In the face of ambiguity etc.

> My guess (again, without trying it) is that it does
> the right thing.

It doesn't...

> Nonetheless, before anything like Jason's path module is incorporated
> into the standard distribution, a PEP is almost certainly required.
> I imagine there are some things which could be done better (or at
> least differently) to make the overall module more acceptable.

Absolutely. I really like Holger's observation that having path objects
levels the road to virtual file systems (eg. use a zip file as a file
system). Anyone intersted in volunteering to write that PEP? I'd like to
contribute in some way, but I'm not going to write more than 1 PEP per
year :-).

Just

Ian Bicking

unread,

Jul 8, 2003, 1:42:45 PM7/8/03

to

On Tue, 2003-07-08 at 05:07, Just van Rossum wrote:
> > Several operators are reused
> > for different meanings, % in particular comes to mind, this doesn't
> > seem that bad. I like the way it looks and feels. It feels better
> > to me than using + for string concatenation ;) -- maybe because
> > division is not all that common an operation anyway.
>
> I tend to think that this whole discussion shows that _maybe_ it's not
> such a good idea to subclass str/unicode after all. (Aside: path.py
> taught me about the existence of os.path.supports_unicode_filenames in
> 2.3, but at least on my platform (OSX) it has the wrong value. I opened
> a bug, #767645.)
>
> Some string methods are handy on paths, such as .endswith(), while
> others are not, like .upper() or .splitlines(). Other string method
> names or operators would make sense for paths, but -- as you say -- are
> out of the question since their meaning would be quite different, eg.
> .join() and +.

True... and as I think about it, a lot of the actually interesting
string methods wouldn't be performed on the entire path anyway. Things
like path.name.startswith('img'), or path.ext == 'jpg'. You could also
do things like override equals, so that two Windows paths would match
case-insensitively, and other things that would be bad to change in a
string subclass.

> I find overloading / unneccesary, since it's not like you'll have to
> write
>
> a.join(b).join(c).join(d)
>
> but rather
>
> a.join(b, c, d)
>

> which I don't think is all that bad.

And you could reuse the join method, which is better than joinpath (but
joinpath is better than overriding string's join, if you are subclassing
string).

> Btw. long ago (2001, before Jason's path module) I posted a balloon to
> python-dev about this subject:
> http://mail.python.org/pipermail/python-dev/2001-August/016663.html
>
> There was hardly any response, and Guido said we wasn't too enthusiastic
> about the idea, so we might have to work hard to pull this off ;-).
>
> Also, with unicode, Guido's suggestion that any object that implements
> __str__ can be used with open() seems to be no longer true. At least I
> can't get it to work.

You don't want to use open() anyway, that breaks the possibility of
alternate filesystems. There should be an open method, like
path.open('w'). Then a URL object (called maybe url?) would also have
an open method, that obviously would do a much different thing. (And
just as I'm thinking of a url class, things like .exists() would be
surprisingly useful, even though urllib doesn't expose these file-like
operations very directly)

The only other way is if a new magic method -- __open__ -- came into
being. That would be interesting (where "interesting" can be read
several ways ;).

Ian

Ian Bicking

unread,

Jul 8, 2003, 1:50:23 PM7/8/03

to

On Tue, 2003-07-08 at 04:32, holger krekel wrote:
> I agree that something like Jason Orendorff's path module should go into
> the standard library. I've coded a similar module and i think that
> a discussion about certain design decisions would probably improve our
> approaches.
>
> For example Jason lets the "path" object inherit from "str" (or unicode)
> but i think it's better to provide a "__str__" method so that you can say
>
> str(pathinstance).endswith('.py')
>
> and *not* base the path object on str/unicode.
>
> unicode(pathinstance)
>
> would just fail if your platform doesn't support this. First, i tried
> the inheritance approach, btw, but it is ambigous (e.g. for the
> join-method (str.join and os.path.join).

I'm starting to think the same thing. Not so much because of join, but
because it doesn't actually offer many advantages. Many methods that
look for a filename will be using "type(arg) is type('')", so you'd have
to pass a real string object in anyway -- and people who say "but you
should use isinstance(arg, str)" are obviously forgetting that you
couldn't do this not very long ago, and lots of code uses type
comparison at this point.

> Also, my module provides most of the os.path.* methods as "filters" so
> you can say
>
> dirs = filter(isdir, list_obj_pathobjects)
> fnames = filter(AND(nolink, isfile), list_obj_pathobjects)
>
> in addition to
>
> pathobject.isfile()
> etc.

That's not necessary with list comprehension, since you can just do:

[p for p in list_obj_pathobjects if p.isdir()]

> Recently, i also did some experimentation with "virtual-fs" features so
> that you can transparently access http/ftp/svn files/directories. I even
> got that to work with "<tab>-completion" but that was quite a hack :-)
>

> I am pretty sure that virtual-fs-like-extensibility would be a big
> "selling" point and would motivate the use of such a module and
> finally the inclusion into the stdlib. Of course, the local-fs should
> be the convenient case but it shouldn't be hard to use the same methods
> for accessing remote "repositories".

Yes, virtual filesystems are certainly an important idea here. Almost
makes me wonder if path() should also recognize URLs by default...
probably not, as that isn't always desired, and a URL is going to create
a significantly different object than a mere filesystem path, even
though its interface will be very similar.

Ian

Irmen de Jong

unread,

Jul 8, 2003, 2:08:44 PM7/8/03

to

Skip Montanaro wrote:

Which the Amiga's version borrows (borrowed) heavily from:

device:dir/dir/dir/file (no filename extensions)

The nifty part was that the <device:> could be the name
of a physical device (such as "DF0:" meaning "floppy drive 0",
or a user-defined logical device (such as "LIBS:").
The latter was usually called an "assign", and you could let
it point to *multiple* locations in your filesystem *at once*.

Sorry--way off topic here, but the above path syntax
brings back some good memories ;-)

--Irmen de Jong

Ian Bicking

unread,

Jul 8, 2003, 2:01:31 PM7/8/03

to

Interesting, but I think a bad idea. I don't believe Python has been
ported to Tops-20, and I'm not sure if there's a viable VMS port
either. Most filesystems don't have the complexity that the Lisp
pathname encapsulates. If someone was using VMS paths, I would assume
they would subclass path for that OS, adding the portions that applied.
I think it's unreasonable to expect people programming on normal
platforms to pay attention to components like version, so even including
it in a structured manner is asking for trouble.

On some level such filesystems would probably be supportable, though.
You just wouldn't adapt the filesystem's native structure, though
presumably your os module would know how to parse such a path and emit
such a path. But like you can use / instead of \ for filenames on
Windows, I would expect / to work on most other filesystems as well.

Ian

Cliff Wells

unread,

Jul 8, 2003, 2:53:11 PM7/8/03

to

On Tue, 2003-07-08 at 11:01, Ian Bicking wrote:

> Interesting, but I think a bad idea. I don't believe Python has been
> ported to Tops-20, and I'm not sure if there's a viable VMS port
> either. Most filesystems don't have the complexity that the Lisp
> pathname encapsulates. If someone was using VMS paths, I would assume
> they would subclass path for that OS, adding the portions that applied.
> I think it's unreasonable to expect people programming on normal
> platforms to pay attention to components like version, so even including
> it in a structured manner is asking for trouble.

There is talk that Windows will have versioning in its next filesystem
(WinFS). It would surprise me if there weren't similar plans on the
Linux side.

--
Cliff Wells, Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 (800) 735-0555

holger krekel

unread,

Jul 8, 2003, 2:28:52 PM7/8/03

to

Ian Bicking wrote:
> On Tue, 2003-07-08 at 04:32, holger krekel wrote:
> > I agree that something like Jason Orendorff's path module should go into
> > the standard library. I've coded a similar module and i think that
> > a discussion about certain design decisions would probably improve our
> > approaches.
> >
> > For example Jason lets the "path" object inherit from "str" (or unicode)
> > but i think it's better to provide a "__str__" method so that you can say
> >
> > str(pathinstance).endswith('.py')
> >
> > and *not* base the path object on str/unicode.
> >
> > unicode(pathinstance)
> >
> > would just fail if your platform doesn't support this. First, i tried
> > the inheritance approach, btw, but it is ambigous (e.g. for the
> > join-method (str.join and os.path.join).
>
> I'm starting to think the same thing. Not so much because of join, but
> because it doesn't actually offer many advantages. Many methods that
> look for a filename will be using "type(arg) is type('')", so you'd have
> to pass a real string object in anyway -- and people who say "but you
> should use isinstance(arg, str)" are obviously forgetting that you
> couldn't do this not very long ago, and lots of code uses type
> comparison at this point.

right. Also i prefer my objects to not have a "polluted" namespace.

> > Also, my module provides most of the os.path.* methods as "filters" so
> > you can say
> >
> > dirs = filter(isdir, list_obj_pathobjects)
> > fnames = filter(AND(nolink, isfile), list_obj_pathobjects)
> >
> > in addition to
> >
> > pathobject.isfile()
> > etc.
>
> That's not necessary with list comprehension, since you can just do:
>
> [p for p in list_obj_pathobjects if p.isdir()]

but i use the same idea (filter-functions) for more advanced walkers:

p = path('/music')
for i in p.filterwalk(AND(nolink, isfile, isplayable, match(repattern))):
play_mp3(i)

where filterwalk is a generator because i don't want the playscript to
first try to gather *all* files for obvious reasons (as would happen with
list comprehension). This has proven to be incredibly useful and easy to
read (if you don't engange in list-comprehension <-> functional-style
wars). Just because Guido somewhat dislikes "functional support" like
lambda, map, filter and friends to be in the __builtin__ module
doesn't mean it's bad :-)

cheers,

holger

holger krekel

unread,

Jul 8, 2003, 3:16:31 PM7/8/03

to

Cliff Wells wrote:
> On Tue, 2003-07-08 at 11:01, Ian Bicking wrote:
>
> > Interesting, but I think a bad idea. I don't believe Python has been
> > ported to Tops-20, and I'm not sure if there's a viable VMS port
> > either. Most filesystems don't have the complexity that the Lisp
> > pathname encapsulates. If someone was using VMS paths, I would assume
> > they would subclass path for that OS, adding the portions that applied.
> > I think it's unreasonable to expect people programming on normal
> > platforms to pay attention to components like version, so even including
> > it in a structured manner is asking for trouble.
>
> There is talk that Windows will have versioning in its next filesystem
> (WinFS). It would surprise me if there weren't similar plans on the
> Linux side.

Right. Reiserfs plans this and Subversion has it (speaking about URLs here, not
only local pathes). But i think the way you specify versions will be
vastly different so the best bet probably is to pass an additional argument to
a path-like object, e.g. for subversion

fn = svnpath('py.py', rev=1050)

or remotely

fn = svnpath('http://.../py.py', rev=7362)

where 'rev' specifies a revision number (which identifies exactly
one state of a subversion-repository). Other than that, a 'svnpath'
could probably behave like a regular local 'path' object, i guess.
Hmmm, the above wouldn't be hard to do because svn has python bindings
on all levels ... but enough advertisement :-)

Either way, i believe that path/file versioning deserves some thoughts
as it might be a "next big thing" (besides java and .NET, of course :-)

cheers,

holger

Bengt Richter

unread,

Jul 9, 2003, 4:57:36 AM7/9/03

to

On Tue, 8 Jul 2003 20:28:52 +0200, holger krekel <py...@devel.trillke.net> wrote:
[...]

>
>but i use the same idea (filter-functions) for more advanced walkers:
>
> p = path('/music')
> for i in p.filterwalk(AND(nolink, isfile, isplayable, match(repattern))):
> play_mp3(i)
>
>where filterwalk is a generator because i don't want the playscript to
>first try to gather *all* files for obvious reasons (as would happen with
>list comprehension). This has proven to be incredibly useful and easy to
>read (if you don't engange in list-comprehension <-> functional-style
>wars). Just because Guido somewhat dislikes "functional support" like
>lambda, map, filter and friends to be in the __builtin__ module
>doesn't mean it's bad :-)
>

What if we had a file systems package from which to import support
for various file system access, e.g.,

from filesystems import fs_win32share # might e.g., use or clone samba infrastructure?
puterfs = fs_win32share.mount(r'\\puter\sharename') #(cf. virtual block device steps below)
p = puterfs.path(r'\music')

for i in p.filterwalk(AND(nolink, isfile, isplayable, match(repattern))):
play_mp3(i)

... i.e., play mp3's stored as windows shared files via LAN.

Taking a cue from os.path, which is posixpath for slackware linux, and ntpath for NT4, perhaps
they could be callable as os.path('/some/path/to/a/dir') to create a path object suitable for
the default file system, referring to the specified directory. A path object could then
have a file method, and the builtin file function might really be the bound method os.path('').file.
It isn't right now, so I write it out below to be clear. An useful binding might be os.file also.

from filesystems import f_as_blockdev
a_drive_vbd = f_as_blockdev.mount(os.path('').file(r'\\.\A:')) # physical NT floppy A
from filesystems import fs_apple
applefdfs = fs.apple.mount(a_drive_vbd)
srcp = applefdfs.path(r'\music')
dstp = os.path('.')

for i in p.filterwalk(AND(nolink, isfile, isplayable, match(repattern))):

dstp.copy(i)

or maybe looking at a CDROM file as an apple floppy image, and copying some files to the
local file system, e.g.,

from filesystems import f_as_blockdev
a_drive_img_vbd = f_as_blockdev.mount(os.path('').file(r'X:\apple\floppies\fdimg.1')) # CDROM X:
from filesystems import fs_apple
applefdfs = fs.apple.mount(a_drive_img_vbd)
srcp = applefdfs.path(r'\music')
dstp = os.path('.') # assumes callable instantiates path object for default local file system

for i in p.filterwalk(AND(nolink, isfile, isplayable, match(repattern))):

dstp.copy(i) # assumes copy method copying to './filename' for filter-surviving filenames.

Etc., etc., (ok, very short songs on floppy ;-)

I.e., path magic would be file-system-appropriate, yet provide a uniform generic interface
with reasonable (but presumably configurable in some file-system-specific ways) defaults.

File systems would be mounted using virtual block devices, unless the mount method for the
file system can reasonably bypass that to synthesize a file system object using non-block/char
device access, e.g., as a local proxy object for a remote file system (or virtual file-system
view, e.g., of an html href/imgref tree or news thread, or database, etc. etc.).

<ramble>
Note that a virtual file system could well have GUI side effects when written to. The frame buffer
device would be interesting to capture access to via a virtual file system module along the lines of
the above. Note also that GUI windows have a hierarchy that could map to a virtual file object hierarchy.
Imagine a virtual svg file system mounted to an x-window instance, so that when you wrote svg source
to it you would get the visual effects in that window. Alternatively, mount on a full-screen virtual
device, etc. Creating virtual sub-"directories" could create child windows...

I see various plotting packages factored into this form as well. And binary mode writes for fast
stuff. For grahic vfs's IWT their mount methods should accept a virtual frame buffer device, so
that something transparent and fast can ultimately talk almost directly to hardware. Maybe it
could be prototyped using tkinter infrastructure, though. Or pygame/sdl.

I've been thinking of prototyping a simple plotting graphic vfs along the above lines. Maybe its
calcomp driver writing nostalgia (I wrote a rasterizing driver to plot calcomp command streams on
a versatec 'way back. Of course a random access canvas will make it easier (there wasn't space
to brute force a full image and then feed the raster plotter ;-)
</ramble>

Just a couple of thoughts to throw in the idea hopper (this variation HOTTOMH, so very alpha ;-)
Too many irons in the coals (not to say fire ;-/)

Regards,
Bengt Richter

Hallvard B Furuseth

unread,

Jul 14, 2003, 7:44:42 AM7/14/03

to

Ian Bicking wrote:
>On Tue, 2003-07-08 at 09:52, Hallvard B Furuseth wrote:
>> If there is going to be a New Wonderful Path module, I suggest you give
>> some thought to paths on systems that don't fit the simple Unix/Windows
>> path model. A path to a filename may e.g. looks different from a path

>> to a directory. (...)
>

> Interesting, but I think a bad idea. (...) If someone was using VMS

> paths, I would assume they would subclass path for that OS, adding the
> portions that applied.

It would be pointless to include _data structures_ for components that
are not supported on any system Python is ported to, but for subclassing
to make sense, some of the _interface_ would have to be in place. Like
the possibility of usin path.joindir() vs. path.joinfile() or something
depending on whether the result should be a file or directory path. And
just path.join() for people who don't care. Assuming there will be a
join method, of course.

Also, you may need some special handling of 'device:' on Windows.

> I think it's unreasonable to expect people programming on normal
> platforms to pay attention to components like version, so even
> including it in a structured manner is asking for trouble.

I dunno. People have already mentioned coming systems where versions
will be availbale.

--
Hallvard

Ian Bicking

unread,

Jul 14, 2003, 8:36:17 AM7/14/03

to

On Mon, 2003-07-14 at 06:44, Hallvard B Furuseth wrote:
> Ian Bicking wrote:
> >On Tue, 2003-07-08 at 09:52, Hallvard B Furuseth wrote:
> >> If there is going to be a New Wonderful Path module, I suggest you give
> >> some thought to paths on systems that don't fit the simple Unix/Windows
> >> path model. A path to a filename may e.g. looks different from a path
> >> to a directory. (...)
>
> > Interesting, but I think a bad idea. (...) If someone was using VMS
> > paths, I would assume they would subclass path for that OS, adding the
> > portions that applied.
>
> It would be pointless to include _data structures_ for components that
> are not supported on any system Python is ported to, but for subclassing
> to make sense, some of the _interface_ would have to be in place. Like
> the possibility of usin path.joindir() vs. path.joinfile() or something
> depending on whether the result should be a file or directory path. And
> just path.join() for people who don't care. Assuming there will be a
> join method, of course.

Since *no one* will ever use joindir or joinfile, why would it be
helpful? Modern systems just don't make that distinction, and people
aren't going to make that distinction in their code.

> Also, you may need some special handling of 'device:' on Windows.

Yes, and the network portion as well (\\server\...). However, it would
still be handled textually. I.e., path(r'\\server') would happen to
create this network path, and path(r'\something_else') wouldn't. The
Windows implementation of path would presumably have an attribute to get
at "server" (.unc or something), while you'd get an AttributeError on
Posix systems.

> > I think it's unreasonable to expect people programming on normal
> > platforms to pay attention to components like version, so even
> > including it in a structured manner is asking for trouble.
>
> I dunno. People have already mentioned coming systems where versions
> will be availbale.

But we have no idea what it will look like, or how it may be represented
in a filename (if at all!) -- so implementing something based on that
would be a little optimistic. You're likely to create an interface that
won't make sense. Better to leave it unspecified until there's an
actual system you want to support, at which point the interface will
seem much clearer. Predictive design is a very bad idea.

Ian

Paul Moore

unread,

Jul 14, 2003, 5:19:37 PM7/14/03

to

holger krekel <py...@devel.trillke.net> wrote in message news:<mailman.105765679...@python.org>...

> I agree that something like Jason Orendorff's path module should go into
> the standard library. I've coded a similar module and i think that
> a discussion about certain design decisions would probably improve our
> approaches.

Is it available anywhere? It would be nice to be able to try both, for comparison.

Paul.

Paolo Invernizzi

unread,

Jul 15, 2003, 4:45:45 AM7/15/03

to

holger krekel wrote:

> sorry, not right now. I'll try to make a release soonish.
>
> holger
>

I cannot access the path.py module from
http://www.jorendorff.com/articles/python/path/

Someone can be so kind to email it to me?

Thanks in advance!

---
Paolo Invernizzi
paoloinvernizzi at dmsware.com

Florian Schulze

unread,

Jul 15, 2003, 8:48:55 AM7/15/03

to

Try google with "cache:http://..." this worked for me.

Florian

--
Florian Schulze

Paolo Invernizzi

unread,

Jul 15, 2003, 9:05:59 AM7/15/03

to

Florian Schulze wrote:

> Try google with "cache:http://..." this worked for me.

Done.

Thanks for the advice.

---
Paolo

Hallvard B Furuseth

unread,

Jul 18, 2003, 9:21:31 AM7/18/03

to

Ian Bicking wrote:
>On Mon, 2003-07-14 at 06:44, Hallvard B Furuseth wrote:
>>Ian Bicking wrote:
>>> Interesting, but I think a bad idea. (...) If someone was using VMS
>>> paths, I would assume they would subclass path for that OS, adding the
>>> portions that applied.
>>
>> It would be pointless to include _data structures_ for components that
>> are not supported on any system Python is ported to, but for subclassing
>> to make sense, some of the _interface_ would have to be in place. Like
>> the possibility of usin path.joindir() vs. path.joinfile() or something
>> depending on whether the result should be a file or directory path. And
>> just path.join() for people who don't care. Assuming there will be a
>> join method, of course.
>
> Since *no one* will ever use joindir or joinfile, why would it be
> helpful?

Because we disagree about whether or not anyone will use it:-)

>> Also, you may need some special handling of 'device:' on Windows.
>
> Yes, and the network portion as well (\\server\...). However, it would
> still be handled textually.

FIne by me. I wasn't thinking of what the internals would look like at
all.

>>> I think it's unreasonable to expect people programming on normal
>>> platforms to pay attention to components like version, so even
>>> including it in a structured manner is asking for trouble.
>>
>> I dunno. People have already mentioned coming systems where versions
>> will be availbale.
>
> But we have no idea what it will look like, or how it may be represented
> in a filename (if at all!) -- so implementing something based on that
> would be a little optimistic. You're likely to create an interface that
> won't make sense. Better to leave it unspecified until there's an
> actual system you want to support, at which point the interface will
> seem much clearer. Predictive design is a very bad idea.

Actually I disagree here. The danger of designing to an existing system
is that another system may come along where the versioning doesn't fit
our design. I think it's a good idea to design it - but not necessaril
implement it - before we see how it works on a real system. Then the
real system comes will either prove that we did good enough, or that we
didn't. In the latter case, it may be better to leave it out anyway.

--
Hallvard

Jason Orendorff

unread,

Jul 21, 2003, 2:16:57 PM7/21/03

to

I wrote the 'path' module at:
http://www.jorendorff.com/articles/python/path

There was some discussion on it here:
http://groups.google.com/groups?th=42ab4db337b60ce3

Just a few comments:

Ian and Holger wondered why 'path' should subclass 'str'. It's because
a path is a string. Benefit: you can pass 'path' objects to functions
that expect strings (like functions in 'win32file'). I find this
really useful in practice.

I agree with Just that 'path' shouldn't override '__iter__()'. I'll
change this eventually.

I think Just is the first to argue that 'path / filename' is confusing.
I find it intuitive. Other people have chosen / for this purpose,
independently: see the O'Reilly book _Python Cookbook_ [1], recipe 4.17,
and the Boost path object [2].

I do believe 'path' should be in the standard library (if not builtin).
I enjoy it and I use it all the time. My perception is that the Python
core dev team doesn't see any particular need for it. If anyone wants me
to, I'll write the PEP.

for f in path('/music').walkfiles('*.mp3'):
play_mp3(f)

Cheers,
Jason

[1] http://safari.oreilly.com/?xmlid=0-596-00167-3
[2] http://www.boost.org/libs/filesystem/doc/path.htm#operator_slash

holger krekel

unread,

Jul 25, 2003, 10:37:57 AM7/25/03

to

Jason Orendorff wrote:
> I wrote the 'path' module at:
> http://www.jorendorff.com/articles/python/path
>
> There was some discussion on it here:
> http://groups.google.com/groups?th=42ab4db337b60ce3
>
> Just a few comments:
>
> Ian and Holger wondered why 'path' should subclass 'str'. It's because
> a path is a string. Benefit: you can pass 'path' objects to functions
> that expect strings (like functions in 'win32file'). I find this
> really useful in practice.

IMO you'll almost never use the following string-methods on a 'Path' object:

capitalize center count decode encode
expandtabs find index isalnum isalpha isdigit
islower isspace istitle isupper
ljust lstrip rjust splitlines startswith
swapcase title translate zfill

and so these methods pollute a Path object's name-space quite a bit.
Also 'join', '__contains__', startswith etc. produce some ambigouity.

I think it's convenient enough to use "str(path)" if passing a 'path'
instance as a string somewhere.

cheers,

holger

Just

unread,

Jul 25, 2003, 11:22:17 AM7/25/03

to

In article <mailman.105914400...@python.org>,
holger krekel <py...@devel.trillke.net> wrote:

If the path object has a __str__ method, apparently it should work
without explicit conversion. However, this seems to fail for me on OSX,
where an attempt is made to convert to unicode. Providing a __unicode__
method doesn't help. But then again, I think we'd be fine if we add the
most used path-taking functions to the path object as methods. I can
even see adding some win-specific methods to it.

Just

holger krekel

unread,

Jul 25, 2003, 12:41:58 PM7/25/03

to

Hello Just,

Just wrote:

> holger krekel <py...@devel.trillke.net> wrote:
> > I think it's convenient enough to use "str(path)" if passing a 'path'
> > instance as a string somewhere.
>
> If the path object has a __str__ method, apparently it should work
> without explicit conversion. However, this seems to fail for me on OSX,
> where an attempt is made to convert to unicode. Providing a __unicode__
> method doesn't help. But then again, I think we'd be fine if we add the
> most used path-taking functions to the path object as methods. I can
> even see adding some win-specific methods to it.

Yes, i think adding platform specific methods to a Path object makes sense.
A friend and me started working on (local and subversion) Path
implementations last week. Currently a Path instance provides
these "path-taking" methods

open
read
write
visit (a recursive walker)
listdir
stat
load/save (unpickle/pickle object)
setmtime (set modification time, uses os.utime)

apart from all the os.path.* stuff like 'exists', 'dirname' etc.
Providing these "path-taking" methods on the Path object is very important
because otherwise you'll have to convert back and fro for using those
os.* and os.path.* or builtin methods (which is evil).

cheers,

holger

Skip Montanaro

unread,

Jul 25, 2003, 12:46:54 PM7/25/03

to

holger> IMO you'll almost never use the following string-methods on a
holger> 'Path' object:

holger> capitalize center count decode encode
holger> expandtabs find index isalnum isalpha isdigit
holger> islower isspace istitle isupper
holger> ljust lstrip rjust splitlines startswith
holger> swapcase title translate zfill

In practice, I rarely use the above methods on string objects. The only
exception is startswith. ;-)

Skip

Ian Bicking

unread,

Jul 25, 2003, 1:48:19 PM7/25/03

to

On Mon, 2003-07-21 at 13:16, Jason Orendorff wrote:
> Ian and Holger wondered why 'path' should subclass 'str'. It's because
> a path is a string. Benefit: you can pass 'path' objects to functions
> that expect strings (like functions in 'win32file'). I find this
> really useful in practice.

I feel like this would lead to some annoying behavior in some
circumstances. Most particularly, I'm thinking of:

def dosomething(file):
if type(file) is type(""):
file = open(file)
...

This isn't uncommon in functions that take pathnames or file objects.
While isinstance(path, str) works, it was not an option until 2.2. So
you'd be forced to do str(pathname) sometimes anyway, to deal with this.

Ideally, interfaces would be changed to use a .open() method on the path
instead of opening the string representation (as Holger's implementation
does), so in the long term it would be nice to abandon direct string
representations entirely. It would also make it more clear when you had
a real path object and when you just had a string.

Ian

Ian Bicking

unread,

Jul 25, 2003, 1:45:33 PM7/25/03

to

On Fri, 2003-07-25 at 11:41, holger krekel wrote:
> Yes, i think adding platform specific methods to a Path object makes sense.
> A friend and me started working on (local and subversion) Path
> implementations last week. Currently a Path instance provides
> these "path-taking" methods
>
> open
> read
> write
> visit (a recursive walker)
> listdir
> stat
> load/save (unpickle/pickle object)
> setmtime (set modification time, uses os.utime)

I like read and write too -- I do:

f = open(filename)
contents = f.read()
f.close()

All the time (when I'm uninterested in streaming or performance, which
is most of the time I deal with files). Or just open(filename).read()
and let garbage collection fix it up, even if it seems a little messy.
A single method to encapsulate that would be nice, and of course write
gives symmetry. Hmmm... Jason's distinguishes bytes (binary) and text
(which is potentially encoded). I kind of like that distinction.

Jason had walkers both for all files, just non-directory files, and
directory files. This seems useful to me, and by making it explicit I
might just start distinguishing text from binary (which I don't now
because I am forgetful). And a globbing walker, though I don't know how
much of an advantage that would be over list comprehension. Actually,
all his walkers have a globbing option.

> apart from all the os.path.* stuff like 'exists', 'dirname' etc.
> Providing these "path-taking" methods on the Path object is very important
> because otherwise you'll have to convert back and fro for using those
> os.* and os.path.* or builtin methods (which is evil).

dirname is a good name, since it should return a path object, not a
"name" (which to me implies a string). I think Jason's module uses a
parent attribute, though it also supports dirname(), and a name
attribute instead of basename() (though that does not return a path
object). And things like dirname make less sense in some non-path
situations, like a URL. Probably not too much renaming should occur,
but at least a little may be appropriate.

Ian

holger krekel

unread,

Jul 25, 2003, 2:33:22 PM7/25/03

to

Ian Bicking wrote:
> Jason had walkers both for all files, just non-directory files, and
> directory files. This seems useful to me, and by making it explicit I
> might just start distinguishing text from binary (which I don't now
> because I am forgetful). And a globbing walker, though I don't know how
> much of an advantage that would be over list comprehension. Actually,
> all his walkers have a globbing option.

We currently only have one 'visit' method that accepts a filter for returning
results and a filter for recursing into the tree. You can use and
combine multiple filters like so:

root = Path('...)
for path in root.visit(AND(isdir, nolink)):
# iterates over all non-link dirs in the tree (breadth-first)

or

for path in root.visit(AND(isfile, endswith('.txt')), nodotfile):
# iterates over all '*.txt' files but not recursing into ".*"

and so on. This proved to be flexible and convenient and mostly avoids
the need for multiple walk-methods.

cheers,

holger

Ian Bicking

unread,

Jul 25, 2003, 3:17:48 PM7/25/03

to

Yeah... but we know that's not going to get into the standard library.
It requires a big namespace, logic functions (AND, OR, etc.), and it
confuses functions with these filter objects, which are named the same
(and even if the filter objects can be used as functions, it's still
confusing). It's a style that doesn't exist in the standard library,
and it seems unlikely that it would get in here.

The multiple walk methods would only be a shortcut anyway. Again, they
might be difficult in a situation like a URL where directory and file
are intermingled (and maybe ReiserFS 4...?) -- which maybe is okay, a
urlpath object simply wouldn't implement that walker.

Ian

holger krekel

unread,

Jul 25, 2003, 3:56:02 PM7/25/03

to

Ian Bicking wrote:
> On Fri, 2003-07-25 at 13:33, holger krekel wrote:
> > We currently only have one 'visit' method that accepts a filter for returning
> > results and a filter for recursing into the tree.

> > ...

> > This proved to be flexible and convenient and mostly avoids
> > the need for multiple walk-methods.
>
> Yeah... but we know that's not going to get into the standard library.
> It requires a big namespace, logic functions (AND, OR, etc.), and it
> confuses functions with these filter objects, which are named the same
> (and even if the filter objects can be used as functions, it's still
> confusing). It's a style that doesn't exist in the standard library,
> and it seems unlikely that it would get in here.

Maybe right. This is not my first priority, anyway, but i also thought that
functional style is just not liked among the builtins.

Anyway, the "filter functions" are indeed just callables which accept
Path objects. You could as well take the unbound method Path.isdir
but this feels ugly and isn't flexible enough.

I don't exactly know what you mean by "big namespace". The filters are
all contained in a 'filter' submodule because they can apply to
multiple Path implementations anyway.

> The multiple walk methods would only be a shortcut anyway. Again, they
> might be difficult in a situation like a URL where directory and file
> are intermingled (and maybe ReiserFS 4...?) -- which maybe is okay, a
> urlpath object simply wouldn't implement that walker.

Yep, URL pathes have no notion of directories and files. Thus a general
URL path can't have a 'listdir' method and thus we can't recurse.
You can easily special case it for Apache's "Indexes" view, though :-)

holger

Jason Orendorff

unread,

Jul 25, 2003, 4:19:49 PM7/25/03

to

holger krekel wrote:
> IMO you'll almost never use the following string-methods on a 'Path' object:

> capitalize center count decode encode [...]

> and so these methods pollute a Path object's name-space quite a bit.
> Also 'join', '__contains__', startswith etc. produce some ambigouity.

I'm not worried about "namespace pollution", but you're right that
strings and paths are generally used for different things. I also
agree 'join()' is a wart.

> I think it's convenient enough to use "str(path)" if passing a 'path'
> instance as a string somewhere.

Hmmm. If the plan were to convert the whole standard library to accept
path objects for pathnames, I would likely agree. But when you say
"str(p)" is "convenient enough", you're saying I need this rule in my head:

Don't pass path objects to functions that take path arguments.
Pass string objects instead.

This is a type rule. Such a thing has no place in Python.

Furthermore, this rule is counterlogical! I would have to change
"mimetypes.guess_type(mypath)" to "mimetypes.guess_type(str(mypath))".

-- j

holger krekel

unread,

Jul 25, 2003, 4:46:38 PM7/25/03

to

Jason Orendorff wrote:

> holger krekel wrote:
> > I think it's convenient enough to use "str(path)" if passing a 'path'
> > instance as a string somewhere.
>
> Hmmm. If the plan were to convert the whole standard library to accept
> path objects for pathnames, I would likely agree. But when you say
> "str(p)" is "convenient enough", you're saying I need this rule in my head:
>
> Don't pass path objects to functions that take path arguments.
> Pass string objects instead.

Or even better, call the appropriate Path method :-)

> This is a type rule. Such a thing has no place in Python.

Oh, the stdlib has lots of places where it expects certain types in
certain places. Look for e.g. 'isinstance'.

> Furthermore, this rule is counterlogical! I would have to change
> "mimetypes.guess_type(mypath)" to "mimetypes.guess_type(str(mypath))".

I'd just call this a little inconvenient. And i wouldn't mind adding
a guess_type method (which would work even better for URL's or
subversion-urls).

cheers,

holger

Andrew Dalke

unread,

Jul 25, 2003, 7:21:15 PM7/25/03

to

holger krekel

> We currently only have one 'visit' method that accepts a filter for
returning
> results and a filter for recursing into the tree.

> for path in root.visit(AND(isdir, nolink)):

> for path in root.visit(AND(isfile, endswith('.txt')), nodotfile):

I've used the AND trick before, as well as tricks to support "isdir &&
nolink".
Still, as these things get more complicated, its easier to just do

for path in root.visit(lambda name: isfile(name) and name.endswith(".txt"))
-or-
def myfilter(name):
return isfile(name) and name.endswith(".txt")
for path in root.visit(myfilter):

rather than use an prefix-style function interface.

This doesn't introduce any new programming styles, which makes it
easier to understand.

The exception is if the result builds up some sort of parse tree which
can be further analyzed for performance, which is not the case here.

Andrew
da...@dalkescientific.com

Van Gale

unread,

Jul 25, 2003, 9:10:11 PM7/25/03

to

holger krekel wrote:
> Yes, i think adding platform specific methods to a Path object makes sense.
> A friend and me started working on (local and subversion) Path
> implementations last week. Currently a Path instance provides
> these "path-taking" methods
>
> open
> read
> write
> visit (a recursive walker)
> listdir
> stat
> load/save (unpickle/pickle object)
> setmtime (set modification time, uses os.utime)

Interesting, I started a project modifying Jason's Path module to work
on subversion trees as well. I didn't get too far before putting the
project on a back-burner so I'm glad to hear someone else is thinking
the same way :)

My extensions to Path included an additional argument to "open" that
included a version number, and a mechanism for retrieving some kind of
"metadata" associated with the file.

I also made another Path module that implements a "poor mans cms" if
subversion/rcs/cvs are not available. It uses hidden files with version
numbers in the filename to emulate a real version control system.

Van Gale

Ian Bicking

unread,

Jul 25, 2003, 9:48:07 PM7/25/03

to

On Fri, 2003-07-25 at 20:10, Van Gale wrote:
> Interesting, I started a project modifying Jason's Path module to work
> on subversion trees as well. I didn't get too far before putting the
> project on a back-burner so I'm glad to hear someone else is thinking
> the same way :)
>
> My extensions to Path included an additional argument to "open" that
> included a version number, and a mechanism for retrieving some kind of
> "metadata" associated with the file.

It's interesting that different kinds of filesystems (or
filesystem-like-things) have very different kinds of metadata
available. Like last-modified, last-accessed, inode (identity),
version, title, branch, mimetype, log message, etc. And then there's
information that's not quite metadata... like <link ref> data, or the
volume name, the host, etc.

I feel like a common interface for these different filesystems should
somehow degrade well in terms of metadata, or expedite introspection in
some fashion.

The differences on the client side are probably easier to handle, as
they can be handled by the constructor, which might look different for
different filesystems. Like url('http://whatever', user='bob',
password='secret', proxy='http://myproxy'), or
cvs(pserver='cvs.sourceforge.net', repository='python'). Or should
there be a string-based representation (i.e., URIs)? Of course for
symmetry then __str__ would always return a URI, but for many
circumstances we'd prefer a more concise notation, like a filesystem
path (though most other cases would be acceptable squeezed into URIs).

I'd have placed the version in the object itself, not as an argument to
open. Then you'd want to query for alternate versions, most recent
version -- maybe some version identifier that meant most recent... a
similar situation might be language negotiation with an HTTP file.

Ian

Ian Bicking

unread,

Jul 25, 2003, 10:03:40 PM7/25/03

to

On Fri, 2003-07-25 at 14:56, holger krekel wrote:
> > The multiple walk methods would only be a shortcut anyway. Again, they
> > might be difficult in a situation like a URL where directory and file
> > are intermingled (and maybe ReiserFS 4...?) -- which maybe is okay, a
> > urlpath object simply wouldn't implement that walker.
>
> Yep, URL pathes have no notion of directories and files. Thus a general
> URL path can't have a 'listdir' method and thus we can't recurse.
> You can easily special case it for Apache's "Indexes" view, though :-)

WebDAV does, though, doesn't it? But you can still edit the directory
resource, so it gets overloaded. WebDAV's use of GET is messed up.

And we should specify HTTP, of course, since FTP does have a notion of
directories, and possibly other URL methods would as well.

But this is a digression...

Ian

Bengt Richter

unread,

Jul 25, 2003, 11:15:57 PM7/25/03

to

On 25 Jul 2003 20:48:07 -0500, Ian Bicking <ia...@colorstudy.com> wrote:

>On Fri, 2003-07-25 at 20:10, Van Gale wrote:
>> Interesting, I started a project modifying Jason's Path module to work
>> on subversion trees as well. I didn't get too far before putting the
>> project on a back-burner so I'm glad to hear someone else is thinking
>> the same way :)
>>
>> My extensions to Path included an additional argument to "open" that
>> included a version number, and a mechanism for retrieving some kind of
>> "metadata" associated with the file.
>
>It's interesting that different kinds of filesystems (or
>filesystem-like-things) have very different kinds of metadata
>available. Like last-modified, last-accessed, inode (identity),
>version, title, branch, mimetype, log message, etc. And then there's
>information that's not quite metadata... like <link ref> data, or the
>volume name, the host, etc.
>
>I feel like a common interface for these different filesystems should
>somehow degrade well in terms of metadata, or expedite introspection in
>some fashion.
>

IMO a mounted file system per se should be represented by an object, and then
that object should have the methods to deliver generic or file-system-specific
file and path and walking objects etc.

After all, even NT can see DOS partitions, vs NTFS vs raw floppy
and HD images of potentially foreign formats. And my slackware linux sees one DOS partition
that can be alternately booted, but can read from slackware via a mount.

Cf. another post in this thread (which didn't get any response ;-)

Regards,
Bengt Richter

holger krekel

unread,

Jul 26, 2003, 3:49:09 AM7/26/03

to

Van Gale wrote:
> holger krekel wrote:
> > Yes, i think adding platform specific methods to a Path object makes sense.
> > A friend and me started working on (local and subversion) Path
> > implementations last week.

> > ...

>
> Interesting, I started a project modifying Jason's Path module to work
> on subversion trees as well. I didn't get too far before putting the
> project on a back-burner so I'm glad to hear someone else is thinking
> the same way :)

It's even working although i am not sure we stay with the
subversion-python bindings as they are fragile and incomplete at places.
We might switch to using the commandline "svn" utility for the time beeing.

> My extensions to Path included an additional argument to "open" that
> included a version number, and a mechanism for retrieving some kind of
> "metadata" associated with the file.

We instantiate the Path like so

path = SvnPath('http://codespeak.net/svn/vpath/trunk/dist', rev=X)

where X is either -1 (default) meaning it should grab the latest
revision or some positive revision number. When you 'visit' or
'listdir' or 'open' on that 'path' you stay in the same revision
and thus get a consistent view. this is obviously a nice property.
Btw, via the above URL you'll get our current implementation with
lots of unittests. You currently need subversion-python-bindings
which are not exactly easy to get going unless you already have a
server-side install.

> I also made another Path module that implements a "poor mans cms" if
> subversion/rcs/cvs are not available. It uses hidden files with version
> numbers in the filename to emulate a real version control system.

I thought about this too. Right now we just want to make it easy and
complete enough.

cheers,

holger

holger krekel

unread,

Jul 26, 2003, 3:50:45 AM7/26/03

to

Ian Bicking wrote:
> On Fri, 2003-07-25 at 14:56, holger krekel wrote:
> > > The multiple walk methods would only be a shortcut anyway. Again, they
> > > might be difficult in a situation like a URL where directory and file
> > > are intermingled (and maybe ReiserFS 4...?) -- which maybe is okay, a
> > > urlpath object simply wouldn't implement that walker.
> >
> > Yep, URL pathes have no notion of directories and files. Thus a general
> > URL path can't have a 'listdir' method and thus we can't recurse.
> > You can easily special case it for Apache's "Indexes" view, though :-)
>
> WebDAV does, though, doesn't it? But you can still edit the directory
> resource, so it gets overloaded. WebDAV's use of GET is messed up.

I am not very familiar with the low-level details of WebDAV but i think
determining if something is a directory is done by a PROPGET command.

cheers,

holger

holger krekel

unread,

Jul 26, 2003, 4:10:14 AM7/26/03

to

Hello Ian,

Ian Bicking wrote:
> ...

> I feel like a common interface for these different filesystems should
> somehow degrade well in terms of metadata, or expedite introspection in
> some fashion.

I wouldn't try to play too many tricks with meta-data.

> The differences on the client side are probably easier to handle, as
> they can be handled by the constructor, which might look different for
> different filesystems. Like url('http://whatever', user='bob',
> password='secret', proxy='http://myproxy'), or
> cvs(pserver='cvs.sourceforge.net', repository='python'). Or should
> there be a string-based representation (i.e., URIs)?

String-based representations are often not specified. e.g.
subversion/webdav/deltax don't define a URL format to get
to a certain revision. The other approach (keyword-args) is
a more generic and better way IMO.

cheers,

holger

John J. Lee

unread,

Jul 26, 2003, 7:34:07 PM7/26/03

to

holger krekel <py...@devel.trillke.net> writes:

> Jason Orendorff wrote:
[...about passing path objects to library methods that expect a string...]

> > This is a type rule. Such a thing has no place in Python.
>
> Oh, the stdlib has lots of places where it expects certain types in
> certain places. Look for e.g. 'isinstance'.

It's not even a strict type rule. It's just that a path object
wouldn't implement the string interface. I don't know why that would
have 'no place in Python', or be 'counterlogical'.

John