Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PEP for fp.type()

1 view
Skip to first unread message

Tim Cera

unread,
Jan 15, 2002, 12:43:29 PM1/15/02
to
I have struggled for some time to get a function similar to the Unix
'file' command. I found http://www.demonseed.net/~jp/code/magic.py
which works pretty well, but I think that a better place for this
would be as a file method.

I am _timidly_ proposing the following...

>>> fp = open('/home/tcera/.cshrc')
>>> fp.type()
('ascii text', 'text/plain', 'ascii')
>>> fp.close()

I realize that the ascii or binary test is problematic, but if you
take the Unix 'file' approach it would be strictly defined by the
'magic' type.

Also would be valuable to be able to test if the filename extension
and the contents match.

I think this would deprecate the imghdr, sndhdr, and mimetypes
modules.

To replace imghdr and sndhdr instead of...

>>> import imghdr
>>> fp = open('/home/tcera/test.gif')
>>> test = imghdr.what(fp)
>>> test
1
>>>

would be something like...

>>> fp = open('/home/tcera/test.gif')
>>> test = fp.type()[1][:5] == 'image'
>>> test
1
>>>

Not as clear as I would like, but should directly replicate imghdr and
sndhdr.

It would be valuable to me to include the capability to only look at
the filename extension (faster, but could be incorrect). This
capability would replace the mimetypes module. Not sure how that
command might look.

Maybe...

>>> fp = open('/home/tcera/test.gif')
>>> fp.type('ext')
('GIF file, v89', 'image/gif', 'binary')
>>> fp.close()

OR

>>> fp = open('/home/tcera/test.gif')
>>> fp.type_ext()
('GIF file, v89', 'image/gif', 'binary')
>>> fp.close()

Dumb idea? Cool idea? Comments?

If this idea is thought of kindly, I might be able to spare the time
to write a PEP, but really hope that someone else gloms onto it.

take care
tim

Skip Montanaro

unread,
Jan 15, 2002, 1:19:45 PM1/15/02
to

Tim> I have struggled for some time to get a function similar to the
Tim> Unix 'file' command. I found
Tim> http://www.demonseed.net/~jp/code/magic.py which works pretty well,
Tim> but I think that a better place for this would be as a file method.

I see nothing wrong with a module like magic.py. I don't understand why
this is best as a file method. Your message didn't provide any motivation.

--
Skip Montanaro (sk...@pobox.com - http://www.mojam.com/)

Tim Cera

unread,
Jan 15, 2002, 10:13:50 PM1/15/02
to
"Skip Montanaro" <sk...@pobox.com> wrote
> I see nothing wrong with a module like magic.py. I don't understand why
> this is best as a file method. Your message didn't provide any motivation.

Your right, I didn't give a rational, just described the idea.

I can think of four reasons...

1) Duplication of function in imghdr, sndhdr, and mimetypes. Both
imghdr and sndhdr do a similar thing to Unix 'file' and magic.py where
mimetypes looks at the filename extension. All try to determine what
kind of file is out there. Also, what if it is a text file? How do
you test with the standard modules? Remember that the mimetypes
module does not sample the file.

2) The imghdr module is limited to testing for only 11 graphic
formats. And in both cases when I should have been able to use it,
the format wasn't supported (ps and eps). The sndhdr module seems a
little better off, covering 12 sound formats.

3) IANALL but it seems to make sense as a file method because it would
return a characteristic of the file. Whether it makes sense as a file
method or an os method I don't think is the most important thing.
Actually it would return information about a file similar to
os.stat(). Maybe os.filetype()? Heck, maybe just extend magic.py to
support the mimetypes approach, but replace imghdr, sndhdr, and
mimetypes.

4) Batteries Included!

take care
tim cera

Andrew Dalke

unread,
Jan 15, 2002, 11:43:28 PM1/15/02
to
Tim Cera wrote:
>"Skip Montanaro" <sk...@pobox.com> wrote
>> I see nothing wrong with a module like magic.py. I don't understand why
>> this is best as a file method. Your message didn't provide any
motivation.
>
>Your right, I didn't give a rational, just described the idea.
>
>I can think of four reasons...

The first two are reasons to have it as a standard module. The aren't
reasons for file objects to grow a "magic" method.

>Heck, maybe just extend magic.py to
>support the mimetypes approach, but replace imghdr, sndhdr, and
>mimetypes.

In other words, that. The more functionality that's put into
a standard object, the harder it is to emulate (as in, make a file-like
object). For example, should that method be usable on a socket,
which acts like a file handle? Or on a StringIO?

It's easier to have a module then pass it the filehandle or filename.

BTW, I've never needed /etc/magic for anything I've done. The file
typing I need can't be done with that limited of a language.

Andrew
da...@dalkescientific.com

Martin von Loewis

unread,
Jan 16, 2002, 4:04:54 AM1/16/02
to
"Andrew Dalke" <da...@dalkescientific.com> writes:

> It's easier to have a module then pass it the filehandle or filename.

Actually, I'd only allow to pass it file names. Finding the type of an
open file is inherently hard (e.g. if it was opened in text mode,
looking for the Python bytecode magic will fail), and the application
should be encouraged to pass the file name if it can.

Regards,
Martin

Alex Martelli

unread,
Jan 16, 2002, 3:49:15 AM1/16/02
to
"Tim Cera" <tim...@earthlink.net> wrote in message
news:9c3f2e9c.02011...@posting.google.com...
...

> 3) IANALL but it seems to make sense as a file method because it would
> return a characteristic of the file. Whether it makes sense as a file

Note that this is not an optimal rationale to decide what's best
architected as an object's method, and what's best architected
as a function accepting an object as a parameter. Consider
Python's built-in functions. Doesn't id(), for example, "return
a characteristic" of the object that's passed to it? Sure it
does, but, it would be wrong to make it a method of each object.

Rather, a designer should consider: am I willing to put the
burden of implementing this method on ANY object that wants
to be an "X-like" object? How important is it for this method
to be overridable by specific object types/classes, and does
it have an obvious implementation in terms of already existing
primitives? See the xreadlines module AND method of file
objects for one possible compromise design.


> method or an os method I don't think is the most important thing.

Probably not, but it IS one important architectural decision -- and
I think this is one more case in which an external function is best.

Haskell typeclasses let methods be added post-facto with default
implementations in terms of other methods, but still overridable
(indeed in a typical case, a typeclass's methods are by default
defined in term of each other and at least one must be overridden
to "break the dependency cycle"). Python doesn't have such a
smooth mechanism for this, although a metaclass and/or mix-in could
be used to emulate it reasonably closely ("protocol adaptation" as
proposed in an open PEP would IMHO be the Pythonic way to make
this sufficiently general).

Thus, the method-vs-function choice is important in designing
for Python (particularly when thinking of functionality meant
to get into core Python). MHO is that a PEP for a new method
of builtin file objects would stand no chance, while one for a
new separate module to be added to the standard library might.


Alex

0 new messages