Any help is appreciated.
Best regards
Holger Brunck
> I use python 2.5 and I am looking for a possibility to determine a
> file type. Especially the endianness of a file is needed for me. Is
> there a way to detect this easily in python?
Only if you already know what's going to be in the file.
> Something like the "file" utility for linux would be very helpfull.
>
> Any help is appreciated.
You're going to have to describe in detail what's in the file before
anybody can help.
--
Grant Edwards grant.b.edwards Yow! A shapely CATHOLIC
at SCHOOLGIRL is FIDGETING
gmail.com inside my costume..
There is a python module called "magic" that uses the same engine as
file to determine a file type. It's part of the "find" source code:
http://www.darwinsys.com/file/
On Fedora I can just yum install python-magic to get it.
>You're going to have to describe in detail what's in the file before
>anybody can help.
We are creating inside our buildsystem for an embedded system a cram filesystem
image. Later on inside our build process we have to check the endianness,
because it could be Little Endian or big endian (arm or ppc).
The output of the "file" tool is for a little endian cramfs image:
<ourImage>: Linux Compressed ROM File System data, little endian size 1875968
version #2 sorted_dirs CRC 0x8721dfc0, edition 0, 462 blocks, 10 files
It would be possible to execute
ret = os.system("file <ourImage> | grep "little endian")
and evaluate the return code.
But I don't like to evaluate a piped system command. If there is an way without
using the os.system command this would be great.
Best regards
Holger
Files don't, as such, have a detectable endianess. 0x23 0x41 could mean
either 0x4123 or 0x2341 - there's no way of knowing.
The "file" utility also doensn't really know about endianess (well,
maybe it does swap bytes here and there, but that's an implementation
detail) - it just knows about file types. It knows what a little-endian
cramfs image looks like, and what a big-endian cramfs image looks like.
And as they're different, it can tell them apart.
If you're only interested in a couple of file types, it shouldn't be too
difficult to read the first few bytes/words with the struct module and
apply your own heuristics. Open the files in question in a hex editor
and try to figure out how to tell them apart!
If you have control over the file format then you could ensure that
there's a double-byte value such as 0xFF00 at a certain offset. That
will tell you the endianness of the file.
>> It would be possible to execute ret = os.system("file <ourImage> |
>> grep "little endian") and evaluate the return code. But I don't like
>> to evaluate a piped system command. If there is an way without using
>> the os.system command this would be great.
>
> Files don't, as such, have a detectable endianess. 0x23 0x41 could mean
> either 0x4123 or 0x2341 - there's no way of knowing.
>
> The "file" utility also doensn't really know about endianess (well,
> maybe it does swap bytes here and there, but that's an implementation
> detail) - it just knows about file types. It knows what a little-endian
> cramfs image looks like, and what a big-endian cramfs image looks like.
> And as they're different, it can tell them apart.
>
> If you're only interested in a couple of file types, it shouldn't be too
> difficult to read the first few bytes/words with the struct module and
> apply your own heuristics. Open the files in question in a hex editor
> and try to figure out how to tell them apart!
And by looking at the rules that "file" uses for the two file types
that matter, one should be able to figure out how to implement
something in Python. Or one can use the Python "magic" module as
previously suggested: http://pypi.python.org/pypi/python-magic/
--
Grant
Please see http://pypi.python.org/pypi/python-magic
HTH,
Daniel
--
Psss, psss, put it down! - http://www.cafepress.com/putitdown
I wouldn't use os.system with grep and evaluate the return code. Instead
I'd use subprocess.Popen("file <ourImage>") and read the text output of the
commdn directly. By parsing that string, I can extract all kinds of
interesting information.
That is an entirely Unix-like way of doing things. Don't reinvent the
wheel when there's a tool that already does what you want.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
Small correction: subprocess.Popen(["file", our_image_filename])
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco