I've a binary file with data in it. This file come from an old ms dos application (multilog ~ 1980). In this application, a field is declared as a 'decimal' (999 999 999.99). I put 0.00 in the field and save the record to the file. When I look in the binary file (with python or an hex editor), the field is stored on 8 bytes: 00-00-00-00-00-00-7F-00. I try unpack from struct module but the result isn't good.
>I've a binary file with data in it. >This file come from an old ms dos application (multilog ~ 1980). >In this application, a field is declared as a 'decimal' (999 999 >999.99). >I put 0.00 in the field and save the record to the file. >When I look in the binary file (with python or an hex editor), the >field is stored on 8 bytes: 00-00-00-00-00-00-7F-00. >I try unpack from struct module but the result isn't good.
>Can someone help me?
Most likely it is a BCD field. Please watch for number in the file with a simple text viewer. In that case, you can read the number as a string (length is len('999999999.99') or similar) and convert it with int() or long().
> >I've a binary file with data in it. > >This file come from an old ms dos application (multilog ~ 1980). > >In this application, a field is declared as a 'decimal' (999 999 > >999.99). > >I put 0.00 in the field and save the record to the file. > >When I look in the binary file (with python or an hex editor), the > >field is stored on 8 bytes: 00-00-00-00-00-00-7F-00. > >I try unpack from struct module but the result isn't good.
> >Can someone help me?
> Most likely it is a BCD field.
I believe hex OP gave is for double 0.0e0 (exponents are biased). OTOH, BCD for 11 digits would be 6 bytes, all 0, which also has. Putting -1.0 and 1.0 in field and storing would clarify storage format if really not known.
pascal.par...@free.fr (Pascal) writes: > I've a binary file with data in it. > This file come from an old ms dos application (multilog ~ 1980). > In this application, a field is declared as a 'decimal' (999 999 > 999.99). > I put 0.00 in the field and save the record to the file. > When I look in the binary file (with python or an hex editor), the > field is stored on 8 bytes: 00-00-00-00-00-00-7F-00. > I try unpack from struct module but the result isn't good.
> Can someone help me?
You're sort of vague here, but I don't think struct is going to help you regardless. "decimal" in this case is almost certainly some sort of BCD, which isn't a standard C struct (and therefore unknown to the struct module).
You really need to figure out how the data is stored. Based on your one example it looks like it's stored as a series of 7 bit values representing the decimal digits with 0x7f indicating the decimal point. If this is correct you could use something like
tstr='' for c in instr: if c == chr(0x7f): tstr+='.' else: tstr += str(ord(c)) fl = float(tstr)
With two major caveats: 1) that this is going to return a float, not a decimal 2) There's no way for me to even guess how negative numbers are represented
-- Christopher A. Craig <list-pyt...@ccraig.org> "By rights we shouldn't be here." -- Sam in Peter Jackson's "The Two Towers" while standing in Osgiliath, where he shouldn't be.
> I've a binary file with data in it. > This file come from an old ms dos application (multilog ~ 1980). > In this application, a field is declared as a 'decimal' (999 999 > 999.99). > I put 0.00 in the field and save the record to the file. > When I look in the binary file (with python or an hex editor), the > field is stored on 8 bytes: 00-00-00-00-00-00-7F-00. > I try unpack from struct module but the result isn't good.
> Can someone help me?
> Thanks
If the number is saved in a floating point representation (IEEE?), typically [sign][exponent][fraction] then you really need to know what the type is. For example, I had to make cross-platform real numbers at one stage and fabricated them as below.
Colin Brown PyNZ
import math
def vmsR4(real): '''vmsR4(real): returns an integer that is equivalent to a VMS real*4 ''' (m, e) = math.frexp(real) if m == 0.0: return 0 else: sign = m < 0 exp = e + 128 mant = int((16777216L * abs(m)) + 0.5) - 8388608 return (sign << 15) + (exp << 7) + (mant >> 16) + (mant << 16)
> I've a binary file with data in it. > This file come from an old ms dos application (multilog ~ 1980). > In this application, a field is declared as a 'decimal' (999 999 > 999.99). > I put 0.00 in the field and save the record to the file. > When I look in the binary file (with python or an hex editor), the > field is stored on 8 bytes: 00-00-00-00-00-00-7F-00. > I try unpack from struct module but the result isn't good.
The only obvious pattern I see is that 2**0 -> 81, 2**1->82, ... 2**9->8A (where A==10) ie, for non-zero, last byte is 81 + exponent of largest power of two, which seems like type of float, and first 6 are 0 if integral. May be proprietary format.
That is a bizarre format, and of course I had to implement it. (Even C is more pleasant in Python!).
It works for the cases given, but do find out where the sign bit is for the mantissa. (This code assumes it's the MSB of the mantissa.)
Also tease out the NaN and +-Infinity cases.
--- Code --- #! /usr/bin/env python # by Francis Avila # # Decode a peculiar binary floating point encoding # used by 'multilog', an old dos spreadsheet.
def _test(): tests = [(str(float(i)),str(dectofloat(j))) for i,j in _known] results = [expect==got for expect,got in tests]
failed = [tests[i] for i, passed in enumerate(results) if not passed]
if failed: return failed else: return 'Passed'
def bin(I): """Return list of bits of int I in little endian.""" if I < 0: raise ValueError, "I must be >= 0" bits = [] if I == 0: bits = [0] while I>0: r = (I & 0x1) if r: r = 1 bits.append(r) I >>= 1 bits.reverse() return bits
def binaryE(n, exp): """Return result of a binary n*10**exp.
As n*10**exp is to decimal, so binaryE(n, exp) is to binary. """ return sum([2**(exp-i) for i,bit in enumerate(bin(n)) if bit])
#Add special cases here: SPECIAL = {'\x00\x00\x00\x00\x00\x00\x7f\x00':0.0}
def dectofloat(S): """Return float value of 8-byte 'decimal' string.""" if S in SPECIAL: return SPECIAL[S]
# Convert to byteswapped long. N, = struct.unpack('<Q', S)
# Grab exponent and mantissa parts using bitmasks. # The eight MSBs are exponent; rest mantissa. exp, mant = (N&(0xffL<<56))>>56, N&~(0xffL<<56)
exp -= 0x81 # Exponential part is excess 0x81 (e.g., 0x82 is 1).
msign = mant & (0x80L<<48) # MSB of mantissa is sign bit. 0==positive. if not msign: msign = 1 else: msign = -1
mant |= (0x80L<<48) # Add implied 1 to the MSB of mantissa.
>Francis Avila fed this fish to the penguins on Friday 28 November 2003 >01:45 am:
>> That is a bizarre format, and of course I had to implement it. (Even C >> is more pleasant in Python!).
> And I thought /I/ was the masochist...
> I do have to confess that short tests with struct did reveal that, on >my system, regular doubles do have the same byte order as the original >data. I'm just more comfortable with seeing hex representations of >numbers with the MSB on the left.
>> It works for the cases given, but do find out where the sign bit is >> for the mantissa. (This code assumes it's the MSB of the mantissa.)
> I suspect most would consider the format my college computer used to >be weird... Xerox Sigma 6... Excess 64 (decimal) (as I recall) exponent >powers of sixteen! A "normalized" mantissa could have up to three >leading 0 bits, and there were no "hidden" bit.
> S eeeeeee mmmmmmmmm mmmmmmmm mmmmmmmm ...
Does the OP have the ability ot generate example values at will, or is it a matter of scrounging through some old recorded data with no way of making more?
If he can make more, I'd suggest a number with most of the nybbles of data individually numbered, e.g., 0xfedcba987654321 E.g., if it's a 64-bit format, a 64-bit integer converted would probably tell the a lot about where the bits go (how many get shifted out, hidden, how they're ordered). And then the same number negative. E.g.,
>>> 0xfedcba987654321 1147797409030816545L >>> -0xfedcba987654321 -1147797409030816545L >>> hex(0xfedcba987654321) '0xFEDCBA987654321L' >>> hex(-0xfedcba987654321) '-0xFEDCBA987654321L' <grr>useless hex representation for looking at bits ...</grr>
A very big thanks to you. The function run perfectly (after python 2.3! installed for enumerate function) If you can, give me more details on the methode or the number's representation.
On 2 Dec 2003 09:40:50 -0800, pascal.par...@free.fr (Pascal) wrote:
>A very big thanks to you. >The function run perfectly (after python 2.3! installed for enumerate function) >If you can, give me more details on the methode or the number's representation.
>Thanks a lot!
According to an old MASM 5.0 programmer's guide, there was a Microsoft Binary format for encoding real numbers, both short (32 bits) and long (64 bits).
There were 3 parts:
1. Biased 8-bit exponent in the highest byte (last in the little-endian view we've been using) It says the bias is 0x81 for short numbers and 0x401 for long, but I'm not sure where that lines up. I just got there by experimentation.
2. Sign bit (0 for +, 1 for -) in upper bit of second highest byte.
3. All except the first set bit of the mantissa in the remaining 7 bits of the second highest byte, and the rest of the bytes. And since the most signficant bit for non-zero numbers is 1, it is not represented. But if if were, it would share the same bit position where the sign is (that's why I or-ed it in there to complete the actual mantissa).
MASM also supported a 10-byte format similar to IEEE. I didn't see anything in that section on NaNs and INFs.