Size of records (packed vs default)

Paul E. Schoen

unread,

Mar 20, 2010, 3:38:05 AM3/20/10

to

I am coding a test program which intends to read a file which has a
proprietary format. I have determined that the pattern repeats such that
each record is 1142 bytes. So I used the following:

type Ttc = record
time: double; // 8
curr: double; // 8
decay: array[1..5] of char;
yesno: char;
end; // Length = 22

type Ttccdata = record
recltype: array[1..7] of char; // 7
id: array[1..20] of char; // 27
amps: array[1..4] of char; // 31
sequence: array[1..7] of char; // 38
gndct: array[1..4] of char; // 42
gndseq: array[1..7] of char; // 49
conn: array[1..5] of char; // 54
mpu: array[1..4] of double; // 86
tc: array[1..48] of Ttc; // 86 + 48*22 = 1142
// Debugger shows sizeof(Ttc)=24 and SizeOf(Ttccdata)=1240
end;

I am reading the file in chunks that I thought would be identical to the
record size, but when I used the debugger inspect I get the SizeOf that
seems to add two bytes to each record as its size. I am using essentially
the same definitions as I used in an old Turbo C program that does the same
thing, but it uses a "struct" rather than "record".

OK, I found that I need to use "packed record". Delphi uses word or double
word boundaries, so my 22 character record was padded to 24, and my Ttccdata
record was padded to 1240.

This may be common knowledge, but I've gone this far so I'll post it anyway.

Paul

Jamie

unread,

Mar 20, 2010, 10:58:49 AM3/20/10

to

Paul E. Schoen wrote:

Yes, its common knowledge! :)

But don't feel left out, this is practice in C/C++ compilers also and
was only to improve processing speed in older generation of CPU's. I
don't think it really matters any more these days!

I always use PACKED on all records.

In C/C++ they can use a inline compiler switch to set the STRUCT
alignments on the fly..

Hans-Peter Diettrich

unread,

Mar 20, 2010, 12:44:22 PM3/20/10

to

Jamie schrieb:

> I always use PACKED on all records.

I don't. Packed is reserved for "external" records, i.e. exchanged with
other applications, libraries, or in binary data files.

> In C/C++ they can use a inline compiler switch to set the STRUCT
> alignments on the fly..

When it's required to rebuild such records from the C declaration, it
may be necessary to add the padding explicitly to an Packed Record.

DoDi

Paul E. Schoen

unread,

Mar 21, 2010, 6:41:25 AM3/21/10

to

"Paul E. Schoen" <pa...@pstech-inc.com> wrote in message
news:AX_on.117198$0N3.1...@newsfe09.iad...

>I am coding a test program which intends to read a file which has a
>proprietary format. I have determined that the pattern repeats such that
>each record is 1142 bytes. So I used the following:

The structure of the data records is more like this:

type Ttc = packed record
curr_a: int64; // Sometimes double
time_a: int64; // Sometimes double
decay_a: array[1..5] of char;
yesno_a: char;
curr_b: int64; // Sometimes double
time_b: int64; // Sometimes double
decay_b: array[1..5] of char;
yesno_b: char;
curr_c: int64; // Always Int64
time_c: int64; // 8
decay_c: array[1..5] of char;
yesno_c: char;
curr_g: int64; // Always Int64
time_g: int64; // 8
decay_g: array[1..5] of char;
yesno_g: char;
end; // Length = 88

type Ttccdata = packed record

recltype: array[1..7] of char; // 7
id: array[1..20] of char; // 27
amps: array[1..4] of char; // 31
sequence: array[1..7] of char; // 38
gndct: array[1..4] of char; // 42
gndseq: array[1..7] of char; // 49

conn: array[1..1] of char; // 50
mpu: array[1..4] of Int64; // 82
flags: array[1..4] of char; // 86
tc: array[1..12] of Ttc; // 86 + 12*88 = 1142
end;

The records consist of a real current and time, plus the decay and yesno
strings, for three test conditions (Low, Med and High), each of which may be
on one of four phases (A, B, C, and G), and there may be as many as four
current and time values. So there are really 3 * 4 * 4 = 48 similar records.
I had assumed the real data was type double, as it occupied 8 bytes, but
that produced impossible numbers for all but zero data. When I read the data
as Integer or Int64, I got a number that was 10,000 times the correct value.
But the data for A and B phases of Med and High test types is type double.

I don't have access to the source code for this program and I believe it was
written in an early form of Visual Basic for console type MSDOS
applications, around 1994. It was written by a shop guy who is not a
professional programmer. But I can't imagine how this database structure
could have been created. I am trying to write a module that can translate
these data files for customers of my new Ortmaster program, which uses dBase
files and a more logical database structure. And I doubt that it will be of
too much use, as the tests do not have a date/time stamp, and the various
tests for any device with a unique ID may have been done at any time,
possibly even years apart.

Now it's going to be a bit of a challenge to read the data properly. I will
probably use a variant for the data and then interpret it as Double or Int64
depending on the test type and phase. But variants are 16 bytes, and they
cannot be used for Int64, although the data reads just as well if it is type
Integer. Maybe I will need to use a variant part in the record. It's ugly
but it should work...

Paul

Hans-Peter Diettrich

unread,

Mar 21, 2010, 7:18:21 AM3/21/10

to

Paul E. Schoen schrieb:

> I had assumed the real data was type double, as it
> occupied 8 bytes, but that produced impossible numbers for all but zero
> data. When I read the data as Integer or Int64, I got a number that was
> 10,000 times the correct value.

Then it's of type Currency.

> But the data for A and B phases of Med
> and High test types is type double.
>
> I don't have access to the source code for this program

Too bad :-(

DoDi

Maarten Wiltink

unread,

Mar 21, 2010, 7:52:08 AM3/21/10

to

"Paul E. Schoen" <pa...@pstech-inc.com> wrote in message

news:sJmpn.76571$jt1....@newsfe01.iad...
[...]

> I had assumed the real data was type double, as it occupied 8 bytes,
> but that produced impossible numbers for all but zero data. When I
> read the data as Integer or Int64, I got a number that was 10,000 times
> the correct value.

That reeks of Currency. It's a scaled 64-bit signed integer. The scaling
is maintained by compiler magic, so for example after multiplying two of
these, or before assigning it to a Real, the value is divided by 10,000.
Although it is a fixed-point format, it is processed by the FPU which has
a special bit in the format for this.

This type is a leftover from when FPUs were at least commonly emulated,
if not exactly commonly physically present, but 64-bit computation was
still beyond the CPU itself.

It's also, as far as I know, a Turbo-Pascal specific type.

> [...] I can't imagine how this database structure could have been
> created.

Don't think of it as a database structure. It's just a file format.
It may have been built by repeated Write calls.

> Now it's going to be a bit of a challenge to read the data properly.
> I will probably use a variant for the data and then interpret it as
> Double or Int64 depending on the test type and phase. But variants are
> 16 bytes, and they cannot be used for Int64, although the data reads
> just as well if it is type Integer. Maybe I will need to use a variant
> part in the record. It's ugly but it should work...

Variants will not help you; they solve a different problem in a different
domain. A variant part in the record comes closer but in the end you
have to get the bits in the input into correctly typed variables. Only
if the bits may end up in different types will a variant record help you,
and if you get the type right, you don't need Variants.

Groetjes,
Maarten Wiltink

Paul E. Schoen

unread,

Mar 21, 2010, 4:48:26 PM3/21/10

to

"Maarten Wiltink" <maa...@kittensandcats.net> wrote in message
news:4ba6086a$0$22918$e4fe...@news.xs4all.nl...

>
> Variants will not help you; they solve a different problem in a different
> domain. A variant part in the record comes closer but in the end you
> have to get the bits in the input into correctly typed variables. Only
> if the bits may end up in different types will a variant record help you,
> and if you get the type right, you don't need Variants.

I got it to work with the following:

type
Ttesttype = (Low, Med, High);
Tphase = (A,B,C,G);
Topnum = (op1,op2,op3,op4);
Tcttype = (I, R);
Tcurrtime = record
case Tcttype of
I: (AsInt: Int64);
R: (asReal: Double);
end;

type Ttc = packed record

curr: Tcurrtime; // 8
time: Tcurrtime; // 16
decay: array[1..5] of char; // 21
yesno: char; // 22

end; // Length = 22

type Ttccdata = packed record

recltype: array[1..7] of char; // 7
id: array[1..20] of char; // 27
amps: array[1..4] of char; // 31
sequence: array[1..7] of char; // 38
gndct: array[1..4] of char; // 42
gndseq: array[1..7] of char; // 49
conn: array[1..1] of char; // 50
mpu: array[1..4] of Int64; // 82
flags: array[1..4] of char; // 86

tc: array[Low..High] of array[A..G] of array[op1..op4] of Ttc;
// 86 + 3*4*4*12*22 = 1142
end;

while ( fInput.Position < fInput.Size ) do begin
inc(RecNum);
if RecNum > 100 then break;
fInput.ReadBuffer( TCCdata, sizeof(Ttccdata) );
Memo1.Lines.Add( Format( char($0d)+'Record %d'+char($0d), [RecNum] ) );
with TCCdata do begin
Memo1.Lines.Add( 'Type: '+String(recltype) );
Memo1.Lines.Add( 'ID: '+String(id) );
Memo1.Lines.Add( 'Amps: '+String(amps) );
Memo1.Lines.Add( 'Sequence: '+String(sequence) );
Memo1.Lines.Add( 'GND CT: '+String(gndct) );
Memo1.Lines.Add( 'GND Sequence: '+String(gndseq) );
Memo1.Lines.Add( 'Connection: '+String(conn) );
Memo1.Lines.Add( Format( 'Minimum Pickup: %6.2f, %6.2f, %6.2f, %6.2f',
[ mpu[1]/10000, mpu[2]/10000, mpu[3]/10000, mpu[4]/10000 ]) );
Memo1.Lines.Add( Format( 'Flags: %s, %s, %s, %s',
[ Flags[1], Flags[2], Flags[3], Flags[4] ] ) );
for TestType := Low to High do begin
if TestType = Low then
Memo1.Lines.Add( 'Low Current Test:' )
else if TestType = Med then
Memo1.Lines.Add( 'Medium Current Test:' )
else
Memo1.Lines.Add( 'High Current Test:' );
for Phase := A to G do begin
for OpNum := op1 to op4 do begin
If ( (TestType=Med) or (TestType=High) ) and ( (Phase=A) or
(Phase=B) ) then
Memo1.Lines.Add( Format( '%d: Time: %6.3f Curr: %8.2f Decay:
%s Drop? %s',
[ RecNum, tc[TestType][Phase][OpNum].time.AsReal,
tc[TestType][Phase][OpNum].curr.AsReal,
String(tc[TestType][Phase][OpNum].decay),
tc[TestType][Phase][OpNum].yesno]) )
else
Memo1.Lines.Add( Format( '%d: Time: %6.3f Curr: %8.2f Decay:
%s Drop? %s',
[ RecNum, tc[TestType][Phase][OpNum].time.AsInt/10000,
tc[TestType][Phase][OpNum].curr.AsInt/10000,
String(tc[TestType][Phase][OpNum].decay),
tc[TestType][Phase][OpNum].yesno]) );
end;
end; //next OpNum
end; //Next Phase
end; //NextTestType
Memo1.Lines.Add(
'======================================================' );
end;

Now all I need to do is convert the data into fields of the new Results
database. It still seems incredible that even an inexperienced programmer
using Visual Basic could come up with such a strange file structure. It
seems to me that it would require the data for the Med and High tests for A
and B to be written to the file in a different way and also have it read
back in the same way. I know that BASIC does not require variables to be
declared, and numbers are represented according to the way thay are being
used, so they are essentially variants. I suppose one could use 48 WRITE
statements and have READ statements that follow the same format. And maybe
some variables were set up as four place precision fixed decimal while
others were not, and hence treated as Real.

I don't think my solution is too ugly, so I'll go with what seems to work.
Thanks for the ideas.

Paul

Hans-Peter Diettrich

unread,

Mar 21, 2010, 11:42:21 PM3/21/10

to

Paul E. Schoen schrieb:

> Now all I need to do is convert the data into fields of the new Results
> database. It still seems incredible that even an inexperienced
> programmer using Visual Basic could come up with such a strange file
> structure.

AFAIR that's no problem. VB includes record I/O, so that it's sufficient
to have different record declarations out there. For some reason part of
the code uses one declaration, other parts use the other declaration.

DoDi

alang...@aol.com

unread,

Mar 22, 2010, 3:36:06 AM3/22/10

to

On 21 Mar, 20:48, "Paul E. Schoen" <p...@pstech-inc.com> wrote:
<snip>
> �type

> � �Ttesttype = (Low, Med, High);

<snip>

I would beware of using Low & High as enumeratd values, they are the
names of standard Delphi functions - try Lo, Me, Hi.

Also note that enumerated values can be used as indices to arrays, so
you can use . . .

TestTypeStr = Array [TTestType] of string = ('Low Current Test',
'Medium Current Test', 'High Current Test');

. . . and then extract the string with . . .

Memo1.Lines.Add(TestTypeStr[TestType])

. . . instead of . . .

if TestType = Low then
Memo1.Lines.Add( 'Low Current Test:' )
else if TestType = Med then
Memo1.Lines.Add( 'Medium Current Test:' )
else
Memo1.Lines.Add( 'High Current Test:' );

If you are not constrained by the data, I'd also use Delphi-standard
zero-based arrays instead of one-based.

If you have such a complex structure its also worth laying it out
carefully to aid your understanding in six months time <g>.

Alan Lloyd

Maarten Wiltink

unread,

Mar 22, 2010, 4:31:18 AM3/22/10

to

"Paul E. Schoen" <pa...@pstech-inc.com> wrote in message

news:xCvpn.36036$NH1...@newsfe14.iad...
[...]

> I got it to work with the following:

Good. That's the first priority.

[...]

> mpu: array[1..4] of Int64; // 82

[...]
> Memo1.Lines.Add( Format

> ( 'Minimum Pickup: %6.2f, %6.2f, %6.2f, %6.2f',
> [ mpu[1]/10000, mpu[2]/10000, mpu[3]/10000, mpu[4]/10000 ]) );

Did you try it with Currency?

Groetjes,
Maarten Wiltink

Hans-Peter Diettrich

unread,

Mar 22, 2010, 7:01:47 AM3/22/10

to

alang...@aol.com schrieb:

> I would beware of using Low & High as enumeratd values, they are the
> names of standard Delphi functions - try Lo, Me, Hi.

I'd discourage the use of Lo and Hi, too, for the same reason.
Enumerated types typically should have a common prefix on all member
names. Not bullet proof, but avoids confusion with standard names.

DoDi

Paul E. Schoen

unread,

Mar 22, 2010, 4:33:52 PM3/22/10

to

"Maarten Wiltink" <maa...@kittensandcats.net> wrote in message

news:4ba72ad8$0$22903$e4fe...@news.xs4all.nl...

No. It probably would work for those values that are Int64 with four decimal
places, and it would save having to divide by 10000, but I really doubt that
it would interpret a type Double correctly. Apparently in VB there was a
four DP Currency type but now in VB.NET it has been replaced with a Decimal
type that can have more precision. Here's what I found:
http://visualbasic.about.com/od/usingvbnet/a/decdatatype.htm

This may wind up being much ado about nothing. The data records do not have
a date/time stamp, so there is no way to know when the tests were done. I
think the customers usually just printed out the test reports as they went
along and didn't worry about accessing old data. They are very much "old
school", and probably are still hard-wired for paper reports.

Paul

Paul E. Schoen

unread,

Mar 22, 2010, 4:49:04 PM3/22/10

to

"Paul E. Schoen" <pa...@pstech-inc.com> wrote in message

news:SuQpn.105592$Ye4....@newsfe11.iad...

>
> "Maarten Wiltink" <maa...@kittensandcats.net> wrote in message
> news:4ba72ad8$0$22903$e4fe...@news.xs4all.nl...
>>

>> Did you try it with Currency?
>
> No. It probably would work for those values that are Int64 with four
> decimal places, and it would save having to divide by 10000, but I really
> doubt that it would interpret a type Double correctly. Apparently in VB
> there was a four DP Currency type but now in VB.NET it has been replaced
> with a Decimal type that can have more precision. Here's what I found:
> http://visualbasic.about.com/od/usingvbnet/a/decdatatype.htm

I found some interesting facts about early Visual BASIC for MSDOS in the
Wiki:
http://en.wikipedia.org/wiki/Visual_Basic

And I see where a variable can be defined as type Currency with the @
symbol, and as Double with the # symbol. They are adjacent on the keyboard
so maybe it was a slip of the fingers typo.

Paul