type Buffer is array(Natural range <>) of Unsigned_8;
procedure Read( b: out Buffer ) is
begin
Buffer'Read(Stream(f_in), b);
exception
when Ada.Streams.Stream_IO.End_Error =>
null;
-- Nothing bad, just some garbage in the buffer
-- after end of compressed code
end Read;
procedure Write( b: in Buffer ) is
begin
Buffer'Write(Stream(f_out), b);
end Write;
Bad luck, it is as slow as doing I/O's with single bytes and
Sequential_IO! But if it is slow by receiving/sending a whole buffer,
how to make it faster ? Now someone (in a slightly different context)
came with this (call it variant 2):
procedure Read( b: out Buffer ) is
use Ada.Streams;
First : constant Stream_Element_Offset:= Stream_Element_Offset
(b'First);
Last : Stream_Element_Offset:= Stream_Element_Offset
(b'Last);
SE_Buffer : Stream_Element_Array (First..Last);
begin
Read(Stream(f_in).all, SE_Buffer, Last);
for i in First..Last loop
b(Natural(i)):= Unsigned_8(SE_Buffer(i));
end loop;
end Read;
procedure Write( b: in Buffer ) is
use Ada.Streams;
First : constant Stream_Element_Offset:= Stream_Element_Offset
(b'First);
Last : constant Stream_Element_Offset:= Stream_Element_Offset
(b'Last);
SE_Buffer : Stream_Element_Array (First..Last);
begin
for i in SE_Buffer'Range loop
SE_Buffer(i):= Stream_Element(b(Natural(i)));
end loop;
Write(Stream(f_out).all, SE_Buffer);
end Write;
Naively, you would say it is even slower: you do even more by copying
a buffer into another one, right ?
Indeed, not at all, it is *lots* faster (on GNAT and ObjectAda)!
To give an idea, the variant 1 applied to a bzip2 decompressor makes
it 4x slower than the C version, and variant 2 makes it only 7%
slower! With only I/O (like copying a file) you would get an even much
larger difference.
Now, it raises some questions:
Is there maybe a reason in the RM why the 'Read and 'Write have to be
that slow ?
Or are these two compilers lazy when compiling these attributes ?
Should I bug Adacore about that, then ?
Do some other compilers do it better ?
_________________________________________________________
Gautier's Ada programming -- http://sf.net/users/gdemont/
NB: For a direct answer, e-mail address on the Web site!
And if you overlay the Stream_Element_Array onto the Buffer, thus eliminating
the copying and the stream attribute operations?
--
Jeff Carter
"C's solution to this [variable-sized array parameters] has real
problems, and people who are complaining about safety definitely
have a point."
Dennis Ritchie
25
> procedure Write( b: in Buffer ) is
> use Ada.Streams;
> First : constant Stream_Element_Offset:= Stream_Element_Offset
> (b'First);
> Last : constant Stream_Element_Offset:= Stream_Element_Offset
> (b'Last);
> SE_Buffer : Stream_Element_Array (First..Last);
> begin
> for i in SE_Buffer'Range loop
> SE_Buffer(i):= Stream_Element(b(Natural(i)));
> end loop;
> Write(Stream(f_out).all, SE_Buffer);
> end Write;
>
> Naively, you would say it is even slower: you do even more by copying
> a buffer into another one, right ?
> Indeed, not at all, it is *lots* faster (on GNAT and ObjectAda)!
I finally thought that the above procedures are faster than 'Read
or 'Write because the latter are defined in terms of stream elements:
When there is a composite object like b : Buffer and you
'Write it, then for each component of b the corresponding 'Write
is called. This then writes stream elements, probably
calling Stream_IO.Write or some such in the end.
So Write from above appears closer to writing bulk loads
of stream elements than a bulk load of 'Writes can be.
Copying buffers does not matter in comparison to the needs
of I/O (on PCs).
> And if you overlay the Stream_Element_Array onto the Buffer, thus eliminating
> the copying and the stream attribute operations?
With an Unchecked_conversion ?... Sure, but you cannot guarantee that
the Buffer will be packed the same way as Stream_Element_Array on a
compiler X that you don't know. On that type of project (open-source
compression, no compiler/system dependency) I want to avoid any "dirty
trick" like that.
Anyway, the copying takes almost no time. Note that I do not copy
_and_ use the stream attribute at the same time:
Variant 1 uses the stream attribute on Buffer.
Variant 2 copies into a Stream_Element_Array ans uses the Write
procedure in Ada.Streams.
Gautier
> I finally thought that the above procedures are faster than 'Read
> or 'Write because the latter are defined in terms of stream elements:
> When there is a composite object like b : Buffer and you
> 'Write it, then for each component of b the corresponding 'Write
> is called. This then writes stream elements, probably
> calling Stream_IO.Write or some such in the end.
> So Write from above appears closer to writing bulk loads
> of stream elements than a bulk load of 'Writes can be.
Sure, it is the safe way: write records field by field, arrays element
by element (that recursively). The compiler avoids problems with non-
packed data. Nothing against that. The general case is well done,
fine. But the compiler could have a look a the type left to the
attribute and in such a case (an array of Unsigned_8, or a String)
say: "Gee! that type Buffer is coincidentally the same as
Stream_Element_Array, then I take the shortcut to generate the code to
write the whole buffer and, this time, not the code to write it
element by element".
> Copying buffers does not matter in comparison to the needs of I/O (on PCs).
Right. Variant 2 works fine, but it is an heavy workaround in terms of
source code. Especially for mixed type I/O with plenty of String'Write
and others, you would not want to put the kind of Variant 2 code all
over the place. It would be a lot better that compilers are able to
take selectively the shortcut form for the attributes.
Gautier
-- Usage: test_stream_performance <file>
-- Produces two .tmp files that are copies of <file>.
--
-- Example of output with file m.wmv, 2.59 MB, GNAT GPL 2008 / Win32:
--
-- xxx'Write / xxx'Read (Attribute).. 1.925210886 seconds
-- Workaround with SE buffer......... 0.049318559 seconds
-- Factor 39.036235547
-- Buffer size in bits..... 8192
-- SE Buffer size in bits.. 8192
with Ada.Calendar; use Ada.Calendar;
with Ada.Text_IO;
with Ada.Streams.Stream_IO; use Ada.Streams.Stream_IO;
with Ada.Command_Line; use Ada.Command_Line;
with Interfaces; use Interfaces;
procedure Test_Stream_Performance is
f_in, f_out: Ada.Streams.Stream_IO.File_Type;
buffer_size, SE_buffer_size: Natural:= 0;
-- To check if buffers could be overlapped (packing)
type Buffer is array(Natural range <>) of Unsigned_8;
------------------------------------------------
-- 1) Stream attributes - xxx'Read, xxx'Write --
------------------------------------------------
-- Usually we would just have: Buffer'Read(Stream(f_in), b);
-- Here we care about end of file.
procedure Read_Attribute( b: out Buffer; last_read: out Natural ) is
idx: constant Positive_Count:= Index(f_in);
siz: constant Positive_Count:= Size(f_in);
begin
if End_Of_File(f_in) then
last_read:= b'First-1;
else
last_read:= Natural'Min(b'First+Natural(siz-idx),b'Last);
Buffer'Read(Stream(f_in), b(b'First .. last_read));
end if;
end Read_Attribute;
procedure Write_Attribute( b: in Buffer ) is
begin
if buffer_size = 0 then
buffer_size:= b'size;
end if;
Buffer'Write(Stream(f_out), b);
end Write_Attribute;
--------------------------------------------
-- 2) The Stream_Element_Array workaround --
--------------------------------------------
procedure Read_SE( b: out Buffer; last_read: out Natural ) is
use Ada.Streams;
First : constant Stream_Element_Offset:= Stream_Element_Offset
(b'First);
Last : Stream_Element_Offset:= Stream_Element_Offset
(b'Last);
SE_Buffer : Stream_Element_Array (First..Last);
begin
Read(Stream(f_in).all, SE_Buffer, Last);
for i in First..Last loop
b(Natural(i)):= Unsigned_8(SE_Buffer(i));
end loop;
last_read:= Natural(last);
end Read_SE;
procedure Write_SE( b: in Buffer ) is
use Ada.Streams;
First : constant Stream_Element_Offset:= Stream_Element_Offset
(b'First);
Last : constant Stream_Element_Offset:= Stream_Element_Offset
(b'Last);
SE_Buffer : Stream_Element_Array (First..Last);
begin
if SE_buffer_size = 0 then
SE_buffer_size:= SE_Buffer'size;
end if;
for i in SE_Buffer'Range loop
SE_Buffer(i):= Stream_Element(b(Natural(i)));
end loop;
Write(Stream(f_out).all, SE_Buffer);
end Write_SE;
name : constant String:= Argument(1);
generic
label: String;
with procedure Read( b: out Buffer; last_read: out Natural );
with procedure Write( b: in Buffer );
procedure Test;
procedure Test is
b: Buffer(1..1024);
l: Natural;
begin
Open(f_in, In_File, name);
Create(f_out, Out_File, name & "_$$$_" & label & ".tmp");
while not End_of_File(f_in) loop
Read(b,l);
Write(b(1..l));
end loop;
Close(f_out);
Close(f_in);
end;
procedure Test_Attribute is new Test("Attribute", Read_Attribute,
Write_Attribute);
procedure Test_SE is new Test("SE", Read_SE, Write_SE);
T0, T1, T2: Time;
use Ada.Text_IO;
begin
T0:= Clock;
Test_Attribute;
T1:= Clock;
Test_SE;
T2:= Clock;
Put_Line("xxx'Write / xxx'Read (Attribute).." & Duration'Image(T1-
T0) & " seconds");
Put_Line("Workaround with SE buffer........." & Duration'Image(T2-
T1) & " seconds");
Put_Line("Factor" & Duration'Image((T1-T0)/(T2-T1)));
New_Line;
Put_Line("Buffer size in bits....." & Integer'Image(buffer_size));
Put_Line("SE Buffer size in bits.." & Integer'Image
(SE_buffer_size));
end;
No, that still does a copy.
Type Buffer, as a simple array of bytes, should have Buffer'Component_Size =
Unsigned_8'Size by default; but you can specify it if you're paranoid. If you're
really paranoid, you can add a test that Unsigned_8'Size = Stream_Element'Size,
which you seem to be assuming. Then
procedure Put (B : in Buffer) is -- Terrible naming scheme.
subtype Buffer_Stream is Stream_Element_Array (1 .. B'Length);
S : Buffer_Stream;
for S'Address use B'Address;
pragma Import (Ada, S);
begin -- Put
Write (S);
end Put;
--
Jeff Carter
"I spun around, and there I was, face to face with a
six-year-old kid. Well, I just threw my guns down and
walked away. Little bastard shot me in the ass."
Blazing Saddles
40
> >> And if you overlay the Stream_Element_Array onto the Buffer, thus eliminating
> >> the copying and the stream attribute operations?
> for S'Address use B'Address;
Bingo!
xxx'Write / xxx'Read (Stream attributes)................ 14.717895000
seconds
Workaround with Stream_Element_Array buffer and copy.... 0.435756000
seconds
Workaround with Stream_Element_Array buffer and overlay. 0.211830000
seconds
Factor (Copy) 33.775541816
Factor (Overlay) 69.479747911
This on a Linux 32 bit netbook, with a 32 MB file; GNAT GPL 2009, -
gnatp -O2.
So my "almost no time" assumption about the buffer copy was definitely
to be understood as "compared to the attribute version"...
Yet a bit more paranoid: checking the size of arrays!
workaround_possible: Boolean;
procedure Check_workaround is
test_a: constant Byte_Buffer(1..10):= (others => 0);
test_b: constant Ada.Streams.Stream_Element_Array(1..10):= (others
=> 0);
begin
workaround_possible:= test_a'Size = test_b'Size;
end Check_workaround;
It's the code I've put into Zip-Ada - big success!
:-)
It doesn't have to, there is a permission to avoid copying in 13.9(12). So
it depends on what the compiler is able to do optimization-wise.
Randy.
IMHO, Ada compilers should do that. (There's specifically a permission to do
this optimization in Ada 2005: 13.13.2(56/2).) That's an intergral part of
the stream attribute implementation on Janus/Ada. (Disclaimer: the entire
stream attribute implementation on Janus/Ada doesn't work right, quite
probably because it is too complicated. So perhaps there is a reason that
other Ada compilers don't do that. :-) Note, however, that it is pretty rare
that you could actually do that (only about 15% of the composite types I've
seen in Janus/Ada would qualify). So I'm not surprised that implementers
have left that capability out in favor of things that happen more often.
Randy.
> IMHO, Ada compilers should do that. (There's specifically a permission to do
> this optimization in Ada 2005: 13.13.2(56/2).)
Excellent news!
> That's an intergral part of
> the stream attribute implementation on Janus/Ada. (Disclaimer: the entire
> stream attribute implementation on Janus/Ada doesn't work right, quite
> probably because it is too complicated. So perhaps there is a reason that
> other Ada compilers don't do that. :-) Note, however, that it is pretty rare
> that you could actually do that (only about 15% of the composite types I've
> seen in Janus/Ada would qualify).
Sure - but imagine that these 15% might transport 95% of the
information. It could happen, couldn't it ?
And if type T qualifies, a record type R with fields of types T,U,V (U
and V not qualifying) will be also transmitted faster, an array of R
will also go faster, and so on...
> So I'm not surprised that implementers
> have left that capability out in favor of things that happen more often.
I am not surprised either...
Gautier
-- Usage: test_stream_performance <big_file>
-- Produces .tmp files that are copies of <big_file>.
--
-- Example of output with GNAT GPL 2008 / Win32:
--
-- xxx'Write / xxx'Read (Stream attributes)......... 9.282530042
seconds
-- Workarounds with Stream_Element_Array buffer:
-- copy........................................... 0.444120412
seconds
-- overlay (read), unchecked_conversion (write)... 0.156874407
seconds
-- overlay........................................ 0.150155676
seconds
-- Factor (Copy) 20.900930898
-- Factor (Overlay) 61.819374993
-- Buffer size in bits..... 8192
-- SE Buffer size in bits.. 8192
-- File size in megabytes..... 2.46367E+01
with Ada.Calendar; use Ada.Calendar;
with Ada.Text_IO;
with Ada.Streams.Stream_IO; use Ada.Streams.Stream_IO;
with Ada.Command_Line; use Ada.Command_Line;
with Ada.Unchecked_Conversion;
with Interfaces; use Interfaces;
procedure Test_Stream_Performance is
f_in, f_out: Ada.Streams.Stream_IO.File_Type;
buffer_size, SE_buffer_size: Natural:= 0;
-- To check if buffers are binary compatible (same packing)
type Buffer is array(Natural range <>) of Unsigned_8;
------------------------------------------------
-- 1) Stream attributes - xxx'Read, xxx'Write --
------------------------------------------------
-- NB: usually we would just have: Buffer'Read(Stream(f_in), b);
-- Here we care about end of file.
--
procedure Read_Attribute( b: out Buffer; last_read: out Natural ) is
idx: constant Positive_Count:= Index(f_in);
siz: constant Positive_Count:= Size(f_in);
begin
if End_Of_File(f_in) then
last_read:= b'First-1;
else
last_read:= Natural'Min(b'First+Natural(siz-idx),b'Last);
Buffer'Read(Stream(f_in), b(b'First .. last_read));
end if;
end Read_Attribute;
procedure Write_Attribute( b: in Buffer ) is
begin
if buffer_size = 0 then
buffer_size:= b'size; -- just for stats
end if;
Buffer'Write(Stream(f_out), b);
end Write_Attribute;
---------------------------------------------
-- 2) The Stream_Element_Array workarounds --
---------------------------------------------
procedure Read_SE_Copy( b: out Buffer; last_read: out Natural ) is
use Ada.Streams;
First : constant Stream_Element_Offset:= Stream_Element_Offset
(b'First);
Last : Stream_Element_Offset:= Stream_Element_Offset
(b'Last);
SE_Buffer : Stream_Element_Array (First..Last);
begin
Read(Stream(f_in).all, SE_Buffer, Last);
for i in First..Last loop
b(Natural(i)):= Unsigned_8(SE_Buffer(i));
end loop;
last_read:= Natural(last);
end Read_SE_Copy;
procedure Write_SE_Copy( b: in Buffer ) is
use Ada.Streams;
First : constant Stream_Element_Offset:= Stream_Element_Offset
(b'First);
Last : constant Stream_Element_Offset:= Stream_Element_Offset
(b'Last);
SE_Buffer : Stream_Element_Array (First..Last);
begin
if SE_buffer_size = 0 then
SE_buffer_size:= SE_Buffer'size; -- just for stats
end if;
for i in SE_Buffer'Range loop
SE_Buffer(i):= Stream_Element(b(Natural(i)));
end loop;
Write(Stream(f_out).all, SE_Buffer);
end Write_SE_Copy;
-- Overlay idea by Jeff Carter
procedure Read_SE_Overlay( b: out Buffer; last_read: out Natural )
is
use Ada.Streams;
Last: Stream_Element_Offset;
SE_Buffer : Stream_Element_Array (1..b'Length);
for SE_Buffer'Address use b'Address;
begin
Read(Stream(f_in).all, SE_Buffer, Last);
last_read:= b'First + Natural(Last) - 1;
end Read_SE_Overlay;
procedure Write_SE_Overlay( b: in Buffer ) is
use Ada.Streams;
SE_Buffer : Stream_Element_Array (1..b'Length);
for SE_Buffer'Address use b'Address;
begin
Write(Stream(f_out).all, SE_Buffer);
end Write_SE_Overlay;
-- Using Unchecked_Conversion
procedure Write_SE_UC( b: in Buffer ) is
subtype My_SEA is Ada.Streams.Stream_Element_Array(1..b'Length);
function To_SEA is new Ada.Unchecked_Conversion(Buffer, My_SEA);
use Ada.Streams;
begin
Write(Stream(f_out).all, To_SEA(b));
end Write_SE_UC;
----------
-- Test --
----------
function name return String is
begin
return Argument(1);
end;
generic
label: String;
with procedure Read( b: out Buffer; last_read: out Natural );
with procedure Write( b: in Buffer );
procedure Test;
procedure Test is
b: Buffer(1..1024);
l: Natural;
begin
Open(f_in, In_File, name);
Create(f_out, Out_File, name & "_$$$_" & label & ".tmp");
while not End_of_File(f_in) loop
Read(b,l);
Write(b(1..l));
end loop;
Close(f_out);
Close(f_in);
end;
procedure Test_Attribute is new Test("Attribute", Read_Attribute,
Write_Attribute);
procedure Test_SE_Copy is new Test("SE_Copy", Read_SE_Copy,
Write_SE_Copy);
procedure Test_SE_Overlay is new Test("SE_Overlay", Read_SE_Overlay,
Write_SE_Overlay);
procedure Test_SE_UC is new Test("SE_UC", Read_SE_Overlay,
Write_SE_UC);
T0, T1, T2, T3, T4: Time;
use Ada.Text_IO;
begin
if Argument_Count=0 then
Put_Line(" Usage: test_stream_performance <big_file>");
Put_Line(" Produces .tmp files that are copies of <big_file>.");
return;
end if;
T0:= Clock;
Test_Attribute;
T1:= Clock;
Test_SE_Copy;
T2:= Clock;
Test_SE_Overlay;
T3:= Clock;
Test_SE_UC;
T4:= Clock;
Put_Line("xxx'Write / xxx'Read (Stream attributes)........." &
Duration'Image(T1-T0) & " seconds");
Put_Line("Workarounds with Stream_Element_Array buffer:");
Put_Line(" copy..........................................." &
Duration'Image(T2-T1) & " seconds");
Put_Line(" overlay (read), unchecked_conversion (write)..." &
Duration'Image(T4-T3) & " seconds");
Put_Line(" overlay........................................" &
Duration'Image(T3-T2) & " seconds");
Put_Line("Factor (Copy) " & Duration'Image((T1-T0)/(T2-T1)));
Put_Line("Factor (Overlay)" & Duration'Image((T1-T0)/(T3-T2)));
New_Line;
Put_Line("Buffer size in bits....." & Integer'Image(buffer_size));
Put_Line("SE Buffer size in bits.." & Integer'Image
(SE_buffer_size));
New_Line;
Open(f_in, In_File, name);
Put_Line("File size in megabytes....." & Float'Image(Float(Size
(f_in))/(1024.0*1024.0)));
Close(f_in);
end;
if Is_Array_Type(Typ) and then
Is_Bit_Packed_Array (Typ) and then
Component_Size (Typ) = 8 and then
then
Comp_Typ:= Component_Type (Typ);
if Is_Modular_Integer_Type(Comp_Typ) and then
[modulus ??] (Comp_Typ) = 2**8
then
[ go ahead with some unchecked conversion to String ]
return [the right thing];
end if;
end if;
Now, I have no experience with building GNAT, from the GCC tree.
Would someone like to help ?
Or is it easy to on Windows (currently, no Linux available on my
side) ?
TIA
http://goodbye-microsoft.com/ :)
*ducks and hides*
--
Ludovic Brenta.
Building GNAT on Windows should work as long as you have binary version
of GNAT already installed.
Search for cygwin or mingw. Also, look at avr-ada and gnuada projects
at sourceforge.net for examples.
Some links:
http://sourceforge.net/apps/mediawiki/avr-ada/index.php?title=BuildScript
http://gnuada.sourceforge.net/
http://en.wikibooks.org/wiki/Ada_Programming/Installing
http://ada.krischik.com/index.php/Articles/CompileGNATGPL?from=Articles.CompileGNAT
--
Tero Koskinen - http://iki.fi/tero.koskinen/
Someone might be quicker than me in doing that...
Just in case, another good place (or even a better one) for inserting
the shortcut would be:
Build_Array_Read_Write_Procedure in exp_strm.adb .
Multi-dimensional arrays could be considered there as well, if they
comply of course.