Hello,
(It is now 29 september 2015)
As promised here is Skybuck's Parallel Universal Code demonstration program.
This posting contains a Delphi and C/C++ version for you to learn from.
(I was kinda thinking of adding some kind of case statement/array and out of
order execution/randomization to proof that the lines can be executed in any
order, though I didn't come around to that (yet) been playing World of
Warships =D at 1 fps to 30 fps lol)
(If anybody doubt that these lines can be executed out of order I may
eventually add such a feature as welll.. maybe I will do it even for the fun
of it ! ;))
(I was also maybe thinking of trying to through it all into a nice
Tprocessor class with an actual thread to show it off as well... didn't come
around to that yet either ;):))
(However if you clearly examine the indexes used you will discover that
there is no overlap, all indexes are uniquely read and such... no
write-read-write-read-sequential stuff going on... it can all be read in
parallel... except for building rowoffset array).
// ** Begin of Delphi Program ***
program TestProgram;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils;
{
Skybuck's Parallel Universal Code, Demonstration program
version 0.01 created on 19 september 2015
This demonstration program will illustrate how to decode fields of data in
parallel.
For this demonstration program bit fiddling will be avoided by allowing 1
bit per array index to illustrate the general idea.
Also lengths may be stored by 1 integer per array index. In a real world
scenerio it would be Skybuck Universally Encoded.
Skybuck's Parallel Universal Code, Design Document:
version 1 created on 16 september 2015 (after seeing another mentioning of
IBM's processor MIL which makes me sick, as if decoding fields in parallel
is hard ! LOL ;) :))
(I also wrote a little introductionary question for it see other file for it
and I even learned something from it: Parallel decoding lesson for you.txt)
All bits of the fields are split up in such a way that the first bit of each
field is right next to each other, the second bit of each field is also next
to each other, and so forth.
Conceptually view:
First currently situation
Field A consists out of a1a2a3a4 (4 bits)
Field B consists out of b1b2b3 (3 bits)
Field C consists out of c1 (1 bit)
Field D consists out of d1d2d3d4d5d6 ( 6 bits)
These bits are stored as follows:
"First row":
a1b1c1d1
"Second row":
a2b2d2
"Third row":
a3b3d3
"Fourth row":
a4d4
"Fiveth row":
d5
"Sixth row":
d6
These rows will be stores sequentially as follows:
a1b1c1d1a2b2d2a3b3d4a4d4d5d6
Now the question is: How does a processor know where each row begins ? and
how many bits of each field there is, the answers are given below:
The bit length of each row is stored preemptively/prefixed:
"First row": 4
"Second row": 3
"Third row": 3
"Fourth row": 2
"Fiveth row": 1
"Sixth row" : 1
Now for each row their offset can be computed:
First row starts at 0,
Second row starts at 0 + 4 = 4
Third row starts at 0 + 4 + 3 = 7
Fouth row starts at 0 + 4 + 3 + 3 = 10
Fiveth row starts at 0 + 4 + 3 + 3 + 2 = 12
Sixth row starts at 0 + 4+ 3 + 3+ 2 + 1 = 13
Let's check if this is true:
0 1 2 3 4 5 6 7 8 9 10 11 12 13
a1 b1 c1 d1 a2 b2 d2 a3 b3 d3 a4 d4 d5 d6
Bingo, all match.
The processor can compute the offsets of each row.
And thus the processor can reach each field in parallel as follows:
Read field A bit 0 at offset 0
Read field B bit 0 at offset 4
Read field C bit 0 at offset 7
Read field D bit 0 at offset 10
Now the question is how can the processor know when to stop reading ?
A marker/meta bit could be used like described in Skybuck's Universal Code
version 1, which indicates if the field continues or stops.
These bits can be stored in the same way as the data bits above.
And thus for each data bit, a meta bit can be read as well.
This way the processor knows that C2 does not exist and can stop reading
field C
For huffman codes that would not even be required, since the processor can
see in the huffman tree when it reaches a leave/end node and then it will
know the end of C was reached.
So marker/beta bits can be avoided by using Huffman codes ! Pretty neeto !
;) =D
However huffman has a drawback that some fields might get large, and may not
be suited for rare information and modifications and so forth ! ;)
The last thing to do to make this usuable is to include another prefix field
which indicates how many prefixes there are.
So final stream will look something like:
[Number of Fields][Set of Field Lengths][Set of Field Data Bits
Intermixed/Parallel]
First field can be universally coded.
Second set of fields can be universally coded.
Third set of field contains data bits intermixed/in parallel.
Additional:
The problem can be solved by sorting the fields from largest to smallest
field, so new stream looks like:
(sorting from smallest to largest would be possible too... but code below
assumes first field is largest so it uses
largest to smallest sorting solution which is also applied to the stream A,
so stream A below is sorted that way)
Stream A: 433211d1a1b1c1d2a2b2d3a3b3d4a4d5d6
Bye,
Skybuck.
}
function Constrain( Para : integer ) : integer;
begin
result := Para;
if Para < 0 then Result := 0;
if Para >= 1 then Result := 1;
end;
procedure Main;
// must put variables here otherwise won't show up in debugger.
const
MaxProcessorCount = 4;
var
// information stream, input
Stream : array[0..20+(MaxProcessorCount-1)] of integer; // add max
processor count to create a safe "padding" for reading so no out of
bounds/range check errors with arrays.
// bits representing fields of data
a1,a2,a3,a4 : integer;
b1,b2,b3 : integer;
c1 : integer;
d1,d2,d3,d4,d5,d6 : integer;
// output
RowIndex : integer;
RowCount : integer;
RowLength : array[0..5] of integer;
RowOffset : array[0..5] of integer;
FieldRowMultiplier : array[0..3,0..5] of integer;
DataOffset : integer;
FieldCount : integer;
FieldLength : array[0..3] of integer;
Processor : array[0..3] of integer;
// debug fields
FieldA : integer;
FieldB : integer;
FieldC : integer;
FieldD : integer;
begin
a1 := 1; a2 := 1; a3:= 1; a4 := 1;
b1 := 1; b2 := 1; b3 := 1;
c1 := 1;
d1 := 1; d2 := 1; d3 := 1; d4 := 1; d5 := 1; d6 := 1;
// compute input fields to compare it later with output fields
FieldA := (a1) or (a2 shl 1) or (a3 shl 2) or (a4 shl 3);
FieldB := (b1) or (b2 shl 1) or (b3 shl 2);
FieldC := (c1);
FieldD := (d1) or (d2 shl 1) or (d3 shl 2) or (d4 shl 3) or (d5 shl 4)
or (d6 shl 5);
// print field values
writeln( 'FieldA: ', FieldA );
writeln( 'FieldB: ', FieldB );
writeln( 'FieldC: ', FieldC );
writeln( 'FieldD: ', FieldD );
writeln;
// let's assume first field is largest so it always consumes all row
information.
// should be 0 if negative, should be zero if zero, should be 1 if
positive so and will do the trick to constaint it.
// the -0, -1, -2, -3 represents subtracting the processor
number/identify from it... to allow parallel processing ! ;)
// contraint is wrong... hahga unnt.
FieldRowMultiplier[0,0] := Constrain(RowLength[0]-0); // shoudl be set
to one if larger.
FieldRowMultiplier[0,1] := Constrain(RowLength[1]-0);
FieldRowMultiplier[0,2] := Constrain(RowLength[2]-0);
FieldRowMultiplier[0,3] := Constrain(RowLength[3]-0);
FieldRowMultiplier[0,4] := Constrain(RowLength[4]-0);
FieldRowMultiplier[0,5] := Constrain(RowLength[5]-0);
// now second field may consume less if there are not enough bits.
FieldRowMultiplier[1,0] := Constrain(RowLength[0]-1);
FieldRowMultiplier[1,1] := Constrain(RowLength[1]-1);
FieldRowMultiplier[1,2] := Constrain(RowLength[2]-1);
FieldRowMultiplier[1,3] := Constrain(RowLength[3]-1);
FieldRowMultiplier[1,4] := Constrain(RowLength[4]-1);
FieldRowMultiplier[1,5] := Constrain(RowLength[5]-1);
// now third field may consume less if there are not enough bits.
FieldRowMultiplier[2,0] := Constrain(RowLength[0]-2);
FieldRowMultiplier[2,1] := Constrain(RowLength[1]-2);
FieldRowMultiplier[2,2] := Constrain(RowLength[2]-2);
FieldRowMultiplier[2,3] := Constrain(RowLength[3]-2);
FieldRowMultiplier[2,4] := Constrain(RowLength[4]-2);
FieldRowMultiplier[2,5] := Constrain(RowLength[5]-2);
// now fourth field may consume less if there are not enough bits.
FieldRowMultiplier[3,0] := Constrain(RowLength[0]-3);
FieldRowMultiplier[3,1] := Constrain(RowLength[1]-3);
FieldRowMultiplier[3,2] := Constrain(RowLength[2]-3);
FieldRowMultiplier[3,3] := Constrain(RowLength[3]-3);
FieldRowMultiplier[3,4] := Constrain(RowLength[4]-3);
FieldRowMultiplier[3,5] := Constrain(RowLength[5]-3);
// now compute field lengths
// not necessary to multiply anything just add them up ! ;) =D
FieldLength[0] :=
FieldRowMultiplier[0,0] +
FieldRowMultiplier[0,1] +
FieldRowMultiplier[0,2] +
FieldRowMultiplier[0,3] +
FieldRowMultiplier[0,4] +
FieldRowMultiplier[0,5];
FieldLength[1] :=
FieldRowMultiplier[1,0] +
FieldRowMultiplier[1,1] +
FieldRowMultiplier[1,2] +
FieldRowMultiplier[1,3] +
FieldRowMultiplier[1,4] +
FieldRowMultiplier[1,5];
FieldLength[2] :=
FieldRowMultiplier[2,0] +
FieldRowMultiplier[2,1] +
FieldRowMultiplier[2,2] +
FieldRowMultiplier[2,3] +
FieldRowMultiplier[2,4] +
FieldRowMultiplier[2,5];
FieldLength[3] :=
FieldRowMultiplier[3,0] +
FieldRowMultiplier[3,1] +
FieldRowMultiplier[3,2] +
FieldRowMultiplier[3,3] +
FieldRowMultiplier[3,4] +
FieldRowMultiplier[3,5];
// though the field multipliers could come in handy later to read the
bits ! nice ! ;) =D
// for each row the offset must be calculated this can be done serially
or by all processors for themselfes at the same time:
// row zero starts after the number of rows which is indicated by the
first stream value =D
// first determine data offset properly ! ;) :) 1 for the row count +
RowCount to skip over row lengths.
DataOffset := 1 + RowCount;
RowOffset[0] := DataOffset;
RowOffset[1] := RowOffset[0] + RowLength[0];
RowOffset[2] := RowOffset[1] + RowLength[1];
RowOffset[3] := RowOffset[2] + RowLength[2];
RowOffset[4] := RowOffset[3] + RowLength[3];
RowOffset[5] := RowOffset[4] + RowLength[4];
// now calculate and/ore detemrine length of each field
// first determine number of fields
// now that all row offsets are calculated it's possible to decode the
stream into each processor, by each processor.
// each processor knows it's own number/identitiy represented by the +0
+1 +2 +3 down below:
// the general idea is:
// row[] here is RowOffset[]
{
Processor[0] :=
Stream[Row[0]+0]+Stream[Row[1]+0]+Stream[Row[2]+0]+Stream[Row[3]+0]+Stream[Row[4]+0]+Stream[Row[5]+0];
Processor[1] :=
Stream[Row[0]+1]+Stream[Row[1]+1]+Stream[Row[2]+1]+Stream[Row[3]+1]+Stream[Row[4]+1]+Stream[Row[5]+1];
Processor[2] :=
Stream[Row[0]+2]+Stream[Row[1]+2]+Stream[Row[2]+2]+Stream[Row[3]+2]+Stream[Row[4]+2]+Stream[Row[5]+2];
Processor[3] :=
Stream[Row[0]+3]+Stream[Row[1]+3]+Stream[Row[2]+3]+Stream[Row[3]+3]+Stream[Row[4]+3]+Stream[Row[5]+3];
}
// however a processor should only include the bits of a stream if it's
within the field's length for that we need
// to compute each field length and here we have an oops ! ;) :) cannot
compute field length if multiple fields
// per row. or can we ?! ;) :)perhaps we can... we know that fields have
at least 1 bit because of row 0
// and we know which fields have 2 bits because of row 2 and so forth...
and since all fields
// are ditributed parallely... we don't actually need to know where
their bits are.... cause they all packed lol...
// we only need to know what max length is or something... though how
can we dan be sure that a field ends up
// wehere it needs to be... well we dont... a field can end up in any
processor.
// so how should it actually look like then... well as follows:
// since the fields are stored as follows: A,B,C,D we know A ends up in
processor 0, B in 1, C in 2, D in 3 and so forth.
// thus... processor 0 can determine length of A by looking at row[0],
row[1], row[2], row[3], row[4], row[5], row[6].
// but how to know which count matches to who's field ?
// is the length of row 5 for A or B or C or D ? we don't know do we ?!
;)
// now we should be able to solve the problem... we know the bits belong
to the first few fields... cool ! ;) =D
// now we should be able to solve it easily... by only including a bit
from the stream if the multiplier is set to 1 ;) :)
// and now only thing left to do is shifting the bits into proper
position and or-ing them together ! ;)
Processor[0] :=
((Stream[RowOffset[0]+0]*FieldRowMultiplier[0,0]) shl 0) or
((Stream[RowOffset[1]+0]*FieldRowMultiplier[0,1]) shl 1) or
((Stream[RowOffset[2]+0]*FieldRowMultiplier[0,2]) shl 2) or
((Stream[RowOffset[3]+0]*FieldRowMultiplier[0,3]) shl 3) or
((Stream[RowOffset[4]+0]*FieldRowMultiplier[0,4]) shl 4) or
((Stream[RowOffset[5]+0]*FieldRowMultiplier[0,5]) shl 5);
Processor[1] :=
((Stream[RowOffset[0]+1]*FieldRowMultiplier[1,0]) shl 0) or
((Stream[RowOffset[1]+1]*FieldRowMultiplier[1,1]) shl 1) or
((Stream[RowOffset[2]+1]*FieldRowMultiplier[1,2]) shl 2) or
((Stream[RowOffset[3]+1]*FieldRowMultiplier[1,3]) shl 3) or
((Stream[RowOffset[4]+1]*FieldRowMultiplier[1,4]) shl 4) or
((Stream[RowOffset[5]+1]*FieldRowMultiplier[1,5]) shl 5);
Processor[2] :=
((Stream[RowOffset[0]+2]*FieldRowMultiplier[2,0]) shl 0) or
((Stream[RowOffset[1]+2]*FieldRowMultiplier[2,1]) shl 1) or
((Stream[RowOffset[2]+2]*FieldRowMultiplier[2,2]) shl 2) or
((Stream[RowOffset[3]+2]*FieldRowMultiplier[2,3]) shl 3) or
((Stream[RowOffset[4]+2]*FieldRowMultiplier[2,4]) shl 4) or
((Stream[RowOffset[5]+2]*FieldRowMultiplier[2,5]) shl 5);
Processor[3] :=
((Stream[RowOffset[0]+3]*FieldRowMultiplier[3,0]) shl 0) or
((Stream[RowOffset[1]+3]*FieldRowMultiplier[3,1]) shl 1) or
((Stream[RowOffset[2]+3]*FieldRowMultiplier[3,2]) shl 2) or
((Stream[RowOffset[3]+3]*FieldRowMultiplier[3,3]) shl 3) or
((Stream[RowOffset[4]+3]*FieldRowMultiplier[3,4]) shl 4) or
((Stream[RowOffset[5]+3]*FieldRowMultiplier[3,5]) shl 5);
// *** STILL TO DO (solved by extending array):
********************************
// *** ^^^ may have to look into potential out of range problem ^^^ ****
// *********************************************************************
// for now it seems ok, also as long as stream array has
+NumberOfProcessors scratch pad at end it may be ok ! ;) :)
// one last problem might remain... the row offset may go out of
range...
// we could either use a scratch pad or solve this in another way.
// I think it's best to leave it as as and perhaps make the stream a bit
larger or omething let's see what happens.
// print processor values.
writeln( 'Processor[0]: ', Processor[0] );
writeln( 'Processor[1]: ', Processor[1] );
writeln( 'Processor[2]: ', Processor[2] );
writeln( 'Processor[3]: ', Processor[3] );
writeln;
end;
begin
try
Main;
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
ReadLn;
end.
// *** End of Delphi Program ***
// *** Begin of C/C++ Program ***
// ParallelDecodingCVersion.cpp : Defines the entry point for the console
application.
//
// Full C version 0.01 created on 23 september 2015 by Skybuck Flying =D
#include "stdafx.h"
// Begin of Dummy Decoder Example
const int MaxProcessorCount = 4;
int Constrain( int Para )
{
if (Para < 0) Para = 0;
if (Para >= 1) Para = 1;
return Para;
}
int _tmain(int argc, _TCHAR* argv[])
{
// information stream, input
int Stream[21+(MaxProcessorCount-1)]; // add max processor count to
create a safe "padding" for reading so no out of bounds/range check errors
with arrays.
// bits representing fields of data
int a1,a2,a3,a4;
int b1,b2,b3;
int c1;
int d1,d2,d3,d4,d5,d6;
// output
int RowIndex;
int RowCount;
int RowLength[6];
int RowOffset[6];
int FieldRowMultiplier[4][6];
int DataOffset;
int FieldCount;
int FieldLength[4];
int Processor[4];
// debug fields
int FieldA;
int FieldB;
int FieldC;
int FieldD;
// input test 1
/*
a1 = 1; a2 = 1; a3 = 1; a4 = 1;
b1 = 1; b2 = 1; b3 = 1;
c1 = 1;
d1 = 1; d2 = 1; d3 = 1; d4 = 1; d5 = 1; d6 = 1;
*/
// input test 2
a1 = 1; a2 = 0; a3 = 0; a4 = 1;
b1 = 1; b2 = 1; b3 = 0;
c1 = 1;
d1 = 1; d2 = 1; d3 = 0; d4 = 0; d5 = 1; d6 = 0;
// let's assume first field is largest so it always consumes all row
information.
// should be 0 if negative, should be zero if zero, should be 1 if
positive so and will do the trick to constaint it.
// the -0, -1, -2, -3 represents subtracting the processor
number/identify from it... to allow parallel processing ! ;)
// contraint is wrong... hahga unnt.
FieldRowMultiplier[0][0] = Constrain(RowLength[0]-0); // shoudl be set
to one if larger.
FieldRowMultiplier[0][1] = Constrain(RowLength[1]-0);
FieldRowMultiplier[0][2] = Constrain(RowLength[2]-0);
FieldRowMultiplier[0][3] = Constrain(RowLength[3]-0);
FieldRowMultiplier[0][4] = Constrain(RowLength[4]-0);
FieldRowMultiplier[0][5] = Constrain(RowLength[5]-0);
// now second field may consume less if there are not enough bits.
FieldRowMultiplier[1][0] = Constrain(RowLength[0]-1);
FieldRowMultiplier[1][1] = Constrain(RowLength[1]-1);
FieldRowMultiplier[1][2] = Constrain(RowLength[2]-1);
FieldRowMultiplier[1][3] = Constrain(RowLength[3]-1);
FieldRowMultiplier[1][4] = Constrain(RowLength[4]-1);
FieldRowMultiplier[1][5] = Constrain(RowLength[5]-1);
// now third field may consume less if there are not enough bits.
FieldRowMultiplier[2][0] = Constrain(RowLength[0]-2);
FieldRowMultiplier[2][1] = Constrain(RowLength[1]-2);
FieldRowMultiplier[2][2] = Constrain(RowLength[2]-2);
FieldRowMultiplier[2][3] = Constrain(RowLength[3]-2);
FieldRowMultiplier[2][4] = Constrain(RowLength[4]-2);
FieldRowMultiplier[2][5] = Constrain(RowLength[5]-2);
// now fourth field may consume less if there are not enough bits.
FieldRowMultiplier[3][0] = Constrain(RowLength[0]-3);
FieldRowMultiplier[3][1] = Constrain(RowLength[1]-3);
FieldRowMultiplier[3][2] = Constrain(RowLength[2]-3);
FieldRowMultiplier[3][3] = Constrain(RowLength[3]-3);
FieldRowMultiplier[3][4] = Constrain(RowLength[4]-3);
FieldRowMultiplier[3][5] = Constrain(RowLength[5]-3);
// now compute field lengths
// not necessary to multiply anything just add them up ! ;) =D
FieldLength[0] =
FieldRowMultiplier[0][0] +
FieldRowMultiplier[0][1] +
FieldRowMultiplier[0][2] +
FieldRowMultiplier[0][3] +
FieldRowMultiplier[0][4] +
FieldRowMultiplier[0][5];
FieldLength[1] =
FieldRowMultiplier[1][0] +
FieldRowMultiplier[1][1] +
FieldRowMultiplier[1][2] +
FieldRowMultiplier[1][3] +
FieldRowMultiplier[1][4] +
FieldRowMultiplier[1][5];
FieldLength[2] =
FieldRowMultiplier[2][0] +
FieldRowMultiplier[2][1] +
FieldRowMultiplier[2][2] +
FieldRowMultiplier[2][3] +
FieldRowMultiplier[2][4] +
FieldRowMultiplier[2][5];
FieldLength[3] =
FieldRowMultiplier[3][0] +
FieldRowMultiplier[3][1] +
FieldRowMultiplier[3][2] +
FieldRowMultiplier[3][3] +
FieldRowMultiplier[3][4] +
FieldRowMultiplier[3][5];
// though the field multipliers could come in handy later to read the
bits ! nice ! ;) =D
// for each row the offset must be calculated this can be done serially
or by all processors for themselfes at the same time:
// row zero starts after the number of rows which is indicated by the
first stream value =D
// first determine data offset properly ! ;) :) 1 for the row count +
RowCount to skip over row lengths.
DataOffset = 1 + RowCount;
RowOffset[0] = DataOffset;
RowOffset[1] = RowOffset[0] + RowLength[0];
RowOffset[2] = RowOffset[1] + RowLength[1];
RowOffset[3] = RowOffset[2] + RowLength[2];
RowOffset[4] = RowOffset[3] + RowLength[3];
RowOffset[5] = RowOffset[4] + RowLength[4];
// now calculate and/ore detemrine length of each field
// first determine number of fields
// now that all row offsets are calculated it's possible to decode the
stream into each processor, by each processor.
// each processor knows it's own number/identitiy represented by the +0
+1 +2 +3 down below:
// the general idea is:
// row[] here is RowOffset[]
/*
Processor[0] :=
Stream[Row[0]+0]+Stream[Row[1]+0]+Stream[Row[2]+0]+Stream[Row[3]+0]+Stream[Row[4]+0]+Stream[Row[5]+0];
Processor[1] :=
Stream[Row[0]+1]+Stream[Row[1]+1]+Stream[Row[2]+1]+Stream[Row[3]+1]+Stream[Row[4]+1]+Stream[Row[5]+1];
Processor[2] :=
Stream[Row[0]+2]+Stream[Row[1]+2]+Stream[Row[2]+2]+Stream[Row[3]+2]+Stream[Row[4]+2]+Stream[Row[5]+2];
Processor[3] :=
Stream[Row[0]+3]+Stream[Row[1]+3]+Stream[Row[2]+3]+Stream[Row[3]+3]+Stream[Row[4]+3]+Stream[Row[5]+3];
*/
// however a processor should only include the bits of a stream if it's
within the field's length for that we need
// to compute each field length and here we have an oops ! ;) :) cannot
compute field length if multiple fields
// per row. or can we ?! ;) :)perhaps we can... we know that fields have
at least 1 bit because of row 0
// and we know which fields have 2 bits because of row 2 and so forth...
and since all fields
// are ditributed parallely... we don't actually need to know where
their bits are.... cause they all packed lol...
// we only need to know what max length is or something... though how
can we dan be sure that a field ends up
// wehere it needs to be... well we dont... a field can end up in any
processor.
// so how should it actually look like then... well as follows:
// since the fields are stored as follows: A,B,C,D we know A ends up in
processor 0, B in 1, C in 2, D in 3 and so forth.
// thus... processor 0 can determine length of A by looking at row[0],
row[1], row[2], row[3], row[4], row[5], row[6].
// but how to know which count matches to who's field ?
// is the length of row 5 for A or B or C or D ? we don't know do we ?!
;)
// now we should be able to solve the problem... we know the bits belong
to the first few fields... cool ! ;) =D
// now we should be able to solve it easily... by only including a bit
from the stream if the multiplier is set to 1 ;) :)
// and now only thing left to do is shifting the bits into proper
position and or-ing them together ! ;)
Processor[0] =
((Stream[RowOffset[0]+0]*FieldRowMultiplier[0][0]) << 0) |
((Stream[RowOffset[1]+0]*FieldRowMultiplier[0][1]) << 1) |
((Stream[RowOffset[2]+0]*FieldRowMultiplier[0][2]) << 2) |
((Stream[RowOffset[3]+0]*FieldRowMultiplier[0][3]) << 3) |
((Stream[RowOffset[4]+0]*FieldRowMultiplier[0][4]) << 4) |
((Stream[RowOffset[5]+0]*FieldRowMultiplier[0][5]) << 5);
Processor[1] =
((Stream[RowOffset[0]+1]*FieldRowMultiplier[1][0]) << 0) |
((Stream[RowOffset[1]+1]*FieldRowMultiplier[1][1]) << 1) |
((Stream[RowOffset[2]+1]*FieldRowMultiplier[1][2]) << 2) |
((Stream[RowOffset[3]+1]*FieldRowMultiplier[1][3]) << 3) |
((Stream[RowOffset[4]+1]*FieldRowMultiplier[1][4]) << 4) |
((Stream[RowOffset[5]+1]*FieldRowMultiplier[1][5]) << 5);
Processor[2] =
((Stream[RowOffset[0]+2]*FieldRowMultiplier[2][0]) << 0) |
((Stream[RowOffset[1]+2]*FieldRowMultiplier[2][1]) << 1) |
((Stream[RowOffset[2]+2]*FieldRowMultiplier[2][2]) << 2) |
((Stream[RowOffset[3]+2]*FieldRowMultiplier[2][3]) << 3) |
((Stream[RowOffset[4]+2]*FieldRowMultiplier[2][4]) << 4) |
((Stream[RowOffset[5]+2]*FieldRowMultiplier[2][5]) << 5);
Processor[3] =
((Stream[RowOffset[0]+3]*FieldRowMultiplier[3][0]) << 0) |
((Stream[RowOffset[1]+3]*FieldRowMultiplier[3][1]) << 1) |
((Stream[RowOffset[2]+3]*FieldRowMultiplier[3][2]) << 2) |
((Stream[RowOffset[3]+3]*FieldRowMultiplier[3][3]) << 3) |
((Stream[RowOffset[4]+3]*FieldRowMultiplier[3][4]) << 4) |
((Stream[RowOffset[5]+3]*FieldRowMultiplier[3][5]) << 5);
// *** STILL TO DO (solved by extending array):
********************************
// *** ^^^ may have to look into potential out of range problem ^^^ ****
// solved by adding padding for reading ;)
// *********************************************************************
// for now it seems ok, also as long as stream array has
+NumberOfProcessors scratch pad at end it may be ok ! ;) :)
// one last problem might remain... the row offset may go out of
range...
// we could either use a scratch pad|solve this in another way.
// I think it's best to leave it as as and perhaps make the stream a bit
larger or omething let's see what happens.
// print processor values.
printf( "Processor[0]: %d \n", Processor[0] );
printf( "Processor[1]: %d \n", Processor[1] );
printf( "Processor[2]: %d \n", Processor[2] );
printf( "Processor[3]: %d \n\n", Processor[3] );
return 0;
}
// *** Emd of C/C++ Program ***
Bye,
Skybuck.