While I'm here, does anybody have Verilog model for a FP ALU?
Emails, URLs, whatever, gratefully received.
Ed Chester
Microelectronic Systems Design Group
Electrical and Electronic Engineering
University of Newcastle, UK.
> Hi. Can anybody help me out with a verilog model (for synthesis) for a
> Priority Encoder? Anything along the lines of the functionality of the
> rather ancient 74147/148s? Any width, just so I can see how to do it,
> but I'm _actually_ after
> a 24b encoder to use as part of a floating-point ALU. For research
> purposes only, I'm only after a set of encoders to compare speed/area
> trade-offs; and the capability of my synthesis tool.
Off the top of my head:
function [3:0] prienc;
input [7:0] select
reg [3:0] out;
begin
casex(select)
8'b1xxx_xxxx: out = 3'h7;
8'b01xx_xxxx: out = 3'h6;
8'b001x_xxxx: out = 4'h5;
8'b0001_xxxx: out = 4'h4;
8'b0000_1xxx: out = 4'h3;
8'b0000_01xx: out = 4'h2;
8'b0000_001x: out = 4'h1;
8'b0000_0001: out = 4'h0;
8'h0000_0000: out = 4'h8;
endcasex
prienc = out;
end
endfunction
It'll return 0-7 if any of the inputs are asserted, or 8, if none are
asserted.
If you're using Synopsys, you'll want to add //synopsys parallel_case
full_case to the case().
--Kai
>Hi. Can anybody help me out with a verilog model (for synthesis) for a
>Priority Encoder? Anything along the lines of the functionality of the
>rather ancient 74147/148s? Any width, just so I can see how to do it,
(snip)
I remember many years ago, just reading though the TI TTL Databook,
looking at the descriptions for the different ICs. I believe
that the logic of the priority encoder was there. In any case, it
is a combinatorial circuit with 2^n inputs and n outputs, and shouldn't
take too long to design for small n.
For small n you can do it all in one or two logic levels. For larger n
there should be a cascade method.
-- glen
always @(select) begin
if (select[7]) out <= 3'h7;
else
if (select[6]) out <= 3'h6;
else
if (select[5]) out <= 3'h5;
..etc.. for all bits
Mindless application of tabs, copy and paste can be very good for the
soul....
>Hi. Can anybody help me out with a verilog model (for synthesis) for a
>Priority Encoder? Anything along the lines of the functionality of the
>rather ancient 74147/148s? Any width, just so I can see how to do it,
>but I'm _actually_ after
>a 24b encoder to use as part of a floating-point ALU. For research
>purposes only, I'm only after a set of encoders to compare speed/area
>trade-offs; and the capability of my synthesis tool.
>
/* Priority encoder.
*
*/
always @ (fl)
begin
locsel = 0;
begin : priority
integer i;
for (i = 0; i <= 7; i = i + 1) begin
if (fl[i] == 1) begin
locsel[i] = fl[i];
disable priority;
end // if
end // for
end // priority
end // always
assign sel_idx[2] = | ({locsel[7:4]});
assign sel_idx[1] = | ({locsel[7:6],locsel[3:2]});
assign sel_idx[0] = | ({locsel[7],locsel[5],locsel[3],locsel[1]});
Dave
======================================================================
Dave
========================================================================
We tried synthesizing both this design and the following casex-based
design:
/* Priority encoder.
*
*/
always @ (/*AUTOSENSE*/fl)
begin
casex (fl) // synopsys full_case parallel_case
8'bxxxx_xxx1 : sel_idx = 3'd0;
8'bxxxx_xx10 : sel_idx = 3'd1;
8'bxxxx_x100 : sel_idx = 3'd2;
8'bxxxx_1000 : sel_idx = 3'd3;
8'bxxx1_0000 : sel_idx = 3'd4;
8'bxx10_0000 : sel_idx = 3'd5;
8'bx100_0000 : sel_idx = 3'd6;
8'b1000_0000 : sel_idx = 3'd7;
endcase
end // always
The casex design used 10 gates versus 18 for the for-loop design
(counting inverters as gates). Both designs had approx 4 levels of
logic (Synopsys threw in some complex gates).
--
Matthew Lovell voice: (970) 898-6264
Hewlett-Packard FSL fax: (970) 898-2510
3404 E. Harmony Rd. MS A0 location: 3UR4
Fort Collins, CO 80528-9599 mailto:lov...@fc.hp.com
The opinions expressed above are mine alone and do not represent
those of the Hewlett-Packard Company.
--
Koenraad SCHELFHOUT
Switching Systems Division http://www.alcatel.com/
Microelectronics Department - VH14 _______________
________________________________________\ /-___
\ / /
Phone : (32/3) 240 89 93 \ ALCATEL / /
Fax : (32/3) 240 99 47 \ / /
mailto:ks...@sh.bel.alcatel.be \ / /
_____________________________________________\ / /______
\ / /
Francis Wellesplein, 1 v\/
B-2018 Antwerpen
Belgium
There are many ways of describing a priority encoder in VHDL. Perhaps
the simplest is:
output <= drive1 WHEN case1 ELSE
drive2 WHEN case2 ELSE
...
driveN-1 WHEN caseN-1 ELSE
driveN;
Where the <caseN>s are boolean expressions, with caseI having higher
priority than case I+M, where M>0.
Hope this helps,
Paul
--
Paul Menchini | me...@mench.com | "Every damn thing is your
Menchini & Associates | www.mench.com | own fault if you're any
P.O. Box 71767 | 919-479-1670[v] | good."
Durham, NC 27722-1767 | 919-479-1671[f] | -- Ernest Hemingway
We have tested the following methods for coding priority encoders:
if-then-else
for loops
for loops with breaks
trees (using recursion and loops)
These methods were run for encoders of length 2**N (N between 2 to 8). All
of the results were graphed. The tree method was consistently the best
solution from a performance standpoint. (We measured area, timing & compile
time). However, for smaller encoders (N < 5) the results tended to be very
close (timing & area wise) after optimization. The individual structures or
architectures were more dominant as the priority encoder became larger. All
structures were inspected using RTL Analyzer.
I will not publish ALL of the coding styles here but I have presented the
material several times at various sites during 1996 & early 1997. I will
reprint a post in ESNUG which was based on material I submitted to John in
Nov 1996. I will also add the snippet of code which implements the same
structure in Verilog. Since I can not find the specific priority encoder
coded in verilog using the "tree method" I have described, I am substituting
a tree_xor written in verilog. This is a substitute for a reduction operator
which can be used to build a tree structure for a parity function. The
concept is the same and can be extended for a priority encoder. Note that in
using a loop structure to implement a tree, you iterate through the loop
once for each level of logic. The number of loop iteration is then log2N
where N is the number of inputs.
For further information you can also view the tutorial presentation at SNUG
1998 (on the web at:
http://www.synopsys.com/news/pubs/snug/nasnug_papers.html
You need a solvit-id to access the papers.
Prasad Paranjpe
Technical Marketing Manager - Synopsys
pra...@synopsys.com
( ESNUG 254 Item 2 ) -------------------------------------------- [11/6/96]
Subject: (ESNUG 253 #10) Does Synopsys Synthesis Support VHDL Recursion ?
> I recently heard that Synopsys supports Recursion in VHDL, and will
> synthesize recursive VHDL code! Recursion is a very powerfull tool, and
> for instantiating highly repetitive blocks it would be very useful. ...
> ... Peter Ashenden of University of Cincinatti has an example of a
> recursive call upon a VHDL entity to generate a buffer tree, the paper is
> called \"Recursive and Repetitive Hardware Models in VHDL\". ... If at
all
> possible I would appreciate a simple example, something like a clock tree
> or priority encoder would be nice.
From: jcooley\@world.std.com (John Cooley)
The quick & dirty answer is \"Yes, Synopsys does support VHDL recursion in
synthesis.\" (It could even support Verilog recursion if Verilog had it;
the basic software to do it isn't terribly Verilog/VHDL dependent.) Of
course this is just newly supported by Synopsys so a lot of this is still
very experimental. There are some complex restrictions you have to follow
but the big one is that *everything* must be static at elaboration time.
(i.e. You may design an N-bit priority encoder, but N *must* be given an
exact value at synthesis time.) The N-bit priority encoder bellow will
consistently build a better circuit with better area and timing -- plus
it scales nicely.
package pack is
constant N: integer := 5; -- Note: N is statically defined here!
function log2(A: integer) return integer;
function max(A,B: integer) return integer;
end;
package body pack is
function max(A,B: integer) return integer is
begin
if(A A) then return(I-1);
end if;
end loop;
return(30);
end;
end;
use work.pack.all;
entity priority_tree is
port (A: in bit_vector(2**N - 1 downto 0);
P: out bit_vector(N-1 downto 0);
F: out bit);
end;
architecture a of priority_tree is
procedure priority ( A: in bit_vector; -- Input Vector
P: out bit_vector; -- High Priority Index
F: out bit) is -- Found a one?
constant WIDTH: INTEGER := A'length;
constant LOG_WIDTH: INTEGER := log2(WIDTH);
variable AT: bit_vector(WIDTH-1 downto 0);
variable F1, F0: bit;
variable PRET: bit_vector(LOG_WIDTH-1 downto 0);
variable P1, P0, PT: bit_vector(max(LOG_WIDTH-2,0) downto 0);
begin
AT := A; -- Normalize array indexes
if(WIDTH = 1) then -- Handle Degenerate case of single input
F := AT(0);
elsif(WIDTH = 2) then -- Bottom of the recursion: a two-bit encoder
PRET(0) := AT(0);
F := AT(1) or AT(0);
else -- Recurse on the two halves, and compute combined result
priority ( AT(WIDTH-1 downto WIDTH/2), P1, F1);
priority ( AT(WIDTH/2-1 downto 0), P0, F0);
F := F1 or F0; -- We found a one if either half had a one
if(F1 = '1') then -- If the first half had a one, use it's index
PT := P1;
else
PT := P0; -- Otherwise, us the second half's index
end if;
PRET := F1 & PT; -- The result MSB is one if the first half had a 1
end if;
P := PRET;
end;
begin
process(A)
variable PV: bit_vector(N-1 downto 0);
variable FV: bit;
begin
priority (A, PV, FV);
P <= PV;
F <= FV;
end process;
end;
Another more concise way to do this is with a loop (see below). The
loop architecture is a serial, cascaded circuit which can be optimized
effectively up to N=16. At N=32 you begin to see a small variation in
the results. The tree is always better but in small examples the
optimizer will produce the same result.
package pack is
constant WIDTH: integer := 32;
function log2(A: integer) return integer;
end;
package body pack is
function log2(A: integer) return integer is
begin
for I in 1 to 30 loop -- Works for up to 32 bit integers
if(2**I > A) then
return(I-1);
end if;
end loop;
return(30);
end;
end;
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use work.pack.all;
entity priority_long is
port (A: in std_logic_vector(WIDTH - 1 downto 0);
P: out std_logic_vector(log2(WIDTH)-1 downto 0));
end;
architecture a of priority_long is
signal PT : std_logic_vector (log2(WIDTH) downto 0);
signal IT : integer;
begin
process(A,PT,IT)
begin
IT <= 0;
for I in 0 to WIDTH-1 loop
if ( A(I) = '1' ) then
IT <= I;
end if;
end loop;
PT <= CONV_STD_LOGIC_VECTOR(IT,log2(WIDTH)+1);
P <= PT(log2(WIDTH)-1 downto 0);
end process;
end;
The final function call shows how to implement the reduction operator
(function call) using recursion to produce a tree. In many cases a
simple loop can produce the same results but the tree is guaranteed to
give you the best synthesized result. (Many times Synopsys can clean up
bad code but this get harder as the circuit gets bigger or more complex.)
function recurse_XOR (data: std_logic_vector) return std_logic is
variable UPPER_TREE, LOWER_TREE : std_logic;
variable L_BOUND, LEN : integer;
variable i_data : std_logic_vector(data'LENGTH-1 downto 0);
variable result : std_logic;
begin
i_data := data;
if i_data'length = 1 then
result := i_data(i_data'LEFT);
elsif i_data'length = 2 then
result := i_data(i_data'LEFT) XOR i_data(i_data'RIGHT);
else
LEN := i_data'LENGTH;
L_BOUND := (LEN + 1)/2 + i_data'RIGHT;
UPPER_TREE := recurse_XOR (i_data(i_data'LEFT downto L_BOUND));
LOWER_TREE := recurse_XOR (i_data(L_BOUND - 1 downto i_data'RIGHT));
result := UPPER_TREE XOR LOWER_TREE;
end if;
return result;
end;
For an interesting discussion of coding priority encoders, I suggest you
check out the paper by Mike Parkin of SUN Microsystems in your Synopsys
Online Documentation titled \"Writing Successful RTL Descriptions in
Verilog\" or the SolvIt note 019403.
Of course, all this recursion work is cutting edge, so don't be surprised
to see more details (and restrictions) on how to use recursion in Synopsys
synthesis in future ESNUGs.
- John Cooley
the ESNUG guy
AND NOW for the verilog
module xor_tree(A, Z);
parameter N = 7;
input [N-1:0] A;
output Z;
reg Z;
function [N-1:0] log2N;
input [31:0] num;
integer I;
reg [31:0] M;
begin
M = 1;
for (I=0; I<=31; I=I+1)
begin : iterate
if (M >= num)
begin
log2N = I;
disable iterate;
end
else
M = M << 1;
end
end
endfunction
`define logN log2N(N)
function even;
input [31:0] num;
begin
even = ~num[0];
end
endfunction
integer I, J, K, NUM;
reg [N-1:0] temp, result;
// and now the real fun begins
// sorry I did not have time to comment this in detail
//
// the basic idea is that you loop thru the inputs 1 pair at a time in the
inner loop
// the results are stored in a temp array which is 1/2 the length
// the outer loop goes through "temp" which shrinks in 1/2 on each iteration
//
// that is the basic idea and I do not have time for more details at this
time (sorry!)
//
always @(A)
begin
temp[N-1:0] = A[N-1:0];
NUM = N;
for (K=`logN-1; K>=0; K=K-1)
begin
J = (NUM+1)/2;
J = J-1;
if (even(NUM))
for (I=NUM-1; I>=0; I=I-2)
begin
result[J] = temp[I] ^ temp[I-1];
J = J-1;
end
else
begin
for (I=NUM-1; I>=1; I=I-2)
begin
result[J] = temp[I] ^ temp[I-1];
J = J-1;
end
result[0] = temp[0];
end
temp[N-1:0] = result[N-1:0];
NUM = (NUM+1)/2;
end
Z = result[0];
end
endmodule
Thanks for all the replies - it appears I opened a small can of worms, or a can
of small worms, and also the Verilog/Vhdl debate. Frankly I like the casex
Verilog form the best, it synthesised nicely even without having Synopsys!! And
yes - 4 gate delays seems to be as fast as it gets.
Ed Chester
Univ. of Newcastle (UK) Elec. & Elec. Eng.