Verilog Code For Serial Adder With Accumulator

0 views
Skip to first unread message

Facunda Ganesh

unread,
Jun 14, 2024, 1:47:42 PM6/14/24
to iminbager

I am writing a VHDL code to impelemt 8 bit serial adder with accumulator.When i do simulation, the output is always zeros! And some times it gives me the same number but with a shift ! I dont know what is the problem, i tried to put A,B as inout but didnt work as well. Can anybody help please.

I am implementing TxRx on Zynq chip. My design is working, but I would like to make optimization of it. Based on report my DSP slices are not utilized. I would like to make multiplication operations on DSP slices. I am just starting with FPGAs. Are there any guidelines hot to target FPGA DSP slices for multiplication operation from my Verilog code. How should I write functions where I write multiplication?

verilog code for serial adder with accumulator


DOWNLOAD https://t.co/xk5Mdcjb5t



The best way to learn how to use DSPs in Verilog with Xilinx's FPGAs is to read the synthesis guide. You will find both guidelines and examples in VHDL and Verilog that will map to DSPs, including use of the pre-adder and accumulator should you require them.

This paper presents a high-speed DDFS system with pipelined PA based on modified parallel BK adder and gated clock technique. The ROM was resized by applying the quarter-wave symmetry technique in one quarter of the sine wave, and an angular decomposition technique based on trigonometric identity has been applied to compress the quarter ROM LUT. Based on these techniques, the quarter ROM LUT was partitioned into three sub-ROMs (, , and ). The proposed architecture improves the speed of the DDFS and reduces the size of the ROMs.

The modified parallel BK adder based on the progression-of-states technique combined with gated clock technique was used in the proposed design of the PA. The frequency resolution () of DDFS is determined by the clock frequency () and the number of input bit of the PA as depicted by

For high frequency resolution, it is preferable to design a PA with large FCW bits input. However, a large ROM size is required to implement all the bits of phase accumulator output. Due to this reason, a part of the MSB phase output is used to address the phase to amplitude converter or ROM lookup table while maintaining high frequency resolution. The pipeline technique was used to increase the throughput of the accumulator, and this throughput will double with the number of pipeline stages, as shown in Figure 1.

The number of registers increases with the number of pipeline stages, which leads to high power consumption. Therefore, in this design, a gated clock technique was used to reduce the number of preskewing registers while preserving high-speed operation. In this technique, D flip-flops (DFFs) were used to connect each row of the pipeline stages with FCW input. These registers are clocked by the pipelined pulses with one clock cycle based on the shifted clock pulses as shown in Figure 2(a). Considering that the phase accumulator input bits are , the PA was partitioned into stages with B DFFs in each stage. The number of the DFFs, for preskewing registers, is given byBy applying the gated clock technique on the proposed design, the number of DFFs is given byAs a result, with the gated clock technique, the numbers of preskewing registers have been reduced from 80 to 36 corresponding to 53.7% reduction.

This operation made the frequency tuning word held constant for four clock cycles without causing any imperfections in the PA output. The partitioned clock cycles () make the multiplexers choose one of the results at the output of the PA to overcome the holding time on the parallel adders as illustrated in Figure 2(c).

The proposed phase accumulator architecture based on the modified parallel BK adder and the gated clock technique with pipelining stages is shown in Figure 3. The output of the PA is a truncated 14-bit value that is achieved from the 8 and 6 bits of the top and second pipelining stages, respectively.

The general prefix addition algorithm is explained by Zimmermann in [25]. By adding the carry input in the prefix structure with some modifications, the prefix structure can be used in pipelining-based adder design. This approach is used in BK adder fast carry computation. However, in this paper, a modification is proposed to the BK adder so that it can be used in pipelining architecture. The proposed modification is by removing the operation of the , and the carry out of the first bit can be achieved by a 2-1 multiplexer. The input to this multiplexer is and while is the select input and the output is . The operation of the multiplexer is given bywhere is the carry out, is the propagate function, is the carry input, and is the first bit input. The proposed modification of the 8-bit BK adder is shown in Figure 4(b). The sum and carry out of the modified 8-bit BK adder are shown in

An adder is a key element of the pipelining PA design, and a fast adder improves PA performance. Parallel-prefix adder tree structures such as Sklansky [26], Kogge-Stone adder [27], BK [23], and Beaumont-Smith [28] have been used in pipelining accumulator design for high-speed operation.

A comparison has been made between conventional adder and several parallel-prefix adders for 12-bit, 18-bit, 24-bit, and 32-bit operations. The PA designs were coded in Verilog HDL and verified in Cyclone III FPGA kit board. Prior to that, all the designs were simulated by using ALTERA Quartus II. The comparison result is shown in Figure 5. From the figure, it can be seen that BK adder performs relatively faster, especially for high number of bits.

Accumulator: The accumulator is an 8-bit buffer register that stores intermediate answers during a computer run. The accumulator has two outputs. The two-state output goes directly to the adder-subtractor and the three-state output goes to the bus. This implies that the 8-bit accumulator word continuously drives the adder- subtractor but only appears on the W bus when Ea is high.

B-Register: The B-register is a buffer register used in performing arithmetic operations. It supplies the number to be added or subtracted from the contents of the accumulator to the adder/subtractor. When data is available at the bus and Lb is low, at the positive clock edge, B register gets and stores the data.

Basically, it is implemented with an accumulator which adds a constant value FCW (Frequency Control Word). The accumulator wrap around every time it reaches its maximum value. For example, if the maximum value is 15 (4-bit accumulator) and the FWC = 4, the accumulator will count:

Figure 8 shows the simulation of the VHDL code of the NCO with start phase. The test bench instantiates two identical 8-bit NCO with the same Frequency Control Word (FCW) and different start phase as in Figure 7

I have Verilog-A code for Ideal ADC. I am using it in Virtuoso spectre and I am not familiar with Verilog-A at all.
But this ADC works on the rising edge of the clock and I want my ADC to work on falling edge.

Or alternately, is there a clean way to reset sumOfVoltages to zero on each clock edge, without creating a race condition between the sequential and combinational logic? The target is a Xilinx Zynq FPGA, so I need synthesizable code, not just test bench.

I updated your code. I believe this is what you want. Please note, that serially adding this way can be quite slow and you should consider putting the adder in a synchronous block. Also, if you do not really need pure average you can think of running average as Dave Tweed indicated.

The Complex Multiplier template shows how to model a complex multiplier-accumulator and manually pipeline the intermediate stages. The hardware implementation of complex multiplication uses four multipliers and two adders.

582128177f
Reply all
Reply to author
Forward
0 new messages