Field-programmable Gate Arrays

0 views

Skip to first unread message

Gaetan Horton

unread,

Aug 4, 2024, 12:11:11 PM8/4/24

to ningniledest

A field-programmable gate array (FPGA) is a type of configurable integrated circuit that can be repeatedly programmed after manufacturing. FPGAs are a subset of logic devices referred to as programmable logic devices (PLDs). They consist of an array of programmable logic blocks with a connecting grid, that can be configured "in the field" to interconnect with other logic blocks to perform various digital functions. FPGAs are often used in limited (low) quantity production of custom-made products, and in research and development, where the higher cost of individual FPGAs is not as important, and where creating and manufacturing a custom circuit wouldn't be feasible. Other applications for FPGAs include the telecommunications, automotive, aerospace, and industrial sectors, which benefit from their flexibility, high signal processing speed, and parallel processing abilities.

A FPGA configuration is generally written using a hardware description language (HDL) e.g. VHDL, similar to the ones used for application-specific integrated circuits (ASICs). Circuit diagrams were formerly used to write the configuration.

The logic blocks of an FPGA can be configured to perform complex combinational functions, or act as simple logic gates like AND and XOR. In most FPGAs, logic blocks also include memory elements, which may be simple flip-flops or more sophisticated blocks of memory.[1] Many FPGAs can be reprogrammed to implement different logic functions, allowing flexible reconfigurable computing as performed in computer software.

FPGAs also have a role in embedded system development due to their capability to start system software development simultaneously with hardware, enable system performance simulations at a very early phase of the development, and allow various system trials and design iterations before finalizing the system architecture.[2]

The FPGA industry sprouted from programmable read-only memory (PROM) and programmable logic devices (PLDs). PROMs and PLDs both had the option of being programmed in batches in a factory or in the field (field-programmable).[3]

In 1987, the Naval Surface Warfare Center funded an experiment proposed by Steve Casselman to develop a computer that would implement 600,000 reprogrammable gates. Casselman was successful and a patent related to the system was issued in 1992.[3]

Altera and Xilinx continued unchallenged and quickly grew from 1985 to the mid-1990s when competitors sprouted up, eroding a significant portion of their market share. By 1993, Actel (later Microsemi, now Microchip) was serving about 18 percent of the market.[6]

The 1990s were a period of rapid growth for FPGAs, both in circuit sophistication and the volume of production. In the early 1990s, FPGAs were primarily used in telecommunications and networking. By the end of the decade, FPGAs found their way into consumer, automotive, and industrial applications.[8]

Companies like Microsoft have started to use FPGAs to accelerate high-performance, computationally intensive systems (like the data centers that operate their Bing search engine), due to the performance per watt advantage FPGAs deliver.[10] Microsoft began using FPGAs to accelerate Bing in 2014, and in 2018 began deploying FPGAs across other data center workloads for their Azure cloud computing platform.[11]

Contemporary FPGAs have ample logic gates and RAM blocks to implement complex digital computations. FPGAs can be used to implement any logical function that an ASIC can perform. The ability to update the functionality after shipping, partial re-configuration of a portion of the design[18] and the low non-recurring engineering costs relative to an ASIC design (notwithstanding the generally higher unit cost), offer advantages for many applications.[1]

As FPGA designs employ very fast I/O rates and bidirectional data buses, it becomes a challenge to verify correct timing of valid data within setup time and hold time.[19] Floor planning helps resource allocation within FPGAs to meet these timing constraints.

Some FPGAs have analog features in addition to digital functions. The most common analog feature is a programmable slew rate on each output pin, allowing the engineer to set low rates on lightly loaded pins that would otherwise ring or couple unacceptably, and to set higher rates on heavily loaded high-speed channels that would otherwise run too slowly.[20][21] Also common are quartz-crystal oscillator driver circuitry, on-chip RC oscillators, and phase-locked loops with embedded voltage-controlled oscillators used for clock generation and management as well as for high-speed serializer-deserializer (SERDES) transmit clocks and receiver clock recovery. Fairly common are differential comparators on input pins designed to be connected to differential signaling channels. A few mixed signal FPGAs have integrated peripheral analog-to-digital converters (ADCs) and digital-to-analog converters (DACs) with analog signal conditioning blocks, allowing them to operate as a system-on-a-chip (SoC).[22] Such devices blur the line between an FPGA, which carries digital ones and zeros on its internal programmable interconnect fabric, and field-programmable analog array (FPAA), which carries analog values on its internal programmable interconnect fabric.

The most common FPGA architecture consists of an array of logic blocks called configurable logic blocks (CLBs) or logic array blocks (LABs) (depending on vendor), I/O pads, and routing channels.[1] Generally, all the routing channels have the same width (number of signals). Multiple I/O pads may fit into the height of one row or the width of one column in the array.

"An application circuit must be mapped into an FPGA with adequate resources. While the number of logic blocks and I/Os required is easily determined from the design, the number of routing channels needed may vary considerably even among designs with the same amount of logic. For example, a crossbar switch requires much more routing than a systolic array with the same gate count. Since unused routing channels increase the cost (and decrease the performance) of the FPGA without providing any benefit, FPGA manufacturers try to provide just enough channels so that most designs that will fit in terms of lookup tables (LUTs) and I/Os can be routed. This is determined by estimates such as those derived from Rent's rule or by experiments with existing designs."[23]

In general, a logic block consists of a few logical cells. A typical cell consists of a 4-input LUT, a full adder (FA) and a D-type flip-flop. The LUT might be split into two 3-input LUTs. In normal mode those are combined into a 4-input LUT through the first multiplexer (mux). In arithmetic mode, their outputs are fed to the adder. The selection of mode is programmed into the second mux. The output can be either synchronous or asynchronous, depending on the programming of the third mux. In practice, the entire adder or parts of it are stored as functions into the LUTs in order to save space.[24][25][26]

Modern FPGA families expand upon the above capabilities to include higher-level functionality fixed in silicon. Having these common functions embedded in the circuit reduces the area required and gives those functions increased performance compared to building them from logical primitives. Examples of these include multipliers, generic DSP blocks, embedded processors, high-speed I/O logic and embedded memories.

Higher-end FPGAs can contain high-speed multi-gigabit transceivers and hard IP cores such as processor cores, Ethernet medium access control units, PCI or PCI Express controllers, and external memory controllers. These cores exist alongside the programmable fabric, but they are built out of transistors instead of LUTs so they have ASIC-level performance and power consumption without consuming a significant amount of fabric resources, leaving more of the fabric free for the application-specific logic. The multi-gigabit transceivers also contain high-performance signal conditioning circuitry along with high-speed serializers and deserializers, components that cannot be built out of LUTs. Higher-level physical layer (PHY) functionality such as line coding may or may not be implemented alongside the serializers and deserializers in hard logic, depending on the FPGA.

In 2012 the coarse-grained architectural approach was taken a step further by combining the logic blocks and interconnects of traditional FPGAs with embedded microprocessors and related peripherals to form a complete system on a programmable chip. Examples of such hybrid technologies can be found in the Xilinx Zynq-7000 all Programmable SoC,[27] which includes a 1.0 GHz dual-core ARM Cortex-A9 MPCore processor embedded within the FPGA's logic fabric,[28] or in the Altera Arria V FPGA, which includes an 800 MHz dual-core ARM Cortex-A9 MPCore. The Atmel FPSLIC is another such device, which uses an AVR processor in combination with Atmel's programmable logic architecture. The Microsemi SmartFusion devices incorporate an ARM Cortex-M3 hard processor core (with up to 512 kB of flash and 64 kB of RAM) and analog peripherals such as a multi-channel analog-to-digital converters and digital-to-analog converters in their flash memory-based FPGA fabric.[citation needed]

Most of the logic inside of an FPGA is synchronous circuitry that requires a clock signal. FPGAs contain dedicated global and regional routing networks for clock and reset, typically implemented as an H tree, so they can be delivered with minimal skew. FPGAs may contain analog phase-locked loop or delay-locked loop components to synthesize new clock frequencies and manage jitter. Complex designs can use multiple clocks with different frequency and phase relationships, each forming separate clock domains. These clock signals can be generated locally by an oscillator or they can be recovered from a data stream. Care must be taken when building clock domain crossing circuitry to avoid metastability. Some FPGAs contain dual port RAM blocks that are capable of working with different clocks, aiding in the construction of building FIFOs and dual port buffers that bridge clock domains.