Performance driven FPGA design with an ASIC perspective
FPGA devices are an important component in many modern devices. This means that it is important that VLSI designers have a thorough knowledge of how to optimize designs for FPGAs. While the design flows for ASICs and FPGAs are similar, there are many differences as well due to the limitations inherent in FPGA devices. To be able to use an FPGA efficiently it is important to be aware of both the strengths and oweaknesses of FPGAs. If an FPGA design should be ported to an ASIC at a later stage it is also important to take this into account early in the design cycle so that the ASIC port will be efficient. This thesis investigates how to optimize a design for an FPGA through a number of case studies of important SoC components. One of these case studies discusses high speed processors and the tradeoffs that are necessary when constructing very high speed processors in FPGAs. The processor has a maximum clock frequency of 357~MHz in a Xilinx Virtex-4 devices of the fastest speedgrade, which is significantly higher than Xilinx' own processor in the same FPGA. Another case study investigates floating point datapaths and describes how a floating point adder and multiplier can be efficiently implemented in an FPGA. The final case study investigates Network-on-Chip architectures and how these can be optimized for FPGAs. The main focus is on packet switched architectures, but a circuit switched architecture optimized for FPGAs is also investigated. All of these case studies also contain information about potential pitfalls when porting designs optimized for an FPGA to an ASIC. The focus in this case is on systems where initial low volume production will be using FPGAs while still keeping the option open to port the design to an ASIC if the demand is high. This information will also be useful for designers who want to create IP cores that can be efficiently mapped to both FPGAs and ASICs. Finally, a framework is also presented which allows for the creation of custom backend tools for the Xilinx design flow. The framework is already useful for some tasks, but the main reason for including it is to inspire researchers and developers to use this powerful ability in their own design tools.
FPGA-based reconfigurable on-board computing systems for space applications
The purpose of the thesis is to conceptualize an application method of ground-based reconfigurable FPGA (Field Programmable Gate Array) technologies for space systems and to apply the method to the on-board computer of the small satellite Flying Laptop for the on-orbit demonstration. The Flying Laptop satellite is the first small satellite within the Stuttgart small satellite program'' in which several small satellites are developed by the Institute of Space Systems at the Universität Stuttgart. The main mission of the Flying Laptop is to demonstrate the space use of reconfigurable FPGAs for the reconfigurable computing'' on an central on-board computer aboard a spacecraft. Due to their radiation vulnerabilities reconfigurable FPGAs have not yet been employed in practical space applications with high reliability requirements. The Flying Laptop project aims to achieve the world's first orbit demonstration of a purely FPGA-based central on-board computer. Within this research firstly, application methods of reconfigurable FPGAs for space systems were investigated, which are not limited to small satellites but for general space systems. The investigation is based on thorough experimental data survey and analysis of radiation effects on existing FPGA devices. Main radiation effects of single event effects and total ionizing dose effects were extensively investigated. Based on the data obtained, a combinational use of SRAM-FPGAs (multi-chip redundant) and Flash-FPGAs (voting element) for mitigating radiation effects was conceptualized. A mathematical system reliability analysis of repairable multi-redundant systems has been. The analysis illustrates that a multi-redundant system based on SRAM-FPGAs together with a Flash-FPGA based voter provides a sufficiently high reliability for Low Earth Orbit (LEO) missions against radiation effects. After the conceptualization of application methods of reconfigurable FPGAs for the space environment, it is applied to the on-board computer of the small satellite Flying Laptop. Flying Laptop is a cubic, 3-axis stabilized satellite with the edge lengths of about 600mm x 700mm x 800mm and a mass of about 120kg, which shall be launched into sun-synchronous LEO in an altitude of around 600km. A system architecture with four SRAM-FPGA based central processing nodes and one Flash-FPGA based voter was applied for the on-board computer of the Flying Laptop. This on-board computer is the central computing system aboard the satellite and shall be capable of controlling all satellite peripheral electronics. First of all, the system design of the whole satellite has been conducted within the scope of the thesis in order to allow the design of the on-board computer. Based on the established system requirements, the on-board computer of the Flying Laptop was designed and the breadboard model and partly the engineering model of its components are developed. The hardware logic (control algorithm) which shall be implemented into FPGAs can be designed by means of hardware description languages. However, it is no longer software engineering but hardware engineering for generating real hardware logics inside FPGAs which are executed in parallel in real-time. The satellite main functions are designed, developed, and implemented in FPGAs by means of the hardware description languages Handel-C and VHDL. The thesis provides development methods of the control algorithms. In addition to this, a control algorithm development facility has been established for the further design activities. Finally, the developed control algorithms are verified in a simulation and verification environment in order to prove the validities of the above described developments. First of all, an FPGA hardware-in-the-loop real-time simulation environment has been established based on the Model-based Development and Verification Environment (MDVE). MDVE was established at the Institute of Space Systems supported by EADS Astrium. The communication interface between the MDVE and FPGAs are developed, including the required hardware components and the serialization algorithms of communication lines inside an FPGA. Using this simulation and verification environment, extensive simulations have been conducted and the design of the on-board computer, as well as the system design of the whole satellite are validated. At the end, an extended investigation has been conducted on formal verification methods of the hardware-logic in order to provide the way of strict design verifications. This thesis establishes the basis of principle application methods of reconfigurable FPGA technologies for reconfigurable computing'' on space systems which provides innovative solutions for high computational demands of future space applications.
Security for volatile FPGAs
With recongurable devices fast becoming complete systems in their own right, interest in their security properties has increased. While research on "FPGA security" has been active since the early 2000s, few have treated the field as a whole, or framed its challenges in the context of the unique FPGA usage model and application space. This dissertation sets out to examine the role of FPGAs within a security system and how solutions to security challenges can be provided. I offer the following contributions. I motivate authenticating configurations as an additional capability to FPGA configuration logic, and then describe a exible security protocol for remote reconfiguration of FPGA-based systems over insecure networks. Non-volatile memory devices are used for persistent storage when required, and complement the lack of features in some FPGAs with tamper proong in order to maintain specified security properties. A unique advantage of the protocol is that it can be implemented on some existing FPGAs (i.e., it does not require FPGA vendors to add functionality to their devices). Also proposed is a solution to the "IP distribution problem" where designs from multiple sources are integrated into a single bitstream, yet must maintain their condentiality. I discuss the diculty of reproducing and comparing FPGA implementation results reported in the academic literature. Concentrating on cryptographic implementations, problems are demonstrated through designing three architecture-optimized variants of the AES block cipher and analyzing the results to show that single figures of merit, namely "throughput" or "throughput per slice", are often meaningless without the context of an application. To set a precedent for reproducibility in our field, the HDL source code, simulation testbenches and compilation instructions are made publicly available for scrutiny and reuse. Finally, I examine payment systems as ubiquitous embedded devices, and evaluate their security vulnerabilities as they interact in a multi-chip environment. Using FPGAs as an adversarial tool, a man-in-the-middle attack against these devices is demonstrated. An FPGA-based defense is also demonstrated: the first secure wired "distance bounding" protocol implementation. This is then put in the context of securing recongurable systems.
Numerical Solutions of Differential Equations on FPGA-Enhanced Computers
Conventionally, to speed up scientific or engineering (S&E) computation programs on general-purpose computers, one may elect to use faster CPUs, more memory, systems with more efficient (though complicated) architecture, better software compilers, or even coding with assembly languages. With the emergence of Field Programmable Gate Array (FPGA) based Reconfigurable Computing (RC) technology, numerical scientists and engineers now have another option using FPGA devices as core components to address their computational problems. The hardware-programmable, low-cost, but powerful “FPGA-enhanced computer” has now become an attractive approach for many S&E applications. A new computer architecture model for FPGA-enhanced computer systems and its detailed hardware implementation are proposed for accelerating the solutions of computationally demanding and data intensive numerical PDE problems. New FPGAoptimized algorithms/methods for rapid executions of representative numerical methods such as Finite Difference Methods (FDM) and Finite Element Methods (FEM) are designed, analyzed, and implemented on it. Linear wave equations based on seismic data processing applications are adopted as the targeting PDE problems to demonstrate the effectiveness of this new computer model. Their sustained computational performances are compared with pure software programs operating on commodity CPUbased general-purpose computers. Quantitative analysis is performed from a hierarchical set of aspects as customized/extraordinary computer arithmetic or function units, compact but flexible system architecture and memory hierarchy, and hardwareoptimized numerical algorithms or methods that may be inappropriate for conventional general-purpose computers. The preferable property of in-system hardware reconfigurability of the new system is emphasized aiming at effectively accelerating the execution of complex multi-stage numerical applications. Methodologies for accelerating the targeting PDE problems as well as other numerical PDE problems, such as heat equations and Laplace equations utilizing programmable hardware resources are concluded, which imply the broad usage of the proposed FPGA-enhanced computers.
Physical Synthesis Toolkit for Area and Power Optimization on FPGAs
A Field-Programmable Gate Array (FPGA) is a configurable platform for implementing a variety of logic circuits. It implements a circuit by the means of logic elements, usually Lookup Tables, connected by a programmable routing network. To utilize an FPGA effectively Computer Aided Design (CAD) tools have been developed. These tools implement circuits by using a traditional CAD flow, where the circuit is analyzed, synthesized, technology mapped, and finally placed and routed on the FPGA fabric. This flow, while generally effective, can produce sub-optimal results because once a stage of the flow is completed it is not revisited. This problem is addressed by an enhanced flow known Physical Synthesis, which consists of a set of iterations of the traditional flow with one key difference: the result of each iteration directly affects the result of the following iteration. An optimization can therefore be evaluated and then adjusted as needed in the following iterations, resulting in an overall better implementation. This CAD flow is challenging to work with because for a given FPGA researchers require access to each stage of the flow in an iterative fashion. This is particularly challenging when targeting modern commercial FPGAs, which are far more complex than a simple Lookup Table and Flip-Flop model generally used by the academic community. This dissertation describes a unified framework, called the Physical Synthesis Toolkit (PST), for research and development of optimizations for modern FPGA devices. PST provides access to modern FPGA devices and CAD tool flow to facilitate research. At the same time the amount of effort required to adapt the framework to a new FPGA device is kept to a minimum. To demonstrate that PST is an effective research platform, this dissertation describes optimization and modeling techniques that were implemented inside of it. The optimizations include: an area reduction technique for XOR-based logic circuits implemented on a 4-LUT based FPGA (25.3% area reduction), and a dynamic power reduction technique that reduces glitches in a circuit implemented on an Altera Stratix II FPGA (7% dynamic power reduction). The modeling technique is a novel toggle rate estimation approach based on the XOR-based decomposition, which reduces the estimate error by 37% as compared to the latest release of the Altera Quartus II CAD tool.
Performance driven FPGA design with an ASIC perspective
FPGA devices are an important component in many modern devices. This means that it is important that VLSI designers have a thorough knowledge of how to optimize designs for FPGAs. While the design flows for ASICs and FPGAs are similar, there are many differences as well due to the limitations inherent in FPGA devices. To be able to use an FPGA efficiently it is important to be aware of both the strengths and oweaknesses of FPGAs. If an FPGA design should be ported to an ASIC at a later stage it is also important to take this into account early in the design cycle so that the ASIC port will be efficient. This thesis investigates how to optimize a design for an FPGA through a number of case studies of important SoC components. One of these case studies discusses high speed processors and the tradeoffs that are necessary when constructing very high speed processors in FPGAs. The processor has a maximum clock frequency of 357~MHz in a Xilinx Virtex-4 devices of the fastest speedgrade, which is significantly higher than Xilinx' own processor in the same FPGA. Another case study investigates floating point datapaths and describes how a floating point adder and multiplier can be efficiently implemented in an FPGA. The final case study investigates Network-on-Chip architectures and how these can be optimized for FPGAs. The main focus is on packet switched architectures, but a circuit switched architecture optimized for FPGAs is also investigated. All of these case studies also contain information about potential pitfalls when porting designs optimized for an FPGA to an ASIC. The focus in this case is on systems where initial low volume production will be using FPGAs while still keeping the option open to port the design to an ASIC if the demand is high. This information will also be useful for designers who want to create IP cores that can be efficiently mapped to both FPGAs and ASICs. Finally, a framework is also presented which allows for the creation of custom backend tools for the Xilinx design flow. The framework is already useful for some tasks, but the main reason for including it is to inspire researchers and developers to use this powerful ability in their own design tools.
FPGA-based reconfigurable on-board computing systems for space applications
The purpose of the thesis is to conceptualize an application method of ground-based reconfigurable FPGA (Field Programmable Gate Array) technologies for space systems and to apply the method to the on-board computer of the small satellite Flying Laptop for the on-orbit demonstration. The Flying Laptop satellite is the first small satellite within the Stuttgart small satellite program'' in which several small satellites are developed by the Institute of Space Systems at the Universität Stuttgart. The main mission of the Flying Laptop is to demonstrate the space use of reconfigurable FPGAs for the reconfigurable computing'' on an central on-board computer aboard a spacecraft. Due to their radiation vulnerabilities reconfigurable FPGAs have not yet been employed in practical space applications with high reliability requirements. The Flying Laptop project aims to achieve the world's first orbit demonstration of a purely FPGA-based central on-board computer. Within this research firstly, application methods of reconfigurable FPGAs for space systems were investigated, which are not limited to small satellites but for general space systems. The investigation is based on thorough experimental data survey and analysis of radiation effects on existing FPGA devices. Main radiation effects of single event effects and total ionizing dose effects were extensively investigated. Based on the data obtained, a combinational use of SRAM-FPGAs (multi-chip redundant) and Flash-FPGAs (voting element) for mitigating radiation effects was conceptualized. A mathematical system reliability analysis of repairable multi-redundant systems has been. The analysis illustrates that a multi-redundant system based on SRAM-FPGAs together with a Flash-FPGA based voter provides a sufficiently high reliability for Low Earth Orbit (LEO) missions against radiation effects. After the conceptualization of application methods of reconfigurable FPGAs for the space environment, it is applied to the on-board computer of the small satellite Flying Laptop. Flying Laptop is a cubic, 3-axis stabilized satellite with the edge lengths of about 600mm x 700mm x 800mm and a mass of about 120kg, which shall be launched into sun-synchronous LEO in an altitude of around 600km. A system architecture with four SRAM-FPGA based central processing nodes and one Flash-FPGA based voter was applied for the on-board computer of the Flying Laptop. This on-board computer is the central computing system aboard the satellite and shall be capable of controlling all satellite peripheral electronics. First of all, the system design of the whole satellite has been conducted within the scope of the thesis in order to allow the design of the on-board computer. Based on the established system requirements, the on-board computer of the Flying Laptop was designed and the breadboard model and partly the engineering model of its components are developed. The hardware logic (control algorithm) which shall be implemented into FPGAs can be designed by means of hardware description languages. However, it is no longer software engineering but hardware engineering for generating real hardware logics inside FPGAs which are executed in parallel in real-time. The satellite main functions are designed, developed, and implemented in FPGAs by means of the hardware description languages Handel-C and VHDL. The thesis provides development methods of the control algorithms. In addition to this, a control algorithm development facility has been established for the further design activities. Finally, the developed control algorithms are verified in a simulation and verification environment in order to prove the validities of the above described developments. First of all, an FPGA hardware-in-the-loop real-time simulation environment has been established based on the Model-based Development and Verification Environment (MDVE). MDVE was established at the Institute of Space Systems supported by EADS Astrium. The communication interface between the MDVE and FPGAs are developed, including the required hardware components and the serialization algorithms of communication lines inside an FPGA. Using this simulation and verification environment, extensive simulations have been conducted and the design of the on-board computer, as well as the system design of the whole satellite are validated. At the end, an extended investigation has been conducted on formal verification methods of the hardware-logic in order to provide the way of strict design verifications. This thesis establishes the basis of principle application methods of reconfigurable FPGA technologies for reconfigurable computing'' on space systems which provides innovative solutions for high computational demands of future space applications.
Security for volatile FPGAs
With recongurable devices fast becoming complete systems in their own right, interest in their security properties has increased. While research on "FPGA security" has been active since the early 2000s, few have treated the field as a whole, or framed its challenges in the context of the unique FPGA usage model and application space. This dissertation sets out to examine the role of FPGAs within a security system and how solutions to security challenges can be provided. I offer the following contributions. I motivate authenticating configurations as an additional capability to FPGA configuration logic, and then describe a exible security protocol for remote reconfiguration of FPGA-based systems over insecure networks. Non-volatile memory devices are used for persistent storage when required, and complement the lack of features in some FPGAs with tamper proong in order to maintain specified security properties. A unique advantage of the protocol is that it can be implemented on some existing FPGAs (i.e., it does not require FPGA vendors to add functionality to their devices). Also proposed is a solution to the "IP distribution problem" where designs from multiple sources are integrated into a single bitstream, yet must maintain their condentiality. I discuss the diculty of reproducing and comparing FPGA implementation results reported in the academic literature. Concentrating on cryptographic implementations, problems are demonstrated through designing three architecture-optimized variants of the AES block cipher and analyzing the results to show that single figures of merit, namely "throughput" or "throughput per slice", are often meaningless without the context of an application. To set a precedent for reproducibility in our field, the HDL source code, simulation testbenches and compilation instructions are made publicly available for scrutiny and reuse. Finally, I examine payment systems as ubiquitous embedded devices, and evaluate their security vulnerabilities as they interact in a multi-chip environment. Using FPGAs as an adversarial tool, a man-in-the-middle attack against these devices is demonstrated. An FPGA-based defense is also demonstrated: the first secure wired "distance bounding" protocol implementation. This is then put in the context of securing recongurable systems.
Numerical Solutions of Differential Equations on FPGA-Enhanced Computers
Conventionally, to speed up scientific or engineering (S&E) computation programs on general-purpose computers, one may elect to use faster CPUs, more memory, systems with more efficient (though complicated) architecture, better software compilers, or even coding with assembly languages. With the emergence of Field Programmable Gate Array (FPGA) based Reconfigurable Computing (RC) technology, numerical scientists and engineers now have another option using FPGA devices as core components to address their computational problems. The hardware-programmable, low-cost, but powerful “FPGA-enhanced computer” has now become an attractive approach for many S&E applications. A new computer architecture model for FPGA-enhanced computer systems and its detailed hardware implementation are proposed for accelerating the solutions of computationally demanding and data intensive numerical PDE problems. New FPGAoptimized algorithms/methods for rapid executions of representative numerical methods such as Finite Difference Methods (FDM) and Finite Element Methods (FEM) are designed, analyzed, and implemented on it. Linear wave equations based on seismic data processing applications are adopted as the targeting PDE problems to demonstrate the effectiveness of this new computer model. Their sustained computational performances are compared with pure software programs operating on commodity CPUbased general-purpose computers. Quantitative analysis is performed from a hierarchical set of aspects as customized/extraordinary computer arithmetic or function units, compact but flexible system architecture and memory hierarchy, and hardwareoptimized numerical algorithms or methods that may be inappropriate for conventional general-purpose computers. The preferable property of in-system hardware reconfigurability of the new system is emphasized aiming at effectively accelerating the execution of complex multi-stage numerical applications. Methodologies for accelerating the targeting PDE problems as well as other numerical PDE problems, such as heat equations and Laplace equations utilizing programmable hardware resources are concluded, which imply the broad usage of the proposed FPGA-enhanced computers.
Physical Synthesis Toolkit for Area and Power Optimization on FPGAs
A Field-Programmable Gate Array (FPGA) is a configurable platform for implementing a variety of logic circuits. It implements a circuit by the means of logic elements, usually Lookup Tables, connected by a programmable routing network. To utilize an FPGA effectively Computer Aided Design (CAD) tools have been developed. These tools implement circuits by using a traditional CAD flow, where the circuit is analyzed, synthesized, technology mapped, and finally placed and routed on the FPGA fabric. This flow, while generally effective, can produce sub-optimal results because once a stage of the flow is completed it is not revisited. This problem is addressed by an enhanced flow known Physical Synthesis, which consists of a set of iterations of the traditional flow with one key difference: the result of each iteration directly affects the result of the following iteration. An optimization can therefore be evaluated and then adjusted as needed in the following iterations, resulting in an overall better implementation. This CAD flow is challenging to work with because for a given FPGA researchers require access to each stage of the flow in an iterative fashion. This is particularly challenging when targeting modern commercial FPGAs, which are far more complex than a simple Lookup Table and Flip-Flop model generally used by the academic community. This dissertation describes a unified framework, called the Physical Synthesis Toolkit (PST), for research and development of optimizations for modern FPGA devices. PST provides access to modern FPGA devices and CAD tool flow to facilitate research. At the same time the amount of effort required to adapt the framework to a new FPGA device is kept to a minimum. To demonstrate that PST is an effective research platform, this dissertation describes optimization and modeling techniques that were implemented inside of it. The optimizations include: an area reduction technique for XOR-based logic circuits implemented on a 4-LUT based FPGA (25.3% area reduction), and a dynamic power reduction technique that reduces glitches in a circuit implemented on an Altera Stratix II FPGA (7% dynamic power reduction). The modeling technique is a novel toggle rate estimation approach based on the XOR-based decomposition, which reduces the estimate error by 37% as compared to the latest release of the Altera Quartus II CAD tool.