dc.contributor.authorAbhishek Kumar Jain
dc.date.accessioned2017-02-02T07:40:58Z
dc.date.available2017-02-02T07:40:58Z
dc.date.issued2017
dc.identifier.citationAbhishek Kumar Jain. (2017). Architecture centric coarse-grained FPGA overlays. Doctoral thesis, Nanyang Technological University, Singapore.
dc.identifier.urihttp://hdl.handle.net/10356/69532
dc.description.abstractCoarse-grained FPGA overlays have emerged as one possible solution to make FPGAs more accessible to application developers who are accustomed to software API abstractions and fast development cycles. Existing overlay architectures offer a number of advantages for general purpose hardware acceleration because of software-like programmability, fast compilation, application portability, and improved design productivity, but at the cost of area and performance overheads due to limited consideration for the underlying FPGA architecture. This thesis explores coarse grained overlays designed using the exible DSP48E1 primitive on Xilinx FPGAs, allowing pipelined execution of compute kernels at significantly higher throughput. We first evaluate an open source overlay architecture, DySER, mapped on the Xilinx Zynq device and show that DySER suffers from a significant area and performance overhead due to limited consideration for the underlying FPGA architecture. Next, we design and implement a more FPGA targeted overlay architecture that maximizes the peak performance and reduces the interconnect area overhead through the use of an array of DSP block based fully pipelined functional units and an island-style coarse-grained routing network. As the interconnect of the island-style overlay is still excessive, we next explore novel interconnect architectures to further reduce the interconnect area. We next develop DeCO, a cone shaped cluster of FUs, which shows 87% savings in LUT requirements compared to our island-style overlay, for a set of compute kernels. Our experimental evaluation shows that the proposed overlays exhibit frequencies close to the DSP theoretical limit and achieve high performance with significantly reduced area overheads. We also present a methodology for compiling high level language (C/OpenCL) descriptions of compute kernels onto DSP block based coarse-grained overlays. Our mapping ow provides a rapid, vendor independent mapping to the overlay, raising the abstraction level while also reducing compilation time significantly, hence addressing the design productivity issue.en_US
dc.format.extent184 p.en_US
dc.language.isoenen_US
dc.subjectDRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systemsen_US
dc.titleArchitecture centric coarse-grained FPGA overlaysen_US
dc.typeThesis
dc.contributor.supervisorDouglas Leslie Maskellen_US
dc.contributor.schoolSchool of Computer Science and Engineeringen_US
dc.description.degreeDoctor of Philosophy (SCE)en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record