FPGA-implementation av ett neuralt nätverk
Publicerad
Författare
Typ
Examensarbete på grundnivå
Program
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Image recognition is a quickly growing field where convolutional neural networks,
CNN, are in the bleeding edge. Today fast GPUs are used which consume a lot
of power. Field programmable gate arrays, FPGAs, are more energy efficient per
calculation. This report describes an architecture of a convolutional neural network
implemented in a field programmable gate array. The main purpose is to design
an architecture and demonstrate its functionality in regards to power, speed and
resource usage. In order to achieve the architecture, the project has followed general
guidelines for a convolutional neural network, with filters that extend over the entire
depth of the image.
The parameters of the design were adapted for the FPGA used in the project.
The dimensions of the memory were adjusted to reduce the number of times each
data has to be loaded for each calculation, due to max-pooling. The final architecture,
however, resulted in a flexible enough design that is adaptable to other
FPGAs. When implemented, the calculations used both data and filters from a
limited read-only memory, ROM, the design could use data from the main processor.
The computing capacity of the architecture is far below the theoretical capacity
of the FPGA. However, there are multiple possibilities for improvements
which would improve the computing potential dramatically. To utilize the increased
potential, the summing tree used in the architecture can be modified which will
potentially double the calculations per clock cycle and optimize the critical data
path to further increase the clock speed. Despite these limitations, the current
architecture has higher performance-to-power ratio than a GTX 1060.
Beskrivning
Ämne/nyckelord
Convolutional Neural Network, CNN, Field Programmable Gate Array, FPGA, Image Recognition