Input Based Dynamic Reconfiguration in Matlab

Input Based Dynamic Reconfiguration in Matlab

Abstract:

In this work, we propose a novel framework which enables dynamic reconfiguration of an already-trained Convolutional Neural Network (CNN) in hardware during inference. The reconfiguration enables input-dependent approximation of the CNN to achieve power saving without much degradation in its classification accuracy at run-time. For each input, our framework uses only a fraction of the CNN's edge weights based on that input (with the rest remaining 0) to conduct the inference. Consequently, power saving is possible due to fewer number of fetches from off-chip memory as well as fewer multiplications for majority of the inputs. To achieve per-input approximation, we use clustering algorithm which groups similar weights in the CNN based on their importance, and design an iterative framework which decides how many clusters of weights should be fetched from off-chip memory for each individual input. We also propose new hardware structures to implement our framework on top of a recently-proposed FPGA-based CNN accelerator. In our experiments with popular CNNs, we show significant power saving with almost no degradation in classification accuracy due to doing inference with only a fraction of the edge weights for the majority of the inputs.