Ultra-low power signal processor for biomedical applications
Seyed Mohammad Ali Zeinolabedin
Date of Issue2017-04-12
School of Electrical and Electronic Engineering
Recent developments in integrated circuit (IC) design technology enable us to realize the complicated applications and algorithms which were mainly implemented in software-based platform. In addition, the rapid advancements in the world of digital signal and image processing have simultaneously brought the newer and more efficient algorithms into existence. For example, the frequency-based image processing that was first relied on Fourier transform, has been gradually looked for another powerful transforms. They not only unmask the frequency components, but also provide the position of the frequency components such as Laplacian and Gaussian Pyramids (LP and GP) and wavelet transform. Going to more than one dimensional signal processing necessitates having the transform providing more information of the signal such as directionality. For instance, curvelet (for continuous signals) and Contourlet (for two dimensional (2D) discrete signals) transforms have been developed to address those issues. By tracking the time passed from Fourier transform to Contourlet transform which is almost 130-year-old journey, we have been able to access to more information, though the complexity is considerably increasing. Now, there are various mathematically complex algorithms developed to achieve the desired outcome in the areas like biomedical and mobile applications, coding, robotics, three dimensional (3D) graphics and so on. Many of the mentioned applications require to be implemented in hardware in order to have the real-time performance. Moreover, some needs to be ultra-low power to operate for a longer time, especially for biomedical applications in which both battery life and thermal energy transferred to body tissues are of great concerns. For instance, the online spike sorting DSP has been developed to do sorting on the signals (spike) recorded from a brain in a real time. Two requirements that should be always met are the real-time performance and the power density. The latter one is also very essential because the hardware implanted in a brain must avoid damaging brain tissue. So, the main bottlenecks of achieving ultra-low power signal and image processors for biomedical and Internet of thing (IoT) applications are memory and computational complexity. In the first part of this thesis, I propose architecture level contributions to achieve ultra-low power digital processor for biomedical and IoT applications. The proposed techniques reduce the power and area by addressing the area- and power-hungry components of such applications which are FIFOs and memory. The proposed techniques are 1- FIFO with error-reduced data compression (FERDC) and 2- FIFO with adaptive error reduced data compression (FAERDC). FERDC internal parameters are set in advance and it has less design complexity. Whereas in the latter one, the adaptive mechanism updates internal parameters with respect to input data pattern to reduce the output error and gain more power and area saving. FERDC and FAERDC can be generally applied to various digital signal and image processors and hardware accelerators because there is no assumption on which FERDC and FAERDC are proposed. FERDC has been extensively investigated using the various simulations to verify the functionality as well as power and area reduction. Then FAERDC along with a proposed extension method for filtering and a near-threshold operation have been proposed to design an ultra-low power Laplacian Pyramid (LP) engine as a popular hardware accelerator. The LP has been fabricated in 180 nm CMOS technology and its functionality is verified at 0.5 V. In the next part of this thesis, Contourlet transform (CT) which is a multiresolution image representation is studied. CT is one of the latest directional image transform whose hardware implementation has not been yet investigated. It consists of LP and directional filter bank (DFB). DFB is extracting the frequency components with respect to their directions. We have proposed the first hardware architecture of CT and studied the DFB in detail. The proposed architecture has been functionally verified through extensive simulation results. DFB is also a memory-intensive algorithm through which the whole input should be iteratively read, processed and stored back in the memory. The last part of this thesis is devoted to a real-time multi-channel spike sorting chip. Various techniques are proposed to design a 128-channel real-time spike sorting DSP operating at near-threshold (NT) voltage. The main contributions are categorized as architecture and circuit levels. We have proposed a new spike detector, feature extraction and training algorithms to improve the accuracy of clustering and then reduce the memory requirement in training and clustering parts. In order to further reduce the area to accommodate 128 channels as well as leakage power, a full custom 8T-cell SRAM is designed to operate in NT. An auto-biased bit-line keeper provides a reliable sensing margin at near- and sub-threshold operation so that a single power supply can be used for both the digital core and SRAM. The measurement results verify the chip functionality at 0.54 V with 0.175 μW/ch.