## **UC Berkeley UC Berkeley Electronic Theses and Dissertations**

### **Title**

Neuromodulation IC for System Integration and Self-Interference Cancellation

**Permalink** <https://escholarship.org/uc/item/9k46j5hq>

**Author** Jung, Seobin N/A

**Publication Date** 2018

Peer reviewed|Thesis/dissertation

#### Neuromodulation IC for System Integration and Self-Interference Cancellation

by

Seobin Jung

A dissertation submitted in partial satisfaction of the

requirements for the degree of

Doctor of Philosophy

in

Engineering - Electrical Engineering and Computer Science

in the

Graduate Division

of the

University of California, Berkeley

Committee in charge:

Professor Elad Alon, Chair Professor Jan Rabaey Professor Bruno Olshausen

Summer 2018

### Neuromodulation IC for System Integration and Self-Interference Cancellation

Copyright 2018 by Seobin Jung

#### Abstract

Neuromodulation IC for System Integration and Self-Interference Cancellation

by

Seobin Jung

Doctor of Philosophy in Engineering - Electrical Engineering and Computer Science

University of California, Berkeley

Professor Elad Alon, Chair

Recent advances in brain-machine interfaces (BMI) have demonstrated its clinical efficacy for various applications such as prosthetic controls for motor-disabled patients and neural disease treatments. Among many other signal modalities, electrophysiology is one of the key areas to understand and engineer neural systems. While impressive technical breakthroughs have been made on electrodes, signal acquisition, and microstimulation, no electrode-based instrumentation reported so far achieves both the coverage and resolution required for a closed-loop BMI with a high degree of freedom and clinical lifespan.

In this dissertation, a minimally invasive neural interface system that has scalability (starting from thousands of neural sites and scaling up to millions), fine resolution  $\left($  <  $10 \mu$ m, *<*1ms), broad coverage (a year, *>*10cm), automated electrode insertion, and a low-energy neuromodulation  $\langle$  <500 $\mu$ W for 64-channel recording). This system became feasible by integrating state-of-the-art sub-components developed across UC Berkeley and UCSF labs. Each sub-component is reviewed along with discussions on current progress and challenges. Prototype *in vitro* and *ex vivo* results are also shown.

Another challenge for a bidirectional neural interface is the existence of self-interference. While simultaneous stimulation and recording are required for neuromodulation chips to support closed-loop BMI applications, such ICs suffer from large stimulus artifacts. The stimulus artifact is essentially a form of self-interference that originates from a stimulator pulse and couples into front-end recorders. Because the ICs typically have front-ends with limited input ranges, they saturate and lose desired neural signals.

This dissertation presents an active cancellation IC that expands the effective dynamic range (uncancelled artifact/cancelled artifact) of the front-end to 8kHz bandwidth and for up to  $200 \text{mV}_{\text{pp}}$  differential-mode (DM) artifact signals with only a modest ( $\sim 10\%$ ) noise penalty. The analog canceller uses a LMS loop to cancel a majority of the artifact signal at the input of the LNA while the digital canceller with another LMS loop further cancels out residual error. The chip was validated with *in vivo* cancellation measurement result.

# **Contents**



Bibliography 52

# List of Figures









# List of Tables



#### Acknowledgments

Berkeley is a great place full of interesting people, technical challenges, breakthroughs, diversity, and inclusion. I was intimidated initially but gradually started enjoying it. To be honest, it was not until my third year that I understood what I was supposed to do as a graduate student in BWRC (after the November retreat as I remember). 'Oh, I have to explore state-of-the-art works and implement one.'

I want to thank the friends and colleagues I got to know in BWRC. My years could have been worse without them. Kristel Deems and Amy Whitcombe, we SPICE girls went through our rough first year together. I will not forget how tall our onion grew up in Cory 550 and how fun the Chairling game we played in the Cory lab 105 was while working. My first thanks go to students in Elad's group – Kosta Trotskovsky, Antonio Puglielli, Greg Lacaille, Nathan Narevsky, Jaeduk Han, Bonjern Yang, DJ Seo, Nick Sutardja, Eric Chang, Zhongkai Wang, Pengpeng Lu, Emily Naviasky, and Paul Kwon. These awesome people were down for having discussions, sharing know-how, doing trial and error, and going through dark times together. I was also fortunate to interact with people in Jan's group – Ali Moin, Andy Zhou, George Alexandrov, Miles Rusch, and Matthew Anderson. I wish their dedicated research efforts bring them fruitful results. Although I did not research radio circuits, I was somehow surrounded by radio people and was able to pick their brains from time to time. I thank Luke Calderin, Sameet Ramakrishnan, Nai-chung Kuo, Bo Zhao, Lorenzo Iotti, and Andrew Townley. Lastly, as a large research center, I could interact with students in silicon photonics, processors, bionics, and beyond. I wish good luck to Krishna Settaluri, Sidney Buchbinder, Taehwan Kim, John Wright, Angie Wang, Pi-Feng Chiu, Luya Zhang, Keertana Settaluri, and Kyoungtae Lee.

Switching gears to outside of BWRC, I want to thank Tim Hanson and Camilo Diaz-Botia for our work in the dare-to-fail neural interface project. They taught me how to collaborate outside of my lab and eventually got me excited by the adventure they have been going through. I thank Joseph O'Doherty for his consulting and David Piech for doing *in vivo* experiments. I also want to express gratitude for my research ancestors - John Crossley, Dan Yeager, Will Biederman, and Jackie Leverett. Their foundational work enabled my research. Special thanks to inspirational seniors who took care of me during my baby Ph.D. year - Alberto Puggelli, Matt Spencer, Yue Lu, Lingkai Kong, Wonyoung Kim, Jiye Lee, Insoon Yang, Jinwook Shin, Jaeyeon Baek, Jaehwa Kwak, and Hogeun Kim. I wish the best of luck for their future.

I want to thank my research advisors and professors. Elad Alon is one of greatest people I have ever met. While working on the BMI project, I often wished that I could tap into his brain and run a thread to use some of his brain power. I hope I have inherited some of his knowledge on ICs and beyond, his capability to work on multiple works and do them well, and his patience. Jan Rabaey has been a great visionary and shown me how to enjoy research work and discussions. Rikky Muller has shown keen interest in my research and given me advice. I was also fortunate to learn from great teachers. The RF circuit class from Ali Niknejad, the statistical digital signal processing class from Laura Waller, the magnetic resonant imaging class from Miki Lustig, the numerical simulation and modeling class from Jaijeet Roychowdhury, the neural computation class from Bruno Olshausen, and the ADC class from Johan Vanderhaegen not only taught me knowledge but also passion for teaching and research. I also want to thank industry people I met during my internship. Jared Zerbe, Brian Leibowitz, Emerson Fang, Sunghyuk Lee, Ovidiu Bajdechi, and Albert Jerng gave me ideas on how talents can be leveraged outside of the academia. I also appreciate the many BWRC Friday seminar speakers who broadened my horizon. Deep thanks to Ajith Amerasekera, Brian Richards, James Dunn, Fred Burghardt, Olivia Nolan, Candy Corpus, Yessica Bravo, and Shirley Salanio. My graduate school life could have been in the tank without their help.

Lastly, I'd like to give my heart to my family who has supported me and encouraged me during my over-20-years in school. I remember my dad saying to his intimidated daughter "No matter whether you do your Ph.D. or not, you will get old anyway. So go for it if you can." I am glad I did it and dedicate my dissertation to him. Although we are staying apart, my heart always stays with them.

# Chapter 1

# Introduction

### 1.1 Brain Machine Interface

Recent advances in the brain-machine interfaces (BMI) have demonstrated their clinical efficacy for various applications such as prosthetic controls for motor-disabled patients and neural disease treatments including Parkinson's and Epilepsy [1]. In the BMI, neural cells directly interface with artificial devices such as computers and robots and relay signals inside a real-time feedback loop. Although the concept of brain-computer interfaces was suggested almost a century ago, it has been only two decades that the BMI can be thought of as a practical means to enhance people's lives. Both Neuroscience and Engineering have



Figure 1.1: Block diagram of a brain-machine interface (BMI) for a prosthetic arm control. Multielectrode implants record neural signals from a cortical sensorimotor area. Signal processing algorithms decode the recorded neural spikes and encode motor commands to a robotic arm. The brain receives visual and somatosensory feedback from the actuation. Microstimulation can possibly actuate cortical sensory area directly. Adapted from [1].

been the driving force; Neuroscience contributed fundamental knowledge and discovery of biomarkers, and Engineering made the implementation possible via technical breakthroughs in signal acquisition, imaging, computational power, telemetry, robotics, and beyond.

While impressive, in order for the BMI to progress from a lab-oriented prototype to a clinical and commercial system, further technical breakthroughs are required. Different signal modalities such as electrical, optical, magnetic, ultrasonic, and molecular have been explored to push the implementation frontiers to near their physical limits, and achieve scalability, spatiotemporal resolution, energy dissipation, and volume displacement required for a practical BMI system (Fig.1.2). Extracellular electrical recording probes voltage due to nearby neurons. Optical microscopy, especially two-photon laser scanning microscopy excites only one focus at a time, avoids scattering problems, captures emitted fluorescence, and repeats it by scanning across a sample. Magnetic resonance imaging detects magnetically induced signals from water protons by applying a strong static magnetic field to align spins and exciting them with radio-frequency pulses and gradients. Activity-dependent contrast agents are necessary to transduce neural activity into an MRI readout. Table 1.1 shows advancements and limits of electrical recording and optical recording.

The scope of this dissertation is in the electrical signal domain, especially microelectrodebased electrophysiology. Specifically, this thesis explores two directions. One is a system integration effort in which multiple research labs collaborated to build a microelectrode-based neural interface that has wide coverage, longevity, and fine spatiotemporal resolution by



Figure 1.2: Neural recording modalities. (A) Extracellular electrical recording (B) Optical microscopy (C) Magnetic resonance imaging (D) Molecular recording. Adapted from [2].

integrating state-of-art sub-components such as a neuromodulation chip, microelectrodes, and an electrode inserter robot. The second direction is on advancing the neuromodulation chip further by adding a feature to simultaneously stimulate and record neural signals. This work solves the problem of front-end saturation due to large stimulus artifacts that couple into limited input-range front-ends while being aware of energy efficiency. Lastly, the thesis finishes with a discussion of challenges and opportunities in multiple-electrode stimulus artifact cancellation.

## 1.2 System Integration for a Minimally Invasive Neural Interface

This work was jointly done with Dr. Timothy Hanson from UCSF Sabes lab and Dr. Camilo Diaz-Botia from UC Berkeley Maharbiz lab. The Sabes lab and the Maharbiz lab were in charge of developing the electrodes and the inserter machine.

One of the primary obstacles to understanding and treating a human brain is having fine-grain access to neural circuits. Challenges come from the fact that the human brain has a large number (10 billion) of tiny neurons  $(10 \mu m)$  and the neurons form a delicate network along with vessels and immune-privileged tissues. Fig.1.3 shows a scanning electron micrograph of a network of vessels. In spite of advances in brain access instrumentation, as far as we know, no solutions can successfully measure brain activity at both fine (micron, millisecond) and broad (centimeter, year) scales.

We worked on a system that can pinpoint and neuromodulate thousands of independent sites using very fine and flexible microfabricated electrodes throughout the brain. Fig.1.4 shows the concept focusing on insertion. The main idea is an integration of electrode arrays, neuromodulation chips, and an inserter that can implant electrodes to any point in the brain quickly and accurately. Low-area low-power high-bandwidth neuromodulation chips are combined with the array for a complete and scalable electrical neural interface.

The proposed system aims to create a state-of-art brain-wide extracellular electrophysiological instrument. As a comparison, large arrays of conventional microwires and silicon probes have limited coverage, cannot avoid blood vessels, and cannot be massively targeted to deep structures such as the thalamus or the basal ganglia. Optical methods also have a limited depth (a few mm). Acoustic and magnetic methods are still in preliminary development. Table 1.1 shows further technical details on comparing this approach with rigid microelectronics and optical recording.

## 1.3 Self-Interference Challenge in a Neuromodulation IC

While simultaneous stimulation and recording are required for neuromodulation chips to support closed-loop BMI applications, such ICs suffer from large stimulus artifacts. The stimulus artifact is essentially a form of self-interference that originates from the stimulator pulse that couples into front-end recorders. Because the ICs typically have front-ends with limited input ranges, they saturate and lose desired neural signals.

This dissertation presents an active cancellation IC that expands the effective dynamic range (uncancelled artifact/noise) of the front-end to 8kHz bandwidth and for up to  $200 \mathrm{mV_{pp}}$ differential-mode (DM) artifact signals with only a modest  $(\sim 10\%)$  noise penalty. The analog canceller uses a LMS loop to cancel the majority of the artifact signal at the input of the LNA while the digital canceller with another LMS loop further cancels out residual error. The chip was validated with *in vivo* cancellation measurement result.



Figure 1.3: Scanning electron micrographs showing corrosion casts from the human cerebral cortex. (A) Arterial and venous distributions are shown. Scale bar =  $375\mu$ m. (B) Arterial distribution and capillary networks are shown. Scale bar  $= 430 \mu m$ . Adapted from [7].



Figure 1.4: A schematic of the electrode insertion machine (electrodes and needle are drawn largely to be visible). Fine flexible electrodes (marked as B) are placed in a replaceable and sterilizable cartridge (marked as C). Neuromodulation chips are integrated with the electrodes in advance. The inserter head (marked as A) moves in three dimensions to individually pick the electrodes from the cartridge. The insertion needle (marked as D) shoots towards a desired point inside the brain, leaves the electrode in place, and retracts. The inserter head moves to pick a new electrode.

Table 1.1: Comparison of the suggested approach with two common methods for interfacing with nervous tissues

| Criteria                     | This approach      | Rigid                                    | Optical recording        |
|------------------------------|--------------------|------------------------------------------|--------------------------|
|                              |                    | microelectronics                         |                          |
| Probe size                   | $6-80 \mu m^2$     | $40-2000 \mu m^2$ <sup>a</sup>           | $\Omega$                 |
| Accessible depth             | $0-32$ mm          | $0-10$ mm $b$                            | $0-3mm$                  |
| Channel density <sup>c</sup> | $12,000 - 800,000$ | $625$ <sup>d</sup> - 40,000 <sup>e</sup> | ${\sim}1000$             |
| Vasculature                  | Yes                | Limited                                  | Yes                      |
| avoidance                    |                    |                                          |                          |
| Power consumption            | Low                | Low                                      | High                     |
| Suitable for                 | Yes                | Yes                                      | Limited                  |
| implantation                 |                    |                                          |                          |
| Targeting                    | Unlimited          | Limited                                  | Superficial, local       |
| Deflection force f           | $2.2 - 4.3nN$      | 33-88mN $\rm{g}$                         | $\Omega$                 |
| Virus/dye required           | N <sub>o</sub>     | No                                       | Yes <sup>h</sup>         |
| Tissue heating               | No                 | N <sub>o</sub>                           | Yes                      |
| Temporal resolution          | High $(\leq 1$ ms) | High ( $\leq$ 1ms)                       | Moderate ( $\geq 10$ ms) |

 $a$ <sup>a</sup> 50 $\mu$ m metallic microelectrode and 7 $\mu$ m carbon fiber electrode.

b Limited by the column buckling force of a  $2000 \mu m^2$  wire and tissue dimpling; no limit for larger electrodes.

 $\frac{c}{c}$  Number of channels per cm<sup>2</sup> targeting  $\langle 1\%$  tissue displacement.

<sup>d</sup> Utah array.

<sup>e</sup> Reference [6], ignoring fan-out shank and assuming 400*µ*m centers; actual tissue displacement for recording non-superficial structures higher.

<sup>f</sup> Representative force calculated for 1cm shank deflected 1mm.

<sup>g</sup> Forces for a  $23\mu$ m x  $60\mu$ m silicon shank and  $35\mu$ m tungsten microwire.

<sup>h</sup> Virus/dying is required for cellular or spike-level resolution.

# Chapter 2

# A Minimally Invasive Neural Interface System

### 2.1 Problem

Electrophysiology with wide coverage and fine resolution is a key area to understand neural systems. An instrument is required to have a wide coverage and a fine resolution both in time and space. A broad area (centimeter), a long-term reliability (year), a fine spatial resolution (micrometer), and a fine temporal resolution (millisecond) are keys for establishing a high degree of freedom BMI that operates reliably. Extracellular recording and microstimulation using electrodes have demonstrated clinical successes for epilepsy and Parkinson's disease and remain as a promising solution in spite of their invasive natures. However, no electrode-based instrumentation reported so far has achieved these temporal and spatial goals. This chapter investigates the current status of sub-components of the electrophysiology instrumentation, discusses challenges, proposes solutions, and reports results.

### 2.2 Current Solutions

### Electrode

Conventional electrodes such as Utah arrays or Michigan shanks have merits of providing a modest channel density and an easy implanting method but have problems of damaging neural targets of interest and degrading signal-to-noise ratios over time. To overcome the longevity problem, current understandings of electrodes failure mechanisms are briefly reviewed here [3].

Electrodes implanted in brains evoke foreign body responses (FBR) that cause a growth of astroglial and fibrous scar tissue. The tissue ultimately insulates the electrode and pushes neurons outside the recording volume. To address this issue, materials of the electrodes need to be biocompatible and geometry of the electrodes need to have minimal mechanical stress.



Figure 2.1: Failure modes of multielectrode arrays. (a) Ideal placement in cortical tissue. (b) Biological failures: bleeding, cell death, hardware infection, and gliosis. (c) Material failures: broken electrode tips, insulation leakage, parylene cracks, and delamination. (d) Mechanical failures: wire bundle damage, connector damage, and mechanical removal. Adapted from [3].

Damage to the blood-brain barrier is another reason for recording-site failure. Since the array has a group of electrodes with fixed locations, it has a fundamental limit in targeting. A blind implant of electrodes comes with a high risk of damaging blood vessels since the capillary bed in primate cortex is dense. For example, the spacing of micro-capillaries is  $40\mu$ m [7]. Puncturing these capillaries can result in critical damages such as hemorrhagic necrosis and edema. With a fixed Utah array, after a day of implantation, 60% of needle tracts showed evidence of hemorrhage and 25% showed edema [8].

### Electronics

Electrical neural signals are sensed, and stimulation is delivered, through the electrodes by electronics which are required for recording and actuation. Commercially available solutions include equipment racks such as the TDT system and the Plexon system. Table 2.1 shows a set of electrophysiology instruments TDT can provide.

Although the TDT system or the Plexon system work as Golden references in many Neuroscience labs, they have a large form factor and are power hungry, suggesting that they cannot be employed as a long-term implantable electronics. Commercially available IC solutions such as the Intan chip exist (shown in Fig.2.2), but it lacks in channel density and functionalities [14].



Figure 2.2: (A) Intan neuromodulation controller having four headstages (B) Intan headstage having two RHS2116 chip providing 32 channels in total.



Table 2.1: TDT Neurophysiology System [9]

| Category | Specification                                    | <b>JSSC</b>      | <b>ISSCC</b>             | <b>ISSCC</b>             | <b>VLSI</b>          |
|----------|--------------------------------------------------|------------------|--------------------------|--------------------------|----------------------|
|          |                                                  | 2015 [10]        | 2016 [11]                | 2017 [12]                | 2017 [13]            |
| General  | Tech [nm]                                        | 65               | 180 HV                   | 130                      | $180$ HV             |
|          | $\overline{\text{Area [mm^2]}}$                  | $\overline{4.8}$ | $\overline{25}$          | $\overline{6}$           | $\overline{11.5}$    |
| Record   | $Vdd$ $[V]$                                      | $1.0\,$          | $\overline{\pm 1.8}$     | $\overline{1.2}$         | $\overline{1.0}$     |
|          | $\overline{\text{Power}/\text{ch}[\mu\text{W}]}$ | $\overline{1.8}$ | 5.4                      | 0.63                     | $\overline{8.0}$     |
|          | Area/ch                                          | 0.0258           | 0.075                    | 0.013                    | $\equiv$             |
|          | $\text{[mm}^2$                                   |                  |                          |                          |                      |
|          | Channel                                          | 64               | 16                       | 64                       | 64                   |
|          | Sample rate                                      | 20               | $\overline{10}$          | 1.0                      | 1.0                  |
|          | [kHz]                                            |                  |                          |                          |                      |
|          | NEF   PEF                                        | 3.6<br>12.9      | 6.2<br>138               | $2.86\,$<br>9.82         | 7.8<br>60.8          |
|          | ENOB [bit]                                       | $\overline{8.2}$ | 8.5                      | $\overline{11.7}$        | $\overline{10.2}$    |
|          | Input range                                      | $0.52 - 6.0$     | $\overline{\phantom{a}}$ | $\overline{\phantom{a}}$ | 100                  |
|          | $[mV_{pp}]$                                      |                  |                          |                          |                      |
|          | $DR$ [dB]                                        |                  |                          |                          | 90                   |
| Stim     | $\overline{\text{Vdd}[\text{V}]}$                | Up to $7$        | Up to $\pm 12$           | $\overline{3.3}$         | Up to $12$           |
|          | Area/ch                                          | 0.0169           | 0.049                    | $\overline{\phantom{a}}$ |                      |
|          | $\mathrm{[mm^2]}$                                |                  |                          |                          |                      |
|          | $\overline{\text{Channel }^{\text{a}}}$          | 8(2)             | 160                      | 64                       | 64(4)                |
|          | $Imax$ [mA]                                      | $\overline{0.9}$ | 0.5                      | 1.35                     | $\overline{5.04}$    |
|          | Pulse width                                      | $0.8 - 409$      | $10 - 8000$              | $\blacksquare$           | $15-500$ (6bit)      |
|          | $[\mu s]$                                        | (9bit)           |                          |                          |                      |
|          | Charge                                           | Yes              | Yes                      | $\blacksquare$           | Yes                  |
|          | cancellation                                     |                  |                          |                          |                      |
| Power    | Type                                             | Wired $(1.5V)$   | Wireless                 | Wireless                 | Wired $(3V)$ /       |
|          |                                                  |                  | (2MHz)                   |                          | Wireless $(3V_{ac},$ |
|          |                                                  |                  |                          |                          | 20MHz)               |
|          | $\overline{DCDC}$                                | 68%              | $\overline{\phantom{a}}$ | $\overline{\phantom{a}}$ | 80%                  |
|          | efficiency                                       |                  |                          |                          |                      |
| Comm     | Type                                             | Wired            | Wireless                 | Wireless                 | Wired                |
|          |                                                  | (14Mbps)         | (2Mbps)                  |                          | (2Mbps)              |
| Digital  | Signal                                           | Streaming,       | Streaming                | Streaming,               | Streaming            |
|          | processing                                       | compression      |                          | seizure                  |                      |
|          |                                                  |                  |                          | detection                |                      |

Table 2.2: Survey of recent Neuromodulation ICs from academia

 $^{\rm a}$  Number of stimulators is put in parenthesis if different with number of channels.

### 2.3 Suggested Approach

To create a scalable neural interface that has both fine and broad coverage, integrating low-power low-area neuromodulation SoCs with fine and flexible electrodes that can be individually controlled is suggested. With an image guiding, one could avoid destroying critical parts of the tissue during the insertion process, enabling a long time operation of the neural interface.

### Electronics

Recently, academia has made a progress on developing low-power low-area high-density integrated circuits for neuromodulation. Table 2.2 shows a survey of recent state-of-art works. Although the performances of the chips do not reach the performances of the TDT system (e.g. max sample rate, input range, ENOB, Imax), the chips performances are good enough to enable closed-loop neuromodulation applications. Since integrated circuits provide unbeatable compactness and power efficiency, an implantable neural interface becomes possible.

Particularly, the neuromodulation chip developed in BWRC achieved a state-of-art performance for measuring cell-level action potentials [10]. Fig.2.3 shows a block system diagram of the chip. It integrated 64 analog-front-ends for recording, two spatially-multiplexing current DACs for stimulation, a charge pump for supplying high-voltages to the current DACs, and digital back-ends for controls and a wired serial communication. The chip consumes less than  $500\mu$ W of power for recording all channels, occupies  $4.78$ mm<sup>2</sup> of area, and requires only a handful of decaps.

For system integration, the chip was improved and re-verified. Microarchitecture of the digital blocks was refactorized to reduce latency from the ADC output to the chip serial



Figure 2.3: Block diagram of the neuromodulation IC previously developed in BWRC [10].

output in the streaming mode (from  $2ms$  to  $50\mu s$ ). A short latency is an important feature for closed-loop neuromodulation applications. The main design change was to merge the history buffer and the serializer into one circular buffer that mimics the data output packet structure. On the analog side, reported measurement results of some of the blocks were regenerated in simulations. For instance, LNA noise was verified with corner simulations. Charge pump efficiency was verified with incorporating parasitic capacitance at the outputs.

### Electrode

In order to prevent a foreign body response, the electrodes must be biocompatible and impose minimal mechanical stress on the brain. The suggested electrode satisfies the first condition by having lithographically-patterned polyimide, one of the most stable and biocompatible polymers [15], with a 250nm titanium-gold conduction layer exposed only at the tip.

The second condition, minimizing mechanical stress, is achieved by making the electrode flexible and thin. The suggested electrodes are 20,000 times more flexible than an equivalent commonly-used  $35\mu$ m stainless steel microwire or  $25\times50\mu$ m silicon shank. Beyond displacing less tissue, smaller sizes have several advantages. A smaller surface area applies less strain on



Figure 2.4: Drawings of the insertion needle and the electrode. (A) Side view of the insertion needle and the electrode oriented for implantation. Polyimide is amber, metal is blue, and the tungsten needle is gray. (B) Front view of the needle and the electrode, showing multiple recording sites to the right. Recording sites are electroplated to reduce impedance. (C) Photo of the insertion needle. (D) Micrograph of several electrodes and geometries tested. The remainder of the electrode (not shown) includes 18mm of lead per electrode, fan-in, and wirebond sites. Polyimide electrodes are visible as amber, and the parylene backing sheet is blue.



Figure 2.5: (A) Rendering of a cartridge with electrodes (amber) and adapter board (green) attached. (B) Photograph of a loaded cartridge. (C) Close-up rendering of mounted electrodes, showing polyimide electrodes (amber), loops, and parylene backing (clear). Electrodes and backing extend out on a small lip which permits needle access to the loops. (D) Photograph of the cartridge and inserter head showing the brake (2), needle cannula (1), and single-lens microscope (3).

the brain during acceleration of the electrode for insertion [16]. The device cross-sectional area is shown to be proportional to the volume of the FBR; 7*µ*m carbon fiber electrodes reduce chronic inflammation and polymer fibers smaller than 6*µ*m show almost no FBR [17].

While the fine flexible electrodes bring longevity, it becomes hard to insert them. In order to achieve certain stiffness required for insertion, a rigid reusable needle is developed. As shown in Fig.2.4, each electrode is fabricated with a 12*µ*m diameter loop at the end. For implantation, the 12-25*µ*m diameter tungsten needle of the inserter machine hooks the loop, moves the electrode over the target area, inserts it into the brain, and releases it once done. The electrodes are initially weakly bonded to a thick parylene backing sheet so that they can be managed even with being extremely fine  $(5\times16\mu m)$ . The backing sheet is attached to a reusable and sterilizable cartridge that magnetically snaps into the inserter machine (Fig.2.5). The cartridge also holds external connectors and/or headstage chips.

#### Insertion Robot

Automation of electrodes insertion is required to handle a large number of fine and flexible electrodes in a practical surgical time window (e.g. 1,000 electrodes over four hours). A robotic arm capable of reliable and efficient targeting can automate this procedure and reduce variability in the surgical procedure. In one study, for example, rapid and automated insertion of microwire arrays results in  $60\%$  yield, compared to  $0\%$  for manual control, after six weeks of recording in rats, even though initial recordings were roughly identical [18].

### 2.4 Measurement Results

#### Electronics: Benchtop measurements

The analog blocks within each recording channel consume an average of 1.8*µ*W for 45dB-65dB variable gain and variable BW (high pass  $= 10$ Hz-1kHz, low pass  $= 3$ kHz-8kHz) with  $20kS/s$  and 10bit quantization. AFEs achieve  $7.5\mu V_{rms}$  IRN in the highest gain setting (Fig.2.6 (left)). Measured SNDR/SFDR is 36.1dB/47.4dB. Measured common-mode rejection is 30dB across the whole bandwidth up to  $10 \text{mV}_{\text{pp}}$  input, but drops to  $12 \text{dB}$  for  $100 \text{mV}_{\text{pp}}$ inputs.



Figure 2.6: Measured input-referred noise PSD (left) and SNDR/SFDR for the entire AFE chain (right).



Figure 2.7: The oscilloscope-captured voltage waveform of the load electrode. The yellow line shows stim electrode  $(+)$ , the green line shows stim electrode  $(-)$ , and the pink line shows a differential of the two. Load electrode impedance was assumed to be mostly capacitive.

Fig.2.7 shows voltage waveforms of loaded stimulator outputs captured by an oscilloscope. Biphasic current pulse had a pulse width of 43.75*µ*s, interphase delay of 6.25*µ*s, and stimulation current of *±*250*µ*A. Load electrode impedance was assumed to be mostly capacitive (Rs =  $10\Omega$ , C =  $10nF$ , Rp =  $1G\Omega$ ). As a result, the differential voltage waveform has the shape of a ramp.

### Electronics: *In vivo* measurements

The recording capability of the AFE was validated by *in vivo* measurements. A 16-channel tungsten-coated microwire electrode array  $(200 \mu m)$  spacing, Innovative Neurophysiology, Durham, NC) was implanted in the motor cortex of an adult Long-Evans rat. Extracellular recordings were obtained while the rat was freely moving (Fig.2.8). All experiments were performed in compliance with the regulations of the Animal Care and Use Committee at the University of California, Berkeley. Input-referred time-aligned epochs extracted by a nonlinear energy operator are shown in Fig.2.9, demonstrating the AFE's capability to record spikes. With the exact same condition but by swapping the chip with the Plexon data acquisition system, neural spikes that have same parameters such as pulse duration and peak-to-peak amplitude were also obtained.



Figure 2.8: Photo taken while conducting an *in vivo* experiment on a Long-Evans rat. All electronics (chip, FPGA, laptop) were battery-powered to minimize 60Hz interference.



Figure 2.9: (A) Input-referred time-aligned spikes measured using the neuromodulation chip from *in vivo* measurements. (B) Input-referred spikes from the same subject using the same electrode but measured using the Plexon data acquisition system. Note that the amplitudes and spike duration match.

#### Integration: Electronics and Electrodes

A prototype integration of the chip and the electrodes were made on a flexible PCB as shown in Fig.2.10. The form factors were not optimized because the focus was primarily on verifying electrical properties. Using this board, *in vitro* recording was successfully conducted as shown in Fig.2.12.

In order to achieve a thousand channel recording, 16 copies of the 64-channel chip were employed (Fig.2.10(D)). An aggregator module time-multiplexing digital  $I/O$ 's of the chips was designed and tested. Fig.2.11 shows a HDL diagram of the aggregator module. A commercially available FPGA board (OpalKelly XEM6310-LX150) with USB3.0 link was used. During recording, no bit was missed as long as RAM of the PC was not filled up. C++ scripts received and post-processed the high-throughput data.

#### Integration: Electrodes and Inserter Machine

An *in vitro* testing of the inserter and electrodes was conducted. Fig.2.13 shows successful insertions into an agarose  $(0.6-1.0\% \text{ w/v})$  tissue proxy as well as ex-vivo brain.

### 2.5 Summary

This chapter discussed an effort for developing an electrophysiology instrument for extracellular recording and microstimulation to enable long-term clinical BMI applications. The goal was to achieve coverage (centimeter, year) and resolution (micrometer, millisecond) that no neural interface solution had achieved yet.

To realize such an interface, we worked on integrating state-of-the-art sub-components. One of the key elements was the neuromodulation IC with 64-channel recording analog frontends, two spatially-multiplexing stimulators, power train, and digital blocks for control and communication. It was one of the most dense neuromodulation ICs in recent years  $(4.78 \text{mm}^2)$ of area for the whole chip). Thanks to the low-power operation  $(<500\mu W$  for recording mode), the chip is an implantable solution for closed-loop neuromodulation applications. Another key element was the electrode that was made out of a bio-compatible material and had a thin and flexible form factor to minimize foreign body responses. The inserter with robotic arms for electrodes insertion was also a main module. With an imaging guide, blood vessel damage during insertions could be avoided with electrodes targeting.

Development of each sub-component was mostly achieved. The IC had complete *in vivo* recording results and its stimulation capability was demonstrated. The FPGA aggregator module along with control/postprocessing was designed and tested. The electrode design achieved a reliable fabrication process and retained a high yield on the aging test. The inserter was partially complete. The insertion speed and target controllability was achieved with a PID controller. Non-recurring time for electrode alignment was still long (10min) but recurring time for each electrode insertion was short (*<*10sec). An imaging guide (optical or MR) for insertion has not yet been included.

Integration of the sub-components was partially done. The IC and electrodes were placed on a flexible PCB and tested with *in vitro* measurements. The electrodes were mounted on the inserter and tested with *in vitro* measurements. Engineering works required for integrating all the modules (IC, electrodes, inserter) are one of the remaining tasks. Other future works involve *in vivo* measurements, a long-term validation, and a clinically-relevant neuromodulation demonstration.



Figure 2.10: (A) A flexible PCB with the electrodes (64 for recording and 16 for stimulation) and the neuromodulation chip assembled is shown. The electrodes are lightly attached in the parylene back sheet and run towards the left side of the photo. (B) Tip parts of the polyimide electrodes are zoomed in. (C) Photo of the un-encapsulated wire-bonded chip in the flex PCB. (D) The assembled flex PCBs are plugged into the aggregator board. The aggregator board accepts 14 flex PCBs for a total of 1120 electrodes (896 for recording and 224 for stimulation). The adaptor board connects to the FPGA board, shown at top of the image, for streaming data to the PC via USB3.



Figure 2.11: HDL dataflow for an aggregator module



Figure 2.12: Recording test result of the flexible PCB using electro-gel. Differential sinusoidal signals of various amplitudes are fed between REF pad and recording pads. For channels 5 through 8, inputs are shorted to REF using conductive gel. For channels 61 through 63, a differential input is fed using conductive gel. Time-domain ADC outputs with various amplitudes are shown: (A)  $0\mu V_{\rm rms}$ , (B)  $40\mu V_{\rm rms}$ , and (C)  $80\mu V_{\rm rms}$ . SNR obtained was 16dB, and was limited by 60Hz noise. Although the chip was battery-powered, FPGA and laptop were wall-plugged.



Figure 2.13: (A) Twenty electrodes inserted at 500*µ*m spacing in two rows. The front row was inserted 2mm deeper than the back row. Variance in the insertion depth is due to differing head and hole geometry of each electrode, as this was an experiment to see what works best. Bubbles are caused by the halogen light used to see the electrode during the experiment. (B) Three electrodes inserted into an ex-vivo zebra finch brain. (C) Seven electrodes inserted in the opposite hemisphere. Electrode leads adhere to the pial surface after wetting the brain with saline.

# Chapter 3

# Self-Interference Challenge in a Neuromodulation IC

### 3.1 Introduction

Closed-loop BMI systems call for neuromodulation ICs that record and stimulate simultaneously. However, such ICs are susceptible to self-interference; large artifact signals originating from the stimulation pulses couple into the front-end amplifiers. Because these front-ends are typically designed with high  $g_m/I_d$  for power savings, their input voltage range is limited, and hence the large stimulus artifact saturates the front-ends.

Recent works have developed new techniques to address various aspects of this problem [19]. These techniques include 1) back-end techniques such as blanking and interpolation [24], matched filter [26], component analysis [27], and 2) front-end techniques such as reset [13], bandpass filter [20], subtraction [21], [22], [23], and adaptive input ranging [25].

This work extends these previous approaches by presenting an active cancellation module that expands the effective dynamic range (defined as a ratio of an uncancelled artifact and a cancelled artifact) of the front-end to a state-of-art value of 78dB for a 8kHz bandwidth and for up to  $200 \text{mV}_{\text{pp}}$  differential-mode (DM) artifact signals with only a modest ( $\sim 10\%$ ) noise penalty and a competitive power consumption.

### 3.2 Architecture Survey and Design

#### Survey of Artifact Removal Filters

With a limited input range, the self-interference removal problem boils down to a filter design problem. In presence of a microstimulation (*s*[*n*]), a recorder picks up a natural biological signal  $(v[n])$ , a neural signal that reacts to the stimulation, and an undesired artifact. While the neural signal has a nonlinear time-varying nature in response to the stimulation  $(h_N(s, n))$ , the artifact channel can be modeled as a linear time-invariant system



Figure 3.1: Block diagrams of stimulus artifact removal filters: (a) no filter (b) a filter at the recorder (c) a filter at the stimulator (d) a filter injecting artifact replica at the recorder.

during the stimulus time span  $(h_A[n])$ . The recorder picks up a superposed signal  $r[n]$  as expected below  $(Fig.3.1(a))$ :

$$
r[n] = h_N(s, n) + h_A[n] * s[n] + v[n].
$$
\n(3.1)

In order to remove the undesired artifact, digital methods record all the signals and apply filter techniques at digital back-ends such as blanking and interpolation [24], matched filter [26], and component analysis [27]. However, the matched filter or component analysis requires high-enough input range to maintain signal linearity and DSPs that are power hungry.

Low-power artifact removal methods employ filters with analog front-end knobs. For a case where the desired neural signal is band-separated with the undesired interference, a band-pass filter or a band-notch filter  $(g_r[n])$  can be placed at the receiver side to suppress the interference (Fig.3.1(b) and equation 3.2).

$$
r[n] = g_r[n] * (h_N(s, n) + h_A[n] * s[n] + v[n])
$$
\n(3.2)

$$
\text{minimize } ||g_r[n] * h_A[n] * s[n]|| \tag{3.3}
$$

While the band-limiting filter at the recorder was shown to be effective for certain biomarkers [20], the band separation assumption limits its usage.

Another approach is to have a filter at the transmitter side  $(g_t[n])$  that predistorts the stimulus to reduce the duration of the stimulus artifact (Fig. 3.1(c) and equation 3.4).

$$
r[n] = h_N(g_t * s, n) + h_A[n] * g_t[n] * s[n] + v[n]
$$
\n(3.4)

$$
h_A[n] * g_t[n] = \delta[n] \tag{3.5}
$$

While this method demonstrated a reduction of the artifact duration [28], it makes the effectiveness of the distorted stimulation questionable (i.e. would the predistorted biphasic stimulation sequence spark neural responses?). In addition, the channel cannot be completely inverted due to a stability issue; zeros outside of the unit circle become poles outside of the unit circle.

A subtraction filter injecting a replica of the artifact is a better solution in a sense that it allows band-overlapping between desired neural signals and stimulus artifacts and does not predistort the stimulus (Fig.3.1(d) and equation 3.6).

$$
r[n] = h_N(s, n) + h_A[n] * s[n] - g_c[n] * s[n] + v[n]
$$
\n(3.6)

$$
\text{minimize } \|h_A[n] - g_c[n]\| \tag{3.7}
$$

Some recent works took the subtraction approach and demonstrated its functionality for both common-mode artifacts  $[21]$  and differential-mode artifacts  $[22]$ ,  $[23]$ . This work focuses on the differential-mode artifact removal and demonstrates an energy-efficient scheme to achieve precise cancellations.

One thing to note on the subtraction method is that besides of the artifacts, active cancellation could cancel out desired neural signals. Fast excitatory neural response happens with 1ms latency [24]. As an extreme example, if there is only desired neural signal and no stimulus artifact, the cancellation filter can be adapted to cancel out the neural signals completely. In order to prevent that, one can do a scaling-up trick. It is well known that the neural responses have thresholds. Unless the total amount of stimulation charge is under a certain value (e.g. 1.6nC), neural responses are not evoked. Assuming that the artifact channel is linear, we can start with low-charge stimulation, adapt the cancellation filter, and scale up stimulation charges to evoke desired neural responses.

#### Architecture Design

As shown in Fig.3.2, our neuromodulation SoC is based on a modified version of the design described in [10]. The AFE input range is the ADC full scale  $(V_{FS,adc}=1.0V$  for our design) divided by the AFE gain  $(a_{lna}a_{vga}a_{buf}=45dB-65dB)$ , corresponding to  $0.5mV_{pp}$  in the highest gain setting, and  $7mV_{pp}$  in the lowest gain setting.

$$
(\text{AFE input range})_{\text{pp}} = \frac{V_{FS,adc}}{a_{lna} a_{vga} a_{buf}} \tag{3.8}
$$

Although this range is sufficient to record neural activities, stimulus artifacts that can go up to  $\sim 100 \text{mV}_{\text{pp}}$  easily saturate the front-end.

To address this issue, 8 out of the 64 total recording channels are modified to include stimulus artifact cancellers. An analog canceller suppresses the artifact at the front end of the LNA to prevent AFE saturation, and a digital canceller further cancels out residual artifacts.



Figure 3.2: Neuromodulation IC with stimulus artifact cancellers (top) and system diagram (bottom) detailing a single AFE with the analog canceller path followed by the digital canceller.

Each analog canceller consists of a switched capacitor (SC) DAC, a  $\Delta\Sigma$  modulator, and an FIR filter (Fig.3.2). The virtual ground nodes of the LNA are chosen as the analog summing junction since cancellation there alleviates the dynamic range requirements of the rest of the chain, and since the summing nodes can tolerate common-mode (CM) fluctuations up to the overdrive voltage of the input devices ( $\sim 60 \text{mV}_{\text{pp}}$  for our design).

An on-chip 32-tap FIR filter is applied to the stimulation sequence (*dstim*) to generate the analog cancellation sequence  $(d_{canc})$ . Each filter tap coefficient has 10 bits. An LMS engine based off of the ADC output  $(d_{adc})$  and  $(d_{stim})$  updates the FIR tap coefficients  $(w)$ to minimize the RMS error of *dadc*. Sign-sign LMS is employed to reduce hardware overhead, resulting in tap coefficient updates of  $+\mu$  or  $-\mu$ . The learning parameter  $(\mu)$  is adjustable and can be as small as the LSB of *w*. To prevent the LMS engine from adapting based on the ADC's DC offset, we installed an optional second-order IIR filter to high-pass filter the ADC output before being used by the LMS engine. When computing correlations in the LMS engine, *dstim* is delayed accordingly to match the delay of *dadc* (i.e. AFE delay and HPF).

As shown in Fig.3.3, residual error of the cancelled artifacts  $(\sigma_{residual}^2)$  measured with the AFE consist of noise from the AFE  $(\sigma_{afe}^2)$ , noise from the artifact path  $(\sigma_{ch}^2)$ , and noise/distortion from the canceller  $(\sigma_{canc}^2)$ .

$$
\sigma_{residual}^2 = \sigma_{afe}^2 + \sigma_{ch}^2 + \sigma_{canc}^2 \tag{3.9}
$$

The AFE noise mostly comes from the LNA in the high gain mode, but also comes from the VGA/BUF/ADC in the low gain mode.

$$
\sigma_{afe}^{2} = \sigma_{n,lna}^{2} + \frac{\sigma_{n,vga}^{2}}{a_{lna}^{2}} + \frac{\sigma_{n,buf}^{2}}{a_{lna}^{2}a_{vga}^{2}} + \frac{\sigma_{n,adc}^{2} + \sigma_{q,adc}^{2}}{a_{lna}^{2}a_{vga}^{2}a_{buf}^{2}} \tag{3.10}
$$

The artifact path adds either biological noise or electronic noise. The canceller adds quantization noise from the SC DAC  $(\sigma_{q,dac}^2)$ , distortion from the SC DAC  $(\sigma_{disto}^2)$ , and quantization



Figure 3.3: Error sources in the AFE  $(n_{lna}, n_{vqa}, n_{buf}, n_{adc})$ , the artifact path  $(n_{ch})$ , and the canceller  $(q_{dac}, q_{fir|lms}, g_2, g_3)$ .

noise from the FIR|LMS  $(\sigma_{q,fir|lms}^2)$ . Section 3.3 and Section 3.4 discuss these errors in detail.

$$
\sigma_{canc}^2 = \sigma_{q,dac}^2 + \sigma_{disto}^2 + k_{sup}^2 g_1^2 \sigma_{q,fir|lms}^2 \tag{3.11}
$$

Since stimulus artifact signals can take arbitrary shapes, we propose an effective dynamic range as a metric quantifying the cancellation performance. Specifically, we define the effective dynamic range as the ratio of the peak-to-peak value of the uncancelled artifact relative to the RMS value after cancellation (both input-referred). Assuming that the cancelled artifact is within the input range, the input-referred cancelled artifact RMS value is measured by taking the ADC output and input-referring it.

$$
DR_{eff}[dB] = 20 \log_{10} \frac{\text{(Uncancelled artifact)}_{\text{peak-to-peak}}}{\text{(Cancelled artifact)}_{\text{rms}}} \tag{3.12}
$$

### 3.3 Circuit Design

### LNA Noise Penalty

A schematic of the LNA OTA is shown in Fig.3.4(a). Supply voltage (*Vdd*) is 1.0V and reference current  $(I_{ref})$  is 67nA. The OTA uses a folded cascode topology and its common mode is adjusted by a CMFB circuit shown in Fig.3.4(b). The input device of the OTA is chosen to be a thick-oxide device to prevent any device damage in presence of a large stimulus artifact. This OTA has resistors working as current sources. Compared to a CMOS current source, a resistor has the merit of having a negligible flicker noise [29]. In this OTA, the flicker noise and shot noise of the input devices (M1,M2) and the thermal noise of the source devices (R1, R2) are the dominant sources of noise.

Because the neural amplifiers are AC-coupled, the analog canceller uses an SC DAC to inject the analog cancellation signal. Increasing the total capacitance of the SC DAC  $(C_b)$ degrades the noise performance, but also increases the maximum cancellable artifact voltage  $(V_a)$ . Specifically,  $V_a = V_{ref} \cdot C_b/C_s$  where  $V_{ref}$  is the DAC reference voltage and  $C_s$  is the sampling capacitance. A small-signal  $\pi$ -model of the LNA is shown in Fig.3.5 to illustrate a noise penalty of having non-zero  $C_b$ . IRN<sub>rms</sub> is linearly proportional to  $(1+\frac{C_b}{C_s})$  assuming  $C_b \ll C_s$ . This is because  $C_b$  works as a parasitic capacitance and degrades the LNA loop gain. In order to keep the  $IRN<sub>rms</sub>$  the same with non-zero  $C_b$ , the bias current of the OTA needs to be increased by  $(1 + \frac{C_b}{C_s})^2$  times. However, small  $\frac{C_b}{C_s}$  is acceptable because the analog supply voltage is typically much larger than the required DM cancellation range. In this design, we set  $C_b = 0.1 \cdot C_s = 1pF$  to achieve 200mV<sub>pp</sub> DM cancellation range with 1V  $V_{ref}$ and a noise penalty of only 10 %.

### SC DAC Switch Noise

The SC DAC adds its own flicker, thermal, and quantization noise to the LNA (marked as  $v_{n,dac}$  in Fig.3.5). We designed the SC DAC such that its noise contribution becomes negligible compared to the OTA noise.

Fig.3.6(a) shows a single-ended version of the SC DAC. *N* copies of switch-controlled unit capacitors  $(C_{b,unit} = C_b/N)$  are shunt-connected  $(N = 32$  for our design). By switching them between *Vref* and *Vgnd*, charges are drawn from *Vref* and injected to LNA virtual ground, or drained from LNA virtual ground to *Vgnd*. Fig.3.6(b) shows a first-order equivalent circuit of the SC DAC. Out of N units,  $N_1$  units are hooked up to  $V_{ref}$  while the rest of the units are hooked up to *Vgnd*. Assuming that each unit switch transistor operates in the triode region and has an on-resistance  $R_{on}$ , the equivalent noise bandwidth (ENBW) is:

$$
ENBW = \frac{kT/C_b}{4kTR_{on}/N} = \frac{N}{4R_{on}C_b}.
$$
\n(3.13)

An important note here is that the ENBW is wider than LNA/VGA BW. Low-pass filters embedded in LNA and VGA whose bandwidth is less than  $10kHz$  ( $f_{lp}$   $\lt$   $10kHz$ ) filter out



Figure 3.4: (a) Schematic of the LNA OTA (b) Common-mode feedback of the LNA OTA.



Figure 3.5:  $\pi$ -model of the LNA for noise analysis.



Figure 3.6: (a) Schematic of the SC DAC (b) First-order equivalent model of the SC DAC with a noise source (c) Noise power spectrum of the SC DAC.

most of the SC DACs switch noise.

$$
v_{n,DAC}^2 = \frac{kT}{C_b} \frac{f_{lp}}{ENBW} \tag{3.14}
$$

With the 8GHz ENBW (switch resistance is estimated as  $R_{on} = 1 \text{k}\Omega$ ), the effective SC DAC output thermal noise is  $v_{n,DAC}^2 = (0.07 \mu V_{rms})^2$ . The RMS value gets attenuated by  $C_b/C_s = 0.1$  when referred to LNA input, and it is clearly a non-dominant factor compared to LNA's original  $IRN<sub>rms</sub>$ .

### $\Delta\Sigma$  Modulator

We chose an oversampling and noise-shaping DAC to achieve the high required precision. In order not to add extra noise to AFE, the quantization noise of the SC DAC is set to be 6dB lower than the LNA input-referred noise. The DAC requires 87dB of SNR, which corresponds to 14bit for a Nyquist converter. Given 1pF of total capacitance, each unit capacitance becomes tiny  $(1pF/2^{14} = 60aF)$ . While it is possible to achieve this small unit capacitance [34], digital calibrations and placement-and-routing overhead remain. Oversampling becomes a better option because the sampling rate is slow and higher frequency clocks are available on-chip.

When designing a  $\Delta\Sigma$  modulator, for a given topology, SQNR predictions based on linear models and behavioral simulations suggest several combinations of the modulator parameters such as modulator order (*m*), clock oversampling ratio (*osr*), and the number of physical bits (*Nbit*) [30]. Below equation shows an estimated SQNR. Appendix A of this paper contains a detailed derivation.

$$
SQNR[dB] = 1.76 + 6.02N_{bit} + 10 \log_{10} (2\pi (2m + 1)) + (2m + 1)10 \log_{10} (osr/\pi)
$$
\n(3.15)

With prototype capacitor array layouts, VLSI synthesis, and PrimeTime PX power estimation, we chose 1<sup>st</sup> order 128x oversampling 5bit modulator which achieves minimal area-power tradeoff.

### SC DAC Nonlinearity

Nonlinearities in the SC DAC could degrade the overall performances of the analog canceller. The DNL of the SC DAC should be less than 1LSB to make the LMS loop be stable. As long as the DNL remains in a reasonable range (e.g. worst case is less than 0.5LSB), it adds minor quantization noise  $(\sigma_{q,dac}^2)$  to the residual cancellation error.

The INL plays an important role because it generates nonlinear error that cannot be addressed by linear cancellers. The INL effect of the SC DAC is marked as high order terms (*g*<sup>2</sup> and *g*<sup>3</sup> in Fig.3.3) in a polynomial equation that describes the digital-to-analog conversion of the SC DAC. Error contribution from the distortion is:

$$
\sigma_{disto}^2 = E\{ (g_2(w * d_{stim})^2 + g_3(w * d_{stim})^3)^2 \}
$$
\n(3.16)

In order to keep the INL low, calibration techniques such as digital correction on the DAC [35] or introduction of weakly nonlinear taps in the FIR/LMS [36] are required.

To minimize systematic mismatch, three capacitor networks (*Cs*,*C<sup>f</sup>* ,*Cb*) are implemented as a common-centroid array where each unit cell is made with two unit capacitors as a differential pair. The unit capacitor has a MOM structure from thin metal layers and its nominal capacitance is 36fF. Unit element mismatch is estimated to be 0.1% based on Monte-Carlo simulations.

### Common-mode Artifacts

For small common-mode artifacts that do not disturb bias conditions in the OTA, mismatches in  $C_s$  or  $C_f$  translate common-mode artifacts into differential-mode artifacts. We analyze this effect by using coupled common-mode half circuit and differential-mode half circuit. In a common-mode half circuit of the LNA,  $v_{ic}/a_{cmfb}$  is applied across  $C_s$  and  $v_{ic}$  is applied across



Figure 3.7: Differential-mode half circuit of the LNA with coupled common-mode disturbances  $i_s$  and  $i_f$  due to mismatches in  $C_s$  and  $C_f$ .

 $C_f$ .  $a_{cmfb}$  is the gain of the CMFB. Fig.3.7 shows a differential-mode half circuit of the LNA with the capacitance mismatches and differential-mode cancellation signal  $v_{dac}$ .  $C_s$  mismatch translates  $v_{ic}/a_{cmfb}$  into a voltage-controlled current source  $i_s(s) = \Delta C_s s \cdot v_{ic}/(2a_{cmfb})$ .  $C_f$ mismatch converts  $v_{ic}$  into  $i_c(s) = \Delta C_f s \cdot v_{ic}$ . Note that as long as total sum of the differentialmode artifacts (due to differential-mode input  $v_{id}$  and common-mode input  $v_{ic}$ ) are within the cancellation range, common-mode artifacts can be also handled by differential-mode artifact cancellers.

$$
|A_{dm-dm}v_{id} + A_{cm-dm}v_{ic}| < |A_{canc}v_{dac}| \tag{3.17}
$$

$$
A_{dm-dm} = \frac{C_s}{C_f} \qquad A_{canc} = \frac{C_b}{C_f} \tag{3.18}
$$

$$
A_{cm-dm} = \frac{\Delta C_s}{C_f} \frac{1}{a_{cmfb}} + \frac{\Delta C_f}{C_f} \tag{3.19}
$$

Large common-mode artifacts disturb the input common modes of the OTA, and derail transistors in the OTA from their intended bias conditions. Common-mode cancellation circuits such as [21] [23] [22] can be employed as complementary techniques to avoid this issue.

### 3.4 LMS Loop

#### **Convergence**

We note that the convergence of the LMS loop in the analog canceller needs to be investigated since the AFE chain embeds filters. Fig.3.8 shows a z-domain model of the loop with an underlying sampling frequency of 20kHz. The stimulus sequence is marked as data *x*, the input-referred stimulus artifact is marked as *d*, the output of the SC DAC is marked as *y*, the input-referred residual error is marked as  $e = d - y$ , and the DC-removed ADC sample is marked as  $e^f$ .  $g_1$  is the digital-to-analog scaling factor the SC DAC (0.2V/32 for our design). Quantization effects from the SC DAC and the ADC are not considered for convergence analysis but are considered for precision analysis in the following subsection.

The LMS loop correlates filtered error  $e^f$  and filtered data  $x^f$ . The error filter  $H_e(z)$  is:

$$
H_e(z) = H_{lna}(z)H_{vga}(z)H_{buf}(z)z^{-1}H_{hpf}(z)
$$
\n(3.20)

 $H_{lna}(z)$  and  $H_{vga}(z)$  each model LNA and VGA as bandpass filters (high pass = 10Hz, low pass = 8.3kHz) whose gains are  $20[V/V]$  and  $1[V/V]$ . A flip flop at the output of the ADC introduces a delay.  $H_{hpf}(z)$  is the digital HPF removing offset of the ADC sample. The data



Figure 3.8: Block diagram of the LMS loop in the analog canceller. The stimulus sequence is marked as data *x*, the input-referred stimulus artifact is marked as *d*, the output of the SC DAC is marked as *y*, the input-referred residual error is marked as  $e = d - y$ , and the DC-removed ADC sample is marked as  $e^f$ . z-domain models of the error filter  $H_e(z)$  and the data filter  $H_x(z)$  are shown.



Figure 3.9: Root locus of the magnitude-magnitude LMS loop. All roots are located inside the unit circle for  $w_{LSB} \leq \mu \leq 0.5w_{MSB}$ .

filter  $H_x(f)$  is a delay, which is introduced to compensate for the delay in the cancellation signal generation path and the error sensing path.

Assuming that 1) the LMS loop correlates the magnitude of the error and the magnitude of the data instead of the sign of the error and the sign of the data, and 2) there is no nonlinear effect such as AFE saturation, we show that the loop is stable for small  $\mu$ . By looking at natural modes of the loop, the tap coefficient update boils down to the characteristic equation below, and we only need to check whether all the roots remain in the unit circle.

$$
z - 1 + \mu g_1 \lambda_{max} \sum_{j=0}^{M_x} h_j^x \sum_{i=0}^{M_e} h_i^e z^{-i} = 0
$$
\n(3.21)

 $\lambda_{max}$  is the maximum eigenvalue of the autocorrelation matrix of data *x*,  $h^x$  is impulse response of the data filter, and  $h^e$  is impulse response of the error filter. Fig.3.9 shows that roots of the characteristic equation remain in the unit circle for  $w_{LSB} \leq \mu \leq 0.5w_{MSB}$ . Appendix B contains a detailed derivation of the characteristic equation based on [31].

Convergence analysis becomes more tricky once non-linearities such as sign-sign LMS and AFE saturation are considered. We checked the loop convergence with behavioral simulations for various artifact channel types. For un-filtered sign-sign LMS, [33] shows a convergence analysis.

#### FIR Quantization Error

The FIR filter and the LMS loop in the analog canceller leave a residual artifact due to quantization error in *w*. As shown in [33], the steady-state variance of the residual artifact of the LMS loop  $(\sigma_{q,fir|lms}^2)$  can be expressed as:

$$
\sigma_{q,fir|lms}^2 = \mu \sigma_x \sigma_{w,tot} \frac{\pi}{4} N_{tap}.
$$
\n(3.22)

 $\mu$  is set to  $w_{LSB}$ .  $N_{tap}$  is number of the FIR taps (32 for our design).  $\sigma_x^2$  is the variance of the stimulation sequence. Larger  $\sigma_x^2$  increases  $\sigma_{q,fir|lms}^2$  since  $|d_{stim}|$  amplifies the tap coefficient round-off error.  $\sigma_w^2$  is the FIR tap coefficient quantization error, with each tap contributing independent quantization error such that the total variance is  $\sigma_{w,tot}^2 = N_{tap} w_{LSB}^2 / 12$ . The last term  $(\pi/4)N_{tap}$  comes from propagation of the tap coefficient error over the signsign LMS convergence. As will be shown in the measurement results section, this residual error is larger than the AFE's thermal noise.

The residual quantization error from the analog canceller can be further eliminated by the digital canceller, which has another FIR/LMS with a higher precision (Fig.3.2). The digital canceller uses the same number of taps, but the LSBs for the tap coefficient  $(w_d)$ are 16x smaller than  $w$ . The output of the FIR in the digital canceller  $(d_{canc,d})$  is rounded to the ADC LSB. However, in order to get canceled in the backend digital, the residual quantization error from the analog canceller should remain linear over the AFE chain. This condition is expressed as:

$$
\sigma_{q,fir|lms,eff}^2 = k_{sup}^2 \sigma_{q,fir|lms}^2 \tag{3.23}
$$

$$
k_{sup} = \begin{cases} \frac{w_{d, LSB}}{w_{LSB}} & \text{if } g_1 \sigma_{q, fir|lms} < (\text{AFE input range})_{\text{pp}}\\ 1 & \text{otherwise.} \end{cases} \tag{3.24}
$$

With 65LP HVT devices, the synthesized digital circuits within the cancellers ( $\Delta\Sigma$  modulator, FIR filters, and LMS blocks) occupy  $0.13$ mm<sup>2</sup> of area and consume  $0.91\mu$ W of power with  $V_{DD}$ =1.0V for a typical 10Hz stimulation sequence.

### 3.5 Measurement Results

#### Benchtop measurements

Fig.3.10 shows measured IRN for AFEs without the canceller and for AFEs with the canceller. AFEs with the canceller achieve  $8.2 \mu V_{\rm rms}$ . As shown by the measurements in Fig. 3.11, the SC DAC maintains a static *|*DNL*| <* 0.3LSB and *|*INL*| <* 0.75LSB. Since the AFE has limited input range and high gain, the SC DACs analog performance could not be characterized directly (i.e., it was not feasible to ground the LNA input and read the ADC output after applying a specific code to the SC DAC). Instead, the SC DAC's analog output voltage was measured by inserting a slow square wave (using a Stanford Research Systems DS360 low-distortion low-noise signal generator) as a step function at the LNA input. The square wave's amplitude was adjusted such that the ADC output was zero when a digital square wave of the SC DAC was estimated by looking for rising/falling edges in the ADC output.



Figure 3.10: Measured input-referred noise PSD.

In order to measure the canceller's performance in a controlled environment, a PCB including the test-chip and an explicit artifact path (from the stimulator to the AFE) was



Figure 3.11: Measured SC DAC DNL/INL (two samples).



Figure 3.12: Schematic of benchtop stimulus artifact measurement setup in absence of an added sinusoidal signal (a) and in presence of an added sinusoidal signal (b).

used (Fig.3.12 (a)). The stimulator was loaded with a model of the electrode impedance  $(Z_{dm} = 50\Omega, Z_{cm} = 1M\Omega||1nF)$ , and the artifact path was created by a differential-tosingle-ended instrumentation amplifier (IA) with tunable gain, a passive filter, and an audio transformer (Hammond Mfg 140QEX). Three types of passive filters were tried: APF (an SMA cable), RC (R=50 $\Omega$ , C=1.5 $\mu$ F), and LC (L=1.2mH, C=0.15 $\mu$ F).

Fig.3.13 shows time-domain waveforms of stimulus artifacts without any cancellation and with the suggested cancellation for the RC filter. The stimulation sequence was a 10Hz periodic biphasic pulse with  $400\mu s$  per phase, and the stimulation current was  $216\mu A$  (differential). The input-referred stimulus artifact was  $86.4 \text{mV}_{\text{pp}}$ , causing the AFE to saturate (Fig.3.13 (left)). The ADC output went through the optional HPF, eliminating both the ADC offset and the DC component of the artifact. After running the on-chip LMS loops, the on-chip FIR tap coefficients converged as shown in Fig.  $3.14$ . The artifact signals were subtracted with analog/digital replica generated based on the filters, leaving minimal residue (Fig.3.13 (middle)).

The LMS adaptation learning curve associated with the RC filter example is shown in Fig.3.15. In the first 1500 iterations, the analog LMS loop adapts the FIR coefficients in the analog canceller using the smallest learning parameter  $(\mu = w_{LSB})$  to decrease the RMS



Figure 3.13: Stimulus artifact waveform without cancellations (left), with cancellations (middle), and zoom-in of the with cancellations (right) for the RC filter. The optional HPF was applied.



Figure 3.14: Converged on-chip FIR coefficients in the analog canceller (top) and converged on-chip FIR coefficients in the digital canceller (bottom) for the RC artifact channel.



Figure 3.15: LMS learning curve for the benchtop artifact channel (RC filter).



Figure 3.16: Input range of the AFE and input-referred noise from the AFE and the stimulus artifact path with a variable AFE gain (left). Input-referred noise power spectral density for the minimum AFE gain and the maximum AFE gain (right).

error at the ADC output. After the tap coefficients settle, the digital LMS loop is turned on to track the quantization error of the FIR in the analog canceller. Because the LSB of the digital LMS loop is 16x smaller than the LSB of the analog LMS loop, the analog loop requires a narrower bandwidth. We accomplish this by simply turning off the analog LMS loop when the digital LMS loop starts operating.

As Fig.3.15 also shows, the residual RMS error converges to  $20\mu V_{\rm rms}$  when referred to the AFE input after the digital cancellation. The residual error of the canceled artifacts measured with the AFE is the sum of noise from the AFE, noise from the artifact path, and noise/distortion from the canceller. Fig.3.16 shows the effect of adding the stimulus artifact path noise to the AFE noise. Note that the added stimulus artifact noise is mostly from tones falling into the kHz band. The variable AFE gain presents a tradeoff between the input range and the AFE noise, as shown in Fig.3.15. In this figure, the total AFE noise and coupled stimulator noise was  $17.5\mu V_{\rm rms}$  in the lowest gain setting.

The precision of the canceller was further measured by employing a set of *dstim* se-

quences, IA gains, passive filters, and AFE gains. The stimulation sequence was a 10Hz periodic biphasic pulse with 200*µ*s per phase, and the stimulation magnitude was varied for  $|d_{stim}|=1,4,8$  (i.e.,  $27\mu$ A,  $108\mu$ A, and  $216\mu$ A differential stimulus current). Fig.3.17 shows measured RMS values of canceled artifacts for the LC filter and the RC filter in the lowest AFE gain setting with the parameter sweep set. Fig.3.18 shows measured RMS values of canceled artifacts for the all-pass filter in the lowest AFE gain setting and in the highest AFE gain setting.

For stimulus artifacts less than  $50mV_{pp}$ , the canceled artifacts are below the noise level of the AFE and the artifact path. For stimulus artifacts larger than  $50 \text{mV}_{\text{pp}}$ , different error sources become dominant based on the AFE gain. In the lowest AFE gain setting, the nonlinearity of the SC DAC becomes the dominant source of the residual error. In the highest AFE gain setting, the quantization error of the FIR in the analog canceller becomes the dominant noise source because it is saturated through the AFE and cannot be subtracted by the digital canceller. Fig.3.19 shows a scatter plot of all measured benchtop results. The RMS error is converted into the effective dynamic range. The maximum measured effective dynamic range is 78dB.

The canceller's performance was evaluated in presence of the desired signal. As shown in Fig.3.12 (b), the sum of a stimulus artifact signal and a sinusoidal signal obtained using two transformers were fed into the AFE. Fig.3.20 shows time-domain waveforms of the original artifact and the cancelled artifact, both in the presence of a  $1kHz \, 1mV_{rms}$  sine.



Figure 3.17: Measured input-referred RMS values of the cancelled stimulus artifacts in the lowest AFE gain setting for the RC filter (top) and for the LC filter (bottom). The magnitude of the uncancelled artifacts was varied by adjusting the IA gain and the stimulus code. Solid lines with symbols mark measurement results and dashed lines without symbols mark calculated error either from the SC DAC INL or the FIR/LMS quantization.



Figure 3.18: Measured input-referred RMS values of the canceled stimulus artifacts for the all-pass filter in the lowest AFE gain setting (top) and the highest AFE gain setting (bottom).



Figure 3.19: Scatter plot of measured effective dynamic range in the benchtop setup.



Figure 3.20: Stimulation artifact waveform without cancellation (left) and with cancellation (right) for the LC filter in presence of an added sinusoid.

#### *In vivo* measurements

An *in vivo* experiment was conducted to verify the artifact cancellation technique. A 40Hz biphasic pulse stimulation was performed on the rat with a  $31\mu$ A differential current for 1ms per phase. Fig.3.21 shows a spectrogram recorded with the chip AFE during stimulation and cancellation progression. The first two seconds recorded baseline without any stimulation. While 60 Hz noise and its second harmonic were present, this did not degrade input range. After the stimulation was activated, the analog canceller LMS loop run for 3.5 seconds. Once the analog canceller settled, its LMS loop was stopped while the canceller was still active and the digital canceller LMS loop was activated for the next 12 seconds. Uncancelled artifact was  $50 \text{mV}_{\text{pp}}$  (peak-to-peak value was measured with a small stimulus current and scaled up linearly with the desired stimulus current), cancelled artifact was  $277 \mu V_{\rm rms}$  (background noise included), and effective dynamic range was 45dB. Because this *in vivo* experiment was performed on a rat with electrodes implanted 10 months previously, no meaningful action potentials were found.



Figure 3.21: Spectrogram of *in vivo* artifact cancellations. Power spectral density is plotted on a logarithmic scale.

### 3.6 Conclusion

The SoC was fabricated in TSMC 65nm LP CMOS and occupies 5.14mm<sup>2</sup> including pads (Fig.3.22). The key performance metrics of the design are summarized in Table 3.1; in comparison with the state of the art, this design extends cancellation to wider bandwidths while retaining competitive effective dynamic range (max tolerable artifact/noise) and NEF/PEF.

|                                                 | $\left[13\right]$  | [21]                                   | $\left[ 23\right]$  | [22]                | This work                   |
|-------------------------------------------------|--------------------|----------------------------------------|---------------------|---------------------|-----------------------------|
| Technology                                      | 180 <sub>nm</sub>  | 40 <sub>nm</sub>                       | 65nm                | 180 <sub>nm</sub>   | 65nm                        |
| Analog $V_{DD}$                                 | 1.0V               | 1.2V                                   | 2.5V                | 1.0V                | 1.0V                        |
| BW                                              | 0.5kHz             | 5kHz                                   | $1\mathrm{kHz}$     | 2kHz                | 8.3kHz                      |
| Input range $(DM)$                              | $100mV_{pp}$       | $80 \text{mV}_{\text{pp}}$             | $110mV_{pp}$        |                     | $0.53 - 1.2 mV_{pp}$        |
| Maximum tolerable artifact                      | $100mV_{pp}$       | $650 \text{mV}_{\text{pp}}^{\text{a}}$ | $110mV_{pp}$        |                     | $200 \text{mV}_{\text{pp}}$ |
| $IRN$ (rms)                                     | $1.6\mu\mathrm{V}$ | $5.3\mu\mathrm{V}$                     | $2.78\mu\mathrm{V}$ | $3.05\mu\mathrm{V}$ | $8.2 \mu V$                 |
| Maximum achievable $DR_{\text{eff}}^{\text{b}}$ | 96dB               | 84dB <sup>c</sup>                      | 92dB                | $\qquad \qquad$     | 87dB                        |
| Measured $DReff$                                |                    | 65dB <sup>d</sup>                      | 25dB <sup>e</sup>   | 24dB <sup>f</sup>   | 77dB                        |
| Power/ch                                        | $8 \mu W$          | $2.8\mu$ W                             | $2.98\mu$ W         | $0.33 \mu W$        | $2.7 \mu W$ s               |
| Area/ch (mm <sup>2</sup> )                      |                    | 0.069                                  | 0.0023              | 0.17                | 0.18 <sup>h</sup>           |
| <b>NEF</b><br>PEF                               | 7.8<br>60.8        | $4.4 \mid 23.2$                        | $2.4$   13.8        |                     | 22.0<br>4.7                 |
| Cancellation mode                               |                    | CM                                     | CM,DM               | CM, DM              | DМ                          |

Table 3.1: Comparison of the stimulus artifact removal ICs

<sup>a</sup> Common mode

 $b$  Artifacts are not included. Calculated as  $20\log_{10}(Maximum$  tolerable DM artifact)/(IRN<sub>rms</sub>). c Maximum achievable DR<sub>eff</sub> is 102dB when calculated for maximum tolerable CM artifact.

 $d$  Residual cancelled artifact is calculated as  $0.36 \text{mV}_{\text{rms}}$  based on reported SIR. Uncancelled artifact (CM) is  $650 \text{mV}_{\text{pp}}$ .<br>
<sup>e</sup> *In vivo* measurement result. Residual cancelled artifact is estimated to be  $5.7 \mu \text{V}_{\text{rms}}$  and

uncancelled artifact is estimated to be  $0.1 \text{mV}_{\text{pp}}$  based on reported ADC codes and IRN.

<sup>f</sup> *In vivo* measurement result

 $\mu$ <sup>g</sup> AFE power = 1.8 $\mu$ W, canceller power = 0.91 $\mu$ W

<sup>h</sup> AFE area =  $0.051$ mm<sup>2</sup>, canceller area =  $0.13$ mm<sup>2</sup>



Figure 3.22: Die photo.

# Chapter 4

# Conclusion

### 4.1 Summary

This dissertation contains two works in the domain of electrical neural interface. The first work has an emphasis by integrating state-of-art components such as low-power low-energy neuromodulation IC, fine and flexible long-lasting electrodes, and a fast and accurate insertion robot. Prototype integration results were presented along with *in vitro* data. There are remaining tasks in this work. An imaging guide needs to be installed on the inserter machine. Integration of all the sub-modules is not done. *In vivo* validation is also required.

The second work was on advancing the neuromodulation chip further by adding a stimulus artifact cancellation feature. This work achieved a state-of-the-art effective dynamic range  $(max$  tolerable artifact/noise) with a competitive noise efficiency and a wide bandwidth. An integration of the CM cancellation can be done as a complementary feature.

### 4.2 Future Work

Recently there have been a number of works on single-stimulator single-recorder active cancellation in ICs. Both common-mode and differential-mode artifacts have been addressed using adaptive learning methods. While solutions to this problem are quite matured, a more interesting problem arises when considering a situation in which there exist multiple recorders and multiple stimulators given that the BMI call for high-density closed-loop neuromodulation applications. This MIMO (multiple input multiple output) canceller network design problem could be solvable assuming that all the stim/record channels exist on the same chip so that there exist timing/magnitude references of the stimulation artifacts.

One of the challenges in designing the MIMO filter is that there exists little parametric study of the multi-channel stimulus artifacts. Although there is a general qualitative opinion that *in vivo* environment has a nonlinear time-varying nature due to electrochemical reactions, numerical figures for channel responses are not reported yet. A thorough investigation is required.

Once channel responses become available, one can try various signal processing algorithms to see what becomes the most effective scheme with being aware of complexity and energy efficiency when implemented. If coupling signals have sufficient linearity and spatial correlation, linear adaptation filters become good candidates. If they lack in linearity or spatial correlation, investigations on neural networks could be worth to try.

Once there is a consensus on what signal processing algorithms are appropriate, there is a great opportunity for hardware designers. Although in a different application, a MIMO duplex radio work demonstrates how a canceller network can be efficiently designed  $[37]$ . A key observation is that a cross-talk from antenna i to antenna j is mostly an attenuated and delayed version of a self-talk of antenna i (or j) due to the fact that antenna i and antenna j are neighboring to each other (e.g.  $\lambda/2$  spacing) and have similar path losses. By partitioning filters and jointly training them, for M antenna MIMO, the authors were able to demonstrate a linearly scaling canceller network  $(\sim M)$  instead of a quadratically scaling canceller network  $(M^2)$  that achieve a full-duplex MIMO radio. A similar work in the context of neuromodulation could enable an implantable high-density simultaneous microstimulation and recording interface for closed-loop applications.

# Appendix A

# Estimation of SQNR for a CIFB  $\Delta\Sigma$ Modulator

Assuming a non-delayed integrator, signal transfer function is *STF*(*z*) = 1, and quantization noise transfer function is  $NTF(z) = (1 - z^{-1})^m$ . Assuming that the incoming signal has a half amplitude of a normalized full scale, signal power is  $P_s = \frac{1}{2}(0.5)^2$ . Without any noise shaping, quantization noise power is  $P_n = \frac{1}{12}\Delta^2$  where  $\Delta = 2^{-N_{bit}}$ . With noise shaping, inband quantization noise power is:

$$
P_n = \int_0^{\frac{0.5}{osr}} |NTF(f)|^2 \frac{\Delta^2}{12} df = \int_0^{\frac{0.5}{osr}} |2\sin(\pi f)|^{2m} \frac{\Delta^2}{12} df
$$
  

$$
\approx \frac{\Delta^2}{12} \int_0^{\frac{0.5}{osr}} |2\pi f|^{2m} df = \frac{\Delta^2}{12} \frac{(\pi/osr)^{2m+1}}{2\pi (2m+1)}.
$$
(A.1)

The second term of  $P_n$  indicates SQNR boosting from  $\Delta\Sigma$  modulation.

# Appendix B

# LMS Loop Convergence for Filtered Error

FIR tap coefficient  $w_n$  is updated based on filtered error  $e_n^f$  and filtered data  $x_n^f$ . The error filter has an impulse response of  $h_i^e$  ( $i = 0, ..., M_e$ ). The data filter has an impulse reponse of  $h_j^x$   $(j = 0, ..., M_x)$ .

$$
\boldsymbol{w}_{n+1} = \boldsymbol{w}_n + \mu \boldsymbol{x}_n^f e_n^f \tag{B.1}
$$

$$
\boldsymbol{x}_n^f = \sum_{j=0}^{M_x} h_j^x \boldsymbol{x}_{n-j} \tag{B.2}
$$

$$
e_n^f = \sum_{i=0}^{M_e} h_i^e e_{n-i}
$$
 (B.3)

$$
e_n = d_n - y_n = d_n - g_1 \mathbf{w}_n^T \mathbf{x}_n
$$
\n(B.4)

Minimum RMSE solution of  $w$  is marked as  $w_0$ .

$$
d_{n-i}\boldsymbol{x}_{n-j} = g_1\boldsymbol{x}_{n-j}\boldsymbol{x}_{n-i}^T\boldsymbol{w}_0
$$
\n(B.5)

We introduce  $c_n$  to track convergence of  $w_n$  over iterations toward the optimum solution  $w_0$  $(c_n = \boldsymbol{w}_n - \boldsymbol{w}_0).$ 

$$
\mathbf{c}_{n+1} = \mathbf{c}_n - \mu g_1 \sum_{j=0}^{M_x} h_j^x \sum_{i=0}^{M_e} h_i^e \mathbf{x}_{n-j} \mathbf{x}_{n-i}^T \mathbf{c}_{n-i}
$$
(B.6)

In order to check convergence of  $c_n$  in a mean sense, we apply expectation operators in the above equation. Assume that  $x_n$  and  $c_n$  are independent to each other.

$$
E\{c_{n+1}\} = E\{c_n\} - \mu g_1 R_x \sum_{j=0}^{M_x} h_j^x \sum_{i=0}^{M_e} h_i^e E\{c_{n-i}\}
$$
(B.7)

Diagonalize the autocorrelation matrix  $R_x$  to find its natural modes, and rotate  $E\{\boldsymbol{c}_n\}$  with the eigenvectors of  $R_x$ .

$$
R_x = Q\Lambda Q^T \text{ where } QQ^T = Q^T Q = I
$$
 (B.8)

$$
\Lambda = diag(\lambda_1, \lambda_2, ..., \lambda_{Ntap})
$$
\n(B.9)

$$
\boldsymbol{v}_n = Q^T E\{\boldsymbol{c}_n\} \tag{B.10}
$$

$$
\boldsymbol{v}_{n+1} = \boldsymbol{v}_n - \mu g \Lambda \Sigma_{j=0}^{M_x} h_j^x \Sigma_{i=0}^{M_e} h_i^e \boldsymbol{v}_{n-i}
$$
(B.11)

For each element of  $v_n$ , a characteristic equation exists  $(k = 1, 2, ..., N_{tap})$ .

$$
z - 1 + \mu g \lambda_k \sum_{j=0}^{M_x} h_j^x \sum_{i=0}^{M_e} h_i^e z^{-i} = 0
$$
\n(B.12)

To ensure the stability of the error-filtered LMS loop in a mean sense, roots of the characteristic equations should remain inside the unit circle. The largest eigenvalue  $\lambda_{max}$  puts the most stringent requirement on how small  $\mu$  should be.

# Bibliography

- [1] M. Nicolelis and M. Lebedev, "Principles of Neural Ensemble Physiology Underlying the Operation of Brain-Machine Interfaces," Nature Reviews Neuroscience, Jul. 2009.
- [2] A. Marblestone, B. Zamft, Y. Maguire, M. Shapiro, T. Cybulski, J. Glaser, D. Amodei, P. Stranges, R. Kalhor, D. Dalrymple, D. Seo, E. Alon, M. Maharbiz, J. Carmena, J. Rabaey, E. Boyden, G. Church, and K. Kording, "Physical Principles for Scalable Neural Recording," Frontiers in Computational Neuroscience, 2013.
- [3] J. Barrese, N. Rao, K. Paroo, C. Triebwasser, C. Vargas-Irwin, L. Franquemont, and J. Donoghue, "Failure more analysis of silicon-based intracortical microelectrode arrays in non-human primates," J. Neural Eng., 2013.
- [4] C. Nordhausen, E. Maynard, and R. Normann, "Single unit recording capabilities of a 100 microelectrode array," Brain Research, vol. 726, Jul. 1996, pp. 129-140.
- [5] C. Diaz-Botia, L. Luna, R. Neely, M. Chamanzar, C. Carraro, J. Carmena, P. Sabes, R. Maboudian, and M. Maharbiz, "A Silicon Carbide Array for Electrocorticography and Peripheral Nerve Recording," J. Neural Eng., vol. 14, no. 5, Aug. 2017.
- [6] J. Du, T. Blanche, R. Harrison, H. Lester, and S. Masmanidis, "Multiplexed, high density electrophysiology with nanofabricated neural probes," *PloS One 6*, 2011.
- [7] F. Torre, A. Rodriguez-Baeza, and J. Sahuquillo-Barris, "Morphological characteristics and distribution pattern of the arterial vessels in human cerebral cortex: A scanning electron microscope study," *The Anatomical Record*, 1998.
- [8] S. Schmidt, K. Horch, and R. Normann, "Biocompatibility of silicon-based electrode arrays implanted in feline cortical tissue," *Journal of Biomedical Materials Research*, vol. 27, 1993.
- [9] Tucker-Davis Technologies 2018, *The IZ2 Stimulator and LZ48 Battery Pack*, accessed 8 Jul 2018, http://tdt.com/files/specs/IZ2.pdf.
- [10] W. Biederman, D. Yeager, N. Narevsky, J. Leverett, R. Neely, J. Carmena, E. Alon, and J. Rabaey, "A 4.78 mm<sup>2</sup> Fully-Integrated Neuromodulation SoC Combining 64

Acquisition Channels With Digital Compression and Simultaneous Dual Stimulation", *IEEE J. Solid-State Circuits*, vol. 50, no. 4, pp. 1038-1047, Apr. 2015.

- [11] Y. Lo, C. Chang, Y. Kuan, S. Culaclii, B. Kim, K. Chen, P. Gad, V. Edgerton, and W. Liu, "A 176-Channel 0.5cm<sup>3</sup> 0.7g Wireless Implant for Motor Function Recovery after Spinal Cord Injury," *IEEE Int Solid-State Circuits Conf. Dig. Tech. Papers*, 2016, pp. 382-383.
- [12] H. Kassiri, M. Salam, M. Pazhouhandeh, N. Soltani, J. Velazquez, P. Carlen, and R. Genov, "Rail-to-Rail-Input Dual-Radio 64-Channel Closed-Loop Neurostimulator," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 11, 2017.
- [13] B. Johnson, S. Gambini, I. Izyumin, A. Moin, A. Zhou, G. Alexandrov, S. Santacruz, J. Rabaey, J. Carmena, and R. Muller, "An Implantable 700*µ*W 64-Channel Neuromodulation IC for Simultaneous Recording and Stimulation with Rapid Artifact Recovery," in *Symp. VLSI Circuits Dig. Tech. Papers*, Jun. 2017, pp. 48-49.
- [14] Intan Technologies 2018, *RHS2116 Digital Electrophysiology Stimulator/Amplifier Chip*, accessed 18 May 2018, http://intantech.com/files/Intan\_RHS2116\_datasheet. pdf.
- [15] K. Cheung, P. Renaud, H. Tanila, and K. Djupsund, "Flexible polyimide microelectrode array for in vivo recordings and current source density analysis," Biosensors and Bioelectronics 22, 2007, pp. 1783-1790.
- [16] H. Lee, R. Bellamkonda, W. Sun, and M. Levenston, "Biomechanical analysis of silicon microelectrode-induced strain in the brain," *J. Neural Eng.*, 2005.
- [17] J. Sanders, C. Stiles, and C. Hayes, "Tissue response to single-polymer fibers of varying diameters: evaluation of fibrous encapsulation and macrophage density," *Journal of Biomedical Materials Research*, 2000.
- [18] R. Rennaker, S. Street, A. Ruyle, and A. Sloan, "A comparison of chronic multi-channel cortical implantation techniques: manual versus mechanical insertion," *Journal of Neuroscience Methods*, vol. 142, 2005.
- [19] A. Zhou, B. Johnson, and R. Muller, "Toward true closed-loop neuromodulation: artifact-free recording during stimulation," Current Opinion in Neurobiology, vol. 50, Jun. 2018, pp. 119-127.
- [20] S. Stanslaski, P. Afshar, P. Cong, J. Giftakis, P. Stypulkowski, D. Carlson, D. Linde, D. Ullestad, A. Avestruz, and T. Denison, "Design and validation of a fully implantable, chronic, closed-loop neuromodulation device with concurrent sensing and stimulation," *IEEE Transactions on Neural Systems and Rehabilitation Engineering*, vol. 20, no. 4, pp. 410-421, Jul. 2012.
- [21] H. Chandrakumar and D. Markovic, "An  $80 \text{-mV}_{\text{pp}}$  Linear-Input Range, 1.6-G $\Omega$  Input Impedance, Low-Power Chopper Amplifier for Closed-Loop Neural Recording That Is Tolerant to 650-mV<sub>pp</sub> Common-Mode Interference," *IEEE J. Solid-State Circuits*, vol. 52, no. 11, Nov. 2017.
- [22] A. Mendrela, J. Cho, J. Fredenburg, V. Nagaraj, T. Netoff, M. Flynn, and E. Yoon, "A Bidirectional Neural Interface Circuit With Active Stimulation Artifact Cancellation and Cross-Channel Common-Mode Noise Suppression," *IEEE J. Solid-State Circuits*, vol. 51, no. 4, Apr. 2016.
- [23] W. Smith, J. Uehlin, S. Perlmutter, J. Rudell, and V. Sathe, "A Scalable, Highly-Multiplexed Delta-Encoded Digital Feedback ECoG Recording Amplifier with Common and Differential-Mode Artifact Suppression," in *Symp. VLSI Circuits Dig. Tech. Papers*, Jun. 2017.
- [24] S. Butovas and C. Schwarz, "Spatiotemporal Effects of Microstimulation in Rat Neocortex: A Parametric Study Using Multielectrode Recordings," *Journal of Neurophysiology*, 90(5), pp. 3024-3039, 2003.
- [25] C. Kim, S. Joshi, H. Courellis, J. Wang, C. Miller, and G. Cauwenberghs, "A 92dB Dynamic Range Sub- $\mu$ V<sub>rms</sub>-Noise  $0.8\mu$ W/ch Neural-Recording ADC Array with Predictive Digital Autoranging," *IEEE International Solid-State Circuits Conference*, pp. 470-471, Feb. 2018.
- [26] K. Limnuson, H. Lu, H. Chiel, and P. Mohseni, "A Bidirectional Neural Interface SoC with an Integrated Spike Recorder, Microstimulator, and Low-Power Processor for Real-Time Stimulus Artifact Rejection," *Analog Integrated Circuits and Signal Processing*, pp. 457470, Feb. 2015.
- [27] K. Zeng, D. Chen, G. Ouyang, L. Wang, X. Liu, and X. Li, "An EEMD-ICA Approach to Enhancing Artifact Rejection for Noisy Multivariate Neural Data," *IEEE Transactions on Neural Systems and Rehabilitation Engineering*, pp. 630-638, Nov. 2015.
- [28] P. Chu, R. Muller, A. Koralek, J. Carmena, J. Rabaey, and S. Gambini, "Equalization for Intracortical Microstimulation Artifact Removal," *IEEE Engineering in Medicine and Biology Society, EMBS*, pp. 245-248, Jul. 2013.
- [29] W. Wattanapanitch, M. Fee, and R. Sarpeshkar, "An Energy Efficient Micropower Neural Recording Amplifier," *IEEE Trans. Biomed. Circuits Syst.*, vol. 1, no. 2, pp. 136-147, 2007.
- [30] R. Schreier and G. Temes, *Understanding Delta Sigma Data Converters*. Piscataway, NJ: IEEE Press/Wiley, 2005.
- [31] E. Bjarnason, "Analysis for the Filtered-X LMS Algorithm," *IEEE Trans. Speech and Audio Processing*, vol. 3, no. 6, pp. 504-514, 1995.
- [32] C. Caraiscos and B. Liu, "A Roundoff Error Analysis of the LMS Adaptive Algorithm," *IEEE Transactions on Acoustics, Speech, and Signal Processing*, 1984.
- [33] B. Jun, D. Park, and Y. Kim, "Convergence Analysis of Sign-Sign LMS Algorithm for Adaptive Filters with Correlated Gaussian Data," in *Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.*, pp. 1380-1383, May. 1995.
- [34] D. Stephanovic, *Calibration Techniques for Time-Interleaved SAR A/D Converters*, Diss. University of California, Berkeley, 2012.
- [35] M. Sarhang-Nejad and G. Temes, "A High-Resolution Multibit  $\Sigma\Delta$  ADC with Digital Correction and Relaxed Amplifier Requirements," in *IEEE J. Solid-State Circuits*, vol. 28, no. 6, pp. 648-660, Jun. 1993.
- [36] O. Agazzi, D. Messerschmitt, and D. Hodges, "Nonlinear Echo Cancellation of Data Signals," in *IEEE Transactions on Communications*, vol. 30, no. 11, pp. 2421-2433, Nov. 1982.
- [37] D. Bharadia and S. Katti, "Full Duplex MIMO Radios," in *Proc. USENIX Symposium on Networked Systems Design and Implementation*, 2014.