# UCLA UCLA Electronic Theses and Dissertations

# Title

Digitally-Calibrated Reconfigurable Analog-to-Digital Converters

**Permalink** https://escholarship.org/uc/item/4pk9j721

Author

Awad, Ramy Mohamed Yousry Ahmed

Publication Date 2013

Peer reviewed|Thesis/dissertation

## UNIVERSITY OF CALIFORNIA

Los Angeles

### Digitally-Calibrated Reconfigurable Analog-to-Digital Converters

A dissertation submitted in partial satisfaction

of the requirements for the degree

Doctor of Philosophy in Electrical Engineering

By

### Ramy Mohamed Yousry Ahmed Awad

2013

© Copyright by

Ramy Mohamed Yousry Ahmed Awad

2013

#### ABSTRACT OF THE DISSERTATION

Digitally-Calibrated Reconfigurable Analog-to-Digital Converters

by

### Ramy Mohamed Yousry Ahmed Awad

Doctor of Philosophy in Electrical Engineering

University of California, Los Angeles, 2013

Professor Chih-Kong Ken Yang, Chair

Modern digital communication systems target satisfying multiple standards and different operating scenarios. Applications include read channels of data storage systems, PCIe links, FPGA I/Os, and multi-standard radios. This stimulates the research on reconfigurable analog-todigital converters (ADCs) to serve as a key building block at the front-end of such systems. Conventional reconfigurable designs suffer from poor figure-of-merit (FoM) scaling with different resolutions, which reduces their flexibility. The limited efficiency of these techniques is attributed to the fact that they fix the ADC architecture for all configurations, whereas the optimum architecture depends on the target resolution. This dissertation introduces an architecture reconfigurable ADC that efficiently covers a wide range of resolutions by configuring the ADC to the proper architecture for each resolution. This leads to a reconfigurable ADC nearly as efficient as dedicated designs in both area and power.

Device matching is the heart of precision analog design. Basically, well matched devices come at the expense of larger die, parasitic capacitance and power consumption. Instead of only sizing the devices to achieve the desired accuracy, the intrinsic accuracy of an area efficient converter is designed worse than its resolution. Before or during chip usage, self-calibration automatically detects and corrects for elements mismatches and leads to reduced silicon area and improved yield. This dissertation investigates the efficiency of using body voltage trimming calibration for data converters. The tradeoffs of this technique are studied in details. Suggested methods have been presented to extend the use of bulk voltage trimming beyond technology limitations with minimal area and power overhead and no special technology requirements. The study shows that best results are achieved when mixing bulk trimming with other calibration techniques.

Two prototype chips are implemented in 65-nm CMOS to verify the results of this study. The first chip is a 2.5-10GS/s reconfigurable flash ADC. The ADC can be configured to work as a 3-bit, a 4-bit, or a 5-bit ADC with worst case integral nonlinearity (INL) and differential nonlinearity (DNL) of 0.48LSB and 0.35LSB respectively. The ADC achieves a figure-of-merit of 0.46pJ/conv-step and the active area is 0.13 mm<sup>2</sup>. The second chip is a 1.5-4GS/s "architecture" reconfigurable ADC. The ADC covers resolution range from 3b to 7b, and achieves a figure-of-merit of 0.46pJ/conv-step at 7-bit and the active area is 0.15mm<sup>2</sup>.

The dissertation of Ramy Mohamed Yousry Ahmed Awad is approved.

Milos Ercegovac

M.-C. Frank Chang

William Kaiser

Chih-Kong Ken Yang, Committee Chair

University of California, Los Angeles

2013

To my dear parents and my lovely wife Hanaa

# **Table of Contents**

| ABSTRACT OF THE DISSERTATIONii                       |  |  |
|------------------------------------------------------|--|--|
| able of Contents                                     |  |  |
| cknowledgments xii                                   |  |  |
| /ITAxiii                                             |  |  |
| CHAPTER 1 Introduction1                              |  |  |
| 1.1.Examples of Reconfigurable ADCs Applications     |  |  |
| 1.2. Motivation                                      |  |  |
| 1.3. Thesis Organization                             |  |  |
| CHAPTER 2 Background                                 |  |  |
| 2.1.Performance Specifications                       |  |  |
| 2.1.1. Static Error Specifications                   |  |  |
| 2.1.1.1. Offset and Gain Errors                      |  |  |
| 2.1.1.2. Differential Non-linearity (DNL) 10         |  |  |
| 2.1.1.3. Integral Non-linearity (INL) 11             |  |  |
| 2.1.2. Dynamic Error Specifications                  |  |  |
| 2.1.2.1. Signal-to-Noise Ratio (SNR)                 |  |  |
| 2.1.2.2. Effective Number of Bits ( <i>ENOB</i> )12  |  |  |
| 2.1.2.3. Total Harmonic Distortion ( <i>THD</i> ) 12 |  |  |
| 2.1.2.4. Spurious Free Dynamic Range ( <i>SFDR</i> ) |  |  |

|    | 2.1.2.5.  | Effective Resolution Bandwidth (ERBW)            | . 13 |
|----|-----------|--------------------------------------------------|------|
|    | 2.1.3.    | ADC Figure of Merit                              | 14   |
|    | 2.2.Liter | ature Review                                     | 15   |
|    | 2.2.1.    | Reconfigurability of ADC                         | 15   |
|    | 2.2.1.1.  | Reconfigurability on Conversion Rate [21]        | 15   |
|    | 2.2.1.2.  | Reconfigurability on Conversion Resolution [21]  | 16   |
|    | 2.2.2.    | Design Examples in Previous Publications [21]    | 17   |
|    | 2.2.2.1.  | Gulati's Design                                  | 17   |
|    | 2.2.2.2.  | Anderson's Design                                | 19   |
|    | 2.2.2.3.  | Cheng-Chung's Design                             | 19   |
|    | 2.2.2.4.  | Ahmed's Design                                   | 19   |
|    | 2.2.2.5.  | Seyed's Design                                   | 20   |
|    | 2.2.3.    | Summary of Previous Publications                 | 20   |
|    | 2.2.4.    | Problems in Reconfigurable ADC                   | 23   |
|    | 2.3.Cond  | clusion                                          | 24   |
| CH | IAPTER    | 3 Body-voltage-based Digital Calibration in ADCs | 26   |
|    | 3.1.Mate  | ching Considerations                             | 27   |
|    | 3.1.1.    | Matching Trends                                  | 27   |
|    | 3.1.2.    | Matching-Critical Blocks                         | 29   |
|    | 3.1.2.1.  | Flash Architecture                               | 29   |

|        | 3.1.2.2.  | Pipelined Architecture                                       | 30       |
|--------|-----------|--------------------------------------------------------------|----------|
|        | 3.2.Body  | Voltage Calibration                                          | 31       |
|        | 3.2.1.    | Advantages                                                   | 31       |
|        | 3.2.2.    | Limitations                                                  | 32       |
|        | 3.3.Hybr  | id-Calibration Solutions                                     | 35       |
|        | 3.3.1.    | Flash ADC Offset Trimming                                    | 36       |
|        | 3.3.2.    | Improving the ADC Dynamic Range                              | 37       |
|        | 3.4.Desig | gn Example: MDAC Calibration                                 | 39       |
|        | 3.4.1.    | Background                                                   | 39       |
|        | 3.4.2.    | Comparison between Trimming and Redundancy                   | 40       |
|        | 3.4.3.    | Body-Voltage Trimming                                        | 42       |
|        | 3.4.4.    | Circuit Implementation                                       | 45       |
|        | 3.4.4.1.  | ADC Self-Calibration Architecture                            | 45       |
|        | 3.4.4.2.  | Design of Self-Calibrating Current Cell                      | 48       |
|        | 3.4.4.3.  | Measurement Results                                          | 49       |
|        | 3.5.Conc  | lusion                                                       | 50       |
| СН     | APTER 4   | An Architecture-Reconfigurable 3b-to-7b 4GS/s-to-1.5GS/s Dig | gitally- |
| Calibr | ated ADC  | C in 65-nm CMOS                                              | 52       |
|        | 4.1.Intro | duction                                                      | 52       |
|        | 4.2.Reco  | nfigurable ADC Architecture                                  | 54       |

| 4.2       | .1.    | Flash ADC Architecture                                          | 54        |
|-----------|--------|-----------------------------------------------------------------|-----------|
| 4.2       | .2.    | Two-Step ADC Architecture                                       | 56        |
| 4.3       | .Circu | uit Implementation                                              | 57        |
| 4.3       | .1.    | Track-and-Hold                                                  | 57        |
| 4.3       | .2.    | Flash Sub-ADCs                                                  | 59        |
| 4.3       | .3.    | MDAC                                                            | 60        |
| 4.4       | . ADC  | Calibration                                                     | 67        |
| 4.4       | .1.    | Flash ADC Offset Trimming                                       | 67        |
| 4.4       | .2.    | DAC Mismatch Calibration                                        | 68        |
| 4.4       | .3.    | Channel Mismatch Calibration                                    |           |
| 4.5       | .Expe  | erimental Results                                               | 72        |
| 4.5       | .1.    | Testing Setup                                                   | 72        |
| 4.5       | .2.    | Measurement Results of the Flash ADC Configuration              | 74        |
| 4.5       | .3.    | Measurement Results of the Two-Step ADC Configuration           | 75        |
| 4.5       | .4.    | Reconfigurable ADC Performance                                  | 78        |
| 4.6       | .Cond  | clusion                                                         | 79        |
| CHAP      | TER :  | 5_A Digitally-Calibrated 3b-to-5b 10GS/s-to-2.5GS/s Reconfigura | ble Flash |
| ADC in 65 | 5nm (  | CMOS                                                            | 80        |
| 5.1       | .Intro | duction                                                         | 80        |
| 5.2       | Reco   | onfigurable ADC Architecture                                    | 81        |

| 5.3. Circuit Implementation                                                   | 36             |
|-------------------------------------------------------------------------------|----------------|
| 5.3.1. Track-and-Hold Amplifier                                               | 36             |
| 5.3.2. Comparator Design                                                      | 39             |
| 5.4. Comparator Offset Calibration9                                           | <b>)</b> 1     |
| 5.4.1. Comparators Offset Trimming                                            | <b>9</b> 1     |
| 5.5. Measurement Results                                                      | <del>)</del> 3 |
| 5.6. Conclusion                                                               | <del>)</del> 5 |
| CHAPTER 6 Body Trimming beyond Calibration: Digitally Calibrated Curren       | ıt-            |
| Steering Segmented-DAC Design9                                                | <b>9</b> 7     |
| 6.1. Coarse-Segments Mismatch Trimming9                                       | <del>)</del> 8 |
| 6.2. Fine-Segments Design                                                     | <del>)</del> 9 |
| 6.3. Measurement Results 10                                                   | )0             |
| 6.4. Conclusion 10                                                            | )1             |
| CHAPTER 7 Conclusion 10                                                       | )3             |
| Appendix Analysis of Body-Voltage Trimming: Thermal Stability and Short-Chann | el             |
| Effects                                                                       | )5             |
| A.1. Thermal Stability                                                        | )5             |
| A.1.1. MOS transistor Mismatch Model10                                        | )5             |
| A.1.2. Body-Effect Model 10                                                   | )8             |
| A.1.3. Process, Voltage, and Temperature Tracking                             | )9             |

| Reference | S                                                       | 112 |
|-----------|---------------------------------------------------------|-----|
| A.2.1.    | Reduced Threshold Sensitivity to Body-Voltage Variation |     |
| A.2.      | Short-Channel Effects                                   | 110 |

# Acknowledgments

I would first like to thank my advisor, Professor Ken Yang, for the continuous help, support, and guidance that he gave to me throughout this journey. I find it really hard to express my gratitude for all the time he spent with me discussing my ideas, reviewing my papers, listening to my design problems, and always being ready with advice and the right feedback.

I would also like to thank my dear friend and colleague, Henry Park, who without his help and great effort, I would have never been able to get my chip out in time and to get it back working.

During my stay at UCLA, I was blessed to be among a group of wonderful people who made my stay in Los Angeles a wonderful memory and who were always there for me, helping me get over every technical and non-technical obstacle I have met. Thanks go to my dear friends, Amr, Sameh, Tamer, Ismail, Omar, Aboudina, Henry, Wonho, Michael, Karam, Said, Yousr, and Yasser.

I would finally like to express my greatest gratitude toward my family; my parents, my sisters, and my lovely wife for the unlimited love and support they gave to me throughout my entire life without which I would have never been even close to where I am today.

# VITA

| 2003           | B.Sc., Electrical Engineering, Ain Shams University, Egypt                                                                 |
|----------------|----------------------------------------------------------------------------------------------------------------------------|
| 2003-2007      | Teaching Assistant, Electronics and Electrical Communications Department, Ain Shams University, Egypt                      |
| 2004-2007      | Mixed-Signal Design Consultant, SysDSoft, Cairo, Egypt                                                                     |
| 2007           | IC Design Consultant, Silicon Vision, Cairo, Egypt                                                                         |
| 2007           | M.Sc., Electrical Engineering, Ain Shams University, Egypt                                                                 |
| 2008-2013      | Research and Teaching Assistant, Electrical Engineering Department, University of California, Los Angeles, California, USA |
| 2008, and 2010 | Mixed-signal Design Intern, Broadcom, Irvine, California, USA                                                              |

# **PUBLICATIONS**

Ramy Yousry, H. Park, E. Chen, and K. Yang "A Digitally-calibrated 10GS/s reconfigurable flash ADC in 65nm CMOS," in *Proc. IEEE Int. Symp. Circuits and Systems (ISCAS)*, May 2013, accepted for publication.

E-Hung Chen, Ramy Yousry, and Chih-Kong Ken Yang, "Power optimized ADC-based serial link receiver," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 4, April 2012.

E-Hung Chen, Ramy Yousry, Tamer Ali, and Chih-Kong Ken Yang, "10Gb/s Serial I/O receiver based on variable reference ADC," *IEEE International Symposium on VLSI Circuits*, pp. 12-13, June 2011.

Ramy Yousry and Chih-Kong Ken Yang, "Digital calibration of current-steering DAC mismatch in multi-GS/s pipelined ADCs," *IEEE Trans. Circuits Syst. II*, submitted for publication.

Ramy Yousry, Ming-Shuan Chen, and Chih-Kong Ken Yang, "An architecture-reconfigurable 3b-to-7b 4GS/s-to-1.5GS/s ADC using subtractor interleaving," *IEEE Custom Integrated Circuits Conference (CICC)*, submitted for publication.

Ramy Yousry, and Chih-Kong Ken Yang, "A Gain-enhanced source-follower amplifier with degeneration-resistance modulation technique," *IEEE Trans. Circuits Syst. II*, submitted for publication.

## **CHAPTER 1**

# Introduction

Data converters are core building blocks in digital communication systems and data storage systems to enable backend digital signal processing. A reconfigurable ADC is one where the resolution and/or sampling rate are dynamically adjustable to suit the needs of an application. The reconfigurability allows a system designer to relax the performance–power trade-off by optimizing the system performance based on different applications. Numerous examples of such applications exist [1-4] and are discussed in Section 1.1. In such applications, the benefit of performance adaptation is largely based on the assumption that reconfigurability can be achieved efficiently and with low overhead.

Generally, two main techniques are used to achieve reconfigurability. The first technique is by directly trading speed for resolution like sigma-delta and counter ADCs [2, 3], and the second technique is by turning on and off some blocks of a flash or a pipelined ADC [1, 4]. Despite the simplicity and low overhead of the first technique, they suffer from significant speed degradation at high resolutions which limits their use to the applications that require inverse bandwidth–resolution proportionality, like software-defined radio. The second approach does not suffer from this problem, but their poor figure-of-merit (FoM) scaling with different resolutions reduces their applicability. The limited efficiency of both approaches that have been shown in literature is attributed to the fact that the ADC architecture is fixed, whereas the optimum architecture actually depends on the target resolution.

This dissertation introduces an architecture reconfigurable ADC that efficiently covers a wide range of resolutions by configuring the ADC to the proper architecture for each resolution. This leads to a reconfigurable ADC nearly as efficient as dedicated designs in both area and power. Key enabling circuit techniques for high-speed low-power conversion are also introduced in this dissertation, like multi-layer interleaved flash architectures, subtractor-interleaved pipelined ADCs, and body-voltage based digital calibration. Section 1.2 of this chapter motivates this study by describing the design challenges of this type of ADCs. The organization of the dissertation is then presented in Section 1.3.

### **1.1. Examples of Reconfigurable ADCs Applications**

Nowadays, very high-speed analog to digital converters (ADCs) are necessary for many applications. Many of these applications need moderate – albeit variable – resolution such as read channels of magnetic and optical data storage systems, and high data rate digital communications (both wireless and wireline). These applications require 4-to-7 bits of resolution at conversion rates of 1GS/s and beyond.

For example, ADC-based serial-link receivers (Fig. 1.1(a)) need low-resolution ADCs at up to 10GS/s rates. To obtain a single system that can efficiently fit different channels with different characteristics, the ADC should be configurable to a wide range of resolutions. Depending on the channel length and characteristics, typical binary receivers use ADCs with 3-to-5 bits resolutions. Recently emerging PAM transceivers would need higher resolutions of 6-to-7 bits.

In another example, read channels of data storage systems (Fig. 1.1(b)) requires ADCs with resolutions in the range of 5-6 bits at 1-2GS/s rates. Even higher resolution is needed for servo signal processing which is essential for determining the position of the read head over the

rotating disk. This higher resolution should be provided at rates as close as possible to that of the normal mode.

In a third example, the multi-band OFDM Ultra-Wideband (UWB) radio (Fig. 1.1(c)) offers data rates up to 480Mb/s within a short distance range. Due to the overlapping of the signal bandwidth with other standards, the receiver requires a wide dynamic range (7-bit and 1.056GS/s) ADC to partially remove the adjacent band interferers. To reduce the power consumption, a lower dynamic range (4~5b) is also needed when the receiver is used in a less crowded environment.



Fig. 1.1. Reconfigurable ADCs Applications: (a) serial-link transceivers, (b) Read-channel of hard-disk drives, and (c) multi-band OFDM UWB radio receivers.

### **1.2.** Motivation

There are numerous approaches to satisfying a range of ADC performance specification. The most straight forward solution is to use a dedicated ADC per resolution or sampling rate. This solution eases the ADC design optimization for each application. However, it suffers from both long design cycle and high fabrication cost of effectively multiple designs. Another possible solution is to design a single ADC with performance at highest common denominator, i.e., ADC with the highest resolution and conversion rate required in all possible scenarios. Obviously, this ADC would consume much more power than necessary for most of the applications. In some cases, such an ADC may not be technologically feasible.

The power/area/performance tradeoff motivates a reconfigurable design that can be more cost effective and closer to optimum in these metrics within the design space of each application. By correctly partitioning the ADC into smaller blocks in conjunction with smart architecture selection; we can maximize the block usage and area efficiency (in other words, minimizing the amount of redundant elements for each application). Furthermore, reconfigurable ADCs reduces the amount of new intellectual property (IP) that needs to be created, maximizes the reuse of existing IP, and enables hardware/software co-design for reduced time to market.

The benefits of a reconfigurable design are not without challenges. First, the ADC accuracy needs to be satisfactory for the highest resolution of all possible configurations. Resolution, speed, and power consumption are the three key parameters for an analog-to-digital converter (ADC). These parameters are not typically changeable once an ADC is

designed. While one can use 6-bit precision from an 8-bit ADC, it is non-optimal resulting in slower speed and extra power consumption due to internal blocks that are designed to satisfy the full 8-bit resolution. This stimulated the research on some inexpensive means of enhancing the ADC precision in order to minimize the overhead at lower resolutions.

The second challenge is that the reconfigurable ADC should provide good FoM scaling with different resolutions. Fig. 1.2 shows the work published in the ISSCC and VLSI in the last 15 years. These recent papers shows that the published high-speed ADCs varies by >20 dB in resolution. The limited efficiency of existing reconfigurable techniques is attributed to the fact that they fix the ADC architecture for all configurations, whereas the optimum architecture depends on the target resolution. So, despite the vast amount of work done on ADCs so far, there is still room for improvement to cover a wide operating range without sacrificing the area and power efficiency.



Fig. 1.2. Trends in Gigasamples/s ADCs Performance: FoM versus ADC resolution for both flash and multi-step architectures in [5].

### **1.3.** Thesis Organization

The dissertation consists of seven chapters. Chapter 2 gives the necessary background on ADCs and reconfigurability. It begins with a brief discussion for the important performance metrics of an ADC followed with a review for the different forms of reconfigurable architectures that have been proposed in the literature before. The basic categories are discussed with several examples for each category. General guidelines for efficient reconfigurability are then extracted from previously published work.

Chapter 3 reviews the matching requirements for precise conversion, and gives a quick overview of calibration techniques and the factors that limit their operation range. Finally, we present our analysis for body-voltage based calibration. We describe the advantages and limitations of this technique. We show that for the same operating conditions, if body-voltage trimming is designed correctly, the converter precision due to mismatch can be enhanced significantly compared to other calibration techniques, without adding significant area and power overhead or degrading the high-speed performance. We also explain how the addition of supplemental calibration methods helps extending the use of this technique beyond technology limitations. One of the most critical blocks in pipelined ADCs designs is the internal DAC. This chapter compares different DAC architectures and proposes a novel method for building a current-steering DAC that achieves high precision and high speed at low area and power overhead.

Two prototype chips are implemented in 65-nm CMOS to verify the results of this study. The first chip is a 1.5-4GS/s "architecture" reconfigurable ADC. The ADC covers resolution range from 3-bit to 7-bit, and achieves an FoM of 0.46pJ/conv-step at 7-bit. The second chip is a 2.5-

10GS/s reconfigurable flash ADC. The ADC can be configured to work as a 3-bit, a 4-bit, or a 5bit ADC and achieves an FoM of 0.46pJ/conv-step. We explain the details of the two chips design in Chapter 4 and Chapter 5 respectively. A segmented-DAC architecture, that extends the use of body-voltage tuning beyond calibration, is proposed in Chapter 6 with measurement results. Finally, conclusions and future work are discussed in Chapter 7.

# **CHAPTER 2**

## Background

This chapter begins with discussing the design and criteria of the ADC as a black-box. The critical performance specifications are defined and formulated in 2.1. The definitions in this chapter are used in the next chapters where the analysis is explained and performances are reported. The ADC reconfigurability is described in 2.2. Some best papers about reconfigurable ADCs are selected and their features are explained. The performance parameters of these ADCs are also summarized.

### **2.1.** Performance Specifications

The quality of an ADC is measured by both its static or DC performance and its dynamic or AC performance. Static properties are easily measured and are typically used as an indication of the dynamic performance. Communications applications place emphasis on dynamic performance. The common definitions of ADC specifications are presented in this section [6].

#### 2.1.1. Static Error Specifications

The key to understanding the static performance of ADCs is to compare the ideal and nonideal transfer characteristics for DC signals. Static errors usually arise due to mismatch in circuit components and non-symmetry in the IC layout. Measurement results must show performance at high and low temperatures and supply voltages. The different forms of static error to be expected in an ADC realization are explained briefly in the following.



Fig. 2.1. ADC gain and offset characteristics.

#### 2.1.1.1. Offset and Gain Errors

Gain and offset errors should include all possible sources including, for instance, the reference generation. These errors are shown in Fig. 2.1 for a unipolar input range; they have similar effect to errors which occur in an analog amplifier. The transfer characteristic is of the form  $D = O + G \times A$ , where D is the output digital code, A the analog input, O the offset, and G the gain error. For a unipolar ADC, O is ideally 0, while for a bipolar ADC, O is ideally -1 MSB. Thus, the offset error is given by how much O deviates from its ideal value in LSBs. The gain error for ADCs is defined by the error at full-scale minus the offset error, as shown in Fig. 2.1. Gain and offset errors can be considered linear effects that do not hurt the system linearity. However, the dynamic range, i.e. the effective analog input range before over-ranging the ADC, is reduced as a consequence. Moreover, these errors do affect the ADC performance significantly if multiple interleaved channels are incorporated to achieve high speed.



Fig. 2.2. Differential nonlinearity: (a) ideal and non-ideal ADC transfer functions and (b) quantization error.

#### 2.1.1.2. Differential Non-linearity (DNL)

The DNL error refers to how much a code width deviates from the ideal value of 1 LSB. The definition of the instantaneous DNL is [6]:

$$DNL_{i} = \frac{x_{a}(Q_{i+1}) - x_{a}(Q_{i})}{\Delta} - 1, \qquad i = 0, \dots, 2^{N} - 2$$
(2.1)

with  $Q_i$  and  $Q_{i+1}$  adjacent transition levels for analog input  $x_a$ . When one refers to the DNL of an ADC, the maximum DNL is usually implied, be it positive or negative. Examples of DNL error

are shown in Fig. 2.2(a), where the non-ideal transfer characteristic is set against the ideal staircase ADC transfer function. The quantization error produced by the non-ideal ADC transfer is shown in Fig. 2.2(b) and it is set against the ideal sawtooth quantization error expected of an ideal ADC with 0 DNL error. The ideal sawtooth varies now between  $\pm 1/2$  LSB. It is important that the ADC produces no missing codes in order to guarantee monotonic behavior. This implies the maximum DNL error must always be less than  $\pm 1$  LSB.

#### 2.1.1.3. Integral Non-linearity (INL)

The INL error refers to the maximum deviation of the actual ADC transfer function from a straight line drawn through the first and last code transitions after correction for offset and gain errors. This is often referred to as end-point INL and gives a more pessimistic but useful estimation of the non-linearity than referring the linearity of the ADC characteristic to an arbitrary best fit curve drawn through the output codes (best-fit INL). Generally, the best-fit INL is only half that of the end-point INL and is not so widely used anymore to specify professional data converters. The INL is defined by the accumulation of DNL errors over the complete ADC characteristic [6]:

$$INL_i = \sum_{j=0}^{i-1} DNL_j \tag{2.2}$$

#### 2.1.2. Dynamic Error Specifications

The dynamic performance of an ADC is obtained by examining its AC characteristics when a spectrally pure sine wave is applied to the input. This is best done by performing an FFT on the output data and examining the spectrum for noise and distortion. The most important dynamic specifications are explained briefly in the following.

#### 2.1.2.1. Signal-to-Noise Ratio (SNR)

The *SNR* is specified for full scale input amplitude and should include all noise contributions in the band of interest - usually up to the Nyquist frequency ( $f_s/2$ ). This is summarized as [6]:

$$SNR = 10 \times log \left( \frac{Signal \ Power}{Total \ Noise \ Power \ in \ the \ Band \ of \ Interest} \right)$$
(2.3)

A variant of this definition is to include the power of the distortion components with the noise power, and this is called the signal-to-noise and distortion ratio, or *SNDR*. Typically the SNR is dominated by quantization noise and circuit thermal noise but also includes other noise sources such as noise emanating from the references and power supplies, glitches, measurement setup noise, etc.

#### 2.1.2.2. Effective Number of Bits (ENOB)

For real ADCs, the *ENOB* is often used instead of *SNR* or *SNDR*, since it gives a better indication of ADC accuracy: it is defined at a specific input frequency and sampling rate. The *ENOB* is defined to include all measured sources of noise and distortion in an ADC [6]:

$$ENOB = \frac{SNDR - 1.76}{6.02}$$
(2.4)

#### **2.1.2.3.** Total Harmonic Distortion (*THD*)

The *THD* of an ADC gives the ratio of the power of all the harmonics of the input signal to the power of the fundamental. It is usually specified up to a certain number of harmonics, k. Furthermore, it assumed that the input signal is close to full scale. The  $k^{th}$  order *THD* is defined as [6]:

$$THD_{k} = 10 \times log\left(\sum_{i=2}^{k} \frac{A_{i}^{2}}{A_{1}^{2}}\right)$$
(2.5)

which is expressed in negative dBs with respect to the fundamental,  $A_1$ .

#### 2.1.2.4. Spurious Free Dynamic Range (SFDR)

The *SFDR* is widely used as a measure of the quality of high-speed ADCs for communications applications. It is defined as the ratio of the power of the signal fundamental tone to the power of the largest spurious component in a certain frequency range [6]:

$$SFDR = 10 \times log\left(\frac{A_1^2}{A_{spur}^2}\right)$$
(2.6)

with  $A_1$  the RMS value of the fundamental and the RMS value of the largest spurious component. The *SFDR* is expressed either as a function of the signal fundamental amplitude (dBc) or as a function of the ADC full scale (dBFS). The frequency range is almost always the Nyquist band from 0 to  $f_s/2$  in Nyquist ADCs. The *SFDR* is a function of the amplitude and frequency of the input tone as well as the sampling frequency.

In a well designed ADC system, the spurious component will be a harmonic of the fundamental and is usually well below the level of the noise floor. The *SFDR* is a very important measure for ADCs in IF bandpass applications or in sub-sampling applications, since the spurious tone can be interpreted as an adjacent channel.

#### 2.1.2.5. Effective Resolution Bandwidth (*ERBW*)

The quality of a real ADC is best quantified by estimating its resolution in effective bits. This is done by replacing the ideal over the complete Nyquist bandwidth with the actual measured



Fig. 2.3. Reduction of ENOB with input frequency and definition of ERBW.

SNDR and calculating out the effective number of bits, or ENOB. The SNR is usually measured as a function of frequency, either  $f_s$  for a fixed  $f_{in}$ , or  $f_{in}$  for a fixed  $f_s$ . The typical variation of ENOB with  $f_{in}$ , as it is swept from 0Hz up to  $f_s/2$ , is illustrated in Fig. 2.3. A related measured parameter of interest is the full scale effective resolution bandwidth ERBW, or sometimes called the full scale analog bandwidth. It is defined as the input frequency where the ENOB of the ADC response to a full scale sinusoidal input reduces by 3dB with respect to its value at very low frequencies.

#### 2.1.3. ADC Figure of Merit

The figure of merit (*FoM*) is a useful measure of the relative performance of ADCs by comparing objectively the efficiency of different design solutions. They are defined as [6]:

$$FoM = \frac{P}{2^{ENOB} \cdot f_s}$$
(2.7)

where ENOB is the effective number of bits,  $f_s$  is the sample frequency in GHz and P is the power consumption measured in mW. This *FoM* normalize power into energy use per ADC

conversion. It can be used to compare all types of converters irrespective of architecture, frequency of operation or type and generation of process used. There are three primary reasons for the continual improvement in efficiency of ADCs, namely shrinking technology size, improving design techniques, and novel design solutions.

### 2.2. Literature Review

As a key component of modern electronic products, ADC is in high demand in communication, consumer electronics, biomedical instruments, and various measurement instruments. Different requirements for different applications keep pushing ADC to higher speed and higher resolution. A large volume of literatures about ADC has been published in the past 30 years but very few of them are related to reconfigurable ADCs.

#### 2.2.1. Reconfigurability of ADC

#### 2.2.1.1. Reconfigurability on Conversion Rate [21]

The most famous example about reconfigurable ADC on conversion rate is time-interleaved ADC [2-3, 22]. For interleaved ADCs, identical channels are combined together in the time-interleaved manner and every channel is a complete ADC. The conversion can take place several times in one clock cycle depending on how many channels there are in the system. The conversion rate can be varied by altering the number of parallel channels but this approach suffers from distortions caused by mismatch among the different parallel channels [23]. Furthermore, sampling clock skews for different channel will finally limit the performance at high speeds [24]. One solution for this problem is using only one sample and hold circuit for all

the channels, but this solution will make the design of sample and hold extremely hard since it works in a much higher frequency.

There are some other simple configurations on the conversion rates, like sigma-delta and counter ADC [2, 3]. For sigma-delta and counter ADCs, the configuration is easier. By keeping the oversampling ratio constant and changing the sampling frequency, the signal bandwidth can be altered [25, 26]. A combination of pipeline mode and sigma-delta mode ADC was presented in [27]. The dynamic range is very high but the signal bandwidth is small due to oversampling. The design is very complex with a considerable area overhead due to the configuration between pipeline mode and sigma-delta mode.

A more complicated and less efficient example is the pipelined ADC. By scaling the OTA bias current, the conversion rate can be scaled accordingly [28, 29]. If the conversion rate reconfiguration range is very large, the OTA tends to work in weak inversion, hurting the transit frequency. In order to alleviate this problem, the ADC can be designed to work in one clock period and then rest in the following several clock periods. The speed can be decreased and the average power consumption is decreased too. The combination of power scaling the period skipping technique is used in [30]. The penalty of this method is that a complex clock scheme should be generated. If the OTA can be reconfigured according to the bias current, furthermore to the conversion rate, the OTA can still work in strong inversion or moderate inversion, hence high transit frequency can still be achieved. In this work, bias scaling and reconfigurable OTA techniques are used when altering the conversion rate.

#### 2.2.1.2. Reconfigurability on Conversion Resolution [21]

One approach for reconfigurable on conversion resolution is using the sigma-delta modulator. The resolution of the sigma-delta ADC is decided by decimation filtering and different decimation filters can be implemented for different resolutions. It is possible to trade signal bandwidth for accuracy by adjusting the oversampling ratio [31-35]. However, it is a tough challenge to implement the decimation filter due to the high oversample ratio. Therefore, this approach is usually used for the low signal bandwidth.

For the pipeline ADC, there are two approaches to configure the resolution. First one is turning off the latter stages [36]. Another one is turning off the first several stages and reroute the input signal to the configuration stages [37]. These two approaches each have their own pros and cons. The first method is very easy and need minimum hardware to configure, but gain no power benefit from it. The second one will need some extra hardware, such as wires and switches but huge power can be saved because the first several stages are power hungry blocks. However, these extra routing wires and switches bring some interference to the configuration stages and deteriorate the overall performance.

#### 2.2.2. Design Examples in Previous Publications [21]

#### 2.2.2.1. Gulati's Design

Gulati's design was published in 2001 [27]. The design can change its architecture between pipeline and sigma-delta modes. It can also vary its circuit parameters, such as size of capacitors, length of pipelines, and oversampling ratio. Moreover, the bias currents are varied in proportion to the sampling frequency. Opamp scaling and opamp sharing between two consecutive stages are used in pipeline mode.

The ADC can be configured in the range of 6-16 bits. The ADC architecture is illustrated in Fig. 2.4. The main concept of this design is that since the basic construction units (OTAs, comparators) of both pipeline ADC and sigma-delta ADC are almost the same, they can be regrouped to have different functions. The method has several disadvantages. First, since the



Fig. 2.4. Block diagram of Gulati's design



Figure 2.5. Block diagram of Anderson's design.

OTAs and the comparators are optimized for some particular situations, they may not have maximum efficiency in other operation mode. Second, reconfigurations between two completely different modes (pipeline and sigma-delta) need more switches to switch back and forth, leading to a large area and more interference.

#### 2.2.2.2. Anderson's Design

Anderson's design was published in 2005 [38]. The design has 8 configurations with top performance of 10-bit 80MS/s. This design has reconfigurability in both conversion rate and resolution. The resolution reconfiguration was realized by turning off the latter stages and the conversion rate reconfiguration was implemented by changing stage 2 and stage 4 for cyclic ADC. The block diagram of this design is shown in Fig. 2.5. One of the drawbacks of this design is there is no OTA power scaling involved, so the power efficiency is low.

#### 2.2.2.3. Cheng-Chung's Design

Cheng-Chung's design was published in 2007 in VLSI [4]. The design has three configurations with top performance of 7-bit 1.1GS/s. This design has reconfigurability in both conversion rate and resolution. The resolution reconfiguration was realized by turning off the latter stages and the conversion rate reconfiguration was implemented by changing the interleaving ratios. The block diagram of this design is shown in Fig. 2.6. One of the drawbacks of this design is there is no OTA power scaling involved, so the power efficiency is low.

#### 2.2.2.4. Ahmed's Design

Ahmed's design was published in Dec. 2005 on JSSC [30]. This ADC can only be configured on conversion rate. However, this design has largest range on conversion rate. It can work from 50 MSPS to 1 KSPS. The conversion rate ratio is up to 50K. It's very hard to scale the power corresponding to each conversion rate only depending on the bias scaling over such a large ratio.

The author uses two methods to solve this problem. First, for low conversion rate, the ADC only work for one clock period and rest for sever clock periods depending on the configuration. The average power can be decreased since the ADC is off during most of the time. Second, since the OTA in the ADC is working periodically, they should be powered on rapidly when employed. The timing and block diagram of the ADC is shown in Fig. 2.7.

#### 2.2.2.5. Seyed's Design

Seyed's design was published in 2011 on VLSI [3]. This ADC trades resolution for conversion rate. The design has three configurations with top performance of 9-bit 250MS/s. The timing and block diagram of the ADC is shown in Fig. 2.8. The converter is reconfigured by generating the sampling and ring-ladder clocks from divided versions of a 1GHz counter clock. When all clocks are operating at 1GHz, a 1GS/s 7-bit converter is realized, while by dividing the front-end clock by 2 and by 4, respectively, the converter can be operated as a 500MS/s 8-bit, and 250MS/s 9-bit converter, respectively.

#### 2.2.3. Summary of Previous Publications

Besides the designs mentioned in the previous several sections, there are still some innovative designs, but they will not be treated in detail. The performances of the best designs are listed in table 2.1. Since each design has different configurations, different processes and supply voltage, it is very hard to justify which one is the best. For traditional ADCs, figure of merit (FoM) is used to evaluate similar designs. However, for reconfigurable ADC, one ADC may have several FoM values for different configurations. Fair comparison among these ADCs remains a


Fig. 2.6. Block diagram of Cheng-Chung's design.



Fig. 2.7. Timing and Block diagram of Ahmed's design.



Fig. 2.8. Timing and Block diagram of Seyed's design.

formidable mission, but one may define a measure metric of the ADC flexibility, i.e., how the ADC covers wide resolution range with minimum speed degradation. We define a reconfigurability factor ( $R_F$ ) as

$$R_F = \frac{\left(2^{ENOB,\max}/2^{ENOB,\min}\right)}{f_{s\max}/f_{s\min}}$$
(2.8)

Some crude observations based on table 2.1 are: (1) Design [27] has the largest area. Part of the reason is that this design uses a  $0.6\mu$ m process. Another reason is due to complex configuration between pipeline and sigma-delta. (2) Even with the relatively modern technologies in [3] and [4], the area is still larger than dedicated designs with the same specifications. The reason in [3] is due to the massive interleaving applied to achieve high conversion rates using a slow ADC architecture, and in [4] is basically due to both interleaving and the significant redundancy associated with lower-resolution configurations. (3) Among the

| Ref. | Bits<br>(b) | f <sub>S</sub><br>(MHz) | Power<br>(mW) | V <sub>DD</sub><br>(V) | R <sub>F</sub> | Area<br>(mm <sup>2</sup> ) | Process<br>(µm) | Configuration     | Comments            |
|------|-------------|-------------------------|---------------|------------------------|----------------|----------------------------|-----------------|-------------------|---------------------|
| [3]  | 7-9         | 250-<br>1000            | 26            | 1.2                    | 0.9            | 0.55                       | 0.13            | Rates, resolution | Calibration         |
| [4]  | 5-7         | 550-<br>1100            | 46            | 1.3                    | 2.69           | 0.86                       | 0.09            | Rates             |                     |
| [15] | 10          | 1k-50                   | 35            | 1.8                    |                | 1.2                        | 0.18            | Rates             |                     |
| [12] | 6-16        | 2.62                    | 24.6          | 4.6                    |                | 79.8                       | 0.6             | Rates,            | Pipeline            |
|      |             | 10                      | 17.7          | 4.6                    |                |                            |                 | resolution        | $\Sigma \Delta$     |
| [23] | 6,8,10      | 80                      | 94            | 1.8                    | 16             | 1.9                        | 0.13            | Rates, resolution | Cyclic,<br>pipeline |
| [6]  | 8,10        | 2.5-<br>40              | 35.4          | 2.5                    | 0.25           | 1.9                        | 0.35            | Rates, resolution | Pipeline            |

Table 2.1. Reconfigurable ADC Performance Summary

reported reconfigurable ADC, only [27] and [38] took into consideration that the optimum ADC architecture depends on the target resolution. This is the reason why they exhibit the highest reconfigurability factors (i.e., the most flexible ADCs). Yet, the architecture reconfigurability in [27] leads to excessive overhead which limits the high-speed performance. In [38], the two architectures selected are not so different which resulted in poor power efficiency at lower resolutions. In other words, the performance of [38] is not much different than "fixed-architecture" reconfigurable ADCs.

#### 2.2.4. Problems in Reconfigurable ADC

High performance ADCs are very sensitive to noise and interference. Any noise source could dominate the overall performance. Reconfigurable ADCs are more vulnerable than traditional ones due to more wires and switches for reconfiguration. For example, for an off-state switch, if

the isolation of the switch is -60dB and there is a 10dBm signal at one terminal, one will get a -50dBm signal at the other terminal. This interference may potentially limit the resolution one can get from an ADC. For high speed and high resolution data converters, the isolation is even worse due to the capacitive characteristic between source and drain terminals [6]. The ADC accuracy also needs to be satisfactory for the highest resolution of all possible configurations. This stimulated the research on some inexpensive means of enhancing the ADC precision in order to minimize the overhead at lower resolutions.

Another important issue in conversion rate reconfiguration is how to scale the bias in OTAs for a large conversion rate ratio. Since the OTAs are usually optimized to operate in the highest conversion rate, they may not work appropriately when the bias is scaled down for lower speed because the device sizes are not scaled accordingly. The proposed solution tackles this problem by removing the OTAs altogether since the target medium-resolution ADCs can be implemented with open-loop amplifiers with digitally-assisted backend.

Finally, the reconfigurable ADC should provide good FoM scaling with different resolutions. The limited efficiency of existing reconfigurable techniques, with very few exceptions, is attributed to the fact that they fix the ADC architecture for all configurations, whereas the optimum architecture depends on the target resolution. So, despite the huge amount of work done on ADCs so far, there is still a large room for improvement to cover this wide range of applications without sacrificing the area and power efficiency.

## 2.3. Conclusion

Key ADC specifications were firstly presented here to form a reference for the analysis in this dissertation. The definitions in this chapter are used in the next chapters where the analysis is

explained and performances are reported. A literature review of reconfigurable ADC architectures is introduced. Some key papers about reconfigurable ADCs are selected and their features are explained. The performance parameters of these ADCs are also summarized.

# **CHAPTER 3**

# **Body-voltage-based Digital Calibration in ADCs**

Data converters performance has been improved significantly over the years leveraging advancements in scaling, density and speed of modern process technology. Many data converter architectures rely on matched components to perform their tasks of data conversion. In practice, perfectly matched components are impossible to fabricate, and mismatch errors, which are defined as the difference between the designed and actual component values, are inevitable. In VLSI circuits, static mismatch errors are caused by process variations such as mask misalignment, nonuniform oxide thickness, and nonuniform doping densities. Additional mismatch errors can be generated by temperature gradients across the circuit, component aging and component noise. Mismatch errors cause nonuniform code widths which cause errors in the converter's transfer function, which affects the converter's static (INL and DNL) and dynamic (SNDR and ENOB) performances. As a result, the converter's performance is diminished.

The chapter is organized as follows; section 3.1 presents the matching trends with technology scaling followed by matching requirements for different ADC architectures. An overview on body-voltage calibration technique is given in 3.2. The tradeoffs of this technique are clearly stated in this section. In order to extend the use of body-voltage trimming beyond technology limitation, hybrid-calibration is proposed in 3.3.Suggested applications of body-trimming are introduced in 3.4. Finally, section 3.5 draws the conclusions.

## **3.1.** Matching Considerations

## **3.1.1.** Matching Trends

Device matching is the heart of precision analog design. Basically, mismatch variance is inversely proportional to device area. This means that well matched devices come at the expense of larger die, parasitic capacitance and power consumption. For modern compact analog designs, where the active area occupies few hundreds of micrometers, the threshold mismatch standard deviation ( $\sigma_{VT}$ ) for an MOS transistor is governed by (3.1)

$$\sigma_{VT} = \frac{A_{VT}}{\sqrt{W \cdot L}} \tag{3.1}$$

where  $A_{VT}$  is an empirical mismatch parameter that depends on the process technology, W and L are transistor width and length respectively. Although the increased fabrication accuracy through technology advance continuously enhanced matching parameters, the fast progress of technology



Fig. 3.1. Threshold mismatch versus feature size for different CMOS technology nodes.



Fig. 3.2. Calibration techniques used in prior art (ISSCC 2008-2012, VLSI Circuit Symposium 2008-2012) at different speeds and resolutions.

scaling has lead to an increasingly challenging performance-accuracy tradeoff for analog circuits designers. Fig. 3.1 compares the mismatch parameter across different technology nodes, as well as the relative offset mismatch for a feature sized transistor with fixed aspect ratio (W/L). Despite the fact that the mismatch parameter has been reduced over technology nodes, the total offset mismatch increased [7].

Instead of only sizing the devices to achieve the desired accuracy, the intrinsic accuracy of an area efficient converter is designed worse than its resolution. Before or during chip usage, self-calibration automatically detects and corrects for elements mismatches and leads to reduced silicon area and improved yield. Regardless of the calibration (either foreground or background), there are two possible places at which calibration is performed. The first choice is to apply calibration at the converter analog front-end, where analog correction elements are added to handle the converter random mismatches. The other alternative is to correct the processed data digitally at the converter backend [8]. Fig. 3.2 shows the calibration choice for digitally

calibrated ADCs published over the last 15 years [5]. The figure shows that most multi-GS/s ADCs with medium resolutions use calibration at front-end because the digital hardware of backend calibration can be very power hungry. The work thus focuses front-end calibration.

Over the past decade, body-biasing techniques have been introduced to raise and lower the threshold voltage to tune leakage current for digital CMOS devices [9]. Recent efforts extend the use of body-biasing to mismatch trimming [1, 10, and 11]. The effectiveness of body-voltage trimming has both advantages and limitations This chapter provides a thorough study of the efficiency of using body voltage trimming calibration for data converters, the limitations, and the suggested solutions to extend the usability and enhance the efficiency of this technique.

## 3.1.2. Matching-Critical Blocks

#### 3.1.2.1. Flash Architecture

In a flash architecture, the preamplifiers offset mismatches are the most critical parameters in defining the ADC linearity. The comparators offsets are less critical since their input is already amplified by the preamplifiers. Yet, very high speed applications require wideband preamplifiers thus reducing their available gain. The backend SR latches receives CMOS input levels thus does not suffer from their internal mismatches.

Usually, the track-and-hold performance does not suffer from their component mismatches since they result in linear errors (e.g. offset errors in their transfer function). The recently emerging interleaved architectures, however, impose very stringent matching to avoid channel-to-channel mismatches. The matching requirements for different blocks in a flash architecture are shown in Fig. 3.3.



Fig. 3.3. Matching requirements for different blocks in an interleaved flash ADC architecture.

#### **3.1.2.2.** Pipelined Architecture

The pipelined analog-to-digital converter (ADC) is a popular architecture for high-speed data conversion in digital communication systems, data-storage systems, and many other applications. These applications often require very high speeds and medium resolutions (6-8 bits) [4, 39]. Among the key building blocks in pipelined ADCs are the multiplying digital-to-analog converters (MDACs) that connect successive converter stages. Unlike other types of non-idealities in a typical pipelined ADC, the non-linearity introduced by the first-stage MDAC is not attenuated or cancelled along the pipeline, so it tends to be the dominant contributor of overall ADC error [8]. The MDAC's precision requirement, especially for the first pipeline stages, is defined by the overall ADC accuracy, which is much higher than the MDAC resolution.

The internal sub-ADCs are less critical since they need to match only to their internal resolutions. The track-and-hold performance does not suffer from their component mismatches since they result in linear errors (e.g. offset errors in their transfer function) unless interleaving is



Fig. 3.4. Matching requirements for different blocks in a pipelined ADC architecture.

incorporated. The matching requirements for different blocks in a pipelined architecture are shown in Fig. 3.3.

## **3.2.** Body Voltage Calibration

The term "body-voltage-trimming" combines three design choices; trimming, voltage, and body. The tradeoffs of these choices are listed in this section.

#### 3.2.1. Advantages

Usually, trimming is performed by shunting each cell of the converter (whether a DAC or an ADC) with a small auxiliary calibration current DAC [12-14], or by modifying the gate voltage of each cell [15-17]. Mixing the main converter array with auxiliary DAC arrays in the first method results in a non-homogeneous layout that may be restricted in nanometer-scale rules. Moreover, applying calibration circuitry at the converter cell output adds extra capacitance to the critical signal path. This may seriously limit the converter performance at high-speeds.

On the other hand, the later method that involves trimming the gate bias does not load the normal signal path of the converter elements. The reason is that voltage trimming is applied by modifying the DC input of each converter cell (e.g., comparators reference levels for a flash ADC or the current cell bias voltage for a current-steering DAC).

Another advantage of this method, compared to current trimming, is the inherent process tracking<sup>1</sup>. This is because voltage trimming corrects the cause of converter error (threshold voltage mismatches), while current trimming handles the resulting current error at the output. This makes voltage trimming more robust to PVT variations than current trimming<sup>2</sup>.

Moreover, MOS transistors provide a free control knob that does not interfere with the critical signal path, which is the body terminal. Body voltage trimming corrects mismatches by modifying the body bias of each cell [1, 10, and 11], hence changing its threshold voltage. The proposed method leverages the inherent body-to-threshold voltage attenuation, where large voltage steps at transistor body results in very fine changes in the corresponding threshold voltages. This results in relaxed matching, noise, and linearity constraints. The continuously decreasing sensitivity to body voltage variations with technology advance makes the proposed technique more appealing<sup>3</sup>.

#### 3.2.2. Limitations

Despite the nice features enabled by body voltage trimming, three main challenges that may degrade the efficiency of this technique have been observed. The first issue is the limited trimming range. The problem is that the trimming range is bounded by the turn-on voltage of the

<sup>&</sup>lt;sup>1</sup> A detailed analysis of this process tracking is presented in the appendix.

<sup>&</sup>lt;sup>2</sup> Simulations in 65-nm CMOS process show a 30% increase in the current trimming range, compared to voltage trimming range, in order to tolerate transconductance variations through different process corners.

<sup>&</sup>lt;sup>3</sup> A 50% reduction in sensitivity is observed when migrating from 90-nm to 40-nm CMOS process.

N-well/P-substrate junction, which limits body voltage range to few hundred millivolts if forward body biasing is used. In order to extend the available body trimming range, reverse biasing may be considered. Yet, higher supply voltage is needed to provide reverse biasing by the trimming DAC [11], which is not available in most designs. Moreover, the threshold voltage sensitivity to body voltage variations drops dramatically when moving from forward to reverse biasing. Fig. 3.5 shows that the threshold voltage sensitivity to body voltage variations at 0.4V of



Fig. 3.5. Simulated PMOS threshold voltages sensitivity to body biasing.



Fig. 3.6. Threshold voltages sensitivity to body biasing for different technology nodes.

reverse bias drops by 50% compared to the sensitivity at 0.4 of forward bias. Also, since reverse bias requires voltages higher than the supply rails, reliability issues limits the maximum available range to values not much higher than the range offered by forward bias.

The second challenge is that the decreasing sensitivity to body voltage variations is a doubleedged sword. On the one hand, it enables very fine trimming accuracy. But on the other hand, the trimming range shrinks dramatically with technology advance. For example, the threshold trimming range shrinks by a factor of 2 when migrating from 90nm to 40nm CMOS process assuming same cell area<sup>4</sup>. The factor increases to 4.3 if minimum feature size transistors are used [11], as shown in Fig. 3.6. This trend poses design tradeoffs between device area/power and the tuning range of body voltage for offset cancellation.

Finally, like any other circuit technique, linearity becomes increasingly challenging with process and supply scaling. Although trimming the body voltage inherently tracks threshold variations, the nonlinear relation described by (3.2), limits trimming accuracy.

$$|V_{T}| = |V_{T0}| + \gamma \left( \sqrt{2\varphi_{F} + |V_{SB}|} - \sqrt{2\varphi_{F}} \right)$$
(3.2)

where  $V_{T0}$  is the threshold voltage at zero source-to-body bias,  $V_{SB}$  is the source-to-body voltage,  $\gamma$  and  $\varphi_F$  are MOSFET device parameters.

This nonlinear relation may result in available trimming range even lower than the junction turn-on limitation, especially if very high trimming accuracy is targeted. Fig. 3.7 shows the maximum achievable trimming accuracy versus offset standard deviation ( $\sigma_{off}$ ) in 65-nm CMOS. In this simulation, the goal is to tolerate offsets in the range of  $\pm 3.5\sigma_{off}$  for a current cell and the

<sup>&</sup>lt;sup>4</sup> A detailed analysis of the short-channel effects is presented in the appendix.



Fig. 3.7. Maximum "trimmed" accuracy versus initial offset for a current cell.

trimming accuracy is calculated relative to the cell nominal current. Simulations show that even if the offset range is within the body trimming capabilities, trimming accuracy requirements may limit the use of this method.

## **3.3.** Hybrid-Calibration Solutions

Interestingly, all mentioned limitations are related to the trimming maximum available range rather than the minimum achievable trimming step. This motivates adding another "supplemental" method to bring mismatches down to the available body voltage trimming range; hence body trimming can efficiently handle the residual mismatches and calibrate the converter elements to very fine accuracies. To reduce the intrinsic mismatches, three methods are considered: sizing, trimming, and redundancy. Although previous analysis proves that trimming and redundancy based calibration are more efficient than row device sizing [7], the choice of the supplemental method strongly depends on the application. Next subsections present two case studies. For each case, the optimum combination is selected in order to achieve the target performance with minimal overhead.

#### **3.3.1.** Flash ADC Offset Trimming

In this work, offset cancellation is achieved by modifying the comparator reference voltages, which does not affect the critical signal path [17]. This way calibration does not degrade high-speed performance. To achieve high calibration accuracy with minimal overhead, the proposed differential reference adjustment of the comparator combines two methods as shown in Fig. 3.8. The first approach is to select the tap-point from the resistor ladder by a 3-bit coarse control signal. The selected voltage is connected to a PMOS reference buffer, where the bodies of PMOS devices are connected to another resistor ladder with 3-bit digital control to perform reference fine-tuning [1].



Fig. 3.8. The comparator differential-reference adjustment combines coarse and fine-tuning to tolerate a large offset range with very fine accuracy.

Although this approach is already presented in [1, 18], the proposed circuit implementation offers wider tuning range. Inspired by the self-biased amplifier technique presented in [19], where the input signal is applied to both the buffer input transistor and the current source, the body trimming is applied to both transistors in order to double the effective tuning rage. Although it needs only to cover one coarse step, the fine-tuning range covers 2 coarse steps to tolerate PVT variations. By combining both methods, the tuning range of each comparator is more than 25% of the ADC dynamic-range (*DR*) with 0.25 LSB steps of the 5-bit resolution.

Despite the fact that body trimming in bulk CMOS process necessitates the use of the "slow" PMOS devices (in order to independently control the body voltage of each device), these PMOS transistors are essentially needed only for the reference buffers that do not require wide bandwidth. The high-speed core, preamplifiers and dynamic-latches, are built using NMOS input transistors to enable high-speed. This favors applying body trimming to the reference buffer rather than directly to the preamplifier as in [10, 11].

#### **3.3.2.** Improving the ADC Dynamic Range

Body voltage trimming offers one more advantage over conventional voltage (and current) trimming, which is the DR saving. The problem is that even distribution of trip-points across the full input range is necessary for optimal ADC performance, as shown in Fig. 3.9(a). In the actual case shown in Fig. 3.9(b), however, some trip points nominally assigned to codes near the input range edges may occur outside the input range edges (due to random offsets) [20]. Although adding extra trimming levels may solve this problem, the required reference range may be significantly larger than the usable input range. The problem becomes more severe with technology advance, where offsets increase dramatically resulting in significant DR overhead.



Fig. 3.9. Trip-point distribution over the input range. (a) Ideal case. (b) Actual case (extended ladder).

This problem does not exist if body trimming is used. Since body trimming controls a different port of the converter cells (i.e. the body terminal), the body trimming range does not affect the ADC input range. Unfortunately, this is valid only if pure body trimming is sufficient for calibration. This motivates adopting redundancy as a "supplemental" calibration method other than conventional coarse trimming, even if pure trimming is the most efficient in terms of hardware overhead. In this case, redundancy is introduced just to bring comparators offsets down to the body voltage trimming range. Hence body voltage trimming successfully calibrates mismatches to the required fine accuracy with zero *DR* overhead.

## **3.4.** Design Example: MDAC Calibration

#### 3.4.1. Background

Nonlinearity of a multi-bit MDAC within a pipelined ADC is typically calibrated in one of three ways: (1) to adjust or trim the currents that generate the DAC output signal [12, 15-16, 43], (2) to introduce redundant elements and selecting the best set to improve linearity [44, 45], and (3) to use digital code mapping by either pre-correcting the MDAC digital input [46] or post-correcting the ADC output code [8, 47-48]. The first two methods are applied at the converter front-end, while the last method is applied to the back-end digital hardware. Fig. 3.10 shows the



Fig. 3.10. Calibration techniques used in prior art (ISSCC 2008-2012, VLSI Circuit Symposium 2008-2012) at different speeds and resolutions.



Fig. 3.11. Area vs. accuracy for a 4-bit thermometer-coded DAC calibrated with sizing, trimming, or redundancy.

calibration choice for digitally calibrated ADCs published over the past 15 years [5]. The figure shows that most multi-GS/s ADCs with medium resolutions use calibration at front-end because the digital hardware of back-end calibration can be very power hungry at these speeds. The work thus focuses front-end calibration. The next section compares the overhead associated with two techniques: redundancy and trimming.

#### **3.4.2.** Comparison between Trimming and Redundancy

Digital calibration incorporating redundancy of data converters is becoming attractive as feature size and supply voltage shrinks [49]. Redundancy calibration effectively decouples analog performance from component matching. Consider *R* elements are assigned to each DAC level. The calibration is performed by sweeping the DAC from code "0" to the maximum code and for each code only *one-of-R* cells is selected in a way to best match the target level. During the sweep, the output at each code is compared to that of an accurate–albeit slow–auxiliary DAC. For a fixed yield, a relation between the overhead due to redundancy and target improvement of accuracy can be given by (3.3):

$$OV_{Re\,dundancy} = K_{Re\,d} \times 2^{Ninprovement}$$
(3.3)

where  $K_{Red}$  is a constant that depends on implementation (nominally equals to one), and  $N_{improvement}$  is the accuracy improvement (in bits) due to calibration.

Even with this simple approach to providing redundancy, (3.3) shows outstanding scaling behavior of this technique. The amount of redundancy doubles for each extra bit of target accuracy, which is much better than raw scaling of device size to achieve the target resolution (4x scaling per bit).

Although trimming overhead depends strongly on implementation details (e.g., on the trimming circuit design and the layout design rules for the used technology), a general trend is observable for a wide range of implementations [12]. The overhead can be split into two terms: fixed area overhead which is the added circuitry and layout extra spacing needed to enable trimming,  $OV_{Trim,Fixed}$  (regardless of the amount of trimming), and area overhead that is directly proportional to the trimming dynamic range,  $OV_{Trim,Var}$ . A designer can generally expect a doubling in the number of trimming elements for each extra bit of accuracy improvement in order to cover the same offset range with twice the accuracy. The total overhead,  $OV_{Trim}$ , is given by (3.4):

$$OV_{Trim} = OV_{Trim, Fixed} + K_{Trim, Var} \times 2^{Nimprovement}$$
(3.4)

where  $K_{Trim, Var}$  is a constant that depends on implementation.

Interestingly, the scaling of area overhead for increasing accuracy for trimming is similar to applying redundancy. Both are proportional to  $2^{Nimprovement}$ . However, details of implementation make the absolute area overheads considerably different. We provide a quantitative comparison to illustrate based on a circuit implementation described in the next section. While different implementation details may change the specific area overhead, the trends and general conclusions hold. By using the design in the next section, we find  $OV_{Trim,Fixed}$  to be 4 and  $K_{Trim,Var}$  to be 1/8. By applying these values, Fig. 3.11 shows a comparison of the scaling behaviors of sizing, trimming, and redundancy-based calibration. In this figure, the area for each technique (relative to the 4-bit case) is plotted versus target accuracy. For high target accuracies, trimming and redundancy has similar scaling. For target accuracy below 8 bits, the area overhead of

trimming is lower and relatively insensitive to target accuracy leading to an offset between the two curves and favoring trimming.

Another fundamental difference that prefers trimming over redundancy is that redundancy overhead essentially interferes with the converter high-speed core, whereas trimming overhead can be placed away from the converter core and thus minimizes the speed penalty. For this reason, the overhead of trimming and redundancy are often referred to as extrinsic and intrinsic redundancy respectively [50].

#### **3.4.3.** Body-Voltage Trimming

Usually, trimming is performed by shunting each cell of the converter with a small auxiliary calibration current DAC [12, 43], or by modifying the gate voltage of each cell [16, 44]. Mixing the main converter array with auxiliary DAC arrays in the first method results in a non-homogeneous layout that may be restricted in nanometer-scale rules. Moreover, applying calibration circuitry at the converter cell output adds extra capacitance to the critical signal path. This may seriously limit the converter performance at high-speeds. Fig. 3.12(a) shows the cell capacitance, and the overhead capacitance due to trimming, versus MDAC intrinsic accuracy for a 4-bit MDAC with 7-bit target accuracy. Analysis assumes same overdrive voltage for both the core cell and the auxiliary cells. As depicted by [12], the total capacitance (or area) can be minimized by optimizing the "intrinsic accuracy"–"calibration depth" combination. Even with this optimized solution, capacitance overhead is almost 100%. Fig. 3.12(b) shows that the problem becomes even worse for higher target accuracies.

The later method that involves trimming the gate bias does not load the critical signal path of the converter elements. The reason is that voltage trimming is applied by modifying the DC input



Fig. 3.12. Output-calibration overhead for a 4-bit DAC: (a) capacitance, of the core cell and the auxiliary DAC, versus MDAC intrinsic accuracy for 7-bit target precision, and (b) normalized total capacitance versus target accuracy assuming optimal intrinsic-accuracy selection.



Fig. 3.13. Trimming calibration: (a) conventional and (b) body-voltage trimming.

of each converter cell (e.g., the current cell gate-bias voltage for a CS DAC). However, this technique suffers from either the increased gate leakage of nanometer-scale CMOS technology, or the need for special floating-gate devices. Moreover, noise constraints necessitate increasing the power consumption of the trimming circuit for both methods.

Alternatively, we propose using a body-voltage based trimming scheme. MOS transistors provide a free control knob that does not interfere with the critical signal path, which is the body terminal. Body voltage trimming corrects mismatches by modifying the body bias of each cell [10, 11], hence changing its threshold voltage. The proposed method leverages the inherent body-to-threshold-voltage attenuation, where large voltage steps at transistor body results in very fine changes in the corresponding threshold voltages. This results in relaxed matching, noise, and linearity constraints. The continuously decreasing sensitivity to body-voltage variations with technology advance makes this technique more appealing (a 50% reduction in sensitivity is observed when migrating from 90-nm to 40-nm CMOS process).

Another advantage of this method, compared to current trimming, is the inherent process tracking. This is because voltage trimming corrects the cause of converter error (i.e., threshold voltage mismatches), while current trimming handles the resulting current error at the output. This makes voltage trimming more robust to PVT variations than current trimming. Consider two different pairs of current cells with initial mismatch, where one pair is calibrated using current trimming (Fig. 3.13(a)) and the other pair is calibrated with body-voltage trimming (Fig. 3.13(b)). Assuming that both pairs are perfectly calibrated (zero error) at room temperature, Fig. 3.14 compares the residual mismatch of both pairs versus temperature. The figure shows that body-voltage trimming stability over temperature variations is more than 10 times better than current trimming<sup>5</sup>.

<sup>&</sup>lt;sup>5</sup> A detailed analysis of this process tracking is presented in the appendix.



Fig. 3.14. Temperature stability: residual mismatch, in percent, versus temperature for body voltage and output-current calibration.

#### **3.4.4.** Circuit Implementation

As discussed in the previous section, MDACs have two common characteristics that impact calibration: (1) MDACs are thermometer coded and calibration is applied similarly to each level, and (2) accuracy is much finer than the intrinsic resolution. This section describes the implementation of a simple and modular calibration scheme that can be symmetrically applied to all DAC cells. The trimming hardware is shared between redundant elements to minimize overhead.

#### 3.4.4.1. ADC Self-Calibration Architecture

At startup, the static non-linearity of the MDAC circuit is measured on-chip using the setup depicted in Fig. 3.15(a) [51]. After the measurement phase, the Ref-DAC is disconnected and the digital correction settings are stored in a memory in order to correct for the non-linearities as determined by the measurement procedure. During self-measurement, the C-ADC outputs are disabled and the DAC digital inputs are controlled by the calibration engine. The front-end T/H





Fig. 3.15. Proposed self-calibration: (a) high-level scheme, (b) residue plot before and after calibration, and (c) current cell with the calibration circuitry.

is also bypassed with an accurate (>7-bit accurate) Ref-DAC that stimulates the ADC with several DC voltage levels. The Ref-DAC has the same resolution as the MDAC, but with much higher accuracy.

At the beginning, The MDAC digital input is set to the lowest value (all zeros), and the Ref-DAC stimulates the ADC with the ideal corresponding value. In this case, the residue signal is



Fig. 3.16. Die photograph of the prototype chip.

|                                    | [4]       | [39]       | This Work |
|------------------------------------|-----------|------------|-----------|
| Supply Voltage (V)                 | 1.3       | 1.2        | 1.2       |
| Process (nm CMOS)                  | 90        | 90         | 65        |
| ADC Architecture                   | Pipelined | Subranging | Pipelined |
| MDAC Architecture                  | SC        | R-ladder   | CS        |
| MDAC Calibration                   | No        | No         | Yes       |
| Active Die Area (mm <sup>2</sup> ) | 0.19      | 0.6        | 0.15      |
| Resolution (Bits)                  | 7         | 8          | 7         |
| Sampling Rate (GS/s)               | 1.1       | 0.77       | 1.5       |
| Total Power (mW)                   | 46        | 70         | 41        |
| Figure of Merit (fJ/conv)*         | 1180      | 940        | 460       |
| ENOB (Bits)                        | 6.5       | 6.97       | 6.27      |

 TABLE 3.1
 ADC PERFORMANCE COMPARISON

\*  $FoM = Power / [2^{ENOB} x min (f_s, 2xERBW)]$ 

the amplified difference between the Ref-DAC and the MDAC outputs. The F-ADC digitizes the residue signal, and an algorithm is performed to control the Cal-DAC in order to minimize the residue signal. The MDAC input is then set to the next code and the process is repeated till all the MDAC codes are corrected. Fig. 3.15(b) shows residue curve before and after calibration, where the points at which the residue in minimized are marked. In summary, the MDAC INL is corrected by fitting to an accurate Cal-DAC transfer function. It is noteworthy that despite the high precision requirements for this Ref-DAC (better than 7-bits), only few taps are required in calibration process (16 levels). These few taps minimize the nonlinearity caused by contact resistances, which allows for on-chip resistor ladder with least 11-bit equivalent linearity [42, 52].

#### 3.4.4.2. Design of Self-Calibrating Current Cell

The current cell for body-voltage calibration is shown in Fig. 3.15(c). The cell comprises of a PMOS current source,  $M_1$ , differential switches,  $M_{2a,b}$ , and a resistive trimming DAC connected to the body of each current cell. To enable low voltage operation, no cascode device is added to the current source. Instead, the switching pair is biased in saturation when turned on to effectively increase the output impedance of the current output. This arrangement may result in current glitches during switching, but the glitches do not impact the ADC performance because the MDAC outputs are sampled by the F-ADC. For more power and area saving, the trimming resistor ladder is shared between all cells.

A digital decoder is embedded in the buffer chain between the C-ADC and MDAC. The proposed decoder, shown in Fig. 3.15(c), enables independent control for each DAC cell. In the calibration mode at start-up, the decoder overwrites the C-ADC output. In this mode, the input of each DAC cell can be externally set to high, low, or completely deactivated.

#### 3.4.4.3. Measurement Results

A prototype 7-bit ADC is fabricated in a 65-nm CMOS (Fig. 3.16). The internal DAC has an active area of  $0.01 \text{mm}^2$ . Calibration hardware occupies ~50% of the total DAC area (i.e., optimal ratio depicted by [12]). Fig. 3.17 shows the measured differential nonlinearity (DNL) and integral nonlinearity (INL) of the MDAC. The measured DNL range was reduced from -0.482/+0.6 LSB before calibration to -0.173/+0.181 LSB after calibration. The measured INL range was reduced from -0.019/+0.872 LSB before calibration to -0.086/+0.173 LSB after calibration. The SFDR, SNDR and ENOB versus sampling rate and input signal frequency are shown in Fig. 3.18. The ADC achieves a calibrated effective number of bits (ENOB) of 6.2b, with >600MHz effective resolution bandwidth (ERBW). Measured performance summary and comparison is presented in Table 3.1. the table compares the performance of this 7-bit ADC, that uses a CS MDAC, to other ADCs that use different DAC architectures (i.e., SC and R-ladder DAC). Comparison shows superior performance in terms of area, power, and efficiency.



Fig. 3.17. Measured MDAC nonlinearity before and after calibration: (a) DNL and (b) INL. The nonlinearity is shown in LSBs of the 7-bit ADC accuracy.



Fig. 3.18. ADC dynamic performance: (a) SNDR and ENOB (before and after calibration) versus  $f_{in}$  for  $f_s = 1.5$ GHz, and (b) calibrated SNDR and SFDR versus  $f_s$  for  $f_{in} = 10$ MHz.

#### **3.5.** Conclusion

The efficiency of using body voltage trimming offset calibration for different data converters blocks has been investigated. Both the advantages and challenges of this technique are studied in details in the context of two critical data-converters blocks. Suggested methods have been presented to extend the use of bulk voltage trimming beyond technology limitations. As a case study, a foreground calibration method for current-steering MDAC has been presented by using body-based trimming. The proposed current cell allows calibration of the current sources to very high accuracies. Measurement results from a prototype chip, implemented in 65-nm CMOS, show significant performance improvement when applying body-voltage trimming. The area and power overhead due to trimming have been minimized using the proposed methods, while neither special technology nor special operating environment was required. The approach manages device mismatch with low penalty. A 4-bit thermometer-coded DAC with this calibration technique is embedded in a 7-bit pipelined ADC and implemented in 65-nm CMOS. The ADC has achieved a calibrated SFDR of 47dB at speeds up to 2GHz. MDAC core occupies  $0.01 \text{mm}^2$  active area and the entire ADC consumes 41mW from a 1.2V supply at 1.5GS/s.

# **CHAPTER 4**

# An Architecture-Reconfigurable 3b-to-7b 4GS/s-to-1.5GS/s Digitally-Calibrated ADC in 65-nm CMOS

This chapter introduces the design of a high-speed reconfigurable analog-to-digital converter in 65-nm CMOS. Accuracy requirements are met without compromising performance by means of digital calibration and smart architecture selection. Partial interleaving architecture and the introduction of a current-steering DAC and an open-loop amplifier are proposed to relax the residue amplifier settling at minimum area and power overhead. Dynamic thresholds adjustment for the sub-ADCs is employed both to calibrate the ADC offset mismatches and to correct for the amplifier gain and nonlinearity errors. The ADC covers a resolution range from 3-bit to 7-bit at sampling rates from 4GS/s to 1.5GS/s. The worst case DNL and INL are  $\pm 0.45$ LSB and  $\pm 0.66$ LSB respectively. The ADC achieves a figure-of-merit of 0.46pJ/conv-step at 7-bit and occupies an active area of 0.15mm<sup>2</sup>.

## 4.1. Introduction

Modern digital communication systems target satisfying multiple standards. Applications include read channels of magnetic and optical data storage systems, PCIe links, FPGA I/Os, and multi-standard radios. This stimulates the research on reconfigurable analog-to-digital converters (ADCs) to serve as a key building block at the front-end of such systems [1-4]. Generally, two

main techniques are used to achieve reconfigurability. The first technique is by directly trading speed for resolution like sigma-delta and counter ADCs [2, 3], and the second technique is by turning on and off some blocks of a flash or a pipelined ADC [1, 4]. Despite the simplicity and low overhead of the first technique, they suffer from significant speed degradation at higher resolutions which limits their use to the applications that require inverse bandwidth–resolution proportionality, like software-defined radio. The second approach does not suffer from this problem, but their poor figure-of-merit (FoM) scaling with different resolutions reduces their flexibility. The limited efficiency of both techniques is attributed to the fact that they fix the ADC architecture for all configurations, whereas the optimum architecture depends on the target resolution.

This chapter introduces an architecture reconfigurable ADC that efficiently covers a wide range of resolutions by configuring the ADC to the proper architecture for each resolution. This leads to a reconfigurable ADC nearly as efficient as dedicated designs in both area and power.

Although interleaving multiple ADCs increases the converter's sampling rate, two main challenges limit the efficiency of this technique. The first challenge is that the ADC occupies massive area, and the second one is that sampling mismatches limit the time-interleaved ADC (TI-ADC) performance. The latter problem can be eliminated with the partial interleaved architecture that interleaves only at next stages [53], but the excessive hardware increases the converter area and power consumption.

This work proposes a two-step architecture that minimizes the application of interleaving to only residue generation—the speed bottleneck. This architecture is considered mostly "single-channel" because it has a single track-and-hold amplifier (THA), single sub-ADCs and a single

MDAC. The design eliminates the sampling mismatches from TI-ADCs and achieves high speed at minimal power and area overhead. Despite being applied to a two-step architecture, the approach is extendable to any pipelined ADC.

The chapter is organized as follows; section 4.2 introduces the reconfigurable ADC architecture. Section 4.3 describes the implementation of some key building blocks. Calibration circuits and algorithms are explained in section 4.4. Finally, section 4.5 draws the conclusions.

## 4.2. Reconfigurable ADC Architecture

Fig. 4.1 shows the FoM versus resolution for gigasamples/s flash and multi-step ADCs reported in the ISSCC and VLSI over the past 15 years [5]. The FoM scaling trends are extracted for both architectures. The figure shows a performance crossover around 5-bit resolution. This means that an efficient reconfigurable ADC should change the architecture from flash to multi-step at this resolution. The proposed reconfigurable ADC architecture is shown in Fig. 4.2. It consists of a track-and-hold (T/H), two 3-bit coarse ADCs (C-ADCs), and a multiplying digital-to-analog converter (MDAC) with two back-end 3-bit fine ADCs (F-ADCs).

#### 4.2.1. Flash ADC Architecture

For the 3-bit configuration, the ADC is configured to pure flash architecture and only one 3bit C-ADC is activated. The other blocks, with their associated clocking, are powered down. The proposed flash architecture is shown in Fig. 4.3. It contains eight comparator slices, with accompanying offset-canceling buffers, resistor ladder, CMOS thermometer-to-binary encoder



Fig. 4.1. Trends in Gigasamples/s ADCs Performance: FoM versus ADC resolution for both flash and multi-step architectures in [6].



Fig. 4.2. Reconfigurable ADC architecture.



Fig. 4.3. 3-bit Flash ADC architecture.

with bubble correction, and clock buffers. Each comparator slice is built of a differential preamplifier and dynamic comparator stage followed by an SR-latch. The clock buffers provide the sampling clock for the track-and-hold, the comparators, and the following encoder.

The ADC is configured to 4-bit flash architecture by merging two 3-bit blocks with interleaved threshold levels as in [1]. In this case, the threshold levels of one ADC are shifted <sup>1</sup>/<sub>4</sub> LSB up, and the thresholds of the other ADC are shifted <sup>1</sup>/<sub>4</sub> LSB down. Then a 3-bit adder is added at the back-end to generate the 4-bit binary output.

#### 4.2.2. Two-Step ADC Architecture

For higher resolutions, the ADC is configured to the two-step architecture shown in Fig. 4.4(a). The 5-bit, 6-bit, and 7-bit resolutions are achieved by configuring the C-ADC and F-ADC as 3b-3b, 3b-4b, and 4b-4b respectively. The residue signal is generated through a two-way interleaved subtraction, with the timing diagram shown in Fig. 4.4(b). The interleaved residue signals are multiplexed back right after the subtraction, amplified, and quantized by the F-ADC. This technique enables three key features: (1) it offers a full cycle for the THA settling thus saving significant power consumption for the THA buffer, (2) the C-ADC comparators regeneration takes a complete half cycle instead of sharing it with the MDAC settling, and (3) another half cycle is provided for the MDAC settling. This makes the settling time of these three critical blocks equivalent to that of a two-way interleaved conventional two-step ADC, although none of these blocks is actually interleaved. Offset mismatches between the interleaved subtractor buffers are digitally calibrated with body voltage trimming as described in section IV.


Fig. 4.4. Two-step configuration: (a) partially-interleaved architecture, and (b) timing diagram.

# 4.3. Circuit Implementation

#### 4.3.1. Track-and-Hold

Although the two-step architecture, with multi-bit C-ADC, relaxes the accuracy and noise requirements of the back-end stage, the increased comparators kickback noise of the C-ADC limits the ADC performance. Such effect becomes worse when increasing the C-ADC resolution since more comparators contribute to the kickback noise. This problem may be solved by using two separate T/H circuits for the C-ADC and the MDAC, but this raises the clock matching constraints between C-ADC and the MDAC sampling instances. The analysis in [24], for the

timing mismatch in interleaved ADC, can be modified to calculate the maximum tolerable timing mismatch between the two T/Hs as

$$\frac{\Delta T}{T_s} = \sqrt{\frac{6}{\pi^2 \cdot SNR}} \tag{4.1}$$

where  $\Delta T$  is the sampling time mismatch,  $T_s$  is the sampling period, and *SNR* is the signal-tonoise ratio due to timing mismatch.

To guarantee less than 1dB degradation of the overall ADC performance, this *SNR* should be 6dB higher than signal-to-quantization-noise-ratio (SQNR). This leads to an *SNR* defined as

$$SNR \ge 6 \times |\text{Re solution}(CADC) + 1| + 1.76 \quad dB$$

$$(4.2)$$

Fig. 4.5 shows the maximum tolerable percentage timing mismatch versus C-ADC resolution. For 4-bit C-ADC resolution, the clocks for the two T/H circuits should match to better than 2% of the sampling clock period. Such constraint is very hard to achieve for gigasamples/s applications unless accurate digitally-controlled delay lines are used.



Fig. 4.5. Maximum tolerable timing mismatch versus C-ADC resolution for 1dB SNR penalty.



Fig. 4.6. Proposed track-and-hold circuit with source-follower amplifier.

The proposed T/H circuit, shown in Fig. 4.6, uses a single clock bootstrapping circuitry and two switched capacitors that share the gate and source nodes. This technique reduces the T/H sensitivity to the (15 comparators) C-ADC kick-back noise, without raising the clock matching constraints between C-ADC and the MDAC sampling instances since it guarantees simultaneous switching for both circuits.

#### 4.3.2. Flash Sub-ADCs

The C-ADC and the F-ADC use the flash architecture shown in Fig. 4.3. The reference voltage of each comparator can be adjusted with a 6-bit segmented reference DAC to calibrate offset mismatches. Each comparator slice is built of a simple differential preamplifier followed by a dynamic strong-arm latch stage as shown in Fig. 4.7(a) and Fig. 4.7(b), respectively. The comparator output is then fed to the proposed SR-latch (Fig. 4.7(c)) to generate an NRZ output. The proposed SR-latch consumes no static current and does not suffer from the fight condition, due to back-to-back inverters, of conventional static SR-latches. In other words, no special sizing ratio is required between the input transistors and the back-to-back inverters in order to



Fig. 4.7. Comparator schematic: (a) Preamplifier, (b) strong-ARM latch, and (c) SR latch.

successfully overwrite the SR-latch outputs. In the comparator regeneration phase, as long as the SR-latch input swing is high enough, one branch is turned off and the inverters do not fight the input transistors during the writing phase. In the comparator reset phase, the SR-latch back-to-back inverters are both active and preserve the SR-latch output value.

A duty cycle control (DCC) circuit is used for the comparators, as shown in Fig. 4.8. This circuit enables independent clock pulse-width optimization for the full flash and the multi-step modes. For the full flash mode, wider pulses are used to extend the comparators regeneration phase hence increasing the ADC available sampling rate. For the two-step mode, narrower pulses are used for comparators clocking in order to give more time for the MDAC settling.

#### 4.3.3. MDAC

For multi-step ADCs, the MDAC block performs three operations: (1) reconstructing the digitized signal from the C-ADC output, (2) generating residue signal by subtracting the input from the digitized signal, and (3) amplifying the residue signal. Conventionally, the three operations are combined and performed by a single switched-capacitor op-amp based circuit.



Fig. 4.8. Comparators clock generation with nonoverlap control.

However, the high gain and bandwidth requirements of the closed-loop amplifier limit the highspeed performance [8].

A good approach to increase the speed of multi-step ADCs was to separate residue generation from the amplification process. This allows for using high-speed open-loop amplifiers and significantly reducing the MDAC power consumption. The amplifier gain and nonlinearity errors are corrected either at the front-end [54], or at the digital back-end [55]. The signal construction and residue generation are combined by using capacitive DAC, similar to successiveapproximation (SAR) and subranging architectures. Although capacitive DACs offer outstanding power efficiency and systematic linearity, matching requirements lead to capacitors sizes well above noise constraints as shown in Fig. 4.9. The minimum required capacitor size, both matching limited and noise limited, is plotted versus target accuracy for a 4-bit capacitive DAC. Analysis assumes 99.7% yield, peak to peak differential swing of ½ the supply rails, and less



Fig. 4.9. MDAC unit capacitor size versus target accuracy for both thermal noise and matching limitations.

than 1dB SNR degradation is targeted. The figure shows that matching constraints results in capacitor sizes 9 times bigger than thermal noise limitation.

For this reason, we propose an MDAC architecture that separates the three MDAC functions as shown in Fig. 4.10. The subtractor is realized with a single capacitor that only performs subtraction. A current DAC is used to reconstruct the C-ADC digital output, as shown in Fig. 4.11. The residue signal is buffered with a source-follower amplifier and then amplified with an open-loop differential amplifier. This solution leverages the capacitor subtraction linearity without the need to oversize the DAC elements. The current DAC mismatches are corrected by trimming the body voltage of each current cell, as described in next section.

The source-follower residue buffer design is optimized to satisfy the speed and noise specifications at minimum power consumption. Starting from any arbitrary size for the source-follower input device, the drain current is swept and the transconductance is plotted. Fig. 4.12 shows the transconductance sensitivity to bias current, as defined by (4.3).

$$Sensitivity_{G_m} = \frac{dg_m/g_m}{dI_D/I_D}$$
(4.3)

where  $g_m$  is the transconductance and  $I_D$  is the drain bias current.

For small bias currents, the device is biased in weak inversion and the transconductance shows high sensitivity. This means that increasing the bias current is an efficient way to



Fig. 4.10. Proposed MDAC schematic with capacitive subtractor.



Fig. 4.11. Current-steering DAC schematic.



Fig. 4.12. Simulated transconductance sensitivity (top) and linearity (bottom) versus bias current.

increase the device transconductance, and so the bandwidth of the source-follower circuit. Yet, the efficiency drops when increasing the bias current, i.e., increasing the bias current may results in marginal transconductance increase. For a power efficient design, the optimum bias is selected as the point at which the sensitivity drops to 50%, i.e., the ideal case for a transistor biased in saturation region. Fig. 4.12 also shows that the buffer linearity at this point is good for resolutions up to 10-bits.

Next, the buffer load capacitance is designed in order to meet the thermal noise requirement for a given target resolution. Then the optimally biased design is scaled in order to achieve the buffer required bandwidth, based on the required settling accuracy for an N-bit ADC given by (4.4).

$$BW_{\min} = \frac{\ln(2^N)}{\pi} \times f_s \tag{4.4}$$

The open-loop residue amplifier in this architecture should provide: (1) small gain error over PVT variations, (2) different gains for different resolutions (4 for the 5-bit and the 6-bit

configurations, and 8 for the 7-bit configuration), and (3) high linearity. This demands wide range of gain control with very high linearity. The required performance may be achieved by adding voltage-controlled degeneration resistance [54], which is used for both gain control and linearization. However, this solution reduces the amplifier gain and voltage headroom thus demanding more current and higher supply voltage to achieve the required voltage gain and output swing. Alternatively, we propose using a simple differential pair as shown in Fig. 4.13(a). The amplifier is designed with a fixed gain and uses constant-g<sub>m</sub> bias to minimize the sensitivity to process variations. The amplifier nonlinearity error is fixed by means of reference predistortion at the back-end F-ADC (i.e., the F-ADC thresholds are adjusted to map the amplifier nonlinearity). Instead of controlling the amplifier gain, the dynamic-range (DR) of the backend F-ADC is controlled to match the amplifier output swing for different configurations. For example, to double the amplifier effective gain, the DR of the F-ADC is scaled down by a factor of two.

Since the F-ADC architecture enables dynamic threshold adjustment (as explained in next section), the residue amplifier gain and linearity errors are less critical. This allows for a speed-optimized solution, where the amplifier (Fig. 4.13(b)) is split into two stages with a fan-out of 2 (FO-2), like tapered high-speed digital buffers. Fig. 4.14 shows the required current for both the residue amplifier and the F-ADC versus residue amplifier gain. The figure shows that a conventional design, with nominal gain of 8 (for the 7-bit configuration), consumes unnecessary more current. For more power saving, the amplifier nominal gain is set to 4 and the DR of the F-ADC is reduced to half of its ideal values as a trade-off between accuracy and speed.







Fig. 4.13. Amplifier schematic: (a) reference predistortion, (b) tapered amplifier with FO-2.



Fig. 4.14. Residue amplifier optimization: amplifier and FADC current versus gain for the 7-bit configuration.

#### 4.4. ADC Calibration

#### 4.4.1. Flash ADC Offset Trimming

In this work, offset cancellation is achieved by modifying the comparator reference voltages, which does not affect the critical signal path [1, 17, 18, and 56]. The proposed differential reference adjustment of the comparators combines two methods to achieve larger tuning range with very fine accuracy as shown in Fig. 4.15. The first approach is to select the tap-point from the resistor ladder by a 3-bit coarse control signal. The selected voltage is connected to a PMOS reference buffer, where the bodies of PMOS devices are connected to another resistor ladder with 3-bit digital control to perform reference fine-tuning. In this case, same accuracy is achieved with 75% saving in the required hardware compared to using a 6-bit conventional calibration DAC. The DR of the back-end F-ADC is controlled with a current DAC (I<sub>DR,control</sub>) to match the residue amplifier output swing for different configurations. Another DAC (I<sub>CM,control</sub>) is used to fix the common-mode of the reference voltages.



Fig. 4.15. The comparator differential-reference adjustment combines coarse and fine-tuning to tolerate a large offset range with very fine accuracy.

Inspired by the self-biased amplifier technique presented in [19], where the input signal is applied to both the buffer input transistor and the current source, the body trimming is applied to both transistors in order to double the effective tuning rage. Although it needs only to cover one coarse step, the fine-tuning range covers 2 coarse steps to tolerate PVT variations. By combining both methods, the reference tuning range of each comparator is more than 25% of the ADC dynamic range with <sup>1</sup>/<sub>4</sub> LSB steps of the 5-bit resolution.

Despite the fact that body trimming in bulk CMOS process necessitates the use of the "slow" PMOS devices (in order to independently control the body voltage of each device), these PMOS transistors are essentially needed only for the reference buffers that do not require wide bandwidth. The high-speed core (i.e. the preamplifiers and the dynamic latches) are built using NMOS input transistors to enable high-speed. This favors applying body trimming to the reference buffer rather than directly to the preamplifier as in [17, 56]. After calibration is done, the input referred offset (which includes both static and dynamic offsets) is cancelled within the calibration accuracy.

#### 4.4.2. DAC Mismatch Calibration

The current DAC cell is shown in Fig. 4.16(a). The cell comprises of a PMOS current source,  $M_1$ , differential switches,  $M_{2a,b}$ , and a resistive trimming DAC connected to the body of each current cell. To enable low voltage operation, no cascode device is added to the current source. Instead, the switching pair is biased in saturation when turned on to effectively increase the output impedance of the current output. This arrangement may result in current glitches during switching, but the glitches do not impact the ADC performance because the MDAC outputs are sampled. For more power and area saving, the trimming resistor ladder is shared between all

cells.

The trimming DAC corrects current mismatches by modifying the body bias of each cell, hence changing its threshold voltage. This trimming approach leverages the inherent attenuated body-to-threshold voltage sensitivity, where large voltage steps at transistor body results in fine changes in the corresponding threshold voltage. This attenuation results in relaxed matching, noise, and linearity constraints of the trimming DAC. Since the sensitivity to body voltage variations is decreasing with technology advancements<sup>6</sup>, this body-bias trimming technique can have even better resolution. This technique provides very fine calibration accuracy without adding extra capacitance to the DAC critical signal path. So, very accurate matching is achieved without compromising the DAC speed. The proposed scheme eliminates the need for auxiliary current units to correct for mismatch, which reduces the current source capacitance and improves its linearity.



Fig. 4.16. Current-steering DAC calibration: (a) body-voltage trimming based calibration scheme, and (b) decoder embedded in the buffer chain.

<sup>&</sup>lt;sup>6</sup> A 50% reduction in sensitivity is observed when migrating from 90nm to 40nm CMOS process.

Another important advantage of this method is the inherent process tracking. Since threshold voltage mismatches are the dominant source of CS DAC errors, a robust correction method is to directly correct the transistors threshold voltages by tuning their bodies. Simulations show body-to-threshold voltage gain variations of only  $\pm 10\%$  for different process corners.

A digital decoder is embedded in the buffer chain between the C-ADC and MDAC to configure the DAC for different resolutions. The proposed decoder, shown in Fig. 4.16(b), enables independent control for each DAC cell. For the pure flash mode, all DAC cells are disabled. For the two-step 5-bit and 6-bit configurations, the C-ADC is configured to 3-bit. So, only 7 DAC cells are activated. For the 7-bit configuration, 15 DAC cells are activated since the C-ADC is configured to 4-bit. The decoder also enables a calibration mode that overwrites the C-ADC output. In this mode, the input of each DAC cell can be externally set to high or low.

#### 4.4.3. Channel Mismatch Calibration

Although channel mismatches such as offset, gain mismatches, and timing skew of the distributed clocks may not affect the performance of individual channels, they severely degrade the interleaved ADC performance [24, 57]. Timing skews does not affect the performance of the proposed architectures since it has a single front-end T/H. For medium resolution ADCs, gain mismatches do not seriously affect the ADC performance if carefully matched source-follower buffers are used. Yet, offset mismatches remains a limiting factor for the overall ADC performance.

Suppose that M channels are interleaved to build a higher speed ADC, where the offsets of each channel are different, this mismatch causes fixed pattern noise in the ADC system. For a zero input signal, each channel may produce a different output code and the period of this error

signal is  $M/f_s$ , where  $f_s$  is the sampling frequency of the interleaved ADC. The pattern noise is almost independent of the input signal. In frequency domain, this effect causes noise peaks at

$$f_{noise} = k \times f_s / M$$
 ,  $k = 1, 2, 3, ...$  (4.5)

In order to alleviate this effect, offset mismatches should be calibrated to a very fine accuracy, usually a fraction of an LSB. Channel offset mismatches can be tolerated by adding redundant channels to the interleaved ADC [58]. Yet, the input capacitance overhead as well as the need for phase control with very wide dynamic range limit the use of this technique to medium speeds and resolutions. Although comparator offset does not affect the performance of SAR ADCs, body voltage trimming of the comparator has been successfully used in [10] mainly to fix the channel offset mismatch problem. Yet, no analysis has been provided to show the limitation of this technique. In addition, applying calibration at the comparator limits the use of this method to single comparator ADCs (e.g. SAR and integrating ADCs). Extending the approach to multiple-comparator ADCs (e.g. Flash and pipelined ADCs), however, adds huge overhead thus shows inefficient. Moreover, the proposed ADC architecture shares the residue amplifier and the back-end F-ADC between the interleaved paths thus making it impossible to fix channel mismatches at these blocks. This leaves the source-follower residue buffers, placed right after the subtractor (Fig. 4.10), the only possible choice for channel mismatch calibration.

In order to investigate the efficiency of applying body trimming calibration, the offset is calculated for this optimized design, described in previous section, for different values of speeds and resolutions. Fig. 4.17 shows that pure body voltage trimming is sufficient to calibrate the offset channel mismatch for the target application (7-bits resolution at 1.5 GS/s sampling rate). It is worth emphasizing that no over sizing is required to bring the offsets down to the body voltage



Fig. 4.17. Amplifier offset versus ADC sampling rate for different target resolutions.

trimming range, since the amplifier device sizes are optimized independent of the offset requirements as described in previous section.

#### 4.5. Experimental Results

This reconfigurable ADC has been fabricated in a six-level metal single-poly 65-nm digital CMOS and occupies 0.15mm<sup>2</sup>. The chip photomicrograph is shown in Fig. 4.18. The ADC operates at 4 GS/s for the full flash mode, and at 1.5 GS/s for the two-step mode. The ADC thermometer outputs are converted to a straight binary format using Wallace encoders and are decimated by a factor of 625. In this section, we explain the details of the testing setup and discuss the measurements results.

#### 4.5.1. Testing Setup

Fig. 4.19 shows a generic block diagram of the testing setup used for all prototype chips described in this dissertation. The ADC decimated outputs are read out using a logic analyzer. Two external signal generators are used for the analog input and the high-speed clocks.



Fig. 4.18. Photomicrograph of the ADC chip.



Fig. 4.19. Testing setup



Fig. 4.20. Measured calibrated ADC dynamic performance for 3-bit and 4-bit configurations: (a) SNDR and ENOB versus  $f_{in}$  for  $f_s = 4$ GHz, (b) SFDR versus  $f_{in}$  for  $f_s = 4$ GHz.

Wideband baluns are used to generate the differential signals. Different control bits are written into an on-chip scan chain using a PC-based MATLAB code. The PC USB port is connected to National Instruments data acquisition (NI-DAQ) interface. The interface between the noisy NI-DAQ and logic analyzer, and the chip, is buffered through an optocoupler I/O board.

#### 4.5.2. Measurement Results of the Flash ADC Configuration

In the 3-bit and 4-bit modes, at 4 GS/s, with a 1.2 V supply, the ADC uses the full flash architecture and consumes 12 mW and 20mW respectively. Fig. 4.20 shows the SNDR, ENOB and SFDR of the ADC output versus input signal frequency. For both resolutions, the ADC achieves flat SNDR over a bandwidth well beyond the Nyquist frequency. Fig. 4.21 shows the measured DNL and INL for the 4-bit mode before and after calibration. Calibration reduces the DNL range from  $\pm 1.01$ LSB to  $\pm 0.18$ LSB, and the INL range from  $\pm 1.06$ LSB to  $\pm 0.11$ LSB.



Fig. 4.21. Measured ADC static linearity for 4-bit configuration (before and after calibration): (a) DNL, and (b) INL.

#### 4.5.3. Measurement Results of the Two-Step ADC Configuration

In the 5-bit, 6-bit, and 7-bit modes, at 1.5 GS/s, with a 1.2 V supply, the ADC uses the two-step architecture and consumes 29mW, 35mW and 41mW respectively. Fig. 4.22(a) shows that offset mismatch results in a measured DC offset as well as a spur of -33 dB<sub>FS</sub> at  $f_{s}/2$ , as depicted by (5). This spur limits the maximum achievable accuracy of the ADC to 5.2 bits. Applying the body voltage trimming, to the residue buffer, brings the spur down by 20 dB as shown in Fig. 4.22(b). This is enough to push the channel-mismatch spur well below other spurs. The SNDR versus input signal frequency is shown in Fig. 4.23(a). Fig. 4.24 shows the measured DNL and INL for the 7-bit mode, before and after calibration. Calibration reduces the DNL range from ±11.5LSB to ±0.34LSB, and the INL range from ±13.8LSB to ±0.66LSB.



Fig. 4.22. Measured spectrum of the decimated output ( $f_{in} = 500$ KHz and  $f_s = 1.5$ GHz) for a 7-bit two-step partially-interleaved ADC with channel mismatches: (a) before channel-mismatch calibration, and (b) after channel-mismatch calibration.



Fig. 4.23. (a) Calibrated SNDR of the two-step configuration (5-to-7 bits) versus  $f_{in}$  for  $f_s = 1.5$ GHz. (b) ENOB (before and after calibration) versus ADC resolution.



Fig. 4.24. Measured DNL and INL for the 7-bit configuration (before and after calibration).



Fig. 4.25. Measured maximum (a) DNL and (b) INL before and after calibration for different configurations.

#### 4.5.4. Reconfigurable ADC Performance

The ADC effective-number-of bits (ENOB), before and after calibration, versus ADC resolution is shown in Fig. 4.23(b). The measured ranges for both DNL and INL versus ADC resolution, before and after calibration, are shown in Fig. 4.25(a) and (b) respectively. Fig. 4.26(a) compares the ADC FoM to previously published work. The discontinuity at 5-bit implies that the architecture switch should have been shifted to the 6-bit resolution. However, the FoM scaling shows best tracking—among reconfigurable architectures—to the optimized ADCs performance trend. Despite being single-channel, Fig. 4.26(b) shows that this work reports the fastest reconfigurable ADC that covers the broadest resolution range. The measured performance summary, presented in Table 4.1, shows the flexibility (best  $R_F$ ) and the area efficiency of this ADC.



Fig. 4.26. Reconfigurable ADCs performance comparison: (a) FoM versus ENOB, (b) effective sampling frequency versus ENOB.

| Technology                         | 65-nm CMOS                      |       |       |       |       |
|------------------------------------|---------------------------------|-------|-------|-------|-------|
| Supply Voltage (V)                 | 1.2                             |       |       |       |       |
| Active Die Area (mm <sup>2</sup> ) | 0.15 (0.55 in [3], 0.38 in [4]) |       |       |       |       |
| <b>Reconfigurability Factor</b>    | 3.65 (0.9 in [3], 2.69 in [4])  |       |       |       |       |
| <b>Resolution (Bits)</b>           | 3                               | 4     | 5     | 6     | 7     |
| Sampling Rate (GS/s)               | 4                               | 4     | 1.5   | 1.5   | 1.5   |
| ERBW (GHz)                         | >2                              | >2    | >0.75 | 0.6   | 0.6   |
| Total Power (mW)                   | 12                              | 20    | 29    | 35    | 41    |
| Figure of Merit (fJ/conv)*         | 382                             | 378   | 877   | 712   | 461   |
| DNL (LSB)                          | ±0.09                           | ±0.18 | ±0.45 | ±0.27 | ±0.34 |
| INL (LSB)                          | ±0.06                           | ±0.11 | ±0.53 | ±0.31 | ±0.66 |
| ENOB (Bits)                        | 2.99                            | 3.71  | 4.46  | 5.34  | 6.27  |

 Table 4.1.
 ADC Performance Summary

\*  $FoM = Power / [2^{ENOB} x min (f_s, 2xERBW)]$ 

#### 4.6. Conclusion

A high-speed architecture-reconfigurable ADC has been fabricated in 65-nm CMOS. The proposed dynamic reconfiguration technique allows a wide range of performances to be covered through splitting the converter into smaller building blocks. Partial interleaving architecture and the introduction of a current-steering DAC and an open-loop amplifier are proposed to relax the residue amplifier settling at minimum area and power overhead. ADC component mismatches are corrected using body-voltage trimming based digital-calibration to decouple analog performance from component matching. Reference-predistortion for the back-end F-ADC is employed both to correct for the residue amplifier gain and nonlinearity errors. The ADC covers a resolution range from 3-bit to 7-bit at sampling rates from 4GS/s to 1.5GS/s.

# **CHAPTER 5**

# A Digitally-Calibrated 3b-to-5b 10GS/s-to-2.5GS/s Reconfigurable Flash ADC in 65nm CMOS

The design of a high-speed reconfigurable analog-to-digital converter in 65-nm CMOS is described. Accuracy requirements are met without compromising the high-speed performance by using trimming-based offset cancellation. The ADC can be configured to work as a 3-bit, a 4-bit, or a 5-bit ADC with maximum integral nonlinearity (INL) and differential nonlinearity (DNL) of 0.48LSB and 0.35LSB respectively. The ADC achieves a figure-of-merit of 0.46pJ/conv-step and the active area is 0.13 mm<sup>2</sup>.

#### 5.1. Introduction

Many recent wireless and wireline high-speed digital communication systems target satisfying multiple standards [18, 25, 59-61]. Applications include read channels of magnetic and optical data storage systems, PCIe links, FPGA I/Os, and multi-standard radios. An area and cost efficient means of signaling across multiple standards is the use of a single reconfigurable transceiver front end. This work focusses on one of the key building blocks of such a transceiver, the analog-to-digital converter (ADC). The majority of the prior efforts on reconfigurable ADCs have targeted low-to-medium sampling rate applications [25, 59]. Reconfiguration is particularly difficult for high-speed applications [60, 61] because the ADCs are tuned for a specific frequency and a change in performance is either intractable or the performance suffers with different sampling speeds or resolution.

The application of the high-speed ADC presented in this work is the front end of wireline transceivers with digital equalization [18, 62-64]. These ADCs require high sampling rates in the range of 2 to 10 GS/s, and need resolution in the range of 3-5 bits. The number of conversion levels depends on the link characteristics; this number directly impacts the power and area of the ADC. Analysis in [18] shows that the required ADC threshold number can be reduced by 50% for similar performance if "dynamically" optimized depending on channel characteristics.

This chapter discusses the design of a reconfigurable 3-5 bit ADC that covers a wide range of specifications with near-to-optimum power and area. Correctly partitioning the ADC into smaller blocks, in conjunction with proper choice of architecture, can maximize the block usage. In other words, we can minimize the amount of redundant elements for each application in order to make the design nearly as efficient as dedicated designs concerning both area and power.

The proposed ADC is capable of real-time reconfiguration in order to achieve near optimal cost-performance operation. We also propose a new offset cancellation scheme, based on mixing resistive reference DACs with programmable body-controlled voltage techniques. This scheme accurately adjusts the reference voltages of the comparators, imposes no performance penalty on the high speed signal path, and adds minimal circuit overhead. This chapter is organized as follows. Section 5.2 introduces the reconfigurable ADC architecture. Section 5.3 describes the proposed calibration circuits and algorithms. Section 5.4 presents measurement results, followed by conclusion in Section 5.5.

## 5.2. Reconfigurable ADC Architecture

The building block for the reconfigurable architecture is selected to be a 3-bit flash ADC (i.e. the minimum resolution). To build a higher resolution converter, the outputs from multiple

blocks are combined. For example, to build a 4-bit ADC, two 3-bit blocks are merged. The threshold levels of one ADC are shifted <sup>1</sup>/<sub>4</sub> LSB up, and the thresholds of the other ADC are shifted <sup>1</sup>/<sub>4</sub> LSB down. Then a 3-bit adder is added at the back-end to generate the 4-bit binary output. Similarly, two 4-bit ADCs are merged to build a 5-bit one. In addition, speed can be traded for resolution. Assuming four 3-bit ADC blocks on a chip, the ADC can be configured to work as: (1) 1-to-4X interleaved 3-bit ADC at sampling rate up to 10GS/s, (2) 1-to-2X interleaved 4-bit ADC at sampling rate up to 5GS/s, or (3) single channel 5-bit ADC at sampling rate of 2.5GS/s.

The advantage of this approach is modularity and scalability. For a target resolution, only the needed blocks are turned on, the unselected blocks are turned off, and comparator clocks are deactivated by the means of clock gating. This proposed technique is similar to parallelism in digital design, except that it interleaves in both voltage and time. The main challenge for this approach is that the 3-bit ADC unit block differs from conventional 3-bit ones in three main aspects: first, the track-and-hold (T/H) should satisfy both the accuracy of the maximum resolution (5-bit accurate), and the bandwidth of the maximum speed (four times faster than a single channel). Second, the ADC accuracy needs to be satisfactory for the highest resolution of all possible configurations. Finally, the thresholds of the ADC need to be dynamically adjusted to adapt to any possible configuration.

We take advantage of the fact that the T/H does not need to meet both the bandwidth and accuracy requirements simultaneously. For example, for a four-way interleaved 3-bit ADC the T/H needs to settle to only within 3-bit accuracy. For the 5-bit configuration, however, the input

signal is four times slower and signal tracking becomes much easier. Fig. 5.1 shows that the most stringent bandwidth requirement is for the 4-way interleaved, 3-bit configuration.

Generally, the ADC accuracy is limited by two parameters: noise and comparators offsets. Since the thermal noise is usually not critical for low-resolution flash ADCs, comparators offsets are the limiting factor for this reconfigurable architecture. If there is no explicit offset



Fig. 5.1. Banwidth requirements of the T/H for different configuration.



Fig. 5.2. Proposed system architecture.

cancellation and only device sizing adopted, the 3-bit ADC needs to be oversized by a factor of 16 in order to meet the 5-bit accuracy requirement. For better power efficiency, a novel calibration method is introduced in the next section that enhances the ADC accuracy and dynamically adapts the ADC threshold to configure the ADC to the target resolution.

The proposed architecture is shown in Fig. 5.2. The energy is optimized for the first comparator stage since it consumes most of the analog power. The energy versus regeneration time constant for the first comparator stage is shown in Fig. 5.3 using a 65-nm CMOS technology targeting 10GS/s. The simulation results show that 5-way interleaving can achieve minimal power for the given speed and technology. However, four-way interleaving architecture is selected due to its simplicity in both layout and timing control. Each channel is built using a 3-bit ADC blocks. Speed is traded for accuracy to provide 3, 4, and 5 bits of resolution at 10, 5, and 2.5GS/s respectively. Digital-calibration is also used to compensate for channel mismatches.

The ADC channel in each interleaving path is shown in Fig. 5.4. It contains eight comparator slices, with accompanying offset-canceling buffers, resistor ladder, CMOS thermometer-tobinary encoder with bubble correction, and clock buffers. Each multi-stage comparator slice is built of two dynamic stages followed by an SR-latch. The clock buffer provides the full-rate clock for the track-and-hold and the first-stage comparators, and the half-rate clocks for the second stage and the following encoder.

To allow for independent energy optimization for the digital back-end, a second level of interleaving is done at the first comparator stage output. With this two-level interleaving, analog energy optimization is decoupled from digital energy optimization. A high level of interleaving

for the digital back-end is possible where the requirements for timing and components mismatches are more relaxed compared to the analog front-end.



Fig. 5.3. Comparator energy versus regeneration time-constant.



Fig. 5.4. Three-bit flash-ADC Architecture.

### 5.3. Circuit Implementation

#### 5.3.1. Track-and-Hold Amplifier

Open-loop track-and-hold amplifiers (THA) are widely used at the front-end of high-speed medium-resolution analog-to-digital converters (ADCs). Modern low-power multi-step ADC architectures extend the use of these amplifiers for inter-stage isolation [65]. Source-follower buffers have been widely used because they feature superior bandwidth, linearity, and robustness to PVT variations as compared to other topologies. Yet, the basic source follower suffers from channel-length modulation of the driver device limiting its output swing and hence leads to an attenuated full-scale voltage. To achieve the intended accuracy, an ADC would then dissipate more power.

The use of cascode bias currents [66], long channel devices, or dynamic biasing with feedback has been shown to improve the voltage gain at the cost of needing larger voltage headroom and penalizing the bandwidth. These techniques often also require a special device or layout such as isolated well, which may not be available for nanometer-scale CMOS technology. Although self-biased source follower offers a good combination of high voltage gain and a wide bandwidth, at low supply voltages and power consumption [19], two main drawbacks with the architecture are (1) the pseudo-differential pair limits the linearity of the buffer and (2) the absence of a well-defined current source makes the bias current of this buffer, consequently the bandwidth, strongly dependent on the input common-mode and PVT variations. Addressing the second drawback involves designing to the worst case input conditions and adding a large amount of overhead.

This section presents a gain-enhanced low-voltage source-follower amplifier by modulating the degeneration resistance of the follower. The circuit architecture has no power penalty in comparison to conventional followers, shows superior stability over common-mode voltage and PVT variation and can be used with standard CMOS technology.

The proposed amplifier, shown in Fig. 5.5, consists of a pseudo-differential source follower with a resistively-degenerated current source. The degeneration resistance is an MOS transistor  $(M_2)$  biased in deep-triode region. The source degeneration slightly increases the output impedance of the current source with a negligible amount of overhead (few tens of millivolts), and so the amplifier gain increases slightly.

To further enhance the voltage gain, the degeneration resistance is modulated by the complementary input signal. In this case, more (or less) current is injected when the input signal increases (or decreases). This counteracts the effects of the devices finite output resistances and



Fig. 5.5. Proposed source-follower amplifier with modulated degeneration resistance and CMFF biasing.

transistors body effect, thus enhancing the amplifier voltage gain. To provide a fixed bias current under different PVT and over a wide range of input common-mode, the gate of the cascode transistor ( $M_3$ ) is biased through a current mirror with a similar degeneration resistance. A common-mode feedforward (CMFF) path is applied to the degeneration resistance of the current mirror in order to track the input common-mode variation. To properly compare with a typical source follower, transistors  $M_5$  are added to act as bypass switches to the degeneration resistance. When these switches are enabled, the circuit behaves similar to source-follower amplifiers. The voltage gain ( $A_V$ ) of this circuit can be derived as

$$A_{V} = \frac{g_{m1}(g_{m3} + g_{o3} + g_{o2}) + g_{m2}(g_{o3} + g_{o2})}{(g_{m1} + g_{o1})(g_{m3} + g_{o3} + g_{o2}) + g_{o3} \cdot g_{o2}}$$
(5.1)

Since transistor  $M_2$  is in deep-triode region, the intrinsic output conductance  $g_{o2}$  is comparable to the transistors transconductance  $(g_{m1-3})$  and the second term of both the numerator and the denominator can not be neglected. Unlike conventional source follower amplifiers, where increasing the bias current reduces the voltage gain, the gain of the proposed architecture increases with increasing the bias current. This increase is because  $g_{m2}$  (in triode region) is directly proportional to the bias current (instead of increasing superlinear like  $g_{m1}$ ). As a result, the second term of the denominator increases faster than the first (the term that exists in conventional source-follower gain expression). Also,  $g_{o2}$  is relatively insensitive to the bias current thus reducing the denominator sensitivity to current variations compared to conventional source-followers.

This structure outperforms self-biased followers [19] when comparing the stability of the output bandwidth over supply and input common-mode variations. This stability is a direct result



Fig. 5.6 Simulated bandwidth versus (a) supply voltage and (b) input common-mode voltage.

of fixing the bias current by using a current mirror with CMFF. Fig. 5.6 shows the simulated bandwidth of both architectures versus supply and input common-mode voltages. The bandwidth variation is 6 and 47 times smaller over  $\pm 10\%$  change in V<sub>DD</sub> and input common mode respectively.

#### 5.3.2. Comparator Design

Each comparator slice is built of a simple differential preamplifier followed by a dynamic strong-arm latch stage as shown in Fig. 5.7(a) and Fig. 5.7(b), respectively. The preamplifier uses a voltage controlled resistance in order to enable adaptive bandwidth-power tradeoff. To configure the ADC to higher speeds, more bias current (i.e., less power consumption) is injected and the feedback circuit reduces the load resistance in order to sustain the same common-mode output. For lower-speeds, however, less bias current (i.e., less power consumption) is injected and the feedback circuit automatically adjusts the load resistance to a higher value (i.e., narrower bandwidth).

The comparator output is then fed to the proposed SR-latch (Fig. 5.7(c)) to generate an NRZ output. The proposed SR-latch consumes no static current and does not suffer from the fight condition, due to back-to-back inverters, of conventional static SR-latches. In other words, no special sizing ratio is required between the input transistors and the back-to-back inverters in order to successfully overwrite the SR-latch outputs. In the comparator regeneration phase, as



(a)



Fig. 5.7. Comparator schematic: (a) Preamplifier, (b) strong-ARM latch, and (c) SR latch.

long as the SR-latch input swing is high enough, one branch is turned off and the inverters do not fight the input transistors during the writing phase. In the comparator reset phase, the SR-latch back-to-back inverters are both active and preserve the SR-latch output value.

## 5.4. Comparator Offset Calibration

Despite the nice features enabled by the reconfigurable architecture, component mismatches puts the designer between a rock and a hard place. If small devices are used to minimize the power consumption, comparators and channel-to-channel mismatches become a show stopper (especially when the ADC is configured to high resolutions) and if devices are oversized to meet the matching specifications of the highest resolution, the ADC becomes quite inefficient when configured to lower resolutions. This necessitates using a calibration scheme that tolerates large offsets with very fine accuracy. Also, the area constraints impose using calibration with minimum hardware requirements.

#### 5.4.1. Comparators Offset Trimming

In this work, offset cancellation is achieved by modifying the comparator reference voltages, which does not affect the critical signal path [17]. This way calibration does not degrade high-speed performance. To achieve high calibration accuracy, however, a very complex switching network as well as very accurate resistor ladder is required. For example, to cover comparator offsets in the range of ±8LSBs with 1/4LSB accuracy, a 5-bit DAC is required for each comparator. For a 5-bit ADC, this translates to a total transistor count of 1024 just for the calibration switches (ignoring the digital-control circuits and the fact that a 7-bit resistor ladder was required in this case). To tackle this problem, the process is split into two stages, coarse and



Fig. 5.8. Offset cancelling reference buffer and calibration circuit.

fine calibration. The coarse calibration is responsible for covering a wide range of offsets and the fine calibration provides very accurate steps. For the previous example, the 5-bit calibration DAC can be split into 3-bit coarse and 2-bit fine. So, same accuracy is achieved with 66% saving in the required hardware.

The proposed reference adjustment of ADC's comparator combines two methods to achieve larger tuning range with very fine accuracy as shown in Fig. 5.8. The first approach is to select the tap-point from the resistor ladder by a 3-bit coarse control signal. The nominal setting of each coarse step is 20 mV. The selected voltage is connected to a PMOS reference buffer, where the bodies of PMOS devices are connected to another resistor ladder with 3-bit digital control to perform reference fine-tuning [10, 11]. The MOS threshold voltage changes with body bias and its value changes with PVT. So, the fine-tuning range covers 2 coarse steps to tolerate PVT variations. By combining both methods, the reference tuning range of each comparator is more
than 25% of the ADC dynamic range with <sup>1</sup>/<sub>4</sub> LSB steps of the 5-bit resolution. This is sufficient to provide the offset cancellation of a 5-bit ADC, and also enables the variable reference tuning for different configurations.

This architecture leverages the high accuracy provided by body trimming, without suffering from its limited tuning range. This becomes increasingly important with technology advance because the threshold sensitivity to body voltage decreases and the threshold mismatches increase when using smaller devices. For example, the required tuning range increases by a factor of 2 when migrating from 90nm to 40nm CMOS process assuming same transistor area. The factor increases to 4.3 if minimum feature size transistors are used. This trend poses design tradeoffs between device area/power and the tuning range of body voltage for offset cancellation. Applying the proposed technique, body trimming needs to cover only two coarse steps from the resistor ladder (including extra margin to account for process variations).

Moreover, PMOS transistors are essentially needed (for body trimming) only for the reference buffers that do not require wide bandwidth. The high-speed core, preamplifiers and dynamiclatches, are built using NMOS input transistors to enable high-speed. This favors applying body trimming to the reference buffer rather than directly to the preamplifier as in [10, 11]. After calibration finishes, the input referred offset (which includes both static and dynamic offsets) is cancelled within the calibration accuracy.

### 5.5. Measurement Results

The prototype ADC, fabricated in a 65-nm digital CMOS (Fig. 5.9), has an active area of 0.13 mm<sup>2</sup>. The analog blocks operate from a 1.2-V supply, and the digital blocks are powered from a



Figure 5.9. Die photograph of the prototype chip.

1-V supply. An on-chip memory collects the full-rate data from each channel and passes it at low speed to an external computer. Figure 5.10 shows the measured worst case nonlinearity  $INL_{max}$  and  $DNL_{max}$  for different ADC configurations. The measured nonlinearity before and after calibration are shown. Each ADC block operates at 2.5 GS/s to achieve a total 10GS/s in a 4-way interleaved architecture. The total power of ADC is 30.6mW (24mW analog + 6.6mW digital), and corresponds to a figure-of-merit (FoM) of 0.46pJ/conv-step at 10GS/s.

The ADC has been successfully integrated into a baud-rate ADC-based serial link receiver [6] and demonstrated 10Gb/s performance in a 23dB loss channel. Measurements prove that the receiver achieves 10<sup>-8</sup> bit-error-rate (BER) at 30mV voltage margin, with 50% power saving compared to receivers using conventional ADCs. This feature is enabled with minimum overhead by using the proposed calibration scheme. Also, simulation results shown in Fig. 5.11 confirm that the ADC achieves satisfactory SNDR for all configurations. Figure 5.12 compares this work with previously reported ADCs over the last 15 years as well as prior reconfigurable

| Technology                              | nm              | 65                    |      |      |
|-----------------------------------------|-----------------|-----------------------|------|------|
| Supply Voltage                          | V               | 1.2-analog, 1-digital |      |      |
| Active Area                             | mm <sup>2</sup> | 0.13                  |      |      |
| Resolution                              | Bits            | 3                     | 4    | 5    |
| ENOB @ Nyquist *                        | Bits            | 2.74                  | 3.54 | 4.62 |
| FoM **                                  | pJ/conv-step    | 0.46                  | 0.53 | 0.5  |
| Max. Conversion Rate                    | GS/s            | 10                    | 5    | 2.5  |
| <b>Total Power</b>                      | mW              | 30.6                  |      |      |
| INL <sub>max</sub> / DNL <sub>max</sub> | LSB             | 0.48 / 0.35           |      |      |

 Table 5.1.
 ADC Performance Summary

\* Simulated at Nyquist for each configuration

\*\*  $FoM = Power / [2^{ENOB} \times min(f_s, 2 \times ERBW)]$ 

ADCs [12]. Comparison shows competitive performance to the state of the art "dedicated" designs and superior performance among "reconfigurable" architectures. Finally, Table 5.1 summarizes the measured ADC performance.

The ADC achieves an FoM of 0.46pJ/conv-step at 10GS/s and occupies an active area of 0.13 mm<sup>2</sup>. To the best of the author's knowledge, we believe our analog-to-digital converter achieves the best FoM compared to other reconfigurable architectures.

## 5.6. Conclusion

A high-speed reconfigurable ADC has been fabricated in 65-nm CMOS. The proposed dynamic reconfiguration technique allows a wide range of performances to be covered through splitting the converter into smaller building blocks. Offset cancellation is achieved by reference voltages trimming; thus decoupling analog performance from component matching. The ADC can be configured to work as a 3-bit, a 4-bit, or a 5-bit ADC with a maximum INL and DNL of 0.48LSB and 0.35LSB respectively.



Figure 5.10. Maximum (a) DNL and (b) INL for different ADC configurations.



Figure 5.11. Simulated SNDR & ENOB versus input frequency at maximum sampling speed for different configurations.



Figure 5.12. FoM vs. Sampling frequency: comparison to prior work.

# **CHAPTER 6**

# Body Trimming beyond Calibration: Digitally Calibrated Current-Steering Segmented-DAC Design

So far, body-voltage trimming has been introduced only as an offset calibration technique. In this chapter, we propose a segmented DAC architecture that extends the use of body trimming to the converter core itself. Segmented architectures are commonly used in the design of high-performance converters [67-70], in order to achieve satisfactory DNL and INL performance at reasonable area and circuit complexity. The DAC segments are built either of the same type [67-69], or of different types [70].

The proposed *N*-bit DAC architecture, with *M* coarse most significant bits (MSBs) and *N*-*M fine* least significant bits (LSBs), is shown in Fig. 6.1. The coarse DAC is implemented as a thermometer-coded current steering DAC, and the fine DAC is implemented by modifying the body voltage of a current cell with the same size (or multiples) of the MSB cells. Due to the low sensitivity to body-voltage variations, the fine DAC steps result in very fine steps in the total differential current of the segmented DAC. While the matching requirements for the coarse DAC are as high as the entire DAC resolution, the fine DAC matching is relaxed due to the inherent attenuation. Yet, the real challenge is to accurately adjust the reference range of the fine DAC such that the total swing is precisely equal to one step of an MSB cell. This can be easily done by controlling the fine ladder range with an accurate—albeit slow—calibration DAC. The accuracy requirements for the calibration circuitry as well as other challenges are discussed in next subsections.



Fig. 6.1. Segmented-DAC architecture.

# 6.1. Coarse-Segments Mismatch Trimming

For a segmented DAC, calibration is usually applied to the MSB segments since they are the most sensitive to mismatches. The DAC MSB segments target trimming accuracy is much finer than the cell value. Usually, trimming is performed by shunting each cell with a small auxiliary calibration current DAC [12, 13], or by modifying the gate voltage of each cell [15, 16]. Mixing the main DAC array with auxiliary DAC arrays in the first method results in a non-homogeneous layout that may be restricted in nanometer-scale rules. The later method that involves trimming the gate bias faces higher gate leakage or the need for special floating-gate devices.

The proposed current cell, shown in Fig. 6.2, is built using a PMOS current source " $M_1$ ", differential switches " $M_{2a,b}$ " and a resistive trimming DAC connected to the body of each current cell. The trimming DAC corrects current mismatches by modifying the body bias of each cell, hence changing its threshold voltage. The proposed method leverages the inherent body-to-threshold voltage attenuation, where large voltage steps at transistor body results in very fine



Fig. 6.2. Proposed self-calibrated current cell with calibration circuitry.

changes in the corresponding threshold voltage. This results in relaxed matching, noise, and linearity constraints by the same attenuation factor.

# 6.2. Fine-Segments Design

As mentioned before, the fine DAC segments are implemented by modifying the body voltage of a current cell with the same size (or multiples) of the MSB cells. While the fine DAC matching is relaxed due to the inherent attenuation, the nonlinear behavior described by (2) limits the maximum achievable accuracy, consequently the resolution, of the fine segments. Basically, reference range of the fine DAC is designed to make the total current swing of the fine segments precisely equal to one step of an MSB cell. In order to achieve the required swing with sufficient linearity, the fine ladder voltage is applied to the bodies of multiple current cells. In other words, sizing is adopted as the supplemental method to increase the linearity of the overall DAC structure. This technique leverages the simplicity of sizing technique without



Fig. 6.3. Maximum achieved accuracy of fine segments versus number of fine cells.

suffering from its poor efficiency. The reason is that body trimming (used to implement the fine bits) is applied to only one (or few) current cells. This makes the overhead insignificant regardless of the used technique. In this case, the required fine ladder range is reduced and higher linearity is achieved. Fig. 6.3 shows the maximum achieved accuracy (and resolution) versus the number of cells for which body tuning is applied. If the fine ladder voltage is applied to a single cell, the tuning range is not sufficient to achieve a swing of one step of the coarse bits. If the ladder voltage is applied differentially to two cells, the fine segments can provide up to 7-bit of resolutions (in addition to the coarse MSBs). If more bits are needed for the fine segments, more cells should be used as shown by the figure.

## 6.3. Measurement Results

A prototype 8-bit segmented DAC is fabricated in a 65-nm digital CMOS (Fig. 6.4) and has an active area of 0.01mm<sup>2</sup> which includes the analog switches, reference ladder, and digital



Fig. 6.4. Die photograph of the prototype segmented-DAC chip.

decoders. The DAC is split into two 4-bit segments, where the four MSBs are implemented as thermometer-coded current-steering DAC, and four LSBs are implemented by differentially controlling the body-voltage of two current cells through a resistor ladder. The DAC consumes 0.48 mW from a 1.2V supply. Fig. 6.5 shows the measured INL and DNL. By calibrating the dynamic range of the fine segments, the measured INL range is reduced from  $-2.2 \sim 2.75$  LSB to  $-0.58 \sim 2.04$  LSB. The measured DNL range is reduced from  $-3 \sim 1.3$  LSB to  $-1.16 \sim 0.94$  LSB. In order to further enhance the DAC linearity, body voltage trimming is applied to the coarse elements to reduce their random mismatches. The calibrated DAC INL range is  $-0.75 \sim 0.77$  LSB, and the DNL range is  $-0.64 \sim 0.66$  LSB.

## 6.4. Conclusion

The efficiency of using body voltage trimming for both the core realization and the mismatch calibration of a segmented digital to analog converter has been investigated. Both the advantages

and challenges of this technique are studied in details. Suggested methods have been presented to extend the use of bulk voltage trimming beyond technology limitations. A case study has been introduced to investigate the efficiency of different hybrid techniques. The area and power overhead due to trimming have been minimized using the proposed methods, while neither special technology nor special operating environment was required.



Fig. 6.5. Measured (a) DNL and (b) INL of the 8-bit segmented-DAC before and after calibration.

## **CHAPTER 7**

# Conclusion

Reconfigurable ADC architectures have been studied in this dissertation. The study focuses on high-speed, low-to-medium resolution application. Novel techniques have been introduced on both architectural and circuit-design levels.

On the architectural level, broad-range of reconfigurability has been achieved with high efficiency in both area and power consumption through the introduction of "architectural-reconfigurable" data converters. This solution allows independent optimization of the ADC performance for different configurations by selecting the optimal architecture for each resolution.

Two key blocks that are widely used for high-speed data converters, namely comparator and MDAC, are carefully studied in this research. This study proposes the multi-level interleaving as a circuit enabling technique for low-power high-speed comparator regeneration. Also, partial interleaving architecture and the introduction of a current-steering DAC and an open-loop amplifier are proposed to relax the MDAC settling at minimum area and power overhead.

ADC component mismatches are corrected using body-voltage trimming based digitalcalibration to decouple analog performance from component matching. The efficiency of using body voltage trimming for both the core realization and the mismatch calibration of data converters has been investigated. Both the advantages and challenges of this technique are studied in details. Suggested methods have been presented to extend the use of body-voltage trimming beyond technology limitations. A foreground calibration method for current-steering DACs has been presented as a case study. The proposed current cell allows calibration of the current sources to very high accuracies. The approach manages device mismatch with low penalty. A 4-bit thermometer-coded DAC with this calibration technique is embedded in a 7-bit pipelined ADC and implemented in 65-nm CMOS. The ADC has achieved a calibrated SFDR of 47dB at speeds up to 2GHz. MDAC core occupies 0.01mm<sup>2</sup> active area and the entire ADC consumes 41mW from a 1.2V supply at 1.5GS/s.

Two prototype chips are implemented in 65-nm CMOS to verify the results of this study. The first chip is a 2.5-10GS/s reconfigurable flash ADC. The ADC can be configured to work as a 3-bit, a 4-bit, or a 5-bit ADC with worst case integral nonlinearity (INL) and differential nonlinearity (DNL) of 0.48LSB and 0.35LSB respectively. The ADC achieves a figure-of-merit of 0.46pJ/conv-step and the active area is 0.13 mm<sup>2</sup>. The second chip is a 1.5-4GS/s "architecture" reconfigurable ADC. The ADC covers resolution range from 3-bit to 7-bit, and achieves a figure-of-merit of 0.46pJ/conv-step at 7-bit and the active area is 0.15mm<sup>2</sup>.

# Appendix

# Analysis of Body-Voltage Trimming: Thermal Stability and Short-Channel Effects

## A.1. Thermal Stability

### A.1.1. MOS transistor Mismatch Model

Generally, MOS transistors will be operating in the saturation region in analog circuits. Therefore we should relate the measured mismatches in  $V_T$  and K to the saturation region, where the drain current is given by

$$I = \frac{K}{2} \cdot \left(V_{GS} - V_T\right)^2 \tag{A.1}$$

Then the variance in the drain current may be written as

$$\frac{\sigma_I^2}{I^2} = \frac{4 \cdot \sigma_{VT}^2}{(V_{GS} - V_T)^2} + \frac{\sigma_K^2}{K^2}$$
(A.2)

From (A.2), we can conclude that the current and voltage matching depends on  $\sigma_{VT}$  and  $\sigma_K$  and that the relative importance of  $V_T$  and K mismatches depends on  $V_{GST} = (V_{GS} - V_T)$ . If we define a corner gate-overdrive voltage as:  $V_{GST,m} = (V_{GS} - V_T)_m = 2$ .  $\sigma_{VT} / \sigma_K$ , the effect of the  $V_T$  mismatch is dominant over the effect of the K mismatch for transistors with a  $V_{GST}$  smaller than  $V_{GST,m}$ . From the values of  $V_{GST,m}$  in Fig. A.1, although  $V_{GST,m}$  does scale down with the technology advance, it is clear that the effect of  $V_T$  mismatch is indeed dominant under normal biasing conditions so that (A.2) can be simplified to [7, 71]:



Fig. A.1. Overdrive voltage at which K mismatches begins to dominate.

$$\frac{\sigma_I^2}{I^2} = \frac{4 \cdot \sigma_{VT}^2}{(V_{GS} - V_T)^2}$$
(A.3)

The threshold voltage of a transistor maybe expressed as [72]

$$V_T = \phi_{MS} + 2\phi_B + \frac{Q_B}{C_{ox}} - \frac{Q_f}{C_{ox}} + \frac{q \cdot D_I}{C_{ox}}$$
(A.5)

where  $\Phi_{MS}$  is the gate-semiconductor work function difference,  $\Phi_B$  is the Fermi potential in the bulk,  $Q_B$  is the depletion charge density,  $Q_f$  is the fixed oxide charge density,  $D_I$  is the threshold adjust implant dose, and  $C_{ox}$  is the gate oxide capacitance per unit area. The last term in (A.5) accounts for the threshold adjust implant where the implanted ions are assumed to have a delta function profile at the silicon-silicon dioxide interface. The standard deviation of  $V_T$  may be determined if we can find the standard deviations of the various terms on the right-hand side of (A.5). The Fermi potential  $\Phi_B$  has a logarithmic dependence on the substrate doping, and  $\Phi_{MS}$  has a similar dependence on the doping in the substrate and in the polysilicon gate. Hence these terms may be regarded as constants not contributing to any mismatch.

Next we consider oxide fixed charge which is reported to have a Poisson distribution [73]. Then its variance is given by

$$\sigma_{Qf}^2 = \frac{qQ_f}{L \cdot W} \tag{A.6}$$

where L is the effective length and W is the effective width of the channel.

The depletion charge per unit area  $Q_B$  is also a random variable dependent on the distribution of the dopant atoms. No theoretical treatment of fluctuations in dopant ion density is available. However, we shall show that the physical conditions in the substrate favor a Poisson distribution [72]. Then the variance in QB maybe shown to be

$$\frac{\sigma_{QB}^2}{Q_B^2} = \frac{1}{4 \cdot L \cdot W \cdot W_D \cdot N_A} \tag{A.7}$$

where  $W_D$  is the depletion layer width and  $N_A$  is the substrate doping.

Since threshold-adjust implant is carried out for p-channel transistors only. Therefore  $qD_1 = 0$  for n-channel devices. Finally, the variance in  $C_{ox}$  maybe determined by estimating the variances in oxide thickness and permittivity [72]. It can be shown that

$$\frac{\sigma_{Cox}^2}{C_{ox}^2} = \frac{A_{ox}}{L \cdot W}$$
(A.8)

where  $A_{ox}$  is a parameter to be determined from measurements.

The random variables  $Q_f$ ,  $Q_B$ , and  $C_{ox}$  are all independent. Hence the variance in  $V_T$  may be written as

$$\sigma_{VT}^{2} = \frac{\left(\sigma_{QB}^{2} + \sigma_{Qf}^{2}\right)}{C_{ox}} + \frac{\sigma_{Cox}^{2}}{C_{ox}^{2}} \cdot \left(\frac{Q_{B}^{2} + Q_{f}^{2}}{C_{ox}^{2}}\right)$$

$$= \frac{1}{W \cdot L \cdot C_{ox}^{2}} \cdot \left[q(Q_{B} + Q_{f}) + A_{ox}(Q_{B}^{2} + Q_{f}^{2})\right]$$
(A.9)

As the threshold voltage varies with temperature, it is interesting to know their matching behavior as a function of temperature. In the case of the threshold voltage, as expressed by (A.5), the only terms that are dependent on temperature are  $\Phi_B$  and  $\Phi_{MS}$  [72]. We have seen through (A.9) that the contribution of these terms to the threshold voltage mismatch is negligible. Therefore we may expect the matching behavior of threshold voltage to be almost independent of temperature.

#### A.1.2. Body-Effect Model

To account for the threshold shift from nonzero flat-band voltage whose main cause comes from fixed oxide charges  $Q_f$  and the work-function difference  $\Phi_{MS}$  between the gate material and the semiconductor, A.5 becomes [74]

$$V_T = \phi_{MS} + 2\phi_B + \frac{\sqrt{2 \cdot \varepsilon_s \cdot q \cdot N_A \cdot (2 \cdot \phi_B)}}{C_{ox}} - \frac{Q_f}{C_{ox}} + \frac{q \cdot D_I}{C_{ox}}$$
(A.10)

Qualitatively,  $V_T$  is the gate bias beyond flat-band just starting to induce an inversion charge sheet and is given by the sum of voltages across the semiconductor  $(2.\Phi_B)$  and the oxide layer. The square-root term is the total depletion-layer charge  $Q_B$ .

When a substrate bias is applied (negative for n-channel or p-substrate), the threshold voltage becomes

$$V_T = \phi_{MS} + 2\phi_B + \frac{\sqrt{2 \cdot \varepsilon_s \cdot q \cdot N_A \cdot (2 \cdot \phi_B - V_{BS})}}{C_{ox}} - \frac{Q_f}{C_{ox}} + \frac{q \cdot D_I}{C_{ox}}$$
(A.11)

and is shifted by an amount of

$$\Delta V_T = \frac{\sqrt{2 \cdot \varepsilon_s \cdot q \cdot N_A}}{C_{ox}} \left( \sqrt{2 \cdot \phi_B - V_{BS}} - \sqrt{2 \cdot \phi_B} \right)$$
(A.12)

Interestingly, the threshold-voltage variation due to body-voltage trimming is also temperature dependent. From (A.9) and (A.12), we can predict a very good thermal stability of the body-voltage trimming technique.

# A.1.3. Process, Voltage, and Temperature Tracking

It is noteworthy that in a well-controlled process the nonuniform distribution of the fixed oxide charges has negligible effect on threshold voltage mismatch. Also, the gate oxide capacitance is quite uniform and hence has little influence on the threshold voltage mismatch. This makes the nonuniform distribution of the dopant atoms in the bulk is a major contributor to the threshold voltage mismatch [72], which yield to a first order process tracking between the threshold mismatch and threshold sensitivity to body voltage trimming as depicted by (A.13)

$$\frac{\Delta V_T}{\sigma_{VT}} = \left[\frac{\sqrt{2 \cdot \varepsilon_s \cdot q \cdot N_A}}{C_{ox}} \left(\sqrt{2 \cdot \phi_B - V_{BS}} - \sqrt{2 \cdot \phi_B}\right)\right] / \left[\frac{q \cdot Q_B}{W \cdot L \cdot C_{ox}^2}\right]$$

$$= \frac{W \cdot L \cdot C_{ox}^2}{q} \left(\sqrt{1 - \frac{V_{BS}}{2 \cdot \phi_B}} - 1\right)$$
(A.13)

This equation shows that body-voltage based trimming does track threshold mismatch changes across voltage and temperature variations, and shows the same scaling behavior as mismatches in terms of silicon parameters variations (e.g., doping and fixed-charge distribution).

#### A.2. Short-Channel Effects

#### A.2.1. Reduced Threshold Sensitivity to Body-Voltage Variation

For short-channel devices, the full effect of  $Q_B$ , on the threshold voltage is educed, because near the source and drain ends of the channel, some field lines originating from the bulk charges under the channel region terminate at the source or drain instead of the gate (Fig. A.2) [74]. Firstorder estimation of the threshold voltage can be made by considering the charge partition. Now, by assuming that all field lines within the trapezoid are terminated within the channel L, and all lines outside terminate in the source and drain electrodes, we can approximate, the bulk charge as

$$Q_B \cdot L = q \cdot N_A \cdot W\left(\frac{L+L'}{2}\right) \tag{A.14}$$

By trigonometry, we can write

$$\frac{L+L'}{2\cdot L} = 1 - \left(\sqrt{1 + \frac{2W}{r_j}} - 1\right) \cdot \frac{r_j}{L}$$
(A.15)

This yields to the reduced body-effect as depicted by (A.16)

$$\Delta V_T = \lambda_b \cdot \frac{\sqrt{2 \cdot \varepsilon_s \cdot q \cdot N_A}}{C_{ox}} \left( \sqrt{2 \cdot \phi_B - V_{BS}} - \sqrt{2 \cdot \phi_B} \right)$$
(A.16)

where  $\lambda_b = 1 - \left(\sqrt{1 + \frac{2W}{r_j}} - 1\right) \cdot \frac{r_j}{L}$  [75].

While channel lengths have been scaled aggressively over the last several years, the junction depth has not been scaled quite as aggressively. Ultra-shallow junctions have been hard to achieve in manufacturing for various reasons. In particular, diffusion of dopants has limited the

use of ultra-shallow junctions. This means that more body-effect reduction should be expected for future technologies.



Fig. A.2. Charge-conservation model.

# References

- [1] R. Yousry, H. Park, E. Chen, and K. Yang "A Digitally-Calibrated 10GS/s Reconfigurable Flash ADC in 65nm CMOS," in *Proc. IEEE Int. Symp. Circuits and Systems (ISCAS)*, accepted for publication for May 2013.
- [2] Klaus von Arnim *et al.*, "A 1 GHz Bandwidth Low-Pass ΔΣ ADC With 20–50 GHz Adjustable Sampling Rate," *IEEE Journal of Solid-State Circuits*, vol.44, no.5, pp. 1401-1414, May 2009.
- [3] S. Danesh, J. Hurwitz, K. Findlater, D. Renshaw, and R. Henderson, "A reconfigurable 1Gsps to 250MSps, 7-bit to 9-bit highly time-interleaved counter ADC in 0.13μm CMOS," *IEEE Symp. VLSI Circuits*, pp. 268-269, June 2011.
- [4] C.-C. Hsu, *et al.*, "A 7b 1.1GS/s Reconfigurable time-interleaved ADC in 90nm CMOS," *IEEE Symp. VLSI Circuits*, pp. 66-67, June 2007.
- [5] B. Murmann, "ADC Performance Survey 1997-2012," [Online]. Available: http://www.stanford.edu/~murmann/adcsurvey.html.
- [6] Patrick J. Quinn and Arthur H.M. Van Roermund, "Switched-Capacitor Techniques For High-Accuracy Filter And ADC Design", Publication date: 2007, Publisher: Springer, ISBN.
- [7] Michael P. Flynn, Sunghyun Park, and Chun C. Lee "Achieving Analog Accuracy in Nanometer CMOS," International Journal of High Speed Electronics and Systems 2005, pp. 255-275.
- [8] Ian Galton, "Digital Cancellation of D/A Converter Noise in Pipelined A/D Converters," *IEEE Trans. Circuits Syst. II*, vol. 47, no. 3, pp. 185–196, March 2000.

- [9] Klaus von Arnim *et al.*, "Efficiency of body biasing in 90-nm CMOS for low-power digital circuits," *IEEE Journal of Solid-State Circuits*, vol.40, no.7, pp. 1549- 1556, July 2005.
- [10] E. Alpman *et al.*, "A 1.1V 50mW 2.5GS/s 7b Time-Interleaved C-2C SAR ADC in 45nm LP digital CMOS," *ISSCC Digest of Technical Papers*, February 2009, pp. 76-77.
- [11] Junjie Yao et al., "Bulk Voltage Trimming Offset Calibration for High-Speed Flash ADCs," *IEEE Trans. Circuits Syst. II*, vol. 57, no. 2, pp. 110–114, Feb. 2010.
- [12] G. Radulov, P. Quinn, J. Hegt, and A.H.M. van Roermund, "A start-up calibration method for generic current steering D/A converters with optimal area solution," in *Proc. IEEE Int. Symp. Circuits and Systems (ISCAS)*, May 2005, pp. 788-791.
- [13] W. Schofield, D. Mercer, and L. St. Onge, "A 16 b 400 MS/s DAC with 80 dBc IMD to 300 MHz and 160 dBm/Hz noise power spectral density," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2003, pp. 126-482.
- [14] S. Park, Y. Palaskas, and M. P. Flynn, "A 4GS/s 4b Flash ADC in 0.18µm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2006, pp. 2330-2339.
- [15] A. R. Bugeja and B.-S. Song, "A self-trimming 14-b 100-MS/s CMOS DAC," IEEE J. Solid-State Circuits, vol. 35, pp. 1841–1852, Dec. 2000.
- [16] J. Hyde, T. Humes, C. Diorio, M. Thomas, and M. Figueroa, "A 300-MS/s 14-bit digitalto-analog converter in logic CMOS," *IEEE J. Solid State Circuits*, vol. 38, no. 5, pp. 734–740, May 2003.
- [17] Chun-Ying Chen *et al.*, "A Low Power 6-bit Flash ADC with Reference Voltage and Common-Mode Calibration," *IEEE Symp. VLSI Circuits*, pp. 12-13, June 2008.
- [18] E-Hung Chen *et al.*, "10Gb/s Serial I/O Receiver Based on Variable Reference ADC," *IEEE Symp. VLSI Circuits*, pp. 288-289, June 2011.

- [19] Hairong Yu and M.-C Frank Chang, "A 1-V 1.25-GS/S 8-Bit Self-Calibrated Flash ADC in 90-nm Digital CMOS," *IEEE Trans. Circuits Syst. II*, vol. 55, no. 7, pp. 668–672, Jul. 2008.
- [20] Conor Donovan and Michael P. Flynn, "A "Digital" 6-bit ADC in 0.25-μm," *IEEE J. Solid State Circuits*, vol. 37, no. 3, pp. 432–437, Mar. 2002.
- [21] Wenchao Qu, "Design of a Cost-efficient Reconfigurable Pipeline ADC: Department of Electrical Engineering," *University of Tennessee, Knoxville*, 2007.
- [22] B. Xia, A. Valdes-Garcia, and E. Sanchez-Sinencio, "A configurable time-interleaved pipeline ADC for multi-standard wireless receivers," *Solid-State Circuits Conference*, 2004. ESSCIRC 2004. Proceeding of the 30th European, pp. 259-262, 2004.
- [23] J. Elbornsson, Analysis, estimation and compensation of mismatch effects in A/D converters: Department of Electrical Engineering, Linkpings universitet, 2003.
- [24] B. Razavi, "Problem of Timing Mismatch in Interleaved ADCs," *Proc. IEEE Custom Integrated Circuits Conference*, pp. 1-8, Sept. 2012.
- [25] C. Kun, A. Mason, and S. Chakrabartty, "A Dynamic Reconfigurable A/D Converter for Sensor Applications," *Sensors*, 2005 IEEE, pp. 1221-1224, 2005.
- [26] G. Mulliken, F. Adil, G. Cauwenberghs, and R. Genov, "Delta-sigma algorithmic analogto-digital conversion," *Circuits and Systems*, 2002. ISCAS 2002. IEEE International Symposium on, vol. 4, 2002.
- [27] K. Gulati and H. S. Lee, "A low-power reconfigurable analog-to-digital converter," *Solid-State Circuits, IEEE Journal of*, vol. 36, pp. 1900-1911, 2001.
- [28] T. N. Andersen, B. Hernes, A. Briskemyr, F. Telst, J. BjNsen, T. E. Bonnerud, and O. Moldsvor, "A cost-efficient high-speed 12-bit pipeline ADC in 0. 18 um digital CMOS," *IEEE journal of solid-state circuits*, vol. 40, pp. 1506-1513, 2005.
- [29] K. Iizuka, H. Matsui, M. Ueda, and M. Daito, "A 14-bit Digitally Self-Calibrated Pipelined ADC With Adaptive Bias Optimization for Arbitrary Speeds Up to 40 MS/s," *Solid-State Circuits, IEEE Journal of*, vol. 41, pp. 883-891, 2006.

- [30] I. Ahmed and D. A. Johns, "A 50-MS/s(35 mW) to 1-kS/s power scaleable 10-bit pipelined ADC using rapid power-on opamps and minimal bias current variation," *IEEE journal of solid-state circuits*, vol. 40, pp. 2446-2455, 2005.
- [31] A. R. Feldman, *High-speed, Low-power Sigma-delta Modulators for RF Baseband Channel Applications*: Electronics Research Laboratory, College of Engineering, University of California, 1997.
- [32] T. Burger and Q. Huang, "A 13.5-mW 185-Msample/s sigma-delta modulator for UMTS/GSMdual-standard IF reception," *Solid-State Circuits, IEEE Journal of*, vol. 36, pp. 1868-1878, 2001.
- [33] K. B. H. Khoo, "Programmable, high dynamic range sigma-delta A/D converters for multistandard, fully integrated RF receivers," *University of California at Berkeley*, pp. 20-21, 1998.
- [34] R. V. Veldhoven, "A tri-mode continuous-time ΣΔ modulator with switched-capacitor feedback DAC for a GSM- DGE/CDMA2000/UMTS receiver," proc. IEEE International Solid-State Circuits Conference, pp. 60-61, 2003.
- [35] A. Dezzani and E. Andre, "A 1.2-V dual-mode WCDMA/GPRS ΣΔ modulator," proc. *IEEE International Solid-State Circuits Conference*, pp. 58-59, 2003.
- [36] T. Nikolaidis, A. Varagis, F. D., S. B.S., and V. N., "Reconfigurable pipeline Analog/Digital Converter for WCDMA/GSM operation," *Workshop on Multi-mode multi-band re-configurable systems for 3rd enhanced generation mobile phones* 2004.
- [37] W. Audoglio, E. Zuffetti, G. Cesura, R. Castello, S. di Microelettronica, and P. Stmicroelectronics, "A 6-10 bits Reconfigurable 20MS/s Digitally Enhanced Pipelined ADC for Multi-Standard Wireless Terminals," *Solid-State Circuits Conference, 2006. ESSCIRC 2006. Proceedings of the 32nd European*, pp. 496-499, 2006.
- [38] M. Anderson, K. Norling, A. Dreyfert, and J. Yuan, "A reconfigurable pipelined ADC in 0.18 um CMOS," VLSI Circuits, 2005. Digest of Technical Papers. 2005 Symposium on, pp. 326-329, 2005.
- [39] K. Ohhata, etl al., "Design of a 770-MHz, 70-mW, 8-bit subranging ADC using reference voltage precharging architecture," *IEEE J. Solid-State Circuits*, vol. 44, no. 11, pp. 2881– 2890, Nov. 2009.
- [40] D. Cline and P. Gray, "A power optimized 13-b 5 Msamples/s pipelined analog-to-digital converter in 1.2 μm CMOS," *IEEE J. Solid-State Circuits*, vol. 31, no. 3, pp. 294–303, Mar. 1996.

- [41] J. Goes, J.C. Vital, J.E. Franca, "Optimum resolution-per-stage in high-speed pipelined A/D converters using self-calibration," in Proc. *IEEE Int. Symp. Circuits and Systems* (*ISCAS*), vol.1, pp. 525-528, May 1995.
- [42] S. Hashemi and B. Razavi, "A 10-bit 1-GS/s CMOS ADC with FOM = 70 fJ/conversion," *IEEE Custom Integrated Circuits Conference (CICC)*, pp.1-4, Sept. 2012.
- [43] W. Schofield, D. Mercer, and L. St. Onge, "A 16b 400 MS/s DAC with 80dBc IMD to 300MHz and 160dBm/Hz noise power spectral density," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2003, pp. 1–10.
- [44] G.L Radulov *et al.*, "A binary-to-thermometer decoder with built-in redundancy for improved DAC yield," in Proc. *IEEE Int. Symp. Circuits and Systems (ISCAS)*, pp.1414-1417, May 2006.
- [45] W. Zhang and M. Hassoun, "A redundant-cell-relay continuous self-calibration method for current-steering DACs," *Proceedings of the European Solid-State Circuits Conference*, pp.349-352, Sept. 2001.
- [46] P.J.A. Harpe, J.M. de Meulmeester, J.A. Hegt, A.H.M. van Roermund, "Novel digital pre-correction method for mismatch in DACs with built-in-self measurement," Proceedings of IEE ADDA 2005.
- [47] C. Giovanni and P. Andrea, "Method of correction of the error introduced by a multibit DAC incorporated in an ADC," U.S. patent 6867718, Mar. 2005.
- [48] K. El-Sankary and M. Sawan, "A background calibration technique for multibit/stage pipelined and time-interleaved ADCs," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol.53, no.6, pp.448-452, June 2006.
- [49] S. Park; Y. Palaskas, and M.P. Flynn, "A 4-GS/s 4-bit flash ADC in 0.18-μm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 42, no. 9, pp.1865-1872, Sept. 2007.
- [50] "Smart and flexible digital-to-analog converters", G. Radulov, P. Quinn, H.Hegt, A.H.M. van Roermund, Publication date: 2011/1/14, Publisher: Springer, ISBN.
- [51] P. Harpe, *et al.*, "digital post-correction of front-end track-and-hold circuits in ADCs," in *Proc. of IEEE Intl. Symp. On Circuits and Systems (ISCAS)*, May 2006, pp. 1503-1506.
- [52] A. Verma and B. Razavi, "A 10-bit 500-MS/s 55-mW CMOS ADC," *IEEE J. Solid-State Circuits*, vol.44, No.11, pp. 3039-3050, Nov. 2009.

- [53] Y. Zhu, C.-H. Chan, S.-W. Sin, S-P. U, and R.P.Martins, "A 34fJ 10b 500 MS/s Partialinterleaving pipelined SAR ADC," *IEEE Symp. VLSI Circuits*, pp. 90-91, June 2012.
- [54] D.-L. Shen and T.-C. Lee, "A 6-bit 800-MS/s Pipelined A/D converter with open-loop amplifiers," *IEEE Symp. VLSI Circuits*, pp. 134-135, June 2006.
- [55] B. Murmann and B. Boser, "A 12 b 75 MS/s pipelined ADC using open-loop residue amplification," *in ISSCC Dig. Tech. Papers*, Feb. 2003, pp. 328-329.
- [56] Hayun Chung *et al.*, "A 7.5-GS/s 3.8-ENOB 52-mW flash ADC with clock duty cycle control in 65nm CMOS," *IEEE Symp. VLSI Circuits*, pp. 268-269, June 2009.
- [57] Naoki Kurosawa *et al.*, "Explicit Analysis of Channel Mismatch Effects in Time-Interleaved ADC Systems," *IEEE Trans. Circuits Syst. I*, vol. 48, no. 3, pp. 261–271, Mar. 2001.
- [58] Brian P. Ginsburg, Anantha P. Chandrakasan, "Highly Interleaved 5b 250MS/s ADC with Redundant Channels in 65nm," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2008, pp. 240-610.
- [59] S. Ouzounov *et al.*, "A 1.2V 121-Mode CT  $\Delta\Sigma$  Modulator for Wireless Receivers in 90nm CMOS," *ISSCC Digest of Technical Papers*, February 2007, pp. 242-600.
- [60] K. Nagaraj *et al.*, "A 700MSample/s 6b Read Channel A/D Converter with 7b Servo Mode," *ISSCC Digest of Technical Papers*, February 2000, pp. 426-427, 476.
- [61] Jun-Xia Ma, Sai-Weng Sin, Seng-Pan U, and R.P.Martins, "A Power-Efficient 1.056 GS/s Resolution-Switchable 5-bit/6-bit Flash ADC for UWB Applications," *in Proc. of IEEE Intl. Symp. On Circuits and Systems (ISCAS)*, May 2006, pp. 4305-4308.
- [62] M. Harwood et al., "A 12.5Gb/s SerDes in 65nm CMOS Using a Baud-Rate ADC with Digital Receiver Equalization and Clock Recovery," ISSCC Digest of Technical Papers, Februrary 2007, pp. 436-591.
- [63] Siamak Savari *et al.*, "A 5Gb/s Speculative DFE for 2x Blind ADC-based Receivers in 65-nm CMOS," *IEEE Symp. VLSI Circuits*, pp. 69-70, June 2010.

- [64] Jun Cao *et al.*, " A 500mW digitally calibrated AFE in 65nm CMOS for 10Gb/s Serial links over backplane and multimode fiber,". *ISSCC Digest of Technical Papers*, February 2009, pp. 370-371.
- [65] I. Ahmed, J. Mulder, D. Johns, "A 50MS/s 9.9mW Pipelined ADC with 58dB SNDR in 0.18μm CMOS using capacitive charge-pumps," *ISSCC*, pp. 164-165, Feb. 2009.
- [66] K. Hadidi, A. Khoei, "A Highly linear cascode-driver CMOS source follower buffer," *IEEE Intl. Conf. on Electronics, Circuits and Systems*, pp. 1243-1246. Oct. 1996.
- [67] John A. Schoeff, "An Inherently Monotonic 12 Bit DAC," *IEEE J. Solid State Circuits*, vol. 14, no. 6, pp. 904–911, Dec. 1979.
- [68] K. O. Andersson, N. U. Andersson, M. Vesterbacka, J.J Wikner, "A method of segmenting digital-to-analog converters," *Southwest Symposium on Mixed-Signal Design*, Apr. 2003, pp. 32- 37.
- [69] Chi-Hung Lin and Klaas Bult, "A 10-b, 500-MSample/s CMOS DAC in 0.6 mm<sup>2</sup>," *IEEE J. Solid State Circuits*, vol. 33, no. 12, pp. 1948–1958, Dec. 1998.
- [70] G.J. Priatko, B.L. Thompson, J.A. Kaskey, "A hybrid 3 Gs/s, 6-bit digital to analog converter," *Eighth University/Government/Industry Microelectronics Symposium*. *Proceedings.*, Jun. 1989, pp.160-164.
- [71] P. Kinget and M. Steyaert, "Impact of transistor mismatch on the speed-accuracy-power trade-off of analog CMOS circuits," *IEEE Custom Integrated Circuits Conference* (*CICC*), pp.333-336, Sept. 1996.
- [72] K.R. Lakshmikumar, R.A. Hadaway, and M.A. Copeland, "Characterization and Modeling of Mismatch in MOS Transistors for Precision Analog Design," *IEEE J. Solid State Circuits*, vol. sc-21, no. 6, pp. 1057–1066, Dec. 1986.
- [73] E. H. Nicollian and J. R. Brews, MOS *Physics and Technology. New* York: Wiley, 1982.
- [74] S. M. Sze and Kwok. K. Ng, *Physics of Semiconductor Devices*. New Jersey: Wiley, 2007.

[75] H.C. Poon, L.D. Yau, R.L. Johnston, D. Beecham, "DC Model for short-channel IGFET's," In proceeding of: International Electron Devices Meeting, vol. 19, Feb. 1973.