Efficiency Improvement Techniques for Millimeter-Wave Transmitters

Permalink
https://escholarship.org/uc/item/1gn4506w

Author
Rostomyan, Narek

Publication Date
2018

Peer reviewed|Thesis/dissertation
Efficiency Improvement Techniques for Millimeter-Wave Transmitters

A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Electrical Engineering (Electronic Circuits and Systems)

by

Narek Rostomyan

Committee in charge:

Professor Peter Asbeck, Chair
Professor Gert Cauwenberghs
Professor William Hodgkiss
Professor Patrick Mercier
Professor Gabriel Rebeiz

2018
The dissertation of Narek Rostomyan is approved, and it is acceptable in quality and form for publication on microfilm and electronically:

__________________________________________

__________________________________________

__________________________________________

__________________________________________

__________________________________________

Chair

University of California San Diego

2018
DEDICATION

To my family.
TABLE OF CONTENTS

Signature Page ................................................................. iii
Dedication ................................................................. iv
Table of Contents .......................................................... v
List of Figures ............................................................. viii
List of Tables ............................................................... xii
Acknowledgements ......................................................... xiii
Vita ............................................................................. xv
Abstract of the Dissertation ............................................. xvii

Chapter 1  Introduction ..................................................... 1
  1.1 Efficient mm-Wave CMOS PA Design ............................... 2
  1.2 Challenges in Mm-Wave TDD Transceiver Front-Ends ............. 8
  1.3 Challenges in RF FDD Transceiver Front-Ends ..................... 11
  1.4 Dissertation Scope ................................................ 13
  1.5 Dissertation Organization ......................................... 14

Chapter 2  15 GHz Doherty Power Amplifier with RF Predistortion Linearizer ............. 17
  2.1 Introduction ....................................................... 17
  2.2 Doherty PA Implementation ....................................... 18
    2.2.1 High Power Final Stage and Driver Stage .................. 19
    2.2.2 Load Modulation of 4-Stack Devices ....................... 21
    2.2.3 Realization of Input and Output Combiners ............... 24
  2.3 Analog Predistortion .............................................. 25
    2.3.1 Analog Predistortion Architectures ....................... 26
    2.3.2 Proposed Analog Predistortion Circuit .................... 27
  2.4 Experimental Results ............................................ 30
    2.4.1 Small-Signal and CW Measurements ....................... 31
    2.4.2 Modulation Measurements .................................. 34
  2.5 Conclusion ........................................................ 40
  2.6 Acknowledgment .................................................. 40

Chapter 3  Comparison of pMOS and nMOS 28 GHz High Efficiency Linear Power
  Amplifiers in 45 nm CMOS SOI ....................................... 41
  3.1 Introduction ..................................................... 41
  3.2 Circuit Architecture and Design ................................ 43
<table>
<thead>
<tr>
<th>Chapter</th>
<th>Title</th>
<th>Pages</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.3</td>
<td>Experimental Results</td>
<td>43</td>
</tr>
<tr>
<td>3.4</td>
<td>Additional Discussions</td>
<td>48</td>
</tr>
<tr>
<td>3.5</td>
<td>Conclusion</td>
<td>51</td>
</tr>
<tr>
<td>3.6</td>
<td>Acknowledgment</td>
<td>51</td>
</tr>
<tr>
<td>4.1</td>
<td>Introduction</td>
<td>53</td>
</tr>
<tr>
<td>4.2</td>
<td>Doherty PA Implementation</td>
<td>54</td>
</tr>
<tr>
<td>4.3</td>
<td>Combiner Performance under Mismatch and Process Variations</td>
<td>60</td>
</tr>
<tr>
<td>4.4</td>
<td>Experimental Results</td>
<td>61</td>
</tr>
<tr>
<td>4.5</td>
<td>Conclusion</td>
<td>65</td>
</tr>
<tr>
<td>4.6</td>
<td>Acknowledgment</td>
<td>65</td>
</tr>
<tr>
<td>4.1</td>
<td>Introduction</td>
<td>53</td>
</tr>
<tr>
<td>4.2</td>
<td>Doherty PA Implementation</td>
<td>54</td>
</tr>
<tr>
<td>4.3</td>
<td>Combiner Performance under Mismatch and Process Variations</td>
<td>60</td>
</tr>
<tr>
<td>4.4</td>
<td>Experimental Results</td>
<td>61</td>
</tr>
<tr>
<td>4.5</td>
<td>Conclusion</td>
<td>65</td>
</tr>
<tr>
<td>4.6</td>
<td>Acknowledgment</td>
<td>65</td>
</tr>
<tr>
<td>5.1</td>
<td>Introduction</td>
<td>66</td>
</tr>
<tr>
<td>5.2</td>
<td>Circuit Architecture and Design</td>
<td>68</td>
</tr>
<tr>
<td>5.3</td>
<td>Experimental Results</td>
<td>70</td>
</tr>
<tr>
<td>5.4</td>
<td>Conclusion</td>
<td>75</td>
</tr>
<tr>
<td>5.5</td>
<td>Acknowledgment</td>
<td>76</td>
</tr>
<tr>
<td>6.1</td>
<td>Introduction</td>
<td>77</td>
</tr>
<tr>
<td>6.2</td>
<td>Combiner Synthesis</td>
<td>78</td>
</tr>
<tr>
<td>6.2.1</td>
<td>Boundary Conditions</td>
<td>79</td>
</tr>
<tr>
<td>6.2.2</td>
<td>Combiner Realization</td>
<td>82</td>
</tr>
<tr>
<td>6.3</td>
<td>Building Blocks of the Front-End</td>
<td>84</td>
</tr>
<tr>
<td>6.3.1</td>
<td>High Power PA Implementation</td>
<td>84</td>
</tr>
<tr>
<td>6.3.2</td>
<td>LNA Implementation</td>
<td>85</td>
</tr>
<tr>
<td>6.3.3</td>
<td>Switch Implementation</td>
<td>86</td>
</tr>
<tr>
<td>6.3.4</td>
<td>Combiner Synthesis</td>
<td>87</td>
</tr>
<tr>
<td>6.4</td>
<td>Experimental Results</td>
<td>92</td>
</tr>
<tr>
<td>6.4.1</td>
<td>Front-end LNA Measurements</td>
<td>92</td>
</tr>
<tr>
<td>6.4.2</td>
<td>Front-end PA Measurements</td>
<td>94</td>
</tr>
<tr>
<td>6.5</td>
<td>Conclusion</td>
<td>97</td>
</tr>
<tr>
<td>6.6</td>
<td>Acknowledgment</td>
<td>99</td>
</tr>
<tr>
<td>7.1</td>
<td>Digital Power Amplifier and Quantization Noise</td>
<td>102</td>
</tr>
<tr>
<td>7.2</td>
<td>Cancellation Technique</td>
<td>103</td>
</tr>
<tr>
<td>7.3</td>
<td>Experimental Results</td>
<td>107</td>
</tr>
<tr>
<td>Chapter 8</td>
<td>Conclusions and Future Work</td>
<td>111</td>
</tr>
<tr>
<td>----------</td>
<td>-----------------------------</td>
<td>-----</td>
</tr>
<tr>
<td>8.1</td>
<td>Dissertation Summary</td>
<td>111</td>
</tr>
<tr>
<td>8.2</td>
<td>Future Work</td>
<td>114</td>
</tr>
</tbody>
</table>

| Bibliography | 116 |
### LIST OF FIGURES

<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>1.1</td>
<td>Four transistor stacking (a), and multigate four-stack layout (b).</td>
<td>4</td>
</tr>
<tr>
<td>1.2</td>
<td>Classic Doherty architecture with $\lambda/4$ lines.</td>
<td>6</td>
</tr>
<tr>
<td>1.3</td>
<td>Four transistor stacking (a), and multigate four-stack layout (b).</td>
<td>8</td>
</tr>
<tr>
<td>1.4</td>
<td>An example of a single channel of an analog beam-forming TDD transceiver.</td>
<td>9</td>
</tr>
<tr>
<td>1.5</td>
<td>Examples of an SPDT T/R switch with inductive compensation (a), and a</td>
<td>10</td>
</tr>
<tr>
<td></td>
<td>transmission line based T/R switch (b).</td>
<td></td>
</tr>
<tr>
<td>1.6</td>
<td>An example of an FDD transceiver front-end demonstrating RxBN issues in</td>
<td>12</td>
</tr>
<tr>
<td></td>
<td>the receiver.</td>
<td></td>
</tr>
<tr>
<td>2.1</td>
<td>Full schematic of the Doherty PA.</td>
<td>19</td>
</tr>
<tr>
<td>2.2</td>
<td>Schematics of the 4-stack final stage (a) and 2-stack driver stage (b).</td>
<td>20</td>
</tr>
<tr>
<td>2.3</td>
<td>Simulated IV-curves of an extracted 32$\mu$m wide NMOS device.</td>
<td>21</td>
</tr>
<tr>
<td>2.4</td>
<td>Simulated voltage waveforms of a 4-stack device with 4.8 V supply voltage</td>
<td>22</td>
</tr>
<tr>
<td></td>
<td>and $R_{\text{Load}} = 35\Omega$.</td>
<td></td>
</tr>
<tr>
<td>2.5</td>
<td>Simulated voltage waveforms of a 4-stack device with 4.8 V supply voltage</td>
<td>23</td>
</tr>
<tr>
<td></td>
<td>and $R_{\text{Load}} = 70\Omega$.</td>
<td></td>
</tr>
<tr>
<td>2.6</td>
<td>Simulated 4-stack device’s PAE vs. back-off for $R_L = 35\Omega$ and $R_L = 23$</td>
<td></td>
</tr>
<tr>
<td>2.7</td>
<td>Output combiner schematic.</td>
<td>25</td>
</tr>
<tr>
<td>2.8</td>
<td>Input splitter schematic.</td>
<td>26</td>
</tr>
<tr>
<td>2.9</td>
<td>Proposed open loop analog predistorter (APD) for symmetric Doherty PAs.</td>
<td>28</td>
</tr>
<tr>
<td>2.10</td>
<td>Proposed analog linearizer schematic. $R_{L1} = 365\Omega$, $C_{L1} = 197\text{ fF}$.</td>
<td>28</td>
</tr>
<tr>
<td>2.11</td>
<td>Simulated total gain without linearization, gains of main and peaking</td>
<td>29</td>
</tr>
<tr>
<td></td>
<td>amplifier, total gain with linearization, and APD loss as a function of</td>
<td></td>
</tr>
<tr>
<td></td>
<td>output power.</td>
<td></td>
</tr>
<tr>
<td>2.12</td>
<td>Simulated envelope detector conversion gain.</td>
<td>30</td>
</tr>
<tr>
<td>2.13</td>
<td>Micrograph of the 1x1 mm$^2$ two stage Doherty power amplifier chip.</td>
<td>31</td>
</tr>
<tr>
<td>2.14</td>
<td>Measured S-parameters.</td>
<td>32</td>
</tr>
<tr>
<td>2.15</td>
<td>Measured Gain, PAE and drain efficiency (DE).</td>
<td>32</td>
</tr>
<tr>
<td>2.16</td>
<td>Measured saturated output power and peak PAE vs frequency.</td>
<td>33</td>
</tr>
<tr>
<td>2.17</td>
<td>6 dB back-off PAE of measured Doherty PA and theoretical class B based on</td>
<td>33</td>
</tr>
<tr>
<td></td>
<td>peak PAE in Fig. 2.16 over frequency.</td>
<td></td>
</tr>
<tr>
<td>2.18</td>
<td>Measured gain and PAE with analog linearizer turned on and off.</td>
<td>34</td>
</tr>
<tr>
<td>2.19</td>
<td>Measured EVM and average PAE for 200 MHz 16-QAM signal with analog</td>
<td>35</td>
</tr>
<tr>
<td></td>
<td>linearizer turned on and off.</td>
<td></td>
</tr>
<tr>
<td>2.20</td>
<td>Measured ACLR for 200 MHz 16-QAM signal with analog linearizer turned</td>
<td>35</td>
</tr>
<tr>
<td></td>
<td>on and off.</td>
<td></td>
</tr>
<tr>
<td>2.21</td>
<td>Measured EVM and average PAE for 200 MHz 64-QAM signal with analog</td>
<td>36</td>
</tr>
<tr>
<td></td>
<td>linearizer turned on and off.</td>
<td></td>
</tr>
<tr>
<td>2.22</td>
<td>Measured ACLR for 200 MHz 64-QAM signal with analog linearizer turned</td>
<td>36</td>
</tr>
<tr>
<td></td>
<td>on and off.</td>
<td></td>
</tr>
</tbody>
</table>
Figure 2.23: Measured spectrum for 200 MHz 64-QAM signal with $EVM = 5.5\%$ and average output power $P_{\text{out}} = 16.4\, \text{dBm}$ with analog linearizer turned on and off. 37

Figure 2.24: Measured constellation for 200 MHz 16-QAM signal with $EVM = 9.5\%$ (a), and 64-QAM signal with $EVM = 5.5\%$ (b) with analog linearizer turned on. 38

Figure 2.25: Measured AM-PM for 200 MHz 64-QAM signal with $EVM = 5.5\%$ and average output power $P_{\text{out}} = 16.4\, \text{dBm}$ with analog linearizer turned on and off. An offset of $-40^\circ$ was added to the data without linearization for clarity. 38

Figure 3.1: Schematics of the 2-stack nMOS (a) and pMOS (b) based PAs. For nMOS PA: $V_{\text{DD}} = 2.4\, \text{V}$, $V_{G1} = 0.22\, \text{V}$, $V_{G2} = 1.7\, \text{V}$. For pMOS PA: $V_{\text{SS}} = -2.4\, \text{V}$, $V_{G1} = -0.22\, \text{V}$, $V_{G2} = -1.7\, \text{V}$. 43

Figure 3.2: Micrograph of the 0.6x0.62 mm$^2$ power amplifier chip. 44

Figure 3.3: Measured S-parameters of nMOS and pMOS based PAs. 44

Figure 3.4: Measured gain, PAE and drain efficiency for nMOS PA at 26.75 GHz and pMOS PA at 26.5 GHz. 45

Figure 3.5: Measured gain change vs time for nMOS PA at 26.75 GHz. 45

Figure 3.6: Measured gain change vs time for pMOS PA at 26.5 GHz. 46

Figure 3.7: Measured constellation and EVM for 800 MHz 64-QAM signal with average output power $P_{\text{out}} = 9.8\, \text{dBm}$. 47

Figure 3.8: Measured spectrum for 800 MHz 64-QAM signal with $EVM = 5.5\%$ and average output power $P_{\text{out}} = 9.8\, \text{dBm}$. 47

Figure 3.9: Schematic of the pMOS PA with a shunt feedback drain-source capacitance $C_3$. 49

Figure 3.10: Measured gain, PAE and drain efficiency for pMOS PA at 27 GHz. 49

Figure 3.11: Measured average PAE and EVM for 800 MHz 64-QAM OFDM for pMOS PA at 27 GHz. 50

Figure 4.1: Schematic of the Doherty PA. $R_b = 14.5\, \text{k}\Omega$, $C_g = 380\, \text{fF}$, $C_1 = 1\, \text{pF}$, $L_1 = 117\, \text{pH}$, $C_2 = 69\, \text{fF}$, $L_2 = 244\, \text{pH}$, $C_3 = 187\, \text{fF}$, $L_3 = 300\, \text{pH}$, $C_4 = 169\, \text{fF}$, $L_4 = 247\, \text{pH}$, $C_5 = 82\, \text{fF}$, $L_6 = 39\, \text{pH}$. 54

Figure 4.2: Optimized combiner (a) and conventional combiner (b). 55

Figure 4.3: Simulated losses in synthesized and conventional combiners vs $P_{\text{out}}$. 55

Figure 4.4: Simulated real part of the main and auxiliary PA’s load with varying output power. 56

Figure 4.5: Simulated complex load impedances of the main and auxiliary PA’s with varying output power. 56

Figure 4.6: Modeling of the layout parasitics of the power stage. 57

Figure 4.7: Layout of the 256$\mu$m wide 2-stack power device including external gate to ground capacitors for the top transistor. 58

Figure 4.8: Extracted (solid line, port 1,2) and modeled (dashed line, port 3,4) device’s input and output reflection coefficients. 59

Figure 4.9: Extracted (solid line, port 1,2) and modeled (dashed line, port 3,4) device’s forward and backward transmission coefficients. 59

Figure 4.10: Simulated combiner loss at for $\pm 10\%$ LC process variations. 61
Figure 4.11: Simulated gain and PAE for ±10% LC process variations. 61
Figure 4.12: Simulated optimized and conventional combiner loss at peak power for antenna mismatch with $|\Gamma| = -10$ dB. 62
Figure 4.13: Micrograph of the 0.94x0.67 mm$^2$ Doherty power amplifier chip. 62
Figure 4.14: CW measurement of gain and PAE over output power. 63
Figure 4.15: CW measurement of $P_{sat}$, peak and 6 dB back-off PAE vs frequency. 63
Figure 4.16: Measured average PAE and EVM for 800 MHz 64-QAM OFDM. 64
Figure 4.17: Measured AM-PM for 800 MHz 64-QAM OFDM signal with EVM = 5.5% and average output power $P_{out} = 13$ dBm. 64

Figure 5.1: Ideal asymmetric Doherty operation with power ratio of 1:2, in back-off output power mode (a), and in peak output power mode (b). 67
Figure 5.2: Schematic of the asymmetric Doherty PA. $V_{DD,m} = 2.4$ V, $V_{G1M} = 0.22$ V, $V_{G2M} = 1.7$ V, $V_{DD,p} = 4.8$ V, $V_{G1P} = 0$ V, $V_{G2P} = 1.8$ V, $V_{G3P} = 2.9$ V, $V_{G4P} = 4.2$ V. 68
Figure 5.3: Simulated Gain and PAE of 2-stack stage with and without shunt-feedback drain to source capacitance (matching losses not included.) 69
Figure 5.4: Simulated losses in synthesized combiner vs. output power. 69
Figure 5.5: Micrograph of the asymmetric dual input Doherty PA. 70
Figure 5.6: Asymmetric input power splitting profile for the main and peaking amplifiers for $\alpha = 6$ dB. 71
Figure 5.7: Measured PAE and drain efficiency (DE) for asymmetric input power split with $\alpha = 10$ dB at 26 GHz. 72
Figure 5.8: Measured peak PAE, 6 dB, and 8 dB back-off PAE versus frequency for asymmetric input power split with $\alpha = 10$ dB. 72
Figure 5.9: Measured PAE for asymmetric input power split with $\alpha = 8$ dB, $\alpha = 10$ dB, $\alpha = 12$ dB at 26 GHz. 73
Figure 5.10: Measured constellation of 50 MHz 64QAM for asymmetric input power split for single carrier (a), and OFDM (b) signals at 26 GHz. 73
Figure 5.11: Measured gain, PAE and drain efficiency (DE) for symmetric input power split at 26 GHz. 74
Figure 5.12: Measured saturated output power, peak PAE, 6 dB, and 8 dB back-off PAE versus frequency for symmetric input power split. 74

Figure 6.1: T/R combiner network represented as a lossy and reciprocal 2-port network. 79
Figure 6.2: Transmit and receive states for determining the 2-port matrix. Transmit mode (a), general receive mode (b). 79
Figure 6.3: Combiner network represented as a lossless reciprocal 3-port network. 82
Figure 6.4: Representation of the lossless 3-port combiner network with two 2-port networks terminated with a load. 83
Figure 6.5: Schematic of the front-end. $V_{DD,LNA} = 2.4$ V, $V_{G1,LNA} = 0.5$ V, $V_{G2,LNA} = 1.8$ V, $V_{DD,PA} = 4.8$ V, $V_{G1,PA} = 0.25$ V, $V_{G2,PA} = 1.8$ V, $V_{G3,PA} = 2.9$ V, $V_{G4,PA} = 4.2$ V. 84
Figure 6.6: Simulated PA to LNA isolation in the transmit mode at $P_{\text{out}} = 23 \text{ dBm}$ with and without $L_{\text{series}}$. ........................................ 88
Figure 6.7: Simulated PA to antenna and LNA to antenna losses. ................................. 88
Figure 6.8: Simulated PA to LNA isolation and voltage amplitude at the LNA input in the transmit mode at $P_{\text{out}} = 23 \text{ dBm}$. ........................................ 89
Figure 6.9: Simulated PA to antenna and LNA to antenna losses at 28 GHz versus phase of antenna load reflection coefficient $\Gamma$ with $|\Gamma| = -10 \text{ dB}$. .................. 90
Figure 6.10: Simulated output IMD3 in the transmit mode at $P_{\text{out}} = 23 \text{ dBm}$. ........ 91
Figure 6.11: Simulated transient response in the transmit mode with settled PA. ............. 91
Figure 6.12: Micrograph of the 0.7 x 0.77 mm$^2$ TDD front-end chip. ........................... 92
Figure 6.13: Measured (solid lines) and simulated (dashed lines) s-parameters of the LNA. 93
Figure 6.14: Measured IIP3 and input P1dB of the LNA. .............................................. 93
Figure 6.15: Measured (solid lines) and simulated (dashed lines) NF of the LNA. .......... 94
Figure 6.16: Measured (solid lines) and simulated (dashed lines) s-parameters of the PA. 95
Figure 6.17: Measured (solid lines) and simulated (dashed lines) Gain, PAE, and DE of the PA at 26 GHz. ................................................................. 96
Figure 6.18: Measured EVM and average PAE of the PA at 26 GHz with 64-QAM 800 MHz OFDM signal. ................................................................. 96
Figure 6.19: Measured spectrum of the PA at 26 GHz with 64-QAM 800 MHz OFDM signal at $P_{\text{out}} = 14 \text{ dBm}$, EVM = 5.5%. ................................. 97
Figure 6.20: Measured AM-PM of the PA at 26 GHz with 64-QAM 800 MHz OFDM signal at $P_{\text{out}} = 14 \text{ dBm}$, EVM = 5.5%. ................................. 98
Figure 7.1: Simulated spectrum of a 5 MHz LTE signal with 45 MHz sampling rate and ACW resolution varying from 4 - 10 bits. ................................. 103
Figure 7.2: Block diagram of a FDD transceiver including the proposed feedback receiver and the adaptive RxBN cancellation. ................................. 104
Figure 7.3: Measured DPA output spectrum for 22.3 dBm average output power using a 4 MHz 16-QAM OFDM signal. ........................................ 108
Figure 7.4: Measured pre- and post-cancellation spectra together with thermal noise floor of the main receiver without an information carrying signal. ........ 108
Figure 7.5: Measured pre- and post-cancellation EVM, as well as the main receiver EVM without the presence of RxBN. ................................. 109
Figure 7.6: Measured constellation for -80 dBm input power. Pre-cancellation (a), and post-cancellation (b). ........................................ 109
Figure 8.1: Frequency reconfigurable, mm-wave Si-based power amplifier concept with back-off efficiency improvement technique. ................................. 115
# LIST OF TABLES

<table>
<thead>
<tr>
<th>Table 1.1:</th>
<th>LTE FDD Frequency Band 5</th>
<th>13</th>
</tr>
</thead>
<tbody>
<tr>
<td>Table 2.1:</td>
<td>Summary of modulation measurements</td>
<td>37</td>
</tr>
<tr>
<td>Table 2.2:</td>
<td>Comparison to Recent cm-wave and mm-wave Doherty PAs.</td>
<td>39</td>
</tr>
<tr>
<td>Table 3.1:</td>
<td>CMOS Linear Power Amplifier Performance Summary</td>
<td>48</td>
</tr>
<tr>
<td>Table 3.2:</td>
<td>Comparison of Reported High Efficiency Single-Ended 28 GHz PAs</td>
<td>51</td>
</tr>
<tr>
<td>Table 4.1:</td>
<td>Load-Pull simulations of the 256(\mu)m wide 2-stack device.</td>
<td>60</td>
</tr>
<tr>
<td>Table 4.2:</td>
<td>Comparison to Recent mm-wave Doherty PAs.</td>
<td>65</td>
</tr>
<tr>
<td>Table 5.1:</td>
<td>Comparison to Recent mm-wave Doherty and Outphasing PAs.</td>
<td>75</td>
</tr>
<tr>
<td>Table 6.1:</td>
<td>Comparison to Recent Ka-band mm-Wave Transceiver Front-ends.</td>
<td>98</td>
</tr>
<tr>
<td>Table 7.1:</td>
<td>Performance Summary of Main and Feedback Receivers</td>
<td>105</td>
</tr>
</tbody>
</table>
ACKNOWLEDGEMENTS

This dissertation would not be possible without the tremendous support and guidance of my advisor, Dr. Peter Asbeck. I was very lucky to have the opportunity at UCSD to learn so many things from him. He has not only shared his technical knowledge and experience, and spend uncountable hours helping me with my research projects, but also guided me in achieving my carrier goals. I could not imagine having a more supportive, kind, knowledgeable and inspiring PhD advisor. Dr. Asbeck, thank you for all the selfless efforts and hard work to teach, inspire, and mentor me.

I would also like to thank my dissertation committee: Prof. Gabriel Rebeiz, Prof. Patrick Mercier, Prof. William Hodgkiss, and Prof. Gert Cauwenberghs for providing valuable and insightful feedback. I am also thankful to Dr. Prasad Gudem for his assistance and encouragement in regards to the receive band noise cancellation project.

A special thanks to all my friends, class-mates, group-mates, and many others I had the pleasure to interact with during my academic journey.

I would like to express my gratitude to my family for their immense amount of support and encouragement throughout my education. They made it possible through a great amount of sacrifices, unconditional love, and inspiration.

The material of this dissertation is based on the following papers.

Chapter 2 is mostly a reprint of the material as it appears in N. Rostomyan, J. A. Jayamon, and P. M. Asbeck, ”15 GHz Doherty Power Amplifier With RF Predistortion Linearizer in CMOS SOI,” IEEE Trans. Microw. Theory Tech., 2017. This dissertation author was the primary author of this material.

Chapter 3 is mostly a reprint of the material as it appears in N. Rostomyan, M. Ozen, and P. Asbeck, ”Comparison of pMOS and nMOS 28 GHz high efficiency linear power amplifiers in 45 nm CMOS SOI,” 2018 IEEE Topical Conference on RF/Microwave Power Amplifiers for Radio and Wireless Applications, 2018. This dissertation author was the primary author of this
material. Chapter 3 is also, in part, a reprint of the material that has been accepted for publication and it will appear in D. Thomas, N. Rostomyan, and P. Asbeck, "A 45% PAE pMOS Power Amplifier for 28GHz Applications in 45 nm SOI," 2018 IEEE MWSCAS, 2018. The dissertation author was the collaborating author of these materials, and co-authors have approved the use of the material for this dissertation.

Chapter 4 is mostly a reprint of the material as it appears in N. Rostomyan, M. Özen, and P. Asbeck, "28 GHz Doherty Power Amplifier in CMOS SOI With 28% Back-Off PAE,” IEEE Microwave and Wireless Components Letters, pp. 13, 2018. This dissertation author was the primary author of this material.

Chapter 5 is mostly a reprint of the material that has been submitted for publication as it may appear in N. Rostomyan, M. Özen, and P. Asbeck, "A Ka-Band Asymmetric Dual Input CMOS SOI Doherty Power Amplifier with 25 dBm Output Power and High Back-Off Efficiency," 2019 IEEE Topical Conference on RF/Microwave Power Amplifiers for Radio and Wireless Applications (PAWR), 2019. This dissertation author was the primary author of this material.

Chapter 6 is mostly a reprint of the material that has been submitted for publication as it may appear in N. Rostomyan, M. Özen, and P. Asbeck "Synthesis Technique for Low Loss Mm-Wave T/R Combiners for TDD Front-Ends,” IEEE Trans. Microw. Theory Tech., 2018. This dissertation author was the primary author of this material.

Chapter 7 is mostly a reprint of the material that has been submitted for publication as it may appear in N. Rostomyan, V. Didi, P. Gudem and P. Asbeck "Adaptive Cancellation of Digital Power Amplifier Receive Band Noise for FDD Transceivers,” IEEE Microwave and Wireless Components Letters, 2018. This dissertation author was the primary author of this material.

The dissertation author was the primary or collaborating author of these materials, and co-authors have approved the use of the material for this dissertation.
VITA

2014-2018 Ph. D. in Electrical Engineering (Electronic Circuits and Systems), University of California San Diego, USA

2011-2013 M. Sc. in Electrical, Electronics and Communications Engineering, Technical University of Munich, Germany

2007-2011 B. Sc. in Electrical, Electronics and Communications Engineering, Hochschule Mannheim - University of Applied Sciences, Germany

PUBLICATIONS


xv


ABSTRACT OF THE DISSERTATION

Efficiency Improvement Techniques for Millimeter-Wave Transmitters

by

Narek Rostomyan

Doctor of Philosophy in Electrical Engineering (Electronic Circuits and Systems)

University of California San Diego, 2018

Professor Peter Asbeck, Chair

Strong demand for mm-wave high data-rate links in emerging 5G communication systems has resulted in substantial interest in mm-wave silicon (Si) based radio front-ends. The efficiency of the PA is a significant factor in the overall power dissipation and thermal management of mm-wave transceivers which have arrays with a large number of antennas (RF channels). This dissertation focused mainly on circuit design techniques for cm/mm-wave CMOS power amplifier efficiency improvement at frequencies from 15 GHz to 28 GHz. In addition, a DSP based solution is proposed to increase efficiency and performance of cellular (LTE band) transmitters in the 1-3 GHz frequency range.

For digital communication signals with multi-carrier modulation and high peak-to-average
power ratios (PAPRs), high back-off efficiency of the PA is of significant importance. In the first part of the dissertation, possible implementations of linear and efficiency-enhanced CMOS PAs are described. The concept of stacking multiple FETs is applied in the design of symmetric and asymmetric Doherty power amplifiers, as well as compact linear PAs.

The dissertation demonstrates that high power density can be achieved with PAs based on 4-stack power devices, while 2-stack devices can be designed to have exceptionally high efficiency due to lower losses. A two stage, high power Doherty PA that uses 4-stack devices in the final stages is demonstrated with more than 25 dBm output power and 25% back-off power added efficiency (PAE) at 6 dB back-off operating in the 15 GHz band. To minimize chip area, the Doherty combiner is based on an optimized, lumped element 90° phase shifter. To overcome the inherent non-linear gain response of Doherty PAs and to minimize the complexity of digital pre-distortion (DPD) due to large channel bandwidth at mm-wave bands, a simple RF domain analog pre-distorter is demonstrated for the first time.

Various compact, linear 2-stack PAs are demonstrated based on nMOS and pMOS FETs for saturated output powers in the range of 20 dBm in the Ka-band. Performance and reliability advantages of pMOS based PAs are shown. Also, by using inter-node impedance tuning with a shunt feedback drain-source capacitor, the PAE of the 2-stack PAs is increased even further, resulting in world record 46% PAE for the pMOS PA at 26.5 GHz.

Due to high passive losses in CMOS, achieving high efficiency Doherty PAs requires careful design and non-conventional synthesis methodology for the Doherty combiner. A high efficiency, symmetric Doherty PA for the Ka-band that is based on efficient 2-stack power devices and a low loss Doherty combiner synthesis technique is presented. At 6 dB back-off, the PAE exceeds 28% which corresponds to 1.4x higher PAE than achievable with ideal class B back-off from peak PAE. Such high efficiency is attained due to low combiner losses of 0.5 dB, which is less than half of what can be achieved with a conventional Doherty combiner. Furthermore, an asymmetric Doherty PA is reported that is based on low loss output Doherty combiner and uses
a 2-stack cell in the main path and a 4-stack cell in the peaking path, thus improving efficiency at more than 6 dB back-off and achieving high output power. In addition, a compact modeling approach for large, parasitic-extracted PA transistors is presented, which considerably reduces simulation time and accelerates developments of CMOS PAs.

A typical time-division duplex (TDD) transmit/receive (T/R) mm-wave front-end comprises a power amplifier, a low noise amplifier (LNA), an antenna switch, and appropriate passive matching and combining networks. In this thesis, a synthesis methodology is proposed that minimizes the overall losses by combining the PA output and the LNA input matching networks together with the T/R switch into one network. The technique improves mm-wave transceiver performance in terms of PA efficiency and LNA noise figure (NF). The proposed T/R combiner can achieve high linearity and can handle large PA output voltage swings. The architecture can be implemented in any process which provides high integration capability. A Ka-band implementation is demonstrated in CMOS SOI that includes a high power, 4-stack based PA and an inductively source degenerated, cascode based LNA. Within the front-end, the PA achieves saturated output power of 23.6 dBm with peak PAE of 28%, while the LNA achieves NF of 3.2 dB.

Finally, in frequency division duplex (FDD) systems, spurious emissions from the transmitter (TX) can fall onto the receive (RX) band and lead to significant receiver desensitization. This dissertation proposes a DSP based solution that relies on a linear auxiliary receiver to cancel the RX band noise from the received signal. This technique allows reduction of duplexer rejection requirements in the RX band, and reduction of insertion loss in the TX band, thus, resulting in high PA efficiency and smaller duplexer footprint. PA architectures that inherently have high receive band noise (envelop tracking and digital PAs) can substantially benefit from this technique. More than 22 dB improvement in the signal to noise ratio (SNR) is shown without the presence of desired signal at the antenna.
Chapter 1

Introduction

In recent years, the wireless industry has witnessed accelerated research and development efforts for the fifth generation (5G) wireless communication links to address increasing demand for high data rates. 5G networks promise to support higher than 1-10 Gbps download speeds by using mm-wave spectrum, such as 28 GHz, 39 GHz, and 60 GHz bands. Also, channel capacity will be increased by utilizing multiple antennas to realize phased arrays and MIMO systems [1].

The 5G wireless revolution presents significant challenges to the implementation of the radios inside wireless mobile devices and base-stations. The complexity of such systems and the large number of front-end modules required for phased array MIMO systems can result in very high cost. For 5G to succeed in the mass market, the effective cost of the front-end module per antenna should be low. Therefore, silicon-based technologies (SiGe or CMOS) are the best contenders for 5G radios. Broadband modulation bandwidth requirements (i.e. above 800 MHz), large integration, complex modulated signals with large peak to average ratios (PAPR), and excessive heat production demand high efficiency and stringent linearity from the power amplifiers (PA) for 5G transmitters. Achieving all these requirements with integrated PAs in CMOS or SiGe is a major challenge and an area of continuing research.

In this chapter, a short introduction is provided to the challenges associated with modern
radio transceivers. In particular, a brief overview is given to outline limitations and circuit concepts for achieving high efficiency in mm-wave CMOS PAs. Second, implementation difficulties of time division duplex (TDD) and frequency division duplex (FDD) transceivers are described along with possible solutions. The final section describes the scope and organization of the dissertation.

1.1 Efficient mm-Wave CMOS PA Design

According to the well-known Shannon’s channel capacity theorem [1], higher data rates can be achieved by increasing the channel bandwidth (BW) or signal-to-noise ratio (SNR). For sub-6 GHz wireless communication, the channel bandwidth and SNR are limited due to frequency planning and reuse, interference, front-end linearity and health/environmental regulations. Utilization of wireless systems at mm-wave carrier frequencies is motivated by the fact that much larger bandwidth is available at mm-wave bands. Besides the advantage of offering higher data rates, the size of the antenna and the transceiver building blocks, which are typically proportional to the wavelength, are substantially smaller at mm-wave frequencies. However, maintaining high SNR at high frequencies creates additional challenges, as the propagation of electromagnetic waves at mm-wave frequencies experiences more attenuation compared to radio and microwave frequencies.

Multiple antenna systems provide a solution for achieving higher SNR at mm-wave frequencies. Such systems rely on spatial diversity to transmit or receive signals from multiple transmitters and receivers. Known as multiple input, multiple output (MIMO) communication, such systems rely on statistical independence of each channel. In MIMO systems, each transceiver has a dedicated base-band processing unit (down-converting mixers, A/D converters), and the SNR improvement is realized only in the digital domain. Phased array systems, on the other hand, are a special class of multi-antenna systems that enable electronically controllable beam-forming
as well as beam-steering. This allows increasing the array antenna gain and realization of spatially
directional links. High spatial selectivity enables a phased array system to mitigate interference
at the receiver and allows higher frequency reuse. Also, in a phased array transmitter with \( M \)
elements, the total Equivalent Isotropically Radiated Power (EIRP) at the direction of peak array
gain increases by a factor of \( M^2 \) over that of a single channel. Thus, for a given power level at the
receiver, high antenna gain allows reduction of transmitter output power per channel.

While most of current mobile communication PAs are implemented using III-V or SiGe
technologies, CMOS PAs are a major contender due to lower fabrication cost, compact size,
and the possibility to be integrated with the digital and mixed-signal building blocks. CMOS
solutions may be enabled by the relatively low power levels needed per antenna. In order to
achieve maximum average EIRP of 35 to 65 dBmi, a \( M = 64 \) element phased array transmitter
with 3 dB antenna gain (including antenna losses, T/R switch and interconnects) per antenna
and 10 dB PAPR (OFDM with QAM) requires the PA to output close to 6 to 36 dBm at P1dB.
The output P1dB is reduced to -6.2 to 24 dBm for \( M = 256 \) element array. Power levels of 15
- 25 dBm can easily be achieved with CMOS or SiGe based PAs, thus allowing mm-wave phased
array systems that require low EIRP and moderate number of antennas, or high EIRP and large
number of antennas. Handset scenarios, which may require EIRP = unit[35]dBmi with only 4-6
antennas active in worst case, may require P1dB of 26-30 dBm and thus remain a challenge.

Due to relatively low breakdown voltages of CMOS FETs, the power handling capability
of a CMOS PA has traditionally been limited. However, the power levels can be considerably
increased by transistor stacking. Recent research in transistor stacking in CMOS SOI has shown
that more than 25 dBm peak power can be achieved with correct biasing and capacitive loading of
the transistor gates. This allows the output voltage swing at the top transistor to reach \( N \times V_{ds,\text{max}} \),
where \( N \) is the number of series transistors, and \( V_{ds,\text{max}} \) is the maximum drain to source voltage
allowed on a single FET for reliable operation [2], as shown in Fig. 1.1a for \( N = 4 \). In [3] and [4],
a compact unit cell implementation of a 4-stacked device was demonstrated in 45 nm CMOS
SOI technology that is based on a four-gate finger, single diffusion FET, together with capacitors implemented with back end of line (BEOL) metalization layers. The multigate layout in GF 45RF SOI technology with 1.2\(\mu\)m finger width is illustrated in Fig. 1.1b. This multigate cell structure considerably reduces the parasitics of interconnections between the stacked transistors and provides better heat removal mechanisms.

Integration of all critical mm-wave transceiver building blocks in Si-based front-ends poses particular concerns not only for the achievable output power but also for the efficiency of power amplifiers. The efficiency of the PA can have a significant contribution in the overall power consumption and thermal management of mm-wave transceivers with a large number of antennas. The power added efficiency (PAE) of a PA can be approximated as a product of factors corresponding to loss mechanisms, such as

\[
PAE \propto F_{V_{\text{min}}} \cdot F_{\text{Gain}} \cdot F_{\text{matching}} \cdot F_{\text{waveform}},
\]

where the factor \(F_{V_{\text{min}}} \propto (1 - V_{\text{min}}/V_{dd})\) describes the loss in efficiency due to minimum drain
voltage of the device at the peak of the voltage swing (assuming even harmonics shorted.)
The factor $F_{\text{Gain}} \propto (1 - 1/G)$ corresponds to the PAE degradation associated with finite gain. $F_{\text{matching}} \propto Q_L/(Q_L + Q)$ describes the loss in the output impedance matching network as a function of quality factors of impedance transformation $Q$ and inductor $Q_L$ (assuming capacitor $Q_C \gg Q_L$.) The last term $F_{\text{waveform}}$ is a function of the overlapping area between drain voltage and current waveforms, ranging between 0.5 for Class A and 1 for ideal switching mode operation.

It is apparent from (1.1) that attaining high PA efficiency is not a straightforward task and requires consideration of many trade-offs. Both $V_{\text{min}}$ and $V_{\text{dd}}$ are technology limited, restricting the designer from increasing $F_{V_{\text{min}}}$ beyond certain limit. The gain factor $F_{\text{Gain}}$ is limited by the power transistor’s $f_{\text{max}}$ (in the order of 300 GHz for modern nano-meter CMOS) and will depend on the transistor technology, layout, and size. Obviously, this factor reduces quickly as the operation frequency approaches $f_{\text{max}}$. The output impedance loss can be minimized by the choice of load resistance close to 50 ohms, the availability of thick top copper metals, the use of short interconnects and the high resistivity substrate. Some of these trade-offs will be addressed throughout the dissertation.

For the above example of a 256 element phased array with average EIRP of 65 dBmi and 10 dB PAPR, the DC power consumption of the PAs will exceed 40 Watts even with a state-of-the-art class AB PA presented in this dissertation with drain efficiency of 15% at 10 dB back-off. It is expected that in addition to this, as much DC power will be consumed in other building blocks of a mm-wave transceiver, such as LNAs, mixers, phase-shifters, VGAs, and LO generation. Such high DC power consumption poses extreme difficulties for cooling and results in much larger mechanical dimensions and higher cost of a mm-wave communication device. Therefore, optimization of the PA’s peak efficiency is not sufficient and techniques for improving the efficiency at back-off power levels is cardinal for successful deployment of mm-wave communication for the mass-market.

For RF frequencies below 6 GHz, various back-off efficiency enhancement techniques
such as envelop tracking, out-phasing, and Doherty have found successful application in mobile and base-station PAs. At mm-wave frequencies, the large signal bandwidth limits the use of ET and out-phasing. This is due to the fact that these techniques result in 3-5x bandwidth expansion of the envelope signal required relative to the modulate carrier bandwidth, and require well controlled time alignment between different signal paths. A Doherty PA, on the other hand, can be more readily implemented in mm-wave frequencies as it relies on a passive load modulation output network. This dissertation puts emphasis on Doherty PAs and various solutions are presented to address challenges associated with their CMOS mm-wave implementation, such as output combiner loss and non-linear gain response.

Shown in Fig. 1.2 is a conventional Doherty PA which consists of main and peaking amplifiers, input/output matching networks, and λ/4 lines. This architecture was proposed by Doherty in 1936 [5]. It contains an always active main amplifier and a peaking amplifier, which turns on when the input power exceeds a certain threshold, e.g., 6 dB back-off from the peak power if the two amplifiers are identical. The classical implementation uses a class AB main amplifier and a class C peaking amplifier with identical transistors [6]. The Doherty amplifier can also be regarded as a class AB main amplifier with an active, variable output load that is lowered after certain input power threshold to keep the voltage swing constant and allow the drain current to increase.

Doherty PAs have the inherent advantage of providing higher efficiency at low power.
levels compared to class AB PAs. When the peaking PA is turned off, the main amplifier operates as a normal class AB PA. However, because it is presented with higher load at back-off, maximum PAE is reached when the drain voltage swing is approximately twice the supply voltage while the drain current swing is half of the maximum current that the main stage can deliver. For a symmetric Doherty, this happens at 6 dB back-off from the total peak output power just before the peaking amplifier starts to turn on. Further increase of the input signal causes the peaking amplifier to turn on and deliver power. As the main amplifier creates large voltage swing at the drain of the peaking amplifier, it is presented with high output load impedance, thus allowing it to operate with good efficiency when it turns on. At peak power, both amplifiers are presented with equal loads which allows them to reach peak efficiency and deliver maximum power. Ideally, the peaking amplifier must compensate for the non-linear input/output power relation due to load modulation effect.

CMOS realizations of Doherty PAs at mm-wave frequencies suffer from high losses of the output network. For example, at 28 GHz the total output loss from the drain node of the PA device to the antenna load usually exceeds 1.5 dB (71%). Such high losses reduce the PAE improvements arising from having a Doherty PA, as the corresponding loss of a one stage output LC matching network of a linear PA is in the order of 0.8 dB (83%). Thus, more than 15% relative improvement in efficiency is necessary from a Doherty load modulation technique to compensate the efficiency reduction due to the higher output network loss compared to a class AB PA.

A technique that significantly reduces the Doherty output combiner loss has been proposed in [7]. This combiner synthesis methodology combines the output matching networks of Fig. 1.2 together with the $\lambda/4$ lines into one single 2-port network, as shown in Fig. 1.3a. The realization of the 2-port network requires load-pull impedance data at the peak and back-off levels, which can be obtained either by simulations or measurements. The 2-port network contains the load (antenna) inside and can be converted into a cascaded combination of main and auxiliary lossless 2-port networks $T_{2p,m}$ and $T_{2p,a}$ as well as load impedance $Z_L$, as illustrated in Fig. 1.3b. Afterwords, the
$T_{2p,m}$ and $T_{2p,a}$ networks are converted into $Y$ (Fig. 1.3c) or $\pi$ (Fig. 1.3d) network representations. It will be shown later in chapter 4 that the total output combiner loss can be more than halved, thus enabling high efficiency mm-wave Doherty PA implementations.

Integrated mm-wave Doherty PA also have to face with challenges associated with non-linearities that arise from load modulation. Lower gain for class C peaking amplifier results in non-linear input/output power response. This can be mitigated by adding an additional gain stage in the peaking path and sacrificing efficiency. Also, the input and output phase-shifting puts bandwidth limitations and creates unwanted memory effects for wideband modulated signals.

1.2 Challenges in Mm-Wave TDD Transceiver Front-Ends

In time division duplex systems, the transmitter and the receiver operate at different time instances, during which the same antenna is used either for transmitting or receiving. Such transceivers usually require a T/R switch between the PA, low noise amplifier (LNA) and the antenna. Shown in Fig. 1.4 is an example of a single channel, analog beam-forming TDD transceiver. The switch losses can have significant effect on the overall performance of the
Figure 1.4: An example of a single channel of an analog beam-forming TDD transceiver.

front-end, as it can reduce the PA efficiency and output power, as well as increase the noise figure of the receiver.

In a traditional TDD front-end, the PA output and the LNA input are matched to 50Ω, followed by the T/R switch which is also matched to 50Ω at all of its three ports. The loss of a single stage LC matching network in CMOS at Ka-band is usually in the order of 0.8 dB. Together with the T/R switch, the overall losses at the PA output and the LNA input are usually higher than 1.5 dB.

At high frequencies, achieving low insertion loss with single-pole-double-throw (SPDT) switches becomes challenging, since the “on” state channel resistance of transistors increases with frequency. The reduction of the “on” resistance by means of a larger transistor increases parasitic capacitance and requires additional inductive compensation network at each port of the SPDT switch, thus increasing losses. Several implementations of CMOS SPDT switches with inductive compensation have been shown in the literature that are similar to the implementation example shown in Fig. 1.5a. [8] has demonstrated that up to 1.4 dB insertion loss (IL) and more than 30 dB isolation can be achieved at Ka-band in 45 nm CMOS SOI technology.

Various other implementations of mm-wave T/R switches that deviate from the classical SPDT switch architecture have also been recently demonstrated in the context of 5G phased arrays. As illustrated in Fig. 1.5, such T/R switch architectures rely on transmission lines for isolation of the LNA in the transmit mode. [9] demonstrated a single ended Ka-band T/R switch
based on lumped element $\lambda/4$ lines and SiGe HBT transistors in reverse saturation mode, which was earlier proposed in [10]. The switch exhibits about 1.5 dB IL and 19 dB isolation at 28 GHz. Another implementation of the T/R switch in [11] suggested eliminating the $\lambda/4$ line from the PA path in order to increase its efficiency. However, elimination of the $\lambda/4$ line has necessitated a switchable capacitive bank at the PA output in order to provide RF open in the RX mode. Besides, as both solutions use a single shunt transistor switch directly at the PA output, higher

![Diagram](image)

**Figure 1.5**: Examples of an SPDT T/R switch with inductive compensation (a), and a transmission line based T/R switch (b).
power PA implementations with large voltage swings can damage the switch transistors. Thus, both of these solutions will require switches with stacked transistors in order to accommodate high power PAs, which will result in higher insertion loss. Alternatively, [12] has shown a T/R switch implementation that uses a single shunt transistor at the LNA input. However, this solution guarantees no general form for simultaneous PA output matching for high efficiency and LNA input matching for minimum noise figure. This dissertation demonstrates a T/R combiner synthesis methodology that optimizes the losses by combining the PA output and the LNA input matching networks together with the T/R switch into one network. Additionally, this technique reduces chip area by minimizing the number of passive components. The synthesis of the network is based on the desired PA output load impedance from load-pull simulations and the optimum source impedance for minimum noise figure for the LNA.

1.3 Challenges in RF FDD Transceiver Front-Ends

Highly reconfigurable and multi-standard radio blocks have attracted considerable research interest to overcome the problems of overcrowded RF frequency bands and increased demand for lower cost, fully integrated radio systems. A promising approach to achieve high efficiency, high integration, and wideband operation is based on digitally modulated power amplifiers (DPAs), which function as RF power digital-to-analog converters. These circuits not only allow frequency agnostic PA designs, but also provide digital modulation and output power control. They facilitate efficiency enhancement techniques such as polar and Doherty techniques [13–16]. However, due to their clocked nature, the DPAs suffer from high level of out of band quantization noise which makes their use in frequency division duplex (FDD) systems challenging.

In frequency division duplex FDD systems widely used for current cellular communication, the transmitter and the receiver operate at the same time but at different center frequencies, as shown in Fig. 1.6. The transmit and receive bands are usually closely spaced, as for example in
LTE band 5 listed in Table 1.1, such that undesired spurious emissions from the transmitter can limit the performance of such systems. In particular, spurious emissions from the transmitter in the receive band, commonly referred as receive band noise (RxBN) are filtered by a duplexer. Currently, in order to minimize degradation of receiver sensitivity, it is required that the RxBN power spectral density at the input of the receiver LNA be kept below 180 dBm/Hz. To achieve such low RxBN floor (below $kT$), the out of band noise floor at the PA output is usually required to be below 130 dBm/Hz, and the duplexer is required to have large TX-RX isolation (> 50 dB).

Given the close frequency spacing (10s of MHz) between transmit and receive bands, the use of high-Q resonators in duplexers is necessary, resulting in high insertion loss ($\approx 3 – 4$ dB). Besides, the duplexers require large PCB area, usually in the order of 2x2 mm\(^2\) per component.

As each band requires a separate duplexer, enabling multi-band operation results in a large number of duplexers, increasing the size and the cost of a cellular communication device. Significant improvements can be achieved if the duplexer rejection in the receive band is relaxed, for example, to achieve 1 dB lower insertion loss of the duplexer in the transmit band. The benefits of 1 dB
lower insertion loss can be appreciated by recognizing that it leads to more than 25% reduction in power consumption of the PA (assuming 24 dBm average output power at the antenna port) - nearly equivalent to the benefits of alternative efficiency enhancement techniques such as envelop tracking (ET). Furthermore, reduction of RxBN in the receiver will allow the use of more advanced PA architectures, such as DPAs, which have inherent high quantization noise ($> -120 \text{dBm/Hz}$) and at present fail the RxBN specs of current cellular systems.

### 1.4 Dissertation Scope

Si-based mm-wave front-ends have the potential to integrate all critical transceiver building blocks for implementing low-cost mm-wave communication systems, such as 5G and satellite communication. The achievable power and efficiency of power amplifiers in Si is of particular concern as the PAs contribute the most to the overall power consumption of mm-wave transceivers with arrays having a large number of antennas. The DC power consumption of high EIRP Si-based mm-wave transceivers with current state-of-the-art PAs can exceed tens of Watts, requiring active cooling, which incurs higher cost, larger mechanical dimensions, and reduced reliability. Thus, efficiency improvement of Si PAs and of overall TDD and FDD front-ends is of significant importance to allow affordable and compact mm-wave communication systems. This dissertation demonstrates a number of techniques to significantly increase the efficiency of Si-based radio front-ends. The main focus of this work is on mm-wave transceivers.

First, as the efficiency of the PA is the major source of the overall power dissipation in mm-wave transceivers, various techniques for improving CMOS PA efficiency both at the peak and at back-off (Doherty) are studied. The main bands of interest are 15 and 28 GHz bands.
Second, as the designer has full control over each building block of a mm-wave transceiver, co-design and optimization of the PA, LNA and the T/R switch cannot be neglected. A new combiner synthesis methodology is proposed to combine matching networks and a T/R switch into one compact and low loss network.

Third, the issue of receiver desensitization by the transmitter in an FDD system for cellular communication is studied. An adaptive, DSP based technique is proposed based on an auxiliary receiver to cancel out receive band noise from the received signal in the digital domain.

1.5 Dissertation Organization

In this chapter, important challenges and limitations for achieving high efficiency in mm-wave power amplifiers and TDD front-ends, as well as RF FDD systems have been reviewed.

Chapter 2 describes a 15 GHz fully integrated symmetric Doherty PA. The PA is realized in 45 nm SOI CMOS technology. Both the main and the peaking amplifier branches consist of two stage power stages, which allows higher gain and flexible control over turn-on characteristics of the peaking PA. The driver stages consist of 2-stack amplifiers, while the final stages are implemented using 4-stack multigate cells to achieve high power. Both the input and output combiners were optimized for minimum area and loss. The PA achieves more than 25.7 dBm saturated output power and peak PAE of 31.2%. PAE at 6 dB back-off is 25%, which is more than 64% higher than for an ideal class B PA roll-off. The PA features the highest power and efficiency of reported high performance integrated silicon Doherty PAs. A simple analog linearizer is also proposed that performs Doherty gain correction in the RF domain. The linearizer effectively flattens the overall gain and extends the output P1dB of the amplifier from 23 dBm to 25.1 dBm without much penalty on the PAE. The performance of the linearized Doherty PA has been verified with 200 MHz single carrier 16-QAM and 64-QAM signals.

While the 15 GHz band has a lot of promise, various international organizations have
instead agreed to allocate a portion of the 28 GHz, 39 GHz and 60 GHz bands for 5G communication. In order to demonstrate achievable efficiencies on CMOS for the Ka-band, chapter 3 presents high efficiency, one stage, mm-wave power amplifiers based on nMOS and pMOS transistors in IBM and later GlobalFoundries 45 nm CMOS SOI. The amplifiers are arranged in a 2-stack configuration to increase the output voltage swing. Preliminary reliability tests have also been conducted to demonstrate greater voltage handling capability of pMOS devices. The pMOS PA achieves world record PAE up to 46% and 19.5 dBm saturated output power, while the nMOS PA sustains 40% PAE with close to 19 dBm saturated power. These compact PAs occupy only 0.18 mm² and can be useful as standalone amplifiers or as components of more complex architectures such as Doherty or out-phasing for 5G transceivers. The use of pMOS provides the potential for increased robustness to hot carrier injection effects.

Linear power amplifiers have the inherent disadvantage of achieving high PAE only at close to saturation and exhibit a rapid drop in PAE at back-off. As modern communication signals exhibit high PAPRs, efficiency enhancement at back-off power levels is even more important than the peak performance. A high efficiency, linear mm-wave Doherty PA in CMOS that uses a novel low-loss combiner is demonstrated in chapter 4. In addition, a compact modeling approach for CMOS PAs is demonstrated that considerably reduces simulation times. With more than 22 dBm saturated power, 40% peak PAE, and 28% at 6 dB back-off, the PA features the highest peak and 6 dB back-off PAE among silicon Doherty PAs.

As back-off efficiency improvement is desirable for more than 6 dB back-offs, a high efficiency, dual input, asymmetric, mm-wave Doherty PA in 45 nm CMOS SOI that uses a novel low-loss combiner is demonstrated in chapter 5. The main Doherty path uses a high efficiency 2-stack amplifier with a shunt feedback drain-source capacitance. The peaking path uses a high power 4-stack amplifier to achieve more than 6 dB back-off efficiency improvement. With dual, asymmetric input drive, the PA is able to output 25 dBm saturate power with 31% peak PAE as well as 34% 6 dB back-off PAE, which constitutes to the highest peak power and back-off
efficiency of any Si-based Doherty PAs in the Ka-band to date.

The overall mm-wave transmitter front-end efficiency not only depends on the PA efficiency, but also on the losses that arise in the interface between the PA and the antenna. High integration capability of CMOS is utilized to develop a TDD T/R combiner synthesis methodology in chapter 6 that optimizes the losses by combining the PA output and the LNA input matching networks together with the T/R switch into one network. A front-end implementation that includes a high power 4-stack PA, an inductively source degenerated, cascode LNA and the proposed T/R switch combiner is also demonstrated in 45 nm CMOS SOI technology. The front-end achieves state-of-the-art performance both in the transmit and receive modes. The PA inside the front-end produces saturated output power of 23.6 dBm with peak PAE of 28%, while maintaining LNA noise figure of 3.2 dB.

In addition, as efficiency and multi-band operation is of high importance for current LTE cellular transmitters, the final chapter presents an adaptive filter based, digital cancellation technique for mitigating stochastic noise, in particular quantization noise of digital power amplifiers. The cancellation technique uses an additional feedback receiver to capture the receive band noise at the output of the PA. The hardware is realized with off the shelf components for LTE band 5 to demonstrate the effectiveness of the technique. Cancellation results have been presented both with and without the presence of a desired signal at the main receiver. It has been shown for the first time that the quantization noise of a digital PA can be reduced below -180 dBm/Hz at the receiver. The cancellation technique enables higher efficiency PAs driven with stronger DPD along with less demanding design requirements for duplexer, which facilitates their use in ever increasing number of bands in FDD systems.
Chapter 2

15 GHz Doherty Power Amplifier with RF Predistortion Linearizer

2.1 Introduction

While the key requirements and standards for 5G communication systems are being actively developed, they are expected to provide considerably higher data rates, very low latency as well as more reliable radio links [17]. To achieve these objectives, 5G communication systems will utilize higher frequency bands to increase the channel capacity. Also, by using a large number of antennas [18], deployment of multiple-input-multiple-output (MIMO) architectures will be possible. Beam-forming and spatial multiplexing will be enabled to provide adequate coverage and higher data rates. Also, active research and field tests have been performed for characterizing channel propagation in cm-wave and mm-wave bands, such as 15 GHz [19], [20], 28 GHz [21], and 70 GHz [22] bands.

The technique of the 4-stack multigate-cell has already been used to implement a high output power and high gain tuned class AB power amplifier in the 15 GHz band [23], where more than 25 dBm saturated output power and 32.4 % peak power added efficiency (PAE) could be
demonstrated. However, for modern communication signals with high peak to average power ratio (PAPR), back-off efficiency enhancement of the PA is of high importance. This work presents a 15 GHz two stage, high output power symmetric Doherty PA that is based on a classic load modulation output network with a lumped $90^\circ$ phase shifter. The PA demonstrates more than 23 dB gain and more than 25.7 dBm saturated output power. Peak PAE is more than 31\% and 6 dB back-off efficiency of 25\% can be achieved. A simple RF predistortion linearizer network based on an envelope detector and an adaptive shunt loss element is also presented. The linearizer is able to considerably improve gain flatness of the Doherty PA. Measurements with 200 MHz 16-QAM and 64-QAM modulated signals have demonstrated that the linearized PA produces 20.6 dBm average output power and 21.8\% PAE for 16-QAM with 9.5\% error vector magnitude (EVM), and 16.4 dBm average output power and 15.2\% PAE for 64-QAM with 5.5\% EVM. The PA has a compact form-factor and can be confined within a 1 mm$^2$ chip area. To the authors knowledge, the performance of the two stage Doherty PA presented in this work features the highest power and efficiency of an integrated silicon Doherty PA reported to date. Similar results were attained only in GaAs [24].

This chapter is organized as follows. In Section 2.2, the implementation of driver stages and high power output stages is introduced. Considerations for load-pulling of a stacked device are discussed. The input and output combiner network implementations are also presented. Section 2.3 presents the proposed RF predistortion network. Section 2.4 covers the experimental results for small-signal, continuous wave (CW), and modulation measurements. Finally, conclusions are given in Section 2.5.

2.2 Doherty PA Implementation

The design of the integrated, two stage CMOS Doherty PA with the analog predistortion linearizer is shown in Fig. 2.1. In this section, the general design approach for the realization of
the high power final stage and the driver stage are first presented, followed by considerations for load modulation of stacked devices. Subsequently, compact, and low loss input and output combiners based on classic lumped element 90° phase shifters are discussed.

### 2.2.1 High Power Final Stage and Driver Stage

In a CMOS SOI process, high output power can be achieved by transistor stacking as shown in Fig. 2.2a. This allows voltage swings on each transistor to add up and result in a high voltage swing at the output. In recent works, it has been demonstrated that more than 25 dBm can be achieved by using the multigate unit cell approach [3], [23]. Each multigate unit consists of four stacked FETs, each implemented with a single source and drain, together with four gate fingers which are $w_g$ wide. The contacts to inter-finger source and drain regions are removed, which produces significant reduction in parasitic capacitance and resistance. Each of the four gate fingers in a unit cell is connected to a gate capacitor of appropriate size which allows a finite voltage swing at the gate, Fig. 2.2a. The values of the gate capacitors are selected to guarantee equal drain-to-source voltage swings and not too high drain-to-gate swings on each stack transistor. In the unit cell, these capacitors are realized as Metal-Oxide-Metal (MOM) capacitors which are designed around the transistor using the available metalization layers. A
complete analysis of the multigate cell can be found in [4].

An additional advantage of the multigate unit cell approach is its scalability. The cells can be arranged in an array of $M$ elements to achieve desired device width of $M \cdot w_g$. This allows a compact high power amplifier realization. The maximum number of elements $M$ is, however, limited due to the difficulty of ensuring phase coherence between cells that are spaced far apart.

In this chapter, both the main path and the peaking path of the Doherty PA are realized with two stage amplifiers. The first stage (driver), consists of a 256 $\mu$m wide 2-stack amplifier shown in Fig. 2.2b. The top device gate is terminated with a finite capacitance of 380 fF which results in a non-zero voltage swing at the gate. The 2-stack structure resembles the traditional cascode arrangement. However, the gate of the 2-stack amplifier is not at RF ground as in a cascode, hence the drain-to-gate swing, which is usually the limiting factor for the cascode voltage, is reduced. In this arrangement, the overall output voltage swing can be higher than each transistor’s breakdown voltage (BV), because the voltage can be equally distributed across each

![Figure 2.2](image_url): Schematics of the 4-stack final stage (a) and 2-stack driver stage (b).
transistor. This stage can achieve saturated output power of \( P_{\text{sat}} = 20 \text{dBm} \).

The final, high power stage is realized using 256 multigate unit cells. Each cell is \( w_g = 1.2 \mu\text{m} \) wide, so that the resultant total device width is \( 256 \times 1.2 \mu\text{m} = 307.2 \mu\text{m} \). Minimum length, double pitch devices were used both for the 2-stack and the multigate cell 4-stack. The main and peaking devices are of the same size. Inter-stage matching between the driver and the final high power stage employs a second order matching network.

### 2.2.2 Load Modulation of 4-Stack Devices

The high allowable voltage swing of the 4-stack multigate device increases the optimum output load impedance, and with the present device the load can be designed to be close to 50\( \Omega \). From load-pull simulations of the high power output stage with 256 multigate unit cells, the optimum load impedance is 35\( \Omega \). This allows realization of an efficient and wide-band output matching network to 50\( \Omega \).

As already mentioned, each transistor in a stacked device will ideally have an equal

![Simulated IV-curves of an extracted 32\( \mu\text{m} \) wide NMOS device.](image)

**Figure 2.3:** Simulated IV-curves of an extracted 32\( \mu\text{m} \) wide NMOS device.
voltage swing, controlled by the selection of gate capacitors. For a conventional class AB power amplifier, the values of these capacitors can be optimized for maximum output power, which means that the drain-to-source voltage swing of each transistor will be maximized for allowed safe limits to avoid breakdown or excessive degradation. In a classical Doherty power amplifier, load modulation does not affect the voltage swing at the output of the power device. However, highly scaled devices usually demonstrate very non-linear I-V curves (as show in Fig. 2.3 for an extracted 32 µm device), which can result in increased voltage swing during load modulation. Thus, during the design of a stacked device for a Doherty amplifier, it is important to ensure that the voltage swings do not exceed their limits at back-off load impedance. For the power device with 256 multigate unit cells, the simulated voltage swings at each transistor’s drain (Fig. 2.2a) are illustrated in Fig. 2.4 and Fig. 2.5 for peak load impedance of 35 Ω and back-off impedance of 70 Ω, respectively. It can be observed that the voltage swings on the transistors slightly increase at back-off but they are still within experimentally established safe limits of reliability.

Another important aspect is the achievable back-off power level. In a classical symmetric
Doherty, load modulation of the main amplifier from $R_{\text{Load}}$ to $2R_{\text{Load}}$ should result in a peak PAE at a -3 dB reduction in output power. This is however not the case if the I-V curves of the transistors in the triode region are non-linear. In Fig. 2.6, PAE simulations are shown of extracted

Figure 2.5: Simulated voltage waveforms of a 4-stack device with 4.8 V supply voltage and $R_{\text{Load}} = 70 \, \Omega$.

Figure 2.6: Simulated 4-stack device’s PAE vs. back-off for $R_L = 35 \, \Omega$ and $R_L = 70 \, \Omega$. 
256 multigate cells for varying output load impedance of 35Ω and 70Ω, vs back-off. The back-off PAE peaks at about -2 dB. Due to this fact, the second efficiency peak of the complete symmetric Doherty amplifier is expected to be at a -5 dB rather than the conventional -6 dB back-off power level.

2.2.3 Realization of Input and Output Combiners

Low loss and compact realization of the Doherty amplifier input and output combiners is important for optimizing efficiency and cost. In a classical Doherty amplifier, λ/4 transmission lines are used to realize impedance inversion at the output, and phase matching at the input. However, at mm-wave frequencies up to 60 GHz, the λ/4 transmission lines are very long and unsuitable for cost- and area-effective integration in a silicon process.

A commonly used solution is to realize a transmission line with a π- or T-network lumped element approximation. The lumped element network can be designed to have either a high-pass or a low-pass response. Among these combinations, high-pass π-networks are advantageous in terms of stability and reduced area.

The output combiner network that was used in this work is shown in Fig. 2.7. Here, inductors $L_1$ and $L_4$ tune out the output capacitance of the main and peaking amplifiers, which are realized as 256 multigate unit cells. The high-pass π network that provides $+90^\circ$ phase shift is implemented with inductors which can be combined with the tuning inductors; the parallel combination of them considerably reduces the overall size. As a result, the output combiner consists of only two inductors ($L_1||L_2$ and $L_3||L_4$) and a capacitor $C_1$. The resultant small inductor values allow space-efficient implementation on silicon.

For an arbitrary characteristic line impedance of $Z_0$ and phase shift $\phi$, the values of the inductors and the capacitor of the high-pass lumped element phase shifter can be calculated as

$$L_2 = L_3 = \frac{Z_0 \sin(\phi)}{\omega (1 - \cos(\phi))},$$

(2.1)
and

\[ C_1 = \frac{1}{\omega Z_0 \sin(\phi)}. \]  

(2.2)

The line impedance \( Z_0 \) is equal to the desired load impedance \( R_{\text{Load}} \) at the output of the amplifier (35 \( \Omega \) in this case.)

The input power splitter network is illustrated in Fig. 2.8. It also relies on a high-pass \( \pi \)-network with +90° phase shift in the peaking path in order to compensate for the phase shift in the output combiner. Similar to the output combiner, the high-pass \( \pi \)-network does not increase the number of inductors and can be merged with the input matching L-C networks.

### 2.3 Analog Predistortion

Nonlinear behaviors of various system components within an RF front-end can distort the transmitted signal and result in EVM reduction in-band, and adjacent channel power (ACPR) in neighboring frequency bands. The creation of these spurious output signals from inputs with varying envelope and high PAPR impose stringent requirements on linearity of the PA. The
availability of low cost signal processing power has made digital pre-distortion (DPD) quite useful to counter PA nonlinearity effects. While DPD is widely deployed for signals with below 100 MHz bandwidth, the complexity and the power consumption of a DPD system limit its use in mm-wave systems consisting of an array of many PAs with bandwidth of several hundreds of MHz to few GHz. In this section, a simple RF (analog) predistortion network that is effective for high gain Doherty amplifiers is presented.

### 2.3.1 Analog Predistortion Architectures

A variety of predistortion circuits in the analog domain (APD) have been proposed and implemented [25]. Mitigation of PA nonlinearities by feedback is extensively used in analog circuits. Mm-wave power devices, however, have relatively low gain, hence only limited amounts of feedback can be applied to each stage in order to not reduce efficiency, so that the effect on distortion is correspondingly small. More gain can be sacrificed if feedback is applied around a multistage power amplifier but the long feedback loop may produce instability and introduce considerable delays between the forward and feedback paths. These challenges bound the appli-
cation of feedback to low bandwidth systems.

Another PA linearization technique is feedforward, which does not reduce the gain of the amplifier and does not cause instability. While the level of correction with this technique can be significant, the complexity of the system increases cost and chip area as it requires power splitters, combiners, couplers, and phase shifters.

A simple form of linearization is RF predistortion, where the nonlinear predistorting element operates at carrier frequency. An element is used whose distortion characteristics are the inverse of the distortion characteristics of the PA. Because of the implementation simplicity and the possibility to linearize large signal bandwidth, RF predistorters present a viable solution to the mm-wave PA nonlinearity problem. The form of the amplifier gain characteristics is critical to the degree of achievable linearity improvement. Different networks have been proposed that attempt to correct various types of PA characteristics. The most straightforward and widely explored networks strive to predistort the third or fifth order nonlinearities [26], [27]. These predistorters may, however, increase the amount of higher order distortion products. Other networks attempt to accurately fit the inverse transfer characteristics of the PA and thus correct the nonlinearities for a number of orders of distortion [25], [28], [29], [30].

2.3.2 Proposed Analog Predistortion Circuit

In this work, we demonstrate an analog predistorter/linearizer that addresses the gain nonlinearity problem of symmetrical Doherty amplifiers. The gain of these PAs experiences very non-flat behavior as a result of load modulation and of the fact that the peaking amplifier is biased in class C and typically has lower gain than the class B biased main amplifier. With the multistage design of this work, the higher gain of the main amplifier can be lowered with an open loop predistorter to match the gain level experienced when the peaking amplifier turns on. The central idea of the proposed APD is illustrated in Fig. 2.9.
Figure 2.9: Proposed open loop analog predistorter (APD) for symmetric Doherty PAs.

The gain of the main amplifier is linearized with an APD element that acts as an adaptive loss element and compensates for the higher gain at back-off. The proposed circuit for the APD element is demonstrated in Fig. 2.10. The circuit consists of an envelope detector ($T_1$) and a shunt NMOS transistor ($T_2$), that acts as an adaptive loss at the input of the main amplifier. The gate voltage of the ($T_2$) shunt transistor is proportional to the envelope output voltage, which has

\[ V_{DD} \]
\[ V_{SS} \]
\[ V_{RF} \]
\[ V_G \]
\[ R_{L1} \]
\[ C_{L1} \]
\[ V_{Env} \]

Figure 2.10: Proposed analog linearizer schematic. $R_{L1} = 365\,\Omega$, $C_{L1} = 197\,fF$. 

28
flexible swing and dc offset adjustment by means of the gate bias voltage $V_G$ as well as positive and negative source voltages $V_{DD}$ and $V_{SS}$ of the envelope detector.

At low RF powers, which corresponds to the back-off operation of the main amplifier, the $V_{Env}$ envelope voltage is high and turns on the transistor $T_2$ which redirects some of the RF current to the ground. $T_2$ must be sized appropriately to handle the RF current through it. If the RF power is high, which corresponds to high power operation when the peaking amplifier is on, the node voltage $V_{Env}$ drops and turns off $T_2$, allowing all RF power to go into the main amplifier. By adjusting the bias and supply voltages of the envelope detector, considerable improvement in gain flatness can be achieved. Fig. 2.11 shows simulated gain curves of main and peaking amplifier, as well as total gain with and without the linearization circuit. Also shown is the shunt loss curve, defined as the decrease in gain provided by the linearizer along the main amplifier path. It can be seen that the linearized total gain is flat and the P1dB point of the Doherty amplifier can be considerably extended.

The bandwidth of the envelope detector is set by the RC low-pass filter that is formed

![Figure 2.11: Simulated total gain without linearization, gains of main and peaking amplifier, total gain with linearization, and APD loss as a function of output power.](image-url)
at the drain of $T_1$. If the $C_{DG}$ parasitic feedback capacitance is ignored, the bandwidth can be approximated as

$$f_{-3\text{dB},\text{ED}} \approx \frac{1}{2\pi R_{L1}|| r_{O,T1} (C_{L1} + C_{DS,T1} + C_{GS,T2})},$$

(2.3)

where $r_{O,T1}$ is the output resistance of $T_1$ due to channel length modulation. Simulated envelope detector conversion gain vs envelope frequency is shown in Fig. 2.12. The $f_{-3\text{dB}}$ bandwidth is 1.95 GHz.

### 2.4 Experimental Results

The PA was fabricated in the GF 45 nm SOI CMOS process and occupies overall chip area of 1x1 mm$^2$; the RF portion (without pads) occupies a compact area of 0.85x0.52 mm$^2$. The chip micrograph is shown in Fig. 2.13.

The dual stage architecture allows independent control of all amplifiers’ bias voltages, and hence their mode of operation. Throughout the experiments, the best results in terms of back-off efficiency, output power and gain flatness were achieved when both the driver and the final stage
of the main path were biased in class-AB mode with gate-to-source bias voltages in the range from 0.22 V to 0.25 V. The driver of the peaking path was operated in class C mode and the final stage in deep class C, with gate-to-source bias voltages of 0.1 V and 0 V, respectively.

2.4.1 Small-Signal and CW Measurements

Fig. 2.14 shows measured small-signal S-parameters of the PA. At 15 GHz, the $S_{21}$ gain measures 27 dB with a -3 dB bandwidth from 13.41 to 16.01 GHz, which results in a fractional bandwidth of 27%.

Fig. 2.15 illustrates measured large signal gain, power added efficiency (PAE), and drain efficiency (DE) together with theoretical class B PAE roll-off curve at 15 GHz. Large signal measurements were conducted with the gate to source voltage of the driver and the final stage of the main path biased at 0.25 V, and of the peaking path at 0 V. The PA achieves maximum saturated output power $P_{sat} = 25.7$ dBm (370 mW) and a peak $PAE_{max} = 31.2\%$ as well as drain efficiency $DE = 35\%$ at $P_{out} = 25.2$ dBm. It can be seen that the PAE curve exhibits a second peak at 5.4 dB back-off, achieving $PAE_{-5.4dB} = 25.5\%$ at that point. Both the $P_{sat}$ and peak PAE, as well as 6 dB back-off $PAE_{-6dB}$ demonstrate wide frequency response, shown in Fig. 2.16 and Fig. 2.17, respectively. The 1 dB bandwidth of $P_{sat}$ spans from 13.75 GHz to 16.25 GHz. Also
shown in Fig. 2.17 is the theoretical class B 6 dB back-off curve which is based on the peak PAE curve of Fig. 2.16. Compared to a class B PA performance, the two stage Doherty PA achieves more than 64% higher PAE at 6 dB back-off.

The performance of the proposed analog linearizer is shown in Fig. 2.18. The best linearity
is achieved by biasing the gate-to-source voltage of the driver and the final stage of the main path at 0.22 V, and of the peaking stage at 0.1 V (different from bias voltages for Fig. 2.15), which leads to a slight decrease in PAE. The envelope detector from Fig. 2.10 was biased at $V_{DD} = 0.5\text{V}$,
Figure 2.18: Measured gain and PAE with analog linearizer turned on and off.

$V_{SS} = -0.35\, \text{V}$, and $V_G = 0.3\, \text{V}$. It can be observed that the analog linearizer effectively flattens the gain and extends the $P_{1\text{dB}}$ from 23 dBm to 25 dBm. The effect of the linearization circuit on PAE is minimal, because the gain is still high and the power consumption of the linearizer is less than 2 mW.

### 2.4.2 Modulation Measurements

The performance of the PA with and without the analog linearizer has been studied with 16-QAM and 64-QAM single carrier (SC) signals with 200 MHz modulation bandwidth. The signals were generated using a Keysight M8195A 65 GSa/s arbitrary waveform generator (AWG) which can directly generate modulated signals at 15 GHz carrier frequency. The PA output was then down-converted and captured with a high sampling rate digital oscilloscope. A root raised cosine (RRC) filter with a roll-off factor of 0.35 was applied to both signals. Measured EVM and average PAE results are shown in Fig. 2.19 for 16-QAM and Fig. 2.21 for 64-QAM. Input PAPR values for these signals are 5.4 dB and 6 dB, respectively. Gain linearization considerably improves EVM without significantly affecting average PAE. The effectiveness of the linearizer can
also be observed based on measured ACLR values, which are depicted in Fig. 2.20 and Fig. 2.22.

16-QAM can tolerate much higher EVM than 64-QAM for the same bit error rate. As there is no 5G standard yet available that clearly specifies EVM requirements, we consider here EVM values
close the IEEE 802.11 maximum EVM requirements. For the 16-QAM signal, the PA achieves $EVM = 9.5\%$ with average output power of $P_{out} = 20.6\text{dBm}$ and average $PAE = 21.7\%$; ACLR improves from -20.8 dBc to -25.5 dBc. For the 64-QAM signal, the PA achieves $EVM = 5.5\%$.
Table 2.1: Summary of modulation measurements

<table>
<thead>
<tr>
<th></th>
<th>200 MHz 16-QAM</th>
<th>200 MHz 64-QAM</th>
</tr>
</thead>
<tbody>
<tr>
<td>EVM (%)</td>
<td>9.5</td>
<td>5.5</td>
</tr>
<tr>
<td>In PAPR (dB)</td>
<td>5.4</td>
<td>6</td>
</tr>
<tr>
<td>Out PAPR (dB)</td>
<td>4.2</td>
<td>6</td>
</tr>
<tr>
<td>Pout (dBm)</td>
<td>20.6</td>
<td>16.4</td>
</tr>
<tr>
<td>PAE (%)</td>
<td>21.8</td>
<td>15.2</td>
</tr>
<tr>
<td>ACLR (dBc)</td>
<td>-25.5</td>
<td>-28.8</td>
</tr>
</tbody>
</table>

with average output power of $P_{\text{out}} = 16.4\,\text{dBm}$, and average $PAE = 15.2\%$; ACLR improves from -22.3 dBc to -28.8 dBc. The ACLR improvement can also be visualized by means of modulated spectrum measurements shown in Fig. 2.23 for the 64-QAM signal. ACLR values of this order can be well suitable for 5G mm-wave phased arrays due to higher spatial selectivity. Fig. 2.24 shows the received constellations for these two signals with the analog linearizer turned on. Table 2.1 summarizes the modulation measurements.

It is also important to analyze phase distortion (AM-PM) of the PA. Fig. 2.25 illustrates AM-PM response of the PA with the analog linearizer turned on and off for the 200 MHz 64-QAM

![Normalized PSD vs Frequency Offset](image)

**Figure 2.23**: Measured spectrum for 200 MHz 64-QAM signal with $EVM = 5.5\%$ and average output power $P_{\text{out}} = 16.4\,\text{dBm}$ with analog linearizer turned on and off.
signal. The improvement in AM-AM response of the amplifier due to the linearizer also improves the AM-PM response. Even without the linearizer, the PA already demonstrates very good AM-PM performance, in keeping with prior measurements of AM-PM using stacked FET PAs in CMOS SOI.

Figure 2.25: Measured AM-PM for 200 MHz 64-QAM signal with $EVM = 5.5\%$ and average output power $P_{out} = 16.4\, \text{dBm}$ with analog linearizer turned on and off. An offset of $-40^\circ$ was added to the data without linearization for clarity.
Table 2.2 gives an overview of recently reported Doherty power amplifiers on silicon for cm-waves and mm-waves. For comparison, the highest reported output power and efficiency of a GaAs Doherty PA is also included. To the authors knowledge, the performance of the two stage Doherty PA presented in this work features the highest power and efficiency silicon Doherty PA reported to date at high microwave frequencies.

In order to have a more fair comparison with the PAs implemented at higher frequencies, a frequency weighted efficiency can be used, defined here as

\[
FOM = \sqrt[4]{f_0/\text{GHz}} \cdot \text{PAE},
\]

where \( f_0 \) is the operating frequency. The two stage Doherty PA with the linearizer presented in this work achieves \( FOM = 112 \), which is about 7\% higher than the PA presented in [31], 26\% higher than the PA in [32] but 32\% lower than the GaAs Doherty PA in [33].

<table>
<thead>
<tr>
<th>Table 2.2: Comparison to Recent cm-wave and mm-wave Doherty PAs.</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Technology</strong></td>
</tr>
<tr>
<td><strong>Technology</strong></td>
</tr>
<tr>
<td><strong>Topology</strong></td>
</tr>
<tr>
<td><strong>( F_0 ) (GHz)</strong></td>
</tr>
<tr>
<td><strong>Supply (V)</strong></td>
</tr>
<tr>
<td><strong>Gain (dB)</strong></td>
</tr>
<tr>
<td><strong>Psat (dBm)</strong></td>
</tr>
<tr>
<td><strong>P1dB (dBm)</strong></td>
</tr>
<tr>
<td><strong>( \text{PAE}_{\text{max}} ) (%)</strong></td>
</tr>
<tr>
<td><strong>( \text{PAE}_{6\text{dB}} ) (%)</strong></td>
</tr>
<tr>
<td><strong>Area (mm(^2))</strong></td>
</tr>
</tbody>
</table>
2.5 Conclusion

In this paper, a 15 GHz fully integrated symmetric Doherty PA is presented. The PA is realized in 45 nm SOI CMOS technology. Both the main and the peaking amplifier branches consist of two stage power stages, which allows higher gain and flexible control over turn-on characteristics of the peaking PA. The driver stages consist of 2-stack amplifiers, while the final stages are implemented using 4-stack multigate cells to achieve high power. Both the input and output combiners were optimized for minimum area and loss. The PA achieves more than 25.7 dBm saturated output power and peak PAE of 31.2%. PAE at 6 dB back-off is 25%, which is more than 64% higher than for an ideal class B PA roll-off.

A simple analog linearizer is also proposed that performs Doherty gain correction in the RF domain. The linearizer effectively flattens the overall gain and extends the output P1dB of the amplifier from 23 dBm to 25.1 dBm without much penalty on the PAE. The performance of the linearized Doherty PA has been verified with 200 MHz single carrier 16-QAM and 64-QAM signals.

2.6 Acknowledgment

Chapter 2 is mostly a reprint of the material as it appears in N. Rostomyan, J. A. Jayamon, and P. M. Asbeck, "15 GHz Doherty Power Amplifier With RF Predistortion Linearizer in CMOS SOI," IEEE Trans. Microw. Theory Tech., 2017. This dissertation author was the primary author of this material.
Chapter 3

Comparison of pMOS and nMOS 28 GHz High Efficiency Linear Power Amplifiers in 45 nm CMOS SOI

3.1 Introduction

As discussed in Chapter 2, high data rate mm-wave communication links will require compact, low cost and efficient power amplifiers. Implementation of the PAs in CMOS offers the potential for greater integration capability with other system components and reduced cost per chip. Large channel bandwidth and high PAPR of mm-wave communication also requires high efficiency and inherently linear mm-wave PAs. However, efficiency and reliability of CMOS high power circuit blocks are limiting their widespread deployment.

Several techniques have been proposed in the literature to increase the power handling capability (limited by relatively low break-down voltages) of CMOS PAs, such as transistor stacking, and on chip or spatial power combining. Compact and high power implementations using 4-stack devices have been demonstrated in Chapter 2.
This chapter demonstrates high efficiency, compact, one stage, 2-stack power amplifiers based on nMOS and pMOS devices at Ka-Band. The PAs occupy very small (0.18 mm²) active area. The nMOS PA achieves 12 dB gain, saturated output power of 18.9 dBm and 40.5% PAE at 26.75 GHz. The pMOS PA demonstrates comparable results with 10.3 dB gain, saturated output power of 17.8 dBm and 40.7% PAE 26.5 GHz (when operated with an identical power supply voltage of 2.4 V.) Even better performance is achieved with the pMOS PA on the new GF 45 nm CMOS SOI with thick copper top metalization option. In this process, the pMOS based PA achieves more than 46% peak PAE and 19.5 dBm saturated output power. pMOS amplifier results are of particular interest since these devices are known to be less susceptible than nMOS devices to various types of degradation, particularly hot carrier injection and time-dependent dielectric breakdown [35]. In this work, simple, preliminary short term reliability measurements were conducted to evaluate gain degradation at saturated output power vs. time for different drain bias voltages. The results showed that the pMOS PAs can withstand higher $V_{dd}$ values without noticeable gain degradation than the nMOS PAs.

In order to assess linearity and bandwidth, measurements with wideband 800 MHz 64-QAM OFDM signals (which are representative of 5G requirements) have been conducted with the nMOS PA. The PA achieves $EVM = 5.5\%$ with average output power of $P_{out} = 9.8$ dBm, average $PAE = 14.8\%$, and ACLR of -25.3 dBc.

The chapter is organized as follows. In Section 3.2 designs considerations for the nMOS and pMOS PAs are presented. Section 3.3 covers the measurement results. Finally, additional discussions that include 2-stack PA design improvements in the new GF 45RFSOI technology are discussed in Section 3.4.
3.2 Circuit Architecture and Design

Both the nMOS and pMOS PAs are realized with 256µm wide 2-stack amplifiers shown in Fig. 3.1a. The top device gate is terminated with a finite capacitance of 380 fF which results in a non-zero voltage swing at the gate. In this arrangement, the overall output voltage swing can be higher than each transistor’s breakdown voltage, because the voltage can be equally distributed across each transistor. The inputs and the outputs are matched to 50Ω. The output matching networks have been optimized for the second harmonic impedance matching in order to provide the highest PAE. The real part of the desired fundamental load impedance at the drain of the nMOS power stage is 26Ω, and of the pMOS is 40Ω.

3.3 Experimental Results

The PAs were fabricated in the GF 45 nm SOI CMOS process. Each occupies overall chip area of 0.6x0.62mm², while the active region (without pads) is 0.45x0.4mm². The chip
micrograph is shown in Fig. 3.2.

A supply voltage of $V_{dd} = 2.4\,\text{V}$ was used throughout the measurements. Fig. 3.3 illustrates measured small-signal s-parameters of the PAs. At 28 GHz, the $S_{21}$ gain measures 12.8 dB with a -3 dB bandwidth from 24.9 to 32.3 GHz, which corresponds to a fractional bandwidth of 26.5%. For the same DC quiescent current, the pMOS PA’s $S_{21}$ gain is 13 dB.

Fig. 3.4 illustrates measured large signal gain, power added efficiency (PAE), and drain

![Figure 3.3: Measured S-parameters of nMOS and pMOS based PAs.](image)
efficiency (DE) for nMOS and pMOS PAs biased in class-AB mode. The nMOS based PA’s PAE peaks at 26.75 GHz, while the pMOS has its peak at 26.5 GHz. The nMOS PA achieves

Figure 3.5: Measured gain change vs time for nMOS PA at 26.75 GHz.
maximum saturated output power $P_{sat} = 18.9 \text{ dBm (77 mW)}$ and a peak $PAE_{max} = 40.5\%$ as well as drain efficiency of $DE = 45\%$ at $P_{out} = 18.25 \text{ dBm}$. The pMOS PA, on the other hand, achieves maximum saturated output power $P_{sat} = 17.8 \text{ dBm (60 mW)}$ and a peak $PAE_{max} = 40.7\%$, and $DE = 47.7\%$ at $P_{out} = 16.8 \text{ dBm}$

To make a very preliminary assessment of reliability of the nMOS and pMOS PAs, a short-term reliability study was carried out. Various drain bias voltages were used for the devices while they operated at saturated output power for 30 minutes at each bias voltage. Figures 3.5 and 3.6 show the saturated gain vs time for drain voltages varying from 2 to 3 V for the nMOS and pMOS PAs, respectively. It can be seen that for the nMOS at 2.8 V and 3 V drain voltages there were noticeable changes in the output power, while for the pMOS there was no noticeable degradation for the period of the measurements up to 3 V drain voltage. This results match the expectations based on reduced hot carrier injection in pMOS. It can be anticipated that with slight modifications on the load line, pMOS PAs could be operated at increased drain bias voltages to achieve higher output power and efficiency.
The linearity performance of the nMOS PA was verified with a 64-QAM OFDM signal with 800 MHz bandwidth and 9.8 dB input PAPR. The modulated signals are generated using a Keysight M8195A 65 GSa/s arbitrary waveform generator (AWG). The PA achieves $EVM = 5.5\%$ with average output power of $P_{out} = 9.8 \text{ dBm}$, average $PAE = 14.8\%$, and ACLR of -25.3 dBc. Figures 3.7 and 3.8 show the received constellations and spectrum for $EVM = 5.5\%$.

**Figure 3.7:** Measured constellation and EVM for 800 MHz 64-QAM signal with average output power $P_{out} = 9.8 \text{ dBm}$.

**Figure 3.8:** Measured spectrum for 800 MHz 64-QAM signal with $EVM = 5.5\%$ and average output power $P_{out} = 9.8 \text{ dBm}$. 

<table>
<thead>
<tr>
<th>EVM/MER</th>
<th>5.4756</th>
</tr>
</thead>
<tbody>
<tr>
<td>EVMPeak</td>
<td>14.521</td>
</tr>
<tr>
<td>PilotEVM</td>
<td>4.1189</td>
</tr>
<tr>
<td>DataEVM</td>
<td>5.8238</td>
</tr>
<tr>
<td>PmbIEVM</td>
<td>3.5559</td>
</tr>
</tbody>
</table>
Table 3.1: CMOS Linear Power Amplifier Performance Summary

<table>
<thead>
<tr>
<th></th>
<th>This Work</th>
<th>[4]</th>
<th>[36]</th>
<th>[37]</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology</td>
<td>45nm SOI</td>
<td>45nm</td>
<td>28nm</td>
<td>28 nm</td>
</tr>
<tr>
<td></td>
<td>CMOS</td>
<td>SOI</td>
<td>CMOS</td>
<td>CMOS</td>
</tr>
<tr>
<td>Topology</td>
<td>2-stack</td>
<td>4-stack</td>
<td>2-stack</td>
<td>2-stage</td>
</tr>
<tr>
<td></td>
<td>N/P-MOS</td>
<td></td>
<td></td>
<td>differential</td>
</tr>
<tr>
<td>Fo (GHz)</td>
<td>26.75/26.5</td>
<td>29</td>
<td>28</td>
<td>30</td>
</tr>
<tr>
<td>Supply (V)</td>
<td>2.4</td>
<td>4.8</td>
<td>NA</td>
<td>1.15</td>
</tr>
<tr>
<td>Gain (dB)</td>
<td>12/10.3</td>
<td>13</td>
<td>13.6</td>
<td>16.3</td>
</tr>
<tr>
<td>Psat (dBm)</td>
<td>18.9/17.8</td>
<td>24.8</td>
<td>19.8</td>
<td>15.3</td>
</tr>
<tr>
<td>OP1dB (dBm)</td>
<td>17.6/16</td>
<td>NA</td>
<td>18.6</td>
<td>14.3</td>
</tr>
<tr>
<td>Peak PAE (%)</td>
<td>40.5/40.7</td>
<td>29</td>
<td>43.3</td>
<td>36.6</td>
</tr>
<tr>
<td>Active area (mm²)</td>
<td>0.18</td>
<td>0.3</td>
<td>0.28</td>
<td>0.16</td>
</tr>
</tbody>
</table>

Table 3.1 gives an overview of recently reported CMOS linear power amplifiers in the Ka-band. Compared to other works, the nMOS and pMOS PAs presented here have very compact dimensions, high gain and efficiency.

3.4 Additional Discussions

The pMOS 2-stack PA has been additionally improved by using updated 45 nm CMOS SOI from GlobalFoundries on a metalization option that incorporates two thick copper metal layers. This process is optimized for RF and mm-wave applications, as it provides mm-wave device models and possibility to realize high-Q inductors using the thick copper layers. The amplifier presented here has slightly increased transistor size (310 um) and makes use of an additional capacitor to improve impedance matching at the inter-device node (an “accelerator capacitor”), as shown in Fig. 3.9.

Fig. 3.10 illustrates large signal measurements at 27 GHz, performed with Class AB bias conditions: \( V_{dd} = 2.4 \text{ V} \), \( V_{g1} = 0.25 \text{ V} \), \( V_{g2} = 1.9 \text{ V} \). The saturated output power reaches 19 dBm, and the peak PAE reaches 46.7%. At the point of maximum PAE, the gain is compressed to
For applications related to 5G, the efficiency achievable with signals that have high PAPR is of major interest. Under the Class AB bias conditions shown in Fig. 3.10, the efficiency at 6 dB back-off from Psat is 25%, and 18% for 8 dB back-off. With bias conditions changed to

![Figure 3.10: Measured gain, PAE and drain efficiency for pMOS PA at 27 GHz.](image)

10 dB.

![Figure 3.9: Schematic of the pMOS PA with a shunt feedback drain-source capacitance C\textsubscript{3}.](image)
Figure 3.11: Measured average PAE and EVM for 800 MHz 64-QAM OFDM for pMOS PA at 27 GHz.

$V_{dd} = 2.1 \, \text{V}$, $V_{g1} = 0.2 \, \text{V}$, $V_{g2} = 1.8 \, \text{V}$ (corresponding to deeper Class AB), the PAE roll-off is a little slower, and PAE reaches 25.5% for 6 dB back-off and 20% for 8 dB back-off. For this bias, the peak power decreases to 18 dBm, and the gain to 11 dB. Output power above 20 dBm is obtained with bias conditions $V_{dd} = 2.6 \, \text{V}$, $V_{g1} = 0.25 \, \text{V}$, $V_{g2} = 1.9 \, \text{V}$.

The linearity performance of the pMOS PA with the accelerator capacitor was verified with a 64-QAM OFDM signal with 800 MHz bandwidth and 9.2 dB input PAPR. No digital pre-distortion (DPD) was used to verify the PA’s inherent linearity. Measured average PAE and error-vector magnitude (EVM) results are shown in Fig. 3.11 with $V_{dd} = 2.1 \, \text{V}$. At highest allowed $EVM = 5.5\%$ for 64-QAM signals, the PA achieves average output power of $P_{out} = 9.2 \, \text{dBm}$, average $PAE = 17\%$, and ACLR of -29 dBc. Output AM-PM response over the power range of modulation is within $\pm 3.5 \, ^\circ$.

In Table 3.2, the reported characteristics of various high performance PAs operating near 28 GHz with single-ended outputs are reported. The PA described in this work has efficiency and power on the same order as those reported for other Si technologies, as well as for GaAs pHEMT.
### Table 3.2: Comparison of Reported High Efficiency Single-Ended 28 GHz PAs

<table>
<thead>
<tr>
<th>Technology</th>
<th>This Work</th>
<th>[38]</th>
<th>[36]</th>
<th>[37]</th>
<th>[39]</th>
<th>[40]</th>
<th>[41]</th>
</tr>
</thead>
<tbody>
<tr>
<td>45nm SOI CMOS</td>
<td>2-stack pMOS</td>
<td>2-stack pMOS</td>
<td>2-stack n/p-MOS</td>
<td>2-stack</td>
<td>2-stage</td>
<td>CS</td>
<td>2-stack</td>
</tr>
<tr>
<td>Fo (GHz)</td>
<td>27</td>
<td>26.75/26.5</td>
<td>28</td>
<td>30</td>
<td>28</td>
<td>28</td>
<td>32</td>
</tr>
<tr>
<td>Supply (V)</td>
<td>2.4</td>
<td>2.4</td>
<td>NA</td>
<td>1.15</td>
<td>2.4</td>
<td>5 &amp; 12</td>
<td>2V/3V</td>
</tr>
<tr>
<td>Gain (dB)</td>
<td>10</td>
<td>12/10.3</td>
<td>13.6</td>
<td>16.3</td>
<td>21.2</td>
<td>16.7</td>
<td>8.9/8.2</td>
</tr>
<tr>
<td>Psat (dBm)</td>
<td>19.5</td>
<td>18.9/17.8</td>
<td>19.8</td>
<td>15.3</td>
<td>17.1</td>
<td>31.5</td>
<td>21.2/24.3</td>
</tr>
<tr>
<td>Peak PAE (%)</td>
<td>46.7</td>
<td>40.5/40.7</td>
<td>43.3</td>
<td>36.6</td>
<td>42</td>
<td>33</td>
<td>59</td>
</tr>
<tr>
<td>Active Area (mm²)</td>
<td>0.18</td>
<td>0.18</td>
<td>0.28</td>
<td>0.16</td>
<td>0.5</td>
<td>2</td>
<td>NA</td>
</tr>
</tbody>
</table>

### 3.5 Conclusion

High efficiency, one stage mm-wave power amplifiers based on nMOS and pMOS transistors in IBM and later GlobalFoundries 45 nm CMOS SOI have been demonstrated. The amplifiers are arranged in a 2-stack configuration to increase the output voltage swing. Preliminary reliability tests have been conducted to demonstrate greater voltage handling capability of pMOS devices. These compact PAs can be useful as a standalone amplifier or a component of more complex architectures such as Doherty or out-phasing for 5G transceivers. The use of pMOS provides the potential for increased robustness to hot carrier injection effects.

### 3.6 Acknowledgment

Chapter 3 is mostly a reprint of the material as it appears in N. Rostomyan, M. Ozen, and P. Asbeck, “Comparison of pMOS and nMOS 28 GHz high efficiency linear power amplifiers in 45 nm CMOS SOI,” *2018 IEEE Topical Conference on RF/Microwave Power Amplifiers for...*
Radio and Wireless Applications, 2018. This dissertation author was the primary author of this material. Chapter 3 is also, in part, a reprint of the material that has been accepted for publication and it will appear in D. Thomas, N. Rostomyan, and P. Asbeck, ”A 45% PAE pMOS Power Amplifier for 28GHz Applications in 45 nm SOI,” 2018 IEEE MWSCAS, 2018. The dissertation author was the collaborating author of these materials, and co-authors have approved the use of the material for this dissertation.
Chapter 4

28 GHz Doherty Power Amplifier in CMOS SOI with 28% Back-Off PAE

4.1 Introduction

Last two chapters have outlined the possibility of achieving high efficiency using a 2-stack CMOS PA configuration. While class AB PAs are easy to designs and occupy small area, for signals with high peak to average ratio (PAPR), back-off efficiency enhancement of the PA is of high importance. Also, limited feasibility of digital pre-distortion (DPD) in mm-wave systems with an array of PAs with bandwidth of several hundreds of MHz to a few GHz imposes stringent linearity requirements on the PAs.

In this chapter, we demonstrate a CMOS Doherty PA for the 28 GHz band with high peak as well as 6 dB back-off PAE and inherent linearity. The PA is based on optimized and low loss Doherty output combiner synthesis methodology that was proposed in [7] and symmetric 2-stack power devices. To the author’s knowledge, the PA achieves the highest peak and back-off efficiency of any Si-based Doherty PAs at 28 GHz. A compact modeling approach is also demonstrated which considerably reduces simulation times in the PA design.
4.2 Doherty PA Implementation

The full schematic of the proposed Doherty PA is shown in Fig. 4.1. Both the main and peaking amplifiers consist of 256 $\mu$m wide 2-stack devices. The 2-stack amplifier has similar arrangement as the traditional cascode. However, the top device gate is terminated with a finite capacitance of 380 fF. Thus the gate of the 2-stack amplifier is not at RF ground as in a cascode, hence the drain-to-gate swing is reduced [42]. In such configuration, the total output voltage swing can be higher than each transistor’s breakdown voltage (BV), because the voltage can be equally distributed across each transistor.

The combiner used in this work is based on the analytical synthesis methodology which has already been demonstrated in [7], [43]. The combiner performs both the necessary Doherty impedance modulation and matching to desired output load. Compared to a conventional Doherty
lumped element (LCL) combiner, the losses are considerably minimized. The synthesis relies on load-pull simulations at the center frequency of interest (28 GHz) of the optimal load impedance of the main ($Z_L = 20||j43\,\Omega$) and peaking ($Z_L = 21||j43\,\Omega$) amplifiers at the peak power, optimal load impedance of the main amplifier at 3 dB back-off output power ($Z_L = 40||j33\,\Omega$), and output impedance of the turned off peaking amplifier ($Z_{off} = -j11.5\,\Omega$).

The synthesized network is shown in Fig. 4.2a. For comparison, a reference Doherty PA

![Diagram](image)

Figure 4.2: Optimized combiner (a) and conventional combiner (b).

![Graph](image)

Figure 4.3: Simulated losses in synthesized and conventional combiners vs $P_{out}$. 

55
based on a conventional lumped element impedance was designed. The reference design is also optimized based on the technique described in [42] and consists of a high-pass LCL network. While the detailed analysis of the novel combiner is beyond the scope of this letter, it is important
to mention that significant reduction in combiner insertion loss can be achieved as illustrated in Fig.4.3. Here, quality factors of $Q_L = 25$ and $Q_C = 40$ are assumed for the conventional combiner. The optimized combiner loss is based on EM simulations.

The real part of the main and auxiliary PAs loads, as well as complex load impedance values in the Smith chart are shown in Figures 4.4 and 4.5, respectively. It can be seen that the combiner performs proper impedance modulation from the back-off to peak operation. The main PA sees about 38 $\Omega$ at 6 dB back-off and 22 $\Omega$ at the peak. The auxiliary PA sees about 24 $\Omega$ Ohm at the peak.

Often, the design of CMOS PAs involves custom layouts of the power device and parasitic extraction of the resultant layout. The extraction usually results in hundreds of thousands of nodes and leads to very long simulation times as well as convergence issues. The simulation times can be reduced from many hours to few minutes if the layout parasitics of the power device
are approximated in a compact model. Such a model was developed for the 256\(\mu\)m wide 2-stack device as shown in Fig. 4.6. The layout of the device including external gate to ground capacitors for the top transistor is depicted in Fig. 4.7. In the compact model, the distributed parasitics of the extracted device are represented with lumped element \(C_s\) and \(R_s\). The model shown here has no parasitic inductors as no significant performance change was observed with inductive extraction activated for the device size and center frequency used in this work. The derivation of the model parameters consists of the following steps:

1. Design the layout of the power device (or a sub-section if the device is very large) and run RCL parasitic extraction.

2. Simulation of s-parameters of the extracted device.

3. Estimation of the location and type (resistive or capacitive) of main parasitics from the layout and placing appropriate lumped element \(R_s, C_s\) or \(L_s\).

4. Run an optimizer and fit s-parameters of the extracted and modeled devices for multiple bias points.

5. Add additional parasitic elements if fitting result did not converge to the desired error margin.

**Figure 4.7:** Layout of the 256\(\mu\)m wide 2-stack power device including external gate to ground capacitors for the top transistor.
Figure 4.8: Extracted (solid line, port 1,2) and modeled (dashed line, port 3,4) device’s input and output reflection coefficients.

The resultant s-parameters of the extracted and fitted models for the 256µm wide 2-stack device used in this work are shown in Figures 4.8 and 4.9 for $V_{G1} = 0.3$ V, $V_{G2} = 2$ V, $V_{DD} = 2$ V. Here, port 1 and 2 belong to the extracted device (solid lines), 3 and 4 to the modeled one (dashed lines). It can be seen that the s-parameter curves perfectly overlap.

Load-pull simulations of the extracted and compact model are shown in Table 4.1. While the compact model slightly underestimates the peak PAE and $P_{sat,max}$ values, the optimum load
Table 4.1: Load-Pull simulations of the 256µm wide 2-stack device.

<table>
<thead>
<tr>
<th></th>
<th>Extracted</th>
<th>Compact Model</th>
</tr>
</thead>
<tbody>
<tr>
<td>$P_{\text{sat,max}}, \text{dBm}$</td>
<td>20.2</td>
<td>19.7</td>
</tr>
<tr>
<td>$Z_{\text{Lopt}(@ P_{\text{sat,max}})}, \Omega$</td>
<td>$10.6 + j7.6$</td>
<td>$10.6 + j7.6$</td>
</tr>
<tr>
<td>$\text{PAE}_{\text{max}}, %$</td>
<td>51.9</td>
<td>49.6</td>
</tr>
<tr>
<td>$Z_{\text{Lopt}(@ \text{PAE}_{\text{max}})}, \Omega$</td>
<td>$13.7 + j13$</td>
<td>$13.7 + j13$</td>
</tr>
</tbody>
</table>

impedance values are the same.

4.3 Combiner Performance under Mismatch and Process Variations

It is important to analyze the sensitivity of performance of the novel combiner when L/C values change due to process variation or if strong mismatch occurs at the load terminal. Based on simulations, the combiner is not very sensitive to L/C value changes. ±10% variation of $L_s$ and $C_s$ has been simulated using an LC lumped element representation of the combiner according to Fig. 4.2a. Quality factors of $Q_L = 25$ and $Q_C = 40$ are assumed. The combiner loss does not vary more than 0.1 dB over the output power range, as illustrated in Fig. 4.10. Also, according to Fig. 4.11, the gain and PAE experience slight deviations from the nominal. A more complex corner analysis for different corners will require appropriate corner files for the EM simulator (EMX), which is not available for this process (IBM12SOI).

In a phased array system, the load that is presented to each PA of the array can significantly change during beam-steering due to antenna coupling. In order to evaluate the sensitivity of the combiner against mismatch, the loss at the peak output power of the EM extracted novel combiner and the conventional combiner of Fig. 4.2b was simulated for load reflection coefficient of magnitude $|\Gamma| = -10\text{dB}$ and phase of 0 to $2\pi$. The simulations are shown in Fig. 4.12 and suggest that the novel combiner performs better than the conventional one under mismatch conditions as well.
4.4 Experimental Results

The Doherty PA was fabricated in GF 45 nm CMOS SOI and has dimensions of 0.63 mm$^2$ including the pads, as shown in Fig. 4.13. Measured and simulated continuous wave (CW) gain
Figure 4.12: Simulated optimized and conventional combiner loss at peak power for antenna mismatch with $|\Gamma| = -10\text{dB}$.

and PAE at 28 GHz versus output power are illustrated in Fig. 4.14 with fixed $V_{\text{DD,main/aux}} = 2.4\text{V}$, $V_{G2,\text{main/aux}} = 1.7\text{V}$, $V_{G1,\text{main}} = 0.22\text{V}$ and $V_{G1,\text{aux}} = -0.1\text{V}$. Both simulated and measured gains are flat, however the measured gain is 2 dB lower than the simulations. Measured saturated output power is $P_{\text{sat}} = 22.4\text{dBm}$, about 1.5 dB lower than the simulated $P_{\text{sat}}$. Measured peak PAE is 40%, while the 6 dB back-off PAE is 28%. Measurements of $P_{\text{sat}}$, peak and 6 dB back-off PAE over frequency are shown in Fig. 4.15.

The performance of the PA under modulated signals was verified with a 64-QAM 800 MHz

Figure 4.13: Micrograph of the 0.94x0.67 mm$^2$ Doherty power amplifier chip.
OFDM signal. No digital pre-distortion (DPD) was used in order to assess the PA’s inherent linearity. Measured average PAE and error-vector magnitude (EVM) results are shown in Fig. 4.16. For the highest allowed EVM = 5.5% for 64-QAM according to IEEE 802.11 standard, the PA achieves average output power of $P_{\text{out}} = 13$ dBm, PAE = 16.8%, and ACLR1 = -27.2 dBc with output PAPR = 7.3 dB. As shown in Fig. 4.17, output AM-PM response over the power range of

Figure 4.14: CW measurement of gain and PAE over output power.

Figure 4.15: CW measurement of $P_{\text{sat}}$, peak and 6 dB back-off PAE vs frequency.
modulation is within $\pm 4^\circ$.

Table 4.2 lists recently reported, state of the art Doherty PAs for mm-waves. To the authors knowledge, the performance of the Doherty PA presented in this work features the highest peak and 6 dB back-off PAE among silicon Doherty PAs.

Figure 4.16: Measured average PAE and EVM for 800 MHz 64-QAM OFDM.

Figure 4.17: Measured AM-PM for 800 MHz 64-QAM OFDM signal with EVM = 5.5% and average output power $P_{out} = 13$dBm.
Table 4.2: Comparison to Recent mm-wave Doherty PAs.

<table>
<thead>
<tr>
<th>Technology</th>
<th>This Work</th>
<th>[43]</th>
<th>[32]</th>
<th>[33]</th>
<th>[44]</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>45 nm SOI CMOS</td>
<td>130 nm SiGe</td>
<td>28 nm CMOS</td>
<td>0.15 um GaAs HEMT</td>
<td>130 nm SiGe</td>
</tr>
<tr>
<td>Fo (GHz)</td>
<td>28</td>
<td>30</td>
<td>32</td>
<td>28</td>
<td>28</td>
</tr>
<tr>
<td>Supply (V)</td>
<td>2.4</td>
<td>1.7</td>
<td>1</td>
<td>4</td>
<td>1.5</td>
</tr>
<tr>
<td>Gain</td>
<td>10</td>
<td>6.9</td>
<td>22</td>
<td>12</td>
<td>18.2</td>
</tr>
<tr>
<td>Psat</td>
<td>22.4</td>
<td>21.3</td>
<td>19.8</td>
<td>26</td>
<td>16.8</td>
</tr>
<tr>
<td>OP1dB</td>
<td>21.5</td>
<td>N/A</td>
<td>16</td>
<td>N/A</td>
<td>15.2</td>
</tr>
<tr>
<td>Peak PAE</td>
<td>40</td>
<td>17</td>
<td>21</td>
<td>40</td>
<td>20.3</td>
</tr>
<tr>
<td>PAE@6dB(%)</td>
<td>28</td>
<td>24.3</td>
<td>12.8</td>
<td>29</td>
<td>13.9</td>
</tr>
<tr>
<td>Area (mm²)</td>
<td>0.63</td>
<td>1.87</td>
<td>1.79</td>
<td>2.86</td>
<td>1.76</td>
</tr>
</tbody>
</table>

### 4.5 Conclusion

A high efficiency, linear mm-wave Doherty PA in CMOS that uses a novel low-loss combiner is demonstrated. A compact modeling approach for CMOS PAs is demonstrated that considerably reduces simulation times.

### 4.6 Acknowledgment

Chapter 4 is mostly a reprint of the material as it appears in N. Rostomyan, M. Özen, and P. Asbeck, "28 GHz Doherty Power Amplifier in CMOS SOI With 28% Back-Off PAE," *IEEE Microwave and Wireless Components Letters*, pp. 13, 2018. This dissertation author was the primary author of this material.
Chapter 5

A Ka-Band Asymmetric Dual Input CMOS SOI Doherty Power Amplifier with 8 dB Back-Off PAE Above 30%

5.1 Introduction

In the previous chapters, various designs have been presented of high power and high efficiency class AB and symmetric Doherty PAs. As already mentioned in the introduction of Chapter 1, modern communication signals, such as OFDM, exhibit high peak to average power ratios which usually exceed 6 dB. Thus, back-off efficiency enhancement at more than 6 dB back-off can significantly reduce mm-wave transceiver power dissipation. Also, as shown in Fig. 8.1, frequency re-configurable PAs with back-off efficiency improvement architecture that will allow RF domain predistortion and auxiliary PA’s turn-on behavior control will require a dual input PA as a key building block.

This chapter discusses a dual input CMOS Doherty PA for the Ka-band based on asymmetric main and peaking amplifiers using a Doherty combiner which is based on an optimized and
low loss synthesis methodology [7]. The main amplifier is designed using a 2-stack architecture. The peaking amplifier is realized using multigate 4-stack devices to achieve high peak power. Both the main and peaking amplifiers utilize same size devices, however the voltage supply of the 4-stack peaking PA is twice as high and thus results in efficiency peaking at more than 6 dB back-off, which is desirable for modulated signals with high PAPR. As shown in Fig. 5.1, if the peaking amplifier can output twice as much power at the peak as the main amplifier, the peak to back-off output power ratio can be estimated as

$$\gamma = \frac{P_{\text{out,peak}}}{P_{\text{out,back-off}}} = \frac{3 \cdot P}{0.5 \cdot P} = 6 \div 7.8 \text{ dB}. \quad (5.1)$$

The PA is fabricated in GlobalFoundries 45 nm CMOS SOI technology and achieves saturated output power of 25 dBm. Because of dual input architecture, the PA can be operated with different input power profiles. With an asymmetric power split, the PA exhibits a second PAE peaking of 37% at 8.3 dB back-off from peak PAE (9.2 dB from peak power), and a high power PAE of 32%. To the author’s knowledge, the PA achieves the highest peak power and back-off efficiency of any Si-based Doherty PAs in the Ka-band. With a symmetric input power split, the PA achieves peak power of 25 dBm peak PAE of 31%, 6 dB back-off PAE of 24%,
and 8 dB back-off PAE of 21% (which in itself constitutes a record for peak power in a CMOS Doherty PA.)

The chapter is organized as follows. In Section 5.2 designs considerations for the asymmetric Doherty PA are presented, followed by experimental results in Section 5.3.

5.2 Circuit Architecture and Design

A high efficiency, compact, one stage, 2-stack, class AB power amplifier for Ka-band applications has been reported in [38] where the nMOS based PA achieved saturated output power of 18.9 dBm and 40.5% PAE. Compact and high power (more than 25 dBm) implementations using 4-stack devices have also been demonstrated in [4, 23] for 15 GHz and 28 GHz, respectively.

![Figure 5.2: Schematic of the asymmetric Doherty PA.](image)

Figure 5.2: Schematic of the asymmetric Doherty PA. $V_{DD,m} = 2.4 \text{ V}$, $V_{G1M} = 0.22 \text{ V}$, $V_{G2M} = 1.7 \text{ V}$, $V_{DD,p} = 4.8 \text{ V}$, $V_{G1P} = 0 \text{ V}$, $V_{G2P} = 1.8 \text{ V}$, $V_{G3P} = 2.9 \text{ V}$, $V_{G4P} = 4.2 \text{ V}$. 

68
Thus, incorporating an efficient 2-stack as a main amplifier with a high power 4-stack as a peaking amplifier in a Doherty configuration can result in an asymmetric Doherty PA with theoretical output voltage (power) ratio of 1:2 and $\gamma = 7.8$ dB.

The schematic of the integrated, asymmetric CMOS Doherty PA with low loss combiner is shown in Fig. 5.2. The transistor widths of both the main and peaking amplifiers are 307 $\mu$m.
The gates of the stacked common-gate transistors are terminated with appropriate capacitors to allow equal voltage division along the stacked transistors and to maintain drain-gate voltage swings below the breakdown [38]. The 4-stack device is based on a multigate approach [4] and relies on the back-end-of-line (BEOL) metalization to realize the common-gate transistor capacitances. Additionally, a shunt feedback drain-source capacitance ($C_3$ on Fig. 5.2) was added to the 2-stack to increase gain and efficiency [45]. This technique helps to perform intermediate node impedance matching by acting as a negative capacitance at the fundamental frequency. Simulation (without including all matching losses) indicate that $C_3$ increases gain by 2 dB, and increases PAE by more than 5%, as shown in Fig. 5.3.

The combiner used in this work is based on analytical synthesis methodology demonstrated in [7], [43], and [46]. The combiner performs both the necessary Doherty impedance modulation and matching to desired output load and achieves considerably lower loss compared to a conventional Doherty combiner [46]. Based on simulations using EM extraction, as illustrated in Fig. 5.4, the synthesized combiner and matching losses are less than 0.9 dB.

5.3 Experimental Results

The Doherty PA was fabricated in GF 45 nm CMOS SOI and has dimensions of 0.63 mm\(^2\) including the pads, as shown in Fig. 5.5. Measurements were carried out using separate inputs to

![Figure 5.5: Micrograph of the asymmetric dual input Doherty PA.](image)
the main and peaking PAs.

The dual input arrangement can enable much better control of the peaking (turn-on point) of the auxiliary amplifier. An example of an input power splitting profile is shown in Fig. 5.6. Here, the turn-on input power value of the auxiliary path is defined as $P_o$ and the gain of this path is higher than for the main in order to reach the input power level that is necessary for saturation. We define the ratio of $P_o$ to input maximum power $P_{\text{max}}$ as

$$\alpha = \frac{P_o}{P_{\text{max}}}. \quad (5.2)$$

The ability to control the peaking performance can significantly increase the back-off efficiency as a sharper turn-on characteristic can be achieved. Fig. 5.7 shows measured PAE and DE for the input power profile of 5.6. The PA exhibits a PAE peaking of 37% at 8.3 dB back-off from peak PAE (9.2 dB from peak power.) The measured frequency dependance of the PAE at various back-offs is shown in Fig. 5.8. The PAE peaking back-off level can be changed by varying $\alpha$, as depicted in Fig. 5.9.

**Figure 5.6:** Asymmetric input power splitting profile for the main and peaking amplifiers for $\alpha = 6$ dB.
Figure 5.7: Measured PAE and drain efficiency (DE) for asymmetric input power split with $\alpha = 10$ dB at 26 GHz.

The linearity performance of the PA under modulated signals was verified with a 64-QAM 50 MHz signal with dual input asymmetric drive. The modulation bandwidth was equipment-limited, as the peaking path requires wide bandwidth due to spectral regrowth associ-
Figure 5.9: Measured PAE for asymmetric input power split with $\alpha = 8\,\text{dB}$, $\alpha = 10\,\text{dB}$, $\alpha = 12\,\text{dB}$ at 26 GHz.

For the highest allowed EVM\textsubscript{RMS} $\approx 5.5\%$ (normalized to the RMS of the signal constellation) for 64-QAM according to IEEE 802.11 standard, with a single carrier signal, the PA achieves average output power of $P_{\text{out}} = 17.2\,\text{dBm}$, PAE = 23%, and ACLR1 = -28 dBc with output PAPR = 5.5 dB. With an OFDM signal, the PA achieves average output power of $P_{\text{out}} = 15.1\,\text{dBm}$, PAE = 19.2%, and ACLR1 = -27.8 dBc with output

Figure 5.10: Measured constellation of 50 MHz 64QAM for asymmetric input power split for single carrier (a), and OFDM (b) signals at 26 GHz.
Figure 5.11: Measured gain, PAE and drain efficiency (DE) for symmetric input power split at 26 GHz.

PAPR = 9 dB. Memory-less DPD was applied and measured constellations are shown in Fig. 5.10. Output AM-PM response over the power range of modulation is within $\pm 3.5^\circ$.

It is possible, for simplicity, to drive both inputs symmetrically (with equal power.)

Figure 5.12: Measured saturated output power, peak PAE, 6 dB, and 8 dB back-off PAE versus frequency for symmetric input power split.
Table 5.1 lists recently reported, state of the art Doherty and out-phasing PAs for mm-wave applications. To the authors’ knowledge, the Doherty PA presented in this work features the highest power and 8 dB back-off PAE among silicon Doherty PAs.

5.4 Conclusion

A high efficiency, dual input, asymmetric, mm-wave Doherty PA in GlobalFoundries 45 nm CMOS SOI that uses a novel low-loss combiner is demonstrated. The main Doherty path
uses a high efficiency 2-stack amplifier with a shunt feedback drain-source capacitance. The peaking path uses a high power 4-stack amplifier to achieve more than 6 dB back-off efficiency improvement.

5.5 Acknowledgment

Chapter 5 is mostly a reprint of the material that has been submitted for publication as it may appear in N. Rostomyan, M. Özen, and P. Asbeck, ”A Ka-Band Asymmetric Dual Input CMOS SOI Doherty Power Amplifier with 25 dBm Output Power and High Back-Off Efficiency,” 2019 IEEE Topical Conference on RF/Microwave Power Amplifiers for Radio and Wireless Applications (PAWR), 2019. This dissertation author was the primary author of this material.
Chapter 6

Synthesis Technique for Low Loss

Mm-Wave T/R Combiners for TDD Front-Ends

6.1 Introduction

Current developments in mm-wave high data-rate links for emerging 5G and low-cost satellite communication systems have resulted in substantial interest in Ka-band silicon (Si)-based front-ends, which can potentially integrate all the critical transceiver building blocks. Of particular concern for transceiver co-integration in Si is the achievable power and efficiency of power amplifiers. The efficiency of the PA can have a significant contribution in the overall power consumption and thermal management of mm-wave transceivers with a large number of antennas. High antenna gain of beam-forming arrays reduces the required output power for each antenna. In many systems the requires saturated output power is relatively low and can be in the range of 15 - 25 dBm, which can easily be achieved with CMOS or SiGe based PAs.

This chapter explores the possibility of reducing the losses of the overall mm-wave
transceiver front-end by exploiting the integration capabilities of CMOS and the ability to co-design different building blocks of a fully integrated transceiver. A T/R combiner synthesis methodology is proposed that minimizes the losses by incorporating the PA output and the LNA input matching networks together with the T/R switch into one network. This technique can also reduce chip area by minimizing the number of passive component. It is demonstrated here using CMOS SOI technology, but it is not limited to this technology, as it does not rely on a stacked transistor switch architecture to handle high voltage swings. The synthesis of the network is based on the desired PA output load impedance from load-pull simulations and the optimum source impedance for minimum noise figure for the LNA. An implementation example is also considered that includes a high power, 4-stack based PA and an inductively source degenerated, cascode based LNA. The PA inside the front-end achieves saturated output power of 23.6 dBm with peak PAE of 28%, while maintaining LNA noise figure of 3.2 dB and IIP3 greater than 5 dBm.

The chapter is organized as follows. In Section 6.2, two-port network parameters of the T/R combiner are derived. Then, in Section 6.3 all building blocks of the front-end comprising the PA, LNA and the T/R combiner are described in detail. Finally, Section 6.4 covers the measurement results.

6.2 Combiner Synthesis

The T/R combiner synthesis methodology utilizes an analytical technique to design a combiner with minimum losses, given the impedance presented at its inputs. This technique has been previously used provide a general solution for synthesis of a Doherty PA combiner which comprises impedance inversion and matching into one network [48]. In the context of a TDD T/R switch, the goal is create a single 2-port network that provides the desired PA output impedance and the LNA input impedance together with a matched antenna port. This network can be represented as a lossy and reciprocal 2-port network that includes the antenna (load) inside of
it, and an external shunt switch to isolate the LNA input in the transmit mode. This configuration is shown in Fig. 6.1. In this section, boundary conditions that are necessary for the synthesis are first presented, followed by derivations of the network parameters.

6.2.1 Boundary Conditions

The derivation of \( Y_{2p} \) admittance parameters of the 2-port network from Fig. 6.1 requires a set of boundary conditions that are determined from design goals. For the PA, it is assumed that the large signal load impedance and the off-state output impedance can be determined from load-pull or s-parameter simulations (measurements), respectively. Similarly, the desired source

Figure 6.1: T/R combiner network represented as a lossy and reciprocal 2-port network.

Figure 6.2: Transmit and receive states for determining the 2-port matrix. Transmit mode (a), general receive mode (b).
impedance for the LNA and the on/off-state impedance values of the switch are also assumed to be known.

In transmit mode, the shunt switch is closed and is represented as $Y_{sw,\text{on}}$, admittance in parallel with the LNA input. As shown in Fig. 6.2a, in this mode of operation, it is desired that the PA is presented with an optimal load impedance $Y_{L,PA}$, which is given as

$$Y_{L,PA} = Y_{\text{in}} = Y_{11} - \frac{Y_{12}^2}{Y_{22} + Y_{sw,\text{on}}}.$$  \hspace{1cm} (6.1)

As shown in Fig. 6.2b, in the receive mode, both the PA and the shunt switch are off and can be represented as $Y_{PA,\text{off}}$ and $Y_{sw,\text{off}}$, respectively. The desired LNA source impedance $Y_{S,LNA}$ at the input can be found with

$$Y_{S,LNA} = Y_{\text{out}} = Y_{22} - \frac{Y_{12}^2}{Y_{11} + Y_{PA,\text{off}}} + Y_{sw,\text{off}}.$$  \hspace{1cm} (6.2)

Without much loss to accuracy, (6.2) can be simplified if we assume that the PA can be modeled as a capacitor to ground in the off-state. If desired, the equation can be further simplified if the off-state PA output capacitance is tuned out with a series inductor $L_{\text{series}}$, as illustrated in Fig. 6.2c. In this case, at resonance, (6.2) can be written as

$$Y_{S,LNA}(f_0) = Y_{22} + Y_{sw,\text{off}},$$  \hspace{1cm} (6.3)

where $f_0 = 1/ (2\pi\sqrt{L_{\text{series}}C_{PA,\text{off}}})$.

Circuit level realization of the lossy and reciprocal 2-port network of Fig. 6.1 requires transformation to a lossless reciprocal 3-port network where the antenna (load) port can be explicitly defined, as shown in Fig. 6.3. Thus, it is necessary to derive conditions for this transformation to be valid.
The $Y_{3p}$ admittance matrix of the 3-port network of Fig. 6.3 can be written as

$$Y_{3p} = \begin{bmatrix} y_{11} & y_{12} & y_{13} \\ y_{21} & y_{22} & y_{23} \\ y_{31} & y_{32} & y_{33} \end{bmatrix},$$  \hspace{1cm} (6.4)

where all the elements of the matrix are purely imaginary due to the lossless condition of the 3-port, and $y_{12} = y_{21}$, $y_{13} = y_{31}$, $y_{23} = y_{32}$ due to reciprocity.

If the third port of the 3-port is terminated with a load having admittance $Y_L$, the resultant 2-port network $y$-parameters expressed in terms of the 3-port network parameters are

$$Y_{11} = y_{11} - \frac{y_{13}^2}{y_{33} + Y_L},$$  \hspace{1cm} (6.5)

$$Y_{12} = y_{12} - \frac{y_{13} y_{23}}{y_{33} + Y_L},$$  \hspace{1cm} (6.6)

$$Y_{22} = y_{22} - \frac{y_{23}^2}{y_{33} + Y_L}.$$  \hspace{1cm} (6.7)

In order to simplify these equations, the common denominator term can be abbreviated as

$$C = y_{33} + Y_L.$$  \hspace{1cm} (6.8)

Given that all the terms of (6.4) are purely imaginary, the real parts of the 2-port network are found to be

$$\Re \{Y_{11}\} = -y_{13}^2 \frac{\Re \{Y_L\}}{|C|^2},$$  \hspace{1cm} (6.9)

$$\Re \{Y_{12}\} = -y_{13} y_{23} \frac{\Re \{Y_L\}}{|C|^2},$$  \hspace{1cm} (6.10)

$$\Re \{Y_{22}\} = -y_{23}^2 \frac{\Re \{Y_L\}}{|C|^2}.$$  \hspace{1cm} (6.11)
Thus, for the 2-port network that results from terminating a lossless reciprocal 3-port with a load $Y_L$, a necessary and sufficient condition for transformation is formulated from (6.9)-(6.11) to be

$$\Re \{Y_{12}\} = \sqrt{\Re \{Y_{11}\} \cdot \Re \{Y_{22}\}}, \quad (6.12)$$

So far, only five equations have been identified for determining six unknowns in the $Y_{2p}$ matrix. An additional condition can be set to maximize isolation between the PA and the LNA ports. Given that the real part of $Y_{12}$ is set by (6.12), highest isolation can be achieved if

$$\Im \{Y_{12}\} = 0. \quad (6.13)$$

Thus, all elements of the $Y_{2p}$ matrix are fully determined. Due to the complexity, the roots of the equation system consisting of (7.5), (6.2), (6.12) and (6.13) are found with numerical solvers.

### 6.2.2 Combiner Realization

After the $Y_{2p}$ admittance matrix is fully determined, it is necessary to convert the 2-port network of the combiner into its circuit implementation. This can be accomplished in different ways. In this work, the method presented in [48] was used, where two lossless and reciprocal 2-port networks are utilized for the combiner realization.
It can be observed that the 3-port lossless combiner of Fig. 6.3 can be presented using two lossless reciprocal 2-port networks that are cascaded together with the load network, as illustrated in Fig. 6.4. For the sake of simplicity, the networks are represented in $ABCD$ matrix form. By denoting the PA side network as $T_{2p,PA}$, and the LNA side as $T_{2p,LNA}$, the resultant 2-port network between $P_1$ and $P_2$ ports is determined as

$$T_{2p} = T_{2p,PA} T_Z T_{2p,LNA}$$

$$= \begin{bmatrix} A_1 & jB_1 \\ jC_1 & D_1 \end{bmatrix}_PA \begin{bmatrix} 1 & 0 \\ 1/Z_L & 1 \end{bmatrix} \begin{bmatrix} A_2 & jB_2 \\ jC_2 & D_2 \end{bmatrix}_LNA$$

(6.14)

where the lossless property of the $ABCD$ matrix was used to assign real components to the diagonal elements and imaginary components to the off-diagonal ones.

Based on the equations derived in the previous section for $Y_{2p}$, the system of (6.14) can be solved analytically, which has been discussed in [48] in detail.

In the final stage of combiner realization, the 2-port networks $T_{2p,PA}$ and $T_{2p,LNA}$ can be converted to lumped element $\pi$- or T-networks, thus resulting in four possible combinations. The solution that results in the best compromise for losses and effective chip area should be selected.

**Figure 6.4:** Representation of the lossless 3-port combiner network with two 2-port networks terminated with a load.
6.3 Building Blocks of the Front-End

The design of the fully integrated TDD front-end that includes the PA, LNA, and the T/R combiner is shown in Fig. 6.5. The values of the components are provided in the caption of the figure. In this section, the design approach of the high power PA is first presented, followed by the description of the LNA and the shunt switch. Subsequently, an implementation demonstration of the T/R combiner synthesis is discussed.

6.3.1 High Power PA Implementation

It is well known that the power handling capability of CMOS FETs is limited due to low breakdown voltages. However, in a CMOS SOI process, high output power can be achieved by transistor stacking, which allows voltage swings of each transistor to add together. Compact
and high power implementations, that produce more than 25 dBm saturated output power using 4-stack devices have been demonstrated in [4, 23] for 15 GHz and 28 GHz, respectively. In transistor stacking, gates of the stacked transistors are terminated with appropriate capacitors to maintain equal voltage division along the transistors and to limit drain to gate voltage swings to values below breakdown. The 4-stack device used in this work is based on a multigate approach [4] and relies on the back-end-of-line (BEOL) metalization to realize gate capacitors. The multigate approach has the advantage of significantly reducing inter-node parasitic inductance and resistance. Besides, it provides a scalable solution for realizing large sized devices, as the cells can be arranged in an array of \( N \) elements to achieve desired device width of \( N \cdot w_g \), where \( w_g \) is the width of each individual multigate cell. However, due to the difficulty of ensuring phase coherence between cells that are spaced far apart, the maximum number of elements \( N \) is limited and depends on the frequency of operation.

The high power PA used in this work consists of 256 multigate 4-stack unit cells, as shown in Fig. 6.5. Each cell is \( w_g = 1.2 \mu m \) wide, resulting in total device width of 307.2 \( \mu m \). Minimum length, double pitch devices were used. The PA uses drain supply voltage of \( V_{DD,PA} = 4.8 \) V. The input terminal is matched to 50\( \Omega \).

### 6.3.2 LNA Implementation

The LNA is based on an inductive source degenerated cascode architecture, as shown in Fig. 6.5. Device width of 48\( \mu m \) was chosen to achieve the real part of the optimal source impedance for minimum NF (\( NF_{\text{min}} \)) to be close to 50\( \Omega \). The current density for this size cascode device at the \( NF_{\text{min}} \) point is about three to four times smaller than the current density for peak \( f_T \). Thus, biasing for \( NF_{\text{min}} \) can result in reduced gain. To achieve a compromise between gain and noise figure, the LNA is biased at almost half of the current density for peak \( f_T \), at about 0.45 mA/\( \mu m \). Although this causes the noise figure to increase by 0.1 dB above the simulated \( NF_{\text{min}} = 1.4 \) dB, it allows the device to operate close to the peak \( f_T \) (180 GHz).
The source degeneration inductance needs to be carefully EM simulated to include the ground return path inductance as well. The output is matched with a single stage LC matching network. The top gate has independent bias control and is bypassed to ground with a large capacitor $C_g = 434 \text{ fF}$. The LNA uses $V_{\text{DD,LNA}} = 2.4 \text{ V}$, which is half of the PA supply voltage.

### 6.3.3 Switch Implementation

The design considerations for the switch are simplified due to the fact that its voltage handling capabilities are relaxed. Although the PA can output peak voltage swings up to $2V_{\text{DD,PA}}$, the voltage swing at the LNA input (and thus across the switch), is much smaller due to impedance transformation of the T/R combiner. This is a significant advantage over conventional SPDT switch architectures, which usually require transistor stacking in order to handle large voltage swings. However, transistor stacking increases switch insertion loss and has many limitations when implemented on a non-SOI process such as bulk CMOS or SiGe. By contrast, the T/R switch and combiner architecture shown here can be implemented in any process.

The size of the switch is mainly determined by considering two factors. First, increasing the switch size results in lower $R_{\text{on}}$ resistance, and thus lower losses in the combiner. However, large switch transistor size also leads to significant parasitic capacitance, which tends to increase losses. Second, as the switch is realized with a non-linear element, it can contribute to signal distortion and spectral regrowth in the transmit mode. To reduce the non-linearities arising from the switch, it is required to increase the transistor size and thus reduce the voltage drop on it. The process of finding the optimal size of the switch is iterative and simulations need to be carried out together with the T/R combiner network. The switch used in this work has $67 \mu\text{m}$ width, which results in equivalent $R_{\text{on}}$ resistance of $4.4 \Omega$. 
6.3.4 Combiner Synthesis

As mentioned in Section 6.2, the combiner synthesis methodology is based on desired load impedance for the PA and source impedance for the LNA. This impedances can be determined either by simulations or measurements.

At 28 GHz, load-pull simulations for the high power 4-stack PA stage at $V_{DD,PA} = 4.8$ V results in optimal load impedance at the fundamental frequency of $Z_{L,PA} = 6.2 + j 12.3 \, \Omega$, which corresponds to equivalent parallel load resistance of $R_{L,PA} = 30 \, \Omega$ and reactance of $X_{L,PA} = j 15.4 \, \Omega$. At the off-state, with the bottom-most transistor of the 4-stack biased to zero by grounding the gate, the PA presents output impedance of $Z_{PA,off} = -j 17.6 \, \Omega$, which is equal to parasitic capacitance of $C_{PA,off} = 323 \, \text{fF}$ to ground.

For the LNA, the optimum source impedance for achieving minimum NF at 28 GHz is equal to $Z_{S,LNA} = 50 + j 70 \, \Omega$. The off-state input impedance of the LNA can be neglected as the switch on-state impedance is much lower. For the switch of 67 $\mu$m width and $Z_{sw,on} = 4.4 \, \Omega$, the off-state impedance is $Z_{sw,off} = -j 159.7 \, \Omega$.

In this work, the combiner topology that makes use of a series inductor $L_{series}$ to tune out the output off-state capacitance of the PA is utilized. This has the advantage of simplifying the equation (6.2) into (6.3) and speeding up the calculations. Besides, based on simulations, addition of $L_{series}$ substantially increases the isolation between the PA and the LNA in the transmit mode, as shown in Fig. 6.6. Here, isolation is defined as the ratio of voltage amplitudes at the LNA input and the PA output. The simulations are based on synthesized $Y_{2p}$ 2-port networks for both cases.

$L_{series} = 100 \, \text{pH}$ is required to tune out $C_{PA,off} = 323 \, \text{fF}$ at 28 GHz, which effectively reduces the imaginary part of the PA optimum load impedance to $Z_{L,PA} = 6.2 + j 12.3 - j 17.6 = 6.2 - j 5.3 \, \Omega$. Plugging this impedance, as well as $Z_{sw,on}$, $Z_{sw,off}$, and $Z_{S,LNA}$ into equations (6.1),
Figure 6.6: Simulated PA to LNA isolation in the transmit mode at $P_{\text{out}} = 23$ dBm with and without $L_{\text{series}}$.

(6.3), (6.12), and (6.13), the 2-port admittance matrix of the combiner can be found as

$$Y_{2p} = \begin{bmatrix} 0.1323 e^{j35^\circ} & 0.02694 \\ 0.02694 & 0.0171 e^{j67^\circ} \end{bmatrix}. \quad (6.15)$$

Figure 6.7: Simulated PA to antenna and LNA to antenna losses.
It was outlined in Section 6.2.2 that the $Y_{2p}$ matrix can be realized with two 2-port networks, each represented with lumped element $\pi$- or T-networks. $\pi$-networks were chosen in this design as they lead to the lowest losses and optimal layout in terms of realizable LC component values. Furthermore, a choke element was added to the network to supply the drain of the PA. If a small choke is chosen, the designer can include the impedance of the choke in the calculations for $Z_{L,PA}$ and $Z_{PA,off}$. The realization of the final network will also require adjustments of LC components values due to their finite quality factors and subsequent detailed EM simulations.

The complete T/R combiner network is shown in Fig. 6.5. The network consists of three inductors and three capacitors. The effective quality factors of the inductors are in the range of 15-25, depending on the inductance. EM based simulations of the PA (from top transistor drain terminal) to antenna loss in the transmit mode; and antenna to LNA (to input transistor gate terminal) loss in the receive mode are illustrated in Fig. 6.7. It can be observed that the overall simulated PA output loss, which by the nature of the combiner design includes both PA matching and the T/R switch loss, is about 0.9 dB at 28 GHz. Similarly, the LNA input loss is

![Graph](image-url)

**Figure 6.8**: Simulated PA to LNA isolation and voltage amplitude at the LNA input in the transmit mode at $P_{\text{out}} = 23$ dBm.
In the transmit mode, it is important that the PA to LNA isolation is sufficient enough to maintain voltage swings at the input of the LNA within reliability or breakdown limits. As shown in Fig. 6.8, even at peak output power of 23 dBm, the isolation is more than 23 dB and the LNA gate voltage is below 0.35 V in the frequency range from 24 to 32 GHz. The synthesized T/R combiner provides wideband frequency response both for the PA and the LNA.

In a phased array system, the antenna impedance can vary depending on beam-steering angle. To evaluate the sensitivity of the T/R combiner against antenna mismatch, the PA to antenna and LNA to antenna losses at 28 GHz were simulated for load reflection coefficient of magnitude $|\Gamma| = 10$ dB and phase of 0 to $2\pi$. The results are shown in Fig. 6.9 and suggest that antenna mismatch has lesser effect on the PA performance than on the LNA.

As the switch transistor is the only active component in the T/R combiner, its size is the major parameter defining the linearity of the switch. Also, the gate resistance $R_{g,sw}$ of $S_1$ transistor can substantially degrade the linearity. The combiner non-linearities were evaluated based on two-tone simulations to achieve third order inter-modulation distortion component at the antenna.
Figure 6.10: Simulated output IMD3 in the transmit mode at $P_{\text{out}} = 23 \text{ dBm}$.

Figure 6.11: Simulated transient response in the transmit mode with settled PA.

terminal below -45 dBc at the peak output power in transmit mode. The dependance of the output IMD3 component versus $R_{g,sw}$ is illustrated in Fig. 6.10. It can observed that for values of $R_{g,sw}$ above 1 kΩ, no additional improvements in terms of linearity can be gained. $R_{g,sw} = 2.5 \text{k} \Omega$ was chosen in this work.
The on/off switching time of the T/R switch combiner is determined by $R_{g,sw}$ biasing resistor and the junction capacitance at the gate of $S_1$ transistor, as well as the settling time of the network. The effective switching speed will also be determined by the settling behavior of the PA and LNA. Transient response for the transmit mode in the event of $S_1$ turning “on” while the PA is already settled, is shown in Fig. 6.11. For $R_{g,sw} = 2.5\, \text{k}\Omega$, the simulated switching times $t_{TX,\text{on}}$ and $t_{TX,\text{off}}$ at 28 GHz are about 240 ps and 200 ps, respectively.

### 6.4 Experimental Results

The TDD front-end chip which contains the high power PA, the LNA and the novel T/R switch combiner for 28 GHz band was fabricated in GlobalFoundries 45 nm CMOS SOI process. It occupies overall chip area of 0.7x0.77 mm$^2$; the RF portion (without pads) occupies a compact area of 0.5x0.55 mm$^2$. The chip micrograph is shown in Fig. 6.12.

#### 6.4.1 Front-end LNA Measurements

Fig. 6.13 shows measured small-signal s-parameters of the integrated LNA. The measurements are done in the receive mode, for which the input gate voltage of the PA is set to zero ($V_{G1,PA} = 0\, \text{V}$, the rest is the same as on Fig. 6.5) and the $S_1$ switch is off. The DC power con-
Figure 6.13: Measured (solid lines) and simulated (dashed lines) s-parameters of the LNA.

The small-signal $s_{21}$ gain peaks at 27.5 GHz and measures 11.2 dB.

Continuous wave (CW) and two-tone large signal measurements were conducted to

Figure 6.14: Measured IIP3 and input P1dB of the LNA.
evaluate the linearity of the LNA. Fig. 6.14 demonstrates input referred P1dB and IP3 versus frequency. 10 MHz tone spacing was used for two-tone measurements. In the frequency range of 24 to 30 GHz the IIP3 is above 5 dBm, while the lowest input P1dB of -9 dBm (output P1dB = 1.6 dBm) is attained at 27 GHz. The LNA achieves saturated output power of 8.6 dBm and peak power added efficiency (PAE) of 19%.

Noise figure measurements were conducted using Keysight 346CK01 noise source. Measured and simulated noise figure versus frequency curves are shown in Fig. 6.15. Minimum NF of 3.2 dB is achieved at 28 GHz.

### 6.4.2 Front-end PA Measurements

Fig. 6.16 shows measured small-signal s-parameters of the PA when the front-end operates in the transmit mode by de-biasing the LNA and turning the S1 switch on. $s_{21}$ gain measures 12 dB at 26 GHz with a -3 dB bandwidth from 23 to 30.9 GHz, which results in a fractional bandwidth of 30%.
Fig. 6.17 illustrates measured and simulated large signal gain, PAE, and drain efficiency (DE) at 26 GHz. Large signal measurements were conducted with the gate-to-source voltage biased at 0.25 V. The PA achieves maximum saturated output power $P_{\text{sat}} = 23.6\,\text{dBm} (230\,\text{mW})$ and a peak $\text{PAE}_{\text{max}} = 28\%$ as well as drain efficiency $DE = 31\%$ at $P_{\text{out}} = 23\,\text{dBm}$. Output P1dB is almost 1.5 dB lower from $P_{\text{sat}}$ and is about 22 dBm. Slight deviation from simulations can be attributed to transistor large-signal modeling, input matching and output combiner EM modeling and simulation inaccuracies.

The linearity of the PA has been studied with 64-QAM orthogonal frequency division multiplexing (OFDM) signals of 800 MHz modulation bandwidth. The signals were generated using a Keysight M8195A 65 GSa/s arbitrary waveform generator (AWG). A high sampling rate digital oscilloscope and an external mixer were used to capture down-converted output signals from the PA. Measured EVM and average PAE results are shown in Fig. 6.18 for at 26 GHz. Input PAPR value for the signal used is 9.7 dB. For the highest allowed $EVM = 5.5\%$ (normalized to the RMS of the signal constellation) for 64-QAM according to IEEE 802.11 standard, the PA achieves average output power of 14 dBm, average PAE of 11.5%, and ACLR1 of -30.4 dBc.
Figure 6.17: Measured (solid lines) and simulated (dashed lines) Gain, PAE, and DE of the PA at 26 GHz.

Figure 6.18: Measured EVM and average PAE of the PA at 26 GHz with 64-QAM 800 MHz OFDM signal.

The captured baseband spectrum is shown in Fig. 6.19. No DPD was used for the modulated measurements in order to characterize PA’s inherent linearity limitations.

It is also important to analyze amplitude to phase distortion (AM-PM) of the PA. Fig. 6.20
illustrates output AM-PM response over the power range of modulation. The PA demonstrates excellent AM-PM response, in keeping with prior reported results of AM-PM using stacked FET PAs in CMOS SOI.

Table 6.1 lists recently reported, state of the art, Ka-band, mm-wave transceiver front-ends. Compared to other works, the front-end presented in this paper features higher PAE and saturated output power in transit mode and state of the art noise figure (LNA including T/R switch) in receive mode.

6.5 Conclusion

In this paper, a T/R combiner synthesis methodology is presented that optimizes the losses by combining the PA output and the LNA input matching networks together with the T/R switch into one network. A front-end implementation that includes a high power 4-stack PA, an inductively source degenerated, cascode LNA and the proposed T/R switch combiner is also demonstrated in GlobalFoundries 45 nm CMOS SOI technology. The PA inside the front-end

![Figure 6.19](image.jpg)

**Figure 6.19:** Measured spectrum of the PA at 26 GHz with 64-QAM 800 MHz OFDM signal at $P_{\text{out}} = 14$ dBm, EVM = 5.5%.
Figure 6.20: Measured AM-PM of the PA at 26 GHz with 64-QAM 800 MHz OFDM signal at $P_{\text{out}} = 14 \text{dBm}$, EVM = 5.5%.

achieves saturated output power of 23.6 dBm with peak PAE of 28%, while maintaining LNA noise figure of 3.2 dB.

Table 6.1: Comparison to Recent Ka-band mm-Wave Transceiver Front-ends.

<table>
<thead>
<tr>
<th>Technology</th>
<th>This Work</th>
<th>[11]</th>
<th>[12]</th>
<th>[9]</th>
</tr>
</thead>
<tbody>
<tr>
<td>Fo (GHz)</td>
<td>28</td>
<td>28</td>
<td>28</td>
<td>29</td>
</tr>
<tr>
<td>Supply (V)</td>
<td>2.4 &amp; 4.8</td>
<td>2.7 &amp; 1.5</td>
<td>-</td>
<td>1.2 &amp; 2.2</td>
</tr>
<tr>
<td>TX Psat (dBm)</td>
<td>23.6</td>
<td>16</td>
<td>14</td>
<td>12.5</td>
</tr>
<tr>
<td>TX OP1dB (dBm)</td>
<td>22</td>
<td>13.5</td>
<td>12</td>
<td>10.5</td>
</tr>
<tr>
<td>TX PAEmax (%)</td>
<td>28</td>
<td>20</td>
<td>20</td>
<td>13</td>
</tr>
<tr>
<td>RX NF (dB), LNA w/ SW</td>
<td>3.2</td>
<td>6</td>
<td>3.8-4.4</td>
<td>4.6</td>
</tr>
<tr>
<td>RX IP1dB (dBm)</td>
<td>-7.1</td>
<td>-22.5</td>
<td>-</td>
<td>-22</td>
</tr>
<tr>
<td>RX IIP3 (dBm)</td>
<td>5.7</td>
<td>-</td>
<td>-</td>
<td>-12</td>
</tr>
<tr>
<td>Area per channel (mm$^2$)</td>
<td>0.54*</td>
<td>5.19</td>
<td>1.16</td>
<td>2.94</td>
</tr>
</tbody>
</table>

Notes: *Includes PA, LNA, T/R switch only.
6.6 Acknowledgment

Chapter 6 is mostly a reprint of the material that has been submitted for publication as it may appear in N. Rostomyan, M. Özen, and P. Asbeck ”Synthesis Technique for Low Loss Mm-Wave T/R Combiners for TDD Front-Ends,” IEEE Trans. Microw. Theory Tech., 2018. This dissertation author was the primary author of this material.
Chapter 7

Adaptive Cancellation of Digital Power Amplifier Receive Band Noise for FDD Transceivers

Highly reconfigurable and multi-standard radio blocks have attracted considerable research interest to overcome the problems of overcrowded RF frequency bands and increased demand for lower cost, fully integrated radio systems. A promising approach to achieve high efficiency, high integration, and wideband operation is based on digitally modulated power amplifiers (DPAs), which function as RF power digital-to-analog converters. These circuits not only allow frequency agnostic PA designs, but also provide digital modulation and output power control. They facilitate efficiency enhancement techniques such as polar and Doherty techniques [13–16]. However, due to their clocked nature, the DPAs suffer from high level of out of band quantization noise which makes their use in frequency division duplex (FDD) systems challenging.

In FDD systems widely used for current cellular communication, the transmitter and the receiver operate at the same time but at different center frequencies. The transmit and receive bands are usually closely spaced, such that undesired spurious emissions from the transmitter
can limit the performance of such systems. In particular, spurious emissions from the transmitter in the receive band, commonly referred as receive band noise (RxBN) are filtered by a duplexer. Currently, in order to minimize degradation of receiver sensitivity, it is required that the RxBN power spectral density at the input of the receiver LNA be kept below 180 dBm/Hz. To achieve such low RxBN floor (below $kT$), the out of band noise floor at the PA output is usually required to be below 130 dBm/Hz, and the duplexer is required to have large TX-RX isolation (> 50 dB). Given the close frequency spacing (10s of MHz) between transmit and receive bands, the use of high-Q resonators in duplexers is necessary, resulting in high insertion loss ($\approx 3 – 4$ dB). Besides, the duplexers require large PCB area, usually in the order of 2x2 mm$^2$ per component. As each band requires a separate duplexer, enabling multi-band operation results in a large number of duplexers, increasing the size and the cost of a cellular communication device.

Significant improvements can be achieved if the duplexer rejection in the receive band is relaxed, for example, to achieve 1 dB lower insertion loss of the duplexer in the transmit band. The benefits of 1 dB lower insertion loss can be appreciated by recognizing that it leads to more than 25% reduction in power consumption of the PA (assuming 24 dBm average output power at the antenna port) - nearly equivalent to the benefits of alternative efficiency enhancement techniques such as envelop tracking (ET). Furthermore, reduction of RxBN in the receiver will allow the use of more advanced PA architectures, such as DPAs, which have inherent high quantization noise (> $–120$ dBm/Hz) and at present fail the RxBN specs of current cellular systems.

This chapter proposes the use of a feedback receiver to cancel high levels of quantization noise of digital PAs at the receive band located at a small duplex spacing. In the digital domain, this technique utilizes well-known adaptive noise cancellation principles which reduce the noise in a signal if a correlated noise component is known [49, 50]. Adaptive cancellation also has the advantage of being able to track system response changes due to temperature variations and antenna mismatch. Using main and feedback receiver implementations based on off-the-shelf components, experimental verification of cancellation is discussed in detail for a high power
CMOS digital PA at LTE band 5 (TX: 881.5 MHz and RX: 836.5 MHz). More than 22 dB cancellation is achieved to reduce RxBN below thermal noise floor at 45 MHz duplex spacing. The cancellation technique is also able to recover fully corrupted constellations in the presence of a low power desired RX signal.

The organization of this chapter is as follows. In Section 7.1, an introduction to digital power amplifiers and their quantization noise is presented. Section 7.2 covers the proposed cancellation principle. Finally, experimental results are discussed in Section 7.3.

### 7.1 Digital Power Amplifier and Quantization Noise

The receive band noise cancellation technique presented in this paper utilizes a watt-level, digital polar PA implemented in 0.18\(\mu\)m CMOS SOI that was presented in [16]. The DPA uses a 10 bit amplitude control word (ACW) to realize amplitude modulation by controlling the number of active output cells, while a constant amplitude input signal centered at the carrier frequency provides the phase modulation.

Finite resolution of the amplitude modulation creates quantization error, which causes white amplitude noise commonly referred as quantization noise in classical DAC theory. In a polar PA architecture, this wide-band noise is up-converted with the phase modulated signal and appears at the output spectrum as a white noise floor. Based on elementary sampling theory, the quantization noise increases by approximately \(\nu \cdot 6\) dB, where \(\nu\) is the effective number of bits (ENOB). This can be confirmed with system-level simulations on a conventional polar DPA. The resulting spectral regrowth was estimated as a function of the ACW resolution. As shown in Fig. 7.1 for a 5 MHz LTE signal with 45 MHz sampling rate and ACW resolution varying from 4 - 10 bits, the spectral purity improves with higher resolution. The quantization noise floor can also be reduced by oversampling, which spreads the total quantization noise power across a larger bandwidth. However, both increasing the ACW resolution and the sampling clock rate result in
Figure 7.1: Simulated spectrum of a 5 MHz LTE signal with 45 MHz sampling rate and ACW resolution varying from 4 - 10 bits.

non-ideal effects which degrade spectral purity both for close-in adjacent channels and far-out broadband flat noise. These effects reduce the achievable ENOB and are dominated by code glitches, as well as time misalignment between the amplitude and phase signals. Also, ENOB is further reduced by the number of bits required for DPD and power control, thus making it more challenging to achieve RxBN floor below required -130 dBm/Hz.

### 7.2 Cancellation Technique

The quantization noise of DPAs has a stochastic, random nature, and cannot be reduced by DPD. Various approaches have been demonstrated to reduce the out of band non-deterministic noise of linear as well as digital power amplifiers. On-chip digital up-sampling and FIR filtering have been proposed to mitigate the quantization noise and sampling clock images of DPAs in [14] and [51]. However, these methods substantially increase the sampling clock rate as well as the complexity of the DPA and fail to reduce the noise floor below -115 dBm/Hz at less than 100 MHz.
offset from the center frequency. Use of an additional feedback receiver was proposed in [52] to model determinist output noise components of the PA that arises from non-linearities. [53] has also made use of a feedback receive to filter stochastic white noise of a linear PA.

The cancellation technique used in this work is also based on the concept of capturing the PA output at the receive band with a feedback receiver through a coupler. The block diagram of the technique is illustrated in Fig. 7.2. The system is realized with off-the-shelf components from Mini-Circuits, ADI, and TI, while the digital processing is done on a PC. LTE band 5 commercial duplexer was used with TX-RX isolation of at least 60 dB. Note that the directional coupler is already part of commercial cellphone transceivers for output power control and DPD.

One of the main challenges of realizing the feedback receiver is the high blocker power tolerance requirement. The feedback receiver operates at $f_{RX}$ center frequency and captures the PA output noise through a directional coupler. With a -30 dB coupler, the transmit signal from the PA at $f_{RX}$ with up to 27 dBm peak power (for cellular uplink LTE systems) is still high enough to desensitize the feedback receiver.

High blocker tolerance can be achieved with mixer-first receivers [54] or N-path filter
Table 7.1: Performance Summary of Main and Feedback Receivers

<table>
<thead>
<tr>
<th></th>
<th>Main RX</th>
<th>Feedback RX</th>
</tr>
</thead>
<tbody>
<tr>
<td>Architecture</td>
<td>Low IF</td>
<td></td>
</tr>
<tr>
<td>( f_{RX} ) (MHz)</td>
<td>836.5</td>
<td></td>
</tr>
<tr>
<td>( f_{IF} ) (MHz)</td>
<td>25</td>
<td></td>
</tr>
<tr>
<td>( NF_{DSB} ) (dB)</td>
<td>0.911</td>
<td>12.55</td>
</tr>
<tr>
<td>Gain (dB)</td>
<td>56</td>
<td>36.8</td>
</tr>
<tr>
<td>TX blocker rejection</td>
<td>16 dB @70 MHz</td>
<td></td>
</tr>
<tr>
<td>Input P1dB (dBm)</td>
<td>-40</td>
<td>-9.5</td>
</tr>
</tbody>
</table>

Based LNAs [55,56]. While these receivers traditionally suffer from high NF (> 5 dB), significant levels of RxBN from digital PAs (> −120 dBm/Hz) allow enough signal-to-noise ratio for cancellation. With IC level implementation of a feedback receiver, the additional DC power consumption (< 50 mW) required for it is more than an order of magnitude smaller than that of the transmitter. In this work, mixer-first, low-IF architecture was used to enable system level (non-IC) implementation and verification. Key performance parameters of the realized main and feedback receivers are summarized in Table 7.1.

The cancellation algorithm is based on classical adaptive noise cancellation principle. According to Fig. 7.2, the \( d(n) \) signal of the main receiver path in the digital domain contains an information bearing signal \( s(n) \) coming from the antenna and a corrupting noise signal \( x(n) \) that is dominated by the PAs receive band noise. The feedback receiver, on the other hand, receives only the reference RxBN signal \( x'(n) \) which is statistically correlated with \( x(n) \), such that

\[
E \{ x(n)x'(n-k) \} \neq 0. \tag{7.1}
\]

Furthermore, it is assumed that the signal \( s(n) \) is not correlated with the noise sources \( x(n) \) and \( x'(n) \), thus

\[
E \{ s(n)x(n-k) \} = 0, \quad E \{ s(n)x'(n-k) \} = 0, \tag{7.2}
\]

for all \( k \).
For any sample at time instance $n$, the reference noise $x'(n)$ is processed by an adaptive filter with time varying filter coefficients $w = w_0(n), w_1(n), ..., w_{M-1}(n)$ to produce the filter output signal

$$y(n) = \sum_{k=0}^{M-1} w_k(n)x'(n-k),$$

(7.3)

while the residual error signal of the noise canceling system is

$$e(n) = d(n) - y(n) = s(n) + x(n) - y(n).$$

(7.4)

The error signal $e(n)$ is fed back to the filter in order to modify the filter coefficients in an adaptive manner, for example using (Least-Mean-Square) LMS or (Recursive Least-Squares) RLS algorithms. RLS algorithm has higher computational requirement than LMS, but behaves much better in terms of steady state MSE and transient time. Upon convergence, both adaptation algorithms will yield $x(n)$ to be statistically close to $y(n)$, hence $e(n)$ will be a close replica of $s(n)$, and therefore, upon adaptive filter convergence

$$e(n) \approx s(n).$$

(7.5)

Thus, the output of the noise cancellation filter is the “cleaned” information carrying signal. In practice, $e(n)$ also contains a residual error term (excess mean square error (EMSE)) and thermal noise components of the main and feedback receivers, thus reducing the SNR of the output signal [50, 53].

The adaptive noise cancellation allows several important advantages. First, the system can track changes in the main and feedback receiver due to temperature and antenna mismatch. Besides, this technique requires only a small amount of DC power consumption, in line with current state-of-the-art receivers. Furthermore, the frequency reconfigurable architecture of the main and the feedback receiver allows multi-band operation.
7.3 Experimental Results

As already mentioned, the verification of the receive band noise cancellation has been performed on a CMOS digital PA to mitigate its high levels of quantization noise and ideally reduce it below -180 dBm/Hz. Measured PA output spectrum at the center of LTE band 5 (881.5 MHz), is shown in Fig. 7.3. The DPA produces 22.3 dBm average output power using a 4 MHz 16-QAM OFDM signal. The sampling rate of the DPA is 85 MHz. At 45 MHz duplex spacing, the DPA produces a flat wideband noise floor with average spectral noise density of -91 dBm/Hz. This high level of noise is not sufficiently attenuated by the duplexer isolation and can significantly increase the NF of the main receiver and completely desensitized it.

Verification of the cancellation has been conducted with and without the presence of a desired signal at the main receiver. RLS algorithm was used with 6 filter taps. Shown in Fig. 7.4 are the spectra of pre- and post-cancellation measured total noise floors as well as the thermal noise floor of the main receiver without an information bearing signal (which corresponds to a noise figure of about 1 dB). The noise floors are referred to the input of the main receiver. The average RxBN power before cancellation over a 10 MHz band is $P_{\text{RxBN,pre}} = -156$ dBm. The total post-cancellation noise floor is a summation of both the residual RxBN as well as the thermal noise of the main receiver. Because the thermal noise of the main receiver is not correlated with the RxBN of the PA, the power spectral density $N_{\text{RxBN}}$ of the residual RxBN can be computed by subtracting the thermal noise $N_{m,\text{RX}}$ of the main receiver from the post-cancellation $N_{\text{total,post}}$, according to

$$N_{\text{RxBN}} = N_{\text{total,post}} - N_{m,\text{RX}}.$$  

(7.6)

The average post-cancellation residual RxBN power over the same 10 MHz bandwidth is calculated to be $P_{\text{RxBN,post}} = -178.2$ dBm, achieving over 22 dB of cancellation.

It is also important to evaluate RxBN cancellation in the presence of a desired information carrying signal at the main receiver. Here, EVM was used as a metric to measure the effectiveness
Figure 7.3: Measured DPA output spectrum for 22.3 dBm average output power using a 4 MHz 16-QAM OFDM signal.

of the cancellation. A 5 MHz 16-QAM signal was used to compare the EVM before and after cancellation, as well as the main receiver EVM without the presence of RxBN. EVM versus

Figure 7.4: Measured pre- and post-cancellation spectra together with thermal noise floor of the main receiver without an information carrying signal.
input power at the antenna port for these three cases are illustrated in Fig. 7.5. It can be seen that the cancellation technique can successfully reconstruct corrupted signals both for low and high input powers. An example of a pre- and post-cancellation constellations are shown in Fig. 7.6 for -80 dBm input power.

**Figure 7.5:** Measured pre- and post-cancellation EVM, as well as the main receiver EVM without the presence of RxBN.

**Figure 7.6:** Measured constellation for -80 dBm input power. Pre-cancellation (a), and post-cancellation (b).
7.4 Conclusion

An adaptive filter based, digital cancellation technique for mitigating stochastic quantization noise of digital power amplifiers is presented. The cancellation technique uses an additional feedback receiver to capture the receive band noise at the output of the PA. The hardware is realized with off the shelf components to demonstrate the effectiveness of the technique. Cancellation results have been presented both with and without the presence of a desired signal at the main receiver. The results indicate that excess RxBN from an important class of PAs can be canceled to appropriate levels, and that there is the possibility of relaxing duplexer requirements for multi-band radios.

7.5 Acknowledgment

Chapter 7 is mostly a reprint of the material that has been submitted for publication as it may appear in N. Rostomyan, V. Didi, P. Gudem and P. Asbeck ”Adaptive Cancellation of Digital Power Amplifier Receive Band Noise for FDD Transceivers,” IEEE Microwave and Wireless Components Letters, 2018. This dissertation author was the primary author of this material.
Chapter 8

Conclusions and Future Work

8.1 Dissertation Summary

High integration capability and low cost of fabrication have increased the demand for using Si-based transceivers in high data rate mm-wave wireless communication. Such systems are based on beam-steering and MIMO architectures to increase the channel capacity and overcome high path-loss at mm-wave frequencies. The large number of front-ends, wide channel bandwidth (above 200 MHz for mobile, 800 MHz for base-stations), large integration, support of high PAPR signals, and excessive heat dissipation demand high efficiency and stringent linearity from the PAs and the overall front-ends.

This dissertation addresses various circuit designs and architectures to improve the efficiency of cm/mm-wave CMOS power amplifiers and TDD front-ends at frequencies from 15 GHz to 28 GHz. Furthermore, a DSP based noise cancellation is proposed to increase efficiency and performance of cellular (LTE band) transmitters.

Chapter 2 discusses a 15 GHz two stage, high output power symmetric Doherty PA that is based on a classic load modulation output network with a lumped 90° phase shifter. The driver stages consist of 2-stack amplifiers, while the final stages are implemented using 4-stack multigate
cells to achieve high power. A simple analog linearizer is also proposed that performs Doherty gain correction in the RF domain. The PA achieves more than 25.7 dBm saturated output power and peak PAE of 31.2%. PAE at 6 dB back-off is 25%, which is more than 64% higher than for an ideal class B PA roll-off. The analog linearizer effectively flattens the overall gain and extends the output P1dB of the amplifier from 23 dBm to 25.1 dBm without much penalty on the PAE.

While the 15 GHz band is actively being explored for satellite communication, various international organizations have proposed to allow portions of the 28 GHz, 39 GHz and 60 GHz bands to be used for 5G communication. In chapter 3, Ka-band, high efficiency, one stage mm-wave power amplifiers based on nMOS and pMOS transistors in GF 45 nm CMOS SOI with different metalization options have been demonstrated. The amplifiers arranged in a 2-stack configuration to increase the output voltage swing and achieve high efficiency. Preliminary reliability tests have also been conducted to demonstrate greater voltage handling capability of pMOS devices. The pMOS PA achieves world record PAE up to 46% and 19.5 dBm saturate output power, while the nMOS PA sustains 40% PAE with close to 19 dBm saturated power. These compact PAs occupy only 0.18 mm² and can be useful as a standalone amplifier or a component of more complex architectures such as Doherty or out-phasing for 5G transceivers. The use of pMOS provides the potential for increased robustness to hot carrier injection effects.

Achieving high peak efficiency is not sufficient for modern communication signals with high PAPR and efficiency improvement at back-off power levels is of significant importance for Si-based mm-wave transceivers. Chapter 4 considers a high efficiency, linear mm-wave Doherty PA in CMOS that uses a novel low-loss combiner. A compact modeling approach for CMOS PAs is also demonstrated that considerably reduces simulation times. With more than 22 dBm saturated power, 40% peak PAE, and 28% at 6 dB back-off, the PA features the highest peak and 6 dB back-off PAE among silicon Doherty PAs.

More than 6 dB back-off efficiency improvement can be achieved with asymmetric Doherty PA realizations. Chapter 5 demonstrates a dual input CMOS Doherty PA for the Ka-band
based on asymmetric main and peaking amplifiers using a Doherty combiner which is based on an optimized and low loss synthesis methodology. Both the main and peaking amplifiers utilize same size devices, however the voltage supply of the peaking PA is twice as high and thus results in efficiency peaking at more than 6 dB back-off, which is desirable for modulated signals with high PAPR. The main Doherty path uses a high efficiency 2-stack amplifier with a shunt feedback drain-source capacitance. The peaking path uses a high power 4-stack amplifier to achieve more than 6 dB back-off efficiency improvement. With dual, asymmetric input drive, the PA is able to output 25 dBm saturate power with 31% peak PAE as well as 34% 6 dB back-off PAE, which constitutes to the highest peak power and back-off efficiency of any Si-based Doherty PAs in the Ka-band due date.

The losses that occur between the output of the PA and the antenna can have significant effect on the overall mm-wave transmitter efficiency. In chapter 6 high integration capability of CMOS is utilized to demonstrate a FDD front-end combiner synthesis methodology that optimizes the losses by combining the PA output and the LNA input matching networks together with the T/R switch into one network. A front-end implementation that includes a high power 4-stack PA, an inductively source degenerated, cascode LNA and the proposed T/R switch combiner is also demonstrated in 45 nm CMOS SOI technology. The front-end achieves state-of-the-art performance both in the transmit and receive modes. The PA inside the front-end produces saturated output power of 23.6 dBm with peak PAE of 28%, while maintaining LNA noise figure of 4.3 dB.

Finally, as the need for sub-6 GHz flexible, multi-band, and highly reconfigurable software defined radios increases, the use of digital power amplifiers for such systems becomes a viable option. However, for FDD radios, the quantization noise of digital power amplifiers causes high levels of receiver desensitization. To address this issue, the final chapter presents an adaptive filter based, digital cancellation technique for mitigating stochastic noise, in particular quantization noise of digital power amplifiers. The cancellation technique uses an additional feedback receiver
to capture the receive band noise at the output of the PA. The hardware is realized with off the shelf components for LTE band 5 to demonstrate the effectiveness of the technique. Cancellation results have been presented both with and without the presence of a desired signal at the main receiver. It has been shown for the first time that the quantization noise of a digital PA can be reduced below -180 dBm/Hz at the receiver. The cancellation technique enables higher efficiency PAs driven with stronger DPD along with less demanding design requirements for duplexers, which facilitates their use in ever increasing number of bands in FDD systems.

8.2 Future Work

The higher voltage handling capability and efficiency of pMOS transistors in the GF 45 nm CMOS SOI technology opens new possibilities for mm-wave power amplifiers. pMOS devices can be further utilized to implement three and four stack, higher power PAs. Also, the Doherty power amplifier design concepts presented in this thesis can benefit from pMOS devices to achieve better reliability and higher efficiency. More than 30% PAE at 6 dB back-off can be anticipated by using 2-stack pMOS PAs to implement a symmetric Doherty PA. Besides, higher voltage handling of pMOS devices mitigates reliability issues related with strong antenna mismatch due to beam-steering.

Back-off efficiency enhancement of PAs is will continue to be an active research area. While considerable advances have been demonstrated with Doherty PAs in this dissertation, out-phasing PAs pose an additional avenue for exploration.

It is envisioned that future Si-based power amplifiers will comprise re-configurable building blocks to achieve multi-band operation, power and gain control, RF domain linearization, as well as high back-off efficiency. As depicted in Fig. 8.1, such highly re-configurable PAs will first of all include a back-off efficiency improvement architecture, such as Doherty or out-phasing. In addition, developments of high speed VGAs and phase shifters that can support up to 3-5
times the modulation bandwidth will be necessary. These building blocks will allow memory-less AM-AM and AM-PM correction in the RF domain, thus substantially reducing the need for DPD. Also, having VGAs and phase shifters for each path of Doherty or out-phasing PAs will allow to precisely control the turn-on behavior of the peaking PA as well as make frequency reconfigurable Doherty/out-phasing configurations possible. All of these efforts can also be extended to other mm-wave communication frequency bands, such as the 39 GHz and 60 GHz bands as well as to other technology nodes, such as the CMOS SOI FDX 22 nm or below.

The system level demonstration of receive band noise cancellation serves as a basis for a custom integrated IC implementation. The IC will include an LTE transmitter and the feedback receiver. If the transmitter is based on a digital PA, such transceiver IC can enable low cost, highly flexible, and frequency agile cellular software defined radios (SDR).
Bibliography


