# **UCLA UCLA Electronic Theses and Dissertations**

#### **Title**

BROADBAND CLASS-AB POWER AMPLIFIER TECHNIQUES FOR CABLE TV APPLICATION

#### **Permalink**

<https://escholarship.org/uc/item/2801j20q>

#### **Author** Lee, Jeffrey

**Publication Date** 2017

Peer reviewed|Thesis/dissertation

#### UNIVERSITY OF CALIFORNIA

Los Angeles

# BROADBAND CLASS-AB POWER AMPLIFIER TECHNIQUES FOR CABLE TV APPLICATION

A dissertation submitted in partial satisfaction of the

requirements for the degree Doctor of Philosophy

in Electrical Engineering

by

Jeffrey Lee

2017

© Copyright by

Jeffrey Lee

2017

#### ABSTRACT OF THE DISSERTATION

# BROADBAND CLASS-AB POWER AMPLIFIER TECHNIQUES FOR CABLE TV APPLICATION

by

Jeffrey Lee

Doctor of Philosophy in Electrical Engineering

University of California, Los Angeles, 2017

Professor Sudhakar Pamarti, Chair

Contemporary high-spectral-efficiency communication systems increasingly rely on complex modulation, with high-order constellations and multi-carrier signaling. These formats often have high peak-to-average-power ratios (PAPR). It is difficult to design power amplifiers (PAs) for high PAPR signal with good power efficiency and linearity simultaneously.

Strategies to improve power efficiency fit into established amplifier classes. Some aim to reduce standing current; others reduce supply-voltage overhead. There are also switch-mode classes and load-modulation classes. PA design arts using these classes alone or in combinations achieve good power efficiency for narrowband high-PAPR signals. However, these existing arts

lack effective techniques for wideband systems. One of the bottlenecks is the limited-speed supply modulation.

There are several linearization schemes that have been presented to mitigate PA's nonlinearity. Among these, digital pre-distortion (DPD) is currently the most popular. However, the conventional architecture, ADC-based DPD, requires high speed/resolution ADC, which is high cost and power hungry. It also requires large processor power and memory for time-domain information.

Two techniques are proposed to address the challenges of broadband Class-AB PA's power efficiency and linearity separately. For efficiency enhancement, we introduce Instantaneous Supply-Switching technique. This technique improves efficiency by high speed current-mode supply-switching in response to instantaneous signal, unlike most prior supply modulation implementations which only responds to the signal envelope. For linearity improvement, we introduce Signal-to-Distortion-Ratio(SDR)-based DPD technique. This methodology only requires the power information of signal- and distortion- channel, which is more data efficient than ADC-based DPD. The hardware for SDR-based DPD therefore potentially has lower cost and power.

The intended application in this dissertation is cable TV upstream power amplifiers.

iii

The dissertation of Jeffrey Lee is approved.

Ramon Gomez

Yuanxun Wang

Chih-Kong Ken Yang

Milos D Ercegovac

Sudhakar Pamarti, Committee Chair

University of California, Los Angeles

2017

## **Table of Contents**





## **LIST OF ACRONYMS**

<span id="page-8-0"></span>



## **LIST OF FIGURES**

<span id="page-10-0"></span>



## **LIST OF TABLES**

<span id="page-12-0"></span>

#### **ACKNOWLEDGEMENT**

<span id="page-13-0"></span>First, I would like to sincerely thank my advisor Dr. Sudhakar Pamarti for his technical support. Professor Pamarti inspires me through his pursuit of high-quality research. I have learned a lot from him, especially the way how he analyzes a complex research topic from the basics. I would like to thank him for his patience and constant guidance on me which helped me achieve my academic goals at UCLA.

Next, I would like to express my deepest gratitude to Dr. Ramon (Ray) Gomez, my mentor at Broadcom. Ray actually plays a role as another advisor of mine. Not only do I receive valuable advice and discussion from him, but he also shows his passion for his career, which is an excellent role model to me as an IC designer. He also teaches me the difference between industry and academia about how to evaluate a project, which is an invaluable experience to me. I am eternally grateful to Broadcom Foundation and Broadcom Limited for their financial and technical support. I couldn't finish my PhD degree without their support.

I also would like to express my appreciation to my committee members: Professor Yuanxun Ethan Wang, Professor Chih-Kong Ken Yang, and Professor Milos Ercegovac for serving on my committee.

I also want to thank my lab mates and friends at UCLA, where people are talented, humorous and help each other. In particular I would like to thank to Dr. Jim Sun, Dr. Ming-Shuan Chen, Dr. Neha Sinha and Dr. Boyu Hu. I enjoy the time either technical discussion or purely chatting together.

Most important of anything I write here, I have to thank my dog daughter Jiaw and my girlfriend Shirley, who constantly support my PhD study. I could not have done it without your unconditionally love and support.

xii

## **VITA**

<span id="page-14-0"></span>

#### **PUBLICATIONS**

- <span id="page-15-0"></span>1. **J. Lee**, R. Gomez and S. Pamarti, "A Broadband Class-AB Power Amplifier with Instantaneous Supply-Switching Efficiency Enhancement for Cable TV Application," IEEE Journal of Solid-State Circuits, May 2018, to be appear
- 2. **J. Lee**, S. Pamarti and R. Gomez, "Training of Predistortion Based on Signal-to-Distortion-Ratio Measurements," IEEE Bipolar/BiCMOS Circuits and Technology Meeting, pp. 9-12, Oct. 2017. pp. 9
- 3. **J. Lee**, S. Pamarti and R. Gomez, " A 10-to-650MHz 1.35W Class-AB Power Amplifier with Instantaneous Supply Switching Efficiency Enhancement," IEEE Custom Integrated Circuits Conference, pp. 1-4, May 2017

# <span id="page-16-0"></span>**CHAPTER 1 INTRODUCTION**

## <span id="page-16-1"></span>**1.1 Motivation**

Modern cable television (CATV) systems provide not only one-way broadcast programming, but also high-speed two-way communications between customers and the Internet. Cable modems are a primary source of Internet connectivity for millions of consumers worldwide, backhauling local WiFi communications for residential and business customers. Modern high-spectral-efficiency CATV systems, such as those based on the Data Over Cable Service Interface Specification (DOCSIS) 3.1 standard, increasingly depend on complex signal modulation, with high-order constellations  $(\geq 256$ -QAM), multi-carrier signaling (OFDM) and multi-channel aggregation.



Fig. 1.1 CATV distribution architecture

| Application                             | 4G LTE        | 802.11ac           | CATV (DOCSIS 3.1) |                   |
|-----------------------------------------|---------------|--------------------|-------------------|-------------------|
|                                         |               |                    | Upstream          | <b>Downstream</b> |
| PAPR (dB)                               | ~1            | ~12                | ~14               | ~14               |
| Frequency (MHz)                         | $<$ 40        | < 160 <sup>a</sup> | $5 - 205$         | 50-1200           |
| Fractional BW (%)                       | $\mathsf{<}2$ | $\leq 7$           | 625               | 470               |
| <b>Sideband Rejection</b><br>Ratio(dBc) | 30            | 40                 | 50                | 50                |

Table 1.1 Speculations between latest Wi-Fi, cellular and CATV protocol signals

 $b$  centered @ approx. 2.4/5 GHz

Fig. 1.1 illustrates the layout of a typical hybrid optical-fiber / coaxial cable CATV plant. Note that customer-premise cable modems and set-top boxes in a community communicate with a so-called "fiber node", via shared coaxial cable. The fiber node then communicates two-way traffic with the cable headend, similar in function to the wireline telephone central office. The bidirectional QAM signals often have a high peak-to-average-power ratio (PAPR), up to 14 dB. The DOCSIS 3.1 standard also requires high fractional bandwidth and output power. It uses frequency bands of approximately 5-200 MHz and 50-1200 MHz for Upstream (customer to headend) and Downstream (headend to customer) signals, respectively. Typical cable modems require Upstream peak output power of about 1 W. Typical fiber nodes require Downstream peak power of about 10 W, and may use costly GaAs or GaN PAs. CATV signals generally have higher PAPR, linearity requirement, and fractional bandwidth than the Wi-Fi (e.g. 802.11.ac) and cellular (e.g. 4G LTE) protocol signals, as shown in Table 1.1. Conventional power amplifiers commonly have low efficiency under high PAPR conditions, because of the high supply voltages and large bias currents necessary to avoid clipping the signal peaks. This leads to higher costs for power supplies, thermal management and battery backup. Conventional PAs also have trade-off between efficiency and linearity. Existing PA products for such applications may apply Class-A design, which achieves at best 4% average efficiency and required linearity. It is therefore desirable to improve the efficiency and linearity of the PA design art for high PAPR / high fractional bandwidth signals.

#### <span id="page-18-0"></span>**1.2 Organization**

This dissertation proposes techniques to enhance the average efficiency and improve the linearity simultaneously for broadband PA with high PAPR signal. In particular, it develops two techniques which can save power for high instantaneous signal magnitude while using low-cost digital pre-distortion to improve the linearity.

The first technique called the "Instantaneous Supply-Switching." It saves power by highspeed current-mode supply-switching in response to instantaneous signal magnitude instead of to the signal envelope as most prior supply modulation. In this way, we can overcome the efficiency limit of classic linear PAs. We first discuss PA design challenges from power efficiency perspective in Chapter 2. After understanding the fundamental limitations in present state-of-the-art PAs, the theoretical analysis, circuit design details, and thorough measurements of proposed "Instantaneous Supply-Switching" technique are presented in Chapter 3.

In Chapter 3, a proof-of-concept broadband power amplifier (PA) is demonstrated, combining a Class AB core with proposed Instantaneous Supply-Switching (ISS) by using 0.18 μm SiGe BiCMOS process. High-ft NPN cascodes switch the amplifier signal current to highsupply or low- supply rails depending on instantaneous signal magnitude. Current-mode switching at GHz rates enhances efficiency for bandwidths well beyond existing envelope trackers. Selecting 7.5/4.5 V as PA core supply rails, this PA achieves 13.6% PAE for a 15-215

MHz noise-like signal, with near-Gaussian PDF and 14 dB PAPR. It shows superior efficiency compared to existing art for large fractional bandwidth, high-PAPR RF signals.

We then move on to the linearity issue after improving the efficiency. In Chapter 4, an overview of digital pre-distortion (DPD) is presented. It starts with describing the existing power amplifier linearization techniques, and then covers the nonlinear and pre-distortion behavior models for power amplifiers. The DPD estimators and algorisms for DPD coefficients adaptation are also discussed. Finally, we bring up the challenges of broadband DPD. With sufficient background knowledge of DPD, we can move on to our second proposed technique: "Signal-to-Distortion (SDR)-based DPD."

In Chapter 5, this novel training approach for digital pre-distorters based solely on measurements of output signal-to-distortion ratio in coarse bands. This approach simplifies the feedback receiver design. It does not require synchronized sample-by-sample comparison of the transmitter input and output, also significantly reducing the complexity of the adaptation block. The algorithm was evaluated with the same SiGe BiCMOS broadband PA in Chapter 2 and 3, demonstrating substantial and robust fidelity improvements.

# <span id="page-20-0"></span>**CHAPTER 2 Power Amplifier Design Challenges**

The CATV signals generally have higher PAPR, linearity requirement, and fractional bandwidth as shown in Table 1.1. Existing PA products for such applications may apply Class-A design [1], which achieves at best 4% average efficiency theoretically and required linearity. Table 2.1 shows the power consumption of a state-of-the-art commercially available PA for CATV applications. The PA can transmit peak continuous wave (CW) of 1W from 5MHz to 205MHz. It consumes 3.33W of the DC power and generates only 100mW at an average efficiency of 3% with ACPR of 55dBc for a 14dB PAPR broadband signal. The low average efficiency leads to higher costs for power supplies, thermal management, and battery backup. However, conventional PAs have trade-off between efficiency and linearity. It is therefore very challenging to improve the efficiency and linearity of the PA design art for high PAPR / high fractional bandwidth signals.

Table 2.1 A commercial CATV Power Amplifier [1]

| DC Power | AC Average Power | Peak AC CW Power | <b>Average Efficiency</b> | <b>ACPR</b> |
|----------|------------------|------------------|---------------------------|-------------|
| 3.33W    | 00mW             | . W              | 3%                        | 55dBc       |

This challenge of power amplifier design can be traced to a fundamental contradiction: the trade-off between efficiency and linearity. Various techniques have been proposed to improve PA average efficiency. Some reduce average bias current, such as Class-AB/B/C [2]- [6]. However, the efficiency drops to half while output power back-off 6dB, as illustrated in Fig. 2.1. The linearity is also degraded by the  $g_m$  variation while the bias current changes dynamically. The high common-mode impedance also degrades the linearity and efficiency further by adding the common-mode voltage swing at even-harmonic frequencies. This commonmode impedance issue was addressed and relieved by proposing low-impedance common-mode matching network in [5]-[6].



Fig. 2.1 Current bias of linear PAs

Others reduce average supply voltage, such as supply-switching (Class-G) [7]-[12], envelope tracking (ET/ Class-H) [13]-[15] and envelope elimination and restoration (EER) [16]-[18]. There are also load modulation strategies, like Doherty [19], outphasing [20] and dynamic load modulation [21],[26]. Switching PAs, e.g. Class-D/E/F [2]-[3], can achieve high peak efficiency. RF PA design art using these schemes alone or in combinations achieves good efficiency for narrowband signals with high PAPR [11]-[12], [21]-[25]. However, EER, load modulation strategies, and switching PAs are not good candidates for CATV application because of their fundamental assumptions are single carrier case. Their strong nonlinearity is also a concern for CATV requirement even after DPD. Class-G/H do not have restrictions such as single carrier and strong nonlinearity but suffer from limited supply modulation bandwidth.



Fig. 2.2 Conventional supply modulation approaches

The conventional Class-G/H topologies are shown in Fig. 2.2. One is ET (Class-H), where the supply follows the signal envelope continuously; another is conventional voltagemode supply switching (SS), where the supply changes in discrete steps in response to the envelope. Both approaches require high driving power for the large top-side PMOS devices. They also suffer from limited supply modulation bandwidth as mentioned earlier. The modulation bandwidth of ET is limited to  $k_1 f_{SW}$ , where  $k_1$  is the achievable ratio between the modulation bandwidth and the switch-mode converter frequency  $f_{SW}$ . This ratio k1 is of course much less than unity. The converter frequency  $f_{SW}$  is in turn limited to  $k_2f_t$ , where  $k_2 \ll 1$  allows enough time constants of the switch driver circuit for full switching. Combining these, we get  $BW_{ET} \le k_1k_2f_{t,PMOS}$  (assuming a discrete PMOS switch is used for ET). Estimating  $k_1 = k_2 = 0.1$ , we get a bandwidth limit of ET equal to  $0.01$  times the  $f_t$  of the discrete off-chip PMOS device, perhaps on the order of tens or hundreds of MHz. The voltage-mode SS is limited by *CV<sup>2</sup> f* losses due to rapid switching of parasitic capacitances, which has the speed limit also on the order of tens or hundreds of MHz.

Though the conventional supply modulation topologies do not have enough BW for CATV application, their linearity is barely degraded. We, therefore, consider supply modulation as a potential candidate for CATV PA for better average efficiency. To address the challenge, we propose a novel supply modulation topology, high-speed current-mode instantaneous supplyswitching (ISS) technique, combined with a broadband push-push Class-AB PA core, which would be discussed in the next chapter.

#### <span id="page-23-0"></span>**References**

[1] NXP, "BGA3131 DOCSIS 3.1 Upstream Amplifier," Product Data Sheet Rev. 1, May 2016. Online Available: [http://www.nxp.com/documents/data\\_sheet/BGA3131.pdf](http://www.nxp.com/documents/data_sheet/BGA3131.pdf)

[2] S. Cripps, Advanced Techniques in RF Power Amplifier Design, Artech House Publisher, 2002.

[3] S. Cripps, RF Power Amplifiers for Wireless Communications, Artech House Publisher, 2nd edition, 2002.

[4] I. Aoki, et al., "Distributed Active Transformer—A New Power-Combining and Impedance-Transformation Technique," *IEEE Trans. Microwave Theory Tech.*, vol. 50, No.1, pp. 316–331, Jan., 2002.

[5] H. Wang et al., "A 5.2-to-13GHz Class-AB CMOS Power Amplifier with a 25.2dBm Peak Output Power at 21.6% PAE," *ISSCC Dig. Tech. Papers*, pp. 44-45, Feb 2010.

[6] W. Ye et al., "A 2-to-6GHz Class-AB Power Amplifier with 28.4% PAE in 65nm CMOS Supporting 256QAM," *ISSCC Dig. Tech. Papers*, pp. 38-39, Feb 2015.

8

[7] Feldman, L., "Class G high efficiency Hi-Fi amplifier," *Radio Electronics*, vol. 87, pp. 47-49, Aug 1976.

[8] Sampei, T., Ohashi, S., Ohta, Y., and Inoue, S., "Highest efficiency and super quality audio amplifier using MOS power FETs in class G operation," *IEEE Trans. Consum. Electron.*, vol. CE-24, no. 3, pp. 300-307, Aug. 1978.

[9] Self, D., "A new look at class-G power," *Electronics World*, vol. 107, pp. 900-905, Dec. 2002.

[10] A. D. Downey, et al. "A Class-F/FB Audio Amplidier," *IEEE Transactions on Consumer Electronics*, vol. 53, No.4, pp. 1537–1545, Nov., 2007.

[11] J. S. Walling, et al. "A Class-G Supply Modulator and Class-E PA in 130 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 9, Sep. 2009.

[12] S. Hu et al., "A Broadband CMOS Digital Power Amplifier with Hybrid Class-G Doherty Efficiency Enhancement," *ISSCC Dig. Tech. Papers*, pp. 44-45, Feb 2015.

[13] D. Debopriyo et al., "A fully integrated reconfigurable wideband envelope-tracking SoC for high-bandwidth WLAN applications in a 28nm CMOS technology," *ISSCC Dig. Tech. Papers*, pp. 34-35, Feb 2017.

[14] S.-H. Yang et al., "A single-inductor dual-output converter with linear-amplifier-driven cross regulation for prioritized energy-distribution control of envelope-tracking supply modulator," *ISSCC Dig. Tech. Papers*, pp. 36-37, Feb 2017.

[15] X. Liu et al., "A 2.4V 23.9dBm 35.7%-PAE -32.1dBc-ACLR LTE-20MHz envelopeshaping-and-tracking system with a multiloop-controlled AC-coupling supply modulator and a mode-switching PA," *ISSCC Dig. Tech. Papers*, pp. 38-39, Feb 2017.

9

[16] L. R. Kahn, "Single-sideband transmission by envelope elimination and restoration," *Proc. IRE*, vol. 40, no. 7, pp .803–806, Jul. 1952.

[17] N. Wang et al., "Linearity of X-band class-E power amplifiers in EER operation," *IEEE Trans. Microw. Theory Tech.*, vol. 53, no. 3, pp. 1096–1102, Mar. 2005.

[18] C.-T. Chen et al., "Kahn envelope elimination and restoration technique using injectionlocked oscillators," *IEEE MTT-S International Microwave Symposium Digest*, pp. 17-22, Jun. 2012.

[19] W. H. Doherty, "A new high efficiency power amplifier for modulated waves," *Proc. IRE*, vol. 24, no. 9, pp. 1163-1182, Sep. 1936.

[20] H. Chireix, "High power outphasing modulation," *Proc. Inst. Radio Engineers*, vol. 23, no. 11, pp. 1370–1392, 1935.

[21] S. Moloudi and A. A. Abidi, "The outphasing RF power amplifier: A comprehensive analysis and a Class-B CMOS realization," *IEEE J. Solid-State Circuits*, vol. 48, no. 6, pp. 1357- 1369, Jun 2013.

[22] W. Tai, H. Xu, A Ravi, H. Lakdawala, O. Bochobza-Degani, L. R. Carley and Y. Palaskas, "A transformer-combined 31.5 dBm outphasing power amplifier in 45 nm LP CMOS with dynamic power control for back-off power efficiency enhancement." *IEEE J. Solid-State Circuits*, vol. 47, no. 7, pp. 1646-1658, Jul 2012.

[23] D. Chowdhury, S. V. Thyagarajan, L. Ye, E. Alon and A. M. Niknejad, "A fullyintegrated efficient CMOS inverse Class-D power amplifier for digital polar transmitters," *IEEE J. Solid-State Circuits*, vol. 47, no. 5, pp. 1113-1122, May 2012.

[24] S.-M. Yoo, J. S. Walling, O. Degani, B. Jann, R. Sadhwani, J. C. Rudell and D. J. Allstot, "A Class-G switched-capacitor RF power amplifier," *IEEE J. Solid-State Circuits*, vol. 48, no. 5, pp. 1212-1224, May 2013.

[25] N. Singhal, H. Zhang and S. Pamarti, "A Zero-Voltage-Switching Contour-based Outphasing Power Amplifier," *IEEE Trans. Microw. Theory Tech.*, vol. 60, no. 6, pp. 1896– 1906, Jun. 2012.

[26] F. H. Raab, "High-efficiency linear amplification by dynamic load modulation," *2003 IEEE MTT-S International Microwave Symposium Digest*, vol. 3, June 2003, pp. 1717-1720.

# <span id="page-27-0"></span>**CHAPTER 3 A Broadband Class-AB Power Amplifier with Instantaneous Supply-Switching Efficiency Enhancement for Cable TV Application**

#### <span id="page-27-1"></span>**Abstract**

A broadband power amplifier (PA) is presented, combining a Class AB core with a novel supply-modulation technique, Instantaneous Supply-Switching (ISS). High-*f<sup>t</sup>* NPN cascodes switch the amplifier signal current to high-supply or low- supply rails depending on instantaneous signal magnitude. Current-mode switching at GHz rates enhances efficiency for bandwidths well beyond existing envelope trackers. 1.35 W peak power and efficiency of 13.6% at 14dB PAPR are observed. The combination of Pout, large fractional bandwidth and high-PAPR efficiency exceeds prior art.

#### <span id="page-27-2"></span>**3.1 Introduction**

Modern cable television (CATV) systems provide not only one-way broadcast programming, but also high-speed two-way communications between customers and the Internet. Cable modems are a primary source of Internet connectivity for millions of consumers worldwide, backhauling local WiFi communications for residential and business customers. Modern high-spectral-efficiency CATV systems, such as those based on the Data Over Cable Service Interface Specification (DOCSIS) 3.1 standard, increasingly depend on complex signal modulation, with high-order constellations (≥ 256-QAM), multi-carrier signaling (OFDM) and multi-channel aggregation.



Fig. 3.1 CATV distribution architecture

Fig. 3.1 illustrates the layout of a typical hybrid optical-fiber / coaxial cable CATV plant. Note that customer-premise cable modems and set-top boxes in a community communicate with a so-called "fiber node", via shared coaxial cable. The fiber node then communicates two-way traffic with the cable headend, similar in function to the wireline telephone central office. The bidirectional QAM signals often have a high peak-to-average-power ratio (PAPR), up to 14 dB. The DOCSIS 3.1 standard also requires high fractional bandwidth and output power. It uses frequency bands of approximately 5-200 MHz and 50-1200 MHz for *Upstream* (customer to headend) and *Downstream* (headend to customer) signals, respectively. Typical cable modems require Upstream peak output power of about 1 W. Typical fiber nodes require Downstream peak power of about 10 W, and may use costly GaAs or GaN PAs [1]. CATV signals generally have higher PAPR and fractional bandwidth than Wi-Fi protocol signals (e.g. 802.11.ac).

Conventional power amplifiers commonly have low efficiency under high PAPR conditions, because of the high supply voltages and large bias currents necessary to avoid clipping the signal peaks. This leads to higher costs for power supplies, thermal management and battery backup. Existing PA products for such applications may apply Class-A design [2], which achieves at best 4% average efficiency, as we will show. It is therefore desirable to extend the PA design art for high PAPR / high fractional bandwidth signals.

Various techniques have been proposed to improve PA average efficiency. Some reduce average bias current, such as Class-AB/B/C; others reduce average supply voltage, such as supply-switching (Class-G), envelope tracking (ET/ Class-H) and envelope elimination and restoration (EER) [3]. There are also load modulation strategies, like Doherty [4], outphasing [5] and dynamic load modulation. Switching PAs, e.g. Class-D/E/F [6], can achieve high peak efficiency. RF PA design art using these schemes alone or in combinations achieves good efficiency for narrowband signals with high PAPR [7]-[12]. However, Class-G/ET/EER techniques are limited by supply modulator bandwidth (typically tens of MHz) and modulator circuit power. Existing load modulation techniques and switching techniques often have limited bandwidth due to the use of tuned circuits. Existing arts lacks effective techniques to enhance efficiency for broadband signals with high PAPR.

To address the challenge, we proposed a novel high-speed current-mode instantaneous supply-switching (ISS) technique, combined with a broadband push-push Class-AB PA core in [13]. ISS here refers to supply modulation fast enough to follow not only slow envelope variations in a narrowband RF signal, but also fast enough to follow the *instantaneous* amplitude of a broadband signal with spectral occupancy of many hundreds of MHz.

In this topology, the cascode transistors necessary for high-voltage tolerance in a common-source PA are simultaneously used for supply switching, in response to the signal amplitude. A high-voltage supply is selected when the signal amplitude is large, and a lowvoltage supply is selected when the signal amplitude is low. As is well-established, current-mode switching is naturally fast - enough to make ISS possible. This technique can theoretically achieve better efficiency than envelope-based supply-modulation schemes (e.g. Class-G/ET/EER).

In this paper, Section 3.2 will describe and analyze the ISS technique. Problems that were found to arise from high-speed current-mode switching are discussed, together with the solutions that were developed and implemented. Expressions are derived for the ideal efficiency of ISS, including optimization of the supply voltages. These results are expressed in terms of the signal PDF, so that the effectiveness of ISS vs. other PA topologies can be assessed for specific applications.

Section 3.3 addresses circuit design details. The resistive shunt-feedback PA core is analyzed to derive small-signal characteristics. Auxiliary circuits, including the supply-switching (SS) driver and output common-mode impedance control network are also described and analyzed. With this information, we examine efficiency more realistically, including the power dissipation of the auxiliary circuits and losses due to voltage and current headroom as required for adequate linearity.

Selecting 7.5/4.5 V as PA core supply rails, this 0.18 μm SiGe BiCMOS PA achieves 13.6% PAE for a 15-215 MHz noiselike signal, with near-Gaussian PDF and 14 dB PAPR. It shows superior efficiency compared to existing art for large fractional bandwidth, high-PAPR RF signals. Measurement results and conclusion are presented in Sections 3.4 and 3.5, respectively.

## <span id="page-31-0"></span>**3.2 Current-Mode Instantaneous Supply-Switching**

Supply modulation is an established category of techniques for amplifier efficiency enhancement. The PA supply is modulated dynamically in response to some signal parameter in order to save DC power. Two prior supply modulation approaches are illustrated in Fig. 3.2. One is ET, where the supply follows the signal envelope *continuously*; another is conventional voltage-mode supply switching (SS), where the supply changes in *discrete steps* in response to the envelope. However, both approaches require high driving power for the large top-side PMOS devices. They also suffer from limited supply modulation bandwidth as mentioned earlier. ET is limited by the switch-mode power supply bandwidth [14], and voltage-mode SS is limited by  $CV<sup>2</sup>f$  losses due to rapid switching of parasitic capacitances.



Fig. 3.2 Conventional supply modulation approaches

The limited bandwidth of existing supply modulation techniques constrains the potential efficiency improvement. For example, it may not be possible to use the full signal envelope as

input to the supply modulator. The envelope may require smoothing (low-pass filtering) of some form to match the modulator bandwidth. The modulated supply may then be forced to remain closer to its maximum value to prevent clipping after sudden jumps in the envelope. CATV signals have bandwidths on the order of 1 GHz, much greater than contemporary cellular or WAN signal formats. High-speed supply modulation is therefore essential in CATV applications, but also relevant to future wireless applications as standards incorporate higher bandwidth signals.



Fig. 3.3 Proposed BJT current-mode supply switching

#### *A. Proposed Current-mode Supply-Switching*

To resolve these drive power and bandwidth issues, we propose a current-mode supplyswitching technique. In Fig. 3.3, the BJT current-mode switch is integrated in a push-push differential PA, switching the current between  $V_{DD|H}$  and  $V_{DD|L}$ . Here we want to highlight that the current-mode switches are also the cascode devices which serve to increase gain and protect the low-voltage NMOS input devices. Therefore, there is almost no extra cost for the switches. Unlike existing art, where the PMOS top-side switch has to be large enough to prevent excessive voltage drop, the NPN switch here only needs to meet current density and linearity requirements,

and can be much smaller with reduced parasitics. The embedded NPN also has naturally high *f<sup>t</sup>* , compared to embedded or external PMOS devices, and this also contributes to reduced parasitics. Another advantage of the proposed solution is that only  $6v_T \approx 150$  mV is required for driving the BJT current-mode switch, which is much smaller than the typical gate drive that would be required for a top-side PMOS switch. In turn, the smaller switch size and smaller driving voltage contribute to lower drive power and better efficiency. Moreover, it is well known that very fast current-mode switching is relatively straightforward, even up to GHz rates. This efficient highspeed switching makes ISS possible. We can now explore the modulation bandwidth advantage of ISS in more detail.

The modulation bandwidth of ET is limited to  $k_1 f_{SW}$ , where  $k_1$  is the achievable ratio between the modulation bandwidth and the switch-mode converter frequency  $f_{SW}$ . This ratio k1 is of course much less than unity. The converter frequency  $f_{SW}$  is in turn limited to  $k_2f_t$ , where  $k_2$ << 1 allows enough time constants of the switch driver circuit for full switching. Combining these, we get  $BW_{ET} \leq k_1k_2f_{t,PMOS}$  (assuming a discrete PMOS switch is used for ET). Estimating  $k_1 = k_2 = 0.1$ , we get a bandwidth limit of ET equal to 0.01 times the  $f_t$  of the discrete off-chip PMOS device, perhaps on the order of tens or hundreds of MHz.

There are several time constants to consider in our proposed ISS scheme. Complete switching will require on the order of 10 time constants. One time constant is due to the transit time of the NPNs as the cascodes switch between rails. Given that the NPN  $f_t$  is 28 GHz, this is not a significant limitation. Another time constant is the product of the switch-drive circuit output resistance and the base capacitance of the NPNs. This is mitigated by sizing the driver appropriately. A third time constant is approximately determined by the product of  $1/g<sub>m,NPN</sub>$  and  $C<sub>u,NPN</sub>$ . This time constant corresponds to the time needed for current through the collector-base

capacitance to become negligible. It is even shorter than  $1/f_{t,NPN}$ . Therefore, the bandwidth of our proposed scheme is limited by the supply-switching driver in Fig. 3.11, which is an inverterchain-based driver. Its toggle rate is limited to about 1 / (8 fan-out-4 inverter delays) [20], which is about 2 GHz in 0.18-µm CMOS. This matches our simulation and measurement results. Testing of our PA was carried out with a 1 GHz switching rate, since this was adequate to achieve optimum PAE.



Fig. 3.4 Conventional supply modulator vs. proposed ISS technique

#### *B. Instantaneous Supply-Switching*

Once again, ISS means that the PA selects its supply depending on the signal's instantaneous magnitude, rather than its envelope. If the magnitude of the signal is greater than the threshold  $V_{TH}$ ,  $V_{DD_H}$  is selected; otherwise,  $V_{DD_L}$  is selected. The architecture and operating principle of ISS are shown in Fig. 3.4. Here we use a sinusoidal signal with peak amplitude greater than  $V<sub>TH</sub>$  and three envelope steps for illustration. When the envelope is smaller than the  $V<sub>TH</sub>$ ,  $V<sub>DD L</sub>$  is always selected. When the envelope is greater than  $V<sub>TH</sub>$ , there are some moments that the magnitude is larger than  $V_{TH}$  and some moments when it is *smaller* than  $V_{TH}$ . A major difference in the proposed ISS technique versus existing art is that at these moments, the supply is switched to  $V_{DD\_L}$  to save DC power. In this way, we can overcome the efficiency limitations of classic linear PAs. For example, an ideal Class-A PA has an efficiency upper bound of 50% for sinusoids. If ISS is applied to a Class-A PA, we can achieve greater than 50% efficiency for sinusoids, ideally up to 60%. In other words, ISS can achieve better efficiency than SS and ET *even for constant-envelope signals*. Note that current-mode supply switching can also be applied to high-speed ET if this is desired. Furthermore, more than two supply levels can be used, for further efficiency improvement. A disadvantage of additional supply levels (besides complexity) is the extra parasitic capacitance that will be seen at the output nodes due to NPN collector-base and collector-substrate capacitance.

We now analyze ideal efficiency for the target DOCSIS 3.1 CATV signals, assuming zero excess voltage and current headroom and an infinite supply switching rate. We compare the ideal efficiency between Class-A, Class-B, and Class-B with proposed ISS by considering the differential push-push PA topology in Fig. 3.5(a). DOCSIS 3.1 uses OFDM multi-carrier QAM signals. Multicarrier signals such as these tend to have near-Gaussian voltage probability density functions, due to the Central Limit Theorem. True Gaussian signals have infinite PAPR, with very infrequent large peaks. OFDM signals will have some naturally bounded PAPR, depending on the details of the signal format. For practical purposes, the peak voltage  $(v_{peak})$  here is clipped to five standard deviations ( $\sigma = v_{\rm rms}$ ), which leads to 14 dB PAPR. Here PAPR is defined as  $(v_{peak}/v_{rms})^2$  instead of  $(v_{peak}/v_{rms})^2/2$ . The latter definition is used in some RF literature, and is the
ratio of the average power of a full-amplitude sinusoid to the average power of the actual signal. Clipping to five standard deviations does not significantly degrade the performance of DOCSIS 3.1 systems.



Fig. 3.5 (a) Differential push-push PA and (b) achievable efficiency with target signal and Class-B with ISS

In the Class-A case,  $V_{DD}$  must be greater than  $v_{peak}/2$  and  $i_{bias}$  must be greater than  $2v_{\text{peak}}/R_L$ , therefore the minimum DC power (P<sub>DC</sub>) to avoid clipping is  $v_{\text{peak}}^2/R_L$ . The average output power (P<sub>out</sub>) is  $v_{rms}^2/R_L$ . Hence we have average power efficiency ( $\eta$ ) of  $v_{rms}^2/v_{peak}^2 =$ 1/PAPR. For our target signal (PAPR = 14 dB, or a linear power ratio of 25:1), Class-A can ideally achieve  $1/25 = 4%$  efficiency. Linear power ratio is  $10^{(PAPR/10)}$ ; in other words PAPR expressed in linear instead of dB units. It is equal to the square of the voltage ratio.

In the Class-B case, the  $V_{DD}$  limit is same as for Class-A. The current follows the signal:  $i_{bias} = 2|v_{sig}|/R_L$ . The minimum average P<sub>DC</sub> is therefore:

$$
P_{DC} = V_{DD} \int_{-v_{peak}}^{v_{peak}} i_{bias} f(v_{sig}) dv_{sig} = \sqrt{\frac{2}{\pi}} v_{peak} \frac{v_{rms}}{R_L}
$$
 (1)

where  $f(v_{sig})$  is the PDF of the Gaussian-distributed output voltage signal. We get  $\eta$  =  $\sqrt{\pi/(2 \cdot PAPR)}$ . For our target signals, Class-B can achieve 25% theoretical efficiency.

In our proposed operating mode, Class-B with ISS,  $i_{bias}$  is same as for Class-B.  $V_{DD_H}$ must be greater than  $v_{peak}/2$ . The supply-rail transitions occur when  $V_{out} = 2V_{DD\_L}$ . The minimum average  $P_{DC}$  is then:

$$
P_{DC} = \int_0^{2V_{DD}} 2V_{DD}\psi_{bias} f(v_{sig}) dv_{sig} + \int_{2V_{DD}}^{v_{peak}} 2V_{DD\mu} i_{bias} f(v_{sig}) dv_{sig}
$$
 (2)



Fig. 3.6 Ideal efficiency with 14 dB PAPR Gaussian-distributed signal

Ideal efficiency vs.  $V_{DDL}$  is plotted in Fig. 3.5(b). For our target signal, Class-B with ISS can achieve 52% efficiency when the optimum value of  $V_{DD\_L}$  is chosen. This is much better than conventional Class-A and Class-B efficiency under such high PAPR conditions. Fig. 3.5 shows that  $V_{DDL}$  cannot be too low, otherwise the supply would barely switch. It also cannot be too high; otherwise no power would be saved. Fig. 3.6 shows PAPR vs. achievable efficiency for a Gaussian distributed signal. Our proposed ISS technique is strongly advantageous relative to Class-A and Class-B for high-PAPR signals.



Fig. 3.7 (a) Magnetic flux change and voltage jumps with RF choke bias and (b) reduction using center-tapped bias transformer

#### *C. Potential Issues and Solutions*

One major potential issue for current-mode ISS is the magnetic flux change in the bias chokes while rail switching. As shown in the Fig. 3.7(a), if the PA is biased with conventional RF chokes and the current is switched from one rail to another, the NPN collectors see very large voltage jumps due to Ldi/dt. For example, in a typical broadband PA design using bias chokes with value of 3 μH, with current switching threshold 100 mA and transition time 250 ps, the flux change would theoretically cause a voltage jump greater than 1 kV, which is of course not practical.

This issue is resolved first by using a center-tapped bias transformer instead of individual chokes. As the current switches from one rail to the other, the sum of the currents in transformer windings remains constant. Because the transformer windings are coupled with  $k = 1$  (ideally), there is no net flux change and therefore no voltage jump during a rail transition. The centertapped transformer is needed anyway to reject Class-AB common-mode current changes, so it does not impose an additional cost. Fig. 3.7(b) illustrates how this works. When the current is switching from  $V_{DD\_L}$  to  $V_{DD\_H}$ , the  $V_{ch+}$  node has a flux change and voltage jump due to the left inductor, which is same as the jump that occurs using an RF choke. Because of the mutual coupling, there is another flux change with the same amplitude but with opposite direction. These two flux changes cancel each other out. Note that in CATV applications, a ferrite-core transformer is required due to the low frequency cutoff (5 MHz); but in wireless applications, an on-chip transformer could be used. Another way to understand this is that the center-tapped transformer has high differential-mode impedance from self-inductance  $(L_{ch})$  plus mutualinductance  $(M)$ , but low common-mode impedance from  $L_{ch}$  minus M. The supply switching operates on the common mode, and the low impedance of the transformer fixes the flux change issue.

In practice, the bias transformer has parasitic leakage inductance due to its package leads and also due to PCB traces, IC traces, bond wires, etc.; the equivalent model is shown in Fig. 3.8. The parasitic leakage inductance was modeled as 2 nH in series with each transformer lead, including the center tap. The residual voltage jump due to flux changes in the leakage inductances is now 2.4 V. This is much better than before, but still unacceptably large. Such jumps would encroach on the device headroom significantly, affecting efficiency and linearity.



Fig. 3.8 Residual magnetic flux change and voltage jumps due to parasitic inductance and reduction with proposed capacitive coupling combiner

To resolve the residual flux change issue, we proposed the *capacitive coupling combiner*, as shown in Fig. 3.8. The combiner is composed of two large equal-size capacitors between  $V_{DDH}$  and  $V_{DDL}$  branches. The large capacitors maintain low impedance over the full frequency range. Note that the voltage jumps due to flux changes across the capacitors are equal and opposite. The low-impedance capacitive path between the two rails allows these jumps to offset each other. The voltage jumps due to flux changes in the parasitic inductances are reduced from 2.4 V to 0.4 V at the NPN collectors. Since the collector jumps are equal and opposite, the jump at the midpoint of the split and at the load is less than  $10 \mu V$  in simulations.

The capacitive combiner is implemented with both internal and external capacitors. The internal capacitors  $(\sim 30pF)$  take care of the high frequency switching transients without suffering from extra package and PCB trace ESL. External capacitors (~1nF) on the PCB are necessary for the low-frequency limit of CATV Upstream signals (5 MHz); internal capacitors of this magnitude would have excessive ground parasitics. The extra combiner pins are shared with the center-tapped bias transformers and common-mode chokes to the IC.

## **3.3 Circuit Implementation**

To demonstrate ISS for CATV applications, this PA was designed and fabricated in the TowerJazz 0.18 μm SiGe QF process. This process features high-breakdown-voltage BJT transistors and standard 0.18 µm CMOS devices. The complete PA architecture is shown in Fig. 3.9.



Fig. 3.9 PA schematic and test bench

We use pseudo-differential common-source NMOS transistors biased in Class-AB as the input stage, cascoded with BJT NPN transistors as the proposed current-mode supply switches. The push-push differential architecture doubles the output swing and suppresses even-order distortion. The high-power (HP) core is connected to  $V_{DD_H} = 7.5$  V, whereas the low-power (LP) core is cascoded with Schottky diode clamps for protection and then connected to  $V_{DD_L} = 4.5 V$ . The clamp diodes are necessary when the supply is switched from  $V_{DD_H}$  to  $V_{DD_L}$  with strong signal magnitude. The PN junction from base to collector at LP core would turn on without the diodes, leading to severe distortion and reliability issues. Moderate drawbacks of the clamp diodes are added capacitance and the extra headroom they require.

The PA is biased through center-tapped transformers, which prevent large voltage jumps. The proposed capacitive coupling combiner couples signal to the balun, and reduces jumps due to residual flux change in the parasitic inductances of the load circuit. Finally, a broadband 1:1 common-mode choke serves as the balun, coupling to a single-ended 75  $\Omega$  load. Resistive shuntfeedback is used for wideband input and output impedance matching. The supply-switching driver and output common-mode impedance control network are also included, and will be discussed later.



Fig. 3.10 Small-signal equivalent circuit of resistive shunt-feedback PA core

#### *A. Resistive Shunt-Feedback PA Core*

The differential-mode half circuit of the PA core is shown in Fig. 3.10. The shunt feedback resistor  $R_{fb} = G_m Z_0^2$  is for broadband input and output impedance matching. The voltage gain of this topology is  $-A_v = -(G_m Z_0 - 1)$ . The shunt-feedback topology provides output matching with better efficiency than a shunt resistor  $(R_{ter})$  to AC ground. Power dissipation in R<sub>fb</sub> is  $P_{out}/(A_v + 1)$ , which is very small compared to dissipation in R<sub>ter</sub> =  $P_{out}$ . The bandwidth of the PA is from  $\omega_{p1}$  to min  $[\omega_{p2}, 1/(C_{in}Z_0)]$ , where

$$
C_{fb} \cong L_{ch}/(A_V Z_0^2) \tag{3}
$$

$$
\omega_{p1} \cong Z_0/(2L_{ch})\tag{4}
$$

$$
\omega_{p2} \cong 2\omega_{T_{BJT}}/(\alpha G_{m_{BJT}} Z_0), \alpha \cong 0.6 \tag{5}
$$

$$
1/(C_{in}Z_0) \cong \omega_{T_{CMOS}}/A_V \tag{6}
$$

The  $f<sub>T</sub>$  of the NPN BJT and NMOS transistors in this process are 28 GHz and 56 GHz, respectively Substituting these process parameters, the PA has estimated voltage gain of 20 dB and bandwidth from 6 MHz to 1.3 GHz. Note that the input common-source devices are NMOS for higher input bandwidth, and the cascode devices are BJT for high-voltage tolerance.  $BV_{CBO}$  $= 18$  V is the critical breakdown metric (not BV<sub>CEO</sub> = 8 V) since the NPN bases have low impedance drive.



Fig. 3.11 Supply-switching driver

#### *B. Supply-Switching Driver*

The supply-switching driver is shown in Fig. 3.11. It is comprises a level shifter cascaded with pre-driver and driver inverters. We use CMOS inverter drivers instead of current-mode logic (CML) drivers to save power at the expected supply-switching rates. Note that the target signals have high PAPR, so peaks are relatively infrequent. The Class-B CMOS drivers naturally have lower quiescent power consumption than CML drivers. The level shifter [15] transfers the supply-switching control signal to two paths. One path maintains the input 0-1.5 V levels; the second path level shifts upwards to 1.2-3 V. The voltage crossings are intentionally asymmetric to establish make-before-break action in current-mode switching; this minimizes flux-changeinduced voltage jumps. Varactors at the gates of the drivers time-align the signals in the two paths (0-1.5 V and 1.2 – 3 V). The swing was increased from  $6v_T = 150$  mV to 300 mV to ensure that the current is fully switched from one power rail to another. Simulated supply-switching bandwidth is approximately 2 GHz. The driver consumes less than 10 mW for typical signals, whereas ISS saves total DC power greater than 250 mW.



Fig 3.12 Output common-mode impedance control network

#### *C. Output Common-mode Impedance Control Network*

The push-push Class-AB PA core generates large output common-mode current. The center-tapped bias transformer has low common-mode impedance at relatively low frequencies. Hence the output common-mode voltage signal is small enough that it does not constrict lowfrequency differential swing and  $P_{SAT}$ . However, the parasitic leakage inductances have common-mode impedance proportional to frequency. At high frequencies, the increasing output

common-mode voltage signal reduces the achievable undistorted differential swing, which limits  $P_{SAT}$ . To improve this, the common-mode impedance control network in Fig. 12 is paralleled with the collectors of the PA core. It has a high differential impedance of 4 k $\Omega$  and a low common-mode impedance of  $(1 + 4k/R_{eq})/G_{m\_cmfb}$ . Note that the value of feedback capacitors and resistors (220 fF and 4 kΩ) are designed to have the same ratio of  $C_{eq}$  to  $R_{eq}$  of common emitter  $G_m$  cell for flat frequency response. The simulation shows that the output common-mode impedance control network can achieve 10 GHz BW with common-mode impedance less than 50 Ω.

The DC current of common-mode feedback circuit is in total about 50mA, 25mA for each HP/LP core. The CMFB circuit can absorb up to ~50 mA peak common-mode current.

#### *D. Efficiency Estimation in Real Practice*

The previous ideal efficiency calculation for ISS assumes zero voltage and current headroom. However, these are not negligible in practice. The headroom voltages of the highvoltage and low-voltage cores are 0.9 V and 2.0 V, respectively. The low-voltage core requires higher voltage headroom due to the protection diodes (0.7 V) and residual flux change jumps (0.4 V). Note that in this paper, we use the term "voltage headroom" rather than "knee voltage". The current headroom (Class-AB standing current) is 50 mA. There is a trade-off between current headroom and linearity due to  $g_m$  variation. When the Class-AB standing current is too low, the input-stage  $g_m$  variation over the full signal swing was found to be too large to easily invert using a digital pre-distortion (DPD) engine developed for this PA. Class-AB standing current is the major efficiency bottleneck in this design. An input stage with less transconductance variation is highly desirable.

Fig. 3.13 shows that our design can achieve drain efficiency (DE) and PAE of 19% and 15.5% when  $V_{DD L}$  is biased at 4.2 V. Drain efficiency and PAE, with and without Shottky clamp-diode drop are compared in Fig. 3.13. Efficiency measurements were performed by averaging over an extended modulated waveform, long enough to observe the full PAPR. The power of auxiliary circuits was included, but unlimited supply-switching speed was still assumed.  $V_{DD\_L}$  was increased to 4.5 V for more headroom and linearity in subsequent measurements.



Fig. 3.13 Drain efficiency and PAE estimate with auxiliary circuits and headroom losses

## **3.4 Measurement Results**

The test setup uses a PC-controlled arbitrary waveform generator (Keysight M8190A AWG) to produce the RF input signal and supply-switching control waveforms. Fig. 3.14 shows the prototype PCB and PA die photo. The output of the PA drives an F-type coaxial connector and the 75  $\Omega$  CATV system. The chip area is 2.4x2.4 mm<sup>2</sup>, which is pin-limited (36-ball waferlevel BGA). The active area is  $1.2x1.2 \text{ mm}^2$ .



Fig. 3.14 PCB and die photo



Fig. 3.15 Measured (a) S-parameters, (b)  $P_{SAT}$  and (c) AM-AM at 50MHz

Fig. 3.15(a) is a plot of measured S-parameters. The PA achieves 20 dB power gain over a range of 8 to 750 MHz (3 dB points). The bandwidth discrepancy between the earlier estimate and measurement is due to the relatively large parasitic capacitance of PCB routing. The PA also achieves good input and output impedance matching over this range, with return losses better than 8 dB. Fig. 3.15(b) shows measured  $P_{SAT}$ . Peak CW  $P_{SAT}$  is 31.3 dBm. This drops to 26 dBm above 400 MHz due to increasing output common-mode impedance from parasitic leakage inductance. Note that  $P_{SAT}$  would drop more without the output common-mode impedance control network. BW and  $P_{SAT}$  may be improved in the future with revised PCB design. Fig. 3.15(c) shows the AM-AM behavior of the PA at 50 MHz.

To verify the high-PAPR capability of this PA, we first demonstrate the output spectrum and constellation of a 10 Msym/s 256-QAM signal at carrier frequency of 100 MHz with PAPR of 9.6 dB. The PA achieves 23.6 dBm average output power, -34.8 dBc ACPR and 2.25% EVM in Fig. 3.16. The PAE is improved from 17% to 22.6% by applying ISS with 1 Gbit/s switching rate. Consumer-grade CATV equipment designed to the DOCSIS standard is expected to reach about -40 dB EVM for 256-QAM, and about -50 dB EVM for 1024-QAM. A novel broadband DPD algorithm was applied to the PA to achieve the linearity requirements [17].



Fig. 3.16 Measured spectrum and constellation of 256-QAM signal



Fig. 3.17 Measured spectrum of (a) NPR signal and (b) NPR signal with DPD

We also demonstrate performance with a 14 dB PAPR, 20-to-220 MHz noise-power-ratio (NPR) test signal with three 12 MHz notches across the band in Fig. 3.17(a). NPR testing is a well-established technique for evaluating broadband communication systems, such as DOCSIS [16]. NPR testing uses white Gaussian noise (WGN) signals with notches. The notches allow observation of any in-band noise and distortion contributed by the system under test, and accurate measurement of signal-to-noise-and-distortion ratios. NPR notch depth can be correlated with EVM measurements [21]. PAPR was controlled by clipping, as mentioned earlier.

With this test signal, the PA achieves average power of 20 dBm with -27 dBc notch depth and sideband rejection ratio across the frequencies. The PAE is improved from 9% to 13.6% with ISS and 1 Gbit/s switching rate. Note that the measured data, output power, and linearity are the same with or without ISS. The measured PAE is close to the estimated efficiency; the discrepancy is due to the finite supply-switching rate. To prove that this DPD algorithm can be applied with ISS, we further illustrating the effectiveness of DPD applied to this PA. The notch depth/side-band rejection is improved with DPD from -27 dBc to -40 dBc across band. The

residual distortion increases at very low frequencies due to even-order effects; this may be addressed in future work.

We also used a 14 dB PAPR, 50-to-150 MHz NPR test signal with a 12 MHz notch centered at 100 MHz. Results are presented in Fig. 3.17(b). We compare three cases: PA operating in Class-AB, Class-AB with ISS (1 GHz SS rate), and Class-AB with ISS plus DPD. ISS improves PAE from 9% to 13%; and the notch depth/side-band rejection is improved with DPD from -30 dBc to -47 dBc. Although supply switching artifacts increase the distortion floor somewhat at high frequencies, this effect is well below the main signal power spectrum  $(50$ dB).



Fig. 3.18 Measured (a) CW efficiency at 50 MHz and (b) PAE vs. PAPR

Fig. 3.18(a) shows measured test results with a CW sinusoid at 50 MHz. Class-AB with ISS has superior efficiency relative to established PA classes even with CW signals for the first 7 dB of power backoff.

Fig. 3.18(b) shows PAE vs PAPR. ISS with 1Gbs supply switching rate shows superior efficiency vs. standard wideband Class-A and Class-AB [18]-[19] PAs, including the start-of-

the-art commercial CATV upstream product [1]. It also achieves comparable efficiency to the narrow-band Doherty switching PA with conventional supply-modulation in [12]. The following waveforms were used in Fig. 18(b): for 3 dB PAPR, a single tone; for 6 dB PAPR, two tones; for 9 dB PAPR, 4 tones; for 9.6dB PAPR, a 256-QAM signal; for 12 dB PAPR, a NPR signal clipped to  $\pm$  4σ; for 14 dB PAPR, a NPR signal clipped to  $\pm$  5σ. Measured results are compared to recent literature in Table 3.1, confirming our superior average efficiency for high PAPR signals, high-fractional-bandwidth signals.





"Single channel only "P<sub>1dB</sub> **\*\*\*Drain Efficiency** 

## **3.5 Conclusion**

Broadband, efficient PA techniques are needed in anticipation of forthcoming multicarrier and multi-band communication standards. This work has demonstrated a high-speed supply-modulation approach, ISS. Combined with a Class-AB core, it achieves superior efficiency for broadband high-PAPR signals. The intended application of this work is CATV upstream/downstream, but extension to other systems is possible. Higher bandwidth or power may be obtained by using a greater step-up ratio (vs. 1:1 in this work).

## **Chapter Acknowledgement**

The text of this chapter will be published as a regular paper in the IEEE Journal of Solid-State Circuits, May, 2018. The dissertation author was the primary researcher. The authors thank Kevin Miller, Guiseppe Cusmai, Tak Hayashi, CY Chen and Jiangfeng Wu for many ideas, and Kim Ng, Ardie Venes, Myles Wakayama and the Broadcom Foundation for their support.

## **References**

[1] Skyworks, "1.218 GHz High Output GaN CATV Power Doubler Amplifier," Preliminary Data Sheet, Oct 2016. Online Available: http://www.skyworksinc.com/uploads/documents/ACA2429\_204207B.pdf

[2] NXP, "BGA3131 DOCSIS 3.1 Upstream Amplifier," Product Data Sheet Rev. 1, May 2016. Online Available: http://www.nxp.com/documents/data\_sheet/BGA3131.pdf

[3] Marian K. Kazimierczuk, *RF Power Amplifiers.* Chichester, West Sussex, UK: Wiley, 2008

37

[4] W. H. Doherty, "A new high efficiency power amplifier for modulated waves," *Proc. IRE*, vol. 24, no. 9, pp. 1163-1182, Sep. 1936.

[5] H. Chireix, "High power outphasing modulation," *Proc. Inst. Radio Engineers*, vol. 23, no. 11, pp. 1370–1392, 1935.

[6] S. C. Cripps*, Advanced Techniques in RF Power Amplifier Design*. Norwood, MA, USA: Artech House, 2002.

[7] S. Moloudi and A. A. Abidi, "The outphasing RF power amplifier: A comprehensive analysis and a Class-B CMOS realization," *IEEE J. Solid-State Circuits*, vol. 48, no. 6, pp. 1357- 1369, Jun 2013.

[8] W. Tai, H. Xu, A Ravi, H. Lakdawala, O. Bochobza-Degani, L. R. Carley and Y. Palaskas, "A transformer-combined 31.5 dBm outphasing power amplifier in 45 nm LP CMOS with dynamic power control for back-off power efficiency enhancement." *IEEE J. Solid-State Circuits*, vol. 47, no. 7, pp. 1646-1658, Jul 2012.

[9] D. Chowdhury, S. V. Thyagarajan, L. Ye, E. Alon and A. M. Niknejad, "A fullyintegrated efficient CMOS inverse Class-D power amplifier for digital polar transmitters," *IEEE J. Solid-State Circuits*, vol. 47, no. 5, pp. 1113-1122, May 2012.

[10] S.-M. Yoo, J. S. Walling, O. Degani, B. Jann, R. Sadhwani, J. C. Rudell and D. J. Allstot, "A Class-G switched-capacitor RF power amplifier," *IEEE J. Solid-State Circuits*, vol. 48, no. 5, pp. 1212-1224, May 2013.

[11] N. Singhal, H. Zhang and S. Pamarti, "A Zero-Voltage-Switching Contour-based Outphasing Power Amplifier," *IEEE Trans. Microw. Theory Tech.*, vol. 60, no. 6, pp. 1896– 1906, Jun. 2012.

[12] S. Hu et al., "A Broadband CMOS Digital Power Amplifier with Hybrid Class-G Doherty Efficiency Enhancement," *ISSCC Dig. Tech. Papers*, pp. 44-45, Feb 2015.

[13] J. Lee, S. Pamarti and R. Gomez, "A 10-to-650MHz 1.35W Class-AB Power Amplifier with Instantaneous Supply Switching Efficiency Enhancement," in Proc. IEEE CICC, May 2017, pp. 1–4.

[14] D. Chowdhury, S. R. Mundlapudi and A. Afsahi, "A fully integrated reconfigurable wideband envelope-tracking SoC for high-bandwidth WLAN applications in a 28nm CMOS technology," ISSCC Dig. Tech. Papers, pp. 34-35, Feb 2017.

[15] B. Serneels, M. Steyaert and W, Dehaene, "A High speed, Low Voltage to High Voltage Level Shifter in Standard 1.2V 0.13μm CMOS," *IEEE ICECS*, pp. 668-671, 2006.

[16] F. H. Irons, K. J. Riley, D. M. Hummels and G. A. Friel, "The noise power ratio-theory and ADC testing," in *IEEE Transactions on Instrumentation and Measurement*, vol. 49, no. 3, pp. 659-665, Jun 2000.

[17] J. Lee, S. Pamarti and R. Gomez, "Training of Predistortion Based on Signal-to-Distortion-Ratio Measurements," IEEE Bipolar/BiCMOS Circuits and Technology Meeting (BCTM), pp 9-12, Oct. 2017.

[18] H. Wang et al., "A 5.2-to-13GHz Class-AB CMOS Power Amplifier with a 25.2dBm Peak Output Power at 21.6% PAE," ISSCC Dig. Tech. Papers, pp. 44-45, Feb 2010.

[19] W. Ye et al., "A 2-to-6GHz Class-AB Power Amplifier with 28.4% PAE in 65nm CMOS Supporting 256QAM," ISSCC Dig. Tech. Papers, pp. 38-39, Feb 2015.

[20] C.-K. K. Yang, "Design of high-speed serial link in CMOS," Ph.D. dissertation, Stanford Univ., Stanford, CA, 1998.

[21] J. B. Sombrin, "On the formal identity of EVM and NPR measurement methods: Conditions for identity of error vector magnitude and noise power ratio," 2011 41st European Microwave Conference, Manchester, pp. 337-340, Oct 2011.

# **CHAPTER 4 Overview of Digital Pre-distortion for Power Amplifiers**

CATV network delivers the TV and internet information between headend and customers. Because of the rapid growth of spectral efficiency and multi-user requirement, the signal has characteristic of high-order constellation (1024-QAM), multi-carrier signaling (OFDM) and multi-channel aggregation. Based on the requirement, the signal of CATV needs high PAPR and high fractional bandwidth compared with latest WIFI protocol. It also requires high linearity up to 50 dBc sideband rejection ratio. However, conventional PAs have trade-off between efficiency and linearity. It leads to poor efficiency for high linearity requirement so that we have high cost for thermal management. Hence, for CATV application, wide bandwidth PA with high linearity and high average efficiency are desirable.

There are several linearization schemes that have been presented to mitigate PA's nonlinearity. Among these, digital pre-distortion (DPD) is currently the most popular. In this chapter, an overview of DPD for power amplifiers is presented. Section 4.1 will describe the power amplifier linearization techniques, which include: feedback linearization, feedforward linearization, and digital pre-distortion. Section 4.2 addresses the nonlinear and pre-distortion model for power amplifiers, which covers memoryless models, nonlinear models with linear memory, and nonlinear models with nonlinear memory. Section 4.3 reviews direct learning and indirect learning DPD estimator. Finally, Section 4.4 presents algorisms for DPD coefficients adaptation, from least squares (LS), singular value decomposition (SVD) to QR factorization. Section 4.5 will discuss the challenges of wideband DPD.

## **4.1 Power Amplifier Linearization Techniques**

Several PA linearization techniques have been proposed to conquer the trade-off between efficiency and linearity. Those techniques can be categorized into three sets: feedback, feedforward and digital pre-distortion [1].

#### *A. Feedback Linearization*



Fig. 4.1 System diagram of feedback linearization

Feedback linearization was first proposed by Howard Black at Bell Labs [2]. He attempted to relieve linearity issue in telephone systems. For long distance voice data transmission, amplifiers are used to recover the signals from channel loss. However, the signals are distorted by the non-linearity of those amplifiers. Negative feedback linearization was proposed which system diagram is shown in Fig. 4.1. The main concept is that a portion of the output signals (β), including distortion, is negative feedback with the input signal. The distortion is improved by  $(1 + loop gain)$  [13]-[14], by the cost of gain. Feedback linearization became popular in RF PA industry. However, the frequency response of the loop gain limits its application at high frequencies and bandwidths.

#### *B. Feedforward Linearization*



Fig 4.2 System diagram of feedforward linearization

Feedforward linearization was also first proposed by Black while dealing the same issue in telephone systems. The system diagram is shown in Fig. 4.2. Unlike feedback linearization correcting the signal at the PA input, feedforward linearization scheme cancels the distortion at the output. The scheme is composed of signal cancellation path and distortion cancellation path. The signal cancellation path subtracts the coupled PA's output signal and the delayed input signal to only leave the nonlinear products. Then it delivers the distortion-only product to distortion cancellation path. The distortion cancellation path again subtracts the delayed PA's output signal and amplified distortion-only signal to remove the distortion of PA and transmit the desired non-distorted signal to the load. Feedforward linearization can achieve high linearity over high fractional bandwidth [3][15], so it was commonly used in wireless base stations. However, it suffers from efficiency issues because of the losses of complex passive components in the system and the auxiliary amplifier, which is inefficient for high linearity requirement.

#### *C. Digital Pre-distortion*



Fig 4.3 System diagram of digital pre-distortion

Because of the constraint of power efficiency of feedforward linearization technique, the base-station manufacturers sought for alternative linearization scheme. With the rapid advancement of digital-to-analog/analog-to-digital converter (DAC/ADC) and digital signal processing, digital pre-distortion (DPD) therefore becomes more and more popular [4]-[12], [16]. It has the advantage of achieving comparable linearity over wide-bandwidth with better power efficiency than feedforward linearization. Moreover, the system is also less complex.

The system diagram of DPD system is shown in Fig. 4.3. The ADC captures the coupled PA output signal and feeds the digital data to adaptation block (estimator) for evaluating the coefficients of PA nonlinear/pre-distortion models. The DPD nonlinear equalizer would base on the estimated coefficient and models to generate the pre-distorted input signal  $\tilde{x}$  through the DAC. To achieve adequate linearity improvement, it is important to build an appropriate PA nonlinear/pre-distortion model and retrieve accurate coefficients, which is discussed in the following sections.

## **4.2 Power Amplifier Nonlinear/Pre-distortion Models**

In this section, PA nonlinear/pre-distortion models are addressed. We start from the simplest special case, memoryless memory models, to more (with linear memories models) and more general cases (with nonlinear memories models), which can handle from the narrow-band, medium-band, and wide-band respectively.

#### *A. Memoryless Models*

Memoryless models [1] are suitable for the PA output signal depends only on the present input signal, which is shown in Fig. 4.4. The memoryless behavior models well when PA has little or no memory effect, which is true for the narrow-band system. To build the model, AM-AM and AM-PM curves are required to capture the amplitude and phase behavior, as shown in Fig. 4.5. Several memoryless models have been presented to describe the memoryless curves. Among these, power series and look-up table (LUT) are two of the most popular models.



Fig. 4.4 Memoryless model for narrowband system



Fig. 4.5 AM-AM and AM-PM curves

#### *B. Nonlinear Models with Linear Memory*

The memory effect becomes non-negligible while the signal bandwidth increases. We start with considering linear memory. Nonlinear models with linear memory [1] are quite intuitive when we look at PA's topology, as shown in Fig. 4.6. The topology is simplified to input/output matching network and non-linear transistor. The input and output matching networks are composed of passive linear components, which can contribute linear memory. Therefore, when the nonlinear memory effect of the transistor is not evident, nonlinear models with linear memory are good candidates for PA behavior.



Fig. 4.6 Simplified PA's topology



Fig. 4.7 Nonlinear models with linear memory

There are three famous categories for these models: Wiener model [17], [20]-[21], Hammerstein model [20], and Wiener-Hammerstein model [19]-[20]. Those models are shown in Fig. 4.7. In Wiener model, a linear filter is ahead of the memoryless nonlinear block; while in Hammerstein model, a linear filter is after the memoryless nonlinear block. Wiener-Hammerstein model combines both models with linear filters at each end of the memoryless nonlinear block. The nonlinear models with linear memory are suitable for medium-band system.

#### *C. Nonlinear Models with Nonlinear Memory*

In Fig. 4.7, when the nonlinear memory effect of the transistor becomes obvious, nonlinear models with nonlinear memory [1] must be applied to fit PA behavior. Volterra series is mathematically proved that it can describe any nonlinear system. Equation (1) gives the general form of Volterra series:

$$
y(n) = h_0 + \sum_{m_1=0}^{M} h_1(m_1)x(n - m_1) + \sum_{m_2=m_1}^{M} \sum_{m_1=0}^{M} h_2(m_2, m_1)x(n - m_2)x(n - m_1) + \sum_{m_3=m_2}^{M} \sum_{m_2=m_1}^{M} \sum_{m_1=0}^{M} h_3(m_3, m_2, m_1)x(n - m_3)x(n - m_2)x(n - m_1) + \cdots
$$
\n(1)

" $h_i$ " can be understood as filter for i-th order nonlinearity. Each term in the Volterra series is the convolution operation of i-th order filter and i-th order nonlinearity. Though Volterra series is very powerful, it has unlimited memory which requires a lot of hardware resources. Therefore, pruned models of Volterra series are necessary in real practice, which will be introduced in the following.

#### *a. Memory Polynomial (MP)*

Memory polynomial model is one of the most well-known pruned Volterra series [5], [18]-[19]. MP model is presented as equation (2):

$$
y(n) = \sum_{p=1}^{P} \sum_{m=0}^{M} h_{pm} [x(n-m)]^p
$$
 (2)

MP model can be understood as a reduction of the Volterra series which considers only multiplication terms of the same memory taps. P and M are the maximum nonlinear order and memory tap respectively. There are also 2D MP models for the dual-band transmitters with considering the cross-terms of both channels [22]-[24].

#### *b. Generalized Memory Polynomial (GMP)*

Generalized memory polynomial model is another popular pruned Volterra series [6]-[7], [17]. GMP model generalizes MP model by introducing cross-over terms with different memory taps. It is presented as equation (3):

$$
y(n) = \sum_{p=1}^{P} \sum_{g=0}^{G} \sum_{m=0}^{M} h_{pgm} x(n-g) [x(n-m)]^{p-1}
$$
\n(3)

P and M present same definitions as MP model. G stands for the length of cross-over terms.

#### *c. Dynamic-Deviation Reduction (DDR)*

Dynamic-Deviation Reduction model is also a well-known pruned Volterra series [8]. It is presented as equation (4):

$$
y(n) = \sum_{p \in P} x^{p-r}(n) \sum_{i_1=1}^{M} \cdots \sum_{i_r=1}^{M} h_{p,r}(0, \cdots, 0, i_r, \cdots, i_1) \prod_{j=1}^{r} x(n - i_j)
$$
(4)

P and M present same definitions as MP model. r stands for the highest order of memorydependent terms. Noted that in GMP model, cross-over terms contain multiply at most two different memory data. However, DDR model relaxes this restriction up to min(r, M).

## **4.3 Digital Pre-distortion Estimator**

With appropriate models for PA behavior, the estimator is required to evaluate the coefficients. Two estimators are considered here: direct learning and indirect learning [9].

#### *A. Direct Learning*

Direct learning has another name as closed-loop estimation as shown in Fig. 4.3, the same system diagram when we first introduce DPD. The loop can minimize the difference between x and the data sampled by ADC while deriving accurate coefficients. The detail of coefficients adaptation will be address in the next section.

The rule of thumb of ADC sampling rate is about 3-5 times of the data bandwidth. However, it is noted that the data captured by ADC is much more than the number of unknown coefficients. Thus, sub-sampling can be applied as long as the length of sub-sampling data provides sufficient rank [10].

#### *B. Indirect Learning*



Fig. 4.8 System diagram of indirect learning

The system diagram of indirect learning [11]-[12], [19] is shown in Fig. 4.8. The DPD nonlinear equalizers before the DAC (pre-distorter) and after the ADC (post-distorter) are identical. The estimator updates the coefficients based on the outputs of both DPD nonlinear equalizers. The target is to minimize the mean square of difference between the two inputs, predistorted and post-distorted signal, of the estimator.

Indirect learning is known that it has faster convergence speed compared with direct learning estimator. However, it introduces offset [9] into the estimation caused by output noise and non-accurate pre-distortion model, which limits the linearization performance. The estimator also has limitation on the ADC sampling rate by memory tap spacing, which is much high than the sub-sampling rate in direct learning scheme [9].

## **4.4 Digital Pre-distortion Coefficients Adaptation**

Inside the estimator, an algorithm for coefficients adaptation is required to calculate the coefficients. From equation  $(1)-(4)$ , it is noted that the nonlinear terms are added linearly in the coefficients. Therefore, those equations describing PA behavior can be represented more compactly as:

$$
y(n) = \sum_{i=1}^{N} a_i f_i(x(n)) = Fa \tag{5}
$$

where F and a represent nonlinear terms matrix and vector of coefficients respectively. To solve a, the intuitive way is using the inverse matrix:

$$
F^{-1}y = \hat{a}
$$
 (6)

However, when F is not a square matrix, a pseudo-inverse matrix is required but challenging. Therefore, we need approaches for retrieving pseudo-inverse matrix, as discussed in the followings:

#### *A. Least Squares (LS)*

The commonly used least squares approach is presented as:

$$
(F^H F)^{-1} F^H y = \hat{a} \tag{7}
$$

It can be applied when F is not a square matrix, which is most of the cases. However, large-size matrix inversion is expensive. Moreover, ill-conditioning may be a problem when the eigenvalue is small, so that we need better solvers with linear algorithms more robust to ill-conditioning.

#### *B. Singular Value Decomposition (SVD)*

Singular value decomposition is proved more robust to ill-conditioning. It also skip the expensive matrix inversion operation. SVD is presented as:

$$
F = A \Sigma B^T \tag{8}
$$

where A and B are orthogonal,  $\Sigma$  is a diagonal matrix with eigenvalues of  $F^H F$ . The pseudoinverse matrix is:

$$
F^{-1} = B\Sigma^{-1}A^T \tag{9}
$$

#### *C. QR Factorization*

QR factorization is another approach which is proved more robust to ill-conditioning. QR factorization is presented as:

$$
F = QR \tag{10}
$$

where Q is a  $M \times N$  orthonormal matrix, and R is a  $N \times N$  upper-triangular and invertible. Q and R can be derived by Gram-Schmidtt normalization. The coefficients vector is solved as:

$$
R^{-1}Q^H y = \hat{a} \tag{11}
$$

## **4.5 Challenges of Wideband Digital Pre-distortion**



Fig. 4.9 Wideband PA's nonlinearity behavior.

The main challenge of wideband DPD is that the PA's non-linearity is complicated, which is strongly memory dependent, as shown in Fig. 4.9. Though full Volterra series is mathematically proved that it can describe any nonlinear system, all nonlinearity orders and unlimited memories require tremendous hardware resources in the DPD nonlinear equalizers, which may not be worthwhile applying DPD by considering overall power efficiency. Moreover, the computation operation of coefficient adaptations is more than  $O(n^2)$ . An adequate simplifiedmodel is necessary but challenging.

Another challenge of wideband DPD is ADC requirement. The rule of thumb of ADC sampling rate is about 3-5 times of the data bandwidth. For CATV upstream/downstream system, 600M-1GHz/3G-5G sampling rate is required. The resolution of ADC also needs greater than 10-ENOB for a 14dB PAPR with 50 dBc ACPR signal at the transmitter output. The ADC

becomes power hungry and costly by such requirement, which consumes at least 500mW/0.4mm<sup>2</sup> for 40nm CMOS [25].

### **References**

[1] D. Schreurs, M. O'Droma, A. A. Goacher, and M. Gadringer, RF Power Amplifier Behavioral Modeling. Cambridge, 2008.

[2] H. S. Black, "Inventing the negative feedback amplifier," IEEE Spectr., vol. 14, no. 12, pp. 55–60, Dec. 1977.

[3] A. Mohammadi and F. M. Ghannouchi, RF Transceiver Design for MIMO Wireless Communications. Springer Berlin Heidelberg, 2012.

[4] A. Katz, J. Wood and D. Chokola, "The Evolution of PA Linearization: From Classic Feedforward and Feedback Through Analog and Digital Predistortion," in *IEEE Microwave Magazine*, vol. 17, no. 2, pp. 32-40, Feb. 2016.

[5] J. Kim and K. Konstantinou, "Digital predistortion of wideband signals based on power amplifier model with memory," *Electron. Lett.*, vol. 37, no. 23, pp. 1417–1418, Nov. 2001.

[6] S. Afsardoost, T. Eriksson, and C. Fager, "Digital predistortion using a vector-switched model," *IEEE Trans. Microw. Theory Tech.*, vol. 60, no. 4, pp. 1166–1174, Apr. 2012.

[7] F. Mkadem et al.," Multi-Band Complexity-Reduced Generalized-Memory-Polynomial Power-Amplifier Digital Predistortion," *IEEE MTTs*, vol. 64, no. 6, pp. 1763-1774, Jun 2016.

[8] A. Zhu et al., "Dynamic Deviation Reduction-based Volterra Behavioral Modeling of RF Power Amplifiers," *IEEE MTTs*, vol. 54, no. 12, pp. 4323-4332, Dec 2006.

[9] R. N. Braithwaite, "A comparison of indirect learning and closed loop estimators used in digital predistortion of power amplifiers," in *IEEE MTT-S Int. Microw. Symp. Dig.*, Phoenix, AZ, USA, May 2015, pp. 1–4.

[10] L. Ding, F. Mujica, and Z. Yang, "Digital predistortion using direct learning with reduced bandwidth feedback," in *IEEE MTT-S Int. Microw. Symp. Dig.*, Seattle, WA, June 2-7, 2013, pp. 1-3.

[11] C. Eun and E. J. Powers, "A new Volterra predistorter based on the indirect learning architecture," *IEEE Trans. Signal Process.*, vol. 45, no. 1, pp. 223–227, Jan. 1997.

[12] L. Ding et al., "Memory polynomial predistorter based on the indirect learning architecture," *in Proc. IEEE Global Telecommunicat. Conf. GLOBECOM,* Taipei, Taiwan, Nov. 2002, vol. 1, pp. 967–971.

[13] A. A. Abidi, "General relations between IP2, IP3, and offsets in differential circuits and the effects of feedback," IEEE MTTs, vol. 51, no. 15, pp. 1610-1612, May 2003.

[14] F. Waldhauer, Feedback. New York: Wiley, 1982.

[15] A. M. Smith and J. K. Cavers, "A wideband architecture for adaptive feedforward linearization," in *Proc. IEEE Vehicular Technology Conf.*, vol. 3, May 1998, pp. 2488–2492.

[16] B. Murmann et al., "Digitally enhanced analog circuits: System aspects," *IEEE International Symposium on Circuits and Systems*, pp. 18-21, May 2008.

[17] D. R. Morgan, Z. Ma, J. Kim, M. G. Z. Zierdt, and J. Pastalan, "A generalized memory polynomial model for digital predistortion of RF power amplifiers," *IEEE Trans. Signal Process.*, vol. 54, no. 10, pp. 3852–3860, Oct. 2006.

[18] F. M. Ghannouchi and O. Hammi, "Behavioral modeling and predistortion," *IEEE Microw. Mag.*, vol. 10, no. 7, pp. 52–64, Dec 2009.

[19] L. Ding et al., "A Robust Digital Baseband Predistorter Constructed Using Memory Polynomials," *IEEE Trans. Commun.*, vol. 52, no. 1, pp. 159–165, Jan. 2004.

[20] V. J. Mathews and G. L. Sicuranza, Polynomial Signal Processing. New York: Wiley, 2000.

[21] T. Liu, S. Boumaiza, and F. M. Ghannouchi, "Pre-compensation for the dynamic nonlinearity of wideband wireless transmitters using augmented Wiener predistorters," in *Proc. Asia–Pacific Microw. Conf.*, Suzhou, China, Dec. 2005, vol. 5, pp. 4–7.

[22] S. A. Bassam, M. Helaoui, and F. M. Ghannouchi, "2-D digital Predistortion (2-D-DPD) architecture for concurrent dual-band transmitters," *IEEE Trans. Microw. Theory Techn.*, vol. 59, no. 10, pp. 2547–2553, Oct. 2011.

[23] C. Quindroit, N. Naraharisetti, P. Roblin, S. Gheitanchi, V. Mauer, and M. Fitton, "2D forward twin nonlinear two-box model for concurrent dual-band digital predistortion," in *Proc. IEEE Radio Wireless Symp.*, Jan. 2014, pp. 1–3.

[24] H. Xiang, C. Yu, J. Gao, S. Li, Y. Wu, M. Su, and Y. Liu, "Dynamic deviation reductionbased concurrent dual-band digital predistortion," *Int. J. RF Microw. Comput.-Aided Eng.*, vol. 24, pp. 401–411, Aug. 2013.

[25] C.-Y. Chen et al, "A 12-Bit 3GS/s Pipeline ADC with 0.4 mm<sup>2</sup> and 500mW in 40nm Digital CMOS" IEEE Journal of Solid-State Circuits, vol. 47, no. 4, pp. 1013–1021, Apr. 2012.

## **CHAPTER 5 Training of Digital Predistortion Based on Signal-to-Distortion-Ratio Measurements**

## **Abstract**

A novel training technique for digital pre-distortion (DPD) systems is presented. Previous techniques rely on the time-domain (TD) difference between the ideal transmitter input and the distorted amplifier output as an error metric. This work presents DPD training based solely on frequency-domain signal-to-distortion-ratio (SDR) information. The only *a priori* information required is the spectral occupancy of the desired signal. The approach is applied to a broadband SiGe BiCMOS power amplifier, including measured results.

## **5.1 Background: Limitations of Existing Digital Predistortion Algorithms and Motivation**



Fig. 5.1 Conventional time-domain-feedback DPD training

Digital pre-distortion is increasingly being used to improve the fidelity of RF transmitters. Improved fidelity relative to legacy approaches is often required to accurately
transmit modern communication signals with complex constellations and multi-carrier signaling. Most existing DPD approaches<sup>[1]</sup>-[5] require time-domain training of a nonlinear equalizer, as shown in Fig. 5.1. Here, the amplifier output is digitized, either directly with an ADC or with a heterodyne receiver followed by an ADC.

This feedback receiver, however it is implemented, will typically add significant complexity to the transmitter subsystem. Furthermore, depending on the duty cycle of adaptation, it may add significant power dissipation. Many existing algorithms require substantial memory and computational resources. A DPD approach which does not require highrate time-domain digital feedback from the transmitter output is therefore desirable.

In this chapter, a DPD training algorithm based solely on coarse frequency-domain Signal-to-Distortion Ratio (SDR) information is described. It has been applied to a broadband RF BiCMOS power amplifier (PA), linearized using a memory polynomial equalizer. Measurements show robust convergence and significant distortion reduction for representative signal scenarios. It is a *blind* algorithm because it does not rely on foreknowledge of the input signal.

### **5.2 Outline of the Proposed Approach**

#### *A. Band-SDR-Based Training*

Conceptually, we propose training the memory polynomial coefficients using only *band SDR*, as illustrated in Fig. 5.2. Band SDR is defined as the ratio of total desired signal power to the distortion in a specific frequency band. Several bands may be observed and used for adaptation, capturing spectral regions impaired by harmonic and/or intermodulation distortion of several orders. For example, consider a transmitter impaired solely by memoryless third-order intermodulation distortion. Conceptually, we could observe the third-order adjacent sidebands and adjust the coefficient of a compensating third-order term until the sideband levels improve optimally. Additional sideband regions could be observed to train additional compensating nonlinear coefficients.



Fig. 5.2 Proposed band-SDR-feedback DPD training

Training with noise-like signals, as reported below, was found to be more difficult than training with multiple tones. This is because tonal inputs produce tonal distortion products which generally do not overlap, and are more easily measured than noise-like distortion products which do overlap. We emphasize that noise-like stimuli are more realistic models of typical communication signals today. In practice, we anticipate continuous foreground training with live signals.

This approach greatly simplifies the feedback receiver. It can be based, for example, on a swept heterodyne power detector, i. e., a spectrum analyzer. This spectrum analyzer needs only dynamic range comparable to the desired transmitter fidelity. There is no need for synchronized real-time sample-by-sample comparison of the digitized Tx output and the Tx input data. Offline comparison is possible, provided the transmitter and signal characteristics do not change too rapidly. *Post-distortion* of nonlinear receivers is also possible, since the only *a priori* information required is spectral occupancy. A similar approach is used in [3], but with a memoryless analogdomain predistorter.

Real-world broadband amplifiers often have nonlinear memory effects which cannot be ignored. Therefore, an ideal approach will also be able to train the coefficients of a suitable nonlinear equalizer, for example, one based on memory polynomials [5][6].

#### *B. Nonlinear Model*

Volterra series are a familiar starting point for nonlinear modeling and equalization. Eq. 1 gives the general form of a  $P<sup>th</sup>$ -order causal Volterra series, with unlimited memory.

$$
\tilde{x}(n) = h_0 + \sum_{p=1}^{P} \sum_{i_1}^{\infty} \cdots \sum_{i_p=0}^{\infty} h_p(i_p, \cdots, i_1) \prod_{j=1}^{p} x(n - i_j)
$$
\n(1)

Note that this series contains every possible product of present and past samples of *x*, up to P<sup>th</sup> order. The kernels  $h_p$  weight each term in *x*. Of course, this series has an infinite number of terms and must be pruned for practical use. One aspect which may be pruned is the memory depth. Then the summations would only extend to some number *M* of past samples*. M* may become  $M(p)$ , so that the memory depth varies with respect to the nonlinear order  $p$ . Furthermore, *p* may be restricted to a subset of the integers from one to P. In other words, some of the nonlinear orders less than P are discarded. This pruned model corresponds to Eq. 2. The DC term  $h_0$  has also been discarded.

$$
\tilde{x}(n) = \sum_{p \in P} \sum_{i_1=0}^{M(p)} \cdots \sum_{i_p=0}^{M(p)} h_p(i_p, \cdots, i_1) \prod_{j=1}^p x(n - i_j)
$$
\n(2)

Note that the indexes  $i_j$  refer to time delays. Each kernel  $h_p$  is a function of  $p$  different delays, which must all be less than *M(p)*. [4] proposes a pruned Volterra series where the kernels  $h_p$  are additionally restricted. The first  $r(p)$  indexes of  $h_p$  are allowed to range to the memory depth  $M(p)$ , but the additional  $p-r(p)$  delays are constrained to be zero (current sample). This pruned model is termed Dynamic-Deviation Reduction (DDR) Volterra model, and the general form is given in Eq. 3.

$$
\tilde{x}(n) = \sum_{p \in P} x^{p-r} \sum_{i_1=1}^{M(p)} \cdots \sum_{i_r=1}^{M(p)} h_{p,r}(0, \cdots, 0, i_r, \cdots, i_1) \prod_{j=1}^r x(n-i_j)
$$
\n(3)\n  
\n
$$
\xrightarrow{\text{Filter for}}
$$
\n
$$
\xrightarrow{\text{Buffer for}}
$$
\n
$$
\xrightarrow{\text{Number}}
$$
\n
$$
\xrightarrow{\text{Under }}
$$
\n
$$
\xrightarrow{\text{Buffer for}}
$$
\n
$$
\xrightarrow{\text{Number}}
$$
\n
$$
\xrightarrow{\text{Under }}
$$
\n
$$
\xrightarrow{\text{Number}}
$$
\n
$$
\xrightarrow{\
$$

Fig. 5.3 DPD model with pre- and post-filters

To help minimize the model order, linear filter blocks are prepended and postpended to the nonlinear model above, as shown in Fig. 5.3. Using established terminology, these are respectively the Wiener and Hammerstein model blocks[5]. The postfilter is the approximate inverse of the amplifier input frequency response (for example, due to  $R_{source}$  and input capacitance) and the prefilter is the inverse of the amplifier output response (for example, due to output capacitance and  $R_{load}$ ). Good Wiener / Hammerstein transfer functions may be estimated from simulations using known circuit parameters, or measured, for example using a network analyzer. It is again explicitly expected that the exact choice of linear transfer functions will not be critical, since any discrepancy can be made up in the nonlinear series. The intention is only a best *ad hoc* effort to reduce model complexity. From [4], the total number of coefficients in the DDR model is

$$
n_{coeff} = \sum_{p \in P} C(M(p) + r(p), r(p)). \tag{4}
$$

# **5.3 Band-SDR-Based Training**

The emphasis of this work is band-SDR-based training, as described earlier. The model above is only exemplary, and other models may be compatible with band-SDR training.



Fig. 5.4 Flow chart of proposed SDR-based DPD training

A flowchart of the training procedure is given in Fig. 5.4. The model coefficients are estimated sequentially, beginning with the terms expected to produce the strongest distortion. Recall that the signal spectral occupancy is known *a priori* (if it is not known, it can be measured). Therefore, the spectral occupancy of the distortion terms at each order is known. Spectral bands are chosen which roughly correspond to the occupancy of the signal and key distortion terms. Perfect segregation of distortion bands is not generally possible, as there may be overlap.

The flowchart proceeds as follows. First, the signal power is measured within its known spectral limits. Then the distortion power in the band associated with the first coefficient is measured. The ratio of the signal power to the first band distortion power is calculated. Particular band ratios are denoted  $SDR_k$ . For each distortion term having an unknown coefficient in the DDR model, we define the *signal-to-term ratio* (STR<sub>k</sub>) as the precalculated ratio of the signal power to the power of the term when the coefficient is one.  $\sqrt{STR_k/SDR_k}$  bounds the size of the nonlinear coefficients. The relevant kernel coefficients are then optimized within this bound using the Nelder-Mead search algorithm, as implemented in MATLAB. The signal and distortion power measurements are made using a Keysight MXA RF spectrum analyzer. The nonlinear DDR Volterra predistortion is applied to the wideband test signal and driven to the PA using a Keysight M8190A arbitrary waveform generator (AWG).

The Nelder-Mead (downhill simplex) algorithm can optimize functions of several variables. It evaluates the function at  $n+1$  vertexes of a simplex per iteration, where *n* is the number of independent variables. In this case,  $n=1$ , so only two evaluations are required per iteration. The SDR is intuitively expected to be smooth throughout its bounded range. However, if the SDR is dominated at any point by terms other than the one currently being optimized, the coefficient may not converge to a good value (convergence to *some* value is guaranteed by limiting the number of iterations per coefficient). Therefore, the order in which coefficients are optimized is important. Multiple passes may be necessary to uncover the less dominant distortion terms and obtain the best convergence.



Fig. 5.5 Spectrum observation without DPD

The steps for a memoryless  $3<sup>rd</sup>$ -order distortion model will now be outlined in detail, as an example. Fig. 5.5 shows the transmitter without DPD. Note the signal spectrum extending from 50-150 MHz, and the intermodulation and harmonic distortion. Now the DPD block applies a compensated signal  $\tilde{x}$ :

$$
\tilde{x} = x + a_3 x^3 \tag{5}
$$

The  $x^3$  term is not generally orthogonal to the desired signal. Intuitively, this is because the bestfit straight line to a cubic curve has non-zero slope. The best fit depends on the statistical properties of the signal. The search for the optimum value of  $\tilde{a}_3$  is facilitated by choosing an orthogonal  $3<sup>rd</sup>$ -order correction term, obtained by subtracting the projection of the desired signal from the correction, as in Eq. 6-7. Here,  $proj_a | b \stackrel{\text{def}}{=} \frac{}{}$  $\frac{S_{\alpha,a}}{S_{\alpha,a}}a.$ 

$$
\tilde{x} = x + \tilde{a}_3(x^3)_{orth} \tag{6}
$$

$$
(x^3)_{orth} \stackrel{\text{def}}{=} x^3 - proj_x | x^3 \tag{7}
$$

The signal-to-term ratio  $STR<sub>3</sub>$  and signal-to-distortion ratio  $SDR<sub>3</sub>$  are defined in Eq. 8-9.

$$
STR_3 \stackrel{\text{def}}{=} \frac{P_X}{P_{(x^3)_{orth}} | band \, 1 \& 2}
$$
\n<sup>(8)</sup>

$$
SDR_3 \stackrel{\text{def}}{=} \frac{P_{sig}}{P_{band1} + P_{band2}}\tag{9}
$$

We can now bound the range of the correction coefficient prior to searching.

$$
\tilde{a}_3 \in [-1,1]\sqrt{STR_3/SDR_3} \tag{10}
$$

Successive terms are orthogonalized using this Gram-Schmidt approach prior to searching. The Volterra series is linear with respect to its coefficients. This, combined with the orthogonalization process facilitates convergence of the algorithm by limiting interaction between terms. There is some interaction due to the cascade of the predistortion model with the actual nonlinearity of the PA. But this effect is not strong; for the same reason that high-order cross terms can often be neglected in cascaded Taylor series.



Fig. 5.6 Coeffcient estimation for proposed SDR-based training

As shown in the flowchart, the overall procedure may be iterated until the distortion no longer improves. Successive passes may prune the model beyond an initial guess, based on the converged values of the coefficients. The operation of the algorithm in the simple  $3<sup>rd</sup>$ -order case is illustrated in Fig. 5.6.

### **5.4 Band-SDR-Based Training**

This predistortion algorithm was applied to a broadband SiGe BiCMOS Class-AB power amplifier [7], driven with noise-like high peak-to-average-power (PAPR) signals. This PA features instantaneous supply switching (ISS), a supply modulation efficiency-enhancement technique suitable for wideband signals with complex modulation. Table I gives the complexity parameters of the pruned DDR model.



Fig. 5.7 DPD training results with proposed SDR-based training and broadband SiGe PA

Fig. 5.7 shows the measured spectrum of the PA output with two different test signals. In both cases, the input is bandlimited Gaussian noise, clipped to 14 dB PAPR. This input simulates contemporary complex signal formats, such as OFDM. The distortion floor is improved by 10- 15 dB. This result was obtained after two passes of the optimization flowchart. The convergence time on the laboratory bench is limited by AWG access. The convergence time estimates in Table 5.1 are based assuming that time for each iteration step is limited by the settling time of the resolution-bandwidth filters in the spectrum analyzer block (proportional to 1/BW), given 20 Nelder-Mead iterations per coefficient. The processing power estimates are based on analysis of digital equalizer power in [8] and surveys of 40nm SoC DSP performance.

| Freq (MHz)                               | $85 - 115$ w/ 1 notch | $50 - 150$ w/ 1 notch | $20 - 220$ w/ 3 notches |
|------------------------------------------|-----------------------|-----------------------|-------------------------|
| <b>Sideband Rejection</b><br>(dBc)       | $48 - 50$             | $45 - 47$             | 40                      |
| # Terms                                  | 55                    | 49                    | 49                      |
| $p \in P$                                | ${2,3,5,7}$           | ${2,3,5,7}$           | $\{2,3,5,7\}$           |
| M(p)                                     | ${4,3,3,3}$           | ${4,3,3,3}$           | ${4,3,3,3}$             |
| r(p)                                     | ${1,3,3,2}$           | ${2,3,2,1}$           | $\{2,3,2,1\}$           |
| <b>Convergence Time (ms,</b><br>$est.$ ) | 40                    | 24                    | 60                      |
| <b>DSP</b> Power (mW, est.)              | 45                    | 48                    | 96                      |

Table 5.1 DPD Training Results for Three Different Bandwidth Cases

The converged parameters were found to be stable over time (hours) for many spectrum scenarios. Some temperature variation was observed, and it is anticipated that periodic training updates will be necessary in practice.

# **5.5 Conclusions**

A novel training approach for digital predistorters has been presented, based solely on measurements of output signal-to-distortion ratio in coarse bands. This approach simplifies the feedback receiver design. It does not require synchronized sample-by-sample comparison of the transmitter input and output, also significantly reducing the complexity of the adaptation block. The algorithm was evaluated with a SiGe BiCMOS broadband PA, demonstrating substantial and robust fidelity improvements.

### **Chapter Acknowledgement**

The text of this chapter will be published as a regular paper in the IEEE Bipolar/BiCMOS Circuits and Technology Meeting, Oct. The dissertation author was the primary researcher. The authors would like to thank Myles Wakayama for his support, and Loke Tan and Lin He for invaluable advice.

# **References**

[1] A. Katz, J. Wood and D. Chokola, "The Evolution of PA Linearization: From Classic Feedforward and Feedback Through Analog and Digital Predistortion," in *IEEE Microwave Magazine*, vol. 17, no. 2, pp. 32-40, Feb. 2016.

[2] Y. W. Lee and M. Schetzen, "Measurement of the Wiener Kernels of a Non-linear System by Cross-correlation," International Journal of Control, 2:3, pp.237-254, 1965.

[3] S. P. Stapleton and F. C. Costescu, "An adaptive predistorter for a power amplifier based on adjacent channel emissions [mobile communications]," in *IEEE Transactions on Vehicular Technology*, vol. 41, no. 1, pp. 49-56, Feb 1992.

[4] A. Zhu *et al.*, "Dynamic Deviation Reduction-based Volterra Behavioral Modeling of RF Power Amplifiers," *IEEE MTTs*, vol. 54, no. 12, pp. 4323-4332, Dec 2006.

67

[5] L. Ding *et al.*, "A Robust Digital Baseband Predistorter Constructed Using Memory Polynomials," *IEEE Trans. Commun.,* vol. 52, no. 1, pp. 159–165, Jan. 2004.

[6] F. Mkadem *et al.*," Multi-Band Complexity-Reduced Generalized-Memory-Polynomial Power-Amplifier Digital Predistortion," *IEEE MTTs*, vol. 64, no. 6, pp. 1763-1774, Jun 2016.

[7] Jeffrey Lee *et al.*, "A 10-to-650MHz 1.35W Class-AB Power Amplifier with Instantaneous Supply Switching Efficiency Enhancement," *IEEE CICC*, 2017, to appear.

[8] R. Gomez, "Theoretical Comparison of Direct-Sampling Versus Heterodyne RF Receivers," *IEEE TCAS I*, vol. 63, no. 8, pp. 1276-1282, Aug. 2016.