# UC Santa Barbara

**UC Santa Barbara Electronic Theses and Dissertations** 

## Title

Integrated Electronics for Energy-efficient Coherent Optical Communication

# Permalink

https://escholarship.org/uc/item/94f5q7mp

## Author

Movaghar, Ghazal

# Publication Date

2024

Peer reviewed|Thesis/dissertation

University of California Santa Barbara

# Integrated Electronics for Energy-efficient Coherent Optical Communication

A dissertation submitted in partial satisfaction of the requirements for the degree

Doctor of Philosophy

in Electrical and Computer Engineering

by

### Ghazal Movaghar

Committee in charge:

Professor James F. Buckwalter, Chair Professor Clint L. Schow, Professor Mark J. W. Rodwell Professor Loai G. Salem

March 2024

The Dissertation of Ghazal Movaghar is approved.

Professor Mark J. W. Rodwell

Professor Loai G. Salem

Professor Clint L. Schow

Professor James F. Buckwalter, Committee Chair

### Integrated Electronics for Energy-efficient Coherent Optical Communication

Copyright  $\bigodot$  2024

by

Ghazal Movaghar

### Acknowledgements

I would like to begin by thanking my advisor, professor Jim Buckwalter, who welcomed me to pursue my MS and PhD with his research team at the RF and Mixed-Signal lab. The opportunity to work with him has been a real privilege and I have learned a great deal from him over the past five years. He patiently provided guidance through every challenge and was always understanding and supportive.

I would also like to thank Professor Clint Schow whom I collaborated with closely for my projects. He welcomed me to work in his lab, fully supported me with the experiments, and always provided valuable insight and feedback for every challenge faced in the lab.

I am also very grateful to professors Mark Rodwell and Loai Salem for being on my committee and for their valuable feedback and advice.

I would like to thank the other PhD students and postdocs at the ECE department that I had the chance to work with and learn from; especially, Viviana Arrunategui, Aaron Maharry, Junqian Liu, Hector Andrade, Luis Valenzuela, Evan Chansky, Navid Hosseinzadeh, Stephen Misak, and Xinhong Du.

I would also like to acknowledge Alethea Butler-Nalin at UCSB for helping with wirebond assemblies.

I would like to thank GlobalFoundries for providing silicon fabrication through the 45SPCLO university program. I would also like to thank Ted Letavic, Rod Augur, Ken Giewont, Takako Hirokawa, and Kevin Dezfulian at GlobalFoundries for technical discussions.

Finally, I want to thank my parents, Maryam and Asaad, and my brother, Ashkan who always supported me and helped me follow my dreams. I also want to thank my little fur buddy, Ginger who never failed to bring a smile to my face even during the hardest PhD days.

### Curriculum Vitæ Ghazal Movaghar

### Education

| 2023 | Ph.D. in Electrical and Computer Engineering (Expected), Univer-<br>sity of California, Santa Barbara. |
|------|--------------------------------------------------------------------------------------------------------|
| 2020 | M.S. in Electrical and Computer Engineering, University of Cali-<br>fornia, Santa Barbara.             |
| 2018 | B.S. in Electrical Engineering, Sharif University of Technology, Tehran Iran.                          |

### Publications

- 1. Ghazal Movaghar, Viviana Arrunategui, Junqian Liu, Aaron Maharry, Stephen Misak, Xinhong Du, Clint L Schow, James F Buckwalter, "A Monolithic O-band Coherent Optical Receiver for Energy-efficient Links," in IEEE Journal of Solid-State Circuits (JSSC), doi: 10.1109/JSSC.2023.339494.
- G. Movaghar, Viviana Arrunategui, Junqian Liu, Aaron Maharry, Clint L Schow, James F Buckwalter, "A 112-Gbps, 0.73-pJ/bit Fully-Integrated O-band I-Q Optical Receiver in a 45-nm CMOS SOI-Photonic Process," 2023 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), San Diego, CA, USA, 2023, pp. 5-8, doi: 10.1109/RFIC54547.2023.10186202.
- 3. G. Movaghar, Viviana Arrunategui, Aaron Maharry, Evan Chansky, Junqian Liu, Hector Andrade, Clint L Schow, James F Buckwalter, "First Monolithically-Integrated Silicon CMOS Coherent Optical Receiver," 2023 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 2023, pp. 1-3, doi: 10.1364/OFC.2023.Th2A.2.
- 4. G. Movaghar, Viviana Arrunategui, Evan Chansky, Aaron Maharry, Clint L Schow, James F Buckwalter, "A 40-Gbps, 900-fJ/bit Dual-Channel Receiver in a 45-nm Monolithic RF/Photonic Integrated Circuit Process," in IEEE Solid-State Circuits Letters, vol. 5, pp. 313-316, 2022, doi: 10.1109/LSSC.2022.3232340.
- G. Movaghar, Luis A Valenzuela, Junqian Liu, Aaron Maharry, Clint Schow, James F Buckwalter, "An 88-Gbps, 3.3-pJ/bit I/Q Receiver with Current-Mode Phase-frequency Detection in a 130-nm SiGe HBT Technology," in IEEE Solid-State Circuits Letters, vol. 5, pp. 308-311, 2022, doi: 10.1109/LSSC.2022.3231238.
- G. Movaghar, Junqian Liu, James Dalton, Luis A Valenzuela, Clint L Schow, James F Buckwalter, "Improved Signal Integrity at 64 Gbps in a 130-nm SiGe Optical Receiver With Through-Silicon Vias,"2022 IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS), Phoenix, AZ, USA, 2022, pp. 132-135, doi: 10.1109/BCICTS53451.2022.10051734.

- Luis A Valenzuela, Ghazal Movaghar, James Dalton, Navid Hosseinzadeh, Hector Andrade, Aaron Maharry, Clint L Schow, James F Buckwalter, "An Energy-Efficient, 60-Gbps Variable Transimpedance Optical Receiver in a 90-nm SiGe HBT Technology," 2022 IEEE/MTT-S International Microwave Symposium - IMS 2022, Denver, CO, USA, 2022, pp. 279-282, doi: 10.1109/IMS37962.2022.9865274.
- T. Hirokawa, S. Pinna, N. Hosseinzadeh, A. Maharry, H. Andrade, J. Liu, T. Meissner, S. Misak, G. Movaghar, L. Valenzuela, Y. Xia, S. Bhat, F. Gambini, J. Klamkin, A. A. M. Saleh, L. Coldren, J.F. Buckwalter, C. L. Schow, "Analog Coherent Detection for Energy Efficient Intra-Data Center Links at 200 Gbps Per Wavelength," in Journal of Lightwave Technology, vol. 39, no. 2, pp. 520-531, 15 Jan.15, 2021, doi: 10.1109/JLT.2020.3029788.
- Aaron Maharry, Hector Andrade, Stephen Misak, Junqian Liu, Yujie Xia, Aaron Wissing, Ghazal Movaghar, Viviana Arrunategui-Norvick, Evan D. Chansky, Xinhong Du, Adel A. M. Saleh, James F. Buckwalter, Larry Coldren, and Clint L. Schow, "Integrated SOAs enable energy-efficient intra-data center coherent links," Opt. Express 31, 17480-17493 (2023)
- Hector Andrade, Junqian Liu, Takako Hirokawa, Aaron Maharry, Ghazal Movaghar, Luis Valenzuela, Clint Schow, James Buckwalter, "Optical Transmitter Equalization With Tunable Mismatched Terminations in a Silicon Modulator," in IEEE Photonics Technology Letters, vol. 34, no. 15, pp. 775-778, 1 Aug.1, 2022, doi: 10.1109/LPT.2022.3186237.
- 11. Luis A Valenzuela, James Dalton, Aaron Maharry, Ghazal Movaghar, Hector Andrade, Clint L Schow, James F Buckwalter, "An 80-Gbps Distributed Driver with Two-Tap Feedforward Equalization in 45-nm CMOS SOI," 2022 IEEE 22nd Topical Meeting on Silicon Monolithic Integrated Circuits in RF Systems (SiRF), Las Vegas, NV, USA, 2022, pp. 49-51, doi: 10.1109/SiRF53094.2022.9720041.

#### Abstract

#### Integrated Electronics for Energy-efficient Coherent Optical Communication

by

#### Ghazal Movaghar

Data center traffic continues to experience considerable growth due to the vast amount of data generated by cloud computing, augmented reality, and the internet of things and Intra-data center traffic makes up to 77% of the total traffic, so improvements in spectral efficiency, bandwidth and power consumption of data center interconnections contribute to overall energy efficiency. Intra-data center traffic interconnects aim for data rates above 200 Gbps per wavelength while reducing power consumption. Coherent links leveraging orthogonal polarization and quadrature modulation schemes are an energy-efficient alternative approach to commonly used intensity modulation direct detection (IMDD). A component to this vision is the realization of low-power, broadband optical receivers for quadrature phase shift keying (QPSK) or higher-order coherent waveforms. Improvements in energy efficiency through increased data rate and reduction in power consumption is also significantly affected by electronic-photonic integration. Co-packaged optics have been proposed as one approach to fulfill this demand by minimizing the high-speed I/O power consumption. Nevertheless, parasitic resistance, inductance and capacitance between electronic and photonic circuits deteriorates the high-speed performance and requires power hungry equalization, thereby eliminating improvements in energy efficiency. Consequently, packaging approaches that enable either monolithic or 3D integration of heterogeneous ICs, i.e. silicon photonic and electronic ICs, are promising approaches to improve performance. The focus of this work is to develop energy-efficient optical fiber communication links through studying the system architecture trade-offs, as well as integrated opto-electrical circuit design for the link implementation. Performance degradation due to packaging effects is also studied and quantized. Several fiber optic communication links have been designed and measured. The first monolithically integrated CMOS-Photonic coherent optical receiver was implemented and achieved 80 Gbps with 1.2 pJ/bit energy efficiency. The O-band receiver was redesigned to further improve the performance and achieved above 100 Gbps and a record energy efficiency below 1 pJ/bit. These results show the possibility to implement O-band coherent optical links to support 200 Gbps per wavelength below 10 pJ/bit for next generation intra data center applications.

# Contents

| Cı       | irric        | ulum Vitae                                             | $\mathbf{V}$ |
|----------|--------------|--------------------------------------------------------|--------------|
| Al       | ostra        | $\mathbf{ct}$                                          | vii          |
| Li       | st of        | Figures                                                | xi           |
| Li       | st of        | Tables                                                 | xvii         |
| 1        | Intr         | oduction                                               | 1<br>2       |
|          | 1.1<br>1.9   | Co Packaged Optics                                     | 2<br>5       |
|          | $1.2 \\ 1.3$ | Dissertation Preview                                   | 5<br>7       |
| <b>2</b> | Ana          | log Coherent Detection                                 | 9            |
|          | 2.1          | Introduction                                           | 9            |
|          | 2.2          | Optical Phase Lock Loop (OPLL) Homodyne Detection      | 11           |
|          | 2.3          | Electrical Phase Lock Loop (EPLL) Heterodyne Detection | 27           |
|          | 2.4          | Self-Homodyne Detection                                | 32           |
|          | 2.5          | Conclusion                                             | 36           |
| 3        | Coh          | erent Detection Power Optimization                     | 37           |
|          | 3.1          | Introduction                                           | 37           |
|          | 3.2          | Transimpledance Amplifier (TIA) Architecture           | 39           |
|          | 3.3          | Laser Power Requirements                               | 49           |
|          | 3.4          | Receiver Power Requirement                             | 50           |
|          | 3.5          | Transmitter Drive Power Requirement                    | 56           |
|          | 3.6          |                                                        | 58           |
|          | 3.7          | Acknowledgment                                         | 58           |
| 4        | Elec         | ctronic and Photonic Packaging and Integration         | 59           |
|          | 4.1          | Introduction                                           | 59           |
|          | 4.2          | Through Silicon Vias                                   | 61           |

|          | 4.3    | Optical Receiver Circuit                                           | 62  |
|----------|--------|--------------------------------------------------------------------|-----|
|          | 4.4    | Experimental Results                                               | 65  |
|          | 4.5    | Conclusion                                                         | 67  |
|          | 4.6    | Acknowledgment                                                     | 67  |
| <b>5</b> | A C    | Costas PFD Implementation                                          | 70  |
|          | 5.1    | Introduction                                                       | 70  |
|          | 5.2    | Current-mode QPSK Receiver Design                                  | 72  |
|          | 5.3    | Experimental results                                               | 75  |
|          | 5.4    | Conclusion                                                         | 79  |
|          | 5.5    | Acknowledgment                                                     | 79  |
| 6        | Coh    | nerent Receiver Circuit Design                                     | 81  |
|          | 6.1    | Introduction                                                       | 81  |
|          | 6.2    | 80-Gbps, 1.2-pJ/bit Fully-Integrated O-band I-Q Optical Receiver   | 81  |
|          | 6.3    | Receiver Measurement Results                                       | 87  |
|          | 6.4    | Conclusion                                                         | 91  |
|          | 6.5    | Acknowledgment                                                     | 92  |
| 7        | Col    | nerent Receiver Circuit Design: Optimization and Co-simulation     | 94  |
|          | 7.1    | Introduction                                                       | 94  |
|          | 7.2    | 112-Gbps, 0.73-pJ/bit Fully-Integrated O-band I-Q Optical Receiver | 95  |
|          | 7.3    | Coherent Transmitter                                               | 102 |
|          | 7.4    | Receiver Measurement Results                                       | 103 |
|          | 7.5    | Conclusion                                                         | 113 |
|          | 7.6    | Acknowledgment                                                     | 114 |
| 8        | Sun    | nmary and Outlook                                                  | 116 |
|          | 8.1    | Future Work                                                        | 117 |
| Bi       | ibliog | graphy                                                             | 119 |

# List of Figures

| 1.1  | Global data center IP traffic showing threefold increase in the last 5 years                                                                                                                                 |    |
|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
|      | [1, 2]                                                                                                                                                                                                       | 1  |
| 1.2  | Optical link architecture using (a) 4-lane IMDD (b) dual polarization co-                                                                                                                                    |    |
|      | herent link.                                                                                                                                                                                                 | 3  |
|      | (a)                                                                                                                                                                                                          | 3  |
|      | (b)                                                                                                                                                                                                          | 3  |
| 1.3  | Google DCI optical modules evolution for short range applications [3]                                                                                                                                        | 3  |
| 1.4  | Power breakdown of 400-G coherent transceiver [12].                                                                                                                                                          | 4  |
| 1.5  | Evolution of electronic/photonic interconnects published by Broadcom [16].                                                                                                                                   | 6  |
| 1.6  | GlobalFoundries 45-nm CMOS SOI photonic technology [23]                                                                                                                                                      | 7  |
| 2.1  | Coherent optical data link with OPLL carrier recovery circuitry                                                                                                                                              | 12 |
| 2.2  | Calculated PFD response normalized response to $V_{PFD}$ as a function of                                                                                                                                    |    |
|      | phase error                                                                                                                                                                                                  | 14 |
| 2.3  | PLL mathematical model including a PFD, low pass filter (LPF), a loop                                                                                                                                        |    |
|      | filter (LF) and voltage control oscillator (VCO)                                                                                                                                                             | 16 |
| 2.4  | Phase margin as a function of time delay and unity gain bandwidth for                                                                                                                                        |    |
|      | (a) $f_{zero} = 10MHz$ (b) $f_{zero} = 100MHz$ .                                                                                                                                                             | 17 |
|      | (a)                                                                                                                                                                                                          | 17 |
|      | (b)                                                                                                                                                                                                          | 17 |
| 2.5  | Normalized step response.                                                                                                                                                                                    | 19 |
| 2.6  | Pull-in frequency as a function of loop delay and PFD bandwidth for                                                                                                                                          |    |
|      | $\omega_{zero} = 10MHz.\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots\dots$ | 23 |
| 2.7  | Pull-in frequency as a function of loop delay and PFD bandwidth for                                                                                                                                          |    |
|      | $\omega_{zero} = 10 \text{ MHz.} \dots \dots$                                                          | 23 |
| 2.8  | Calculated PFD response normalized response to $V_{PFD}$ as a function of                                                                                                                                    |    |
|      | phase error.                                                                                                                                                                                                 | 25 |
| 2.9  | Lock-in frequency as a function of $P_{RX}$ and $K_{LF}$                                                                                                                                                     | 26 |
| 2.10 | Coherent optical data link with EPLL carrier recovery circuitry                                                                                                                                              | 27 |
| 2.11 | Heterodyne wireless data link with PLL carrier recovery circuitry                                                                                                                                            | 28 |
| 2.12 | EPLL front-end circuit.                                                                                                                                                                                      | 29 |

| 2.13         | QVCO circuit schematic.                                                                                                                                                                                                | 30                        |
|--------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------|
| 2.14         | Simulated oscillation frequency as a function of control voltage Lock in frequency for the EDL as a function of $D$ and $V$                                                                                            | - პ0<br>- აი              |
| 2.10<br>2.16 | Lock-in frequency for the EPLL as a function of $P_{RX}$ and $K_{LF}$ Solf homodyne optical link a) using a separate fiber to forward the LO                                                                           | 32                        |
| 2.10         | lease and ODL for phase locking b) using an orthogonal polarization to                                                                                                                                                 |                           |
|              | forward LO lacor                                                                                                                                                                                                       | 22                        |
|              | $\begin{array}{c} \text{forward LO faser} \\ \text{(a)} \end{array}$                                                                                                                                                   | - <u>-</u> - 2-2<br>- 2-2 |
|              | $(a) \qquad \dots \qquad $                                                                                     | - აა<br>- იი              |
| 2.17         | A linear model for the DLL. THe PFD and LPF blocks model the costas<br>PFD. The LF is modeled as an ideal integrator. The voltage controlled<br>delay line (VCDL) adjusts the output phase based on the input voltage. | зэ<br>34                  |
| 2.18         | Lock time as a function of $P_{RX}$ and $K_{LF}$                                                                                                                                                                       | 35                        |
| 31           | Single polarization coherent optical data link with                                                                                                                                                                    | 38                        |
| 3.2          | Common gate TIA schematic (a) ideal dc bias (b) circuit implementation                                                                                                                                                 | 00                        |
| 0.1          | for the dc bias.                                                                                                                                                                                                       | 40                        |
|              | (a)                                                                                                                                                                                                                    | 40                        |
|              | (b)                                                                                                                                                                                                                    | 40                        |
| 3.3          | Noise model schematic including thermal noise of $R_D$ , and channel noise                                                                                                                                             |                           |
|              | of $M_1, M_c$                                                                                                                                                                                                          | 43                        |
| 3.4          | (a) Gain boosted common gate TIA (b) circuit implementation for the                                                                                                                                                    |                           |
|              | gain boosting amplifier                                                                                                                                                                                                | 45                        |
|              | (a)                                                                                                                                                                                                                    | 45                        |
|              | (b)                                                                                                                                                                                                                    | 45                        |
| 3.5          | (a) Generic shunt feedback TIA (b) noise sources for the TIA                                                                                                                                                           | 47                        |
|              | (a)                                                                                                                                                                                                                    | 47                        |
|              | (b)                                                                                                                                                                                                                    | 47                        |
| 3.6          | optical power loss as a function of modulated voltage normalized to the                                                                                                                                                |                           |
|              | $MZMs V_{\pi}. \dots \dots$                                                                                      | 52                        |
| 3.7          | Shunt feedback TIA block diagram with an inverter cell for the core amplifier                                                                                                                                          | 53                        |
| 3.8          | $f_T$ and intrinsic gain $A_0 = G_m \cdot R_{DS} = 4.8$ for an inverter cell in 45CLO                                                                                                                                  |                           |
|              | technology                                                                                                                                                                                                             | 54                        |
| 3.9          | (a) IRNC (b) total power consumption (c) EE as a function of DR and                                                                                                                                                    |                           |
|              | $f_T$ . Cross lines show the maximum expected data rate for a given $f_T$ .                                                                                                                                            | 55                        |
|              | $(a) \qquad \dots \qquad $                                                                                     | 55                        |
|              | (b)                                                                                                                                                                                                                    | 55                        |
| 9.10         | $(c) \qquad \dots \qquad $                                                                                     | 55                        |
| 3.10         | Required driver power and optimized receiver and laser power found in $(3.34)$                                                                                                                                         | 57                        |
| 4.1          | Two packaging approaches where chip ground is connected to PCB ground                                                                                                                                                  |                           |
| _            | either through wirebonds (right) or TSVs (left).                                                                                                                                                                       | 60                        |

| 4.2  | Ground TSV cross-section and equivalent circuit model. M1 connection to<br>the backside metallization through silicon can be modeled as an inductance |          |
|------|-------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|      | in series with a resistance depending on the number of vias in the array                                                                              | 61       |
| 4.3  | Differential optical receiver consisting of a TIA, VGA, and 50- $\Omega$ output                                                                       |          |
| 4.4  | buffer fabricated in 130 nm BiCMOS process                                                                                                            | 62       |
|      | ground wirebond inductance from chip pads to PCB for a) 570 $\mu$ m CPW transmission line driving the circuit, b) TIA, c) VGA, and d) OB              | 63       |
|      | (a)                                                                                                                                                   | 63       |
|      | (b)                                                                                                                                                   | 63       |
|      | $(c) \qquad \dots \qquad $                    | 63       |
| 4.5  | (d) Photograph of the two assemblies under test for investigating packaging                                                                           | 63       |
| 4.6  | effects                                                                                                                                               | 65       |
| 4.0  | the non-TSV assembly and 36.6 GHz for TSV assembly                                                                                                    | 66       |
| 4.7  | Noise histograms for TSV and non-TSV assemblies. The TSV assembly                                                                                     | 00       |
|      | shows a slightly higher output voltage noise due to its larger bandwidth.                                                                             | 66       |
| 4.8  | Measured eye diagrams at 50, 56, and 64 Gb/s with 6.5 mV inputs. TSV assembly shows a larger eye opening resulting in lower BER at high data          |          |
|      | rates                                                                                                                                                 | 68       |
| 4.9  | Bathtub curves at 50, 56, and 64 Gb/s for TSV and non-TSV assemblies for 6.5 mV input voltage. TSV assembly shows almost 100 times improvement        |          |
| 4 10 | in BER as a result of higher bandwidth                                                                                                                | 69<br>60 |
| 4.10 | Sensitivity curves at 50, 50, and 04 Gb/s for 15V and non 15V assemblies.                                                                             | 09       |
| 5.1  | a) Conventional shunt-feedback TIA, b) shunt-feedback TIA with emitter follower in the feedback and c) proposed common-base(CB)/common-               |          |
|      | emitter(CE) front-end for hybrid TIA/mixer interface.                                                                                                 | 71       |
| 5.2  | Schematic of the current-mode receiver combining the TIA and PFD mixer                                                                                |          |
|      | interface with an input common-base TIA embedded within a Gilbert                                                                                     |          |
|      | amplifier (LA) and output buffer with CTLE                                                                                                            | 73       |
| 5.3  | Midband TI gain and input impedance as a function of $V_{ECE}$ .                                                                                      | 74       |
| 5.4  | Frequency response of the receiver for 700 pH and 400 pH input wirebond                                                                               |          |
|      | inductance. Also assuming 50 fF photo-detector capacitance, and 200 pH                                                                                |          |
|      | output wirebond inductance                                                                                                                            | 75       |
| 5.5  | Receiver assembly on a FR-4 PCB and chip micro-graph                                                                                                  | 76       |
| 5.6  | PFD output at (a) 100 MHz and for two $V_{E,CE,I/Q}$ settings to show offset                                                                          | -        |
|      | and imbalance compensation capability of this node                                                                                                    | 76       |

| 5.7                                       | Eye diagrams for I and Q channels at 40 and 44 Gbps data rates. Measurements indicate 45 mV eye opening at 40 Gbps and 30 mV at 44 Gbps.                                                                        |          |
|-------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| $5.8 \\ 5.9$                              | The time scale is 5 ps per division and the voltage scale is 28 mV per division BER bathtub curves for I(solid line) and Q(dashed line) Noise histogram, measured output noise voltage and IRNC variation for I | 77<br>78 |
| 5.10                                      | and Q channels as a function of $V_{E,CE}$ compared with simulated IRNC<br>Power consumption breakdown for the modified and conventional Costas<br>loop design presented in [74]                                | 78<br>79 |
| <i>C</i> 1                                |                                                                                                                                                                                                                 | 00       |
| $6.1 \\ 6.2$                              | Power splitting ratio inside directional couplers and PD responsivity $R_{PD}$                                                                                                                                  | 82       |
| 6.3                                       | Simulated 56 GBd QPSK constellations at the output of hybrid with the                                                                                                                                           | 00       |
|                                           | phase tuner biased at (a) 17mW (b) 13.6mW with normalized amplitudes<br>(a)                                                                                                                                     | 83<br>83 |
|                                           | (b)                                                                                                                                                                                                             | 83       |
| 6.4                                       | Half-circuit schematic of pseudodifferential receiver implemented in 45-nm<br>BE/photonic integrated circuit process with a transimpedance amplifier                                                            |          |
|                                           | limiting amplifier and output buffer.                                                                                                                                                                           | 84       |
| 6.5                                       | Simulated transimpedance and input-referred noise current (IRNC) to<br>show the effect of feedforward capacitor. Simulated IRNC for electrical                                                                  | 01       |
|                                           | characterization indicates that the total integrated input noise current over<br>twice the simulated bandwidth of 23 GHz is 10 $\mu$ A when the wirebonds                                                       |          |
|                                           | are not present                                                                                                                                                                                                 | 86       |
| $\begin{array}{c} 6.6 \\ 6.7 \end{array}$ | Receiver assembly on a PCB and chip micrograph                                                                                                                                                                  | 88       |
|                                           | pared with measurement results. The receiver presented here has a 13-                                                                                                                                           | 00       |
| 68                                        | GHz bandwidth penalty due to the wirebond inductance                                                                                                                                                            | 88       |
| 0.8                                       | time scale is 6 ps per division and the voltage scale is 21 mV per division.                                                                                                                                    | 89       |
|                                           | (a)                                                                                                                                                                                                             | 89       |
|                                           | (b)                                                                                                                                                                                                             | 89       |
| 6.9                                       | Noise histogram measurement for a) I channel indicating 0.9727mV rms                                                                                                                                            |          |
| C 10                                      | output noise and b) Qchannel indicating 1.081mV rms output noise                                                                                                                                                | 90       |
| 0.10                                      | Bathtub curves illustrating FEC-compatible operation up to 40 Gbps per                                                                                                                                          | 01       |
| 6.11                                      | Sensitivity curves as a function of input current                                                                                                                                                               | 91<br>92 |
| 6.12                                      | Self-homodyne link configuration and microscopic view of the chip.                                                                                                                                              | 92       |

| 6.13 | Measured constellations at 20, 32, and 40 GBd and sensitivity curves.<br>Equalized results use a post-processing script on raw data to apply a 7-<br>tap feed-forward equalizer (FFE) to compensate for bandwidth limitations<br>of TX and RX packaging losses. The equalization adds 6dB of peaking at<br>40GHz and reduces BER for a given signal power and data rate | 93           |
|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
| 7.1  | Optical receiver implemented in 45-nm RF/photonic integrated circuit process consisting of an optical hybrid, TIA, limiting amplifier (LA), $50\Omega$ output buffer (OB), and a Costas phase/frequency detector (PFD). Detailed design parameters can also be found in [78].                                                                                           | 95           |
| 7.2  | Simulation of the IRNC and BW for an inverter TIA as a function of device width assuming a photodetector capacitance of 50 fF                                                                                                                                                                                                                                           | 98           |
| 7.3  | EE of the design as a function of device width                                                                                                                                                                                                                                                                                                                          | 99           |
| 7.4  | Comparison of simulated transimpedance for the RX channel and mea-<br>surements based on an electrical test structure                                                                                                                                                                                                                                                   | 100          |
| 7.5  | Power spectral density of output noise voltage for PD operating in dark current and 0dBm optical power, translating to 0.93mA PD current                                                                                                                                                                                                                                | 101          |
| 7.6  | Simulated constellations on the CORX (a) 40GBd with EVM=-14.5dB (b) 56GBd with EVM=-10.9dB and normalized amplitudes                                                                                                                                                                                                                                                    | 101          |
|      | (a)                                                                                                                                                                                                                                                                                                                                                                     | 101          |
| 7.7  | <ul><li>(b)</li></ul>                                                                                                                                                                                                                                                                                                                                                   | 101          |
|      | and I/Q MZMs and (b) transmitter assembly used for testing (a)                                                                                                                                                                                                                                                                                                          | 103          |
|      | (a) $\dots \dots \dots$                                                                                                                                                                                                                                                 | $103 \\ 103$ |
| 7.8  | Simulated S21 of the driver showing 11 dB of peaking at 36 GHz and 66                                                                                                                                                                                                                                                                                                   |              |
|      | GHz of 3-dB bandwidth                                                                                                                                                                                                                                                                                                                                                   | 104          |
| 7.9  | Standalone TX (a) constellation (b) eye diagram at 56 GBd with 18-mV swing and 4-mV eye opening, and (c) BER as a function of RX input                                                                                                                                                                                                                                  |              |
|      | power for -2-dBm LO power per PD                                                                                                                                                                                                                                                                                                                                        | 105          |
|      | $ \begin{array}{c} (a) \\ (b) \\ (c) \end{array}$                                                                                                                                                                                                                                                                                                                       | 105<br>105   |
| 7 10 | Chip micrograph and PCB assembly for the coherent optical receiver chip                                                                                                                                                                                                                                                                                                 | 100          |
| 1.10 | and assembly.                                                                                                                                                                                                                                                                                                                                                           | 106          |
| 7.11 | Self-homodyne test setup for link testing of the coherent optical receiver.                                                                                                                                                                                                                                                                                             | 106          |
| 7.12 | Power consumption from the current drawn form $V_{DD}$ , $V_{DD,buffer}$ and optical to involve the set                                                                                                                                                                                                                                                                 | 105          |
| 7.13 | QPSK raw and sampled constellations at (a) 28 (b) 40 Gbaud with BER<br>less than $1 \times 10^{-5}$ and (b) 56 Gbaud with $1.5 \times 10^{-3}$ BER (c) 60 Gbaud                                                                                                                                                                                                         | 107          |
|      | with $6 \times 10^{-3}$                                                                                                                                                                                                                                                                                                                                                 | 108          |
|      | (a)                                                                                                                                                                                                                                                                                                                                                                     | 108          |

|      | (b)                                                                        | 108 |
|------|----------------------------------------------------------------------------|-----|
|      | (c)                                                                        | 108 |
|      | (d)                                                                        | 108 |
| 7.14 | All electrical and full link optical eyes at (a) 28 Gbaud (b) 40 Gbaud (c) |     |
|      | 56 Gbaud (d) 60 Gbaud                                                      | 109 |
|      | (a)                                                                        | 109 |
|      | (b)                                                                        | 109 |
|      | (c)                                                                        | 109 |
|      | (d)                                                                        | 109 |
| 7.15 | BER curves at different bit rates indicating the power penalty to higher   |     |
|      | data rates as referenced to the FEC limit.                                 | 110 |
| 7.16 | PFD output as a function of phase error                                    | 111 |
|      |                                                                            |     |
| 8.1  | Heterodyne monolithic receiver                                             | 118 |
| 8.2  | Dual polarization monolithic receiver                                      | 118 |

# List of Tables

| 2.1 | Architecture comparison     | 36  |
|-----|-----------------------------|-----|
| 3.1 | Link parameters             | 57  |
| 5.1 | State-of-the-Art Comparison | 80  |
| 7.1 | State-of-the-Art Comparison | 115 |

# Chapter 1

# Introduction

Within the past decade data centers have become an integral technology to enable internet-based applications. Vast amount of data generated by augmented reality, internet of things, and other cloud based applications have resulted in significant increase in data traffic as shown in Fig. 1.1[1, 2]. Also shown in Fig. 1.1, more than 77% of the total data center traffic is attributed to short-range (< 2km) intra-data center links.



Figure 1.1: Global data center IP traffic showing threefold increase in the last 5 years [1, 2].

Low loss of optical fibers in the order of 0.2 dB/km as well as high bandwidth enables Tb/s per fiber data transfer up to 10s of km without a need for optical amplification. These characteristics have made fiber optic interconnects the key technology to implement data center interconnects (DCI), covering link distances from a few meters to thousands of kilometers [3]. However, technology requirements for short range DCI vary significantly from traditional long-haul communication, where maximizing per fiber capacity with more stringent requirements on power consumption, and cost is critical [3, 4]. As these short-range links encounter a large number of connections demanding continual scaling in spectral efficiency and bandwidth, their cost is largely dominated by the high speed transceivers. As a result improvements in spectral efficiency, bandwidth and power consumption of optical transceivers contribute to overall data center efficiency [5].

To address this data growth, DCIs will operate above 200 Gbps per wavelength with optimized energy efficiency. This dissertation aims to outline various approaches toward implementing these optical links. The rest of this chapter provides a thorough comparison between direct and coherent detection followed by a discussion of co-packaged optics.

## **1.1** Intra-data Center Link Implementation

### 1.1.1 Intensity-Modulation Direct Detection (IMDD) Links

The simplicity and low-cost of IMDD links, illustrated in Fig. 1.2a, have made it a popular approach to implement fiber optic links despite its low tolerance to optical impairments. For example, 400 G transceivers use 4x100 G 4-level pulse amplitude modulation (PAM4) to deploy next-generation 1.6 T ethernet [6]. Fig. 1.2a shows the evolution of Google IMDD based DCI modules for short range applications [3].

As shown in Fig. 1.3, Google's optical module implementation has evolved from one 10-Gbps NRZ lane to 8 100-Gbps PAM4 lanes. Scaling per lane speed from 100 Gbps to 200 Gbps is achievable through either utilizing more spectrally efficient modulation for-



Figure 1.2: Optical link architecture using (a) 4-lane IMDD (b) dual polarization coherent link.



Figure 1.3: Google DCI optical modules evolution for short range applications [3].

mats such as PAM8, or doubling the circuit bandwidth. Nevertheless, both approaches prove challenging with stringent requirements on SNR and linearity. Moreover, heavy equalization is essential for bandwidth improvements, resulting in significant power consumption.

As a case in point, A 200-Gbps/lane PAM-4 link has been shown to operate over 400 meters using 71 feedforward equalizer (FFE) taps and 15 decision feedback equalizer (DFE) taps to achieve a pre-FEC (Forward Error Correction) bit error rate (BER) limit of  $2 \times 10^{-2}$  with more than 7 dBm received optical power demanding a significant output power from the transmitter (TX) source laser [7]. Moreover, scaling beyond PAM-4 results in further increases in linearity and power consumption in the transmitter and

### 1.1.2 Coherent Detection Links

Coherent detection is an alternative to IMDD [8]. Fig. 1.2b illustrates the overall link architecture for dual-polarization with coherent detection to enable a 200 Gbps/lane. Despite direct detection where the only variable is the signal intensity, coherent detection takes advantage of three different dimensions to modulate the signal. Quadrature amplitude modulation (QAM) adds in-phase and quadrature phase space as well as orthogonal polarization to intensity of light to transmit data and hence offers more scalability compared to IMDD. Moreover, coherent detection offers improved receiver sensitivity as well as higher tolerance to optical impairments such as chromatic dispersion (CD) and polarization mode dispersion (PMD). Due to high power requirements attributed to highspeed coherent DSP, as well higher cost of coherent transceivers, this technology has been mostly deployed for long-haul communication links. However, continual advancements in IC technologies have helped reduce the power consumption with improved performance [9].

Fig. 1.4 depicts the power breakdown for a 400-G coherent transceiver using 7-nm CMOS technology [10, 11, 12].



Figure 1.4: Power breakdown of 400-G coherent transceiver [12].

As shown in Fig. 1.4, coherent laser and driver/TIA electronics constitute 40% of total power consumption while almost 50% of total power is consumed within the ASIC chip. Inside the ASIC chip, coherent and IMDD mostly share the same functional blocks except for the carrier phase/frequency recovery and polarization demultiplexing equalizer specific to coherent digital signal processing (DSP). These extra functionality only attribute to 10% of the total ASIC power concluding that a coherent DSP could potentially consume 10% more power than IMDD. Additionally, it is important to note the most significant power contributors within the DSP is high speed analog-to-digital converter (ADC) and digital-to-analog converter (DAC).

Detailed comparison of IMDD versus coherent detection has been extensively researched concluding lower laser power requirements for coherent with comparable ASIC power [6, 3]. These coherent power analysis and comparison with traditional IMDD links shown the potential to replace IMDD with coherent detection for short range DCI.

While 1.6-T coherent links have been demonstrated with discrete components [13], strict power consumption requirements must be met with reduced equalization and more energy-efficient demodulation using an analog coherent optical receiver (CORX) [12, 8, 14, 15]. Consequently this thesis will study different techniques to implement coherent optical links with improved efficiency. The remainder of this chapter reviews evolution of electronic/ photonic technologies and interconnects and a brief preview of the thesis.

# 1.2 Co-Packaged Optics

As previously discussed, there is a continual demand for an increase in capacity and speed of data transfer demanding innovation in both system and chip architectures. Optical fiber has been a promising technology for high-speed data transfer. Nevertheless, bandwidth limitation of electronic/photonic interconnects remains one major bottleneck in continuous scaling of the baud rate further.

Fig. 1.5 shows the currently used technology within data centers with discrete electronic and photonic components. Co-Packaged Optics (CPO) is an advanced heterogeneous integration of optics and silicon on a single packaged substrate aimed at addressing next generation bandwidth and power challenges [16].



Figure 1.5: Evolution of electronic/photonic interconnects published by Broadcom [16].

Let us return to coherent optical signal processing which places demanding requirements on both photonic and electronic circuits. Heterogeneously-integrated energyefficient dual polarization coherent optical links operating at 224 Gbps/wavelength have been demonstrated [17, 18]. However, monolithic optical transmitters and receivers offer reduced parasitics between the photonic and RF integrated circuit components. Silicon photonics (SiPh) has enabled CMOS compatible optical structures with extensive research on implementing high-speed electro-optic silicon modulators, SiGe photodetectors, low loss fiber to waveguide couplers as well as silicon based lasers [19]. Monolithic coherent receiver at C-band have been demonstrated operating with 3.2 pJ/bit using a photonic BiCMOS 0.25- $\mu$ m SiGe technology [20]. However, RF CMOS circuit techniques complementing silicon photonic (SiPh) devices enable further improvement in energy efficiency [21].

Fig. 1.6 depicts GlobalFoundries 45-nm CMOS SOI technology (45CLO) which offers NMOS devices with  $f_T = 290$  GHz and supports a process development kit (PDK) that includes optical structures for waveguides, photodetectors, fiber coupling, polarization control structures, as well as ring and Mach-Zehnder modulators (MZMs) [22] with recent implementations of 112 Gbps IMDD links [23].



Figure 1.6: GlobalFoundries 45-nm CMOS SOI photonic technology [23].

As this dissertation focuses on implementing optical coherent links with optimized, it is essential to study various integration and packaging approaches and leverage minimized parasitic components in monolithic technologies.

## **1.3** Dissertation Preview

This dissertation will cover system and chip architectures, as well as design and implementation of energy-efficient intra-data center coherent optical links.

Chapter 2 will cover design methodologies and architectures for short-reach coherent links with a focus on transition from digital to analog signal processing for power saving.

Chapter 3 will outline an analysis on power optimization for short reach coherent detection.

A quantitative comparison between two high-speed packaging platforms will be presented in Chapter 4, showing reduced error rates in data transfer with utilization of through silicon vias (TSVs) to implement chip to PCB interconnects.

Chapter 5 will cover design and measurement of a Costas phase/frequency detector (PFD) as an integral part of analog coherent signal processing.

Chapter 6 shows the design and full demonstration of first O-band coherent receiver with monolithic CMOS SOI photonic technology.

Coherent link design optimization enabled with co-simulations of electrical and optical components will be reported in Chapter 7. This chapter also shows the O-band coherent full link measurement results with the most energy-efficient coherent receiver design reported up to date.

Finally, Chapter 8 will conclude the designs with a discussion of future work.

# Chapter 2

# **Analog Coherent Detection**

# 2.1 Introduction

As indicated in the previous section, coherent detection improves receiver sensitivity and supports a pathway to improving energy efficiency with high speed electronic devices. In this section, we will review various approaches to implement coherent architecture.

Currently, high speed coherent detection in long haul communication relies on digital signal processing (DSP), a mature technology using a linear receiver front-end followed by an analog-to-digital converter (ADC) to empower the DSP to perform carrier recovery, equalization, and polarization de-multiplexing. CMOS scaling of DSP circuitry has provided continued improvements in energy required by DSP; however, the high-speed, high-resolution ADC requirement results in high power consumption. For instance, [24] demonstrates a 56-GS/s 8-bit ADC implemented in 28-nm CMOS that consumes 702 mW. A dual-pol I/Q receiver operating at 56 GB requires 4 such ADCs, resulting in 12.5 pJ/bit just from analog to digital conversion.

An alternative approach is to use analog coherent circuits for short range interconnects that operate at O-band to avoid equalization caused by optical dispersion. Quadrature phase shift keying (QPSK) uses a single decision threshold which eliminates the need for high-speed ADC enabling direct demodulation using analog circuitry [8, 25]. A multiwavelength analog coherent detection (ACD) architecture utilizing a chip-scale optical phase lock loop (OPLL) is proposed in [25]. The proposed architecture is based on 50 GBd polarization-multiplexed QPSK (PM-QPSK) for an aggregate data rate of 200 Gbps/ $\lambda$  at sub-10 pJ/bit energy per bit. Although high order QAM modulations still needs high speed ADCs, analog carrier recovery enables compatible DSP with IMDD links.

Moreover, re-configurable data center links and improvement in energy efficiency through optical switching have become a major research topic [5, 26, 27, 28, 29, 30, 31, 32]. However, the losses of the switched network result in strict link budget. One approach to relax link budget requirements is to make the switch transparent by incorporating optical gain through semiconductor optical amplifiers [8]. This approach is presented in [33] showing integration challenges in Si photonic platforms as well as operational issues including added noise, gain uniformity across wavelengths, and crosstalk. To enable photonic switching through expanding available link budgets, an analysis on various modulation formats is presented in [8, 3]. For an analysis conducted under a consistent set of assumptions for each link the conclusion in these publication was the following. For drive swings above  $V_{\pi}$ , 16QAM can offer some improvement in budget compared to IMDD, but the advantages of QPSK are much more substantial. At full  $2V_{\pi}$  drive levels, QPSK expands link budgets by 8 dB compared to PAM4 and 12 dB compared to PAM8. At a more practically realizable drive voltage of  $0.6V_{\pi}$ , QPSK offers increases of 2 dB and 6 dB compared to PAM4 and PAM8, respectively.

In summary, the analog coherent detection architecture proposed in [8] and reviewed briefly here, enables optical switching through improved link budget while operating at sub-10 pJ/bit energy per bit. However, in previous analysis the complications to lock the LO laser to the transmitter was not fully considered. In this section we review QPSK link architectures with analog carrier recovery circuitry and discuss the advantages and drawbacks of each architecture.

# 2.2 Optical Phase Lock Loop (OPLL) Homodyne Detection

Fig. 2.1 shows a coherent link block diagram using an OPLL to lock the frequency and phase of the LO laser to an incoming wavelength channel. The CORX includes an optical hybrid to produce quadrature versions of the LO and received data. The electronic circuits capture differential I/Q signals and amplify these as well as driving a Costas phase/frequency detector (PFD) used to lock the phase/frequency of the tunable LO laser [34, 35]. Previous implementations of optical phase locking with simpler BPSK modulation format can be found in [36, 37, 38, 39]. Highly tunable InP [40, 41], and Silicon photonic integrated lasers have been demonstrated [42, 43, 44, 45]; however, temperature control to ensure a mode-hop free operation is essential. Moreover, a high loop bandwidth with minimum loop delay is essential to ensure stable locking with the required pull-in and lock-in range. In the following section, a simplified loop dynamics is conducted to quantify the OPLL performance and implementation challenges.

### 2.2.1 Dynamics of the OPLL

To review dynamics of the frequency recovery loop, let us quantify the output of the hybrid for a general case. We will further discuss the hybrid functionality and design to generate desired fields in chapters 3 and 5. The hybrid can be viewed as a mixer generating electrical fields with components at  $\omega_{LO} + \omega_{RX}$  and  $\omega_{LO} - \omega_{RX}$ . The PDs



Figure 2.1: Coherent optical data link with OPLL carrier recovery circuitry.

perfrom as a low pass filter (LPF) and remove the  $\omega_{LO} + \omega_{RX}$  component. The I/Q received signal can be expressed as  $m_I \cos(\omega_{RX} t) + m_Q \sin(\omega_{RX} t)$  and the LO laser as  $\cos(\omega_{LO} t + \phi)$ . As depicted in Fig. 2.1, the current at each PD is the following

$$\begin{bmatrix} I_p \\ I_n \\ Q_p \\ Q_n \end{bmatrix} = \begin{bmatrix} \frac{1}{2}(m_I \cos(\omega_{IF}t - \phi) + m_Q \sin(\omega_{IF}t - \phi)) \\ \frac{1}{2}(-m_I \cos(\omega_{IF}t - \phi) - m_Q \sin(\omega_{IF}t - \phi)) \\ \frac{1}{2}(-m_I \sin(\omega_{IF}t - \phi) + m_Q \cos(\omega_{IF}t - \phi)) \\ \frac{1}{2}(m_I \sin(\omega_{IF}t - \phi) - m_Q \cos(\omega_{IF}t - \phi)) \end{bmatrix},$$
(2.1)

where  $\omega_{IF} = \omega_{RX} - \omega_{LO}$ .  $m_I$  and  $m_Q$  are random ones and zeros. The feedback loop shown in Fig. 2.1 minimizes the correlation seen between  $m_I$  and  $m_Q$  in (2.1) to near zero. We will start by discussing the Costas PFD performance and continue by introducing a linear model for the OPLL to help determine the functionality for small frequency perturbations.

#### Costas PFD Gain and Bandwidth

As shown in Fig. 2.1, the PFD response is generated from the multiplication and addition of I/Q signal with the limited version of Q/I waveforms. Setting  $\Phi(t) = \omega_{IF}t - \phi$ , the signal generated at the output of the adder equals

$$PFD_{out} = [Z_{TIA} \cdot [m_I \cos(\Phi(t)) + m_Q \sin(\Phi(t))] \cdot G_{mix} \cdot$$
  

$$sign[-m_I \sin(\Phi(t)) + m_Q \cos(\Phi(t))] - Z_{TIA} \cdot [-m_I \sin(\Phi(t)) + m_Q \cos(\Phi(t))] \cdot \qquad (2.2)$$
  

$$G_{mix} \cdot sign[m_I \cos(\Phi(t)) + m_Q \sin(\Phi(t))]] \cdot G_{add},$$

where  $Z_{TIA}$  is the transimpedance from the TIA,  $G_{mix}$  is the mixer gain and  $G_{add}$  is the gain from addition circuitry. Of course, this equation does not take into account mixer and adder non-linearity that will affect the PFD response. Using trigonometric identities, we have

$$m_I \cos(\Phi(t)) + m_Q \sin(\Phi(t)) = \sqrt{m_I^2 + m_Q^2} \times \sin(\Phi(t) + \arctan(\frac{m_I}{m_Q})), \qquad (2.3)$$

and

$$-m_I \sin(\Phi(t)) + m_Q \cos(\Phi(t)) = \sqrt{m_I^2 + m_Q^2} \times \sin(\Phi(t) - \arctan(\frac{m_I}{m_Q})).$$
(2.4)

This will generate the following output voltage as a function of phase error and quadrature data  $m_I, m_Q$ 

$$PFD_{out}(t) = \begin{cases} \sqrt{2} \cdot V_{PFD} \cdot \sin(\Phi(t) - \arctan(\frac{m_I}{m_Q}) - \frac{\pi}{4}), & 0 < \Phi(t) - \arctan(\frac{m_I}{m_Q}) < \frac{\pi}{2}, \\ -\sqrt{2} \cdot V_{PFD} \cdot \sin(\Phi(t) - \arctan(\frac{m_I}{m_Q}) + \frac{\pi}{4}), & \frac{\pi}{2} < \Phi(t) - \arctan(\frac{m_I}{m_Q}) < \pi, \\ -\sqrt{2} \cdot V_{PFD} \cdot \sin(\Phi(t) - \arctan(\frac{m_I}{m_Q}) - \frac{\pi}{4}), & \pi < \Phi(t) - \arctan(\frac{m_I}{m_Q}) < \frac{3\pi}{2}, \\ \sqrt{2} \cdot V_{PFD} \cdot \sin(\Phi(t) - \arctan(\frac{m_I}{m_Q}) + \frac{\pi}{4}), & \frac{3\pi}{2} < \Phi(t) - \arctan(\frac{m_I}{m_Q}) < 2\pi, \end{cases}$$

where  $V_{PFD} = Z_{TIA} \cdot G_{mix} \cdot G_{add} \cdot \sqrt{m_I^2 + m_Q^2}$ . Taking into the account that  $\frac{m_I}{m_Q} = \pm 1$ , for both cases the response simplifies to

$$PFD_{out}(t) = \begin{cases} V_{PFD} \cdot \sin(\Phi(t)), & -\frac{\pi}{4} < \Phi(t) < \frac{\pi}{4}, \\ -V_{PFD} \cdot \cos(\Phi(t)), & \frac{\pi}{4} < \Phi(t) < \frac{3\pi}{4}, \\ -V_{PFD} \cdot \sin(\Phi(t)), & \frac{3\pi}{4} < \Phi(t) < \frac{5\pi}{4}, \\ V_{PFD} \cdot \cos(\Phi(t)), & \frac{5\pi}{4} < \Phi(t) < \frac{7\pi}{4}. \end{cases}$$

The normalized response to  $V_{PFD}$  as a function of phase error is shown in Fig. 2.8.



Figure 2.2: Calculated PFD response normalized response to  $V_{PFD}$  as a function of phase error.

PFD gain,  $K_{PFD}$ , is found to be  $K_{PFD} = \frac{2V_{PFD}}{\pi}$  V/rad. Input signal amplitude  $\sqrt{m_I^2 + m_Q^2}$  depends on both signal power,  $P_{RX}$ , and LO power,  $P_{LO}$ , and is found to be  $R_{PD}\sqrt{2P_{LO}P_{RX}}$ . We will review the derivation of this equation in next chapter in (3.3).

Consequently, we have  $K_{PFD} = \frac{2}{\pi} Z_{TIA} \cdot G_{mix} \cdot G_{add} \cdot R_{PD} \sqrt{2P_{LO}P_{RX}}$  V/rad. There is a well known limit for the transimpedance,  $Z_{TIA}$ , as a function of the bandwidth that we will address in the next chapter [46], but for the purpose of this analysis let us assume,  $Z_{TIA} = 380\Omega$  is chosen for optimized performance in data path. For a passive mixer  $G_{mix} = \frac{2}{\pi}$ , and assuming ideal addition  $G_{add} = 1$ , we find

$$K_{PFD} = 154R_{PD}\sqrt{2P_{LO}P_{RX}}.$$
 (2.5)

Since the PFD response is dependent on the input signal amplitude, it is essential to ensure that the OPLL can achieve locking at the receiver sensitivity levels.

The PFD bandwidth depends on TIA front-end bandwidth, how fast the mixer can switch to follow high speed data transitions as well as the adder capability to subtract the signals without degradation in amplitude at higher data rates. As circuit components in the Costas loop are designed to be wide-band to support high data rates up to 56 GBaud, we can safely model the PFD as a first order system having a wider bandwidth compared to other components in the loop. In the following section, a mathematical model for the PLL is introduced in order to review loop stability as well as phase/frequency tracking ability of the OPLL.

#### **Tunable Laser**

The design and fabrication of the tunable laser can be found in [42]. The tunable laser can be modeled as a current control oscillator (CCO) with a phase/frequency efficiency  $K_{CCO} = 600 \text{ GHz/A}$ . The phase tuning diode is forward biased to achieve a high phase efficiency and can be modeled as a 15  $\Omega$  resister. So, we can rewrite the phase efficiency as  $K_{VCO} = 40 \text{ GHz/V}$ .

The maximum LO laser power including on chip semiconductor optical amplifiers (SOA) for this design is 10 mW.

### Linear PLL Model

Fig. 2.3 depicts a block diagram to help evaluate the PLL dynamics. To simplify the mathematical analysis, the electrical circuitry in the Costas loop can be modeled as a first order system with  $H_{PFD} = \frac{k_{PFD}}{\frac{\omega}{\omega_{PFD}}+1}$ , described as PFD and a low pass filter (LPF) in Fig. 2.1.



Figure 2.3: PLL mathematical model including a PFD, low pass filter (LPF), a loop filter (LF) and voltage control oscillator (VCO).

The costas loop is then followed by a loop filter (LF). Because of its superior performance an integrator with a zero is chosen as the LF. The output of the LF will see a time delay before reaching the tunable laser modeled as a voltage/ current control oscillator (VCO/CCO).

#### Stability

The open loop transfer function  $H_{OL}$  for the PLL model depicted in Fig. 2.3 is defined by

$$H_{OL} = \frac{\theta_e}{\theta_{in}} = k_{PFD} k_{VCO} \frac{1 + s\tau_{zero}}{s^2 \tau_{LF}} e^{-s\tau_{delay}}.$$
(2.6)

The amplitude of the transfer function is unaffected by the delay; however, the phase is shifted by  $\phi_{delay} = \omega . \tau_{delay}$ . To ensure stability, the phase must remain larger than  $-\pi$ rad at the frequency where the amplitude crosses 0 dB. The amplitude of the transfer function equals

$$|H_{OL}| = k_{PFD}k_{VCO}\frac{\sqrt{1+\omega^2.\tau_{zero}^2}}{\omega^2\tau_{LF}},$$
(2.7)

and the phase of the transfer function is

$$< H_{OL} = -\pi + \arctan(\omega . \tau_{zero}) - \omega . \tau_{delay}.$$
 (2.8)

Assuming that the LF zero frequency is much smaller than the unity gain frequency  $\omega_u$ ; i.e.  $\omega_u \tau_{zero} >> 1$ , we find  $\omega_u = k_{PFD} k_{VCO} \frac{\tau_{zero}}{\tau_{LF}}$ . The phase at this frequency equals  $\phi_u = -\pi + \arctan(k_{PFD} k_{VCO} \frac{\tau_{zero}^2}{\tau_{LF}}) - k_{PFD} k_{VCO} \frac{\tau_{zero}}{\tau_{LF}} \cdot \tau_{delay}$ .

For graphical aid, positive phase margin=  $\arctan(\omega_u \cdot \tau_{zero}) - \omega_u \cdot \tau_{delay}$  as a function of unity gain bandwidth,  $\omega_u$ , and time delay,  $\tau_{delay}$  is shown in Figs . 2.4a and . 2.4b for choices of LF zero frequencies  $f_{zero} = \frac{1}{2\pi\tau_{zero}}$  at 10 and 100MHz. As shown, pushing the LF zero to higher frequencies limits the phase margin. It is also important to note that larger delay values significantly limits stable unity gain bandwidth. Although the zero improves stability for smaller delays, as the loop delay increases, its effect is reduced and it can not help improve the stable bandwidth. We will review circuit designs of the LF to further evaluate actual implementation challenges but for calculation purposes we will just use the ideal transfer function in this section.



Figure 2.4: Phase margin as a function of time delay and unity gain bandwidth for (a)  $f_{zero} = 10MHz$  (b)  $f_{zero} = 100MHz$ .

The closed loop transfer function for this linear PLL model can also be found. For the closed loop, we have

$$(\theta_{in} - \theta_e)(k_{PFD}k_{VCO}\frac{1 + s\tau_{zero}}{s^2\tau_{LF}}e^{-s\tau_{delay}}) = \theta_e.$$
(2.9)

As a result, we can find

$$H_{CL} = \frac{\theta_e}{\theta_{in}} = \frac{k_{PFD}k_{VCO}}{\tau_{LF}} \frac{1 + s\tau_{zero}}{s^2 e^{s\tau_{delay}} + s.k_{PFD}k_{VCO}\frac{\tau_{zero}}{\tau_{LF}} + \frac{k_{PFD}k_{VCO}}{\tau_{LF}}}.$$
 (2.10)

For small loop delay values such that  $s\tau_{delay} < 0.1$ , we can estimate the closed loop transfer function as a second order system with natural frequency  $\omega_n = \sqrt{\frac{k_{PFD}k_{VCO}}{\tau_{LF}}}$ , and damping factor  $\zeta = \frac{\tau_{zero}\omega_n}{2}$ , whereas with higher delay values the closed loop response approaches the open loop transfer function and can be estimated as  $H_{CL} = k_{PFD}k_{VCO}\frac{\sqrt{1+\omega^2\tau_{zero}^2}}{\omega^2\tau_{LF}}$ . This can indicate instability and failure to lock for large loop delays.

The linear model determines the step response and the PLL lock dynamic for small instantaneous frequency drifts. Let us find the step response assuming the abrupt changes in frequency are such that the phase shift from the time delay can be neglected.

$$H_{step} = \frac{1}{s} 2\zeta \omega_n \frac{s + \omega_{zero}}{s^2 + 2\zeta \omega_n s + \omega_n^2} = \frac{2\zeta \omega_{zero}}{\omega_n} \left[\frac{1}{s} - \frac{s + (2\zeta \omega_n - \frac{\omega_n^2}{\omega_{zero}})}{s^2 + 2\zeta \omega_n s + \omega_n^2}\right],\tag{2.11}$$

where  $\omega_{zero} = \frac{1}{\tau_{zero}}$  For damping factor  $0 < \zeta < 1$ , the response can be expressed as

$$H_{step} = \frac{2\zeta\omega_{zero}}{\omega_n} \left[\frac{1}{s} - \frac{\zeta\omega_n - \frac{\omega_n^2}{\omega_{zero}}}{(s + \zeta\omega_n)^2 + \omega_d^2} - \frac{s + \zeta\omega_n}{(s + \zeta\omega_n)^2 + \omega_d^2}\right],\tag{2.12}$$

where  $\omega_d = \sqrt{\omega_n^2(1-\zeta^2)}$ . The time domain response in this case equals
$$h_{step}(t) = \frac{2\zeta\omega_{zero}}{\omega_n} \left[1 - \left(\frac{\zeta\omega_n - \frac{\omega_n^2}{\omega_{zero}}}{\omega_d}\right)e^{-\zeta\omega_n t}\sin\omega_d t - e^{-\zeta\omega_n t}\cos\omega_d t\right].$$
 (2.13)

The equation can be further simplified

$$h_{step}(t) = \frac{2\zeta\omega_{zero}}{\omega_n} \left[1 - e^{-\zeta\omega_n t} \left(\frac{-\zeta}{\sqrt{1-\zeta^2}}\sin\omega_d t + \cos\omega_d t\right)\right].$$
 (2.14)

 $\zeta = \frac{1}{\sqrt{2}}$  is a proper choice to optimize rise time for a reasonable overshoot. The normalized step response is shown in Fig. 2.5. The response settles at  $\frac{\omega_n t}{\sqrt{2}} = 4$  and hence the settling time is  $t_s = \frac{4\sqrt{2}}{\omega_n}$ .



Figure 2.5: Normalized step response.

And for damping factor  $\zeta > 1$ , the response can be expressed as

$$H_{step} = \frac{2\zeta\omega_{zero}}{\omega_n} \left[\frac{1}{s} - \frac{\zeta\omega_n - \frac{\omega_n^2}{\omega_{zero}}}{(s + \zeta\omega_n)^2 - \omega_d^2} - \frac{s + \zeta\omega_n}{(s + \zeta\omega_n)^2 - \omega_d^2}\right],\tag{2.15}$$

where  $\omega_d = \sqrt{\omega_n^2(\zeta^2 - 1)}$ . The time domain response in this case equals

$$h_{step}(t) = \frac{2\zeta\omega_{zero}}{\omega_n} \left[1 - \left(\frac{\zeta\omega_n - \frac{\omega_n^2}{\omega_{zero}}}{2\omega_d} + \frac{1}{2}\right)e^{(-\zeta\omega_n + \omega_d)t} + \left(\frac{\zeta\omega_n - \frac{\omega_n^2}{\omega_{zero}}}{2\omega_d} - \frac{1}{2}\right)e^{(-\zeta\omega_n - \omega_d)t}\right].$$
(2.16)

The under-damped response has slower rise time which will limit the lock time. However, the damping factor  $\zeta = \frac{\tau_{zero}}{2} \sqrt{\frac{k_{PFD}k_{VCO}}{\tau_{LF}}}$  depends on  $K_{PFD}$  which varies with transmitted signal amplitude. As a result the OPLL tracking capability highly depends on signal power.

To summarize, in an analog Costas OPLL, locking dynamics highly depend on signal and LO powers as well as loop delay which is inherently large due to integration challenges between electronic and photonic components. Techniques for rapidly acquiring the LO might use non-linear adaptation schemes to change the loop filter dynamically and allow for a fast acquisition period, followed by a longer time constant to improve the phase noise rejection. Moving forward we will review frequency locking dynamics for the linear model by calculating pull-in, and lock-in range and time.

#### Pull-in Frequency

Pull-in range is defined as the largest interval  $[0, \Delta \omega_P)$  of frequency such that the loop achieves lock for any initial state.

To find the Pull-in frequency, let us revisit QPSK Costas PFD response shown in Fig. 2.8. Assume there is an initial frequency offset  $\Delta \omega_P$ . The costas PFD response quadrupled the phase/frequency offset; hence, the signal traveling through the LF has a frequency equal to  $4\Delta\omega_P$ . The output of the LF will have another phase shift equal to  $\phi_{delay}(\omega) = -\omega\tau_{delay}$  due to the time delay before reaching the tunable laser.

To ensure the output frequency,  $\omega_{out}$ , pulls toward the desired frequency of received signal,  $\omega_{RX}$ , the total phase shift of the signal driving the VCO, must be at most  $-\pi/2$ . The signal space analysis as well as mathematical proof for the statement can be found in [35]. The phase of the signal traveling in the loop equals

$$\phi_{total}(\Delta\omega_P) = 4\phi_{PFD}(\Delta\omega_P) + \phi_{LF}(4\Delta\omega_P) + \phi_{delay}(4\Delta\omega_P), \qquad (2.17)$$

where  $\phi_{PFD} = -\arctan(\frac{\Delta\omega_P}{\omega_{PFD}})$  is the phase shift in PFD response from the LPF single pole model. At  $4\Delta\omega_P$ , the phase shift from the loop filter shown in Fig. 2.3 equals to  $-\pi/2$  phase shift from the integrating pole and  $\arctan(\frac{4\Delta\omega_P}{\omega_{zero}})$  phase shift from the zero in the transfer function. Finally, the phase shift from the time delay equals to  $\phi_{delay}(4\Delta\omega_P) = 4\Delta\omega_P.\tau_{delay}$ . Replacing the phase shifts in (2.17) with calculated values, to find  $\Delta\omega_P$  we should solve

$$\phi_{total}(\Delta\omega_P) = -4\arctan(\frac{\Delta\omega_P}{\omega_{PFD}}) - \pi/2 + \arctan(\frac{4\Delta\omega_P}{\omega_{zero}}) - 4\Delta\omega_P \cdot \tau_{delay} = -\pi/2. \quad (2.18)$$

We can modify the equation using the following trigonometric identity

$$4\arctan(x) = \arctan(\frac{4x(1-x^2)}{1-6x^2+x^4}).$$
(2.19)

The pull-in frequency,  $\Delta \omega_P$ , must satisfy the following equation

$$-\arctan(\frac{4\frac{\Delta\omega_P}{\omega_{PFD}}(1-(\frac{\Delta\omega_P}{\omega_{PFD}})^2)}{1-6(\frac{\Delta\omega_P}{\omega_{PFD}})^2+(\frac{\Delta\omega_P}{\omega_{PFD}})^4}) + \arctan(\frac{4\Delta\omega_P}{\omega_{zero}}) = 4\Delta\omega_P.\tau_{delay}, \quad (2.20)$$

and using

$$\arctan(x) + \arctan(y) = \begin{cases} \arctan(\frac{x+y}{1-xy}), & xy < 1\\ \pi + \arctan(\frac{x+y}{1-xy}), & xy > 1 \end{cases}$$

We find that for 
$$\Delta \omega_P < \omega_{PFD} \sqrt{\frac{6 - \frac{\omega_{zero}}{\omega_{PFD}} - \sqrt{(6 - \frac{\omega_{zero}}{\omega_{PFD}})^2 - 4(1 + \frac{\omega_{zero}}{\omega_{PFD}})}{2}}}{2}}$$
,  $\Delta \omega_P$ , must satisfy

$$\arctan\left(\frac{\frac{4\frac{\Delta\omega_P}{\omega_{PFD}}(1-(\frac{\Delta\omega_P}{\omega_{PFD}})^2)}{1-6(\frac{\Delta\omega_P}{\omega_{PFD}})^2+(\frac{\Delta\omega_P}{\omega_{PFD}})^4}-\frac{4\Delta\omega_P}{\omega_{zero}}}{1-6(\frac{\Delta\omega_P}{\omega_{PFD}})^2+(\frac{\Delta\omega_P}{\omega_{PFD}})^2}\right) = 4\Delta\omega_P.\tau_{delay}.$$

$$(2.21)$$

And for 
$$\Delta \omega_P > \omega_{PFD} \sqrt{\frac{6 - \frac{\omega_{zero}}{\omega_{PFD}} - \sqrt{(6 - \frac{\omega_{zero}}{\omega_{PFD}})^2 - 4(1 + \frac{\omega_{zero}}{\omega_{PFD}})}{2}}$$
 we have

$$\arctan\left(\frac{\frac{4\frac{\Delta\omega_P}{\omega_{PFD}}(1-(\frac{\Delta\omega_P}{\omega_{PFD}})^2)}{1-6(\frac{\Delta\omega_P}{\omega_{PFD}})^2+(\frac{\Delta\omega_P}{\omega_{PFD}})^4}-\frac{4\Delta\omega_P}{\omega_{zero}}}{1+\frac{4\frac{\Delta\omega_P}{\omega_{PFD}}(1-(\frac{\Delta\omega_P}{\omega_{PFD}})^2)}{1-6(\frac{\Delta\omega_P}{\omega_{PFD}})^2+(\frac{\Delta\omega_P}{\omega_{PFD}})^4}\cdot\frac{4\Delta\omega_P}{\omega_{zero}}}\right) = -\pi + 4\Delta\omega_P.\tau_{delay}.$$
 (2.22)

The equation does not provide much insight into how each variable contribute to the pull-in range. To better understand the design criteria let us start by analyzing a special case where the loop delay is negligible. In this case we can find the pull-in frequency by solving

$$\left(\frac{\Delta\omega_P}{\omega_{PFD}}\right)^4 + \left(\frac{\Delta\omega_P}{\omega_{PFD}}\right)^2 \left(\frac{\omega_{zero}}{\omega_{PFD}} - 6\right) + 1 + \frac{\omega_{zero}}{\omega_{PFD}} = 0, \qquad (2.23)$$

which results in

$$\Delta\omega_P = \omega_{PFD} \sqrt{\frac{6 - \frac{\omega_{zero}}{\omega_{PFD}} - \sqrt{(6 - \frac{\omega_{zero}}{\omega_{PFD}})^2 - 4(1 + \frac{\omega_{zero}}{\omega_{PFD}})}{2}}.$$
(2.24)

However, in general, the equation needs to be solved numerically to find the pull-in range. The zero frequency in the loop filter is at a much lower frequency compared to the pull in frequency and hence has  $\pi/2$  phase shift. The effects of PFD bandwidth and time delay on the pull-in frequency is shown in Fig. 2.6.

As shown, for large loop delay values the pull-in range is highly limited and improving



Figure 2.6: Pull-in frequency as a function of loop delay and PFD bandwidth for  $\omega_{zero} = 10 MHz$ .

the PFD performance does not have a significant effect on the pull-in range. For instance, with 100 ps time delay in the loop, which is expected from the packaging traces, at least 20 GHz 3dB BW is required for the PFD design to achieve only 800 MHz pull-in range. Reducing the PFD BW to 5 GHz will change  $f_{pull-in}$  to 400 MHz. This means the LO laser frequency must be withing 400 MHz of the transmitter laser to achieve lock. limited dependency of  $f_{pull-in}$  on PFD BW for large loop delays is easier perceived in Fig. 2.7.



Figure 2.7: Pull-in frequency as a function of loop delay and PFD bandwidth for  $\omega_{zero} = 10$  MHz.

Since an electrical signal is fed back to the tunable laser, a large loop delay is inherent to the architecture. As a result, high levels of integration is essential to enable reliable OPLL functionality.

#### Pull-in Time

To find the pull-in time, the signal traveling in the loop should be studied in the time domain. Derivation of time domain signal at each node when not in a lock state can be found in [35] and the pull-in time  $T_p$  is calculated by solving the following equation

$$\frac{\tau_{LF}\pi^2}{k_{LF}.K_{PFD}^2.k_{VCO}^2} \int_{\Delta\omega_{init}}^{\Delta\omega_L} \frac{\Delta\omega}{\cos(\phi_{total})} d\Delta\omega = -\int_0 T_p dt, \qquad (2.25)$$

where  $\Delta \omega_{init}$  is the initial frequency offset, and  $\phi_{total}$  taking into account the phase shift from the loop delay was found in (2.17). The equation is not easily solvable and can be studied numerically for various loop components.

#### Lock-in Frequency

Lock-in range is defined as the largest interval  $[0, \Delta \omega_L)$  of frequency within pull-in range such that the loop achieves lock without cycle slipping after an abrupt change.

Assuming the phase shifts from the time delay and limited BW are negligible within lock-in range, the VCO frequency is modulated with the following signal

$$\omega_{out} = \omega_{free} + k_{LF} \cdot k_{VCO} PFD_{out}(\Delta\omega_L), \qquad (2.26)$$

where  $k_{LF}$  is the LF amplitude variation at the desired frequency and  $PFD_{out}(\Delta\omega_L)$ is the PFD output for the frequency error  $\Delta\omega_L$ . The PFD response was calculated and shown in Fig. 2.8. Now let us look at the modulated frequency based on the PFD response The maximum abrupt frequency change the PLL can track equals



Figure 2.8: Calculated PFD response normalized response to  $V_{PFD}$  as a function of phase error.

$$\Delta\omega_L = \frac{\sqrt{2}}{2} V_{PFD} k_{LF} k_{VCO}. \qquad (2.27)$$

For  $K_{VCO} = 40$  GHz/V,  $R_{PD} = 0.9$  A/w, and LO laser power set to maximum  $P_{LO} = 10$  mW, we find

$$\Delta f_L = \frac{\Delta \omega_L}{2\pi} = 138k_{LF}\sqrt{P_{RX}}GHz.$$
(2.28)

The lock-in frequency as a function of  $k_{LF}$  and signal power  $P_{RX}$  is shown in . This figures provide guidance on the LF design in order to ensure frequency lock can be achieved for the minimum detectable signal.

Note that for simplicity effects of loop delay was neglected while calculating the lockin range; however, as discussed inherent large loop delay highly affects stability, pull-in frequency and essentially lock-in range.

#### Lock-in Time

In previous section, the step response for the linear PLL model was studied. As discussed, for a second order system, the choice of damping factor,  $\zeta = \frac{\tau_{zero}\omega_n}{2}$  where



Figure 2.9: Lock-in frequency as a function of  $P_{RX}$  and  $K_{LF}$ .

 $\omega_n = \sqrt{\frac{k_{PFD}k_{VCO}}{\tau_{LF}}}$ , affects how fast the system responds to an instantaneous change in input frequency.  $\zeta = \frac{1}{\sqrt{2}}$  is a reasonable choice to balance the overshoot and rise time which will result in a settling time equal to  $t_s = \frac{4\sqrt{2}}{\omega_n}$ . For this case, the lock-in time can be found

$$t_L = t_s = \frac{4\sqrt{2\tau_{LF}}}{\sqrt{k_{PFD}k_{VCO}}}.$$
(2.29)

Consequently, the lock-in time depends on the loop filter design as well as  $k_{VCO}$ .

#### Summary

To summarize the findings in this section, the OPLL design proves to be challenging raising concerns regarding stability and frequency locking range. The loop stability as well as locking dynamic is dependent on signal levels and a dynamic loop filter is essential. Moreover, the loop delay drastically affect the stability of the loop and limits the choice of loop filter components for fast tracking. To address these drawbacks, this dissertation will focus on two alternative architectures to implement CORX.

# 2.3 Electrical Phase Lock Loop (EPLL) Heterodyne Detection

As reviewed in the previous section, the loop delay has a significant effect on the locking dynamic. For instance, if the output of the loop filter has a 50 ps delay to reach to the laser the pull-in range remains below 5 GHz meaning the initial LO frequency should be within that range to ensure locking. As discussed, the inherent large loop delay in OPLLs can cause instability and chip-scale integration is essential. To minimized the loop delay, the carrier recovery may be performed on a single chip. Fig. 2.10 shows the architecture for a heterodyne optical receiver, where the LO laser is free running at an offset from the transmitter laser frequency. The electrical carrier recovery circuitry follows the offset frequency between the lasers which may drift up to 10 GHz.



Figure 2.10: Coherent optical data link with EPLL carrier recovery circuitry.

The PD currents depicted in Fig. 2.10 are calculated in (2.1). In the wireless heterodyne receiver, depicted in Fig. 2.11, the intermidiate signal is down converted and low pass filtered to remove frequency components at  $2\omega_{IF}$ . However, despite the wireless architecture, the optical receiver is wide-band and low pass filtering  $2\omega_{IF}$  may filter the data  $m_I$  and  $m_Q$  as well.

To down convert the data to base-band, differential I/Q signals  $\Delta I = m_I \cos(\omega_{IF}t - \phi) + m_Q \sin(\omega_{IF}t - \phi), \ \Delta Q = -m_I \sin(\omega_{IF}t - \phi) + m_Q \cos(\omega_{IF}t - \phi), \ \text{are mixed down}$ 

#### Wireless



Figure 2.11: Heterodyne wireless data link with PLL carrier recovery circuitry.

with  $\cos(\omega_{IF}t - \phi)$  and  $\sin(\omega_{IF}t - \phi)$ .

$$\begin{bmatrix} A\\ B\\ C\\ D \end{bmatrix} = \begin{bmatrix} \Delta I \cos(\omega_{IF}t - \phi)\\ \Delta I \sin(\omega_{IF}t - \phi)\\ \Delta Q \cos(\omega_{IF}t - \phi)\\ \Delta Q \cos(\omega_{IF}t - \phi) \end{bmatrix} = \begin{bmatrix} \frac{1}{2}(m_I + m_I \cos(2\omega_{IF}t) + m_Q \sin(2\omega_{IF}t))\\ \frac{1}{2}(m_Q + m_I \sin(2\omega_{IF}t) - m_Q \cos(2\omega_{IF}t))\\ \frac{1}{2}(m_Q - m_I \sin(2\omega_{IF}t) + m_Q \cos(2\omega_{IF}t))\\ \frac{1}{2}(-m_I + m_I \cos(2\omega_{IF}t) + m_Q \sin(2\omega_{IF}t))\end{bmatrix}$$
(2.30)

Based on (2.30),  $m_I$  is recovered from A-D and  $m_Q$  is recovered from B+C. Fig. 2.10 shows the circuit block diagram to generate the required signals. Fig. 2.11 shows proposed circuit design to perform the above mentioned analog signal processing. The front-end consists of 4 double balanced mixers in which in-phase and quadrature signals are each down converted to base-band with  $\cos(\omega_{IF}t-\phi)$  and  $\sin(\omega_{IF}t-\phi)$  to generate the outputs shown in (2.30). The outputs are then combined as expressed to remove unwanted signals.

A quadrature VCO (QVCO) generates  $\cos(\omega_{IF}t - \phi)$  and  $\sin(\omega_{IF}t - \phi)$  and an EPLL will track the offset frequency and phase to down convert the signal using the circuit showin in. The rest of this section reviews a proposed QVCO design, its characteristics and the locking dynamics of the EPLL.



Figure 2.12: EPLL front-end circuit.

# 2.3.1 Quadrature VCO

Low noise oscillators with high oscillation frequency are widely used in wireless communication. For instance, 110 GHz VCO is proposed in [47]; however, with a limited tuning range. LC tank oscillators with a wide tuning range and low phase noise have been widely studied [48, 49, 50, 51]. Nevertheless, ring oscillators easily generate multiple phases consuming relatively low power but with worse noise compared to LC oscillators [52]. A ring oscillator with improved phase-noise is proposed in [53]. For the EPLL architecture, we took a similar approach. The proposed QVCO schematic is shown in Fig. 2.13.

The QVCO consists of a two stage differential ring oscillator to generate 0°, 180°, 90°, 270° as needed for the down conversion shown in Fig. 2.11.

The QVCO consumes 10 mW and the schematic simulation, depicted in Fig. 2.14, shows 14 GHz tuning range as needed to adjust any drift in LO laser frequency. The



Figure 2.13: QVCO circuit schematic.

center frequency can be adjusted by varying the transistor sizing and hence  $g_m$  depending on the link architecture and laser frequency offset.



Figure 2.14: Simulated oscillation frequency as a function of control voltage.

We will continue to analyze loop dynamics for the proposed architecture and will then review the coherent circuit design and co simulations of electronic and photonic ICs.

# 2.3.2 Dynamics of the EPLL

The effects of loop delay on OPLL stability and locking was thoroughly discussed in the previous section. The main advantage of the EPLL architecture is to minimize the loop delay; however, the electrical VCO has a smaller range compared to the tunable laser. In this section, the effects of limited  $K_{VCO}$  and minimized loop delay is reviewed.

## **Pull-in Frequency**

Fig. 2.6 and 2.7 show the pull-in range dependency on the loop delay. Restricting the delay to 20 ps in the design will provide more than 5 GHz of pull-in range with  $\omega_{PD} = 10$  GHz. As shown, for smaller loop delay, the response is more sensitive to the PFD bandwidth and hence to improve the pull-in range further, wide-band PFD design is required.

#### Pull-in Time

The pull-in time is calculated similar to the OPLL design.

#### Lock-in Frequency

The lock-in range depends on  $K_{VCO}$  which is smaller than the laser tunability. For the QVCO design described in 2.3.1,  $K_{VCO} = 14$  GHz/V. With the same PFD design  $f_{lock-in}$  for the EPLL is

$$\Delta f_L = \frac{\Delta \omega_L}{2\pi} = 24.5 k_{LF} \sqrt{P_{RX}} GHz.$$
(2.31)

The lock-in range is shown in Fig. 2.15. To compensate for limited  $K_{VCO}$  the LF may be redesigned to provide higher gain.

#### Lock-in Time

Previously, the OPLL lock-in time was found to  $bet_L = \frac{4\sqrt{2\tau_{LF}}}{\sqrt{k_{PFD}k_{VCO}}}$ ; hence, the smaller  $k_{VCO}$  in the EPLL will result in an increase in the lock-in time for the same loop filter



Figure 2.15: Lock-in frequency for the EPLL as a function of  $P_{RX}$  and  $K_{LF}$ .

design.

# 2.4 Self-Homodyne Detection

To meet strict power and cost requirements and further simplify the use of coherent optical architecture in low-power environments self-homodyne architecture have gained popularity. A self-homodyne receiver architecture simplifies the LO requirements by receiving the unmodulated carrier from the transmitter either through a separate fiber or an orthogonal polarization.

Fig. 2.16a illustrates a self-homodyne link where a portion of the laser power is splitted and forwarded to the RX on a separate fiber to use as the LO,  $P_{LO}$ . Forwarded laser is then used as a reference for the receiver; however, with the LO having a different delay as the modulated signal the demodulation circuitry needs to lock the phase of the LO laser with an optical delay lock loop (ODLL) [54]. Similar to OPLL architecture a Costas PFD can be utilized to drive a high speed active optical phase tuner to adjust the phase of LO laser.



Figure 2.16: Self-homodyne optical link a) using a separate fiber to forward the LO laser and ODLL for phase locking b) using an orthogonal polarization to forward LO laser.

Fig. 2.16b shows an alternative to using an extra fiber by sending the LO on the orthogonal polarization. This removes the need for phase locking since the LO and signal are now traveling on a single fiber; however, it eliminates the capability to double the data rate by sending data on both polarization.

# 2.4.1 Dynamics of the ODLL

Fig. 2.17 represents a linear phase model for the DLL.

Similar to the PLL the PFD and LPF model the costas loop as a first order linear



Figure 2.17: A linear model for the DLL. THE PFD and LPF blocks model the costas PFD. The LF is modeled as an ideal integrator. The voltage controlled delay line (VCDL) adjusts the output phase based on the input voltage.

system. The LF role is to generate infinite gain at low frequencies and remove high frequency components of the PFD. Hence, an integrator is a proper choice. The voltage control delay line (VCDL) provides a variable time delay or phase shift in the signal as a function of the voltage applied to it. For now, let us assume that the VCDL has a linear phase response and can be modeled as  $\phi_{out} = \phi_{in} + K_{VCDL}V_{cont}$ . Using the linear model, the phase through the loop obeys the following equation

$$\phi_{out} = \phi_{in} + \frac{K_{LF}}{S} k_{PFD} k_{VCDL} (\phi_{in} - \phi_{out}).$$
(2.32)

Consequently, the phase error,  $\phi_e = \phi_{in} - \phi_{out}$ , follows

$$\phi_e \cdot \frac{K_{LF}}{S} k_{PFD} \cdot k_{VCDL} = -\phi_e. \tag{2.33}$$

For the equation to hold correct across all frequencies,  $\phi_e$  should approach 0. In time domain, if the input phase fluctuates slowly the output phase follows the input phase with a time delay. So we have

$$\phi_{out} = \phi_{in} + \phi_{e0} e^{\frac{-\iota}{\tau_L}},\tag{2.34}$$

where  $\phi_{e0}$  is the initial phase error, and  $\tau_L = \frac{1}{K_{LF}k_{PFD}k_{VCDL}}$  is the loop time constant. At  $t = 7\tau_L$ , the phase error is reduced by a factor of  $10^{-3}$ . Hence, we need to ensure the loop can track the phase faster than the phase variations. The VCDL in optical domain can be implemented either with an active PN phase tuner or a thermal phase tuner. The active phase tuner provides a higher speed response and is more suitable. For the PN phase tuner the phase efficiency  $V_{\pi}L$  (V.Cm) is defined as the voltage required for  $\pi$  radian phase shift for 1cm long device and is technology dependent and is about 1.8V.Cm for the 45CLO technology. For this case, we have  $k_{VCDL} = \frac{\pi}{V_{\pi}} = \frac{\pi \cdot L}{1.8}$  rad/V, which depends on the length of the active device L. for a 3mm device we have  $k_{VCDL} = 0.5$  rad/V.

Revisiting (2.5), the PFD gain is a function of signal and LO laser power and equals  $K_{PFD} = 154R_{PD}\sqrt{2P_{LO}P_{RX}}$  V/rad for the same Costas circuit design. In the self homodyne approach, the LO is forwarded from the transmit side and the loss in the LO path limits the  $K_{PFD}$  as well as the receiver sensitivity. Assuming a 10 dB loss in the LO path compared to the OPLL design, we use  $P_{LO} = 1$  mW for this analysis. With  $R_{PD} = 0.9$ A/w, we have  $K_{PFD} = 6.2\sqrt{P_{RX}}$  V/rad.

The loop time constant,  $\tau_L = \frac{1}{3.1K_{LF}\sqrt{P_{RX}}}$ , is shown as a function of signal power  $P_{RX}$  and  $K_{LF}$ . Fig. 2.18 shows the LF design requirement to make sure the loop locks within the required time frame for the minimum detectable signal on the received channel.



Figure 2.18: Lock time as a function of  $P_{RX}$  and  $K_{LF}$ .

Thus far we assumed the LF is based on an ideal integrator; however, in practice, the integrator may saturate before generating enough swing to adjust the phase.

# 2.5 Conclusion

To conclude, in this chapter we reviewed 3 different approaches to implement a coherent optical link with analog carrier recovery. The carrier recovery for OPLL homodyne architecture is challenging requiring on chip tunable laser which places strict requirements on the photonic IC (PIC). The self-homodyne architecture simplifies the carrier recovery and PIC design by replacing the tunable laser with an optical phase tuner to adjust the incoming LO phase. However, the sensitivity is degraded as for the same laser power the optical loss due to forwarding the LO from transmit side limits the received power.

The heterodyne architecture requires a higher front-end bandwidth compared to the homodyne design to acheive the same data rate. This is due to the fact that the signal at PDs are not at base-band and are up-converted by the TX and RX laser offset frequency. Higher bandwidth requirement may result in worsen integrated noise and higher power consumption as well. In this approach the PIC design is simple; however, all the complication is transferred to the electronic front-end.

|                  | Homodyne   | Self-homodyne    | Heterodyne    |
|------------------|------------|------------------|---------------|
| Front-end BW     | Moderate   | Moderate         | Large         |
| Carrier recovery | Hard       | Easy             | Moderate      |
| Photonic IC      | Hard       | Moderate         | Easy          |
| Sensitivity      | Best       | Moderate         | Worst         |
| Pitfalls         | LO leakage | Additional fiber | I/Q imbalance |

 Table 2.1: Architecture comparison

# Chapter 3

# Coherent Detection Power Optimization

# 3.1 Introduction

In chapter 1, we reviewed the importance of continual improvement in bandwidth, and power consumption of optical transceivers. Moreover, we studied different analog coherent architectures to implement energy efficient short range optical links.

In this chapter, we will express an analysis for coherent link optimization.

The CORX consists of an optical 90° hybrid which mixes the local oscillator (LO) and signal electric fields, respectively  $E_{LO}$  and  $E_{RX}$ , that impinge on a photodetector (PD) as illustrated in Fig. 3.1. In terms of the optical power, the fields can be expressed as a function of the LO power  $P_{LO}$ , the received optical power  $P_{RX}$  and the relative frequency and phase of each, e.g.  $\omega_{LO}$  is the LO frequency and  $\phi_{LO}$  is the LO phase of electric field. Therefore, the electric fields are expressed respectively as

$$E_{LO} = \sqrt{P_{LO}} e^{j(\omega_{LO}t + \phi_{LO})}.$$
(3.1)



Figure 3.1: Single polarization coherent optical data link with.

$$E_{RX} = \sqrt{P_{RX}} e^{j(\omega_{RX}t + \phi_{RX})}.$$
(3.2)

The transmitted optical power is found from the received optical power by accounting for the transmitter loss  $(L_{TX})$ , i.e  $P_{LAS} = P_{RX}L_{TX}$ , where  $L_{TX}$  is the transmitter loss and  $P_{LAS}$  is the transmitter input optical power. Channel loss is negligible in shortrange data center interconnect (DCI) since these are much smaller than the transmitter losses. The field incident at each quadrature PD differential pairs can be found applying (3.1),(3.2) according to

$$\begin{bmatrix} E_1 \\ E_2 \\ E_3 \\ E_4 \end{bmatrix} = \begin{bmatrix} \frac{1}{2}(E_{RX} + jE_{LO}) \\ \frac{1}{2}(E_{RX} - jE_{LO}) \\ \frac{1}{2}(E_{RX} - E_{LO}) \\ \frac{1}{2}(-E_{RX} + E_{LO}) \end{bmatrix}$$
(3.3)

For a locked phase and frequency between signal and LO, i.e.  $\omega_{LO} = \omega_{RX}$  and  $\phi_{LO} = \phi_{RX}$ , the amplitude of the current at each PD is attributed to the optical power converted into electrical current through the PD responsivity,  $R_{PD}$  [55].

$$I_{PD} = \frac{R_{PD}}{4} (P_{LO} + P_{RX} + 2\sqrt{P_{LO}P_{RX}}).$$
(3.4)

Usually  $P_{LO}$  is much larger than  $P_{RX}$  so the first term will generate a DC current at the PD while the second term will generate the modulated current. The peak-to-peak current swing at each PD is then  $I_{pp} = \frac{1}{2}R_{PD}\sqrt{P_{LO}P_{RX}} = \frac{1}{2}R_{PD}\sqrt{\frac{P_{LAS}P_{LO}}{L_{TX}}}$ .

The PD current is then amplified using a transimpledance amplifier (TIA) to generate a voltage for sampling. The overall link efficiency depends on the DC power required to amplify PDs current to a minimum sampling voltage as well as the optical power consumption generating a minimum detectable current for the TIA.

In order to design an energy efficient coherent optical link, section 3.2 reviews different transimpedance amplifier (TIA) architectures and their noise, bandwidth (BW), gain and power consumption trade-offs. The chapter will continue to analyze the affect of these factors in overall link performance. A detailed power consumption analysis is performed to determine required optical power, the dc power consumption of the TIA, as well as a methodology to optimize the link efficiency. Finally, we will review coherent transmitter design and trade-offs between driver swing and optical power dissipation.

# 3.2 Transimpledance Amplifier (TIA) Architecture

Transimpledance Amplifier (TIA) plays an important role in the optical receiver design. The TIA is required to have a wide bandwidth to support high data rates as well as a high gain to minimize noise contribution of following stages. The power consumption is also an important design criteria.

In this section we review how these design requirements trade-off with one another for different TIA architectures and discuss design optimization based on the application requirements.

# 3.2.1 Open Loop TIA

### Common Gate (CG) TIA

Fig. 3.2a shows a circuit schematic for a common-gate TIA with the PD modeled as an ideal current source and a capacitance  $C_{PD}$ .



Figure 3.2: Common gate TIA schematic (a) ideal dc bias (b) circuit implementation for the dc bias.

Let us start with the low frequency behavior of the TIA. Neglecting channel length modulation all input current flows through output load; hence, the output voltage equals  $V_{out} = R_D I_{in}$  and hence the low frequency transimpedance equals  $R_T = R_D$ . The input resistance of the TIA together with the input capacitance determines the input pole and BW. For the current source implementation shown in Fig. 3.2b, the input impedance for the common gate TIA equals the input impedance of  $M_1$  in parallel with output resistance of the current source  $M_C$ .

$$R_{IN} = \frac{r_{ds} + R_D}{1 + g_m \cdot r_{ds}} || r_{ds,c}.$$
(3.5)

Assuming that  $R_D \ll r_{ds}$  the input resistance can be estimated by  $\frac{1}{g_m}$ . The input pole for the CG TIA equals

$$f_{in} = \frac{1}{2\pi R_{in} C_{IN}}.$$
(3.6)

Wehre  $C_{IN} = C_{PD} + c_{gs} + c_{gd,c} + c_{db,c}$ . The output pole roughly equals

$$f_{out} = \frac{1}{2\pi R_D C_{out}}.$$
(3.7)

Where  $C_{out} = c_{gd} + c_{db} + Cin, next$ , Cin, next showing the loading capacitance from the next stage.  $C_{PD}$  is typically the largest capacitance and we can assume that the 3dB bandwidth is determined by the input pole. To push the input pole to higher frequencies  $M_1$  should be biased at higher dc currents, requiring either a larger  $M_C$ device or more head room for the device. Higher dc bias also increases the voltage drop across  $R_D$  and limits the maximum gain for a given supply voltage. Before moving to the noise performance, let us quantify the trade-offs between gain, bandwidth, and power consumption for a CG TIA.

The current flowing through a short channel transistor can be estimated with

$$I_D = v_{sat} C_{ox} w (v_{gs} - v_{th}). (3.8)$$

Where w is the device width,  $C_{ox}$  is the gate oxide capacitance per unit length,  $V_{gs}$  is the voltage across gate and source of the device, and  $v_{th}$  is transistor threshold voltage. Minimum supply voltage to keep the devices in saturation equals

$$V_{DD} = I_D R_D + v_{gs,1} + v_{gs,C} - 2v_{th} = v_{sat} C_{ox} w_1 (v_{gs,1} - v_{th}) R_D + v_{gs,1} + v_{gs,C} - 2v_{th}.$$
 (3.9)

Where  $v_{gs,c} = \frac{w_1}{w_c} v_{gs,1} + (1 - \frac{w_1}{w_c}) v_{th}$ .

Estimating the 3dB BW with the input pole we can roughly find

$$BW_{3dB} = \frac{g_{m,1}}{2\pi (C_{PD} + c_{gs} + c_{gd,c} + c_{db,c})}.$$
(3.10)

Where  $g_{m,1}$  is calculated from

$$g_{m,1} = v_{sat} C_{ox} w_1. (3.11)$$

Device capacitance is also dependent on the device size.  $c_{gs} = \frac{2}{3}w_1LC_{ox} + w_1C_{ov}$ ,  $c_{gd,c} = w_cC_{ov}$ , and  $c_{db,c} = C_jw_cE + 2(w_c + E)C_{j,sw}$ .  $C_{ov}$ , the the overlap capcitance between gate and source/drain per unit width,  $C_j$  is the bottom plate junction capacitance per unit area between source/drain and substrate,  $C_{j,sw}$ , is the sidewall junction capacitance per unit length between source/drain and substrate, and E is the source/drain length[56]. For simplicity, we neglect the junction capacitance between source/drain and substrate. We can estimate 3dB bandwidth as a function of device width in (3.12).

$$BW_{3dB} = \frac{v_{sat}C_{ox}}{2\pi(\frac{C_{PD}}{w_1} + \frac{2}{3}LC_{ox} + C_{ov} + \frac{w_c}{w_1}C_{ov})}.$$
(3.12)

To improve BW,  $M_1$  should be a large device, while the current source width  $w_C$  should be minimized which will increase required voltage across it to keep the current constant. In practice, increasing  $w_1$  will increase output cpacitance which will eventually start limiting the BW; hence, width of  $M_1$  could be optimized to maximize the BW.

Revisiting (3.9), we can find minimum dc power dissipation for agiven  $M_1$  device

width and its overdrive voltage.

$$P_{dc,min} = I_D V_{DD} = w_1 v_{sat} C_{ox} (v_{gs,1} - v_{th})^2 [v_{sat} C_{ox} w_1 R_D + 1 + \frac{w_1}{w_c}].$$
 (3.13)

To conclude, we need a large  $M_1$  device while the current source device width  $w_C$  needs to be minimized to improve BW. The transimpedance is determined by  $R_D$  which also needs to be maximized. Moreover, the overdrive voltage across  $M_1$  must be minimized to reduce required supply voltage and power consumption. Let us now analyze the noise contribution of each transistor and load resistance.



Figure 3.3: Noise model schematic including thermal noise of  $R_D$ , and channel noise of  $M_1, M_c$ 

Fig. 3.3 shows the thermal noise contribution of  $R_D$  and channel noise of  $M_1$ , and  $M_c$ . First, ignoring channel length modulation all of  $i_{n,R_D}$  flows through  $R_D$  and generates  $4kTR_D$  output noise voltage. Also, all of  $i_{n,M_c}$  flows through  $R_D$  and generates  $4kT\gamma g_{m,c}R_D^2$  output noise voltage. For an infinite output resistance for  $M_1, M_c$ , all of  $i_{n,M_1}$  flows through  $M_1$  and it does not have nay noise contribution at the output. The

$$i_{n,in}^2 = 4kT(\gamma g_{m,c} + \frac{1}{R_D}).$$
(3.14)

So, to minimize input referred noise  $R_D$  must be maximized and  $g_{m,c}$  minimized which again leads back to the increase in overdrive voltage and required supply. In other word, noise contribution of  $R_D$  and  $M_c$  trade off with one another. Considering  $g_{m,c} = \frac{I_D}{v_{gs,c}-v_{th}}$ , overdrive voltage can be expressed as  $v_{gs,c} - v_{th} = \frac{4kT\gamma I_D}{i_{n,M_c}^2}$ . We can also find  $V_{DD,min} = I_D.R_D + (v_{gs,c} - v_{th})(1 + \frac{w_c}{w_1})$  which results in

$$V_{DD,min} = \frac{4kTI_D}{i_{n,R_D}^2} + \frac{4kT\gamma I_D}{i_{n,M_c}^2} (1 + \frac{w_c}{w_1}).$$
(3.15)

(3.15) indicates that to reduce noise contributions of  $M_1$ ,  $R_D$  noise is essentially increased.

Now let us analyze how channel length modulation contributes to the noise. We can find that the output voltage noise from  $M_1$  is equal to

$$v_{n,out,M_1} = -\frac{R_D r_{ds,1}}{r_{ds,c}(1 + g_{m,1}r_{ds,1}) + R_D} i_{n,M_1} \approx -\frac{R_D}{r_{ds,c}g_{m,1}} i_{n,M_1}.$$
 (3.16)

Consequently, we have  $v_{n,out,M_1}^2 = 4kT\gamma \frac{R_D^2}{r_{d_{s,c}}^2 g_{m,1}}$ , and  $i_{n,in,M_1}^2 = \frac{v_{n,out,M_1}^2}{R_D^2} = \frac{4kT\gamma}{r_{d_{s,c}}^2 g_{m,1}}$ . Taking into account noise contribution from  $M_1$  total input referred noise current equals

$$i_{n,in}^2 = 4kT(\gamma g_{m,c} + \frac{\gamma}{r_{ds,c}^2 g_{m,1}} + \frac{1}{R_D}).$$
(3.17)

In conclusion, in a CG TIA design different design parameters closely trade off with one another and are interleaved. As a result there is not much degree of freedom to satisfy all design criteria while minimizing power consumption. In the next section we will review a modified CG TIA design to address the above mentioned challenges.

## Gain-boosted Common-gate TIA

The CG design can be modified with a negative feedback between source and gate as shown in Fig. 3.2.1.



Figure 3.4: (a) Gain boosted common gate TIA (b) circuit implementation for the gain boosting amplifier.

Similar to the CG design, neglecting the channel length modulation all the input current flows through  $R_D$  and hence low frequency transimpedance equals  $R_T = R_D$ . The transimpedance including the channel length modulation is found in (3.18) yielding the effect of channel length modulation is negligible even for  $A_v = 0$ .

$$R_T = R_D \left(1 - \frac{r_{ds,1} + R_D}{r_{ds,1} + r_{ds,c} + g_{m,1} r_{ds,1} r_{ds,c} (1 - A_v) + R_D}\right).$$
(3.18)

Let us find the input resistance for this modified circuit to evaluate how the input pole deviates from the CG design. We can find that

$$R_{IN} = \frac{r_{ds,1} + R_D}{1 + g_m \cdot r_{ds,1} (1 - A_v)} || r_{ds,c}.$$
(3.19)

Assuming  $R_D \ll rds$ , 1 the input resistance is reduced to  $R_{IN} = \frac{1}{g_m \cdot r_{ds,1}(1-A_v)}$ . Ignoring device parasitic capacitance the input pole is  $P_{IN} = \frac{1}{2\pi C_{PD}R_{IN}}$  which is  $(1 - A_v)$  times higher than the CG design and the BW is more likely limited by the output pole. However, in practice the device gate capacitance of the modified design increases by  $(1 - A_v)$ which limits BW improvements. Fig. 3.2.1 shows an implementation of the feedback amplifier with parasitic components from each device. Looking through the source of  $M_1$ the device capacitance can be estimated as  $(c_{gs,1} + c_{gd,f})(1 - A_v) + c_{gs,f} + c_{gd,c} + c_{db,c}$ Where  $A_V = -g_{m,f}R_L$ . The input pole is found in (3.20)

$$P_{IN} = \frac{g_{m,1}(1 + g_{m,f}R_L)}{2\pi(C_{PD} + (c_{gs,1} + c_{gd,f})(1 + g_{m,f}R_L) + c_{gs,f} + c_{gd,c} + c_{db,c})}.$$
(3.20)

Simplifying (3.20), we find  $P_{IN} = \frac{g_{m,1}}{2\pi(c_{gs,1}+c_{gd,f}+\frac{C_{PD}}{1+g_{m,f}R_L})}$ , indicating that in the modified CG design the effective PD capacitance is reduced to  $\frac{C_{PD}}{1+g_{m,f}R_L}$ .

Let us now find an estimated power consumption for the modified design. Clearly, this design requires a higher headroom to bias  $M_f$  as well as increase power consumption inside the feedback amplifier. The minimum supply voltage to ensure all devices operate in saturation equals

$$V_{DD} = I_D R_D + v_{gs,1} + v_{gs,f} - v_{th}.$$
(3.21)

Moreover, we need to ensure  $I_{D,f}.R_L + v_{gs,1} + v_{gs,f} < v_{DD}$ . Although this modification provides more design flexibility and removes the trade-offs between gain, bandwidth and power consumption by introducing the extra design parameters  $R_L$ , and  $g_{m,f}$ , for low power applications we should explore other architectures.

Let us now examine the trade-offs in a shunt feedback amplifier design in order to

determine the more appropriate approach for the power optimized link.

# 3.2.2 Shunt Feedback TIA

Fig. 3.5a shows the shunt feedback TIA using a core open loop amplifier with a feedback resistance  $R_F$ . The transimpedance is



Figure 3.5: (a) Generic shunt feedback TIA (b) noise sources for the TIA.

$$Z_T = \frac{V_O}{I_{PD}} = \frac{\frac{A_0\omega_0}{C_{IN}}}{s^2 + (\frac{1}{R_F C_{IN}} + \omega_0)s + \frac{(A_0+1)\omega_0}{R_F C_{IN}}}.$$
(3.22)

where  $C_{IN}$  is the total input capacitance contribution due to the PD and the transistor capacitance. The damping factor in the second-order transfer function must be equal to  $\sqrt{2}/2$  to ensure a well-behaved response, forcing the pole frequency of the core amplifier to be  $f_0 = \frac{\omega_0}{2\pi} = \frac{2A_0}{2\pi R_F C_{IN}}$ , resulting in a 3-dB bandwidth for the TIA equal to [56]

$$BW = \frac{1}{2\pi} \frac{\sqrt{2A_0}}{R_F C_{IN}}.$$
(3.23)

The gain-bandwidth product is limited to the technology which suggests the transimpedance-bandwidth limit [46].

$$R_F = \frac{A_0 f_0}{2\pi C_{IN} B W^2}.$$
(3.24)

Let us now examine the noise contribution of each the feedback resistor and core amplifier. Based on the noise model shown in , we can find

$$V_{n,out} = \frac{V_{n,R_F} + (R_F C_{PD} S + 1) V_{n,A}}{1 + \frac{R_F C_{PD} S}{A}}.$$
(3.25)

The IRNC at low frequency where  $A >> R_F C_{PD} S$  can be simplified to

$$i_{n,out}^2 = \frac{4KT}{R_F} + \frac{V_{n,A}^2}{R_F^2}.$$
(3.26)

Consequently, the current noise from the feedback resistor is directly reffered to the input similar to the contribution of load resistor  $R_D$  in the common gate design. However, despite the common gate design, the resistor value does not affect the headroom and enables low power design with noise optimization.

In summary, in the shunt feedback design, the feedback resistor determines the gain with no DC current flowing through it and affecting the headroom. This may simplify low power design while optimizing the trade-offs between gain, bandwidth and noise.

Taking into account the findings in this section, we choose a shunt feedback TIA for link optimization and continue to explore gain, bandwidth and noise trade-offs for this design taking into account the transimpedance limit. For this purpose, we will continue the discussion by first exploring how receiver noise affects the required laser power. We will then continue to evaluate how the receiver design can be optimized.

# **3.3** Laser Power Requirements

In previous section, TIA design and trade-offs between noise, bandwidth, and power consumption was reviewed. As discussed previously, to minimize the DC power consumed in the coherent link for a given bit rate, both the receiver power and transmit optical power must be optimized.

The minimum transmit laser power is dependent on the receiver sensitivity. Hence, to design the coherent link, one must study how required optical power trades off with the DC power inside the receiver.

The minimum peak-to-peak current at each PD to achieve the desired BER is [57]

$$I_{pp} = 2Q \cdot i_{n,rms}.\tag{3.27}$$

where Q is a constant for a given BER and  $i_{n,rms}$  is the rms input referred noise current (IRNC). In terms of the IRNC, the minimum required transmit laser power is

$$P_{LAS} = \frac{L_{TX}}{P_{LO}} (\frac{4Q \cdot i_{n,rms}}{R_{PD}})^2.$$
(3.28)

The total DC laser power consumption is

$$P_{DC} = \frac{P_{LAS} + P_{LO}}{\eta_{LAS}} = \frac{1}{\eta_{LAS}} \left( P_{LO} + \left( \frac{4Q \cdot i_{n,rms}}{R_{PD}} \right)^2 \frac{L_{TX}}{P_{LO}} \right).$$
(3.29)

where  $\eta_{LAS}$  is the wall-plug efficiency, defined as the laser's ability to convert electrical DC power into optical power, and is assumed for both LO and laser powers. An optimum DC laser power consumption is found from trading off the LO power in the receiver for

TX power. This minimum power is

$$P_{DC,LAS,MIN} = \frac{8}{\eta_{LAS}} \frac{Q \cdot i_{n,rms}}{R_{PD}} \sqrt{L_{TX}}.$$
(3.30)

The total DC optical power is clearly closely connected to the IRNC of the electronic receiver and the losses of the transmitter. The laser efficiency and PD responsivity are contributions to power beyond the scope of this work.

# **3.4** Receiver Power Requirement

To calculate the power consumption required to reach a given IRNC requires some details about the process technology. The PD current is amplified using a transimpledance amplifier (TIA) to generate a voltage for sampling. The overall link efficiency depends on the DC power required to amplify PDs current to a minimum sampling voltage as well as the optical power consumption generating a minimum detectable current for the TIA. To detect a peak voltage  $V_O$  at RX output, the required transimpedance  $Z_T$ , is  $\frac{2V_O}{I_{PP}}$ and substituting  $I_{PP}$  with (3.27),

$$Z_T = \frac{V_O}{Q \cdot i_{n,rms}}.$$
(3.31)

Assuming a technology-dependent coefficient  $K_Z$  that relates the desired  $Z_T$  to power consumption, the DC power dissipation of a single channel is  $P_{DC,RX} = K_Z \times Z_T$ . The total power consumption for a dual channel I/Q receiver, excluding the transmitter electronic driver is

$$P_{DC,TOT} = \frac{8Q}{\eta_{LAS}R_{PD}}\sqrt{L_{TX}}i_{n,rms} + \frac{2K_Z \cdot V_O}{Q \cdot i_{n,rms}}.$$
(3.32)

The first term is found from (3.30) while the second term is the electronic receiver con-

tribution. Since the optical power consumption reduces with lower rms current but the receiver power increases with lower rms current, the total power is minimized for

$$i_{n,rms,MIN} = \sqrt{\frac{Kz \cdot V_O}{4Q^2 \cdot \sqrt{L_{TX}}} \cdot R_{PD} \cdot \eta_{LAS}}.$$
(3.33)

Applying this condition to the total power consumption, the minimum required total power for a dual channel I/Q receiver is

$$P_{DC,TOT,MIN} = 8\sqrt{\frac{K_Z \cdot V_O \sqrt{L_{TX}}}{\eta_{LAS} \cdot R_{PD}}}.$$
(3.34)

Consequently, the minimum power is closely related to the efficiency of the transistors at producing transimpedance gain for a given DC power consumption and the reduction of the sampling voltage range.

The TX losses also feature prominently in (3.34). For amplitude modulation, the MZM is biased at the quadrature bias where the applied voltage produces maximal optical power variation. For phase modulation, the MZM is biased at the null of the optical power. The optical carrier undergoes 180° phase shift as the modulated signal swings around the null bias point. As the signal swings, the electric field as well as the optical power varies depending on the modulator phase efficiency  $V_{\pi}$  defined as the voltage required to generate  $\pi$  phase shift [8]. Fig. 3.6 plots the optical power loss as a function of modulated voltage normalized to the MZMs phase efficiency,  $V_{\pi}$ . For SiPh processes, a typical  $V_{\pi}L$  of 2 V-cm is expected where L is the modulator length which is inversely proportional to the speed. Trade offs between driver and laser power as well as optimum swing for TX optimum power consumption can be found in [58]. For this analysis, a typical of 20-dB optical loss due to limited modulation is assumed.

Moreover, assuming 5-dB coupling loss for the input and output couplers at the TX as

well as at the receiver,  $L_{TX}$  is at least 35 dB. The  $\eta_{LAS}$  depends strongly on the linewidth requirements and device technology and might be relatively low for an integrated SiPh tunable laser. For instance, [59] shows an implementation of a heterogeneously-integrated III-V/silicon interferometric widely tunable laser with 17% peak efficiency. Moreover, the optical versus electrical power curve is typically not linear, and we would expect a drop in the efficiency for higher output optical powers. However, for an external cavity laser (ECL),  $\eta_{LAS}$  could be as high as 50%. For this analysis, a  $\eta_{LAS} = 25\%$  is assumed to estimate the ECL used for the measurement. Considering a minimum of 50-mV peak swing requirement,  $K_Z = 0.01 m W / \Omega$ ,  $R_{PD} = 0.9 A / W$  and Q = 7 for a BER below  $10^{-12}$ , the minimum required optical power will be 16.4 dBm, and power consumption for dual channel receiver will be 44 mW. This minimum power consumption requires  $3.2\mu A$  IRNC. Nevertheless, dependence of noise on the bandwidth and high bandwidth requirements for desired SR exceeding 50 GBd, where SR = BW/0.7 for each channel and BR = 2BW/0.7 for the dual channel I/Q receiver, make this power challenging and determining the bandwidth that provides the minimum current suggests the optimal SR. The transimpedance required to amplify minimum detectable current to 50-mV peak swing is 66 dB $\Omega$ . The DC power  $P_{DC,TOT,MIN}$  is proportional to  $\frac{1}{\sqrt{\eta_{LAS}}}$ , hence a laser with



Figure 3.6: optical power loss as a function of modulated voltage normalized to the MZMs  $V_{\pi}$ .



Figure 3.7: Shunt feedback TIA block diagram with an inverter cell for the core amplifier

twice the better efficiency reduces the total DC power by a factor of 1.4 while increasing minimum noise requirement by the same amount. In general, an exact optimization value depends on several link components that could be refined through further study.

## 3.4.1 Noise and Bandwidth

To evaluate the optimal power consumption against bandwidth requirements, the analysis might assume a shunt-feedback TIA shown in Fig. 3.7.

Transimpedance, bandwidth, and TIA limit was previously explored in (??), (3.23), (3.24). With enough transimpedance, the noise added from following stages can be neglected. Hence, neglecting the shot noise contribution of PDs, IRNC for the shunt feedback TIA is calculated from [57] to include the thermal noise contributions at the input due to  $R_F$  and the channel noise contributions at the output. In terms of the Boltzman constant k and temperature T, the rms current is

$$i_{n,rms} = \sqrt{kT(\frac{4p_2}{R_F}BW + \frac{2p_3}{G_m}(2\pi C_{IN})^2 BW^3)}.$$
(3.35)

The noise bandwidth is scaled using Personik coefficients where  $p_2$  and  $p_3$  are roughly 1.11 and 3.3 for a Butterworth response,  $C_{IN} = C_{PD} + C_{GS} = C_{PD} + \frac{G_m}{2\pi f_T}$  is the total



Figure 3.8:  $f_T$  and intrinsic gain  $A_0 = G_m \cdot R_{DS} = 4.8$  for an inverter cell in 45CLO technology

input capacitance. To minimize the noise contributions, the approximation  $C_{PD} = C_{GS}$  is applied to (3.24) and (3.35) [57], the IRNC equals

$$i_{n,rms} = \sqrt{8\pi kTBW^3 C_{IN} (\frac{p_2}{\sqrt{2}A_0 BW} + \frac{p_3}{f_T})}.$$
(3.36)

For the core amplifier, an inverter amplifier produces high gain from the composite  $G_m$  of both NMOS and PMOS devices while operating at low DC currents producing high intrinsic gain while minimizing power consumption for a given bandwidth [60]. The composite  $G_m$  for the inverter is

$$G_m = g_{m,n} + g_{m,p} = v_{sat,n} W_n C_{ox} + v_{sat,p} W_p C_{ox}.$$
(3.37)

where  $W_p = 1.2W_n$  are NMOS and PMOS transistor widths. The DC intrinsic gain is  $A_0 = G_m R_{DS}$  where  $R_{DS} = r_{ds,n} ||r_{ds,p}$ .

For the 45CLO technology,  $f_T$  and intrinsic gain as a function of current is plotted in Fig. 3.8. The cell has a  $A_0 = G_m \cdot R_{DS} = 4.8$  for a wide current range.

Based on (3.36), the IRNC is plotted as a function of SR and  $f_T$  in Fig. 3.9a. The


Figure 3.9: (a) IRNC (b) total power consumption (c) EE as a function of DR and  $f_T$ . Cross lines show the maximum expected data rate for a given  $f_T$ .

IRNC contours indicate that for a given SR improving  $f_T$  reduces the IRNC and therefore should achieve lower DC power. Extending the argument to the power consumption based on (3.32), (3.36), and the earlier assumptions for  $V_O$  and  $L_{TX}$  of 50 mV and 35 dB, the total power consumption for the coherent link is plotted in Fig. 3.9b. Higher transmit laser power is required to achieve higher SR; however, increasing  $f_T$  can reduce required laser power by improving receiver sensitivity. Nevertheless, biasing the device at a higher  $f_T$  requires an increase in current and receiver power dissipation. As a result, there is a trade-off between speed and  $f_T$  in total power consumption shown in Fig. 3.9b.

The EE is calculated using  $EE = \frac{P_{TOT}}{BR}$  where BR for the dual channel receiver is twice the SR  $\frac{2 \cdot BW}{0.7}$  and is plotted in Fig. 3.9c and determines a minimum EE for a desired SR. For instance, a maximum SR of 60 GBd can be achieved with EE of 0.75 pJ/bit for  $f_T$  of 300 GHz, which consumes 44 mW DC power from the optical sources and requires 3.2  $\mu A$  IRNC, and 46 mW for the dual channel RX. In practice, device level analysis should be conducted to capture the exact bandwidth for a given  $f_T$ . Section III reviews implementation challenges and trade-off between achievable bandwidth, noise, and EE in CMOS circuits.

#### **3.5** Transmitter Drive Power Requirement

In previous sections, we explored the trade offs between required transmit power and receiver power consumption. As a result, minimum dc power consumption was found in (3.34). However, so far, the transmitter electrical power was not considered. Fig. 3.6 depicts the optical power loss as a function of electrical signal swing. This loss due to modulation factor can be modeled as

$$L_{TX} = \left[\frac{1}{2}(1 - \cos(\pi \frac{V_{sig}}{2V_{\pi}})) = (\sin(\pi \frac{V_{sig}}{4V_{\pi}}))^2\right]^{-1}.$$
(3.38)

| Table 5.1. Link parameters |                          |                |        |                    |           |             |                  |                    |       |  |
|----------------------------|--------------------------|----------------|--------|--------------------|-----------|-------------|------------------|--------------------|-------|--|
| Parameter                  | $K_Z$                    | $\gamma_{LAS}$ | $V_O$  | $R_{PD}$           | $V_{\pi}$ | $Z_{MZM}$   | $C_0$            | $C_1$              | $C_2$ |  |
| Value                      | $0.01 \text{ mW}/\Omega$ | 25%            | 50  mV | $0.9 \mathrm{A/W}$ | 10 V      | $36 \Omega$ | $75 \mathrm{mW}$ | $175 \mathrm{~mV}$ | 1.225 |  |

| Table | 3.1· | Link  | parameters |
|-------|------|-------|------------|
| rance | 0.1. | LIIII | parameters |

The transmitter power consumption as a function of required swing is modeled in [58] based on linear fitting on various designs.

$$P_{TX} = C_0 + C_1 \frac{V_{sig}}{Z_{MZM}} + C_2 \frac{V_{sig}^2}{Z_{MZM}},$$
(3.39)

where  $Z_{MZM}, C_0, C_1, C_2$  are MZM termination and process dependant coefficients.

Consequently, adding the transmitter electronic power to the minimum power consumption we previously found, we have

$$P_{DC,TOT,MIN} + P_{TX} = 8 \frac{1}{\sqrt{\sin(\pi \frac{V_{sig}}{4V_{\pi}})}} \sqrt{\frac{K_Z \cdot V_O}{\eta_{LAS} \cdot R_{PD}}} + C_0 + C_1 \frac{V_{sig}}{Z_{MZM}} + C_2 \frac{V_{sig}^2}{Z_{MZM}}.$$
 (3.40)

Let us return to previously assumed link parameters summarized in Table 3.1 to examine how the power consumption varies with driver swing.



Figure 3.10: Required driver power and optimized receiver and laser power found in (3.34)

Fig. 3.10 shows how increasing driver swing reduces the required laser power and DC power consumption in the receiver by reducing the optical loss while it results in higher power consumption in the electrical driver. Based on these link parameters there is an optimum swing below 2 V; however, in this simplified model, driver noise and degradation of transmitter OSNR is neglected. Lower driver swing may result in undetectable signal at the receiver and for better transmitter design and optimization, its noise must be included in the analysis.

### 3.6 Conclusion

In this chapter, a power analysis for coherent optical links was provided. Shuntfeedback and open loop TIAs were reviewed and a shunt-feedback inverter based TIA was chosen for minimized power consumption in the link. Based on the TIA topology, the analysis provides insight into how different link parameters, including laser efficiency, transmitter optical loss, and electronic technology node, affect the link performance. This chapter was mainly focused on receiver power consumption and sensitivity; however, the transmitter electrical driver plays an important role in the link efficiency as well. There is a trade-off between the optical loss and electrical voltage swing which was also briefly studied.

#### 3.7 Acknowledgment

This chapter is in part a reprint of material in the manuscript, "A Monolithic O-Band Coherent Optical Receiver for Energy-Efficient Links," published in the IEEE Journal of Solid-State Circuits ©2023 IEEE.

## Chapter 4

# Electronic and Photonic Packaging and Integration

### 4.1 Introduction

As data centers demand substantial improvements in energy efficiency through increased data rate and reduction in power consumption, Co-packaged optics have been proposed as one approach to fulfill this demand by minimizing the high-speed I/O power consumption. Nevertheless, parasitic resistance, inductance and capacitance between electronic and photonic circuits deteriorates the high-speed performance and requires power hungry equalization, thereby eliminating improvements in energy efficiency. Consequently, packaging approaches that enable 2.5D and 3D integration of heterogeneous ICs, i.e. silicon photonic and electronic ICs, are promising approaches to avoid parasitics associated with interconnects by introducing through-silicon vias (TSVs).

Fig. 5.1 juxtaposes a conventional multi-chip wirebond assembly with the TSV approach to assembly. The TSV approach proposed in this work allows ground connections and eliminates the need for ground wirebonds. These TSVs are connected electrically



Figure 4.1: Two packaging approaches where chip ground is connected to PCB ground either through wirebonds (right) or TSVs (left).

to a backside metalization layer to produce a ground plane beneath the die that reduces the inductance to the ground path. While the TSV is limited to ground connections here, future TSV development might allow signal and power interconnects that support a general three dimensional (3D) IC packaging approach.

TSVs have been widely evaluated in the context of radio-frequency and millimeterwave circuitry for 3D packaging [61, 62, 63, 64, 65, 66, 67]. Recent work has demonstrated that TSVs can offer low loss through to 300 GHz and support equalization of signals [68]. However, less work has been reported on the signal integrity of TSVs as measured through bit-error rate.

To quantify the signal integrity improvement of the ground TSV, this chapter presents an analysis of the impact of the ground TSV on the electrical bandwidth of each circuit block as well as the temperature rise of the chip in the presence of the TSV. In section II, the TSV feature and simulated improvement is reviewed for the optical receiver (ORX). The electrical characterization of the ORX with and without ground TSVs is demonstrated through eye diagrams and BER measurements before both assembly techniques are compared in Section III.

#### 4.2 Through Silicon Vias

The TSVs are etched vertical vias through the silicon wafer that consists of a field of 21  $\mu$ m by 3 $\mu$ m rectangular tungsten filled bars. These bars are organized into arrays that are scaled by the number of elements and the linear dimension. An illustration of the via and its parasitic component from M1 metal layer to backside metalization is shown in Fig. 4.2. The wafer is thinned to 100  $\mu$ m to form the TSV and finished with backside metalization. The thinning of the wafer already provides a substantial reduction in wirebond length so the experimental comparison between wirebonded and TSV chips used similar die thickness.



Figure 4.2: Ground TSV cross-section and equivalent circuit model. M1 connection to the backside metallization through silicon can be modeled as an inductance in series with a resistance depending on the number of vias in the array.

TSVs can potentially affect performance through different electrical and thermal mechanisms. By reducing the path of the signal or ground by directly connecting to the backside of the IC, the parasitic inductance is minimized. As illustrated in Fig. 4.2, the larger the number of via bars reduces both the series resistance and the inductance. Additionally, DC power consumption in the chip increases operating temperature and deteriorates signal performance and ultimately can lead to more DC power consumption in power amplifiers or drivers. As a result, understanding improvements in the gain and frequency response of a circuit is crucial for investigating TSVs thermal conductivity and potential impact on die temperature.

#### 4.3 Optical Receiver Circuit

To quantify the impact of the TSV on the circuit performance, a reference circuit is introduced in Fig. 6.7. An ORX consists of a fully differential Cherry-Hooper transimpedance amplifier (TIA), Gilbert cell for variable-gain amplifier (VGA), and 50- $\Omega$  buffer to provide matching to the 50- $\Omega$  measurement environment. The detailed schematic for TIA and VGA were previously presented in [69, 70] and were redesigned to incorporate the TSVs. The output buffer uses CTLE and inductive peaking for bandwidth improvements. The ORX variants include a wirebond ground and TSV ground version.

In the following section, the bandwidth dependence on wirebond inductance is first reviewed. A thermal analysis on the ORX IC is then provided to show temperature variations throughout the chip and heat sinking effects of tungsten TSVs.



Figure 4.3: Differential optical receiver consisting of a TIA, VGA, and 50- $\Omega$  output buffer fabricated in 130 nm BiCMOS process.

#### 4.3.1 Wire-bond Effects

Wirebond inductance introduces a pole and zero to the transfer function. As a result, circuits may experience either peaking or degradation in the 3-dB bandwidth depending on the design. To evaluate circuit sensitivity to wirebond inductance each circuit block was simulated individually. Individual wirebonds from chip pads to PCB were modeled as ideal inductors and the value is swept for the ground and signal inductance.

Fig. 4.4 shows simulated 3-dB bandwidth as a function of signal and ground wirebond inductance for a) 570  $\mu$ m CPW transmission line separated from the substrate by metal layer with coupled differential signal paths driving the circuit, b) TIA, c) VGA, and d) OB. In Fig. 4.4a, increasing the ground and signal inductance from 0 to 300 pH for a CPW transmission line reduces the bandwidth significantly from more than 70 GHz to



Figure 4.4: Simulated 3-dB bandwidth as a function of input signal and individual ground wirebond inductance from chip pads to PCB for a) 570  $\mu$ m CPW transmission line driving the circuit, b) TIA, c) VGA, and d) OB.

35 GHz. Notably, the signal and ground wirebond has similar impact in this case.

In Fig. 4.4b), increasing ground inductance on the TIA reduces the bandwidth compared to the signal wirebond between the source (e.g. photodiode and the TIA).

In Fig. 4.4c), the signal wirebond inductance improves the bandwidth due to peaking in the frequency response, while the ground inductance significantly reduces the bandwidth. Since the VGA already has very high gain at lower frequencies then slightly peaking its response at the input with  $\sim 200$  pH of inductance results in optimal performance.

Fig. 4.4d plots the dependence of wirebonds on the output buffer bandwidth. Notably, given some signal wirebond inductance the ground inductance introduces almost no bandwidth reduction.

#### 4.3.2 Thermal Effects

To evaluate thermal performance, the Lumerical HEAT simulator modeled temperature increases when the total dissipated power of the measured circuit is 168 mW and was uniformly sourced from the center of the chip where the vast majority of electronic devices were located. This was done with TSVs both enabled and disabled on the GND plane to investigate thermal effects on the chip. The simulations indicate very negligible differences between TSV and non-TSV chips at these power levels in the temperature rise, with both heating up to  $\sim$ 307 K around the center. Results were physically consistent with expectation since the tungsten TSVs have a similar thermal conductivity compared to pure silicon. Thus, most of the performance improvement for this specific circuit can be attributed to the RF bandwidth differences introduced by wirebonds.

#### 4.4 Experimental Results

Fig. 6.8 shows the PCB mounted chip assembly for the two variants. The photograph on the top shows the wirebond variant and the one below shows the TSV assembly. As mentioned, the non-TSV variant was thinned from 350  $\mu$ m to 100  $\mu$ m to ensure both assemblies have same signal wirebond lengths for input and output signal paths and eliminate any other substrate thickness factors affecting measured performance. Moreover, VGA gain setting is the same in both assemblies.



Figure 4.5: Photograph of the two assemblies under test for investigating packaging effects.

Frequency response and stability factor  $\mu$  of both assemblies are measured and presented in Fig. 6.10. S-parameter measurements are single-ended where one input is ac grounded and its output is 50  $\Omega$  terminated. As expected, wirebond inductance constrains bandwidth and the TSV variant has a flatter gain over a larger frequency range. The assembly with TSV shows a 3-dB bandwidth of 36.6 GHz while the non-TSV vari-



Figure 4.6: Measured S21 for two assemblies showing 3-dB bandwidth of 32.3 GHz for the non-TSV assembly and 36.6 GHz for TSV assembly.

ant bandwidth is reduced to 32.3 GHz, suggesting a roughly 12% improvement. Both assemblies remain stable for the entire frequency range.

The output noise voltage is also measured and compared in Fig. 6.11, indicating 3.84  $mV_{rms}$  output noise for TSV assembly and 3.58  $mV_{rms}$  for non-TSV assembly. This 7% increase in integrated noise by the TSV assembly is consistent with its larger bandwidth.

The TSV effects are further investigated through eye opening measurements with a



Figure 4.7: Noise histograms for TSV and non-TSV assemblies. The TSV assembly shows a slightly higher output voltage noise due to its larger bandwidth.

fully differential PRBS31 sequence. Fig. 6.9 shows eye diagrams for TSV and non-TSV assemblies at 50, 56, and 64 Gb/s data rates. In all cases, peak to peak swing is 100 mV while TSV assembly shows substantial improvement in eye opening resulting from the larger 3-dB bandwidth. The bit error rate (BER) as a function of sampling time and sensitivity curves as a function of bit pattern generator (BPG) voltage are shown in Fig. 4.9 and Fig. 4.10. Dashed lines show BER for the TSV assembly and solid lines are contributed to non TSV assembly. The BER performance exceeds the acceptable FEC limit (<2.2e-3) for G.975.1 I.4 (UFEC) up to 64 Gb/s in both assemblies. Nevertheless, TSV packaging shows a significantly lower error rate enabling higher speed data transfer and could be potentially pushed to higher rates.

### 4.5 Conclusion

This chapter compared the frequency response, eye opening, and BER for two variants of an optical reciever to assess the benefit of through-silicon vias and analyzed how packaging parasitic components affect different blocks in the optical receiver. We compared the performance of a 64 Gb/s (2.6 pJ/bit) optical receiver with and without ground TSVs and found more than 4GHz improvement in 3-dB bandwidth, as well as 100 times reduction in BER for TSV assembly.

#### 4.6 Acknowledgment

This chapter is in part a reprint of material in the manuscript, "Improved Signal Integrity at 64 Gbps in a 130-nm SiGe Optical Receiver With Through-Silicon Vias," published in the IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS) ©2022 IEEE.



64Gbps

Figure 4.8: Measured eye diagrams at 50, 56, and 64 Gb/s with 6.5 mV inputs. TSV assembly shows a larger eye opening resulting in lower BER at high data rates



Figure 4.9: Bathtub curves at 50, 56, and 64 Gb/s for TSV and non-TSV assemblies for 6.5 mV input voltage. TSV assembly shows almost 100 times improvement in BER as a result of higher bandwidth



Figure 4.10: Sensitivity curves at 50, 56, and 64 Gb/s for TSV and non TSV assemblies.

# Chapter 5

# A Costas PFD Implementation

### 5.1 Introduction

A component to enabling intra-data center interconnects (intra-DCIs) operating at data rates above 200 Gbps per lambda, is the realization of low-power, broadband optical receivers for quadrature phase shift keying (QPSK) or higher-order coherent waveforms. Recent work has demonstrated RXICs for QPSK and 16-QAM coherent links operating with optical data rates above 100 Gbps/polarization with energy efficiencies below 5 pJ/bit [71, 72, 73, 74]. In the analog coherent receiver architecture, an optical phasedlock loop (OPLL) locks to the phase and frequency of the incoming optical carrier. The OPLL uses a Costas loop to lock to a QPSK signal [54]. However, adding the Costas PFD to a conventional shunt-feedback TIA and variable gain amplifier (VGA) with high bandwidth demands additional emitter followers to buffer the signal through separate signal paths. Energy efficiency in the conventional voltage-mode SiGe HBT-based QPSK receiver design with a Costas loop is 5.34 pJ/bit [74]. To improve the energy efficiency of the RXIC, the blocks in the Costas loop-based receiver must be combined with the receive blocks to reuse the current and reduce power consumption.



Figure 5.1: a) Conventional shunt-feedback TIA, b) shunt-feedback TIA with emitter follower in the feedback and c) proposed common-base(CB)/common-emitter(CE) front-end for hybrid TIA/mixer interface.

In this paper, we present a new architecture for the QPSK receiver based on a currentmode TIA. The scheme indicated in Fig. 5.1 replaces the conventional shunt-feedback resistor with a cross-coupled CB current buffer. As the comparison between the shuntfeedback TIA and shunt-feedback TIA with emitter follower in the feedback to isolate input and output indicates, the CB stage replaces the role of  $R_F$  and improves  $\frac{Z_T}{Z_{in}} = 1 + g_{m1}R_D$  to  $(g_{m1}+g_{m2})R_D$  while allowing a current-controlled transimpedance for the frontend. Additionally, current generated by the CB and CE stages is tapped into a currentmode mixer that replaces  $R_D$  to form the Costas loop and reduce power consumption by roughly 50%.

Section II discusses a current-mode CB/CE TIA circuit implementation with emphasis on the overall power reduction of the novel QPSK Costas loop architecture. The electrical characterization of the current-mode I/Q receiver based on a Costas loop fabricated in the 130-nm SiGe technology is presented in Section III with measurements of the data rate, PFD output, and BER for the RFIC. To the best of our knowledge, this is the first current-mode I/Q receiver and achieves the best energy efficiency for designs operating over 40 GBaud (GBd) among SiGe HBT-based designs.

### 5.2 Current-mode QPSK Receiver Design

The detailed schematic of the current-mode QPSK receiver is illustrated in Fig. 5.2. The input stage of the receiver consists of a differential common-base current buffer (Q2-Q4) with a cross-coupled common-source pair (Q1-Q3) to increase the current gain. The current generated by the common-base and common-emitter devices is fed directly into a Gilbert cell current-mode mixer pair. The output voltage of the TIA is

$$V_{TIA,I/Q} = \frac{\frac{\beta(g_{m,1}+g_{m,2})}{g_{m,1}+\beta g_{m,2}}}{g_{m,eff,5/6}} I_{PD,I/Q},$$
(5.1)

where we assume  $g_{m,1} = g_{m,3}$ ,  $g_{m,2} = g_{m,4}$  and  $g_{m,eff,5/6}$  is the effective transconductance of the switched HBT devices.

Photo-detector current is provided through the dc current source used to bias the common-base devices (Q7). The TIA provides a limited gain of  $38dB\Omega$  and hence the collector voltage of Q2/Q4 further amplified with the Cherry-Hooper limiting amplifier (LA) shown to the right of the TIA schematic to produce the limited versions of the signal. In the modified Costas loop architecture, the limited voltage output from the quadrature channel, shown as  $V_{LIM,I/Q}$ , drives the Gilbert cell multiplier through an emitter follower buffer to combine it with the output current of TIA ( $I_{TIA,Q/I}$ ).

The limiting amplifier output voltage  $V_{LA}$  is written as

$$V_{LA,I/Q} = \tanh G_{CH} V_{TIA,I/Q} \tag{5.2}$$

where  $G_{CH}$  is the gain of the Cherry-Hooper stage. Therefore, the differential current at



Figure 5.2: Schematic of the current-mode receiver combining the TIA and PFD mixer interface with an input common-base TIA embedded within a Gilbert cell mixer for input photodector current reuse, Cherry-Hooper limiting amplifier (LA), and output buffer with CTLE

the output of the current-mode HBT switches is

$$\Delta I_{PFD,I/Q} = \Delta I_{TIA,Q/I} \times \tanh(\frac{\Delta V_{LA,I/Q}}{2V_T}).$$
(5.3)

where  $V_T$  is thermal voltage,  $I_{TIA,I/Q}$  is the respective collector current of the TIA stages and  $V_{LA,I/Q}$  is the respective output voltage of the limiting amplifier. Ideally, the LA provides large enough gain to rail the output voltage and remove any amplitude information generating an ideal QPSK costas loop PFD response presented in Eq. (5.4).

$$V_{PFD} = (I_{PFD,I} - I_{PFD,Q}) \cdot R_L. \tag{5.4}$$

Finally, the limited signal is buffered for the 50- $\Omega$  environment as shown in on the far right of Fig. 5.2. The 50- $\Omega$  output buffer includes a continuous-time linear equalization network through a parallel RC-degeneration circuit that allows for a broadband output interface to standard 50- $\Omega$  measurement equipment.

The current of cross-coupled pair (Q2/Q4) is tuned by modulating the  $V_{E,CE}$  pro-

viding variable gain control through effectively changing  $g_{m,1}$  and thus transimpedance gain as illustrated in (5.1). In this prototype, manual tuning of  $V_{E,CE}$  allows for gain control in the TIA through the common-emitter current. A comprehensive automatic gain control (AGC) design would ensure that the TIA remains in the linear region and compensates for any mismatch between I and Q channels for a correct Costas loop operation [72]. Transimpedance ( $Z_T$ ) and input impedance ( $Z_{IN}$ ) as a function of  $V_{E,CE}$  is shown in Fig. 5.3. The RXIC has 61.2 dB $\Omega$  maximum ZT and changing  $V_{E,CE}$  from -0.4 V to 0.4 V provides 2.4 dB of gain variation while the  $Z_{IN}$  remains roughly constant. For short-range data center interconnects, the required dynamic range is around 10 dB and integrated optical attenuators or laser current adaptation avoids wasting power. Gain control is included to assist the mismatch between channels.



Figure 5.3: Midband TI gain and input impedance as a function of  $V_{E,CE}$ .

The frequency response is shown in Fig. 5.4. The simulation assumes 50 fF photodetector capacitance, 700 pH wirebond inductance from the photo-detector to receiver input pads, and 200 pH output wirebond inductance 3-dB bandwidth is 29.7 GHz. Copackaged electronic and photonic ICs can leverage the flexibility to use long input wirebond generating peaking above 20GHz and improving 3-dB bandwidth while maintaining minimum output wirebond inductance.



Figure 5.4: Frequency response of the receiver for 700 pH and 400 pH input wirebond inductance. Also assuming 50 fF photo-detector capacitance, and 200 pH output wirebond inductance

#### 5.3 Experimental results

Fig. 5.5 illustrates mounted chip assembly on a FR-4 PCB as well as the chip micrograph. TIA combined with the Gilbert cell draws less than 17 mA from a 3.8-V source. Rest of the receive chain draw around 63 mA from a 3-V supply. This results in 254.6 mW power consumption for main branches and total of 290 mW including all reference biasing circuits. All packaging parasitic components such as wirebond inductance are embedded in the measurements.

Functionality of the PFD is demonstrated through driving I and Q channels with sine and cosine waveform at 100 MHz. PFD output measured using the real-time Keysight oscilloscope (DSAV134A) is illustrated in Fig. 5.6. The saw-tooth waveform is at four times the input frequency. Some non uniformity in the peaks in the saw-tooth waveform are attributed to minor offsets in the circuit.  $V_{E,CE}$  can help compensate for minor offsets and imbalance in I/Q channels and create more uniform PFD waveform. As shown in



Figure 5.5: Receiver assembly on a FR-4 PCB and chip micro-graph.

Fig. 5.6.  $V_{E,CE,I} = -200$  mV and  $V_{E,CE,Q} = -400$  mVprovides minimum mismatch between I and Q channels and was used to measure the data paths. Main source of gain mismatch between 2 channels can be contributed to local mismatch predicting 5 dB variation for the 3-sigma deviation by Monte Carlo simulation.



Figure 5.6: PFD output at (a) 100 MHz and for two  $V_{E,CE,I/Q}$  settings to show offset and imbalance compensation capability of this node

NRZ eyes up to 44 Gbps were measured for both channels using a PRBS31 sequence without the use of any transmit pre-emphasis. The bit pattern generator (BPG) output drives the RXIC input with 13.3 mV peak input voltage. For eye diagram measurements, one of the differential output of the RXIC was connected to a Tektronix digital serial analyzer sampling oscilloscope (DSA8300) and the other output was terminated with a 50- $\Omega$  load. Measured eyes are shown in Fig. 5.7. The output of the receiver is limited; hence, despite the PFD response where the output of TIA with limited gain is tapped, the eye diagrams do not show the mismatch between channels. The output voltage swing of 238 mV voltage swing for I channel and 231 mV for Q channel.



Figure 5.7: Eye diagrams for I and Q channels at 40 and 44 Gbps data rates. Measurements indicate 45 mV eye opening at 40 Gbps and 30 mV at 44 Gbps. The time scale is 5 ps per division and the voltage scale is 28 mV per division

The receiver bit error rate (BER) was characterized by connecting the differential outputs to the error analyzer (EA) (SHF 1104A). The bit error rate (BER) is plotted as a function of sampling time in Fig. 5.8. Dashed lines show BER for I channel and solid lines are contributed to Q channel. The combination of noise and intersymbol interference causes the eye to rapidly close above 40 Gbps indicating that BER is rapidly degrading. The BER performance exceeds the acceptable FEC limit (<2.2e-3) up to 44 Gbps.

Fig. 5.9 shows output voltage noise measurement as a function of  $V_{E,CE,I/Q}$ 

Input referred noise current (IRNC) is then calculated by dividing total output voltage



Figure 5.8: BER bathtub curves for I(solid line) and Q(dashed line)

noise by TI gain. Both output voltage noise and TI gain reduce as  $V_{E,CE}$  increases and IRNC remains almost constant. The variation between 2 channels may be expected due to local mismatch and a large integrated noise variation from 1.2 mV to 3.6 mV is predicted by Monte Carlo simulation.



Figure 5.9: Noise histogram, measured output noise voltage and IRNC variation for I and Q channels as a function of  $V_{E,CE}$  compared with simulated IRNC.

Table 1 demonstrates the state-of-the-art performance for differential RXICs across SiGe HBT technologies. Notably, this work provides among the highest energy efficiencies. Prior work [74] uses a conventional Costas loop and Fig. 5.10 compares power of the conventional and proposed Costas loop designs with an overall receiver energy efficiency reduction by a factor of 1.8.



Figure 5.10: Power consumption breakdown for the modified and conventional Costas loop design presented in [74].

### 5.4 Conclusion

This chapter presented a 3.3-pJ/bit I/Q receiver with a Costas loop PFD in a 130-nm SiGe technology. The amplifier and PFD employ a current reuse technique to improve energy efficiency. Electrical characterization demonstrates the functionality of the PFD and I/Q data paths up to 88 Gb/s while maintaining BER below FEC limits.

#### 5.5 Acknowledgment

This chapter is in part a reprint of material in the manuscript, "An 88-Gbps, 3.3pJ/Bit I/Q Receiver With Current-Mode Phase-Frequency Detection in a 130-nm SiGe HBT Technology," published in the IEEE Solid-State Circuits Letters ©2022 IEEE.

Reference [75][72][74]This Work [71]NRZ/QPSK Modulation 16QAM QPSK QPSK QPSK 64 128100 Bit rate 13688 (Gbps/Pol.) Efficiency  $4.6^{1}$  $6.81^{1}$  $4.33^{1}$  $3.32^{1}$  $3.3^{2}$ (pJ/bit)  $5.34^{2}$ TI Gain  $74^{3}$  $80^{3}$ 67.2 73 61.2 $(dB\Omega)$  $12.2^{3}$  $24.86^3$  $10.7^{4}$  $15.6^{4}$ Avg. IRNCD 20(pA/sqrt(Hz))Optical/ Optical Electrical Optical Optical Electrical Electrical

Table 5.1: State-of-the-Art Comparison

<sup>1</sup> only RX circuits, <sup>2</sup> RX and Costas loop circuits, <sup>3</sup> differential transimpedance gain:  $Z_{T,diff} = 2Z_T = 2\Delta V_{out}/_{in}$ , <sup>4</sup> Calculated using simulated gain/BW and measured output noise histogram statistics at the maximum gain setting

# Chapter 6

# **Coherent Receiver Circuit Design**

### 6.1 Introduction

In chapter 4, the effects of parasitic components due to package interconnects was studied showing significant improvement with TSV interconnects. Consequently, moving toward fully-integrated designs is beneficial for high-speed coherent links.

In this chapter a monolithic design is presented An O-band coherent optical receiver (CORX) is integrated in a 45-nm monolithic CMOS SOI process. The CORX operates to 80 Gbps with FEC-acceptable BER at 1.2 pJ/bit energy efficiency. To our knowledge, this is the first monolithically-integrated Silicon CMOS CORX.

# 6.2 80-Gbps, 1.2-pJ/bit Fully-Integrated O-band I-Q Optical Receiver

The coherent receiver consists of an optical 90° hybrid which produces differential optical signals and drives two integrated Germanium photodetectors (PDs). The differential PD current is then amplified through a low-power pseudo-differential push-pull shunt-feedback TIA. This section reviews optical and electrical front-ends.

#### 6.2.1 Optical Front-end

The received optical signal and LO are coupled into the chip through waveguide grating couplers. Each coupler has an anticipated loss of 3 dB. The optical 90° hybrid, depicted in Fig. 6.1, comprises 3-dB directional couplers and a thermal phase shifter (PS), biased to 90° to generate quadrature fields expressed in (3.3). The directional coupler splits the power and applies a 90° phase shift to the opposing output arm.



Figure 6.1: Block diagram for optical hybrid.

Fig. 6.2 simulates the wavelength dependence of the power splitting ratio and indicates less than 0.1 dB penalty across the band. The PDK PDs also illustrate a wavelength variation for the PD responsivity in Fig. 6.2.

To illustrate the ability of the optical hybrid to tune the quadrature phase relationship, an optical simulation of the QPSK constellation is performed with the PDK elements for the directional couplers and the thermal phase shifters in Fig. 6.3a. A 56-



Figure 6.2: Power splitting ratio inside directional couplers and PD responsivity  $R_{PD}$  as a function of wavelength.



Figure 6.3: Simulated 56 GBd QPSK constellations at the output of hybrid with the phase tuner biased at (a) 17mW (b) 13.6mW with normalized amplitudes

GBd constellation is plotted in under two heater bias conditions in Figs. 6.3a and 6.3b. The mismatch in the bias may result in phase mismatch and distort the constellation. The simulation uses 0 dBm laser power and -15 dBm modulated signal power incident at the hybrid generating an average 218- $\mu$ A DC current at each PD. Also, 70- $\mu$ A peak current swing is generated close to the DC and swing values predicted based on (3.4). These constellations do not capture the bandwidth limitation of the receiver.

#### 6.2.2 Electrical Front-end

In the proposed technology, the RX consists of a TIA, LA, and OB as indicated in Fig. 5.1. The Costas loop analysis, to implement the carrier recovery circuit, typically assumes limiting amplification in the PFD but analysis indicates that gain compression still produces the desired phase response with slightly lower sensitivity. To achieve large swing through the LA incurs higher power penalty. Considering a 845- $\Omega$  transimpedance achieved in this design at the output of LA, a 235- $\mu$ A peak input current generates 200-mV peak voltage swing which will saturate the LA. Considering 0.8 A/W responsivity for the photodetectors (PDs), -5.3-dBm input power is required to saturate the receiver channel.



Figure 6.4: Half-circuit schematic of pseudodifferential receiver implemented in 45-nm RF/photonic integrated circuit process with a transimpedance amplifier, limiting amplifier and output buffer.

A pseudodifferential scheme is used since the optical hybrid produces differential currents from two PDs receiving differential optical signals. A variable current source at the differential input allows manual adjustment of the DC current to the PDs. To avoid the overhead power consumption of a fully differential circuit, the TIA and LA are pseudodifferential while the OB is fully differential to provide improved common mode rejection.

In this design, the objective was establishing techniques to reduce electronic IC power consumption while improving bandwidth and support the analysis of a full optical coherent receiver in which the LO and transmit optical power required is related to the input referred noise current and sensitivity of the receiver.

The circuit schematic of the RX is shown in Fig. 6.4. The TIA is a push-pull amplifier with resistive feedback. In a shunt-feedback TIA, assuming the bandwidth limitation is only due to PD capacitance at the input, the transimpedance can be estimated from (6.1), where A is the voltage gain of the push-pull amplifier,  $R_f$  is the feedback resistor, and  $C_{PD}$  is the PD capacitance.

$$Z_T = -\frac{A}{A+1} \frac{R_f}{1 + \frac{R_f C_{PD}}{A+1}s}.$$
(6.1)

To achieve a high bandwidth, the push-pull amplifier in the TIA needs to have a large voltage gain and, hence, a large  $g_m$  requiring high power consumption. Inverter-based amplifiers produce high gain by adding the  $g_m$  of both nmos and pmos devices while operating at low dc currents to increase the  $g_{ds}$ . Circuit simulations demonstrate that the 45CLO process maintains an intrinsic  $g_m/g_{ds}$  gain of 5 across a wide range of current density. Consequently, the choice of current density in the TIA is limited by input-noise current requirements.

To improve the TIA bandwidth, a feedforward capacitor,  $C_f$ , provides a negative capacitance at the TIA output, which reduces loading effects of the next stage. In Fig. 6.5, the simulated transimpedance at the output of the TIA and IRNC is plotted to twice the 3 dB bandwidth. The feedforward capacitor generates peaking in the frequency response but worsen the noise slightly at high frequencies. A second inverter stage increases the gain and is followed by a common-source amplifier with active peaking. The frequencydependent load impedance of this stage is

$$Z_L = \frac{1 + R_2 C_{gs6} s}{g_{m6} + C_{gs6} s}.$$
(6.2)

Through the zero in the load impedance, the gain increases at higher frequencies as the  $C_{gs}$  of  $M_6$  becomes a small impedance and the impedance presented to  $M_5$  is limited by  $R_2$  rather than  $\frac{1}{g_{m6}}$ . This peaking compensates for the loading effects of the feedforward capacitor.



Figure 6.5: Simulated transimpedance and input-referred noise current (IRNC) to show the effect of feedforward capacitor. Simulated IRNC for electrical characterization indicates that the total integrated input noise current over twice the simulated bandwidth of 23 GHz is 10  $\mu$ A when the wirebonds are not present

In order to provide common-mode rejection, a fully differential stage is added before the 50 $\Omega$  buffer. To provide the headroom for current sources, a pmos level shifter was used. The OB is designed to drive the 50- $\Omega$  load through a wirebonded chip and hence is the most power-hungry part of the circuit. Most silicon photonic systems target short reach applications where the transceivers might be co-packaged with a switch or other functions. In a co-packaged solution, the wirebonds would be replaced with flip-chip packaging. To demonstrate the functionality and characterize the electronics in the process, we used a source follower at the output, which introduces 8 dB of additional loss limiting both output swing and gain. Capacitive degeneration and inductive peaking are added to the differential stage to counterbalance the loading effects of large output transistors. Each of the pseudodifferential receive channels consumes 36 mW DC power, 21.6 mW of which is accounted in the OB.

#### 6.3 Receiver Measurement Results

#### 6.3.1 Electrical Measurement

Fig. 6.6 shows the receiver chip micrograph and the mounted chip assembly on a PCB. The area of the chip is  $800\mu m \ge 840\mu m$  with wirebonds of approximately  $800\mu m$ , which constrains the 3-dB bandwidth to just under 20 GHz as described in Fig. 6.7. which plots the measured transimpedance based on fully-differential S-parameter measurements. The frequency response of the receiver is compared with measured data without any de-embedding. In the absence of the wirebonds, the simulated bandwidth is 23 GHz assuming a 50-fF PD capacitance, typical in this process. The 800-pH input and output wirebond inductance, expected for this assembly, as well as losses from PCB traces reduces the bandwidth to 9.6 GHz. The dual channel RX performance presented here demonstrates the eye opening with a PRBS31 sequence and Fig. 6.8 shows output eye diagrams at 36, and 40 Gbps. Additionally, the eyes are generated through 20dB attenuators connected to bit pattern generator (BPG) (SHF 12105A) to generate a differential input at 40 mV peak swing to drive the RX. The I channel output swing is

150 mV for a 11-dB differential voltage gain and corresponds a 49 dB- $\Omega$  transimpedance. The Q channel shows a larger swing of 180mV and corresponding transimpedance gain of 52 dB- $\Omega$  which is in agreement with the small-signal measurements.



Figure 6.6: Receiver assembly on a PCB and chip micrograph



Figure 6.7: Simulated transimpedance gain without the wirebond inductance compared with measurement results . The receiver presented here has a 13-GHz bandwidth penalty due to the wirebond inductance.

The output noise histogram for both channels is shown in Fig. 6.9 and indicates approximately 1-mV rms output noise. Based on the transimpedance, this suggests an integrated rms current noise of 3-uA rms which varies from the simulated value because of 3dB bandwidth degradation due to packaging parasitics.



Figure 6.8: Measured eye diagrams at a) 36 Gbps and b) 40Gbps per channel. The time scale is 6 ps per division and the voltage scale is 21 mV per division.

To verify the open eye diagrams, the BER is plotted as a function of sampling time in Fig. 6.10. The BER curves are relatively symmetric and show that up to 40 Gbps the BER performance exceeds the acceptable FEC limit (2.2e-3) for G.975.1 I.4 (UFEC). Fig. 6.11 shows BER variation as a function of input voltage and indicates that below 4 mV input voltage the BER is noise limited. BER mismatches between two channels may be contributed to both gain mismatch and imbalanced wirebond length limiting the bandwidth.

A performance summary for this design is shown in Table ?? and compared against recent work. While the 50- $\Omega$  buffer is required for measurement purposes, it might be eliminated in a link implemented for data centers changing the energy efficiency from 0.9 to 0.36 pJ/bit. While recent work using FinFet CMOS has indicated excellent power for a given transimpedance at similar data rates, this process does not support silicon photonic integration. When compared with recent assembled optical receivers based on separate 45-nm CMOS and 90-nm silicon photonic chips for coherent detection, the



Figure 6.9: Noise histogram measurement for a) I channel indicating 0.9727mV rms output noise and b) Qchannel indicating 1.081mV rms output noise

energy efficiency is improved (0.9 pJ/bit).

#### 6.3.2 Optical Coherent Link Measurement

The receiver was tested in a self-homodyne link configuration. The measurement setup as well as chip micrograph and assembly on a FR-4 PCB is shown in Fig. 6.12. A 1310-nm external cavity laser (ECL) splits into local oscillator (LO) and signal paths. In the signal path, an iXblue MXIQER-LN-30 I/Q modulator is driven with a 600-mV PRBS-15 signal from a bit pattern generator (BPG) (SHF 12105A). The signal path also includes an attenuator for sensitivity measurements. The receiver I/Q channels are connected to a real-time oscilloscope (RTO) (Keysight UXR0702A). with a 0.875 µs acquisition time at 256 GSa/s to detect the transmitted QPSK signal.

The ECL power is set to 13 dBm providing 10-dBm LO power and 10-dBm input power to the transmitter. The transmitter has a typical bandwidth of 25 GHz and 15 dB of loss at O-band. The 10-dBm LO power and -5.3-dBm signal power translate to 0.59-mA LO and  $8-\mu$ A signal current per PD indicating 12.2 dB of coupling loss. The differential dual-channel electrical circuit draws 60-mA current from a 1.2-V supply and


Figure 6.10: Bathtub curves illustrating FEC-compatible operation up to 40 Gbps per channel

thermal phase-shifter inside the optical hybrid consumes 24-mW for quadrature bias corresponding to 96-mW dc power consumption for the CORX.

Fig. 6.13 shows the measured constellations at 20, 32, and 40 Gbaud and bit error rate (BER) as a function of signal power incident at signal grating coupler which is 23 dB larger than signal power at each PD. The single ended output voltage swing is 28 mV for 67- $\mu$ A (calculated based on signal and LO power) input current per PD driving the TIA showing 415 $\Omega$  (52.3dB $\Omega$ ) transimpedance gain.

A performance summary for this design is shown in Table 7.1 and compared against recent work. FinFet CMOS has indicated excellent power; however, this process does not support silicon photonic integration. When compared with monolithic coherent design in O-band, we achieved a 3.6 times improvement in energy efficiency.

### 6.4 Conclusion

This chapter described the performance of a 1.2-pJ/bit coherent optical receiver fabricated in the 45CLO technology that supports both silicon photonic components as well as



Figure 6.11: Sensitivity curves as a function of input current



Figure 6.12: Self-homodyne link configuration and microscopic view of the chip.

high speed electronics. Measured constellations and sensitivity curves show performance up to 40 Gbaud below FEC BER limit of  $2.2 \times 10^{-3}$ 

## 6.5 Acknowledgment

This chapter is in part a reprint of material in the manuscript, "A 40-Gb/s, 900fJ/bit Dual-Channel Receiver in a 45-nm Monolithic RF/Photonic Integrated Circuit Process," published in the IEEE Solid-State Circuits Letters ©2022 IEEE, and "First Monolithically-Integrated Silicon CMOS Coherent Optical Receiver," published in the



Figure 6.13: Measured constellations at 20, 32, and 40 GBd and sensitivity curves. Equalized results use a post-processing script on raw data to apply a 7-tap feed–forward equalizer (FFE) to compensate for bandwidth limitations of TX and RX packaging losses. The equalization adds 6dB of peaking at 40GHz and reduces BER for a given signal power and data rate.

IEEE Optical Fiber Communication (OFC) Conference ©2023

# Coherent Receiver Circuit Design: Optimization and Co-simulation

# 7.1 Introduction

In previous chapter a prototype monolithic coherent receiver designed in a 45nm CMOS-SOI photonic process was presented. Nevertheless, as reviewed in chapter 2, for analog carrier recovery in self homodyne architecture, a PFD is required. Moreover, chapter 3 provided a power analysis and a guideline to optimizing overall coherent linkef-ficiency. In this chapter, we will use discussions in chapters 2 and 3 in order to implement an optimize coherent link.

A 1310-nm (O-band) coherent optical Link is demonstrated for short-range optical interconnects that operate to 56-GBd symbol rate (SR)(112 Gbps) with FEC-acceptable BER. The coherent optical receiver (CORX) leverages a monolithic 45-nm CMOS SOI photonic-enabled process to realize an energy-efficient quadrature phase shift keying (QPSK) demodulation. Co-design of the optical and electronic circuit elements supports high-speed operation and low power consumption. The coherent link is demonstrated



Figure 7.1: Optical receiver implemented in 45-nm RF/photonic integrated circuit process consisting of an optical hybrid, TIA, limiting amplifier (LA),  $50\Omega$  output buffer (OB), and a Costas phase/frequency detector (PFD). Detailed design parameters can also be found in [78].

with an optical transmitter photonic IC (PIC) fabricated in silicon photonic (SiPh) process with laser diodes wirebonded to a 90-nm SiGe driver electronic RFIC. The transmitter operates at 5.9-pJ/bit energy efficiency (EE) while the receiver achieves 0.73 pJ/bit and, to our knowledge, is the best EE reported for a coherent optical receiver.

# 7.2 112-Gbps, 0.73-pJ/bit Fully-Integrated O-band I-Q Optical Receiver

The optical and electronic receive circuity implemented in the 45CLO process are illustrated in Fig. 7.1.

#### 7.2.1 Optical Front-end

The optical front-end is similar to the design in chapter 6. The received optical signal and LO are again coupled into the chip through waveguide grating couplers. In Fig. 7.1, a PN-type phase shifter is also introduced to allow a large tuning range to adjust the phase of the LO.

#### 7.2.2 Electronic Front-end

In this design, the differential PD current is amplified through a low-power pseudodifferential push-pull shunt-feedback transimpedance amplifier (TIA). The PD DC current calculated in (3.4) flows through a variable current source shown in Fig. 7.1. The current source is adjusted manually based on the incident optical power to ensure it does not flow through the feedback resistor affecting the inverters bias and gain, causing the outputs to rail. An automatic DC sink can be implemented similar to [73].

A fully-differential TIA offers high common mode rejection and immunity to environmental noise, such as supply noise, compared to a single-ended design with a power penalty. To better evaluate the common mode rejection of the inverter cell, the output voltage variation with a noisy  $V_{DD}$  is found from the inverter model in Fig. 3.7. Neglecting channel length modulation,  $-V_{DD} = V_{gs,p} - V_{gs,n}$  and  $V_{out} = V_{gs,n}$ . As a result,  $\frac{\partial V_{out}}{\partial V_{DD}} = \frac{\partial V_{gs,n}}{\partial I_D} \frac{\partial I_D}{\partial V_{DD}}$ . Moreover,  $|\frac{\partial V_{DD}}{\partial I_D}| = \frac{\partial V_{gs,n}}{\partial I_D} - \frac{\partial V_{gs,n}}{\partial I_D} = \frac{1}{g_{m,n}} + \frac{1}{g_{m,p}}$ . Consequently,

$$\left|\frac{\partial V_{out}}{\partial V_{DD}}\right| = \frac{1}{\frac{g_{m,n}}{g_{m,p}} + 1}.$$
(7.1)

which will provide roughly 6-dB isolation between output and  $V_{DD}$ . In practice, channel length modulation further limits the ideal isolation. Simulation shows 4.6-dB supply noise rejection for a single ended inverter TIA. The differential PDs and common-mode noise rejection are improved from the pseudo-differential outputs. Assuming the differential NMOS and PMOS devices have a mismatch of  $\delta_n$  and  $\delta_p$ , the supply noise rejection becomes

$$\left|\frac{\partial \Delta V_{out}}{\partial V_{DD}}\right| = \frac{\frac{g_{m,n}(\delta_n - \delta_p)}{g_{m,p}(1 + \delta_p)} + 1}{\left(\frac{g_{m,n}}{g_{m,p}}\right)^2 \frac{1 + \delta_n}{1 + \delta_p} + \frac{g_{m,n}}{g_{m,p}} \left(\frac{1 + \delta_n}{1 + \delta_p} + 1\right) + 1}.$$
(7.2)

In the limit that the mismatch is zero, the rejection is infinite. Otherwise, the mis-

match produces finite rejection.

In Section II, the minimum EE that can be achieved for a for a given technology was calculated. In practice, layout and packaging parasitic components further limit the available  $f_T$  and achievable EE. Section II assumes the maximum DR based on available  $f_T$ ; however, the shunt feedback TIA can be designed to maximize  $Z_T$  for a given power consumption while achieving desired bandwidth using inductive peaking [76, 77].

For further analysis, device level parameters, illustrated in Fig. 3.7, are included in the equations. The 3-dB bandwidth for the core cell  $\omega_0$  is estimated as

$$\omega_0 = \frac{1}{R_{DS}C_{DS}} = \frac{G_m}{A_0 C_{DS}},$$
(7.3)

where  $C_{DS} = C_{gd,n} + C_{gd,p} + C_{db,n} + C_{db,p}$  is the total capacitance at the drain of nfet and pfet transistors,  $C_{db,n/p}$  is the capacitance between drain and substrate and  $C_{gd,n/p}$ is the capacitance between gate and drain. Note that, as described in section II, C, the damping factor in the second-order transfer function of the shunt feedback TIA found in (3.22) must be equal to  $\sqrt{2}/2$  to ensure a well-behaved response, forcing the pole frequency of the core amplifier to be  $\omega_0 = \frac{2A_0}{R_F C_{IN}}$  resulting in a 3-dB bandwidth for the TIA equal to  $BW = \frac{1}{2\pi} \frac{\sqrt{2}A_0}{R_F C_{IN}}$ . Consequently, the bandwidth of the core amplifier ( $\frac{\omega_0}{2\pi}$ ), modeled as a first order amplifier, should be chosen  $\sqrt{2}$  times higher than desired BW of the TIA.

The IRNC in (3.35) is recalculated as a function of transistor parameters.

$$i_{n,rms} = \sqrt{\frac{kT}{\sqrt{2}\pi} (\frac{1}{A_0 C_{DS}})^3 (C_{PD} + C_{in,i})} \times \sqrt{G_m^2 (p_2 C_{DS} + p_3 (C_{PD} + C_{in,i}))}.$$
(7.4)

The total inverter input capacitance equals to  $C_{in,i} = C_{gs,n} + C_{gs,p} + (1 - A)(C_{gd,n} + C_{gd,n})$ 

 $C_{gd,p}$ ) where gate source capacitance for each device,  $C_{gs,n/p}$ ,  $G_m$ , and  $C_{DS}$  are all proportional to device widths  $W_n, W_p$ . A larger device improves noise performance, reduces required transmit power in expense of higher receiver power and limited bandwidth. These trade offs between noise, bandwidth and power consumption yield an optimum choice for device width. Fig. 7.2 shows the bandwidth and noise trade-off as a function of device width and indicates that for data rates exceeding 50 Gb/s, the device should ideally be under 10 um.



Figure 7.2: Simulation of the IRNC and BW for an inverter TIA as a function of device width assuming a photodetector capacitance of 50 fF.

With the link parameters used in Section II, the EE of the receiver, taking into account the total power consumption calculated from (3.32), is shown in Fig. 7.3. The DC power consumed in the receiver as well as the minimum transmit laser power requirement is also shown in Fig. 7.3. Introduction of  $K_z$  allows an estimation of a multistage receive chain power consumption based on the desired gain while assuming the TIA dominates the IRNC. Although the calculated bandwidth of the TIA will be further limited in the multistage design, series inductive peaking allows for bandwidth adjustment. The total power consumption is minimized to 90 mW with 45 mW of receiver DC power consumption for a 8- $\mu$ m device resulting in 32 GHz of bandwidth, which should support a desired SR of 56 GBd. A higher bandwidth is possible through passive inductive peaking to peak the frequency response to 40GHz without extra power in the receiver chain. However, this frequency response will increase IRNC slightly and reduce maximum  $R_f$ to 375 $\Omega$ . Consequently, the TIA illustrated in Fig. 7.1 uses M1 = 8.25 $\mu$ m and M2 = 10  $\mu$ m to scale the PMOS slightly according to the relative velocity. The TIA stage consumes around 2.4 mW, suggesting a transimpedance power efficiency of  $K_Z = 0.01 mW/\Omega$  as estimated in Section II. The EE of just under 1 pJ/bit considers only a single channel of the receiver. The TIA stage consumes around 2.4 mW with 47-dB $\Omega$  gain, suggesting a transimpedance power efficiency of  $K_Z = 0.01 \text{ mW}/\Omega$  as estimated in Section II. The calculations also predict 3.2- $\mu$ A IRNC and a desired transimpedance of 66 dB $\Omega$  for the receive chain. To achieve the higher desired transimpedance, a cascade of inverter cells with scaled transistors as well as inductive peaking follow the TIA to minimize loading effects and bandwidth reduction.



Figure 7.3: EE of the design as a function of device width

Fig. 7.4 plots the simulated receiver frequency response assuming 400-pH output wirebond inductance from the driver to a printed circuit board assembly. The post-layout simulations indicate that a 40-GHz 3dB bandwidth is achievable. Measured S-parameters of an electrical test structures with wirebond assembly is cascaded with the PD model to determine the transimpedance of the packaged receiver. The measured result is also shown in Fig. 7.4 including 4-inch cable connection to the VNA and PCB traces. The slight discrepancy between simulation and measurement may be attributed to the PCB packaging and connection to measurement device which is present in all time domain measurements as well.



Figure 7.4: Comparison of simulated transimpedance for the RX channel and measurements based on an electrical test structure

The simulated output voltage noise for PD operating under dark current and with 0dBm unmodulated optical power is shown in fig. 7.5. Higher Dc currents flowing in the PD will increase the shot current noise at the input. IRNC integrated across twice the bandwidth will increase from 3.9  $\mu$ A for dark current to 5.6  $\mu$ A for 0.93mA dc current at the PD.

The co-simulation of the photonic and electronic circuits is indicated in Figs. 7.6a and 7.6b. The demodulation of the QPSK constellation at 40 and 56 GBd is performed using the post-layout CORX circuitry. No noise is added to the transient simulation and the impairments in the eye indicate slight intersymbol interference. The simulated error vector magnitude (EVM) equals -10.9 dB for the constellation shown at 56 GBd, and



Figure 7.5: Power spectral density of output noise voltage for PD operating in dark current and 0dBm optical power, translating to 0.93mA PD current.



Figure 7.6: Simulated constellations on the CORX (a) 40GBd with EVM=-14.5dB (b) 56GBd with EVM=-10.9dB and normalized amplitudes

-14.5 dB for the constellation shown at 40 GBd

As shown in Fig. 7.1, the design also includes a Costas phase/frequency detector with detailed analysis in [34, 35] to enable analog phase recovery of LO similar to the implementation in [54].

### 7.3 Coherent Transmitter

#### 7.3.1 Optical Front-end

The TX PIC was fabricated in Intel's silicon photonic process and includes a dualpolarization (DP)-IQ traveling-wave Mach-Zehnder Modulators (MZM) with more than 30-GHz EO bandwidth. Previous work has reported on the design and performance of the SiPh DP-IQ MZM and includes detailed DP transmitter measurements [79].

#### 7.3.2 Electrical Front-end

Fig. 7.7a provides a schematic of the MZM driver EIC fabricated in a 90-nm GlobalFoundries SiGe BiCMOS process (9HP). The output stage load resistor  $R_L$  is 200Ω to reduce the total current required to drive 30Ω MZM termination while suppressing backward reflections. The dual-channel driver consumes 250 mW (2.2 pJ/bit/channel). The driver also includes a continuous time linear equalizer (CTLE) circuit in the output stage to peak the output. This is realized by the emitter degeneration as shown in Fig. 7.7a. As operating frequency increases the emitter degeneration impedance reduces and hence the gain of the driver increases resulting in a peak in frequency response. As shown in Fig. 7.8, the CTLE circuit generates 11 dB of peaking at 36 GHz to compensate for bandwidth degradation in a silicon modulator. The simulated driver circuit exhibits 66-GHz bandwidth and can provide 2-V peak to peak swing excluding packaging and parasitic components. Trade offs between driver and laser power as well as optimum drive swing for TX EE can be found in [58].

The measured TX 56 GBd constellations with a 70-GHz reference PD is shown in Fig. 7.9a and Fig. 7.9b. Moreover, Fig. 7.9c shows the BER measurement with -2 dBm LO power per PD. This constellation offers a baseline comparison to the QPSK constellations







Figure 7.7: (a) Coherent optical transmitter including differential driver with CTLE and I/Q MZMs and (b) transmitter assembly used for testing .

that will be plotted for the CORX.

#### **Receiver Measurement Results** 7.4

The chip micrograph for the TX and RX chips and chip-on-board assembly are illustrated in Figs. 7.7b and 7.10. The TX chip measures 3.4 mm by 8.25 mm. The entire



Figure 7.8: Simulated S21 of the driver showing 11 dB of peaking at 36 GHz and 66 GHz of 3-dB bandwidth.

RX MEPIC is contained within 2.6 mm by 1.1 mm, where a significant area is required for the LO phase shifter. The optical hybrid and electronics have relatively equal area. The die is wirebonded to a high-speed test PCB.

The measurement setup is illustrated in Fig. 7.11. For testing, a 1310-nm external cavity laser (ECL) splits into the LO and signal paths where 25% of ECL power goes to LO and 75% goes to the transmitter. In the signal path, a coherent transmitter is driven with a 500-mV PRBS-15 signal from a bit pattern generator (BPG) (SHF 12105A).

The signal path also includes an O-band fiber amplifier (PDFA) compensating for high coupling loss in both transmitter and receiver and an attenuator for sensitivity measurements. The ECL output power is set to 20 dBm for 200mA input current providing 14-dBm LO and 18.7-dBm input power to the TX. The signal power at the output of the attenuator with minimum attenuation of 0.6dB is 2.6dBm. The 14-dBm LO power and 2.6-dBm signal power correspond to 0.3-mA LO and 4- $\mu$ A peak signal current per PD indicating coupling loss of 12.2 dB for LO and 19.8 dB for signal. Manual alignment and sensitivity to mechnical perturbations of couplers contributed to the high coupling loss. Probes are used only to introduce the optical fibers to the waveguide grating cou-



Figure 7.9: Standalone TX (a) constellation (b) eye diagram at 56 GBd with 18-mV swing and 4-mV eye opening, and (c) BER as a function of RX input power for -2-dBm LO power per PD.

plers on the receiver chip. The excessive optical losses in the link measurement is due to non ideal coupling as well as other components in the setup that significantly attenuates



Figure 7.10: Chip micrograph and PCB assembly for the coherent optical receiver chip and assembly.



Figure 7.11: Self-homodyne test setup for link testing of the coherent optical receiver.

the modulated signal. As described in Section II, the modulators are biased at minimum transmission resulting in very low signal power and are driven with a limited swing not providing the full  $V_{\pi}$  swing, further limiting the modulation factor for the QPSK signal. With the additional optical loss compared to the initial estimation, and a 50/50 split in the laser power, the modulated signal power was significantly limited generating less than 1- $\mu$ A current. In theory, the LO can compensate for low signal swing as shown in (3.4). However, in practice, the transmitter is not ideal and its noise affects the signal received,



Figure 7.12: Power consumption from the current drawn form  $V_{DD}$ ,  $V_{DD,buffer}$  and optical tuning element.

and the signal becomes undetectable even with an ideal noiseless amplifier in the receiver as the optical SNR (OSNR) reduces. Moreover, to boost the modulated current with a very high LO power, the DC current shown in (3.4) has a larger increase compared to the AC signal, which will result in a higher shot noise which was originally neglected in the analysis assuming the noise contribution is due to the thermal and channel noise in the receiver. To compensate for high transmitter loss and generate a detectable signal at the receiver, a higher portion of the optical power was split into the transmitter.

The receiver outputs are connected through high-speed 2.4-mm connectors to a realtime oscilloscope (RTO). The receiver I/Q channels are connected to a 70-GHz RTO (Keysight UXR0702A) with a 0.875-µs acquisition time at 256 GSa/s to capture the received QPSK signal. The differential, dual-channel electrical circuit draws 42-mA current from a 1.1-V supply, or 46.2 mW. The adder used in the Costas loop draws 5.4 mA from 1.5-V supply consuming 8-mW power. The thermal phase shifter inside the optical hybrid consumes 36 mW for quadrature bias corresponding to 82.2-mW DC power consumption



Figure 7.13: QPSK raw and sampled constellations at (a) 28 (b) 40 Gbaud with BER less than  $1 \times 10^{-5}$  and (b) 56 Gbaud with  $1.5 \times 10^{-3}$  BER (c) 60 Gbaud with  $6 \times 10^{-3}$ .

for the data path and additional 8 mW for the Costas implementation. A significant portion of the total receiver power was therefore consumed in optical tuning elements. Fig. 7.12 details the simulated power breakdown compared to the total measured power consumption of the CORX.

Figs. 7.13a, 7.13b, 7.13c, and 7.13d plot the measured QPSK constellations at 28, 40, 56, and 60 Gbaud based on the I / Q electrical outputs of the receiver. The constellation at 28, 40 Gbaud illustrates slight gain and phase imbalance. The constellations on the left show the transition between symbols while the constellations on the right are sampled to just determine ISI and noise effects on broadening the points and causing error. The constellations at 56 and 60 GBaud indicate lower imbalance but higher noise and intersymbol interference (ISI) contributions. Figs. 7.14a, 7.14b and 7.14c, 7.14d show all electrical eye diagrams, driven with 40mV input voltage generated directly from SHF 12105A railing at 200mv, compared to the full link optical eyes.



Figure 7.14: All electrical and full link optical eyes at (a) 28 Gbaud (b) 40 Gbaud (c) 56 Gbaud (d) 60 Gbaud.

The bit-error rate (BER) as a function of signal power incident at each PD is shown in Fig. 7.15. At lower data rates, the error rate is mainly due to noise while as data rate increases ISI degrades the error rate and sensitivity. The receiver input referred noise current can be estimated at 28 GBd, where the minimum signal power to achieve BER below FEC limit of  $3.8 \times 10^{-3}$  is -35 dBm. The LO power of -4.2 dBm incident at each PD results in a sensitivity of 19.7  $\mu$ A assuming  $R_{PD} = 0.9A/W$  (-19.6 dBm optical power from  $\sqrt{P_{LO}P_{RX}}$  calculated from (3.4). The IRNC can be estimated from (3.27) to be roughly 3.6  $\mu$ A. The LO power is constant for all BRs yielding a sensitivity of -14.6 dBm for 40 GBd and -13 dBm at 56 GBd. The observed degradation above 28 GBd is worse than expected and was not predicted in RX simulations but can be accounted to other frequency-dependent non-idealities including including group delay dispersion and power supply sensitivity. Based on electrical eye openings, more BER degradation is expected as the data rate increases from 40 GBd to 56 GBd compared to going from 28 GBd to 40 GBd. In the full optical link measurements, the received signal has limited eye opening and worse OSNR at higher data rates that could further limit the sensitivity above 28 GBd. To compensate, fiber amplifiers are also used to improve optical swing, but also



Figure 7.15: BER curves at different bit rates indicating the power penalty to higher data rates as referenced to the FEC limit.

optical noise for higher Baud rates. Note that all data includes packaging and cable connections to the measurement equipment which is not included in the simulation and their effect is not predicted.

The Costas loop performance was investigated in an electrical test structure by generating quadrature beat-tones at the I/Q input with 80-mV swing, translating to 600- $\mu$ A current, and the measured output voltage is shown in Fig. 7.16. To investigate the PFD performance, let us reiterate on the analysis in chapter 2 and examine how it may enable a self-homodyne link. The ideal Costas PFD response is analyzed in [35] and is shown in (7.5), where  $V_{PFD} = Z_{TIA} \cdot G_{mix} \cdot I_{PD}$  assuming the LA stage fully limits the signal and the addition is ideal and perfectly linear. The  $G_{mix} = 2/\pi$  is the gain of the passive mixer stage.



Figure 7.16: PFD output as a function of phase error.

$$PFD_{out}(t) = \begin{cases} V_{PFD} \cdot \sin(\Phi(t)), & -\frac{\pi}{4} < \Phi(t) < \frac{\pi}{4}, \\ -V_{PFD} \cdot \cos(\Phi(t)), & \frac{\pi}{4} < \Phi(t) < \frac{3\pi}{4}, \\ -V_{PFD} \cdot \sin(\Phi(t)), & \frac{3\pi}{4} < \Phi(t) < \frac{5\pi}{4}, \\ V_{PFD} \cdot \cos(\Phi(t)), & \frac{5\pi}{4} < \Phi(t) < \frac{7\pi}{4}, \end{cases}$$
(7.5)

Fig. 7.16 also shows the ideal PFD response as a baseline for performance analysis considering  $Z_{TIA} = 223$  and  $I_{PD} = 600 \ \mu A$ . The finite LA stage gain limits the PFD swing in the measurement. The response also suffers from imbalance in amplitude and 0 crossing which maybe due to slight gain mismatch between I and Q channels and DC offsets in the mixing stage. For a more symmetrical response gain control and DC offset compensation circuitry should be added to the design.

Network switching would ideally be much faster than the dynamics of a phase-locked loop for carrier recovery. Two solutions to eliminating this bottleneck are possible. First, the self-homodyne approach discussed here where the clock is sent along with the data. This eliminates the need to track the frequency and rather to adjust to the phase rapidly. Second, if a network architecture really benefits from generating the LO locally, techniques for rapidly acquiring the LO might use non-linear adaptation schemes to change the loop filter dynamically and allow for a fast acquisition period, followed by a longer time constant to improve the phase noise rejection.

As discussed in chapter 2, the linear Costas loop would follow a linear phase model, where the Costas PFD provides an error voltage based on the initial phase error between signal and LO. To remove high frequency component of the PFD and provide high DC gain, a loop filter should follow the Costas PFD whose output drives the optical phase tuner. An integrator is an ideal choice for the loop filter. The optical phase tuner can be modeled as a voltage control delay line (VCDL) providing a variable time delay or phase shift in the signal as a function of the voltage applied to it. Let us assume that the VCDL has a linear phase response and can be modeled as  $\phi_{out} = \phi_{in} + K_{VCDL}V_{cont}$ . Using the linear model the phase through the loop obeys the following equation

$$\phi_{out} = \phi_{in} + \frac{K_{LF}}{S} k_{PFD} k_{VCDL} (\phi_{in} - \phi_{out}).$$
(7.6)

Consequently, the phase error,  $\phi_e = \phi_{in} - \phi_{out}$ , follows

$$\phi_e \cdot \frac{K_{LF}}{S} k_{PFD} \cdot k_{VCDL} = -\phi_e, \tag{7.7}$$

For the equation to hold correct across all frequencies,  $\phi_e$  should approach 0. In time domain, the input phase fluctuates slowly the output phase follows the input phase with a time delay. We have

$$\phi_{out} = \phi_{in} + \phi_{e0} e^{\frac{-t}{\tau_L}},\tag{7.8}$$

where  $\phi_{e0}$  is the initial phase error, and  $\tau_L = \frac{1}{K_{LF}k_{PFD}k_{VCDL}}$  is the loop time constant. At  $t = 7\tau_L$ , the phase error is reduced by a factor of  $10^{-3}$ . Hence, To ensure the loop can

track the phase faster than the phase variations,  $K_{PFD}$  and loop filter should be designed properly.

The PFD response highly depends on optical power. The PFD swing is proportional to  $R_{PD}\sqrt{P_{LO}P_{RX}}$ . For instance, the PFD swing for 600 $\mu$ A current is 100 mV. Assuming a linear attenuation, the PFD swing is expected to reduce to 7.5 mV for the receiver sensitivity of -13 dBm at 56 GBd. Also assuming the optical phase tuner has a  $K_{VCDL} =$ 0.5rad/V, and the integrator has a time constant  $(1/K_{LF})$  of 10 ps, the loop takes 15 ns to reduce the phase error to  $10^{-3}$ .

A performance summary for this design is provided in Table 7.1 with comparison against recent work at similar data rates. Notably, this result is fully integrated and was tested on a PCB assembly and not probed electrically. The finFET CMOS has indicated excellent power; however, this process does not support silicon photonic integration and the measured results are not for a full link optical assembly. When compared to prior monolithic coherent design in O-band, this design achieved a 6-fold improvement in energy efficiency for similar data rates. Compared to monolithic IMDD design, the energy efficiency we achieved was almost half with for the same data rate. Although SNR requirements are more strict for a PAM4 receiver compared to QPSK to achieve same BER, low IRNC in [76] allows sensitivity of -12dBm required for FEC level BER. Compared to coherent designs this work achieves best sensitivity except for [81]which has a much lower BW and hence lower integrated noise. This design also has the highest FOM defined as  $Z_T BW/P_{DC}$  among coherent receivers.

### 7.5 Conclusion

This chapter described a coherent optical receiver that achieves 0.73 pJ/bit at 56 GBd fabricated in a 45-nm CMOS silicon photonic technology. An analysis of the trade-

offs between device speed and energy efficiency illustrates the optimal input-referred noise current. Design optimization indicating transistor sizing, as well as the monolithic technology with high  $f_T$  allows for maximized FOM and best EE when compared to other coherent designs at same data rates. Measured constellations and sensitivity curves indicate bit error rates below forward error correction limit of  $3.8 \times 10^{-3}$ .

### 7.6 Acknowledgment

This chapter is in part a reprint of material in the manuscript, "A 112-Gbps, 0.73pJ/bit Fully-Integrated O-band I-Q Optical Receiver in a 45-nm CMOS SOI-Photonic Process," published in the IEEE Symposium on Radio Frequency Integrated Circuits (RFIC) ©2023 IEEE, and "A Monolithic O-Band Coherent Optical Receiver for Energy-Efficient Links," published in the IEEE Journal of Solid-State Circuits ©2023 IEEE.

|                                  |                    |                         | Table 7.1: S                  | tate-of-the-Art (              | Compariso              | n                         |                              |                   |
|----------------------------------|--------------------|-------------------------|-------------------------------|--------------------------------|------------------------|---------------------------|------------------------------|-------------------|
| $\operatorname{Ref}$             | [20]               | [80]                    | [74]                          | [73]                           | [92]                   | [23]                      | [81]                         | This work         |
| $\operatorname{Process}$         | $0.25\mu{ m m}$    | $0.25 \mu { m m}$       | $130 \mathrm{nm}$ SiGe        | $45 \mathrm{nm} \mathrm{CMOS}$ | $22 \mathrm{nm}$       | 45nm CMOS                 | 45nm CMOS                    | 45nm CMOS         |
|                                  | $\mathrm{SiGe}$    | $\rm SiGe$              | <b>BiCMOS EIC</b>             | SOI EIC                        | FinFET                 | SOI-Photonic              | SOI-Photonic                 | SOI-Photonic      |
|                                  | -Photonics         | -Photonics              | 90-nm Silicon-                | 90-nm Silicon-                 | *                      |                           |                              |                   |
|                                  |                    |                         | Photonic PIC                  | Photonic PIC                   |                        |                           |                              |                   |
| Interface                        | Diff               | Diff                    | Diff                          | Psuedo Diff                    | SE                     | S2D                       | Psuedo Diff                  | Psuedo Diff       |
| Supply $(V)$                     | 2.4                | 1                       | 3                             | I                              | 0.8                    | 1.2                       | 1.2                          | 1.1               |
| $\mathrm{R_{PD}}~(\mathrm{A/W})$ | 0.7                | I                       | I                             | I                              | I                      | 0.9                       | 0.0                          | 0.9               |
| Modulation                       | QPSK <sup>++</sup> | QPSK                    | QPSK                          | QPSK                           | NRZ                    | PAM4                      | QPSK                         | QPSK              |
|                                  |                    |                         |                               |                                | PAM4                   |                           |                              |                   |
| $\mathbf{Speed}$                 | $128^{1}$          | $112^{1}$               | 100                           | $100^{1}$                      | 80                     | $112^{1}$                 | 80                           | 112               |
| (Gb/s)                           |                    |                         |                               |                                | 128                    |                           |                              |                   |
| $TI (dB\Omega)$                  | 22                 | 1                       | 67.2                          | 53.6                           | 59.3                   | 68                        | 52.3                         | 58.7              |
| EE (pJ/bit)                      | 3.2                | 4.3                     | $3.32^{2}$                    | $0.93^{2}$                     | $0.098^{3}$            | 1                         | $1.2, 0.9^2$                 | $0.73, 0.41^2$    |
| Chip area                        | $2.5 \times 1.1$   | $3.65 \times 1.45$      | $1.475 \times 1.9$            | $1.885 \times 1.28$            | $0.23 \times$          | 1                         | $2.7 \times 0.84$            | $2.63 \times 1.1$ |
| $(\mathbf{mm^2})$                |                    |                         | EIC                           | EIC                            | $0.11^{4}$             |                           |                              |                   |
| ${f Sensitivity}^+$              | 1                  | $\approx -14$           | -12                           | -8.4                           | -12 <sup>5</sup>       | 1                         | -16                          | -13               |
| (dBm)                            |                    |                         |                               |                                |                        |                           |                              |                   |
| IRNC                             | I                  | I                       | 10.7                          | 6.3                            | 2.7                    | 17.7                      | 3                            | 3.6               |
|                                  |                    |                         | $(\mathrm{pA}/\sqrt{Hz})$     | $\mu A_{rms}$                  | $\mu A_{rms}$          | $(\mathrm{pA}/\sqrt{Hz})$ | $\mu A_{rms}$                | $\mu A_{rms}$     |
| FOM                              | 561                | 1                       | 486                           | 112                            | 3979                   | 616                       | 127                          | 675               |
|                                  |                    | 6                       |                               | ,                              |                        |                           | ,                            |                   |
| ++ C-band, <sup>+</sup> at       | FEC limit of       | $3.8 \times 10^{-3}$ at | max speed, * No               | optical measuren               | nents, <sup>1</sup> Wi | th post-processin         | g equalization, <sup>2</sup> | EIC only, $^3$ No |
|                                  |                    | integrat                | tion with PIC, <sup>±</sup> E | UC active area wi              | thout pads             | , <sup>o</sup> Simulated  |                              |                   |

# Summary and Outlook

In this dissertation implementation of next-generation power efficient intra-data center optical links leveraging analog coherent detection was studied. A detailed assessment of homodyne, heterodyne, and self- homodyne analog coherent was provided in chapter 2. This thesis was mainly focused on implementing heterodyne and self-homodyne links. In the proposed optical transceiver implementation, the receiver, transmitter driver circuits as well as the laser power account for the majority of dissipated power, which motivates the link optimization proposed in chapter 3. Various TIA circuit designs and techniques were also presented to be utilized in low-power high-bandwidth analog coherent detection based links.

Significant effects of package parasitic components were evaluated in chapter 4.

A QPSK Phase/frequency detector which may be employed for analog carrier recover was discussed in chapter 5 where a 3.3-pJ/bit I/Q receiver with a Costas loop PFD in a 130-nm SiGe technology was presented. The amplifier and PFD employ a current reuse technique to improve energy efficiency. Electrical characterization demonstrates the functionality of the PFD and I/Q data paths up to 88 Gb/s while maintaining BER below FEC limits.

Advantage of monolithic electronic/photonic integration in optical receivers was further described in chapter 6. The performance of a 1.2-pJ/bit coherent optical receiver fabricated in the 45CLO technology that supports both silicon photonic components as well as high speed electronics was explored in this chapter. Measured constellations and sensitivity curves showed performance up to 40 Gbaud below FEC BER limit of  $2.2 \times 10^{-3}$ .

Finally, in chapter 7, a 1310-nm (O-band) coherent optical Link is demonstrated for short-range optical interconnects that operate to 56-GBd symbol rate (SR)(112 Gbps) with FEC-acceptable BER. The coherent optical receiver (CORX) leverages a monolithic 45-nm CMOS SOI photonic-enabled process to realize an energy-efficient quadrature phase shift keying (QPSK) demodulation. Co-design of the optical and electronic circuit elements supports high-speed operation and low power consumption. An analysis of the trade-offs between device speed and energy efficiency illustrates the optimal input-referred noise current. Design optimization indicating transistor sizing, as well as the monolithic technology with high  $f_T$  allows for maximized FOM and best EE when compared to other coherent designs at same data rates. The coherent link is demonstrated with an optical transmitter photonic IC (PIC) fabricated in silicon photonic (SiPh) process with laser diodes wirebonded to a 90-nm SiGe driver electronic RFIC. The transmitter operates at 5.9-pJ/bit energy efficiency (EE) while the receiver achieves 0.73 pJ/bit and, to our knowledge, is the best EE reported for a coherent optical receiver.

## 8.1 Future Work

Chapter 2 explores a heterodyne analog coherent receiver design and reviews circuit implementation. This chip was also taped-out and fabricated in the monolithic 45nm CMOS SOI photonics process. Fig. 8.1 shows the chip micro-graph of this design which is under measurement.



Figure 8.1: Heterodyne monolithic receiver

The CORX designed in chapter 7 showed great potential for an energy efficient shortrange optical link. Consequently, we design a dual polarization version of the receiver with an on chip polarization controller. Fig. 8.2 shows the chip micro-graph for this design. Future wrok includes a dula polarization coherent link measurement to implement a 224 Gbps/ $\lambda$  QPSK coherent link.



Figure 8.2: Dual polarization monolithic receiver

# Bibliography

- [1] Cisco, "Cisco annual Internet report (2018-2023)," 2020, https://www. cisco.com/c/en/us/solutions/collateral/executive-perspectives/ annual-internet-report/white-paper-c11-741490.html
- [2] Cisco, "Cisco global cloud index: forecast and methodology, 2016-2021," 2018, https://virtualization.network/Resources/Whitepapers/ 0b75cf2e-0c53-4891-918e-b542a5d364c5\_white-paper-c11-738085.pdf
- [3] X. Zhou, R. Urata and H. Liu, "Beyond 1 Tb/s Intra-Data Center Interconnect Technology: IM-DD OR Coherent?," in Journal of Lightwave Technology, vol. 38, no. 2, pp. 475-484, 15 Jan.15, 2020, doi: 10.1109/JLT.2019.2956779.
- [4] R. Urata, H. Liu, X. Zhou and A. Vahdat, "Datacenter interconnect and networking: From evolution to holistic revolution," 2017 Optical Fiber Communications Conference and Exhibition (OFC), Los Angeles, CA, USA, 2017, pp. 1-3.
- [5] A. Saleh, K. Schmidtke, R. Stone, J. Buckwalter, L. Coldren, and C. Schow, "IN-TREPID program: technology and architecture for next-generation, energy-efficient, hyper-scale data centers," in Journal of Optical Communications and Networking, vol. 13, no. 12, pp. 347-359, December 2021.
- [6] J. Cheng, C. Xie, Y. Chen, X. Chen, M. Tang and S. Fu, "Comparison of Coherent and IMDD Transceivers for Intra Datacenter Optical Interconnects," 2019 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 2019, pp. 1-3.
- [7] X. Pang et al., "200 Gbps/Lane IM/DD Technologies for Short Reach Optical Interconnects," in Journal of Lightwave Technology, vol. 38, no. 2, pp. 492-503, 15 Jan.15, 2020, doi: 10.1109/JLT.2019.2962322.
- [8] T. Hirokawa et al., "Analog Coherent Detection for Energy Efficient Intra-Data Center Links at 200 Gbps Per Wavelength," in Journal of Lightwave Technology, vol. 39, no. 2, pp. 520-531, 15 Jan.15, 2021

- [9] R. Urata, X. Zhou, and H. Liu, "Beyond 400G: Business as usual or coherent convergence?" in Proc. OFC Workshop Talk: Beyond 400G Hyperscale DCs Workshop, 2019, pp. 1–7.
- [10] R. Nagarajan et al., "Low Power DSP-Based Transceivers for Data Center Optical Fiber Communications," JLT 54, pp, 5221-5231, 2021.
- [11] H. Zhang, "1.6 Tb/s SiPh Coherent Solution for Intra data Center Interconnection", OFC Panel 2022.
- [12] H. Zhang, "Power Efficient Coherent Detection for Short-Reach System," 2023 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 2023, pp. 1-3, doi: 10.1364/OFC.2023.M1E.1.
- [13] E. Berikaa et al., "Net 1.6 Tbps O-band Coherent Transmission over 10 km Using a TFLN IQM and DFB Lasers for Carrier and LO," 2023 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 2023, pp. 1-3, doi: 10.1364/OFC.2023.Th4B.1.
- [14] J. K. Perin, A. Shastri and J. M. Kahn, "Coherent Data Center Links," in Journal of Lightwave Technology, vol. 39, no. 3, pp. 730-741, 1 Feb.1, 2021, doi: 10.1109/JLT.2020.3043951.
- [15] J. K. Perin, A. Shastri and J. M. Kahn, "Design of Low-Power DSP-Free Coherent Receivers for Data Center Links," in Journal of Lightwave Technology, vol. 35, no. 21, pp. 4650-4662, 1 Nov.1, 2017, doi: 10.1109/JLT.2017.2752079.
- [16] https://www.broadcom.com/info/optics/cpo
- [17] A. Maharry et al., "First Demonstration of an O-Band Coherent Link for Intra-Data Center Applications," 2022 European Conference on Optical Communication (ECOC), 2022,
- [18] A. Maharry et al., "A 224 Gbps/λ O-Band Coherent Link for Intra-Data Center Applications," 2023 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 2023, pp. 1-3, doi: 10.1364/OFC.2023.M1E.5
- [19] R. Soref, "The Past, Present, and Future of Silicon Photonics," in IEEE Journal of Selected Topics in Quantum Electronics, vol. 12, no. 6, pp. 1678-1687, Nov.-dec. 2006, doi: 10.1109/JSTQE.2006.883151.
- [20] C. Kress et al., "64 GBd Monolithically Integrated Coherent QPSK Single Polarization Receiver in 0.25 μ m SiGe-Photonic Technology," 2018 Optical Fiber Communications Conference and Exposition (OFC), 2018, pp. 1-3.
- [21] L. A. Valenzuela, J. F. Buckwalter https://rfic.ece.ucsb.edu/files/ opticalrxsurveyv0xlsx

- [22] M. Rakowski, et al. "45nm CMOS Silicon Photonics Monolithic Technology (45CLO) for next-generation, low power and high speed optical interconnects," in Optical Fiber Communication Conference (OFC) 2020, OSA Technical Digest (Optica Publishing Group, 2020), paper T3H.3.
- [23] Thomas Baehr-Jones et al "Monolithically integrated 112 Gbps PAM4 optical transmitter and receiver in a 45nm CMOS-silicon photonics process," Opt. Express 31, 24926-24938 (2023)
- [24] K. Sun, G. Wang, Q. Zhang, S. Elahmadi and P. Gui, "A 56-GS/s 8-bit Time-Interleaved ADC With ENOB and BW Enhancement Techniques in 28-nm CMOS," in IEEE Journal of Solid-State Circuits, vol. 54, no. 3, pp. 821-833, March 2019
- [25] T. Hirokawa, "Wavelength-Selective Photonic Switches for Energy Efficient Reconfigurable Data Center Networks"
- [26] A. A. M. Saleh, "Scaling-out data centers using photonics technologies," in Proc. Photon. Switching Conf., Jul. 2014, Paper JM4B.5.
- [27] A. A. M. Saleh, A. S. P. Khope, J. E. Bowers, and R. C. Alferness, "Elastic WDMswitching for scalable data center and HPC interconnect networks," in Proc. 21st Opto Electron. Commun. Conf., 2016, pp. 1–3.
- [28] G. Michelogiannakis et al., "Bandwidth steering in HPC using Silicon nanophotonics," in Proc. Int. Conf. High Perfor. Comput. Netw. Storage Anal., Nov. 2029, pp. 1–25.
- [29] F. Testa and L. Pavesi, Optical Switching In Next GenerationDataCenters. Springer, 2017.
- [30] K. Suzuki et al., "Nonduplicate polarization-diversity 32 × 32 Silicon photonics switch based on a SIN/Si double-layer platform," J. Lightw. Technol., vol. 38, no. 2, pp. 226–232, Jan. 2020.
- [31] K. Kwon, T. J. Seok, J. Henriksson, J. Luo, and M. C. Wu, "Large-scale silicon photonic switches," in Proc. Photon. Electromagn. Res. Symp. -Spring, 2019, pp. 268–273.
- [32] T. Hirokawa et al., "An all-optical wavelength-selective O-band chipscale silicon photonic.
- [33] N. Dupuis et al., "A 4 × 4 electrooptic silicon photonic switch fabric with net neutral insertion loss," J. Lightw. Technol., vol. 38, no. 2, pp. 178–184, Jan. 2020.
- [34] M. Lu et al., "An Integrated 40 Gbit/s Optical Costas Receiver," in Journal of Lightwave Technology, vol. 31, no. 13, pp. 2244-2253, July1, 2013

- [35] R.E. Best, N.V. Kuznetsov, G.A. Leonov, M.V. Yuldashev, R.V. Yuldashev, "Tutorial on dynamic analysis of the Costas loop", Annual Reviews in Control, Volume 42, 2016, Pages 27-49,
- [36] M. J. W. Rodwell et al., "Optical Phase-Locking and Wavelength Synthesis," 2014 IEEE Compound Semiconductor Integrated Circuit Symposium (CSICS), La Jolla, CA, USA, 2014, pp. 1-4, doi: 10.1109/CSICS.2014.6978571.
- [37] M. Lu, H. Park, E. Bloch, L. A. Johansson, M. J. Rodwell, and L. A. Coldren, "Highly Integrated Homodyne Receiver for Short-reach Coherent Communication," in International Photonics and OptoElectronics, OSA Technical Digest (online) (Optica Publishing Group, 2015), paper OT2A.4.
- [38] M. J. W. Rodwell et al., "Phase-locked coherent optical interconnects for data links," 2014 Optical Interconnects Conference, San Diego, CA, USA, 2014, pp. 119-120, doi: 10.1109/OIC.2014.6886108.
- [39] M. Lu, H. Park, E. Bloch, L. A. Johansson, M. J. Rodwell, and L. A. Coldren, "A Highly-Integrated Optical Frequency Synthesizer Based on Phase-locked Loops," in Optical Fiber Communication Conference, OSA Technical Digest (online) (Optica Publishing Group, 2014), paper W1G.4.
- [40] J. S. Parker et al., "Highly-Stable Integrated InGaAsP/InP Mode-Locked Laser and Optical Phase-Locked Loop," in IEEE Photonics Technology Letters, vol. 25, no. 18, pp. 1851-1854, Sept.15, 2013, doi: 10.1109/LPT.2013.2277865.
- [41] M. Lu et al., "Monolithic Integration of a High-Speed Widely Tunable Optical Coherent Receiver," in IEEE Photonics Technology Letters, vol. 25, no. 11, pp. 1077-1080, June1, 2013, doi: 10.1109/LPT.2013.2259474.
- [42] S. Misak, A. Maharry, J. Liu, R. Kumar, D. Huang, G. Gilardi, R. Jones, A. Liu, L. Coldren, and C. Schow, "Heterogeneously Integrated O-band SG-DBR Lasers for Short Reach Analog Coherent Links," in OSA Advanced Photonics Congress 2021
- [43] J. Liu, A. Maharry, A. Wissing, H. Andrade, S. Misak, G. Gilardi, S. Liao, A. Liu, Y. Akulova, L. Coldren, J. F. Buckwalter, C. L. Schow, "First Oband silicon coherent transmitter with integrated hybrid tunable laser and SOAs," Proc. SPIE 12426, Silicon Photonics XVIII, 124260A (13 March 2023); https://doi.org/10.1117/12.2668010
- [44] J. Buus, M. Amann, and D. Blumenthal, "Tunable Laser Diodes and Related Optical Sources," New York, Wiley-Interscience, 2005.
- [45] Yang, Changjin, Liang, Lei, Qin, Li, Tang, Hui, Lei, Yuxin, Jia, Peng, Chen, Yongyi, Wang, Yubing, Song, Yu, Qiu, Cheng, Zheng, Chuantao, Zhao, Huan,

Li, Xin, Li, Dabing and Wang, Lijun. "Advances in silicon-based, integrated tunable semiconductor lasers" Nanophotonics, vol. 12, no. 2, 2023, pp. 197-217. https://doi.org/10.1515/nanoph-2022-0699

- [46] E. Säckinger, "The Transimpedance Limit," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 57, no. 8, pp. 1848-1856, Aug. 2010, doi: 10.1109/TCSI.2009.2037847.
- [47] C. Ding et al., "110-GHz PLL Frequency Synthesizer With High-Efficiency Voltage-Controlled Oscillator," in IEEE Transactions on Microwave Theory and Techniques, doi: 10.1109/TMTT.2023.3308169.
- [48] Y. Li, T. Tan and X. Li, "A 40.6% Tuning Range Low-Phase-Noise Class-F-1/3 VCO Using Simultaneous Frequency and Harmonic-Mode Switching," in IEEE Transactions on Circuits and Systems II: Express Briefs, doi: 10.1109/TCSII.2023.3327498.
- [49] Y. Li, Z. Huang, L. Song, T. Yang and X. Li, "An X -Band Low-Phase-Noise Class-F 23 VCO Without Manual Harmonic Tuning Based on Switched-Transformer and Wideband Common-Mode Resonance," in IEEE Transactions on Microwave Theory and Techniques, doi: 10.1109/TMTT.2023.3319988.
- [50] X. Meng, H. Li, P. Chen, J. Yin, P. -I. Mak and R. P. Martins, "Analysis and Design of a 15.2-to-18.2-GHz Inverse-Class-F VCO With a Balanced Dual-Core Topology Suppressing the Flicker Noise Upconversion," in IEEE Transactions on Circuits and Systems I: Regular Papers, doi: 10.1109/TCSI.2023.3312817.
- [51] A. Iesurum, D. Manente, F. Padovan, M. Bassi and A. Bevilacqua, "Analysis and Design of Coupled PLL-Based CMOS Quadrature VCOs," in IEEE Journal of Solid-State Circuits, doi: 10.1109/JSSC.2023.3280360.
- [52] Y. -S. Lin, C. -W. Hsu, C. -L. Lu and Y. -H. Wang, "A Low-Power Quadrature Local Oscillator Using Current-Mode-Logic Ring Oscillator and Frequency Triplers," in IEEE Microwave and Wireless Components Letters, vol. 23, no. 12, pp. 650-652, Dec. 2013, doi: 10.1109/LMWC.2013.2283860.
- [53] J. -M. Kim, S. Kim, I. -Y. Lee, S. -K. Han and S. -G. Lee, "A Low-Noise Four-Stage Voltage-Controlled Ring Oscillator in Deep-Submicrometer CMOS Technology," in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 60, no. 2, pp. 71-75, Feb. 2013, doi: 10.1109/TCSII.2012.2235734.
- [54] R. Ashok, S. Naaz, R. Kamran and S. Gupta, "Analog Domain Carrier Phase Synchronization in Coherent Homodyne Data Center Interconnects," in Journal of Lightwave Technology, vol. 39, no. 19, pp. 6204-6214, Oct.1, 2021
- [55] Agraval, G., 1992. "Fiber-optic communication systems," John Wiley & Sons.

- [56] Razavi, B., 2012. "Design of integrated circuits for optical communications," John Wiley & Sons.
- [57] E. Säckinger, Broadband Circuits for Optical Fiber Communications. Hoboken, NJ: Wiley, 2005.
- [58] Aaron Maharry, Hector Andrade, Stephen Misak, Junqian Liu, Yujie Xia, Aaron Wissing, Ghazal Movaghar, Viviana Arrunategui-Norvick, Evan D. Chansky, Xinhong Du, Adel A. M. Saleh, James F. Buckwalter, Larry Coldren, and Clint L. Schow, "Integrated SOAs enable energy-efficient intra-data center coherent links," Opt. Express 31, 17480-17493 (2023)
- [59] G. Su, M. N. Sakib, J. Heck, H. Rong, and M. C. Wu, "A heterogeneously-integrated III-V/silicon interferometric widely tunable laser," in OSA Advanced Photonics Congress (AP) 2019 (IPR, Networks, NOMA, SPPCom, PVLED), OSA Technical Digest (Optica Publishing Group, 2019), paper IW3A.1.
- [60] S. Saeedi, S. Menezo, G. Pares and A. Emami, "A 25 Gb/s 3D-Integrated CMOS/Silicon-Photonic Receiver for Low-Power High-Sensitivity Optical Communication," in Journal of Lightwave Technology, vol. 34, no. 12, pp. 2924-2933, 15 June15, 2016
- [61] K. Fu, W. -S. Zhao, D. -W. Wang, G. Wang, M. Swaminathan and W. -Y. Yin, "A Compact Passive Equalizer Design for Differential Channels in TSV-Based 3-D ICs," in IEEE Access, vol. 6, pp. 75278-75292, 2018
- [62] D. Malta et al., "TSV-Last, Heterogeneous 3D Integration of a SiGe BiCMOS Beamformer and Patch Antenna for a W-Band Phased array Radar," 2016 IEEE 66th Electronic Components and Technology Conference (ECTC), 2016
- [63] G. Katti, M. Stucchi, K. De Meyer and W. Dehaene, "Electrical Modeling and Characterization of Through Silicon via for Three-Dimensional ICs," in IEEE Transactions on Electron Devices, vol. 57, no. 1, pp. 256-262, Jan. 2010
- [64] A. Rahimi, P. Somarajan and Q. Yu, "Modeling and Characterization of Through-Silicon-Vias (TSVs) in Radio Frequency Regime in an Active Interposer Technology," 2020 IEEE 70th Electronic Components and Technology Conference (ECTC), 2020, pp. 1383-1389
- [65] V. Blaschke and H. Jebory, "Test structure and analysis for accurate RFcharacterization of tungsten through silicon via (TSV) grounding devices," 2013 IEEE International Conference on Microelectronic Test Structures (ICMTS)
- [66] J. Yook, Y. Kim, W. Kim, S. Kim and J. C. Kim, "Ultrawideband Signal Transition Using Quasi-Coaxial Through-Silicon-Via (TSV) for mm-Wave IC Packaging," in

IEEE Microwave and Wireless Components Letters, vol. 30, no. 2, pp. 167-169, Feb. 2020,

- [67] H. Liao and H. Chiou, "RF Model and Verification of Through-Silicon Vias in Fully Integrated SiGe Power Amplifier," in IEEE Electron Device Letters, vol. 32, no. 6, pp. 809-811, June 2011
- [68] M. Wietstruck, S. Marschmeyer, C. Wipf, M. Stocchi and M. Kaynak, "BiCMOS Through-Silicon Via (TSV) Signal Transition at 240/300 GHz for MM-Wave & Sub-THz Packaging and Heterogeneous Integration," 2020 50th European Microwave Conference (EuMC), 2021, pp. 244-247
- [69] L. A. Valenzuela, A. Maharry, H. Andrade, C. L. Schow and J. F. Buckwalter, "A 108-Gbps, 162-mW Cherry-Hooper Transimpedance Amplifier," 2020 IEEE BiC-MOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS), 2020, pp. 1-4
- [70] L. A. Valenzuela, A. Maharry, H. Andrade, C. L. Schow and J. F. Buckwalter, "Energy Optimization for Optical Receivers Based on a Cherry-Hooper Emitter Follower Transimpedance Amplifier Front-end in 130-nm SiGe HBT Technology," in Journal of Lightwave Technology, vol. 39, no. 23, pp. 7393-7405, Dec.1, 2021
- [71] M. G. Ahmed, T. N. Huynh, C. Williams, Y. Wang, P. K. Hanumolu and A. Rylyakov, "34-GBd Linear Transimpedance Amplifier for 200-Gb/s DP-16-QAM Optical Coherent Receivers," in IEEE Journal of Solid-State Circuits, vol. 54, no. 3, pp. 834-844, March 2019, doi: 10.1109/JSSC.2018.2882265.
- [72] A. Awny et al., "23.5 A dual 64Gbaud 10kΩ 5% THD linear differential transimpedance amplifier with automatic gain control in 0.13µm BiCMOS technology for optical fiber coherent receivers," 2016 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2016, pp. 406-407, doi: 10.1109/ISSCC.2016.7418079.
- [73] H. Andrade, Y. Xia, A. Maharry, L. Valenzuela, J. F. Buckwalter and C. L. Schow, "50 GBaud QPSK 0.98 pJ/bit Receiver in 45 nm CMOS and 90 nm Silicon Photonics," 2021 European Conference on Optical Communication (ECOC), 2021, pp. 1-4.
- [74] L. A. Valenzuela, Y. Xia, A. Maharry, H. Andrade, C. L. Schow and J. F. Buckwalter, "A 50-GBaud QPSK Optical Receiver With a Phase/Frequency Detector for Energy-Efficient Intra-Data Center Interconnects," in IEEE Open Journal of the Solid-State Circuits Society, vol. 2, pp. 50-60, 2022

- [75] A. Awny et al., "A Linear Differential Transimpedance Amplifier for 100-Gb/s Integrated Coherent Optical Fiber Receivers," in IEEE Transactions on Microwave Theory and Techniques, vol. 66, no. 2, pp. 973-986, Feb. 2018, doi: 10.1109/TMTT.2017.2752170.
- [76] S. Daneshgar, H. Li, T. Kim and G. Balamurugan, "A 128 Gb/s, 11.2 mW Single-Ended PAM4 Linear TIA With 2.7 /muArms Input Noise in 22 nm FinFET CMOS," in IEEE Journal of Solid-State Circuits, vol. 57, no. 5, pp. 1397-1408, May 2022, doi: 10.1109/JSSC.2022.3147467
- [77] S. Daneshgar, H. Li, T. Kim and G. Balamurugan, "A 128 Gb/s PAM4 Linear TIA with 12.6  $pA/\sqrt{Hz}$  Noise Density in 22nm FinFET CMOS," 2021 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), Atlanta, GA, USA, 2021, pp. 135-138, doi: 10.1109/RFIC51843.2021.9490496.
- [78] G. Movaghar, V. Arrunategui, J. Liu, A. Maharry, C. Schow and J. Buckwalter, "A 112-Gbps, 0.73-pJ/bit Fully-Integrated O-band I-Q Optical Receiver in a 45-nm CMOS SOI-Photonic Process," 2023 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), San Diego, CA, USA, 2023, pp. 5-8, doi: 10.1109/RFIC54547.2023.10186202.
- [79] A. Maharry et al., "First Demonstration of an O-Band Coherent Link for Intra-Data Center Applications," in Journal of Lightwave Technology, doi: 10.1109/JLT.2023.3290487.
- [80] P. M. Seiler et al., "56 GBaud O-Band Transmission using a Photonic BiCMOS Coherent Receiver," 2020 European Conference on Optical Communications (ECOC), 2020, pp. 1-4, doi: 10.1109/ECOC48923.2020.9333218.
- [81] G. Movaghar et al., "First Monolithically-Integrated Silicon CMOS Coherent Optical Receiver," 2023 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 2023, pp. 1-3, doi: 10.1364/OFC.2023.Th2A.2.