## **UC Berkeley**

### **UC Berkeley Electronic Theses and Dissertations**

#### **Title**

Energy-efficient and High-bandwidth Density Monolithic Optical Transceivers in Advanced CMOS Processes

#### **Permalink**

https://escholarship.org/uc/item/5c17z5w8

#### **Author**

Moazeni, Sajjad

#### **Publication Date**

2018

Peer reviewed|Thesis/dissertation

# Energy-efficient and High-bandwidth Density Monolithic Optical Transceivers in Advanced CMOS Processes

by

Sajjad Moazeni

A dissertation submitted in partial satisfaction of the

requirements for the degree of

Doctor of Philosophy

in

Engineering - Electrical Engineering and Computer Sciences

in the

Graduate Division

of the

University of California, Berkeley

Committee in charge:

Professor Vladimir Stojanović, Chair Professor Ming C. Wu Professor Liwei Lin

Summer 2018

# Energy-efficient and High-bandwidth Density Monolithic Optical Transceivers in Advanced CMOS Processes

Copyright 2018 by Sajjad Moazeni

#### Abstract

Energy-efficient and High-bandwidth Density Monolithic Optical Transceivers in Advanced CMOS Processes

by

Sajjad Moazeni

Doctor of Philosophy in Engineering - Electrical Engineering and Computer Sciences University of California, Berkeley

Professor Vladimir Stojanović, Chair

Today's conventional cloud computing and mobile platforms have been challenged by the advent of Machine Learning (ML) and Internet of Things (IoT). The performance and diversity requirements of these applications demand the shift towards hyper-scale data centers, Exascale high-performance computing (HPC), energy-efficient edge computing, and new sensing and imaging modalities. My research goal is to design and implement large-scale and energy-efficient integrated systems that answer these technological changes by merging state-of-the-art electronics with photonics.

This thesis developes several monolithic photonics platforms in advanced CMOS technologies that were designed as key enablers for the next-generation of integrated systems: (1) Using unmodified CMOS in 32/45 nm SOI nodes places photonics next to one of the fastest transistors and enhances integrated system applications beyond the Moore-scaling, while being able to offload major communication tasks from more deeply-scaled compute and memory chips without the complications of 3D integration approaches. (2) Poly-silicon based photonics in bulk CMOS as a path for embedding photonics in the most advanced CMOS nodes (sub-10 nm). We demonstrate system results using these platforms for the immediate application area of high-performance optical transceivers. We elaborate on the electronicphotonic co-optimization opportunities on the example of optical interconnect application, a 40 Gb/s optical transmitter achieving the world record energy and bandwidth density. Furthermore, we explain how deep insight into details of an advanced CMOS process can leverage photonic device design, enabling new degrees of freedom in a seemingly constrained environment. Lastly, we demonstrate the first monolithic integrated photonics platform in a commercial 300 mm-wafer bulk CMOS technology. We implemented the photonic systemon-chip (SoC) in this platform for *in-situ* device characterization and process development, and demonstrated wavelength division multiplexed (WDM) optical transceivers. These integrated platforms and system design methodologies can unlock new functionalities in many applications such as HPC, high-bandwidth wireless connectivity, LiDAR, bio-sensing, etc.

To My Family

# Contents

| C  | onter              | nts                                                                                                                                                                                   | ii                                                                                                                        |
|----|--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|
| Li | st of              | Figures                                                                                                                                                                               | iv                                                                                                                        |
| Li | st of              | Tables                                                                                                                                                                                | iv  ix  1  4  7  7  10  12  13  15  17  19  22  24  25  27  28  31  39  39  39  39  39  39  40  Optical Links  41  44  46 |
| 1  | <b>Intr</b><br>1.1 | roduction Thesis Organization and Contributions                                                                                                                                       |                                                                                                                           |
| 2  | Pre                | liminaries                                                                                                                                                                            | -                                                                                                                         |
|    | 2.1                | Optical Interconnects                                                                                                                                                                 | 10<br>12                                                                                                                  |
|    | 2.2                | 2.1.3 Laser Sources and Integration                                                                                                                                                   | 15<br>17                                                                                                                  |
|    | 2.3<br>2.4<br>2.5  | Zero-change 45nm SOI CMOS Platform  2.3.1 Waveguides  2.3.2 Grating Couplers  2.3.3 Modulators  2.3.4 Photodetectors  Photonic SoC Design Flow  Energy-efficiency & Bandwidth-density | 22<br>24<br>25<br>27<br>28<br>31                                                                                          |
| 3  | Elec               | etronic-Photonic Co-optimization                                                                                                                                                      | 39                                                                                                                        |
|    | 3.1<br>3.2         | Motivation                                                                                                                                                                            | 40<br>41<br>44                                                                                                            |
|    | 3.3                | Ring-resonator based Optical DAC                                                                                                                                                      | 46<br>46                                                                                                                  |

|    |       | 3.3.2 Segmented Ring-resonator ODAC in zero-change 45nm SOI platform | 49  |
|----|-------|----------------------------------------------------------------------|-----|
|    | 3.4   | PAM-4 Optical Transmitter Building Blocks                            | 50  |
|    |       | 3.4.1 Transmitter Data-path                                          | 50  |
|    |       | 3.4.2 Digital PLL (DPLL)                                             | 52  |
|    |       | 3.4.3 Thermal Tuning                                                 | 53  |
|    | 3.5   | Complete Transmitter Design                                          | 59  |
|    | 3.6   | Experimental Demonstration                                           | 60  |
|    | 3.7   | Summary                                                              | 65  |
| 4  | Moi   | nolithic Photonics in 32nm SOI CMOS                                  | 70  |
|    | 4.1   | Monolithic Photonic Platform in Zero-change 32nm SOI CMOS            | 71  |
|    | 4.2   | Photonic Device Design                                               | 73  |
|    |       | 4.2.1 Passive Photonic Devices                                       | 73  |
|    |       | 4.2.2 Active Photonic Devices                                        | 76  |
|    | 4.3   | Electronic-Photonic Optical Tranceivers                              | 80  |
|    | 4.4   | Summary                                                              | 83  |
| 5  | Pho   | otonic SoCs in Bulk CMOS                                             | 85  |
|    | 5.1   | Monolithic Photonic Platform in 65nm Bulk CMOS                       | 87  |
|    | 5.2   | Photonic Devices                                                     | 92  |
|    |       | 5.2.1 Loss Mechanisms in Polysilicon Waveguides                      | 93  |
|    |       | 5.2.2 Defect States based Photodetection in Polysilicon              | 95  |
|    |       | 5.2.3 Passive Devices                                                | 96  |
|    |       | 5.2.4 Active Devices                                                 | 98  |
|    | 5.3   | Design of a Photonic SoC in Bulk CMOS                                | 102 |
|    |       | 5.3.1 Chip Implementation                                            | 103 |
|    |       | 5.3.2 Electrical Performance Evaluation                              | 104 |
|    |       | 5.3.3 Electronic-Photonic WDM Transceivers                           | 104 |
|    | 5.4   | Summary                                                              | 108 |
| 6  | Con   | nclusion                                                             | 110 |
|    | 6.1   | Thesis Summary                                                       | 110 |
|    | 6.2   | Next-generation Electronic-Photonic Integrated Systems               | 112 |
| Bi | bliog | graphy                                                               | 114 |

# List of Figures

| 1.1  | (a) Evolution of the average top 10 supercomputers normalized to year 2010 (top500–June rankings), (b) Evolution of the top 10 systems average node com- |    |
|------|----------------------------------------------------------------------------------------------------------------------------------------------------------|----|
|      | puter power, node bandwidth, and resulting byte/flop aspect ratio [1]                                                                                    | 2  |
| 1.2  | Processor-memory bandwidth trend for high-performance CPU/GPUs                                                                                           | 3  |
| 1.3  | (a) Advanced heterogeneous GPU/CPU packaging (AMD GPU), (b) Emerging                                                                                     | 0  |
| 1.0  | high-performance computing systems (Nvidia DGX-2 supercomputer)                                                                                          | 3  |
|      | mon performance companies systems (1111ans 2 cir 2 supercompassi).                                                                                       |    |
| 2.1  | Electrical transceivers power efficiency vs. copper channel loss (from ISSCC                                                                             |    |
|      | Trends 2018)                                                                                                                                             | 8  |
| 2.2  | Ring-resonator characteristics and transfer functions                                                                                                    | 10 |
| 2.3  | A ring resonator based optical link [2]                                                                                                                  | 13 |
| 2.4  | An example of a WDM link using ring resonators [2]                                                                                                       | 14 |
| 2.5  | Unity current gain frequency $(f_T)$ of NMOS devices in IBM/GF processes                                                                                 | 18 |
| 2.6  | Hybrid and 3D integration methods: (a) Wire-bond packaging, (b) Next-generation                                                                          |    |
|      | 3D micro bump packaging with Through-silicon vias (TSVs) [3]                                                                                             | 19 |
| 2.7  | Wafer-level 3D integration via through-oxide vias (TOV) [4]                                                                                              | 20 |
| 2.8  | Commercial fabrication platforms for the CMOS technology                                                                                                 | 21 |
| 2.9  | Zero-change SOI platform evolution; (a) Development timeline, (b) EOS22 die                                                                              |    |
|      | photo, (c) WDM transceivers, (d) Key photonic devices of an optical link                                                                                 | 23 |
| 2.10 | Cross-section of the zero-change 45 nm SOI platform [5]. (Figure is not drawn to                                                                         |    |
|      | scale)                                                                                                                                                   | 24 |
| 2.11 | Simplified diagrams of two major waveguide geometries in silicon-photonics (strip                                                                        |    |
|      | and ridge waveguides)                                                                                                                                    | 25 |
| 2.12 | Scanning electron micrographs (SEM) of a strip waveguide in the zero-change                                                                              |    |
|      | 45 nm SOI platform [6]                                                                                                                                   | 25 |
|      | Simple diagrams of a diffraction grating coupler indicating loss mechanisms                                                                              | 26 |
| 2.14 | (a) 3D layout of a unidirectional grating coupler, (b) Optical transmission at 10.5                                                                      |    |
|      | degree vertical angle [7]                                                                                                                                | 27 |
| 2.15 | Major $p$ - $n$ junction configurations in depletion-mode modulators                                                                                     | 28 |

| 2.16       | (a) 3D layout of a spoked-ring modulator, (b) Optical transmission of a WDM transmitter row with 11 channels (numbers indicate channel ordering) over 3.2 THz FSR. Channel 3's heater is turned on by 20% strength to show the individual resonance tuning functionality [7]                                                           | 29         |
|------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
| 2.17       | Photodetectors in zero-change platforms: (a) PMOS cross-sections in 45 nm and 32 nm processes and their features used for O-band light detection, (b) 3D layout of a resonant SiGe PD in 32 nm, (c) and (d) Micrograph and cross-section of the defect-based resonant PD for L-band [7].                                               | 30         |
|            | Photonic SoC design flow.  The flowchart of the Simulink co-optimization framework for silicon-photonic transmitters [8]                                                                                                                                                                                                               | 32         |
|            | Estimated energy breakdown for today's commercial silicon-photonic transceivers. Optical power flow in an optical link                                                                                                                                                                                                                 | 35<br>37   |
| 3.1        | PAM-4 versus NRZ modulation eye-diagrams.                                                                                                                                                                                                                                                                                              | 41         |
| 3.2        | A ring-resonator based optical link.                                                                                                                                                                                                                                                                                                   | 42         |
| 3.3<br>3.4 | Trade-off between $OMA_{TX,outer}$ and ring resonator's available optical bandwidth.<br>(a) Normalized achievable $OMA_{TX,outer}$ of a ring-modulators versus the required bandwidth and (b) the ratio of PAM-4 $OMA_{TX,outer}$ over NRZ. The energy-efficiency improvement at any data rate is determined by the difference of this | 43         |
| 3.5        | curve and the PAM-4 receivers' energy penalty (red dashed lines) The ratio of PAM-8 $OMA_{TX,outer}$ over NRZ. The energy-efficiency improvement at any data rate is determined by the difference of this curve and the PAM-8                                                                                                          | 44         |
|            | receivers' energy penalty (red dashed lines)                                                                                                                                                                                                                                                                                           | 46         |
| 3.6        | Segmented ring-resonator optical DAC concept                                                                                                                                                                                                                                                                                           | 47         |
| 3.7        | Linearity comparison of the proposed optical DAC versus an ideal electrical DAC driven microring modulator for driver's voltage swings of 1.5 V (a) and 4 V (b).                                                                                                                                                                       | 48         |
| 3.8        | Linearity comparison of the proposed optical DAC versus an ideal electrical DAC driven microring modulator for Q-factors of 7.5 k (a) and 15 k (b)                                                                                                                                                                                     | 49         |
| 3.9        | 3D layout of a segmented ring-resonator based ODAC in zero-change $45\mathrm{nm}$ SOI                                                                                                                                                                                                                                                  | <b>E</b> 1 |
| 9 10       | platform                                                                                                                                                                                                                                                                                                                               | 51<br>52   |
|            | Block-diagram of the transmitter's data-path                                                                                                                                                                                                                                                                                           | 52<br>53   |
|            | 20 GHz Digital PLL's block-diagram.                                                                                                                                                                                                                                                                                                    | 99         |
| 5.12       | Circuit diagram of a CDAC unit cell (total of 48 unit cells are connected to the LC-DCO differential output nodes)                                                                                                                                                                                                                     | 54         |
| 3.13       | Thermal sensitivity of the resonance wavelength and its effects on a PAM-4 transmit eye. PAM-4 levels are uncoded while any coding can be applied through the                                                                                                                                                                          |            |
|            | LUT.                                                                                                                                                                                                                                                                                                                                   | 55         |
|            | PAM-4 transmitter's thermal tuning feedback loop block-diagram Simulated thermal tuning procedure; the optimum heater value that maximizes                                                                                                                                                                                             | 56         |
|            | eye-openings is found ( $Sweep\ Phase$ ) and resonance is locked to the corresponding wavelength by tracking $i_3$ level ( $Tracking\ Phase$ )                                                                                                                                                                                         | 59         |

| 3.16       | Simulated heater output power and ring's temperature during the thermal tuning                                                                                                                                                                                                                                                                                                                                                                       |
|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 3 17       | process                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| 0.11       | ring's resonance and laser's wavelength                                                                                                                                                                                                                                                                                                                                                                                                              |
| 3 18       | PAM-4 transmitter's full block-diagram                                                                                                                                                                                                                                                                                                                                                                                                               |
| 3.19       |                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|            | Test setup and packaging scheme of the test chip.                                                                                                                                                                                                                                                                                                                                                                                                    |
| 3.21       | (a) Optical DAC static measurement at low optical input power to avoid thermal drifts. Notice the resonance is shifted down to 1280 nm as cavity's optical power and consequently temperature is lower than the eye-diagram measurement. (b) Normalized optical output for each DAC code compared with an ideal linear DAC (Code 1's output is derived from Codes 0 and 2 transmissions since it is skipped in transmitter's 16b thermometer coding) |
|            | Transient waveforms with all possible ODAC codes in order to measure linearity. Measured transmit eye-diagrams. The highest and lowest optical levels are the same in both cases since the operating point remains the same                                                                                                                                                                                                                          |
| 3 24       | same in both cases since the operating point remains the same                                                                                                                                                                                                                                                                                                                                                                                        |
| 3.24       |                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| 5.25       | tuning and (b) 20 Gb/s PAM-4 transmit eye-diagrams with thermal tuning on and                                                                                                                                                                                                                                                                                                                                                                        |
|            | off                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|            |                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| 4.1<br>4.2 | Cross section of the 32 nm SOI CMOS process                                                                                                                                                                                                                                                                                                                                                                                                          |
| 4.3        | Post-processing steps to release silicon substrate in the case of flip-chip packaging on the PCB.                                                                                                                                                                                                                                                                                                                                                    |
| 4.4        | Micro-graphs of substrate released chips from top and bottom views                                                                                                                                                                                                                                                                                                                                                                                   |
| 4.5        | SEM of a strip waveguide before substrate release step                                                                                                                                                                                                                                                                                                                                                                                               |
| 4.6        | Diffraction grating couplers in 32 nm zero-change platform. (a) 3D layout, (b) Optical transmission with fibers coupled from the bottom side of the die at 12.5                                                                                                                                                                                                                                                                                      |
| 1 7        | degree coupling angle                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 4.7        |                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| 1.0        | lator for different biases in the depletion mode                                                                                                                                                                                                                                                                                                                                                                                                     |
| 4.8        | An example of a heater shifting the microring's resonance wavelength for various                                                                                                                                                                                                                                                                                                                                                                     |
| 4.0        | heater strengths                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| 4.9        | (a) SEM of a PMOS in 45 nm SOI process highlighting embedded SiGe (eSiGe) available as a source/drain material [9], (b) SEM of a PMOS in 32 nm SOI process                                                                                                                                                                                                                                                                                           |
|            | showing additional channel SiGe (cSiGe) material available as a channel of this                                                                                                                                                                                                                                                                                                                                                                      |
| 1 10       | device [10], (c) PMOS device's channel SEM                                                                                                                                                                                                                                                                                                                                                                                                           |
| 4.10       |                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| 111        | 32 nm processes indicating available SiGe layers for photodetection                                                                                                                                                                                                                                                                                                                                                                                  |
| 4.11       | SiGe optical characteristics dependency on the concentration of Ge fraction                                                                                                                                                                                                                                                                                                                                                                          |

| 4.12       | Cross-section of eSiGe- and cSiGe-based resonant PDs in the zero-change 32 nm platform.                                                                         | 80  |
|------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 4.13       | cSiGe-based resonant PD measured characteristics: (a) optical transmission, (b)                                                                                 | 00  |
|            | I-V curve.                                                                                                                                                      | 80  |
| 4.14       | cSiGe-based resonant PD's electro-optical frequency response                                                                                                    | 81  |
|            | (a) Optical transmitter's block-diagram and (b) micro-graph                                                                                                     | 81  |
|            | High-swing modulator driver circuit.                                                                                                                            | 82  |
| 4.17       | (a) Optical receiver's block-diagram and (b) micro-graph                                                                                                        | 83  |
| 4.18       | (a) Transmitter's eye-diagram captured via commercial photo-detector and elec-                                                                                  |     |
|            | trical scope, (b) Receiver's eye-diagram tested by externally modulated input                                                                                   |     |
|            | light.                                                                                                                                                          | 84  |
| 5.1        | Our technique for integrating electronic and photonic devices on a single silicon                                                                               |     |
|            | microchip by adding isolated patches (islands) of the insulator material silicon                                                                                |     |
|            | dioxide to a bulk silicon substrate [11]                                                                                                                        | 86  |
| 5.2        | Photonic integration with nano-scale transistors. (a) Illustration of three major                                                                               |     |
|            | deeply-scaled CMOS processes: planar bulk CMOS, FinFET bulk CMOS, and                                                                                           |     |
|            | fully-depleted SOI CMOS, (b) Integration of photonics process module into planar                                                                                |     |
|            | bulk CMOS, (c) SEM of different photonic and electronic blocks in our monolithic                                                                                | 0.0 |
| <b>r</b> 0 | platform.                                                                                                                                                       | 88  |
| 5.3        | Monolithic electronic-photonic platform in 65 nm bulk CMOS. (a) Photograph of                                                                                   |     |
|            | a fully fabricated 300 mm wafer with monolithic electronics and photonics, and close-ups of a reticle on this wafer, and a packaged WDM chiplet, (b) Micrograph |     |
|            | of a WDM chiplet, (c) Close-up of a single transceiver macro and its photonic                                                                                   |     |
|            | and electronic circuit components                                                                                                                               | 90  |
| 5.4        | Photo of a reticle showing various device/system test structures                                                                                                | 92  |
| 5.5        | Cross-sectional TEM of a gate polysilicon waveguide integrated in a 28 nm bulk-                                                                                 | 32  |
| 0.0        | CMOS process. The core thickness is approximately 73 nm with the surface                                                                                        |     |
|            | roughness of 5 nm RMS with a 100 nm correlation length [12]                                                                                                     | 94  |
| 5.6        | Photonics Platform Performance. (a) Passive components specifications at 1300 nm                                                                                |     |
|            | for partial- and full-flow wafers, (b) Transmission spectrum and the longitudinal                                                                               |     |
|            | cross-section of grating couplers                                                                                                                               | 97  |
| 5.7        | Passive photonic performance at 1300 nm and 1550 nm. (a) Waveguide prop-                                                                                        |     |
|            | agation loss at 1300 nm. Waveguide loss drops with wavelength because of a                                                                                      |     |
|            | combination of lower absorption and scattering by polysilicon, (b) Q-factor of a                                                                                |     |
|            | 15 μm diameter microring resonator, (c) Waveguide propagation loss at 1550 nm,                                                                                  |     |
|            | (d) One resonance of a 17 µm diameter microring near 1540 nm with a Q-factor                                                                                    |     |
|            | of 38,000                                                                                                                                                       | 97  |
| 5.8        | Photonics Platform Performance. (a) Microring modulator 3D layout, (b) Trans-                                                                                   |     |
|            | mission spectrum of a modulator resonance with loaded Q-factor of 5,000, (c)                                                                                    |     |
|            | Modulator electro-optic frequency response (S21) and the eye-diagram obtained                                                                                   |     |
|            | with 2 Vpp drive voltage                                                                                                                                        | 100 |

| 5.9  | Photonics Platform Performance. (a) Microring photodiode 3D layout, (b) Re-                           |     |
|------|-------------------------------------------------------------------------------------------------------|-----|
|      | sponsivity vs. reverse bias voltage. Avalanche gain is observed at biases above                       |     |
|      | 8 V, (c) Photodiode frequency response (S21) under 0 V and 5 V reverse bias with                      |     |
|      | 3 dB bandwidths of 8 GHz and 11 GHz, respectively. Inset shows the eye diagram                        |     |
|      | obtained under 5 V bias                                                                               | 100 |
| 5.10 | v · · · · · · · · · · · · · · · · · · ·                                                               |     |
|      | diode under dark and illumination for input optical power of $20\mu\mathrm{W}$ . Dynamic              |     |
|      | range is $\sim 60 \text{dB}$ and $\sim 10 \text{dB}$ at 0 V and 16 V, respectively. (b) One microring |     |
|      | photo-detector resonance and the corresponding photo-current as the wavelength                        |     |
|      | is swept across the resonance. (c) NEP (blue curve) of the photo-diode esti-                          |     |
|      | mated based on the dark-current shot noise, which dominates the detector noise.                       |     |
|      | Avalanche gain is 13 at 16 V bias, with an NEP of 0.27 pW/ $\sqrt{\rm Hz}$ . Simulated SNR            |     |
|      | (red curve) at the output of the optical receiver, assuming 1 $\mu$ W optical signal,                 |     |
|      | and a receiver circuit input-referred noise spectral density of 1 pA/ $\sqrt{\rm Hz}$ . (d) The       |     |
|      | responsivity of the photo-diode vs. input optical power, showing minimal power                        |     |
|      | dependency. (e) and (f) Eye-diagrams at 12.5 Gb/s at 0 V and 14.5 V reverse bias.                     |     |
|      | Cross-section of a WDM photonic SoC packaging                                                         | 102 |
| 5.12 | WDM photonic SoC top-level (CMOS clock distribution networks for transmit                             |     |
|      | and receive clocks are shown in red)                                                                  | 103 |
| 5.13 |                                                                                                       |     |
|      | performance and variation ispection, (b) Histogram comparison of measured fre-                        |     |
|      | quencies with Monte-carlo simulation results from the original PDK of 65 nm                           | 405 |
| - 11 | CMOS process.                                                                                         | 105 |
| 5.14 | Electro-optical test setup of WDM transceiver chips and block diagram of one                          | 100 |
| F 1F | WDM transmit-receive row                                                                              | 106 |
| 5.15 | , , ,                                                                                                 |     |
|      | (c) 10 Gb/s transmit eye-diagram using the on-chip PRBS generator, (d) 7 Gb/s                         |     |
|      | receiver bathtub curve, (e) Thermal tuning of one WDM channel using the inte-                         | 100 |
| F 10 | grated micro-heater.                                                                                  | 106 |
|      | Block-diagram of the receiver's analog front-end                                                      | 107 |
| 5.17 | Block-diagram of the duty-cycle corrector (DCC) and delay line (DL)                                   | 108 |
| 6.1  | Summary and time-line of all integrated systems described in this thesis                              | 112 |
| 6.2  | (a) Block-diagram of the microsecond optical switching network demonstration,                         |     |
|      | (b) Silicon photonics MEMS switch chip, (c) Electro-optically packaged processor                      |     |
|      | SoC [13]                                                                                              | 113 |

# List of Tables

|     | Summary and comparison of non-monolithic silicon photonic platforms Summary and comparison of monolithic silicon photonic platforms |    |
|-----|-------------------------------------------------------------------------------------------------------------------------------------|----|
| 3.1 | Summary and comparison of the ODAC-based PAM-4 transmitter with prior high-speed optical transmitters                               | 67 |
| 4.1 | Photonic devices in zero-change platforms performance summary                                                                       | 84 |

#### Acknowledgments

Looking back on the last 5 years of my life in Berkeley, I cannot imagine having a better time and experience, and I owe this to my great advisor, collaborators, colleagues, friends, and family.

First and foremost, I would like to thank my advisor, Professor Vladimir Stojanović. He taught me countless invaluable lessons in research and academia. His guidance, passion in research, versatile expertise in different areas, and unique perspective at problem solving fueled my achievements and inspired my future career directions. His level of optimism always amazed me, and proved to me that nothing is impossible! Vladimir was always more than just an academic advisor to me, more like a true role model in life and I want to thank him for always having faith in me and letting me lead many projects in his group.

Most of the work during my Ph.D. was performed in collaboration with other academic and industrial research groups. I would like to thank Professor Rajeev Ram (MIT), Professor Milos Popović (Boston University), and Professor Ming C. Wu (UC Berkeley) for their technical guidance, leadership, and support. During these team efforts, I really enjoyed learning and working alongside many talented post-docs and students. In particular, Chen Sun mentored me when I just joined Vladimir's group and passed on many essential skills and knowledge, from design to hands-on experiments in the lab. On the photonics side, I had the privilege of working with Amir Atabaki, Fabio Pavanello, and Mark Wade, for whom I am deeply thankful for their mentorship and guidance. I would also like to thank Christopher Baiocco from College of Nanoscience and Engineering (CNSE) for providing technical knowledge and fabrication support for many of the projects I was involved in.

I also had the pleasure of sharing this memorable journey with the Integrated Systems Group's students. Special thanks to Krishna Settaluri, Sen Lin, Nandish Mehta, Pavan Bhargava, Taehwan Kim, Sidney Buchbinder, Panagiotis Zarkos, and Christos Adamopoulos for providing everlasting friendships and proving me either right or wrong through our interesting discussions.

The work in this thesis and many other projects I worked on during my Ph.D. would not be possible without the assistance of the amazing administrators and top-notch technical staff at Berkeley Wireless Research Center (BWRC). Fred Burghardt, Brian Richards, Candy Corpus, Amber Sanchez, and many others did such a fantastic job supporting me and other students at BWRC by providing the best imaginable work environment and facilities. I am extremely grateful to be a member of BWRC and enjoy their support.

Special thanks to all my friends for supporting me and being there for me during one of the most stressful and enjoyable moments of my life. In particular, thank you to Ahmad Zareei, Sina Akhbari, Pouria Kourehpaz, Zhaleh Amini, Salar Fattahi, and Ali Moin, with all of whom I was blessed to have a chance of sharing this unforgettable and wonderful time.

Finally, I would like to thank my parents and sisters for their unwavering support and endless encouragement through my whole life. For many years, they patiently tolerated thousands of miles between us created by unfair visa laws. I owe a special thanks to my

sister, Somayeh, who was a great inspiration in my life and a main reason for me to continue my graduate studies.

**Support.** The work in this thesis is funded in part by DARPA, NSF, and BWRC. I would also like to thank the CNSE, Trusted Foundries, MOSIS and the IBM/Global Foudries staff involved in handling the design submissions.

# Chapter 1

## Introduction

Today's conventional cloud computing and mobile platforms have been challenged by the advent of Machine Learning (ML) and Internet of Things (IoT). The performance and diversity requirements of these applications demand hyper-scale data centers and Exascale high-performance computing (HPC). Powerful HPC was always the major drive force behind solving complex and large models leading to great scientific discoveries, solving environmental problems, and inventions in life sciences. Hyper-scale data centers also played important roles in cloud-computing in many applications and providing global connectivity. Additionally, hyper-scale data centers help us alleviate the energy consumption issue of data centers as large-scale data centers normally operate with significantly higher energy-efficiency [14]. However, a major hurdle in scaling up data centers and enabling next generations of HPC is the energy, cost, and areal bandwidth density of optical transceivers.

Figure 1.1a presents the trend of average top 10 supercomputers normalized to year 2010 [1]. These curves show that the improvement of compute power for these last years was almost solely owing to the boost in the single node compute power caused by massive parallelization and development of more advanced CMOS nodes. On the other hand, the network size has not been proportionally increased. As we are approaching the end of Moore's law, energy/area enhancements of new CMOS nodes are plateauing and we cannot rely on a single-node compute power enhancement for long. Furthermore, this significantly slower trend of the available bandwidth per node relative to the evolution of compute power per node could not provide the adequate memory resources and data exchange per FLOP (floating point operation). Consequently, the byte/FLOP ratio has been declining by almost a factor of  $6\times$ , while the FLOPs/node metric, which has progressed by nearly  $20\times$  (Fig. 1.1b). Smaller byte/FLOP ratios negatively impacted the parallel efficiency and/or harden the burden on programmers.

One of major reasons for this stagnation of the network size and performance is the energy-efficiency and cost of optical transceivers. Today's silicon photonic optical transceivers are operating at around  $30\,\mathrm{pJ/b}$  energy consumption with the cost of \$5/Gbps. These numbers are prohibiting us from building the next generations of data centers and HPC. For instance in the case of HPC for reaching to the next milestone of Exascale node ( $10^{18}$  FLOPS)



Figure 1.1: (a) Evolution of the average top 10 supercomputers normalized to year 2010 (top500–June rankings), (b) Evolution of the top 10 systems average node computer power, node bandwidth, and resulting byte/flop aspect ratio [1].

by 2020, we need to provide about 40 Pb/s bisection network bandwidth. Supporting this bandwidth using the existing silicon photonics demands around 6.8 MW electrical power [1] with the cost of \$200 M. Considering the fact that the total projected energy and cost budget for the whole systems is 25 MW and \$250 M, respectively, we have to find an alternative solution to drastically improve the energy and cost of optical modules. Thus, building the next generations of supercomputer and hyper-scale data centers depends on significant improvement in the energy-efficiency, bandwidth density, and cost of optical transceivers for inter- and intra-rack communication at high-data rates.

Moreover, the advent of computationally intensive applications such as machine learning (ML) demand larger memory sizes and processor-memory bandwidths (Fig. 1.2). These memory bandwidth requirements (on the orders of TB/s) were achieved by heterogeneous packaging of high-bandwidth memories (HBM) with CPU/GPUs on the interposers providing high-density and ultra-short reach electrical interconnects for processor-memory links (Fig. 1.3a). However, enough off-chip interconnect bandwidth should be also supported to provide adequate data for processing. For instance latest Nvidia DGX-2 (Fig. 1.3b) utilizes 16 interconnected GPUs, each demanding 2.4 Tb/s off-chip bandwidth via electrical NVLinks. As these data-rates are also increasing, supporting them via electrical signaling is becoming more and more challenging due to increased channel losses, limited I/O pins and allocated power for off-chip communication on CPU/GPUs. Furthermore, these applications require distributed computation scaling and communication over multiple racks of such boxes, necessitating efficient optical interconnects. These area and energy-efficient optical interconnects should be realized either through co-packaging or direct integration into SoC computing chips.

In order to solve the area, energy, and cost efficiency issues of optical transceivers, we should notice that the electronic-photonic integration platform always sets and limits these



Figure 1.2: Processor-memory bandwidth trend for high-performance CPU/GPUs.



Figure 1.3: (a) Advanced heterogeneous GPU/CPU packaging (AMD GPU), (b) Emerging high-performance computing systems (Nvidia DGX-2 supercomputer).

performance metrics. There are two major approaches to merge electronics and photonics; while hybrid/3D silicon photonics technologies utilize advanced CMOS microelectronics, the density and parasitic capacitances of interconnection in between electronics and photonics impacts the performance and energy-efficiency of the final system. On the other hand, monolithic solutions were all demonstrated in old CMOS nodes (older than 90 nm) leading to poor energy-efficiency and speed.

This thesis describes two monolithic photonics platforms in advanced CMOS processes to solve the above mention trade-off. Starting from the 45 nm, we present how we can implement a full photonic transmitter in this process without any change or modification (zero-change) to the original CMOS process. A ring-resonator based 40 Gb/s PAM-4 optical transmitter has been demonstrated in this platform achieving the energy and bandwidth density world record by using a novel photonic device configuration (an optical digital-to-analog converter (ODAC)) and co-optimizing electronic and photonics. To show the extendability of the zero-change approach to a more complex and advanced CMOS node, we applied our zero-change scheme to a 32 nm silicon-on-insulator (SOI) CMOS node as well. Here, we will elaborate on the new opportunities for improving the photonics performance by exploiting new features of this process. These platforms can be "sweet-spots" for placing photonics monolithically next to one of the fastest CMOS processes demonstrated so far. Achieving ultra high bandwidth densities of around 0.5 Tb/s/mm² makes these photonic transceivers suitable for in-package heterogeneous integration with high-performance system-on-chip (SoC).

Finally, we develop a monolithic electronic-photonic platform in a 300 mm-wafer commercial bulk CMOS process for the first time. This milestone opens up the path for embedding photonics monilithically in sub-32 nm CMOS nodes which are all fabricated in bulk CMOS or thin-body SOI technologies. Consequently, we can implement monolithic electronic-photonic SoCs in the most advanced CMOS nodes of sub-32 nm. As with the 32/45 nm zero-change platforms, we are already in the regime that photonics only takes less than 10% of energy and area, moving to a more advanced CMOS node is the only solution to boost the energy-efficiency and bandwidth density of the optical transceiver further. Additionally, today's high-performance SoCs are mainly fabricated in sub-20 nm nodes due to higher transistor energy-efficiency and densities (>20 MTr/mm²). Thus, our approach can unlock embedding optical transceivers directly in the CPU/GPU chips by enabling photonics in bulk CMOS platforms. In doing so, we can alleviate the issue of limited I/O pins of modern SoCs in addition to achieving energy, area, and cost performance metrics necessary for the future of computing.

### 1.1 Thesis Organization and Contributions

Chapter 2 of this thesis provides an overview of micro-ring resonator based optical links and integrated photonics technologies. We discuss the trade-offs and challenges of electronic-photonic integration platforms and introduce introduce our zero-change 45 nm SOI CMOS monolithic photonics. Finally, summarize the achieved photonic device performances in that technology and discuss the importance of energy-efficiency and bandwidth density in optical transceivers. Parts of this chapter appear in:

• [7] V. Stojanović, R. J. Ram, M. Popović, S. Lin, S. Moazeni, M. Wade, C. Sun, L. Alloatti, A. Atabaki, F. Pavanello, N. Mehta, and P. Bhargava, "Monolithic Silicon-

Photonic Platforms in State-of-the-art CMOS SOI Processes," *Optics Express*, vol. 15, no. 19, pp. 11798–11807, 2018

Chapter 3 describes the opportunities of electronic-photonic co-design and co-optimization. We will introduce an optical digital-to-analog (ODAC) and discuss the implementation methodology and benefits of this new device. Using this element, we will demonstrate a multi-level optical transmitter and address the thermal stability issue of micro-rings for higher-order modulations. Chapter 3 has the following references:

- [15] S. Moazeni, S. Lin, M. Wade, L. Alloatti, R. Ram, M. Popović, and V. Stojanović, "A 40Gb/s PAM-4 Transmitter Based on a Ring-Resonator Optical DAC in 45nm SOI CMOS," (Invited Paper) IEEE Journal of Solid-State Circuits (JSSC), vol. 52, no. 12, pp. 3503–3516, Dec 2017
- [16] S. Moazeni, S. Lin, M. T. Wade, L. Alloatti, R. J. Ram, M. A. Popović, and V. Stojanović, "A 40Gb/s PAM-4 transmitter based on a ring-resonator optical DAC in 45nm SOI CMOS," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), Feb 2017, pp. 486–487

Although CMOS processes seem to be very constrained environments for optical devices, we can always exploit new features of these platforms to open up the new degrees of freedom in device design and improve the performance of our devices. Chapter 4 elaborates on how we extended the zero-change monolithic photonic approach to a more advanced process node of 32 nm and exploited features of this process in our advantage to boost up system's performance. While we recently showed microring modulators in this process [17], here we will demonstrate the full electronic-photonic capabilities and SiGe photodetectors in this platform as well. This chapter has the following reference:

• [18] S. Moazeni, A. Atabaki, D. Cheian, S. Lin, R. J. Ram, and V. Stojanović, "Monolithic Integration of O-band Photonic Transceivers in a "Zero-change" 32nm SOI CMOS," in 2017 IEEE International Electron Devices Meeting (IEDM), 2017

Zero-change monolithic photonics in  $45/32\,\mathrm{nm}$  approach drastically improved the energefficiency and bandwidth density of optical transceivers, however further improvement can be only done by using more advanced CMOS technologies as we are in the regime where less than 10% of area/energy consumed by photonics. However, enabling monolithic photonics in sub-32 nm CMOS nodes is challenging and has not been demonstrated so far. In Chapter 5, we demonstrate a monolithic photonic platform in a 300 mm wafer bulk CMOS process for the first time. Parts of this chapter appear in:

• [19] A. H. Atabaki\*, S. Moazeni\*, F. Pavanello\*, H. Gevorgyan, J. Notaros, L. Alloatti, M. T. Wade, C. Sun, S. A. Kruger, H. Meng, K. Al Qubaisi, I. Wang, B. Zhang, A. Khilo, C. V. Baiocco, M. A. Popović, V. M. Stojanović, and R. J. Ram, "Integrating

photonics with silicon nanoelectronics for the next generation of systems on a chip," *Nature*, vol. 556, no. 7701, pp. 349–354, 2018

- [20] S. Moazeni\*, A. Atabaki\*, F. Pavanello\*, H. Gevorgyan, J. Notaros, L. Alloatti, M. T. Wade, C. Sun, S. A. Kruger, H. Meng, K. A. Qubaisi, I. Wang, B. Zhang, A. Khilo, C. Baiocco, M. A. Popović, R. Ram, and V. Stojanović, "Integration of Polysilicon-based Photonics in a 12-inch Wafer 65nm Bulk CMOS Process," in 2017 Fifth Berkeley Symposium on Energy Efficient Electronic Systems (E3S), 2017
- [21] A. Atabaki\*, S. Moazeni\*, F. Pavanello\*, H. Gevorgyan, J. Notaros, L. Alloatti, M. T. Wade, C. Sun, S. A. Kruger, H. Meng, K. A. Qubaisi, I. Wang, B. Zhang, A. Khilo, C. Baiocco, M. A. Popović, V. Stojanović, and R. Ram, "Monolithic Optical Transceivers in 65 nm Bulk CMOS," Optical Fiber Communications Conference and Exhibition (OFC), 2018

Chapter 6 summarizes the results and outlines the future opportunities and applications of developed electronic-photonic platforms and integrated systems. Parts of this chapter appear in:

• [13] S. Moazeni, J. Henriksson, T. J. Seok, M. T. Wade, C. Sun, M. C. Wu, and V. Stojanović, "Microsecond Optical Switching Network of Processor SoCs with Optical I/O," Optical Fiber Communications Conference and Exhibition (OFC), 2018

 $<sup>{}^*\</sup>mathrm{These}$  authors contributed equally.

# Chapter 2

### **Preliminaries**

Silicon photonics is a promising technology to realize low-cost and energy-efficient optical links for emerging short-reach intra-rack and rack-to-rack interconnects in data-center and high-performance computing applications [22]. This technology also enables high-radix network switches, scalable interconnect fabric for future memory systems, and core-to-core cross-bars [23, 24, 25] once it is integrated with the large-scale electronic system-on-chips. The high performance in all of the above mentioned applications requires tightly integrated electronic-photonic circuits. Various integration strategies have been utilized to meet these demands including hybrid [26], heterogeneous via 3D-stacking [4, 27, 28], and monolithic [6, 29, 30, 31] integration. Among these, monolithic integration has the potential for reliable, low-cost, and large-scale integration, while being most promising in terms of energy-efficiency and bandwidth density.

In this chapter, we first give a brief review of various approaches to build an optical link and explain the advantages of micro-ring resonator based optical links over other schemes. Next, we describe basics of modeling micro-ring resonators and optical links. Section 2.2 compares hybrid and monolithic integration platforms and compares the state-of-the-art platforms for electronic-photonic integration. Details of our recently developed monolithic photonic platform in zero-change 45 nm SOI CMOS is presented in Section 2.3. Due to a high-level of integration and complexity of emerging electronic-photonic integrated systems including our photonic SoCs in this thesis, we developed various tools for modeling, simulation, and implementing such integrated systesm. This automated photonic SoC design flow and our toolkits for this aim are explained in Section 2.4. Finally, we discuss the critical role of energy-efficiency and bandwidth density in future optical interconnects in Section 2.5.

### 2.1 Optical Interconnects

Light-based data transfer and communication have been firstly deployed about 30 years ago in transatlantic links connecting Europe and America. The low loss of fiber optics  $(0.2 \, \mathrm{dB/km} \ \mathrm{at} \ 1550 \, \mathrm{nm} \ \mathrm{wavelength})$  compared with copper wires was the main attraction for



Figure 2.1: Electrical transceivers power efficiency vs. copper channel loss (from ISSCC Trends 2018).

optical links. As the demand for higher-data rates grew, the copper channel losses (which are frequency dependent) drastically increased even at short distances and this caused considerable energy penalties in electrical transceivers. As an example, a 5 m copper cable with 50 dB loss has been deployed for 28 Gb/s data communication and the transceiver energy-efficiency of 15 pJ/b [32]. Just to compare with optical links, with almost the same energy-efficiency and data-rate, we can communicate data over more than 2 km. Furthermore, more advanced CMOS nodes could not alleviate this issue as the limit is imposed by the channel loss mechanisms. Today, even on-board electrical signaling for distances of 12-inch traces is facing serious challenges as demand for data-rates increases above 25 Gb/s typically consuming around 10 pJ/b. Optical interconnects achieving high energy-efficiency and bandwidth density can break the electrical signaling barriers and empower future computing and communication systems. The target opportunities for optical links in regard to the electrical channel loss is shown in Fig. 2.1. Designing energy-efficient photonic transceivers with high-bandwidth density can revolutionize the interconnection paradigms in applications where copper wires cannot reach. Additionally, achieving ultra-high energy efficiencies of sub-1 pJ/b for shortreach links such as on-board signaling can also brings the new opportunities for optical links as well.

Overall, we can categorize optical links into two main types: (1) Directly Modulated Lasers (DML), or (2) Externally Modulated Laser (EML). The receiver in both methods have the same architecture where a photodiode converts optical intensity (or phase in coherent links) into electrical photocurrents. Afterwards, receiver restores the electrical currents into the digital domain via thresholding electrical circuitry. On the transmitter side, the laser diode is directly modulated by applying a modulated electrical signal in DML links. On the other hand, the laser acts as an optical continuous wavelength (CW) source and transmitter

imprints digital data on the light stream via an optical modulator in EML approach.

DML links require laser sources with high relaxation frequencies. However implementing such laser is very challenging and they still underlie frequency chirping. Despite the demonstrations of distributed feedback lasers (DFB) operating at 56 Gb/s data-rates [33], DML optical links have very limited usage due to various issues. One of the major issues is that lasers cannot be monolithically integrated with CMOS driver chips and that sets an upper energy-efficiency limit due to the parasitic capacitances of interconnection in between laser and electrical drive which shoule be driven at the datar-rate speed. Moreover, lasers' threshold and performance depends greatly on their temperature. As using thermo-electrical cooling (TEC) is extremely power and area inefficient, DML should operate in an uncooled mode. This also prohibit them from being co-packaged with high-performance SoCs running at 80-100 °C normally. Finally, multi-wavelength operations like wavelength division multiplexing (WDM) links require extra optical MuX/DeMux that add extra optical path loss and area overhead.

Today DML optical links are only used for short-reach optical links via vertical-cavity surface-emitting lasers (VCSEL). They are multi-mode sources at 850 nm leading to limited working distances of up to 300 m. Data-rates of up to around 40 Gb/s have been reported with transmitter energy-efficiency of 0.5-1 pJ/b [34]. One of the advantages of VCSELs is their cost efficiency, however due to challenges for achieving higher speeds and lower power consumption, their application and usage may shrink by the advent of future EML optical transceivers. Thus, here we only focus our analysis on the EML-based photonic transceivers.

EML optical interconnects bring many opportunities to improve energy-efficiency and bandwidths of optical transceivers. Separating laser sources from the transmitter module solves all the issues of frequency chirping and relaxes the laser frequency metrics. Also, there's no need to drive the parasitic capacitance of laser's anode/cathodes at high-data rates and we only use DC currents to bias the laser (whether laser is 3D integrated or deployed as a separate module). Due to the temperature dependency of laser sources and also relatively low utilization of optical links in data-centers, separating laser modules from the transceiver module can eventually improve the energy-efficiency and bandwidth density as the transceivers can now be placed very close to high-performance SoCs (like in package or directly embedded on CPU/GPU die). One of the main disadvantages of EML links is the inevitable extra optical loss of coupling laser light into the transmitter chip. This can be alleviated by devising low-loss fiber-to-chip or chip-to-chip optical coupling mechanisms.

The key device in EML links is an optical modulator that can operate based on any of the following principles: (1) Pockels effect, (2) Thermo-optics, (3) Electro-absorption, (4) Carrier-plasma effect. Pockels electro-optic effect, changes or produces birefringence in an optical medium induced by an electric field. However, since materials showing Pockels effect cannot be easily integrated with silicon platforms, using this type of optical modulators for silicon-photonics is still under research [35]. Thermo-optical effects are normally slow (<1 GHz) and cannot be used for high data-rate modulation. Ge-based electro-absorption modulators have been demonstrated at data-rates up to 50 Gb/s with relatively compact footprints. However, 100% Ge is hard to integrate into a CMOS process and the insertion



Figure 2.2: Ring-resonator characteristics and transfer functions.

loss of these devices is still large (5 dB) due to the intrinsic absorption of Ge even at low bias voltages. Thus electro-optical phase shifters based on carrier-plasma effect [36] are the most promising approach for CMOS integration, which can be utilized in the either forms of a Mach-Zehnder interferometer (MZI) or a ring-resonator.

Todays, MZIs are the workhorse modulators in commercial optical transceivers. However, MZIs with high-enough extinction ratio (ER) are inherently millimeter-sized devices, which leads to high energy consumption, high insertion loss (IL), and large footprints. These transmitters have energy-efficiencies around 5 pJ/b, which ironically dominates the total link power budget. MZIs with improved phase shifters [37, 38] also cannot alleviate this issue as their fabrication in advanced monolithic CMOS processes is problematic, requiring hybrid integration which in turn reduces energy-efficiency. Compared to traditional Mach-Zehnder interferometer-based modulator structures, which even the smallest are hundreds of micrometers long, the resonant structure of rings increases the optical length, allowing them to perform modulation in a much smaller form factor. Thus, this thesis focuses on developing high-speed ring-resonator based optical transceivers and addresses their thermal sensitivity and non-linearity issues for multi-level modulations.

### 2.1.1 Micro-ring Resonators

The micro-ring resonator is a ring shaped waveguide that can trap the input light from the bus waveguide coupled into its cavity. In other words, whenever the circumference of the ring is an integer multiple of the input light wavelength, namely  $\lambda_0$ , the coupled light wave from the bus port will interfere constructively in the ring cavity by traveling through the ring circumference and building up the optical power. Consequently, through-port intensity will be reduced significantly. This behavior can be translated as a set of notches in the wavelength domain, shown in Fig. 2.2.

Resonance spacings are called free spectral range (FSR) and rings with smaller radius have larger FSR. Since any multiple wavelengths can satisfy constructive interference condition,

FSR can be quantitatively derived:

$$\lambda_0 = \frac{n_{eff}L}{m}, m = 1, 2, 3, \dots$$
 (2.1)

$$FSR = \frac{\lambda^2}{n_g L} \tag{2.2}$$

where L is the circumference of the ring,  $n_{eff}$  is the effective refractive index, and  $n_g$  is the group index, which takes the dispersion of the silicon waveguide into account. In multi-wavelength communication systems such as WDM links, all the channels should fit in one FSR and generally larger FSRs are preferable for modulators in optical communications. We can also define the quality factor (Q-factor) of the resonance as the ratio of the resonance wavelength to its bandwidth:

$$Q = \frac{\lambda_0}{\Delta \lambda} \tag{2.3}$$

where  $\Delta\lambda$  is microring's full-width half-maximum (FWHM) bandwidth, which is indicative of the sharpness of the resonance notch. The ratio of the FSR to  $\Delta\lambda$  is also called finesse of the resonator denoted by  $\mathcal{F}$ . The finesse represents the number of round-trips (within a factor of  $2\pi$ ) made by light in the ring before its energy is reduced to 1/e of its initial value.

In order to derive through-port transfer  $(\alpha)$ , we define the quantities r (the self-coupling of the bus waveguide), t (the cross-coupling between the bus waveguide and the microring waveguide), and  $\tau$  (the total round trip loss (which includes the drop-port coupling loss,  $t_d$ )). Note that  $r^2 + t^2 = 1$  as  $r^2$  and  $t^2$  are the power splitting ratios of the coupler and we suppose the coupling region is lossless. Thus, through-port transfer function can be derived as a function of cross-coupling coefficients between the bus waveguide and ring cavity and also the round trip loss along the ring [39], as follows:

$$\alpha = \frac{I_{through}}{I_{inmut}} = \frac{\tau^2 - 2r\tau\cos\phi + r^2}{1 - 2r\tau\cos\phi + r^2\tau^2}$$
(2.4)

 $I_{through}$  and  $I_{input}$  are the light intensities at the through port and input port, respectively. Also,  $\phi = (2\pi n_{eff}L/\lambda)$  ( $\phi = 0$  when ring is on resonance). This function can be simplified in terms of main resonator parameters:

$$\alpha(\lambda) = 1 - \frac{A}{1 + 4(\frac{\lambda - \lambda_0}{\lambda \lambda})^2} \tag{2.5}$$

where A represents the depth of the notch. Intrinsic extinction ratio  $(ER_i)$  is defined as the through-port transmission (from unity) when the ring is on resonance and it is equal to  $ER_i = -10 \cdot log_{10}(1-A)$ . When the ring is critically coupled, A = 1 and  $ER_i$  is ideally infinite.

#### 2.1.2 Ring-resonator based Optical Links

Optical links can be realized by modulating a microring resonator's resonance wavelength. This approach can be interpreted as the on-off keying (OOK) modulation of the input light in the frequency domain if the input laser wavelength is in the proximity of the resonance. One way to modulate the resonance of these resonators is to use electro-optical properties of the ring material to change the refractive index. Silicon is a popular material used to fabricate these structures and we know that it has a symmetric crystal, which makes silicon's optical properties invariant to any applied electric field. However, it's been shown that changes in free carrier concentration cause linear change in the index of refraction [36]. Hence, we can form active cavities by injecting or depleting carriers inside the cavity. Carrierinjection modulators are based on p-i-n junctions which inject electron/hole carriers into the intrinsic region during forward-bias and blue-shift the resonance wavelength. Carrierdepletion modulators are based on p-n junctions which deplete the carriers from the junction during reverse bias and red-shift the resonance wavelength. Due to the minority carrier lifetimes in the cavity, carrier-injection modulators are normally limited in speed and hence require pre-emphasis drive for high data-rates [40, 41]. They also consume static power and show lower energy-efficiency by operating in the forward-biased regime. Carrier-depletion modulators avoid these issues, however they require mid-level doping control (not common in CMOS processes) to balance  $\lambda_0$  shift with Q-factor degradation caused by the free carrier absorption. Higher Q-factor leads to a greater modulation depth given the fixed capability to shift the resonance wavelength, but also translates to a higher lifetime for photons resonating in the ring cavity. Photon lifetimes comparable to or greater than the bit time result in optical inter-symbol inference (ISI) due to residual light left in the ring from bit-to-bit. A ring's Q-factor sets an optical bound on its maximum data-rate in addition to the RC bandwidth limitations caused by p-n junctions.

Figure 2.3 shows how ring resonators can be used on both transmitter and receiver sides. At the transmit side, modulator switches between two resonances, each corresponding to the data bit to communicate, imprinting data stream on the through-port optical intensity. As a result, the through-port optical intensity will be digitized by two values ( $T_0$  or  $T_1$ ). Extinction ratio (ER) can be defined as the ratio of the transmitted optical intensity between these two levels. Moreover, the optical intensity difference between two levels is called optical modulation amplitude (OMA). Notice, there is a certain insertion loss (IL) even for the high-intensity state due to the limited amount of resonance shift imposed by the doping level and shape of p-n junctions. IL and ER can be expressed in terms of the Q-factor,  $ER_i$ , and the amount of resonance shift per applied voltage. Speed of modulation is limited by the p-n junctions dynamic characteristics in addition to the  $\Delta\lambda$ , which should accommodate the modulated signal bandwidth proportionate to the data-rate.

The wavelength selectivity characteristic of ring resonators can help the receiver side to bandpass filter the received optical light. The selected portion of spectrum will be converted to the electrical signal via a photo diode (PD) and eventually trans-impedance amplifiers and samplers are used to recover the transmitted digital data stream at the receiver. We will

explain later in this thesis, that the ring-resonator can be used as a resonant PD as well by creating p-i-n junctions inside the resonator's cavity. The resonant PD combines the filtering and photodetection functionality, which consequently eliminates the need of drop-port and enhances PD's responsivity.



Figure 2.3: A ring resonator based optical link [2].

Conventional optical transmitters use MZI-based modulators, which are millimeter sized devices because of the weak carrier-plasma depletion effect requires long phase shifters to realize the required  $\pi$  phase shift. Large footprints make them energy inefficient, RC-bandwidth limited, and costly.

A major benefit of microring based optical links is the capability of communicating over multiple wavelengths simply by cascading them with slightly different radius on a shared bus waveguide (Fig. 2.4). This scheme is called wavelength division multiplexing (WDM), which is gaining popularity recently since it can improve aggregate link data-rates in a single waveguide/fiber drastically. Absence of wavelength selectivity in MZIs makes them unfavorable in WDM systems as they need extra optical Mux/DeMux devices such as arrayed waveguide gratings (AWG). These multiplexers not only require milliliter size area, but also consume relatively large static power for thermal tuning, while thermal tuning of ring resonators can be done in a very power efficient way due to their compact footprint [42].

### 2.1.3 Laser Sources and Integration

CW laser sources commonly used in EML links can be either integrated on-chip or used as a separate off-chip module or even off the package. On-chip laser sources have been demonstrated in the form of Ge-based or through hybrid silicon/indium phosphide (InP) integration [43]. Uncooled off-chip lasers based on quantum dots (QD) are also commercially available for fiber optics communications at reasonably high efficiencies [44]. Although, off-



Figure 2.4: An example of a WDM link using ring resonators [2].

chip sources are fabricated on a standalone die, they can be co-packaged very closely with the photonic transceiver chip via 3D integration [45].

One of the main considerations about choosing laser sources is their temperature dependency. Their threshold voltage normally increases with the temperature and causes the rapid fall-off in the wall-plug efficiency. Additionally lasing wavelength and output power also strongly varies by temperature due to thermal expansion and gain medium temperature dependency. Thus, the temperature of laser sources is usually stabilized via a thermoelectric cooler (TEC) controlled by a temperature-stabilizer control loop. However, the energy cost of TEC is significant and ironically it dominates the overall wall-plug efficiency of the laser units [46]. Thus, uncooled lasers with low temperature drift are preferred in order to achieve high energy-efficiency and small form factor. Thermal variations can be compensated in the design of the transceiver; for instance, for ring-modulators, we can design thermal tuning control loops to track the laser wavelength while its wavelength and power is slowly drifting. In doing so, we can significantly improve overall link efficiency by eliminating the need for TEC.

Off-chip laser sources can be either co-packaged with transceiver IC (3D integrated) or kept as a separate module out of the transceiver package. Although, most of today's transceivers contain laser sources, once the transceivers can be co-packaged with high-performance SoCs, separating the laser source and transceiver IC is becoming more preferable. The main reason is that high-performance SoC packages have temperatures of around 100 °C and placing uncooled lasers in this environment drastically degrade their efficiency. Moreover, in many applications where link utilization is not uniform or high, we can share laser sources among multiple transceivers. For example, combining a separate set of lasers and low-loss optical switches can be utilized to manage optical power budget more efficiently. We note that tight integration of laser sources with electronic-photonic systems is still essential and preferred in low-power integrated systems such as sensors and imagers for IoT and biomedical applications.

Laser combs providing multi-wavelength optical carriers have been also demonstrated and commercially available in limited optical bands and channel spacings [47]. These sources are

essential to enable dense WDM (DWDM) optical links and they are more power efficient compared with separate laser diodes. However, limited maximum power per wavelength channel, channel-to-channel peak power variations, limited choices for optical bands and channel-to-channel spacings are some of the major challenges for these sources to become commercially useful in optical links.

It's worth noting that the cost of fiber and laser modules and also extra power and packaging issues added by extra lasers are preventing us from increasing the total number of channels in WDM systems [46]. Recently many solutions such as laser combs [48] have been proposed to provide as many laser lines as possible, however they still provide limited number of channels. Thus, it's preferred to achieve higher data-rate per channel to minimize the energy consumption and cost of WDM links. Todays, designing multi-wavelength laser sources with enough optical power on each wavelength channel and high energy-efficiency is still an active research area. These sources play an important role in enabling emerging energy-efficient and high-bandwidth density DWDM optical links. In order to achieve the best performance on the overall system, such lasers and optical transceivers should be codesigned and co-optimized together.

### 2.2 Electronic-Photonic Integration

Three decades ago the work of Soref and Bennett [36] signaled the dawn of silicon-photonics. To many, this meant that finally, optics would realize the same economies of scale that silicon-based microelectronics (especially CMOS) has enjoyed for decades. In this section, we take a look at the development trajectory of the silicon-photonic technology and the state-of-the-art in the capability of silicon-photonic processes available today, in the context of the photonic interconnects as the flagship application for this technology.

Being able to create passive photonic devices in silicon, as well as affect the index of refraction through some current or voltage controlled mechanism are the necessary steps towards creating optical coupling, guiding, and modulating devices for photonic interconnects. However, the other key pieces of technology are the photodetector and the approach for integration with electronics, which determine the effectiveness of photon-electron conversion, and ultimately the energy cost, speed, and bandwidth-density and integration cost of the overall solution.

Indeed, the first commercial high-volume process, developed by Luxtera, attempted to address all of the issues above at the same time, by integrating the photonic devices in a then state-of-the-art 130 nm silicon-on-insulator (SOI) CMOS process [49]. To yield good photonic device performance the process had to be modified with epitaxial Ge step for efficient photodetectors, as well as Si body partial-etch for passive and active waveguide structures. Small parasitic capacitances between transistors and devices were realized through monolithic integration, enabling energy-efficient, high-bandwidth transmitter and receiver components. However, the process customizations and economic investment that led to having the improved photonic device performance, also prevented the technology from following the

CMOS scaling trends of shrinking the device features every two years, and hence improving the transistor and system speed and energy-efficiency. Furthermore, since interconnect speeds are scaling at an even faster rate of  $4\times$  every two years, this meant that the technology would soon fail to deliver the speeds required in new interconnect standards. For example, it has been challenging for this technology to achieve  $25\,\text{Gb/s}$  modulation even into relatively small photonic loads such as ring-based optical modulators [29].

To make a major impact, every process technology has to be qualified and available for high-volume production, and every additional process step complicates and slows down this process, further preventing the technology from following the mainstream scaling trends. Similarly, IBM's monolithic photonic platform [22], which was implemented in a more advanced 90 nm node, took several years to qualify and achieve high-volume and availability due to process customizations.

The issues with limited transistor performance and process qualification/availability recently have taken the manufacturers in a different direction. Both STMicroelectronics and TSMC have demonstrated hybrid integration of CMOS with Luxtera's photonic technology implemented in standalone photonic SOI wafers [3, 50]. This approach decouples the transistor process development from the photonic process development and is seemingly very attractive since it allows the latest node CMOS circuitry to be used in conjunction with optimized photonic devices. However, this arrangement suffers from several issues which significantly limit its effectiveness to a narrow range of applications. First, the micro-bumps used to connect the chip with transistors to the chip with photonic devices have limited scaling pitch (to about 40-50 µm) and parasitic capacitances larger than 20 fF, which significantly impacts the speed and energy-efficiency of photonic interconnects. Second, this connectivity arrangement limits the integration scenarios of photonics to 100G pluggable transceivers applications [3]. To enable larger density and quality of electrical connections to the transceiver circuit chip, such as those needed in 400G pluggable and mid-board optics scenarios, the photonic process has to be modified to add through-silicon-via (TSV) technology, further complicating the process and qualification. Alas, this multi-chip stacked solution is cumbersome for highly-integrated optics-in-package scenarios.

Photonic interconnects can achieve high volumes and remain the technology of choice for future system connectivity applications, if they can help address the bandwidth density and energy-cost limits of electrical I/O on large SoC chips such as network switches, graphics and multi-core processors (GPUs and CPUs) or field-programmable gate arrays (FPGAs). The integration and packaging approach have to enable both a low-energy, high bandwidth-density connection from the large SoC to the photonic transceiver chip, and a photonic connection out of the transceiver chip. The only way to achieve this is if: 1) the transceiver chip is integrated as close as possible (preferably in the same package and on the same interposer) to these large SoCs; 2) photonic interconnects are monolithically integrated with transistors that enable the highest performance in mixed-signal transceiver applications while not further complicating the packaging and process development. With this in mind, our team has worked to create a photonic technology platform in high-volume 45 nm and 32 nm SOI process nodes, creating photonic devices in the native processes with no required process

modifications. The advanced features of these processes opened new degrees of freedom in photonic device design that mitigated some of the inherent process limitations and also enabled tight electronic-photonic design optimization. This approach led to record breaking energy-efficient high-speed transmitters as well as the highest degree of electronic-photonic integration demonstrated in the world's first microprocessor with photonic I/O [6]. However, since the zero-change approach cannot be extended to nodes below 32 nm, we devised a new method to enable monolithic photonics in the most advanced CMOS nodes by minimally changing the process (Chapter 5).

In this section, we summarize and compare the results of silicon-photonics technology platforms, demonstrating the potential of the monolithic integration technologies, and in particular our unmodified (zero-change) 45 nm SOI CMOS.

### 2.2.1 Monolithic vs. Hybrid/3D Integration

In this subsection, we summarize the performance of the state-of-the-art silicon photonic process technology platforms and discuss the advantages of monolithic integration in advanced high-performance CMOS processes for meeting the needs of future optical interconnects.

High-performance integrated systems demand advanced CMOS technologies with high  $f_T$  (frequency at which transistor current gain is unity) and  $f_{max}$  (frequency at which transistor power gain is unity). These are the performance metrics of transistors representing analog circuit's speed and sensitivity, and logic speed. Figure 2.5 shows the trend of  $f_T$  for NMOS devices in IBM/GlobalFoundries (GF) technology nodes, which is representative of the performance in other similar foundries and process nodes. Notice that  $f_T$  has peaked in 45 nm and 32 nm CMOS nodes, due to the change of focus for more scaled-down nodes on logic energy and area density optimization for memory and logic chips, rather than the speed and performance of analog and mixed-signal circuits [51]. Since photonic interconnects are primarily based on mixed-signal transceiver circuitry, these transistor metrics directly impact the link performance metrics such as speed, sensitivity and energy efficiency. For photonic interconnects to be attractive alternative to electrical short-to-long-range (chip-to-chip to backplane) I/O of large SoC chips, they have to provide a sub-1 pJ/b 25-50 Gb/s links with low-energy electrical connection to the SoC and aggregate throughputs larger than 10 Tb/s. In addition to these performance metrics, for such large volume applications, it is key that photonic interconnects are manufactured in a high-volume, state-of-the-art 300 mm-wafer foundry.

From this perspective, non-monolithic platforms are expected to achieve high energy efficiencies and receiver sensitives for high-speed optical transceivers due to the flexibility to choose the best performing electronics process independent of the photonics process. A performance summary of the latest non-monolithic silicon photonic technologies is shown in Table. 2.1. Despite the advantage of optimizing the electronics and photonics separately, these platforms still consume  $>1\,\mathrm{pJ/b}$  modulator driver energies with  $>50\,\mathrm{\mu A}$  receiver sensitivity, which clearly does not satisfy the electrical and optical power budget of future optical interconnects. The main reason is the additional parasitic inductance and capac-



Figure 2.5: Unity current gain frequency  $(f_T)$  of NMOS devices in IBM/GF processes.

itance of wire-bonds or micro-bumps (Cu-pillars) interconnecting electronic and photonic chips (Fig. 2.6). This extra capacitance ranging from 20 fF to 100 fF degrades transmitter's energy efficiency and also imposes stringent gain-bandwidth constraint for the receiver design leading to degraded receiver sensitivity.

Aside from the packaging of photonics with mixed-signal transceiver circuits, the final packaging with the SoC chip is important for the overall photonic interconnect performance since it determines the quality of the electrical link between the SoC and the photonic transceiver. Current non-monolithic platforms require wire-bonds to connect the photonic transceivers to the package, which degrades the electrical link channel between the SoC and the electronic transceiver chip in the photonic interconnect module. Flip-chip packaging capability is required for high-performance applications such as 400G optical transceivers, mid-board modules and co-packaging with the SoC. Solving this problem demands the development of silicon photonics platform with TSVs shown in Fig. 2.6b as discussed in [3].

A promising solution for solving the parasitic and density issues of 3D integration solutions, wafer-level 3D integration via through-oxide vias (TOV) has been demonstrated. As illustrated in Fig. 2.7, in this platform, 300 mm photonic and electronic wafers are manufactured separately and then bonded face-to-face using oxide bonding. The silicon substrate of the photonic wafer is then removed and TOVs are punched through at 4 µm pitch to connect the top layer metal of the photonic wafer to the top layer metal on the CMOS wafer. For packaging, wire-bonded back metal pads are deposited on top of the selected TOVs. Due to the reletively short height of TOVs, this process achieved 3 fF parasitic capacitance per TOV. Despite the preliminary results in this platform [4, 27], this technology is still suffering from reaching mature process control and yield due to the thermal and stress requirements of the wafer-to-wafer bonding and TOV-metal contacts. Additionally, flip-chip packaging of these



Figure 2.6: Hybrid and 3D integration methods: (a) Wire-bond packaging, (b) Next-generation 3D micro bump packaging with Through-silicon vias (TSVs) [3].

Table 2.1: Summary and comparison of non-monolithic silicon photonic platforms.

|                                             | IMEC<br>[52, 53]                  | $\begin{array}{c} \text{HP} \\ [54,55] \end{array}$ | $rac{	ext{Luxtera/TSMC}}{[3]}$    | ${f STMicroelectronics}\ [50,56]$                |
|---------------------------------------------|-----------------------------------|-----------------------------------------------------|------------------------------------|--------------------------------------------------|
| Technology                                  |                                   |                                                     |                                    |                                                  |
| Availability                                | Prototyping/Research              | Prototyping/Research                                | High-volume*                       | High-volume                                      |
| Integration Method                          | Wire-bond                         | Wire-bond                                           | 3D Cu-pillar                       | 3D Cu-pillar                                     |
| Photonics process node                      | 220 nm SOI                        | 130 nm SOI                                          | 130 nm SOI                         | PIC25G SOI                                       |
| Circuits CMOS node                          | $28\mathrm{nm}/40\mathrm{nm}$     | $65\mathrm{nm}$                                     | $28\mathrm{nm}$                    | $55\mathrm{nm}\;(\mathrm{BiCMOS})/65\mathrm{nm}$ |
| NFET $f_T$                                  | $275\mathrm{GHz}/305\mathrm{GHz}$ | N/R                                                 | N/R                                | $300\mathrm{GHz}/200\mathrm{GHz}$                |
| Wavelength                                  | 1550 nm                           | $1550\mathrm{nm}$                                   | 1310 nm                            | 1310 nm                                          |
| Photonics                                   |                                   |                                                     |                                    |                                                  |
| Waveguide Loss                              | 1 dB/cm                           | 3 dB/cm                                             | $1.9\mathrm{dB/cm}$                | $3\mathrm{dB/cm}$                                |
| Couplers Loss                               | $2\mathrm{dB}$                    | $5\mathrm{dB}$                                      | $2.2\mathrm{dB}$                   | $2.15\mathrm{dB}$                                |
| Modulator Device                            | Ring-resonator                    | Ring-resonator                                      | MZM                                | MZM                                              |
| Modulator Bandwidth                         | 38 GHz                            | N/R                                                 | $25\mathrm{Gb/s}$                  | $25\mathrm{Gb/s}$                                |
| PD Responsivity                             | 0.8 A/W                           | $0.45{ m A/W}$                                      | 1 A/W                              | $0.88{ m A/W}$                                   |
| PD Bandwidth                                | $50\mathrm{GHz}$                  | $30\mathrm{GHz}$                                    | $24\mathrm{GHz}$                   | $20\mathrm{GHz}$                                 |
| System Performances                         |                                   |                                                     |                                    |                                                  |
| Transmitter Data-rate                       | $50\mathrm{Gb/s}$                 | $25\mathrm{Gb/s}$                                   | $25\mathrm{Gb/s}$                  | $56\mathrm{Gb/s}$                                |
| ER (IL)                                     | 5 dB (5 dB)                       | 7 dB (5 dB)                                         | $4.2\mathrm{dB}\ (1.5\mathrm{dB})$ | 2.5 dB (7.5 dB)                                  |
| Transmitter Energy <sup>†</sup>             | $0.61\mathrm{pJ/b}$               | $2.5\mathrm{pJ/b}$                                  | N/R                                | $5.35\mathrm{pJ/b}$                              |
| Receiver Data-rate                          | $20\mathrm{Gb/s}$                 | $25\mathrm{Gb/s}$                                   | $25\mathrm{Gb/s}$                  | $25\mathrm{Gb/s}$                                |
| Receiver Sensitivity                        | 70 µA                             | 72 μΑ                                               | N/R                                | $97\mu\mathrm{A_{pp}}$                           |
| Receiver Energy                             | $0.58\mathrm{pJ/b}$               | $0.68\mathrm{pJ/b}$                                 | N/R                                | $1.24\mathrm{pJ/b}$                              |
| N/R = Not Reported  * High-volume assumes a | 300mm foundry                     |                                                     |                                    |                                                  |

<sup>\*</sup> High-volume assumes a 300mm foundry

3D integrated chips for higher electrical signal integrity requires extra sensitive processing and packaging steps lacking in the current demonstrations.

### 2.2.2 Monolithicly Integrated Photonics

Monolithic silicon-photonic integration can minimize both the parasitic capacitance of the interconnection between optical transceiver's electronic and photonic devices (now implemented on the same die), and the transceiver chip and the package substrate or the interposer through flip-chip packaging. However, a major challenge in monolithic integration is

<sup>†</sup> Modulator and driver energy efficiency



Figure 2.7: Wafer-level 3D integration via through-oxide vias (TOV) [4].

that process optimizations for photonics and electronics cannot be performed independently of each other. As such, the transistors in monolithic platforms tend to derive from older CMOS processes, where transistor properties are not so sensitive to fabrication changes for photonics. For instance, adding epitaxial Ge to the process requires front-end process modifications that can more easily be tolerated in old CMOS nodes above 90 nm (Table. 2.2). Such front-end process modifications are significantly more challenging in more advanced process nodes with higher performance transistors. Furthermore, old CMOS processes do not have enough lithography precision for building high quality ring-resonators with good coupling and relative resonant wavelength control required for dense wavelength division multiplexed (DWDM) applications, and consequently transmitters use Mach-Zehnder modulators (MZM) which are much less area and energy efficient.

Our solution to the above mentioned problems is to use unmodified high-volume 45 and 32 nm SOI CMOS technologies, which have the highest  $f_T/f_{max}$  demonstrated, and achieve the needed photonic performance by utilizing the advanced lithography and new process features, coupled with device and circuits co-optimization. We call this approach "zero-change", as we are not changing the native CMOS fabrication steps. These nodes are the latest partially depleted SOI (PDSOI) processes, which provide thick-enough crystalline silicon (c-Si) body layer to build low-loss optical waveguides, unlike fully depleted thin-body (FDSOI) in the 28 nm node and below. Figure 2.8 illustrates different platforms commercially developed for CMOS technologies. Unlike PDSOI platform that can be naturally used to build low-loss optical waveguides, implementing photonics in any other three platforms require process change. The reason is the lack of any low-loss optical material such as c-Si in bulk CMOS processes and the thin-thickness of c-Si in FDSOI processes which are prohibitive to implement the most fundamental optical device, a low-loss optical waveguide, at the first place. In Chapter 5, we will discuss and demonstrate a solution for bringing photonic capabilities in these platforms by the addition of minimal number of mask sets and introducing new

materials to these processes.



Figure 2.8: Commercial fabrication platforms for the CMOS technology.

We have demonstrated ring-resonator based optical transmitters with radii as small as 5 µm operating at 40 Gb/s with only 40 fJ/b modulator and driver energy in zero-change 45 nm SOI CMOS [15]. Although, the photodiode (PD) responsivity and bandwidth are sacrificed by not using the pure Ge-based PDs, co-design of electronics and photonics is utilized to obtain sensitive and high-speed optical receivers with low receiver energy. Additionally, our platforms allow direct flip-chip packaging of the photonic transceiver chip on interposers or package substrates suitable for providing dense and low-parasitic electrical signaling to the host SoC.

| $\mathbf{m}$ 11 $\alpha$ $\alpha$ | a       | 1       | • (         | • 1•       | 1 .     | 1 •     | 1         | 1 / (*    |
|-----------------------------------|---------|---------|-------------|------------|---------|---------|-----------|-----------|
| Table 7.7.                        | Summary | and com | narison of  | monolit    | hic ci  | licon n | hotonic i | alattarme |
| 1 abic 2.2.                       | Dummary | and con | iparison or | 1110110110 | 1110 51 | исои р  |           | manutins. |

|                                 | This platform              | Luxtera<br>[57]              | IBM (now GF)<br>[58] | IHP<br>[59]          | Oracle<br>[29]                   |
|---------------------------------|----------------------------|------------------------------|----------------------|----------------------|----------------------------------|
| Technology                      |                            |                              |                      |                      |                                  |
| Availability                    | High-volume                | Medium-volume                | High-volume          | Medium-volume        | Medium-volume                    |
| CMOS Node                       | 45 nm SOI                  | 130 nm SOI                   | 90 nm SOI            | 250 nm (BiCMOS)      | 130 nm SOI                       |
| NFET $f_T$                      | $485\mathrm{GHz}$          | $140\mathrm{GHz}$            | $190\mathrm{GHz}$    | $190\mathrm{GHz}$    | $140\mathrm{GHz}$                |
| Wavelength                      | $1290\mathrm{nm}$          | $1550\mathrm{nm}$            | 1310 nm              | $1550\mathrm{nm}$    | $1550\mathrm{nm}$                |
| Photonics                       |                            |                              |                      |                      |                                  |
| Waveguide Loss                  | $3.7\mathrm{dB/cm}$        | 1 dB/cm                      | $2.5\mathrm{dB/cm}$  | $2.4\mathrm{dB/cm}$  | 3 dB/cm                          |
| Couplers Loss                   | 1.5 dB*                    | $1.5\mathrm{dB}$             | $2.5\mathrm{dB}$     | $1.5\mathrm{dB}$     | $5.5\mathrm{dB}$                 |
| Modulator Device                | Ring-resonator             | MZM                          | MZM                  | MZM                  | Ring-resonator                   |
| Modulator Bandwidth             | $13\mathrm{GHz}$           | N/R                          | $21\mathrm{GHz}$     | $< 7.5 \mathrm{GHz}$ | $15\mathrm{GHz}$                 |
| PD Responsivity                 | $0.5\mathrm{A/W}$          | $0.6\mathrm{A/W}$            | $0.5\mathrm{A/W}$    | 0.8 A/W              | $0.8\mathrm{A/W}$                |
| PD Bandwidth                    | $5\mathrm{GHz}$            | $20\mathrm{GHz}$             | $15\mathrm{GHz}$     | $31\mathrm{GHz}$     | $17.6\mathrm{GHz}$               |
| System Performances             |                            |                              |                      |                      |                                  |
| Transmitter Data-rate           | $40\mathrm{Gb/s}$          | $10\mathrm{Gb/s}$            | $25\mathrm{Gb/s}$    | $10\mathrm{Gb/s}$    | $25\mathrm{Gb/s}$                |
| ER (IL)                         | 3  dB  (4.7  dB)           | 4 dB (N/R)                   | 6.3 dB (5 dB)        | 8 dB (13 dB)         | $6.9\mathrm{dB}\ (5\mathrm{dB})$ |
| Transmitter Energy <sup>†</sup> | $0.04\mathrm{pJ/b}$        | $57.5\mathrm{pJ/b}$          | $10.8\mathrm{pJ/b}$  | $83\mathrm{pJ/b}$    | $7.2\mathrm{pJ/b}$               |
| Receiver Data-rate              | $12\mathrm{Gb/s}$          | $10\mathrm{Gb/s}$            | $25\mathrm{Gb/s}$    | $40\mathrm{Gb/s}$    | $25\mathrm{Gb/s}$                |
| Receiver Sensitivity            | 8.6 μA <sub>pp</sub>       | $6\mu\mathrm{A}_\mathrm{pp}$ | 250 μΑ               | 200 μΑ               | 200 μΑ                           |
| Receiver Energy                 | $0.36{ m pJ/b^{\ddagger}}$ | $8\mathrm{pJ/b}$             | $3.8\mathrm{pJ/b}$   | $6.9\mathrm{pJ/b}$   | $1.9\mathrm{pJ/b}$               |

N/R = Not Reported

<sup>\*</sup> Pigtailed loss is 2.5 dB

Modulator and driver energy efficiency

<sup>‡</sup> Full receiver energy including samplers and digital circuitry

#### 2.3 Zero-change 45nm SOI CMOS Platform

Figure 2.9a presents the timeline of platform development for this technology, utilizing the available multi-project wafer (MPW) runs and without the explicit foundry support. Owing to the maturity of the high-volume 45 nm SOI process, but constrained by the multi-project wafer run availability and turnaround times, we were able to go from device test-chips to a fully-functional processor with photonic I/O in less than four years, on limited research grant funds. Commercially available CMOS technologies normally have much faster turnaround time, which expedites device development and development of new systems. In translating the learning experiences from this platform into the 32 nm SOI platform we have already shrunk the development cycles significantly. Further optimization and acceleration will be possible with tighter foundry support and coordination.

The demonstrated processor with photonic I/O using the 45 nm zero-change platform showcases the power of this technology. Ultra-power-efficient ring-resonator based silicon photonic links, with millions of transistors and hundreds of photonic devices fabricated on the same chip, are aimed to improve processor-memory link bandwidth [6]. This SoC, Fig. 2.9b, has a dual-core RISC-V processor [60], 1 MB SRAM-based cache memory, and DWDM optical I/Os illustrated in Fig. 2.9c. Figure 2.9d shows the key photonic devices of an optical link implemented in this technology. This work achieved the highest level of integration scale and system complexity among the state-of-the-art electronic-photonic systems.

We will be using the zero-change monolithic photonic in commercial 45 nm CMOS SOI process [61] to demonstrate an area and energy-efficient optical PAM-4 transmitter in Chapter 3 and later in Chapter 4 prove the extendability of this scheme to a more advanced CMOS node of 32 nm SOI. More detailed process development steps and performance results are discussed in that chapter.

All photonic devices are designed to conform to the existing (purely-electrical) foundry design flow [62], without any modifications to the native process (Fig. 2.10). Key enabler of optical devices is the sub-100 nm thick high-index crystalline silicon (c-Si) layer, normally used as the body of transistors. Since the buried oxide (BOX) layer is not thick enough to optically isolate the c-Si waveguide core from the silicon substrate, we have to remove silicon substrate to reduce the waveguide optical loss. Substrate removal is done in a single post-processing step on the flip-chip die-attached chips [62]. The flip-chip under-fill keeps the released die mechanically stable even under thermal stress tests. Transistors are also unaffected and all existing foundry IP, timing libraries, and simulation models remain valid [62, 6]. Waveguide loss of approximately 3 dB/cm is achieved after this step. In addition, the flip-chip packaging is favorable for high-performance electronics due to better power delivery, pin counts, and signal integrity of the I/O pins. Light is coupled to the chip via vertical grating couplers. Couplers have been also fabricated by patterning c-Si and polysilicon layers. Polysilicon layer helps to break vertical symmetry and achieve sub-2dB loss (including the taper) over the 78 nm 1 – dB bandwidth around 1320 nm wavelength [63, 64]. Active devices including microring modulators and photodiodes are demonstrated using



Figure 2.9: Zero-change SOI platform evolution; (a) Development timeline, (b) EOS22 die photo, (c) WDM transceivers, (d) Key photonic devices of an optical link.

existing source/drain and well implants doping levels. Microring-modulators have been designed by placing interleaved p and n junctions along the ring cavity. These 5 µm radius microrings achieved loaded Q-factors of better than 10 K with high thermal tuning efficiencies (3.8 µW/GHz) and tuning range of (524 GHz (>50 K temperature)) [65, 5]. Resonant SiGe detectors on this platform showed a responsivity of 0.55 A/W [66]. Polysilicon based resonant detectors have been also demonstrated covering optical tele-communication bands from O-band to L-band [67]. More details on fundamental photonic device designs are described in the following subsections.



Figure 2.10: Cross-section of the zero-change 45 nm SOI platform [5]. (Figure is not drawn to scale)

#### 2.3.1 Waveguides

Waveguides are the most fundamental building blocks of any guided light based photonic system. They are used to route the light on the chip and also inside the active devices for modulation or absorption. Majority of waveguide structures used in silicon-photonic platforms have either strip (rectangular) or ridge geometries (Fig. 2.11). They are both consist of a low optical loss material with high-refractive index as a core and low refractive material for cladding. The cladding materials used in silicon-photonics are mostly: silicon dioxide (SiO<sub>2</sub>), nitrides, deposited polymers, and even air. The choice of core material which can be crystalline silicon (c-Si), epitaxially-grown crystalline, polysilicon, amorphized polysilicon, or silicon nitride depends greatly on the platform and the operating wavelength.

The refractive index contrast determines the ability to better confine the light inside the core while some portion extends out from the waveguide core as evanescent fields. Radiation and bending losses can be reduced by confining most of the light power inside the core. On the other hand, higher evanescent field is more preferable in some applications such as optical molecular sensing. The evanescent field enables waveguide coupling required in devices such as directional couplers, ring resonators, etc. Waveguide losses are caused by the core/cladding material absorption loss, line-edge etch induced sidewall roughness scattering loss, surface scattering loss, and bending loss mechanisms. For active photonic devices with doped waveguide section, the dominant loss mechanism will be due to the free carrier absorption caused by the presence of free electrons or holes. Waveguides can be built to operate in the single or multi-mode regimes and guide TE, TM, or TEM modes.

Waveguides in the 45 nm SOI CMOS are built in the sub-100 nm thick c-Si body layer by blocking all the transistor body dopants to lower the optical loss and they are sized to guide the fundamental TE mode (Fig. 2.12). In the zero-change 45 nm platform, the measured loss is 3.7 dB at 1280 nm and 4.6 dB at 1550 nm [62]. The extracted intrinsic quality factors



Figure 2.11: Simplified diagrams of two major waveguide geometries in silicon-photonics (strip and ridge waveguides).

of 227 k and 112 k were obtained for 1280 nm and 1550 nm undoped rings with 7  $\mu$ m radius, respectively. These high Q-factors are made possible thanks to the advanced processing that offers very small line-edge roughness. Advanced photo lithography and patterning in this process also allow ring-resonators with radii as small as 5  $\mu$ m with small bending loss. More discussion on waveguide structures in zero-change 32 nm SOI platform can be found in Section 4.2.1 as well.



Figure 2.12: Scanning electron micrographs (SEM) of a strip waveguide in the zero-change 45 nm SOI platform [6].

#### 2.3.2 Grating Couplers

Another fundamental block in integrated photonics is a fiber-chip coupler. Main approaches to build such a device are vertical and edge coupling. The vertical coupling is achieved through a grating coupler, which is a periodic structure diffracting light from on-chip propagation direction in the waveguide to free-space in the direction of optical fiber. The periodic scattering structure is normally made out of the waveguide core's material (Fig. 2.13). The

diffraction grating size needs to be proportional to the size of the single-mode fiber core (8 µm) so that the mode spot size at the fiber and coupler's output matches. This requires tapering from the 400 nm-wide waveguide to the grating. Smaller coupler sizes can be used by deploying lensed fibers to reduce the mode diameter at the grating site.

Critical coupling loss mechanisms are: back reflection due to the tapering, directionality, the fiber-coupler mode mismatch (Gaussian mode and mode diameter mismatch), and the destructive interference of up/down propagated light. Extra partial etching step can be also added to the process in order to break the horizontal symmetricity of these teeth for reducing the directionality loss. To reduce the polarization sensitivity of grating couplers, polarization splitting grating couplers have been demonstrated by combining two orthogonal gratings [68].



Figure 2.13: Simple diagrams of a diffraction grating coupler indicating loss mechanisms.

Edge couplers are the tapered waveguide extensions for coupling to and from a single-mode fiber. They are generally broadband, supporting both TE and TM polarizations with low insertion loss (lower than 0.5 dB [69]). However, due to their need for precision alignment, polishing/etching the facet, beam astigmatism, and the need for anti-reflection coatings they have limited applications in silicon-photonics compared with grating couplers.

Vertical grating couplers have been used in zero-change platforms to couple the light from on-chip waveguides to optical fibers. The grating couplers are implemented using both the c-Si and transistor gate poly-Si layers. Since the two silicon layers can be patterned independently of each other, we have more degrees of freedom for design and optimization compared to other custom silicon photonic processes that use a partial silicon etch for unidirectional grating couplers. This optimization led to couplers achieving 1.5 dB loss (including the taper), and 78 nm 1 dB bandwidth around 1320 nm wavelength [63, 64, 70] (Fig. 2.14). The measured pigtailed optically packaged couplers also achieved 2.5 dB loss [70].



Figure 2.14: (a) 3D layout of a unidirectional grating coupler, (b) Optical transmission at 10.5 degree vertical angle [7].

#### 2.3.3 Modulators

Depletion mode micro-ring modulators can be realized by forming lateral, vertical, or interleaved p-n junction profiles along the ring cavity (Fig. 2.15). While vertical junctions provide the largest resonance shift for a fixed doping levels among other profile shapes (due to maximum interaction of optical mode with depletion region), they require precise vertical control of doping implantation which does not exist in commercial CMOS processes. Furthermore, lateral junctions need a partial etch step to form ridge structures and that also does not exist in today's CMOS platforms. Thus, we have used the interleaved junction profiles to design the micro-ring modulators and similarly to embed p-i-n junctions in resonant photodetectors. It can be also shown that interleaved junction based modulators can achieve larger resonance shift for a fixed applied voltage over the modulators with lateral junctions.

In this platform, microring-modulators have been realized by placing interleaved p and n junctions along the ring cavity. This technique utilizes the fine lithography advantage of the deeply-scaled 45 nm process, in order to enable efficient modulation through high junction capacitance density, in the absence of the partial etch or customized doping to form other types of junctions. Resonance wavelength can be modulated by changing the carrier density in the depletion regions of inter-digitated junctions via carrier-plasma effect in silicon [36, 71].

A variety of different p and n doping profiles can be implemented by combining available implants for transistor well and source/drain dopings that set various threshold voltage options. Cathode and anode segments are all connected via spoke-shaped metal contacts in the center of the ring in order to avoid proximity of metal to the optical mode. These 5 µmradius active microrings achieved intrinsic Q-factors of 18k and up to 10k loaded Q-factors with 3.2 THz free spectral range (FSR) in the telecom O-band (1310 nm wavelength) [72, 65, 73]. Measured resonance wavelength shift efficiency is  $20 \,\mathrm{pm/V}$  in the depletion region (reverse bias). The resonator has an embedded silicided c-Si heater structure for thermal



Figure 2.15: Major p-n junction configurations in depletion-mode modulators.

tuning of the resonance required to compensate for thermal and process variations. The ring heater resistance is  $500\,\Omega$  with a high thermal tuning efficiency (3.8  $\mu$ W/GHz) achieved through a combination of heater proximity to the optical mode and high thermal impedance of the BOX-air interface of the removed Si substrate.

Segmented ring-resonator can also be configured as an optical digital-to-analog converter (ODAC) (Section 3.3.2). One can control the amount of resonance shift by independently controlling individual interleaved junction segments. Figure 2.16a shows the 3D rendered layout of a spoked ring-modulator with separate anode contacts. Our analysis showed improved linearity of this structure over the conventional method of controlling the resonance shift by using electrical DACs to control the applied voltage on p-n junctions. This device is used to perform 40 Gb/s PAM-4 transmission, and can also be used in other systems such as optical arbitrary waveform generators. We will elaborate on this device and transmitter design in the next chapter. Figure 2.16b shows measured optical transmission of a WDM transmitter row with 11 channels. Due to the high lithographic precision and film thickness control of  $45/32\,\mathrm{nm}$  processes, the measured resonances are fabricated in order as designed, with channel-to-channel resonance variation less than half of the channel spacing, across WDM row length of 1.5 mm.

#### 2.3.4 Photodetectors

A photodetector converts the optical intensity of the light into the electrical current at the receiver side. The phase of the received light field can be also captured by combining photodiodes and optical couplers for coherent detection. Light absorption can be done by materials such as pure Germanium, SiGe (which is already used in advanced CMOS processes) [74, 27], defect-trap-based poly-Si [75]. Photodetectors can be either built in a straight waveguide fashion or paired/combined with resonant cavities. Resonant photodetectors are useful in multi-wavelength systems for optical channel selection. They also benefit from the enhancement of optical power inside the cavity which improves the responsivity per detector's area, enhancing the weak absorption effects like those in low mole-fraction SiGe materials and defect-based poly-Si. Compact footprints are favorable to achieve higher receiver's bandwidth density and reduce the RC-parasitic junctions of the device itself. In doing so, the



Figure 2.16: (a) 3D layout of a spoked-ring modulator, (b) Optical transmission of a WDM transmitter row with 11 channels (numbers indicate channel ordering) over 3.2 THz FSR. Channel 3's heater is turned on by 20% strength to show the individual resonance tuning functionality [7].

effective absorption length is boosted linearly by the finesse of the cavity.

In multi-wavelength systems, we can place the photodetector at the drop-port output of the ring resonator filter to perform optical filtering and detection. Alternatively, we can directly embed *p-i-n* junctions in the ring cavity to combine light filtering and detection. Overall-size can be reduced in this approach and we can avoid extra coupling loss of the drop-port. The main trade-off here is that although higher Q-factor potentially leads to higher responsivity (higher finesse), it limits the optical bandwidth of the receiver which is disadvantageous at high data-rates.

Both 45 nm and 32 nm CMOS nodes feature epitaxial SiGe materials to improve the performance of PMOS devices (Fig. 2.17a). Embedded SiGe (eSiGe) with Ge% concentration around 20% has been used in the source/drain regions of PMOS transistors to apply compressive stress since the 45 nm technology node [9]. In order to compensate for the low Ge% concentration and minimize the PD parasitic capacitance, we built resonant PDs by forming p-i-n junctions in the ring resonator's cavity as shown in Fig. 2.17b.

Resonated eSiGe detectors in  $45 \,\mathrm{nm}$  platform showed the responsivity of  $0.55 \,\mathrm{A/W}$  and  $0.5 \,\mathrm{A/W}$  at  $1180 \,\mathrm{nm}$  and  $1270 \,\mathrm{nm}$  wavelengths, respectively with  $-4 \,\mathrm{V}$  bias voltage [66, 70]. This PD has the best-in-class dark current of  $20 \,\mathrm{pA}$  and the electro-optical  $3 \,\mathrm{dB}$  bandwidth of  $5 \,\mathrm{GHz}$  limited by the RC of the junctions.

Two types of resonant SiGe-based PDs are implemented in the  $32\,\mathrm{nm}$  process using the two variants of epitaxial SiGe available in this process. Photonic structures in this process are still at an early stage of development and PDs are implemented using unoptimized microring resonators with a loaded Q-factor of  $6.5\,\mathrm{k}$  (intrinsic Q  $> 15\,\mathrm{k}$ ). PDs using eSiGe layer achieved



Figure 2.17: Photodetectors in zero-change platforms: (a) PMOS cross-sections in 45 nm and 32 nm processes and their features used for O-band light detection, (b) 3D layout of a resonant SiGe PD in 32 nm, (c) and (d) Micrograph and cross-section of the defect-based resonant PD for L-band [7].

 $0.06 \,\mathrm{A/W}$  responsivity at  $1310 \,\mathrm{nm}$ . This technology node also features another epitaxially grown SiGe layer with a higher Ge% concentration (approximately 40%), which leads to higher responsivity. This SiGe epi layer is used for PMOS channels (cSiGe) to reduce the threshold voltage  $(V_{TH})$  after introducing metal gate to the process [10]. Measurements showed that cSiGe-based resonant PDs have an improved responsivity of  $0.13 \,\mathrm{A/W}$  at  $-8 \,\mathrm{V}$  bias. The responsivity of both types of SiGe PDs will improve with the improvement of Q of microrings through the reduction of optical loss. These devices exhibit a  $>12.5 \,\mathrm{GHz}$  3 dB bandwidth (measured via a  $13.5 \,\mathrm{GHz}$  VNA) and  $150 \,\mathrm{nA}$  dark current. Details of these devices in a zero-change  $32 \,\mathrm{nm}$  SOI CMOS platform and electronic-photonic system results are described in Chapter 4.

We have also extended the operation of modulators and PDs beyond the O-band. We have redesigned the microring spoked modulators for operation in 1550 nm (C-band) and have demonstrated 25 Gb/s modulation [76]. The loaded Q-factor of 13 k of these modulators shows that the sub-100 nm thickness of the silicon device layer is not a limiting factor of these platforms (32 nm and 45 nm PD-SOI nodes) to implement compact and high performance devices for wavelengths longer than the O-band (most silicon-photonic platforms have a silicon thickness of 200-250 nm).

Also, in addition to the SiGe PDs, defect-based resonant PDs have been demonstrated

covering the optical telecommunication O to L bands [77]. These PDs work based on the absorption enabled by the defect states in the transistor gate polysilicon layer [31]. Figure 2.17c is the micrograph of this design and Fig. 2.17d depicts the cross-section of the absorption region. This PD achieved  $0.15 \,\mathrm{A/W}$  responsivity with  $10 \,\mathrm{GHz}$  bandwidth at  $-15 \,\mathrm{V}$  bias.

#### 2.4 Photonic SoC Design Flow

Design of large-scale electronic-photonic integrated systems requires new CAD tools to model and co-simulate electro-optical dynamics, and to automate the entire design flow. These tools allow us to build photonic SoCs in the fashion of existing VLSI flows in electronics. As the number of photonic devices tends to grow [6] with hundreds of photonic elements and millions of transistors the demand for these photonic SoC design CAD tools are becoming imminent.

In order to consolidate electronic and photonic design flows, we first leveraged the original CMOS node's process design kit (PDK) by including new photonic layers that should be recognized and processed by the CAD tools. Based on this modified PDK, we developed new tools for designing photonic VLSI systems. Figure 2.18 presents the design flow for a photonic SoC comprised of conventional VLSI design and newly developed tools in our research group. Here, we briefly describe some of the major parts of these new tools:

Electronic-photonic co-design and co-simulation. Simulation and modeling of both electronic and photonic in a unified environment with high accuracy on both electronic and photonic side is a first step in designing photonic SoCs. Although CMOS circuitry have been already supported by abundant CAD tools for simulation using both schematic or layout level models, photonic designers use electromagnetic (EM) field solvers such as finite-difference time-domain (FDTD) tools to design and verify their devices. Fundamentally, due to the frequency range mismatch between electronics and optics and complexity of computing for these modelings in optics, merging both currently existing CAD tools is not trivial. Additionally, the speed of simulation is a critical factor as many iterations are required at the system design step.

To ease some of above-mentioned issues, we have developed multiple tools that can be utilized at different system design levels. DSENT (Design Space Exploration for Network Tool) is a modeling framework that can be used for design space exploration with focus on network-on-chip (NoC) [78]. Considering CMOS node technology parameters and scaling, photonic behavioral parameters, and thermal tuning, the tool can be utilized to compare energy-efficiency of WDM links.

We have also built a framework in a Simulink toolbox to run system-level simulations (Fig. 2.19) and showed the importance of co-design of electronics and photonics needed to achieve high energy-efficiency in optical transmitters as an example. This tool captures electro-optical behavior of photonic devices such as phase shifters and ring-resonators [8]. Additionally, it uses more detailed photonic models by taking physical parameters such as



Figure 2.18: Photonic SoC design flow.

doping levels into account. Optimum ring-resonator coupling, doping levels, drive voltage, and etc. can be analyzed and found via this simulator.

Ultimately, we developed a Cadence toolkit library written in Verilog-A for simulation of electro-optical systems [79]. We can directly use the most accurate models of CMOS devices from PDK in conjunction with photonic devices behavioral models in this tool. Photonic models can be built by simple instantiation of fundamental primitives such as waveguide and couplers. Both the amplitude and phase of optical signals as well as optical-electrical interactions can be simulated. The photonic models have high accuracies and they are matched with the Matlab analytical models and measurement results. We can use this tool for both time transient and frequency simulations. We used this tool so far in the context of designing WDM optical links, single-sideband modulators, and Pound-Drever-Hall (PDH) locking loops. Other non-linearity and non-idealities such as Kerr effect, dispersion, etc. have been added to the models for use in other electronic-photonic applications.

Photonic automated layout generation. We recently presented an automated layout generation tool capable of producing the photonic devices and circuits and auto correcting major DRC violations [80]. This tool is written in SKILL, the native language of the main-stream electronic IC design automation software, Cadence. This allows seamless integration



Figure 2.19: The flowchart of the Simulink co-optimization framework for silicon-photonic transmitters [8].

of photonic and electronic design in a single environment dominantly used for SoC design. The tool has a modular structure, where first the basic photonic elements are defined and parametrized and later they can be assembled with automatic waveguide routings to make a full photonic circuit. Moreover, designers can only use abstract photonic layers and do not need to deal with complex layer combinations required to realize certain physical layers. This makes the tool compatible with various silicon-photonic platforms and also designers do not need to get the full access to the underlying proprietary technology information.

Photonic design rule check (DRC). All of the designs sent to foundry for the tape-out should conform to the manufacturing rules. This is normally verified by running a DRC rule deck, provided by the foundry, with a tool like for example Mentor Graphics Calibre. For our zero-change platforms the rules are already specified in the original PDK. However, we need to make sure all the utility layers we use in our photonic designs are also checked by modifying the boolean rules generating these layers. In Chapter 5, where we change the process by adding 5 new masks/layers to the CMOS platform, we had to write our own DRC rules in the Calibre (Standard Verification Rule Format) SVRF language and include that

in the original CMOS process's rule file.

Photonic layout versus schematic (LVS). We developed a photonic LVS verification tool using Calibre software which is compatible with the original CMOS PDK and can be used in various integration technologies with similar device designs. This tool verifies if the photonic device's physical geometries from the layout match the desired design specifications, and ensures that mistakes do not make it to tape-out. These mistakes can be either on the electrical signaling side such as connecting the photonic device with wrong polarity or on the optical side like placing two waveguides very closely and causing unintentional optical coupling or breaks between various waveguide segments. Furthermore, just like in electrical LVS, the tool extract particular parameters such as ring resonator's radius and coupling, grating coupler's pitch, etc. that can be used for post layout extracted (PEX) electro-optical simulation to complete the closed loop design flow.

We will be using the photonic SoC design flow in Chapter 3 to build a PAM-4 optical transmitter in a zero-change 45 nm SOI CMOS. Moreover, we will implement an SoC containing about 400 photonic devices and 32M transistors in a modified 65 nm bulk CMOS process in Chapter 5 using the same flow.

#### 2.5 Energy-efficiency & Bandwidth-density

Majority of today's commercial optical transceivers are designed for pluggable solutions with the relatively large form factors such as QSFP28 attaching to the edge of the mother boards in server blades and switches. Compact form factors are favorable to maximize the number of connections to the blade which determines the total available bandwidth per blade. However, the size of the package should be sufficiently large to allow the heat transfer due to the energy consumption of the transceivers. Additionally, the cost of packaging including laser integration and fiber interfacing is another important factor in choosing the form factor and packaging scheme. Today, state-of-the-art  $100G (4 \times 25 Gb/s NRZ)$  pluggable modules consume about 3 W power. This leads to about 30 pJ/b energy-efficiency for transferring a bit over the link. Figure 2.20 shows the breakdown for this type of transceivers. Although the bandwidth density is currently determined by the size of the package, the physically limited bandwidth density estimates based on the size of the photonic die (largest component in today's optical transceivers) is still only <36 Gb/s/mm<sup>2</sup>. Surprisingly, next-generation hyper-scale data-centers and HPC demand optical links with  $Tb/s/mm^2$  bandwidth densities and <5 pJ/b energy-efficiency [1]. This wide gap between the future need and the state-ofthe-art can be addressed by utilizing many techniques described in this thesis. Here, we will discuss briefly how this work can improve the energy and area consumption of each of these components to meet the demands of future optical interconnects.

**Electrical Link Energy.** About 10 pJ/b is consumed by an electrical link connecting the edge-connector to the CPU/GPU/Switch ASIC sitting in the middle of the motherboard

| Electrical Link | Tx     | Rx    | Laser |
|-----------------|--------|-------|-------|
| 10pJ/b          | 10pJ/b | 2pJ/b | 8pJ/b |

Figure 2.20: Estimated energy breakdown for today's commercial silicon-photonic transceivers.

(with about 10-inch long traces on the motherboard PCB). This portion is even increasing due to the higher copper channel losses at higher data-rates required in near future ( $>50\,\mathrm{Gb/s}$ ). As a result to alleviate this modern electrical links are shifting towards multilevel modulations like PAM-4, but still suffering from relatively poor energy-efficiencies. One promising solution to reduce this power consumption is to decrease the distance between the optical transceiver and the processor or switch chip. Currently, there are some commercial efforts to move the electro-optical (E/O) bridge chip from the edge to the surface of the board to reduce the signal trace lengths. However, electrical signals still have to go through the connectors and PCB traces and overall energy saving is not that significant. Ultimately, if we could co-package the optical transceiver chip with the SoC, the distance will be on the order of millimeters without any socket connector and consequently the energy-efficiency of the electrical link can be around 1 pJ/b [81].

Today's high-performance SoCs already have complex packages to co-package multiple platforms in a single socket. For instance, recent Nvidia or AMD GPUs use interposers to accommodate high-bandwidth memories (HBM) very close to the processing unit to lower latency and increase bandwidths of memory-processor interconnects. In doing so, the energy cost of interconnection in between processor and other chips (on the same interposer) will be 0.5-1 pJ/b. Hence, if the optical transceiver chip can achieve high-bandwidth densities such that it can be placed in these types of packages, we can drastically cut down the electrical link energy from 15 pJ/b to less than 1 pJ/b. Ultimately, enabling monolithic photonic in advanced CMOS processes allows implementing optical transceivers directly on the SoC chips and eliminate the need for any electrical interconnect between the optical transceiver and the SoC. Consequently, the connection of processor to the optical transceiver happens using on-chip wiring with energy-efficiency of sub-100 fJ/b.

Optical Transmitter Energy. The second critical part of energy consumption is the optical transmitter's power. About 10 pJ/b is consumed by the optical transmitter, mainly due to the use of area/energy inefficient MZI modulators in today's commercial transceivers. These MZIs are normally bulky devices with 3 mm long footprints and pFs of capacitance to drive. They are also the dominant source of the transceiver's area as well which limit the bandwidth density to below 36 Gb/s/mm<sup>2</sup>. The reason for their abundant use in industry is their broadband optical characteristics that does not require resonance wavelength tuning. However, even the state-of-the-art CMOS drivers in 7 nm nodes cannot lower their energy consumption below 3.5 pJ/b (for 26GBaud NRZ) [3], while micro-ring resonators are only

consuming less than 50 fJ/b driver energy. Thus, we need to use more energy-efficient optical modulators such as micro-rings instead to improve transmitter's energy.

In Chapter 3, we will address the thermal tuning and optical bandwidth limitations of micro-ring modulators for high data-rates. Also note, the high-speed PLL and serializers are inevitable parts of any transmitter as well. They also consume about 350 fJ/b in 45 nm process node assuming a PLL shared between 4 transmitters. Thus the total transmitter energy in this process node is around 400 fJ/b and since it is dominated by the circuit's energy, the only way to improve this number further is to move the monolithic photonic to a more advanced CMOS node. We will discuss this issue in Chapter 5 and demonstrate a solution for this purpose. Moreover, we can achieve 3.6 Tb/s/mm² modulator and driver bandwidth density by using micro-rings for high data-rates and leveraging the low-parasitics between the ring modulator and the driver circuits in the monolithic zero-change integration platform [15].

Optical Receiver Energy. Generally, there is a trade-off between the receiver's energyefficiency and sensitivity. However, receiver's sensitivity directly determines the optical power required at the laser output. Due to the low wall-plug efficiency of lasers and challenges to build high power laser sources, designers prefer to achieve the best sensitivity within a reasonable receiver circuit energy bound. As a state-of-the-art example, a 64 Gb/s NRZ optical-receiver has been demonstrated in a 14 nm CMOS FinFET process with 1.4 pJ/b energy-efficiency [82]. Despite the deployment of various circuit techniques to improve the sensitivity, there are two main factors impacting the energy-efficiency. First, total PD capacitance including the parasitic capacitance directly affects the sensitivity and hence the energy-efficiency. For instance in [82], the capacitance is estimated to be 69 fF dominated by the wire-bonding capacitance. This capacitance imposes bandwidth limitations that should be resolved via equalization techniques (such as contentious time linear equalization (CTLE) or decision feedback equalization (DFE)), which comes with its own energy penalty. Also smaller parasitic capacitance, relaxes the bandwidth limitations of the design of receiver's analog front-end, leading to higher sensitivity (for instance by increasing the TIA gain). Unlike in monolithic platforms, this capacitance is strongly dominated by the capacitance of the electronic to photonic interconnect and not the capacitance of the PD itself. Second, using more advanced CMOS nodes can reduce the circuit energy consumption due to the benefits of CMOS area/energy scaling for every generation. Thus, monolithic photonics in advanced CMOS processes can reduce the overall link energy consumption by both reducing the receiver circuit energy and the laser energy due to improved receiver sensitivity from low PD parasitic interconnect capacitance. Since receiver area is mostly composed of electrical circuitry, using more advanced process nodes reduces receiver's area and eventually enhances the total transceiver's bandwidth density.

Laser Energy. In order to analyze the factors determining the required laser power, we can follow the optical power flow through the link. Optical power flow in an optical link is



Figure 2.21: Optical power flow in an optical link.

illustrated in Fig. 2.21. Laser source with the wall-plug energy-efficiency of  $\Gamma_L$  and output power of  $P_L$  is coupled into the transmitter chip. The light is modulated via an optical modulator with the normalized OMA of  $OMA_{TX}$  and insertion loss of IL. Next, the modulated light is coupled out in to a second fiber and transferred into the receiver chip.

Now, we can write the laser energy per bit  $(E_L)$  as:

$$E_L = \frac{OMA_{RX}}{\eta_L \cdot \Gamma_L \cdot OMA_{TX} \cdot DR} \tag{2.6}$$

where  $OMA_{RX}$  is the minimum OMA requirement of the receiver for the target datarate (DR) and bit error rate. Notice  $OMA_{RX}$  is the absolute difference of  $P_{High}$  and  $P_{Low}$  which are corresponding to optical power levels at the receiver input for bit "1" and "0", respectively.  $\eta_L$  is the total optical path loss including: 3 coupler loss, modulator's IL, and on-chip waveguide and fibers/connectors loss. Here, you can clearly notice that a higher sensitivity receiver (i.e. smaller  $OMA_{RX}$ ) can directly reduce the required laser power. However, achieving sensitivities of better than  $-15\,\mathrm{dBm}$  at high data-rates (above  $50\,\mathrm{Gb/s}$ ) with bit error rates (BER) of  $< 10^-7$  is very challenging even in advanced CMOS processes. Monolithic platforms can potentially help in this regard by providing low parasitic connection of the PD to the circuits. Improving fiber couplers' loss below 1 dB is also essential to boost laser energy-efficiency. Overall, we are expecting to lower  $E_L$  near 1 pJ/b, by solving the above mentioned challenges in our monolithic photonic platforms.

In total, we are expecting to achieve  $<1\,\mathrm{pJ/b}$  energy-efficiency for each of the energy components (ultra short-reach electrical link, transmitter, receiver, and laser source) at  $40\,\mathrm{Gb/s}$  data-rate. This leads to  $<4\,\mathrm{pJ/b}$  energy-efficiency in total that is about  $7.5\times$  more efficient compared with today's state-of-the-art.

Next-generation hyper-scale data-centers and HPC demand optical links with  $Tb/s/mm^2$  bandwidth densities and  $<5\,\mathrm{pJ/b}$  energy-efficiency. Overall, zero-change monolithic photonic in  $32/45\,\mathrm{nm}$  is capable of providing energy-efficiencies of  $<4\,\mathrm{pJ/b}$  with around  $>0.5\,\mathrm{Tb/s/mm^2}$  bandwidth density. Moving beyond this milestone demands monolithic photonics in sub-10 nm processes. We are expecting to achieve energy-efficiency and bandwidth density of  $>2\,\mathrm{Tb/s/mm^2}$  and  $<2\,\mathrm{pJ/b}$ , respectively. Furthermore, once high-performance CPU/GPUs demand  $>10\,\mathrm{Tb/s}$  off-chip bandwidths, electrical links cannot satisfy their limited energy budget. Notice the energy budget of off-chip links cannot be increased as the CMOS SoCs have a limited total power budget of around  $300\,\mathrm{W}$ , which is not improving by the Moore's law. The off-chip power budget is currently about 10% of the total SoC power consumption  $(30\,\mathrm{W})$ . Therefore, implementing optical transceivers with energy-efficiency of better than  $<3\,\mathrm{pJ/b}$  directly on CPU/GPU dies (fabricated in sub-10 nm technologies) can solve a major challenge in chip-to-chip interconnects (as well as the networking interconnect).

From the cost point of view, monolithic photonic platforms and co-packaging of optical transceivers with SoCs can also significantly reduce the total cost per Gb/s. Currently, the cost of optical transceivers is around 5/Gb/s which needs to be scaled down ideally to be less than 1/Gb/s for the next-generations. Reusing recently developed heterogeneous packages significantly saves the cost of these transceivers by eliminating the need for extra packaging (like QSFPs). Although we may still need an off-package laser module, the cost of that package can be amortized between multiple transceivers. Fiber and connectors are also another component of the total cost of the system, which can be reduced by increasing the capacity per fiber. This will be achieved by utilizing the DWDM scheme to boost the aggregate data-rate per fiber to beyond  $1\,\mathrm{Tb/s}$ . Finally, the fabrication and integration will be much cheaper by reusing mature and cost-efficient CMOS technologies that will be discussed in this thesis.

## Chapter 3

## **Electronic-Photonic Co-optimization**

The next generations of large scale data-centers and supercomputers demand optical interconnects to migrate to 400G and beyond. Microring modulators in silicon-photonics VLSI chips are promising devices to meet this demand due to their energy-efficiency and compatibility with dense wavelength division multiplexed (DWDM) chip-to-chip optical I/O. Higher order pulse amplitude modulation (PAM) schemes can be exploited to mitigate their fundamental energy-bandwidth trade-off at the systems-level for high data rates. In this chapter, we propose an optical digital-to-analog converter based on a segmented microring resonator, capable of operating at 20 GSym/s with improved linearity over conventional optical multilevel generators that can be used in a variety of applications such as optical arbitrary waveform generators, PAM transmitters, etc. Using this technique, we demonstrate a PAM-4 transmitter which directly converts the digital data into optical levels in a commerciallyavailable 45 nm SOI CMOS process. We achieved 40 Gb/s PAM-4 transmission at 42 fJ/b modulator and driver energies, and 685 fJ/b total transmitter energy-efficiency with an area bandwidth density of 0.67 Tb/s/mm<sup>2</sup>. The transmitter incorporates a thermal tuning feedback loop to address the thermal and process variations of microrings' resonance wavelength. This scheme is suitable for system-on-chip applications with a large number of I/O links, such as switches, general-purpose and specialized processors in large-scale computing and storage systems.

### 3.1 Motivation

Today's silicon photonic links still do not meet the cost and energy requirements of future interconnects due to high integration/packaging costs and the poor energy-efficiency of laser sources. Their wall-plug efficiencies are still below 10% over the standard temperature range, despite many efforts to heterogeneously integrate un-cooled lasers [46, 48]. Hence, scaling the overall link data rate by increasing data rates per wavelength is a favorable direction as long as it also improves the energy-efficiency. However, designing energy-efficient optical transceivers at data rates beyond 50 Gb/s, is a challenging problem. Bandwidth

limitations of optical modulators and directly modulated lasers are leading optical links toward using higher order modulations to increase the spectral efficiency. PAM-4 has been recently adopted in optical transceivers to double the data rate and accommodate the emerging PAM-4 electrical line rates and modulation formats without the need to introduce the PAM-4 to NRZ gearshift boxes. For instance, all of the proposed long-reach 400G IEEE standards (400G-DR4, 400G-FR8, and 400G-LR8) are based on this scheme. Multi-level optical modulation can be realized with an electrical DAC driving an optical Mach-Zehnder interferometer (MZI) modulator and/or segmenting the phase-shift portions of the MZI and driving them digitally [30]. However, MZIs with high-enough extinction ratio (ER) are inherently millimeter-sized devices, which leads to high energy consumption, high insertion loss (IL), and large footprints. These transmitters have energy-efficiencies around 5 pJ/b, which ironically dominates the total link power budget. MZIs with improved phase shifters [37, 38] also cannot alleviate this issue as their fabrication in advanced monolithic CMOS processes is problematic, requiring hybrid integration which in turn reduces energy-efficiency.

Silicon microring resonators proved to be energy-efficient, compact, and CMOS-compatible devices for high-speed optical transmitters [83]. Moreover, their wavelength selectivity enables DWDM links at the µm-scale and consequently increasing aggregate link bandwidth drastically. The thermal and process variation sensitivity issues of microring modulators have been also addressed [5]. However, microring modulators have limited optical bandwidths due to their resonant nature, which constrains the link performance by trading-off the optical modulation amplitude (OMA) with the optical bandwidth [84]. Optical PAM-4 microring modulators have been recently demonstrated [85] in a hybrid platform using an electrical DAC driver.

In this work, we demonstrate an optical digital-to-analog converter (ODAC) based on a segmented ring-resonator in an unmodified ("zero-change") state-of-the-art 45 nm SOI CMOS monolithic platform described in Section 2.3. Using this scheme, we propose an O-band optical PAM-4 transmitter, which eliminates the electrical DAC and its non-linearity, power consumption, and area overhead. Measurements show two orders of magnitude improvement in the energy-efficiency compared to the MZI-based PAM-4 transmitters. This high-speed resonant ODAC also enables a multitude of optical pulse shaping applications such as optical arbitrary waveform generators (AWG) and RF-photonics [86, 87, 88].

This chapter is organized as follows: Section 3.2 describes the benefits of using PAM-4 over conventional NRZ modulation in ring-resonator based optical links. The idea of a segmented ring-resonator based ODAC is presented in Section 3.3. Sections 3.4 and 3.5 elaborate on the transmitter's architecture and building blocks. Experimental results are discussed in Section 3.6 and compared with prior works in Section 3.7.

## 3.2 Choosing the Right Modulation

In this section, we present an overview of the ring-resonator based optical link modeling and compare the energy-efficiency of PAM-4 and NRZ modulation schemes for these links. For

PAM-4 modulation, two data bits are transmitted per symbol time period (Fig. 3.1). In doing so, we can essentially double the data-rate within a fixed available frequency bandwidth. Bits are normally mapped to the levels according to the Gray coding so that only one bit error per symbol is made for incorrect level decisions on the receiver side. Here for simplicity we associated symbols to the levels linearly. In the next part, we will analyze and discuss the trade-offs between NRZ and PAM-4 microring-based optical links and show under which conditions it is more energy-efficient to use higher-order modulations.



Figure 3.1: PAM-4 versus NRZ modulation eye-diagrams.

# 3.2.1 Modeling & Discussion on Ring-resonator based Optical Links

Figure 3.2 shows an optical link with a ring-resonator based transmitter. For simplicity, we focus on a single wavelength link, while the following discussion can be extended to DWDM optical links as well. Light from a single wavelength,  $\lambda_{Laser}$ , laser source (Fig. 3.2a) with wall-plug efficiency of  $\eta_L$  and output optical power of  $P_L$  is coupled into the transmitter. On the transmit side (Fig. 3.2b), a microring resonator imprints the digital data stream in the input light. As described in Section 2.1.1, microring resonators are waveguide loops coupled into a bus waveguide, which trap the input light's wavelength  $\lambda$  inside the loop (cavity) whenever the round trip optical length is an integer multiple of  $\lambda$ . Thus, the microring resonator acts as an optical filter with a resonance wavelength of:

$$\lambda_0 = (n_{eff} \cdot L)/m \tag{3.1}$$

where  $n_{eff}$  is the effective refractive index, L represents the ring's circumference, and m is the integer number associated with this resonance. Normalized thru-port transmission,  $\alpha(\lambda)$ , of a ring resonator around this resonance wavelength can fit to a Lorentzian characteristic [5, 29]



Figure 3.2: A ring-resonator based optical link.

as follows:

$$\alpha(\lambda) = 1 - \frac{A}{1 + 4\left(\frac{\lambda - \lambda_0}{\Delta \lambda}\right)^2} \tag{3.2}$$

where  $\Delta\lambda$  is the linewidth around the  $\lambda_0$  resonance. The constant A is indicative of device intrinsic extinction. A=1 when the device is perfectly critically coupled [31] such that  $\alpha(\lambda_0)=0$ . The ring's Q-factor is equivalent to  $\lambda_0/\Delta\lambda$ , which depends on the round trip optical loss and also the coupling strength to the bus waveguide. The Lorentzian transmission and relevant parameters are detailed in Fig. 3.2b.

Active microrings can be built using p-n or p-i-n junctions capable of shifting the resonance wavelength via the free carrier plasma dispersion effect [36, 71]. The amount of resonance shift is a function of the applied voltage, doping profiles, and junctions' geometry. This resonance shift can be utilized for the amplitude modulation of a carrier wavelength. Figure 3.2b illustrates an example of on-off keying (OOK) modulation, where the transmitted light intensity is digitized in  $P_{High}$  and  $P_{Low}$  levels. The separation of the highest and lowest normalized optical powers  $(P_{High} - P_{Low})$  in any PAM modulation can be defined as the transmitter OMA  $(OMA_{TX,outer})$ , where outer indicates the distance between highest and lowest (for instance, the distance between optical levels 0 (00) and 3 (11) in PAM-4 modulation).  $OMA_{TX,outer}$  depends on the Q-factor and the amount of resonance shift. Since, the resonance shift is governed mainly by the electrical and optical capabilities of the technology,  $OMA_{TX,outer}$  is dictated by the Q-factor. However, Q-factor should be proportionally decreased as data rate increases to provide enough optical bandwidth for the inter-symbol interference (ISI)-free communication [84]. Hence, there is a trade-off between

 $OMA_{TX,outer}$  and the optical bandwidth of a microring modulator illustrated in Fig. 3.3.



Figure 3.3: Trade-off between  $OMA_{TX,outer}$  and ring resonator's available optical bandwidth.

Figure 3.4a presents the achievable  $OMA_{TX,outer}$  for different required bandwidths. In this model, the ring is always critically coupled with A=0.9 and resonance wavelength shift of  $20 \,\mathrm{pm/V}$  of reverse bias voltage. Applied voltage is determined by the driver's voltage swing  $(V_{DR})$ , set to 1.5 V (for CMOS inverter based drivers used in this work) or 4 V, that can be achieved by differential thick-oxide or cascode drivers [54]. A further increase in  $V_{DR}$  is not helpful due to the limited depletion region width of the junctions in addition to energy consumption and bandwidth limitation overheads of the higher swing drivers. The  $OMA_{TX,outer}$  curve is flat at low bandwidths since the Q-factor cannot be higher than  $Q_{Max}$ , determined by the carrier absorption and intrinsic optical loss mechanisms inside the cavity  $(Q_{Max} = 18K)$  assumed here [65]).

OMA at the receiver is denoted by  $P_{RX}$ , which can be calculated by knowing the total optical path loss which includes the 3 couplers and waveguide/fiber losses,  $\Gamma_L$ , from  $(P_{RX} = P_L \cdot \Gamma_L \cdot OMA_{TX,outer})$ . In the receiver front-end (Fig. 3.2c), optical power is converted into the photocurrent  $(I_{Ph} = R \cdot P_{RX})$ , where R represents the responsivity of the photodiode (PD) with a parasitic capacitance of  $C_P$ . This current undergoes amplification either via a trans-impedance amplifier (TIA) [82] or an integrator [89]. The conversion gain, 3 dB bandwidth (BW), and the input referred current noise  $(\sigma_n)$  can be estimated for the TIA-based front-ends as follows:

$$Gain = \frac{V_S}{I_{Ph}} = R_f, \quad BW = \frac{1 + A_V}{2\pi C_p R_f}, \quad \sigma_n = \frac{1}{Gain} \sqrt{\frac{4kT(1 + A_V)}{2\pi C_p} + \overline{\sigma_{v,Amp}^2} + \overline{\sigma_{v,Dig}^2}}$$
(3.3)

where  $R_f$  is the feedback resistance,  $A_V$  is the amplifier's open loop voltage gain, k denotes the Boltzmann constant, and T is the temperature. Here we assume the input pole with  $(C_{in} \approx C_p)$  is dominant and  $\sigma_{v,Amp}$  and  $\sigma_{v,Dig}$  denote the voltage noise of the amplifier and samplers, respectively. For the integrating front-ends, gain depends on the integration



Figure 3.4: (a) Normalized achievable  $OMA_{TX,outer}$  of a ring-modulators versus the required bandwidth and (b) the ratio of PAM-4  $OMA_{TX,outer}$  over NRZ. The energy-efficiency improvement at any data rate is determined by the difference of this curve and the PAM-4 receivers' energy penalty (red dashed lines).

period, normally a fraction  $(k_{int} < 1)$  of the symbol time  $(T_{Sym})$ :

$$Gain = \frac{k_{int}T_{Sym}}{C_{int} + C_P}, \quad BW = \frac{1}{2\pi(C_p||C_{int})R_{ON}}, \quad \sigma_n = \frac{1}{Gain}\sqrt{\frac{kT}{(C_p||C_{int})} + \overline{\sigma_{v,Dig}^2}}, \quad (3.4)$$

where  $R_{ON}$  is the ON resistance of the integration switch and  $C_{int}$  is the integration capacitance. Finally, the resulting voltage,  $V_S$ , should be converted back into the digital domain via a set of samplers (Fig. 3.2d). Samplers require a minimum voltage swing,  $V_{S,Min}$ , in order to resolve the input analog voltage into the digital symbols. The receiver requires a minimum OMA level,  $OMA_{RX,outer}$ , at each data rate in order to satisfy the sampler's sensitivity and signal-to-noise ratio (SNR) requirements and achieve the target bit error rate (BER) consequently.  $OMA_{RX,outer}$  depends also on timing related jitter, which is ignored in this first-order analysis under ISI-free conditions.

#### 3.2.2 Energy-efficiency Comparison of PAM-4 and NRZ

Total energy-efficiency (per bit) can be expressed as  $E_{tot} = E_L + E_{TX} + E_{RX}$ , where the terms correspond to the energy-efficiency of the laser, transmitter, and the receiver, respectively. In monolithic ring-resonator based transmitters,  $E_{TX}$  is strictly dominated by the serializer and clocking blocks since the required driver size to drive a microring's capacitance is relatively small. Therefore, we assume  $E_{TX}$  is a fixed technology-dependent parameter, which can

be optimized independently. Considering high optical losses and inefficient laser sources in todays optical links,  $E_L$  dominates the total link energy budget. Furthermore, improving  $E_L$  leads to laser integration and packaging cost reduction in addition to improving the form factor. Laser energy per bit at the DR data rate can be obtained from the following:

$$E_L = \frac{OMA_{RX,outer}}{\eta_L \cdot \Gamma_L \cdot OMA_{TX,outer} \cdot DR}$$
(3.5)

According to Fig. 3.4a, PAM-4 transmitters provide a larger  $OMA_{TX,outer}$  compared to the NRZ at the same data rate since they require half the bandwidth. However, PAM-4 receivers require larger  $OMA_{RX,outer}$  due to the  $3\times$  reduction in symbol level separation. In order to estimate the penalty in  $OMA_{RX,outer}$ , we consider a case where PAM-4 and NRZ receivers both have the same sampler's sensitivity for a fixed data rate. This is illustrated in Fig. 3.2d, where 3 samplers in PAM-4 and 2 for NRZ case operate at the same sampling rate. Equations 3.3 and 3.4 imply that PAM-4 receivers have 2 times larger Gain and therefore half the noise level  $\sigma_n$  (assuming constant  $\sigma_{v,Amp}$  and  $\sigma_{v,Dig}$ ). This can be realized by doubling  $R_f$  in the TIA-based or equivalently doubling the  $T_{Sym}$  in these equations. Combining this with  $3\times$  reduction in nearest symbol level distance, the  $OMA_{RX,outer}$  is  $1.5\times$  larger for a PAM-4 receiver to satisfy the sampler's sensitivity requirement. This penalty is smaller in practice taking the amplifiers' bandwidth limitations [90] and the adjustment of integration capacitances into account.

Hence, if PAM-4  $OMA_{TX,outer}$  is larger than the NRZ by more than  $(1.5 \times \approx 1.76dB)$  at any data rate, the optical energy-efficiency of the link will be improved by exploiting PAM-4 modulation. Figure 3.4b shows the ratio of the  $OMA_{TX,outer}$  for PAM-4 and NRZ along with the 1.76 dB penalty of the PAM-4 receivers' OMA requirement (black dashed line) versus data rate for two different  $V_{DR}$  values. For data rates above the crossover points, PAM-4 links require less optical energy. As the laser energy improves, we may reach a point where receiver energy becomes comparable with the optical energy  $(E_{RX} \approx E_L)$  suggesting that PAM-4  $E_{RX}$  overhead due to the larger number of samplers should be taken into account as well. The upper-bound for this dynamic energy overhead is 50%, as the receivers are operating at the same data rate and clock frequency and the number of samplers ratio is 3 to 2 (Fig. 3.2d). Dynamic energy is about half of the energy consumption of the optical receivers [82, 89], thus PAM-4  $E_{RX}$  overhead is at most 25% ( $\approx 1dB$ ) (red dashed line). Notice here we ignored all the static energy savings of PAM-4 receivers due to the relaxed bandwidth requirement. In addition, energy-efficiency of the NRZ receivers degrades more rapidly since samplers' noise starts dominating due to the more required time interleaving and lower available Gain at high data rates [91, 92]. Red dashed line in Fig. 3.4b, shows the new PAM-4 receivers' energy overhead by adding the  $E_{RX}$  overhead upper bound to the OMA requirement penalty (black dashed line). In practice, PAM-4 starts improving the total energy-efficiency for data rates higher than a point between the crossover of the OMA improvement curve with these two lines, depending on the actual optimized PAM-4 receivers' energy overhead. Similar analysis can be applied to other silicon photonics technologies and higher order PAM modulations such as PAM-8/16 as well.

For instance, assuming PAM-8 modulation, Fig. 3.5 shows the ratio of  $OMA_{TX,outer}$  for PAM-8 over NRZ (assuming the same ring parameters). This analysis indicates that switching to PAM-8 cannot improve the energy-efficiency of the optical link for this modulator. However, as we mentioned since we have assumed the worst case scenario for RX energy penalty, one need to make a more accurate analytical energy model for RX to be able to correctly evaluate the performance of PAM-8 optical link. This model of RX depends on the architecture of the receiver circuit.



Figure 3.5: The ratio of PAM-8  $OMA_{TX,outer}$  over NRZ. The energy-efficiency improvement at any data rate is determined by the difference of this curve and the PAM-8 receivers' energy penalty (red dashed lines).

### 3.3 Ring-resonator based Optical DAC

In this section we present a multi-level electro-optic modulator, based on the ODAC idea using a segmented ring-resonator design. First, we characterize the ODAC's transmission and linearity and later describe the implementation details of this device in our zero-change platform.

#### 3.3.1 Optical DAC (ODAC)

Microring resonator based optical DAC can be realized by segmenting the p-n junction phase shifter along the ring and driving each of the segments separately. In doing so, the density of carriers in the cavity can be controlled by depleting a certain number of segments,



Figure 3.6: Segmented ring-resonator optical DAC concept.

consequently producing a corresponding resonance shift and an optical thru-port intensity change (Fig. 3.6). For the interleaved junctions, phase shifter is already segmented and we only need to drive each junction segment independently. In order to compare linearity of this approach with the conventional electrical DAC driven microring modulator, we will derive the thru-port transmission as a function of the number of depleted junctions and the applied voltage. First, we derive the resonance wavelength shift  $(\Delta \lambda_{Shift})$  and later substitute it in the Lorentzian characteristic of the ring-resonator (3.2).  $\Delta \lambda_{Shift}$  caused by a perturbation in the effective refractive index  $(\Delta n_{eff})$  can be directly obtained from (3.1) as:

$$\Delta \lambda_{Shift} = (L/m) \cdot \Delta n_{eff}, \quad \Delta n_{eff} = \gamma \cdot (k_e \cdot \Delta N_e + k_h \cdot \Delta N_h). \tag{3.6}$$

where  $\Delta n_{eff}$  is a linear function of the electron/hole density change  $(\Delta N_e/\Delta N_h)$  [36],  $\gamma$  expresses the effective interaction of optical mode with depletion regions, and  $k_e/k_h$  are material and optical band dependent coefficients. Since carrier injection modulators are slow due to the long carrier life time in the p-n junctions along the cavity [72], only depletion mode (reverse bias) operation is considered. Assuming an abrupt p-n junction model, n side depletion region's width for a single junction can be derived in terms of the n and p side doping concentrations  $(N_D)$  and  $(N_A)$ , junction's built-in voltage  $(V_b)$ , dielectric permittivity of the silicon  $(\epsilon_{Si})$ , and elementary charge (q) as follows:

$$x_e(V) = \frac{1}{N_D} \sqrt{\frac{2\epsilon_{Si}(V + V_b)}{q} \frac{N_A N_D}{N_A + N_D}}$$
(3.7)

where V is the applied reverse bias voltage of the segments. Hence, the depletion region width is a function of the applied voltage (V) and we denote the change in the depletion

region width by  $\Delta x_e(V)$ , which equals to  $x_e(V) - x_e(0)$ . By summing over all the depleted segments, M, total change in electron/hole density per  $cm^3$  can be obtained from:

$$\Delta N_e(V) = \frac{M \times N_D \Delta x_e(V)}{L} = \frac{M \times N_D}{L} \frac{x_e(0)}{\sqrt{V_b}} (\sqrt{V + V_b} - \sqrt{V_b})$$
(3.8)

Since  $N_D x_e(V) = N_A x_h(V)$  holds, we suppose  $\Delta N_h(V) = \Delta N_e(V)$ . Finally, we can derive  $\Delta \lambda_{Shift}$  as a function of M and V from (3.6) and (3.8), and rewrite (3.2) in terms of the  $\Delta \lambda_{Shift}(M, V)$  as:

$$\Delta \lambda_{Shift}(M, V) = M(\sqrt{V + V_b} - \sqrt{V_b}) \times \left[ \gamma \frac{k_e + k_h}{m} \sqrt{\frac{2\epsilon_{Si}}{q} \frac{N_A N_D}{N_A + N_D}} \right]$$
(3.9)

$$\alpha_{M,V}(\lambda) = 1 - \frac{A}{1 + 4(\frac{\lambda - \lambda_0 - \Delta \lambda_{Shift}(M,V)}{\Delta \lambda})^2}$$
(3.10)



Figure 3.7: Linearity comparison of the proposed optical DAC versus an ideal electrical DAC driven microring modulator for driver's voltage swings of 1.5 V (a) and 4 V (b).

Figure 3.7 depicts the normalized transmission function,  $\alpha_{M,V}(\lambda)$ , for an ODAC and an ideal electrical DAC driven ring-modulator.  $\lambda$  is set to the wavelength which maximizes the  $OMA_{TX}$  for a microring with  $\lambda_0 = 1280nm$ , Q = 7.5K, A = 0.9,  $V_b = 0.5V$ , 16 junction segments, and 20pm resonance shift at 1 V applied voltage. Two different driving voltage capabilities (1.5 V and 4 V) are also considered to study the effect of the maximum resonance shift on the linearity of both methods. ODAC shows a slight improvement for

the smaller range resonance shifts (Fig. 3.7a) while showing higher improvement for larger shifts (Fig. 3.7b). Notice, here we assumed that the electrical DAC is ideal, while designing a linear electrical DAC operating at  $+20\,\mathrm{GS/s}$  is not trivial and requires extra area and energy overhead. Our analysis also showed that the effects of the optical loss variations inside the cavity due to the different depleted carriers' density on the linearity of the ODAC is negligible in a PAM-4 transmitter design. The direct digital drive of the ODAC design also lends itself to efficient pre-emphasis equalization, which can help improve the bandwidth limitation of the ring resonators. The pre-distortion of the ODAC non-linearity is critical in some applications such as high-resolution optical arbitrary waveform generators.

Linearity of the microring's characteristics also depends on the Q-factor of the resonator. As expected higher Q-factor leads to more non-linear Lorentzian shape. To show this effect, Fig. 3.8 presents the ODAC transmission for two different Q-factor values of 7.5 k and 15 k. Firstly, notice that higher Q-factor improves the  $OMA_{TX}$  for a fixed resonance shift capability (assuming  $20 \,\mathrm{pm/V}$ ). Also, the transmission characteristics of ODAC for higher Q modulator is slightly more non-linear, while it is still linear enough for PAM-4 modulation due to relatively small resonance shift.



Figure 3.8: Linearity comparison of the proposed optical DAC versus an ideal electrical DAC driven microring modulator for Q-factors of 7.5 k (a) and 15 k (b).

# 3.3.2 Segmented Ring-resonator ODAC in zero-change 45nm SOI platform

The ODAC is built using interleaved lateral p-n "T-junctions" with the cavity diameter of 10 µm in zero-change 45 nm SOI platform [73] (Fig. 3.9). Although, more segments increase the resolution of the ODAC, the number of segments is limited to the minimum allowed doping region's width set by the technology design rules. In current design, we choose to

place 64 segments (32 anodes and 32 cathodes) along the ring, while 128 is the upper limit with current junction shapes. The 64 Segments provide more than enough resolution to linearize PAM-4 levels and characterize ODAC functionality and characteristics.

The ring cavity is wider than the single-mode width to allow electrical contacts to be placed at the inner-radius edge of the junctions minimizing the optical loss to the fundamental mode. All the cathode segments are connected together via a spoked-ring shape metal contact in the center of the ring, while each anode segment has its own contact pin. The spoked-shape contact prevents extra optical loss due to the proximity of the electrical metal and contacts to the inner radius of the ring waveguide [65]. Although the higher-order modes are suppressed in Q-factor by scattering from these contacts and bending loss, they remain high enough Q to have an undesirable spectral signature. Hence, excitation of only the fundamental optical mode, and suppression of the higher order modes, is further accomplished by a suitably designed bended coupler. A propagation-constant-matched, curved bus-to-resonator coupler with a long interaction length has a small k-space spread of the perturbation and does not excite the higher order, low-Q resonances. The 5 µm outer ring radius is larger than the minimum permitted by bending loss to accommodate an efficient coupler design and high enough intrinsic Q.

A resistive c-Si based heater and a weakly-coupled drop-port with a straight PD are also added to this structure for closed-loop thermal tuning of the ring described in Section 3.4.3. Heater is built in c-Si material with a ring shape in order to minimize thermal impedance in between the heater and the ring cavity. Heater structure is also salicided to have smaller resistance. We have used low-resistance wide and higher metal levels to connect heater to its driver, while p-n junctions are driven through narrow and lower metal levels to reduce wiring parasitic capacitances.

#### 3.4 PAM-4 Optical Transmitter Building Blocks

The ODAC based PAM-4 transmitter electronics consist of a fully digital data-path, a digital phase lock loop (DPLL), and a thermal tuning feedback loop. This section explains the design details of each of these blocks.

#### 3.4.1 Transmitter Data-path

Figure 3.10 shows the transmitter data-path's block-diagram. PAM-4 symbols are generated from two separate PRBS-31 modules in the digital backend. This enables the NRZ modulation mode as well as PAM-4 by setting the same seed for both PRBS blocks for both pattern and PRBS modes. Next, every 4 generated PAM-4 symbols are serialized and encoded through the data-path. 2-to-1 serializers have been implemented by using a 2:1 MUX and a latch on the even path. While ODAC design has 32 anode junctions, here we drive each two anodes together. The segments are activated in a thermometer manner to achieve better linearity, electrical bandwidth and energy-efficiency, by minimizing the wire-



Figure 3.9: 3D layout of a segmented ring-resonator based ODAC in zero-change  $45\,\mathrm{nm}$  SOI platform.

to-wire capacitance parasitics of the segment control wires. This segment partitioning with thermometer coding leads to a 4-bit binary ODAC. The mapping is done by grouping the first two segments together so that 16 states of the driver can be matched to the modulator conditions from all to none p-n junctions depleted. Although, PAM-4 symbols consist of 2 bits, the extra 2 bits of the ODAC provides flexibility for linearization of the Lorentzian behavior discussed in Section 3.3.1. Although, extra bits provide higher flexibility to adjust level mismatch, it comes with the extra wiring parasitics leading to poor serialization and driver energy-efficiency. In fact, the linearity analysis in Section 3.3.1 shows that with only two segments we should be able to produce PAM-4 levels with same spacing in between them. However, the current design in this work (with 16 segments) is implemented to fully characterize the ODAC for other applications as well.

MUX-based encoders convert the 2 b PAM-4 symbols to 16 b thermometer codes mapped by a programmable look up table (LUT). LUT size is only  $4 \times 4b = 16b$  and it is implemented by scan-flop cells. LUT values are programmable through an off-chip control FPGA used in the experiment setup. We note that the placement of encoders ahead of the final serializers relaxed the timing constraints in the price of having 16 times more final stage 2-1 serializers. We note that for higher baud-rates (>25 GS/s), we have to bring the LUT in the digital backend (before the first stage of 2-to-1 serialization) and instantiate full  $16 \times 2$ -to-1 serializers for the first stage serialization. Alternatively, using custom designed digital sub-blocks instead

of standard cells can be exploited to speed up encoding and serialization at higher data-rates.



Figure 3.10: Block-diagram of the transmitter's data-path.

#### 3.4.2 Digital PLL (DPLL)

Transmitter's target symbol rate was 20 GS/s, requiring a 10 GHz clock source as it operates in the double data rate fashion. Clocking blocks are illustrated in Fig. 3.11. The DPLL generates a differential 20 GHz clock and divided clocks (10 GHz and 5 GHz) from the PLL clock divider chain that can be selected as transmitter's reference clock. DPLL is comprised of an LC digital controlled oscillator (DCO), a CML/CMOS clock divider chain, and a fully digital loop filter [93]. The digital loop consists of a bang-bang phase detector (BB-PD), a frequency detector, a digital filter and the  $\Sigma\Delta$  modulator.  $\Sigma\Delta$  modulation is implemented to improve the effective resolution of the digital controlled oscillator (DCO) and reduce the effective quantization noise on the jitter. A digital controlled LC-oscillator (LC-DCO) is custom designed with a tunable capacitive DAC (CDAC). The tuning range of the DCO frequency is from 16 GHz to 22 GHz with the reference clock frequency equal to 1/32 of the output clock frequency ( $f_{ref}$ : 500 MHz - 656 MHz). The CML output of the oscillator is divided using a CML-latch based clock divider and then followed by a CML-to-CMOS converter and four standard-cell based flip-flops in CMOS domain. Overall, the clock divider chain divides down LC-DCO by 32 to be compared with the reference clock provided from an off-chip source.

Since the substrate will be removed, we adjusted the inductor design to achieve the target clock frequency. The inductor has an inductance of  $580\,\mathrm{pH}$  and the LC tank has a quality factor of 10. The CDAC consists of 17 LSB capacitor units and 31 MSB capacitor units, both of which are thermometer coded. The circuit implementation of a CDAC unit cell is presented in Fig. 3.12. This symmetric design provides enough On/Off capacitance ratio for target frequency range. Capacitor C is implemented using VN back-end-of-line metal

capacitors to have less parasitics and voltage dependency. The output of the  $\Sigma\Delta$  modulator drives one of the LSB units. The sizes of the switch transistors are optimized to achieve the highest quality factor while maintaining reasonable tuning range. The DPLL has the dimensions of 250 µm by 85 µm.



Figure 3.11: 20 GHz Digital PLL's block-diagram.

#### 3.4.3 Thermal Tuning

Despite multiple advantages of microring modulators, there is not any microring based optical transceiver commercially available due to their thermal and process variation sensitivity [94, 12]. Microrings' resonance wavelength can be greatly affected by the process variations and temperature. Thermo-optical resonance shift effects are estimated to be as large as  $-10\,\mathrm{GHz/K}$  [12]. Figure 3.13 depicts this effect, where solid and dashed lines represent corresponding ring positions to each PAM-4 symbol at temperatures  $T_0$  and  $T_0 + \Delta T$ , respectively. Resonance wavelength is chosen on the right side of the laser wavelength where it is thermally stable [5] such that  $OMA_{TX,outer}$  is maximized at temperature  $T_0$ . We note that this optimum distance between laser and ring's resonance wavelengths depends on the Q-factor and total resonance shift capability of the modulator. If temperature increases by



Figure 3.12: Circuit diagram of a CDAC unit cell (total of 48 unit cells are connected to the LC-DCO differential output nodes).

 $\Delta T$ , this distance changes due to the increase in the resonance wavelength of the ring modulator. This resonance shift causes unbalanced PAM-4 levels in addition to the degradation of the  $OMA_{TX,outer}$ . In order to solve this issue, we have demonstrated a thermal tuner capable of optimizing and locking the ring position in NRZ transceivers regardless of the data encoding [5]. In this work, we adopt the same scheme and show this approach can be extended for higher order PAM transmitters. Block-diagram of the proposed thermal tuner for PAM-4 is shown in Fig. 3.14. The underlying procedure is as follows; a weakly coupled drop-port to the ring senses a small fraction ( $\approx 1\%$ ) of the optical power inside this cavity. Next, the optical power is converted to the equivalent photocurrents (denoted by  $i_{0-3}$ ). The current will be integrated on a capacitor of size C in an interval of N symbols. Resultant voltage is digitized via a 6 b analog-to-digital converter (ADC) and fed into the controller as the sensing input of this feedback loop. Assuming  $N_n$  denotes the number of "n" symbols, transmitted during the integration period. Hence,  $N_0 + N_1 + N_2 + N_3 = N$  and the output voltage of the integrator is:

$$v_N = \frac{1}{C} \int_0^{NT_{Sym}} i(t)dt = \frac{T_{Sym}}{C} \left( \sum_{n=0}^3 N_n \cdot i_n \right)$$
 (3.11)

where i(t) is the photocurrent and  $T_{Sym}$  is the symbol-time set by the baud-rate. We can write intermediate levels ( $i_1$  and  $i_2$ ) in terms of the lowest and highest optical levels ( $i_0$  and  $i_3$ ):

$$i_1 = i_0 + \Delta i + \delta_1$$
  
 $i_2 = i_3 - \Delta i - \delta_2$   
 $\Delta i = \frac{i_3 - i_0}{3}$ 
(3.12)



Figure 3.13: Thermal sensitivity of the resonance wavelength and its effects on a PAM-4 transmit eye. PAM-4 levels are uncoded while any coding can be applied through the LUT.

where  $\Delta i$  is the ideal distance between adjacent levels, and  $\delta_1, \delta_2$  are errors in the real distances of the first and third level spacings due to nonlinear ring characteristics. Assuming the level mismatch around the optimum point is negligible  $(\delta_1, \delta_2 \ll \Delta i)$ , we can reformulate Equation (3.11) by rewriting the intermediate according to Equations (3.12) as:

$$v_N = \frac{T_{Sym}}{3C} \left( (3N_3 + 2N_2 + N_1) \cdot i_3 + (N_2 + 2N_1 + 3N_0) \cdot i_0 \right)$$
 (3.13)

The coefficients of  $i_3$  and  $i_0$  in Equation (3.13) are equivalent to the binary summation over all the transmitted symbols  $(S_i)$  and inverted symbols  $(\overline{S_i})$ , respectively. Thus we can rewrite (3.13) as:

$$v_N = \frac{T_{Sym}}{3C} \left( \sum_{i=1}^N S_i \cdot i_3 + \sum_{i=1}^N \overline{S_i} \cdot i_0 \right)$$
(3.14)

Now assume we do this integration over two consecutive time windows denoted by  $t_a$  and  $t_b$ .  $i_0$  and  $i_3$  are approximately constant over these intervals since the thermal time constant  $(\tau_t)$  is always much larger than the integration time  $(N \cdot T_{Sym})$ . Thus, assuming  $i_{0,3}(t_a) \approx i_{0,3}(t_b)$ , the resultant voltage at the output of integrator in each window will be:



Figure 3.14: PAM-4 transmitter's thermal tuning feedback loop block-diagram.

$$v_{N}(t_{a}) = \frac{T_{Sym}}{3C} \left( \sum_{i=1}^{N} S_{i}(t_{a}) \cdot i_{3} + \sum_{i=1}^{N} \overline{S_{i}(t_{a})} \cdot i_{0} \right)$$

$$v_{N}(t_{b}) = \frac{T_{Sym}}{3C} \left( \sum_{i=1}^{N} S_{i}(t_{b}) \cdot i_{3} + \sum_{i=1}^{N} \overline{S_{i}(t_{b})} \cdot i_{0} \right)$$
(3.15)

If  $\sum_{i=1}^{N} S_i(t_a) \neq \sum_{i=1}^{N} S_i(t_b)$ , the two equations above are independent and can be used to solve for  $i_0$  and  $i_3$  values and any derivatives of them such as average or difference. Now, suppose we digitize  $v_N$  values via an ADC with ratio of  $G_A$  such that  $L_N = v_N \cdot G_A$ . Hence Equation (3.15) can be transformed as:

$$L_{N}(t_{a}) = \frac{1}{3N} \left( \sum_{i=1}^{N} S_{i}(t_{a}) \cdot L_{3} + \sum_{i=1}^{N} \overline{S_{i}(t_{a})} \cdot L_{0} \right)$$

$$L_{N}(t_{b}) = \frac{1}{3N} \left( \sum_{i=1}^{N} S_{i}(t_{b}) \cdot L_{3} + \sum_{i=1}^{N} \overline{S_{i}(t_{b})} \cdot L_{0} \right)$$
(3.16)

where  $L_0 = (G_A \cdot N \cdot T_{Sym}/C) \cdot i_0$  and  $L_3 = (G_A \cdot N \cdot T_{Sym}/C) \cdot i_3$  are digital representations of the "0" and "3" levels, respectively. Notice  $\sum_{i=1}^N S_i(t) + \sum_{i=1}^N \overline{S_i(t)} = \sum_{i=1}^N 3 = 3N$  holds for every timing window. Let  $L_d = L_3 - L_0$  be the digital representation of the "0" and "3" levels' difference, thus:

$$L_{d} = \frac{3N \cdot (L_{N}(t_{a}) - L_{N}(t_{b}))}{(\sum_{i=1}^{N} \overline{S_{i}}(t_{b}) - \sum_{i=1}^{N} \overline{S_{i}}(t_{a}))}$$

$$L_{3} = L_{N} + \frac{\sum_{i=1}^{N} \overline{S_{i}}}{3N} \cdot L_{d}$$

$$L_{0} = L_{3} - L_{d}$$
(3.17)

Equations (3.17) show how we can decouple the calculation of  $L_0$  and  $L_3$  in terms of  $L_d$  from the updates to  $L_d$  itself while the thermal tuning loop is running. We can update  $L_d$  only if  $\sum_{i=1}^{N} S_i(t_b) \neq \sum_{i=1}^{N} S_i(t_a)$ . Therefore, for the integration windows where this constraint does not hold, we only update  $L_0$  and  $L_3$  based on the latest value of  $L_d$ . We can show that even without the up-to-date  $L_d$ ,  $L_{0,3}$  still have the same local monotonicity to enable locking to the correct point, as explained in [5]. Thus, during the initialization (Sweep Phase) the loop learns the optimum/correct value of  $L_d$  and updates to  $L_d$  are not necessary during the locking operation (Tracking Phase). We note that this scheme does not rely on any data encoding since the control loop always has sufficient information to lock the ring, regardless of the transmitted symbol patterns. Consequently, we can estimate  $L_{0,3,d}$  constantly during the transmission time. This allows us to use an embedded heater structure inside the ring (discussed in Section 3.3.2) to control the resonance wavelength of the ring such that  $L_{0,3,d}$  are locked to desired values.

Controlling the resonance wavelength of the ring is performed by adjusting the current flowing in the resistive heater. This current determines the power dissipated by the heater which heats up the heater and consequently the ring cavity and moves the ring resonance eventually. Heater is driven by a CMOS driver which consists of a NMOS driver head and a digital accumulator. The accumulator generates a pulse-density modulated (PDM) waveform with the average power of its digital input  $(D_H)$ .  $D_H$  is set by the loop controller corresponding to the desired heater strength. The PDM waveform drives the gate of the NMOS driver which turns on and injects current into the heater whenever the waveform is 1. Overall, the heater and its driver act as a DAC converting controller's output  $(D_H)$  to a change in the temperature of the ring cavity and consequently shift in the resonance of the ring.

In order to find the optimum locking point achieving maximum eye-opening  $(L_d)$ , initially  $\lambda_0$  is set at a large wavelength offset to the laser wavelength  $\lambda_{Laser}$  by choosing a large value of  $P_{HT}$ , such that the ring is completely OFF-resonance. During the Sweep Phase, the heater strength  $(P_{HT})$  is swept down while the transmitted data is set to a training sequence favorable to the operation of the symbol-statistical tracker leading to frequent updates to  $L_d$ .  $P_{HT}$  is swept firstly with coarse step sizes and later with fine steps once  $\lambda_0$  reaches the proximity of  $\lambda_{Laser}$ . After this phase, we can find the  $P_{HT}$  value maximizing the  $L_d$  and restart the heater to this optimum point. This restarting operation is necessary to return to the optimal lock point found under optical bi-stability caused by hysteresis characteristics of the heater tuning [5]. In the Tracking Phase, the controller configures the tracker for a

fixed  $L_d$  and arbitrary data can be sent while the ring is locked to the optimum location by tracking and locking the values of either  $L_0$  or  $L_3$ .

Moreover, the digital controller is capable of self-heating cancellation via a fast feed-forward path for adjusting  $P_{HT}$ . Self-heating is the effect of optical power inside the ring on the ring temperature. Large self-heating power perturbations occur whenever the average optical power inside the ring changes (due to unbalanced data pattern), causing the resonance wavelength shift with a much faster time constant than ambient temperature variations. To avoid this issue due the sudden power change, controller offsets the calculated  $P_{HT}$  (for maintaining the level tracking) via fixed steps proportionate to the density of "3"-levels in the transmitted data pattern.

Figure 3.15 shows the thermal tuning procedure simulation in MATLAB. In this simulation, we are integrating over 128 PAM-4 symbols (at 25 GSym/s) in each window and the controller locks the  $i_3$  values in the Tracking Phase. We assumed heater's time constant is 1 µs and self-heating cancellation controller is also activated.  $i_{Diff}$  is the difference between calculated  $i_3$  and  $i_0$  by the controller and all parameters are normalized to the ADC output code. During the *Tracking Phase*, we intentionally forced most of the symbols to be "3" from time  $1.5 \times 10^5$  to  $2.5 \times 10^5$  (time units are in integration window numbers, where each window is 128×40 ps or 5.12 ns long) in order to validate the ability of the loop to recover and lock again to the optimal  $i_3$  value. As expected, at the perturbation times,  $i_3$  abruptly changes due to lower average optical power inside the ring and the fast time-constant of self-heating. However, thermal tuning loop locks ring's resonance to the target location achieving the desired level of  $i_3$  shortly ( $\sim 0.5 \,\mathrm{ns}$ ) after each perturbation due to the fast self-heating cancellation path in the controller. Simulated heater's output power and ring modulator's temperature, assuming 7 µW/GHz heater efficiency, are plotted in Fig. 3.16. Notice increasing the density of "3" symbols during the perturbation intervals causes less average power inside the ring reducing ring's temperature. However, the controller increases embedded heater's power such that overall ring temperature remains constant regardless of the rapid change in symbols' densities.

To summarize, the tuning procedure starts with sweeping the wavelength resonance by sweeping down the heater strength via a PDM driver. Afterwards, heater PDM value is set to the optimum value where the maximum of  $(i_3$ - $i_0)$  occurred. After the optimization phase, controller keeps locking the ring to this position by tracking either the  $\theta$  or 3 symbol levels. Figure 3.17 shows the PAM-4 level separation mismatch ratio,  $R_{LM}$ , and  $OMA_{TX,outer}$  as a function of the spacing between the microring resonance and laser wavelength.  $R_{LM}$  is defined as the ratio of the minimum PAM-4 eye height to 1/3 of the  $OMA_{TX,outer}$ . This simulation validates the  $(\delta_1, \delta_2 \ll \Delta i)$  assumption during the optimization and locking procedures. Notice that in the  $Sweep\ Phase$  (during the initialization), the resonance of the ring is swept down to reach laser wavelength and in this region  $R_{LM}$  is larger than 90% satisfying the  $(\delta_1, \delta_2 \ll \Delta i)$  constraint.

For the circuit implementation, the integration capacitance can be tuned from 50 fF to 150 fF to set the integrated value in the ADC input range for various optical power levels. Buffers are made out of reset-based inverters. We designed a 6b successive approximation



Figure 3.15: Simulated thermal tuning procedure; the optimum heater value that maximizes eye-openings is found ( $Sweep\ Phase$ ) and resonance is locked to the corresponding wavelength by tracking  $i_3$  level ( $Tracking\ Phase$ ).

register (SAR) ADC with differential input to digitize the integration capacitor's voltage. The ADC's CDAC has the LSB value of 1 fF.

The thermal tuning scheme described in this section can also tune the resonance of the ring to compensate for deviations due to process variations. We first note that since we implement our photonic devices in an unmodified or minimally modified CMOS process (Chapter 5), fabrication processes are significantly more accurate with far less variations compared to other custom photonic processes. Despite the resonance wavelength deviation due to the sensitivity to the actual circumference of the cavity, the resonance deviation (normally with  $3\sigma < 1 \,\mathrm{nm}$ ) is in the range that current design of thermal tuner can tolerate. Notice since the thermal tuning procedure sweeps the whole tuning range, the initial location of resonance cannot affect the functionality in this method.

## 3.5 Complete Transmitter Design

The complete ODAC-based PAM-4 transmitter, integrates all the sub-blocks described in the previous section (Fig. 3.18). The transmitter can operate in both NRZ and PAM-4 modes with selectable baud rate of around 10 GS/s or 20 GS/s. Transmitter is running on



Figure 3.16: Simulated heater output power and ring's temperature during the thermal tuning process.

the nominal supply of  $0.9\,\mathrm{V}$  except for the drivers which use  $1.55\,\mathrm{V}$ . Despite we have not used thick-oxide devices for the driver, using them can reduce the reliability risks and improve the drivers' voltage swing up to  $2.4\,\mathrm{V}$  in this process. Whole data-path is designed using CMOS digital standard cells in  $45\,\mathrm{nm}$  SOI CMOS process. This makes the design easily portable to advanced nodes without the need of designing an analog high-speed electrical DAC. The ODAC is operating in the depletion mode with swing of  $0\,\mathrm{to}\,-1.55\,\mathrm{V}$ .

Fabricated test chip accommodates 5 full PAM-4 transmitter test sites in order to be able to measure the performance of many variants of ODAC designs. Each variant uses different designs of junction shapes (i.e. various wavelength shift per voltage capabilities) and bus/drop-port waveguides coupling gaps (i.e. various Q-factors). All test sites are sharing a single DPLL as a reference clock source via a high-speed CMOS clock distribution network.

## 3.6 Experimental Demonstration

The transmitter is designed and fabricated in a multi-project wafer (MPW) run in the 45 nm SOI CMOS technology along with other electrical designs. Assuming a dedicated DPLL, the transmitter occupies the area of 0.06 mm<sup>2</sup> including clocking blocks. Figure 3.19 shows the



Figure 3.17:  $OMA_{TX,outer}$  and PAM-4 level-mismatch ratio versus the relative distance of the ring's resonance and laser's wavelength.

micro-graphs of the test chip, DPLL, a full transmitter site, and a zoom-in into an ODAC. Sub-blocks of the PLL are labeled in this micrograph, including the inductor, decoupling capacitor array, capacitive DAC, divider, scan chain and digital control logic. The die is flip-chip assembled onto a PCB and undergone substrate removal afterwards (Fig. 3.20). A tunable laser source coupled into the backside of the chip via a lensed fiber and unidirectional vertical couplers on the chip with 3 dB loss at 1280 nm wavelength and the modulated light is directly fed in to a 30 GHz optical scope without any optical amplification.

Figure 3.21a presents the ODAC's static characteristics for codes used for the PAM-4 transmission measurement. The microring modulator achieved a free spectral range (FSR) of 3.2 THz and Q-factor of 6.5 K. This microring design is over-coupled since the bus waveguide/ring gap was originally designed for the 1180 nm wavelength. Static transmission of ODAC for different DAC settings (Fig. 3.21b) is measured by capturing the transient waveforms (in pattern mode) shown in Fig. 3.22. This experiment is done by changing ODAC values through the LUT so that the average optical power inside the ring stays constant to avoid thermal drifts. Notice, we cannot turn on the thermal tuning for this experiment as it counteracts with the slow movements of resonance we are trying to measure. This measurement also confirms that the  $R_{LM}$  of about 98% is achievable at the maximum  $OMA_{TX}$  wavelength without any pre-distortion. ODAC's transmission is also indirectly measured for



Figure 3.18: PAM-4 transmitter's full block-diagram.

all 16 codes by running the transmitter in pattern mode with 4 different code settings to measure all possible transmissions at the output (Fig. 3.21b). Estimated differential non-linearity (DNL) and integral non-linearity (INL) for this ODAC are  $0.22\ LSB$  and  $0.62\ LSB$ , respectively.

The measured 10-90% rise/fall time is 20 ps indicating that transmitter can potentially run faster than 20 GS/s. The transmitted eye-diagram (captured via a 30 GHz optical scope module) is open with 3 dB ER and 5.5 dB IL at 1285 nm laser wavelength (Fig. 3.23). ER and IL can be improved by critically coupling the ring. In addition, using higher-swing drivers can further enhance the performance as the junctions are not yet fully depleted. Optimal number of depleted segments corresponding to PAM-4 symbols are (0, 5, 10, 15) to achieve balanced PAM-4 transmit eye-diagram without any pre-distortion. The modulator and drivers achieved 20 Gb/s data rate in the NRZ mode with 155 fJ/b and 40 Gb/s PAM-4 with 42 fJ/b energy-efficiency. Higher energy consumption per symbol for NRZ was expected due to the higher transition probabilities and different wiring capacitance of segments. The energy-efficiency of the complete transmitter at 40 Gb/s PAM-4 is 685 fJ/b.

Area/energy breakdown is summarized in Fig. 3.24. Area and energy is dictated by the



Figure 3.19: Micrographs of the test chip, DPLL, full transmitter, and an ODAC.

clocking, which can be amortized by sharing the DPLL among multiple transmitters due to a small transmitter form-factor which simplifies the clock distribution. For instance, assuming a DPLL shared between 4 transmitter channels, the total area and energy will be 0.045 mm<sup>2</sup> and 400 fJ/b, respectively. Notice that the area taken by photonic components is only 16%, which can be reduced to below 10% by optimizing the floor planning and sharing the thermal tuner digital controller among multiple transmitters (using time multiplexing). Similarly, the enrgy consumption of modulator and its driver is only about 10% of total energy and the rest is the inevitable power consumption required for any electrical or optical serial link. Thus, choosing the right modulator design and co-optimization of electronic-photonic enabled monolithic photonics on CMOS chips with negligible area/energy overheads, while it can extend the link reach to 2 km distances and beyond.

We also note that heater energy is not included in the energy-breakdown. This energy can be easily calculated by knowing the heater efficiency  $(3.7\,\mu\text{W}/\text{GHz})$  and the tuning range required by any particular application. For instance, assuming a DWDM ring-resonator based link with 200 GHz channel-to-channel spacing, the thermal tuner should be able to adjust the resonance of any channel for up to 200 GHz shift. Hence, maximum energy dissipated by the heater will be around 740  $\mu$ W, leading to 18.5 fJ/b energy efficiency per bit at 40 Gb/s data-rate. Another example can be a case where the resonance wavelength needs to be tuned over the extended industrial temperature range (-5 °C to 85 °C). Assuming



Figure 3.20: Test setup and packaging scheme of the test chip.

thermal sensitivty of  $10\,\mathrm{GHz/K}$ , heater should be able to sweep the resonance by  $900\,\mathrm{GHz}$ , consuming  $3.3\,\mathrm{mW}$  ( $83\,\mathrm{fJ/b}$  energy efficiency per bit at  $40\,\mathrm{Gb/s}$  data-rate).

Thermal tuning functionality is verified up to 20 Gb/s PAM-4 transmission due to a timing violation issue in the digital controller for higher data rates. In this stress test, after the tuner locked the ring into the maximum eye-opening point by tracking level 3 symbol, we turned on and off the other test sites around this transmitter on the die to create "hot spots" with a fixed random pattern (Fig. 3.25a). Since these test sites are the closest circuitry to the transmitter, they have smallest thermal impedance to the transmitter among all other thermal perturbation sources (like other circuits, package, etc.). Thus, testing the thermal tuning functionality and loop bandwidth under this stress test proves the robustness of this scheme to any other heat sources. Note that the current designs tuning range is about 524 GHz (more than 50 K temperature), but it can be extended by driving the resistive heater with higher voltages than the nominal 1 V which we used in this work.

Transmit eye-diagrams for a fixed bit-stream are captured via an ac-coupled external TIA with 10 GHz bandwidth. The eye is completely closed without the thermal tuning while active thermal tuning kept the PAM-4 eyes open by adjusting the heater strength according to the injected ambient heat (Fig. 3.25b). In this experiment, the total peak-



Figure 3.21: (a) Optical DAC static measurement at low optical input power to avoid thermal drifts. Notice the resonance is shifted down to 1280 nm as cavity's optical power and consequently temperature is lower than the eye-diagram measurement. (b) Normalized optical output for each DAC code compared with an ideal linear DAC (Code 1's output is derived from Codes 0 and 2 transmissions since it is skipped in transmitter's 16b thermometer coding).

to-peak temperature change is about  $5\,^{\circ}\mathrm{C}$  (each heater DAC LSB corresponds to  $1\,\mathrm{GHz}$  resonance shift).

## 3.7 Summary

We have demonstrated a DWDM compatible PAM-4 transmitter based on a digital-to-optical converter design using segmented microring resonators in a commercial 45 nm SOI CMOS process. This device can be used to support even higher order modulations such as PAM-8/16 as well as coherent modulations like QAM-16 and used as an equalizer for beyond optical line-width data-rates. Furthermore, it can enable high-speed and moderate-resolution (5-7bits) optical AWGs for RF-photonics applications. In addition to high energy-efficiency, the complete transmitter occupies only  $0.06\,\mathrm{mm^2}$  achieving bandwidth density of  $0.67\,\mathrm{Tb/s/mm^2}$  including the PLL, which makes this approach suitable for systems-on-chip such as processors and switches with a large number of I/O links.



Figure 3.22: Transient waveforms with all possible ODAC codes in order to measure linearity.



Figure 3.23: Measured transmit eye-diagrams. The highest and lowest optical levels are the same in both cases since the operating point remains the same.

The performance of our PAM-4 transmitter is summarized in Table 3.1 and compared against other high-speed optical transmitters. This work proves the benefits of eliminating the electrical DAC, using microrings and the advantages of monolithic silicon photonics platforms to achieve energy-efficiency. These elements improved both total energy-efficiency (685 fJ/b) and bandwidth density (0.67 Tb/s/mm<sup>2</sup>) over the state-of-the-art MZI

|                                          | This Work                      | [85]                    | [30]                        | [38]                          | [28]                       |
|------------------------------------------|--------------------------------|-------------------------|-----------------------------|-------------------------------|----------------------------|
| Technology                               |                                |                         |                             |                               |                            |
| Photonics<br>Circuits                    | 45 nm CMOS SOI                 | GP 65 nm CMOS           | 90 nm CMOS SOI              | 130 nm SOI CMOS<br>40 nm CMOS | PIC25G SOI<br>55 nm BiCMOS |
| Integration                              | Monolithic                     | Wirebond                | Monolithic                  | Wirebond                      | 3D (copper pillars)        |
| Wavelength                               | 1280 nm                        | $1550\mathrm{nm}$       | 1300 nm                     | $1310\mathrm{nm}$             | 1310 nm                    |
| Transmitter                              |                                |                         |                             |                               |                            |
| Driver Supply                            | 1.55 V                         | 2.4 V                   | 1.5 V                       | 1V                            | N/R                        |
| Modulator Device                         | Ring-resonator                 | Ring-resonator          | MZI                         | SISCAP-MZI                    | MZI                        |
| Extinction                               | 3 dB                           | 7 dB                    | 6.3 dB                      | $_{ m N/R}$                   | 2.5 dB                     |
| Insertion Loss                           | 5.5 dB                         | 5 dB                    | 5 dB                        | $_{ m N/R}$                   | >5.7 dB <sup>7</sup>       |
| NRZ Data Rate                            | $20\mathrm{Gb/s}$              | N/A                     | $25\mathrm{Gb/s}$           | $20\mathrm{Gb/s}$             | 56 Gb/s                    |
| NRZ Energy Efficiency*                   | 0.155 pJ/bit                   | N/A                     | $_{ m N/R}$                 | $4.5 \mathrm{pJ/bit}$         | $5.4  \mathrm{pJ/bit}$     |
| PAM-4 Data Rate                          | $40\mathrm{Gb/s}$              | $40\mathrm{Gb/s}$       | $56\mathrm{Gb/s}$           | $20\mathrm{Gb/s}$             | N/A                        |
| PAM-4 Energy Efficiency*                 | $0.042  \mathrm{pJ/bit}$       | 3.04 pJ/bit             | $4.8 \mathrm{pJ/bit}$       | $0.29  \mathrm{pJ/bit}$       | N/A                        |
| Photonics Area                           | $0.01\mathrm{mm}^2$            | $0.01\mathrm{mm}^{27}$  | $1.5\mathrm{mm}^{27}$       | $0.18\mathrm{mm}^2$           | $2.3\mathrm{mm}^{2}$       |
| Driver Area                              | $0.001\mathrm{mm}^{2\ddagger}$ | $0.07\mathrm{mm}^{27}$  | $1.5\mathrm{mm}^{2}$        | $0.18\mathrm{mm}^{2}$         | $0.45\mathrm{mm}^{21}$     |
| BW Density*                              | $3.6\mathrm{Tb/s/mm^2}$        | $0.5\mathrm{Tb/s/mm^2}$ | $0.036  \mathrm{Tb/s/mm^2}$ | $0.053  \mathrm{Tb/s/mm^2}$   | $0.02\mathrm{Tb/s/mm^2}$   |
| Total Transmitter Area                   | $0.06\mathrm{mm}^{2\S}$        | $0.08\mathrm{mm}^2$     | $1.56\mathrm{mm}^2$         | $0.38\mathrm{mm}^2$           | $2.75\mathrm{mm}^2$        |
| N/A = Not Applicable, N/R = Not Reported | R = Not Reported               |                         |                             |                               |                            |

Summary and comparison of the ODAC-based PAM-4 transmitter with prior high-speed optical transmit-Table 3.1:

<sup>\*</sup> Modulator and Driver
† Estimated from figures
† Including LUT
¶ Silicon-insulator-silicon Capacitor MZI



Figure 3.24: Transmitter's energy and area breakdown at 40 Gb/s PAM-4.

and microring-based transmitters by more than an order of magnitude.



Figure 3.25: (a) Thermal tuning stress test pattern and heater value during the active thermal tuning and (b)  $20\,\mathrm{Gb/s}$  PAM-4 transmit eye-diagrams with thermal tuning on and off.

## Chapter 4

# Monolithic Photonics in 32nm SOI CMOS

One of the main advantages of the CMOS microelectronics technology is the ability to evolve into smaller transistor technologies by scaling the effective channel length. More advanced CMOS process nodes provide larger density of transistors with better performance and energy-efficiency. From the perspective of electronic-photonic system design, this is favorable as we can also improve the energy-efficiency and bandwidth density of optical transceivers utilizing advanced CMOS nodes. However, photonic devices are limited in size by the optical wavelength and smaller film thicknesses and feature sizes of advanced CMOS nodes are limiting the monolithic integration of photonic components in these technologies.

As a step toward implementing photonics in a more advanced CMOS node, we deployed our zero-change approach in the 32 nm SOI technology. Performance benefits of the 32 nm compared to the 45 nm node include faster transistors, 20% higher transistor density, and the introduction of embedded DRAM (eDRAM), a dense on-chip memory. This CMOS node will be most likely the last CMOS node that can be utilized in zero-change monolithic platforms as more advanced nodes are not fabricated in partially-depleted SOI (PDSOI) platforms. Integrating photonic components in sub-28 nm nodes requires process change and modification as the c-Si layer is not thick enough to provide sufficient light confinement for the waveguides in fully-depleted SOI (FDSOI) platforms. Moreover, the FinFET technologies are mostly using bulk CMOS platforms which does not provide any medium to build low-optical loss waveguides organically. We will address this challenge in the next chapter by introducing the minimal number new masks and materials to enable photonics monolithically in the most advanced FDSOI and FinFET CMOS processes.

In this chapter, we extend our zero-change monolithic integration approach to a more advanced 32 nm technology node in order to further improve the speed of electronics and performance of the device platform by exploiting new process features such as channel SiGe with higher Ge concentration available in this technology [17, 18]. This process node features high-k/metal gates (HKMG) with the minimum gate length of 25 nm and about 33% logic speed improvement over 45 nm node [10]. Due to a similar silicon body thickness to the 45 nm

process, photonic devices are designed based on our reference structures and techniques in that node. In a single multi-project wafer run, we were able to demonstrate 12 Gb/s optical transceivers in the standard telecom O-band (1260 nm-1360 nm) in this platform. This transceiver is using resonant-based modulators and detectors with analog front-end circuits. As with the 45 nm platform, we expect that future development of photonics that are able to exploit the improved lithography and expanded material selection will allow for continued improvement of the 32 nm platform. This platform provides the electronic and photonic performance needed to address the integration of high bandwidth and energy-efficient photonic I/O with high performance and low power logic chips, particularly for HPC applications.

## 4.1 Monolithic Photonic Platform in Zero-change 32nm SOI CMOS

In this chapter, we extend the monolithic integration of photonics with transistors to a more advanced 32 nm technology node to further improve the performance of the device platform [17]. This process features high-k/metal gates (HKMG) with the minimum gate length of 25 nm [95] and  $\sim$ 33% logic speed improvement over 45 nm node [10]. Unlocking these advanced CMOS nodes for photonic and electronic integration is a promising path toward future electro-optical SoC solutions.

We will be using an unmodified 32 nm SOI CMOS process fabricated at GlobalFoundries for this platform. The sub-100nm thick high-index crystalline silicon (c-Si) layer in 32 nm SOI CMOS provides enough confinement for the optical modes to design compact photonic devices in datacom and telecom wavelengths (1300-1600 nm) [62]. This layer is sandwiched between the buried oxide (BOX) layer at the bottom and nitride dual stress liners (DSL) on the top as shown in the process cross-section (Fig. 4.1). Since the BOX is not thick enough to isolate the guided modes in the cSi layer from leaking into the substrate, silicon substrate was removed using XeF<sub>2</sub> etch in a single post-processing step. Transistor performance is not affected, and all existing foundry models and standard cell timing libraries remain valid [6, 62]. For this work, we transferred the die to a glass carrier substrate after substrate removal step in order to probe the test structures and circuits from the top-side of the chip. Removing the substrate also enables us to couple light from the backside of the chip, while the chip can be electrically flip-chip attached to a package substrate for electrical connectivity with high signal integrity and low parasitics.

In this work after releasing the silicon substrate, we need to mount the die on a glass substrate so that we can have access to the top of the chip for probing our devices and circuits. This procedure is shown in Fig. 4.2. The process steps are the followings in order:

1. Die is epoxied on the first glass substrate via crystal-bond on the hot plate. Epoxy needs to cover only the bottom  $10\,\mu m$  of the chip to prevent over-etching of c-Si during etching.



Figure 4.1: Cross section of the 32 nm SOI CMOS process.

- 2. Silicon substrate is completely etched away using  $XeF_2$ .
- 3. The second glass substrate should be attached to the back of the die with none optical adhesive (NOA) using UV curing.
- 4. First glass substrate is released by resolving the crystal-bond via acetone. Chip surface is washed at the end using Isopropyl alcohol (IPA).



Figure 4.2: Post-processing steps to release silicon substrate and transfer the die to another second substrate for accessing the pads for probing and characterizing devices.

In case of fully electrically flip-chip packaging, there is no need to perform substrate transfer steps. Here, we apply under-fill epoxy for mechanical and thermal stability after flip-chip bonding and use XeF<sub>2</sub> etching to remove the silicon substrate (Fig. 4.3).



Figure 4.3: Post-processing steps to release silicon substrate in the case of flip-chip packaging on the PCB.

## 4.2 Photonic Device Design

#### 4.2.1 Passive Photonic Devices

The photonic components for optical transceivers in this platform are implemented using the body cSi, and epitaxial SiGe layers. The well and source/drain doping implants are re-purposed for implementing device junctions in the high-speed modulators and detectors. Figure 4.4 presents the top and bottom view micro-graphs of the  $3 \times 3mm^2$  test chip. Here, we discussed the design considerations of passive and active photonic devices in detail.



Figure 4.4: Micro-graphs of substrate released chips from top and bottom views.

Waveguides. A waveguide is the most fundamental photonic element required for routing the light on a chip and forming other critical building blocks of all on-chip optical devices.

Waveguides used in silicon-photonic microchips consist of a waveguide core with high refractive index and low optical loss, surrounded by the cladding made out of lower refractive index materials. A high index contrast between the core and the cladding material allows for bend radii on the order of a few microns to be achieved with minimal radiative losses, enabling optical routing within tight geometries, such as that of a chip. The index contrast also sets the limit of waveguide spacings and minimum cladding thickness required; An evanescent field extends out from the waveguide core into the cladding material, allowing waveguides to optically interact with each other when brought together in close enough proximity. This field decays exponentially the farther it is from the waveguide core and the amount of coupling between interacting waveguides can be adjusted using the spacing.

There are multiple sources for optical loss in the waveguides including: the bulk material absorption loss, line-edge etch induced sidewall roughness scattering loss, surface scattering loss, and bending loss. The presence of free electrons or holes (from a doped silicon waveguide or from a piece of metal in close proximity to the waveguide) can also introduce an additional loss due to free carrier absorption. Waveguides used for silicon-photonics are usually designed to be single-mode, though multi-mode structures can be used in straight waveguides to create low-loss wide structures, waveguide crossings or create beat-patterns that avoid certain lossy obstacles such as contacts or high-doping regions in active devices.

The choice of materials for the waveguide core and the cladding depends on the silicon-photonic platform. Figure 4.5 shows the SEM of a strip waveguide cross-section before substrate removal. Here, we used sub-100 nm thick crystalline Silicon (c-Si) as a core. 200 nm-thick BOX acts as the under-cladding medium, while waveguide core is surrounded by Nitride liners and silicon oxide on top. We measured 25 dB/cm propagation loss at 1310 nm and 20 dB/cm at 1550 nm wavelengths. We expect to be able to lower the loss to below 10 dB/cm as we have already shown in the 45 nm node (3 dB/cm) [62], by blocking unwanted doping layers in the waveguides in the current chip design. Notice these doping layers have different numbers and names for the different CMOS processes and we need to recognize all doping layers and block them on the photonic waveguide regions. Thus, normally multiple runs are necessary for developing such electronic-photonic platforms.



Figure 4.5: SEM of a strip waveguide before substrate release step.

Grating Couplers. Grating couplers for coupling light into and out of the chip are designed by patterning the c-Si layer. The light can be coupled into the chip from both the front or back side of the chip because of the vertical symmetry of this structure (Fig. 4.6a). The measured optical transmission from the backside is shown in Fig. 4.6b with a coupling loss of 4.9 dB and 84 nm 1 dB-bandwidth. We measured a higher coupling loss of 7.5 dB when light is coupled from the top side due to the inter-layer dielectric films in the optical path that absorb and reflect input light. Advanced CMOS processes use multiple variants of low-k materials in the BEOL and the reflections caused by different refractive indexes is the dominating cause of the extra coupling loss in this case. This issue can be solved by an extra post-processing step to selectively etch BEOL oxides on top of grating couplers.

The coupling loss can be reduced to sub-2dB by utilizing the polysilicon layer in this process to break the vertical symmetry of the device and by designing a unidirectional grating [96]. One critical difference is the introduction of a metal gate instead of the polysilicon gate in this process. However, polysilicon is still utilized in the gate formation after depositing a thin layer of metal gate [97]. Pure polysilicon (polysilicon without gate electrode underneath) can be also utilized as a routing layer. Thus, available pure polysilicon can be patterned and utilized as the second grating layer to make the couplers unidirectional as we demonstrated in the zero-change 45 nm SOI platform (Section 2.3.2).



Figure 4.6: Diffraction grating couplers in 32 nm zero-change platform. (a) 3D layout, (b) Optical transmission with fibers coupled from the bottom side of the die at 12.5 degree coupling angle.

All photonic devices and components are covered by c-Si, poly-silicon, and first 7 metal layers (M1-M7) exclude layers to avoid auto-placement of any filler cells of these layers in their proximity by the foundry. However, we have to make sure to meet the density requirement of these layers inside exclude regions. A custom made 10 µm-wide belt of high density (HD) fillers for these layers are placed around each photonic circuit for this purpose

(can be seen in Fig. 4.6a). Additionally, we need to block the placement of any metal pieces including filler cells for higher metal levels on the grating couplers as well. This is necessary in order to be able to couple in/out the light from the back-end (top of chip surface) without any blockage. Therefore, all the exclude layers should be instantiated on the grating coupler regions.

#### 4.2.2 Active Photonic Devices

Microring Modulators. We used a similar spoked-ring resonator design for modulators described in Section 2.3.3. Microring modulators are implemented using interleaved lateral p-n junctions that modulate the light through modulation of the resonance wavelength via carrier plasma dispersion effect [72]. In this design, 30 cathode and 30 anode segments have been placed along the perimeter of the microring cavity. T-shape junctions have been used in order to increase the interaction of depletion regions with the optical mode to enhance modulation efficiency [73]. The microring has a 5  $\mu$ m radius (Fig. 4.7a) resulting in a free spectral range (FSR) of 18.9 nm around 1310 nm wavelength, with a loaded Q-factor of 6k (intrinsic Q>12k). The interleaved junctions are operated in the depletion mode for high-speed operation and exhibit a resonance wavelength shift efficiency of 20  $\mu$ m/V (Fig. 4.7b).



Figure 4.7: (a) Microring modulator's micrograph, (b) Optical transmission of a ring modulator for different biases in the depletion mode.

In order to address the resonance wavelength changes due to the process variations and temperature changes, a doughnut shaped resistive heater is implemented in Salicided c-Si layer and is embedded inside the microring. The microheater has  $500\,\Omega$  resistance and the wavelength tuning efficiency of  $0.8\,\mathrm{nm/mW}$  ( $14\,\mu\mathrm{W/GHz}$ ). Figure  $4.8\,\mathrm{shows}$  the functionality

of the heater, where the resonance of the microring can be tuned by controlling the heater voltage and consequently its power. The current heater efficiency can be improved to around  $3.7\,\mu\mathrm{W}/\mathrm{GHz}$  once the die is flip-chip packaged on the PCB instead of being mounted on a glass substrate. The reason is that heat can be dissipated via NOA epoxy, while in the flip-chip case there is no epoxy on the BOX surface and consequently heat can be efficiently transferred on the chip surface.



Figure 4.8: An example of a heater shifting the microring's resonance wavelength for various heater strengths.

Resonant Photodetectors. Although majority of silicon photonic platforms rely on the Ge for photo detection, growing high-quality Ge material on silicon based substrates/films was always very challenging. Additionally, our premise in the zero-change scheme is not to customize the commercial CMOS process and only exploit process features available in these platforms to achieve the optical capabilities needed. Despite the lack of pure Ge in CMOS technologies, SiGe has been abundantly used especially in 45/32 nm nodes. Epitaxial SiGe has been traditionally exploited in the source/drain regions of PMOS transistors to apply compressive stress since 45 nm technology node [9]. This type of SiGe is called embedded SiGe (eSiGe) and exists also in the 32 nm node as well. However, the concentration of Ge in this type of SiGe is estimated to be less than 19%. The 32 nm technology node also features another epitaxially grown SiGe layer with a higher Ge% concentration, which leads to higher responsivity for PDs. This SiGe epi layer is called channel SiGe (cSiGe) and it is used in PMOS channels to reduce the threshold voltage (V<sub>TH</sub>) after introducing metallic material for the gate [10]. Figure 4.9 shows SEMs of PMOS devices in both of these technologies and highlights the SiGe layers.

In this work, we deployed both types of SiGe layers available in 32 nm process to build photodetectors and compared their performances. Due to the relatively low quantum efficiency of SiGe, we designed resonant PDs in order to improve the responsivity within compact areas. Compact area of the device keeps the PD capacitance low enough to achieve



Figure 4.9: (a) SEM of a PMOS in 45 nm SOI process highlighting embedded SiGe (eSiGe) available as a source/drain material [9], (b) SEM of a PMOS in 32 nm SOI process showing additional channel SiGe (cSiGe) material available as a channel of this device [10], (c) PMOS device's channel SEM.

high electro-optical bandwidths. The resonant waveguide photodetector exploits carrier generation in SiGe within a microring cavity. The effective absorption length is now multiplied by the finesse of the resonator. Additionally, resonant PDs are advantageous in multi-wavelength optical receivers, as they combine optical filter and photodetection together.

The 3D layout of the design is shown in Fig. 4.10a, where interleaved p and n junctions are placed radially along the ring cavity. These well-implants are fabricated prior to SiGe deposition and affect the c-Si only. Source-drain (S/D) well implants, halo/extension implants, and salicidation complete the cathode and anode electrical contacts. By adding a 0.3 µm wide ring of this SiGe layer into the microring cavity [98], we were able to build resonant PDs. This SiGe ring can be made out of either eSiGe or cSiGe (Fig. 4.10b).

Higher Ge% concentration has multiple benefits that eventually improves the responsivity and bandwidth of the resonant PDs. Higher Ge% leads to both larger absorption coefficient and refractive index. These factors have been studied in [99] and the results are presented in Fig. 4.11. We have extrapolated the optical behavior of SiGe at 40% Ge concentration based on the model described in that reference. These characteristics show that not only higher Ge concentration increases the absorption coefficient, but it also shows higher refractive index which enhances the effective refractive index and light confinement in the waveguide as well.

Figure 4.13a shows the optical transmission of the cSiGe-based resonant PD. The Q-factor of these devices is >6.5 k (intrinsic Q >15k). Measurements showed that cSiGe-based resonant PDs have the improved responsivity of 0.13 A/W compared with 0.06 A/W responsivity for eSiGe-based resonant PDs at 1310 nm. This device exhibit a >12.5 GHz 3 dB electro-optical bandwidth (measured via a 13.5 GHz VNA) and 150 nA dark current at 8V reverse bias (Fig. 4.14). eSiGe-based resonant PDs have smaller electro-optical bandwidth of 5 GHz dominated by the RC-time constant of the junctions. The reason is that the thickness



Figure 4.10: (a) Resonant PD's 3D layout, (b) Cross-section of PMOS devices in 45 nm and 32 nm processes indicating available SiGe layers for photodetection.



Figure 4.11: SiGe optical characteristics dependency on the concentration of Ge fraction.

of eSiGe is larger than cSiGe thickness and the remaining doped silicon region underneath which forms the junctions contacts are thinner, with higher resistance (Fig. 4.12).

The diode behavior of resonant PDs is presented in Fig. 4.13b, where we can apply up to  $10\,\mathrm{V}$  reverse bias voltage before the diode breaks. This is expected as the electrical field is applied over a  $300\,\mathrm{nm}$  wide region which is about  $10\times$  wider than the FET channel's length that tolerates up to  $1.2\,\mathrm{V}$  applied voltage.

Exploiting new features of advanced CMOS nodes such as cSiGe in 32 nm technology is only one example to prove the new degrees of freedom existing in the state-of-the-art CMOS processes. Todays, these platforms are utilizing various elements of the periodic table beyond silicon and its derivatives. Furthermore, these materials are all deposited or grown with high



Figure 4.12: Cross-section of eSiGe- and cSiGe-based resonant PDs in the zero-change 32 nm platform.



Figure 4.13: cSiGe-based resonant PD measured characteristics: (a) optical transmission, (b) I-V curve.

resolution lithography masks and process control. Reusing these built-in materials and layers for emerging applications such as electronic-photonic devices and new memory technologies significantly lowers the development cost and cycles and makes them more readily available for next-generation large-scale microelectronics.

## 4.3 Electronic-Photonic Optical Tranceivers

Our first electro-optical system demonstration in this platform is a microring-based O-band transceiver with on-chip CMOS front-ends. These electronic-photonic transceivers help us to characterize our photonic device designs in-situ and prove the full electro-optical functionality together. Figure 4.15 shows the circuit-level block diagram and micro-graph of the



Figure 4.14: cSiGe-based resonant PD's electro-optical frequency response.

transmitter. The circuits are supplied by a  $1.2\,\mathrm{V}$  DC source except for the transmitters' high-swing driver head, which requires an additional  $2.4\,\mathrm{V}$  supply. Ring-modulator diodes are biased such that they see 0 to  $-2.4\,\mathrm{V}$  voltage swing in the depletion mode.



Figure 4.15: (a) Optical transmitter's block-diagram and (b) micro-graph.

While 3D and hybrid silicon-photonic platforms impose large parasitic capacitances interconnecting photonic and electronic chips which limits the energy-efficiency of ring-resonator based optical transmitters, the interconnect wiring of these devices in monolithic platforms are only on the order of fFs. Here, this wiring parasitic is estimated to be smaller than 5 fF that can be even lower by optimizing the floor-plan.

We used high-swing modulator drivers capable of driving the modulator with  $2 \times VDD$  swing to improve the modulation efficiency by increasing the resonance shift in this work. Using nominal swing of VDD, only about 50 nm out of 400 nm-wide p and n region (in each segment) is depleted. This means that undepleted junction portions do not contribute to the

modulation depth and only cause extra free carrier loss inside the ring cavity. In order to increase the modulator's efficiency, the junction width should be decreased and have more segments or larger portion of each junction should be depleted by applying higher voltages. Since the minimum junction width is set by the manufacturing mask capabilities which can be found in the foundry's design rule check (DRC) files, the only option left is to boost the driver's swing.

Figure 4.16 shows the circuit block-diagram of the high-swing modulator driver. The driver stage is comprised of thick-oxide devices supporting up to  $2.4\,\mathrm{V}$ . HVDD and AVDD supplies are set to  $2.4\,\mathrm{V}$  and  $1.2\,\mathrm{V}$ , respectively. The input data (IN) is provided by a CML-to-CMOS converter. IN is level shifted via an AC-coupled level shifting circuit to drive this stage. Thanks to the fast transistors in this process, the electrical data-rates of  $(>25\,\mathrm{Gb/s})$  can be achieved with this design assuming the total capacitance of with 30 fF to drive. Higher voltage swing of  $4\times VDD$  can be also applied by using two copies of this driver and and AC-coupler to drive both ends of the modulator for further improvement.



Figure 4.16: High-swing modulator driver circuit.

Receiver's block diagram and micro-graph is shown in Fig. 4.17. Small parasitic capacitance interconnecting PD to the receiver's front-end brings a major advantage to monolithic photonic compared with 3D/hybrid platforms which improves the receiver's sensitivity significantly. Receiver front-end is comprised of a split cSiGe-based resonant PD with two trans-impedance amplifiers (TIA) for each arm. Splitting the resonant PD provides a pseudo-differential photocurrent and detection, which suppresses common mode and supply noises at the receiver [100]. TIAs are followed by two current mode logic (CML) stages in order to drive the output pads. Total differential trans-impedance gain of the receiver is  $13\,\mathrm{k}\Omega$  with  $5\,\mathrm{GHz}$  electrical bandwidth (TIA gain:  $4.5\,\mathrm{k}\Omega$ ).



Figure 4.17: (a) Optical receiver's block-diagram and (b) micro-graph.

Figure 4.18 presents the measured transmitter and receiver eye-diagrams. We achieved 13.5 Gb/s transmission with extinction ratio (ER) of 3.7 dB and insertion loss (IL) of 2.8 dB. The data-rate is currently limited due to the limited RC time constant of modulator junctions and the lack of high-speed data generation on the chip. RC bandwidth limitations can be solved by optimizing junctions' profiles to reduce each segment's electrical resistance. We expect to achieve similar data-rates as we demonstrated in the previous chapter by fixing this issue and adding on-chip high-speed PLL and serializers. Receiver is tested by coupling the externally modulated light into the chip. We used a commercial MZI modulator to produce the modulated light and finally captured the differential electrical signal at the output pads for monitoring receiver's functionality. Receiver's eye-diagram is measured at data-rates of up to 12 Gb/s, limited by TIAs 5 GHz bandwidth limitation (due to large input capacitances of the following buffer). The TIA design can be improved to achieve bandwidths of beyond 5 GHz by proper sizing of CML buffers chains and optimizing the TIA architecture and transistor sizing.

## 4.4 Summary

We monolithically integrated all the necessary building blocks of WDM optical links with CMOS in a  $32\,\mathrm{nm}$  SOI technology node without any change to the native process. We were able to demonstrate  $12\,\mathrm{Gb/s}$  transceivers in the standard O-band. These transceivers speed and energy-efficiency can be further improved by optimizing the circuit and photonic designs. We expect to lower waveguide losses similar to our previous work in the  $45\,\mathrm{nm}$  node to enhance the modulator/detector Q-factors. One of the benefits of advanced CMOS nodes is that the high resolution masks and lithography can be exploited to achieve high density p-n junction segments and the possibility of more advanced modulation formats to improve energy efficiency and data rate [16].

Table. 4.1 summarizes the performance of our photonic devices implemented in zero-



Figure 4.18: (a) Transmitter's eye-diagram captured via commercial photo-detector and electrical scope, (b) Receiver's eye-diagram tested by externally modulated input light.

change platforms. Overall, the integration of photonics in advanced CMOS is a promising solution to a multitude of challenges facing todays CMOS including the need for terabit-scale photonic I/O on processor and high-radix switch chips for next-generation HPC.

Table 4.1: Photonic devices in zero-change platforms performance summary.

| CMOS Technology                | 45nm SOI            |                     | 32nm SOI            |
|--------------------------------|---------------------|---------------------|---------------------|
| Wavelength                     | 1550                | 1310                | 1310                |
| Waveguide Loss                 | $4.6\mathrm{dB/cm}$ | $3.7\mathrm{dB/cm}$ | $25\mathrm{dB/cm}$  |
| Grating Coupler Loss           | $10{\rm dB^{+}}$    | $1.5\mathrm{dB}$    | $4.9{\rm dB^{+}}$   |
| Grating Coupler 1-db Bandwidth | N/A                 | $78\mathrm{nm}$     | 84 nm               |
| Modulation Speed*              | $25\mathrm{Gb/s}$   | $40\mathrm{Gb/s}$   | $13.5\mathrm{Gb/s}$ |
| Photodetector Responsivity     | $0.15\mathrm{A/W}$  | $0.5\mathrm{A/W}$   | $0.13{ m A/W}$      |
| Photodetector Bandwidth        | $10\mathrm{GHz}$    | $5\mathrm{GHz}$     | $>12.5\mathrm{GHz}$ |

<sup>\*</sup>Without electrical equalization, \*Bidirectional couplers.

## Chapter 5

## Photonic SoCs in Bulk CMOS

Electronic and photonic technologies have transformed our way of living – from computing and mobile devices, to information technology and the Internet. Our future demands in these fields require innovation in each technology separately, but also depend on our ability to harness their complementary physics through integrated solutions [6, 101]. Today, this goal is hindered by the fact that the majority of silicon nanotechnologies that enable our processors, computer memory, communications chips, and image sensors, utilize bulk silicon substrates, a cost-effective solution with an abundant supply chain, but with significant limitations for the integration of photonic functions. In this chapter, we aim to address this challenge by bringing photonics into bulk silicon complementary metal-oxide-semiconductor or bulk CMOS chips using a deposited layer of poly-crystalline silicon (polysilicon) on silicon oxide (glass) islands fabricated alongside transistors (Fig. 5.1). We have used this single deposited layer to realize optical waveguides and resonators, high-speed optical modulators, and sensitive avalanche photo-detectors. We have integrated this photonic platform with an entire 65 nm bulk CMOS process inside a 300 mm-wafer microelectronics foundry. We have implemented integrated high-speed optical transceivers in this platform operating at 10 Gb/s, composed of millions of transistors, and arrayed on a single optical bus for WDM links, to address the immediate demand for energy-efficient and high-bandwidth optical interconnects in data-centers and high-performance computing [102, 103]. By decoupling the formation of photonic devices from transistors, the demonstrated integration approach can achieve many of the goals of multi-chip solutions [50], with the performance, cost advantage, and scalability of monolithic platforms [6, 104, 57, 105]. As transistors are scaled beyond 10 nm in the near future [106], and as new nanotechnologies emerge [107, 108], this approach can provide the means for integration of photonics with the state-of-the-art in nanoelectronics.

Sustained innovations in electronics, predominantly in CMOS, have transformed computing, communications, sensing, and imaging. More recently, silicon-photonics has been leveraging the CMOS infrastructure to address the growing demands for optical communications for Internet and data-center networks [104, 102, 57]. This convergence of photonics with CMOS is very promising to create a new paradigm in electronic-photonic technologies: processor and memory chips with high-bandwidth optical input/output [6, 103], communica-

tions chips with high-fidelity optical signal processing [109, 101], and highly parallel optical biochemical sensors for blood analysis [110] and gene sequencing [111]. To make these aspirations a reality, photonic devices need to be integrated with a variety of nanoelectronic functions (digital, analog, memory, storage, etc.) on a single silicon die.

As we discussed in Section 2.2, monolithic integration of photonic devices in close proximity to electronic circuits is crucial for two main reasons: it allows us to simultaneously achieve the required levels of performance, scalability, and complexity for electronic-photonic systems; and significantly accelerates system-level innovation by enabling a cohesive design environment and device ecosystem to realize entire SoC. In fact, the accelerated progress in recent years in electronics is a direct result of such an SoC approach and the addition of new functions and components to CMOS to create new monolithic device platforms – wireless communications and Radar imaging chips (through the addition of inductors and transmission lines [112]), and image sensors (through silicon photodiodes [113]).



Figure 5.1: Our technique for integrating electronic and photonic devices on a single silicon microchip by adding isolated patches (islands) of the insulator material silicon dioxide to a bulk silicon substrate [11].

The greatest challenge toward the integration of photonic circuits into CMOS has been the lack of a semiconductor material with suitable optical properties for realizing active and passive photonic functions in bulk CMOS, the dominant manufacturing platform for microelectronic chips (every Intel, Apple, and Nvidia CPU/GPU, all computer memory and Flash storage, etc.). As a result, all of the efforts, to date, for the integration of photonics into CMOS have been limited to SOI substrates [6, 104, 57, 105]. These processes are cost prohibitive for many applications (e.g., computer memory) and have a limited supply chain for high volume markets. The same photonic integration challenge also exists for the leading CMOS technologies below 28 nm transistor nodes (FinFET and thin-body fully-depleted SOI [114] (TBFD-SOI)) where the crystalline silicon (c-Si) layers are too thin (less than 20nm) to support photonic structures with sufficient optical confinement. To address these integration challenges, we have developed a photonic platform for high-speed and low-

power operation using an optimized polysilicon film that could be deposited on silicon oxide islands that are ubiquitous in CMOS (utilized to isolate transistors) even in the most recent technologies using FinFET and TBFD-SOI [114] (Fig. 5.2a).

Deposited electronic and photonic devices on glass have already impacted many fields: thin-film transistors (TFT) have enabled today's display technologies, and photonic platforms with thin-film components on glass have seen their commercial deployment in optical communications systems [115]. However, deposited photonic components have been restricted to passive functions (e.g., filters and delay lines) lacking light detection and modulation. A variety of materials including amorphous and polycrystalline silicon [116, 31, 117], polymer-based devices [118], and chalcogenides [119] have been investigated for deposition onto glass for realizing active photonic components. Nevertheless, the integration of a fully functional photonic platform (i.e., passive functions, optical modulators and detectors) and its integration with sub-100 nm CMOS nanoelectronics in a 300 mm foundry is yet to be demonstrated. In this work, we have integrated a fully functional polysilicon photonic platform with a 65 nm bulk CMOS process through the addition of a few extra processing steps without affecting transistors native performance, and demonstrated large-scale monolithic electronic-photonic systems.

## 5.1 Monolithic Photonic Platform in 65nm Bulk CMOS

Figure 5.2a shows transistor structures in today's three dominant deeply-scaled CMOS processes. The silicon oxide shallow trench isolation (STI) for transistors in advanced CMOS nodes is too thin to support low-loss optical waveguides on top of this layer due to the light leakage into the substrate. We address this issue by locally adding a thicker oxide photonic isolation layer ( $\sim$ 1.5 µm) with a fabrication process very similar to STI. An optimized polysilicon film (220 nm thick) with low optical propagation loss and high carrier mobility is then deposited on this layer, and is used for passive photonic components, free-carrier plasma dispersion modulators [71, 72], and photodetectors that utilize the absorption by defect states at polysilicon grain boundaries [75, 77]. Photonic isolation layer fabrication and polysilicon film deposition are followed by two etching steps (full and partial, for strip and ridge structures) and two doping implants (p and n for modulators and detectors) to form our photonics process module that is inserted into the CMOS fabrication process flow. Figure 5.2b shows the cross-sectional drawing of three representative photonic components in our polysilicon photonic platform next to a transistor in a planar bulk CMOS process.

The photonics process module is inserted in the middle of transistor processing, after gate definition, but before source and drain implants. With this approach, all of the high-temperature photonics processing takes place before the definition of the source, drain, and channel of transistors. This eliminates the need for re-optimizing the source and drain implants and anneal processes that would otherwise be needed because of the sensitivity of



Figure 5.2: Photonic integration with nano-scale transistors. (a) Illustration of three major deeply-scaled CMOS processes: planar bulk CMOS, FinFET bulk CMOS, and fully-depleted SOI CMOS, (b) Integration of photonics process module into planar bulk CMOS, (c) SEM of different photonic and electronic blocks in our monolithic platform.

deeply scaled transistors to the source and drain doping profiles [120]. Also, this approach allows us to reuse some of the frontend processing steps (high-doping implants, and silicide formation) for active photonic components to minimize the number of photolithography masks. In doing so, the entire fabrication development is shifted to the photonics side, as low-loss photonic structures have to be implemented while transistor gate features already exist on the same level. This necessitates careful optimization of the polysilicon film deposition, polishing, and etching steps to achieve low (<1 nm) surface and sidewall roughness for low-loss and high-performance devices.

Designs were fabricated on 300 mm wafers in the fabrication facility at Colleges for Nanoscale Sciences and Engineering (CNSE), State University of New York, Albany, New York. The photonics passive-only wafers (partial flow) were fabricated on silicon wafers with 1.5 µm thick SiO<sub>2</sub> under-cladding blankets with the whole CMOS backend dielectric stack as the over-cladding. Partial-flow wafers were fabricated for photonics process optimization before integration with electronics.

Each wafer quadrant on full-flow wafers received a separate mid-level doping implant concentration for photonic active components (modulators and detectors) ( $[1, 2, 3, 6] \times 10^{18}$  cm<sup>-3</sup>). Due to the presence of a large density of defects in polysilicon, the carrier activation occurs

only after the majority of defect states are occupied, whose onset occurs for a doping concentration of roughly  $10^{18}$  cm<sup>-3</sup> [121]. This necessitates careful optimization of photonic mid-level p and n doping concentrations to balance loss, modulator efficiency, and device series resistance, which affects the speed of both modulators and detectors. By using a separate doping concentration in each quadrant of the wafer, we tested the performance of modulators and detectors as a function of doping concentration. From the results of doping splits in an earlier fabrication run for optimizing the photonics process, we expected the optimal doping concentration to be close to  $3 \times 10^{18}$  cm<sup>-3</sup>. However, in the full flow run, due to an increase in optical loss caused by polishing residues, microring Q-factors dropped by a factor of two. This required larger wavelength shifts in modulators to compensate for the broadened resonance line-shape to achieve the same level of modulation depth. Therefore, we observed the best overall performance for a p and n implant concentration of  $6 \times 10^{18}$  cm<sup>-3</sup>. The results presented in this work are for devices receiving this implant concentration.

For full-flow wafers (passive and active photonics with electronics), major fabrications steps which are indicated by numbers in Fig. 5.2b are as follows:

- 1. Transistor isolations (STI) fabrication.
- 2. Photonic isolation fabrication: The deep photonic trench is first fabricated by etching the trench in the silicon substrate and filling it with  $SiO_2$  ( $\sim 1.5 \,\mu m$ ) by chemical vapor deposition (CVD) followed by a planarization step.
- 3. Transistor front-end fabrication up to source/drain implant, including gate definition and S/D extension and halo implants; At this point, the wafer goes through the CMOS frontend process up to the source and drain formation.
- 4. Photonic polysilicon film deposition, anneal, and polish with 220 nm thickness.
- 5. Polysilicon full and partial etching for forming strip and ridge photonic structures; Using two reactive ion etching steps, one full (etching the entire 220 nm depth of polysilicon film) and one partial (120 nm deep), strip and ridge photonic structures are formed.
- 6. Doping implants (mid-level p and n) for building active photonics.
- 7. From here, electronics and photonics share the rest of the fabrication process including the high-doping implants (p++ and n++ for the transistor source and drain, and the photonic modulator and detector ohmic contacts), nitride liners (silicide block, and etch-stop for the first via), silicide formation.
- 8. Metallization. There are a total of 7 metal interconnect layers in this process, with the first 4 having a lithography resolution of less than 100 nm.

This optimized photonics process was integrated with an entire commercial low-power 65 nm bulk CMOS process on 300 mm wafers, featuring transistors with three different



Figure 5.3: Monolithic electronic-photonic platform in 65 nm bulk CMOS. (a) Photograph of a fully fabricated 300 mm wafer with monolithic electronics and photonics, and close-ups of a reticle on this wafer, and a packaged WDM chiplet, (b) Micrograph of a WDM chiplet, (c) Close-up of a single transceiver macro and its photonic and electronic circuit components.

threshold voltages and two oxide thickness variants. Figure 5.2c shows bird's eye SEMs of our monolithic platform with photonic components next to transistors with 60 nm channel length.

Figure 5.3a shows a photo of a fully fabricated wafer, and close-up photos of the entire reticle and one packaged chiplet composed of several WDM photonic transceiver rows. For this photo, the silicon substrate was entirely removed using XeF<sub>2</sub> gas after mounting the die on a carrier substrate, and the micrograph was taken from the backside of the die. The micrographs of the transceiver chiplet, transmitter and receiver circuit blocks, and individual photonic components are shown in Fig. 5.3b and Fig. 5.3c. We were able to build a library of passive and active photonic components (waveguides, microring resonators, vertical grating couplers, high-speed modulators, and avalanche photo-detectors) with a performance similar or better than previous demonstrations on polysilicon [31, 75, 117], next to circuit blocks composed of millions of transistors operating at native CMOS process specifications.

Electronic-Photonic Systems On Glass. The current work was aimed at integrating photonics into bulk CMOS technologies. However, a fully functional deposited photonic platform on glass transcends any one particular substrate or application. All of our partialflow photonics-only silicon wafers were covered by a blanket of 1.5 µm-thick plasma-enhanced chemical vapor deposition (PECVD) silicon oxide (glass), on which we have fabricated optical waveguides, resonators, modulators, and photodetectors. These thin-film integrated photonic devices along with TFTs that are currently used in display panels can enable 'electronic-photonic systems on glass'. These systems can be fabricated on low-cost largearea substrates such as metal foils, transparent glass, or even flexible substrates as long as they are covered with roughly 1 µm of glass. Such a platform can enable a variety of new systems and applications which current electronic-photonic technologies cannot address due to substrate size or cost limitations. For example, several space and astrophysics applications, such as laser communications and astronomical spectroscopy, require large-area optics and detectors. Also, many optical phased array applications (lidars, augmented reality headsets, etc.) can benefit greatly from large-area integrated photonic circuits. An electronic-photonic platform on glass, enabled by the deposited polysilicon photonics demonstrated in this work, can address these application areas. The performance of photonics on this platform would be similar to the devices we have reported on partial-flow wafers in this work.

Photonic Integration into More Advanced CMOS. One great potential of the thinfilm photonics platform is that it separates photonic and electronic implementation and offers the possibility for photonic integration into state-of-the-art CMOS. One of the more important milestones moving forward is to integrate photonics with FinFET Bulk CMOS that are already at sub-10 nm transistor node. Some features of the photonic process module in the current work might require modifications to facilitate integration with such advanced CMOS nodes. For example, the fabrication of the thick photonic isolation layer can be skipped and photonic structures can be fabricated on the transistor shallow trench isolation layer. The silicon substrate under the photonic devices can later be selectively removed in a post-processing step to eliminate any light leakage to the substrate as previously shown in [6]. This would minimize processing steps prior to transistor fabrication, which could be important for compatibility with deeply-scaled CMOS nodes with nano-scale structures. Also, the photonic fabrication process can be completely moved after the transistor fabrication to eliminate any cross-compatibility integration issues. For example, photonic structures can be fabricated on the planarized pre-metal dielectric layer (first dielectric film after transistor fabrication). This will require re-centering the transistors specifications by adjusting the anneal conditions of different implants. This approach is possible since our photonic process module only uses short spike anneal steps, which induce small changes in the doping profile of the transistors.



Figure 5.4: Photo of a reticle showing various device/system test structures.

### 5.2 Photonic Devices

The reticle area was divided into two main sections: the test device section with photonic and electrical test structures and the electronic-photonic microsystem section (WDM chiplets). Test structures are designed to measure and characterize waveguide loss, microring Q-factors, grating coupler efficiency, and sheet resistances as well as, standalone modulator and detector performances (Fig. 5.4). Various devices with different design parameters are fabricated to find the optimum design and extract the details of fabrication quality (e.g., estimating surface and sidewall roughness, doping activation, etc.). Photonic test devices were designed for both 1300 nm and 1550 nm wavelengths. Visible light waveguides can be also built in this platform after two post-processing steps [122].

Overall, approximately 1000 test devices were laid out on the reticle. Total of 6 transceiver SoC chiplets are also included, each occupying a 4.8 mm×5 mm area. Since the same set of masks were used for process optimization and system implementation, there were some uncertainties about the performance of photonic passives and actives. This required sweeping modulator and detector parameters in the WDM transceiver designs to cover a large enough range of device performance.

All photonic components were designed using two etch steps (partial and full). The thickness of the polysilicon layer (220 nm) and the depth of the partial etch (120 nm) were

chosen to optimize the overall performance of the whole platform, including the efficiency of grating couplers, radiation loss of the microring resonators, and series resistance of the modulators and detectors. The optimal width of the single-mode waveguides and diameter of the high-Q microrings were around 450 nm and 15  $\mu$ m, respectively, for 1300 nm operation. The high doping regions for ohmic contacts were  $\sim 1 \mu$ m away from the center of the ridge waveguides to avoid free-carrier loss.

#### 5.2.1 Loss Mechanisms in Polysilicon Waveguides

The fundamental problem for embedding photonics in bulk CMOS processes is the lack of a low-optical loss material to use as waveguide core. Due to the lack of any c-Si layer in these processes, we decided to build waveguides out of polysilicon. However, since original gate polysilicon developed for transistors is not optimized for optical wave guiding the loss will be intolerable. For instance, gate polysilicon waveguide loss will be 55 dB/cm (near 1550 nm) in 65 nm and 76 dB/cm (at 1280 nm) in 28 nm CMOS processes [123].

Here, we summarize the loss mechanisms of polysilicon waveguides (described in [123] with more details):

- Line edge roughness (LER)
- Absorption associated with the nitride and other dielectric liners used in the CMOS gate process
- Top surface roughness
- Polysilicon absorption due to defect states

LER is well controlled in CMOS processes as it may affect the gate electrical characteristics and S/D junction profiles. In fact, LER is even defined as a target metric in ITRS roadmaps for the development of next generation technology nodes. Based on the estimated roughness (for example ITRS 2011 target is 1.9 nm), the LER is expected to contribute less than 1.5 dB/cm of loss for a polysilicon based waveguide [124].

Second loss component is due to the dielectric and Nitride liners used in CMOS technology for stress engineering. Although the Nitride material loss could be as high as  $20\,\mathrm{dB/cm}$ , the small overlap of the optical mode and the nitride layers suggests an upper limit of loss closer to  $9\,\mathrm{dB/cm}$ . Additionally, the nitride liner can pull the optical mode toward waveguide's surfaces and causes higher roughness losses due to LER and top surface roughness. The nitride liner also applies stress on waveguides which changes the grain size/boundaries and affect polysilicon absorption.

The dominant loss process in polysilicon waveguides is normally the top surface roughness. Since the top surface roughness of polysilicon gates does not impact the electrical functionality much, the roughness is not well characterized in CMOS technology. The polysilicon grains are deposited in columnar structures that is optimized to promote vertical diffusion of



Figure 5.5: Cross-sectional TEM of a gate polysilicon waveguide integrated in a 28 nm bulk-CMOS process. The core thickness is approximately 73 nm with the surface roughness of 5 nm RMS with a 100 nm correlation length [12].

dopants along grain boundaries. Since the grains are growing during the deposition process, the surface is formed by the differential growth rates for the crystal orientations of each grain. Estimated theoretical loss curves from a published sidewall roughness analysis, the waveguide sensitivity to top-surface roughness is approximately  $10\,\mathrm{dB/cm}$  per  $1\,\mathrm{nm}$  RMS roughness for the measured  $100\,\mathrm{nm}$  correlation length [124]. The resulting rough surface is also clearly visible in transmission electron microscopy (TEM) images (Fig. 5.5). The close agreement of the  $50\,\mathrm{dB/cm}$  prediction with measurements offered further confirmation that the top surface roughness is the dominant factor to overall integrated polysilicon waveguide loss.

Light can be also absorbed or scattered by the polysilicon bulk material itself. This loss mechanism scales with the bulk of the polysilicon material which can result from either defect state absorption or scattering at the grain boundaries. These two phenomena are expected to have inverse scaling relations with grain size; defect state absorption should be proportional to the relative volume of grain boundaries in the material and decreases with larger grain sizes. For grain sizes below the light wavelength, scattering is expected to result in larger propagation losses as the grain size and therefore correlation length of the index heterogeneity increases.

In the work presented in this chapter, waveguide performance was optimized under multiple polysilicon deposition and anneal conditions through a series of different recipes in collaboration with CNSE. The primary fabrication optimization step was to minimize the top surface roughness of the polysilicon layer which is the dominant loss mechanism as explained above. In order to do so, two approaches have been taken to integrate low loss polysilicon waveguides: optimizing deposition and anneal conditions and temperatures, and post-deposition polish. Due to the differential crystal plane etch rates, we can also remove the chemical component of the polish to improve the process and use a purely mechanical polish for producing smooth polysilicon films.

We also note that we are using defect states of polysilicon for photodetection. This creates a trade-off between waveguide loss and the quantum efficiency of PDs as they have inverse relationships with grain size. The defect state density could be enhanced to enable all-silicon, infrared photodetectors [125] and smaller grain sizes provide larger density

of defect states increasing responsivity. However, we optimized the grain size to minimize the waveguide loss by refining the deposit, anneal, and polish recipes. Despite this affects the photodetection efficiency, we will use resonant photodetectors to boost the overall responsivity of PDs. Additionally, lower polysilicon loss leads to high-Q resonators improving resonant PD's performance as well.

### 5.2.2 Defect States based Photodetection in Polysilicon

Integrated photonic platforms are widely using pure germanium (Ge) or SiGe for photodetection due to their large absorption coefficient at near-infrared optical band. However, they require specified high temperature (>700 °C) epitaxial growth of Ge due to lattice mismatch, which limits the process compatibility with microelectronics [126]. Polycrystalline Ge photodetectors fabricated at low temperatures (<300 °C) have been also reported, however with relatively low efficiency of (15%) due to the poor crystal quality. Moreover, they have been fabricated in the back-end of old CMOS processes which are less sensitive to process change. Another approach is to use polycrystalline silicon. This material can be fabricated and optimized in microelectronic platforms with low temperature processes. Photodetection in polysilicon occurs based on absorption by the defect states. Quantum efficiency can be enhanced to be greater than 5% by optimizing the polysilicon structure (adding defects and scaling the grain size). Resonant PDs can be utilized to achieve responsivities of greater than 0.2 A/W [75]. Furthermore, defect states lead to a very wide band light absorption enabling photodetection from O-band (1310 nm) to L-band (1550 nm) [77].

Defects can generate energy states inside the bandgap of silicon leading to the absorption of infrared photons. Absorption is shown in moderately doped silicon waveguides with p-n junctions through the defects generated during ion implantation. Silicon has been intentionally implanted with high doping concentrations in these designs to introduce defects in order to increase the quantum efficiency of these photodetectors [127]. Resonant detectors using this technique have achieved responsivities in the range of 0.1 to  $0.2\,\text{A/W}$  with a few GHz bandwidth [128]. A waveguide detector was also demonstrated with a responsivity as high as 6-10 A/W with 35 GHz bandwidth at 20 V bias by optimizing the anneal condition for activation of the defect states [129]. Polysilicon with grain boundary midgap states has been also used for building photodetectors and have achieved comparable or better performance compared to ion-implanted silicon resonant photodetectors. Polysilicon detectors at 1280 nm and 1550 nm with a responsivity of  $0.2\,\text{A/W}$  with 8 GHz bandwidth (at  $10\,\text{V}$  bias) are already demonstrated in a modified bulk CMOS [75]. Resonant polysilicon-based photodetector with  $10\,\text{GHz}$  bandwidth and the responsivity of more than  $0.14\,\text{A/W}$  has been demonstrated in the zero-change  $45\,\text{nm}$  SOI process as well [77].

Here, we will implement photodetectors exploiting defect states of the polysilicon. This approach guarantees that all fabrication steps are occurring at low temperatures in order to be compatible to front-end of line (FEOL) thermal budget of advanced CMOS technologies. We measured responsivities of greater than 0.1 A/W under low bias voltages (comparing with previous works) by optimizing the deposit/anneal time and temperature of polysilicon.

We also observed an avalanche behavior in these type of polysilicon-based photodetectors for the first time. More results on photonic devices will be described in next sections.

#### 5.2.3 Passive Devices

Ridge waveguides with 400 nm wide ridge geometries have been used to implement all passive and active devices. The placement of any first 4 metal layers (M1-M4) on top of the waveguides and devices has been avoided to prevent introducing any extra metallic loss. Figure 5.6a summarizes the performance of passive components measured on partial-flow (passive photonics only) and full-flow (active and passive photonics with electronics) wafers, at a wavelength of 1300 nm. We achieved approximately  $10 \, \mathrm{dB/cm}$  propagation loss for ridge and strip waveguides, and >20,000 loaded Q-factor for microring resonators on partial-flow wafers. Full-flow wafers exhibit higher loss; however, this issue did not significantly affect the performance of our optical transceivers: the  $20 \, \mathrm{dB/cm}$  waveguide loss results in  $3 \, \mathrm{dB}$  loss across the 10-lambda WDM rows, and the 10,000 loaded Q-factor of microring resonators is close to optimal for  $10-20 \, \mathrm{GHz}$  bandwidth resonant modulators and detectors. Waveguide loss and resonator Q-factor are two times better at  $1550 \, \mathrm{nm}$  (Fig. 5.7), but all optical transceivers were initially designed for  $1300 \, \mathrm{nm}$ .

Grating couplers (GCs) for coupling light into and out of the chip are designed using both the partial- and full-etch steps to construct a periodic L-shaped geometry (Fig. 5.6b). The thickness of the photonic trench should be optimized ( $\approx 1.5 \mu m$ ) such that the downward beam reflected from the silicon interface at the bottom of the trench interferes constructively with the main beam radiated upwards to improve the directionality and coupling efficiency. All metal layers have been blocked on top and around the grating in order to avoid any light blockage and extra loss. The measured grating transmission, shown in Fig. 5.6b, indicates a peak efficiency of  $-4.2\,\mathrm{dB}$  for the partial flow and  $-5.2\,\mathrm{dB}$  for the full flow, with  $1\,\mathrm{dB}$  bandwidth of around  $40\,\mathrm{nm}$ . The  $1\,\mathrm{dB}$  extra loss in the full flow is caused by the fabrication error in the photonic trench thickness which will be fixed for the next fabrication runs.

Discussion on Fabrication Results. The mask density rules assure that the material density during polishing, etching, and lithography is within an acceptable range over the entire reticle and wafer to eliminate pattern-dependent results and achieve a high fabrication yield. Nevertheless, the maximum density range for each layer (polysilicon, metals, etc.) is desirable for more design flexibility. In this fabrication run, we faced unforeseen issues with photonic trench planarization, due to a large density gradient of photonic trenches across the reticle field. This caused dielectric residues on the wafer after photonic trench planarization. We also experienced metal residues after the fabrication of the first via contact. Both of these issues led to a factor of two degradation in the passive photonic performance in full flow runs (20 dB/cm vs. 10 dB/cm for waveguide loss, and 10,000 vs. 20,000 for microring Q-factor at 1300 nm). Both of these issues are resolved through modified design rules, and optimized fabrication processes for the next fabrication run.



Figure 5.6: Photonics Platform Performance. (a) Passive components specifications at 1300 nm for partial- and full-flow wafers, (b) Transmission spectrum and the longitudinal cross-section of grating couplers.



Figure 5.7: Passive photonic performance at  $1300\,\mathrm{nm}$  and  $1550\,\mathrm{nm}$ . (a) Waveguide propagation loss at  $1300\,\mathrm{nm}$ . Waveguide loss drops with wavelength because of a combination of lower absorption and scattering by polysilicon, (b) Q-factor of a  $15\,\mathrm{\mu m}$  diameter microring resonator, (c) Waveguide propagation loss at  $1550\,\mathrm{nm}$ , (d) One resonance of a  $17\,\mathrm{\mu m}$  diameter microring near  $1540\,\mathrm{nm}$  with a Q-factor of 38,000.

Improvement of Passive Photonic Performance. We have already taken the necessary steps to improve the waveguide loss in our full-flow wafers by fixing the photonic isolation planarization issue on the next fabrication run. Hence, we expect to achieve the same passive photonics performance in our fully integrated wafers as that reported in partial-flow runs in this work. This means a factor of two improvement in waveguide loss (10 dB/cm) and microring Q-factor (20,000) by simply resolving the photonic isolation planarization issue.

The loss improvement will also affect the performance of active photonics. The loss improvement will result in a higher quantum efficiency (QE) for photo-detectors by reducing the fraction of photons lost by scattering and absorption. We expect 50-100% improvement in detector responsivity (0.15-0.2 A/W in the linear mode, and 2-2.6 A/W with avalanche gain). The improvement in the microring modulator Q-factor leads to a sharper resonance feature, which consequently reduces the drive voltage and lowers the transmitter power consumption.

The waveguide loss can be improved even further to about  $6\,\mathrm{dB/cm}$  across the entire telecom and datacom bands  $(1250-1700\,\mathrm{nm})$  by optimizing the polysilicon film. Currently, waveguide loss at shorter wavelengths  $(1300\,\mathrm{nm})$  is twice as high as at  $1550\,\mathrm{nm}$  (Fig. 5.7). The similarity of the detector QE (11%) at  $1300\,\mathrm{nm}$  and  $1550\,\mathrm{nm}$  suggests that the scattering and absorption losses are increasing proportionately at shorter wavelengths. As optical modes at shorter wavelengths are more confined in the waveguide core and exhibit less scattering by the sidewall roughness, the dominant mechanism for the increase in scattering loss at shorter wavelengths is most likely the increase in scattering by polysilicon grain boundaries. This source of scattering can be reduced by optimizing polysilicon deposition and anneal conditions such that the scattering correlation length is reduced.

We also expect an improvement in the GC efficiency in the next fabrication run. In the current run, a  $30\,\mathrm{nm}$  photolithography bias caused the dimensions of the grating couplers to be smaller than the nominal design values. This caused the grating coupler efficiency to drop from  $-1.8\,\mathrm{dB}$  in design to  $-5\,\mathrm{dB}$  in the fabricated devices. We are addressing this issue by optimizing the lithography step and pre-biasing the photolithography mask. Also, by utilizing a nonuniform grating design [130], we expect to improve the mode matching of the grating coupler to the Gaussian mode profile of the optical fiber. This will enable us to improve the coupler efficiency to below  $1\,\mathrm{dB}$  loss.

#### 5.2.4 Active Devices

Depletion-mode resonant modulators and defect-based photodetectors were implemented using ridge microring structures with lateral p-n and p-i-n diode junctions, respectively. Device micrographs are shown in Fig. 5.3c and 3D layouts of the modulator and resonant PD are drawn in Fig. 5.8a and Fig. 5.9a, respectively. Notice here we built lateral junction based devices in ridge structures unlike the previous demonstrations in zero-change platforms where interleaved junction profiles have been used. The reason is that the optical mode is confined within the proximity of the ridge allowing us to have contacts on the wings of the waveguide without extra metallic loss.

Microheaters were incorporated in the microring modulators and detectors for tuning their resonances to the desired laser wavelength. We used silicided p+ polysilicon resistors for the microheaters. The fabricated microheaters had a resistance of  $\sim 100\Omega$ . These microheaters enabled us to implement closed-loop thermal tuning (similar to our design from Section 3.4.3) to compensate for thermal and process variations.

Two mid-level doping implants, p and n with concentrations of  $6 \times 10^{18}$  cm<sup>-3</sup>, are optimized for active photonics. The transistor source and drain implants were reused for the high-doping regions (p++) and n++ under the contact areas of modulators and detectors. The resonant design enhances the modulation and detection effects and enables operating with CMOS-compatible voltages in this process (<1.5 V). The resonator cavity has the radius of 7.5 µm leading to the FSR 8.8 nm. Within this FSR range, we have placed 10 equally spaced WDM channels  $(10-\lambda)$  in this current design. FSR can be increased by reducing the radius of the ring resonators. This demands further improvements in polysilicon loss and surface roughnesses to maintain the high levels of intrinsic Q-factor of the cavity.

The modulator uses a lateral p-n junction to modulate light through the modulation of the resonance wavelength based on the free-carrier plasma dispersion effect [131] (Fig. 5.8b). By operating the p-n junction under full depletion, a 3dB bandwidth of 16.8 GHz (Fig. 5.8c), and digital modulation at 10 Gb/s is achieved with only 2 Vpp modulation signal (Fig. 5.8c inset). We achieved the modulation efficiencies of  $\sim$ 17 pm/V for resonance wavelength shift.

The defect-based photodetector has a responsivity of  $0.11\,\mathrm{A/W}$  (QE of  $\sim 10\%$ ) near  $1300\,\mathrm{nm}$  under very low bias voltages (Fig. 5.9b) using a resonant design that enhances the weak absorption in polysilicon (Fig. 5.9c inset). The width of the intrinsic region for the photodiode is  $800\,\mathrm{nm}$  in the reported device. We have also observed avalanche gain for the first time in polysilicon photodetectors [132] at bias voltages above 8V (Fig. 5.9c), leading to a responsivity of  $1.3\,\mathrm{A/W}$  at  $16\,\mathrm{V}$  bias with a noise equivalent power (NEP) of  $0.27\,\mathrm{pW/\sqrt{Hz}}$ . This device has a  $3\,\mathrm{dB}$  bandwidth of more than  $8\,\mathrm{GHz}$  under reverse bias voltages above  $0\,\mathrm{V}$ , reaching  $11\,\mathrm{GHz}$  for  $5\,\mathrm{V}$  bias (Fig. 5.9c). More measurement results on avalanche photodetectors are given in Fig. 5.10.

Improvement of Active Photonic Performance. The bandwidth of the modulators and detectors can also be improved by optimizing the doping profiles. In the current run, we employed the same source and drain implants for the low-resistance regions (p++ and n++) of our modulators and detectors. These implants have concentrations that are designed for the shallow source and drain of deeply scaled transistors and are not high enough to provide low series resistance for the 100 nm-thick polysilicon regions in our photonic devices. By using dedicated implant masks and high implant dosages, and by further optimizing dimensions in all doping profiles, we expect to improve the bandwidth of modulators by 25-50% to 20-24 GHz based on the estimates for the contributions of mid-level (p and n) and high-level doping (p++ and n++) regions to the series resistance. As for the detectors, we also expect an improvement in bandwidth by reducing the width of the intrinsic region from 800 nm to reduce the transit time, which is currently limiting the speed of the device. We expect



Figure 5.8: Photonics Platform Performance. (a) Microring modulator 3D layout, (b) Transmission spectrum of a modulator resonance with loaded Q-factor of 5,000, (c) Modulator electro-optic frequency response (S21) and the eye-diagram obtained with 2 Vpp drive voltage.



Figure 5.9: Photonics Platform Performance. (a) Microring photodiode 3D layout, (b) Responsivity vs. reverse bias voltage. Avalanche gain is observed at biases above 8 V, (c) Photodiode frequency response (S21) under 0 V and 5 V reverse bias with 3 dB bandwidths of 8 GHz and 11 GHz, respectively. Inset shows the eye diagram obtained under 5 V bias.



Figure 5.10: Polysilicon avalanche photo-detector. (a) I-V curve of the microring photo-diode under dark and illumination for input optical power of 20  $\mu$ W. Dynamic range is ~60dB and ~10dB at 0 V and 16 V, respectively. (b) One microring photo-detector resonance and the corresponding photo-current as the wavelength is swept across the resonance. (c) NEP (blue curve) of the photo-diode estimated based on the dark-current shot noise, which dominates the detector noise. Avalanche gain is 13 at 16 V bias, with an NEP of 0.27 pW/ $\sqrt{\rm Hz}$ . Simulated SNR (red curve) at the output of the optical receiver, assuming 1  $\mu$ W optical signal, and a receiver circuit input-referred noise spectral density of 1 pA/ $\sqrt{\rm Hz}$ . (d) The responsivity of the photo-diode vs. input optical power, showing minimal power dependency. (e) and (f) Eye-diagrams at 12.5 Gb/s at 0 V and 14.5 V reverse bias.



Figure 5.11: Cross-section of a WDM photonic SoC packaging.

that 50% reduction of the intrinsic region width to 400 nm would not significantly affect the Q-factor and responsivity of detectors, while reducing the depletion width and transit time. A careful optimization of the intrinsic region width combined with the reduction of RC time constant by adjusting the implant conditions can improve the bandwidth of detectors by 25-50% to 14-16.5 GHz in the linear mode, and to 10-12 GHz in the avalanche mode.

## 5.3 Design of a Photonic SoC in Bulk CMOS

As a first demonstration of monolithic electronic-photonic integrated systems in this platform, we have implemented high-bandwidth photonic WDM transceivers. We designed a total of six chiplets, each containing four stand-alone WDM transmitter and receiver rows supporting up to 16 channels. Different designs for resonant modulators and photodetectors were used in WDM rows. These chiplets are diced and wire-bond packaged with 100 pads to provide DC supplies, bias signals, digital I/O, and high-speed clocks for electro-optical testing. Chips were assembled in ceramic packages (CPG20809) with 100 wirebond connections (Fig. 5.11). These packages were plugged into a socket on a host board PCB, which delivers supplies, bias signals, high-speed clock (from an Agilent 81142A pulse generator), and control scan signals from an Opal-Kelly FPGA. Scan commands for each measurement are set in Python scripts on a computer and then sent to the FPGA to configure the chip for each particular experiment.

Figure 5.12 shows the top-level floor-plan of these chiplets. Two separate high-speed CMOS clock distribution networks have been used to bring reference clocks for transmitters and receivers. Local delay lines are embedded in each macro to compensate clock delay between macros and synchronize timing for different channels in a row. By integrating all of the analog and digital blocks, data-stream generation and bit error estimation for transceivers are performed on the chip.



Figure 5.12: WDM photonic SoC top-level (CMOS clock distribution networks for transmit and receive clocks are shown in red).

## 5.3.1 Chip Implementation

We adopted similar photonic SoC design flow described in Section 2.4 in this work. Modified PDK has been developed for our new platform by leveraging the 65 nm CMOS node's original PDK. The new 5 layers (photonic trenches, polysilicon, poly partial etch, and dopings) are added to the PDK along with their manufacturing DRC rules. SKILL primitives are also redefined for automatic photonic layout generation. Finally, we modified our photonic behavioral models based on expected parameters for new active and passive device designs in this process.

Photonic device layouts were developed and drawn in Cadence Virtuoso (an industry-standard design tool for front-end electronics in conjunction with mixed-signal electronics [80]). Digital electronics were implemented using a combination of digital synthesis and place and route tools from Cadence. All photonic and electronic designs conform to the

65 nm CMOS technology manufacturing rules (more than 5000 rules). New design rules were added to the original CMOS rules for the new photonics masks that were added to the process. The most critical mask rules with the introduction of photonics into the process are density rules. This is important as photonics and electronics occupy separate regions on the chip, but their respective masks have to maintain a certain maximum and minimum density of shapes across the whole design area. These density rules were met by custom fill shapes designed by our team. Density fill shapes can be seen in the SEM images in Fig. 5.2c. The physical design verification was performed using Mentor Graphics Calibre.

#### 5.3.2 Electrical Performance Evaluation

In order to examine the performance and process variation of transistors after introducing the photonics module into the CMOS process, electrical ring oscillators composed of 15 equally-sized inverting stages (Fig. 5.13a) are used inside every transceiver macro to probe the speed of transistors, as well as the intra- and inter-die variations. The fastest transistors (low-threshold voltage) in the process with gate lengths of 55 nm were used for the ring oscillator design. In order to read out the ring-oscillator frequencies, the output of the oscillator is fed into an asynchronous digital divider (divide by 8) and the divided clock runs a digital counter block. Then, the oscillator's frequency was estimated by scanning out the counter's value.

Figure 5.13b shows the histogram of ring oscillators' measured frequencies and compares it against the Monte-Carlo simulation results based on the original PDK provided by the foundry. Each peak in the measurement or simulated distributions can be associated with slow-slow (SS), typical-typical-typical (TT), and fast-fast (FF) corners in CMOS modeling. Thus, each of these distributions can be decomposed into three Gaussian distributions. These Gaussian components are broader distributions in the Monte-Carlo simulation. This is due to the fact that we assumed all model parameters are varying in respect to the normal distributions while it is not necessarily the case for intra-wafer variations. However, this provides a worst case estimation of oscillator frequencies. On the TT corner, the nominal frequency should be 2.33 GHz (single stage delay of 14.3 ps) from the simulations. The comparison of distributions is confirming that the  $3\sigma$  span is the same for both the measurement and the simulation and hence our photonic process module does not degrade the performance of transistors.

#### 5.3.3 Electronic-Photonic WDM Transceivers

Figure 5.3b shows the die photo of the full electronic-photonic SoC containing four standalone WDM transmitter and receiver rows, each supporting up to 16 channels. This chip contains 64 transceiver macros in total. Overall, each macro is comprised of approximately 30,000 logic gates ( $\sim 0.5$  million transistors) from IP standard cells have been used in each transceiver channel to form all the necessary digital circuitry including: serialization and deserialization, scan chain configuration bits, and thermal tuning control loop. Additionally,



Figure 5.13: (a) Embedded ring-oscillator test block in each tranceiver macro for electrical performance and variation ispection, (b) Histogram comparison of measured frequencies with Monte-carlo simulation results from the original PDK of 65 nm CMOS process.

macros include analog front-end of the transmitter and receiver and auxiliary clocking blocks to correct duty-cycle and delay in between channels.

Test Setup. Figure 5.14 depicts the test setup for a WDM optical transmit and receive row. Input light at O-band is coupled into the transmitter row from a tunable Santec laser source. The transmitter's eye-diagram was captured via an external Ortel photoreceiver with 10 GHz bandwidth on an electrical DCA scope by running the on-chip PRBS modules at 5 GHz external clock frequency. The receiver's BER test was performed by first programming a KC705 Xilinx FPGA with the same PRBS coefficients used for our on-chip bit-error-rate (BER) checkers. The output of the FPGA was then amplified using a high-voltage modulator driver (JDSU H301), which then drives an external JDSU MZI optical modulator. Modulated light is amplified using a semiconductor optical amplifier (Thorlab BOA1130) and is coupled into the chip.

The transmitter is composed of a full digital backend that generates a pseudo-random binary sequence (PRBS) signal, a serializer with an 8 to 1 ratio, and finally an inverter chain that drives the microring modulator (Fig. 5.15a).

On the receiver side, a trans-impedance amplifier (TIA) analog front-end converts and amplifies the received photo-current into a voltage signal, and a pair of double-data-rate (DDR) samplers converts the signal into the digital domain [4] (Fig. 5.15b). These bits are describined and fed into a BER checker on the chip. The generated PRBS signal and BER data are controlled/monitored via on-chip scan chains to measure the functionality and performance of the transceivers.

The TIA-based receiver circuit has a pseudo-differential front-end with a cascode pre-



Figure 5.14: Electro-optical test setup of WDM transceiver chips and block diagram of one WDM transmit-receive row.



Figure 5.15: (a) Block-diagram of a WDM transmitter, (b) Block-diagram of a WDM receiver, (c) 10 Gb/s transmit eye-diagram using the on-chip PRBS generator, (d) 7 Gb/s receiver bathtub curve, (e) Thermal tuning of one WDM channel using the integrated microheater.



Figure 5.16: Block-diagram of the receiver's analog front-end.

amplifier feeding into double-data rate (DDR) sense-amplifiers and dynamic-to-static converters (D2S) (Fig. 5.16). The TIA stage with  $3 k\Omega$  feedback contains a 5-bit current bleeder at the input node, which is set to the average current of the photodiode. This allows the TIA input and output to swing around the midpoint voltage of the inverter. The TIA input and output are directly fed into a cascode amplifier with resistive pull up. The bias voltage of the cascode is tuned through a 5-bit DAC. Adjusting this bias voltage results in a trade-off between the output common-mode voltage and the signal gain of the cascode stage. More specifically, increasing this bias voltage results in a higher cascade gain but lower output common-mode voltage that reduces the sense-amplifier speed. For a given data rate, an optimal bias voltage is determined so as to minimize the overall evaluation time of the sense amplifier. The proceeding sense amplifiers then evaluate the cascode outputs before getting deserialized and fed into on-chip BER checkers. Each sense amplifier has a coarse, 3-bit current bleeding DAC as well as a fine, 5-bit capacitive DAC for offset correction. During the initial seeding phase, the incoming receiver data are used to seed the on-chip PRBS generators for the BER check. The receiver and describlizer achieved 7 Gb/s data-rate with a BER below 10<sup>-</sup>10 with the photocurrent sensitivity of 26 uA.

On-chip clock adjustment circuits composed of a duty-cycle corrector (DCC) and a delay line (DL). These circuits are utilized in order to synchronize the timing of different transceivers and perform an open-loop clock-data recovery (CDR) (Fig. 5.17). DCC can adjust up to 20% duty-cycle variation and the DL has 3 control bits with the LSB size of 20 ps and capability to invert the clock signal.

The operation of one channel of a  $10-\lambda$  WDM transceiver is shown in Fig. 5.15. Modulators were operated in the depletion mode (0 to  $-1.5\,\mathrm{V}$  voltage swing) at  $10\,\mathrm{Gb/s}$  data-rate with an extinction ratio of  $4.7\,\mathrm{dB}$  (Fig. 5.15c). The BER bathtub curves in Fig. 5.15d show the measured BER at each time delay point between the clock fed into the chip and the FPGA reference clock. Notice, FPGA reference clock determines the timing of externally modulated light. The receiver achieved a BER better than  $10^{-10}$  with  $-3\,\mathrm{dBm}$  input optical



Figure 5.17: Block-diagram of the duty-cycle corrector (DCC) and delay line (DL).

power sensitivity at 7 Gb/s, as shown in Fig. 5.15d. The speed of receivers is currently limited by the long on-chip clock distribution network and can be further improved by integrating local clock generators [4].

Thermal tuning controllers and heater drivers are also included in the transceivers, to adjust for microring resonance fluctuations due to temperature and process variations [42]. Using the digital-to-analog converters in the thermal tuning controllers and heater drivers, we measured  $45\,\mu\text{W}/\text{GHz}$  tuning efficiency for integrated microheaters on microring modulators and detectors (Fig. 5.15e).

The total electrical energy consumption of the complete transmitter and receiver was  $100\,\mathrm{fJ/b}$  and  $500\,\mathrm{fJ/b}$ , respectively. The transceiver achieved the bandwidth density of  $180\,\mathrm{Gb/s/mm^2}$  with 10% of the effective area occupied by photonics, which can be reduced to 5% by optimizing the floorplan. Incorporating this photonics platform in advanced sub-10 nm technology nodes with higher transistor densities [133] would lead to  $>2\,\mathrm{Tb/s/mm^2}$  bandwidth densities meeting the needs of next-generation SoCs. The total expected energy-efficiency including laser and high speed clocking blocks after optimizing the device performances is  $<2\,\mathrm{pJ/b}$  at the  $40\,\mathrm{Gb/s}$  data-rate.

## 5.4 Summary

We have demonstrated a monolithic photonic electronic-photonic in a 300 mm-wafer commercial bulk CMOS process for the firs time. This scheme, implemented in a 65 nm bulk CMOS process, is a proof-of-concept for viability of embedding monolithic photonics in the most advanced CMOS nodes with sub-10 nm technology. The cost overhead in this method is negligible as we minimized the number of additional masks. Notice the cost of any CMOS technology is dominated by the number and type of required fabrication masks. We have

enabled monolithic photonics by adding only five inexpensive, binary chrome-on-glass masks to the 65 nm process which already has more than 30 masks. This cost overhead in state-of-the-art nodes like FinFET with more than 70 masks will be even lower (less than 10%). This cost-effective photonic platform and integration approach show a path forward for adding photonic functions on a variety of substrates to enable the next generation of systems on a chip for computing, communications, imaging, and sensing.

We have implemented  $10\,\mathrm{Gb/s}$  optical  $10\text{-}\lambda\,\mathrm{WDM}$  optical transceivers as the first electronic-photonic system design. The demonstrated optical transceivers in bulk CMOS are an important milestone toward terabyte-scale optical interconnects for direct integration with logic and memory to improve the performance of computing systems, currently limited by the chip input/output bandwidth.

# Chapter 6

## Conclusion

## 6.1 Thesis Summary

For decades, the exponential growth of transistors' count in an integrated circuit (Moore's law) along with the integration of new devices like micro electro mechanical systems (MEMS) led to the invention of impactful computers and mobile system-on-chip platforms. Today, transistor scaling is slowing down, while there is even larger demands and expectations from integrated systems. I believe the next wave of innovations and technologies in integrated systems begins with the fusion of photonics and electronics. The integration of photonic devices with state-of-the-art electronics advances both cloud and edge computing, and unlocks new functionalities for sensing and imaging by exploiting lightwave properties. Therefore, the seamless integration of photonic and other emerging devices with state-of-the-art electronics empowers future computation and sensing integrated systems for new scientific discoveries, smarter human-machine interactions, and better healthcare. Demonstrations of this thesis set the stage for this seamless photonics integration in memory technologies and the most advanced CMOS nodes.

Figure 6.1 illustrates a summary and time-line of integrated systems described in this thesis. First, we described a monolithic photonics platforms in an advanced 45 nm SOI CMOS process. We showed how we implemented a full photonic transmitter in this process without any change or modification (zero-change). A ring-resonator based 40 Gb/s PAM-4 optical transmitter has been demonstrated in this platform achieving the record energy and bandwidth density by using a new ODAC photonic device and co-optimizing electronic-photonics. We have also studied and addressed the non-linearity and thermal tuning issues of ring-resonator based PAM-4 transmitters. Next, we implemented our zero-change platform in the 32 nm CMOS node in order to show the extendability of the zero-change approach to a more complex and advanced CMOS node. In doing so we improved both electronic and photonic performances. We demonstrated new opportunities of improving the photonics performance by exploiting new features of this process. The zero-change platforms in 45 nm and 32 nm CMOS can be "sweet-spots" for placing photonics monolithically to one of the fastest

CMOS processes demonstrated. Achieving ultra high energy-efficiencies of  $<4\,\mathrm{pJ/b}$  and bandwidth densities of around  $0.5\,\mathrm{Tb/s/mm^2}$  for complete transceivers at no extra fabrication cost makes these photonic transceivers suitable for in-package heterogeneous integration with high-performance SoCs.

With adoption by major semiconductor foundries, prospects for silicon-photonics are bright. A combination of a slow-down in transistor performance scaling for advanced process nodes and successful photonic device and transceiver demonstrations with state-of-the-art energy, bandwidth-density and cost performance in the 45 nm platform, point to these process nodes as potential strongholds for a variety of integrated electronic-photonic systems-on-a-chip. These fully-integrated electronic-photonic SoCs will be able to offload the communication work from co-packaged more-deeply-scaled compute and memory SoCs, as well as perform other complex sensing and communication functions. Furthermore, the advanced lithographic capability and some limited process customizations of these process nodes will likely enable an even more powerful class of photonic devices and electronic-photonic systems to address the future applications. The position of the 32 nm and 45 nm SOI platforms as planar process nodes with fastest transistor and analog/mixed-signal performance, adds the possibility for inexpensive process customizations such as photonic structure doping optimizations that would further improve the device performance for a number of applications, even beyond photonic interconnects.

Finally, the work in thesis resolved a major gating issue in unifying the state-of-the-art microelectronic processes with photonics for high-performance and large-scale electronicphotonic SoCs. These advanced CMOS platforms like FinFET are all fabricated on bulk silicon wafers unlike photonics which requires SOI platforms. In order to resolve this disparity, we demonstrated monolithic photonic in a 300 mm-wafer 65 nm commercial bulk CMOS process for the first time. Wafer-level process development requires careful and strategic planning due to a high-level of design and fabrication complexity. We designed and implemented the photonic SoC in this platform for in-situ device characterization and process development, and demonstrated wavelength division multiplexed (WDM) optical transceivers. The cost overhead of this solution is negligible as by optimizing the photonic fabrication steps and re-using CMOS mask set this solution requires only 5 additional masks, while advanced CMOS processes (sub-10 nm) demand over 70 masks. This solution significantly improves the energy-efficiency and bandwidth density of the optical transceivers and also enables monolithic implementation of optical transceivers directly on high-performance SoCs such as CPU/GPUs, memory chips, and switch ASICs. Moving forward to the future, the integration and complexity of electronic-photonic integrated systems will be improved by mature monolithic photonics in sub-10 nm CMOS nodes. At that turning point, these integrated systems can benefit from the Moore's law trends and economy of the scale of microelectronics.



Figure 6.1: Summary and time-line of all integrated systems described in this thesis.

# 6.2 Next-generation Electronic-Photonic Integrated Systems

This thesis focused on developing multiple electronic-photonic integrated technologies to embed photonic monolithically with advanced CMOS technologies. Additionally, we provided insights on electronic and photonic co-design and co-optimization and opportunities in advanced CMOS processes to implement large-scale and energy-efficient electronic-photonic integrated systems. As the immediate application, we discussed opportunities and demonstrations in the area of optical interconnects and networks. Although this application has emerged in last two decades, today's demand for energy-efficient and high-bandwidth density optical transceivers is even more critical due to the advent of latency-sensitive applications such as machine learning (ML) and the exponential growth of data sets. In [13], we demonstrated a microsecond optical switching network (Fig. 6.2) between processor SoC with optical I/O in zero-change 45 nm platform [6] using MEMS-actuated adiabatic coupler switches [134] in silicon photonic technologies. Achieving microsecond-scale switching in conjunction with high-bandwidth and energy-efficient WDM photonic SoCs opens up new possibilities for imminent disaggregated and heterogeneous data-center and HPC ar-

chitectures. Data movement and memory access dominates today's computing latency and energy, which can be significantly improved by utilizing monolithically integrated optical transceivers in computing SoCs and interconnecting them via high-radix and fast silicon photonic switches at both intra- and inter-cluster levels. This is one example to show how emerging electronic-photonic integrated systems can revolutionize computing paradigms in future.



Figure 6.2: (a) Block-diagram of the microsecond optical switching network demonstration, (b) Silicon photonics MEMS switch chip, (c) Electro-optically packaged processor SoC [13].

For the last decade, optical interconnects and networks have been the primary application target for the silicon-photonic platforms. Even in this domain, further advances are possible by co-optimizing electronic and photonics and utilizing more advanced device concepts implementable in the state-of-the-art CMOS processes, with fast and low-cost development cycles. With inherent high connection density between transistors and photonics in monolithic platforms, these platforms are also suitable for emerging photonic applications such as phase-arrays [135] for lidar and free-space optical communication, as well as molecular sensing arrays [136, 137]. Beyond classical applications, integrated photonic platforms hold a great promise for quantum communication and computing [138, 139], as well as low-energy interconnects from cryogenic environments.

# **Bibliography**

- [1] S. Rumley, M. Bahadori, R. Polster, S. D. Hammond, D. M. Calhoun, K. Wen, A. Rodrigues, and K. Bergman, "Optical interconnects for extreme scale computing systems," *Parallel Computing*, vol. 64, pp. 65 80, 2017, high-End Computing for Next-Generation Scientific Discovery.
- [2] C. Sun, "Silicon-photonics for VLSI systems," MIT P.hD.'s Thesis, 2015.
- [3] P. D. Dobbelaere, A. Dahl, A. Mekis, B. Chase, B. Weber, B. Welch, D. Foltz, G. Armijo, G. Masini, G. McGee, G. Wong, J. Balardeta, J. Dotson, J. Schramm, K. Hon, K. Khauv, K. Robertson, K. Stechschulte, K. Yokoyama, L. Planchon, L. Tullgren, M. Eker, M. Mack, M. Peterson, N. Rudnick, P. Milton, P. Sun, R. Bruck, R. Zhou, S. Denton, S. Fath-pour, S. Gloeckner, S. Jackson, S. Pang, S. Sahni, S. Wang, S. Yu, T. Pinguet, Y. D. Koninck, Y. Chi, and Y. Liang, "Advanced silicon photonics technology platform leveraging a semiconductor supply chain," in 2017 IEEE International Electron Devices Meeting (IEDM), Dec 2017, pp. 34.1.1–34.1.4.
- [4] K. T. Settaluri, S. Lin, S. Moazeni, E. Timurdogan, C. Sun, M. Moresco, Z. Su, Y.-H. Chen, G. Leake, D. LaTulipe, C. McDonough, J. Hebding, D. Coolbaugh, M. Watts, and V. Stojanović, "Demonstration of an optical chip-to-chip link in a 3D integrated electronic-photonic platform," in *European Solid-State Circuits Conference* (ESSCIRC), ESSCIRC 2015 41st, Sept 2015, pp. 156–159.
- [5] C. Sun, M. Wade, M. Georgas, S. Lin, L. Alloatti, B. Moss, R. Kumar, A. H. Atabaki, F. Pavanello, J. M. Shainline, J. S. Orcutt, R. J. Ram, M. Popović, and V. Stojanović, "A 45 nm CMOS-SOI Monolithic Photonics Platform With Bit-Statistics-Based Resonant Microring Thermal Tuning," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 4, pp. 893–907, April 2016.
- [6] C. Sun, M. T. Wade, Y. Lee, J. S. Orcutt, L. Alloatti, M. S. Georgas, A. S. Waterman, J. M. Shainline, R. R. Avizienis, S. Lin, B. R. Moss, R. Kumar, F. Pavanello, A. Atabaki, H. M. Cook, A. J. Ou, J. C. Leu, Y.-H. Chen, K. Asanović, R. J. Ram, M. A. Popović, and V. M. Stojanović, "Single-chip microprocessor that communicates directly using light," *Nature*, vol. 528, no. 7583, pp. 534–538, 2015.

[7] V. Stojanović, R. J. Ram, M. Popović, S. Lin, S. Moazeni, M. Wade, C. Sun, L. Alloatti, A. Atabaki, F. Pavanello, N. Mehta, and P. Bhargava, "Monolithic Silicon-Photonic Platforms in State-of-the-art CMOS SOI Processes," *Optics Express*, vol. 15, no. 19, pp. 11798–11807, 2018.

- [8] S. Lin, S. Moazeni, K. T. Settaluri, and V. Stojanović, "Electronic-Photonic Co-Optimization of High-Speed Silicon Photonic Transmitters," *Journal of Lightwave Technology*, vol. 35, no. 21, pp. 4766–4780, Nov 2017.
- [9] S. E. Thompson, M. Armstrong, C. Auth, M. Alavi, M. Buehler, R. Chau, S. Cea, T. Ghani, G. Glass, T. Hoffman, C. H. Jan, C. Kenyon, J. Klaus, K. Kuhn, Z. Ma, B. Mcintyre, K. Mistry, A. Murthy, B. Obradovic, R. Nagisetty, P. Nguyen, S. Sivakumar, R. Shaheed, L. Shifren, B. Tufts, S. Tyagi, M. Bohr, and Y. El-Mansy, "A 90-nm logic technology featuring strained-silicon," *IEEE Transactions on Electron Devices*, vol. 51, no. 11, pp. 1790–1797, Nov 2004.
- [10] S. Krishnan, U. Kwon, N. Moumen, M. W. Stoker, E. C. T. Harley, S. Bedell, D. Nair, B. Greene, W. Henson, M. Chowdhury, D. P. Prakash, E. Wu, D. Ioannou, E. Cartier, M. H. Na, S. Inumiya, K. Mcstay, L. Edge, R. Iijima, J. Cai, M. Frank, M. Hargrove, D. Guo, A. Kerber, H. Jagannathan, T. Ando, J. Shepard, S. Siddiqui, M. Dai, H. Bu, J. Schaeffer, D. Jaeger, K. Barla, T. Wallner, S. Uchimura, Y. Lee, G. Karve, S. Zafar, D. Schepis, Y. Wang, R. Donaton, S. Saroop, P. Montanini, Y. Liang, J. Stathis, R. Carter, R. Pal, V. Paruchuri, H. Yamasaki, J. H. Lee, M. Ostermayr, J. P. Han, Y. Hu, M. Gribelyuk, D. G. Park, X. Chen, S. Samavedam, S. Narasimha, P. Agnello, M. Khare, R. Divakaruni, V. Narayanan, and M. Chudzik, "A manufacturable dual channel (Si and SiGe) high-k metal gate CMOS technology with multiple oxides for high performance and low power applications," in 2011 International Electron Devices Meeting, Dec 2011, pp. 28.1.1–28.1.4.
- [11] G. Z. Mashanovich, "Electronics and photonics united," 2018.
- [12] J. S. Orcutt, A. Khilo, C. W. Holzwarth, M. A. Popović, H. Li, J. Sun, T. Bonifield, R. Hollingsworth, F. X. Kärtner, H. I. Smith, V. Stojanović, and R. J. Ram, "Nanophotonic integration in state-of-the-art CMOS foundries," *Opt. Express*, vol. 19, no. 3, pp. 2335–2346, Jan 2011.
- [13] S. Moazeni, J. Henriksson, T. J. Seok, M. T. Wade, C. Sun, M. C. Wu, and V. Sto-janović, "Microsecond Optical Switching Network of Processor SoCs with Optical I/O," Optical Fiber Communications Conference and Exhibition (OFC), 2018.
- [14] A. Shehabi, S. J. Smith, D. A. Sartor, R. E. Brown, M. Herrlin, J. G. Koomey, E. R. Masanet, N. Horner, I. L. Azevedo, and W. Lintner, "United States Data Center Energy Usage Report," Tech. Rep., 06/2016 2016.

[15] S. Moazeni, S. Lin, M. Wade, L. Alloatti, R. Ram, M. Popović, and V. Stojanović, "A 40Gb/s PAM-4 Transmitter Based on a Ring-Resonator Optical DAC in 45nm SOI CMOS," (Invited Paper) IEEE Journal of Solid-State Circuits (JSSC), vol. 52, no. 12, pp. 3503–3516, Dec 2017.

- [16] S. Moazeni, S. Lin, M. T. Wade, L. Alloatti, R. J. Ram, M. A. Popović, and V. Sto-janović, "A 40Gb/s PAM-4 transmitter based on a ring-resonator optical DAC in 45nm SOI CMOS," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), Feb 2017, pp. 486–487.
- [17] M. T. Wade, F. Pavanello, J. Orcutt, R. Kumar, J. M. Shainline, V. Stojanovi, R. Ram, and M. A. Popovi, "Scaling zero-change photonics: An active photonics platform in a 32nm microelectronics soi cmos process," in 2015 Conference on Lasers and Electro-Optics (CLEO), May 2015, pp. 1–2.
- [18] S. Moazeni, A. Atabaki, D. Cheian, S. Lin, R. J. Ram, and V. Stojanović, "Monolithic Integration of O-band Photonic Transceivers in a "Zero-change" 32nm SOI CMOS," in 2017 IEEE International Electron Devices Meeting (IEDM), 2017.
- [19] A. H. Atabaki\*, S. Moazeni\*, F. Pavanello\*, H. Gevorgyan, J. Notaros, L. Alloatti, M. T. Wade, C. Sun, S. A. Kruger, H. Meng, K. Al Qubaisi, I. Wang, B. Zhang, A. Khilo, C. V. Baiocco, M. A. Popović, V. M. Stojanović, and R. J. Ram, "Integrating photonics with silicon nanoelectronics for the next generation of systems on a chip," *Nature*, vol. 556, no. 7701, pp. 349–354, 2018.
- [20] S. Moazeni\*, A. Atabaki\*, F. Pavanello\*, H. Gevorgyan, J. Notaros, L. Alloatti, M. T. Wade, C. Sun, S. A. Kruger, H. Meng, K. A. Qubaisi, I. Wang, B. Zhang, A. Khilo, C. Baiocco, M. A. Popović, R. Ram, and V. Stojanović, "Integration of Polysilicon-based Photonics in a 12-inch Wafer 65nm Bulk CMOS Process," in 2017 Fifth Berkeley Symposium on Energy Efficient Electronic Systems (E3S), 2017.
- [21] A. Atabaki\*, S. Moazeni\*, F. Pavanello\*, H. Gevorgyan, J. Notaros, L. Alloatti, M. T. Wade, C. Sun, S. A. Kruger, H. Meng, K. A. Qubaisi, I. Wang, B. Zhang, A. Khilo, C. Baiocco, M. A. Popović, V. Stojanović, and R. Ram, "Monolithic Optical Transceivers in 65 nm Bulk CMOS," Optical Fiber Communications Conference and Exhibition (OFC), 2018.
- [22] Y. A. Vlasov, "Silicon CMOS-integrated nano-photonics for computer and data communications beyond 100G," *IEEE Communications Magazine*, vol. 50, no. 2, pp. s67–s72, February 2012.
- [23] C. Kachris and I. Tomkos, "A Survey on Optical Interconnects for Data Centers," *IEEE Communications Surveys Tutorials*, vol. 14, no. 4, pp. 1021–1036, April 2012.

[24] D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. P. Jouppi, M. Fiorentino, A. Davis, N. Binkert, R. G. Beausoleil, and J. H. Ahn, "Corona: System implications of emerging nanophotonic technology," in 2008 International Symposium on Computer Architecture, June 2008, pp. 153–164.

- [25] S. Beamer, C. Sun, Y.-J. Kwon, A. Joshi, C. Batten, V. Stojanović, and K. Asanović, "Re-architecting DRAM Memory Systems with Monolithically Integrated Silicon Photonics," in *Proceedings of the 37th Annual International Symposium on Computer Architecture*, ser. ISCA '10. New York, NY, USA: ACM, 2010, pp. 129–140.
- [26] M. Rakowski, M. Pantouvaki, P. D. Heyn, P. Verheyen, M. Ingels, H. Chen, J. D. Coster, G. Lepage, B. Snyder, K. D. Meyer, M. Steyaert, N. Pavarelli, J. S. Lee, P. O'Brien, P. Absil, and J. V. Campenhout, "A 4×20Gb/s WDM ring-based hybrid CMOS silicon photonics transceiver," in 2015 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, Feb 2015, pp. 1–3.
- [27] E. Timurdogan, Z. Su, K. Settaluri, S. Lin, S. Moazeni, C. Sun, G. Leake, D. D. Coolbaugh, B. R. Moss, M. Moresco, V. Stojanović, and M. R. Watts, "An ultra low power 3D integrated intra-chip silicon electronic-photonic link," in 2015 Optical Fiber Communications Conference (OFC), March 2015, pp. 1–3.
- [28] E. Temporiti, G. Minoia, M. Repossi, D. Baldi, A. Ghilioni, and F. Svelto, "A 56Gb/s 300mW silicon-photonics transmitter in 3D-integrated PIC25G and 55nm BiCMOS technologies," in 2016 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, Jan 2016, pp. 404–405.
- [29] J. F. Buckwalter, X. Zheng, G. Li, K. Raj, and A. V. Krishnamoorthy, "A Monolithic 25-Gb/s Transceiver With Photonic Ring Modulators and Ge Detectors in a 130-nm CMOS SOI Process," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 6, pp. 1309–1322, June 2012.
- [30] C. Xiong, D. M. Gill, J. E. Proesel, J. S. Orcutt, W. Haensch, and W. M. J. Green, "Monolithic 56Gb/s silicon photonic pulse-amplitude modulation transmitter," *Optica*, vol. 3, no. 10, pp. 1060–1065, Oct 2016.
- [31] C. Sun, M. Georgas, J. Orcutt, B. Moss, Y.-H. Chen, J. Shainline, M. Wade, K. Mehta, K. Nammari, E. Timurdogan, D. Miller, O. Tehar-Zahav, Z. Sternberg, J. Leu, J. Chong, R. Bafrali, G. Sandhu, M. Watts, R. Meade, M. Popovic, R. Ram, and V. Stojanovic, "A monolithically-integrated chip-to-chip optical link in bulk CMOS," Solid-State Circuits, IEEE Journal of, vol. 50, no. 4, pp. 828–844, April 2015.
- [32] T. Norimatsu, T. Kawamoto, K. Kogo, N. Kohmu, F. Yuki, N. Nakajima, T. Muto, J. Nasu, T. Komori, H. Koba, T. Usugi, T. Hokari, T. Kawamata, Y. Ito, S. Umai, M. Tsuge, T. Yamashita, M. Hasegawa, and K. Higeta, "A 25Gb/s multistandard

- serial link transceiver for 50dB-loss copper cable in 28nm CMOS," in 2016 IEEE International Solid-State Circuits Conference (ISSCC), Jan 2016, pp. 60–61.
- [33] K. Nakahara, Y. Wakayama, T. Kitatani, T. Taniguchi, T. Fukamachi, Y. Sakuma, and S. Tanaka, "56-Gb/s direct modulation in InGaAlAs BH-DFB lasers at 1550nm," in *OFC 2014*, March 2014, pp. 1–3.
- [34] A. Sharif-Bakhtiar, M. G. Lee, and A. C. Carusone, "A 40-Gbps 0.5-pJ/bit VCSEL driver in 28nm CMOS with complex zero equalizer," in 2017 IEEE Custom Integrated Circuits Conference (CICC), April 2017, pp. 1–4.
- [35] F. Eltes, M. Kroh, D. Caimi, C. Mai, Y. Popoff, G. Winzer, D. Petousi, S. Lischke, J. E. Ortmann, L. Czornomaz, L. Zimmermann, J. Fompeyrine, and S. Abel, "A novel 25 Gbps electro-optic Pockels modulator integrated on an advanced Si photonic platform," in 2017 IEEE International Electron Devices Meeting (IEDM), Dec 2017, pp. 24.5.1–24.5.4.
- [36] R. Soref and B. Bennett, "Electro-optical effects in silicon," *IEEE Journal of Quantum Electronics*, vol. 23, no. 1, pp. 123–129, Jan 1987.
- [37] M. R. Watts, W. A. Zortman, D. C. Trotter, R. W. Young, and A. L. Lentine, "Low-Voltage, Compact, Depletion-Mode, Silicon Mach-Zehnder Modulator," *IEEE Journal of Selected Topics in Quantum Electronics*, vol. 16, no. 1, pp. 159–164, Jan 2010.
- [38] X. Wu, B. Dama, P. Gothoskar, P. Metz, K. Shastri, S. Sunder, J. V. d. Spiegel, Y. Wang, M. Webster, and W. Wilson, "A 20Gb/s NRZ/PAM-4 1V transmitter in 40nm CMOS driving a Si-photonic modulator in 0.13um CMOS," in 2013 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, Feb 2013, pp. 128–129.
- [39] J. E. Heebner, V. Wong, A. Schweinsberg, R. W. Boyd, and D. J. Jackson, "Optical transmission characteristics of fiber ring resonators," *IEEE Journal of Quantum Electronics*, vol. 40, no. 6, pp. 726–730, June 2004.
- [40] C. Li, R. Bai, A. Shafik, E. Tabasy, B. Wang, G. Tang, C. Ma, C.-H. Chen, Z. Peng, M. Fiorentino, R. Beausoleil, P. Chiang, and S. Palermo, "Silicon photonic transceiver circuits with microring resonator bias-based wavelength stabilization in 65 nm CMOS," Solid-State Circuits, IEEE Journal of, vol. 49, no. 6, pp. 1419–1436, June 2014.
- [41] B. Moss, C. Sun, M. Georgas, J. Shainline, J. Orcutt, J. Leu, M. Wade, Y.-H. Chen, K. Nammari, X. Wang, H. Li, R. Ram, M. Popović, and V. Stojanović, "A 1.23pJ/b 2.5Gb/s monolithically integrated optical carrier-injection ring modulator and all-digital driver circuit in commercial 45nm SOI," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2013 IEEE International, Feb 2013, pp. 126–127.

[42] C. Sun, M. Wade, M. Georgas, S. Lin, L. Alloatti, B. Moss, R. Kumar, A. H. Atabaki, F. Pavanello, J. M. Shainline, J. S. Orcutt, R. J. Ram, M. Popovi, and V. Stojanovi, "A 45nm CMOS-SOI Monolithic Photonics Platform With Bit-Statistics-Based Resonant Microring Thermal Tuning," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 4, pp. 893–907, April 2016.

- [43] D. Liang and J. E. Bowers, "Recent progress in lasers on silicon," *Nature Photonics*, vol. 4, p. 511, jul 2010.
- [44] "QD Laser," Website, http://www.qdlaser.com.
- [45] P. D. Dobbelaere, A. Ayazi, Y. Chi, A. Dahl, S. Denton, S. Gloeckner, K.-Y. Hon, S. Hovey, Y. Liang, M. Mack, G. Masini, A. Mekis, M. Peterson, T. Pinguet, J. Schramm, M. Sharp, C. Sohn, K. Stechschulte, P. Sun, G. Vastola, L. Verslegers, and R. Zhou, "Packaging of silicon photonics systems," in *Optical Fiber Communication Conference*. Optical Society of America, 2014, p. W3I.2.
- [46] X. Zheng, S. Lin, Y. Luo, J. Yao, G. Li, S. S. Djordjevic, J. H. Lee, H. D. Thacker, I. Shubin, K. Raj, J. E. Cunningham, and A. V. Krishnamoorthy, "Efficient WDM Laser Sources Towards Terabyte/s Silicon Photonic Interconnects," *Journal of Lightwave Technology*, vol. 31, no. 24, pp. 4142–4154, Dec 2013.
- [47] G. L. Wojcik, D. Yin, A. R. Kovsh, A. E. Gubenko, I. L. Krestnikov, S. S. Mikhrin, D. A. Livshits, D. A. Fattal, M. Fiorentino, and R. G. Beausoleil, "A single comb laser source for short reach WDM interconnects," pp. 7230 7230 12, 2009.
- [48] B. R. Koch, E. J. Norberg, B. Kim, J. Hutchinson, J. H. Shin, G. Fish, and A. Fang, "Integrated Silicon Photonic Laser Sources for Telecom and Datacom," in 2013 Optical Fiber Communication Conference (OFC), March 2013, pp. 1–3.
- [49] C. Gunn, "Cmos photonics for high-speed interconnects," *IEEE Micro*, vol. 26, no. 2, pp. 58–66, March 2006.
- [50] F. Boeuf, S. Crémer, N. Vulliet, T. Pinguet, A. Mekis, G. Masini, L. Verslegers, P. Sun, A. Ayazi, N. Hon *et al.*, "A multi-wavelength 3D-compatible silicon photonics platform on 300mm SOI wafers for 25Gb/s applications," in *Electron Devices Meeting (IEDM)*, 2013 IEEE International. IEEE, 2013, pp. 13–3.
- [51] M. H. Wakayama, "Nanometer CMOS from a mixed-signal/RF perspective," in 2013 IEEE International Electron Devices Meeting, Dec 2013, pp. 17.4.1–17.4.4.
- [52] M. Pantouvaki, P. D. Heyn, R. Michal, P. Verheyen, S. Brad, A. Srinivasan, H. CHEN, J. D. Coster, G. Lepage, P. Absil, and J. V. Campenhout, "50Gb/s Silicon Photonics Platform for Short-Reach Optical Interconnects," in *Optical Fiber Communication Conference*. Optical Society of America, 2016, p. Th4H.4.

[53] M. Rakowski, M. Pantouvaki, P. Verheyen, J. D. Coster, G. Lepage, P. Absil, and J. V. Campenhout, "A 50gb/s, 610fj/bit hybrid cmos-si photonics ring-based nrz-ook transmitter," in *Optical Fiber Communication Conference*. Optical Society of America, 2016, p. Th1F.4.

- [54] H. Li, Z. Xuan, A. Titriku, C. Li, K. Yu, B. Wang, A. Shafik, N. Qi, Y. Liu, R. Ding, T. Baehr-Jones, M. Fiorentino, M. Hochberg, S. Palermo, and P. Y. Chiang, "A 25 Gb/s, 4.4 V-Swing, AC-Coupled Ring Modulator-Based WDM Transmitter with Wavelength Stabilization in 65 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 12, pp. 3145–3159, Dec 2015.
- [55] K. Yu, C. Li, H. Li, A. Titriku, A. Shafik, B. Wang, Z. Wang, R. Bai, C. H. Chen, M. Fiorentino, P. Y. Chiang, and S. Palermo, "A 25 Gb/s Hybrid-Integrated Silicon Photonic Source-Synchronous Receiver With Microring Wavelength Stabilization," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 9, pp. 2129–2141, Sept 2016.
- [56] E. Temporiti, G. Minoia, M. Repossi, D. Baldi, A. Ghilioni, and F. Svelto, "A 56Gb/s 300mW silicon-photonics transmitter in 3D-integrated PIC25G and 55nm BiCMOS technologies," in 2016 IEEE International Solid-State Circuits Conference (ISSCC), Jan 2016, pp. 404–405.
- [57] A. Narasimha, B. Analui, E. Balmater, A. Clark, T. Gal, D. Guckenberger, S. Gutierrez, M. Harrison, R. Koumans, D. Kucharski *et al.*, "A 40-Gb/s QSFP optoelectronic transceiver in a 0.13 μm CMOS Silicon-on-Insulator Technology," in *Optical Fiber Communication Conference*. Optical Society of America, 2008, p. OMK7.
- [58] N. B. Feilchenfeld, F. G. Anderson, T. Barwicz, S. Chilstedt, Y. Ding, J. Ellis-Monaghan, D. M. Gill, C. Hedges, J. Hofrichter, F. Horst, M. Khater, E. Kiewra, R. Leidy, Y. Martin, K. McLean, M. Nicewicz, J. S. Orcutt, B. Porth, J. Proesel, C. Reinholm, J. C. Rosenberg, W. D. Sacher, A. D. Stricker, C. Whiting, C. Xiong, A. Agrawal, F. Baker, C. W. Baks, B. Cucci, D. Dang, T. Doan, F. Doany, S. Engelmann, M. Gordon, E. Joseph, J. Maling, S. Shank, X. Tian, C. Willets, J. Ferrario, M. Meghelli, F. Libsch, B. Offrein, W. M. J. Green, and W. Haensch, "An integrated silicon photonics technology for O-band datacom," in 2015 IEEE International Electron Devices Meeting (IEDM), Dec 2015, pp. 25.7.1–25.7.4.
- [59] L. Zimmermann, D. Knoll, M. Kroh, S. Lischke, D. Petousi, G. Winzer, and Y. Yamamoto, "BiCMOS silicon photonics platform," in 2015 Optical Fiber Communications Conference and Exhibition (OFC), March 2015, pp. 1–3.
- [60] Y. Lee, A. Waterman, R. Avizienis, H. Cook, C. Sun, V. Stojanović, and K. Asanovic, "A 45nm 1.3 GHz 16.7 double-precision GFLOPS/W RISC-V processor with vector accelerators," in *European Solid State Circuits Conference (ESSCIRC)*, ESSCIRC 2014-40th, 2014, pp. 199–202.

[61] S. Narasimha, K. Onishi, H. Nayfeh, A. Waite, M. Weybright, J. Johnson, C. Fonseca, D. Corliss, C. Robinson, M. Crouse, D. Yang, C.-H. Wu, A. Gabor, T. Adam, I. Ahsan, M. Belyansky, L. Black, S. Butt, J. Cheng, A. Chou, G. Costrini, C. Dimitrakopoulos, A. Domenicucci, P. Fisher, A. Frye, S. Gates, S. Greco, S. Grunow, M. Hargrove, J. Holt, S.-J. Jeng, M. Kelling, B. Kim, W. Landers, G. LaRosa, D. Lea, M. Lee, X. Liu, N. Lustig, A. McKnight, L. Nicholson, D. Nielsen, K. Nummy, V. Ontalus, C. Ouyang, X. Ouyang, C. Prindle, R. Pal, W. Rausch, D. Restaino, C. Sheraw, J. Sim, A. Simon, T. Standaert, C. Sung, K. Tabakman, C. Tian, R. Van Den Nieuwenhuizen, H. van Meer, A. Vayshenker, D. Wehella-Gamage, J. Werking, R. Wong, J. Yu, S. Wu, R. Augur, D. Brown, X. Chen, D. Edelstein, A. Grill, M. Khare, Y. Li, S. Luning, J. Norum, S. Sankaran, D. Schepis, R. Wachnik, R. Wise, C. Warm, T. Ivers, and P. Agnello, "High performance 45-nm SOI technology with enhanced strain, porous low-k BEOL, and immersion lithography," in *Electron Devices Meeting*, 2006. IEDM '06. International, Dec 2006, pp. 1–4.

- [62] J. S. Orcutt, B. Moss, C. Sun, J. Leu, M. Georgas, J. Shainline, E. Zgraggen, H. Li, J. Sun, M. Weaver, S. Urošević, M. Popović, R. J. Ram, and V. Stojanović, "Open foundry platform for high-performance electronic-photonic integration," *Opt. Express*, vol. 20, no. 11, pp. 12 222–12 232, May 2012.
- [63] M. T. Wade, F. Pavanello, R. Kumar, C. M. Gentry, A. Atabaki, R. Ram, V. Sto-janović, and M. A. Popović, "75% efficient wide bandwidth grating couplers in a 45nm microelectronics CMOS process," in 2015 IEEE Optical Interconnects Conference (OI), April 2015, pp. 46–47.
- [64] J. Notaros, F. Pavanello, M. T. Wade, C. M. Gentry, A. Atabaki, L. Alloatti, R. J. Ram, and M. A. Popović, "Ultra-efficient CMOS fiber-to-chip grating couplers," in 2016 Optical Fiber Communications Conference (OFC), March 2016, pp. 1–3.
- [65] M. T. Wade, J. M. Shainline, J. S. Orcutt, C. Sun, R. Kumar, B. Moss, M. Georgas, R. J. Ram, V. Stojanović, and M. A. Popović, "Energy-efficient active photonics in a zero-change, state-of-the-art CMOS process," in 2014 Optical Fiber Communications Conference (OFC), March 2014, pp. 1–3.
- [66] L. Alloatti and R. J. Ram, "Resonance-enhanced waveguide-coupled silicon-germanium detector," arXiv preprint arXiv:1601.00542, 2016.
- [67] A. H. Atabaki, H. Meng, L. Alloatti, and R. J. Ram, "A high-speed photodetector for telecom, ethernet, and FTTH applications in zero-change CMOS process," in 2016 Optical Fiber Communications Conference (OFC), March 2016, pp. 1–3.
- [68] L. Verslegers, A. Mekis, T. Pinguet, Y. Chi, G. Masini, P. Sun, A. Ayazi, K. Y. Hon, S. Sahni, S. Gloeckner, C. Baudot, F. Boeuf, and P. D. Dobbelaere, "Silicon photonics device libraries for high-speed transceivers," in 2014 IEEE Photonics Conference, Oct 2014, pp. 65–66.

[69] M. Pu, L. Liu, H. Ou, K. Yvind, and J. M. Hvam, "Ultra-low-loss inverted taper coupler for silicon-on-insulator ridge waveguide," *Optics Communications*, vol. 283, no. 19, pp. 3678 – 3682, 2010.

- [70] M. S. Akhter, P. Somogyi, C. Sun, M. Wade, R. Meade, P. Bhargava, S. Lin, and N. Mehta, "WaveLight: A Monolithic Low Latency Silicon-Photonics Communication Platform for the Next-Generation Disaggregated Cloud Data Centers," in 2017 IEEE 25th Annual Symposium on High-Performance Interconnects (HOTI), Aug 2017, pp. 25–28.
- [71] Q. Xu, B. Schmidt, S. Pradhan, and M. Lipson, "Micrometre-scale silicon electro-optic modulator," *Nature*, vol. 435, no. 7040, pp. 325–327, 2005.
- [72] J. M. Shainline, J. S. Orcutt, M. T. Wade, K. Nammari, B. Moss, M. Georgas, C. Sun, R. J. Ram, V. Stojanović, and M. A. Popović, "Depletion-mode carrier-plasma optical modulator in zero-change advanced CMOS," *Optics letters*, vol. 38, no. 15, pp. 2657– 2659, 2013.
- [73] L. Alloatti, D. Cheian, and R. J. Ram, "High-speed modulator with interleaved junctions in zero-change CMOS photonics," Applied Physics Letters, vol. 108, no. 13, p. 131101, 2016.
- [74] L. Kimerling, "Silicon microphotonics," *Applied Surface Science*, vol. 159, pp. 8–13, Jun. 2000.
- [75] K. K. Mehta, J. S. Orcutt, J. M. Shainline, O. Tehar-Zahav, Z. Sternberg, R. Meade, M. A. Popović, and R. J. Ram, "Polycrystalline silicon ring resonator photodiodes in a bulk complementary metal-oxide-semiconductor process," Opt. Lett., vol. 39, no. 4, pp. 1061–1064, Feb 2014.
- [76] M. D. C. Falco, A. Atabaki, L. Alloatti, M. Wade, M. Popovic, and R. Ram., "A thin silicon photonic platform for telecommunication wavelengths," in *ECOC2017 European Conference on Optical Communication (ECOC)*, no. pp. SC2.25, 2017.
- [77] A. H. Atabaki, H. Meng, L. Alloatti, K. K. Mehta, and R. J. Ram, "High-speed polysilicon CMOS photodetector for telecom and datacom," *Applied Physics Letters*, vol. 109, no. 11, p. 111106, 2016.
- [78] C. Sun, C.-H. Chen, G. Kurian, L. Wei, J. Miller, A. Agarwal, L.-S. Peh, and V. Stojanović, "DSENT-a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling," in *Networks on Chip (NoCS)*, 2012 Sixth IEEE/ACM International Symposium on. IEEE, 2012, pp. 201–210.
- [79] C. Sorace-Agaskar, J. Leu, M. R. Watts, and V. Stojanovic, "Electro-optical co-simulation for integrated CMOS photonic circuits with VerilogA," *Opt. Express*, vol. 23, no. 21, pp. 27180–27203, Oct 2015.

[80] L. Alloatti, M. Wade, V. Stojanovic, M. Popovic, and R. Ram, "Photonics design tool for advanced CMOS nodes," *Optoelectronics*, *IET*, vol. 9, no. 4, pp. 163–167, 2015.

- [81] J. M. Wilson, W. J. Turner, J. W. Poulton, B. Zimmer, X. Chen, S. S. Kudva, S. Song, S. G. Tell, N. Nedovic, W. Zhao, S. R. Sudhakaran, C. T. Gray, and W. J. Dally, "A 1.17pJ/b 25Gb/s/pin ground-referenced single-ended serial link for off- and on-package communication in 16nm CMOS using a process- and temperature-adaptive voltage regulator," in 2018 IEEE International Solid State Circuits Conference (ISSCC), Feb 2018, pp. 276–278.
- [82] A. Cevrero, I. Ozkaya, P. A. Francese, C. Menolfi, T. Morf, M. Brandli, D. Kuchta, L. Kull, J. Proesel, M. Kossel, D. Luu, B. Lee, F. Doany, M. Meghelli, Y. Leblebici, and T. Toifl, "A 64Gb/s 1.4pJ/b NRZ optical-receiver data-path in 14nm CMOS FinFET," in 2017 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, Feb 2017, pp. 482–483.
- [83] M. Pantouvaki, P. Verheyen, J. D. Coster, G. Lepage, P. Absil, and J. V. Campenhout, "56Gb/s ring modulator on a 300mm silicon photonics platform," in 2015 European Conference on Optical Communication (ECOC), Sept 2015, pp. 1–3.
- [84] I. L. Gheorma and R. M. Osgood, "Fundamental limitations of optical resonator based high-speed EO modulators," *IEEE Photonics Technology Letters*, vol. 14, no. 6, pp. 795–797, June 2002.
- [85] A. Roshan-Zamir, B. Wang, S. Telaprolu, K. Yu, C. Li, M. A. Seyedi, M. Fiorentino, R. Beausoleil, and S. Palermo, "A 40 Gb/s PAM4 silicon microring resonator modulator transmitter in 65nm CMOS," in 2016 IEEE Optical Interconnects Conference (OI), May 2016, pp. 8–9.
- [86] T. Yilmaz, C. M. DePriest, T. Turpin, J. H. Abeles, and P. J. Delfyett, "Toward a photonic arbitrary waveform generator using a modelocked external cavity semiconductor laser," *IEEE Photonics Technology Letters*, vol. 14, no. 11, pp. 1608–1610, Nov 2002.
- [87] M. H. Khan, H. Shen, Y. Xuan, L. Zhao, S. Xiao, D. E. Leaird, A. M. Weiner, and M. Qi, "Ultrabroad-bandwidth arbitrary radiofrequency waveform generation with a silicon photonic chip-based spectral shaper," *Nature Photonics*, vol. 4, no. 2, pp. 117– 122, 02 2010.
- [88] J. Wang, H. Shen, L. Fan, R. Wu, B. Niu, L. T. Varghese, Y. Xuan, D. E. Leaird, X. Wang, F. Gan, A. M. Weiner, and M. Qi, "Reconfigurable radio-frequency arbitrary waveforms synthesized in a silicon photonic chip," *Nature communications*, vol. 6, 2015.
- [89] M. Nazari and A. Emami-Neyestanak, "A 24-Gb/s Double-Sampling Receiver for Ultra-Low-Power Optical Communication," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 2, pp. 344–357, Feb 2013.

[90] E. Sackinger, "The Transimpedance Limit," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 57, no. 8, pp. 1848–1856, Aug 2010.

- [91] B. Wang, K. Yu, H. Li, P. Y. Chiang, and S. Palermo, "Energy efficiency comparisons of NRZ and PAM4 modulation for ring-resonator-based silicon photonic links," in 2015 IEEE 58th International Midwest Symposium on Circuits and Systems (MWSCAS), Aug 2015, pp. 1–4.
- [92] K. T. Settaluri, C. Lalau-Keraly, E. Yablonovitch, and V. Stojanović, "First Principles Optimization of Opto-Electronic Communication Links," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. PP, no. 99, pp. 1–14, 2017.
- [93] J. Crossley, E. Naviasky, and E. Alon, "An energy-efficient ring-oscillator digital PLL," in *Custom Integrated Circuits Conference (CICC)*, 2010 IEEE, Sept 2010, pp. 1–4.
- [94] S. K. Selvaraja, W. Bogaerts, P. Dumon, D. V. Thourhout, and R. Baets, "Subnanometer Linewidth Uniformity in Silicon Nanophotonic Waveguide Devices Using CMOS Fabrication Technology," *IEEE Journal of Selected Topics in Quantum Electronics*, vol. 16, no. 1, pp. 316–324, Jan 2010.
- [95] B. Greene, Q. Liang, K. Amarnath, Y. Wang, J. Schaeffer, M. Cai, Y. Liang, S. Saroop, J. Cheng, A. Rotondaro, S. J. Han, R. Mo, K. McStay, S. Ku, R. Pal, M. Kumar, B. Dirahoui, B. Yang, F. Tamweber, W. H. Lee, M. Steigerwalt, H. Weijtmans, J. Holt, L. Black, S. Samavedam, M. Turner, K. Ramani, D. Lee, M. Belyansky, M. Chowdhury, D. Aime, B. Min, H. van Meer, H. Yin, K. Chan, M. Angyal, M. Zaleski, O. Ogunsola, C. Child, L. Zhuang, H. Yan, D. Permanaa, J. Sleight, D. Guo, S. Mittl, D. Ioannou, E. Wu, M. Chudzik, D. G. Park, D. Brown, S. Luning, D. Mocuta, E. Maciejewski, K. Henson, and E. Leobandung, "High performance 32nm SOI CMOS with high-k/metal gate and 0.149um<sup>2</sup> SRAM and ultra low-k back end with eleven levels of copper," in 2009 Symposium on VLSI Technology, June 2009, pp. 140–141.
- [96] M. Wade, F. Pavanello, R. Kumar, C. Gentry, R. Ram, V. Stojanovic, and M. Popovic, "75% efficient wide bandwidth grating couplers in a 45 nm microelectronics CMOS process," in *Optical Interconnects Conference*, 2015 IEEE, April 2015.
- [97] M. Chudzik, B. Doris, R. Mo, J. Sleight, E. Cartier, C. Dewan, D. Park, H. Bu, W. Natzle, W. Yan, C. Ouyang, K. Henson, D. Boyd, S. Callegari, R. Carter, D. Casarotto, M. Gribelyuk, M. Hargrove, W. He, Y. Kim, B. Linder, N. Moumen, V. K. Paruchuri, J. Stathis, M. Steen, A. Vayshenker, X. Wang, S. Zafar, T. Ando, R. Iijima, M. Takayanagi, V. Narayanan, R. Wise, Y. Zhang, R. Divakaruni, M. Khare, and T. C. Chen, "High-performance high-k/metal gates for 45nm CMOS and beyond with gate-first processing," in 2007 IEEE Symposium on VLSI Technology, June 2007, pp. 194–195.

[98] L. Alloatti and R. J. Ram, "Resonance-enhanced waveguide-coupled silicon-germanium detector," *Applied Physics Letters*, vol. 108, no. 7, p. 071105, 2016.

- [99] Y. Kim, M. Takenaka, T. Osada, M. Hata, and S. Takagi, "Strain-induced enhancement of plasma dispersion effect and free-carrier absorption in sige optical modulators," *Scientific Reports*, vol. 4, pp. 4683 EP –, 04 2014.
- [100] N. Mehta, C. Sun, M. Wade, S. Lin, M. Popovic, and V. Stojanovic, "A 12Gb/s, 8.6 µA<sub>pp</sub> input sensitivity, monolithic-integrated fully differential optical receiver in CMOS 45nm SOI process," in *ESSCIRC Conference 2016: 42nd European Solid-State Circuits Conference*, Sept 2016, pp. 491–494.
- [101] A. Khilo, S. J. Spector, M. E. Grein, A. H. Nejadmalayeri, C. W. Holzwarth, M. Y. Sander, M. S. Dahlem, M. Y. Peng, M. W. Geis, N. A. DiLello *et al.*, "Photonic ADC: overcoming the bottleneck of electronic jitter," *Optics Express*, vol. 20, no. 4, pp. 4454–4469, 2012.
- [102] S. Rumley, D. Nikolova, R. Hendry, Q. Li, D. Calhoun, and K. Bergman, "Silicon photonics for exascale systems," *Journal of Lightwave Technology*, vol. 33, no. 3, pp. 547–562, 2015.
- [103] D. A. Miller, "Rationale and challenges for optical interconnects to electronic chips," *Proceedings of the IEEE*, vol. 88, no. 6, pp. 728–749, 2000.
- [104] S. Assefa, W. Green, A. Rylyakov, C. Schow, F. Horst, and Y. Vlasov, "Monolithic integration of silicon nanophotonics with CMOS," in *Photonics Conference (IPC)*, 2012 IEEE, Sept 2012, pp. 626–627.
- [105] A. Awny, R. Nagulapalli, G. Winzer, M. Kroh, D. Micusik, S. Lischke, D. Knoll, G. Fischer, D. Kissinger, A. Ç. Ulusoy et al., "A 40 Gb/s Monolithically Integrated Linear Photonic Receiver in a 0.25 µm BiCMOS SiGe:C Technology," *IEEE Microwave and Wireless Components Letters*, vol. 25, no. 7, pp. 469–471, 2015.
- [106] M. Ieong, B. Doris, J. Kedzierski, K. Rim, and M. Yang, "Silicon device scaling to the sub-10-nm regime," Science, vol. 306, no. 5704, pp. 2057–2060, 2004.
- [107] J. A. Del Alamo, "Nanometre-scale electronics with III-V compound semiconductors," *Nature*, vol. 479, no. 7373, pp. 317–323, 2011.
- [108] S. B. Desai, S. R. Madhvapathy, A. B. Sachid, J. P. Llinas, Q. Wang, G. H. Ahn, G. Pitner, M. J. Kim, J. Bokor, C. Hu *et al.*, "MoS2 transistors with 1-nanometer gate lengths," *Science*, vol. 354, no. 6308, pp. 99–102, 2016.
- [109] P. Ghelfi, F. Laghezza, F. Scotti, G. Serafino, A. Capria, S. Pinna, D. Onori, C. Porzi, M. Scaffardi, A. Malacarne et al., "A fully photonics-based coherent radar system," Nature, vol. 507, no. 7492, p. 341, 2014.

[110] S. Mudumba, S. de Alba, R. Romero, C. Cherwien, A. Wu, J. Wang, M. A. Gleeson, M. Iqbal, and R. W. Burlingame, "Photonic ring resonance is a versatile platform for performing multiplex immunoassays in real time," *Journal of Immunological Methods*, 2017.

- [111] M. A. Quail, M. Smith, P. Coupland, T. D. Otto, S. R. Harris, T. R. Connor, A. Bertoni, H. P. Swerdlow, and Y. Gu, "A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers," BMC genomics, vol. 13, no. 1, p. 341, 2012.
- [112] A. Zolfaghari, A. Chan, and B. Razavi, "Stacked inductors and transformers in CMOS technology," *IEEE Journal of Solid-State Circuits*, vol. 36, no. 4, pp. 620–628, 2001.
- [113] E. R. Fossum, "CMOS image sensors: Electronic camera-on-a-chip," *IEEE transactions on electron devices*, vol. 44, no. 10, pp. 1689–1698, 1997.
- [114] P. Magarshack, P. Flatresse, and G. Cesana, "UTBB FD-SOI: A process/design symbiosis for breakthrough energy-efficiency," in *Proceedings of the Conference on Design*, Automation and Test in Europe. EDA Consortium, 2013, pp. 952–957.
- [115] A. Melloni, R. Costa, P. Monguzzi, and M. Martinelli, "Ring-resonator filters in silicon oxynitride technology for dense wavelength-division multiplexing systems," *Optics letters*, vol. 28, no. 17, pp. 1567–1569, 2003.
- [116] S. K. Selvaraja, E. Sleeckx, M. Schaekers, W. Bogaerts, D. Van Thourhout, P. Dumon, and R. Baets, "Low-loss amorphous silicon-on-insulator technology for photonic integrated circuitry," *Optics Communications*, vol. 282, no. 9, pp. 1767–1770, 2009.
- [117] K. Preston, S. Manipatruni, A. Gondarenko, C. B. Poitras, and M. Lipson, "Deposited silicon high-speed integrated electro-optic modulator," *Optics express*, vol. 17, no. 7, pp. 5118–5124, 2009.
- [118] R. Dangel, C. Berger, R. Beyeler, L. Dellmann, M. Gmur, R. Hamelin, F. Horst, T. Lamprecht, T. Morf, S. Oggioni et al., "Polymer-waveguide-based board-level optical interconnect technology for datacom applications," *IEEE Transactions on Advanced Packaging*, vol. 31, no. 4, pp. 759–767, 2008.
- [119] B. J. Eggleton, B. Luther-Davies, and K. Richardson, "Chalcogenide photonics," *Nature photonics*, vol. 5, no. 3, pp. 141–148, 2011.
- [120] Y. Nishi and R. Doering, *Handbook of semiconductor manufacturing technology*. CRC Press, 2000.
- [121] J. Y. W. Seto, "The electrical properties of polycrystalline silicon films," *Journal of Applied Physics*, vol. 46, p. 5247, Jun. 1975.

[122] A. H. Atabaki, G. N. West, K. K. Mehta, D. Kramnik, and R. J. Ram, "Full spectrum visible integrated photonics in scaled microelectronic cmos," in 2017 IEEE Photonics Society Summer Topical Meeting Series (SUM), July 2017, pp. 129–130.

- [123] R. J. Ram, "Photonic-electronic integration with polysilicon photonics in bulk CMOS," pp. 9367 9367 8, 2015.
- [124] T. Barwicz and H. A. Haus, "Three-dimensional analysis of scattering losses due to sidewall roughness in microphotonic waveguides," *J. Lightwave Technol.*, vol. 23, no. 9, p. 2719, Sep 2005.
- [125] K. Preston, B. Schmidt, and M. Lipson, "Polysilicon photonic resonators for large-scale 3d integration of optical networks," *Opt. Express*, vol. 15, no. 25, pp. 17283–17290, Dec 2007.
- [126] L. Chen, P. Dong, and M. Lipson, "High performance germanium photodetectors integrated on submicron silicon waveguides by low temperature wafer bonding," *Opt. Express*, vol. 16, no. 15, pp. 11513–11518, Jul 2008.
- [127] H. Yu, D. Korn, M. Pantouvaki, J. V. Campenhout, K. Komorowska, P. Verheyen, G. Lepage, P. Absil, D. Hillerkuss, L. Alloatti, J. Leuthold, R. Baets, and W. Bogaerts, "Using carrier-depletion silicon modulators for optical power monitoring," Opt. Lett., vol. 37, no. 22, pp. 4681–4683, Nov 2012.
- [128] J. K. Doylend, P. E. Jessop, and A. P. Knights, "Silicon photonic resonator-enhanced defect-mediated photodiode for sub-bandgap detection," *Opt. Express*, vol. 18, no. 14, pp. 14671–14678, Jul 2010.
- [129] M. W. Geis, S. J. Spector, M. E. Grein, J. U. Yoon, D. M. Lennon, and T. M. Lyszczarz, "Silicon waveguide infrared photodiodes with 35 GHz bandwidth and phototransistors with 50 A/W response," *Opt. Express*, vol. 17, no. 7, pp. 5193–5204, Mar 2009.
- [130] J. Notaros, F. Pavanello, M. T. Wade, C. M. Gentry, A. Atabaki, L. Alloatti, R. J. Ram, and M. A. Popović, "Ultra-efficient CMOS fiber-to-chip grating couplers," in Optical Fiber Communications Conference and Exhibition (OFC), 2016. IEEE, 2016, pp. 1–3.
- [131] R. Soref and B. Bennett, "Electrooptical effects in silicon," *IEEE Journal of Quantum Electronics*, vol. 23, no. 1, January 1987.
- [132] J. J. Ackert, A. S. Karar, D. J. Paez, P. E. Jessop, J. C. Cartledge, and A. P. Knights, "10 Gbps silicon waveguide-integrated infrared avalanche photodiode," *Opt. Express*, vol. 21, no. 17, pp. 19530–19537, Aug 2013.

[133] S. Yang, Y. Liu, M. Cai, J. Bao, P. Feng, X. Chen, L. Ge, J. Yuan, J. Choi, P. Liu, Y. Suh, H. Wang, J. Deng, Y. Gao, J. Yang, X. Y. Wang, D. Yang, J. Zhu, P. Penzes, S. C. Song, C. Park, S. Kim, J. Kim, S. Kang, E. Terzioglu, K. Rim, and P. R. C. Chidambaram, "10nm high performance mobile SoC design and technology co-developed for performance, power, and area scaling," in 2017 Symposium on VLSI Technology, June 2017, pp. T70-T71.

- [134] T. J. Seok, N. Quack, S. Han, R. S. Muller, and M. C. Wu, "Large-scale broadband digital silicon photonic switches with vertical adiabatic couplers," *Optica*, vol. 3, no. 1, pp. 64–70, Jan 2016.
- [135] S. Chung, H. Abediasl, and H. Hashemi, "A Monolithically Integrated Large-Scale Optical Phased Array in Silicon-on-Insulator CMOS," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 1, pp. 275–296, Jan 2018.
- [136] M. Iqbal, M. A. Gleeson, B. Spaugh, F. Tybor, W. G. Gunn, M. Hochberg, T. Baehr-Jones, R. C. Bailey, and L. C. Gunn, "Label-free biosensor arrays based on silicon ring resonators and high-speed optical scanning instrumentation," *IEEE Journal of Selected Topics in Quantum Electronics*, vol. 16, no. 3, pp. 654–661, May 2010.
- [137] J. T. Robinson, L. Chen, and M. Lipson, "On-chip gas detection in silicon optical microcavities," *Opt. Express*, vol. 16, no. 6, pp. 4296–4301, Mar 2008.
- [138] C. M. Gentry, J. M. Shainline, M. T. Wade, M. J. Stevens, S. D. Dyer, X. Zeng, F. Pavanello, T. Gerrits, S. W. Nam, R. P. Mirin, and M. A. Popović, "Quantum-correlated photon pairs generated in a commercial 45nm complementary metal-oxide semiconductor microelectronic chip," *Optica*, vol. 2, no. 12, pp. 1065–1071, Dec 2015.
- [139] K. K. Mehta, C. D. Bruzewicz, R. McConnell, R. J. Ram, J. M. Sage, and J. Chiaverini, "Integrated optical addressing of an ion qubit," *Nature Nanotechnology*, vol. 11, pp. 1066 EP –, 08 2016.