Deep-Learning Electron Diffractive Imaging

We report the development of deep-learning coherent electron diffractive imaging at subangstrom resolution using convolutional neural networks (CNNs) trained with only simulated data. We experimentally demonstrate this method by applying the trained CNNs to recover the phase images from electron diffraction patterns of twisted hexagonal boron nitride, monolayer graphene, and a gold nanoparticle with comparable quality to those reconstructed by a conventional ptychographic algorithm. Fourier ring correlation between the CNN and ptychographic images indicates the achievement of a resolution in the range of 0.70 and 0.55 Å. We further develop CNNs to recover the probe function from the experimental data. The ability to replace iterative algorithms with CNNs and perform real-time atomic imaging from coherent diffraction patterns is expected to find applications in the physical and biological sciences. DOI: 10.1103/PhysRevLett.130.016101

We report the development of deep-learning coherent electron diffractive imaging at subangstrom resolution using convolutional neural networks (CNNs) trained with only simulated data. We experimentally demonstrate this method by applying the trained CNNs to recover the phase images from electron diffraction patterns of twisted hexagonal boron nitride, monolayer graphene, and a gold nanoparticle with comparable quality to those reconstructed by a conventional ptychographic algorithm. Fourier ring correlation between the CNN and ptychographic images indicates the achievement of a resolution in the range of 0.70 and 0.55 Å. We further develop CNNs to recover the probe function from the experimental data. The ability to replace iterative algorithms with CNNs and perform real-time atomic imaging from coherent diffraction patterns is expected to find applications in the physical and biological sciences. Coherent diffractive imaging (CDI), which replaces the physical lens of a microscope by coherent illumination and a computational algorithm [1], is revolutionizing the imaging and microscopy field [2]. In particular, ptychography, a powerful scanning CDI method, takes advantage of overlapping illuminations in the sample plane as a constraint to simultaneously reconstruct the complex exit wave of the sample and the probe function [3,4]. Although ptychography was proposed to extract the phase differences of overlapping diffraction spots for crystalline samples in 1969 [5], the modern version of ptychography utilizing iterative algorithms to recover the phases of noncrystalline objects was demonstrated with coherent x rays in 2007 [3], which was based on the original CDI experiment in 1999 [1]. Ptychographic CDI with iterative algorithms has found wide applications with synchrotron radiation, high harmonic generation, electron, and optical microscopy [2,[6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]. Albeit powerful, iterative algorithms are not only computationally expensive especially for the reconstruction of large fields of view, but also require practitioners to get algorithmic training to optimize the parameters for accurate phase retrieval. These difficulties have thus far prevented ptychographic CDI from being accessible to a wider user community. One approach to overcome these difficulties is to replace iterative algorithms with convolutional neural networks (CNNs) [21][22][23][24][25][26]. However, current experimental realizations of deep-learning CDI usually require a large experimental database to train the CNNs-numerous experiments with a specific physical setup must be performed before any future predictions can be made on that same setup. Deep-learning CDI would become a more general and powerful method if one can perform real-time, automated phase retrieval without the requirement of using experimental databases to train CNNs. In this Letter, we report that such a method is not only possible, but powerful enough to compete against conventional ptychographic reconstruction algorithms for real-time, atomic-scale imaging. By training CNNs with a sufficiently large database of randomly generated stock images, we demonstrate the power and generality of deep-learning CDI with subangstrom resolution using three experimental data sets of different samples acquired from different electron microscopes.
Deep-learning CDI with augmented data begins with the forward propagation of a coherent source. Figure 1(a) shows a diagram of a typical electron ptychography setup where a focused coherent source illuminates an object. The resulting wave function is propagated to the far field and only the square of the magnitude of the wave function is measured by a pixel array detector. Mathematically, this forward process relates the object function and the measurement by where MðkÞ is the magnitude of the Fourier transform (F ), and PðrÞ and OðrÞ are the complex probe and object functions, respectively. Since the phase of the Fourier transform is lost during measurement, the inverse of this forward process is nonlinear. Figure 1(b) shows the process of converting a random stock image found on the Internet into a pure phase object, which is illuminated by a probe function to produce an exit wave. The magnitude and phase of the probe function are estimated based on the defocus and aberration of the electron optics. The diffraction intensity of the exit wave with Poisson noise is calculated to satisfy the oversampling requirement for phase retrieval [27]. The square root of the noisy diffraction intensity is used to train CNNs with an L1-norm loss function to recover the phases in the illuminated area (named a phase patch) [ Fig. 1(c)]. Note that only from the diffraction intensity, one cannot distinguish a function fðrÞ from its twin, f Ã ð−rÞ, which is known as the twin image problem [27]. CNNs solve this problem by using the constraint of the probe function. When training CNNs, the function taken to the far field is modeled as fðrÞ ¼ PðrÞ · OðrÞ, whereas its twin function is represented by Since the inequality PðrÞ ≠ P Ã ð−rÞ usually holds due to the aberration and/or defocus of the probe, the CNN is trained to recover OðrÞ and can eliminate the nonphysical solution of ½P Ã ð−rÞ=PðrÞO Ã ð−rÞ as demonstrated in our numerical simulations and experimental results below. The CNNs have an encoder-decoder architecture [ Fig. 1(d)], commonly known as U net [28], with skip connections between corresponding tensor sizes to prevent vanishing gradient issues. A schematic of the residual layers is shown in Supplemental Material [29], Fig. S1, with skip connections acting as concatenations that provide direct throughput within the architecture. In our experience, using randomly generated stock photos from the Internet provides a rich source of entropy within the images to sufficiently train the CNNs without imposing any regularizations [30]. Detailed examples of phase image generation, forward process, and CNN performance on validation data can be found in Supplemental Material [29], Fig. S2, while a step-by-step procedure on how realspace support is applied with cropping and oversampling is shown in Supplemental Material [29], Fig. S3. As the diffraction intensity is corrupted with Poisson noise, a low learning rate of 1.0 × 10 −4 is used and a high dropout rate of 0.2 is applied at every layer to prevent overfitting of noise while training with stochastic gradient descent. A quantitative study of the effect of the training data size on CNNs indicated that 250 000 simulated diffraction patterns of stock images from the Internet are sufficient for accurate phase retrieval (Supplemental Material [29], Fig. S4). The trained CNN is then used to directly map from the magnitudes of the Fourier transform to phase patches without any iteration, which are merged to form a phase image by a stitching algorithm [ Fig. 1(e)]. The detailed description of the stitching algorithm is provided in Supplemental Material [29]. To investigate the CNN and the stitching algorithm under various shot noise conditions, we simulate diffraction patterns with three different orders of the noise and confirm that the final stitched phase reconstructions using the CNN are consistent with those obtained by the extended ptychographic iterative engine (ePIE) [31] (Supplemental Material [29], Fig. S5).
To demonstrate the versatility of deep-learning CDI, we performed electron ptychography experiments on three different samples using different microscopes. The first sample consists of two 5-nm-thick hexagonal boron nitride (h-BN) flakes with a twisted interface [32]. The experiment was conducted on the TEAM I double-corrected S=TEM instrument at the National Center for Electron Microscopy (NCEM) (Supplemental Material [29], Table S1). The microscope was equipped with a Gatan K3 pixelated detector, which operated in electron-counting mode and was binned (×2) and windowed (×2) to 512 × 512 pixels. The diffraction patterns were further binned to 32 × 32 pixels in postprocessing. Figure 2(a) shows a representative diffraction pattern with high noise. Conventional STEM imaging modes such as annular dark-field (ADF) imaging produce images with a poor signal-to-noise ratio [ Fig. 2(b)]. To train the CNN, a probe function is analytically generated by parametrizing the aberration function up to second order with 7 total parameters: one from defocus, two from twofold astigmatism, two from coma, and two from threefold astigmatism [33]. After training with 250 000 simulated diffraction patterns (Supplemental Material [29], Table S2 and 2(f), which is implemented on the GPU in MATLAB and takes 14 min for 50 iterations. Fourier ring correlation (FRC) between the CNN and ePIE reconstructions shows a good agreement between the two methods [ Fig. 2(g)]. The slight difference at low spatial frequencies of the FRC curve is likely because the stitching algorithm can combine the phase patches together more uniformly than ePIE. Based on the cutoff criteria of FRC ¼ 0.5 and 0.143 [34], the resolution of the CNN phase image is determined to be 0.71 and 0.53 Å, respectively [red and blue dashed lines in Fig. 2(g)], demonstrating that both methods consistently reconstructed the diffraction signal beyond the brightfield disk.
Next, we use an experimental data set of monolayer graphene to investigate the tolerance of the stitching algorithm to varying overlap of the diffraction patterns. The ptychographic data set was acquired using a JEOL 4DCanvas pixelated detector installed on a JEM-ARM200F probe-corrected microscope (Supplemental Material [29], Table S1) [35,36]. The probe function is analytically generated with a second order aberration function as implemented in the h-BN experiment, and a CNN is trained by 250 000 simulated diffraction patterns (Supplemental Material [29], Table S2). Phase patches independently retrieved by the CNN from the experimental diffraction patterns are stitched together to form a phase image [ Fig. 3(a)], which is in a good agreement with the ePIE reconstruction of the same data set [ Fig. 3(e)]. To study the performance of the stitching algorithm with varying overlap of the diffraction patterns, every other diffraction pattern in both x and y scanning directions is used to reconstruct the phase image in Fig. 3(b), doubling the scanning step size from 0.132 to 0.264 Å and quartering the effective electron dose. Additionally, the stitched phase images of the CNN by taking every three (0.396 Å step size) and four (0.528 Å step size) diffraction patterns are To examine the performance of the CNN with strongly scattering atoms, we performed a ptychographic experiment on a 5-nm Au nanoparticle. The diffraction patterns were acquired using the 4D camera (576 × 576 pixels) [37] installed on the TEAM 0.5 double-corrected S=TEM instrument at NCEM (Supplemental Material [29], Table S1). Figure 4(a) shows a representative diffraction pattern after binning with high noise. An ADF STEM image is generated by integrating the intensity outside the bright-field disk [ Fig. 4(b)], corresponding to an incoherent image of the sample. The probe function is analytically generated with a second order aberration function as for the h-BN and graphene cases, and a CNN is trained by 250 000 simulated diffraction patterns (Supplemental Material [29], Table S2). Figures 4(c) and 4(d) show the phase images retrieved by the CNN and ePIE, respectively, exhibiting consistent atomic features along the zone axis. The FRC curve of the two images indicates a resolution of 0.70 and 0.55 Å with the cutoff criteria of FRC ¼ 0.5 and 0.143, respectively [ Fig. 4(e)]. The dip in the low spatial frequencies in the FRC curve is consistent with the observation in the h-BN result [ Fig. 2(g)].
To remove the necessity of estimating the probe function from the defocus and aberration parameters, we have developed a CNN-based probe recovery method. A CNN is trained using simulated diffraction patterns illuminated by randomly generated probe functions with seven defocus and aberration parameters (Supplemental Material [29], Table S3). The probe recovery process starts with an initial probe function created by randomly choosing seven defocus and aberration parameters. The trained CNN retrieves the phase patches from pairs of adjacent diffraction patterns with the initial probe function as input. The cumulative L1 error within the overlapping regions is used as the loss function to optimize the defocus and aberration parameters [Supplemental Material [29], Fig. S6(a)]. We employ 100 pairs of diffraction patterns and a well-defined probe function can be recovered after 50 iterations of stochastic optimization with 30 trial searches per iteration. Figures S6(b)-S6(d) in Supplemental Material [29] shows the initial probe function, and the probe functions retrieved by the CNN and ePIE, respectively. With the initial (incorrect) probe function, the CNN retrieves a phase image of the twisted h-BN interface with degraded atomic structure [Supplemental Material [29], Figs. S6(e) and S6(f)]. With the CNN-recovered probe function, the phase image of the h-BN sample retrieved by the CNN [Supplemental Material [29], Figs. S6(g) and S6(h)] agrees well with the ePIE reconstruction [Figs. 2(e) and 2(f)]. We also apply the same trained CNN to other experimental data, and successfully recover the probe function and the phase image of the samples from the diffraction patterns alone.
We demonstrate deep-learning CDI at subangstrom resolution using CNNs trained only with stock images from the Internet. The CNNs are then coupled with a stitching algorithm to directly retrieve the phase images of twisted h-BN, monolayer graphene, and a Au nanoparticle from experimental electron diffraction patterns. Quantitative analysis using FRC curves indicates that the phase images recovered by deep-learning CDI have comparable quality to those reconstructed by ePIE. The spatial resolution of these phase images is quantified by the FRC to be in the range of 0.71-0.53 Å. We also demonstrate that the CNNs can be used to recover the probe function from the experimental diffraction patterns based on the overlap conditions. Compared with iterative algorithms such as ePIE [3,4,31], deep-learning CDI independently recovers phase patches at different scanning positions by CNNs and then stitches them together to form a phase image, which can be implemented in real time with a computational time much faster than iterative algorithms. In contrast to noniterative ptychographic methods that require small scan step sizes to sample at the Nyquist limit [10,38], deeplearning CDI can be implemented with larger scan step sizes to achieve larger fields of view, which can be tuned by defocusing the probe (Supplemental Material [29], Fig. S7). Additionally, it can reach higher spatial resolutions than the noniterative methods as the resolution of deep-learning CDI is only limited by the spatial frequency of the diffraction signal. Although we focus on 2D atomic-scale imaging in this Letter, deep-learning CDI can also be combined with atomic electron tomography [39,40]to determine the 3D atomic structure of radiation sensitive, low-Z, and amorphous materials [41][42][43]. With further development, we expect that deep-learning CDI will become an important tool for real-time, atomic-scale imaging of a wide range of samples across different disciplines.
All the data and codes are available at [44].
We thank Gatan, Inc. as well as P.