We present recent work on a prototype compact neutron generator for associated particle imaging (API). API uses alpha particles that are produced simultaneously with neutrons in the deuterium-tritium (2D(3T,n)4 alpha) fusion reaction to determine the direction of the neutrons upon exiting the reaction. This method determines the spatial position of each neutron interaction and requires the neutrons to be generated from a small spot in order to achieve high spatial resolution. The ion source for API is designed to produce a focused ion beam with a beam spot diameter of 1-mm or less on the target. We use an axial type neutron generator with a predicted neutron yield of 108 n/s for a 50 muA D/T ion beam current accelerated to 80 kV. The generator utilizes a RF planar spiral antenna at 13.56 MHz to create a highly efficient inductively-coupled plasma at the ion source. Experimental results show that beams with an atomic ion fraction of over 80percent can be obtained while utilizing only 100 watts of RF power in the ion source. A single acceleration gap with a secondary electron suppression electrode is used in the tube. Experimental results, such as the current density, atomic ion fraction, electron temperature, and electron density, from ion source testing will be discussed.

## Type of Work

Article (67) Book (0) Theses (57) Multimedia (0)

## Peer Review

Peer-reviewed only (109)

## Supplemental Material

Video (0) Audio (1) Images (0) Zip (1) Other files (2)

## Publication Year

## Campus

UC Berkeley (4) UC Davis (3) UC Irvine (0) UCLA (97) UC Merced (2) UC Riverside (0) UC San Diego (22) UCSF (12) UC Santa Barbara (2) UC Santa Cruz (1) UC Office of the President (2) Lawrence Berkeley National Laboratory (5) UC Agriculture & Natural Resources (0)

## Department

School of Medicine (17) Family Medicine (2)

Department of Statistics, UCLA (13) Research Grants Program Office (2) UCLA Department of Psychology (2) Anderson School of Management (1)

## Journal

Proceedings of the Annual Meeting of the Cognitive Science Society (2)

## Discipline

## Reuse License

BY - Attribution required (1)

## Scholarly Works (124 results)

To achieve the required damping time in the main damping rings for the Next Linear Collider (NLC), a wiggler will be required in each ring with integrated squared field strength up to 110 T^2m. There are concerns that nonlinear components of the wiggler field will damage the dynamic aperture of the ring, leading to poor injection efficiency. Severe effects from an insertion device have been observed and corrected in SPEAR 2. In this paper, we describe a model that we have developed to study the effects of the damping wiggler, compare the predictions of the model with actual experience in the case of the SPEAR 2 wiggler, and consider the predicted effects of current damping wiggler design on the NLC main damping rings.

This thesis is to meet the needs of developing automation process on defective testing in the

hard disk drives production process, focusing on machine learning and articial intelligence.

The objective is to to predict the defectives and improve the accuracy rate by using the tree

based algorithm and neural networks. The powerful models can help manufacturing process

improving time and labor efficiency.

Existence of missing values creates a big problem in real world data. Unless those values are missing completely at random, we cannot disregard them. This paper demonstrates some implementation methods to deal with missing values. It will show the theory and implementations of EM Algorithm, Regression Imputation, Stochastic Regression Imputation and Multiple imputations. This paper begins by introducing the theories of those methods and then applying them to two examples and finally diagnostics. I will use some examples to demonstrate these algorithms and hope this helps researchers with various backgrounds solve missing values problem in their data.

- 2 supplemental files

Click through rate (CTR) and conversation rate estimation are two core prediction tasks in online advertising. However, four major challenges emerged as data scientists trying to analyze the advertising data - sheer volume, the amount of data available for mining is massive; complex structure, there is no easy way to tell what factors drive a user to click an ad or make a conversion and how the factors interacted with one another; high cardinality for categorical variables, features like device id usually have tons of possible values which will lead to very sparse data; severe skewness in response variable with the majority of the users not clicking the ad. In this paper, I will make a comprehensive summary of the state-of-art machine learning models (decision tree based, regularized logistic regression, online learning, and factorization machine) that are often used in the industry to solve the problem. Insights and practical tricks are then provided based on a wide range of experiments conducted on multiple data sets with different characteristics.

Generative model, as an unsupervised learning approach, is a promising development for learning meaningful representations without focusing on specific tasks. Finding such generative models is one of the most fundamental problems in both statistics, computer vision, and artificial intelligence research. The deep energy-based model (EBM) is one of the most promising candidates. Previous works have proven the capability of EBM on image domains. In this dissertation, we explore the capability of EBM in three important domains: unordered set modeling, 3D shape representation, and continuous inverse optimal control. For each domain, we proposed a novel approach using EBM and got substantial competitive results.

Originated from statistical physics, EBM directly defines a probability density that is an exponential of the negative energy function, where the energy function maps the input variable to an energy scalar. Training an EBM from observed data entails finding an energy function, where observed data are assigned lower energies than unobserved ones. Given the observed training data, EBM are trained by maximum likelihood estimation, which leads to an ``analysis by synthesis'' algorithm. The training process iterates the following two steps: (1) Synthesis step: sample the data from the current probability distribution using the Markov chain Monte Carlo (MCMC) method. (2) Analysis step: update the model parameters based on the statistical difference between the synthesized data and the observed data. Compared other commonly used generative models, such as Generative Adversarial Network (GAN) or Variational Auto-encoder (VAE), EBM is appealing because (1) EBM provides an explicit density function for the data; (2) training EBM does not rely on any auxiliary models; (3) Training EBM does not suffer from mode collapse; (4) EBM unifies the representation and generation in a single framework.

We first propose EBM on unordered set data, such as point clouds which are widely used in 3D shape representation. In this part, we propose a generative model of unordered point sets in the form of an EBM, where the energy function is parameterized by an input-permutation-invariant bottom-up neural network. The energy function learns a coordinate encoding of each point and then aggregates all individual point features into energy for the whole point cloud. We call our model the Generative PointNet because it can be derived from the discriminative PointNet. Our model can be trained by MCMC-based maximum likelihood learning (as well as its variants), without the help of any assisting networks like those in GANs and VAEs. Unlike most point cloud generators that rely on hand-crafted distance metrics, our model does not require any hand-crafted distance metric for the point cloud generation, because it synthesizes point clouds by matching observed examples in terms of statistical properties defined by the energy function. Furthermore, we can learn a short-run MCMC towards EBM as a flow-like generator for point cloud reconstruction and interpolation. The learned point cloud representation can be useful for point cloud classification and segmentation.

We then design a novel shape implicit representation based on EBM. The implicit representation, which uses a function to represent a 3D shape, shows great performance in the 3D graphic field. Unlike previous work which required a human-defined function, we proposed the energy-based implicit function defined as a natural representation of the probability that a point is on the surface. The energy-based implicit function learned a probability distribution for points over the 3D space. With the introduction of conditional latent code, a deep neural network approximated energy function can represent multiple objects. We use importance sampling and maximum likelihood estimation to learn this network. Our training procedure does not require extra human-defined loss functions and sample points which are not on the surface. Furthermore, we combined this energy-based implicit function with variation auto-encoder for improved capacity in generation.

At last, we focus on the problem of continuous inverse optimal control (over finite time horizon) by learning the unknown cost function over the sequence of continuous control variables from expert demonstrations. In this part, we study this fundamental problem in the framework of EBM, where the observed expert trajectories are assumed to be random samples from a probability density function defined as the exponential of the negative cost function up to a normalizing constant. The parameters of the cost function are learned by maximum likelihood via an ``analysis by synthesis'' scheme, which iterates (1) synthesis step: sample the synthesized trajectories from the current probability density using the Langevin dynamics via back-propagation through time, and (2) analysis step: update the model parameters based on the statistical difference between the synthesized trajectories and the observed trajectories. Given the fact that an efficient optimization algorithm is usually available for an optimal control problem, we also consider a convenient approximation of the above learning method, where we replace the sampling in the synthesis step by optimization. Moreover, to make the sampling or optimization more efficient, we propose a method to train EBM simultaneously with a top-down trajectory generator via cooperative learning, where the trajectory generator is used to fast initialize the synthesis step of EBM. We demonstrate that the proposed methods work well on autonomous driving tasks and show that they can learn suitable cost functions for optimal control.

It is widely held that larger language models, trained on vast quantities of text, excel at generating coherent and fluent text. But at the same time, Small Language Models still struggle to produce meaningful text beyond a few words. The specific scale at which these abilities emerge is still not well-defined. Consequently, the lingering question remains: must a model be large-scale to generate coherent text?In this paper we have have trained a small language model on Tiny Stories, a synthetic dataset of short stories. The objective is to study the small language models in their ability to generate coherent and consistent English text. We have performed a comparative study where we have analyzed the convergence of loss and investigated how adjustments to the number of heads, layers, and embedding size affect the generation of English text in Small Language Models.

Nowadays, the popular online review sites like Yelp have greatly affected the user purchase behaviors. Users either search for reviews to judge the quality of interested businesses or get recommendations of businesses they might like. Accordingly, this paper explores review rating predictions with two approaches. Binary sentiment analysis used vectorized review documents to predict general positive or negative attitudes in the text reviews. It gives higher prediction accuracy and adds on real-word interpretability of text reviews. A nearest neighbor collaborative filtering recommender predicts five-star ratings based on similar users and similar businesses. It predicts more comparable results to actual ratings with decent model accuracy and can make user-specific recommendations.

Probabilistic generative models, especially ones that are parametrized by convolutional neural network (ConvNet), are compact representation tools towards knowledge understanding and can be crucial in statistics as well as artiﬁcial intelligence. The generator model and the energy-based model are two notable examples. Yet the learning and understanding of such models can be challenging because of the high dimensionality of the input and the high non-linearity of the network. In this dissertation, we pay particular attention to the generator model, and study its learning algorithm and the behavior of the learned model. We also develop the joint learning scheme for both the generator model and the energy-based model.

To learn the generator model, we view it in the lens of non-linear generalization of factor analysis and propose an alternating back-propagation algorithm for learning. The alternating back-propagation algorithm iterates the following two steps: (1) Inferential back-propagation, which infers the latent factors by Langevin dynamics or gradient descent. (2) Learning back-propagation, which updates the parameters given the inferred latent factors by gradient descent. The gradient computations in both steps are powered by back-propagation, and they share most of their code in common. We show that the alternating back-propagation algorithm can learn realistic generator models of natural images, video sequences, and sounds. Moreover, it can also be used to learn from incomplete or indirect training data.

The generator model can be naturally extended for multi-view representation learning where we build separate generator model for each domain but share their latent variables. The proposed multi-view generator model can be easily learned through alternating back-propagation. Our experiments show that the proposed method is powerful in both generation, prediction and recognition. Speciﬁcally, we demonstrate our model can accurately rotate and complete faces as well as predict missing modalities. We also show our model can achieve state-of-art or competitive recognition performance through quantitative comparisons.

Further, the generator model can be jointly learned with the energy-based model. We propose the probabilistic framework, called divergence triangle, as a compact and symmetric (anti-symmetric) objective function that seamlessly integrates variational learning, adversarial learning, wake-sleep algorithm, and contrastive divergence. This uniﬁcation makes the processes of sampling, inference, energy evaluation readily available without the need for costly Markov chain Monte Carlo methods. Our experiments demonstrate that the divergence triangle is capable of learning (1) an energy-based model with well-formed energy landscape, (2) direct sampling in the form of a generator model, and (3) feed-forward inference that faithfully reconstructs observed as well as synthesized data. The divergence triangle is also a robust training method that can learn from incomplete data.

The last but not the least, we take the inspiration from recent discovery in neuroscience which states that for the face stimuli generated by a pre-trained active appearance model (AAM), the responses of neurons in the selected areas of the primate brain exhibit strong linear relationship with the shape and appearance variables of the AAM that generates the face stimuli. We show that this behavior can be replicated by a generator model. Speciﬁcally, we learn the generator model from the face images generated by a pre-trained AAM model using variational auto-encoder, and we show that the inferred latent variables of the learned generator model have strong linear relationship with the shape and appearance variables of the AAM model that generates the face images. Unlike the AAM model that has an explicit shape model where the shape variables generate the landmarks, the generator model has no such shape model and shape variables. Yet the generator model can learn the shape knowledge in the sense that some of the latent variables of the learned generator network capture the shape variations in the face images generated by AAM.