Space-filling designs are commonly used in computer experiments and other scenarios for investigating complex systems, but the construction of such designs is challenging. In this thesis, we construct a series of maximin-distance Latin hypercube designs via Williams transformations of good lattice point designs. Some constructed designs are optimal under the maximin L1-distance criterion, while others are asymptotically optimal. Moreover, these designs are also shown to have small pairwise correlations between columns. The procedure is further extended to the construction of multi-level nonregular fractional factorial designs which have better properties than regular designs. Existing research on the construction of nonregular designs focuses on two-level designs. We construct a novel class of multilevel nonregular designs by permuting levels of regular designs via the Williams transformation. The constructed designs can reduce aliasing among effects without increasing the run size. They are more efficient than regular designs for studying quantitative factors. In addition, we explore the application of experimental design strategies to data-driven problems and develop a subsampling framework for big data linear regression. The subsampling procedure inherits optimality from the design matrices and therefore minimizes the mean squared error of coefficient estimations for sufficiently large data. It works especially well for the problem of label-constrained regression where a large covariate dataset is available but only a small set of labels are observable. The subsampling procedure can also be used for big data reduction where computation and storage issues are the primary concern.

## Type of Work

Article (111) Book (0) Theses (1) Multimedia (1)

## Peer Review

Peer-reviewed only (91)

## Supplemental Material

Video (0) Audio (0) Images (0) Zip (1) Other files (2)

## Publication Year

## Campus

UC Berkeley (19) UC Davis (7) UC Irvine (0) UCLA (4) UC Merced (0) UC Riverside (0) UC San Diego (6) UCSF (10) UC Santa Barbara (0) UC Santa Cruz (0) UC Office of the President (12) Lawrence Berkeley National Laboratory (86) UC Agriculture & Natural Resources (0)

## Department

Research Grants Program Office (RGPO) (12) University of California Research Initiatives (UCRI) (1) UC Lab Fees Research Program (LFRP); a funding opportunity through UC Research Initiatives (UCRI) (1)

School of Medicine (5) Department of Psychiatry, UCSD (1) Skaggs School of Pharmacy and Pharmaceutical Sciences (SSPPS) (1)

## Journal

## Discipline

Life Sciences (1) Physical Sciences and Mathematics (1)

## Reuse License

BY-NC-ND - Attribution; NonCommercial use; No derivatives (2) BY - Attribution required (1) BY-NC - Attribution; NonCommercial use only (1)

## Scholarly Works (113 results)

We present a new linearly scaling three-dimensional fragment (LS3DF) method for large scale ab initio electronic structure calculations. LS3DF is based on a divide-and-conquer approach, which incorporates a novel patching scheme that effectively cancels out the artificial boundary effects due to the subdivision of the system. As a consequence, the LS3DF program yields essentially the same results as direct density functional theory (DFT) calculations. The fragments of the LS3DF algorithm can be calculated separately with different groups of processors. This leads to almost perfect parallelization on tens of thousands of processors. After code optimization, we were able to achieve 35.1 Tflop/s, which is 39percent of the theoretical speed on 17,280 Cray XT4 processor cores. Our 13,824-atom ZnTeO alloy calculation runs 400 times faster than a direct DFT calculation, even presuming that the direct DFT calculation can scale well up to 17,280 processor cores. These results demonstrate the applicability of the LS3DF method to material simulations, the advantage of using linearly scaling algorithms over conventional O(N3) methods, and the potential for petascale computation using the LS3DF method.

We have carried out a survey of codes and algorithms used on NERSC computers within the science category of material science. This is part of the effort to track the usage of different algorithms in NERSC community. This survey is based on the data provided in the ERCAP application of FY06. To figure out the usage of each code in one account, we have multiplied the total high performance computer (HPC) time allocation (MPP hours) of this account with the percentage usage of this code as estimated by the users in the ERCAP application. This is not the actual usage time, but should be a good estimation of it, and it represents the intention of the users.

An efficient new method is presented to calculate the quantum transports using periodic boundary conditions. This method allows the use of conventional ground state ab initio programs without big changes. The computational effort is only a few times of a normal ground state calculations, thus is makes accurate quantum transport calculations for large systems possible.