Abstract. Microbialites are a product of trapping and binding of sediment by microbial communities, and are considered to be some of the most ancient records of life on Earth. It is a commonly held belief that microbialites are limited to extreme, hypersaline settings. However, more recent studies report their occurrence in a wider range of environments. The goal of this study is to explore whether microbialite-bearing sites share common geochemical properties. We apply statistical techniques to distinguish any common traits in these environments. These techniques ultimately could be used to address questions of microbialite distribution: are microbialites restricted to environments with specific characteristics; or are they more broadly distributed? A dataset containing hydrographic characteristics of several microbialite sites with data on pH, conductivity, alkalinity, and concentrations of several major anions and cations was constructed from previously published studies. In order to group the water samples by their natural similarities and differences, a clustering approach was chosen for analysis. k means clustering with partial distances was applied to the dataset with missing values, and separated the data into two clusters. One of the clusters is formed by samples from atoll Kiritimati (central Pacific Ocean), and the second cluster contains all other observations. Using these two clusters, the missing values were imputed by k nearest neighbor method, producing a complete dataset that can be used for further multivariate analysis. Salinity is not found to be an important variable defining clustering, and although pH defines clustering in this dataset, it is not an important variable for microbialite formation. Clustering and imputation procedures outlined here can be applied to an expanded dataset on microbialite characteristics in order to determine properties associated with microbialite-containing environments.