Gaussian processes (GPs) are ubiquitously used in science and engineering as metamodels. Standard GPs, however, can only handle numerical or quantitative variables. In this paper, we introduce latent map Gaussian processes (LMGPs) that inherit the attractive properties of GPs but are also applicable to mixed data that have both quantitative and qualitative inputs. The core idea behind LMGPs is to learn a low-dimensional manifold where all qualitative inputs are represented by some quantitative features. To learn this manifold, we first assign a unique prior vector representation to each combination of qualitative inputs. We then use a linear map to project these priors on a manifold that characterizes the posterior representations. As the posteriors are quantitative, they can be used in any standard correlation function such as the Gaussian. Hence, the optimal map and the corresponding manifold can be efficiently learned by maximizing the Gaussian likelihood function.
Through a wide range of analytical and real-world examples, we demonstrate the advantages of LMGPs over state-of-the-art methods in terms of accuracy and versatility. In particular, we show that LMGPs can handle variable-length inputs and provide insights into how qualitative inputs affect the response or interact with each other. We also provide a neural network interpretation of LMGPs and study the effect of prior latent representations on their performance. Lastly, we demonstrate that LMGP can be applied to data assimilation problems in which data from multiple sources (e.g., simulations, mathematical models and real-world experiments) are fused to improve prediction performance.