Using machine learning to classify extant apes and interpret the dental morphology of the chimpanzee-human last common ancestor


Machine learning is a formidable tool for pattern recognition in large datasets. We developed and expanded on these methods, applying machine learning pattern recognition to a problem in paleoanthropology and evolution. For decades, paleontologists have used the chimpanzee as a model for the chimpanzee-human last common ancestor (LCA) because they are our closest living primate relative. Using a large sample of extant and extinct primates, we tested the hypothesis that machine learning methods can accurately classify extant apes based on dental data. We then used this classification tool to observe the affinities between extant apes and Miocene hominoids. We assessed the discrimination accuracy of supervised learning algorithms when tasked with the classification of extant apes (n=175), using three types of data from the postcanine dentition: linear, 2-dimensional, and the morphological output of two genetic patterning mechanisms that are independent of body size: molar module component (MMC) and premolar-molar module (PMM) ratios. We next used the trained algorithms to classify a sample of fossil hominoids (n=95), treated as unknowns. Machine learning classifies extant apes with greater than 92% accuracy with linear and 2-dimensional dental measurements, and greater than 60% accuracy with the MMC and PMM ratios. Miocene hominoids are morphologically most similar in dental size and shape to extant chimpanzees. However, relative dental proportions of Miocene hominoids are more similar to extant gorillas and follow a strong trajectory through evolutionary time. Machine learning is a powerful tool that can discriminate between the dentitions of extant apes with high accuracy and quantitatively compare fossil and extant morphology. Beyond detailing applications of machine learning to vertebrate paleontology, our study highlights the impact of phenotypes of interest and the importance of comparative samples in paleontological studies.

