What makes some words harder to learn than others in a second
language? Although some robust factors have been identified
based on small scale experimental studies, many relevant
factors are difficult to study in such experiments due to the
amount of data necessary to test them. Here, we investigate
what factors affect the ease of learning of a word in a second
language using a large data set of users learning English as a
second language through the Duolingo mobile app. In a
regression analysis, we test and confirm the well-studied effect
of cognate status on word learning accuracy. Furthermore, we
find significant effects for both cross-linguistic semantic
alignment and English semantic density, two novel predictors
derived from large scale distributional models of lexical
semantics. Finally, we provide data on several other
psycholinguistically plausible word level predictors. We
conclude with a discussion of the limits, benefits and future
research potential of using big data for investigating second
language learning.