What creates violations of linguistic universals? Correlates of linguistic rarities and their importance in studying linguodiversity
Absolute linguistic universals (ALUs) are theoretically useful. However, do non-trivial ALUs exist? Two reasons for doubt can be identified. First, many proposed ALUs transpired to be only strong statistical tendencies. Second, computational modeling suggests that rarely can positive typological evidence alone indicate an ALU. Thus, the debate over ALU existence persists. We advance this discussion by analyzing what languages tend to violate previously proposed ALUs and whether they are distributed randomly in terms of sociogeography and phylogeny. Non-randomness would suggest bias in ALU estimation using existing data. We find non-random distribution of ALU violations: languages with more speakers, certain language families, including Indo-European, and language isolates tend to violate ALUs more often than expected. The results suggest that linguistic diversity is underdescribed, and that many ALUs are put forward due to an underestimation of the space of possible linguistic structures due to sampling biases of researchers.