Various studies have recently shown that the long-held claim that
the relation between the sound of a word and its meaning is
arbitrary needs to be revisited. In two computational studies we
investigated whether word valence can be derived from sound
features in English, Dutch and German. In Study 1, we identified
the extent to which individual phonological features explained
valence scores per language separately. In Study 2, we aimed to
determine the optimal combination of cues that can predict valence
scores across the three languages using two statistical classifiers
and four machine learning classifiers. Our results showed that
frequency and word complexity were the most reliable shared cues
to predict valence for all three languages, obtaining a correct
valence classification of about 60%. This percentage could be
enhanced for individual or pairs of languages using additional
relevant cues. These findings demonstrated that the claim that the
relation between the sound of a word and its meaning is arbitrary is
too strong.