A Layered Bridge from Sound to Meaning: Investigating Cross-linguistic Phonosemantic Correspondences
The present paper addresses the study of cross-linguistic phonosemantic correspondences within a deep learning framework. An LSTM-based Recurrent Neural Network is trained to associate the phonetic representation of a word, encoded as a sequence of feature vectors, to its corresponding semantic representation in a multilingual and cross-family vector space. The processing network is then tested, without further training, in a language that does not appear in the training set and belongs to a different language family. The performance of the model is evaluated through a comparison with a monolingual and mono-family upper bound and a randomized baseline. After the assessment of the network's performance, the distribution of phonosemantic properties in the lexicon is inspected in relation to different (psycho)linguistic variables, showing a link between lexical non-arbitrariness and semantic, syntactic, pragmatic, and developmental factors.