Human beings get a lot of information from a picture based on
what we see and our background knowledge. However, many
computer vision researches are heavily dependent on the use of
image features and have paid little attention to background
knowledge we use in texture processing. The present study
explores the degree to which onomatopoeia evoked by visual
images is affected by the multimodal experience-based
knowledge such as tactile experience. In Experiment 1
participants saw original complete images of Flickr Material
Database (FMD) and answered onomatopoeia for expressing
their textures and in Experiment 2 participants saw cut out
images and answered onomatopoeia for expressing their
textures. We obtained 17487 onomatopoeic words (1827 types)
from experiment 1 and 30138 onomatopoeic words (2442 types)
from experiment 2. We counted the number of types of
onomatopoeia evoked by each image. Result showed that
original image evoked significantly more variety of
onomatopoeia than cut-off image. This result suggests that
human texture evaluations based on the original complete
images of FMD are affected more easily by experience-based
knowledge about the material. Furthermore, we showed that
image whose material category is relatively easy to recognize
evokes significantly frequently tactile onomatopoeia than image
whose material category is hard to recognize.