Young children can reason about direct and indirect visual information, but fully mapping this understanding to linguisticforms encoding the two knowledge sources appears to come later in development. In English, perception verbs with smallclause complements (I saw something happen) report direct perception of an event, while perception verbs with sententialcomplements (I saw that something happened) can report inferences about an event. In two experiments, we explore when4-9-year-old English-speaking children have linked the conceptual distinction between direct perception and inferenceto different complements expressing this distinction. We find that unlike older children or adults, 4-6-year-olds do notrecognize that see with a sentential complement can report visually-based inference, even when syntactic and contextualcues make inference interpretations highly salient. Until around age seven, children are still learning the syntax andsemantics of perception verbs like see and how distinct syntactic forms encode different kinds of perceptual experience.