Theories suggest that representational gestures depicting properties of referents in accompanying speech could facilitate language production and comprehension. In order to shed light on how gesture and speech are coordinated during production, we investigate whether representational gestures are time-locked to the onset of utterances (hence planned when full events are encoded) or Lexical Affiliates (LAs; words most closely aligned with the gesture meaning; hence planned when individual concepts are encoded) in a large corpus of naturalistic conversation (n = 1803 gestures from n = 24 speakers). Our data shows that representational gestures are more tightly tied to LA onsets than utterance onsets, which is consistent with theories of multimodal communication in which gestures aid conceptual packaging or retrieval of individual concepts rather than events. We also demonstrate that in naturalistic speech, representational gestures tend to precede their LAs by around 370ms, which means that they could plausibly allow for an addressee to predict upcoming words (ter Bekke, Drijvers & Holler, 2021; Ferré, 2010; Habets et al., 2011).