Children’s prelinguistic gestures play a central role in theircommunicative development. Early gesture use has beenshown to be predictive of both concurrent and later languageability, making the identification of gestures in video data atscale a potentially valuable tool for both theoretical and clini-cal purposes. We describe a new dataset consisting of videos of72 infants interacting with their caregivers at 11&12 months,annotated for the appearance of 12 different gesture types. Wepropose a model based on deep convolutional neural networksto classify these. The model achieves 48.32% classification ac-curacy overall, but with significant variation between gesturetypes. Critically, we found strong (0.7 or above) rank ordercorrelations between by-child gesture counts from human andmachine coding for 7 of the 12 gestures (including the criticalgestures of declarative pointing, hold outs and gives). Giventhe challenging nature of the data - recordings of many differ-ent dyads in different environments engaged in diverse activi-ties - we consider these results a very encouraging first attemptat the task, and evidence that automatic or machine-assistedgesture identification could make a valuable contribution to thestudy of cognitive development.