How do children bootstrap language through noisy supervision? Most prior works focused on tracking co-occurrences between individual words and referents. We model cross-situational learning (CSL) at sentence level with few (1000) training examples. We compare reservoir computing (RC) and LSTMs on three datasets including complex robotic commands. For most experiments, reservoirs yield superior performance over LSTMs. Surprisingly, reservoirs demonstrate robust generalization when increasing vocabulary size: the error grows slowly. On the contrary, LSTMs are not robust: the number of hidden units needs to be dramatically increased to follow up vocabulary size increase, which is questionable from a biological or cognitive perspective. This suggests that that random projections used in RC helps to bootstrap generalization quickly. To our knowledge, this is a new result in developmental learning modelling. We analyse the evolution of internal representations during training of both recurrent networks and suggest why reservoir generalization seems more efficient.