Language is a productive system--we routinely produce well-formed utterances that we have never heard before. It is, however, difficult to assess when children first achieve linguistic productivity simply because we rarely know all the utterances a child has experienced. The onset of linguistic productivity has been at the heart of a long-standing theoretical question in language acquisition--do children come to language learning with abstract categories that they deploy from the earliest moments of acquisition? We address the problem of when linguistic productivity begins by marrying longitudinal behavioral observations and computational modeling to capitalize on the strengths of each. We used behavioral data to assess when a sample of 64 English-learning children began to productively combine determiners and nouns, a linguistic construction previously used to address this theoretical question. After the onset of productivity, the children produced determiner-noun combinations that were not attested in our sample of their linguistic input from caregivers. We used computational techniques to model the onsets and trajectories of determiner-noun combinations in these 64 children, as well as characteristics of their utterances in which the determiner was omitted. Because we knew exactly what input the model was trained on, we could, with confidence, know that the model had gone beyond its input. The parallels found between child and model in the timing and number of novel combinations suggest that the children too were creatively going beyond their input.
Keywords: generalization; grammatical development; linguistic productivity; modeling language acquisition; syntactic categories.