Skip to main content
Open Access Publications from the University of California

The Omniglot Jr. challenge; Can a model achieve child-level character generation and classification?


Lake 2015 presented the Omniglot dataset to study how people generate and classify characters. They proposed a model for one-shot-learning of new concepts based on inferring compositionally structured generative models, and could transfer from familiar concepts to new ones. Their Bayesian-Programming-Language model was able to both classify and generate characters similar to adults. However, adults have years of experience, would a similar model apply to children without this prior knowledge? We introduce a new dataset called Omniglot_Jr., composed of Omniglot letters generated by children aged 3-6. Our results of training BPL with children's data in classification and generation find that BPL achieves higher classification accuracy and generates more adult looking letters. We propose the challenge of reproducing children's distinctive pattern of mistakes. Skills such as character recognition depend on child-like learning; this challenge should help us understand how that learning is possible, and how to simulate it in computational systems.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View