Few-shot Concept Induction through Lenses of Intelligence Quotient Tests
Humans not only learn concepts from labeled supervision but also induce new relational concepts unsupervisedly from observing reoccurring sequences of events. In contrast with the abundance of tasks that challenge machines on perception, one that evaluates machines’ few-shot concept induction ability has been long overdue. To endow machines with such capability and fill the missing gap, we start with the introduction of RAVEN, a dataset based on the cognitive study of Raven’s Progressive Matrices (RPM) that has proven to be effective in measuring humans’ few-shot concept induction. In particular, we note that neural methods that are supplied with the idea of contrastive learning can significantly improve both model performance and learning efficiency. However, completely neural methods are neither interpretable nor performative. Therefore, we further propose neuro-symbolic approaches. We first introduce a neuro-symbolic Probabilistic Abduction and Execution (PrAE) learner; central to the PrAE learner is the process of probabilistic abduction and execution on a probabilistic scene representation, akin to the mental manipulation of objects. In PrAE, we disentangle perception and reasoning from a monolithic model. The neural visual perception frontend predicts objects’ attributes, later aggregated by a scene inference engine to produce a probabilistic scene representation. In the symbolic logical reasoning backend, the PrAE learner uses the representation to abduce the hidden rules. An answer is predicted by executing the rules on the probabilistic representation. The entire system is trained end-to-end in an analysis-by-synthesis manner without any visual attribute annotations. While effective, PrAE essentially turns the induction problem into abduction problem as explicit knowledge is recruited. We then introduce the ALgebra-Aware Neuro-Semi-Symbolic (ALANS) learner. The ALANS learner is motivated by abstract algebra and the representation theory. It consists of a neural visual perception frontend and an algebraic abstract reasoning backend: the frontend summarizes the visual information from object-based representation, while the backend transforms it into an algebraic structure and induces the hidden operator on the fly. The induced operator is later executed to predict the answer’s representation, and the choice most similar to the prediction is selected as the solution. Both methods explicitly realize the computational process of reasoning, achieve improved performance, and are more interpretable and easy for debugging. However, compared to PrAE, ALANS fully implements the induction process as on-the-fly optimization. Experiments show that the ALANS learner outperforms various pure connectionist models in domains requiring systematic generalization. We further show the generative nature of the learned algebraic representation; it can be decoded by isomorphism to generate an answer. The results and analysis demonstrate that the learned algebraic architecture facilitates relational learning and is a viable schema for few-shot concept learning.