The robust and efficient recognition of visual relations in im-ages is a hallmark of biological vision. We argue that, de-spite recent progress in visual recognition, modern machinevision algorithms are severely limited in their ability to learnvisual relations. Through controlled experiments, we demon-strate that visual-relation problems strain convolutional neuralnetworks (CNNs). The networks eventually break altogetherwhen rote memorization becomes impossible, as when intra-class variability exceeds network capacity. Motivated by thecomparable success of biological vision, we argue that feed-back mechanisms including attention and perceptual groupingmay be the key computational components underlying abstractvisual reasoning.