When learning about events through visual experience, one must not only identify which events are visually similar but also retrieve those events associates-which may be visually dissimilar-and recognize when different events have similar predictive relations. How are these demands balanced? To address this question, we taught participants the predictive structures among four events, which appeared in four different sequences, each cued by a distinct object. In each, one event (cause) was predictably followed by another (effect). Sequences in the same relational category had similar predictive structure, while across categories, the effect and cause events were reversed. Using functional magnetic resonance imaging data, we measured associative coding, indicated by correlated responses between effect and cause events; perceptual coding, indicated by correlated responses to visually similar events; and relational category coding, indicated by correlated responses to sequences in the same relational category. All three models characterized responses within the right middle temporal gyrus (MTG), but in different ways: Perceptual and associative coding diverged along the posterior to anterior axis, while relational categories emerged anteriorly in tandem with associative coding. Thus, along the posterior-anterior axis of MTG, the representation of the visual attributes of events is transformed to a representation of both specific and generalizable relational attributes.