Visual working memory possesses a limited capacity for information but people can use objects’ statistical structure to help remember their features. If you know that your papers are scattered around your desk, for example, this constrains their possible locations (e.g. it is unlikely they are in the bathroom) and can help you remember specifically where each paper is on your desk. However, it is often uncertain what information visual working memory should summarize to aid recall later on. Is it sufficient to remember that the papers were near the desk? Or will you need to know where they were relative to each other? My dissertation investigates what statistical structure visual working memory seeks to encode by (Chapter 1) revealing what visuospatial groupings people expect and tend to use, (Chapter 2) examining how people use those expectations to form structured memories of objects’ groupings and (Chapter 3) evaluating the cost of using this grouping structure—what information is lost by encoding objects as components of groups. Overall, my dissertation reveals reveals that while exploiting the statistics of scenes introduces structured biases into memories, doing so enables visual memory to build accurate, multi-level representations of scenes.