Robots are becoming increasingly weaved into the fabric of our society, from self-driving cars on our streets to assistive manipulators in our homes. To act in the world, robots rely on a representation of salient features of the task: for example, to hand me a cup of coffee, the robot considers movement efficiency and cup orientation in its behavior. However, if we want robots to act for and with people, their representations must not be just functional but also reflective of what humans care about, i.e. their representations must be aligned with humans'. What's holding us back from successful human-robot interaction is that these representations are often misaligned, resulting in anything from miscoordination and misunderstandings, to learning and executing dangerous behaviors.
To learn the human's representation of what matters in a task, typical methods rely on data sets of human behavior but this data cannot reflect every individual, environment, and task the robot will be exposed to. This dissertation advocates that we should instead treat humans as active participants in the interaction not as static data sources: robots must engage with humans in an interactive process for finding a shared representation. We formalize the representation alignment problem as a joint search for a common representation. Then, rather than hoping that representations will naturally be aligned, we propose having humans directly teach them to robots with representation-specific input. Next, we enable robots to automatically detect representation misalignment with the human by estimating a confidence over how much the robot's representation can explain the human's behavior. We demonstrate how human-aligned representations can lead to novel human behavior models with broad implications beyond robotics, to econometrics and cognitive science. Finally, this thesis concludes by asking ``How can robots help the human-robot team converge to a shared representation?'' and discusses opportunities for future work in expanding representation alignment for seamless human-robot interaction.