As we introduce complex algorithmic systems into decision-making in high-stakes domains, system designers need principled approaches to help people set their expectations of these systems, and give them mechanisms for recovery when these systems fail. For example, a doctor using a machine learning-based system in clinical care needs to know when they can expect the model to perform well, and when they should not rely on its output. However, supporting these kinds of judgments is difficult when stakeholders have conflicting needs and goals, or cannot directly assess the quality of a system's output. This dissertation examines two such contexts: matching algorithms for assigning students to public schools; and machine translation systems, which use machine learning to translate between natural languages. I take three approaches to design for reliability in these contexts: first, teaching users what a system can and cannot do; second, aligning system evaluations with people's actual use cases, needs, and goals; and third, helping users recover from failures. By making it easier for users to understand what kinds of inputs are supported by a system, and how they can express their intent within that supported scope, we can increase users' agency to make appropriate decisions about how and when to use these systems in high-stakes settings.