Humans effortlessly parse continuous experience based on its temporal structure, allowing us to recognize speech, detect regularity in sequences of events, and predict when things will happen. Yet it remains poorly understood how the nervous system accomplishes this multifaceted effort. In the current dissertation, I first review organisms’ abilities to sense timing on the scale of tens to hundreds of milliseconds, as well as evidence of sensory neurons that respond selectively based on temporal features or respond differently based on recent temporal context. I propose that neuronal selectivity to timing results from time-varying neural and synaptic properties, most notably short-term synaptic plasticity (STP), and I review supporting evidence. Next, I present a computational model that explains why different sensory neurons show different patterns of sensitivity to temporal context. Like real neurons observed in mice, model neuron responses either decrease, remain stable, or increase over the course of repeated stimulation on short timescales, an effect that relies on the model’s usage of experimentally-observed STP at synapses with two distinct types of inhibitory interneurons. I test and confirm model predictions by analyzing the responses of mouse auditory neurons to trains of repeated pure tones. Subsequently, I shift my focus toward a potential mechanism of internally-generated timing on the circa-second scale: persistent activity states. I build on recent computational work by showing that counter-intuitive “cross-homeostatic” plasticity rules are able to configure neural networks to exhibit stable persistent activity states in a large, sparsely-connected spiking model. Importantly, I show that when cross-homeostatic plasticity operates using only local signals, it fails unless counterbalanced by classical homeostatic plasticity. Finally, I test the idea that timing is computationally linked with working memory by performing behavioral experiments in humans. To do so, I employ two tasks that have the same stimulus structure but differ in whether timing or working memory is required to respond correctly. I find that in each task participants learn about and use the other task-irrelevant component, which is consistent with the hypothesis that in some cases working memory and timing information are multiplexed in a time-varying format because of the importance of predicting when working memory will be used.