Drift diffusion (or evidence accumulation) models have found
widespread use in the modelling of simple decision tasks.
Extensions of these models, in which the model’s
instantaneous drift rate is not fixed but instead allowed to
vary over time as a function of a stream of perceptual inputs,
have allowed these models to account for more complex
sensorimotor decision tasks. However, many real-world tasks
seemingly rely on a myriad of even more complex underlying
processes. One interesting example is the task of deciding
whether to cross a road with an approaching vehicle. This
action decision seemingly depends on sensory information
both about own affordances (whether one can make it across
before the vehicle) and action intention of others (whether the
vehicle is yielding to oneself). Here, we compared three
extensions of a standard drift diffusion model, with regards to
their ability to capture timing of pedestrian crossing decisions
in a virtual reality environment. We find that a single
variable-drift diffusion model (S-VDDM) in which the
varying drift rate is determined by visual quantities describing
vehicle approach and deceleration, saturated at an upper and
lower bound, can explain multimodal distributions of crossing
times well across a broad range vehicle approach scenarios.
More complex models, which attempt to partition the final
crossing decision into constituent perceptual decisions,
improve the fit to the human data but further work is needed
before firm conclusions can be drawn from this finding.