The dot-probe task has been widely used in research to produce an index of biased attention based on reaction times (RTs). Despite its popularity, very few published studies have examined psychometric properties of the task, including test-retest reliability, and no previous study has examined reliability in clinically anxious samples or systematically explored the effects of task design and analysis decisions on reliability. In the current analysis, we used dot-probe data from 3 studies in which attention bias toward threat-related faces was assessed at multiple (≥5) time-points. Two of the studies were similar (adults with social anxiety disorder, similar design features) whereas 1 was more disparate (pediatric healthy volunteers, distinct task design). We explored the effects of analysis choices (e.g., bias score formula, outlier handling method) on reliability and searched for convergent findings across the 3 studies. We found that, when concurrently considering the 3 studies, the most reliable RT index of bias used data from dot-bottom trials, comparing congruent to incongruent trials, with rescaled outliers, particularly after averaging across more than 1 assessment point. Although reliability of RT bias indices was moderate to low, within-session variability in bias (attention bias variability; ABV), a recently proposed RT index, was more reliable across sessions. Several eyetracking-based indices of attention bias (available in the pediatric healthy sample only) showed reliability that matched the optimal RT index (ABV). On the basis of these findings, we make specific recommendations to researchers using the dot-probe, particularly those wishing to investigate individual differences and/or single-patient applications.