Abstract Background Streetscape (microscale) features of the built environment can influence people’s perceptions of their neighborhoods’ suitability for physical activity. Many microscale audit tools have been developed, but few have published systematic scoring methods. We present the development, scoring, and reliability of the Microscale Audit of Pedestrian Streetscapes (MAPS) tool and its theoretically-based subscales. Methods MAPS was based on prior instruments and was developed to assess details of streetscapes considered relevant for physical activity. MAPS sections (route, segments, crossings, and cul-de-sacs) were scored by two independent raters for reliability analyses. There were 290 route pairs, 516 segment pairs, 319 crossing pairs, and 53 cul-de-sac pairs in the reliability sample. Individual inter-rater item reliability analyses were computed using Kappa, intra-class correlation coefficient (ICC), and percent agreement. A conceptual framework for subscale creation was developed using theory, expert consensus, and policy relevance. Items were grouped into subscales, and subscales were analyzed for inter-rater reliability at tiered levels of aggregation. Results There were 160 items included in the subscales (out of 201 items total). Of those included in the subscales, 80 items (50.0%) had good/excellent reliability, 41 items (25.6%) had moderate reliability, and 18 items (11.3%) had low reliability, with limited variability in the remaining 21 items (13.1%). Seventeen of the 20 route section subscales, valence (positive/negative) scores, and overall scores (85.0%) demonstrated good/excellent reliability and 3 demonstrated moderate reliability. Of the 16 segment subscales, valence scores, and overall scores, 12 (75.0%) demonstrated good/excellent reliability, three demonstrated moderate reliability, and one demonstrated poor reliability. Of the 8 crossing subscales, valence scores, and overall scores, 6 (75.0%) demonstrated good/excellent reliability, and 2 demonstrated moderate reliability. The cul-de-sac subscale demonstrated good/excellent reliability. Conclusions MAPS items and subscales predominantly demonstrated moderate to excellent reliability. The subscales and scoring system represent a theoretically based framework for using these complex microscale data and may be applicable to other similar instruments.