The spontaneous imitation paradigm (Goldinger, 1998), in which subjects' speech is compared before and after they are exposed to target speech, has shown that subjects shift their production in the direction of the target, indicating the use of episodic traces in speech perception as well as the close tie between speech perception and production. By using this paradigm, the current study aims to investigate the psychological reality of three levels of linguistic unit (i.e., word, phoneme, and sub-phonemic unit such as feature/gesture) through physical measurements instead of perceptual assessments. An experiment was carried out to test: 1) whether spontaneous phonetic imitation can be generalized across (a) new words which share the same initial phoneme, and (b) new words with a new phoneme falling in the same natural class (sharing a feature/gesture); 2) whether word-level specificity can be obtained through physical measurements of a phonetic feature; and 3) if/how the phonetic imitation interacts with linguistic representations when the change might impair linguistic (in this case, phonemic) contrast. The feature manipulated in the experiments was aspiration, or [+/- spread glottis], on the phonemes /p/ and /k/. The results revealed a significant effect of spontaneous phonetic imitation: subjects produced significantly longer VOTs after they were exposed to target speech with extended VOTs, replicating Shockley (2004) in a non-shadowing paradigm. Furthermore, this modeled feature (increased aspiration) was generalized to new instances of the target phoneme /p/ (i.e., in new words) as well as to the new segment /k/. On the other hand, the subjects did not imitate reduced VOTs, despite the fact that the (modeled) shorter VOTs occur more often than (modeled) longer VOTs in the baseline recordings. These results, taken together, indicate that 1) speakers possess sub-phonemic representations, and 2) knowledge of linguistic (or, phonemic) contrast constrains spontaneous phonetic imitation. Expected word-level specificity, tested through lexical frequency and training exposures, was not observed in this study.