Inferring mental states from faces during social perception is sensitive to context and higher-level semantic information. This is vividly observed in captioned humor, like memes, where a single line can dramatically reshape scene perception and understanding. This EEG-study explores the real-time neural dynamics of these sudden insights in social perception. Across three experimental phases (pre-insight, insight, post-insight), 40 participants viewed images of 120 scenes showing public figures. During the insight phase, humorous captions (e.g., "trying to set somebody on fire with his mind") either matched or mismatched the following image (e.g., a politician, mid-speech, rubbing his temples). Comparing event-related potentials between trials with vs. without sudden insight revealed distinct changes in the N170, early posterior negativity (EPN), and N400 components from pre-insight to insight phase. These results link sudden, humorous insight in social perception to instant alterations in visual processing, fast affective responses, and higher-level semantic processing.