Eye movement detection plays a crucial role in various fields, including eye tracking applications and understanding human perception and cognitive states. Existing detection methods typically rely on gaze positions predicted by gaze estimation algorithms, which may introduce cumulative errors. While certain video-based methods, directly classifying behaviours from videos, have been introduced to address this issue, they often have limitations as they primarily focus on detecting blinks. In this paper, we propose a video-based two-stream framework designed to detect four eye movement behaviours—fixations, saccades, smooth pursuits, and blinks—from infrared near-eye videos. To explicitly capture motion information, we introduce optical flow as the input for one stream. Additionally, we propose a spatio-temporal feature fusion module to combine information from the two streams. The framework is evaluated on a large-scale eye movement dataset and performs excellent results.