We propose a novel approach toward event detection in real-world continuous video sequences. The method: 1) is able to model arbitrary-order non-Markovian dependences in videos to mitigate local visual ambiguities; 2) conducts simultaneous event segmentation and labeling; and 3) is time-window free. The idea is to represent a video as an event stream of both high-level semantic events and low-level video observations. In training, we learn a point process model called a piecewise-constant conditional intensity model (PCIM) that is able to capture complex non-Markovian dependences in the event streams. In testing, event detection can be modeled as the inference of high-level semantic events, given low-level image observations. We develop the first inference algorithm for PCIM and show it samples exactly from the posterior distribution. We then evaluate the video event detection task on real-world video sequences. Our model not only provides competitive results on the video event segmentation and labeling task, but also provides benefits, including being interpretable and efficient.