Non-parametric Decoding on Discrete Time Series and Its Applications in Bioinformatics
- Author(s): Fushing, Hsieh
- Chen, Shu-Chun
- Hwang, Chii-Ruey
- et al.
Published Web Locationhttps://doi.org/10.1007/s12561-010-9019-9
We address the question: How do we non-parametrically decode the unknown state-space vector underlying a lengthy discrete time series? The time series of concern is governed by one non-autonomous dynamics with only two internal states. This question pertinently reflects the dilemma of computing infeasibility against inferential bias found in many scientific areas. This dilemma becomes an issue when considering whether to have, or not to have likely very unrealistic structural assumptions on the state-space dynamics in most of real-world applications. To resolve this dilemma, the decoding problem is transformed into an event-intensity change-point problem without prior knowledge of the number of change-points involved. A new decoding algorithm, called Hierarchical Factor Segmentation (HFS), is proposed to achieve computability and robustness. Performance of the HFS algorithm in terms of total decoding error is compared to the decoding benchmark Viterbi algorithm through computer experiments. Under Hidden Markov Model (HMM) settings with true parameter values, our HFS algorithm is competitive against the Viterbi algorithm. Interestingly, when the Viterbi algorithm operates with maximum likelihood estimated (MLE) parameter values, our HFS algorithm performs significantly better. Similar favorable results are found when the Markov assumption is violated. We further demonstrate one very important application of our HFS algorithm in bioinformatics as a promising computational solution for finding CpG islands—DNA segments with aggregated CpG dinucleotides—on a genome sequence. A real illustration on a subsequence of human chromosome #22 is carried out and compared with one popular search algorithm.
Many UC-authored scholarly publications are freely available on this site because of the UC Academic Senate's Open Access Policy. Let us know how this access is important for you.