Algorithms for testing fault-tolerance of sequenced jobs
- Author(s): Chrobak, Marek
- Hurand, Mathilde
- Sgall, Jiří
- et al.
Published Web Locationhttps://doi.org/10.1007/s10951-009-0126-8
We study the problem of testing whether a given set of sequenced jobs can tolerate transient faults. We present efficient algorithms for this problem in several fault models. A fault model describes what types of faults are allowed and specifies assumptions on their frequency. Two types of faults are considered: hidden faults, that can only be detected after a job completes, and exposed faults, that can be detected immediately. First, we give an O(n)-time fault-tolerance testing algorithm, for both exposed and hidden faults, if the number of faults does not exceed a given parameter k. Then we consider the model in which any two faults are separated in time by a gap of length at least Δ, where Δ is at least twice the maximum job length. For exposed faults, we give an O(n)-time algorithm. For hidden faults, we give an algorithm with running time O(n 2), and we prove that if job lengths are distributed uniformly over an interval [0,p max ], then this algorithm’s expected running time is O(n). Our experimental study shows that this linear-time performance extends to other distributions. Finally, we provide evidence that improving the worst-case performance may not be possible, by proving an Ω(n 2) lower bound, in the algebraic computation tree model, on a slight generalization of this problem.