Skip to main content
eScholarship
Open Access Publications from the University of California

A Methodology and Tool Support for the Design and Evaluation of Fault Tolerant, Distributed Embedded Systems

  • Author(s): McKelvin, Jr., Mark Lee
  • Advisor(s): Sangiovanni-Vincentelli, Alberto L
  • et al.
Abstract

Embedded systems are becoming pervasive in diverse application domains,

such as automotive, avionic, medical, and industrial automation control

systems.

Advancements in technology and the demand for sophisticated functionality to

support a variety of applications are driving the increase in complexity of

embedded systems, particularly in systems whose incorrect operation can

result in significant consequences, such as financial loss or human life.

As a result, these systems require high assurance to meet stringent constraints on

reliability and fault tolerance, the ability to operate despite potential

for components to operate incorrectly.

Reliability is an important design goal in distributed embedded systems that

may be achieved by the provision of additional components in parallel or by

improving component reliability.

Thus, reliability in a fault tolerant system will be dictated by the

combinations of components that operate incorrectly, or fail.

Since, redundancy comes at a cost, the problem that designers face is

determining which components to improve.

Most existing approaches that seek to achieve better system reliability by determining

levels of component redundancies and a selection of component reliabilities

simultaneously do not consider the design of embedded systems.

Of the approaches that do consider applications in the design of embedded systems,

many do not consider the combinations of component failures, their location in the

system architecture, and rate of failure due to the challenges and limitations

of constructing reliability models that can express those characteristics.

In this dissertation, I present a design flow and a set of tools

to support the design and analysis of distributed embedded systems with fault tolerant

and reliability requirements using fault trees.

A fault tree is a reliability model that is based on the failure

characteristics of a system and its structure.

The proposed design flow integrates the automatic generation and analysis of

fault trees to enable the design of fault tolerant architectures.

I will apply this design flow to the evaluation of a fault tolerant

control application and to the evaluation of architecture alternatives for

an automotive application.

Main Content
Current View