universal fault-tolerant non-linear: Topics by Science.gov

Sample records for universal fault-tolerant non-linear

A hybrid robust fault tolerant control based on adaptive joint unscented Kalman filter.

PubMed

Shabbouei Hagh, Yashar; Mohammadi Asl, Reza; Cocquempot, Vincent

2017-01-01

In this paper, a new hybrid robust fault tolerant control scheme is proposed. A robust H ∞ control law is used in non-faulty situation, while a Non-Singular Terminal Sliding Mode (NTSM) controller is activated as soon as an actuator fault is detected. Since a linear robust controller is designed, the system is first linearized through the feedback linearization method. To switch from one controller to the other, a fuzzy based switching system is used. An Adaptive Joint Unscented Kalman Filter (AJUKF) is used for fault detection and diagnosis. The proposed method is based on the simultaneous estimation of the system states and parameters. In order to show the efficiency of the proposed scheme, a simulated 3-DOF robotic manipulator is used. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Active fault tolerant control based on interval type-2 fuzzy sliding mode controller and non linear adaptive observer for 3-DOF laboratory helicopter.

PubMed

Zeghlache, Samir; Benslimane, Tarak; Bouguerra, Abderrahmen

2017-11-01

In this paper, a robust controller for a three degree of freedom (3 DOF) helicopter control is proposed in presence of actuator and sensor faults. For this purpose, Interval type-2 fuzzy logic control approach (IT2FLC) and sliding mode control (SMC) technique are used to design a controller, named active fault tolerant interval type-2 Fuzzy Sliding mode controller (AFTIT2FSMC) based on non-linear adaptive observer to estimate and detect the system faults for each subsystem of the 3-DOF helicopter. The proposed control scheme allows avoiding difficult modeling, attenuating the chattering effect of the SMC, reducing the rules number of the fuzzy controller. Exponential stability of the closed loop is guaranteed by using the Lyapunov method. The simulation results show that the AFTIT2FSMC can greatly alleviate the chattering effect, providing good tracking performance, even in presence of actuator and sensor faults. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Fault tolerant linear actuator

DOEpatents

Tesar, Delbert

2004-09-14

In varying embodiments, the fault tolerant linear actuator of the present invention is a new and improved linear actuator with fault tolerance and positional control that may incorporate velocity summing, force summing, or a combination of the two. In one embodiment, the invention offers a velocity summing arrangement with a differential gear between two prime movers driving a cage, which then drives a linear spindle screw transmission. Other embodiments feature two prime movers driving separate linear spindle screw transmissions, one internal and one external, in a totally concentric and compact integrated module.
Error rates and resource overheads of encoded three-qubit gates

NASA Astrophysics Data System (ADS)

Takagi, Ryuji; Yoder, Theodore J.; Chuang, Isaac L.

2017-10-01

A non-Clifford gate is required for universal quantum computation, and, typically, this is the most error-prone and resource-intensive logical operation on an error-correcting code. Small, single-qubit rotations are popular choices for this non-Clifford gate, but certain three-qubit gates, such as Toffoli or controlled-controlled-Z (ccz), are equivalent options that are also more suited for implementing some quantum algorithms, for instance, those with coherent classical subroutines. Here, we calculate error rates and resource overheads for implementing logical ccz with pieceable fault tolerance, a nontransversal method for implementing logical gates. We provide a comparison with a nonlocal magic-state scheme on a concatenated code and a local magic-state scheme on the surface code. We find the pieceable fault-tolerance scheme particularly advantaged over magic states on concatenated codes and in certain regimes over magic states on the surface code. Our results suggest that pieceable fault tolerance is a promising candidate for fault tolerance in a near-future quantum computer.
Fault-tolerant wait-free shared objects

NASA Technical Reports Server (NTRS)

Jayanti, Prasad; Chandra, Tushar D.; Toueg, Sam

1992-01-01

A concurrent system consists of processes communicating via shared objects, such as shared variables, queues, etc. The concept of wait-freedom was introduced to cope with process failures: each process that accesses a wait-free object is guaranteed to get a response even if all the other processes crash. However, if a wait-free object 'crashes,' all the processes that access that object are prevented from making progress. In this paper, we introduce the concept of fault-tolerant wait-free objects, and study the problem of implementing them. We give a universal method to construct fault-tolerant wait-free objects, for all types of 'responsive' failures (including one in which faulty objects may 'lie'). In sharp contrast, we prove that many common and interesting types (such as queues, sets, and test&set) have no fault-tolerant wait-free implementations even under the most benign of the 'non-responsive' types of failure. We also introduce several concepts and techniques that are central to the design of fault-tolerant concurrent systems: the concepts of self-implementation and graceful degradation, and techniques to automatically increase the fault-tolerance of implementations. We prove matching lower bounds on the resource complexity of most of our algorithms.
Universal fault-tolerant quantum computation with only transversal gates and error correction.

PubMed

Paetznick, Adam; Reichardt, Ben W

2013-08-30

Transversal implementations of encoded unitary gates are highly desirable for fault-tolerant quantum computation. Though transversal gates alone cannot be computationally universal, they can be combined with specially distilled resource states in order to achieve universality. We show that "triorthogonal" stabilizer codes, introduced for state distillation by Bravyi and Haah [Phys. Rev. A 86, 052329 (2012)], admit transversal implementation of the controlled-controlled-Z gate. We then construct a universal set of fault-tolerant gates without state distillation by using only transversal controlled-controlled-Z, transversal Hadamard, and fault-tolerant error correction. We also adapt the distillation procedure of Bravyi and Haah to Toffoli gates, improving on existing Toffoli distillation schemes.
Room temperature high-fidelity holonomic single-qubit gate on a solid-state spin.

PubMed

Arroyo-Camejo, Silvia; Lazariev, Andrii; Hell, Stefan W; Balasubramanian, Gopalakrishnan

2014-09-12

At its most fundamental level, circuit-based quantum computation relies on the application of controlled phase shift operations on quantum registers. While these operations are generally compromised by noise and imperfections, quantum gates based on geometric phase shifts can provide intrinsically fault-tolerant quantum computing. Here we demonstrate the high-fidelity realization of a recently proposed fast (non-adiabatic) and universal (non-Abelian) holonomic single-qubit gate, using an individual solid-state spin qubit under ambient conditions. This fault-tolerant quantum gate provides an elegant means for achieving the fidelity threshold indispensable for implementing quantum error correction protocols. Since we employ a spin qubit associated with a nitrogen-vacancy colour centre in diamond, this system is based on integrable and scalable hardware exhibiting strong analogy to current silicon technology. This quantum gate realization is a promising step towards viable, fault-tolerant quantum computing under ambient conditions.
Adaptive robust fault-tolerant control for linear MIMO systems with unmatched uncertainties

NASA Astrophysics Data System (ADS)

Zhang, Kangkang; Jiang, Bin; Yan, Xing-Gang; Mao, Zehui

2017-10-01

In this paper, two novel fault-tolerant control design approaches are proposed for linear MIMO systems with actuator additive faults, multiplicative faults and unmatched uncertainties. For time-varying multiplicative and additive faults, new adaptive laws and additive compensation functions are proposed. A set of conditions is developed such that the unmatched uncertainties are compensated by actuators in control. On the other hand, for unmatched uncertainties with their projection in unmatched space being not zero, based on a (vector) relative degree condition, additive functions are designed to compensate for the uncertainties from output channels in the presence of actuator faults. The developed fault-tolerant control schemes are applied to two aircraft systems to demonstrate the efficiency of the proposed approaches.
Design and Analysis of Linear Fault-Tolerant Permanent-Magnet Vernier Machines

PubMed Central

Xu, Liang; Liu, Guohai; Du, Yi; Liu, Hu

2014-01-01

This paper proposes a new linear fault-tolerant permanent-magnet (PM) vernier (LFTPMV) machine, which can offer high thrust by using the magnetic gear effect. Both PMs and windings of the proposed machine are on short mover, while the long stator is only manufactured from iron. Hence, the proposed machine is very suitable for long stroke system applications. The key of this machine is that the magnetizer splits the two movers with modular and complementary structures. Hence, the proposed machine offers improved symmetrical and sinusoidal back electromotive force waveform and reduced detent force. Furthermore, owing to the complementary structure, the proposed machine possesses favorable fault-tolerant capability, namely, independent phases. In particular, differing from the existing fault-tolerant machines, the proposed machine offers fault tolerance without sacrificing thrust density. This is because neither fault-tolerant teeth nor the flux-barriers are adopted. The electromagnetic characteristics of the proposed machine are analyzed using the time-stepping finite-element method, which verifies the effectiveness of the theoretical analysis. PMID:24982959
Design and analysis of linear fault-tolerant permanent-magnet vernier machines.

PubMed

Xu, Liang; Ji, Jinghua; Liu, Guohai; Du, Yi; Liu, Hu

2014-01-01

This paper proposes a new linear fault-tolerant permanent-magnet (PM) vernier (LFTPMV) machine, which can offer high thrust by using the magnetic gear effect. Both PMs and windings of the proposed machine are on short mover, while the long stator is only manufactured from iron. Hence, the proposed machine is very suitable for long stroke system applications. The key of this machine is that the magnetizer splits the two movers with modular and complementary structures. Hence, the proposed machine offers improved symmetrical and sinusoidal back electromotive force waveform and reduced detent force. Furthermore, owing to the complementary structure, the proposed machine possesses favorable fault-tolerant capability, namely, independent phases. In particular, differing from the existing fault-tolerant machines, the proposed machine offers fault tolerance without sacrificing thrust density. This is because neither fault-tolerant teeth nor the flux-barriers are adopted. The electromagnetic characteristics of the proposed machine are analyzed using the time-stepping finite-element method, which verifies the effectiveness of the theoretical analysis.
Demonstration of a quantum error detection code using a square lattice of four superconducting qubits

PubMed Central

Córcoles, A.D.; Magesan, Easwar; Srinivasan, Srikanth J.; Cross, Andrew W.; Steffen, M.; Gambetta, Jay M.; Chow, Jerry M.

2015-01-01

The ability to detect and deal with errors when manipulating quantum systems is a fundamental requirement for fault-tolerant quantum computing. Unlike classical bits that are subject to only digital bit-flip errors, quantum bits are susceptible to a much larger spectrum of errors, for which any complete quantum error-correcting code must account. Whilst classical bit-flip detection can be realized via a linear array of qubits, a general fault-tolerant quantum error-correcting code requires extending into a higher-dimensional lattice. Here we present a quantum error detection protocol on a two-by-two planar lattice of superconducting qubits. The protocol detects an arbitrary quantum error on an encoded two-qubit entangled state via quantum non-demolition parity measurements on another pair of error syndrome qubits. This result represents a building block towards larger lattices amenable to fault-tolerant quantum error correction architectures such as the surface code. PMID:25923200
Demonstration of a quantum error detection code using a square lattice of four superconducting qubits.

PubMed

Córcoles, A D; Magesan, Easwar; Srinivasan, Srikanth J; Cross, Andrew W; Steffen, M; Gambetta, Jay M; Chow, Jerry M

2015-04-29

The ability to detect and deal with errors when manipulating quantum systems is a fundamental requirement for fault-tolerant quantum computing. Unlike classical bits that are subject to only digital bit-flip errors, quantum bits are susceptible to a much larger spectrum of errors, for which any complete quantum error-correcting code must account. Whilst classical bit-flip detection can be realized via a linear array of qubits, a general fault-tolerant quantum error-correcting code requires extending into a higher-dimensional lattice. Here we present a quantum error detection protocol on a two-by-two planar lattice of superconducting qubits. The protocol detects an arbitrary quantum error on an encoded two-qubit entangled state via quantum non-demolition parity measurements on another pair of error syndrome qubits. This result represents a building block towards larger lattices amenable to fault-tolerant quantum error correction architectures such as the surface code.
Integral Sliding Mode Fault-Tolerant Control for Uncertain Linear Systems Over Networks With Signals Quantization.

PubMed

Hao, Li-Ying; Park, Ju H; Ye, Dan

2017-09-01

In this paper, a new robust fault-tolerant compensation control method for uncertain linear systems over networks is proposed, where only quantized signals are assumed to be available. This approach is based on the integral sliding mode (ISM) method where two kinds of integral sliding surfaces are constructed. One is the continuous-state-dependent surface with the aim of sliding mode stability analysis and the other is the quantization-state-dependent surface, which is used for ISM controller design. A scheme that combines the adaptive ISM controller and quantization parameter adjustment strategy is then proposed. Through utilizing H ∞ control analytical technique, once the system is in the sliding mode, the nature of performing disturbance attenuation and fault tolerance from the initial time can be found without requiring any fault information. Finally, the effectiveness of our proposed ISM control fault-tolerant schemes against quantization errors is demonstrated in the simulation.
Quantitative fault tolerant control design for a hydraulic actuator with a leaking piston seal

NASA Astrophysics Data System (ADS)

Karpenko, Mark

Hydraulic actuators are complex fluid power devices whose performance can be degraded in the presence of system faults. In this thesis a linear, fixed-gain, fault tolerant controller is designed that can maintain the positioning performance of an electrohydraulic actuator operating under load with a leaking piston seal and in the presence of parametric uncertainties. Developing a control system tolerant to this class of internal leakage fault is important since a leaking piston seal can be difficult to detect, unless the actuator is disassembled. The designed fault tolerant control law is of low-order, uses only the actuator position as feedback, and can: (i) accommodate nonlinearities in the hydraulic functions, (ii) maintain robustness against typical uncertainties in the hydraulic system parameters, and (iii) keep the positioning performance of the actuator within prescribed tolerances despite an internal leakage fault that can bypass up to 40% of the rated servovalve flow across the actuator piston. Experimental tests verify the functionality of the fault tolerant control under normal and faulty operating conditions. The fault tolerant controller is synthesized based on linear time-invariant equivalent (LTIE) models of the hydraulic actuator using the quantitative feedback theory (QFT) design technique. A numerical approach for identifying LTIE frequency response functions of hydraulic actuators from acceptable input-output responses is developed so that linearizing the hydraulic functions can be avoided. The proposed approach can properly identify the features of the hydraulic actuator frequency response that are important for control system design and requires no prior knowledge about the asymptotic behavior or structure of the LTIE transfer functions. A distributed hardware-in-the-loop (HIL) simulation architecture is constructed that enables the performance of the proposed fault tolerant control law to be further substantiated, under realistic operating conditions. Using the HIL framework, the fault tolerant hydraulic actuator is operated as a flight control actuator against the real-time numerical simulation of a high-performance jet aircraft. A robust electrohydraulic loading system is also designed using QFT so that the in-flight aerodynamic load can be experimentally replicated. The results of the HIL experiments show that using the fault tolerant controller to compensate the internal leakage fault at the actuator level can benefit the flight performance of the airplane.
General linear codes for fault-tolerant matrix operations on processor arrays

NASA Technical Reports Server (NTRS)

Nair, V. S. S.; Abraham, J. A.

1988-01-01

Various checksum codes have been suggested for fault-tolerant matrix computations on processor arrays. Use of these codes is limited due to potential roundoff and overflow errors. Numerical errors may also be misconstrued as errors due to physical faults in the system. In this a set of linear codes is identified which can be used for fault-tolerant matrix operations such as matrix addition, multiplication, transposition, and LU-decomposition, with minimum numerical error. Encoding schemes are given for some of the example codes which fall under the general set of codes. With the help of experiments, a rule of thumb for the selection of a particular code for a given application is derived.
Nonuniform code concatenation for universal fault-tolerant quantum computing

NASA Astrophysics Data System (ADS)

Nikahd, Eesa; Sedighi, Mehdi; Saheb Zamani, Morteza

2017-09-01

Using transversal gates is a straightforward and efficient technique for fault-tolerant quantum computing. Since transversal gates alone cannot be computationally universal, they must be combined with other approaches such as magic state distillation, code switching, or code concatenation to achieve universality. In this paper we propose an alternative approach for universal fault-tolerant quantum computing, mainly based on the code concatenation approach proposed in [T. Jochym-O'Connor and R. Laflamme, Phys. Rev. Lett. 112, 010505 (2014), 10.1103/PhysRevLett.112.010505], but in a nonuniform fashion. The proposed approach is described based on nonuniform concatenation of the 7-qubit Steane code with the 15-qubit Reed-Muller code, as well as the 5-qubit code with the 15-qubit Reed-Muller code, which lead to two 49-qubit and 47-qubit codes, respectively. These codes can correct any arbitrary single physical error with the ability to perform a universal set of fault-tolerant gates, without using magic state distillation.
Fault-tolerant quantum computation with nondeterministic entangling gates

NASA Astrophysics Data System (ADS)

Auger, James M.; Anwar, Hussain; Gimeno-Segovia, Mercedes; Stace, Thomas M.; Browne, Dan E.

2018-03-01

Performing entangling gates between physical qubits is necessary for building a large-scale universal quantum computer, but in some physical implementations—for example, those that are based on linear optics or networks of ion traps—entangling gates can only be implemented probabilistically. In this work, we study the fault-tolerant performance of a topological cluster state scheme with local nondeterministic entanglement generation, where failed entangling gates (which correspond to bonds on the lattice representation of the cluster state) lead to a defective three-dimensional lattice with missing bonds. We present two approaches for dealing with missing bonds; the first is a nonadaptive scheme that requires no additional quantum processing, and the second is an adaptive scheme in which qubits can be measured in an alternative basis to effectively remove them from the lattice, hence eliminating their damaging effect and leading to better threshold performance. We find that a fault-tolerance threshold can still be observed with a bond-loss rate of 6.5% for the nonadaptive scheme, and a bond-loss rate as high as 14.5% for the adaptive scheme.
Distributed fault-tolerant time-varying formation control for high-order linear multi-agent systems with actuator failures.

PubMed

Hua, Yongzhao; Dong, Xiwang; Li, Qingdong; Ren, Zhang

2017-11-01

This paper investigates the fault-tolerant time-varying formation control problems for high-order linear multi-agent systems in the presence of actuator failures. Firstly, a fully distributed formation control protocol is presented to compensate for the influences of both bias fault and loss of effectiveness fault. Using the adaptive online updating strategies, no global knowledge about the communication topology is required and the bounds of actuator failures can be unknown. Then an algorithm is proposed to determine the control parameters of the fault-tolerant formation protocol, where the time-varying formation feasible conditions and an approach to expand the feasible formation set are given. Furthermore, the stability of the proposed algorithm is proven based on the Lyapunov-like theory. Finally, two simulation examples are given to demonstrate the effectiveness of the theoretical results. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Using concatenated quantum codes for universal fault-tolerant quantum gates.

PubMed

Jochym-O'Connor, Tomas; Laflamme, Raymond

2014-01-10

We propose a method for universal fault-tolerant quantum computation using concatenated quantum error correcting codes. The concatenation scheme exploits the transversal properties of two different codes, combining them to provide a means to protect against low-weight arbitrary errors. We give the required properties of the error correcting codes to ensure universal fault tolerance and discuss a particular example using the 7-qubit Steane and 15-qubit Reed-Muller codes. Namely, other than computational basis state preparation as required by the DiVincenzo criteria, our scheme requires no special ancillary state preparation to achieve universality, as opposed to schemes such as magic state distillation. We believe that optimizing the codes used in such a scheme could provide a useful alternative to state distillation schemes that exhibit high overhead costs.
Tutorial: Advanced fault tree applications using HARP

NASA Technical Reports Server (NTRS)

Dugan, Joanne Bechta; Bavuso, Salvatore J.; Boyd, Mark A.

1993-01-01

Reliability analysis of fault tolerant computer systems for critical applications is complicated by several factors. These modeling difficulties are discussed and dynamic fault tree modeling techniques for handling them are described and demonstrated. Several advanced fault tolerant computer systems are described, and fault tree models for their analysis are presented. HARP (Hybrid Automated Reliability Predictor) is a software package developed at Duke University and NASA Langley Research Center that is capable of solving the fault tree models presented.

Advanced development for space robotics with emphasis on fault tolerance

NASA Technical Reports Server (NTRS)

Tesar, D.; Chladek, J.; Hooper, R.; Sreevijayan, D.; Kapoor, C.; Geisinger, J.; Meaney, M.; Browning, G.; Rackers, K.

1995-01-01

This paper describes the ongoing work in fault tolerance at the University of Texas at Austin. The paper describes the technical goals the group is striving to achieve and includes a brief description of the individual projects focusing on fault tolerance. The ultimate goal is to develop and test technology applicable to all future missions of NASA (lunar base, Mars exploration, planetary surveillance, space station, etc.).
Fault-tolerant conversion between adjacent Reed-Muller quantum codes based on gauge fixing

NASA Astrophysics Data System (ADS)

Quan, Dong-Xiao; Zhu, Li-Li; Pei, Chang-Xing; Sanders, Barry C.

2018-03-01

We design forward and backward fault-tolerant conversion circuits, which convert between the Steane code and the 15-qubit Reed-Muller quantum code so as to provide a universal transversal gate set. In our method, only seven out of a total 14 code stabilizers need to be measured, and we further enhance the circuit by simplifying some stabilizers; thus, we need only to measure eight weight-4 stabilizers for one round of forward conversion and seven weight-4 stabilizers for one round of backward conversion. For conversion, we treat random single-qubit errors and their influence on syndromes of gauge operators, and our novel single-step process enables more efficient fault-tolerant conversion between these two codes. We make our method quite general by showing how to convert between any two adjacent Reed-Muller quantum codes \\overline{\\textsf{RM}}(1,m) and \\overline{\\textsf{RM}}≤ft(1,m+1\\right) , for which we need only measure stabilizers whose number scales linearly with m rather than exponentially with m obtained in previous work. We provide the explicit mathematical expression for the necessary stabilizers and the concomitant resources required.
H∞ robust fault-tolerant controller design for an autonomous underwater vehicle's navigation control system

NASA Astrophysics Data System (ADS)

Cheng, Xiang-Qin; Qu, Jing-Yuan; Yan, Zhe-Ping; Bian, Xin-Qian

2010-03-01

In order to improve the security and reliability for autonomous underwater vehicle (AUV) navigation, an H∞ robust fault-tolerant controller was designed after analyzing variations in state-feedback gain. Operating conditions and the design method were then analyzed so that the control problem could be expressed as a mathematical optimization problem. This permitted the use of linear matrix inequalities (LMI) to solve for the H∞ controller for the system. When considering different actuator failures, these conditions were then also mathematically expressed, allowing the H∞ robust controller to solve for these events and thus be fault-tolerant. Finally, simulation results showed that the H∞ robust fault-tolerant controller could provide precise AUV navigation control with strong robustness.
Fault Tolerant Real-Time Systems

DTIC Science & Technology

1993-09-30

The ART (Advanced Real-Time Technology) Project of Carnegie Mellon University is engaged in wide ranging research on hard real - time systems . The...including hardware and software fault tolerance using temporal redundancy and analytic redundancy to permit the construction of real - time systems whose
Safety Verification of a Fault Tolerant Reconfigurable Autonomous Goal-Based Robotic Control System

NASA Technical Reports Server (NTRS)

Braman, Julia M. B.; Murray, Richard M; Wagner, David A.

2007-01-01

Fault tolerance and safety verification of control systems are essential for the success of autonomous robotic systems. A control architecture called Mission Data System (MDS), developed at the Jet Propulsion Laboratory, takes a goal-based control approach. In this paper, a method for converting goal network control programs into linear hybrid systems is developed. The linear hybrid system can then be verified for safety in the presence of failures using existing symbolic model checkers. An example task is simulated in MDS and successfully verified using HyTech, a symbolic model checking software for linear hybrid systems.
Symposium on the Interface: Computing Science and Statistics (20th). Theme: Computationally Intensive Methods in Statistics Held in Reston, Virginia on April 20-23, 1988

DTIC Science & Technology

1988-08-20

34 William A. Link, Patuxent Wildlife Research Center "Increasing reliability of multiversion fault-tolerant software design by modulation," Junryo 3... Multiversion lault-Tolerant Software Design by Modularization Junryo Miyashita Department of Computer Science California state University at san Bernardino Fault...They shall beE refered to as " multiversion fault-tolerant software design". Onel problem of developing multi-versions of a program is the high cost
Fault-tolerant linear optical quantum computing with small-amplitude coherent States.

PubMed

Lund, A P; Ralph, T C; Haselgrove, H L

2008-01-25

Quantum computing using two coherent states as a qubit basis is a proposed alternative architecture with lower overheads but has been questioned as a practical way of performing quantum computing due to the fragility of diagonal states with large coherent amplitudes. We show that using error correction only small amplitudes (alpha>1.2) are required for fault-tolerant quantum computing. We study fault tolerance under the effects of small amplitudes and loss using a Monte Carlo simulation. The first encoding level resources are orders of magnitude lower than the best single photon scheme.
A Non-linear Geodetic Data Inversion Using ABIC for Slip Distribution on a Fault With an Unknown dip Angle

NASA Astrophysics Data System (ADS)

Fukahata, Y.; Wright, T. J.

2006-12-01

We developed a method of geodetic data inversion for slip distribution on a fault with an unknown dip angle. When fault geometry is unknown, the problem of geodetic data inversion is non-linear. A common strategy for obtaining slip distribution is to first determine the fault geometry by minimizing the square misfit under the assumption of a uniform slip on a rectangular fault, and then apply the usual linear inversion technique to estimate a slip distribution on the determined fault. It is not guaranteed, however, that the fault determined under the assumption of a uniform slip gives the best fault geometry for a spatially variable slip distribution. In addition, in obtaining a uniform slip fault model, we have to simultaneously determine the values of the nine mutually dependent parameters, which is a highly non-linear, complicated process. Although the inverse problem is non-linear for cases with unknown fault geometries, the non-linearity of the problems is actually weak, when we can assume the fault surface to be flat. In particular, when a clear fault trace is observed on the EarthOs surface after an earthquake, we can precisely estimate the strike and the location of the fault. In this case only the dip angle has large ambiguity. In geodetic data inversion we usually need to introduce smoothness constraints in order to compromise reciprocal requirements for model resolution and estimation errors in a natural way. Strictly speaking, the inverse problem with smoothness constraints is also non-linear, even if the fault geometry is known. The non-linearity has been dissolved by introducing AkaikeOs Bayesian Information Criterion (ABIC), with which the optimal value of the relative weight of observed data to smoothness constraints is objectively determined. In this study, using ABIC in determining the optimal dip angle, we dissolved the non-linearity of the inverse problem. We applied the method to the InSAR data of the 1995 Dinar, Turkey earthquake and obtained a much shallower dip angle than before.
Reliability Assessment for Low-cost Unmanned Aerial Vehicles

NASA Astrophysics Data System (ADS)

Freeman, Paul Michael

Existing low-cost unmanned aerospace systems are unreliable, and engineers must blend reliability analysis with fault-tolerant control in novel ways. This dissertation introduces the University of Minnesota unmanned aerial vehicle flight research platform, a comprehensive simulation and flight test facility for reliability and fault-tolerance research. An industry-standard reliability assessment technique, the failure modes and effects analysis, is performed for an unmanned aircraft. Particular attention is afforded to the control surface and servo-actuation subsystem. Maintaining effector health is essential for safe flight; failures may lead to loss of control incidents. Failure likelihood, severity, and risk are qualitatively assessed for several effector failure modes. Design changes are recommended to improve aircraft reliability based on this analysis. Most notably, the control surfaces are split, providing independent actuation and dual-redundancy. The simulation models for control surface aerodynamic effects are updated to reflect the split surfaces using a first-principles geometric analysis. The failure modes and effects analysis is extended by using a high-fidelity nonlinear aircraft simulation. A trim state discovery is performed to identify the achievable steady, wings-level flight envelope of the healthy and damaged vehicle. Tolerance of elevator actuator failures is studied using familiar tools from linear systems analysis. This analysis reveals significant inherent performance limitations for candidate adaptive/reconfigurable control algorithms used for the vehicle. Moreover, it demonstrates how these tools can be applied in a design feedback loop to make safety-critical unmanned systems more reliable. Control surface impairments that do occur must be quickly and accurately detected. This dissertation also considers fault detection and identification for an unmanned aerial vehicle using model-based and model-free approaches and applies those algorithms to experimental faulted and unfaulted flight test data. Flight tests are conducted with actuator faults that affect the plant input and sensor faults that affect the vehicle state measurements. A model-based detection strategy is designed and uses robust linear filtering methods to reject exogenous disturbances, e.g. wind, while providing robustness to model variation. A data-driven algorithm is developed to operate exclusively on raw flight test data without physical model knowledge. The fault detection and identification performance of these complementary but different methods is compared. Together, enhanced reliability assessment and multi-pronged fault detection and identification techniques can help to bring about the next generation of reliable low-cost unmanned aircraft.
Self-stabilizing byzantine-fault-tolerant clock synchronization system and method

NASA Technical Reports Server (NTRS)

Malekpour, Mahyar R. (Inventor)

2012-01-01

Systems and methods for rapid Byzantine-fault-tolerant self-stabilizing clock synchronization are provided. The systems and methods are based on a protocol comprising a state machine and a set of monitors that execute once every local oscillator tick. The protocol is independent of specific application specific requirements. The faults are assumed to be arbitrary and/or malicious. All timing measures of variables are based on the node's local clock and thus no central clock or externally generated pulse is used. Instances of the protocol are shown to tolerate bursts of transient failures and deterministically converge with a linear convergence time with respect to the synchronization period as predicted.
Robust fault tolerant control based on sliding mode method for uncertain linear systems with quantization.

PubMed

Hao, Li-Ying; Yang, Guang-Hong

2013-09-01

This paper is concerned with the problem of robust fault-tolerant compensation control problem for uncertain linear systems subject to both state and input signal quantization. By incorporating novel matrix full-rank factorization technique with sliding surface design successfully, the total failure of certain actuators can be coped with, under a special actuator redundancy assumption. In order to compensate for quantization errors, an adjustment range of quantization sensitivity for a dynamic uniform quantizer is given through the flexible choices of design parameters. Comparing with the existing results, the derived inequality condition leads to the fault tolerance ability stronger and much wider scope of applicability. With a static adjustment policy of quantization sensitivity, an adaptive sliding mode controller is then designed to maintain the sliding mode, where the gain of the nonlinear unit vector term is updated automatically to compensate for the effects of actuator faults, quantization errors, exogenous disturbances and parameter uncertainties without the need for a fault detection and isolation (FDI) mechanism. Finally, the effectiveness of the proposed design method is illustrated via a model of a rocket fairing structural-acoustic. Copyright © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
Adaptive-gain fast super-twisting sliding mode fault tolerant control for a reusable launch vehicle in reentry phase.

PubMed

Zhang, Yao; Tang, Shengjing; Guo, Jie

2017-11-01

In this paper, a novel adaptive-gain fast super-twisting (AGFST) sliding mode attitude control synthesis is carried out for a reusable launch vehicle subject to actuator faults and unknown disturbances. According to the fast nonsingular terminal sliding mode surface (FNTSMS) and adaptive-gain fast super-twisting algorithm, an adaptive fault tolerant control law for the attitude stabilization is derived to protect against the actuator faults and unknown uncertainties. Firstly, a second-order nonlinear control-oriented model for the RLV is established by feedback linearization method. And on the basis a fast nonsingular terminal sliding mode (FNTSM) manifold is designed, which provides fast finite-time global convergence and avoids singularity problem as well as chattering phenomenon. Based on the merits of the standard super-twisting (ST) algorithm and fast reaching law with adaption, a novel adaptive-gain fast super-twisting (AGFST) algorithm is proposed for the finite-time fault tolerant attitude control problem of the RLV without any knowledge of the bounds of uncertainties and actuator faults. The important feature of the AGFST algorithm includes non-overestimating the values of the control gains and faster convergence speed than the standard ST algorithm. A formal proof of the finite-time stability of the closed-loop system is derived using the Lyapunov function technique. An estimation of the convergence time and accurate expression of convergence region are also provided. Finally, simulations are presented to illustrate the effectiveness and superiority of the proposed control scheme. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Experiments in fault tolerant software reliability

NASA Technical Reports Server (NTRS)

Mcallister, David F.; Tai, K. C.; Vouk, Mladen A.

1987-01-01

The reliability of voting was evaluated in a fault-tolerant software system for small output spaces. The effectiveness of the back-to-back testing process was investigated. Version 3.0 of the RSDIMU-ATS, a semi-automated test bed for certification testing of RSDIMU software, was prepared and distributed. Software reliability estimation methods based on non-random sampling are being studied. The investigation of existing fault-tolerance models was continued and formulation of new models was initiated.
Evaluation of reliability modeling tools for advanced fault tolerant systems

NASA Technical Reports Server (NTRS)

Baker, Robert; Scheper, Charlotte

1986-01-01

The Computer Aided Reliability Estimation (CARE III) and Automated Reliability Interactice Estimation System (ARIES 82) reliability tools for application to advanced fault tolerance aerospace systems were evaluated. To determine reliability modeling requirements, the evaluation focused on the Draper Laboratories' Advanced Information Processing System (AIPS) architecture as an example architecture for fault tolerance aerospace systems. Advantages and limitations were identified for each reliability evaluation tool. The CARE III program was designed primarily for analyzing ultrareliable flight control systems. The ARIES 82 program's primary use was to support university research and teaching. Both CARE III and ARIES 82 were not suited for determining the reliability of complex nodal networks of the type used to interconnect processing sites in the AIPS architecture. It was concluded that ARIES was not suitable for modeling advanced fault tolerant systems. It was further concluded that subject to some limitations (the difficulty in modeling systems with unpowered spare modules, systems where equipment maintenance must be considered, systems where failure depends on the sequence in which faults occurred, and systems where multiple faults greater than a double near coincident faults must be considered), CARE III is best suited for evaluating the reliability of advanced tolerant systems for air transport.
Gaussian error correction of quantum states in a correlated noisy channel.

PubMed

Lassen, Mikael; Berni, Adriano; Madsen, Lars S; Filip, Radim; Andersen, Ulrik L

2013-11-01

Noise is the main obstacle for the realization of fault-tolerant quantum information processing and secure communication over long distances. In this work, we propose a communication protocol relying on simple linear optics that optimally protects quantum states from non-Markovian or correlated noise. We implement the protocol experimentally and demonstrate the near-ideal protection of coherent and entangled states in an extremely noisy channel. Since all real-life channels are exhibiting pronounced non-Markovian behavior, the proposed protocol will have immediate implications in improving the performance of various quantum information protocols.
Determination of the optimal tolerance for MLC positioning in sliding window and VMAT techniques

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hernandez, V., E-mail: vhernandezmasgrau@gmail.com; Abella, R.; Calvo, J. F.

2015-04-15

Purpose: Several authors have recommended a 2 mm tolerance for multileaf collimator (MLC) positioning in sliding window treatments. In volumetric modulated arc therapy (VMAT) treatments, however, the optimal tolerance for MLC positioning remains unknown. In this paper, the authors present the results of a multicenter study to determine the optimal tolerance for both techniques. Methods: The procedure used is based on dynalog file analysis. The study was carried out using seven Varian linear accelerators from five different centers. Dynalogs were collected from over 100 000 clinical treatments and in-house software was used to compute the number of tolerance faults as amore » function of the user-defined tolerance. Thus, the optimal value for this tolerance, defined as the lowest achievable value, was investigated. Results: Dynalog files accurately predict the number of tolerance faults as a function of the tolerance value, especially for low fault incidences. All MLCs behaved similarly and the Millennium120 and the HD120 models yielded comparable results. In sliding window techniques, the number of beams with an incidence of hold-offs >1% rapidly decreases for a tolerance of 1.5 mm. In VMAT techniques, the number of tolerance faults sharply drops for tolerances around 2 mm. For a tolerance of 2.5 mm, less than 0.1% of the VMAT arcs presented tolerance faults. Conclusions: Dynalog analysis provides a feasible method for investigating the optimal tolerance for MLC positioning in dynamic fields. In sliding window treatments, the tolerance of 2 mm was found to be adequate, although it can be reduced to 1.5 mm. In VMAT treatments, the typically used 5 mm tolerance is excessively high. Instead, a tolerance of 2.5 mm is recommended.« less
Robust Gain-Scheduled Fault Tolerant Control for a Transport Aircraft

NASA Technical Reports Server (NTRS)

Shin, Jong-Yeob; Gregory, Irene

2007-01-01

This paper presents an application of robust gain-scheduled control concepts using a linear parameter-varying (LPV) control synthesis method to design fault tolerant controllers for a civil transport aircraft. To apply the robust LPV control synthesis method, the nonlinear dynamics must be represented by an LPV model, which is developed using the function substitution method over the entire flight envelope. The developed LPV model associated with the aerodynamic coefficient uncertainties represents nonlinear dynamics including those outside the equilibrium manifold. Passive and active fault tolerant controllers (FTC) are designed for the longitudinal dynamics of the Boeing 747-100/200 aircraft in the presence of elevator failure. Both FTC laws are evaluated in the full nonlinear aircraft simulation in the presence of the elevator fault and the results are compared to show pros and cons of each control law.
Allocating application to group of consecutive processors in fault-tolerant deadlock-free routing path defined by routers obeying same rules for path selection

DOEpatents

Leung, Vitus J [Albuquerque, NM; Phillips, Cynthia A [Albuquerque, NM; Bender, Michael A [East Northport, NY; Bunde, David P [Urbana, IL

2009-07-21

In a multiple processor computing apparatus, directional routing restrictions and a logical channel construct permit fault tolerant, deadlock-free routing. Processor allocation can be performed by creating a linear ordering of the processors based on routing rules used for routing communications between the processors. The linear ordering can assume a loop configuration, and bin-packing is applied to this loop configuration. The interconnection of the processors can be conceptualized as a generally rectangular 3-dimensional grid, and the MC allocation algorithm is applied with respect to the 3-dimensional grid.
Step-by-step magic state encoding for efficient fault-tolerant quantum computation

PubMed Central

Goto, Hayato

2014-01-01

Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation. PMID:25511387
Step-by-step magic state encoding for efficient fault-tolerant quantum computation.

PubMed

Goto, Hayato

2014-12-16

Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation.

Observer-based distributed adaptive fault-tolerant containment control of multi-agent systems with general linear dynamics.

PubMed

Ye, Dan; Chen, Mengmeng; Li, Kui

2017-11-01

In this paper, we consider the distributed containment control problem of multi-agent systems with actuator bias faults based on observer method. The objective is to drive the followers into the convex hull spanned by the dynamic leaders, where the input is unknown but bounded. By constructing an observer to estimate the states and bias faults, an effective distributed adaptive fault-tolerant controller is developed. Different from the traditional method, an auxiliary controller gain is designed to deal with the unknown inputs and bias faults together. Moreover, the coupling gain can be adjusted online through the adaptive mechanism without using the global information. Furthermore, the proposed control protocol can guarantee that all the signals of the closed-loop systems are bounded and all the followers converge to the convex hull with bounded residual errors formed by the dynamic leaders. Finally, a decoupled linearized longitudinal motion model of the F-18 aircraft is used to demonstrate the effectiveness. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Fault-tolerant cooperative output regulation for multi-vehicle systems with sensor faults

NASA Astrophysics Data System (ADS)

Qin, Liguo; He, Xiao; Zhou, D. H.

2017-10-01

This paper presents a unified framework of fault diagnosis and fault-tolerant cooperative output regulation (FTCOR) for a linear discrete-time multi-vehicle system with sensor faults. The FTCOR control law is designed through three steps. A cooperative output regulation (COR) controller is designed based on the internal mode principle when there are no sensor faults. A sufficient condition on the existence of the COR controller is given based on the discrete-time algebraic Riccati equation (DARE). Then, a decentralised fault diagnosis scheme is designed to cope with sensor faults occurring in followers. A residual generator is developed to detect sensor faults of each follower, and a bank of fault-matching estimators are proposed to isolate and estimate sensor faults of each follower. Unlike the current distributed fault diagnosis for multi-vehicle systems, the presented decentralised fault diagnosis scheme in each vehicle reduces the communication and computation load by only using the information of the vehicle. By combing the sensor fault estimation and the COR control law, an FTCOR controller is proposed. Finally, the simulation results demonstrate the effectiveness of the FTCOR controller.
Fault-tolerant optimised tracking control for unknown discrete-time linear systems using a combined reinforcement learning and residual compensation methodology

NASA Astrophysics Data System (ADS)

Han, Ke-Zhen; Feng, Jian; Cui, Xiaohong

2017-10-01

This paper considers the fault-tolerant optimised tracking control (FTOTC) problem for unknown discrete-time linear system. A research scheme is proposed on the basis of data-based parity space identification, reinforcement learning and residual compensation techniques. The main characteristic of this research scheme lies in the parity-space-identification-based simultaneous tracking control and residual compensation. The specific technical line consists of four main contents: apply subspace aided method to design observer-based residual generator; use reinforcement Q-learning approach to solve optimised tracking control policy; rely on robust H∞ theory to achieve noise attenuation; adopt fault estimation triggered by residual generator to perform fault compensation. To clarify the design and implementation procedures, an integrated algorithm is further constructed to link up these four functional units. The detailed analysis and proof are subsequently given to explain the guaranteed FTOTC performance of the proposed conclusions. Finally, a case simulation is provided to verify its effectiveness.
Multi-version software reliability through fault-avoidance and fault-tolerance

NASA Technical Reports Server (NTRS)

Vouk, Mladen A.; Mcallister, David F.

1989-01-01

A number of experimental and theoretical issues associated with the practical use of multi-version software to provide run-time tolerance to software faults were investigated. A specialized tool was developed and evaluated for measuring testing coverage for a variety of metrics. The tool was used to collect information on the relationships between software faults and coverage provided by the testing process as measured by different metrics (including data flow metrics). Considerable correlation was found between coverage provided by some higher metrics and the elimination of faults in the code. Back-to-back testing was continued as an efficient mechanism for removal of un-correlated faults, and common-cause faults of variable span. Software reliability estimation methods was also continued based on non-random sampling, and the relationship between software reliability and code coverage provided through testing. New fault tolerance models were formulated. Simulation studies of the Acceptance Voting and Multi-stage Voting algorithms were finished and it was found that these two schemes for software fault tolerance are superior in many respects to some commonly used schemes. Particularly encouraging are the safety properties of the Acceptance testing scheme.
Superconducting quantum circuits at the surface code threshold for fault tolerance.

PubMed

Barends, R; Kelly, J; Megrant, A; Veitia, A; Sank, D; Jeffrey, E; White, T C; Mutus, J; Fowler, A G; Campbell, B; Chen, Y; Chen, Z; Chiaro, B; Dunsworth, A; Neill, C; O'Malley, P; Roushan, P; Vainsencher, A; Wenner, J; Korotkov, A N; Cleland, A N; Martinis, John M

2014-04-24

A quantum computer can solve hard problems, such as prime factoring, database searching and quantum simulation, at the cost of needing to protect fragile quantum states from error. Quantum error correction provides this protection by distributing a logical state among many physical quantum bits (qubits) by means of quantum entanglement. Superconductivity is a useful phenomenon in this regard, because it allows the construction of large quantum circuits and is compatible with microfabrication. For superconducting qubits, the surface code approach to quantum computing is a natural choice for error correction, because it uses only nearest-neighbour coupling and rapidly cycled entangling gates. The gate fidelity requirements are modest: the per-step fidelity threshold is only about 99 per cent. Here we demonstrate a universal set of logic gates in a superconducting multi-qubit processor, achieving an average single-qubit gate fidelity of 99.92 per cent and a two-qubit gate fidelity of up to 99.4 per cent. This places Josephson quantum computing at the fault-tolerance threshold for surface code error correction. Our quantum processor is a first step towards the surface code, using five qubits arranged in a linear array with nearest-neighbour coupling. As a further demonstration, we construct a five-qubit Greenberger-Horne-Zeilinger state using the complete circuit and full set of gates. The results demonstrate that Josephson quantum computing is a high-fidelity technology, with a clear path to scaling up to large-scale, fault-tolerant quantum circuits.
Fault tolerant features and experiments of ANTS distributed real-time system

NASA Astrophysics Data System (ADS)

Dominic-Savio, Patrick; Lo, Jien-Chung; Tufts, Donald W.

1995-01-01

The ANTS project at the University of Rhode Island introduces the concept of Active Nodal Task Seeking (ANTS) as a way to efficiently design and implement dependable, high-performance, distributed computing. This paper presents the fault tolerant design features that have been incorporated in the ANTS experimental system implementation. The results of performance evaluations and fault injection experiments are reported. The fault-tolerant version of ANTS categorizes all computing nodes into three groups. They are: the up-and-running green group, the self-diagnosing yellow group and the failed red group. Each available computing node will be placed in the yellow group periodically for a routine diagnosis. In addition, for long-life missions, ANTS uses a monitoring scheme to identify faulty computing nodes. In this monitoring scheme, the communication pattern of each computing node is monitored by two other nodes.
Award ER25750: Coordinated Infrastructure for Fault Tolerance Systems Indiana University Final Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lumsdaine, Andrew

2013-03-08

The main purpose of the Coordinated Infrastructure for Fault Tolerance in Systems initiative has been to conduct research with a goal of providing end-to-end fault tolerance on a systemwide basis for applications and other system software. While fault tolerance has been an integral part of most high-performance computing (HPC) system software developed over the past decade, it has been treated mostly as a collection of isolated stovepipes. Visibility and response to faults has typically been limited to the particular hardware and software subsystems in which they are initially observed. Little fault information is shared across subsystems, allowing little flexibility ormore » control on a system-wide basis, making it practically impossible to provide cohesive end-to-end fault tolerance in support of scientific applications. As an example, consider faults such as communication link failures that can be seen by a network library but are not directly visible to the job scheduler, or consider faults related to node failures that can be detected by system monitoring software but are not inherently visible to the resource manager. If information about such faults could be shared by the network libraries or monitoring software, then other system software, such as a resource manager or job scheduler, could ensure that failed nodes or failed network links were excluded from further job allocations and that further diagnosis could be performed. As a founding member and one of the lead developers of the Open MPI project, our efforts over the course of this project have been focused on making Open MPI more robust to failures by supporting various fault tolerance techniques, and using fault information exchange and coordination between MPI and the HPC system software stack from the application, numeric libraries, and programming language runtime to other common system components such as jobs schedulers, resource managers, and monitoring tools.« less
Coordinated Fault-Tolerance for High-Performance Computing Final Project Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Panda, Dhabaleswar Kumar; Beckman, Pete

2011-07-28

With the Coordinated Infrastructure for Fault Tolerance Systems (CIFTS, as the original project came to be called) project, our aim has been to understand and tackle the following broad research questions, the answers to which will help the HEC community analyze and shape the direction of research in the field of fault tolerance and resiliency on future high-end leadership systems. Will availability of global fault information, obtained by fault information exchange between the different HEC software on a system, allow individual system software to better detect, diagnose, and adaptively respond to faults? If fault-awareness is raised throughout the system throughmore » fault information exchange, is it possible to get all system software working together to provide a more comprehensive end-to-end fault management on the system? What are the missing fault-tolerance features that widely used HEC system software lacks today that would inhibit such software from taking advantage of systemwide global fault information? What are the practical limitations of a systemwide approach for end-to-end fault management based on fault awareness and coordination? What mechanisms, tools, and technologies are needed to bring about fault awareness and coordination of responses on a leadership-class system? What standards, outreach, and community interaction are needed for adoption of the concept of fault awareness and coordination for fault management on future systems? Keeping our overall objectives in mind, the CIFTS team has taken a parallel fourfold approach. Our central goal was to design and implement a light-weight, scalable infrastructure with a simple, standardized interface to allow communication of fault-related information through the system and facilitate coordinated responses. This work led to the development of the Fault Tolerance Backplane (FTB) publish-subscribe API specification, together with a reference implementation and several experimental implementations on top of existing publish-subscribe tools. We enhanced the intrinsic fault tolerance capabilities representative implementations of a variety of key HPC software subsystems and integrated them with the FTB. Targeting software subsystems included: MPI communication libraries, checkpoint/restart libraries, resource managers and job schedulers, and system monitoring tools. Leveraging the aforementioned infrastructure, as well as developing and utilizing additional tools, we have examined issues associated with expanded, end-to-end fault response from both system and application viewpoints. From the standpoint of system operations, we have investigated log and root cause analysis, anomaly detection and fault prediction, and generalized notification mechanisms. Our applications work has included libraries for fault-tolerance linear algebra, application frameworks for coupled multiphysics applications, and external frameworks to support the monitoring and response for general applications. Our final goal was to engage the high-end computing community to increase awareness of tools and issues around coordinated end-to-end fault management.« less
Enhanced fault-tolerant quantum computing in d-level systems.

PubMed

Campbell, Earl T

2014-12-05

Error-correcting codes protect quantum information and form the basis of fault-tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transversal non-Clifford gate. Codes with the desired property are presented for d-level qudit systems with prime d. The codes use n=d-1 qudits and can detect up to ∼d/3 errors. We quantify the performance of these codes for one approach to quantum computation known as magic-state distillation. Unlike prior work, we find performance is always enhanced by increasing d.
Dust-Tolerant Intelligent Electrical Connection System

NASA Technical Reports Server (NTRS)

Lewis, Mark; Dokos, Adam; Perotti, Jose; Calle, Carlos; Mueller, Robert; Bastin, Gary; Carlson, Jeffrey; Townsend, Ivan, III; Immer, Chirstopher; Medelius, Pedro

2012-01-01

Faults in wiring systems are a serious concern for the aerospace and aeronautic (commercial, military, and civilian) industries. Circuit failures and vehicle accidents have occurred and have been attributed to faulty wiring created by open and/or short circuits. Often, such circuit failures occur due to vibration during vehicle launch or operation. Therefore, developing non-intrusive fault-tolerant techniques is necessary to detect circuit faults and automatically route signals through alternate recovery paths while the vehicle or lunar surface systems equipment is in operation. Electrical connector concepts combining dust mitigation strategies and cable diagnostic technologies have significant application for lunar and Martian surface systems, as well as for dusty terrestrial applications. The dust-tolerant intelligent electrical connection system has several novel concepts and unique features. It combines intelligent cable diagnostics (health monitoring) and automatic circuit routing capabilities into a dust-tolerant electrical umbilical. It retrofits a clamshell protective dust cover to an existing connector for reduced gravity operation, and features a universal connector housing with three styles of dust protection: inverted cap, rotating cap, and clamshell. It uses a self-healing membrane as a dust barrier for electrical connectors where required, while also combining lotus leaf technology for applications where a dust-resistant coating providing low surface tension is needed to mitigate Van der Waals forces, thereby disallowing dust particle adhesion to connector surfaces. It also permits using a ruggedized iris mechanism with an embedded electrodynamic dust shield as a dust barrier for electrical connectors where required.
Roads towards fault-tolerant universal quantum computation

NASA Astrophysics Data System (ADS)

Campbell, Earl T.; Terhal, Barbara M.; Vuillot, Christophe

2017-09-01

A practical quantum computer must not merely store information, but also process it. To prevent errors introduced by noise from multiplying and spreading, a fault-tolerant computational architecture is required. Current experiments are taking the first steps toward noise-resilient logical qubits. But to convert these quantum devices from memories to processors, it is necessary to specify how a universal set of gates is performed on them. The leading proposals for doing so, such as magic-state distillation and colour-code techniques, have high resource demands. Alternative schemes, such as those that use high-dimensional quantum codes in a modular architecture, have potential benefits, but need to be explored further.
Roads towards fault-tolerant universal quantum computation.

PubMed

Campbell, Earl T; Terhal, Barbara M; Vuillot, Christophe

2017-09-13

A practical quantum computer must not merely store information, but also process it. To prevent errors introduced by noise from multiplying and spreading, a fault-tolerant computational architecture is required. Current experiments are taking the first steps toward noise-resilient logical qubits. But to convert these quantum devices from memories to processors, it is necessary to specify how a universal set of gates is performed on them. The leading proposals for doing so, such as magic-state distillation and colour-code techniques, have high resource demands. Alternative schemes, such as those that use high-dimensional quantum codes in a modular architecture, have potential benefits, but need to be explored further.
Fault tolerant high-performance PACS network design and implementation

NASA Astrophysics Data System (ADS)

Chimiak, William J.; Boehme, Johannes M.

1998-07-01

The Wake Forest University School of Medicine and the Wake Forest University/Baptist Medical Center (WFUBMC) are implementing a second generation PACS. The first generation PACS provided helpful information about the functional and temporal requirements of the system. It highlighted the importance of image retrieval speed, system availability, RIS/HIS integration, the ability to rapidly view images on any PACS workstation, network bandwidth, equipment redundancy, and the ability for the system to evolve using standards-based components. This paper deals with the network design and implementation of the PACS. The physical layout of the hospital areas served by the PACS, the choice of network equipment and installation issues encountered are addressed. Efforts to optimize fault tolerance are discussed. The PACS network is a gigabit, mixed-media network based on LAN emulation over ATM (LANE) with a rapid migration from LANE to Multiple Protocols Over ATM (MPOA) planned. Two fault-tolerant backbone ATM switches serve to distribute network accesses with two load-balancing 622 megabit per second (Mbps) OC-12 interconnections. The switch was sized to be upgradable to provide a 2.54 Gbps OC-48 interconnection with an OC-12 interconnection as a load-balancing backup. Modalities connect with legacy network interface cards to a switched-ethernet device. This device has two 155 Mbps OC-3 load-balancing uplinks to each of the backbone ATM switches of the PACS. This provides a fault-tolerant logical connection to the modality servers which pass verified DICOM images to the PACS servers and proper PACS diagnostic workstations. Where fiber pulls were prohibitively expensive, edge ATM switches were installed with an OC-12 uplink to a backbone ATM switches. The PACS and data base servers are fault-tolerant, hot-swappable Sun Enterprise Servers with an OC-12 connection to a backbone ATM switch and a fast-ethernet connection to a back-up network. The workstations come with 10/100 BASET autosense cards. A redundant switched-ethernet network will be installed to provide yet another degree of network fault-tolerance. The switched-ethernet devices are connected to each of the backbone ATM switches with two-load-balancing OC-3 connections to provide fault-tolerant connectivity in the event of a primary network failure.
A fault-tolerant control architecture for unmanned aerial vehicles

NASA Astrophysics Data System (ADS)

Drozeski, Graham R.

Research has presented several approaches to achieve varying degrees of fault-tolerance in unmanned aircraft. Approaches in reconfigurable flight control are generally divided into two categories: those which incorporate multiple non-adaptive controllers and switch between them based on the output of a fault detection and identification element, and those that employ a single adaptive controller capable of compensating for a variety of fault modes. Regardless of the approach for reconfigurable flight control, certain fault modes dictate system restructuring in order to prevent a catastrophic failure. System restructuring enables active control of actuation not employed by the nominal system to recover controllability of the aircraft. After system restructuring, continued operation requires the generation of flight paths that adhere to an altered flight envelope. The control architecture developed in this research employs a multi-tiered hierarchy to allow unmanned aircraft to generate and track safe flight paths despite the occurrence of potentially catastrophic faults. The hierarchical architecture increases the level of autonomy of the system by integrating five functionalities with the baseline system: fault detection and identification, active system restructuring, reconfigurable flight control; reconfigurable path planning, and mission adaptation. Fault detection and identification algorithms continually monitor aircraft performance and issue fault declarations. When the severity of a fault exceeds the capability of the baseline flight controller, active system restructuring expands the controllability of the aircraft using unconventional control strategies not exploited by the baseline controller. Each of the reconfigurable flight controllers and the baseline controller employ a proven adaptive neural network control strategy. A reconfigurable path planner employs an adaptive model of the vehicle to re-shape the desired flight path. Generation of the revised flight path is posed as a linear program constrained by the response of the degraded system. Finally, a mission adaptation component estimates limitations on the closed-loop performance of the aircraft and adjusts the aircraft mission accordingly. A combination of simulation and flight test results using two unmanned helicopters validates the utility of the hierarchical architecture.
Evaluating and extending user-level fault tolerance in MPI applications

DOE PAGES

Laguna, Ignacio; Richards, David F.; Gamblin, Todd; ...

2016-01-11

The user-level failure mitigation (ULFM) interface has been proposed to provide fault-tolerant semantics in the Message Passing Interface (MPI). Previous work presented performance evaluations of ULFM; yet questions related to its programability and applicability, especially to non-trivial, bulk synchronous applications, remain unanswered. In this article, we present our experiences on using ULFM in a case study with a large, highly scalable, bulk synchronous molecular dynamics application to shed light on the advantages and difficulties of this interface to program fault-tolerant MPI applications. We found that, although ULFM is suitable for master–worker applications, it provides few benefits for more common bulkmore » synchronous MPI applications. Furthermore, to address these limitations, we introduce a new, simpler fault-tolerant interface for complex, bulk synchronous MPI programs with better applicability and support than ULFM for application-level recovery mechanisms, such as global rollback.« less
An Analysis of Failure Handling in Chameleon, A Framework for Supporting Cost-Effective Fault Tolerant Services

NASA Technical Reports Server (NTRS)

Haakensen, Erik Edward

1998-01-01

The desire for low-cost reliable computing is increasing. Most current fault tolerant computing solutions are not very flexible, i.e., they cannot adapt to reliability requirements of newly emerging applications in business, commerce, and manufacturing. It is important that users have a flexible, reliable platform to support both critical and noncritical applications. Chameleon, under development at the Center for Reliable and High-Performance Computing at the University of Illinois, is a software framework. for supporting cost-effective adaptable networked fault tolerant service. This thesis details a simulation of fault injection, detection, and recovery in Chameleon. The simulation was written in C++ using the DEPEND simulation library. The results obtained from the simulation included the amount of overhead incurred by the fault detection and recovery mechanisms supported by Chameleon. In addition, information about fault scenarios from which Chameleon cannot recover was gained. The results of the simulation showed that both critical and noncritical applications can be executed in the Chameleon environment with a fairly small amount of overhead. No single point of failure from which Chameleon could not recover was found. Chameleon was also found to be capable of recovering from several multiple failure scenarios.
A Self-Stabilizing Byzantine-Fault-Tolerant Clock Synchronization Protocol

NASA Technical Reports Server (NTRS)

Malekpour, Mahyar R.

2009-01-01

This report presents a rapid Byzantine-fault-tolerant self-stabilizing clock synchronization protocol that is independent of application-specific requirements. It is focused on clock synchronization of a system in the presence of Byzantine faults after the cause of any transient faults has dissipated. A model of this protocol is mechanically verified using the Symbolic Model Verifier (SMV) [SMV] where the entire state space is examined and proven to self-stabilize in the presence of one arbitrary faulty node. Instances of the protocol are proven to tolerate bursts of transient failures and deterministically converge with a linear convergence time with respect to the synchronization period. This protocol does not rely on assumptions about the initial state of the system other than the presence of sufficient number of good nodes. All timing measures of variables are based on the node s local clock, and no central clock or externally generated pulse is used. The Byzantine faulty behavior modeled here is a node with arbitrarily malicious behavior that is allowed to influence other nodes at every clock tick. The only constraint is that the interactions are restricted to defined interfaces.
Fault tolerance in an inner-outer solver: A GVR-enabled case study

DOE PAGES

Zhang, Ziming; Chien, Andrew A.; Teranishi, Keita

2015-04-18

Resilience is a major challenge for large-scale systems. It is particularly important for iterative linear solvers, since they take much of the time of many scientific applications. We show that single bit flip errors in the Flexible GMRES iterative linear solver can lead to high computational overhead or even failure to converge to the right answer. Informed by these results, we design and evaluate several strategies for fault tolerance in both inner and outer solvers appropriate across a range of error rates. We implement them, extending Trilinos’ solver library with the Global View Resilience (GVR) programming model, which provides multi-streammore » snapshots, multi-version data structures with portable and rich error checking/recovery. Lastly, experimental results validate correct execution with low performance overhead under varied error conditions.« less
Model-based design and experimental verification of a monitoring concept for an active-active electromechanical aileron actuation system

NASA Astrophysics Data System (ADS)

Arriola, David; Thielecke, Frank

2017-09-01

Electromechanical actuators have become a key technology for the onset of power-by-wire flight control systems in the next generation of commercial aircraft. The design of robust control and monitoring functions for these devices capable to mitigate the effects of safety-critical faults is essential in order to achieve the required level of fault tolerance. A primary flight control system comprising two electromechanical actuators nominally operating in active-active mode is considered. A set of five signal-based monitoring functions are designed using a detailed model of the system under consideration which includes non-linear parasitic effects, measurement and data acquisition effects, and actuator faults. Robust detection thresholds are determined based on the analysis of parametric and input uncertainties. The designed monitoring functions are verified experimentally and by simulation through the injection of faults in the validated model and in a test-rig suited to the actuation system under consideration, respectively. They guarantee a robust and efficient fault detection and isolation with a low risk of false alarms, additionally enabling the correct reconfiguration of the system for an enhanced operational availability. In 98% of the performed experiments and simulations, the correct faults were detected and confirmed within the time objectives set.
Verification of a Byzantine-Fault-Tolerant Self-stabilizing Protocol for Clock Synchronization

NASA Technical Reports Server (NTRS)

Malekpour, Mahyar R.

2008-01-01

This paper presents the mechanical verification of a simplified model of a rapid Byzantine-fault-tolerant self-stabilizing protocol for distributed clock synchronization systems. This protocol does not rely on any assumptions about the initial state of the system except for the presence of sufficient good nodes, thus making the weakest possible assumptions and producing the strongest results. This protocol tolerates bursts of transient failures, and deterministically converges within a time bound that is a linear function of the self-stabilization period. A simplified model of the protocol is verified using the Symbolic Model Verifier (SMV). The system under study consists of 4 nodes, where at most one of the nodes is assumed to be Byzantine faulty. The model checking effort is focused on verifying correctness of the simplified model of the protocol in the presence of a permanent Byzantine fault as well as confirmation of claims of determinism and linear convergence with respect to the self-stabilization period. Although model checking results of the simplified model of the protocol confirm the theoretical predictions, these results do not necessarily confirm that the protocol solves the general case of this problem. Modeling challenges of the protocol and the system are addressed. A number of abstractions are utilized in order to reduce the state space.

Quantum Error Correction

NASA Astrophysics Data System (ADS)

Lidar, Daniel A.; Brun, Todd A.

2013-09-01

Prologue; Preface; Part I. Background: 1. Introduction to decoherence and noise in open quantum systems Daniel Lidar and Todd Brun; 2. Introduction to quantum error correction Dave Bacon; 3. Introduction to decoherence-free subspaces and noiseless subsystems Daniel Lidar; 4. Introduction to quantum dynamical decoupling Lorenza Viola; 5. Introduction to quantum fault tolerance Panos Aliferis; Part II. Generalized Approaches to Quantum Error Correction: 6. Operator quantum error correction David Kribs and David Poulin; 7. Entanglement-assisted quantum error-correcting codes Todd Brun and Min-Hsiu Hsieh; 8. Continuous-time quantum error correction Ognyan Oreshkov; Part III. Advanced Quantum Codes: 9. Quantum convolutional codes Mark Wilde; 10. Non-additive quantum codes Markus Grassl and Martin Rötteler; 11. Iterative quantum coding systems David Poulin; 12. Algebraic quantum coding theory Andreas Klappenecker; 13. Optimization-based quantum error correction Andrew Fletcher; Part IV. Advanced Dynamical Decoupling: 14. High order dynamical decoupling Zhen-Yu Wang and Ren-Bao Liu; 15. Combinatorial approaches to dynamical decoupling Martin Rötteler and Pawel Wocjan; Part V. Alternative Quantum Computation Approaches: 16. Holonomic quantum computation Paolo Zanardi; 17. Fault tolerance for holonomic quantum computation Ognyan Oreshkov, Todd Brun and Daniel Lidar; 18. Fault tolerant measurement-based quantum computing Debbie Leung; Part VI. Topological Methods: 19. Topological codes Héctor Bombín; 20. Fault tolerant topological cluster state quantum computing Austin Fowler and Kovid Goyal; Part VII. Applications and Implementations: 21. Experimental quantum error correction Dave Bacon; 22. Experimental dynamical decoupling Lorenza Viola; 23. Architectures Jacob Taylor; 24. Error correction in quantum communication Mark Wilde; Part VIII. Critical Evaluation of Fault Tolerance: 25. Hamiltonian methods in QEC and fault tolerance Eduardo Novais, Eduardo Mucciolo and Harold Baranger; 26. Critique of fault-tolerant quantum information processing Robert Alicki; References; Index.
Cascading Policies Provide Fault Tolerance for Pervasive Clinical Communications.

PubMed

Williams, Rose; Jalan, Srikant; Stern, Edie; Lussier, Yves A

2005-03-21

We implemented an end-to-end notification system that pushed urgent clinical laboratory results to Blackberry 7510 devices over the Nextel cellular network. We designed our system to use user roles and notification policies to abstract and execute clinical notification procedures. We anticipated some problems with dropped and non-delivered messages when the device was out-of-network, however, we did not expect the same problems in other situations like device reconnection to the network. We addressed these problems by creating cascading "fault tolerance" policies to drive notification escalation when messages timed-out or delivery failed. This paper describes our experience in providing an adaptable, fault tolerant pervasive notification system for delivering secure, critical, time-sensitive patient laboratory results.
Design of LPV fault-tolerant controller for pitch system of wind turbine

NASA Astrophysics Data System (ADS)

Wu, Dinghui; Zhang, Xiaolin

2017-07-01

To address failures of wind turbine pitch-angle sensors, traditional wind turbine linear parameter varying (LPV) model is transformed into a double-layer convex polyhedron LPV model. On the basis of this model, when the plurality of the sensor undergoes failure and details of the failure are inconvenient to obtain, each sub-controller is designed using distributed thought and gain scheduling method. The final controller is obtained using all of the sub-controllers by a convex combination. The design method corrects the errors of the linear model, improves the linear degree of the system, and solves the problem of multiple pitch angle faults to ensure stable operation of the wind turbine.
Joint University Program for Air Transportation Research, 1985

NASA Technical Reports Server (NTRS)

Morrell, Frederick R. (Compiler)

1987-01-01

Air transportation research being carried on at the Massachusetts Institute of Technology, Princeton University, and Ohio University is discussed. Global Positioning System experiments, Loran-C monitoring, inertial navigation, the optimization of aircraft trajectories through severe microbursts, fault tolerant flight control systems, and expert systems for air traffic control are among the topics covered.
A Log-Scaling Fault Tolerant Agreement Algorithm for a Fault Tolerant MPI

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hursey, Joshua J; Naughton, III, Thomas J; Vallee, Geoffroy R

The lack of fault tolerance is becoming a limiting factor for application scalability in HPC systems. The MPI does not provide standardized fault tolerance interfaces and semantics. The MPI Forum's Fault Tolerance Working Group is proposing a collective fault tolerant agreement algorithm for the next MPI standard. Such algorithms play a central role in many fault tolerant applications. This paper combines a log-scaling two-phase commit agreement algorithm with a reduction operation to provide the necessary functionality for the new collective without any additional messages. Error handling mechanisms are described that preserve the fault tolerance properties while maintaining overall scalability.
Fault-tolerant Greenberger-Horne-Zeilinger paradox based on non-Abelian anyons.

PubMed

Deng, Dong-Ling; Wu, Chunfeng; Chen, Jing-Ling; Oh, C H

2010-08-06

We propose a scheme to test the Greenberger-Horne-Zeilinger paradox based on braidings of non-Abelian anyons, which are exotic quasiparticle excitations of topological states of matter. Because topological ordered states are robust against local perturbations, this scheme is in some sense "fault-tolerant" and might close the detection inefficiency loophole problem in previous experimental tests of the Greenberger-Horne-Zeilinger paradox. In turn, the construction of the Greenberger-Horne-Zeilinger paradox reveals the nonlocal property of non-Abelian anyons. Our results indicate that the non-Abelian fractional statistics is a pure quantum effect and cannot be described by local realistic theories. Finally, we present a possible experimental implementation of the scheme based on the anyonic interferometry technologies.
Towards scalable Byzantine fault-tolerant replication

NASA Astrophysics Data System (ADS)

Zbierski, Maciej

2017-08-01

Byzantine fault-tolerant (BFT) replication is a powerful technique, enabling distributed systems to remain available and correct even in the presence of arbitrary faults. Unfortunately, existing BFT replication protocols are mostly load-unscalable, i.e. they fail to respond with adequate performance increase whenever new computational resources are introduced into the system. This article proposes a universal architecture facilitating the creation of load-scalable distributed services based on BFT replication. The suggested approach exploits parallel request processing to fully utilize the available resources, and uses a load balancer module to dynamically adapt to the properties of the observed client workload. The article additionally provides a discussion on selected deployment scenarios, and explains how the proposed architecture could be used to increase the dependability of contemporary large-scale distributed systems.
Autonomous control system reconfiguration for spacecraft with non-redundant actuators

NASA Astrophysics Data System (ADS)

Grossman, Walter

1995-05-01

The Small Satellite Technology Initiative (SSTI) 'CLARK' spacecraft is required to be single-failure tolerant, i.e., no failure of any single component or subsystem shall result in complete mission loss. Fault tolerance is usually achieved by implementing redundant subsystems. Fault tolerant systems are therefore heavier and cost more to build and launch than non-redundent, non fault-tolerant spacecraft. The SSTI CLARK satellite Attitude Determination and Control System (ADACS) achieves single-fault tolerance without redundancy. The attitude determination system system uses a Kalman Filter which is inherently robust to loss of any single attitude sensor. The attitude control system uses three orthogonal reaction wheels for attitude control and three magnetic dipoles for momentum control. The nominal six-actuator control system functions by projecting the attitude correction torque onto the reaction wheels while a slower momentum management outer loop removes the excess momentum in the direction normal to the local B field. The actuators are not redundant so the nominal control law cannot be implemented in the event of a loss of a single actuator (dipole or reaction wheel). The spacecraft dynamical state (attitude, angular rate, and momentum) is controllable from any five-element subset of the six actuators. With loss of an actuator the instantaneous control authority may not span R(3) but the controllability gramian integral(limits between t,0) Phi(t, tau)B(tau )B(prime)(tau) Phi(prime)(t, tau)d tau retains full rank. Upon detection of an actuator failure the control torque is decomposed onto the remaining active axes. The attitude control torque is effected and the over-orbit momentum is controlled. The resulting control system performance approaches that of the nominal system.
Real time health monitoring and control system methodology for flexible space structures

NASA Astrophysics Data System (ADS)

Jayaram, Sanjay

This dissertation is concerned with the Near Real-time Autonomous Health Monitoring of Flexible Space Structures. The dynamics of multi-body flexible systems is uncertain due to factors such as high non-linearity, consideration of higher modal frequencies, high dimensionality, multiple inputs and outputs, operational constraints, as well as unexpected failures of sensors and/or actuators. Hence a systematic framework of developing a high fidelity, dynamic model of a flexible structural system needs to be understood. The fault detection mechanism that will be an integrated part of an autonomous health monitoring system comprises the detection of abnormalities in the sensors and/or actuators and correcting these detected faults (if possible). Applying the robust control law and the robust measures that are capable of detecting and recovering/replacing the actuators rectifies the actuator faults. The fault tolerant concept applied to the sensors will be in the form of an Extended Kalman Filter (EKF). The EKF is going to weigh the information coming from multiple sensors (redundant sensors used to measure the same information) and automatically identify the faulty sensors and weigh the best estimate from the remaining sensors. The mechanization is comprised of instrumenting flexible deployable panels (solar array) with multiple angular position and rate sensors connected to the data acquisition system. The sensors will give position and rate information of the solar panel in all three axes (i.e. roll, pitch and yaw). The position data corresponds to the steady state response and the rate data will give better insight on the transient response of the system. This is a critical factor for real-time autonomous health monitoring. MATLAB (and/or C++) software will be used for high fidelity modeling and fault tolerant mechanism.
A Self-Stabilizing Hybrid Fault-Tolerant Synchronization Protocol

NASA Technical Reports Server (NTRS)

Malekpour, Mahyar R.

2015-01-01

This paper presents a strategy for solving the Byzantine general problem for self-stabilizing a fully connected network from an arbitrary state and in the presence of any number of faults with various severities including any number of arbitrary (Byzantine) faulty nodes. The strategy consists of two parts: first, converting Byzantine faults into symmetric faults, and second, using a proven symmetric-fault tolerant algorithm to solve the general case of the problem. A protocol (algorithm) is also present that tolerates symmetric faults, provided that there are more good nodes than faulty ones. The solution applies to realizable systems, while allowing for differences in the network elements, provided that the number of arbitrary faults is not more than a third of the network size. The only constraint on the behavior of a node is that the interactions with other nodes are restricted to defined links and interfaces. The solution does not rely on assumptions about the initial state of the system and no central clock nor centrally generated signal, pulse, or message is used. Nodes are anonymous, i.e., they do not have unique identities. A mechanical verification of a proposed protocol is also present. A bounded model of the protocol is verified using the Symbolic Model Verifier (SMV). The model checking effort is focused on verifying correctness of the bounded model of the protocol as well as confirming claims of determinism and linear convergence with respect to the self-stabilization period.
Fault-Tolerant Heat Exchanger

NASA Technical Reports Server (NTRS)

Izenson, Michael G.; Crowley, Christopher J.

2005-01-01

A compact, lightweight heat exchanger has been designed to be fault-tolerant in the sense that a single-point leak would not cause mixing of heat-transfer fluids. This particular heat exchanger is intended to be part of the temperature-regulation system for habitable modules of the International Space Station and to function with water and ammonia as the heat-transfer fluids. The basic fault-tolerant design is adaptable to other heat-transfer fluids and heat exchangers for applications in which mixing of heat-transfer fluids would pose toxic, explosive, or other hazards: Examples could include fuel/air heat exchangers for thermal management on aircraft, process heat exchangers in the cryogenic industry, and heat exchangers used in chemical processing. The reason this heat exchanger can tolerate a single-point leak is that the heat-transfer fluids are everywhere separated by a vented volume and at least two seals. The combination of fault tolerance, compactness, and light weight is implemented in a unique heat-exchanger core configuration: Each fluid passage is entirely surrounded by a vented region bridged by solid structures through which heat is conducted between the fluids. Precise, proprietary fabrication techniques make it possible to manufacture the vented regions and heat-conducting structures with very small dimensions to obtain a very large coefficient of heat transfer between the two fluids. A large heat-transfer coefficient favors compact design by making it possible to use a relatively small core for a given heat-transfer rate. Calculations and experiments have shown that in most respects, the fault-tolerant heat exchanger can be expected to equal or exceed the performance of the non-fault-tolerant heat exchanger that it is intended to supplant (see table). The only significant disadvantages are a slight weight penalty and a small decrease in the mass-specific heat transfer.
Mixed linear-non-linear inversion of crustal deformation data: Bayesian inference of model, weighting and regularization parameters

NASA Astrophysics Data System (ADS)

Fukuda, Jun'ichi; Johnson, Kaj M.

2010-06-01

We present a unified theoretical framework and solution method for probabilistic, Bayesian inversions of crustal deformation data. The inversions involve multiple data sets with unknown relative weights, model parameters that are related linearly or non-linearly through theoretic models to observations, prior information on model parameters and regularization priors to stabilize underdetermined problems. To efficiently handle non-linear inversions in which some of the model parameters are linearly related to the observations, this method combines both analytical least-squares solutions and a Monte Carlo sampling technique. In this method, model parameters that are linearly and non-linearly related to observations, relative weights of multiple data sets and relative weights of prior information and regularization priors are determined in a unified Bayesian framework. In this paper, we define the mixed linear-non-linear inverse problem, outline the theoretical basis for the method, provide a step-by-step algorithm for the inversion, validate the inversion method using synthetic data and apply the method to two real data sets. We apply the method to inversions of multiple geodetic data sets with unknown relative data weights for interseismic fault slip and locking depth. We also apply the method to the problem of estimating the spatial distribution of coseismic slip on faults with unknown fault geometry, relative data weights and smoothing regularization weight.
Fault Tolerance in ZigBee Wireless Sensor Networks

NASA Technical Reports Server (NTRS)

Alena, Richard; Gilstrap, Ray; Baldwin, Jarren; Stone, Thom; Wilson, Pete

2011-01-01

Wireless sensor networks (WSN) based on the IEEE 802.15.4 Personal Area Network standard are finding increasing use in the home automation and emerging smart energy markets. The network and application layers, based on the ZigBee 2007 PRO Standard, provide a convenient framework for component-based software that supports customer solutions from multiple vendors. This technology is supported by System-on-a-Chip solutions, resulting in extremely small and low-power nodes. The Wireless Connections in Space Project addresses the aerospace flight domain for both flight-critical and non-critical avionics. WSNs provide the inherent fault tolerance required for aerospace applications utilizing such technology. The team from Ames Research Center has developed techniques for assessing the fault tolerance of ZigBee WSNs challenged by radio frequency (RF) interference or WSN node failure.
Fault-tolerant software - Experiment with the sift operating system. [Software Implemented Fault Tolerance computer

NASA Technical Reports Server (NTRS)

Brunelle, J. E.; Eckhardt, D. E., Jr.

1985-01-01

Results are presented of an experiment conducted in the NASA Avionics Integrated Research Laboratory (AIRLAB) to investigate the implementation of fault-tolerant software techniques on fault-tolerant computer architectures, in particular the Software Implemented Fault Tolerance (SIFT) computer. The N-version programming and recovery block techniques were implemented on a portion of the SIFT operating system. The results indicate that, to effectively implement fault-tolerant software design techniques, system requirements will be impacted and suggest that retrofitting fault-tolerant software on existing designs will be inefficient and may require system modification.
Model Checking a Byzantine-Fault-Tolerant Self-Stabilizing Protocol for Distributed Clock Synchronization Systems

NASA Technical Reports Server (NTRS)

Malekpour, Mahyar R.

2007-01-01

This report presents the mechanical verification of a simplified model of a rapid Byzantine-fault-tolerant self-stabilizing protocol for distributed clock synchronization systems. This protocol does not rely on any assumptions about the initial state of the system. This protocol tolerates bursts of transient failures, and deterministically converges within a time bound that is a linear function of the self-stabilization period. A simplified model of the protocol is verified using the Symbolic Model Verifier (SMV) [SMV]. The system under study consists of 4 nodes, where at most one of the nodes is assumed to be Byzantine faulty. The model checking effort is focused on verifying correctness of the simplified model of the protocol in the presence of a permanent Byzantine fault as well as confirmation of claims of determinism and linear convergence with respect to the self-stabilization period. Although model checking results of the simplified model of the protocol confirm the theoretical predictions, these results do not necessarily confirm that the protocol solves the general case of this problem. Modeling challenges of the protocol and the system are addressed. A number of abstractions are utilized in order to reduce the state space. Also, additional innovative state space reduction techniques are introduced that can be used in future verification efforts applied to this and other protocols.
Computer Sciences and Data Systems, volume 1

NASA Technical Reports Server (NTRS)

1987-01-01

Topics addressed include: software engineering; university grants; institutes; concurrent processing; sparse distributed memory; distributed operating systems; intelligent data management processes; expert system for image analysis; fault tolerant software; and architecture research.
Universal non-adiabatic geometric manipulation of pseudo-spin charge qubits

NASA Astrophysics Data System (ADS)

Azimi Mousolou, Vahid

2017-01-01

Reliable quantum information processing requires high-fidelity universal manipulation of quantum systems within the characteristic coherence times. Non-adiabatic holonomic quantum computation offers a promising approach to implement fast, universal, and robust quantum logic gates particularly useful in nano-fabricated solid-state architectures, which typically have short coherence times. Here, we propose an experimentally feasible scheme to realize high-speed universal geometric quantum gates in nano-engineered pseudo-spin charge qubits. We use a system of three coupled quantum dots containing a single electron, where two computational states of a double quantum dot charge qubit interact through an intermediate quantum dot. The additional degree of freedom introduced into the qubit makes it possible to create a geometric model system, which allows robust and efficient single-qubit rotations through careful control of the inter-dot tunneling parameters. We demonstrate that a capacitive coupling between two charge qubits permits a family of non-adiabatic holonomic controlled two-qubit entangling gates, and thus provides a promising procedure to maintain entanglement in charge qubits and a pathway toward fault-tolerant universal quantum computation. We estimate the feasibility of the proposed structure by analyzing the gate fidelities to some extent.
Trends in non-stationary signal processing techniques applied to vibration analysis of wind turbine drive train - A contemporary survey

NASA Astrophysics Data System (ADS)

Uma Maheswari, R.; Umamaheswari, R.

2017-02-01

Condition Monitoring System (CMS) substantiates potential economic benefits and enables prognostic maintenance in wind turbine-generator failure prevention. Vibration Monitoring and Analysis is a powerful tool in drive train CMS, which enables the early detection of impending failure/damage. In variable speed drives such as wind turbine-generator drive trains, the vibration signal acquired is of non-stationary and non-linear. The traditional stationary signal processing techniques are inefficient to diagnose the machine faults in time varying conditions. The current research trend in CMS for drive-train focuses on developing/improving non-linear, non-stationary feature extraction and fault classification algorithms to improve fault detection/prediction sensitivity and selectivity and thereby reducing the misdetection and false alarm rates. In literature, review of stationary signal processing algorithms employed in vibration analysis is done at great extent. In this paper, an attempt is made to review the recent research advances in non-linear non-stationary signal processing algorithms particularly suited for variable speed wind turbines.
FTAPE: A fault injection tool to measure fault tolerance

NASA Technical Reports Server (NTRS)

Tsai, Timothy K.; Iyer, Ravishankar K.

1995-01-01

The paper introduces FTAPE (Fault Tolerance And Performance Evaluator), a tool that can be used to compare fault-tolerant computers. The tool combines system-wide fault injection with a controllable workload. A workload generator is used to create high stress conditions for the machine. Faults are injected based on this workload activity in order to ensure a high level of fault propagation. The errors/fault ratio and performance degradation are presented as measures of fault tolerance.
Design of reliable universal QCA logic in the presence of cell deposition defect

NASA Astrophysics Data System (ADS)

Sen, Bibhash; Mukherjee, Rijoy; Mohit, Kumar; Sikdar, Biplab K.

2017-08-01

The emergence of Quantum-dot Cellular Automata (QCA) has resulted in being identified as a promising alternative to the currently prevailing techniques of very large scale integration. QCA can provide low-power nanocircuit with high device density. Keeping aside the profound acceptance of QCA, the challenge that it is facing can be quoted as susceptibility to high error rate. The work produced in this article aims towards the design of a reliable universal logic gate (r-ULG) in QCA (r-ULG along with the single clock zone and r-ULG-II along with multiple clock zones). The design would include hybrid orientation of cells that would realise majority and minority, functions and high fault tolerance simultaneously. The characterisation of the defective behaviour of r-ULGs under different kinds of cell deposition defects is investigated. The outcomes of the investigation provide an indication that the proposed r-ULG provides a fault tolerance of 75% under single clock zone and a fault tolerance of 100% under dual clock zones. The high functional aspects of r-ULGs in the implementation of different logic functions successfully under cell deposition defects are affirmed by the experimental results. The high-level logic around the multiplexer is synthesised, which helps to extend the design capability to the higher-level circuit synthesis.

A Fault-tolerant RISC Microprocessor for Spacecraft Applications

NASA Technical Reports Server (NTRS)

Timoc, Constantin; Benz, Harry

1990-01-01

Viewgraphs on a fault-tolerant RISC microprocessor for spacecraft applications are presented. Topics covered include: reduced instruction set computer; fault tolerant registers; fault tolerant ALU; and double rail CMOS logic.
Experimental fault-tolerant universal quantum gates with solid-state spins under ambient conditions

PubMed Central

Rong, Xing; Geng, Jianpei; Shi, Fazhan; Liu, Ying; Xu, Kebiao; Ma, Wenchao; Kong, Fei; Jiang, Zhen; Wu, Yang; Du, Jiangfeng

2015-01-01

Quantum computation provides great speedup over its classical counterpart for certain problems. One of the key challenges for quantum computation is to realize precise control of the quantum system in the presence of noise. Control of the spin-qubits in solids with the accuracy required by fault-tolerant quantum computation under ambient conditions remains elusive. Here, we quantitatively characterize the source of noise during quantum gate operation and demonstrate strategies to suppress the effect of these. A universal set of logic gates in a nitrogen-vacancy centre in diamond are reported with an average single-qubit gate fidelity of 0.999952 and two-qubit gate fidelity of 0.992. These high control fidelities have been achieved at room temperature in naturally abundant 13C diamond via composite pulses and an optimized control method. PMID:26602456
Defense Small Business Innovation Research Program (SBIR). Volume 2. Navy Projects, Abstracts of Phase 1 Awards from FY 1989 SBIR Solicitation

DTIC Science & Technology

1990-04-01

DECISION AIDS HAVE CREATED A VAST NEW POTENTIAL FOR SUPPORT OF STRATEGIC AND TACTICAL OPERATIONS. THE NON-MONOTONIC PROBABILIST (NMP), DEVELOPED BY...QUALITY OF THE NEW DESIGN WILL BE EVALUATED BY CREATING A VIDEO TAPE USING A VIDEO ANIMATION SYSTEM, AND A SOFTWARE SIMULATION OF THE NEW DESIGN. THE...FAULT TOLERANT, SECURE SHIPBOARD COMMUNICATIONS. THE LAN WILL UTILIZE PHOENIX DIGITAL’S FAULT TOLERANT, " SELF - HEALING " SMALL BUSINESS INNOVATION RESEARCH
Braid read-only memory

NASA Technical Reports Server (NTRS)

Mckenna, J. F.

1973-01-01

Transformer-type memory is fault-tolerant array of independent read-only memory units. Information pattern in each unit is written by weaving wires through array of linear (nonswitching) transformers. Presence or absence of a bit is determined by whether a given wire threads or bypasses given transformer.
Analysis of a hardware and software fault tolerant processor for critical applications

NASA Technical Reports Server (NTRS)

Dugan, Joanne B.

1993-01-01

Computer systems for critical applications must be designed to tolerate software faults as well as hardware faults. A unified approach to tolerating hardware and software faults is characterized by classifying faults in terms of duration (transient or permanent) rather than source (hardware or software). Errors arising from transient faults can be handled through masking or voting, but errors arising from permanent faults require system reconfiguration to bypass the failed component. Most errors which are caused by software faults can be considered transient, in that they are input-dependent. Software faults are triggered by a particular set of inputs. Quantitative dependability analysis of systems which exhibit a unified approach to fault tolerance can be performed by a hierarchical combination of fault tree and Markov models. A methodology for analyzing hardware and software fault tolerant systems is applied to the analysis of a hypothetical system, loosely based on the Fault Tolerant Parallel Processor. The models consider both transient and permanent faults, hardware and software faults, independent and related software faults, automatic recovery, and reconfiguration.
Fault-tolerant control of large space structures using the stable factorization approach

NASA Technical Reports Server (NTRS)

Razavi, H. C.; Mehra, R. K.; Vidyasagar, M.

1986-01-01

Large space structures are characterized by the following features: they are in general infinite-dimensional systems, and have large numbers of undamped or lightly damped poles. Any attempt to apply linear control theory to large space structures must therefore take into account these features. Phase I consisted of an attempt to apply the recently developed Stable Factorization (SF) design philosophy to problems of large space structures, with particular attention to the aspects of robustness and fault tolerance. The final report on the Phase I effort consists of four sections, each devoted to one task. The first three sections report theoretical results, while the last consists of a design example. Significant results were obtained in all four tasks of the project. More specifically, an innovative approach to order reduction was obtained, stabilizing controller structures for plants with an infinite number of unstable poles were determined under some conditions, conditions for simultaneous stabilizability of an infinite number of plants were explored, and a fault tolerance controller design that stabilizes a flexible structure model was obtained which is robust against one failure condition.
Rule-based fault diagnosis of hall sensors and fault-tolerant control of PMSM

NASA Astrophysics Data System (ADS)

Song, Ziyou; Li, Jianqiu; Ouyang, Minggao; Gu, Jing; Feng, Xuning; Lu, Dongbin

2013-07-01

Hall sensor is widely used for estimating rotor phase of permanent magnet synchronous motor(PMSM). And rotor position is an essential parameter of PMSM control algorithm, hence it is very dangerous if Hall senor faults occur. But there is scarcely any research focusing on fault diagnosis and fault-tolerant control of Hall sensor used in PMSM. From this standpoint, the Hall sensor faults which may occur during the PMSM operating are theoretically analyzed. According to the analysis results, the fault diagnosis algorithm of Hall sensor, which is based on three rules, is proposed to classify the fault phenomena accurately. The rotor phase estimation algorithms, based on one or two Hall sensor(s), are initialized to engender the fault-tolerant control algorithm. The fault diagnosis algorithm can detect 60 Hall fault phenomena in total as well as all detections can be fulfilled in 1/138 rotor rotation period. The fault-tolerant control algorithm can achieve a smooth torque production which means the same control effect as normal control mode (with three Hall sensors). Finally, the PMSM bench test verifies the accuracy and rapidity of fault diagnosis and fault-tolerant control strategies. The fault diagnosis algorithm can detect all Hall sensor faults promptly and fault-tolerant control algorithm allows the PMSM to face failure conditions of one or two Hall sensor(s). In addition, the transitions between health-control and fault-tolerant control conditions are smooth without any additional noise and harshness. Proposed algorithms can deal with the Hall sensor faults of PMSM in real applications, and can be provided to realize the fault diagnosis and fault-tolerant control of PMSM.
Agent Based Fault Tolerance for the Mobile Environment

NASA Astrophysics Data System (ADS)

Park, Taesoon

This paper presents a fault-tolerance scheme based on mobile agents for the reliable mobile computing systems. Mobility of the agent is suitable to trace the mobile hosts and the intelligence of the agent makes it efficient to support the fault tolerance services. This paper presents two approaches to implement the mobile agent based fault tolerant service and their performances are evaluated and compared with other fault-tolerant schemes.
Sequential behavior and its inherent tolerance to memory faults.

NASA Technical Reports Server (NTRS)

Meyer, J. F.

1972-01-01

Representation of a memory fault of a sequential machine M by a function mu on the states of M and the result of the fault by an appropriately determined machine M(mu). Given some sequential behavior B, its inherent tolerance to memory faults can then be measured in terms of the minimum memory redundancy required to realize B with a state-assigned machine having fault tolerance type tau and fault tolerance level t. A behavior having maximum inherent tolerance is exhibited, and it is shown that behaviors of the same size can have different inherent tolerance.
Coordinated Fault Tolerance for High-Performance Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dongarra, Jack; Bosilca, George; et al.

2013-04-08

Our work to meet our goal of end-to-end fault tolerance has focused on two areas: (1) improving fault tolerance in various software currently available and widely used throughout the HEC domain and (2) using fault information exchange and coordination to achieve holistic, systemwide fault tolerance and understanding how to design and implement interfaces for integrating fault tolerance features for multiple layers of the software stack—from the application, math libraries, and programming language runtime to other common system software such as jobs schedulers, resource managers, and monitoring tools.
Byzantine-fault tolerant self-stabilizing protocol for distributed clock synchronization systems

NASA Technical Reports Server (NTRS)

Malekpour, Mahyar R. (Inventor)

2010-01-01

A rapid Byzantine self-stabilizing clock synchronization protocol that self-stabilizes from any state, tolerates bursts of transient failures, and deterministically converges within a linear convergence time with respect to the self-stabilization period. Upon self-stabilization, all good clocks proceed synchronously. The Byzantine self-stabilizing clock synchronization protocol does not rely on any assumptions about the initial state of the clocks. Furthermore, there is neither a central clock nor an externally generated pulse system. The protocol converges deterministically, is scalable, and self-stabilizes in a short amount of time. The convergence time is linear with respect to the self-stabilization period.
Physical fault tolerance of nanoelectronics.

PubMed

Szkopek, Thomas; Roychowdhury, Vwani P; Antoniadis, Dimitri A; Damoulakis, John N

2011-04-29

The error rate in complementary transistor circuits is suppressed exponentially in electron number, arising from an intrinsic physical implementation of fault-tolerant error correction. Contrariwise, explicit assembly of gates into the most efficient known fault-tolerant architecture is characterized by a subexponential suppression of error rate with electron number, and incurs significant overhead in wiring and complexity. We conclude that it is more efficient to prevent logical errors with physical fault tolerance than to correct logical errors with fault-tolerant architecture.
Analysis of typical fault-tolerant architectures using HARP

NASA Technical Reports Server (NTRS)

Bavuso, Salvatore J.; Bechta Dugan, Joanne; Trivedi, Kishor S.; Rothmann, Elizabeth M.; Smith, W. Earl

1987-01-01

Difficulties encountered in the modeling of fault-tolerant systems are discussed. The Hybrid Automated Reliability Predictor (HARP) approach to modeling fault-tolerant systems is described. The HARP is written in FORTRAN, consists of nearly 30,000 lines of codes and comments, and is based on behavioral decomposition. Using the behavioral decomposition, the dependability model is divided into fault-occurrence/repair and fault/error-handling models; the characteristics and combining of these two models are examined. Examples in which the HARP is applied to the modeling of some typical fault-tolerant systems, including a local-area network, two fault-tolerant computer systems, and a flight control system, are presented.
What does fault tolerant Deep Learning need from MPI?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Amatya, Vinay C.; Vishnu, Abhinav; Siegel, Charles M.

Deep Learning (DL) algorithms have become the {\\em de facto} Machine Learning (ML) algorithm for large scale data analysis. DL algorithms are computationally expensive -- even distributed DL implementations which use MPI require days of training (model learning) time on commonly studied datasets. Long running DL applications become susceptible to faults -- requiring development of a fault tolerant system infrastructure, in addition to fault tolerant DL algorithms. This raises an important question: {\\em What is needed from MPI for designing fault tolerant DL implementations?} In this paper, we address this problem for permanent faults. We motivate the need for amore » fault tolerant MPI specification by an in-depth consideration of recent innovations in DL algorithms and their properties, which drive the need for specific fault tolerance features. We present an in-depth discussion on the suitability of different parallelism types (model, data and hybrid); a need (or lack thereof) for check-pointing of any critical data structures; and most importantly, consideration for several fault tolerance proposals (user-level fault mitigation (ULFM), Reinit) in MPI and their applicability to fault tolerant DL implementations. We leverage a distributed memory implementation of Caffe, currently available under the Machine Learning Toolkit for Extreme Scale (MaTEx). We implement our approaches by extending MaTEx-Caffe for using ULFM-based implementation. Our evaluation using the ImageNet dataset and AlexNet neural network topology demonstrates the effectiveness of the proposed fault tolerant DL implementation using OpenMPI based ULFM.« less
Modeling and experimental verification of single event upsets

NASA Technical Reports Server (NTRS)

Fogarty, T. N.; Attia, J. O.; Kumar, A. A.; Tang, T. S.; Lindner, J. S.

1993-01-01

The research performed and the results obtained at the Laboratory for Radiation Studies, Prairie View A&M University and Texas A&I University, on the problem of Single Events Upsets, the various schemes employed to limit them and the effects they have on the reliability and fault tolerance at the systems level, such as robotic systems are reviewed.
Aircraft Engine On-Line Diagnostics Through Dual-Channel Sensor Measurements: Development of a Baseline System

NASA Technical Reports Server (NTRS)

Kobayashi, Takahisa; Simon, Donald L.

2008-01-01

In this paper, a baseline system which utilizes dual-channel sensor measurements for aircraft engine on-line diagnostics is developed. This system is composed of a linear on-board engine model (LOBEM) and fault detection and isolation (FDI) logic. The LOBEM provides the analytical third channel against which the dual-channel measurements are compared. When the discrepancy among the triplex channels exceeds a tolerance level, the FDI logic determines the cause of the discrepancy. Through this approach, the baseline system achieves the following objectives: (1) anomaly detection, (2) component fault detection, and (3) sensor fault detection and isolation. The performance of the baseline system is evaluated in a simulation environment using faults in sensors and components.
Fault-tolerant processing system

NASA Technical Reports Server (NTRS)

Palumbo, Daniel L. (Inventor)

1996-01-01

A fault-tolerant, fiber optic interconnect, or backplane, which serves as a via for data transfer between modules. Fault tolerance algorithms are embedded in the backplane by dividing the backplane into a read bus and a write bus and placing a redundancy management unit (RMU) between the read bus and the write bus so that all data transmitted by the write bus is subjected to the fault tolerance algorithms before the data is passed for distribution to the read bus. The RMU provides both backplane control and fault tolerance.
A single dynamic observer-based module for design of simultaneous fault detection, isolation and tracking control scheme

NASA Astrophysics Data System (ADS)

Davoodi, M.; Meskin, N.; Khorasani, K.

2018-03-01

The problem of simultaneous fault detection, isolation and tracking (SFDIT) control design for linear systems subject to both bounded energy and bounded peak disturbances is considered in this work. A dynamic observer is proposed and implemented by using the H∞/H-/L1 formulation of the SFDIT problem. A single dynamic observer module is designed that generates the residuals as well as the control signals. The objective of the SFDIT module is to ensure that simultaneously the effects of disturbances and control signals on the residual signals are minimised (in order to accomplish the fault detection goal) subject to the constraint that the transfer matrix from the faults to the residuals is equal to a pre-assigned diagonal transfer matrix (in order to accomplish the fault isolation goal), while the effects of disturbances, reference inputs and faults on the specified control outputs are minimised (in order to accomplish the fault-tolerant and tracking control goals). A set of linear matrix inequality (LMI) feasibility conditions are derived to ensure solvability of the problem. In order to illustrate and demonstrate the effectiveness of our proposed design methodology, the developed and proposed schemes are applied to an autonomous unmanned underwater vehicle (AUV).
Identification of significant intrinsic mode functions for the diagnosis of induction motor fault.

PubMed

Cho, Sangjin; Shahriar, Md Rifat; Chong, Uipil

2014-08-01

For the analysis of non-stationary signals generated by a non-linear process like fault of an induction motor, empirical mode decomposition (EMD) is the best choice as it decomposes the signal into its natural oscillatory modes known as intrinsic mode functions (IMFs). However, some of these oscillatory modes obtained from a fault signal are not significant as they do not bear any fault signature and can cause misclassification of the fault instance. To solve this issue, a novel IMF selection algorithm is proposed in this work.
A methodology for testing fault-tolerant software

NASA Technical Reports Server (NTRS)

Andrews, D. M.; Mahmood, A.; Mccluskey, E. J.

1985-01-01

A methodology for testing fault tolerant software is presented. There are problems associated with testing fault tolerant software because many errors are masked or corrected by voters, limiter, or automatic channel synchronization. This methodology illustrates how the same strategies used for testing fault tolerant hardware can be applied to testing fault tolerant software. For example, one strategy used in testing fault tolerant hardware is to disable the redundancy during testing. A similar testing strategy is proposed for software, namely, to move the major emphasis on testing earlier in the development cycle (before the redundancy is in place) thus reducing the possibility that undetected errors will be masked when limiters and voters are added.

Minimalist fault-tolerance techniques for mitigating single-event effects in non-radiation-hardened microcontrollers

NASA Astrophysics Data System (ADS)

Caldwell, Douglas Wyche

Commercial microcontrollers--monolithic integrated circuits containing microprocessor, memory and various peripheral functions--such as are used in industrial, automotive and military applications, present spacecraft avionics system designers an appealing mix of higher performance and lower power together with faster system-development time and lower unit costs. However, these parts are not radiation-hardened for application in the space environment and Single-Event Effects (SEE) caused by high-energy, ionizing radiation present a significant challenge. Mitigating these effects with techniques which require minimal additional support logic, and thereby preserve the high functional density of these devices, can allow their benefits to be realized. This dissertation uses fault-tolerance to mitigate the transient errors and occasional latchups that non-hardened microcontrollers can experience in the space radiation environment. Space systems requirements and the historical use of fault-tolerant computers in spacecraft provide context. Space radiation and its effects in semiconductors define the fault environment. A reference architecture is presented which uses two or three microcontrollers with a combination of hardware and software voting techniques to mitigate SEE. A prototypical spacecraft function (an inertial measurement unit) is used to illustrate the techniques and to explore how real application requirements impact the fault-tolerance approach. Low-cost approaches which leverage features of existing commercial microcontrollers are analyzed. A high-speed serial bus is used for voting among redundant devices and a novel wire-OR output voting scheme exploits the bidirectional controls of I/O pins. A hardware testbed and prototype software were constructed to evaluate two- and three-processor configurations. Simulated Single-Event Upsets (SEUs) were injected at high rates and the response of the system monitored. The resulting statistics were used to evaluate technical effectiveness. Fault-recovery probabilities (coverages) higher than 99.99% were experimentally demonstrated. The greater than thousand-fold reduction in observed effects provides performance comparable with SEE tolerance of tested, rad-hard devices. Technical results were combined with cost data to assess the cost-effectiveness of the techniques. It was found that a three-processor system was only marginally more effective than a two-device system at detecting and recovering from faults, but consumed substantially more resources, suggesting that simpler configurations are generally more cost-effective.
Fault Tolerant Homopolar Magnetic Bearings

NASA Technical Reports Server (NTRS)

Li, Ming-Hsiu; Palazzolo, Alan; Kenny, Andrew; Provenza, Andrew; Beach, Raymond; Kascak, Albert

2003-01-01

Magnetic suspensions (MS) satisfy the long life and low loss conditions demanded by satellite and ISS based flywheels used for Energy Storage and Attitude Control (ACESE) service. This paper summarizes the development of a novel MS that improves reliability via fault tolerant operation. Specifically, flux coupling between poles of a homopolar magnetic bearing is shown to deliver desired forces even after termination of coil currents to a subset of failed poles . Linear, coordinate decoupled force-voltage relations are also maintained before and after failure by bias linearization. Current distribution matrices (CDM) which adjust the currents and fluxes following a pole set failure are determined for many faulted pole combinations. The CDM s and the system responses are obtained utilizing 1D magnetic circuit models with fringe and leakage factors derived from detailed, 3D, finite element field models. Reliability results are presented vs. detection/correction delay time and individual power amplifier reliability for 4, 6, and 7 pole configurations. Reliability is shown for two success criteria, i.e. (a) no catcher bearing contact following pole failures and (b) re-levitation off of the catcher bearings following pole failures. An advantage of the method presented over other redundant operation approaches is a significantly reduced requirement for backup hardware such as additional actuators or power amplifiers.
Eigenstructure Assignment for Fault Tolerant Flight Control Design

NASA Technical Reports Server (NTRS)

Sobel, Kenneth; Joshi, Suresh (Technical Monitor)

2002-01-01

In recent years, fault tolerant flight control systems have gained an increased interest for high performance military aircraft as well as civil aircraft. Fault tolerant control systems can be described as either active or passive. An active fault tolerant control system has to either reconfigure or adapt the controller in response to a failure. One approach is to reconfigure the controller based upon detection and identification of the failure. Another approach is to use direct adaptive control to adjust the controller without explicitly identifying the failure. In contrast, a passive fault tolerant control system uses a fixed controller which achieves acceptable performance for a presumed set of failures. We have obtained a passive fault tolerant flight control law for the F/A-18 aircraft which achieves acceptable handling qualities for a class of control surface failures. The class of failures includes the symmetric failure of any one control surface being stuck at its trim value. A comparison was made of an eigenstructure assignment gain designed for the unfailed aircraft with a fault tolerant multiobjective optimization gain. We have shown that time responses for the unfailed aircraft using the eigenstructure assignment gain and the fault tolerant gain are identical. Furthermore, the fault tolerant gain achieves MIL-F-8785C specifications for all failure conditions.
Modular Adder Designs Using Optimal Reversible and Fault Tolerant Gates in Field-Coupled QCA Nanocomputing

NASA Astrophysics Data System (ADS)

Bilal, Bisma; Ahmed, Suhaib; Kakkar, Vipan

2018-02-01

The challenges which the CMOS technology is facing toward the end of the technology roadmap calls for an investigation of various logical and technological solutions to CMOS at the nano scale. Two such paradigms which are considered in this paper are the reversible logic and the quantum-dot cellular automata (QCA) nanotechnology. Firstly, a new 3 × 3 reversible and universal gate, RG-QCA, is proposed and implemented in QCA technology using conventional 3-input majority voter based logic. Further the gate is optimized by using explicit interaction of cells and this optimized gate is then used to design an optimized modular full adder in QCA. Another configuration of RG-QCA gate, CRG-QCA, is then proposed which is a 4 × 4 gate and includes the fault tolerant characteristics and parity preserving nature. The proposed CRG-QCA gate is then tested to design a fault tolerant full adder circuit. Extensive comparisons of gate and adder circuits are drawn with the existing literature and it is envisaged that our proposed designs perform better and are cost efficient in QCA technology.
A verified design of a fault-tolerant clock synchronization circuit: Preliminary investigations

NASA Technical Reports Server (NTRS)

Miner, Paul S.

1992-01-01

Schneider demonstrates that many fault tolerant clock synchronization algorithms can be represented as refinements of a single proven correct paradigm. Shankar provides mechanical proof that Schneider's schema achieves Byzantine fault tolerant clock synchronization provided that 11 constraints are satisfied. Some of the constraints are assumptions about physical properties of the system and cannot be established formally. Proofs are given that the fault tolerant midpoint convergence function satisfies three of the constraints. A hardware design is presented, implementing the fault tolerant midpoint function, which is shown to satisfy the remaining constraints. The synchronization circuit will recover completely from transient faults provided the maximum fault assumption is not violated. The initialization protocol for the circuit also provides a recovery mechanism from total system failure caused by correlated transient faults.
The Programming Language Python In Earth System Simulations

NASA Astrophysics Data System (ADS)

Gross, L.; Imranullah, A.; Mora, P.; Saez, E.; Smillie, J.; Wang, C.

2004-12-01

Mathematical models in earth sciences base on the solution of systems of coupled, non-linear, time-dependent partial differential equations (PDEs). The spatial and time-scale vary from a planetary scale and million years for convection problems to 100km and 10 years for fault systems simulations. Various techniques are in use to deal with the time dependency (e.g. Crank-Nicholson), with the non-linearity (e.g. Newton-Raphson) and weakly coupled equations (e.g. non-linear Gauss-Seidel). Besides these high-level solution algorithms discretization methods (e.g. finite element method (FEM), boundary element method (BEM)) are used to deal with spatial derivatives. Typically, large-scale, three dimensional meshes are required to resolve geometrical complexity (e.g. in the case of fault systems) or features in the solution (e.g. in mantel convection simulations). The modelling environment escript allows the rapid implementation of new physics as required for the development of simulation codes in earth sciences. Its main object is to provide a programming language, where the user can define new models and rapidly develop high-level solution algorithms. The current implementation is linked with the finite element package finley as a PDE solver. However, the design is open and other discretization technologies such as finite differences and boundary element methods could be included. escript is implemented as an extension of the interactive programming environment python (see www.python.org). Key concepts introduced are Data objects, which are holding values on nodes or elements of the finite element mesh, and linearPDE objects, which are defining linear partial differential equations to be solved by the underlying discretization technology. In this paper we will show the basic concepts of escript and will show how escript is used to implement a simulation code for interacting fault systems. We will show some results of large-scale, parallel simulations on an SGI Altix system. Acknowledgements: Project work is supported by Australian Commonwealth Government through the Australian Computational Earth Systems Simulator Major National Research Facility, Queensland State Government Smart State Research Facility Fund, The University of Queensland and SGI.
Fault Analysis in Solar Photovoltaic Arrays

NASA Astrophysics Data System (ADS)

Zhao, Ye

Fault analysis in solar photovoltaic (PV) arrays is a fundamental task to increase reliability, efficiency and safety in PV systems. Conventional fault protection methods usually add fuses or circuit breakers in series with PV components. But these protection devices are only able to clear faults and isolate faulty circuits if they carry a large fault current. However, this research shows that faults in PV arrays may not be cleared by fuses under some fault scenarios, due to the current-limiting nature and non-linear output characteristics of PV arrays. First, this thesis introduces new simulation and analytic models that are suitable for fault analysis in PV arrays. Based on the simulation environment, this thesis studies a variety of typical faults in PV arrays, such as ground faults, line-line faults, and mismatch faults. The effect of a maximum power point tracker on fault current is discussed and shown to, at times, prevent the fault current protection devices to trip. A small-scale experimental PV benchmark system has been developed in Northeastern University to further validate the simulation conclusions. Additionally, this thesis examines two types of unique faults found in a PV array that have not been studied in the literature. One is a fault that occurs under low irradiance condition. The other is a fault evolution in a PV array during night-to-day transition. Our simulation and experimental results show that overcurrent protection devices are unable to clear the fault under "low irradiance" and "night-to-day transition". However, the overcurrent protection devices may work properly when the same PV fault occurs in daylight. As a result, a fault under "low irradiance" and "night-to-day transition" might be hidden in the PV array and become a potential hazard for system efficiency and reliability.
A distributed fault-tolerant signal processor /FTSP/

NASA Astrophysics Data System (ADS)

Bonneau, R. J.; Evett, R. C.; Young, M. J.

1980-01-01

A digital fault-tolerant signal processor (FTSP), an example of a self-repairing programmable system is analyzed. The design configuration is discussed in terms of fault tolerance, system-level fault detection, isolation and common memory. Special attention is given to the FDIR (fault detection isolation and reconfiguration) logic, noting that the reconfiguration decisions are based on configuration, summary status, end-around tests, and north marker/synchro data. Several mechanisms of fault detection are described which initiate reconfiguration at different levels. It is concluded that the reliability of a signal processor can be significantly enhanced by the use of fault-tolerant techniques.
Fault tolerant software modules for SIFT

NASA Technical Reports Server (NTRS)

Hecht, M.; Hecht, H.

1982-01-01

The implementation of software fault tolerance is investigated for critical modules of the Software Implemented Fault Tolerance (SIFT) operating system to support the computational and reliability requirements of advanced fly by wire transport aircraft. Fault tolerant designs generated for the error reported and global executive are examined. A description of the alternate routines, implementation requirements, and software validation are included.
Fault tree models for fault tolerant hypercube multiprocessors

NASA Technical Reports Server (NTRS)

Boyd, Mark A.; Tuazon, Jezus O.

1991-01-01

Three candidate fault tolerant hypercube architectures are modeled, their reliability analyses are compared, and the resulting implications of these methods of incorporating fault tolerance into hypercube multiprocessors are discussed. In the course of performing the reliability analyses, the use of HARP and fault trees in modeling sequence dependent system behaviors is demonstrated.
Parallel and fault-tolerant algorithms for hypercube multiprocessors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aykanat, C.

1988-01-01

Several techniques for increasing the performance of parallel algorithms on distributed-memory message-passing multi-processor systems are investigated. These techniques are effectively implemented for the parallelization of the Scaled Conjugate Gradient (SCG) algorithm on a hypercube connected message-passing multi-processor. Significant performance improvement is achieved by using these techniques. The SCG algorithm is used for the solution phase of an FE modeling system. Almost linear speed-up is achieved, and it is shown that hypercube topology is scalable for an FE class of problem. The SCG algorithm is also shown to be suitable for vectorization, and near supercomputer performance is achieved on a vectormore » hypercube multiprocessor by exploiting both parallelization and vectorization. Fault-tolerance issues for the parallel SCG algorithm and for the hypercube topology are also addressed.« less
The Design of a Fault-Tolerant COTS-Based Bus Architecture for Space Applications

NASA Technical Reports Server (NTRS)

Chau, Savio N.; Alkalai, Leon; Tai, Ann T.

2000-01-01

The high-performance, scalability and miniaturization requirements together with the power, mass and cost constraints mandate the use of commercial-off-the-shelf (COTS) components and standards in the X2000 avionics system architecture for deep-space missions. In this paper, we report our experiences and findings on the design of an IEEE 1394 compliant fault-tolerant COTS-based bus architecture. While the COTS standard IEEE 1394 adequately supports power management, high performance and scalability, its topological criteria impose restrictions on fault tolerance realization. To circumvent the difficulties, we derive a "stack-tree" topology that not only complies with the IEEE 1394 standard but also facilitates fault tolerance realization in a spaceborne system with limited dedicated resource redundancies. Moreover, by exploiting pertinent standard features of the 1394 interface which are not purposely designed for fault tolerance, we devise a comprehensive set of fault detection mechanisms to support the fault-tolerant bus architecture.
Survivable algorithms and redundancy management in NASA's distributed computing systems

NASA Technical Reports Server (NTRS)

Malek, Miroslaw

1992-01-01

The design of survivable algorithms requires a solid foundation for executing them. While hardware techniques for fault-tolerant computing are relatively well understood, fault-tolerant operating systems, as well as fault-tolerant applications (survivable algorithms), are, by contrast, little understood, and much more work in this field is required. We outline some of our work that contributes to the foundation of ultrareliable operating systems and fault-tolerant algorithm design. We introduce our consensus-based framework for fault-tolerant system design. This is followed by a description of a hierarchical partitioning method for efficient consensus. A scheduler for redundancy management is introduced, and application-specific fault tolerance is described. We give an overview of our hybrid algorithm technique, which is an alternative to the formal approach given.
Fault tolerant architectures for integrated aircraft electronics systems, task 2

NASA Technical Reports Server (NTRS)

Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.

1984-01-01

The architectural basis for an advanced fault tolerant on-board computer to succeed the current generation of fault tolerant computers is examined. The network error tolerant system architecture is studied with particular attention to intercluster configurations and communication protocols, and to refined reliability estimates. The diagnosis of faults, so that appropriate choices for reconfiguration can be made is discussed. The analysis relates particularly to the recognition of transient faults in a system with tasks at many levels of priority. The demand driven data-flow architecture, which appears to have possible application in fault tolerant systems is described and work investigating the feasibility of automatic generation of aircraft flight control programs from abstract specifications is reported.
Rapid recovery from transient faults in the fault-tolerant processor with fault-tolerant shared memory

NASA Technical Reports Server (NTRS)

Harper, Richard E.; Butler, Bryan P.

1990-01-01

The Draper fault-tolerant processor with fault-tolerant shared memory (FTP/FTSM), which is designed to allow application tasks to continue execution during the memory alignment process, is described. Processor performance is not affected by memory alignment. In addition, the FTP/FTSM incorporates a hardware scrubber device to perform the memory alignment quickly during unused memory access cycles. The FTP/FTSM architecture is described, followed by an estimate of the time required for channel reintegration.
Parallel and distributed computation for fault-tolerant object recognition

NASA Technical Reports Server (NTRS)

Wechsler, Harry

1988-01-01

The distributed associative memory (DAM) model is suggested for distributed and fault-tolerant computation as it relates to object recognition tasks. The fault-tolerance is with respect to geometrical distortions (scale and rotation), noisy inputs, occulsion/overlap, and memory faults. An experimental system was developed for fault-tolerant structure recognition which shows the feasibility of such an approach. The approach is futher extended to the problem of multisensory data integration and applied successfully to the recognition of colored polyhedral objects.
State and actuator fault estimation observer design integrated in a riderless bicycle stabilization system.

PubMed

Brizuela Mendoza, Jorge Aurelio; Astorga Zaragoza, Carlos Manuel; Zavala Río, Arturo; Pattalochi, Leo; Canales Abarca, Francisco

2016-03-01

This paper deals with an observer design for Linear Parameter Varying (LPV) systems with high-order time-varying parameter dependency. The proposed design, considered as the main contribution of this paper, corresponds to an observer for the estimation of the actuator fault and the system state, considering measurement noise at the system outputs. The observer gains are computed by considering the extension of linear systems theory to polynomial LPV systems, in such a way that the observer reaches the characteristics of LPV systems. As a result, the actuator fault estimation is ready to be used in a Fault Tolerant Control scheme, where the estimated state with reduced noise should be used to generate the control law. The effectiveness of the proposed methodology has been tested using a riderless bicycle model with dependency on the translational velocity v, where the control objective corresponds to the system stabilization towards the upright position despite the variation of v along the closed-loop system trajectories. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 1: FTMP principles of operation

NASA Technical Reports Server (NTRS)

Smith, T. B., Jr.; Lala, J. H.

1983-01-01

The basic organization of the fault tolerant multiprocessor, (FTMP) is that of a general purpose homogeneous multiprocessor. Three processors operate on a shared system (memory and I/O) bus. Replication and tight synchronization of all elements and hardware voting is employed to detect and correct any single fault. Reconfiguration is then employed to repair a fault. Multiple faults may be tolerated as a sequence of single faults with repair between fault occurrences.
A Novel Dual Separate Paths (DSP) Algorithm Providing Fault-Tolerant Communication for Wireless Sensor Networks.

PubMed

Tien, Nguyen Xuan; Kim, Semog; Rhee, Jong Myung; Park, Sang Yoon

2017-07-25

Fault tolerance has long been a major concern for sensor communications in fault-tolerant cyber physical systems (CPSs). Network failure problems often occur in wireless sensor networks (WSNs) due to various factors such as the insufficient power of sensor nodes, the dislocation of sensor nodes, the unstable state of wireless links, and unpredictable environmental interference. Fault tolerance is thus one of the key requirements for data communications in WSN applications. This paper proposes a novel path redundancy-based algorithm, called dual separate paths (DSP), that provides fault-tolerant communication with the improvement of the network traffic performance for WSN applications, such as fault-tolerant CPSs. The proposed DSP algorithm establishes two separate paths between a source and a destination in a network based on the network topology information. These paths are node-disjoint paths and have optimal path distances. Unicast frames are delivered from the source to the destination in the network through the dual paths, providing fault-tolerant communication and reducing redundant unicast traffic for the network. The DSP algorithm can be applied to wired and wireless networks, such as WSNs, to provide seamless fault-tolerant communication for mission-critical and life-critical applications such as fault-tolerant CPSs. The analyzed and simulated results show that the DSP-based approach not only provides fault-tolerant communication, but also improves network traffic performance. For the case study in this paper, when the DSP algorithm was applied to high-availability seamless redundancy (HSR) networks, the proposed DSP-based approach reduced the network traffic by 80% to 88% compared with the standard HSR protocol, thus improving network traffic performance.
General Monte Carlo reliability simulation code including common mode failures and HARP fault/error-handling

NASA Technical Reports Server (NTRS)

Platt, M. E.; Lewis, E. E.; Boehm, F.

1991-01-01

A Monte Carlo Fortran computer program was developed that uses two variance reduction techniques for computing system reliability applicable to solving very large highly reliable fault-tolerant systems. The program is consistent with the hybrid automated reliability predictor (HARP) code which employs behavioral decomposition and complex fault-error handling models. This new capability is called MC-HARP which efficiently solves reliability models with non-constant failures rates (Weibull). Common mode failure modeling is also a specialty.

cost and benefits optimization model for fault-tolerant aircraft electronic systems

NASA Technical Reports Server (NTRS)

1983-01-01

The factors involved in economic assessment of fault tolerant systems (FTS) and fault tolerant flight control systems (FTFCS) are discussed. Algorithms for optimization and economic analysis of FTFCS are documented.
Two fault tolerant toggle-hook release

NASA Technical Reports Server (NTRS)

Graves, Thomas Joseph (Inventor); Brown, Christopher William (Inventor)

1991-01-01

A coupling device is disclosed which is mechanically two fault tolerant for release. The device comprises a fastener plate and fastener body, each of which is attachable to a different one of a pair of structures to be joined. The fastener plate and body are coupled by an elongate toggle mounted at one end in a socket on the fastener plate for universal pivotal movement thereon. The other end of the toggle is received in an opening in the fastener body and adapted for limited pivotal movement therein. The toggle is adapted to be restrained by three latch hooks arranged in symmetrical equiangular spacing about the axis of the toggle, each hook being mounted on the fastener body for pivotal movement between an unlatching non-contact position with respect to the toggle and a latching position in engagement with a latching surface of the toggle. The device includes releasable lock means for locking each latch hook in its latching position whereby the toggle couples the fastener plate to the fastener body and means for releasing the lock means to unlock each said latch hook from the latch position whereby the unlocking of at least one of the latch hooks from its latching position results in the decoupling of the fastener plate from the fastener body.
Reliable Cellular Automata with Self-Organization

NASA Astrophysics Data System (ADS)

Gács, Peter

2001-04-01

In a probabilistic cellular automaton in which all local transitions have positive probability, the problem of keeping a bit of information indefinitely is nontrivial, even in an infinite automaton. Still, there is a solution in 2 dimensions, and this solution can be used to construct a simple 3-dimensional discrete-time universal fault-tolerant cellular automaton. This technique does not help much to solve the following problems: remembering a bit of information in 1 dimension; computing in dimensions lower than 3; computing in any dimension with non-synchronized transitions. Our more complex technique organizes the cells in blocks that perform a reliable simulation of a second (generalized) cellular automaton. The cells of the latter automaton are also organized in blocks, simulating even more reliably a third automaton, etc. Since all this (a possibly infinite hierarchy) is organized in "software," it must be under repair all the time from damage caused by errors. A large part of the problem is essentially self-stabilization recovering from a mess of arbitrary size and content. The present paper constructs an asynchronous one-dimensional fault-tolerant cellular automaton, with the further feature of "self-organization." The latter means that unless a large amount of input information must be given, the initial configuration can be chosen homogeneous.
End-to-End Fault Tolerance Using Transport Layer Multihoming

DTIC Science & Technology

2005-01-01

it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy...dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor...grow tired of revising the same text over and over, he always showed enthusiasm for helping me improve its
Validation Methods for Fault-Tolerant avionics and control systems, working group meeting 1

NASA Technical Reports Server (NTRS)

1979-01-01

The proceedings of the first working group meeting on validation methods for fault tolerant computer design are presented. The state of the art in fault tolerant computer validation was examined in order to provide a framework for future discussions concerning research issues for the validation of fault tolerant avionics and flight control systems. The development of positions concerning critical aspects of the validation process are given.
Sliding Mode Fault Tolerant Control with Adaptive Diagnosis for Aircraft Engines

NASA Astrophysics Data System (ADS)

Xiao, Lingfei; Du, Yanbin; Hu, Jixiang; Jiang, Bin

2018-03-01

In this paper, a novel sliding mode fault tolerant control method is presented for aircraft engine systems with uncertainties and disturbances on the basis of adaptive diagnostic observer. By taking both sensors faults and actuators faults into account, the general model of aircraft engine control systems which is subjected to uncertainties and disturbances, is considered. Then, the corresponding augmented dynamic model is established in order to facilitate the fault diagnosis and fault tolerant controller design. Next, a suitable detection observer is designed to detect the faults effectively. Through creating an adaptive diagnostic observer and based on sliding mode strategy, the sliding mode fault tolerant controller is constructed. Robust stabilization is discussed and the closed-loop system can be stabilized robustly. It is also proven that the adaptive diagnostic observer output errors and the estimations of faults converge to a set exponentially, and the converge rate greater than some value which can be adjusted by choosing designable parameters properly. The simulation on a twin-shaft aircraft engine verifies the applicability of the proposed fault tolerant control method.
Advanced cloud fault tolerance system

NASA Astrophysics Data System (ADS)

Sumangali, K.; Benny, Niketa

2017-11-01

Cloud computing has become a prevalent on-demand service on the internet to store, manage and process data. A pitfall that accompanies cloud computing is the failures that can be encountered in the cloud. To overcome these failures, we require a fault tolerance mechanism to abstract faults from users. We have proposed a fault tolerant architecture, which is a combination of proactive and reactive fault tolerance. This architecture essentially increases the reliability and the availability of the cloud. In the future, we would like to compare evaluations of our proposed architecture with existing architectures and further improve it.
ROBUS-2: A Fault-Tolerant Broadcast Communication System

NASA Technical Reports Server (NTRS)

Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.

2005-01-01

The Reliable Optical Bus (ROBUS) is the core communication system of the Scalable Processor-Independent Design for Enhanced Reliability (SPIDER), a general-purpose fault-tolerant integrated modular architecture currently under development at NASA Langley Research Center. The ROBUS is a time-division multiple access (TDMA) broadcast communication system with medium access control by means of time-indexed communication schedule. ROBUS-2 is a developmental version of the ROBUS providing guaranteed fault-tolerant services to the attached processing elements (PEs), in the presence of a bounded number of faults. These services include message broadcast (Byzantine Agreement), dynamic communication schedule update, clock synchronization, and distributed diagnosis (group membership). The ROBUS also features fault-tolerant startup and restart capabilities. ROBUS-2 is tolerant to internal as well as PE faults, and incorporates a dynamic self-reconfiguration capability driven by the internal diagnostic system. This version of the ROBUS is intended for laboratory experimentation and demonstrations of the capability to reintegrate failed nodes, dynamically update the communication schedule, and tolerate and recover from correlated transient faults.
Controllability Analysis for Multirotor Helicopter Rotor Degradation and Failure

NASA Astrophysics Data System (ADS)

Du, Guang-Xun; Quan, Quan; Yang, Binxian; Cai, Kai-Yuan

2015-05-01

This paper considers the controllability analysis problem for a class of multirotor systems subject to rotor failure/wear. It is shown that classical controllability theories of linear systems are not sufficient to test the controllability of the considered multirotors. Owing to this, an easy-to-use measurement index is introduced to assess the available control authority. Based on it, a new necessary and sufficient condition for the controllability of multirotors is derived. Furthermore, a controllability test procedure is approached. The proposed controllability test method is applied to a class of hexacopters with different rotor configurations and different rotor efficiency parameters to show its effectiveness. The analysis results show that hexacopters with different rotor configurations have different fault-tolerant capabilities. It is therefore necessary to test the controllability of the multirotors before any fault-tolerant control strategies are employed.
Decoupling control of a five-phase fault-tolerant permanent magnet motor by radial basis function neural network inverse

NASA Astrophysics Data System (ADS)

Chen, Qian; Liu, Guohai; Xu, Dezhi; Xu, Liang; Xu, Gaohong; Aamir, Nazir

2018-05-01

This paper proposes a new decoupled control for a five-phase in-wheel fault-tolerant permanent magnet (IW-FTPM) motor drive, in which radial basis function neural network inverse (RBF-NNI) and internal model control (IMC) are combined. The RBF-NNI system is introduced into original system to construct a pseudo-linear system, and IMC is used as a robust controller. Hence, the newly proposed control system incorporates the merits of the IMC and RBF-NNI methods. In order to verify the proposed strategy, an IW-FTPM motor drive is designed based on dSPACE real-time control platform. Then, the experimental results are offered to verify that the d-axis current and the rotor speed are successfully decoupled. Besides, the proposed motor drive exhibits strong robustness even under load torque disturbance.
Probabilistic evaluation of on-line checks in fault-tolerant multiprocessor systems

NASA Technical Reports Server (NTRS)

Nair, V. S. S.; Hoskote, Yatin V.; Abraham, Jacob A.

1992-01-01

The analysis of fault-tolerant multiprocessor systems that use concurrent error detection (CED) schemes is much more difficult than the analysis of conventional fault-tolerant architectures. Various analytical techniques have been proposed to evaluate CED schemes deterministically. However, these approaches are based on worst-case assumptions related to the failure of system components. Often, the evaluation results do not reflect the actual fault tolerance capabilities of the system. A probabilistic approach to evaluate the fault detecting and locating capabilities of on-line checks in a system is developed. The various probabilities associated with the checking schemes are identified and used in the framework of the matrix-based model. Based on these probabilistic matrices, estimates for the fault tolerance capabilities of various systems are derived analytically.
Ultrareliable fault-tolerant control systems

NASA Technical Reports Server (NTRS)

Webster, L. D.; Slykhouse, R. A.; Booth, L. A., Jr.; Carson, T. M.; Davis, G. J.; Howard, J. C.

1984-01-01

It is demonstrated that fault-tolerant computer systems, such as on the Shuttles, based on redundant, independent operation are a viable alternative in fault tolerant system designs. The ultrareliable fault-tolerant control system (UFTCS) was developed and tested in laboratory simulations of an UH-1H helicopter. UFTCS includes asymptotically stable independent control elements in a parallel, cross-linked system environment. Static redundancy provides the fault tolerance. A polling is performed among the computers, with results allowing for time-delay channel variations with tight bounds. When compared with the laboratory and actual flight data for the helicopter, the probability of a fault was, for the first 10 hr of flight given a quintuple computer redundancy, found to be 1 in 290 billion. Two weeks of untended Space Station operations would experience a fault probability of 1 in 24 million. Techniques for avoiding channel divergence problems are identified.
Fault recovery characteristics of the fault tolerant multi-processor

NASA Technical Reports Server (NTRS)

Padilla, Peter A.

1990-01-01

The fault handling performance of the fault tolerant multiprocessor (FTMP) was investigated. Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles byzantine or lying faults. It is pointed out that these weak areas in the FTMP's design increase the probability that, for any hardware fault, a good LRU (line replaceable unit) is mistakenly disabled by the fault management software. It is concluded that fault injection can help detect and analyze the behavior of a system in the ultra-reliable regime. Although fault injection testing cannot be exhaustive, it has been demonstrated that it provides a unique capability to unmask problems and to characterize the behavior of a fault-tolerant system.
Three dimensional modelling of earthquake rupture cycles on frictional faults

NASA Astrophysics Data System (ADS)

Simpson, Guy; May, Dave

2017-04-01

We are developing an efficient MPI-parallel numerical method to simulate earthquake sequences on preexisting faults embedding within a three dimensional viscoelastic half-space. We solve the velocity form of the elasto(visco)dynamic equations using a continuous Galerkin Finite Element Method on an unstructured pentahedral mesh, which thus permits local spatial refinement in the vicinity of the fault. Friction sliding is coupled to the viscoelastic solid via rate- and state-dependent friction laws using the split-node technique. Our coupled formulation employs a picard-type non-linear solver with a fully implicit, first order accurate time integrator that utilises an adaptive time step that efficiently evolves the system through multiple seismic cycles. The implementation leverages advanced parallel solvers, preconditioners and linear algebra from the Portable Extensible Toolkit for Scientific Computing (PETSc) library. The model can treat heterogeneous frictional properties and stress states on the fault and surrounding solid as well as non-planar fault geometries. Preliminary tests show that the model successfully reproduces dynamic rupture on a vertical strike-slip fault in a half-space governed by rate-state friction with the ageing law.
Optimizing the Reliability and Performance of Service Composition Applications with Fault Tolerance in Wireless Sensor Networks

PubMed Central

Wu, Zhao; Xiong, Naixue; Huang, Yannong; Xu, Degang; Hu, Chunyang

2015-01-01

The services composition technology provides flexible methods for building service composition applications (SCAs) in wireless sensor networks (WSNs). The high reliability and high performance of SCAs help services composition technology promote the practical application of WSNs. The optimization methods for reliability and performance used for traditional software systems are mostly based on the instantiations of software components, which are inapplicable and inefficient in the ever-changing SCAs in WSNs. In this paper, we consider the SCAs with fault tolerance in WSNs. Based on a Universal Generating Function (UGF) we propose a reliability and performance model of SCAs in WSNs, which generalizes a redundancy optimization problem to a multi-state system. Based on this model, an efficient optimization algorithm for reliability and performance of SCAs in WSNs is developed based on a Genetic Algorithm (GA) to find the optimal structure of SCAs with fault-tolerance in WSNs. In order to examine the feasibility of our algorithm, we have evaluated the performance. Furthermore, the interrelationships between the reliability, performance and cost are investigated. In addition, a distinct approach to determine the most suitable parameters in the suggested algorithm is proposed. PMID:26561818
Adaptive Control Allocation for Fault Tolerant Overactuated Autonomous Vehicles

DTIC Science & Technology

2007-11-01

Tolerant Overactuated Autonomous Vehicles Casavola, A.; Garone, E. (2007) Adaptive Control Allocation for Fault Tolerant Overactuated Autonomous ...Adaptive Control Allocation for Fault Tolerant Overactuated Autonomous Vehicles 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6...Tolerant Overactuated Autonomous Vehicles 3.2 - 2 RTO-MP-AVT-145 UNCLASSIFIED/UNLIMITED Control allocation problem (CAP) - Given a virtual input v(t
Study of fault-tolerant software technology

NASA Technical Reports Server (NTRS)

Slivinski, T.; Broglio, C.; Wild, C.; Goldberg, J.; Levitt, K.; Hitt, E.; Webb, J.

1984-01-01

Presented is an overview of the current state of the art of fault-tolerant software and an analysis of quantitative techniques and models developed to assess its impact. It examines research efforts as well as experience gained from commercial application of these techniques. The paper also addresses the computer architecture and design implications on hardware, operating systems and programming languages (including Ada) of using fault-tolerant software in real-time aerospace applications. It concludes that fault-tolerant software has progressed beyond the pure research state. The paper also finds that, although not perfectly matched, newer architectural and language capabilities provide many of the notations and functions needed to effectively and efficiently implement software fault-tolerance.
Stability, performance and sensitivity analysis of I.I.D. jump linear systems

NASA Astrophysics Data System (ADS)

Chávez Fuentes, Jorge R.; González, Oscar R.; Gray, W. Steven

2018-06-01

This paper presents a symmetric Kronecker product analysis of independent and identically distributed jump linear systems to develop new, lower dimensional equations for the stability and performance analysis of this type of systems than what is currently available. In addition, new closed form expressions characterising multi-parameter relative sensitivity functions for performance metrics are introduced. The analysis technique is illustrated with a distributed fault-tolerant flight control example where the communication links are allowed to fail randomly.
Propulsion Health Monitoring for Enhanced Safety

NASA Technical Reports Server (NTRS)

Butz, Mark G.; Rodriguez, Hector M.

2003-01-01

This report presents the results of the NASA contract Propulsion System Health Management for Enhanced Safety performed by General Electric Aircraft Engines (GE AE), General Electric Global Research (GE GR), and Pennsylvania State University Applied Research Laboratory (PSU ARL) under the NASA Aviation Safety Program. This activity supports the overall goal of enhanced civil aviation safety through a reduction in the occurrence of safety-significant propulsion system malfunctions. Specific objectives are to develop and demonstrate vibration diagnostics techniques for the on-line detection of turbine rotor disk cracks, and model-based fault tolerant control techniques for the prevention and mitigation of in-flight engine shutdown, surge/stall, and flameout events. The disk crack detection work was performed by GE GR which focused on a radial-mode vibration monitoring technique, and PSU ARL which focused on a torsional-mode vibration monitoring technique. GE AE performed the Model-Based Fault Tolerant Control work which focused on the development of analytical techniques for detecting, isolating, and accommodating gas-path faults.
Velocity Gradient Across the San Andreas Fault and Changes in Slip Behavior as Outlined by Full non Linear Tomography

NASA Astrophysics Data System (ADS)

Chiarabba, C.; Giacomuzzi, G.; Piana Agostinetti, N.

2017-12-01

The San Andreas Fault (SAF) near Parkfield is the best known fault section which exhibit a clear transition in slip behavior from stable to unstable. Intensive monitoring and decades of studies permit to identify details of these processes with a good definition of fault structure and subsurface models. Tomographic models computed so far revealed the existence of large velocity contrasts, yielding physical insight on fault rheology. In this study, we applied a recently developed full non-linear tomography method to compute Vp and Vs models which focus on the section of the fault that exhibit fault slip transition. The new tomographic code allows not to impose a vertical seismic discontinuity at the fault position, as routinely done in linearized codes. Any lateral velocity contrast found is directly dictated by the data themselves and not imposed by subjective choices. The use of the same dataset of previous tomographic studies allows a proper comparison of results. We use a total of 861 earthquakes, 72 blasts and 82 shots and the overall arrival time dataset consists of 43948 P- and 29158 S-wave arrival times, accurately selected to take care of seismic anisotropy. Computed Vp and Vp/Vs models, which by-pass the main problems related to linarized LET algorithms, excellently match independent available constraints and show crustal heterogeneities with a high resolution. The high resolution obtained in the fault surroundings permits to infer lateral changes of Vp and Vp/Vs across the fault (velocity gradient). We observe that stable and unstable sliding sections of the SAF have different velocity gradients, small and negligible in the stable slip segment, but larger than 15 % in the unstable slip segment. Our results suggest that Vp and Vp/Vs gradients across the fault control fault rheology and the attitude of fault slip behavior.

Lattice surgery on the Raussendorf lattice

NASA Astrophysics Data System (ADS)

Herr, Daniel; Paler, Alexandru; Devitt, Simon J.; Nori, Franco

2018-07-01

Lattice surgery is a method to perform quantum computation fault-tolerantly by using operations on boundary qubits between different patches of the planar code. This technique allows for universal planar code computation without eliminating the intrinsic two-dimensional nearest-neighbor properties of the surface code that eases physical hardware implementations. Lattice surgery approaches to algorithmic compilation and optimization have been demonstrated to be more resource efficient for resource-intensive components of a fault-tolerant algorithm, and consequently may be preferable over braid-based logic. Lattice surgery can be extended to the Raussendorf lattice, providing a measurement-based approach to the surface code. In this paper we describe how lattice surgery can be performed on the Raussendorf lattice and therefore give a viable alternative to computation using braiding in measurement-based implementations of topological codes.
Software fault tolerance in computer operating systems

NASA Technical Reports Server (NTRS)

Iyer, Ravishankar K.; Lee, Inhwan

1994-01-01

This chapter provides data and analysis of the dependability and fault tolerance for three operating systems: the Tandem/GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Based on measurements from these systems, basic software error characteristics are investigated. Fault tolerance in operating systems resulting from the use of process pairs and recovery routines is evaluated. Two levels of models are developed to analyze error and recovery processes inside an operating system and interactions among multiple instances of an operating system running in a distributed environment. The measurements show that the use of process pairs in Tandem systems, which was originally intended for tolerating hardware faults, allows the system to tolerate about 70% of defects in system software that result in processor failures. The loose coupling between processors which results in the backup execution (the processor state and the sequence of events occurring) being different from the original execution is a major reason for the measured software fault tolerance. The IBM/MVS system fault tolerance almost doubles when recovery routines are provided, in comparison to the case in which no recovery routines are available. However, even when recovery routines are provided, there is almost a 50% chance of system failure when critical system jobs are involved.
Performance and Fault-Tolerance of Neural Networks for Optimization

DTIC Science & Technology

1991-06-01

initialization to overcome the unstable equilibrium point at uij--O. "’ used the initial values Vij--0.5+6 with small, uniform noise _10-7򔄮 -7 . The...connectionist network: Investigations of acquired dyslexia . Technical Report CRG-TR-89-3, Dept. of Computer Science, University of Toronto, May 1989
A fault-tolerant intelligent robotic control system

NASA Technical Reports Server (NTRS)

Marzwell, Neville I.; Tso, Kam Sing

1993-01-01

This paper describes the concept, design, and features of a fault-tolerant intelligent robotic control system being developed for space and commercial applications that require high dependability. The comprehensive strategy integrates system level hardware/software fault tolerance with task level handling of uncertainties and unexpected events for robotic control. The underlying architecture for system level fault tolerance is the distributed recovery block which protects against application software, system software, hardware, and network failures. Task level fault tolerance provisions are implemented in a knowledge-based system which utilizes advanced automation techniques such as rule-based and model-based reasoning to monitor, diagnose, and recover from unexpected events. The two level design provides tolerance of two or more faults occurring serially at any level of command, control, sensing, or actuation. The potential benefits of such a fault tolerant robotic control system include: (1) a minimized potential for damage to humans, the work site, and the robot itself; (2) continuous operation with a minimum of uncommanded motion in the presence of failures; and (3) more reliable autonomous operation providing increased efficiency in the execution of robotic tasks and decreased demand on human operators for controlling and monitoring the robotic servicing routines.
A Low Mass Translation Mechanism for Planetary FTIR Spectrometry using an Ultrasonic Piezo Linear Motor

NASA Technical Reports Server (NTRS)

Heverly, Matthew; Dougherty, Sean; Toon, Geoffrey; Soto, Alejandro; Blavier, Jean-Francois

2004-01-01

One of the key components of a Fourier Transform Infrared Spectrometer (FTIR) is the linear translation stage used to vary the optical path length between the two arms of the interferometer. This translation mechanism must produce extremely constant velocity motion across its entire range of travel to allow the instrument to attain high signal-to-noise ratio and spectral resolving power. A new spectrometer is being developed at the Jet Propulsion Laboratory under NASA s Planetary Instrument Definition and Development Program (PIDDP). The goal of this project is to build upon existing spaceborne FTIR spectrometer technology to produce a new instrument prototype that has drastically superior spectral resolution and substantially lower mass, making it feasible for planetary exploration. In order to achieve these goals, Alliance Spacesystems, Inc. (ASI) has developed a linear translation mechanism using a novel ultrasonic piezo linear motor in conjunction with a fully kinematic, fault tolerant linear rail system. The piezo motor provides extremely smooth motion, is inherently redundant, and is capable of producing unlimited travel. The kinematic rail uses spherical Vespel(R). rollers and bushings, which eliminates the need for wet lubrication, while providing a fault tolerant platform for smooth linear motion that will not bind under misalignment or structural deformation. This system can produce velocities from 10 - 100 mm/s with less than 1% velocity error over the entire 100-mm length of travel for a total mechanism mass of less than 850 grams. This system has performed over half a million strokes under vacuum without excessive wear or degradation in performance. This paper covers the design, development, and testing of this linear translation mechanism as part of the Planetary Atmosphere Occultation Spectrometer (PAOS) instrument prototype development program.
Heterogeneous rupture on homogenous faults: Three-dimensional spontaneous rupture simulations with thermal pressurization

NASA Astrophysics Data System (ADS)

Urata, Yumi; Kuge, Keiko; Kase, Yuko

2008-11-01

To understand role of fluid on earthquake rupture processes, we investigated effects of thermal pressurization on spatial variation of dynamic rupture by computing spontaneous rupture propagation on a rectangular fault. We found thermal pressurization can cause heterogeneity of rupture even on a fault of uniform properties. On drained faults, tractions drop linearly with increasing slip in the same way everywhere. However, by changing the drained condition to an undrained one, the slip-weakening curves become non-linear and depend on locations on faults with small shear zone thickness w, and the dynamic frictional stresses vary spatially and temporally. Consequently, the super-shear transition fault length decreases for small w, and the final slip distribution can have some peaks regardless of w, especially on undrained faults. These effects should be taken into account of determining dynamic rupture parameters and modeling earthquake cycles when the presence of fluid is suggested in the source regions.
Method and system for environmentally adaptive fault tolerant computing

NASA Technical Reports Server (NTRS)

Copenhaver, Jason L. (Inventor); Jeremy, Ramos (Inventor); Wolfe, Jeffrey M. (Inventor); Brenner, Dean (Inventor)

2010-01-01

A method and system for adapting fault tolerant computing. The method includes the steps of measuring an environmental condition representative of an environment. An on-board processing system's sensitivity to the measured environmental condition is measured. It is determined whether to reconfigure a fault tolerance of the on-board processing system based in part on the measured environmental condition. The fault tolerance of the on-board processing system may be reconfigured based in part on the measured environmental condition.
Joint University Program for Air Transportation Research, 1990-1991

NASA Technical Reports Server (NTRS)

Morrell, Frederick R. (Compiler)

1991-01-01

The goals of this program are consistent with the interests of both NASA and the FAA in furthering the safety and efficiency of the National Airspace System. Research carried out at the Massachusetts Institute of Technology (MIT), Ohio University, and Princeton University are covered. Topics studied include passive infrared ice detection for helicopters, the cockpit display of hazardous windshear information, fault detection and isolation for multisensor navigation systems, neural networks for aircraft system identification, and intelligent failure tolerant control.
Privacy-Assured Aggregation Protocol for Smart Metering: A Proactive Fault-Tolerant Approach [Proactive Fault-Tolerant Aggregation Protocol for Privacy-Assured Smart Metering

DOE PAGES

Won, Jongho; Ma, Chris Y. T.; Yau, David K. Y.; ...

2016-06-01

Smart meters are integral to demand response in emerging smart grids, by reporting the electricity consumption of users to serve application needs. But reporting real-time usage information for individual households raises privacy concerns. Existing techniques to guarantee differential privacy (DP) of smart meter users either are not fault tolerant or achieve (possibly partial) fault tolerance at high communication overheads. In this paper, we propose a fault-tolerant protocol for smart metering that can handle general communication failures while ensuring DP with significantly improved efficiency and lower errors compared with the state of the art. Our protocol handles fail-stop faults proactively bymore » using a novel design of future ciphertexts, and distributes trust among the smart meters by sharing secret keys among them. We prove the DP properties of our protocol and analyze its advantages in fault tolerance, accuracy, and communication efficiency relative to competing techniques. We illustrate our analysis by simulations driven by real-world traces of electricity consumption.« less
Privacy-Assured Aggregation Protocol for Smart Metering: A Proactive Fault-Tolerant Approach [Proactive Fault-Tolerant Aggregation Protocol for Privacy-Assured Smart Metering

DOE Office of Scientific and Technical Information (OSTI.GOV)

Won, Jongho; Ma, Chris Y. T.; Yau, David K. Y.

Smart meters are integral to demand response in emerging smart grids, by reporting the electricity consumption of users to serve application needs. But reporting real-time usage information for individual households raises privacy concerns. Existing techniques to guarantee differential privacy (DP) of smart meter users either are not fault tolerant or achieve (possibly partial) fault tolerance at high communication overheads. In this paper, we propose a fault-tolerant protocol for smart metering that can handle general communication failures while ensuring DP with significantly improved efficiency and lower errors compared with the state of the art. Our protocol handles fail-stop faults proactively bymore » using a novel design of future ciphertexts, and distributes trust among the smart meters by sharing secret keys among them. We prove the DP properties of our protocol and analyze its advantages in fault tolerance, accuracy, and communication efficiency relative to competing techniques. We illustrate our analysis by simulations driven by real-world traces of electricity consumption.« less
On the design of fault-tolerant robotic manipulator systems

NASA Technical Reports Server (NTRS)

Tesar, Delbert

1993-01-01

Robotic systems are finding increasing use in space applications. Many of these devices are going to be operational on board the Space Station Freedom. Fault tolerance has been deemed necessary because of the criticality of the tasks and the inaccessibility of the systems to maintenance and repair. Design for fault tolerance in manipulator systems is an area within robotics that is without precedence in the literature. In this paper, we will attempt to lay down the foundations for such a technology. Design for fault tolerance demands new and special approaches to design, often at considerable variance from established design practices. These design aspects, together with reliability evaluation and modeling tools, are presented. Mechanical architectures that employ protective redundancies at many levels and have a modular architecture are then studied in detail. Once a mechanical architecture for fault tolerance has been derived, the chronological stages of operational fault tolerance are investigated. Failure detection, isolation, and estimation methods are surveyed, and such methods for robot sensors and actuators are derived. Failure recovery methods are also presented for each of the protective layers of redundancy. Failure recovery tactics often span all of the layers of a control hierarchy. Thus, a unified framework for decision-making and control, which orchestrates both the nominal redundancy management tasks and the failure management tasks, has been derived. The well-developed field of fault-tolerant computers is studied next, and some design principles relevant to the design of fault-tolerant robot controllers are abstracted. Conclusions are drawn, and a road map for the design of fault-tolerant manipulator systems is laid out with recommendations for a 10 DOF arm with dual actuators at each joint.
Software fault tolerance for real-time avionics systems

NASA Technical Reports Server (NTRS)

Anderson, T.; Knight, J. C.

1983-01-01

Avionics systems have very high reliability requirements and are therefore prime candidates for the inclusion of fault tolerance techniques. In order to provide tolerance to software faults, some form of state restoration is usually advocated as a means of recovery. State restoration can be very expensive for systems which utilize concurrent processes. The concurrency present in most avionics systems and the further difficulties introduced by timing constraints imply that providing tolerance for software faults may be inordinately expensive or complex. A straightforward pragmatic approach to software fault tolerance which is believed to be applicable to many real-time avionics systems is proposed. A classification system for software errors is presented together with approaches to recovery and continued service for each error type.
Switch failure diagnosis based on inductor current observation for boost converters

NASA Astrophysics Data System (ADS)

Jamshidpour, E.; Poure, P.; Saadate, S.

2016-09-01

Face to the growing number of applications using DC-DC power converters, the improvement of their reliability is subject to an increasing number of studies. Especially in safety critical applications, designing fault-tolerant converters is becoming mandatory. In this paper, a switch fault-tolerant DC-DC converter is studied. First, some of the fastest Fault Detection Algorithms (FDAs) are recalled. Then, a fast switch FDA is proposed which can detect both types of failures; open circuit fault as well as short circuit fault can be detected in less than one switching period. Second, a fault-tolerant converter which can be reconfigured under those types of fault is introduced. Hardware-In-the-Loop (HIL) results and experimental validations are given to verify the validity of the proposed switch fault-tolerant approach in the case of a single switch DC-DC boost converter with one redundant switch.
Distributed Fault-Tolerant Control of Networked Uncertain Euler-Lagrange Systems Under Actuator Faults.

PubMed

Chen, Gang; Song, Yongduan; Lewis, Frank L

2016-05-03

This paper investigates the distributed fault-tolerant control problem of networked Euler-Lagrange systems with actuator and communication link faults. An adaptive fault-tolerant cooperative control scheme is proposed to achieve the coordinated tracking control of networked uncertain Lagrange systems on a general directed communication topology, which contains a spanning tree with the root node being the active target system. The proposed algorithm is capable of compensating for the actuator bias fault, the partial loss of effectiveness actuation fault, the communication link fault, the model uncertainty, and the external disturbance simultaneously. The control scheme does not use any fault detection and isolation mechanism to detect, separate, and identify the actuator faults online, which largely reduces the online computation and expedites the responsiveness of the controller. To validate the effectiveness of the proposed method, a test-bed of multiple robot-arm cooperative control system is developed for real-time verification. Experiments on the networked robot-arms are conduced and the results confirm the benefits and the effectiveness of the proposed distributed fault-tolerant control algorithms.
The Design of a Fault-Tolerant COTS-Based Bus Architecture

NASA Technical Reports Server (NTRS)

Chau, Savio N.; Alkalai, Leon; Burt, John B.; Tai, Ann T.

1999-01-01

In this paper, we report our experiences and findings on the design of a fault-tolerant bus architecture comprised of two COTS buses, the IEEE 1394 and the 12C. This fault-tolerant bus is the backbone system bus for the avionics architecture of the X2000 program at the Jet Propulsion Laboratory. COTS buses are attractive because of the availability of low cost commercial products. However, they are not specifically designed for highly reliable applications such as long-life deep-space missions. The X2000 design team has devised a multi-level fault tolerance approach to compensate for this shortcoming of COTS buses. First, the approach enhances the fault tolerance capabilities of the IEEE 1394 and 12 C buses by adding a layer of fault handling hardware and software. Second, algorithms are developed to enable the IEEE 1394 and the 12 C buses assist each other to isolate and recovery from faults. Third, the set of IEEE 1394 and 12 C buses is duplicated to further enhance system reliability. The X2000 design team has paid special attention to guarantee that all fault tolerance provisions will not cause the bus design to deviate from the commercial standard specifications. Otherwise, the economic attractiveness of using COTS will be diminished. The hardware and software design of the X2000 fault-tolerant bus are being implemented and flight hardware will be delivered to the ST4 and Europa Orbiter missions.
Modeling the Fault Tolerant Capability of a Flight Control System: An Exercise in SCR Specification

NASA Technical Reports Server (NTRS)

Alexander, Chris; Cortellessa, Vittorio; DelGobbo, Diego; Mili, Ali; Napolitano, Marcello

2000-01-01

In life-critical and mission-critical applications, it is important to make provisions for a wide range of contingencies, by providing means for fault tolerance. In this paper, we discuss the specification of a flight control system that is fault tolerant with respect to sensor faults. Redundancy is provided by analytical relations that hold between sensor readings; depending on the conditions, this redundancy can be used to detect, identify and accommodate sensor faults.
Design study of Software-Implemented Fault-Tolerance (SIFT) computer

NASA Technical Reports Server (NTRS)

Wensley, J. H.; Goldberg, J.; Green, M. W.; Kutz, W. H.; Levitt, K. N.; Mills, M. E.; Shostak, R. E.; Whiting-Okeefe, P. M.; Zeidler, H. M.

1982-01-01

Software-implemented fault tolerant (SIFT) computer design for commercial aviation is reported. A SIFT design concept is addressed. Alternate strategies for physical implementation are considered. Hardware and software design correctness is addressed. System modeling and effectiveness evaluation are considered from a fault-tolerant point of view.
Linear complementarity formulation for 3D frictional sliding problems

USGS Publications Warehouse

Kaven, Joern; Hickman, Stephen H.; Davatzes, Nicholas C.; Mutlu, Ovunc

2012-01-01

Frictional sliding on quasi-statically deforming faults and fractures can be modeled efficiently using a linear complementarity formulation. We review the formulation in two dimensions and expand the formulation to three-dimensional problems including problems of orthotropic friction. This formulation accurately reproduces analytical solutions to static Coulomb friction sliding problems. The formulation accounts for opening displacements that can occur near regions of non-planarity even under large confining pressures. Such problems are difficult to solve owing to the coupling of relative displacements and tractions; thus, many geomechanical problems tend to neglect these effects. Simple test cases highlight the importance of including friction and allowing for opening when solving quasi-static fault mechanics models. These results also underscore the importance of considering the effects of non-planarity in modeling processes associated with crustal faulting.
Use of non-adiabatic geometric phase for quantum computing by NMR.

PubMed

Das, Ranabir; Kumar, S K Karthick; Kumar, Anil

2005-12-01

Geometric phases have stimulated researchers for its potential applications in many areas of science. One of them is fault-tolerant quantum computation. A preliminary requisite of quantum computation is the implementation of controlled dynamics of qubits. In controlled dynamics, one qubit undergoes coherent evolution and acquires appropriate phase, depending on the state of other qubits. If the evolution is geometric, then the phase acquired depend only on the geometry of the path executed, and is robust against certain types of error. This phenomenon leads to an inherently fault-tolerant quantum computation. Here we suggest a technique of using non-adiabatic geometric phase for quantum computation, using selective excitation. In a two-qubit system, we selectively evolve a suitable subsystem where the control qubit is in state |1, through a closed circuit. By this evolution, the target qubit gains a phase controlled by the state of the control qubit. Using the non-adiabatic geometric phase we demonstrate implementation of Deutsch-Jozsa algorithm and Grover's search algorithm in a two-qubit system.
14 CFR Special Federal Aviation... - Fuel Tank System Fault Tolerance Evaluation Requirements

Code of Federal Regulations, 2014 CFR

2014-01-01

... 14 Aeronautics and Space 1 2014-01-01 2014-01-01 false Fuel Tank System Fault Tolerance Evaluation Requirements Federal Special Federal Aviation Regulation No. 88 Aeronautics and Space FEDERAL AVIATION..., SFAR No. 88 Special Federal Aviation Regulation No. 88—Fuel Tank System Fault Tolerance Evaluation...

14 CFR Special Federal Aviation... - Fuel Tank System Fault Tolerance Evaluation Requirements

Code of Federal Regulations, 2011 CFR

2011-01-01

... 14 Aeronautics and Space 1 2011-01-01 2011-01-01 false Fuel Tank System Fault Tolerance Evaluation Requirements Federal Special Federal Aviation Regulation No. 88 Aeronautics and Space FEDERAL AVIATION..., SFAR No. 88 Special Federal Aviation Regulation No. 88—Fuel Tank System Fault Tolerance Evaluation...
14 CFR Special Federal Aviation... - Fuel Tank System Fault Tolerance Evaluation Requirements

Code of Federal Regulations, 2012 CFR

2012-01-01

... 14 Aeronautics and Space 1 2012-01-01 2012-01-01 false Fuel Tank System Fault Tolerance Evaluation Requirements Federal Special Federal Aviation Regulation No. 88 Aeronautics and Space FEDERAL AVIATION..., SFAR No. 88 Special Federal Aviation Regulation No. 88—Fuel Tank System Fault Tolerance Evaluation...
14 CFR Special Federal Aviation... - Fuel Tank System Fault Tolerance Evaluation Requirements

Code of Federal Regulations, 2010 CFR

2010-01-01

... 14 Aeronautics and Space 1 2010-01-01 2010-01-01 false Fuel Tank System Fault Tolerance Evaluation Requirements Federal Special Federal Aviation Regulation No. 88 Aeronautics and Space FEDERAL AVIATION..., SFAR No. 88 Special Federal Aviation Regulation No. 88—Fuel Tank System Fault Tolerance Evaluation...
14 CFR Special Federal Aviation... - Fuel Tank System Fault Tolerance Evaluation Requirements

Code of Federal Regulations, 2013 CFR

2013-01-01

... 14 Aeronautics and Space 1 2013-01-01 2013-01-01 false Fuel Tank System Fault Tolerance Evaluation Requirements Federal Special Federal Aviation Regulation No. 88 Aeronautics and Space FEDERAL AVIATION..., SFAR No. 88 Special Federal Aviation Regulation No. 88—Fuel Tank System Fault Tolerance Evaluation...
Investigation of Air Transportation Technology at Princeton University, 1989-1990

NASA Technical Reports Server (NTRS)

Stengel, Robert F.

1990-01-01

The Air Transportation Technology Program at Princeton University proceeded along six avenues during the past year: microburst hazards to aircraft; machine-intelligent, fault tolerant flight control; computer aided heuristics for piloted flight; stochastic robustness for flight control systems; neural networks for flight control; and computer aided control system design. These topics are briefly discussed, and an annotated bibliography of publications that appeared between January 1989 and June 1990 is given.
A second generation experiment in fault-tolerant software

NASA Technical Reports Server (NTRS)

Knight, J. C.

1986-01-01

The primary goal was to determine whether the application of fault tolerance to software increases its reliability if the cost of production is the same as for an equivalent nonfault tolerance version derived from the same requirements specification. Software development protocols are discussed. The feasibility of adapting to software design fault tolerance the technique of N-fold Modular Redundancy with majority voting was studied.
An experiment in software reliability

NASA Technical Reports Server (NTRS)

Dunham, J. R.; Pierce, J. L.

1986-01-01

The results of a software reliability experiment conducted in a controlled laboratory setting are reported. The experiment was undertaken to gather data on software failures and is one in a series of experiments being pursued by the Fault Tolerant Systems Branch of NASA Langley Research Center to find a means of credibly performing reliability evaluations of flight control software. The experiment tests a small sample of implementations of radar tracking software having ultra-reliability requirements and uses n-version programming for error detection, and repetitive run modeling for failure and fault rate estimation. The experiment results agree with those of Nagel and Skrivan in that the program error rates suggest an approximate log-linear pattern and the individual faults occurred with significantly different error rates. Additional analysis of the experimental data raises new questions concerning the phenomenon of interacting faults. This phenomenon may provide one explanation for software reliability decay.
A tutorial on the CARE III approach to reliability modeling. [of fault tolerant avionics and control systems

NASA Technical Reports Server (NTRS)

Trivedi, K. S.; Geist, R. M.

1981-01-01

The CARE 3 reliability model for aircraft avionics and control systems is described by utilizing a number of examples which frequently use state-of-the-art mathematical modeling techniques as a basis for their exposition. Behavioral decomposition followed by aggregration were used in an attempt to deal with reliability models with a large number of states. A comprehensive set of models of the fault-handling processes in a typical fault-tolerant system was used. These models were semi-Markov in nature, thus removing the usual restrictions of exponential holding times within the coverage model. The aggregate model is a non-homogeneous Markov chain, thus allowing the times to failure to posses Weibull-like distributions. Because of the departures from traditional models, the solution method employed is that of Kolmogorov integral equations, which are evaluated numerically.
A Byzantine-Fault Tolerant Self-Stabilizing Protocol for Distributed Clock Synchronization Systems

NASA Technical Reports Server (NTRS)

Malekpour, Mahyar R.

2006-01-01

Embedded distributed systems have become an integral part of safety-critical computing applications, necessitating system designs that incorporate fault tolerant clock synchronization in order to achieve ultra-reliable assurance levels. Many efficient clock synchronization protocols do not, however, address Byzantine failures, and most protocols that do tolerate Byzantine failures do not self-stabilize. Of the Byzantine self-stabilizing clock synchronization algorithms that exist in the literature, they are based on either unjustifiably strong assumptions about initial synchrony of the nodes or on the existence of a common pulse at the nodes. The Byzantine self-stabilizing clock synchronization protocol presented here does not rely on any assumptions about the initial state of the clocks. Furthermore, there is neither a central clock nor an externally generated pulse system. The proposed protocol converges deterministically, is scalable, and self-stabilizes in a short amount of time. The convergence time is linear with respect to the self-stabilization period. Proofs of the correctness of the protocol as well as the results of formal verification efforts are reported.
AADL and Model-based Engineering

DTIC Science & Technology

2014-10-20

and MBE Feiler, Oct 20, 2014 © 2014 Carnegie Mellon University We Rely on Software for Safe Aircraft Operation Embedded software systems ...D eveloper Compute Platform Runtime Architecture Application Software Embedded SW System Engineer Data Stream Characteristics Latency...confusion Hardware Engineer Why do system level failures still occur despite fault tolerance techniques being deployed in systems ? Embedded software
An Efficient Silent Data Corruption Detection Method with Error-Feedback Control and Even Sampling for HPC Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Di, Sheng; Berrocal, Eduardo; Cappello, Franck

The silent data corruption (SDC) problem is attracting more and more attentions because it is expected to have a great impact on exascale HPC applications. SDC faults are hazardous in that they pass unnoticed by hardware and can lead to wrong computation results. In this work, we formulate SDC detection as a runtime one-step-ahead prediction method, leveraging multiple linear prediction methods in order to improve the detection results. The contributions are twofold: (1) we propose an error feedback control model that can reduce the prediction errors for different linear prediction methods, and (2) we propose a spatial-data-based even-sampling method tomore » minimize the detection overheads (including memory and computation cost). We implement our algorithms in the fault tolerance interface, a fault tolerance library with multiple checkpoint levels, such that users can conveniently protect their HPC applications against both SDC errors and fail-stop errors. We evaluate our approach by using large-scale traces from well-known, large-scale HPC applications, as well as by running those HPC applications on a real cluster environment. Experiments show that our error feedback control model can improve detection sensitivity by 34-189% for bit-flip memory errors injected with the bit positions in the range [20,30], without any degradation on detection accuracy. Furthermore, memory size can be reduced by 33% with our spatial-data even-sampling method, with only a slight and graceful degradation in the detection sensitivity.« less
Fault tolerance of artificial neural networks with applications in critical systems

NASA Technical Reports Server (NTRS)

Protzel, Peter W.; Palumbo, Daniel L.; Arras, Michael K.

1992-01-01

This paper investigates the fault tolerance characteristics of time continuous recurrent artificial neural networks (ANN) that can be used to solve optimization problems. The principle of operations and performance of these networks are first illustrated by using well-known model problems like the traveling salesman problem and the assignment problem. The ANNs are then subjected to 13 simultaneous 'stuck at 1' or 'stuck at 0' faults for network sizes of up to 900 'neurons'. The effects of these faults is demonstrated and the cause for the observed fault tolerance is discussed. An application is presented in which a network performs a critical task for a real-time distributed processing system by generating new task allocations during the reconfiguration of the system. The performance degradation of the ANN under the presence of faults is investigated by large-scale simulations, and the potential benefits of delegating a critical task to a fault tolerant network are discussed.
Fault-tolerant dynamic task graph scheduling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kurt, Mehmet C.; Krishnamoorthy, Sriram; Agrawal, Kunal

2014-11-16

In this paper, we present an approach to fault tolerant execution of dynamic task graphs scheduled using work stealing. In particular, we focus on selective and localized recovery of tasks in the presence of soft faults. We elicit from the user the basic task graph structure in terms of successor and predecessor relationships. The work stealing-based algorithm to schedule such a task graph is augmented to enable recovery when the data and meta-data associated with a task get corrupted. We use this redundancy, and the knowledge of the task graph structure, to selectively recover from faults with low space andmore » time overheads. We show that the fault tolerant design retains the essential properties of the underlying work stealing-based task scheduling algorithm, and that the fault tolerant execution is asymptotically optimal when task re-execution is taken into account. Experimental evaluation demonstrates the low cost of recovery under various fault scenarios.« less
Design of on-board Bluetooth wireless network system based on fault-tolerant technology

NASA Astrophysics Data System (ADS)

You, Zheng; Zhang, Xiangqi; Yu, Shijie; Tian, Hexiang

2007-11-01

In this paper, the Bluetooth wireless data transmission technology is applied in on-board computer system, to realize wireless data transmission between peripherals of the micro-satellite integrating electronic system, and in view of the high demand of reliability of a micro-satellite, a design of Bluetooth wireless network based on fault-tolerant technology is introduced. The reliability of two fault-tolerant systems is estimated firstly using Markov model, then the structural design of this fault-tolerant system is introduced; several protocols are established to make the system operate correctly, some related problems are listed and analyzed, with emphasis on Fault Auto-diagnosis System, Active-standby switch design and Data-Integrity process.
Spacecraft fault tolerance: The Magellan experience

NASA Technical Reports Server (NTRS)

Kasuda, Rick; Packard, Donna Sexton

1993-01-01

Interplanetary and earth orbiting missions are now imposing unique fault tolerant requirements upon spacecraft design. Mission success is the prime motivator for building spacecraft with fault tolerant systems. The Magellan spacecraft had many such requirements imposed upon its design. Magellan met these requirements by building redundancy into all the major subsystem components and designing the onboard hardware and software with the capability to detect a fault, isolate it to a component, and issue commands to achieve a back-up configuration. This discussion is limited to fault protection, which is the autonomous capability to respond to a fault. The Magellan fault protection design is discussed, as well as the developmental and flight experiences and a summary of the lessons learned.
Development of N-version software samples for an experiment in software fault tolerance

NASA Technical Reports Server (NTRS)

Lauterbach, L.

1987-01-01

The report documents the task planning and software development phases of an effort to obtain twenty versions of code independently designed and developed from a common specification. These versions were created for use in future experiments in software fault tolerance, in continuation of the experimental series underway at the Systems Validation Methods Branch (SVMB) at NASA Langley Research Center. The 20 versions were developed under controlled conditions at four U.S. universities, by 20 teams of two researchers each. The versions process raw data from a modified Redundant Strapped Down Inertial Measurement Unit (RSDIMU). The specifications, and over 200 questions submitted by the developers concerning the specifications, are included as appendices to this report. Design documents, and design and code walkthrough reports for each version, were also obtained in this task for use in future studies.
Matrix Algebra for GPU and Multicore Architectures (MAGMA) for Large Petascale Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dongarra, Jack J.; Tomov, Stanimire

2014-03-24

The goal of the MAGMA project is to create a new generation of linear algebra libraries that achieve the fastest possible time to an accurate solution on hybrid Multicore+GPU-based systems, using all the processing power that future high-end systems can make available within given energy constraints. Our efforts at the University of Tennessee achieved the goals set in all of the five areas identified in the proposal: 1. Communication optimal algorithms; 2. Autotuning for GPU and hybrid processors; 3. Scheduling and memory management techniques for heterogeneity and scale; 4. Fault tolerance and robustness for large scale systems; 5. Building energymore » efficiency into software foundations. The University of Tennessee’s main contributions, as proposed, were the research and software development of new algorithms for hybrid multi/many-core CPUs and GPUs, as related to two-sided factorizations and complete eigenproblem solvers, hybrid BLAS, and energy efficiency for dense, as well as sparse, operations. Furthermore, as proposed, we investigated and experimented with various techniques targeting the five main areas outlined.« less
An optimized implementation of a fault-tolerant clock synchronization circuit

NASA Technical Reports Server (NTRS)

Torres-Pomales, Wilfredo

1995-01-01

A fault-tolerant clock synchronization circuit was designed and tested. A comparison to a previous design and the procedure followed to achieve the current optimization are included. The report also includes a description of the system and the results of tests performed to study the synchronization and fault-tolerant characteristics of the implementation.
VLSI Implementation of Fault Tolerance Multiplier based on Reversible Logic Gate

NASA Astrophysics Data System (ADS)

Ahmad, Nabihah; Hakimi Mokhtar, Ahmad; Othman, Nurmiza binti; Fhong Soon, Chin; Rahman, Ab Al Hadi Ab

2017-08-01

Multiplier is one of the essential component in the digital world such as in digital signal processing, microprocessor, quantum computing and widely used in arithmetic unit. Due to the complexity of the multiplier, tendency of errors are very high. This paper aimed to design a 2×2 bit Fault Tolerance Multiplier based on Reversible logic gate with low power consumption and high performance. This design have been implemented using 90nm Complemetary Metal Oxide Semiconductor (CMOS) technology in Synopsys Electronic Design Automation (EDA) Tools. Implementation of the multiplier architecture is by using the reversible logic gates. The fault tolerance multiplier used the combination of three reversible logic gate which are Double Feynman gate (F2G), New Fault Tolerance (NFT) gate and Islam Gate (IG) with the area of 160μm x 420.3μm (67.25 mm2). This design achieved a low power consumption of 122.85μW and propagation delay of 16.99ns. The fault tolerance multiplier proposed achieved a low power consumption and high performance which suitable for application of modern computing as it has a fault tolerance capabilities.
An Integrated Fault Tolerant Robotic Controller System for High Reliability and Safety

NASA Technical Reports Server (NTRS)

Marzwell, Neville I.; Tso, Kam S.; Hecht, Myron

1994-01-01

This paper describes the concepts and features of a fault-tolerant intelligent robotic control system being developed for applications that require high dependability (reliability, availability, and safety). The system consists of two major elements: a fault-tolerant controller and an operator workstation. The fault-tolerant controller uses a strategy which allows for detection and recovery of hardware, operating system, and application software failures.The fault-tolerant controller can be used by itself in a wide variety of applications in industry, process control, and communications. The controller in combination with the operator workstation can be applied to robotic applications such as spaceborne extravehicular activities, hazardous materials handling, inspection and maintenance of high value items (e.g., space vehicles, reactor internals, or aircraft), medicine, and other tasks where a robot system failure poses a significant risk to life or property.

Reliability of Fault Tolerant Control Systems. Part 1

NASA Technical Reports Server (NTRS)

Wu, N. Eva

2001-01-01

This paper reports Part I of a two part effort, that is intended to delineate the relationship between reliability and fault tolerant control in a quantitative manner. Reliability analysis of fault-tolerant control systems is performed using Markov models. Reliability properties, peculiar to fault-tolerant control systems are emphasized. As a consequence, coverage of failures through redundancy management can be severely limited. It is shown that in the early life of a syi1ein composed of highly reliable subsystems, the reliability of the overall system is affine with respect to coverage, and inadequate coverage induces dominant single point failures. The utility of some existing software tools for assessing the reliability of fault tolerant control systems is also discussed. Coverage modeling is attempted in Part II in a way that captures its dependence on the control performance and on the diagnostic resolution.
Distributed asynchronous microprocessor architectures in fault tolerant integrated flight systems

NASA Technical Reports Server (NTRS)

Dunn, W. R.

1983-01-01

The paper discusses the implementation of fault tolerant digital flight control and navigation systems for rotorcraft application. It is shown that in implementing fault tolerance at the systems level using advanced LSI/VLSI technology, aircraft physical layout and flight systems requirements tend to define a system architecture of distributed, asynchronous microprocessors in which fault tolerance can be achieved locally through hardware redundancy and/or globally through application of analytical redundancy. The effects of asynchronism on the execution of dynamic flight software is discussed. It is shown that if the asynchronous microprocessors have knowledge of time, these errors can be significantly reduced through appropiate modifications of the flight software. Finally, the papear extends previous work to show that through the combined use of time referencing and stable flight algorithms, individual microprocessors can be configured to autonomously tolerate intermittent faults.
[Advanced Development for Space Robotics With Emphasis on Fault Tolerance Technology

NASA Technical Reports Server (NTRS)

Tesar, Delbert

1997-01-01

This report describes work developing fault tolerant redundant robotic architectures and adaptive control strategies for robotic manipulator systems which can dynamically accommodate drastic robot manipulator mechanism, sensor or control failures and maintain stable end-point trajectory control with minimum disturbance. Kinematic designs of redundant, modular, reconfigurable arms for fault tolerance were pursued at a fundamental level. The approach developed robotic testbeds to evaluate disturbance responses of fault tolerant concepts in robotic mechanisms and controllers. The development was implemented in various fault tolerant mechanism testbeds including duality in the joint servo motor modules, parallel and serial structural architectures, and dual arms. All have real-time adaptive controller technologies to react to mechanism or controller disturbances (failures) to perform real-time reconfiguration to continue the task operations. The developments fall into three main areas: hardware, software, and theoretical.
Simulated fault injection - A methodology to evaluate fault tolerant microprocessor architectures

NASA Technical Reports Server (NTRS)

Choi, Gwan S.; Iyer, Ravishankar K.; Carreno, Victor A.

1990-01-01

A simulation-based fault-injection method for validating fault-tolerant microprocessor architectures is described. The approach uses mixed-mode simulation (electrical/logic analysis), and injects transient errors in run-time to assess the resulting fault impact. As an example, a fault-tolerant architecture which models the digital aspects of a dual-channel real-time jet-engine controller is used. The level of effectiveness of the dual configuration with respect to single and multiple transients is measured. The results indicate 100 percent coverage of single transients. Approximately 12 percent of the multiple transients affect both channels; none result in controller failure since two additional levels of redundancy exist.
Fault-tolerant measurement-based quantum computing with continuous-variable cluster states.

PubMed

Menicucci, Nicolas C

2014-03-28

A long-standing open question about Gaussian continuous-variable cluster states is whether they enable fault-tolerant measurement-based quantum computation. The answer is yes. Initial squeezing in the cluster above a threshold value of 20.5 dB ensures that errors from finite squeezing acting on encoded qubits are below the fault-tolerance threshold of known qubit-based error-correcting codes. By concatenating with one of these codes and using ancilla-based error correction, fault-tolerant measurement-based quantum computation of theoretically indefinite length is possible with finitely squeezed cluster states.
Sequoia: A fault-tolerant tightly coupled multiprocessor for transaction processing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bernstein, P.A.

1988-02-01

The Sequoia computer is a tightly coupled multiprocessor, and thus attains the performance advantages of this style of architecture. It avoids most of the fault-tolerance disadvantages of tight coupling by using a new fault-tolerance design. The Sequoia architecture is similar to other multimicroprocessor architectures, such as those of Encore and Sequent, in that it gives dozens of microprocessors shared access to a large main memory. It resembles the Stratus architecture in its extensive use of hardware fault-detection techniques. It resembles Stratus and Auragen in its ability to quickly recover all processes after a single point failure, transparently to the user.more » However, Sequoia is unique in its combination of a large-scale tightly coupled architecture with a hardware approach to fault tolerance. This article gives an overview of how the hardware architecture and operating systems (OS) work together to provide a high degree of fault tolerance with good system performance.« less
A Convex Approach to Fault Tolerant Control

NASA Technical Reports Server (NTRS)

Maghami, Peiman G.; Cox, David E.; Bauer, Frank (Technical Monitor)

2002-01-01

The design of control laws for dynamic systems with the potential for actuator failures is considered in this work. The use of Linear Matrix Inequalities allows more freedom in controller design criteria than typically available with robust control. This work proposes an extension of fault-scheduled control design techniques that can find a fixed controller with provable performance over a set of plants. Through convexity of the objective function, performance bounds on this set of plants implies performance bounds on a range of systems defined by a convex hull. This is used to incorporate performance bounds for a variety of soft and hard failures into the control design problem.
Advanced information processing system - Status report. [for fault tolerant and damage tolerant data processing for aerospace vehicles

NASA Technical Reports Server (NTRS)

Brock, L. D.; Lala, J.

1986-01-01

The Advanced Information Processing System (AIPS) is designed to provide a fault tolerant and damage tolerant data processing architecture for a broad range of aerospace vehicles. The AIPS architecture also has attributes to enhance system effectiveness such as graceful degradation, growth and change tolerance, integrability, etc. Two key building blocks being developed by the AIPS program are a fault and damage tolerant processor and communication network. A proof-of-concept system is now being built and will be tested to demonstrate the validity and performance of the AIPS concepts.
Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing

NASA Technical Reports Server (NTRS)

Akamine, Robert L.; Hodson, Robert F.; LaMeres, Brock J.; Ray, Robert E.

2011-01-01

Fault tolerant systems require the ability to detect and recover from physical damage caused by the hardware s environment, faulty connectors, and system degradation over time. This ability applies to military, space, and industrial computing applications. The integrity of Point-to-Point (P2P) communication, between two microcontrollers for example, is an essential part of fault tolerant computing systems. In this paper, different methods of fault detection and recovery are presented and analyzed.
A survey of NASA and military standards on fault tolerance and reliability applied to robotics

NASA Technical Reports Server (NTRS)

Cavallaro, Joseph R.; Walker, Ian D.

1994-01-01

There is currently increasing interest and activity in the area of reliability and fault tolerance for robotics. This paper discusses the application of Standards in robot reliability, and surveys the literature of relevant existing standards. A bibliography of relevant Military and NASA standards for reliability and fault tolerance is included.
Power Supply Fault Tolerant Reliability Study

DTIC Science & Technology

1991-04-01

easier to design than for equivalent bipolar transistors. MCDONNELL DOUGLAS ELECTRONICS SYSTEMS COMPANY 9. Base circuitry should be designed to drive...SWITCHING REGULATORS (Ref. 28), SWITCHING AND LINEAR POWER SUPPLY DESIGN (Ref. 25) 6. Sequence the turn-off/turn-on logic in an orderly and controllable ...for equivalent bipolar transistors. MCDONNELL DOUGLAS ELECTRONICS SYSTEMS COMPANY 8. Base circuitry should be designed to drive the transistor into
Abstractions for Fault-Tolerant Distributed System Verification

NASA Technical Reports Server (NTRS)

Pike, Lee S.; Maddalon, Jeffrey M.; Miner, Paul S.; Geser, Alfons

2004-01-01

Four kinds of abstraction for the design and analysis of fault tolerant distributed systems are discussed. These abstractions concern system messages, faults, fault masking voting, and communication. The abstractions are formalized in higher order logic, and are intended to facilitate specifying and verifying such systems in higher order theorem provers.
A fault-tolerant strategy based on SMC for current-controlled converters

NASA Astrophysics Data System (ADS)

Azer, Peter M.; Marei, Mostafa I.; Sattar, Ahmed A.

2018-05-01

The sliding mode control (SMC) is used to control variable structure systems such as power electronics converters. This paper presents a fault-tolerant strategy based on the SMC for current-controlled AC-DC converters. The proposed SMC is based on three sliding surfaces for the three legs of the AC-DC converter. Two sliding surfaces are assigned to control the phase currents since the input three-phase currents are balanced. Hence, the third sliding surface is considered as an extra degree of freedom which is utilised to control the neutral voltage. This action is utilised to enhance the performance of the converter during open-switch faults. The proposed fault-tolerant strategy is based on allocating the sliding surface of the faulty leg to control the neutral voltage. Consequently, the current waveform is improved. The behaviour of the current-controlled converter during different types of open-switch faults is analysed. Double switch faults include three cases: two upper switch fault; upper and lower switch fault at different legs; and two switches of the same leg. The dynamic performance of the proposed system is evaluated during healthy and open-switch fault operations. Simulation results exhibit the various merits of the proposed SMC-based fault-tolerant strategy.
Fault tolerant architectures for integrated aircraft electronics systems

NASA Technical Reports Server (NTRS)

Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.

1983-01-01

Work into possible architectures for future flight control computer systems is described. Ada for Fault-Tolerant Systems, the NETS Network Error-Tolerant System architecture, and voting in asynchronous systems are covered.
Predeployment validation of fault-tolerant systems through software-implemented fault insertion

NASA Technical Reports Server (NTRS)

Czeck, Edward W.; Siewiorek, Daniel P.; Segall, Zary Z.

1989-01-01

Fault injection-based automated testing (FIAT) environment, which can be used to experimentally characterize and evaluate distributed realtime systems under fault-free and faulted conditions is described. A survey is presented of validation methodologies. The need for fault insertion based on validation methodologies is demonstrated. The origins and models of faults, and motivation for the FIAT concept are reviewed. FIAT employs a validation methodology which builds confidence in the system through first providing a baseline of fault-free performance data and then characterizing the behavior of the system with faults present. Fault insertion is accomplished through software and allows faults or the manifestation of faults to be inserted by either seeding faults into memory or triggering error detection mechanisms. FIAT is capable of emulating a variety of fault-tolerant strategies and architectures, can monitor system activity, and can automatically orchestrate experiments involving insertion of faults. There is a common system interface which allows ease of use to decrease experiment development and run time. Fault models chosen for experiments on FIAT have generated system responses which parallel those observed in real systems under faulty conditions. These capabilities are shown by two example experiments each using a different fault-tolerance strategy.
Research in computer science

NASA Technical Reports Server (NTRS)

Ortega, J. M.

1984-01-01

The research efforts of University of Virginia students under a NASA sponsored program are summarized and the status of the program is reported. The research includes: testing method evaluations for N version programming; a representation scheme for modeling three dimensional objects; fault tolerant protocols for real time local area networks; performance investigation of Cyber network; XFEM implementation; and vectorizing incomplete Cholesky conjugate gradients.
Fault-tolerant locomotion of the hexapod robot.

PubMed

Yang, J M; Kim, J H

1998-01-01

In this paper, we propose a scheme for fault detection and tolerance of the hexapod robot locomotion on even terrain. The fault stability margin is defined to represent potential stability which a gait can have in case a sudden fault event occurs to one leg. Based on this, the fault-tolerant quadruped periodic gaits of the hexapod walking over perfectly even terrain are derived. It is demonstrated that the derived quadruped gait is the optimal one the hexapod can have maintaining fault stability margin nonnegative and a geometric condition should be satisfied for the optimal locomotion. By this scheme, when one leg is in failure, the hexapod robot has the modified tripod gait to continue the optimal locomotion.
Reliable fuzzy H∞ control for active suspension of in-wheel motor driven electric vehicles with dynamic damping

NASA Astrophysics Data System (ADS)

Shao, Xinxin; Naghdy, Fazel; Du, Haiping

2017-03-01

A fault-tolerant fuzzy H∞ control design approach for active suspension of in-wheel motor driven electric vehicles in the presence of sprung mass variation, actuator faults and control input constraints is proposed. The controller is designed based on the quarter-car active suspension model with a dynamic-damping-in-wheel-motor-driven-system, in which the suspended motor is operated as a dynamic absorber. The Takagi-Sugeno (T-S) fuzzy model is used to model this suspension with possible sprung mass variation. The parallel-distributed compensation (PDC) scheme is deployed to derive a fault-tolerant fuzzy controller for the T-S fuzzy suspension model. In order to reduce the motor wear caused by the dynamic force transmitted to the in-wheel motor, the dynamic force is taken as an additional controlled output besides the traditional optimization objectives such as sprung mass acceleration, suspension deflection and actuator saturation. The H∞ performance of the proposed controller is derived as linear matrix inequalities (LMIs) comprising three equality constraints which are solved efficiently by means of MATLAB LMI Toolbox. The proposed controller is applied to an electric vehicle suspension and its effectiveness is demonstrated through computer simulation.
Disjointness of Stabilizer Codes and Limitations on Fault-Tolerant Logical Gates

NASA Astrophysics Data System (ADS)

Jochym-O'Connor, Tomas; Kubica, Aleksander; Yoder, Theodore J.

2018-04-01

Stabilizer codes are among the most successful quantum error-correcting codes, yet they have important limitations on their ability to fault tolerantly compute. Here, we introduce a new quantity, the disjointness of the stabilizer code, which, roughly speaking, is the number of mostly nonoverlapping representations of any given nontrivial logical Pauli operator. The notion of disjointness proves useful in limiting transversal gates on any error-detecting stabilizer code to a finite level of the Clifford hierarchy. For code families, we can similarly restrict logical operators implemented by constant-depth circuits. For instance, we show that it is impossible, with a constant-depth but possibly geometrically nonlocal circuit, to implement a logical non-Clifford gate on the standard two-dimensional surface code.
Fault Mitigation Schemes for Future Spaceflight Multicore Processors

NASA Technical Reports Server (NTRS)

Alexander, James W.; Clement, Bradley J.; Gostelow, Kim P.; Lai, John Y.

2012-01-01

Future planetary exploration missions demand significant advances in on-board computing capabilities over current avionics architectures based on a single-core processing element. The state-of-the-art multi-core processor provides much promise in meeting such challenges while introducing new fault tolerance problems when applied to space missions. Software-based schemes are being presented in this paper that can achieve system-level fault mitigation beyond that provided by radiation-hard-by-design (RHBD). For mission and time critical applications such as the Terrain Relative Navigation (TRN) for planetary or small body navigation, and landing, a range of fault tolerance methods can be adapted by the application. The software methods being investigated include Error Correction Code (ECC) for data packet routing between cores, virtual network routing, Triple Modular Redundancy (TMR), and Algorithm-Based Fault Tolerance (ABFT). A robust fault tolerance framework that provides fail-operational behavior under hard real-time constraints and graceful degradation will be demonstrated using TRN executing on a commercial Tilera(R) processor with simulated fault injections.

A highly reliable, high performance open avionics architecture for real time Nap-of-the-Earth operations

NASA Technical Reports Server (NTRS)

Harper, Richard E.; Elks, Carl

1995-01-01

An Army Fault Tolerant Architecture (AFTA) has been developed to meet real-time fault tolerant processing requirements of future Army applications. AFTA is the enabling technology that will allow the Army to configure existing processors and other hardware to provide high throughput and ultrahigh reliability necessary for TF/TA/NOE flight control and other advanced Army applications. A comprehensive conceptual study of AFTA has been completed that addresses a wide range of issues including requirements, architecture, hardware, software, testability, producibility, analytical models, validation and verification, common mode faults, VHDL, and a fault tolerant data bus. A Brassboard AFTA for demonstration and validation has been fabricated, and two operating systems and a flight-critical Army application have been ported to it. Detailed performance measurements have been made of fault tolerance and operating system overheads while AFTA was executing the flight application in the presence of faults.
Fault-tolerant rotary actuator

DOEpatents

Tesar, Delbert

2006-10-17

A fault-tolerant actuator module, in a single containment shell, containing two actuator subsystems that are either asymmetrically or symmetrically laid out is provided. Fault tolerance in the actuators of the present invention is achieved by the employment of dual sets of equal resources. Dual resources are integrated into single modules, with each having the external appearance and functionality of a single set of resources.
Fault diagnosis and fault-tolerant finite control set-model predictive control of a multiphase voltage-source inverter supplying BLDC motor.

PubMed

Salehifar, Mehdi; Moreno-Equilaz, Manuel

2016-01-01

Due to its fault tolerance, a multiphase brushless direct current (BLDC) motor can meet high reliability demand for application in electric vehicles. The voltage-source inverter (VSI) supplying the motor is subjected to open circuit faults. Therefore, it is necessary to design a fault-tolerant (FT) control algorithm with an embedded fault diagnosis (FD) block. In this paper, finite control set-model predictive control (FCS-MPC) is developed to implement the fault-tolerant control algorithm of a five-phase BLDC motor. The developed control method is fast, simple, and flexible. A FD method based on available information from the control block is proposed; this method is simple, robust to common transients in motor and able to localize multiple open circuit faults. The proposed FD and FT control algorithm are embedded in a five-phase BLDC motor drive. In order to validate the theory presented, simulation and experimental results are conducted on a five-phase two-level VSI supplying a five-phase BLDC motor. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
A Bayesian least squares support vector machines based framework for fault diagnosis and failure prognosis

NASA Astrophysics Data System (ADS)

Khawaja, Taimoor Saleem

A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior and any abnormal or novel data during real-time operation. The results of the scheme are interpreted as a posterior probability of health (1 - probability of fault). As shown through two case studies in Chapter 3, the scheme is well suited for diagnosing imminent faults in dynamical non-linear systems. Finally, the failure prognosis scheme is based on an incremental weighted Bayesian LS-SVR machine. It is particularly suited for online deployment given the incremental nature of the algorithm and the quick optimization problem solved in the LS-SVR algorithm. By way of kernelization and a Gaussian Mixture Modeling (GMM) scheme, the algorithm can estimate "possibly" non-Gaussian posterior distributions for complex non-linear systems. An efficient regression scheme associated with the more rigorous core algorithm allows for long-term predictions, fault growth estimation with confidence bounds and remaining useful life (RUL) estimation after a fault is detected. The leading contributions of this thesis are (a) the development of a novel Bayesian Anomaly Detector for efficient and reliable Fault Detection and Identification (FDI) based on Least Squares Support Vector Machines, (b) the development of a data-driven real-time architecture for long-term Failure Prognosis using Least Squares Support Vector Machines, (c) Uncertainty representation and management using Bayesian Inference for posterior distribution estimation and hyper-parameter tuning, and finally (d) the statistical characterization of the performance of diagnosis and prognosis algorithms in order to relate the efficiency and reliability of the proposed schemes.
Adding Fault Tolerance to NPB Benchmarks Using ULFM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parchman, Zachary W; Vallee, Geoffroy R; Naughton III, Thomas J

2016-01-01

In the world of high-performance computing, fault tolerance and application resilience are becoming some of the primary concerns because of increasing hardware failures and memory corruptions. While the research community has been investigating various options, from system-level solutions to application-level solutions, standards such as the Message Passing Interface (MPI) are also starting to include such capabilities. The current proposal for MPI fault tolerant is centered around the User-Level Failure Mitigation (ULFM) concept, which provides means for fault detection and recovery of the MPI layer. This approach does not address application-level recovery, which is currently left to application developers. In thismore » work, we present a mod- ification of some of the benchmarks of the NAS parallel benchmark (NPB) to include support of the ULFM capabilities as well as application-level strategies and mechanisms for application-level failure recovery. As such, we present: (i) an application-level library to checkpoint and restore data, (ii) extensions of NPB benchmarks for fault tolerance based on different strategies, (iii) a fault injection tool, and (iv) some preliminary results that show the impact of such fault tolerant strategies on the application execution.« less
How does the architecture of a fault system controls magma upward migration through the crust?

NASA Astrophysics Data System (ADS)

Iturrieta, P. C.; Cembrano, J. M.; Stanton-Yonge, A.; Hurtado, D.

2017-12-01

The orientation and relative disposition of adjacent faults locally disrupt the regional stress field, thus enhancing magma flow through previous or newly created favorable conduits. Moreover, the brittle-plastic transition (BPT), due to its stronger rheology, governs the average state of stress of shallower portions of the fault system. Furthermore, the BPT may coincide with the location of transient magma reservoirs, from which dikes can propagate upwards into the upper crust, shaping the inner structure of the volcanic arc. In this work, we examine the stress distribution in strike-slip duplexes with variable geometry, along with the critical fluid overpressure ratio (CFOP), which is the minimum value required for individual faults to fracture in tension. We also determine the stress state disruption of the fault system when a dike is emplaced, to answer open questions such as: what is the nature of favorable pathways for magma to migrate? what is the architecture influence on the feedback between fault system kinematics and magma injection? To this end, we present a 3D coupled hydro-mechanical finite element model of the continental lithosphere, where faults are represented as continuum volumes with an elastic-plastic rheology. Magma flow upon fracturing is modeled through non-linear Stoke's flow, coupling solid and fluid equilibrium. A non-linear sensitivity analysis is performed in function of tectonic, rheology and geometry inputs, to assess which are the first-order factors that governs the nature of dike emplacement. Results show that the CFOP is heterogeneously distributed in the fault system, and within individual fault segments. Minimum values are displayed near fault intersections, where local kinematics superimpose on regional tectonic loading. Furthermore, when magma is transported through a fault segment, the CFOP is now minimized in faults with non-favorable orientations. This suggests that these faults act as transient pathways for magma to continue migrating upwards, which may explain the heterogeneity of seismicity patterns in volcano-tectonic seismic swarms. Likewise, once magma is injected, the consequent disruption of the stress field enhances the slip of faults which are not favorably oriented to the regional tectonic loading.
Fault tolerant filtering and fault detection for quantum systems driven by fields in single photon states

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gao, Qing, E-mail: qing.gao.chance@gmail.com; Dong, Daoyi, E-mail: daoyidong@gmail.com; Petersen, Ian R., E-mail: i.r.petersen@gmai.com

The purpose of this paper is to solve the fault tolerant filtering and fault detection problem for a class of open quantum systems driven by a continuous-mode bosonic input field in single photon states when the systems are subject to stochastic faults. Optimal estimates of both the system observables and the fault process are simultaneously calculated and characterized by a set of coupled recursive quantum stochastic differential equations.
Final Project Report. Scalable fault tolerance runtime technology for petascale computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krishnamoorthy, Sriram; Sadayappan, P

With the massive number of components comprising the forthcoming petascale computer systems, hardware failures will be routinely encountered during execution of large-scale applications. Due to the multidisciplinary, multiresolution, and multiscale nature of scientific problems that drive the demand for high end systems, applications place increasingly differing demands on the system resources: disk, network, memory, and CPU. In addition to MPI, future applications are expected to use advanced programming models such as those developed under the DARPA HPCS program as well as existing global address space programming models such as Global Arrays, UPC, and Co-Array Fortran. While there has been amore » considerable amount of work in fault tolerant MPI with a number of strategies and extensions for fault tolerance proposed, virtually none of advanced models proposed for emerging petascale systems is currently fault aware. To achieve fault tolerance, development of underlying runtime and OS technologies able to scale to petascale level is needed. This project has evaluated range of runtime techniques for fault tolerance for advanced programming models.« less
Full-Authority Fault-Tolerant Electronic Engine Control System for Variable Cycle Engines.

DTIC Science & Technology

1982-04-01

single internally self-checked VLSI micro - processor . The selected configuration is an externally checked pair of com- mercially available...Electronic Engine Control FPMH Failures per Million Hours FTMP Fault Tolerant Multi- Processor FTSC Fault Tolerant Spaceborn Computer GRAMP Generalized...Removal * MTBR Mean Time Between Repair MTTF Mean Time to Failure xiii List of Abbreviations (continued) - NH High Pressure Rotor Speed O&S Operating
High-throughput state-machine replication using software transactional memory.

PubMed

Zhao, Wenbing; Yang, William; Zhang, Honglei; Yang, Jack; Luo, Xiong; Zhu, Yueqin; Yang, Mary; Luo, Chaomin

2016-11-01

State-machine replication is a common way of constructing general purpose fault tolerance systems. To ensure replica consistency, requests must be executed sequentially according to some total order at all non-faulty replicas. Unfortunately, this could severely limit the system throughput. This issue has been partially addressed by identifying non-conflicting requests based on application semantics and executing these requests concurrently. However, identifying and tracking non-conflicting requests require intimate knowledge of application design and implementation, and a custom fault tolerance solution developed for one application cannot be easily adopted by other applications. Software transactional memory offers a new way of constructing concurrent programs. In this article, we present the mechanisms needed to retrofit existing concurrency control algorithms designed for software transactional memory for state-machine replication. The main benefit for using software transactional memory in state-machine replication is that general purpose concurrency control mechanisms can be designed without deep knowledge of application semantics. As such, new fault tolerance systems based on state-machine replications with excellent throughput can be easily designed and maintained. In this article, we introduce three different concurrency control mechanisms for state-machine replication using software transactional memory, namely, ordered strong strict two-phase locking, conventional timestamp-based multiversion concurrency control, and speculative timestamp-based multiversion concurrency control. Our experiments show that speculative timestamp-based multiversion concurrency control mechanism has the best performance in all types of workload, the conventional timestamp-based multiversion concurrency control offers the worst performance due to high abort rate in the presence of even moderate contention between transactions. The ordered strong strict two-phase locking mechanism offers the simplest solution with excellent performance in low contention workload, and fairly good performance in high contention workload.
High-throughput state-machine replication using software transactional memory

PubMed Central

Yang, William; Zhang, Honglei; Yang, Jack; Luo, Xiong; Zhu, Yueqin; Yang, Mary; Luo, Chaomin

2017-01-01

State-machine replication is a common way of constructing general purpose fault tolerance systems. To ensure replica consistency, requests must be executed sequentially according to some total order at all non-faulty replicas. Unfortunately, this could severely limit the system throughput. This issue has been partially addressed by identifying non-conflicting requests based on application semantics and executing these requests concurrently. However, identifying and tracking non-conflicting requests require intimate knowledge of application design and implementation, and a custom fault tolerance solution developed for one application cannot be easily adopted by other applications. Software transactional memory offers a new way of constructing concurrent programs. In this article, we present the mechanisms needed to retrofit existing concurrency control algorithms designed for software transactional memory for state-machine replication. The main benefit for using software transactional memory in state-machine replication is that general purpose concurrency control mechanisms can be designed without deep knowledge of application semantics. As such, new fault tolerance systems based on state-machine replications with excellent throughput can be easily designed and maintained. In this article, we introduce three different concurrency control mechanisms for state-machine replication using software transactional memory, namely, ordered strong strict two-phase locking, conventional timestamp-based multiversion concurrency control, and speculative timestamp-based multiversion concurrency control. Our experiments show that speculative timestamp-based multiversion concurrency control mechanism has the best performance in all types of workload, the conventional timestamp-based multiversion concurrency control offers the worst performance due to high abort rate in the presence of even moderate contention between transactions. The ordered strong strict two-phase locking mechanism offers the simplest solution with excellent performance in low contention workload, and fairly good performance in high contention workload. PMID:29075049
Stochastic Stability of Sampled Data Systems with a Jump Linear Controller

NASA Technical Reports Server (NTRS)

Gonzalez, Oscar R.; Herencia-Zapana, Heber; Gray, W. Steven

2004-01-01

In this paper an equivalence between the stochastic stability of a sampled-data system and its associated discrete-time representation is established. The sampled-data system consists of a deterministic, linear, time-invariant, continuous-time plant and a stochastic, linear, time-invariant, discrete-time, jump linear controller. The jump linear controller models computer systems and communication networks that are subject to stochastic upsets or disruptions. This sampled-data model has been used in the analysis and design of fault-tolerant systems and computer-control systems with random communication delays without taking into account the inter-sample response. This paper shows that the known equivalence between the stability of a deterministic sampled-data system and the associated discrete-time representation holds even in a stochastic framework.
Parameter Transient Behavior Analysis on Fault Tolerant Control System

NASA Technical Reports Server (NTRS)

Belcastro, Christine (Technical Monitor); Shin, Jong-Yeob

2003-01-01

In a fault tolerant control (FTC) system, a parameter varying FTC law is reconfigured based on fault parameters estimated by fault detection and isolation (FDI) modules. FDI modules require some time to detect fault occurrences in aero-vehicle dynamics. This paper illustrates analysis of a FTC system based on estimated fault parameter transient behavior which may include false fault detections during a short time interval. Using Lyapunov function analysis, the upper bound of an induced-L2 norm of the FTC system performance is calculated as a function of a fault detection time and the exponential decay rate of the Lyapunov function.
The Use of Efficient Broadcast Protocols in Asynchronous Distributed Systems. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Schmuck, Frank Bernhard

1988-01-01

Reliable broadcast protocols are important tools in distributed and fault-tolerant programming. They are useful for sharing information and for maintaining replicated data in a distributed system. However, a wide range of such protocols has been proposed. These protocols differ in their fault tolerance and delivery ordering characteristics. There is a tradeoff between the cost of a broadcast protocol and how much ordering it provides. It is, therefore, desirable to employ protocols that support only a low degree of ordering whenever possible. This dissertation presents techniques for deciding how strongly ordered a protocol is necessary to solve a given application problem. It is shown that there are two distinct classes of application problems: problems that can be solved with efficient, asynchronous protocols, and problems that require global ordering. The concept of a linearization function that maps partially ordered sets of events to totally ordered histories is introduced. How to construct an asynchronous implementation that solves a given problem if a linearization function for it can be found is shown. It is proved that in general the question of whether a problem has an asynchronous solution is undecidable. Hence there exists no general algorithm that would automatically construct a suitable linearization function for a given problem. Therefore, an important subclass of problems that have certain commutativity properties are considered. Techniques for constructing asynchronous implementations for this class are presented. These techniques are useful for constructing efficient asynchronous implementations for a broad range of practical problems.
Deterministic and robust generation of single photons from a single quantum dot with 99.5% indistinguishability using adiabatic rapid passage.

PubMed

Wei, Yu-Jia; He, Yu-Ming; Chen, Ming-Cheng; Hu, Yi-Nan; He, Yu; Wu, Dian; Schneider, Christian; Kamp, Martin; Höfling, Sven; Lu, Chao-Yang; Pan, Jian-Wei

2014-11-12

Single photons are attractive candidates of quantum bits (qubits) for quantum computation and are the best messengers in quantum networks. Future scalable, fault-tolerant photonic quantum technologies demand both stringently high levels of photon indistinguishability and generation efficiency. Here, we demonstrate deterministic and robust generation of pulsed resonance fluorescence single photons from a single semiconductor quantum dot using adiabatic rapid passage, a method robust against fluctuation of driving pulse area and dipole moments of solid-state emitters. The emitted photons are background-free, have a vanishing two-photon emission probability of 0.3% and a raw (corrected) two-photon Hong-Ou-Mandel interference visibility of 97.9% (99.5%), reaching a precision that places single photons at the threshold for fault-tolerant surface-code quantum computing. This single-photon source can be readily scaled up to multiphoton entanglement and used for quantum metrology, boson sampling, and linear optical quantum computing.
Chance of Vulnerability Reduction in Application-Specific NoC through Distance Aware Mapping Algorithm

NASA Astrophysics Data System (ADS)

Janidarmian, Majid; Fekr, Atena Roshan; Bokharaei, Vahhab Samadi

2011-08-01

Mapping algorithm which means which core should be linked to which router is one of the key issues in the design flow of network-on-chip. To achieve an application-specific NoC design procedure that minimizes the communication cost and improves the fault tolerant property, first a heuristic mapping algorithm that produces a set of different mappings in a reasonable time is presented. This algorithm allows the designers to identify the set of most promising solutions in a large design space, which has low communication costs while yielding optimum communication costs in some cases. Another evaluated parameter, vulnerability index, is then considered as a principle of estimating the fault-tolerance property in all produced mappings. Finally, in order to yield a mapping which considers trade-offs between these two parameters, a linear function is defined and introduced. It is also observed that more flexibility to prioritize solutions within the design space is possible by adjusting a set of if-then rules in fuzzy logic.
Fault Tolerance Middleware for a Multi-Core System

NASA Technical Reports Server (NTRS)

Some, Raphael R.; Springer, Paul L.; Zima, Hans P.; James, Mark; Wagner, David A.

2012-01-01

Fault Tolerance Middleware (FTM) provides a framework to run on a dedicated core of a multi-core system and handles detection of single-event upsets (SEUs), and the responses to those SEUs, occurring in an application running on multiple cores of the processor. This software was written expressly for a multi-core system and can support different kinds of fault strategies, such as introspection, algorithm-based fault tolerance (ABFT), and triple modular redundancy (TMR). It focuses on providing fault tolerance for the application code, and represents the first step in a plan to eventually include fault tolerance in message passing and the FTM itself. In the multi-core system, the FTM resides on a single, dedicated core, separate from the cores used by the application. This is done in order to isolate the FTM from application faults and to allow it to swap out any application core for a substitute. The structure of the FTM consists of an interface to a fault tolerant strategy module, a responder module, a fault manager module, an error factory, and an error mapper that determines the severity of the error. In the present reference implementation, the only fault tolerant strategy implemented is introspection. The introspection code waits for an application node to send an error notification to it. It then uses the error factory to create an error object, and at this time, a severity level is assigned to the error. The introspection code uses its built-in knowledge base to generate a recommended response to the error. Responses might include ignoring the error, logging it, rolling back the application to a previously saved checkpoint, swapping in a new node to replace a bad one, or restarting the application. The original error and recommended response are passed to the top-level fault manager module, which invokes the response. The responder module also notifies the introspection module of the generated response. This provides additional information to the introspection module that it can use in generating its next response. For example, if the responder triggers an application rollback and errors are still occurring, the introspection module may decide to recommend an application restart.
Magic state distillation protocols with noisy Clifford gates

NASA Astrophysics Data System (ADS)

Brooks, Peter

2013-03-01

A promising approach to universal fault-tolerant quantum computation is to implement the non-universal group of Clifford gates, and to achieve universality by adding the ability to prepare high-fidelity copies of certain ``magic states''. By applying state distillation protocols, many noisy copies of a magic state ancilla can be purified into a smaller number of clean copies which are arbitrarily close to the perfect state, using only Clifford operations. In practice, the Clifford gates themselves will be noisy, which can limit the efficiency of state distillation and put a floor on the achievable fidelity with the desired state. Recently, a number of new state distillation protocols have been proposed that have the potential to reduce the required resource overhead. I analyze these protocols and explore the tradeoffs between these different approaches to magic state distillation when noisy Clifford gates are taken into account. Supported in part by IARPA under contract D11PC20165, by NSF under Grant No. PHY-0803371, by DOE under Grant No. DE-FG03-92-ER40701, and by NSA/ARO under Grant No. W911NF-09-1-0442.
Protection Relaying Scheme Based on Fault Reactance Operation Type

NASA Astrophysics Data System (ADS)

Tsuji, Kouichi

The theories of operation of existing relays are roughly divided into two types: one is the current differential types based on Kirchhoff's first law and the other is impedance types based on second law. We can apply the Kirchhoff's laws to strictly formulate fault phenomena, so the circuit equations are represented non linear simultaneous equations with variables fault point k and fault resistance Rf. This method has next two defect. 1) heavy computational burden for the iterative calculation on N-R method, 2) relay operator can not easily understand principle of numerical matrix operation. The new protection relay principles we proposed this paper focuses on the fact that the reactance component on fault point is almost zero. Two reactance Xf(S), Xf(R) on branch both ends are calculated by operation of solving linear equations. If signs of Xf(S) and Xf(R) are not same, it can be judged that the fault point exist in the branch. This reactance Xf corresponds to difference of branch reactance between actual fault point and imaginaly fault point. And so relay engineer can to understand fault location by concept of “distance". The simulation results using this new method indicates the highly precise estimation of fault locations compared with the inspected fault locations on operating transmission lines.
Linear Parameter Varying Control for Actuator Failure

NASA Technical Reports Server (NTRS)

Shin, Jong-Yeob; Wu, N. Eva; Belcastro, Christine; Bushnell, Dennis M. (Technical Monitor)

2002-01-01

A robust linear parameter varying (LPV) control synthesis is carried out for an HiMAT vehicle subject to loss of control effectiveness. The scheduling parameter is selected to be a function of the estimates of the control effectiveness factors. The estimates are provided on-line by a two-stage Kalman estimator. The inherent conservatism of the LPV design is reducing through the use of a scaling factor on the uncertainty block that represents the estimation errors of the effectiveness factors. Simulations of the controlled system with the on-line estimator show that a superior fault-tolerance can be achieved.

Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing

DTIC Science & Technology

2012-12-14

Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Matei Zaharia Tathagata Das Haoyuan Li Timothy Hunter Scott Shenker Ion...SUBTITLE Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER...time. However, current programming models for distributed stream processing are relatively low-level often leaving the user to worry about consistency of
Integrated Environment for Development and Assurance

DTIC Science & Technology

2015-01-26

Jan 26, 2015 © 2015 Carnegie Mellon University We Rely on Software for Safe Aircraft Operation Embedded software systems introduce a new class of...eveloper Compute Platform Runtime Architecture Application Software Embedded SW System Engineer Data Stream Characteristics Latency jitter affects...Why do system level failures still occur despite fault tolerance techniques being deployed in systems ? Embedded software system as major source of
Double swivel toggle release

NASA Technical Reports Server (NTRS)

King, Guy L.; Schneider, William C.

1989-01-01

A pyrotechnic actuated structural release device is disclosed which is mechanically two fault tolerant for release. The device comprises a fastener plate and fastener body each attachable to one of a pair of structures to be joined. The fastener plate and the fastener body are fastened by a dual swivel toggle member. The toggle member is supported at one end on the fastener plate and mounted for universal pivotal movement thereon. Its other end is received in a central opening in the fastener body, and has a universally mounted retainer ring member. The toggle member is restrained by three retractable latching pins symmetrically disposed in equiangular spacing about the axis of the toggle member and positionable in latching engagement with the retainer ring member on the toggle member. Each pin is retractable by a pyrotechnic charge, the expanding gases of which are applied to a pressure receiving face on the latch pins to effect retraction from the ring member. While retraction of all three pins releases the ring member, the fastener is mechanically two fault tolerant since the failure of any single one or pair of the latch pins to retract results in an asymmetrical loading on the ring member and its dual pivotal movement ensures a release.
CSP: A Multifaceted Hybrid Architecture for Space Computing

NASA Technical Reports Server (NTRS)

Rudolph, Dylan; Wilson, Christopher; Stewart, Jacob; Gauvin, Patrick; George, Alan; Lam, Herman; Crum, Gary Alex; Wirthlin, Mike; Wilson, Alex; Stoddard, Aaron

2014-01-01

Research on the CHREC Space Processor (CSP) takes a multifaceted hybrid approach to embedded space computing. Working closely with the NASA Goddard SpaceCube team, researchers at the National Science Foundation (NSF) Center for High-Performance Reconfigurable Computing (CHREC) at the University of Florida and Brigham Young University are developing hybrid space computers that feature an innovative combination of three technologies: commercial-off-the-shelf (COTS) devices, radiation-hardened (RadHard) devices, and fault-tolerant computing. Modern COTS processors provide the utmost in performance and energy-efficiency but are susceptible to ionizing radiation in space, whereas RadHard processors are virtually immune to this radiation but are more expensive, larger, less energy-efficient, and generations behind in speed and functionality. By featuring COTS devices to perform the critical data processing, supported by simpler RadHard devices that monitor and manage the COTS devices, and augmented with novel uses of fault-tolerant hardware, software, information, and networking within and between COTS devices, the resulting system can maximize performance and reliability while minimizing energy consumption and cost. NASA Goddard has adopted the CSP concept and technology with plans underway to feature flight-ready CSP boards on two upcoming space missions.
Highly fault-tolerant parallel computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Spielman, D.A.

We re-introduce the coded model of fault-tolerant computation in which the input and output of a computational device are treated as words in an error-correcting code. A computational device correctly computes a function in the coded model if its input and output, once decoded, are a valid input and output of the function. In the coded model, it is reasonable to hope to simulate all computational devices by devices whose size is greater by a constant factor but which are exponentially reliable even if each of their components can fail with some constant probability. We consider fine-grained parallel computations inmore » which each processor has a constant probability of producing the wrong output at each time step. We show that any parallel computation that runs for time t on w processors can be performed reliably on a faulty machine in the coded model using w log{sup O(l)} w processors and time t log{sup O(l)} w. The failure probability of the computation will be at most t {center_dot} exp(-w{sup 1/4}). The codes used to communicate with our fault-tolerant machines are generalized Reed-Solomon codes and can thus be encoded and decoded in O(n log{sup O(1)} n) sequential time and are independent of the machine they are used to communicate with. We also show how coded computation can be used to self-correct many linear functions in parallel with arbitrarily small overhead.« less
Sliding mode based fault detection, reconstruction and fault tolerant control scheme for motor systems.

PubMed

Mekki, Hemza; Benzineb, Omar; Boukhetala, Djamel; Tadjine, Mohamed; Benbouzid, Mohamed

2015-07-01

The fault-tolerant control problem belongs to the domain of complex control systems in which inter-control-disciplinary information and expertise are required. This paper proposes an improved faults detection, reconstruction and fault-tolerant control (FTC) scheme for motor systems (MS) with typical faults. For this purpose, a sliding mode controller (SMC) with an integral sliding surface is adopted. This controller can make the output of system to track the desired position reference signal in finite-time and obtain a better dynamic response and anti-disturbance performance. But this controller cannot deal directly with total system failures. However an appropriate combination of the adopted SMC and sliding mode observer (SMO), later it is designed to on-line detect and reconstruct the faults and also to give a sensorless control strategy which can achieve tolerance to a wide class of total additive failures. The closed-loop stability is proved, using the Lyapunov stability theory. Simulation results in healthy and faulty conditions confirm the reliability of the suggested framework. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Network-Physics(NP) Bec DIGITAL(#)-VULNERABILITY Versus Fault-Tolerant Analog

NASA Astrophysics Data System (ADS)

Alexander, G. K.; Hathaway, M.; Schmidt, H. E.; Siegel, E.

2011-03-01

Siegel[AMS Joint Mtg.(2002)-Abs.973-60-124] digits logarithmic-(Newcomb(1881)-Weyl(1914; 1916)-Benford(1938)-"NeWBe"/"OLDbe")-law algebraic-inversion to ONLY BEQS BEC:Quanta/Bosons= digits: Synthesis reveals EMP-like SEVERE VULNERABILITY of ONLY DIGITAL-networks(VS. FAULT-TOLERANT ANALOG INvulnerability) via Barabasi "Network-Physics" relative-``statics''(VS.dynamics-[Willinger-Alderson-Doyle(Not.AMS(5/09)]-]critique); (so called)"Quantum-computing is simple-arithmetic(sans division/ factorization); algorithmic-complexities: INtractibility/ UNdecidability/ INefficiency/NONcomputability / HARDNESS(so MIScalled) "noise"-induced-phase-transitions(NITS) ACCELERATION: Cook-Levin theorem Reducibility is Renormalization-(Semi)-Group fixed-points; number-Randomness DEFINITION via WHAT? Query(VS. Goldreich[Not.AMS(02)] How? mea culpa)can ONLY be MBCS "hot-plasma" versus digit-clumping NON-random BEC; Modular-arithmetic Congruences= Signal X Noise PRODUCTS = clock-model; NON-Shor[Physica A,341,586(04)] BEC logarithmic-law inversion factorization:Watkins number-thy. U stat.-phys.); P=/=NP TRIVIAL Proof: Euclid!!! [(So Miscalled) computational-complexity J-O obviation via geometry.
SABRE: a bio-inspired fault-tolerant electronic architecture.

PubMed

Bremner, P; Liu, Y; Samie, M; Dragffy, G; Pipe, A G; Tempesti, G; Timmis, J; Tyrrell, A M

2013-03-01

As electronic devices become increasingly complex, ensuring their reliable, fault-free operation is becoming correspondingly more challenging. It can be observed that, in spite of their complexity, biological systems are highly reliable and fault tolerant. Hence, we are motivated to take inspiration for biological systems in the design of electronic ones. In SABRE (self-healing cellular architectures for biologically inspired highly reliable electronic systems), we have designed a bio-inspired fault-tolerant hierarchical architecture for this purpose. As in biology, the foundation for the whole system is cellular in nature, with each cell able to detect faults in its operation and trigger intra-cellular or extra-cellular repair as required. At the next level in the hierarchy, arrays of cells are configured and controlled as function units in a transport triggered architecture (TTA), which is able to perform partial-dynamic reconfiguration to rectify problems that cannot be solved at the cellular level. Each TTA is, in turn, part of a larger multi-processor system which employs coarser grain reconfiguration to tolerate faults that cause a processor to fail. In this paper, we describe the details of operation of each layer of the SABRE hierarchy, and how these layers interact to provide a high systemic level of fault tolerance.
Fault tolerance in computational grids: perspectives, challenges, and issues.

PubMed

Haider, Sajjad; Nazir, Babar

2016-01-01

Computational grids are established with the intention of providing shared access to hardware and software based resources with special reference to increased computational capabilities. Fault tolerance is one of the most important issues faced by the computational grids. The main contribution of this survey is the creation of an extended classification of problems that incur in the computational grid environments. The proposed classification will help researchers, developers, and maintainers of grids to understand the types of issues to be anticipated. Moreover, different types of problems, such as omission, interaction, and timing related have been identified that need to be handled on various layers of the computational grid. In this survey, an analysis and examination is also performed pertaining to the fault tolerance and fault detection mechanisms. Our conclusion is that a dependable and reliable grid can only be established when more emphasis is on fault identification. Moreover, our survey reveals that adaptive and intelligent fault identification, and tolerance techniques can improve the dependability of grid working environments.
Fault Injection and Monitoring Capability for a Fault-Tolerant Distributed Computation System

NASA Technical Reports Server (NTRS)

Torres-Pomales, Wilfredo; Yates, Amy M.; Malekpour, Mahyar R.

2010-01-01

The Configurable Fault-Injection and Monitoring System (CFIMS) is intended for the experimental characterization of effects caused by a variety of adverse conditions on a distributed computation system running flight control applications. A product of research collaboration between NASA Langley Research Center and Old Dominion University, the CFIMS is the main research tool for generating actual fault response data with which to develop and validate analytical performance models and design methodologies for the mitigation of fault effects in distributed flight control systems. Rather than a fixed design solution, the CFIMS is a flexible system that enables the systematic exploration of the problem space and can be adapted to meet the evolving needs of the research. The CFIMS has the capabilities of system-under-test (SUT) functional stimulus generation, fault injection and state monitoring, all of which are supported by a configuration capability for setting up the system as desired for a particular experiment. This report summarizes the work accomplished so far in the development of the CFIMS concept and documents the first design realization.
Application of a Resource Theory for Magic States to Fault-Tolerant Quantum Computing.

PubMed

Howard, Mark; Campbell, Earl

2017-03-03

Motivated by their necessity for most fault-tolerant quantum computation schemes, we formulate a resource theory for magic states. First, we show that robustness of magic is a well-behaved magic monotone that operationally quantifies the classical simulation overhead for a Gottesman-Knill-type scheme using ancillary magic states. Our framework subsequently finds immediate application in the task of synthesizing non-Clifford gates using magic states. When magic states are interspersed with Clifford gates, Pauli measurements, and stabilizer ancillas-the most general synthesis scenario-then the class of synthesizable unitaries is hard to characterize. Our techniques can place nontrivial lower bounds on the number of magic states required for implementing a given target unitary. Guided by these results, we have found new and optimal examples of such synthesis.
Verification of fault-tolerant clock synchronization systems. M.S. Thesis - College of William and Mary, 1992

NASA Technical Reports Server (NTRS)

Miner, Paul S.

1993-01-01

A critical function in a fault-tolerant computer architecture is the synchronization of the redundant computing elements. The synchronization algorithm must include safeguards to ensure that failed components do not corrupt the behavior of good clocks. Reasoning about fault-tolerant clock synchronization is difficult because of the possibility of subtle interactions involving failed components. Therefore, mechanical proof systems are used to ensure that the verification of the synchronization system is correct. In 1987, Schneider presented a general proof of correctness for several fault-tolerant clock synchronization algorithms. Subsequently, Shankar verified Schneider's proof by using the mechanical proof system EHDM. This proof ensures that any system satisfying its underlying assumptions will provide Byzantine fault-tolerant clock synchronization. The utility of Shankar's mechanization of Schneider's theory for the verification of clock synchronization systems is explored. Some limitations of Shankar's mechanically verified theory were encountered. With minor modifications to the theory, a mechanically checked proof is provided that removes these limitations. The revised theory also allows for proven recovery from transient faults. Use of the revised theory is illustrated with the verification of an abstract design of a clock synchronization system.
Fault-tolerant onboard digital information switching and routing for communications satellites

NASA Technical Reports Server (NTRS)

Shalkhauser, Mary JO; Quintana, Jorge A.; Soni, Nitin J.; Kim, Heechul

1993-01-01

The NASA Lewis Research Center is developing an information-switching processor for future meshed very-small-aperture terminal (VSAT) communications satellites. The information-switching processor will switch and route baseband user data onboard the VSAT satellite to connect thousands of Earth terminals. Fault tolerance is a critical issue in developing information-switching processor circuitry that will provide and maintain reliable communications services. In parallel with the conceptual development of the meshed VSAT satellite network architecture, NASA designed and built a simple test bed for developing and demonstrating baseband switch architectures and fault-tolerance techniques. The meshed VSAT architecture and the switching demonstration test bed are described, and the initial switching architecture and the fault-tolerance techniques that were developed and tested are discussed.
Intelligent fault-tolerant controllers

NASA Technical Reports Server (NTRS)

Huang, Chien Y.

1987-01-01

A system with fault tolerant controls is one that can detect, isolate, and estimate failures and perform necessary control reconfiguration based on this new information. Artificial intelligence (AI) is concerned with semantic processing, and it has evolved to include the topics of expert systems and machine learning. This research represents an attempt to apply AI to fault tolerant controls, hence, the name intelligent fault tolerant control (IFTC). A generic solution to the problem is sought, providing a system based on logic in addition to analytical tools, and offering machine learning capabilities. The advantages are that redundant system specific algorithms are no longer needed, that reasonableness is used to quickly choose the correct control strategy, and that the system can adapt to new situations by learning about its effects on system dynamics.
Hybrid routing technique for a fault-tolerant, integrated information network

NASA Technical Reports Server (NTRS)

Meredith, B. D.

1986-01-01

The evolutionary growth of the space station and the diverse activities onboard are expected to require a hierarchy of integrated, local area networks capable of supporting data, voice, and video communications. In addition, fault-tolerant network operation is necessary to protect communications between critical systems attached to the net and to relieve the valuable human resources onboard the space station of time-critical data system repair tasks. A key issue for the design of the fault-tolerant, integrated network is the development of a robust routing algorithm which dynamically selects the optimum communication paths through the net. A routing technique is described that adapts to topological changes in the network to support fault-tolerant operation and system evolvability.
Provable Transient Recovery for Frame-Based, Fault-Tolerant Computing Systems

NASA Technical Reports Server (NTRS)

DiVito, Ben L.; Butler, Ricky W.

1992-01-01

We present a formal verification of the transient fault recovery aspects of the Reliable Computing Platform (RCP), a fault-tolerant computing system architecture for digital flight control applications. The RCP uses NMR-style redundancy to mask faults and internal majority voting to purge the effects of transient faults. The system design has been formally specified and verified using the EHDM verification system. Our formalization accommodates a wide variety of voting schemes for purging the effects of transients.
Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

NASA Technical Reports Server (NTRS)

Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.

1992-01-01

Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions.
Development and analysis of the Software Implemented Fault-Tolerance (SIFT) computer

NASA Technical Reports Server (NTRS)

Goldberg, J.; Kautz, W. H.; Melliar-Smith, P. M.; Green, M. W.; Levitt, K. N.; Schwartz, R. L.; Weinstock, C. B.

1984-01-01

SIFT (Software Implemented Fault Tolerance) is an experimental, fault-tolerant computer system designed to meet the extreme reliability requirements for safety-critical functions in advanced aircraft. Errors are masked by performing a majority voting operation over the results of identical computations, and faulty processors are removed from service by reassigning computations to the nonfaulty processors. This scheme has been implemented in a special architecture using a set of standard Bendix BDX930 processors, augmented by a special asynchronous-broadcast communication interface that provides direct, processor to processor communication among all processors. Fault isolation is accomplished in hardware; all other fault-tolerance functions, together with scheduling and synchronization are implemented exclusively by executive system software. The system reliability is predicted by a Markov model. Mathematical consistency of the system software with respect to the reliability model has been partially verified, using recently developed tools for machine-aided proof of program correctness.
A fault tolerant gait for a hexapod robot over uneven terrain.

PubMed

Yang, J M; Kim, J H

2000-01-01

The fault tolerant gait of legged robots in static walking is a gait which maintains its stability against a fault event preventing a leg from having the support state. In this paper, a fault tolerant quadruped gait is proposed for a hexapod traversing uneven terrain with forbidden regions, which do not offer viable footholds but can be stepped over. By comparing performance of straight-line motion and crab walking over even terrain, it is shown that the proposed gait has better mobility and terrain adaptability than previously developed gaits. Based on the proposed gait, we present a method for the generation of the fault tolerant locomotion of a hexapod over uneven terrain with forbidden regions. The proposed method minimizes the number of legs on the ground during walking, and foot adjustment algorithm is used for avoiding steps on forbidden regions. The effectiveness of the proposed strategy over uneven terrain is demonstrated with a computer simulation.
Algorithm-Based Fault Tolerance Integrated with Replication

NASA Technical Reports Server (NTRS)

Some, Raphael; Rennels, David

2008-01-01

In a proposed approach to programming and utilization of commercial off-the-shelf computing equipment, a combination of algorithm-based fault tolerance (ABFT) and replication would be utilized to obtain high degrees of fault tolerance without incurring excessive costs. The basic idea of the proposed approach is to integrate ABFT with replication such that the algorithmic portions of computations would be protected by ABFT, and the logical portions by replication. ABFT is an extremely efficient, inexpensive, high-coverage technique for detecting and mitigating faults in computer systems used for algorithmic computations, but does not protect against errors in logical operations surrounding algorithms.

Detection of faults and software reliability analysis

NASA Technical Reports Server (NTRS)

Knight, J. C.

1986-01-01

Multiversion or N-version programming was proposed as a method of providing fault tolerance in software. The approach requires the separate, independent preparation of multiple versions of a piece of software for some application. Specific topics addressed are: failure probabilities in N-version systems, consistent comparison in N-version systems, descriptions of the faults found in the Knight and Leveson experiment, analytic models of comparison testing, characteristics of the input regions that trigger faults, fault tolerance through data diversity, and the relationship between failures caused by automatically seeded faults.
Fault detection and isolation for complex system

NASA Astrophysics Data System (ADS)

Jing, Chan Shi; Bayuaji, Luhur; Samad, R.; Mustafa, M.; Abdullah, N. R. H.; Zain, Z. M.; Pebrianti, Dwi

2017-07-01

Fault Detection and Isolation (FDI) is a method to monitor, identify, and pinpoint the type and location of system fault in a complex multiple input multiple output (MIMO) non-linear system. A two wheel robot is used as a complex system in this study. The aim of the research is to construct and design a Fault Detection and Isolation algorithm. The proposed method for the fault identification is using hybrid technique that combines Kalman filter and Artificial Neural Network (ANN). The Kalman filter is able to recognize the data from the sensors of the system and indicate the fault of the system in the sensor reading. Error prediction is based on the fault magnitude and the time occurrence of fault. Additionally, Artificial Neural Network (ANN) is another algorithm used to determine the type of fault and isolate the fault in the system.
Reconfiguration Schemes for Fault-Tolerant Processor Arrays

DTIC Science & Technology

1992-10-15

partially notion of linear schedule are easily related to similar ordered subset of a multidimensional integer lattice models and concepts used in [11-[131...and several other (called indec set). The points of this lattice correspond works. to (i.e.. are the indices of) computations, and the partial There are...These data dependencies are represented as vectors that of all computations of the algorithm is to be minimized. connect points of the lattice . If a
Fault Detection for Automotive Shock Absorber

NASA Astrophysics Data System (ADS)

Hernandez-Alcantara, Diana; Morales-Menendez, Ruben; Amezquita-Brooks, Luis

2015-11-01

Fault detection for automotive semi-active shock absorbers is a challenge due to the non-linear dynamics and the strong influence of the disturbances such as the road profile. First obstacle for this task, is the modeling of the fault, which has been shown to be of multiplicative nature. Many of the most widespread fault detection schemes consider additive faults. Two model-based fault algorithms for semiactive shock absorber are compared: an observer-based approach and a parameter identification approach. The performance of these schemes is validated and compared using a commercial vehicle model that was experimentally validated. Early results shows that a parameter identification approach is more accurate, whereas an observer-based approach is less sensible to parametric uncertainty.
On-line node fault injection training algorithm for MLP networks: objective function and convergence analysis.

PubMed

Sum, John Pui-Fai; Leung, Chi-Sing; Ho, Kevin I-J

2012-02-01

Improving fault tolerance of a neural network has been studied for more than two decades. Various training algorithms have been proposed in sequel. The on-line node fault injection-based algorithm is one of these algorithms, in which hidden nodes randomly output zeros during training. While the idea is simple, theoretical analyses on this algorithm are far from complete. This paper presents its objective function and the convergence proof. We consider three cases for multilayer perceptrons (MLPs). They are: (1) MLPs with single linear output node; (2) MLPs with multiple linear output nodes; and (3) MLPs with single sigmoid output node. For the convergence proof, we show that the algorithm converges with probability one. For the objective function, we show that the corresponding objective functions of cases (1) and (2) are of the same form. They both consist of a mean square errors term, a regularizer term, and a weight decay term. For case (3), the objective function is slight different from that of cases (1) and (2). With the objective functions derived, we can compare the similarities and differences among various algorithms and various cases.
Energy-efficient fault tolerance in multiprocessor real-time systems

NASA Astrophysics Data System (ADS)

Guo, Yifeng

The recent progress in the multiprocessor/multicore systems has important implications for real-time system design and operation. From vehicle navigation to space applications as well as industrial control systems, the trend is to deploy multiple processors in real-time systems: systems with 4 -- 8 processors are common, and it is expected that many-core systems with dozens of processing cores will be available in near future. For such systems, in addition to general temporal requirement common for all real-time systems, two additional operational objectives are seen as critical: energy efficiency and fault tolerance. An intriguing dimension of the problem is that energy efficiency and fault tolerance are typically conflicting objectives, due to the fact that tolerating faults (e.g., permanent/transient) often requires extra resources with high energy consumption potential. In this dissertation, various techniques for energy-efficient fault tolerance in multiprocessor real-time systems have been investigated. First, the Reliability-Aware Power Management (RAPM) framework, which can preserve the system reliability with respect to transient faults when Dynamic Voltage Scaling (DVS) is applied for energy savings, is extended to support parallel real-time applications with precedence constraints. Next, the traditional Standby-Sparing (SS) technique for dual processor systems, which takes both transient and permanent faults into consideration while saving energy, is generalized to support multiprocessor systems with arbitrary number of identical processors. Observing the inefficient usage of slack time in the SS technique, a Preference-Oriented Scheduling Framework is designed to address the problem where tasks are given preferences for being executed as soon as possible (ASAP) or as late as possible (ALAP). A preference-oriented earliest deadline (POED) scheduler is proposed and its application in multiprocessor systems for energy-efficient fault tolerance is investigated, where tasks' main copies are executed ASAP while backup copies ALAP to reduce the overlapped execution of main and backup copies of the same task and thus reduce energy consumption. All proposed techniques are evaluated through extensive simulations and compared with other state-of-the-art approaches. The simulation results confirm that the proposed schemes can preserve the system reliability while still achieving substantial energy savings. Finally, for both SS and POED based Energy-Efficient Fault-Tolerant (EEFT) schemes, a series of recovery strategies are designed when more than one (transient and permanent) faults need to be tolerated.
Optimal Management of Redundant Control Authority for Fault Tolerance

NASA Technical Reports Server (NTRS)

Wu, N. Eva; Ju, Jianhong

2000-01-01

This paper is intended to demonstrate the feasibility of a solution to a fault tolerant control problem. It explains, through a numerical example, the design and the operation of a novel scheme for fault tolerant control. The fundamental principle of the scheme was formalized in [5] based on the notion of normalized nonspecificity. The novelty lies with the use of a reliability criterion for redundancy management, and therefore leads to a high overall system reliability.
Advanced information processing system: Hosting of advanced guidance, navigation and control algorithms on AIPS using ASTER

NASA Technical Reports Server (NTRS)

Brenner, Richard; Lala, Jaynarayan H.; Nagle, Gail A.; Schor, Andrei; Turkovich, John

1994-01-01

This program demonstrated the integration of a number of technologies that can increase the availability and reliability of launch vehicles while lowering costs. Availability is increased with an advanced guidance algorithm that adapts trajectories in real-time. Reliability is increased with fault-tolerant computers and communication protocols. Costs are reduced by automatically generating code and documentation. This program was realized through the cooperative efforts of academia, industry, and government. The NASA-LaRC coordinated the effort, while Draper performed the integration. Georgia Institute of Technology supplied a weak Hamiltonian finite element method for optimal control problems. Martin Marietta used MATLAB to apply this method to a launch vehicle (FENOC). Draper supplied the fault-tolerant computing and software automation technology. The fault-tolerant technology includes sequential and parallel fault-tolerant processors (FTP & FTPP) and authentication protocols (AP) for communication. Fault-tolerant technology was incrementally incorporated. Development culminated with a heterogeneous network of workstations and fault-tolerant computers using AP. Draper's software automation system, ASTER, was used to specify a static guidance system based on FENOC, navigation, flight control (GN&C), models, and the interface to a user interface for mission control. ASTER generated Ada code for GN&C and C code for models. An algebraic transform engine (ATE) was developed to automatically translate MATLAB scripts into ASTER.
Fault Injection Campaign for a Fault Tolerant Duplex Framework

NASA Technical Reports Server (NTRS)

Sacco, Gian Franco; Ferraro, Robert D.; von llmen, Paul; Rennels, Dave A.

2007-01-01

Fault tolerance is an efficient approach adopted to avoid or reduce the damage of a system failure. In this work we present the results of a fault injection campaign we conducted on the Duplex Framework (DF). The DF is a software developed by the UCLA group [1, 2] that uses a fault tolerant approach and allows to run two replicas of the same process on two different nodes of a commercial off-the-shelf (COTS) computer cluster. A third process running on a different node, constantly monitors the results computed by the two replicas, and eventually restarts the two replica processes if an inconsistency in their computation is detected. This approach is very cost efficient and can be adopted to control processes on spacecrafts where the fault rate produced by cosmic rays is not very high.
Fault tolerant control of multivariable processes using auto-tuning PID controller.

PubMed

Yu, Ding-Li; Chang, T K; Yu, Ding-Wen

2005-02-01

Fault tolerant control of dynamic processes is investigated in this paper using an auto-tuning PID controller. A fault tolerant control scheme is proposed composing an auto-tuning PID controller based on an adaptive neural network model. The model is trained online using the extended Kalman filter (EKF) algorithm to learn system post-fault dynamics. Based on this model, the PID controller adjusts its parameters to compensate the effects of the faults, so that the control performance is recovered from degradation. The auto-tuning algorithm for the PID controller is derived with the Lyapunov method and therefore, the model predicted tracking error is guaranteed to converge asymptotically. The method is applied to a simulated two-input two-output continuous stirred tank reactor (CSTR) with various faults, which demonstrate the applicability of the developed scheme to industrial processes.
Investigation of the applicability of a functional programming model to fault-tolerant parallel processing for knowledge-based systems

NASA Technical Reports Server (NTRS)

Harper, Richard

1989-01-01

In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper Fault-Tolerant Parallel Processor (FTPP). When used in conjunction with the FTPP's fault detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms have been implemented and are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence and recovery. This user interface is described and its use demonstrated. The applicability of the functional programming style to the Activation Framework, a paradigm for intelligent systems, is then briefly described.
Adaptive sensor-fault tolerant control for a class of multivariable uncertain nonlinear systems.

PubMed

Khebbache, Hicham; Tadjine, Mohamed; Labiod, Salim; Boulkroune, Abdesselem

2015-03-01

This paper deals with the active fault tolerant control (AFTC) problem for a class of multiple-input multiple-output (MIMO) uncertain nonlinear systems subject to sensor faults and external disturbances. The proposed AFTC method can tolerate three additive (bias, drift and loss of accuracy) and one multiplicative (loss of effectiveness) sensor faults. By employing backstepping technique, a novel adaptive backstepping-based AFTC scheme is developed using the fact that sensor faults and system uncertainties (including external disturbances and unexpected nonlinear functions caused by sensor faults) can be on-line estimated and compensated via robust adaptive schemes. The stability analysis of the closed-loop system is rigorously proven using a Lyapunov approach. The effectiveness of the proposed controller is illustrated by two simulation examples. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Reconfigurable tree architectures using subtree oriented fault tolerance

NASA Technical Reports Server (NTRS)

Lowrie, Matthew B.

1987-01-01

An approach to the design of reconfigurable tree architecture is presented in which spare processors are allocated at the leaves. The approach is unique in that spares are associated with subtrees and sharing of spares between these subtrees can occur. The Subtree Oriented Fault Tolerance (SOFT) approach is more reliable than previous approaches capable of tolerating link and switch failures for both single chip and multichip tree implementations while reducing redundancy in terms of both spare processors and links. VLSI layout is 0(n) for binary trees and is directly extensible to N-ary trees and fault tolerance through performance degradation.
Airborne Advanced Reconfigurable Computer System (ARCS)

NASA Technical Reports Server (NTRS)

Bjurman, B. E.; Jenkins, G. M.; Masreliez, C. J.; Mcclellan, K. L.; Templeman, J. E.

1976-01-01

A digital computer subsystem fault-tolerant concept was defined, and the potential benefits and costs of such a subsystem were assessed when used as the central element of a new transport's flight control system. The derived advanced reconfigurable computer system (ARCS) is a triple-redundant computer subsystem that automatically reconfigures, under multiple fault conditions, from triplex to duplex to simplex operation, with redundancy recovery if the fault condition is transient. The study included criteria development covering factors at the aircraft's operation level that would influence the design of a fault-tolerant system for commercial airline use. A new reliability analysis tool was developed for evaluating redundant, fault-tolerant system availability and survivability; and a stringent digital system software design methodology was used to achieve design/implementation visibility.
A benchmark for fault tolerant flight control evaluation

NASA Astrophysics Data System (ADS)

Smaili, H.; Breeman, J.; Lombaerts, T.; Stroosma, O.

2013-12-01

A large transport aircraft simulation benchmark (REconfigurable COntrol for Vehicle Emergency Return - RECOVER) has been developed within the GARTEUR (Group for Aeronautical Research and Technology in Europe) Flight Mechanics Action Group 16 (FM-AG(16)) on Fault Tolerant Control (2004 2008) for the integrated evaluation of fault detection and identification (FDI) and reconfigurable flight control strategies. The benchmark includes a suitable set of assessment criteria and failure cases, based on reconstructed accident scenarios, to assess the potential of new adaptive control strategies to improve aircraft survivability. The application of reconstruction and modeling techniques, based on accident flight data, has resulted in high-fidelity nonlinear aircraft and fault models to evaluate new Fault Tolerant Flight Control (FTFC) concepts and their real-time performance to accommodate in-flight failures.
Fault-tolerant communication channel structures

NASA Technical Reports Server (NTRS)

Tai, Ann T. (Inventor); Alkalai, Leon (Inventor); Chau, Savio N. (Inventor)

2006-01-01

Systems and techniques for implementing fault-tolerant communication channels and features in communication systems. Selected commercial-off-the-shelf devices can be integrated in such systems to reduce the cost.
Study of fault tolerant software technology for dynamic systems

NASA Technical Reports Server (NTRS)

Caglayan, A. K.; Zacharias, G. L.

1985-01-01

The major aim of this study is to investigate the feasibility of using systems-based failure detection isolation and compensation (FDIC) techniques in building fault-tolerant software and extending them, whenever possible, to the domain of software fault tolerance. First, it is shown that systems-based FDIC methods can be extended to develop software error detection techniques by using system models for software modules. In particular, it is demonstrated that systems-based FDIC techniques can yield consistency checks that are easier to implement than acceptance tests based on software specifications. Next, it is shown that systems-based failure compensation techniques can be generalized to the domain of software fault tolerance in developing software error recovery procedures. Finally, the feasibility of using fault-tolerant software in flight software is investigated. In particular, possible system and version instabilities, and functional performance degradation that may occur in N-Version programming applications to flight software are illustrated. Finally, a comparative analysis of N-Version and recovery block techniques in the context of generic blocks in flight software is presented.
Closed-Loop Evaluation of an Integrated Failure Identification and Fault Tolerant Control System for a Transport Aircraft

NASA Technical Reports Server (NTRS)

Shin, Jong-Yeob; Belcastro, Christine; Khong, thuan

2006-01-01

Formal robustness analysis of aircraft control upset prevention and recovery systems could play an important role in their validation and ultimate certification. Such systems developed for failure detection, identification, and reconfiguration, as well as upset recovery, need to be evaluated over broad regions of the flight envelope or under extreme flight conditions, and should include various sources of uncertainty. To apply formal robustness analysis, formulation of linear fractional transformation (LFT) models of complex parameter-dependent systems is required, which represent system uncertainty due to parameter uncertainty and actuator faults. This paper describes a detailed LFT model formulation procedure from the nonlinear model of a transport aircraft by using a preliminary LFT modeling software tool developed at the NASA Langley Research Center, which utilizes a matrix-based computational approach. The closed-loop system is evaluated over the entire flight envelope based on the generated LFT model which can cover nonlinear dynamics. The robustness analysis results of the closed-loop fault tolerant control system of a transport aircraft are presented. A reliable flight envelope (safe flight regime) is also calculated from the robust performance analysis results, over which the closed-loop system can achieve the desired performance of command tracking and failure detection.
Preliminary design of the redundant software experiment

NASA Technical Reports Server (NTRS)

Campbell, Roy; Deimel, Lionel; Eckhardt, Dave, Jr.; Kelly, John; Knight, John; Lauterbach, Linda; Lee, Larry; Mcallister, Dave; Mchugh, John

1985-01-01

The goal of the present experiment is to characterize the fault distributions of highly reliable software replicates, constructed using techniques and environments which are similar to those used in comtemporary industrial software facilities. The fault distributions and their effect on the reliability of fault tolerant configurations of the software will be determined through extensive life testing of the replicates against carefully constructed randomly generated test data. Each detected error will be carefully analyzed to provide insight in to their nature and cause. A direct objective is to develop techniques for reducing the intensity of coincident errors, thus increasing the reliability gain which can be achieved with fault tolerance. Data on the reliability gains realized, and the cost of the fault tolerant configurations can be used to design a companion experiment to determine the cost effectiveness of the fault tolerant strategy. Finally, the data and analysis produced by this experiment will be valuable to the software engineering community as a whole because it will provide a useful insight into the nature and cause of hard to find, subtle faults which escape standard software engineering validation techniques and thus persist far into the software life cycle.
Experimental Demonstration of Fault-Tolerant State Preparation with Superconducting Qubits.

PubMed

Takita, Maika; Cross, Andrew W; Córcoles, A D; Chow, Jerry M; Gambetta, Jay M

2017-11-03

Robust quantum computation requires encoding delicate quantum information into degrees of freedom that are hard for the environment to change. Quantum encodings have been demonstrated in many physical systems by observing and correcting storage errors, but applications require not just storing information; we must accurately compute even with faulty operations. The theory of fault-tolerant quantum computing illuminates a way forward by providing a foundation and collection of techniques for limiting the spread of errors. Here we implement one of the smallest quantum codes in a five-qubit superconducting transmon device and demonstrate fault-tolerant state preparation. We characterize the resulting code words through quantum process tomography and study the free evolution of the logical observables. Our results are consistent with fault-tolerant state preparation in a protected qubit subspace.

A Voyager attitude control perspective on fault tolerant systems

NASA Technical Reports Server (NTRS)

Rasmussen, R. D.; Litty, E. C.

1981-01-01

In current spacecraft design, a trend can be observed to achieve greater fault tolerance through the application of on-board software dedicated to detecting and isolating failures. Whether fault tolerance through software can meet the desired objectives depends on very careful consideration and control of the system in which the software is imbedded. The considered investigation has the objective to provide some of the insight needed for the required analysis of the system. A description is given of the techniques which have been developed in this connection during the development of the Voyager spacecraft. The Voyager Galileo Attitude and Articulation Control Subsystem (AACS) fault tolerant design is discussed to emphasize basic lessons learned from this experience. The central driver of hardware redundancy implementation on Voyager was known as the 'single point failure criterion'.
Neural-like computing with populations of superparamagnetic basis functions.

PubMed

Mizrahi, Alice; Hirtzlin, Tifenn; Fukushima, Akio; Kubota, Hitoshi; Yuasa, Shinji; Grollier, Julie; Querlioz, Damien

2018-04-18

In neuroscience, population coding theory demonstrates that neural assemblies can achieve fault-tolerant information processing. Mapped to nanoelectronics, this strategy could allow for reliable computing with scaled-down, noisy, imperfect devices. Doing so requires that the population components form a set of basis functions in terms of their response functions to inputs, offering a physical substrate for computing. Such a population can be implemented with CMOS technology, but the corresponding circuits have high area or energy requirements. Here, we show that nanoscale magnetic tunnel junctions can instead be assembled to meet these requirements. We demonstrate experimentally that a population of nine junctions can implement a basis set of functions, providing the data to achieve, for example, the generation of cursive letters. We design hybrid magnetic-CMOS systems based on interlinked populations of junctions and show that they can learn to realize non-linear variability-resilient transformations with a low imprint area and low power.
Robust Characterization of Loss Rates

NASA Astrophysics Data System (ADS)

Wallman, Joel J.; Barnhill, Marie; Emerson, Joseph

2015-08-01

Many physical implementations of qubits—including ion traps, optical lattices and linear optics—suffer from loss. A nonzero probability of irretrievably losing a qubit can be a substantial obstacle to fault-tolerant methods of processing quantum information, requiring new techniques to safeguard against loss that introduce an additional overhead that depends upon the loss rate. Here we present a scalable and platform-independent protocol for estimating the average loss rate (averaged over all input states) resulting from an arbitrary Markovian noise process, as well as an independent estimate of detector efficiency. Moreover, we show that our protocol gives an additional constraint on estimated parameters from randomized benchmarking that improves the reliability of the estimated error rate and provides a new indicator for non-Markovian signatures in the experimental data. We also derive a bound for the state-dependent loss rate in terms of the average loss rate.
Copilot: Monitoring Embedded Systems

NASA Technical Reports Server (NTRS)

Pike, Lee; Wegmann, Nis; Niller, Sebastian; Goodloe, Alwyn

2012-01-01

Runtime verification (RV) is a natural fit for ultra-critical systems, where correctness is imperative. In ultra-critical systems, even if the software is fault-free, because of the inherent unreliability of commodity hardware and the adversity of operational environments, processing units (and their hosted software) are replicated, and fault-tolerant algorithms are used to compare the outputs. We investigate both software monitoring in distributed fault-tolerant systems, as well as implementing fault-tolerance mechanisms using RV techniques. We describe the Copilot language and compiler, specifically designed for generating monitors for distributed, hard real-time systems. We also describe two case-studies in which we generated Copilot monitors in avionics systems.
Which preferences associate with school performance?-Lessons from an exploratory study with university students.

PubMed

Horn, Daniel; Kiss, Hubert Janos

2018-01-01

Success in life is determined to a large extent by school performance so it is important to understand the effect of the factors that influence it. In this exploratory study, in addition to cognitive abilities, we attempt to link measures of preferences with outcomes of school performance. We measured in an incentivized way risk, time, social and competitive preferences and cognitive abilities of university students to look for associations between these measures and two important academic outcome measures: exam results and GPA. We find consistently that cognitive abilities (proxied by the Cognitive Reflection Test) are very well correlated with school performance. Regarding non-cognitive skills, we report suggestive evidence for many of our measured preferences. We used two alternative measures of time preference: patience and present bias. Present bias explains exam grades better, while patience explains GPA relatively better. Both measures of time preferences have a non-linear relation to school performance. Competitiveness matters, as students, who opt for a more competitive payment scheme in our experimental task have a higher average GPA. We observe also that risk-averse students perform a little better than more risk-tolerant students. That makes sense in case of multiple choice exams, because more risk-tolerant students may want to try to pass the exam less prepared, as the possibility of passing an exam just by chance is not zero. Finally, we have also detected that cooperative preferences-the amount of money offered in a public good game-associates strongly with GPA in a non-linear way. Students who offered around half of their possible amounts had significantly higher GPAs than those, who offered none or all their money.
Which preferences associate with school performance?—Lessons from an exploratory study with university students

PubMed Central

2018-01-01

Success in life is determined to a large extent by school performance so it is important to understand the effect of the factors that influence it. In this exploratory study, in addition to cognitive abilities, we attempt to link measures of preferences with outcomes of school performance. We measured in an incentivized way risk, time, social and competitive preferences and cognitive abilities of university students to look for associations between these measures and two important academic outcome measures: exam results and GPA. We find consistently that cognitive abilities (proxied by the Cognitive Reflection Test) are very well correlated with school performance. Regarding non-cognitive skills, we report suggestive evidence for many of our measured preferences. We used two alternative measures of time preference: patience and present bias. Present bias explains exam grades better, while patience explains GPA relatively better. Both measures of time preferences have a non-linear relation to school performance. Competitiveness matters, as students, who opt for a more competitive payment scheme in our experimental task have a higher average GPA. We observe also that risk-averse students perform a little better than more risk-tolerant students. That makes sense in case of multiple choice exams, because more risk-tolerant students may want to try to pass the exam less prepared, as the possibility of passing an exam just by chance is not zero. Finally, we have also detected that cooperative preferences—the amount of money offered in a public good game—associates strongly with GPA in a non-linear way. Students who offered around half of their possible amounts had significantly higher GPAs than those, who offered none or all their money. PMID:29451886
Fault tolerant programmable digital attitude control electronics study

NASA Technical Reports Server (NTRS)

Sorensen, A. A.

1974-01-01

The attitude control electronics mechanization study to develop a fault tolerant autonomous concept for a three axis system is reported. Programmable digital electronics are compared to general purpose digital computers. The requirements, constraints, and tradeoffs are discussed. It is concluded that: (1) general fault tolerance can be achieved relatively economically, (2) recovery times of less than one second can be obtained, (3) the number of faulty behavior patterns must be limited, and (4) adjoined processes are the best indicators of faulty operation.
Refinement for fault-tolerance: An aircraft hand-off protocol

NASA Technical Reports Server (NTRS)

Marzullo, Keith; Schneider, Fred B.; Dehn, Jon

1994-01-01

Part of the Advanced Automation System (AAS) for air-traffic control is a protocol to permit flight hand-off from one air-traffic controller to another. The protocol must be fault-tolerant and, therefore, is subtle -- an ideal candidate for the application of formal methods. This paper describes a formal method for deriving fault-tolerant protocols that is based on refinement and proof outlines. The AAS hand-off protocol was actually derived using this method; that derivation is given.
Testing For EM Upsets In Aircraft Control Computers

NASA Technical Reports Server (NTRS)

Belcastro, Celeste M.

1994-01-01

Effects of transient electrical signals evaluated in laboratory tests. Method of evaluating nominally fault-tolerant, aircraft-type digital-computer-based control system devised. Provides for evaluation of susceptibility of system to upset and evaluation of integrity of control when system subjected to transient electrical signals like those induced by electromagnetic (EM) source, in this case lightning. Beyond aerospace applications, fault-tolerant control systems becoming more wide-spread in industry; such as in automobiles. Method supports practical, systematic tests for evaluation of designs of fault-tolerant control systems.
Design of the Protocol Processor for the ROBUS-2 Communication System

NASA Technical Reports Server (NTRS)

Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.

2005-01-01

The ROBUS-2 Protocol Processor (RPP) is a custom-designed hardware component implementing the functionality of the ROBUS-2 fault-tolerant communication system. The Reliable Optical Bus (ROBUS) is the core communication system of the Scalable Processor-Independent Design for Enhanced Reliability (SPIDER), a general-purpose fault tolerant integrated modular architecture currently under development at NASA Langley Research Center. ROBUS is a time-division multiple access (TDMA) broadcast communication system with medium access control by means of time-indexed communication schedule. ROBUS-2 is a developmental version of the ROBUS providing guaranteed fault-tolerant services to the attached processing elements (PEs), in the presence of a bounded number of faults. These services include message broadcast (Byzantine Agreement), dynamic communication schedule update, time reference (clock synchronization), and distributed diagnosis (group membership). ROBUS also features fault-tolerant startup and restart capabilities. ROBUS-2 tolerates internal as well as PE faults, and incorporates a dynamic self-reconfiguration capability driven by the internal diagnostic system. ROBUS consists of RPPs connected to each other by a lower-level physical communication network. The RPP has a pipelined architecture and the design is parameterized in the behavioral and structural domains. The design of the RPP enables the bus to achieve a PE-message throughput that approaches the available bandwidth at the physical layer.
Optimal fault-tolerant control strategy of a solid oxide fuel cell system

NASA Astrophysics Data System (ADS)

Wu, Xiaojuan; Gao, Danhui

2017-10-01

For solid oxide fuel cell (SOFC) development, load tracking, heat management, air excess ratio constraint, high efficiency, low cost and fault diagnosis are six key issues. However, no literature studies the control techniques combining optimization and fault diagnosis for the SOFC system. An optimal fault-tolerant control strategy is presented in this paper, which involves four parts: a fault diagnosis module, a switching module, two backup optimizers and a controller loop. The fault diagnosis part is presented to identify the SOFC current fault type, and the switching module is used to select the appropriate backup optimizer based on the diagnosis result. NSGA-II and TOPSIS are employed to design the two backup optimizers under normal and air compressor fault states. PID algorithm is proposed to design the control loop, which includes a power tracking controller, an anode inlet temperature controller, a cathode inlet temperature controller and an air excess ratio controller. The simulation results show the proposed optimal fault-tolerant control method can track the power, temperature and air excess ratio at the desired values, simultaneously achieving the maximum efficiency and the minimum unit cost in the case of SOFC normal and even in the air compressor fault.
Reliability and coverage analysis of non-repairable fault-tolerant memory systems

NASA Technical Reports Server (NTRS)

Cox, G. W.; Carroll, B. D.

1976-01-01

A method was developed for the construction of probabilistic state-space models for nonrepairable systems. Models were developed for several systems which achieved reliability improvement by means of error-coding, modularized sparing, massive replication and other fault-tolerant techniques. From the models developed, sets of reliability and coverage equations for the systems were developed. Comparative analyses of the systems were performed using these equation sets. In addition, the effects of varying subunit reliabilities on system reliability and coverage were described. The results of these analyses indicated that a significant gain in system reliability may be achieved by use of combinations of modularized sparing, error coding, and software error control. For sufficiently reliable system subunits, this gain may far exceed the reliability gain achieved by use of massive replication techniques, yet result in a considerable saving in system cost.
Abnormal fault-recovery characteristics of the fault-tolerant multiprocessor uncovered using a new fault-injection methodology

NASA Technical Reports Server (NTRS)

Padilla, Peter A.

1991-01-01

An investigation was made in AIRLAB of the fault handling performance of the Fault Tolerant MultiProcessor (FTMP). Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once in every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles Byzantine or lying faults. Byzantine faults behave such that the faulted unit points to a working unit as the source of errors. The design's problems involve: (1) the design and interface between the simplex error detection hardware and the error processing software, (2) the functional capabilities of the FTMP system bus, and (3) the communication requirements of a multiprocessor architecture. These weak areas in the FTMP's design increase the probability that, for any hardware fault, a good line replacement unit (LRU) is mistakenly disabled by the fault management software.
Formal Techniques for Synchronized Fault-Tolerant Systems

NASA Technical Reports Server (NTRS)

DiVito, Ben L.; Butler, Ricky W.

1992-01-01

We present the formal verification of synchronizing aspects of the Reliable Computing Platform (RCP), a fault-tolerant computing system for digital flight control applications. The RCP uses NMR-style redundancy to mask faults and internal majority voting to purge the effects of transient faults. The system design has been formally specified and verified using the EHDM verification system. Our formalization is based on an extended state machine model incorporating snapshots of local processors clocks.
Fault-tolerant Control of a Cyber-physical System

NASA Astrophysics Data System (ADS)

Roxana, Rusu-Both; Eva-Henrietta, Dulf

2017-10-01

Cyber-physical systems represent a new emerging field in automatic control. The fault system is a key component, because modern, large scale processes must meet high standards of performance, reliability and safety. Fault propagation in large scale chemical processes can lead to loss of production, energy, raw materials and even environmental hazard. The present paper develops a multi-agent fault-tolerant control architecture using robust fractional order controllers for a (13C) cryogenic separation column cascade. The JADE (Java Agent DEvelopment Framework) platform was used to implement the multi-agent fault tolerant control system while the operational model of the process was implemented in Matlab/SIMULINK environment. MACSimJX (Multiagent Control Using Simulink with Jade Extension) toolbox was used to link the control system and the process model. In order to verify the performance and to prove the feasibility of the proposed control architecture several fault simulation scenarios were performed.
Fault-tolerant building-block computer study

NASA Technical Reports Server (NTRS)

Rennels, D. A.

1978-01-01

Ultra-reliable core computers are required for improving the reliability of complex military systems. Such computers can provide reliable fault diagnosis, failure circumvention, and, in some cases serve as an automated repairman for their host systems. A small set of building-block circuits which can be implemented as single very large integration devices, and which can be used with off-the-shelf microprocessors and memories to build self checking computer modules (SCCM) is described. Each SCCM is a microcomputer which is capable of detecting its own faults during normal operation and is described to communicate with other identical modules over one or more Mil Standard 1553A buses. Several SCCMs can be connected into a network with backup spares to provide fault-tolerant operation, i.e. automated recovery from faults. Alternative fault-tolerant SCCM configurations are discussed along with the cost and reliability associated with their implementation.
Fault Tolerant Paradigms

DTIC Science & Technology

2016-02-26

say that A is a JL(m,d,)-embedding of S into Cm. Linear JL(m,d,)-embeddings are closely related to the Restricted Isometry Property [9, 4, 18...holds ∀x ∈ Cd containing at most s nonzero coordinates. In this case we will say that A is RIP(s,). In particular, the following theorem due to Krahmer...implement reliable edge detector functions, especially in the presence of noise. Needless to say , the same issues exist in two dimensions, as
Guest Editor's Introduction: Special section on dependable distributed systems

NASA Astrophysics Data System (ADS)

Fetzer, Christof

1999-09-01

We rely more and more on computers. For example, the Internet reshapes the way we do business. A `computer outage' can cost a company a substantial amount of money. Not only with respect to the business lost during an outage, but also with respect to the negative publicity the company receives. This is especially true for Internet companies. After recent computer outages of Internet companies, we have seen a drastic fall of the shares of the affected companies. There are multiple causes for computer outages. Although computer hardware becomes more reliable, hardware related outages remain an important issue. For example, some of the recent computer outages of companies were caused by failed memory and system boards, and even by crashed disks - a failure type which can easily be masked using disk mirroring. Transient hardware failures might also look like software failures and, hence, might be incorrectly classified as such. However, many outages are software related. Faulty system software, middleware, and application software can crash a system. Dependable computing systems are systems we can rely on. Dependable systems are, by definition, reliable, available, safe and secure [3]. This special section focuses on issues related to dependable distributed systems. Distributed systems have the potential to be more dependable than a single computer because the probability that all computers in a distributed system fail is smaller than the probability that a single computer fails. However, if a distributed system is not built well, it is potentially less dependable than a single computer since the probability that at least one computer in a distributed system fails is higher than the probability that one computer fails. For example, if the crash of any computer in a distributed system can bring the complete system to a halt, the system is less dependable than a single-computer system. Building dependable distributed systems is an extremely difficult task. There is no silver bullet solution. Instead one has to apply a variety of engineering techniques [2]: fault-avoidance (minimize the occurrence of faults, e.g. by using a proper design process), fault-removal (remove faults before they occur, e.g. by testing), fault-evasion (predict faults by monitoring and reconfigure the system before failures occur), and fault-tolerance (mask and/or contain failures). Building a system from scratch is an expensive and time consuming effort. To reduce the cost of building dependable distributed systems, one would choose to use commercial off-the-shelf (COTS) components whenever possible. The usage of COTS components has several potential advantages beyond minimizing costs. For example, through the widespread usage of a COTS component, design failures might be detected and fixed before the component is used in a dependable system. Custom-designed components have to mature without the widespread in-field testing of COTS components. COTS components have various potential disadvantages when used in dependable systems. For example, minimizing the time to market might lead to the release of components with inherent design faults (e.g. use of `shortcuts' that only work most of the time). In addition, the components might be more complex than needed and, hence, potentially have more design faults than simpler components. However, given economic constraints and the ability to cope with some of the problems using fault-evasion and fault-tolerance, only for a small percentage of systems can one justify not using COTS components. Distributed systems built from current COTS components are asynchronous systems in the sense that there exists no a priori known bound on the transmission delay of messages or the execution time of processes. When designing a distributed algorithm, one would like to make sure (e.g. by testing or verification) that it is correct, i.e. satisfies its specification. Many distributed algorithms make use of consensus (eventually all non-crashed processes have to agree on a value), leader election (a crashed leader is eventually replaced by a new leader, but at any time there is at most one leader) or a group membership detection service (a crashed process is eventually suspected to have crashed but only crashed processes are suspected). From a theoretical point of view, the service specifications given for such services are not implementable in asynchronous systems. In particular, for each implementation one can derive a counter example in which the service violates its specification. From a practical point of view, the consensus, the leader election, and the membership detection problem are solvable in asynchronous distributed systems. In this special section, Raynal and Tronel show how to bridge this difference by showing how to implement the group membership detection problem with a negligible probability [1] to fail in an asynchronous system. The group membership detection problem is specified by a liveness condition (L) and a safety property (S): (L) if a process p crashes, then eventually every non-crashed process q has to suspect that p has crashed; and (S) if a process q suspects p, then p has indeed crashed. One can show that either (L) or (S) is implementable, but one cannot implement both (L) and (S) at the same time in an asynchronous system. In practice, one only needs to implement (L) and (S) such that the probability that (L) or (S) is violated becomes negligible. Raynal and Tronel propose and analyse a protocol that implements (L) with certainty and that can be tuned such that the probability that (S) is violated becomes negligible. Designing and implementing distributed fault-tolerant protocols for asynchronous systems is a difficult but not an impossible task. A fault-tolerant protocol has to detect and mask certain failure classes, e.g. crash failures and message omission failures. There is a trade-off between the performance of a fault-tolerant protocol and the failure classes the protocol can tolerate. One wants to tolerate as many failure classes as needed to satisfy the stochastic requirements of the protocol [1] while still maintaining a sufficient performance. Since clients of a protocol have different requirements with respect to the performance/fault-tolerance trade-off, one would like to be able to customize protocols such that one can select an appropriate performance/fault-tolerance trade-off. In this special section Hiltunen et al describe how one can compose protocols from micro-protocols in their Cactus system. They show how a group RPC system can be tailored to the needs of a client. In particular, they show how considering additional failure classes affects the performance of a group RPC system. References [1] Cristian F 1991 Understanding fault-tolerant distributed systems Communications of ACM 34 (2) 56-78 [2] Heimerdinger W L and Weinstock C B 1992 A conceptual framework for system fault tolerance Technical Report 92-TR-33, CMU/SEI [3] Laprie J C (ed) 1992 Dependability: Basic Concepts and Terminology (Vienna: Springer)
Broad-band simulation of M7.2 earthquake on the North Tehran fault, considering non-linear soil effects

NASA Astrophysics Data System (ADS)

Majidinejad, A.; Zafarani, H.; Vahdani, S.

2018-05-01

The North Tehran fault (NTF) is known to be one of the most drastic sources of seismic hazard on the city of Tehran. In this study, we provide broad-band (0-10 Hz) ground motions for the city as a consequence of probable M7.2 earthquake on the NTF. Low-frequency motions (0-2 Hz) are provided from spectral element dynamic simulation of 17 scenario models. High-frequency (2-10 Hz) motions are calculated with a physics-based method based on S-to-S backscattering theory. Broad-band ground motions at the bedrock level show amplifications, both at low and high frequencies, due to the existence of deep Tehran basin in the vicinity of the NTF. By employing soil profiles obtained from regional studies, effect of shallow soil layers on broad-band ground motions is investigated by both linear and non-linear analyses. While linear soil response overestimate ground motion prediction equations, non-linear response predicts plausible results within one standard deviation of empirical relationships. Average Peak Ground Accelerations (PGAs) at the northern, central and southern parts of the city are estimated about 0.93, 0.59 and 0.4 g, respectively. Increased damping caused by non-linear soil behaviour, reduces the soil linear responses considerably, in particular at frequencies above 3 Hz. Non-linear deamplification reduces linear spectral accelerations up to 63 per cent at stations above soft thick sediments. By performing more general analyses, which exclude source-to-site effects on stations, a correction function is proposed for typical site classes of Tehran. Parameters for the function which reduces linear soil response in order to take into account non-linear soil deamplification are provided for various frequencies in the range of engineering interest. In addition to fully non-linear analyses, equivalent-linear calculations were also conducted which their comparison revealed appropriateness of the method for large peaks and low frequencies, but its shortage for small to medium peaks and motions with higher than 3 Hz frequencies.
Making classical ground-state spin computing fault-tolerant.

PubMed

Crosson, I J; Bacon, D; Brown, K R

2010-09-01

We examine a model of classical deterministic computing in which the ground state of the classical system is a spatial history of the computation. This model is relevant to quantum dot cellular automata as well as to recent universal adiabatic quantum computing constructions. In its most primitive form, systems constructed in this model cannot compute in an error-free manner when working at nonzero temperature. However, by exploiting a mapping between the partition function for this model and probabilistic classical circuits we are able to show that it is possible to make this model effectively error-free. We achieve this by using techniques in fault-tolerant classical computing and the result is that the system can compute effectively error-free if the temperature is below a critical temperature. We further link this model to computational complexity and show that a certain problem concerning finite temperature classical spin systems is complete for the complexity class Merlin-Arthur. This provides an interesting connection between the physical behavior of certain many-body spin systems and computational complexity.

Software-Implemented Fault Tolerance in Communications Systems

NASA Technical Reports Server (NTRS)

Gantenbein, Rex E.

1994-01-01

Software-implemented fault tolerance (SIFT) is used in many computer-based command, control, and communications (C(3)) systems to provide the nearly continuous availability that they require. In the communications subsystem of Space Station Alpha, SIFT algorithms are used to detect and recover from failures in the data and command link between the Station and its ground support. The paper presents a review of these algorithms and discusses how such techniques can be applied to similar systems found in applications such as manufacturing control, military communications, and programmable devices such as pacemakers. With support from the Tracking and Communication Division of NASA's Johnson Space Center, researchers at the University of Wyoming are developing a testbed for evaluating the effectiveness of these algorithms prior to their deployment. This testbed will be capable of simulating a variety of C(3) system failures and recording the response of the Space Station SIFT algorithms to these failures. The design of this testbed and the applicability of the approach in other environments is described.
The core legion object model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lewis, M.; Grimshaw, A.

1996-12-31

The Legion project at the University of Virginia is an architecture for designing and building system services that provide the illusion of a single virtual machine to users, a virtual machine that provides secure shared object and shared name spaces, application adjustable fault-tolerance, improved response time, and greater throughput. Legion targets wide area assemblies of workstations, supercomputers, and parallel supercomputers, Legion tackles problems not solved by existing workstation based parallel processing tools; the system will enable fault-tolerance, wide area parallel processing, inter-operability, heterogeneity, a single global name space, protection, security, efficient scheduling, and comprehensive resource management. This paper describes themore » core Legion object model, which specifies the composition and functionality of Legion`s core objects-those objects that cooperate to create, locate, manage, and remove objects in the Legion system. The object model facilitates a flexible extensible implementation, provides a single global name space, grants site autonomy to participating organizations, and scales to millions of sites and trillions of objects.« less
Post-seismic and interseismic fault creep I: model description

NASA Astrophysics Data System (ADS)

Hetland, E. A.; Simons, M.; Dunham, E. M.

2010-04-01

We present a model of localized, aseismic fault creep during the full interseismic period, including both transient and steady fault creep, in response to a sequence of imposed coseismic slip events and tectonic loading. We consider the behaviour of models with linear viscous, non-linear viscous, rate-dependent friction, and rate- and state-dependent friction fault rheologies. Both the transient post-seismic creep and the pattern of steady interseismic creep rates surrounding asperities depend on recent coseismic slip and fault rheologies. In these models, post-seismic fault creep is manifest as pulses of elevated creep rates that propagate from the coseismic slip, these pulses feature sharper fronts and are longer lived in models with rate-state friction compared to other models. With small characteristic slip distances in rate-state friction models, interseismic creep is similar to that in models with rate-dependent friction faults, except for the earliest periods of post-seismic creep. Our model can be used to constrain fault rheologies from geodetic observations in cases where the coseismic slip history is relatively well known. When only considering surface deformation over a short period of time, there are strong trade-offs between fault rheology and the details of the imposed coseismic slip. Geodetic observations over longer times following an earthquake will reduce these trade-offs, while simultaneous modelling of interseismic and post-seismic observations provide the strongest constraints on fault rheologies.
Evaluation of fault-tolerant parallel-processor architectures over long space missions

NASA Technical Reports Server (NTRS)

Johnson, Sally C.

1989-01-01

The impact of a five year space mission environment on fault-tolerant parallel processor architectures is examined. The target application is a Strategic Defense Initiative (SDI) satellite requiring 256 parallel processors to provide the computation throughput. The reliability requirements are that the system still be operational after five years with .99 probability and that the probability of system failure during one-half hour of full operation be less than 10(-7). The fault tolerance features an architecture must possess to meet these reliability requirements are presented, many potential architectures are briefly evaluated, and one candidate architecture, the Charles Stark Draper Laboratory's Fault-Tolerant Parallel Processor (FTPP) is evaluated in detail. A methodology for designing a preliminary system configuration to meet the reliability and performance requirements of the mission is then presented and demonstrated by designing an FTPP configuration.
Measurement and analysis of operating system fault tolerance

NASA Technical Reports Server (NTRS)

Lee, I.; Tang, D.; Iyer, R. K.

1992-01-01

This paper demonstrates a methodology to model and evaluate the fault tolerance characteristics of operational software. The methodology is illustrated through case studies on three different operating systems: the Tandem GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Measurements are made on these systems for substantial periods to collect software error and recovery data. In addition to investigating basic dependability characteristics such as major software problems and error distributions, we develop two levels of models to describe error and recovery processes inside an operating system and on multiple instances of an operating system running in a distributed environment. Based on the models, reward analysis is conducted to evaluate the loss of service due to software errors and the effect of the fault-tolerance techniques implemented in the systems. Software error correlation in multicomputer systems is also investigated.
A DICOM-based 2nd generation Molecular Imaging Data Grid implementing the IHE XDS-i integration profile.

PubMed

Lee, Jasper; Zhang, Jianguo; Park, Ryan; Dagliyan, Grant; Liu, Brent; Huang, H K

2012-07-01

A Molecular Imaging Data Grid (MIDG) was developed to address current informatics challenges in archival, sharing, search, and distribution of preclinical imaging studies between animal imaging facilities and investigator sites. This manuscript presents a 2nd generation MIDG replacing the Globus Toolkit with a new system architecture that implements the IHE XDS-i integration profile. Implementation and evaluation were conducted using a 3-site interdisciplinary test-bed at the University of Southern California. The 2nd generation MIDG design architecture replaces the initial design's Globus Toolkit with dedicated web services and XML-based messaging for dedicated management and delivery of multi-modality DICOM imaging datasets. The Cross-enterprise Document Sharing for Imaging (XDS-i) integration profile from the field of enterprise radiology informatics was adopted into the MIDG design because streamlined image registration, management, and distribution dataflow are likewise needed in preclinical imaging informatics systems as in enterprise PACS application. Implementation of the MIDG is demonstrated at the University of Southern California Molecular Imaging Center (MIC) and two other sites with specified hardware, software, and network bandwidth. Evaluation of the MIDG involves data upload, download, and fault-tolerance testing scenarios using multi-modality animal imaging datasets collected at the USC Molecular Imaging Center. The upload, download, and fault-tolerance tests of the MIDG were performed multiple times using 12 collected animal study datasets. Upload and download times demonstrated reproducibility and improved real-world performance. Fault-tolerance tests showed that automated failover between Grid Node Servers has minimal impact on normal download times. Building upon the 1st generation concepts and experiences, the 2nd generation MIDG system improves accessibility of disparate animal-model molecular imaging datasets to users outside a molecular imaging facility's LAN using a new architecture, dataflow, and dedicated DICOM-based management web services. Productivity and efficiency of preclinical research for translational sciences investigators has been further streamlined for multi-center study data registration, management, and distribution.
Study of a unified hardware and software fault-tolerant architecture

NASA Technical Reports Server (NTRS)

Lala, Jaynarayan; Alger, Linda; Friend, Steven; Greeley, Gregory; Sacco, Stephen; Adams, Stuart

1989-01-01

A unified architectural concept, called the Fault Tolerant Processor Attached Processor (FTP-AP), that can tolerate hardware as well as software faults is proposed for applications requiring ultrareliable computation capability. An emulation of the FTP-AP architecture, consisting of a breadboard Motorola 68010-based quadruply redundant Fault Tolerant Processor, four VAX 750s as attached processors, and four versions of a transport aircraft yaw damper control law, is used as a testbed in the AIRLAB to examine a number of critical issues. Solutions of several basic problems associated with N-Version software are proposed and implemented on the testbed. This includes a confidence voter to resolve coincident errors in N-Version software. A reliability model of N-Version software that is based upon the recent understanding of software failure mechanisms is also developed. The basic FTP-AP architectural concept appears suitable for hosting N-Version application software while at the same time tolerating hardware failures. Architectural enhancements for greater efficiency, software reliability modeling, and N-Version issues that merit further research are identified.
Software dependability in the Tandem GUARDIAN system

NASA Technical Reports Server (NTRS)

Lee, Inhwan; Iyer, Ravishankar K.

1995-01-01

Based on extensive field failure data for Tandem's GUARDIAN operating system this paper discusses evaluation of the dependability of operational software. Software faults considered are major defects that result in processor failures and invoke backup processes to take over. The paper categorizes the underlying causes of software failures and evaluates the effectiveness of the process pair technique in tolerating software faults. A model to describe the impact of software faults on the reliability of an overall system is proposed. The model is used to evaluate the significance of key factors that determine software dependability and to identify areas for improvement. An analysis of the data shows that about 77% of processor failures that are initially considered due to software are confirmed as software problems. The analysis shows that the use of process pairs to provide checkpointing and restart (originally intended for tolerating hardware faults) allows the system to tolerate about 75% of reported software faults that result in processor failures. The loose coupling between processors, which results in the backup execution (the processor state and the sequence of events) being different from the original execution, is a major reason for the measured software fault tolerance. Over two-thirds (72%) of measured software failures are recurrences of previously reported faults. Modeling, based on the data, shows that, in addition to reducing the number of software faults, software dependability can be enhanced by reducing the recurrence rate.
Fault-tolerant, high-level quantum circuits: form, compilation and description

NASA Astrophysics Data System (ADS)

Paler, Alexandru; Polian, Ilia; Nemoto, Kae; Devitt, Simon J.

2017-06-01

Fault-tolerant quantum error correction is a necessity for any quantum architecture destined to tackle interesting, large-scale problems. Its theoretical formalism has been well founded for nearly two decades. However, we still do not have an appropriate compiler to produce a fault-tolerant, error-corrected description from a higher-level quantum circuit for state-of the-art hardware models. There are many technical hurdles, including dynamic circuit constructions that occur when constructing fault-tolerant circuits with commonly used error correcting codes. We introduce a package that converts high-level quantum circuits consisting of commonly used gates into a form employing all decompositions and ancillary protocols needed for fault-tolerant error correction. We call this form the (I)initialisation, (C)NOT, (M)measurement form (ICM) and consists of an initialisation layer of qubits into one of four distinct states, a massive, deterministic array of CNOT operations and a series of time-ordered X- or Z-basis measurements. The form allows a more flexible approach towards circuit optimisation. At the same time, the package outputs a standard circuit or a canonical geometric description which is a necessity for operating current state-of-the-art hardware architectures using topological quantum codes.
Software Fault Tolerance: A Tutorial

NASA Technical Reports Server (NTRS)

Torres-Pomales, Wilfredo

2000-01-01

Because of our present inability to produce error-free software, software fault tolerance is and will continue to be an important consideration in software systems. The root cause of software design errors is the complexity of the systems. Compounding the problems in building correct software is the difficulty in assessing the correctness of software for highly complex systems. After a brief overview of the software development processes, we note how hard-to-detect design faults are likely to be introduced during development and how software faults tend to be state-dependent and activated by particular input sequences. Although component reliability is an important quality measure for system level analysis, software reliability is hard to characterize and the use of post-verification reliability estimates remains a controversial issue. For some applications software safety is more important than reliability, and fault tolerance techniques used in those applications are aimed at preventing catastrophes. Single version software fault tolerance techniques discussed include system structuring and closure, atomic actions, inline fault detection, exception handling, and others. Multiversion techniques are based on the assumption that software built differently should fail differently and thus, if one of the redundant versions fails, it is expected that at least one of the other versions will provide an acceptable output. Recovery blocks, N-version programming, and other multiversion techniques are reviewed.
Compact, Low-Force, Low-Noise Linear Actuator

NASA Technical Reports Server (NTRS)

Badescu, Mircea; Sherrit, Stewart; Bar-Cohen, Yoseph

2012-01-01

Actuators are critical to all the robotic and manipulation mechanisms that are used in current and future NASA missions, and are also needed for many other industrial, aeronautical, and space activities. There are many types of actuators that were designed to operate as linear or rotary motors, but there is still a need for low-force, low-noise linear actuators for specialized applications, and the disclosed mechanism addresses this need. A simpler implementation of a rotary actuator was developed where the end effector controls the motion of a brush for cleaning a thermal sensor. The mechanism uses a SMA (shape-memory alloy) wire for low force, and low noise. The linear implementation of the actuator incorporates a set of springs and mechanical hard-stops for resetting and fault tolerance to mechanical resistance. The actuator can be designed to work in a pull or push mode, or both. Depending on the volume envelope criteria, the actuator can be configured for scaling its volume down to 4 2 1 cm3. The actuator design has an inherent fault tolerance to mechanical resistance. The actuator has the flexibility of being designed for both linear and rotary motion. A specific configuration was designed and analyzed where fault-tolerant features have been implemented. In this configuration, an externally applied force larger than the design force does not damage the active components of the actuator. The actuator housing can be configured and produced using cost-effective methods such as injection molding, or alternatively, its components can be mounted directly on a small circuit board. The actuator is driven by a SMA -NiTi as a primary active element, and it requires energy on the order of 20 Ws(J) per cycle. Electrical connections to points A and B are used to apply electrical power in the resistive NiTi wire, causing a phase change that contracts the wire on the order of 5%. The actuation period is of the order of a second for generating the stroke, and 4 to 10 seconds for resetting. Thus, this design allows the actuator to work at a frequency of up to 0.1 Hz. The actuator does not make use of the whole range of motion of the SMA material, allowing for large margins on the mechanical parameters of the design. The efficiency of the actuator is of the order of 10%, including the margins. The average dissipated power while driving at full speed is of the order of 1 W, and can be scaled down linearly if the rate of cycling is reduced. This design produces an extremely quiet actuator; it can generate a force greater than 2 N and a stroke greater than 1 cm. The operational duration of SMA materials is of the order of millions of cycles with some reduced stroke over a wide temperature range up to 150 C.
Compact, Low-Force, Low-Noise Linear Actuator

NASA Technical Reports Server (NTRS)

Badescu, Mircea; Sherrit, Stewart; Bar-Cohen, Yoseph

2012-01-01

Actuators are critical to all the robotic and manipulation mechanisms that are used in current and future NASA missions, and are also needed for many other industrial, aeronautical, and space activities. There are many types of actuators that were designed to operate as linear or rotary motors, but there is still a need for low-force, low-noise linear actuators for specialized applications, and the disclosed mechanism addresses this need. A simpler implementation of a rotary actuator was developed where the end effector controls the motion of a brush for cleaning a thermal sensor. The mechanism uses a SMA (shape-memory alloy) wire for low force, and low noise. The linear implementation of the actuator incorporates a set of springs and mechanical hard-stops for resetting and fault tolerance to mechanical resistance. The actuator can be designed to work in a pull or push mode, or both. Depending on the volume envelope criteria, the actuator can be configured for scaling its volume down to 4x2x1 cu cm. The actuator design has an inherent fault tolerance to mechanical resistance. The actuator has the flexibility of being designed for both linear and rotary motion. A specific configuration was designed and analyzed where fault-tolerant features have been implemented. In this configuration, an externally applied force larger than the design force does not damage the active components of the actuator. The actuator housing can be configured and produced using cost-effective methods such as injection molding, or alternatively, its components can be mounted directly on a small circuit board. The actuator is driven by a SMA -NiTi as a primary active element, and it requires energy on the order of 20 Ws(J) per cycle. Electrical connections to points A and B are used to apply electrical power in the resistive NiTi wire, causing a phase change that contracts the wire on the order of 5%. The actuation period is of the order of a second for generating the stroke, and 4 to 10 seconds for resetting. Thus, this design allows the actuator to work at a frequency of up to 0.1 Hz. The actuator does not make use of the whole range of motion of the SMA material, allowing for large margins on the mechanical parameters of the design. The efficiency of the actuator is of the order of 10%, including the margins. The average dissipated power while driving at full speed is of the order of 1 W, and can be scaled down linearly if the rate of cycling is reduced. This design produces an extremely quiet actuator; it can generate a force greater than 2 N and a stroke greater than 1 cm. The operational duration of SMA materials is of the order of millions of cycles with some reduced stroke over a wide temperature range up to 150 C.
77 FR 39353 - Wassenaar Arrangement 2011 Plenary Agreements Implementation: Commerce Control List, Definitions...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-07-02

... Controls for Conventional Arms and Dual-Use Goods and Technologies is a group of 41 like-minded states... specified and packaged as medical products, are not subject to control. ECCN 1C008 (Non-Fluorinated... technology and computer system design have made control of fault tolerance neither warranted nor feasible...
Application of Fault-Tolerant Computing For Spacecraft Using Commercial-Off-The-Shelf Microprocessors

DTIC Science & Technology

2000-06-01

real - time operating system and design of a human-computer interface (HCI) for a triple modular redundant (TMR) fault-tolerant microprocessor for use in space-based applications. Once disadvantage of using COTS hardware components is their susceptibility to the radiation effects present in the space environment. and specifically, radiation-induced single-event upsets (SEUs). In the event of an SEU, a fault-tolerant system can mitigate the effects of the upset and continue to process from the last known correct system state. The TMR basic hardware
The cost of software fault tolerance

NASA Technical Reports Server (NTRS)

Migneault, G. E.

1982-01-01

The proposed use of software fault tolerance techniques as a means of reducing software costs in avionics and as a means of addressing the issue of system unreliability due to faults in software is examined. A model is developed to provide a view of the relationships among cost, redundancy, and reliability which suggests strategies for software development and maintenance which are not conventional.
Computer-Aided Reliability Estimation

NASA Technical Reports Server (NTRS)

Bavuso, S. J.; Stiffler, J. J.; Bryant, L. A.; Petersen, P. L.

1986-01-01

CARE III (Computer-Aided Reliability Estimation, Third Generation) helps estimate reliability of complex, redundant, fault-tolerant systems. Program specifically designed for evaluation of fault-tolerant avionics systems. However, CARE III general enough for use in evaluation of other systems as well.
Fault-Tolerant Control For A Robotic Inspection System

NASA Technical Reports Server (NTRS)

Tso, Kam Sing

1995-01-01

Report describes first phase of continuing program of research on fault-tolerant control subsystem of telerobotic visual-inspection system. Goal of program to develop robotic system for remotely controlled visual inspection of structures in outer space.
Characterization of the faulted behavior of digital computers and fault tolerant systems

NASA Technical Reports Server (NTRS)

Bavuso, Salvatore J.; Miner, Paul S.

1989-01-01

A development status evaluation is presented for efforts conducted at NASA-Langley since 1977, toward the characterization of the latent fault in digital fault-tolerant systems. Attention is given to the practical, high speed, generalized gate-level logic system simulator developed, as well as to the validation methodology used for the simulator, on the basis of faultable software and hardware simulations employing a prototype MIL-STD-1750A processor. After validation, latency tests will be performed.
Fly-By-Light/Power-By-Wire Fault-Tolerant Fiber-Optic Backplane

NASA Technical Reports Server (NTRS)

Malekpour, Mahyar R.

2002-01-01

The design and development of a fault-tolerant fiber-optic backplane to demonstrate feasibility of such architecture is presented. The simulation results of test cases on the backplane in the advent of induced faults are presented, and the fault recovery capability of the architecture is demonstrated. The architecture was designed, developed, and implemented using the Very High Speed Integrated Circuits (VHSIC) Hardware Description Language (VHDL). The architecture was synthesized and implemented in hardware using Field Programmable Gate Arrays (FPGA) on multiple prototype boards.
Redundant and fault-tolerant algorithms for real-time measurement and control systems for weapon equipment.

PubMed

Li, Dan; Hu, Xiaoguang

2017-03-01

Because of the high availability requirements from weapon equipment, an in-depth study has been conducted on the real-time fault-tolerance of the widely applied Compact PCI (CPCI) bus measurement and control system. A redundancy design method that uses heartbeat detection to connect the primary and alternate devices has been developed. To address the low successful execution rate and relatively large waste of time slices in the primary version of the task software, an improved algorithm for real-time fault-tolerant scheduling is proposed based on the Basic Checking available time Elimination idle time (BCE) algorithm, applying a single-neuron self-adaptive proportion sum differential (PSD) controller. The experimental validation results indicate that this system has excellent redundancy and fault-tolerance, and the newly developed method can effectively improve the system availability. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

Multiple Embedded Processors for Fault-Tolerant Computing

NASA Technical Reports Server (NTRS)

Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy

2005-01-01

A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.
Analysis of fault-tolerant neurocontrol architectures

NASA Technical Reports Server (NTRS)

Troudet, T.; Merrill, W.

1992-01-01

The fault-tolerance of analog parallel distributed implementations of a multivariable aircraft neurocontroller is analyzed by simulating weight and neuron failures in a simplified scheme of analog processing based on the functional architecture of the ETANN chip (Electrically Trainable Artificial Neural Network). The neural information processing is found to be only partially distributed throughout the set of weights of the neurocontroller synthesized with the backpropagation algorithm. Although the degree of distribution of the neural processing, and consequently the fault-tolerance of the neurocontroller, could be enhanced using Locally Distributed Weight and Neuron Approaches, a satisfactory level of fault-tolerance could only be obtained by retraining the degrated VLSI neurocontroller. The possibility of maintaining neurocontrol performance and stability in the presence of single weight of neuron failures was demonstrated through an automated retraining procedure of the neurocontroller based on a pre-programmed choice and sequence of the training parameters.
Machine-checked proofs of the design and implementation of a fault-tolerant circuit

NASA Technical Reports Server (NTRS)

Bevier, William R.; Young, William D.

1990-01-01

A formally verified implementation of the 'oral messages' algorithm of Pease, Shostak, and Lamport is described. An abstract implementation of the algorithm is verified to achieve interactive consistency in the presence of faults. This abstract characterization is then mapped down to a hardware level implementation which inherits the fault-tolerant characteristics of the abstract version. All steps in the proof were checked with the Boyer-Moore theorem prover. A significant results is the demonstration of a fault-tolerant device that is formally specified and whose implementation is proved correct with respect to this specification. A significant simplifying assumption is that the redundant processors behave synchronously. A mechanically checked proof that the oral messages algorithm is 'optimal' in the sense that no algorithm which achieves agreement via similar message passing can tolerate a larger proportion of faulty processor is also described.
Design and experimental validation for direct-drive fault-tolerant permanent-magnet vernier machines.

PubMed

Liu, Guohai; Yang, Junqin; Chen, Ming; Chen, Qian

2014-01-01

A fault-tolerant permanent-magnet vernier (FT-PMV) machine is designed for direct-drive applications, incorporating the merits of high torque density and high reliability. Based on the so-called magnetic gearing effect, PMV machines have the ability of high torque density by introducing the flux-modulation poles (FMPs). This paper investigates the fault-tolerant characteristic of PMV machines and provides a design method, which is able to not only meet the fault-tolerant requirements but also keep the ability of high torque density. The operation principle of the proposed machine has been analyzed. The design process and optimization are presented specifically, such as the combination of slots and poles, the winding distribution, and the dimensions of PMs and teeth. By using the time-stepping finite element method (TS-FEM), the machine performances are evaluated. Finally, the FT-PMV machine is manufactured, and the experimental results are presented to validate the theoretical analysis.
Slip Distribution of the 2008 Iwate-Miyagi Nairiku, Japan, Earthquake Inverted from PALSAR Data

NASA Astrophysics Data System (ADS)

Fukahata, Y.; Fukushima, Y.; Arimoto, M.

2008-12-01

On 14 June 2008, the Iwate-Miyagi Nairiku earthquake struck northeast Japan, where active seismicity has been observed under east-west compressional stress fields. According to the Japan Meteorological Agency, the magnitude and the hypocenter depth of the earthquake are 7.2 and 8 km, respectively. The earthquake is considered to have occurred on a west dipping reverse fault with a roughly north-south strike. The earthquake caused significant surface displacements, which were detected by PALSAR, a Synthetic Aperture Radar (SAR) onboard the Advanced Land Observing Satellite (ALOS) employed by the Japan Aerospace Exploration Agency (JAXA). Several pairs of PALSAR images are available to measure the coseismic displacements. InSAR data show up to 1 m of line-of-sight displacements both for ascending and descending paths. The pixel matching method was also used to obtain range and azimuth offset data around the epicentral region, where displacements were too large for the interferometric technique (see Fukushima (this meeting) in detail). We inverted the obtained SAR interferometric and pixel matching data to estimate slip distribution on the fault. Since the geometry of the fault are not well known, the inverse problem is non-linear. If the fault surface is assumed to be a flat plane, however, the non-linearity is weak. Following the method of Fukahata & Wright (2008), we resolved the weak non-linearity based on ABIC (Akaike"fs Bayesian Information Criterion). That is to say, the fault parameters (e.g. strike, dip and location) as well as the weight of smoothing parameter were objectively determined by minimizing ABIC. We first estimated slip distribution by assuming a pure dip slip for simplicity, since it has been reported that the dip slip component is dominant. Then, the optimal fault geometry was dip 26 and strike 203 degrees with the location passing through (140.90E, 38.97N). The maximum slip was more than 8 m and most slips concentrated at shallow depths (< 4 km). Without fixing the rake, a large slip area with the maximum slip of about 8 m concentrated in the shallow region was obtained again.
Gyro-based Maximum-Likelihood Thruster Fault Detection and Identification

NASA Technical Reports Server (NTRS)

Wilson, Edward; Lages, Chris; Mah, Robert; Clancy, Daniel (Technical Monitor)

2002-01-01

When building smaller, less expensive spacecraft, there is a need for intelligent fault tolerance vs. increased hardware redundancy. If fault tolerance can be achieved using existing navigation sensors, cost and vehicle complexity can be reduced. A maximum likelihood-based approach to thruster fault detection and identification (FDI) for spacecraft is developed here and applied in simulation to the X-38 space vehicle. The system uses only gyro signals to detect and identify hard, abrupt, single and multiple jet on- and off-failures. Faults are detected within one second and identified within one to five accords,
Implementation of a Helicopter Flight Simulator with Individual Blade Control

NASA Astrophysics Data System (ADS)

Zinchiak, Andrew G.

2011-12-01

Nearly all modern helicopters are designed with a swashplate-based system for control of the main rotor blades. However, the swashplate-based approach does not provide the level of redundancy necessary to cope with abnormal actuator conditions. For example, if an actuator fails (becomes locked) on the main rotor, the cyclic inputs are consequently fixed and the helicopter may become stuck in a flight maneuver. This can obviously be seen as a catastrophic failure, and would likely lead to a crash. These types of failures can be overcome with the application of individual blade control (IBC). IBC is achieved using the blade pitch control method, which provides complete authority of the aerodynamic characteristics of each rotor blade at any given time by replacing the normally rigid pitch links between the swashplate and the pitch horn of the blade with hydraulic or electronic actuators. Thus, IBC can provide the redundancy necessary for subsystem failure accommodation. In this research effort, a simulation environment is developed to investigate the potential of the IBC main rotor configuration for fault-tolerant control. To examine the applications of IBC to failure scenarios and fault-tolerant controls, a conventional, swashplate-based linear model is first developed for hover and forward flight scenarios based on the UH-60 Black Hawk helicopter. The linear modeling techniques for the swashplate-based helicopter are then adapted and expanded to include IBC. Using these modified techniques, an IBC based mathematical model of the UH-60 helicopter is developed for the purposes of simulation and analysis. The methodology can be used to model and implement a different aircraft if geometric, gravimetric, and general aerodynamic data are available. Without the kinetic restrictions of the swashplate, the IBC model effectively decouples the cyclic control inputs between different blades. Simulations of the IBC model prove that the primary control functions can be manually reconfigured after local actuator failures are initiated, thus preventing a catastrophic failure or crash. Furthermore, this simulator promises to be a useful tool for the design, testing, and analysis of fault-tolerant control laws.
Fault-tolerant logical gates in quantum error-correcting codes

NASA Astrophysics Data System (ADS)

Pastawski, Fernando; Yoshida, Beni

2015-01-01

Recently, S. Bravyi and R. König [Phys. Rev. Lett. 110, 170503 (2013), 10.1103/PhysRevLett.110.170503] have shown that there is a trade-off between fault-tolerantly implementable logical gates and geometric locality of stabilizer codes. They consider locality-preserving operations which are implemented by a constant-depth geometrically local circuit and are thus fault tolerant by construction. In particular, they show that, for local stabilizer codes in D spatial dimensions, locality-preserving gates are restricted to a set of unitary gates known as the D th level of the Clifford hierarchy. In this paper, we explore this idea further by providing several extensions and applications of their characterization to qubit stabilizer and subsystem codes. First, we present a no-go theorem for self-correcting quantum memory. Namely, we prove that a three-dimensional stabilizer Hamiltonian with a locality-preserving implementation of a non-Clifford gate cannot have a macroscopic energy barrier. This result implies that non-Clifford gates do not admit such implementations in Haah's cubic code and Michnicki's welded code. Second, we prove that the code distance of a D -dimensional local stabilizer code with a nontrivial locality-preserving m th -level Clifford logical gate is upper bounded by O (LD +1 -m) . For codes with non-Clifford gates (m >2 ), this improves the previous best bound by S. Bravyi and B. Terhal [New. J. Phys. 11, 043029 (2009), 10.1088/1367-2630/11/4/043029]. Topological color codes, introduced by H. Bombin and M. A. Martin-Delgado [Phys. Rev. Lett. 97, 180501 (2006), 10.1103/PhysRevLett.97.180501; Phys. Rev. Lett. 98, 160502 (2007), 10.1103/PhysRevLett.98.160502; Phys. Rev. B 75, 075103 (2007), 10.1103/PhysRevB.75.075103], saturate the bound for m =D . Third, we prove that the qubit erasure threshold for codes with a nontrivial transversal m th -level Clifford logical gate is upper bounded by 1 /m . This implies that no family of fault-tolerant codes with transversal gates in increasing level of the Clifford hierarchy may exist. This result applies to arbitrary stabilizer and subsystem codes and is not restricted to geometrically local codes. Fourth, we extend the result of Bravyi and König to subsystem codes. Unlike stabilizer codes, the so-called union lemma does not apply to subsystem codes. This problem is avoided by assuming the presence of an error threshold in a subsystem code, and a conclusion analogous to that of Bravyi and König is recovered.
Gait planning for a quadruped robot with one faulty actuator

NASA Astrophysics Data System (ADS)

Chen, Xianbao; Gao, Feng; Qi, Chenkun; Tian, Xinghua

2015-01-01

Fault tolerance is essential for quadruped robots when they work in remote areas or hazardous environments. Many fault-tolerant gaits planning method proposed in the past decade constrained more degrees of freedom(DOFs) of a robot than necessary. Thus a novel method to realize the fault-tolerant walking is proposed. The mobility of the robot is analyzed first by using the screw theory. The result shows that the translation of the center of body(CoB) can be kept with one faulty actuator if the rotations of the body are controlled. Thus the DOFs of the robot body are divided into two parts: the translation of the CoB and the rotation of the body. The kinematic model of the whole robot is built, the algorithm is developed to actively control the body orientations at the velocity level so that the planned CoB trajectory can be realized in spite of the constraint of the faulty actuator. This gait has a similar generation sequence with the normal gait and can be applied to the robot at any position. Simulations and experiments of the fault-tolerant gait with one faulty actuator are carried out. The CoB errors and the body rotation angles are measured. Comparing to the traditional fault-tolerant gait they can be reduced by at least 50%. A fault-tolerant gait planning algorithm is presented, which not only realizes the walking of a quadruped robot with a faulty actuator, but also efficiently improves the walking performances by taking full advantage of the remaining operational actuators according to the results of the simulations and experiments.
Fault detection and fault tolerance in robotics

NASA Technical Reports Server (NTRS)

Visinsky, Monica; Walker, Ian D.; Cavallaro, Joseph R.

1992-01-01

Robots are used in inaccessible or hazardous environments in order to alleviate some of the time, cost and risk involved in preparing men to endure these conditions. In order to perform their expected tasks, the robots are often quite complex, thus increasing their potential for failures. If men must be sent into these environments to repair each component failure in the robot, the advantages of using the robot are quickly lost. Fault tolerant robots are needed which can effectively cope with failures and continue their tasks until repairs can be realistically scheduled. Before fault tolerant capabilities can be created, methods of detecting and pinpointing failures must be perfected. This paper develops a basic fault tree analysis of a robot in order to obtain a better understanding of where failures can occur and how they contribute to other failures in the robot. The resulting failure flow chart can also be used to analyze the resiliency of the robot in the presence of specific faults. By simulating robot failures and fault detection schemes, the problems involved in detecting failures for robots are explored in more depth.
SFTP: A Secure and Fault-Tolerant Paradigm against Blackhole Attack in MANET

NASA Astrophysics Data System (ADS)

KumarRout, Jitendra; Kumar Bhoi, Sourav; Kumar Panda, Sanjaya

2013-02-01

Security issues in MANET are a challenging task nowadays. MANETs are vulnerable to passive attacks and active attacks because of a limited number of resources and lack of centralized authority. Blackhole attack is an attack in network layer which degrade the network performance by dropping the packets. In this paper, we have proposed a Secure Fault-Tolerant Paradigm (SFTP) which checks the Blackhole attack in the network. The three phases used in SFTP algorithm are designing of coverage area to find the area of coverage, Network Connection algorithm to design a fault-tolerant model and Route Discovery algorithm to discover the route and data delivery from source to destination. SFTP gives better network performance by making the network fault free.
Using Performance Tools to Support Experiments in HPC Resilience

DOE Office of Scientific and Technical Information (OSTI.GOV)

Naughton, III, Thomas J; Boehm, Swen; Engelmann, Christian

2014-01-01

The high performance computing (HPC) community is working to address fault tolerance and resilience concerns for current and future large scale computing platforms. This is driving enhancements in the programming environ- ments, specifically research on enhancing message passing libraries to support fault tolerant computing capabilities. The community has also recognized that tools for resilience experimentation are greatly lacking. However, we argue that there are several parallels between performance tools and resilience tools . As such, we believe the rich set of HPC performance-focused tools can be extended (repurposed) to benefit the resilience community. In this paper, we describe the initialmore » motivation to leverage standard HPC per- formance analysis techniques to aid in developing diagnostic tools to assist fault tolerance experiments for HPC applications. These diagnosis procedures help to provide context for the system when the errors (failures) occurred. We describe our initial work in leveraging an MPI performance trace tool to assist in provid- ing global context during fault injection experiments. Such tools will assist the HPC resilience community as they extend existing and new application codes to support fault tolerances.« less
Adaptive Fault-Tolerant Control of Uncertain Nonlinear Large-Scale Systems With Unknown Dead Zone.

PubMed

Chen, Mou; Tao, Gang

2016-08-01

In this paper, an adaptive neural fault-tolerant control scheme is proposed and analyzed for a class of uncertain nonlinear large-scale systems with unknown dead zone and external disturbances. To tackle the unknown nonlinear interaction functions in the large-scale system, the radial basis function neural network (RBFNN) is employed to approximate them. To further handle the unknown approximation errors and the effects of the unknown dead zone and external disturbances, integrated as the compounded disturbances, the corresponding disturbance observers are developed for their estimations. Based on the outputs of the RBFNN and the disturbance observer, the adaptive neural fault-tolerant control scheme is designed for uncertain nonlinear large-scale systems by using a decentralized backstepping technique. The closed-loop stability of the adaptive control system is rigorously proved via Lyapunov analysis and the satisfactory tracking performance is achieved under the integrated effects of unknown dead zone, actuator fault, and unknown external disturbances. Simulation results of a mass-spring-damper system are given to illustrate the effectiveness of the proposed adaptive neural fault-tolerant control scheme for uncertain nonlinear large-scale systems.
A Primer on Architectural Level Fault Tolerance

NASA Technical Reports Server (NTRS)

Butler, Ricky W.

2008-01-01

This paper introduces the fundamental concepts of fault tolerant computing. Key topics covered are voting, fault detection, clock synchronization, Byzantine Agreement, diagnosis, and reliability analysis. Low level mechanisms such as Hamming codes or low level communications protocols are not covered. The paper is tutorial in nature and does not cover any topic in detail. The focus is on rationale and approach rather than detailed exposition.
Reliability model derivation of a fault-tolerant, dual, spare-switching, digital computer system

NASA Technical Reports Server (NTRS)

1974-01-01

A computer based reliability projection aid, tailored specifically for application in the design of fault-tolerant computer systems, is described. Its more pronounced characteristics include the facility for modeling systems with two distinct operational modes, measuring the effect of both permanent and transient faults, and calculating conditional system coverage factors. The underlying conceptual principles, mathematical models, and computer program implementation are presented.
Fault-tolerant arithmetic via time-shared TMR

NASA Astrophysics Data System (ADS)

Swartzlander, Earl E.

1999-11-01

Fault tolerance is increasingly important as society has come to depend on computers for more and more aspects of daily life. The current concern about the Y2K problems indicates just how much we depend on accurate computers. This paper describes work on time- shared TMR, a technique which is used to provide arithmetic operations that produce correct results in spite of circuit faults.
Implementation of an experimental fault-tolerant memory system

NASA Technical Reports Server (NTRS)

Carter, W. C.; Mccarthy, C. E.

1976-01-01

The experimental fault-tolerant memory system described in this paper has been designed to enable the modular addition of spares, to validate the theoretical fault-secure and self-testing properties of the translator/corrector, to provide a basis for experiments using the new testing and correction processes for recovery, and to determine the practicality of such systems. The hardware design and implementation are described, together with methods of fault insertion. The hardware/software interface, including a restricted single error correction/double error detection (SEC/DED) code, is specified. Procedures are carefully described which, (1) test for specified physical faults, (2) ensure that single error corrections are not miscorrections due to triple faults, and (3) enable recovery from double errors.
Neuromorphic Computing – From Materials Research to Systems Architecture Roundtable

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schuller, Ivan K.; Stevens, Rick; Pino, Robinson

2015-10-29

Computation in its many forms is the engine that fuels our modern civilization. Modern computation—based on the von Neumann architecture—has allowed, until now, the development of continuous improvements, as predicted by Moore’s law. However, computation using current architectures and materials will inevitably—within the next 10 years—reach a limit because of fundamental scientific reasons. DOE convened a roundtable of experts in neuromorphic computing systems, materials science, and computer science in Washington on October 29-30, 2015 to address the following basic questions: Can brain-like (“neuromorphic”) computing devices based on new material concepts and systems be developed to dramatically outperform conventional CMOS basedmore » technology? If so, what are the basic research challenges for materials sicence and computing? The overarching answer that emerged was: The development of novel functional materials and devices incorporated into unique architectures will allow a revolutionary technological leap toward the implementation of a fully “neuromorphic” computer. To address this challenge, the following issues were considered: The main differences between neuromorphic and conventional computing as related to: signaling models, timing/clock, non-volatile memory, architecture, fault tolerance, integrated memory and compute, noise tolerance, analog vs. digital, and in situ learning New neuromorphic architectures needed to: produce lower energy consumption, potential novel nanostructured materials, and enhanced computation Device and materials properties needed to implement functions such as: hysteresis, stability, and fault tolerance Comparisons of different implementations: spin torque, memristors, resistive switching, phase change, and optical schemes for enhanced breakthroughs in performance, cost, fault tolerance, and/or manufacturability.« less
Fault Tolerance for Fight Through (FTFT)

DTIC Science & Technology

2013-02-01

eventually to the lowest level. Now this information pyramid is being inverted: the lowest, most populated level is being “elevated” so that it is the... Egypt , December 2010, pp. 269-273. 8. Roger Myerson, Game Theory: Analysis of Conflict, Harvard University Press, 1997. 9. Li Wang, Zheng Li...Published by Springer, Delhi, India, May 2012, pp. 883-896. 27. “Inverting the Information Pyramid ,” Federal Computer Week, Vol. 26, No.4, March
Distributed adaptive neural network control for a class of heterogeneous nonlinear multi-agent systems subject to actuation failures

NASA Astrophysics Data System (ADS)

Cui, Bing; Zhao, Chunhui; Ma, Tiedong; Feng, Chi

2017-02-01

In this paper, the cooperative adaptive consensus tracking problem for heterogeneous nonlinear multi-agent systems on directed graph is addressed. Each follower is modelled as a general nonlinear system with the unknown and nonidentical nonlinear dynamics, disturbances and actuator failures. Cooperative fault tolerant neural network tracking controllers with online adaptive learning features are proposed to guarantee that all agents synchronise to the trajectory of one leader with bounded adjustable synchronisation errors. With the help of linear quadratic regulator-based optimal design, a graph-dependent Lyapunov proof provides error bounds that depend on the graph topology, one virtual matrix and some design parameters. Of particular interest is that if the control gain is selected appropriately, the proposed control scheme can be implemented in a unified framework no matter whether there are faults or not. Furthermore, the fault detection and isolation are not needed to implement. Finally, a simulation is given to verify the effectiveness of the proposed method.

A fault-tolerant multiprocessor architecture for aircraft, volume 1. [autopilot configuration

NASA Technical Reports Server (NTRS)

Smith, T. B.; Hopkins, A. L.; Taylor, W.; Ausrotas, R. A.; Lala, J. H.; Hanley, L. D.; Martin, J. H.

1978-01-01

A fault-tolerant multiprocessor architecture is reported. This architecture, together with a comprehensive information system architecture, has important potential for future aircraft applications. A preliminary definition and assessment of a suitable multiprocessor architecture for such applications is developed.
Validation Methods Research for Fault-Tolerant Avionics and Control Systems: Working Group Meeting, 2

NASA Technical Reports Server (NTRS)

Gault, J. W. (Editor); Trivedi, K. S. (Editor); Clary, J. B. (Editor)

1980-01-01

The validation process comprises the activities required to insure the agreement of system realization with system specification. A preliminary validation methodology for fault tolerant systems documented. A general framework for a validation methodology is presented along with a set of specific tasks intended for the validation of two specimen system, SIFT and FTMP. Two major areas of research are identified. First, are those activities required to support the ongoing development of the validation process itself, and second, are those activities required to support the design, development, and understanding of fault tolerant systems.
Modeling and Simulation Reliable Spacecraft On-Board Computing

NASA Technical Reports Server (NTRS)

Park, Nohpill

1999-01-01

The proposed project will investigate modeling and simulation-driven testing and fault tolerance schemes for Spacecraft On-Board Computing, thereby achieving reliable spacecraft telecommunication. A spacecraft communication system has inherent capabilities of providing multipoint and broadcast transmission, connectivity between any two distant nodes within a wide-area coverage, quick network configuration /reconfiguration, rapid allocation of space segment capacity, and distance-insensitive cost. To realize the capabilities above mentioned, both the size and cost of the ground-station terminals have to be reduced by using reliable, high-throughput, fast and cost-effective on-board computing system which has been known to be a critical contributor to the overall performance of space mission deployment. Controlled vulnerability of mission data (measured in sensitivity), improved performance (measured in throughput and delay) and fault tolerance (measured in reliability) are some of the most important features of these systems. The system should be thoroughly tested and diagnosed before employing a fault tolerance into the system. Testing and fault tolerance strategies should be driven by accurate performance models (i.e. throughput, delay, reliability and sensitivity) to find an optimal solution in terms of reliability and cost. The modeling and simulation tools will be integrated with a system architecture module, a testing module and a module for fault tolerance all of which interacting through a centered graphical user interface.
An improved fault-tolerant control scheme for PWM inverter-fed induction motor-based EVs.

PubMed

Tabbache, Bekheïra; Benbouzid, Mohamed; Kheloui, Abdelaziz; Bourgeot, Jean-Matthieu; Mamoune, Abdeslam

2013-11-01

This paper proposes an improved fault-tolerant control scheme for PWM inverter-fed induction motor-based electric vehicles. The proposed strategy deals with power switch (IGBTs) failures mitigation within a reconfigurable induction motor control. To increase the vehicle powertrain reliability regarding IGBT open-circuit failures, 4-wire and 4-leg PWM inverter topologies are investigated and their performances discussed in a vehicle context. The proposed fault-tolerant topologies require only minimum hardware modifications to the conventional off-the-shelf six-switch three-phase drive, mitigating the IGBTs failures by specific inverter control. Indeed, the two topologies exploit the induction motor neutral accessibility for fault-tolerant purposes. The 4-wire topology uses then classical hysteresis controllers to account for the IGBT failures. The 4-leg topology, meanwhile, uses a specific 3D space vector PWM to handle vehicle requirements in terms of size (DC bus capacitors) and cost (IGBTs number). Experiments on an induction motor drive and simulations on an electric vehicle are carried-out using a European urban driving cycle to show that the proposed fault-tolerant control approach is effective and provides a simple configuration with high performance in terms of speed and torque responses. Copyright © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
High-Intensity Radiated Field Fault-Injection Experiment for a Fault-Tolerant Distributed Communication System

NASA Technical Reports Server (NTRS)

Yates, Amy M.; Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Gonzalez, Oscar R.; Gray, W. Steven

2010-01-01

Safety-critical distributed flight control systems require robustness in the presence of faults. In general, these systems consist of a number of input/output (I/O) and computation nodes interacting through a fault-tolerant data communication system. The communication system transfers sensor data and control commands and can handle most faults under typical operating conditions. However, the performance of the closed-loop system can be adversely affected as a result of operating in harsh environments. In particular, High-Intensity Radiated Field (HIRF) environments have the potential to cause random fault manifestations in individual avionic components and to generate simultaneous system-wide communication faults that overwhelm existing fault management mechanisms. This paper presents the design of an experiment conducted at the NASA Langley Research Center's HIRF Laboratory to statistically characterize the faults that a HIRF environment can trigger on a single node of a distributed flight control system.
Fault-Tolerant Control of ANPC Three-Level Inverter Based on Order-Reduction Optimal Control Strategy under Multi-Device Open-Circuit Fault.

PubMed

Xu, Shi-Zhou; Wang, Chun-Jie; Lin, Fang-Li; Li, Shi-Xiang

2017-10-31

The multi-device open-circuit fault is a common fault of ANPC (Active Neutral-Point Clamped) three-level inverter and effect the operation stability of the whole system. To improve the operation stability, this paper summarized the main solutions currently firstly and analyzed all the possible states of multi-device open-circuit fault. Secondly, an order-reduction optimal control strategy was proposed under multi-device open-circuit fault to realize fault-tolerant control based on the topology and control requirement of ANPC three-level inverter and operation stability. This control strategy can solve the faults with different operation states, and can works in order-reduction state under specific open-circuit faults with specific combined devices, which sacrifices the control quality to obtain the stability priority control. Finally, the simulation and experiment proved the effectiveness of the proposed strategy.
Deep neural networks: A promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data

NASA Astrophysics Data System (ADS)

Jia, Feng; Lei, Yaguo; Lin, Jing; Zhou, Xin; Lu, Na

2016-05-01

Aiming to promptly process the massive fault data and automatically provide accurate diagnosis results, numerous studies have been conducted on intelligent fault diagnosis of rotating machinery. Among these studies, the methods based on artificial neural networks (ANNs) are commonly used, which employ signal processing techniques for extracting features and further input the features to ANNs for classifying faults. Though these methods did work in intelligent fault diagnosis of rotating machinery, they still have two deficiencies. (1) The features are manually extracted depending on much prior knowledge about signal processing techniques and diagnostic expertise. In addition, these manual features are extracted according to a specific diagnosis issue and probably unsuitable for other issues. (2) The ANNs adopted in these methods have shallow architectures, which limits the capacity of ANNs to learn the complex non-linear relationships in fault diagnosis issues. As a breakthrough in artificial intelligence, deep learning holds the potential to overcome the aforementioned deficiencies. Through deep learning, deep neural networks (DNNs) with deep architectures, instead of shallow ones, could be established to mine the useful information from raw data and approximate complex non-linear functions. Based on DNNs, a novel intelligent method is proposed in this paper to overcome the deficiencies of the aforementioned intelligent diagnosis methods. The effectiveness of the proposed method is validated using datasets from rolling element bearings and planetary gearboxes. These datasets contain massive measured signals involving different health conditions under various operating conditions. The diagnosis results show that the proposed method is able to not only adaptively mine available fault characteristics from the measured signals, but also obtain superior diagnosis accuracy compared with the existing methods.
End State: The Fallacy of Modern Military Planning

DTIC Science & Technology

2017-04-06

operational planning for non -linear, complex scenarios requires application of non -linear, advanced planning techniques such as design methodology ...cannot be approached in a linear, mechanistic manner by a universal planning methodology . Theater/global campaign plans and theater strategies offer no...strategic environments, and instead prescribes a universal linear methodology that pays no mind to strategic complexity. This universal application
Advanced Information Processing System - Fault detection and error handling

NASA Technical Reports Server (NTRS)

Lala, J. H.

1985-01-01

The Advanced Information Processing System (AIPS) is designed to provide a fault tolerant and damage tolerant data processing architecture for a broad range of aerospace vehicles, including tactical and transport aircraft, and manned and autonomous spacecraft. A proof-of-concept (POC) system is now in the detailed design and fabrication phase. This paper gives an overview of a preliminary fault detection and error handling philosophy in AIPS.
Catastrophic Fault Recovery with Self-Reconfigurable Chips

NASA Technical Reports Server (NTRS)

Zheng, Will Hua; Marzwell, Neville I.; Chau, Savio N.

2006-01-01

Mission critical systems typically employ multi-string redundancy to cope with possible hardware failure. Such systems are only as fault tolerant as there are many redundant strings. Once a particular critical component exhausts its redundant spares, the multi-string architecture cannot tolerate any further hardware failure. This paper aims at addressing such catastrophic faults through the use of 'Self-Reconfigurable Chips' as a last resort effort to 'repair' a faulty critical component.
A novel Lagrangian approach for the stable numerical simulation of fault and fracture mechanics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Franceschini, Andrea; Ferronato, Massimiliano, E-mail: massimiliano.ferronato@unipd.it; Janna, Carlo

The simulation of the mechanics of geological faults and fractures is of paramount importance in several applications, such as ensuring the safety of the underground storage of wastes and hydrocarbons or predicting the possible seismicity triggered by the production and injection of subsurface fluids. However, the stable numerical modeling of ground ruptures is still an open issue. The present work introduces a novel formulation based on the use of the Lagrange multipliers to prescribe the constraints on the contact surfaces. The variational formulation is modified in order to take into account the frictional work along the activated fault portion accordingmore » to the principle of maximum plastic dissipation. The numerical model, developed in the framework of the Finite Element method, provides stable solutions with a fast convergence of the non-linear problem. The stabilizing properties of the proposed model are emphasized with the aid of a realistic numerical example dealing with the generation of ground fractures due to groundwater withdrawal in arid regions. - Highlights: • A numerical model is developed for the simulation of fault and fracture mechanics. • The model is implemented in the framework of the Finite Element method and with the aid of Lagrange multipliers. • The proposed formulation introduces a new contribution due to the frictional work on the portion of activated fault. • The resulting algorithm is highly non-linear as the portion of activated fault is itself unknown. • The numerical solution is validated against analytical results and proves to be stable also in realistic applications.« less
Fenix, A Fault Tolerant Programming Framework for MPI Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gamel, Marc; Teranihi, Keita; Valenzuela, Eric

2016-10-05

Fenix provides APIs to allow the users to add fault tolerance capability to MPI-based parallel programs in a transparent manner. Fenix-enabled programs can run through process failures during program execution using a pool of spare processes accommodated by Fenix.
Increases in Tolerance within Naturalistic, Self-Help Recovery Homes

PubMed Central

Olson, Brad D.; Jason, Leonard A.; Davidson, Michelle; Ferrari, Joseph R.

2011-01-01

Changes in tolerance toward others (i.e., universality/diversity measure) among 150 participants (93 women, 57 men) discharged from inpatient treatment centers randomly assigned to either a self-help, communal living setting or usual after-care and interviewed every 6 months for a 24 month period was explored. Hierarchical Linear Modeling examined the effect of condition (Therapeutic Communal Living versus Usual Care) and other moderator variables on wave trajectories of tolerance attitudes (i.e., universality/diversity scores). Over time, residents of the communal living recovery model showed significantly greater tolerance trajectories than usual care participants. Results supported the claim that residents of communal living settings unit around super-ordinate goals of overcoming substance abuse problems. Also older compared to younger residents living in a house for 6 or more months experienced the greatest increases in tolerance. Theories regarding these differential increases in tolerance, such as social contact theory and transtheoretical processes of change, are discussed. PMID:19838787
Neural-Network-Based Adaptive Decentralized Fault-Tolerant Control for a Class of Interconnected Nonlinear Systems.

PubMed

Li, Xiao-Jian; Yang, Guang-Hong

2018-01-01

This paper is concerned with the adaptive decentralized fault-tolerant tracking control problem for a class of uncertain interconnected nonlinear systems with unknown strong interconnections. An algebraic graph theory result is introduced to address the considered interconnections. In addition, to achieve the desirable tracking performance, a neural-network-based robust adaptive decentralized fault-tolerant control (FTC) scheme is given to compensate the actuator faults and system uncertainties. Furthermore, via the Lyapunov analysis method, it is proven that all the signals of the resulting closed-loop system are semiglobally bounded, and the tracking errors of each subsystem exponentially converge to a compact set, whose radius is adjustable by choosing different controller design parameters. Finally, the effectiveness and advantages of the proposed FTC approach are illustrated with two simulated examples.
Advanced information processing system: Fault injection study and results

NASA Technical Reports Server (NTRS)

Burkhardt, Laura F.; Masotto, Thomas K.; Lala, Jaynarayan H.

1992-01-01

The objective of the AIPS program is to achieve a validated fault tolerant distributed computer system. The goals of the AIPS fault injection study were: (1) to present the fault injection study components addressing the AIPS validation objective; (2) to obtain feedback for fault removal from the design implementation; (3) to obtain statistical data regarding fault detection, isolation, and reconfiguration responses; and (4) to obtain data regarding the effects of faults on system performance. The parameters are described that must be varied to create a comprehensive set of fault injection tests, the subset of test cases selected, the test case measurements, and the test case execution. Both pin level hardware faults using a hardware fault injector and software injected memory mutations were used to test the system. An overview is provided of the hardware fault injector and the associated software used to carry out the experiments. Detailed specifications are given of fault and test results for the I/O Network and the AIPS Fault Tolerant Processor, respectively. The results are summarized and conclusions are given.
Optical asymmetric cryptography based on elliptical polarized light linear truncation and a numerical reconstruction technique.

PubMed

Lin, Chao; Shen, Xueju; Wang, Zhisong; Zhao, Cheng

2014-06-20

We demonstrate a novel optical asymmetric cryptosystem based on the principle of elliptical polarized light linear truncation and a numerical reconstruction technique. The device of an array of linear polarizers is introduced to achieve linear truncation on the spatially resolved elliptical polarization distribution during image encryption. This encoding process can be characterized as confusion-based optical cryptography that involves no Fourier lens and diffusion operation. Based on the Jones matrix formalism, the intensity transmittance for this truncation is deduced to perform elliptical polarized light reconstruction based on two intensity measurements. Use of a quick response code makes the proposed cryptosystem practical, with versatile key sensitivity and fault tolerance. Both simulation and preliminary experimental results that support theoretical analysis are presented. An analysis of the resistance of the proposed method on a known public key attack is also provided.
Monitoring and Control Interface Based on Virtual Sensors

PubMed Central

Escobar, Ricardo F.; Adam-Medina, Manuel; García-Beltrán, Carlos D.; Olivares-Peregrino, Víctor H.; Juárez-Romero, David; Guerrero-Ramírez, Gerardo V.

2014-01-01

In this article, a toolbox based on a monitoring and control interface (MCI) is presented and applied in a heat exchanger. The MCI was programed in order to realize sensor fault detection and isolation and fault tolerance using virtual sensors. The virtual sensors were designed from model-based high-gain observers. To develop the control task, different kinds of control laws were included in the monitoring and control interface. These control laws are PID, MPC and a non-linear model-based control law. The MCI helps to maintain the heat exchanger under operation, even if a temperature outlet sensor fault occurs; in the case of outlet temperature sensor failure, the MCI will display an alarm. The monitoring and control interface is used as a practical tool to support electronic engineering students with heat transfer and control concepts to be applied in a double-pipe heat exchanger pilot plant. The method aims to teach the students through the observation and manipulation of the main variables of the process and by the interaction with the monitoring and control interface (MCI) developed in LabVIEW©. The MCI provides the electronic engineering students with the knowledge of heat exchanger behavior, since the interface is provided with a thermodynamic model that approximates the temperatures and the physical properties of the fluid (density and heat capacity). An advantage of the interface is the easy manipulation of the actuator for an automatic or manual operation. Another advantage of the monitoring and control interface is that all algorithms can be manipulated and modified by the users. PMID:25365462
On the Effect of Variability on Fermi, Pasta and Ulam Matrices

NASA Astrophysics Data System (ADS)

Nelson, Heather; Choubey, Bhaskar

The first numerical experiment by Fermi, Pasta, Ulam and Tsingou in 1955 observed recurrence in an array of non-linear systems. This has led to a large number of nonlinear numerical experiments with various new results from a chain of ideal oscillators. FPUT arrays consists of linear oscillators connected nonlinearly which leads to recurrence of energy mode with time. However, if such a system were to be physically constructed, inherent process variations would introduce a manufacturing tolerance into the parameters of the system. This abstract reports investigation into the effects of these tolerances on the FPU matrices. It has been observed that tolerance in the oscillators can degrade the observance of recurrence and with a chain of even 64 oscillators, recurrence cannot be observed with tolerances more than 10%. It has also been observed that linear oscillators tolerances have more effects on recurrence than those of the nonlinear coupling. Even with very small tolerances of +/- 1% on the linear components, one start to observe variations in the quality and magnitude of the recurrence and at +/- 5%, recurrence is starting to break down.
Depth optimal sorting networks resistant to k passive faults

DOE Office of Scientific and Technical Information (OSTI.GOV)

Piotrow, M.

In this paper, we study the problem of constructing a sorting network that is tolerant to faults and whose running time (i.e. depth) is as small as possible. We consider the scenario of worst-case comparator faults and follow the model of passive comparator failure proposed by Yao and Yao, in which a faulty comparator outputs directly its inputs without comparison. Our main result is the first construction of an N-input, k-fault-tolerant sorting network that is of an asymptotically optimal depth {theta}(log N+k). That improves over the recent result of Leighton and Ma, whose network is of depth O(log N +more » k log log N/log k). Actually, we present a fault-tolerant correction network that can be added after any N-input sorting network to correct its output in the presence of at most k faulty comparators. Since the depth of the network is O(log N + k) and the constants hidden behind the {open_quotes}O{close_quotes} notation are not big, the construction can be of practical use. Developing the techniques necessary to show the main result, we construct a fault-tolerant network for the insertion problem. As a by-product, we get an N-input, O(log N)-depth INSERT-network that is tolerant to random faults, thereby answering a question posed by Ma in his PhD thesis. The results are based on a new notion of constant delay comparator networks, that is, networks in which each register is used (compared) only in a period of time of a constant length. Copies of such networks can be put one after another with only a constant increase in depth per copy.« less
Cost and benefits design optimization model for fault tolerant flight control systems

NASA Technical Reports Server (NTRS)

Rose, J.

1982-01-01

Requirements and specifications for a method of optimizing the design of fault-tolerant flight control systems are provided. Algorithms that could be used for developing new and modifying existing computer programs are also provided, with recommendations for follow-on work.

Software reliability models for fault-tolerant avionics computers and related topics

NASA Technical Reports Server (NTRS)

Miller, Douglas R.

1987-01-01

Software reliability research is briefly described. General research topics are reliability growth models, quality of software reliability prediction, the complete monotonicity property of reliability growth, conceptual modelling of software failure behavior, assurance of ultrahigh reliability, and analysis techniques for fault-tolerant systems.
Design and Experimental Validation for Direct-Drive Fault-Tolerant Permanent-Magnet Vernier Machines

PubMed Central

Liu, Guohai; Yang, Junqin; Chen, Ming; Chen, Qian

2014-01-01

A fault-tolerant permanent-magnet vernier (FT-PMV) machine is designed for direct-drive applications, incorporating the merits of high torque density and high reliability. Based on the so-called magnetic gearing effect, PMV machines have the ability of high torque density by introducing the flux-modulation poles (FMPs). This paper investigates the fault-tolerant characteristic of PMV machines and provides a design method, which is able to not only meet the fault-tolerant requirements but also keep the ability of high torque density. The operation principle of the proposed machine has been analyzed. The design process and optimization are presented specifically, such as the combination of slots and poles, the winding distribution, and the dimensions of PMs and teeth. By using the time-stepping finite element method (TS-FEM), the machine performances are evaluated. Finally, the FT-PMV machine is manufactured, and the experimental results are presented to validate the theoretical analysis. PMID:25045729
Self-adaptive Fault-Tolerance of HLA-Based Simulations in the Grid Environment

NASA Astrophysics Data System (ADS)

Huang, Jijie; Chai, Xudong; Zhang, Lin; Li, Bo Hu

The objects of a HLA-based simulation can access model services to update their attributes. However, the grid server may be overloaded and refuse the model service to handle objects accesses. Because these objects have been accessed this model service during last simulation loop and their medium state are stored in this server, this may terminate the simulation. A fault-tolerance mechanism must be introduced into simulations. But the traditional fault-tolerance methods cannot meet the above needs because the transmission latency between a federate and the RTI in grid environment varies from several hundred milliseconds to several seconds. By adding model service URLs to the OMT and expanding the HLA services and model services with some interfaces, this paper proposes a self-adaptive fault-tolerance mechanism of simulations according to the characteristics of federates accessing model services. Benchmark experiments indicate that the expanded HLA/RTI can make simulations self-adaptively run in the grid environment.
Reliability of Fault Tolerant Control Systems. Part 2

NASA Technical Reports Server (NTRS)

Wu, N. Eva

2000-01-01

This paper reports Part II of a two part effort that is intended to delineate the relationship between reliability and fault tolerant control in a quantitative manner. Reliability properties peculiar to fault-tolerant control systems are emphasized, such as the presence of analytic redundancy in high proportion, the dependence of failures on control performance, and high risks associated with decisions in redundancy management due to multiple sources of uncertainties and sometimes large processing requirements. As a consequence, coverage of failures through redundancy management can be severely limited. The paper proposes to formulate the fault tolerant control problem as an optimization problem that maximizes coverage of failures through redundancy management. Coverage modeling is attempted in a way that captures its dependence on the control performance and on the diagnostic resolution. Under the proposed redundancy management policy, it is shown that an enhanced overall system reliability can be achieved with a control law of a superior robustness, with an estimator of a higher resolution, and with a control performance requirement of a lesser stringency.
A fail-safe CMOS logic gate

NASA Technical Reports Server (NTRS)

Bobin, V.; Whitaker, S.

1990-01-01

This paper reports a design technique to make Complex CMOS Gates fail-safe for a class of faults. Two classes of faults are defined. The fail-safe design presented has limited fault-tolerance capability. Multiple faults are also covered.
Data-based fault-tolerant control for affine nonlinear systems with actuator faults.

PubMed

Xie, Chun-Hua; Yang, Guang-Hong

2016-09-01

This paper investigates the fault-tolerant control (FTC) problem for unknown nonlinear systems with actuator faults including stuck, outage, bias and loss of effectiveness. The upper bounds of stuck faults, bias faults and loss of effectiveness faults are unknown. A new data-based FTC scheme is proposed. It consists of the online estimations of the bounds and a state-dependent function. The estimations are adjusted online to compensate automatically the actuator faults. The state-dependent function solved by using real system data helps to stabilize the system. Furthermore, all signals in the resulting closed-loop system are uniformly bounded and the states converge asymptotically to zero. Compared with the existing results, the proposed approach is data-based. Finally, two simulation examples are provided to show the effectiveness of the proposed approach. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Advanced I&C for Fault-Tolerant Supervisory Control of Small Modular Reactors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cole, Daniel G.

In this research, we have developed a supervisory control approach to enable automated control of SMRs. By design the supervisory control system has an hierarchical, interconnected, adaptive control architecture. A considerable advantage to this architecture is that it allows subsystems to communicate at different/finer granularity, facilitates monitoring of process at the modular and plant levels, and enables supervisory control. We have investigated the deployment of automation, monitoring, and data collection technologies to enable operation of multiple SMRs. Each unit's controller collects and transfers information from local loops and optimize that unit’s parameters. Information is passed from the each SMR unitmore » controller to the supervisory controller, which supervises the actions of SMR units and manage plant processes. The information processed at the supervisory level will provide operators the necessary information needed for reactor, unit, and plant operation. In conjunction with the supervisory effort, we have investigated techniques for fault-tolerant networks, over which information is transmitted between local loops and the supervisory controller to maintain a safe level of operational normalcy in the presence of anomalies. The fault-tolerance of the supervisory control architecture, the network that supports it, and the impact of fault-tolerance on multi-unit SMR plant control has been a second focus of this research. To this end, we have investigated the deployment of advanced automation, monitoring, and data collection and communications technologies to enable operation of multiple SMRs. We have created a fault-tolerant multi-unit SMR supervisory controller that collects and transfers information from local loops, supervise their actions, and adaptively optimize the controller parameters. The goal of this research has been to develop the methodologies and procedures for fault-tolerant supervisory control of small modular reactors. To achieve this goal, we have identified the following objectives. These objective are an ordered approach to the research: I) Development of a supervisory digital I&C system II) Fault-tolerance of the supervisory control architecture III) Automated decision making and online monitoring.« less
Adaptive extended-state observer-based fault tolerant attitude control for spacecraft with reaction wheels

NASA Astrophysics Data System (ADS)

Ran, Dechao; Chen, Xiaoqian; de Ruiter, Anton; Xiao, Bing

2018-04-01

This study presents an adaptive second-order sliding control scheme to solve the attitude fault tolerant control problem of spacecraft subject to system uncertainties, external disturbances and reaction wheel faults. A novel fast terminal sliding mode is preliminarily designed to guarantee that finite-time convergence of the attitude errors can be achieved globally. Based on this novel sliding mode, an adaptive second-order observer is then designed to reconstruct the system uncertainties and the actuator faults. One feature of the proposed observer is that the design of the observer does not necessitate any priori information of the upper bounds of the system uncertainties and the actuator faults. In view of the reconstructed information supplied by the designed observer, a second-order sliding mode controller is developed to accomplish attitude maneuvers with great robustness and precise tracking accuracy. Theoretical stability analysis proves that the designed fault tolerant control scheme can achieve finite-time stability of the closed-loop system, even in the presence of reaction wheel faults and system uncertainties. Numerical simulations are also presented to demonstrate the effectiveness and superiority of the proposed control scheme over existing methodologies.
Low-Power Fault Tolerance for Spacecraft FPGA-Based Numerical Computing

DTIC Science & Technology

2006-09-01

Ranganathan , “Power Management – Guest Lecture for CS4135, NPS,” Naval Postgraduate School, Nov 2004 [32] R. L. Phelps, “Operational Experiences with the...4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188) Washington DC 20503. 1. AGENCY USE ONLY (Leave blank) 2...undesirable, are not necessarily harmful. Our intent is to prevent errors by properly managing faults. This research focuses on developing fault-tolerant
Fault-tolerant computer study. [logic designs for building block circuits

NASA Technical Reports Server (NTRS)

Rennels, D. A.; Avizienis, A. A.; Ercegovac, M. D.

1981-01-01

A set of building block circuits is described which can be used with commercially available microprocessors and memories to implement fault tolerant distributed computer systems. Each building block circuit is intended for VLSI implementation as a single chip. Several building blocks and associated processor and memory chips form a self checking computer module with self contained input output and interfaces to redundant communications buses. Fault tolerance is achieved by connecting self checking computer modules into a redundant network in which backup buses and computer modules are provided to circumvent failures. The requirements and design methodology which led to the definition of the building block circuits are discussed.
Different-Level Simultaneous Minimization Scheme for Fault Tolerance of Redundant Manipulator Aided with Discrete-Time Recurrent Neural Network

PubMed Central

Jin, Long; Liao, Bolin; Liu, Mei; Xiao, Lin; Guo, Dongsheng; Yan, Xiaogang

2017-01-01

By incorporating the physical constraints in joint space, a different-level simultaneous minimization scheme, which takes both the robot kinematics and robot dynamics into account, is presented and investigated for fault-tolerant motion planning of redundant manipulator in this paper. The scheme is reformulated as a quadratic program (QP) with equality and bound constraints, which is then solved by a discrete-time recurrent neural network. Simulative verifications based on a six-link planar redundant robot manipulator substantiate the efficacy and accuracy of the presented acceleration fault-tolerant scheme, the resultant QP and the corresponding discrete-time recurrent neural network. PMID:28955217
Influence of slot number and pole number in fault-tolerant brushless dc motors having unequal tooth widths

NASA Astrophysics Data System (ADS)

Ishak, D.; Zhu, Z. Q.; Howe, D.

2005-05-01

The electromagnetic performance of fault-tolerant three-phase permanent magnet brushless dc motors, in which the wound teeth are wider than the unwound teeth and their tooth tips span approximately one pole pitch and which have similar numbers of slots and poles, is investigated. It is shown that they have a more trapezoidal phase back-emf wave form, a higher torque capability, and a lower torque ripple than similar fault-tolerant machines with equal tooth widths. However, these benefits gradually diminish as the pole number is increased, due to the effect of interpole leakage flux.
Fault Tolerant Frequent Pattern Mining

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shohdy, Sameh; Vishnu, Abhinav; Agrawal, Gagan

FP-Growth algorithm is a Frequent Pattern Mining (FPM) algorithm that has been extensively used to study correlations and patterns in large scale datasets. While several researchers have designed distributed memory FP-Growth algorithms, it is pivotal to consider fault tolerant FP-Growth, which can address the increasing fault rates in large scale systems. In this work, we propose a novel parallel, algorithm-level fault-tolerant FP-Growth algorithm. We leverage algorithmic properties and MPI advanced features to guarantee an O(1) space complexity, achieved by using the dataset memory space itself for checkpointing. We also propose a recovery algorithm that can use in-memory and disk-based checkpointing,more » though in many cases the recovery can be completed without any disk access, and incurring no memory overhead for checkpointing. We evaluate our FT algorithm on a large scale InfiniBand cluster with several large datasets using up to 2K cores. Our evaluation demonstrates excellent efficiency for checkpointing and recovery in comparison to the disk-based approach. We have also observed 20x average speed-up in comparison to Spark, establishing that a well designed algorithm can easily outperform a solution based on a general fault-tolerant programming model.« less
Vision & Needs for Distributed Controls: Customers for Control Systems and What Do They Value (Postprint)

DTIC Science & Technology

2009-08-01

in engine technology 7 VS. • Military demand is growing for FADEC & control systems with expert system embedded in the S/W for fault tolerance...leverage commercial FADECs & control systems S/W & H/W. •Modular / Universal/Distributed design can reduce development time and cost. S/W could offer...baseline for military-qualified FADECs . •To promote dual use, the services must recognize the similarities between commercial applications & military
Failure Accommodation Tested in Magnetic Suspension Systems for Rotating Machinery

NASA Technical Reports Server (NTRS)

Provenza, Andy J.

2000-01-01

The NASA Glenn Research Center at Lewis Field and Texas A&M University are developing techniques for accommodating certain types of failures in magnetic suspension systems used in rotating machinery. In recent years, magnetic bearings have become a viable alternative to rolling element bearings for many applications. For example, industrial machinery such as machine tool spindles and turbomolecular pumps can today be bought off the shelf with magnetically supported rotating components. Nova Gas Transmission Ltd. has large gas compressors in Canada that have been running flawlessly for years on magnetic bearings. To help mature this technology and quiet concerns over the reliability of magnetic bearings, NASA researchers have been investigating ways of making the bearing system tolerant to faults. Since the potential benefits from an oil-free, actively controlled bearing system are so attractive, research that is focused on assuring system reliability and safety is justifiable. With support from the Fast Quiet Engine program, Glenn's Structural Mechanics and Dynamics Branch is working to demonstrate fault-tolerant magnetic suspension systems targeted for aerospace engine applications. The Flywheel Energy Storage Program is also helping to fund this research.
Avionic Air Data Sensors Fault Detection and Isolation by means of Singular Perturbation and Geometric Approach

PubMed Central

2017-01-01

Singular Perturbations represent an advantageous theory to deal with systems characterized by a two-time scale separation, such as the longitudinal dynamics of aircraft which are called phugoid and short period. In this work, the combination of the NonLinear Geometric Approach and the Singular Perturbations leads to an innovative Fault Detection and Isolation system dedicated to the isolation of faults affecting the air data system of a general aviation aircraft. The isolation capabilities, obtained by means of the approach proposed in this work, allow for the solution of a fault isolation problem otherwise not solvable by means of standard geometric techniques. Extensive Monte-Carlo simulations, exploiting a high fidelity aircraft simulator, show the effectiveness of the proposed Fault Detection and Isolation system. PMID:28946673
Design of Power System Architectures for Small Spacecraft Systems

NASA Technical Reports Server (NTRS)

Momoh, James A.; Subramonian, Rama; Dias, Lakshman G.

1996-01-01

The objective of this research is to perform a trade study on several candidate power system architectures for small spacecrafts to be used in NASA's new millennium program. Three initial candidate architectures have been proposed by NASA and two other candidate architectures have been proposed by Howard University. Howard University is currently conducting the necessary analysis, synthesis, and simulation needed to perform the trade studies and arrive at the optimal power system architecture. Statistical, sensitivity and tolerant studies has been performed on the systems. It is concluded from present studies that certain components such as the series regulators, buck-boost converters and power converters can be minimized while retaining the desired functionality of the overall architecture. This in conjunction with battery scalability studies and system efficiency studies have enabled us to develop more economic architectures. Future studies will include artificial neural networks and fuzzy logic to analyze the performance of the systems. Fault simulation studies and fault diagnosis studies using EMTP and artificial neural networks will also be conducted.
Is the Multigrid Method Fault Tolerant? The Two-Grid Case

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ainsworth, Mark; Glusa, Christian

2016-06-30

The predicted reduced resiliency of next-generation high performance computers means that it will become necessary to take into account the effects of randomly occurring faults on numerical methods. Further, in the event of a hard fault occurring, a decision has to be made as to what remedial action should be taken in order to resume the execution of the algorithm. The action that is chosen can have a dramatic effect on the performance and characteristics of the scheme. Ideally, the resulting algorithm should be subjected to the same kind of mathematical analysis that was applied to the original, deterministic variant.more » The purpose of this work is to provide an analysis of the behaviour of the multigrid algorithm in the presence of faults. Multigrid is arguably the method of choice for the solution of large-scale linear algebra problems arising from discretization of partial differential equations and it is of considerable importance to anticipate its behaviour on an exascale machine. The analysis of resilience of algorithms is in its infancy and the current work is perhaps the first to provide a mathematical model for faults and analyse the behaviour of a state-of-the-art algorithm under the model. It is shown that the Two Grid Method fails to be resilient to faults. Attention is then turned to identifying the minimal necessary remedial action required to restore the rate of convergence to that enjoyed by the ideal fault-free method.« less
Usability Studies in Virtual and Traditional Computer Aided Design Environments for Fault Identification

DTIC Science & Technology

2017-08-08

Usability Studies In Virtual And Traditional Computer Aided Design Environments For Fault Identification Dr. Syed Adeel Ahmed, Xavier University...virtual environment with wand interfaces compared directly with a workstation non-stereoscopic traditional CAD interface with keyboard and mouse. In...the differences in interaction when compared with traditional human computer interfaces. This paper provides analysis via usability study methods
Contextuality supplies the 'magic' for quantum computation.

PubMed

Howard, Mark; Wallman, Joel; Veitch, Victor; Emerson, Joseph

2014-06-19

Quantum computers promise dramatic advantages over their classical counterparts, but the source of the power in quantum computing has remained elusive. Here we prove a remarkable equivalence between the onset of contextuality and the possibility of universal quantum computation via 'magic state' distillation, which is the leading model for experimentally realizing a fault-tolerant quantum computer. This is a conceptually satisfying link, because contextuality, which precludes a simple 'hidden variable' model of quantum mechanics, provides one of the fundamental characterizations of uniquely quantum phenomena. Furthermore, this connection suggests a unifying paradigm for the resources of quantum information: the non-locality of quantum theory is a particular kind of contextuality, and non-locality is already known to be a critical resource for achieving advantages with quantum communication. In addition to clarifying these fundamental issues, this work advances the resource framework for quantum computation, which has a number of practical applications, such as characterizing the efficiency and trade-offs between distinct theoretical and experimental schemes for achieving robust quantum computation, and putting bounds on the overhead cost for the classical simulation of quantum algorithms.

Validation of fault-free behavior of a reliable multiprocessor system - FTMP: A case study. [Fault-Tolerant Multi-Processor avionics

NASA Technical Reports Server (NTRS)

Clune, E.; Segall, Z.; Siewiorek, D.

1984-01-01

A program of experiments has been conducted at NASA-Langley to test the fault-free performance of a Fault-Tolerant Multiprocessor (FTMP) avionics system for next-generation aircraft. Baseline measurements of an operating FTMP system were obtained with respect to the following parameters: instruction execution time, frame size, and the variation of clock ticks. The mechanisms of frame stretching were also investigated. The experimental results are summarized in a table. Areas of interest for future tests are identified, with emphasis given to the implementation of a synthetic workload generation mechanism on FTMP.
The Design and Semi-Physical Simulation Test of Fault-Tolerant Controller for Aero Engine

NASA Astrophysics Data System (ADS)

Liu, Yuan; Zhang, Xin; Zhang, Tianhong

2017-11-01

A new fault-tolerant control method for aero engine is proposed, which can accurately diagnose the sensor fault by Kalman filter banks and reconstruct the signal by real-time on-board adaptive model combing with a simplified real-time model and an improved Kalman filter. In order to verify the feasibility of the method proposed, a semi-physical simulation experiment has been carried out. Besides the real I/O interfaces, controller hardware and the virtual plant model, semi-physical simulation system also contains real fuel system. Compared with the hardware-in-the-loop (HIL) simulation, semi-physical simulation system has a higher degree of confidence. In order to meet the needs of semi-physical simulation, a rapid prototyping controller with fault-tolerant control ability based on NI CompactRIO platform is designed and verified on the semi-physical simulation test platform. The result shows that the controller can realize the aero engine control safely and reliably with little influence on controller performance in the event of fault on sensor.
A comparative study of sensor fault diagnosis methods based on observer for ECAS system

NASA Astrophysics Data System (ADS)

Xu, Xing; Wang, Wei; Zou, Nannan; Chen, Long; Cui, Xiaoli

2017-03-01

The performance and practicality of electronically controlled air suspension (ECAS) system are highly dependent on the state information supplied by kinds of sensors, but faults of sensors occur frequently. Based on a non-linearized 3-DOF 1/4 vehicle model, different methods of fault detection and isolation (FDI) are used to diagnose the sensor faults for ECAS system. The considered approaches include an extended Kalman filter (EKF) with concise algorithm, a strong tracking filter (STF) with robust tracking ability, and the cubature Kalman filter (CKF) with numerical precision. We propose three filters of EKF, STF, and CKF to design a state observer of ECAS system under typical sensor faults and noise. Results show that three approaches can successfully detect and isolate faults respectively despite of the existence of environmental noise, FDI time delay and fault sensitivity of different algorithms are different, meanwhile, compared with EKF and STF, CKF method has best performing FDI of sensor faults for ECAS system.
Definition and trade-off study of reconfigurable airborne digital computer system organizations

NASA Technical Reports Server (NTRS)

Conn, R. B.

1974-01-01

A highly-reliable, fault-tolerant reconfigurable computer system for aircraft applications was developed. The development and application reliability and fault-tolerance assessment techniques are described. Particular emphasis is placed on the needs of an all-digital, fly-by-wire control system appropriate for a passenger-carrying airplane.
Fault-tolerant nonlinear adaptive flight control using sliding mode online learning.

PubMed

Krüger, Thomas; Schnetter, Philipp; Placzek, Robin; Vörsmann, Peter

2012-08-01

An expanded nonlinear model inversion flight control strategy using sliding mode online learning for neural networks is presented. The proposed control strategy is implemented for a small unmanned aircraft system (UAS). This class of aircraft is very susceptible towards nonlinearities like atmospheric turbulence, model uncertainties and of course system failures. Therefore, these systems mark a sensible testbed to evaluate fault-tolerant, adaptive flight control strategies. Within this work the concept of feedback linearization is combined with feed forward neural networks to compensate for inversion errors and other nonlinear effects. Backpropagation-based adaption laws of the network weights are used for online training. Within these adaption laws the standard gradient descent backpropagation algorithm is augmented with the concept of sliding mode control (SMC). Implemented as a learning algorithm, this nonlinear control strategy treats the neural network as a controlled system and allows a stable, dynamic calculation of the learning rates. While considering the system's stability, this robust online learning method therefore offers a higher speed of convergence, especially in the presence of external disturbances. The SMC-based flight controller is tested and compared with the standard gradient descent backpropagation algorithm in the presence of system failures. Copyright © 2012 Elsevier Ltd. All rights reserved.
Universal quantum gate set approaching fault-tolerant thresholds with superconducting qubits.

PubMed

Chow, Jerry M; Gambetta, Jay M; Córcoles, A D; Merkel, Seth T; Smolin, John A; Rigetti, Chad; Poletto, S; Keefe, George A; Rothwell, Mary B; Rozen, J R; Ketchen, Mark B; Steffen, M

2012-08-10

We use quantum process tomography to characterize a full universal set of all-microwave gates on two superconducting single-frequency single-junction transmon qubits. All extracted gate fidelities, including those for Clifford group generators, single-qubit π/4 and π/8 rotations, and a two-qubit controlled-not, exceed 95% (98%), without (with) subtracting state preparation and measurement errors. Furthermore, we introduce a process map representation in the Pauli basis which is visually efficient and informative. This high-fidelity gate set serves as a critical building block towards scalable architectures of superconducting qubits for error correction schemes and pushes up on the known limits of quantum gate characterization.
Universal Quantum Computing with Measurement-Induced Continuous-Variable Gate Sequence in a Loop-Based Architecture.

PubMed

Takeda, Shuntaro; Furusawa, Akira

2017-09-22

We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.
An universal read-out controller

NASA Astrophysics Data System (ADS)

Manz, S.; Abel, N.; Gebelein, J.; Kebschull, U.

2010-11-01

Since 2007 we design and develop a ROC (read-out controller) for FAIR's data-acquisition. While our first implementation solely focused on the nXYTER, today we are also designing and implementing readout logic for the GET4 which is supposed to be part of the ToF detector. Furthermore, we fully support both Ethernet and Optical transport as two transparent solutions. The usage of a strict modularization of the Read Out Controller enables us to provide an Universal ROC where front-end specific logic and transport logic can be combined in a very flexible way. Fault tolerance techniques are only required for some of those modules and hence are only implemented there.
Universal Quantum Computing with Measurement-Induced Continuous-Variable Gate Sequence in a Loop-Based Architecture

NASA Astrophysics Data System (ADS)

Takeda, Shuntaro; Furusawa, Akira

2017-09-01

We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.
Universal Quantum Gate Set Approaching Fault-Tolerant Thresholds with Superconducting Qubits

NASA Astrophysics Data System (ADS)

Chow, Jerry M.; Gambetta, Jay M.; Córcoles, A. D.; Merkel, Seth T.; Smolin, John A.; Rigetti, Chad; Poletto, S.; Keefe, George A.; Rothwell, Mary B.; Rozen, J. R.; Ketchen, Mark B.; Steffen, M.

2012-08-01

We use quantum process tomography to characterize a full universal set of all-microwave gates on two superconducting single-frequency single-junction transmon qubits. All extracted gate fidelities, including those for Clifford group generators, single-qubit π/4 and π/8 rotations, and a two-qubit controlled-not, exceed 95% (98%), without (with) subtracting state preparation and measurement errors. Furthermore, we introduce a process map representation in the Pauli basis which is visually efficient and informative. This high-fidelity gate set serves as a critical building block towards scalable architectures of superconducting qubits for error correction schemes and pushes up on the known limits of quantum gate characterization.
Design Principles for resilient cyber-physical Early Warning Systems - Challenges, Experiences, Design Patterns, and Best Practices

NASA Astrophysics Data System (ADS)

Gensch, S.; Wächter, J.; Schnor, B.

2014-12-01

Early warning systems (EWS) are safety-critical IT-infrastructures that serve the purpose of potentially saving lives or assets by observing real-world phenomena and issuing timely warning products to authorities and communities. An EWS consists of sensors, communication networks, data centers, simulation platforms, and dissemination channels. The components of this cyber-physical system may all be affected by both natural hazards and malfunctions of components alike. Resilience engineering so far has mostly been applied to safety-critical systems and processes in transportation (aviation, automobile), construction and medicine. Early warning systems need equivalent techniques to compensate for failures, and furthermore means to adapt to changing threats, emerging technology and research findings. We present threats and pitfalls from our experiences with the German and Indonesian tsunami early warning system, as well as architectural, technological and organizational concepts employed that can enhance an EWS' resilience. The current EWS is comprised of a multi-type sensor data upstream part, different processing and analysis engines, a decision support system, and various warning dissemination channels. Each subsystem requires a set of approaches towards ensuring stable functionality across system layer boundaries, including also institutional borders. Not only must services be available, but also produce correct results. Most sensors are distributed components with restricted resources, communication channels and power supply. An example for successful resilience engineering is the power capacity based functional management for buoy and tide gauge stations. We discuss various fault-models like cause and effect models on linear pathways, interaction of multiple events, complex and non-linear interaction of assumedly reliable subsystems and fault tolerance means implemented to tackle these threats.
Stochastic Stability of Nonlinear Sampled Data Systems with a Jump Linear Controller

NASA Technical Reports Server (NTRS)

Gonzalez, Oscar R.; Herencia-Zapana, Heber; Gray, W. Steven

2004-01-01

This paper analyzes the stability of a sampled- data system consisting of a deterministic, nonlinear, time- invariant, continuous-time plant and a stochastic, discrete- time, jump linear controller. The jump linear controller mod- els, for example, computer systems and communication net- works that are subject to stochastic upsets or disruptions. This sampled-data model has been used in the analysis and design of fault-tolerant systems and computer-control systems with random communication delays without taking into account the inter-sample response. To analyze stability, appropriate topologies are introduced for the signal spaces of the sampled- data system. With these topologies, the ideal sampling and zero-order-hold operators are shown to be measurable maps. This paper shows that the known equivalence between the stability of a deterministic, linear sampled-data system and its associated discrete-time representation as well as between a nonlinear sampled-data system and a linearized representation holds even in a stochastic framework.
Indirect adaptive fuzzy fault-tolerant tracking control for MIMO nonlinear systems with actuator and sensor failures.

PubMed

Bounemeur, Abdelhamid; Chemachema, Mohamed; Essounbouli, Najib

2018-05-10

In this paper, an active fuzzy fault tolerant tracking control (AFFTTC) scheme is developed for a class of multi-input multi-output (MIMO) unknown nonlinear systems in the presence of unknown actuator faults, sensor failures and external disturbance. The developed control scheme deals with four kinds of faults for both sensors and actuators. The bias, drift, and loss of accuracy additive faults are considered along with the loss of effectiveness multiplicative fault. A fuzzy adaptive controller based on back-stepping design is developed to deal with actuator failures and unknown system dynamics. However, an additional robust control term is added to deal with sensor faults, approximation errors, and external disturbances. Lyapunov theory is used to prove the stability of the closed loop system. Numerical simulations on a quadrotor are presented to show the effectiveness of the proposed approach. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
From Three-Photon Greenberger-Horne-Zeilinger States to Ballistic Universal Quantum Computation.

PubMed

Gimeno-Segovia, Mercedes; Shadbolt, Pete; Browne, Dan E; Rudolph, Terry

2015-07-10

Single photons, manipulated using integrated linear optics, constitute a promising platform for universal quantum computation. A series of increasingly efficient proposals have shown linear-optical quantum computing to be formally scalable. However, existing schemes typically require extensive adaptive switching, which is experimentally challenging and noisy, thousands of photon sources per renormalized qubit, and/or large quantum memories for repeat-until-success strategies. Our work overcomes all these problems. We present a scheme to construct a cluster state universal for quantum computation, which uses no adaptive switching, no large memories, and which is at least an order of magnitude more resource efficient than previous passive schemes. Unlike previous proposals, it is constructed entirely from loss-detecting gates and offers a robustness to photon loss. Even without the use of an active loss-tolerant encoding, our scheme naturally tolerates a total loss rate ∼1.6% in the photons detected in the gates. This scheme uses only 3 Greenberger-Horne-Zeilinger states as a resource, together with a passive linear-optical network. We fully describe and model the iterative process of cluster generation, including photon loss and gate failure. This demonstrates that building a linear-optical quantum computer needs to be less challenging than previously thought.
Advanced information processing system

NASA Technical Reports Server (NTRS)

Lala, J. H.

1984-01-01

Design and performance details of the advanced information processing system (AIPS) for fault and damage tolerant data processing on aircraft and spacecraft are presented. AIPS comprises several computers distributed throughout the vehicle and linked by a damage tolerant data bus. Most I/O functions are available to all the computers, which run in a TDMA mode. Each computer performs separate specific tasks in normal operation and assumes other tasks in degraded modes. Redundant software assures that all fault monitoring, logging and reporting are automated, together with control functions. Redundant duplex links and damage-spread limitation provide the fault tolerance. Details of an advanced design of a laboratory-scale proof-of-concept system are described, including functional operations.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sadayappan, Ponnuswamy

Exascale computing systems will provide a thousand-fold increase in parallelism and a proportional increase in failure rate relative to today's machines. Systems software for exascale machines must provide the infrastructure to support existing applications while simultaneously enabling efficient execution of new programming models that naturally express dynamic, adaptive, irregular computation; coupled simulations; and massive data analysis in a highly unreliable hardware environment with billions of threads of execution. We propose a new approach to the data and work distribution model provided by system software based on the unifying formalism of an abstract file system. The proposed hierarchical data model providesmore » simple, familiar visibility and access to data structures through the file system hierarchy, while providing fault tolerance through selective redundancy. The hierarchical task model features work queues whose form and organization are represented as file system objects. Data and work are both first class entities. By exposing the relationships between data and work to the runtime system, information is available to optimize execution time and provide fault tolerance. The data distribution scheme provides replication (where desirable and possible) for fault tolerance and efficiency, and it is hierarchical to make it possible to take advantage of locality. The user, tools, and applications, including legacy applications, can interface with the data, work queues, and one another through the abstract file model. This runtime environment will provide multiple interfaces to support traditional Message Passing Interface applications, languages developed under DARPA's High Productivity Computing Systems program, as well as other, experimental programming models. We will validate our runtime system with pilot codes on existing platforms and will use simulation to validate for exascale-class platforms. In this final report, we summarize research results from the work done at the Ohio State University towards the larger goals of the project listed above.« less
A research program in empirical computer science

NASA Technical Reports Server (NTRS)

Knight, J. C.

1991-01-01

During the grant reporting period our primary activities have been to begin preparation for the establishment of a research program in experimental computer science. The focus of research in this program will be safety-critical systems. Many questions that arise in the effort to improve software dependability can only be addressed empirically. For example, there is no way to predict the performance of the various proposed approaches to building fault-tolerant software. Performance models, though valuable, are parameterized and cannot be used to make quantitative predictions without experimental determination of underlying distributions. In the past, experimentation has been able to shed some light on the practical benefits and limitations of software fault tolerance. It is common, also, for experimentation to reveal new questions or new aspects of problems that were previously unknown. A good example is the Consistent Comparison Problem that was revealed by experimentation and subsequently studied in depth. The result was a clear understanding of a previously unknown problem with software fault tolerance. The purpose of a research program in empirical computer science is to perform controlled experiments in the area of real-time, embedded control systems. The goal of the various experiments will be to determine better approaches to the construction of the software for computing systems that have to be relied upon. As such it will validate research concepts from other sources, provide new research results, and facilitate the transition of research results from concepts to practical procedures that can be applied with low risk to NASA flight projects. The target of experimentation will be the production software development activities undertaken by any organization prepared to contribute to the research program. Experimental goals, procedures, data analysis and result reporting will be performed for the most part by the University of Virginia.
Fault-tolerance of a neural network solving the traveling salesman problem

NASA Technical Reports Server (NTRS)

Protzel, P.; Palumbo, D.; Arras, M.

1989-01-01

This study presents the results of a fault-injection experiment that stimulates a neural network solving the Traveling Salesman Problem (TSP). The network is based on a modified version of Hopfield's and Tank's original method. We define a performance characteristic for the TSP that allows an overall assessment of the solution quality for different city-distributions and problem sizes. Five different 10-, 20-, and 30- city cases are sued for the injection of up to 13 simultaneous stuck-at-0 and stuck-at-1 faults. The results of more than 4000 simulation-runs show the extreme fault-tolerance of the network, especially with respect to stuck-at-0 faults. One possible explanation for the overall surprising result is the redundancy of the problem representation.
Polarization of stacking fault related luminescence in GaN nanorods

NASA Astrophysics Data System (ADS)

Pozina, G.; Forsberg, M.; Serban, E. A.; Hsiao, C.-L.; Junaid, M.; Birch, J.; Kaliteevski, M. A.

2017-01-01

Linear polarization properties of light emission are presented for GaN nanorods (NRs) grown along [0001] direction on Si(111) substrates by direct-current magnetron sputter epitaxy. The near band gap photoluminescence (PL) measured at low temperature for a single NR demonstrated an excitonic line at ˜3.48 eV and the stacking faults (SFs) related transition at ˜3.43 eV. The SF related emission is linear polarized in direction perpendicular to the NR growth axis in contrast to a non-polarized excitonic PL. The results are explained in the frame of the model describing basal plane SFs as polymorphic heterostructure of type II, where anisotropy of chemical bonds at the interfaces between zinc blende and wurtzite GaN subjected to in-built electric field is responsible for linear polarization parallel to the interface planes.
Faults on Skylab imagery of the Salton Trough area, Southern California

NASA Technical Reports Server (NTRS)

Merifield, P. M.; Lamar, D. L. (Principal Investigator)

1975-01-01

The author has identified the following significant results. Large segments of the major high angle faults in the Salton Trough area are readily identifiable in Skylab images. Along active faults, distinctive topographic features such as scarps and offset drainage, and vegetation differences due to ground water blockage in alluvium are visible. Other fault-controlled features along inactive as well as active faults visible in Skylab photography include straight mountain fronts, linear valleys, and lithologic differences producing contrasting tone, color or texture. A northwestern extension of a fault in the San Andreas set, is postulated by the regional alignment of possible fault-controlled features. The suspected fault is covered by Holocene deposits, principally windblown sand. A northwest trending tonal change in cultivated fields across Mexicali Valley is visible on Skylab photos. Surface evidence for faulting was not observed; however, the linear may be caused by differences in soil conditions along an extension of a segment of the San Jacinto fault zone. No evidence of faulting could be found along linears which appear as possible extensions of the Substation and Victory Pass faults, demonstrating that the interpretation of linears as faults in small scale photography must be corroborated by field investigations.

Slip accumulation and lateral propagation of active normal faults in Afar

NASA Astrophysics Data System (ADS)

Manighetti, I.; King, G. C. P.; Gaudemer, Y.; Scholz, C. H.; Doubre, C.

2001-01-01

We investigate fault growth in Afar, where normal fault systems are known to be currently growing fast and most are propagating to the northwest. Using digital elevation models, we have examined the cumulative slip distribution along 255 faults with lengths ranging from 0.3 to 60 km. Faults exhibiting the elliptical or "bell-shaped" slip profiles predicted by simple linear elastic fracture mechanics or elastic-plastic theories are rare. Most slip profiles are roughly linear for more than half of their length, with overall slopes always <0.035. For the dominant population of NW striking faults and fault systems longer than 2 km, the slip profiles are asymmetric, with slip being maximum near the eastern ends of the profiles where it drops abruptly to zero, whereas slip decreases roughly linearly and tapers in the direction of overall Aden rift propagation. At a more detailed level, most faults appear to be composed of distinct, shorter subfaults or segments, whose slip profiles, while different from one to the next, combine to produce the roughly linear overall slip decrease along the entire fault. On a larger scale, faults cluster into kinematically coupled systems, along which the slip on any scale individual fault or fault system complements that of its neighbors, so that the total slip of the whole system is roughly linearly related to its length, with an average slope again <0.035. We discuss the origin of these quasilinear, asymmetric profiles in terms of "initiation points" where slip starts, and "barriers" where fault propagation is arrested. In the absence of a barrier, slip apparently extends with a roughly linear profile, tapered in the direction of fault propagation.
Hydraulic Universal Display Processor System (HUDPS).

DTIC Science & Technology

1981-11-21

emphasis on smart alphanumeric devices in Task II. Volatile and non-volatile memory components were utilized along with the Intel 8748 microprocessor...system. 1.2 TASK 11 Fault display methods for ground support personnel were investigated during Phase II with emphasis on smart alphanumeric devices...CONSIDERATIONS Methods of display fault indication for ground support personnel have been investigated with emphasis on " smart " alphanumeric devices
Software reliability through fault-avoidance and fault-tolerance

NASA Technical Reports Server (NTRS)

Vouk, Mladen A.; Mcallister, David F.

1993-01-01

Strategies and tools for the testing, risk assessment and risk control of dependable software-based systems were developed. Part of this project consists of studies to enable the transfer of technology to industry, for example the risk management techniques for safety-concious systems. Theoretical investigations of Boolean and Relational Operator (BRO) testing strategy were conducted for condition-based testing. The Basic Graph Generation and Analysis tool (BGG) was extended to fully incorporate several variants of the BRO metric. Single- and multi-phase risk, coverage and time-based models are being developed to provide additional theoretical and empirical basis for estimation of the reliability and availability of large, highly dependable software. A model for software process and risk management was developed. The use of cause-effect graphing for software specification and validation was investigated. Lastly, advanced software fault-tolerance models were studied to provide alternatives and improvements in situations where simple software fault-tolerance strategies break down.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Fang, Aiman; Laguna, Ignacio; Sato, Kento

Future high-performance computing systems may face frequent failures with their rapid increase in scale and complexity. Resilience to faults has become a major challenge for large-scale applications running on supercomputers, which demands fault tolerance support for prevalent MPI applications. Among failure scenarios, process failures are one of the most severe issues as they usually lead to termination of applications. However, the widely used MPI implementations do not provide mechanisms for fault tolerance. We propose FTA-MPI (Fault Tolerance Assistant MPI), a programming model that provides support for failure detection, failure notification and recovery. Specifically, FTA-MPI exploits a try/catch model that enablesmore » failure localization and transparent recovery of process failures in MPI applications. We demonstrate FTA-MPI with synthetic applications and a molecular dynamics code CoMD, and show that FTA-MPI provides high programmability for users and enables convenient and flexible recovery of process failures.« less
Dual-quaternion based fault-tolerant control for spacecraft formation flying with finite-time convergence.

PubMed

Dong, Hongyang; Hu, Qinglei; Ma, Guangfu

2016-03-01

Study results of developing control system for spacecraft formation proximity operations between a target and a chaser are presented. In particular, a coupled model using dual quaternion is employed to describe the proximity problem of spacecraft formation, and a nonlinear adaptive fault-tolerant feedback control law is developed to enable the chaser spacecraft to track the position and attitude of the target even though its actuator occurs fault. Multiple-task capability of the proposed control system is further demonstrated in the presence of disturbances and parametric uncertainties as well. In addition, the practical finite-time stability feature of the closed-loop system is guaranteed theoretically under the designed control law. Numerical simulation of the proposed method is presented to demonstrate the advantages with respect to interference suppression, fast tracking, fault tolerant and practical finite-time stability. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Experimental Robot Position Sensor Fault Tolerance Using Accelerometers and Joint Torque Sensors

NASA Technical Reports Server (NTRS)

Aldridge, Hal A.; Juang, Jer-Nan

1997-01-01

Robot systems in critical applications, such as those in space and nuclear environments, must be able to operate during component failure to complete important tasks. One failure mode that has received little attention is the failure of joint position sensors. Current fault tolerant designs require the addition of directly redundant position sensors which can affect joint design. The proposed method uses joint torque sensors found in most existing advanced robot designs along with easily locatable, lightweight accelerometers to provide a joint position sensor fault recovery mode. This mode uses the torque sensors along with a virtual passive control law for stability and accelerometers for joint position information. Two methods for conversion from Cartesian acceleration to joint position based on robot kinematics, not integration, are presented. The fault tolerant control method was tested on several joints of a laboratory robot. The controllers performed well with noisy, biased data and a model with uncertain parameters.
Integrated Fault Diagnosis Algorithm for Motor Sensors of In-Wheel Independent Drive Electric Vehicles.

PubMed

Jeon, Namju; Lee, Hyeongcheol

2016-12-12

An integrated fault-diagnosis algorithm for a motor sensor of in-wheel independent drive electric vehicles is presented. This paper proposes a method that integrates the high- and low-level fault diagnoses to improve the robustness and performance of the system. For the high-level fault diagnosis of vehicle dynamics, a planar two-track non-linear model is first selected, and the longitudinal and lateral forces are calculated. To ensure redundancy of the system, correlation between the sensor and residual in the vehicle dynamics is analyzed to detect and separate the fault of the drive motor system of each wheel. To diagnose the motor system for low-level faults, the state equation of an interior permanent magnet synchronous motor is developed, and a parity equation is used to diagnose the fault of the electric current and position sensors. The validity of the high-level fault-diagnosis algorithm is verified using Carsim and Matlab/Simulink co-simulation. The low-level fault diagnosis is verified through Matlab/Simulink simulation and experiments. Finally, according to the residuals of the high- and low-level fault diagnoses, fault-detection flags are defined. On the basis of this information, an integrated fault-diagnosis strategy is proposed.
Model-Based Fault Tolerant Control

NASA Technical Reports Server (NTRS)

Kumar, Aditya; Viassolo, Daniel

2008-01-01

The Model Based Fault Tolerant Control (MBFTC) task was conducted under the NASA Aviation Safety and Security Program. The goal of MBFTC is to develop and demonstrate real-time strategies to diagnose and accommodate anomalous aircraft engine events such as sensor faults, actuator faults, or turbine gas-path component damage that can lead to in-flight shutdowns, aborted take offs, asymmetric thrust/loss of thrust control, or engine surge/stall events. A suite of model-based fault detection algorithms were developed and evaluated. Based on the performance and maturity of the developed algorithms two approaches were selected for further analysis: (i) multiple-hypothesis testing, and (ii) neural networks; both used residuals from an Extended Kalman Filter to detect the occurrence of the selected faults. A simple fusion algorithm was implemented to combine the results from each algorithm to obtain an overall estimate of the identified fault type and magnitude. The identification of the fault type and magnitude enabled the use of an online fault accommodation strategy to correct for the adverse impact of these faults on engine operability thereby enabling continued engine operation in the presence of these faults. The performance of the fault detection and accommodation algorithm was extensively tested in a simulation environment.
Fault tolerant control laws

NASA Technical Reports Server (NTRS)

Ly, U. L.; Ho, J. K.

1986-01-01

A systematic procedure for the synthesis of fault tolerant control laws to actuator failure has been presented. Two design methods were used to synthesize fault tolerant controllers: the conventional LQ design method and a direct feedback controller design method SANDY. The latter method is used primarily to streamline the full-state Q feedback design into a practical implementable output feedback controller structure. To achieve robustness to control actuator failure, the redundant surfaces are properly balanced according to their control effectiveness. A simple gain schedule based on the landing gear up/down logic involving only three gains was developed to handle three design flight conditions: Mach .25 and Mach .60 at 5000 ft and Mach .90 at 20,000 ft. The fault tolerant control law developed in this study provides good stability augmentation and performance for the relaxed static stability aircraft. The augmented aircraft responses are found to be invariant to the presence of a failure. Furthermore, single-loop stability margins of +6 dB in gain and +30 deg in phase were achieved along with -40 dB/decade rolloff at high frequency.
High-Threshold Fault-Tolerant Quantum Computation with Analog Quantum Error Correction

NASA Astrophysics Data System (ADS)

Fukui, Kosuke; Tomita, Akihisa; Okamoto, Atsushi; Fujii, Keisuke

2018-04-01

To implement fault-tolerant quantum computation with continuous variables, the Gottesman-Kitaev-Preskill (GKP) qubit has been recognized as an important technological element. However, it is still challenging to experimentally generate the GKP qubit with the required squeezing level, 14.8 dB, of the existing fault-tolerant quantum computation. To reduce this requirement, we propose a high-threshold fault-tolerant quantum computation with GKP qubits using topologically protected measurement-based quantum computation with the surface code. By harnessing analog information contained in the GKP qubits, we apply analog quantum error correction to the surface code. Furthermore, we develop a method to prevent the squeezing level from decreasing during the construction of the large-scale cluster states for the topologically protected, measurement-based, quantum computation. We numerically show that the required squeezing level can be relaxed to less than 10 dB, which is within the reach of the current experimental technology. Hence, this work can considerably alleviate this experimental requirement and take a step closer to the realization of large-scale quantum computation.
Active Fault Tolerant Control for Ultrasonic Piezoelectric Motor

NASA Astrophysics Data System (ADS)

Boukhnifer, Moussa

2012-07-01

Ultrasonic piezoelectric motor technology is an important system component in integrated mechatronics devices working on extreme operating conditions. Due to these constraints, robustness and performance of the control interfaces should be taken into account in the motor design. In this paper, we apply a new architecture for a fault tolerant control using Youla parameterization for an ultrasonic piezoelectric motor. The distinguished feature of proposed controller architecture is that it shows structurally how the controller design for performance and robustness may be done separately which has the potential to overcome the conflict between performance and robustness in the traditional feedback framework. A fault tolerant control architecture includes two parts: one part for performance and the other part for robustness. The controller design works in such a way that the feedback control system will be solely controlled by the proportional plus double-integral PI2 performance controller for a nominal model without disturbances and H∞ robustification controller will only be activated in the presence of the uncertainties or an external disturbances. The simulation results demonstrate the effectiveness of the proposed fault tolerant control architecture.
Detection of faults and software reliability analysis

NASA Technical Reports Server (NTRS)

Knight, J. C.

1987-01-01

Specific topics briefly addressed include: the consistent comparison problem in N-version system; analytic models of comparison testing; fault tolerance through data diversity; and the relationship between failures caused by automatically seeded faults.
Robust adaptive fault-tolerant control for leader-follower flocking of uncertain multi-agent systems with actuator failure.

PubMed

Yazdani, Sahar; Haeri, Mohammad

2017-11-01

In this work, we study the flocking problem of multi-agent systems with uncertain dynamics subject to actuator failure and external disturbances. By considering some standard assumptions, we propose a robust adaptive fault tolerant protocol for compensating of the actuator bias fault, the partial loss of actuator effectiveness fault, the model uncertainties, and external disturbances. Under the designed protocol, velocity convergence of agents to that of virtual leader is guaranteed while the connectivity preservation of network and collision avoidance among agents are ensured as well. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Dynamic rupture scenarios from Sumatra to Iceland - High-resolution earthquake source physics on natural fault systems

NASA Astrophysics Data System (ADS)

Gabriel, Alice-Agnes; Madden, Elizabeth H.; Ulrich, Thomas; Wollherr, Stephanie

2017-04-01

Capturing the observed complexity of earthquake sources in dynamic rupture simulations may require: non-linear fault friction, thermal and fluid effects, heterogeneous fault stress and fault strength initial conditions, fault curvature and roughness, on- and off-fault non-elastic failure. All of these factors have been independently shown to alter dynamic rupture behavior and thus possibly influence the degree of realism attainable via simulated ground motions. In this presentation we will show examples of high-resolution earthquake scenarios, e.g. based on the 2004 Sumatra-Andaman Earthquake, the 1994 Northridge earthquake and a potential rupture of the Husavik-Flatey fault system in Northern Iceland. The simulations combine a multitude of representations of source complexity at the necessary spatio-temporal resolution enabled by excellent scalability on modern HPC systems. Such simulations allow an analysis of the dominant factors impacting earthquake source physics and ground motions given distinct tectonic settings or distinct focuses of seismic hazard assessment. Across all simulations, we find that fault geometry concurrently with the regional background stress state provide a first order influence on source dynamics and the emanated seismic wave field. The dynamic rupture models are performed with SeisSol, a software package based on an ADER-Discontinuous Galerkin scheme for solving the spontaneous dynamic earthquake rupture problem with high-order accuracy in space and time. Use of unstructured tetrahedral meshes allows for a realistic representation of the non-planar fault geometry, subsurface structure and bathymetry. The results presented highlight the fact that modern numerical methods are essential to further our understanding of earthquake source physics and complement both physic-based ground motion research and empirical approaches in seismic hazard analysis.
Mini-Ckpts: Surviving OS Failures in Persistent Memory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fiala, David; Mueller, Frank; Ferreira, Kurt Brian

Concern is growing in the high-performance computing (HPC) community on the reliability of future extreme-scale systems. Current efforts have focused on application fault-tolerance rather than the operating system (OS), despite the fact that recent studies have suggested that failures in OS memory are more likely. The OS is critical to a system's correct and efficient operation of the node and processes it governs -- and in HPC also for any other nodes a parallelized application runs on and communicates with: Any single node failure generally forces all processes of this application to terminate due to tight communication in HPC. Therefore,more » the OS itself must be capable of tolerating failures. In this work, we introduce mini-ckpts, a framework which enables application survival despite the occurrence of a fatal OS failure or crash. Mini-ckpts achieves this tolerance by ensuring that the critical data describing a process is preserved in persistent memory prior to the failure. Following the failure, the OS is rejuvenated via a warm reboot and the application continues execution effectively making the failure and restart transparent. The mini-ckpts rejuvenation and recovery process is measured to take between three to six seconds and has a failure-free overhead of between 3-5% for a number of key HPC workloads. In contrast to current fault-tolerance methods, this work ensures that the operating and runtime system can continue in the presence of faults. This is a much finer-grained and dynamic method of fault-tolerance than the current, coarse-grained, application-centric methods. Handling faults at this level has the potential to greatly reduce overheads and enables mitigation of additional fault scenarios.« less
Soft-Fault Detection Technologies Developed for Electrical Power Systems

NASA Technical Reports Server (NTRS)

Button, Robert M.

2004-01-01

The NASA Glenn Research Center, partner universities, and defense contractors are working to develop intelligent power management and distribution (PMAD) technologies for future spacecraft and launch vehicles. The goals are to provide higher performance (efficiency, transient response, and stability), higher fault tolerance, and higher reliability through the application of digital control and communication technologies. It is also expected that these technologies will eventually reduce the design, development, manufacturing, and integration costs for large, electrical power systems for space vehicles. The main focus of this research has been to incorporate digital control, communications, and intelligent algorithms into power electronic devices such as direct-current to direct-current (dc-dc) converters and protective switchgear. These technologies, in turn, will enable revolutionary changes in the way electrical power systems are designed, developed, configured, and integrated in aerospace vehicles and satellites. Initial successes in integrating modern, digital controllers have proven that transient response performance can be improved using advanced nonlinear control algorithms. One technology being developed includes the detection of "soft faults," those not typically covered by current systems in use today. Soft faults include arcing faults, corona discharge faults, and undetected leakage currents. Using digital control and advanced signal analysis algorithms, we have shown that it is possible to reliably detect arcing faults in high-voltage dc power distribution systems (see the preceding photograph). Another research effort has shown that low-level leakage faults and cable degradation can be detected by analyzing power system parameters over time. This additional fault detection capability will result in higher reliability for long-lived power systems such as reusable launch vehicles and space exploration missions.
Evaluating SPLASH-2 Applications Using MapReduce

NASA Astrophysics Data System (ADS)

Zhu, Shengkai; Xiao, Zhiwei; Chen, Haibo; Chen, Rong; Zhang, Weihua; Zang, Binyu

MapReduce has been prevalent for running data-parallel applications. By hiding other non-functionality parts such as parallelism, fault tolerance and load balance from programmers, MapReduce significantly simplifies the programming of large clusters. Due to the mentioned features of MapReduce above, researchers have also explored the use of MapReduce on other application domains, such as machine learning, textual retrieval and statistical translation, among others.
Fault tolerant operation of switched reluctance machine

NASA Astrophysics Data System (ADS)

Wang, Wei

The energy crisis and environmental challenges have driven industry towards more energy efficient solutions. With nearly 60% of electricity consumed by various electric machines in industry sector, advancement in the efficiency of the electric drive system is of vital importance. Adjustable speed drive system (ASDS) provides excellent speed regulation and dynamic performance as well as dramatically improved system efficiency compared with conventional motors without electronics drives. Industry has witnessed tremendous grow in ASDS applications not only as a driving force but also as an electric auxiliary system for replacing bulky and low efficiency auxiliary hydraulic and mechanical systems. With the vast penetration of ASDS, its fault tolerant operation capability is more widely recognized as an important feature of drive performance especially for aerospace, automotive applications and other industrial drive applications demanding high reliability. The Switched Reluctance Machine (SRM), a low cost, highly reliable electric machine with fault tolerant operation capability, has drawn substantial attention in the past three decades. Nevertheless, SRM is not free of fault. Certain faults such as converter faults, sensor faults, winding shorts, eccentricity and position sensor faults are commonly shared among all ASDS. In this dissertation, a thorough understanding of various faults and their influence on transient and steady state performance of SRM is developed via simulation and experimental study, providing necessary knowledge for fault detection and post fault management. Lumped parameter models are established for fast real time simulation and drive control. Based on the behavior of the faults, a fault detection scheme is developed for the purpose of fast and reliable fault diagnosis. In order to improve the SRM power and torque capacity under faults, the maximum torque per ampere excitation are conceptualized and validated through theoretical analysis and experiments. With the proposed optimal waveform, torque production is greatly improved under the same Root Mean Square (RMS) current constraint. Additionally, position sensorless operation methods under phase faults are investigated to account for the combination of physical position sensor and phase winding faults. A comprehensive solution for position sensorless operation under single and multiple phases fault are proposed and validated through experiments. Continuous position sensorless operation with seamless transition between various numbers of phase fault is achieved.
Generation of the September 29, 2009 Samoa Tsunami: Examination of a Possible Non-Double Couple Component (Invited)

NASA Astrophysics Data System (ADS)

Geist, E. L.; Kirby, S. H.; Ross, S.; Dartnell, P.

2009-12-01

A non-double couple component associated with the Mw=8.0 September 29, 2009 Samoa earthquake is investigated to explain direct tsunami arrivals at deep-ocean pressure sensors (i.e., DART stations). In particular, we seek a tsunami generation model that correctly predicts the polarity of first motions: negative at the Apia station (#51425) NW of the epicenter and positive at the Tonga (#51426) and Aukland (#54401) stations south of the epicenter. Slip on a single, finite fault corresponding to either nodal plane of the best-fitting double couple fails to predict the positive first-motion polarity observed at the southerly (Tonga and Aukland) DART stations. The Samoa earthquake has a significant non-double component as measured by the compensated linear vector dipole (CLVD) ratio that ranges from |ɛ|=0.15 (USGS CMT) to |ɛ| =0.37 (Global CMT). To test what effect the non-double component has on tsunami generation, the static elastic displacement field at the sea floor is computed from the full moment tensor. This displacement field represents the initial conditions for tsunami propagation computed using a finite-difference approximation to the linear shallow-water wave equations. The tsunami waveforms calculated from the full moment tensor are consistent with the observed polarities at all of the DART stations. The static displacement field is then decomposed into double-couple and non-double couple components to determine the relative contribution of each to the tsunami wavefield. Although a point-source approximation to the tsunami source is typically inadequate at near-field and regional distances, finite-fault inversions of the 2009 Samoa earthquake indicate that peak slip is spatially concentrated near the hypocenter, suggesting that the point-source representation may be acceptable in this case. Generation of the 2009 Samoa tsunami may involve earthquake rupture on multiple faults and/or along curved faults, both of which are observed from multibeam bathymetry in the epicentral region. The exact rupture path of the earthquake is presently unclear. It is evident from seismological and tsunami observations of the 2009 Samoa event, however, that uniform slip on a single, planar fault cannot explain all aspects of the observed tsunami wavefield.
Combining dynamical decoupling with fault-tolerant quantum computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ng, Hui Khoon; Preskill, John; Lidar, Daniel A.

2011-07-15

We study how dynamical decoupling (DD) pulse sequences can improve the reliability of quantum computers. We prove upper bounds on the accuracy of DD-protected quantum gates and derive sufficient conditions for DD-protected gates to outperform unprotected gates. Under suitable conditions, fault-tolerant quantum circuits constructed from DD-protected gates can tolerate stronger noise and have a lower overhead cost than fault-tolerant circuits constructed from unprotected gates. Our accuracy estimates depend on the dynamics of the bath that couples to the quantum computer and can be expressed either in terms of the operator norm of the bath's Hamiltonian or in terms of themore » power spectrum of bath correlations; we explain in particular how the performance of recursively generated concatenated pulse sequences can be analyzed from either viewpoint. Our results apply to Hamiltonian noise models with limited spatial correlations.« less

DOE Office of Scientific and Technical Information (OSTI.GOV)

Duan, Sisi; Li, Yun; Levitt, Karl N.

Consensus is a fundamental approach to implementing fault-tolerant services through replication where there exists a tradeoff between the cost and the resilience. For instance, Crash Fault Tolerant (CFT) protocols have a low cost but can only handle crash failures while Byzantine Fault Tolerant (BFT) protocols handle arbitrary failures but have a higher cost. Hybrid protocols enjoy the benefits of both high performance without failures and high resiliency under failures by switching among different subprotocols. However, it is challenging to determine which subprotocols should be used. We propose a moving target approach to switch among protocols according to the existing systemmore » and network vulnerability. At the core of our approach is a formalized cost model that evaluates the vulnerability and performance of consensus protocols based on real-time Intrusion Detection System (IDS) signals. Based on the evaluation results, we demonstrate that a safe, cheap, and unpredictable protocol is always used and a high IDS error rate can be tolerated.« less
Measurement system analysis of viscometers used for drilling mud characterization

NASA Astrophysics Data System (ADS)

Mat-Shayuti, M. S.; Adzhar, S. N.

2017-07-01

Viscometers in the Faculty of Chemical Engineering, University Teknologi MARA, are subject to heavy utilization from the members of the faculty. Due to doubts surrounding their result integrity and maintenance management, Measurement System Analysis was executed. 5 samples of drilling muds with varied barite content from 5 - 25 weight% were prepared and their rheological properties determined in 3 trials by 3 operators using the viscometers. Gage Linearity and Bias Study were performed using Minitab software and the result shows high biases in the range of 19.2% to 38.7%, with non-linear trend along the span of measurements. Gage Repeatability & Reproducibility (Nested) analysis later produces Percent Repeatability & Reproducibility more than 7.7% and Percent Tolerance above 30%. Lastly, good and marginal Distinct Categories output are seen among the results. Despite acceptable performance of the measurement system in Distinct Categories, the poor results in accuracy, linearity, and Percent Repeatability & Reproducibility render the gage generally not capable. Improvement to the measurement system is imminent.
Fault Tolerant Software Technology for Distributed Computer Systems

DTIC Science & Technology

1989-03-01

RAY.) &-TR-88-296 I Fin;.’ Technical Report ,r 19,39 i A28 3329 F’ULT TOLERANT SOFTWARE TECHNOLOGY FOR DISTRIBUTED COMPUTER SYSTEMS Georgia Institute...GrfisABN 34-70IiWftlI NO0. IN?3. NO IACCESSION NO. 158 21 7 11. TITLE (Incld security Cassification) FAULT TOLERANT SOFTWARE FOR DISTRIBUTED COMPUTER ...Technology for Distributed Computing Systems," a two year effort performed at Georgia Institute of Technology as part of the Clouds Project. The Clouds
Combined methods of tolerance increasing for embedded SRAM

NASA Astrophysics Data System (ADS)

Shchigorev, L. A.; Shagurin, I. I.

2016-10-01

The abilities of combined use of different methods of fault tolerance increasing for SRAM such as error detection and correction codes, parity bits, and redundant elements are considered. Area penalties due to using combinations of these methods are investigated. Estimation is made for different configurations of 4K x 128 RAM memory block for 28 nm manufacturing process. Evaluation of the effectiveness of the proposed combinations is also reported. The results of these investigations can be useful for designing fault-tolerant “system on chips”.
Multiple Intelligence Scores of Science Stream Students and Their Relation with Reading Competency in Malaysian University English Test (MUET)

ERIC Educational Resources Information Center

Razak, Norizan Abdul; Zaini, Nuramirah

2014-01-01

Many researches have shown that different approach needed in analysing linear and non-linear reading comprehension texts and different cognitive skills are required. This research attempts to discover the relationship between Science Stream students' reading competency on linear and non-linear texts in Malaysian University English Test (MUET) with…
High reliability linear drive device for artificial hearts

NASA Astrophysics Data System (ADS)

Ji, Jinghua; Zhao, Wenxiang; Liu, Guohai; Shen, Yue; Wang, Fangqun

2012-04-01

In this paper, a new high reliability linear drive device, termed as stator-permanent-magnet tubular oscillating actuator (SPM-TOA), is proposed for artificial hearts (AHs). The key is to incorporate the concept of two independent phases into this linear AH device, hence achieving high reliability operation. The fault-tolerant teeth are employed to provide the desired decoupling phases in magnetic circuit. Also, as the magnets and the coils are located in the stator, the proposed SPM-TOA takes the definite advantages of robust mover and direct-drive capability. By using the time-stepping finite element method, the electromagnetic characteristics of the proposed SPM-TOA are analyzed, including magnetic field distributions, flux linkages, back- electromotive forces (back-EMFs) self- and mutual inductances, as well as cogging and thrust forces. The results confirm that the proposed SPM-TOA meets the dimension, weight, and force requirements of the AH drive device.
3D Ta/TaO x /TiO2/Ti synaptic array and linearity tuning of weight update for hardware neural network applications

NASA Astrophysics Data System (ADS)

Wang, I.-Ting; Chang, Chih-Cheng; Chiu, Li-Wen; Chou, Teyuh; Hou, Tuo-Hung

2016-09-01

The implementation of highly anticipated hardware neural networks (HNNs) hinges largely on the successful development of a low-power, high-density, and reliable analog electronic synaptic array. In this study, we demonstrate a two-layer Ta/TaO x /TiO2/Ti cross-point synaptic array that emulates the high-density three-dimensional network architecture of human brains. Excellent uniformity and reproducibility among intralayer and interlayer cells were realized. Moreover, at least 50 analog synaptic weight states could be precisely controlled with minimal drifting during a cycling endurance test of 5000 training pulses at an operating voltage of 3 V. We also propose a new state-independent bipolar-pulse-training scheme to improve the linearity of weight updates. The improved linearity considerably enhances the fault tolerance of HNNs, thus improving the training accuracy.
An approximation formula for a class of fault-tolerant computers

NASA Technical Reports Server (NTRS)

White, A. L.

1986-01-01

An approximation formula is derived for the probability of failure for fault-tolerant process-control computers. These computers use redundancy and reconfiguration to achieve high reliability. Finite-state Markov models capture the dynamic behavior of component failure and system recovery, and the approximation formula permits an estimation of system reliability by an easy examination of the model.
Design of penicillin fermentation process simulation system

NASA Astrophysics Data System (ADS)

Qi, Xiaoyu; Yuan, Zhonghu; Qi, Xiaoxuan; Zhang, Wenqi

2011-10-01

Real-time monitoring for batch process attracts increasing attention. It can ensure safety and provide products with consistent quality. The design of simulation system of batch process fault diagnosis is of great significance. In this paper, penicillin fermentation, a typical non-linear, dynamic, multi-stage batch production process, is taken as the research object. A visual human-machine interactive simulation software system based on Windows operation system is developed. The simulation system can provide an effective platform for the research of batch process fault diagnosis.
Fault tolerant onboard packet switch architecture for communication satellites: Shared memory per beam approach

NASA Technical Reports Server (NTRS)

Shalkhauser, Mary JO; Quintana, Jorge A.; Soni, Nitin J.

1994-01-01

The NASA Lewis Research Center is developing a multichannel communication signal processing satellite (MCSPS) system which will provide low data rate, direct to user, commercial communications services. The focus of current space segment developments is a flexible, high-throughput, fault tolerant onboard information switching processor. This information switching processor (ISP) is a destination-directed packet switch which performs both space and time switching to route user information among numerous user ground terminals. Through both industry study contracts and in-house investigations, several packet switching architectures were examined. A contention-free approach, the shared memory per beam architecture, was selected for implementation. The shared memory per beam architecture, fault tolerance insertion, implementation, and demonstration plans are described.
Formal design specification of a Processor Interface Unit

NASA Technical Reports Server (NTRS)

Fura, David A.; Windley, Phillip J.; Cohen, Gerald C.

1992-01-01

This report describes work to formally specify the requirements and design of a processor interface unit (PIU), a single-chip subsystem providing memory-interface bus-interface, and additional support services for a commercial microprocessor within a fault-tolerant computer system. This system, the Fault-Tolerant Embedded Processor (FTEP), is targeted towards applications in avionics and space requiring extremely high levels of mission reliability, extended maintenance-free operation, or both. The need for high-quality design assurance in such applications is an undisputed fact, given the disastrous consequences that even a single design flaw can produce. Thus, the further development and application of formal methods to fault-tolerant systems is of critical importance as these systems see increasing use in modern society.
Techniques for modeling the reliability of fault-tolerant systems with the Markov state-space approach

NASA Technical Reports Server (NTRS)

Butler, Ricky W.; Johnson, Sally C.

1995-01-01

This paper presents a step-by-step tutorial of the methods and the tools that were used for the reliability analysis of fault-tolerant systems. The approach used in this paper is the Markov (or semi-Markov) state-space method. The paper is intended for design engineers with a basic understanding of computer architecture and fault tolerance, but little knowledge of reliability modeling. The representation of architectural features in mathematical models is emphasized. This paper does not present details of the mathematical solution of complex reliability models. Instead, it describes the use of several recently developed computer programs SURE, ASSIST, STEM, and PAWS that automate the generation and the solution of these models.
A Unified Fault-Tolerance Protocol

NASA Technical Reports Server (NTRS)

Miner, Paul; Gedser, Alfons; Pike, Lee; Maddalon, Jeffrey

2004-01-01

Davies and Wakerly show that Byzantine fault tolerance can be achieved by a cascade of broadcasts and middle value select functions. We present an extension of the Davies and Wakerly protocol, the unified protocol, and its proof of correctness. We prove that it satisfies validity and agreement properties for communication of exact values. We then introduce bounded communication error into the model. Inexact communication is inherent for clock synchronization protocols. We prove that validity and agreement properties hold for inexact communication, and that exact communication is a special case. As a running example, we illustrate the unified protocol using the SPIDER family of fault-tolerant architectures. In particular we demonstrate that the SPIDER interactive consistency, distributed diagnosis, and clock synchronization protocols are instances of the unified protocol.
Features and dimensions of the Hayward Fault Zone in the Strawberry and Blackberry Creek Area, Berkeley, California

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, P.L.

1995-03-01

This report presents an examination of the geometry of the Hayward fault adjacent to the Lawrence Berkeley Laboratory and University of California campuses in central Berkeley. The fault crosses inside the eastern border of the UC campus. Most subtle geomorphic (landform) expressions of the fault have been removed by development and by the natural processes of landsliding and erosion. Some clear expressions of the fault remain however, and these are key to mapping the main trace through the campus area. In addition, original geomorphic evidence of the fault`s location was recovered from large scale mapping of the site dating frommore » 1873 to 1897. Before construction obscured and removed natural landforms, the fault was expressed by a linear, northwest-tending zone of fault-related geomorphic features. There existed well-defined and subtle stream offsets and beheaded channels, fault scarps, and a prominent ``shutter ridge``. To improve our confidence in fault locations interpreted from landforms, we referred to clear fault exposures revealed in trenching, revealed during the construction of the Foothill Housing Complex, and revealed along the length of the Lawson Adit mining tunnel. Also utilized were the locations of offset cultural features. At several locations across the study area, distress features in buildings and streets have been used to precisely locate the fault. Recent published mapping of the fault (Lienkaemper, 1992) was principally used for reference to evidence of the fault`s location to the northwest and southeast of Lawrence Berkeley Laboratory.« less
2009 fault tolerance for extreme-scale computing workshop, Albuquerque, NM - March 19-20, 2009.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Katz, D. S.; Daly, J.; DeBardeleben, N.

2009-02-01

This is a report on the third in a series of petascale workshops co-sponsored by Blue Waters and TeraGrid to address challenges and opportunities for making effective use of emerging extreme-scale computing. This workshop was held to discuss fault tolerance on large systems for running large, possibly long-running applications. The main point of the workshop was to have systems people, middleware people (including fault-tolerance experts), and applications people talk about the issues and figure out what needs to be done, mostly at the middleware and application levels, to run such applications on the emerging petascale systems, without having faults causemore » large numbers of application failures. The workshop found that there is considerable interest in fault tolerance, resilience, and reliability of high-performance computing (HPC) systems in general, at all levels of HPC. The only way to recover from faults is through the use of some redundancy, either in space or in time. Redundancy in time, in the form of writing checkpoints to disk and restarting at the most recent checkpoint after a fault that cause an application to crash/halt, is the most common tool used in applications today, but there are questions about how long this can continue to be a good solution as systems and memories grow faster than I/O bandwidth to disk. There is interest in both modifications to this, such as checkpoints to memory, partial checkpoints, and message logging, and alternative ideas, such as in-memory recovery using residues. We believe that systematic exploration of these ideas holds the most promise for the scientific applications community. Fault tolerance has been an issue of discussion in the HPC community for at least the past 10 years; but much like other issues, the community has managed to put off addressing it during this period. There is a growing recognition that as systems continue to grow to petascale and beyond, the field is approaching the point where we don't have any choice but to address this through R&D efforts.« less
Critical fault patterns determination in fault-tolerant computer systems

NASA Technical Reports Server (NTRS)

Mccluskey, E. J.; Losq, J.

1978-01-01

The method proposed tries to enumerate all the critical fault-patterns (successive occurrences of failures) without analyzing every single possible fault. The conditions for the system to be operating in a given mode can be expressed in terms of the static states. Thus, one can find all the system states that correspond to a given critical mode of operation. The next step consists in analyzing the fault-detection mechanisms, the diagnosis algorithm and the process of switch control. From them, one can find all the possible system configurations that can result from a failure occurrence. Thus, one can list all the characteristics, with respect to detection, diagnosis, and switch control, that failures must have to constitute critical fault-patterns. Such an enumeration of the critical fault-patterns can be directly used to evaluate the overall system tolerance to failures. Present research is focused on how to efficiently make use of these system-level characteristics to enumerate all the failures that verify these characteristics.
Fractional-order active fault-tolerant force-position controller design for the legged robots using saturated actuator with unknown bias and gain degradation

NASA Astrophysics Data System (ADS)

Farid, Yousef; Majd, Vahid Johari; Ehsani-Seresht, Abbas

2018-05-01

In this paper, a novel fault accommodation strategy is proposed for the legged robots subject to the actuator faults including actuation bias and effective gain degradation as well as the actuator saturation. First, the combined dynamics of two coupled subsystems consisting of the dynamics of the legs subsystem and the body subsystem are developed. Then, the interaction of the robot with the environment is formulated as the contact force optimization problem with equality and inequality constraints. The desired force is obtained by a dynamic model. A robust super twisting fault estimator is proposed to precisely estimate the defective torque amplitude of the faulty actuator in finite time. Defining a novel fractional sliding surface, a fractional nonsingular terminal sliding mode control law is developed. Moreover, by introducing a suitable auxiliary system and using its state vector in the designed controller, the proposed fault-tolerant control (FTC) scheme guarantees the finite-time stability of the closed-loop control system. The robustness and finite-time convergence of the proposed control law is established using the Lyapunov stability theory. Finally, numerical simulations are performed on a quadruped robot to demonstrate the stable walking of the robot with and without actuator faults, and actuator saturation constraints, and the results are compared to results with an integer order fault-tolerant controller.
Dynamic rupture scenarios from Sumatra to Iceland - High-resolution earthquake source physics on natural fault systems

NASA Astrophysics Data System (ADS)

Gabriel, A. A.; Madden, E. H.; Ulrich, T.; Wollherr, S.

2016-12-01

Capturing the observed complexity of earthquake sources in dynamic rupture simulations may require: non-linear fault friction, thermal and fluid effects, heterogeneous fault stress and strength initial conditions, fault curvature and roughness, on- and off-fault non-elastic failure. All of these factors have been independently shown to alter dynamic rupture behavior and thus possibly influence the degree of realism attainable via simulated ground motions. In this presentation we will show examples of high-resolution earthquake scenarios, e.g. based on the 2004 Sumatra-Andaman Earthquake and a potential rupture of the Husavik-Flatey fault system in Northern Iceland. The simulations combine a multitude of representations of source complexity at the necessary spatio-temporal resolution enabled by excellent scalability on modern HPC systems. Such simulations allow an analysis of the dominant factors impacting earthquake source physics and ground motions given distinct tectonic settings or distinct focuses of seismic hazard assessment. Across all simulations, we find that fault geometry concurrently with the regional background stress state provide a first order influence on source dynamics and the emanated seismic wave field. The dynamic rupture models are performed with SeisSol, a software package based on an ADER-Discontinuous Galerkin scheme for solving the spontaneous dynamic earthquake rupture problem with high-order accuracy in space and time. Use of unstructured tetrahedral meshes allows for a realistic representation of the non-planar fault geometry, subsurface structure and bathymetry. The results presented highlight the fact that modern numerical methods are essential to further our understanding of earthquake source physics and complement both physic-based ground motion research and empirical approaches in seismic hazard analysis.
Fault-tolerance in Two-dimensional Topological Systems

NASA Astrophysics Data System (ADS)

Anderson, Jonas T.

This thesis is a collection of ideas with the general goal of building, at least in the abstract, a local fault-tolerant quantum computer. The connection between quantum information and topology has proven to be an active area of research in several fields. The introduction of the toric code by Alexei Kitaev demonstrated the usefulness of topology for quantum memory and quantum computation. Many quantum codes used for quantum memory are modeled by spin systems on a lattice, with operators that extract syndrome information placed on vertices or faces of the lattice. It is natural to wonder whether the useful codes in such systems can be classified. This thesis presents work that leverages ideas from topology and graph theory to explore the space of such codes. Homological stabilizer codes are introduced and it is shown that, under a set of reasonable assumptions, any qubit homological stabilizer code is equivalent to either a toric code or a color code. Additionally, the toric code and the color code correspond to distinct classes of graphs. Many systems have been proposed as candidate quantum computers. It is very desirable to design quantum computing architectures with two-dimensional layouts and low complexity in parity-checking circuitry. Kitaev's surface codes provided the first example of codes satisfying this property. They provided a new route to fault tolerance with more modest overheads and thresholds approaching 1%. The recently discovered color codes share many properties with the surface codes, such as the ability to perform syndrome extraction locally in two dimensions. Some families of color codes admit a transversal implementation of the entire Clifford group. This work investigates color codes on the 4.8.8 lattice known as triangular codes. I develop a fault-tolerant error-correction strategy for these codes in which repeated syndrome measurements on this lattice generate a three-dimensional space-time combinatorial structure. I then develop an integer program that analyzes this structure and determines the most likely set of errors consistent with the observed syndrome values. I implement this integer program to find the threshold for depolarizing noise on small versions of these triangular codes. Because the threshold for magic-state distillation is likely to be higher than this value and because logical CNOT gates can be performed by code deformation in a single block instead of between pairs of blocks, the threshold for fault-tolerant quantum memory for these codes is also the threshold for fault-tolerant quantum computation with them. Since the advent of a threshold theorem for quantum computers much has been improved upon. Thresholds have increased, architectures have become more local, and gate sets have been simplified. The overhead for magic-state distillation has been studied, but not nearly to the extent of the aforementioned topics. A method for greatly reducing this overhead, known as reusable magic states, is studied here. While examples of reusable magic states exist for Clifford gates, I give strong reasons to believe they do not exist for non-Clifford gates.
Using certification trails to achieve software fault tolerance

NASA Technical Reports Server (NTRS)

Sullivan, Gregory F.; Masson, Gerald M.

1993-01-01

A conceptually novel and powerful technique to achieve fault tolerance in hardware and software systems is introduced. When used for software fault tolerance, this new technique uses time and software redundancy and can be outlined as follows. In the initial phase, a program is run to solve a problem and store the result. In addition, this program leaves behind a trail of data called a certification trail. In the second phase, another program is run which solves the original problem again. This program, however, has access to the certification trail left by the first program. Because of the availability of the certification trail, the second phase can be performed by a less complex program and can execute more quickly. In the final phase, the two results are accepted as correct; otherwise an error is indicated. An essential aspect of this approach is that the second program must always generate either an error indication or a correct output even when the certification trail it receives from the first program is incorrect. The certification trail approach to fault tolerance was formalized and it was illustrated by applying it to the fundamental problem of finding a minimum spanning tree. Cases in which the second phase can be run concorrectly with the first and act as a monitor are discussed. The certification trail approach was compared to other approaches to fault tolerance. Because of space limitations we have omitted examples of our technique applied to the Huffman tree, and convex hull problems. These can be found in the full version of this paper.

Aircraft applications of fault detection and isolation techniques

NASA Astrophysics Data System (ADS)

Marcos Esteban, Andres

In this thesis the problems of fault detection & isolation and fault tolerant systems are studied from the perspective of LTI frequency-domain, model-based techniques. Emphasis is placed on the applicability of these LTI techniques to nonlinear models, especially to aerospace systems. Two applications of Hinfinity LTI fault diagnosis are given using an open-loop (no controller) design approach: one for the longitudinal motion of a Boeing 747-100/200 aircraft, the other for a turbofan jet engine. An algorithm formalizing a robust identification approach based on model validation ideas is also given and applied to the previous jet engine. A general linear fractional transformation formulation is given in terms of the Youla and Dual Youla parameterizations for the integrated (control and diagnosis filter) approach. This formulation provides better insight into the trade-off between the control and the diagnosis objectives. It also provides the basic groundwork towards the development of nested schemes for the integrated approach. These nested structures allow iterative improvements on the control/filter Youla parameters based on successive identification of the system uncertainty (as given by the Dual Youla parameter). The thesis concludes with an application of Hinfinity LTI techniques to the integrated design for the longitudinal motion of the previous Boeing 747-100/200 model.
Ada 9X Project Revision Request Report. Supplement 1

DTIC Science & Technology

1990-01-01

Non-portable use of operating system primitives or of Ada run time system internals. POSSIBLE SOLUTIONS: Mandate that compilers recognize tasks that...complex than a simple operating system file, the compiler vendor must provide routines to manipulate it (create, copy, move etc .) as a single entity... system , to support fault tolerance, load sharing, change of system operating mode etc . It is highly desirable that such important software be written in
An experimental evaluation of software redundancy as a strategy for improving reliability

NASA Technical Reports Server (NTRS)

Eckhardt, Dave E., Jr.; Caglayan, Alper K.; Knight, John C.; Lee, Larry D.; Mcallister, David F.; Vouk, Mladen A.; Kelly, John P. J.

1990-01-01

The strategy of using multiple versions of independently developed software as a means to tolerate residual software design faults is suggested by the success of hardware redundancy for tolerating hardware failures. Although, as generally accepted, the independence of hardware failures resulting from physical wearout can lead to substantial increases in reliability for redundant hardware structures, a similar conclusion is not immediate for software. The degree to which design faults are manifested as independent failures determines the effectiveness of redundancy as a method for improving software reliability. Interest in multi-version software centers on whether it provides an adequate measure of increased reliability to warrant its use in critical applications. The effectiveness of multi-version software is studied by comparing estimates of the failure probabilities of these systems with the failure probabilities of single versions. The estimates are obtained under a model of dependent failures and compared with estimates obtained when failures are assumed to be independent. The experimental results are based on twenty versions of an aerospace application developed and certified by sixty programmers from four universities. Descriptions of the application, development and certification processes, and operational evaluation are given together with an analysis of the twenty versions.
A survey of provably correct fault-tolerant clock synchronization techniques

NASA Technical Reports Server (NTRS)

Butler, Ricky W.

1988-01-01

Six provably correct fault-tolerant clock synchronization algorithms are examined. These algorithms are all presented in the same notation to permit easier comprehension and comparison. The advantages and disadvantages of the different techniques are examined and issues related to the implementation of these algorithms are discussed. The paper argues for the use of such algorithms in life-critical applications.
Fault Tolerance for VLSI Multicomputers

DTIC Science & Technology

1985-08-01

that consists of hundreds or thousands of VLSI computation nodes interconnected by dedicated links. Some important applications of high-end computers...technology, and intended applications . A proposed fault tolerance scheme combines hardware that performs error detection and system-level protocols for...order to recover from the error and resume correct operation, a valid system state must be restored. A low-overhead, application -transparent error
Fault-Tolerant Computing: An Overview

DTIC Science & Technology

1991-06-01

Addison Wesley:, Reading, MA) 1984. [8] J. Wakerly , Error Detecting Codes, Self-Checking Circuits and Applications , (Elsevier North Holland, Inc.- New York... applicable to bit-sliced organi- zations of hardware. In the first time step, the normal computation is performed on the operands and the results...for error detection and fault tolerance in parallel processor systems while perform- ing specific computation-intensive applications [111. Contrary to
COTS-Based Fault Tolerance in Deep Space: Qualitative and Quantitative Analyses of a Bus Network Architecture

NASA Technical Reports Server (NTRS)

Tai, Ann T.; Chau, Savio N.; Alkalai, Leon

2000-01-01

Using COTS products, standards and intellectual properties (IPs) for all the system and component interfaces is a crucial step toward significant reduction of both system cost and development cost as the COTS interfaces enable other COTS products and IPs to be readily accommodated by the target system architecture. With respect to the long-term survivable systems for deep-space missions, the major challenge for us is, under stringent power and mass constraints, to achieve ultra-high reliability of the system comprising COTS products and standards that are not developed for mission-critical applications. The spirit of our solution is to exploit the pertinent standard features of a COTS product to circumvent its shortcomings, though these standard features may not be originally designed for highly reliable systems. In this paper, we discuss our experiences and findings on the design of an IEEE 1394 compliant fault-tolerant COTS-based bus architecture. We first derive and qualitatively analyze a -'stacktree topology" that not only complies with IEEE 1394 but also enables the implementation of a fault-tolerant bus architecture without node redundancy. We then present a quantitative evaluation that demonstrates significant reliability improvement from the COTS-based fault tolerance.
Design and Implementation of Replicated Object Layer

NASA Technical Reports Server (NTRS)

Koka, Sudhir

1996-01-01

One of the widely used techniques for construction of fault tolerant applications is the replication of resources so that if one copy fails sufficient copies may still remain operational to allow the application to continue to function. This thesis involves the design and implementation of an object oriented framework for replicating data on multiple sites and across different platforms. Our approach, called the Replicated Object Layer (ROL) provides a mechanism for consistent replication of data over dynamic networks. ROL uses the Reliable Multicast Protocol (RMP) as a communication protocol that provides for reliable delivery, serialization and fault tolerance. Besides providing type registration, this layer facilitates distributed atomic transactions on replicated data. A novel algorithm called the RMP Commit Protocol, which commits transactions efficiently in reliable multicast environment is presented. ROL provides recovery procedures to ensure that site and communication failures do not corrupt persistent data, and male the system fault tolerant to network partitions. ROL will facilitate building distributed fault tolerant applications by performing the burdensome details of replica consistency operations, and making it completely transparent to the application.Replicated databases are a major class of applications which could be built on top of ROL.
Fault-Tolerant Algorithms for Connectivity Restoration in Wireless Sensor Networks.

PubMed

Zeng, Yali; Xu, Li; Chen, Zhide

2015-12-22

As wireless sensor network (WSN) is often deployed in a hostile environment, nodes in the networks are prone to large-scale failures, resulting in the network not working normally. In this case, an effective restoration scheme is needed to restore the faulty network timely. Most of existing restoration schemes consider more about the number of deployed nodes or fault tolerance alone, but fail to take into account the fact that network coverage and topology quality are also important to a network. To address this issue, we present two algorithms named Full 2-Connectivity Restoration Algorithm (F2CRA) and Partial 3-Connectivity Restoration Algorithm (P3CRA), which restore a faulty WSN in different aspects. F2CRA constructs the fan-shaped topology structure to reduce the number of deployed nodes, while P3CRA constructs the dual-ring topology structure to improve the fault tolerance of the network. F2CRA is suitable when the restoration cost is given the priority, and P3CRA is suitable when the network quality is considered first. Compared with other algorithms, these two algorithms ensure that the network has stronger fault-tolerant function, larger coverage area and better balanced load after the restoration.
The mechanics of fault-bend folding and tear-fault systems in the Niger Delta

NASA Astrophysics Data System (ADS)

Benesh, Nathan Philip

This dissertation investigates the mechanics of fault-bend folding using the discrete element method (DEM) and explores the nature of tear-fault systems in the deep-water Niger Delta fold-and-thrust belt. In Chapter 1, we employ the DEM to investigate the development of growth structures in anticlinal fault-bend folds. This work was inspired by observations that growth strata in active folds show a pronounced upward decrease in bed dip, in contrast to traditional kinematic fault-bend fold models. Our analysis shows that the modeled folds grow largely by parallel folding as specified by the kinematic theory; however, the process of folding over a broad axial surface zone yields a component of fold growth by limb rotation that is consistent with the patterns observed in natural folds. This result has important implications for how growth structures can he used to constrain slip and paleo-earthquake ages on active blind-thrust faults. In Chapter 2, we expand our DEM study to investigate the development of a wider range of fault-bend folds. We examine the influence of mechanical stratigraphy and quantitatively compare our models with the relationships between fold and fault shape prescribed by the kinematic theory. While the synclinal fault-bend models closely match the kinematic theory, the modeled anticlinal fault-bend folds show robust behavior that is distinct from the kinematic theory. Specifically, we observe that modeled structures maintain a linear relationship between fold shape (gamma) and fault-horizon cutoff angle (theta), rather than expressing the non-linear relationship with two distinct modes of anticlinal folding that is prescribed by the kinematic theory. These observations lead to a revised quantitative relationship for fault-bend folds that can serve as a useful interpretation tool. Finally, in Chapter 3, we examine the 3D relationships of tear- and thrust-fault systems in the western, deep-water Niger Delta. Using 3D seismic reflection data and new map-based structural restoration techniques, we find that the tear faults have distinct displacement patterns that distinguish them from conventional strike-slip faults and reflect their roles in accommodating displacement gradients within the fold-and-thrust belt.
Sensor fault-tolerant control for gear-shifting engaging process of automated manual transmission

NASA Astrophysics Data System (ADS)

Li, Liang; He, Kai; Wang, Xiangyu; Liu, Yahui

2018-01-01

Angular displacement sensor on the actuator of automated manual transmission (AMT) is sensitive to fault, and the sensor fault will disturb its normal control, which affects the entire gear-shifting process of AMT and results in awful riding comfort. In order to solve this problem, this paper proposes a method of fault-tolerant control for AMT gear-shifting engaging process. By using the measured current of actuator motor and angular displacement of actuator, the gear-shifting engaging load torque table is built and updated before the occurrence of the sensor fault. Meanwhile, residual between estimated and measured angular displacements is used to detect the sensor fault. Once the residual exceeds a determined fault threshold, the sensor fault is detected. Then, switch control is triggered, and the current observer and load torque table estimates an actual gear-shifting position to replace the measured one to continue controlling the gear-shifting process. Numerical and experiment tests are carried out to evaluate the reliability and feasibility of proposed methods, and the results show that the performance of estimation and control is satisfactory.
Development and evaluation of a Fault-Tolerant Multiprocessor (FTMP) computer. Volume 3: FTMP test and evaluation

NASA Technical Reports Server (NTRS)

Lala, J. H.; Smith, T. B., III

1983-01-01

The experimental test and evaluation of the Fault-Tolerant Multiprocessor (FTMP) is described. Major objectives of this exercise include expanding validation envelope, building confidence in the system, revealing any weaknesses in the architectural concepts and in their execution in hardware and software, and in general, stressing the hardware and software. To this end, pin-level faults were injected into one LRU of the FTMP and the FTMP response was measured in terms of fault detection, isolation, and recovery times. A total of 21,055 stuck-at-0, stuck-at-1 and invert-signal faults were injected in the CPU, memory, bus interface circuits, Bus Guardian Units, and voters and error latches. Of these, 17,418 were detected. At least 80 percent of undetected faults are estimated to be on unused pins. The multiprocessor identified all detected faults correctly and recovered successfully in each case. Total recovery time for all faults averaged a little over one second. This can be reduced to half a second by including appropriate self-tests.
A formally verified algorithm for interactive consistency under a hybrid fault model

NASA Technical Reports Server (NTRS)

Lincoln, Patrick; Rushby, John

1993-01-01

Consistent distribution of single-source data to replicated computing channels is a fundamental problem in fault-tolerant system design. The 'Oral Messages' (OM) algorithm solves this problem of Interactive Consistency (Byzantine Agreement) assuming that all faults are worst-cass. Thambidurai and Park introduced a 'hybrid' fault model that distinguished three fault modes: asymmetric (Byzantine), symmetric, and benign; they also exhibited, along with an informal 'proof of correctness', a modified version of OM. Unfortunately, their algorithm is flawed. The discipline of mechanically checked formal verification eventually enabled us to develop a correct algorithm for Interactive Consistency under the hybrid fault model. This algorithm withstands $a$ asymmetric, $s$ symmetric, and $b$ benign faults simultaneously, using $m+1$ rounds, provided $n is greater than 2a + 2s + b + m$, and $m\\geg a$. We present this algorithm, discuss its subtle points, and describe its formal specification and verification in PVS. We argue that formal verification systems such as PVS are now sufficiently effective that their application to fault-tolerance algorithms should be considered routine.
Cluster-state quantum computing enhanced by high-fidelity generalized measurements.

PubMed

Biggerstaff, D N; Kaltenbaek, R; Hamel, D R; Weihs, G; Rudolph, T; Resch, K J

2009-12-11

We introduce and implement a technique to extend the quantum computational power of cluster states by replacing some projective measurements with generalized quantum measurements (POVMs). As an experimental demonstration we fully realize an arbitrary three-qubit cluster computation by implementing a tunable linear-optical POVM, as well as fast active feedforward, on a two-qubit photonic cluster state. Over 206 different computations, the average output fidelity is 0.9832+/-0.0002; furthermore the error contribution from our POVM device and feedforward is only of O(10(-3)), less than some recent thresholds for fault-tolerant cluster computing.
Integrated Fault Diagnosis Algorithm for Motor Sensors of In-Wheel Independent Drive Electric Vehicles

PubMed Central

Jeon, Namju; Lee, Hyeongcheol

2016-01-01

An integrated fault-diagnosis algorithm for a motor sensor of in-wheel independent drive electric vehicles is presented. This paper proposes a method that integrates the high- and low-level fault diagnoses to improve the robustness and performance of the system. For the high-level fault diagnosis of vehicle dynamics, a planar two-track non-linear model is first selected, and the longitudinal and lateral forces are calculated. To ensure redundancy of the system, correlation between the sensor and residual in the vehicle dynamics is analyzed to detect and separate the fault of the drive motor system of each wheel. To diagnose the motor system for low-level faults, the state equation of an interior permanent magnet synchronous motor is developed, and a parity equation is used to diagnose the fault of the electric current and position sensors. The validity of the high-level fault-diagnosis algorithm is verified using Carsim and Matlab/Simulink co-simulation. The low-level fault diagnosis is verified through Matlab/Simulink simulation and experiments. Finally, according to the residuals of the high- and low-level fault diagnoses, fault-detection flags are defined. On the basis of this information, an integrated fault-diagnosis strategy is proposed. PMID:27973431
A design fix to supervisory control for fault-tolerant scheduling of real-time multiprocessor systems with aperiodic tasks

NASA Astrophysics Data System (ADS)

Devaraj, Rajesh; Sarkar, Arnab; Biswas, Santosh

2015-11-01

In the article 'Supervisory control for fault-tolerant scheduling of real-time multiprocessor systems with aperiodic tasks', Park and Cho presented a systematic way of computing a largest fault-tolerant and schedulable language that provides information on whether the scheduler (i.e., supervisor) should accept or reject a newly arrived aperiodic task. The computation of such a language is mainly dependent on the task execution model presented in their paper. However, the task execution model is unable to capture the situation when the fault of a processor occurs even before the task has arrived. Consequently, a task execution model that does not capture this fact may possibly be assigned for execution on a faulty processor. This problem has been illustrated with an appropriate example. Then, the task execution model of Park and Cho has been modified to strengthen the requirement that none of the tasks are assigned for execution on a faulty processor.
Design and analysis of new fault-tolerant permanent magnet motors for four-wheel-driving electric vehicles

NASA Astrophysics Data System (ADS)

Liu, Guohai; Gong, Wensheng; Chen, Qian; Jian, Linni; Shen, Yue; Zhao, Wenxiang

2012-04-01

In this paper, a novel in-wheel permanent-magnet (PM) motor for four-wheel-driving electrical vehicles is proposed. It adopts an outer-rotor topology, which can help generate a large drive torque, in order to achieve prominent dynamic performance of the vehicle. Moreover, by adopting single-layer concentrated-windings, fault-tolerant teeth, and the optimal combination of slot and pole numbers, the proposed motor inherently offers negligible electromagnetic coupling between different phase windings, hence, it possesses a fault-tolerant characteristic. Meanwhile, the phase back electromotive force waveforms can be designed to be sinusoidal by employing PMs with a trapezoidal shape, eccentric armature teeth, and unequal tooth widths. The electromagnetic performance is comprehensively investigated and the optimal design is conducted by using the finite-element method.
Noise Threshold and Resource Cost of Fault-Tolerant Quantum Computing with Majorana Fermions in Hybrid Systems.

PubMed

Li, Ying

2016-09-16

Fault-tolerant quantum computing in systems composed of both Majorana fermions and topologically unprotected quantum systems, e.g., superconducting circuits or quantum dots, is studied in this Letter. Errors caused by topologically unprotected quantum systems need to be corrected with error-correction schemes, for instance, the surface code. We find that the error-correction performance of such a hybrid topological quantum computer is not superior to a normal quantum computer unless the topological charge of Majorana fermions is insusceptible to noise. If errors changing the topological charge are rare, the fault-tolerance threshold is much higher than the threshold of a normal quantum computer and a surface-code logical qubit could be encoded in only tens of topological qubits instead of about 1,000 normal qubits.
Economic modeling of fault tolerant flight control systems in commercial applications

NASA Technical Reports Server (NTRS)

Finelli, G. B.

1982-01-01

This paper describes the current development of a comprehensive model which will supply the assessment and analysis capability to investigate the economic viability of Fault Tolerant Flight Control Systems (FTFCS) for commercial aircraft of the 1990's and beyond. An introduction to the unique attributes of fault tolerance and how they will influence aircraft operations and consequent airline costs and benefits is presented. Specific modeling issues and elements necessary for accurate assessment of all costs affected by ownership and operation of FTFCS are delineated. Trade-off factors are presented, aimed at exposing economically optimal realizations of system implementations, resource allocation, and operating policies. A trade-off example is furnished to graphically display some of the analysis capabilities of the comprehensive simulation model now being developed.
Problems related to the integration of fault tolerant aircraft electronic systems

NASA Technical Reports Server (NTRS)

Bannister, J. A.; Adlakha, V.; Triyedi, K.; Alspaugh, T. A., Jr.

1982-01-01

Problems related to the design of the hardware for an integrated aircraft electronic system are considered. Taxonomies of concurrent systems are reviewed and a new taxonomy is proposed. An informal methodology intended to identify feasible regions of the taxonomic design space is described. Specific tools are recommended for use in the methodology. Based on the methodology, a preliminary strawman integrated fault tolerant aircraft electronic system is proposed. Next, problems related to the programming and control of inegrated aircraft electronic systems are discussed. Issues of system resource management, including the scheduling and allocation of real time periodic tasks in a multiprocessor environment, are treated in detail. The role of software design in integrated fault tolerant aircraft electronic systems is discussed. Conclusions and recommendations for further work are included.

The Design of Fault Tolerant Quantum Dot Cellular Automata Based Logic

NASA Technical Reports Server (NTRS)

Armstrong, C. Duane; Humphreys, William M.; Fijany, Amir

2002-01-01

As transistor geometries are reduced, quantum effects begin to dominate device performance. At some point, transistors cease to have the properties that make them useful computational components. New computing elements must be developed in order to keep pace with Moore s Law. Quantum dot cellular automata (QCA) represent an alternative paradigm to transistor-based logic. QCA architectures that are robust to manufacturing tolerances and defects must be developed. We are developing software that allows the exploration of fault tolerant QCA gate architectures by automating the specification, simulation, analysis and documentation processes.
Application of power spectrum, cepstrum, higher order spectrum and neural network analyses for induction motor fault diagnosis

NASA Astrophysics Data System (ADS)

Liang, B.; Iwnicki, S. D.; Zhao, Y.

2013-08-01

The power spectrum is defined as the square of the magnitude of the Fourier transform (FT) of a signal. The advantage of FT analysis is that it allows the decomposition of a signal into individual periodic frequency components and establishes the relative intensity of each component. It is the most commonly used signal processing technique today. If the same principle is applied for the detection of periodicity components in a Fourier spectrum, the process is called the cepstrum analysis. Cepstrum analysis is a very useful tool for detection families of harmonics with uniform spacing or the families of sidebands commonly found in gearbox, bearing and engine vibration fault spectra. Higher order spectra (HOS) (also known as polyspectra) consist of higher order moment of spectra which are able to detect non-linear interactions between frequency components. For HOS, the most commonly used is the bispectrum. The bispectrum is the third-order frequency domain measure, which contains information that standard power spectral analysis techniques cannot provide. It is well known that neural networks can represent complex non-linear relationships, and therefore they are extremely useful for fault identification and classification. This paper presents an application of power spectrum, cepstrum, bispectrum and neural network for fault pattern extraction of induction motors. The potential for using the power spectrum, cepstrum, bispectrum and neural network as a means for differentiating between healthy and faulty induction motor operation is examined. A series of experiments is done and the advantages and disadvantages between them are discussed. It has been found that a combination of power spectrum, cepstrum and bispectrum plus neural network analyses could be a very useful tool for condition monitoring and fault diagnosis of induction motors.
Non-double-couple microearthquakes at Long Valley caldera, California, provide evidence for hydraulic fracturing

USGS Publications Warehouse

Foulger, G.R.; Julian, B.R.; Hill, D.P.; Pitt, A.M.; Malin, P.E.; Shalev, E.

2004-01-01

Most of 26 small (0.4??? M ???3.1) microearthquakes at Long Valley caldera in mid-1997, analyzed using data from a dense temporary network of 69 digital three-component seismometers, have significantly non-double-couple focal mechanisms, inconsistent with simple shear faulting. We determined their mechanisms by inverting P - and S -wave polarities and amplitude ratios using linear-programming methods, and tracing rays through a three-dimensional Earth model derived using tomography. More than 80% of the mechanisms have positive (volume increase) isotropic components and most have compensated linear-vector dipole components with outward-directed major dipoles. The simplest interpretation of these mechanisms is combined shear and extensional faulting with a volume-compensating process, such as rapid flow of water, steam, or CO2 into opening tensile cracks. Source orientations of earthquakes in the south moat suggest extensional faulting on ESE-striking subvertical planes, an orientation consistent with planes defined by earthquake hypocenters. The focal mechanisms show that clearly defined hypocentral planes in different locations result from different source processes. One such plane in the eastern south moat is consistent with extensional faulting, while one near Casa Diablo Hot Springs reflects en echelon right-lateral shear faulting. Source orientations at Mammoth Mountain vary systematically with location, indicating that the volcano influences the local stress field. Events in a 'spasmodic burst' at Mammoth Mountain have practically identical mechanisms that indicate nearly pure compensated tensile failure and high fluid mobility. Five earthquakes had mechanisms involving small volume decreases, but these may not be significant. No mechanisms have volumetric moment fractions larger than that of a force dipole, but the reason for this fact is unknown. Published by Elsevier B.V.
Multiplex Networks of Cortical and Hippocampal Neurons Revealed at Different Timescales

PubMed Central

Timme, Nicholas; Ito, Shinya; Myroshnychenko, Maxym; Yeh, Fang-Chin; Hiolski, Emma; Hottowy, Pawel; Beggs, John M.

2014-01-01

Recent studies have emphasized the importance of multiplex networks – interdependent networks with shared nodes and different types of connections – in systems primarily outside of neuroscience. Though the multiplex properties of networks are frequently not considered, most networks are actually multiplex networks and the multiplex specific features of networks can greatly affect network behavior (e.g. fault tolerance). Thus, the study of networks of neurons could potentially be greatly enhanced using a multiplex perspective. Given the wide range of temporally dependent rhythms and phenomena present in neural systems, we chose to examine multiplex networks of individual neurons with time scale dependent connections. To study these networks, we used transfer entropy – an information theoretic quantity that can be used to measure linear and nonlinear interactions – to systematically measure the connectivity between individual neurons at different time scales in cortical and hippocampal slice cultures. We recorded the spiking activity of almost 12,000 neurons across 60 tissue samples using a 512-electrode array with 60 micrometer inter-electrode spacing and 50 microsecond temporal resolution. To the best of our knowledge, this preparation and recording method represents a superior combination of number of recorded neurons and temporal and spatial recording resolutions to any currently available in vivo system. We found that highly connected neurons (“hubs”) were localized to certain time scales, which, we hypothesize, increases the fault tolerance of the network. Conversely, a large proportion of non-hub neurons were not localized to certain time scales. In addition, we found that long and short time scale connectivity was uncorrelated. Finally, we found that long time scale networks were significantly less modular and more disassortative than short time scale networks in both tissue types. As far as we are aware, this analysis represents the first systematic study of temporally dependent multiplex networks among individual neurons. PMID:25536059
Neural Networks and other Techniques for Fault Identification and Isolation of Aircraft Systems

NASA Technical Reports Server (NTRS)

Innocenti, M.; Napolitano, M.

2003-01-01

Fault identification, isolation, and accomodation have become critical issues in the overall performance of advanced aircraft systems. Neural Networks have shown to be a very attractive alternative to classic adaptation methods for identification and control of non-linear dynamic systems. The purpose of this paper is to show the improvements in neural network applications achievable through the use of learning algorithms more efficient than the classic Back-Propagation, and through the implementation of the neural schemes in parallel hardware. The results of the analysis of a scheme for Sensor Failure, Detection, Identification and Accommodation (SFDIA) using experimental flight data of a research aircraft model are presented. Conventional approaches to the problem are based on observers and Kalman Filters while more recent methods are based on neural approximators. The work described in this paper is based on the use of neural networks (NNs) as on-line learning non-linear approximators. The performances of two different neural architectures were compared. The first architecture is based on a Multi Layer Perceptron (MLP) NN trained with the Extended Back Propagation algorithm (EBPA). The second architecture is based on a Radial Basis Function (RBF) NN trained with the Extended-MRAN (EMRAN) algorithms. In addition, alternative methods for communications links fault detection and accomodation are presented, relative to multiple unmanned aircraft applications.
Quantum neuromorphic hardware for quantum artificial intelligence

NASA Astrophysics Data System (ADS)

Prati, Enrico

2017-08-01

The development of machine learning methods based on deep learning boosted the field of artificial intelligence towards unprecedented achievements and application in several fields. Such prominent results were made in parallel with the first successful demonstrations of fault tolerant hardware for quantum information processing. To which extent deep learning can take advantage of the existence of a hardware based on qubits behaving as a universal quantum computer is an open question under investigation. Here I review the convergence between the two fields towards implementation of advanced quantum algorithms, including quantum deep learning.
Adiabatic gate teleportation.

PubMed

Bacon, Dave; Flammia, Steven T

2009-09-18

The difficulty in producing precisely timed and controlled quantum gates is a significant source of error in many physical implementations of quantum computers. Here we introduce a simple universal primitive, adiabatic gate teleportation, which is robust to timing errors and many control errors and maintains a constant energy gap throughout the computation above a degenerate ground state space. This construction allows for geometric robustness based upon the control of two independent qubit interactions. Further, our piecewise adiabatic evolution easily relates to the quantum circuit model, enabling the use of standard methods from fault-tolerance theory for establishing thresholds.
BigBWA: approaching the Burrows-Wheeler aligner to Big Data technologies.

PubMed

Abuín, José M; Pichel, Juan C; Pena, Tomás F; Amigo, Jorge

2015-12-15

BigBWA is a new tool that uses the Big Data technology Hadoop to boost the performance of the Burrows-Wheeler aligner (BWA). Important reductions in the execution times were observed when using this tool. In addition, BigBWA is fault tolerant and it does not require any modification of the original BWA source code. BigBWA is available at the project GitHub repository: https://github.com/citiususc/BigBWA. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
FTMP - A highly reliable Fault-Tolerant Multiprocessor for aircraft

NASA Technical Reports Server (NTRS)

Hopkins, A. L., Jr.; Smith, T. B., III; Lala, J. H.

1978-01-01

The FTMP (Fault-Tolerant Multiprocessor) is a complex multiprocessor computer that employs a form of redundancy related to systems considered by Mathur (1971), in which each major module can substitute for any other module of the same type. Despite the conceptual simplicity of the redundancy form, the implementation has many intricacies owing partly to the low target failure rate, and partly to the difficulty of eliminating single-fault vulnerability. An extensive analysis of the computer through the use of such modeling techniques as Markov processes and combinatorial mathematics shows that for random hard faults the computer can meet its requirements. It is also shown that the maintenance scheduled at intervals of 200 hr or more can be adequate most of the time.
Redundancy management for efficient fault recovery in NASA's distributed computing system

NASA Technical Reports Server (NTRS)

Malek, Miroslaw; Pandya, Mihir; Yau, Kitty

1991-01-01

The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance.
Machine learning techniques for fault isolation and sensor placement

NASA Technical Reports Server (NTRS)

Carnes, James R.; Fisher, Douglas H.

1993-01-01

Fault isolation and sensor placement are vital for monitoring and diagnosis. A sensor conveys information about a system's state that guides troubleshooting if problems arise. We are using machine learning methods to uncover behavioral patterns over snapshots of system simulations that will aid fault isolation and sensor placement, with an eye towards minimality, fault coverage, and noise tolerance.
Braiding by Majorana tracking and long-range CNOT gates with color codes

NASA Astrophysics Data System (ADS)

Litinski, Daniel; von Oppen, Felix

2017-11-01

Color-code quantum computation seamlessly combines Majorana-based hardware with topological error correction. Specifically, as Clifford gates are transversal in two-dimensional color codes, they enable the use of the Majoranas' non-Abelian statistics for gate operations at the code level. Here, we discuss the implementation of color codes in arrays of Majorana nanowires that avoid branched networks such as T junctions, thereby simplifying their realization. We show that, in such implementations, non-Abelian statistics can be exploited without ever performing physical braiding operations. Physical braiding operations are replaced by Majorana tracking, an entirely software-based protocol which appropriately updates the Majoranas involved in the color-code stabilizer measurements. This approach minimizes the required hardware operations for single-qubit Clifford gates. For Clifford completeness, we combine color codes with surface codes, and use color-to-surface-code lattice surgery for long-range multitarget CNOT gates which have a time overhead that grows only logarithmically with the physical distance separating control and target qubits. With the addition of magic state distillation, our architecture describes a fault-tolerant universal quantum computer in systems such as networks of tetrons, hexons, or Majorana box qubits, but can also be applied to nontopological qubit platforms.
From Fault-Diagnosis and Performance Recovery of a Controlled System to Chaotic Secure Communication

NASA Astrophysics Data System (ADS)

Hsu, Wen-Teng; Tsai, Jason Sheng-Hong; Guo, Fang-Cheng; Guo, Shu-Mei; Shieh, Leang-San

Chaotic systems are often applied to encryption on secure communication, but they may not provide high-degree security. In order to improve the security of communication, chaotic systems may need to add other secure signals, but this may cause the system to diverge. In this paper, we redesign a communication scheme that could create secure communication with additional secure signals, and the proposed scheme could keep system convergence. First, we introduce the universal state-space adaptive observer-based fault diagnosis/estimator and the high-performance tracker for the sampled-data linear time-varying system with unanticipated decay factors in actuators/system states. Besides, robustness, convergence in the mean, and tracking ability are given in this paper. A residual generation scheme and a mechanism for auto-tuning switched gain is also presented, so that the introduced methodology is applicable for the fault detection and diagnosis (FDD) for actuator and state faults to yield a high tracking performance recovery. The evolutionary programming-based adaptive observer is then applied to the problem of secure communication. Whenever the tracker induces a large control input which might not conform to the input constraint of some physical systems, the proposed modified linear quadratic optimal tracker (LQT) can effectively restrict the control input within the specified constraint interval, under the acceptable tracking performance. The effectiveness of the proposed design methodology is illustrated through tracking control simulation examples.
Self-calibrating models for dynamic monitoring and diagnosis

NASA Technical Reports Server (NTRS)

Kuipers, Benjamin

1996-01-01

A method for automatically building qualitative and semi-quantitative models of dynamic systems, and using them for monitoring and fault diagnosis, is developed and demonstrated. The qualitative approach and semi-quantitative method are applied to monitoring observation streams, and to design of non-linear control systems.
Insights into the relationship between surface and subsurface activity from mechanical modeling of the 1992 Landers M7.3 earthquake

NASA Astrophysics Data System (ADS)

Madden, E. H.; Pollard, D. D.

2009-12-01

Multi-fault, strike-slip earthquakes have proved difficult to incorporate into seismic hazard analyses due to the difficulty of determining the probability of these ruptures, despite collection of extensive data associated with such events. Modeling the mechanical behavior of these complex ruptures contributes to a better understanding of their occurrence by elucidating the relationship between surface and subsurface earthquake activity along transform faults. This insight is especially important for hazard mitigation, as multi-fault systems can produce earthquakes larger than those associated with any one fault involved. We present a linear elastic, quasi-static model of the southern portion of the 28 June 1992 Landers earthquake built in the boundary element software program Poly3D. This event did not rupture the extent of any one previously mapped fault, but trended 80km N and NW across segments of five sub-parallel, N-S and NW-SE striking faults. At M7.3, the earthquake was larger than the potential earthquakes associated with the individual faults that ruptured. The model extends from the Johnson Valley Fault, across the Landers-Kickapoo Fault, to the Homestead Valley Fault, using data associated with a six-week time period following the mainshock. It honors the complex surface deformation associated with this earthquake, which was well exposed in the desert environment and mapped extensively in the field and from aerial photos in the days immediately following the earthquake. Thus, the model incorporates the non-linearity and segmentation of the main rupture traces, the irregularity of fault slip distributions, and the associated secondary structures such as strike-slip splays and thrust faults. Interferometric Synthetic Aperture Radar (InSAR) images of the Landers event provided the first satellite images of ground deformation caused by a single seismic event and provide constraints on off-fault surface displacement in this six-week period. Insight is gained by comparing the density, magnitudes and focal plane orientations of relocated aftershocks for this time frame with the magnitude and orientation of planes of maximum Coulomb shear stress around the fault planes at depth.
A programmable five qubit quantum computer using trapped atomic ions

NASA Astrophysics Data System (ADS)

Debnath, Shantanu

2017-04-01

In order to harness the power of quantum information processing, several candidate systems have been investigated, and tailored to demonstrate only specific computations. In my thesis work, we construct a general-purpose multi-qubit device using a linear chain of trapped ion qubits, which in principle can be programmed to run any quantum algorithm. To achieve such flexibility, we develop a pulse shaping technique to realize a set of fully connected two-qubit rotations that entangle arbitrary pairs of qubits using multiple motional modes of the chain. Following a computation architecture, such highly expressive two-qubit gates along with arbitrary single-qubit rotations can be used to compile modular universal logic gates that are effected by targeted optical fields and hence can be reconfigured according to any algorithm circuit programmed in the software. As a demonstration, we run the Deutsch-Jozsa and Bernstein-Vazirani algorithm, and a fully coherent quantum Fourier transform, that we use to solve the `period finding' and `quantum phase estimation' problem. Combining these results with recent demonstrations of quantum fault-tolerance, Grover's search algorithm, and simulation of boson hopping establishes the versatility of such a computation module that can potentially be connected to other modules for future large-scale computations.
Algorithm-Based Fault Tolerance for Numerical Subroutines

NASA Technical Reports Server (NTRS)

Tumon, Michael; Granat, Robert; Lou, John

2007-01-01

A software library implements a new methodology of detecting faults in numerical subroutines, thus enabling application programs that contain the subroutines to recover transparently from single-event upsets. The software library in question is fault-detecting middleware that is wrapped around the numericalsubroutines. Conventional serial versions (based on LAPACK and FFTW) and a parallel version (based on ScaLAPACK) exist. The source code of the application program that contains the numerical subroutines is not modified, and the middleware is transparent to the user. The methodology used is a type of algorithm- based fault tolerance (ABFT). In ABFT, a checksum is computed before a computation and compared with the checksum of the computational result; an error is declared if the difference between the checksums exceeds some threshold. Novel normalization methods are used in the checksum comparison to ensure correct fault detections independent of algorithm inputs. In tests of this software reported in the peer-reviewed literature, this library was shown to enable detection of 99.9 percent of significant faults while generating no false alarms.
A review of fault tolerant control strategies applied to proton exchange membrane fuel cell systems

NASA Astrophysics Data System (ADS)

Dijoux, Etienne; Steiner, Nadia Yousfi; Benne, Michel; Péra, Marie-Cécile; Pérez, Brigitte Grondin

2017-08-01

Fuel cells are powerful systems for power generation. They have a good efficiency and do not generate greenhouse gases. This technology involves a lot of scientific fields, which leads to the appearance of strongly inter-dependent parameters. This makes the system particularly hard to control and increases fault's occurrence frequency. These two issues call for the necessity to maintain the system performance at the expected level, even in faulty operating conditions. It is called "fault tolerant control" (FTC). The present paper aims to give the state of the art of FTC applied to the proton exchange membrane fuel cell (PEMFC). The FTC approach is composed of two parts. First, a diagnosis part allows the identification and the isolation of a fault; it requires a good a priori knowledge of all the possible faults. Then, a control part allows an optimal control strategy to find the best operating point to recover/mitigate the fault; it requires the knowledge of the degradation phenomena and their mitigation strategies.
Non-functional Avionics Requirements

NASA Astrophysics Data System (ADS)

Paulitsch, Michael; Ruess, Harald; Sorea, Maria

Embedded systems in aerospace become more and more integrated in order to reduce weight, volume/size, and power of hardware for more fuel-effi ciency. Such integration tendencies change architectural approaches of system ar chi tec tures, which subsequently change non-functional requirements for plat forms. This paper provides some insight into state-of-the-practice of non-func tional requirements for developing ultra-critical embedded systems in the aero space industry, including recent changes and trends. In particular, formal requi re ment capture and formal analysis of non-functional requirements of avionic systems - including hard-real time, fault-tolerance, reliability, and per for mance - are exemplified by means of recent developments in SAL and HiLiTE.
Partitioning in Avionics Architectures: Requirements, Mechanisms, and Assurance

NASA Technical Reports Server (NTRS)

Rushby, John

1999-01-01

Automated aircraft control has traditionally been divided into distinct "functions" that are implemented separately (e.g., autopilot, autothrottle, flight management); each function has its own fault-tolerant computer system, and dependencies among different functions are generally limited to the exchange of sensor and control data. A by-product of this "federated" architecture is that faults are strongly contained within the computer system of the function where they occur and cannot readily propagate to affect the operation of other functions. More modern avionics architectures contemplate supporting multiple functions on a single, shared, fault-tolerant computer system where natural fault containment boundaries are less sharply defined. Partitioning uses appropriate hardware and software mechanisms to restore strong fault containment to such integrated architectures. This report examines the requirements for partitioning, mechanisms for their realization, and issues in providing assurance for partitioning. Because partitioning shares some concerns with computer security, security models are reviewed and compared with the concerns of partitioning.

High Speed, High Temperature, Fault Tolerant Operation of a Combination Magnetic-Hydrostatic Bearing Rotor Support System for Turbomachinery

NASA Technical Reports Server (NTRS)

Jansen, Mark; Montague, Gerald; Provenza, Andrew; Palazzolo, Alan

2004-01-01

Closed loop operation of a single, high temperature magnetic radial bearing to 30,000 RPM (2.25 million DN) and 540 C (1000 F) is discussed. Also, high temperature, fault tolerant operation for the three axis system is examined. A novel, hydrostatic backup bearing system was employed to attain high speed, high temperature, lubrication free support of the entire rotor system. The hydrostatic bearings were made of a high lubricity material and acted as journal-type backup bearings. New, high temperature displacement sensors were successfully employed to monitor shaft position throughout the entire temperature range and are described in this paper. Control of the system was accomplished through a stand alone, high speed computer controller and it was used to run both the fault-tolerant PID and active vibration control algorithms.
Guidance, Navigation, and Control System Design in a Mass Reduction Exercise

NASA Technical Reports Server (NTRS)

Crain, Timothy; Begly, Michael; Jackson, Mark; Broome, Joel

2008-01-01

Early Orion GN&C system designs optimized for robustness, simplicity, and utilization of commercially available components. During the System Definition Review (SDR), all subsystems on Orion were asked to re-optimize with component mass and steady state power as primary design metrics. The objective was to create a mass reserve in the Orion point of departure vehicle design prior to beginning the PDR analysis cycle. The Orion GN&C subsystem team transitioned from a philosophy of absolute 2 fault tolerance for crew safety and 1 fault tolerance for mission success to an approach of 1 fault tolerance for crew safety and risk based redundancy to meet probability allocations of loss of mission and loss of crew. This paper will discuss the analyses, rationale, and end results of this activity regarding Orion navigation sensor hardware, control effectors, and trajectory design.
Verification of the FtCayuga fault-tolerant microprocessor system. Volume 1: A case study in theorem prover-based verification

NASA Technical Reports Server (NTRS)

Srivas, Mandayam; Bickford, Mark

1991-01-01

The design and formal verification of a hardware system for a task that is an important component of a fault tolerant computer architecture for flight control systems is presented. The hardware system implements an algorithm for obtaining interactive consistancy (byzantine agreement) among four microprocessors as a special instruction on the processors. The property verified insures that an execution of the special instruction by the processors correctly accomplishes interactive consistency, provided certain preconditions hold. An assumption is made that the processors execute synchronously. For verification, the authors used a computer aided design hardware design verification tool, Spectool, and the theorem prover, Clio. A major contribution of the work is the demonstration of a significant fault tolerant hardware design that is mechanically verified by a theorem prover.
Verification of the FtCayuga fault-tolerant microprocessor system. Volume 2: Formal specification and correctness theorems

NASA Technical Reports Server (NTRS)

Bickford, Mark; Srivas, Mandayam

1991-01-01

Presented here is a formal specification and verification of a property of a quadruplicately redundant fault tolerant microprocessor system design. A complete listing of the formal specification of the system and the correctness theorems that are proved are given. The system performs the task of obtaining interactive consistency among the processors using a special instruction on the processors. The design is based on an algorithm proposed by Pease, Shostak, and Lamport. The property verified insures that an execution of the special instruction by the processors correctly accomplishes interactive consistency, providing certain preconditions hold, using a computer aided design verification tool, Spectool, and the theorem prover, Clio. A major contribution of the work is the demonstration of a significant fault tolerant hardware design that is mechanically verified by a theorem prover.
Proactive Fault Tolerance for HPC with Xen Virtualization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nagarajan, Arun Babu; Mueller, Frank; Engelmann, Christian

2007-01-01

with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current techniques to tolerate faults focus on reactive schemes to recover from faults and generally rely on a checkpoint/restart mechanism. Yet, in today's systems, node failures can often be anticipated by detecting a deteriorating health status. Instead of a reactive scheme for fault tolerance (FT), we are promoting a proactive one where processes automatically migrate from unhealthy nodes to healthy ones. Our approach relies on operating system virtualization techniques exemplied by but not limited to Xen. This paper contributes an automatic and transparent mechanismmore » for proactive FT for arbitrary MPI applications. It leverages virtualization techniques combined with health monitoring and load-based migration. We exploit Xen's live migration mechanism for a guest operating system (OS) to migrate an MPI task from a health-deteriorating node to a healthy one without stopping the MPI task during most of the migration. Our proactive FT daemon orchestrates the tasks of health monitoring, load determination and initiation of guest OS migration. Experimental results demonstrate that live migration hides migration costs and limits the overhead to only a few seconds making it an attractive approach to realize FT in HPC systems. Overall, our enhancements make proactive FT a valuable asset for long-running MPI application that is complementary to reactive FT using full checkpoint/ restart schemes since checkpoint frequencies can be reduced as fewer unanticipated failures are encountered. In the context of OS virtualization, we believe that this is the rst comprehensive study of proactive fault tolerance where live migration is actually triggered by health monitoring.« less
System Wide Joint Position Sensor Fault Tolerance in Robot Systems Using Cartesian Accelerometers

NASA Technical Reports Server (NTRS)

Aldridge, Hal A.; Juang, Jer-Nan

1997-01-01

Joint position sensors are necessary for most robot control systems. A single position sensor failure in a normal robot system can greatly degrade performance. This paper presents a method to obtain position information from Cartesian accelerometers without integration. Depending on the number and location of the accelerometers. the proposed system can tolerate the loss of multiple position sensors. A solution technique suitable for real-time implementation is presented. Simulations were conducted using 5 triaxial accelerometers to recover from the loss of up to 4 joint position sensors on a 7 degree of freedom robot moving in general three dimensional space. The simulations show good estimation performance using non-ideal accelerometer measurements.
Flight test results of the Strapdown hexad Inertial Reference Unit (SIRU). Volume 1: Flight test summary

NASA Technical Reports Server (NTRS)

Hruby, R. J.; Bjorkman, W. S.

1977-01-01

Flight test results of the strapdown inertial reference unit (SIRU) navigation system are presented. The fault-tolerant SIRU navigation system features a redundant inertial sensor unit and dual computers. System software provides for detection and isolation of inertial sensor failures and continued operation in the event of failures. Flight test results include assessments of the system's navigational performance and fault tolerance.
Implementation Of The Configurable Fault Tolerant System Experiment On NPSAT 1

DTIC Science & Technology

2016-03-01

REPORT TYPE AND DATES COVERED Master’s thesis 4. TITLE AND SUBTITLE IMPLEMENTATION OF THE CONFIGURABLE FAULT TOLERANT SYSTEM EXPERIMENT ON NPSAT...open-source microprocessor without interlocked pipeline stages (MIPS) based processor softcore, a cached memory structure capable of accessing double...data rate type three and secure digital card memories, an interface to the main satellite bus, and XILINX’s soft error mitigation softcore. The
Proactive Fault Tolerance Using Preemptive Migration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Engelmann, Christian; Vallee, Geoffroy R; Naughton, III, Thomas J

2009-01-01

Proactive fault tolerance (FT) in high-performance computing is a concept that prevents compute node failures from impacting running parallel applications by preemptively migrating application parts away from nodes that are about to fail. This paper provides a foundation for proactive FT by defining its architecture and classifying implementation options. This paper further relates prior work to the presented architecture and classification, and discusses the challenges ahead for needed supporting technologies.
The art of fault-tolerant system reliability modeling

NASA Technical Reports Server (NTRS)

Butler, Ricky W.; Johnson, Sally C.

1990-01-01

A step-by-step tutorial of the methods and tools used for the reliability analysis of fault-tolerant systems is presented. Emphasis is on the representation of architectural features in mathematical models. Details of the mathematical solution of complex reliability models are not presented. Instead the use of several recently developed computer programs--SURE, ASSIST, STEM, PAWS--which automate the generation and solution of these models is described.
A General theory of Signal Integration for Fault-Tolerant Dynamic Distributed Sensor Networks

DTIC Science & Technology

1993-10-01

related to a) the architecture and fault- tolerance of the distributed sensor network, b) the proper synchronisation of sensor signals, c) the...Computational complexities of the problem of distributed detection. 5) Issues related to recording of events and synchronization in distributed sensor...Intervals for Synchronization in Real Time Distributed Systems", Submitted to Electronic Encyclopedia. 3. V. G. Hegde and S. S. Iyengar "Efficient
Systems Design Factors: The Essential Ingredients of System Design, Version 0.4

DTIC Science & Technology

1994-03-18

Reliability Function). 4. Barry . W. Johnson, Design and Analysis of Fault Tolerant Digital Systems, p. 4, Addison- Wesley Publishing Company, 1985. METRICS...the system was performing correctly at time t. The unreliability is often referred to as the probability of failure. SOURCE: 1. Barry W. Johnson...Systems Enuineerinf. 3. Barry W. Johnson, Design and Analysis of Fault Tolerant Digital Systems, Addison-Wesley Publishing Company, 1985, p. 5
Implementation of a Configurable Fault Tolerant Processor (CFTP) Using Internal Triple Modular Redundancy (TMR)

DTIC Science & Technology

2005-12-01

Upsets in SRAM FPGAs,” Military and Aerospace Applications of Programmable Logic Devices, September 2002. 8. Wakerly , John F,. “Microcomputer...change. The goal of the Configurable Fault Tolerant Processor (CFTP) Project is to explore, develop and demonstrate the applicability of using off-the...develop and demonstrate the applicability of using commercial-of-the-shelf (COTS) Field Programmable Gate Arrays (FPGA) in the design of
Advanced information processing system: The Army Fault-Tolerant Architecture detailed design overview

NASA Technical Reports Server (NTRS)

Harper, Richard E.; Babikyan, Carol A.; Butler, Bryan P.; Clasen, Robert J.; Harris, Chris H.; Lala, Jaynarayan H.; Masotto, Thomas K.; Nagle, Gail A.; Prizant, Mark J.; Treadwell, Steven

1994-01-01

The Army Avionics Research and Development Activity (AVRADA) is pursuing programs that would enable effective and efficient management of large amounts of situational data that occurs during tactical rotorcraft missions. The Computer Aided Low Altitude Night Helicopter Flight Program has identified automated Terrain Following/Terrain Avoidance, Nap of the Earth (TF/TA, NOE) operation as key enabling technology for advanced tactical rotorcraft to enhance mission survivability and mission effectiveness. The processing of critical information at low altitudes with short reaction times is life-critical and mission-critical necessitating an ultra-reliable/high throughput computing platform for dependable service for flight control, fusion of sensor data, route planning, near-field/far-field navigation, and obstacle avoidance operations. To address these needs the Army Fault Tolerant Architecture (AFTA) is being designed and developed. This computer system is based upon the Fault Tolerant Parallel Processor (FTPP) developed by Charles Stark Draper Labs (CSDL). AFTA is hard real-time, Byzantine, fault-tolerant parallel processor which is programmed in the ADA language. This document describes the results of the Detailed Design (Phase 2 and 3 of a 3-year project) of the AFTA development. This document contains detailed descriptions of the program objectives, the TF/TA NOE application requirements, architecture, hardware design, operating systems design, systems performance measurements and analytical models.
ALLIANCE: An architecture for fault tolerant, cooperative control of heterogeneous mobile robots

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parker, L.E.

1995-02-01

This research addresses the problem of achieving fault tolerant cooperation within small- to medium-sized teams of heterogeneous mobile robots. The author describes a novel behavior-based, fully distributed architecture, called ALLIANCE, that utilizes adaptive action selection to achieve fault tolerant cooperative control in robot missions involving loosely coupled, largely independent tasks. The robots in this architecture possess a variety of high-level functions that they can perform during a mission, and must at all times select an appropriate action based on the requirements of the mission, the activities of other robots, the current environmental conditions, and their own internal states. Since suchmore » cooperative teams often work in dynamic and unpredictable environments, the software architecture allows the team members to respond robustly and reliably to unexpected environmental changes and modifications in the robot team that may occur due to mechanical failure, the learning of new skills, or the addition or removal of robots from the team by human intervention. After presenting ALLIANCE, the author describes in detail experimental results of an implementation of this architecture on a team of physical mobile robots performing a cooperative box pushing demonstration. These experiments illustrate the ability of ALLIANCE to achieve adaptive, fault-tolerant cooperative control amidst dynamic changes in the capabilities of the robot team.« less
Accurate electrostatic and van der Waals pull-in prediction for fully clamped nano/micro-beams using linear universal graphs of pull-in instability

NASA Astrophysics Data System (ADS)

Tahani, Masoud; Askari, Amir R.

2014-09-01

In spite of the fact that pull-in instability of electrically actuated nano/micro-beams has been investigated by many researchers to date, no explicit formula has been presented yet which can predict pull-in voltage based on a geometrically non-linear and distributed parameter model. The objective of present paper is to introduce a simple and accurate formula to predict this value for a fully clamped electrostatically actuated nano/micro-beam. To this end, a non-linear Euler-Bernoulli beam model is employed, which accounts for the axial residual stress, geometric non-linearity of mid-plane stretching, distributed electrostatic force and the van der Waals (vdW) attraction. The non-linear boundary value governing equation of equilibrium is non-dimensionalized and solved iteratively through single-term Galerkin based reduced order model (ROM). The solutions are validated thorough direct comparison with experimental and other existing results reported in previous studies. Pull-in instability under electrical and vdW loads are also investigated using universal graphs. Based on the results of these graphs, non-dimensional pull-in and vdW parameters, which are defined in the text, vary linearly versus the other dimensionless parameters of the problem. Using this fact, some linear equations are presented to predict pull-in voltage, the maximum allowable length, the so-called detachment length, and the minimum allowable gap for a nano/micro-system. These linear equations are also reduced to a couple of universal pull-in formulas for systems with small initial gap. The accuracy of the universal pull-in formulas are also validated by comparing its results with available experimental and some previous geometric linear and closed-form findings published in the literature.
A Decentralized Adaptive Approach to Fault Tolerant Flight Control

NASA Technical Reports Server (NTRS)

Wu, N. Eva; Nikulin, Vladimir; Heimes, Felix; Shormin, Victor

2000-01-01

This paper briefly reports some results of our study on the application of a decentralized adaptive control approach to a 6 DOF nonlinear aircraft model. The simulation results showed the potential of using this approach to achieve fault tolerant control. Based on this observation and some analysis, the paper proposes a multiple channel adaptive control scheme that makes use of the functionally redundant actuating and sensing capabilities in the model, and explains how to implement the scheme to tolerate actuator and sensor failures. The conditions, under which the scheme is applicable, are stated in the paper.
Validation of multiprocessor systems

NASA Technical Reports Server (NTRS)

Siewiorek, D. P.; Segall, Z.; Kong, T.

1982-01-01

Experiments that can be used to validate fault free performance of multiprocessor systems in aerospace systems integrating flight controls and avionics are discussed. Engineering prototypes for two fault tolerant multiprocessors are tested.
Periodic Application of Concurrent Error Detection in Processor Array Architectures. PhD. Thesis -

NASA Technical Reports Server (NTRS)

Chen, Paul Peichuan

1993-01-01

Processor arrays can provide an attractive architecture for some applications. Featuring modularity, regular interconnection and high parallelism, such arrays are well-suited for VLSI/WSI implementations, and applications with high computational requirements, such as real-time signal processing. Preserving the integrity of results can be of paramount importance for certain applications. In these cases, fault tolerance should be used to ensure reliable delivery of a system's service. One aspect of fault tolerance is the detection of errors caused by faults. Concurrent error detection (CED) techniques offer the advantage that transient and intermittent faults may be detected with greater probability than with off-line diagnostic tests. Applying time-redundant CED techniques can reduce hardware redundancy costs. However, most time-redundant CED techniques degrade a system's performance.
Design and evaluation of a fault-tolerant multiprocessor using hardware recovery blocks

NASA Technical Reports Server (NTRS)

Lee, Y. H.; Shin, K. G.

1982-01-01

A fault-tolerant multiprocessor with a rollback recovery mechanism is discussed. The rollback mechanism is based on the hardware recovery block which is a hardware equivalent to the software recovery block. The hardware recovery block is constructed by consecutive state-save operations and several state-save units in every processor and memory module. When a fault is detected, the multiprocessor reconfigures itself to replace the faulty component and then the process originally assigned to the faulty component retreats to one of the previously saved states in order to resume fault-free execution. A mathematical model is proposed to calculate both the coverage of multi-step rollback recovery and the risk of restart. A performance evaluation in terms of task execution time is also presented.

An improved ant colony optimization algorithm with fault tolerance for job scheduling in grid computing systems

PubMed Central

Idris, Hajara; Junaidu, Sahalu B.; Adewumi, Aderemi O.

2017-01-01

The Grid scheduler, schedules user jobs on the best available resource in terms of resource characteristics by optimizing job execution time. Resource failure in Grid is no longer an exception but a regular occurring event as resources are increasingly being used by the scientific community to solve computationally intensive problems which typically run for days or even months. It is therefore absolutely essential that these long-running applications are able to tolerate failures and avoid re-computations from scratch after resource failure has occurred, to satisfy the user’s Quality of Service (QoS) requirement. Job Scheduling with Fault Tolerance in Grid Computing using Ant Colony Optimization is proposed to ensure that jobs are executed successfully even when resource failure has occurred. The technique employed in this paper, is the use of resource failure rate, as well as checkpoint-based roll back recovery strategy. Check-pointing aims at reducing the amount of work that is lost upon failure of the system by immediately saving the state of the system. A comparison of the proposed approach with an existing Ant Colony Optimization (ACO) algorithm is discussed. The experimental results of the implemented Fault Tolerance scheduling algorithm show that there is an improvement in the user’s QoS requirement over the existing ACO algorithm, which has no fault tolerance integrated in it. The performance evaluation of the two algorithms was measured in terms of the three main scheduling performance metrics: makespan, throughput and average turnaround time. PMID:28545075
Convergence and objective functions of some fault/noise-injection-based online learning algorithms for RBF networks.

PubMed

Ho, Kevin I-J; Leung, Chi-Sing; Sum, John

2010-06-01

In the last two decades, many online fault/noise injection algorithms have been developed to attain a fault tolerant neural network. However, not much theoretical works related to their convergence and objective functions have been reported. This paper studies six common fault/noise-injection-based online learning algorithms for radial basis function (RBF) networks, namely 1) injecting additive input noise, 2) injecting additive/multiplicative weight noise, 3) injecting multiplicative node noise, 4) injecting multiweight fault (random disconnection of weights), 5) injecting multinode fault during training, and 6) weight decay with injecting multinode fault. Based on the Gladyshev theorem, we show that the convergence of these six online algorithms is almost sure. Moreover, their true objective functions being minimized are derived. For injecting additive input noise during training, the objective function is identical to that of the Tikhonov regularizer approach. For injecting additive/multiplicative weight noise during training, the objective function is the simple mean square training error. Thus, injecting additive/multiplicative weight noise during training cannot improve the fault tolerance of an RBF network. Similar to injective additive input noise, the objective functions of other fault/noise-injection-based online algorithms contain a mean square error term and a specialized regularization term.
Proprioceptive Sensors' Fault Tolerant Control Strategy for an Autonomous Vehicle.

PubMed

Boukhari, Mohamed Riad; Chaibet, Ahmed; Boukhnifer, Moussa; Glaser, Sébastien

2018-06-09

In this contribution, a fault-tolerant control strategy for the longitudinal dynamics of an autonomous vehicle is presented. The aim is to be able to detect potential failures of the vehicle's speed sensor and then to keep the vehicle in a safe state. For this purpose, the separation principle, composed of a static output feedback controller and fault estimation observers, is designed. Indeed, two observer techniques were proposed: the proportional and integral observer and the descriptor observer. The effectiveness of the proposed scheme is validated by means of the experimental demonstrator of the VEDECOM (Véhicle Décarboné et Communinicant) Institut.
77 FR 10968 - Fluopyram; Pesticide Tolerances

Federal Register 2010, 2011, 2012, 2013, 2014

2012-02-24

... 16 ppm; vegetable, fruiting, except non-bell pepper, group 8 at 1.0 ppm; vegetable, leafy, except... sufficient information on the carcinogenic mode of action is available, a threshold or non-linear approach is... notice. Submit a copy of your non-CBI objection or hearing request, identified by docket ID number EPA-HQ...
Shallow Vs Structure Accross Hayward Fault Zone Inferred from Multichannel Analysis of Surface Waves (MASW)

NASA Astrophysics Data System (ADS)

Chan, J. H.; Richardson, I. S.; Strayer, L. M.; Catchings, R.; McEvilly, A.; Goldman, M.; Criley, C.; Sickler, R. R.

2017-12-01

The Hayward Fault Zone (HFZ) includes the Hayward fault (HF), as well as several named and unnamed subparallel, subsidiary faults to the east, among them the Quaternary-active Chabot Fault (CF), the Miller Creek Fault (MCF), and a heretofore unnamed fault, the Redwood Thrust Fault (RTF). With an ≥M6.0 recurrence interval of 130 y for the HF and the last major earthquake in 1868, the HFZ is a major seismic hazard in the San Francisco Bay Area, exacerbated by the many unknown and potentially active secondary faults of the HFZ. In 2016, researchers from California State University, East Bay, working in concert with the United States Geological Survey conducted the East Bay Seismic Investigation (EBSI). We deployed 296 RefTek RT125 (Texan) seismographs along a 15-km-long linear seismic profile across the HF, extending from the bay in San Leandro to the hills in Castro Valley. Two-channel seismographs were deployed at 100 m intervals to record P- and S-waves, and additional single-channel seismographs were deployed at 20 m intervals where the seismic line crossed mapped faults. The active-source survey consisted of 16 buried explosive shots located at approximately 1-km intervals along the seismic line. We used the Multichannel Analysis of Surfaces Waves (MASW) method to develop 2-D shear-wave velocity models across the CF, MCF, and RTF. Preliminary MASW analysis show areas of anomalously low S-wave velocities , indicating zones of reduced shear modulus, coincident with these three mapped faults; additional velocity anomalies coincide with unmapped faults within the HFZ. Such compliant zones likely correspond to heavily fractured rock surrounding the faults, where the shear modulus is expected to be low compared to the undeformed host rock.
Control design and performance analysis of a 6 MW wind turbine-generator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Murdoch, A.; Barton, R.S.; Javid, S.H.

1983-05-01

This paper discusses an approach to the modeling and performance for the preliminary design phase of a large (6.2 MW) horizontal axis wind turbine generator (WTG). Two control philosophies are presented, both of which are based on linearized models of the WT mechanical and electrical systems. The control designs are compared by showing the performance through detailed non-linear time simulation. The disturbances considered are wind gusts, and electrical faults near the WT terminals.
Control design and performance analysis of a 6 MW wind turbine-generator

NASA Technical Reports Server (NTRS)

Murdoch, A.; Winkelman, J. R.; Javid, S. H.; Barton, R. S.

1983-01-01

This paper discusses an approach to the modeling and performance for the preliminary design phase of a large (6.2 MW) horizontal axis wind turbine generator (WTG). Two control philosophies are presented, both of which are based on linearized models of the WT mechanical and electrical systems. The control designs are compared by showing the performance through detailed non-linear time simulation. The disturbances considered are wind gusts, and electrical faults near the WT terminals.
Quantum Error Correction with Biased Noise

NASA Astrophysics Data System (ADS)

Brooks, Peter

Quantum computing offers powerful new techniques for speeding up the calculation of many classically intractable problems. Quantum algorithms can allow for the efficient simulation of physical systems, with applications to basic research, chemical modeling, and drug discovery; other algorithms have important implications for cryptography and internet security. At the same time, building a quantum computer is a daunting task, requiring the coherent manipulation of systems with many quantum degrees of freedom while preventing environmental noise from interacting too strongly with the system. Fortunately, we know that, under reasonable assumptions, we can use the techniques of quantum error correction and fault tolerance to achieve an arbitrary reduction in the noise level. In this thesis, we look at how additional information about the structure of noise, or "noise bias," can improve or alter the performance of techniques in quantum error correction and fault tolerance. In Chapter 2, we explore the possibility of designing certain quantum gates to be extremely robust with respect to errors in their operation. This naturally leads to structured noise where certain gates can be implemented in a protected manner, allowing the user to focus their protection on the noisier unprotected operations. In Chapter 3, we examine how to tailor error-correcting codes and fault-tolerant quantum circuits in the presence of dephasing biased noise, where dephasing errors are far more common than bit-flip errors. By using an appropriately asymmetric code, we demonstrate the ability to improve the amount of error reduction and decrease the physical resources required for error correction. In Chapter 4, we analyze a variety of protocols for distilling magic states, which enable universal quantum computation, in the presence of faulty Clifford operations. Here again there is a hierarchy of noise levels, with a fixed error rate for faulty gates, and a second rate for errors in the distilled states which decreases as the states are distilled to better quality. The interplay of of these different rates sets limits on the achievable distillation and how quickly states converge to that limit.
Development and Evaluation of Fault-Tolerant Flight Control Systems

NASA Technical Reports Server (NTRS)

Song, Yong D.; Gupta, Kajal (Technical Monitor)

2004-01-01

The research is concerned with developing a new approach to enhancing fault tolerance of flight control systems. The original motivation for fault-tolerant control comes from the need for safe operation of control elements (e.g. actuators) in the event of hardware failures in high reliability systems. One such example is modem space vehicle subjected to actuator/sensor impairments. A major task in flight control is to revise the control policy to balance impairment detectability and to achieve sufficient robustness. This involves careful selection of types and parameters of the controllers and the impairment detecting filters used. It also involves a decision, upon the identification of some failures, on whether and how a control reconfiguration should take place in order to maintain a certain system performance level. In this project new flight dynamic model under uncertain flight conditions is considered, in which the effects of both ramp and jump faults are reflected. Stabilization algorithms based on neural network and adaptive method are derived. The control algorithms are shown to be effective in dealing with uncertain dynamics due to external disturbances and unpredictable faults. The overall strategy is easy to set up and the computation involved is much less as compared with other strategies. Computer simulation software is developed. A serious of simulation studies have been conducted with varying flight conditions.
Robust Routing Protocol For Digital Messages

NASA Technical Reports Server (NTRS)

Marvit, Maclen

1994-01-01

Refinement of ditigal-message-routing protocol increases fault tolerance of polled networks. AbNET-3 is latest of generic AbNET protocols for transmission of messages among computing nodes. AbNET concept described in "Multiple-Ring Digital Communication Network" (NPO-18133). Specifically aimed at increasing fault tolerance of network in broadcast mode, in which one node broadcasts message to and receives responses from all other nodes. Communication in network of computers maintained even when links fail.
Theory of reliable systems. [systems analysis and design

NASA Technical Reports Server (NTRS)

Meyer, J. F.

1973-01-01

The analysis and design of reliable systems are discussed. The attributes of system reliability studied are fault tolerance, diagnosability, and reconfigurability. Objectives of the study include: to determine properties of system structure that are conducive to a particular attribute; to determine methods for obtaining reliable realizations of a given system; and to determine how properties of system behavior relate to the complexity of fault tolerant realizations. A list of 34 references is included.
MAGMA: A Liquid Software Approach to Fault Tolerance, Computer Network Security, and Survivable Networking

DTIC Science & Technology

2001-12-01

and Lieutenant Namik Kaplan , Turkish Navy. Maj Tiefert’s thesis, “Modeling Control Channel Dynamics of SAAM using NS Network Simulation”, helped lay...DEC99] Deconinck , Dr. ir. Geert, Fault Tolerant Systems, ESAT / Division ACCA , Katholieke Universiteit Leuven, October 1999. [FRE00] Freed...Systems”, Addison-Wesley, 1989. [KAP99] Kaplan , Namik, “Prototyping of an Active and Lightweight Router,” March 1999 [KAT99] Kati, Effraim
Fault-tolerant composite Householder reflection

NASA Astrophysics Data System (ADS)

Torosov, Boyan T.; Kyoseva, Elica; Vitanov, Nikolay V.

2015-07-01

We propose a fault-tolerant implementation of the quantum Householder reflection, which is a key operation in various quantum algorithms, quantum-state engineering, generation of arbitrary unitaries, and entanglement characterization. We construct this operation using the modular approach of composite pulses and a relation between the Householder reflection and the quantum phase gate. The proposed implementation is highly insensitive to variations in the experimental parameters, which makes it suitable for high-fidelity quantum information processing.
Flight test results of the strapdown hexad inertial reference unit (SIRU). Volume 2: Test report

NASA Technical Reports Server (NTRS)

Hruby, R. J.; Bjorkman, W. S.

1977-01-01

Results of flight tests of the Strapdown Inertial Reference Unit (SIRU) navigation system are presented. The fault tolerant SIRU navigation system features a redundant inertial sensor unit and dual computers. System software provides for detection and isolation of inertial sensor failures and continued operation in the event of failures. Flight test results include assessments of the system's navigational performance and fault tolerance. Performance shortcomings are analyzed.
Imperfect construction of microclusters

NASA Astrophysics Data System (ADS)

Schneider, E.; Zhou, K.; Gilbert, G.; Weinstein, Y. S.

2014-01-01

Microclusters are the basic building blocks used to construct cluster states capable of supporting fault-tolerant quantum computation. In this paper, we explore the consequences of errors on microcluster construction using two error models. To quantify the effect of the errors we calculate the fidelity of the constructed microclusters and the fidelity with which two such microclusters can be fused together. Such simulations are vital for gauging the capability of an experimental system to achieve fault tolerance.
An Investigative Redesign of the ECG and EMG Signal Conditioning Circuits for Two-fault Tolerance and Circuit Improvement

NASA Technical Reports Server (NTRS)

Obrien, Edward M.

1991-01-01

An investigation was undertaken to make the elctrocardiography (ECG) and the electromyography (EMG) signal conditioning circuits two-fault tolerant and to update the circuitry. The present signal conditioning circuits provide at least one level of subject protection against electrical shock hazard but at a level of 100 micro-A (for voltages of up to 200 V). However, it is necessary to provide catastrophic fault tolerance protection for the astronauts and to provide protection at a current level of less that 100 micro-A. For this study, protection at the 10 micro-A level was sought. This is the generally accepted value below which no possibility of microshock exists. Only the possibility of macroshock exists in the case of the signal conditioners. However, this extra amount of protection is desirable. The initial part deals with current limiter circuits followed by an investigation into the signal conditioner specifications and circuit design.
A Conceptual Design for a Reliable Optical Bus (ROBUS)

NASA Technical Reports Server (NTRS)

Miner, Paul S.; Malekpour, Mahyar; Torres, Wilfredo

2002-01-01

The Scalable Processor-Independent Design for Electromagnetic Resilience (SPIDER) is a new family of fault-tolerant architectures under development at NASA Langley Research Center (LaRC). The SPIDER is a general-purpose computational platform suitable for use in ultra-reliable embedded control applications. The design scales from a small configuration supporting a single aircraft function to a large distributed configuration capable of supporting several functions simultaneously. SPIDER consists of a collection of simplex processing elements communicating via a Reliable Optical Bus (ROBUS). The ROBUS is an ultra-reliable, time-division multiple access broadcast bus with strictly enforced write access (no babbling idiots) providing basic fault-tolerant services using formally verified fault-tolerance protocols including Interactive Consistency (Byzantine Agreement), Internal Clock Synchronization, and Distributed Diagnosis. The conceptual design of the ROBUS is presented in this paper including requirements, topology, protocols, and the block-level design. Verification activities, including the use of formal methods, are also discussed.
Fault Tolerant Characteristics of Artificial Neural Network Electronic Hardware

NASA Technical Reports Server (NTRS)

Zee, Frank

1995-01-01

The fault tolerant characteristics of analog-VLSI artificial neural network (with 32 neurons and 532 synapses) chips are studied by exposing them to high energy electrons, high energy protons, and gamma ionizing radiations under biased and unbiased conditions. The biased chips became nonfunctional after receiving a cumulative dose of less than 20 krads, while the unbiased chips only started to show degradation with a cumulative dose of over 100 krads. As the total radiation dose increased, all the components demonstrated graceful degradation. The analog sigmoidal function of the neuron became steeper (increase in gain), current leakage from the synapses progressively shifted the sigmoidal curve, and the digital memory of the synapses and the memory addressing circuits began to gradually fail. From these radiation experiments, we can learn how to modify certain designs of the neural network electronic hardware without using radiation-hardening techniques to increase its reliability and fault tolerance.
Fault tolerance with noisy and slow measurements and preparation.

PubMed

Paz-Silva, Gerardo A; Brennen, Gavin K; Twamley, Jason

2010-09-03

It is not so well known that measurement-free quantum error correction protocols can be designed to achieve fault-tolerant quantum computing. Despite their potential advantages in terms of the relaxation of accuracy, speed, and addressing requirements, they have usually been overlooked since they are expected to yield a very bad threshold. We show that this is not the case. We design fault-tolerant circuits for the 9-qubit Bacon-Shor code and find an error threshold for unitary gates and preparation of p((p,g)thresh)=3.76×10(-5) (30% of the best known result for the same code using measurement) while admitting up to 1/3 error rates for measurements and allocating no constraints on measurement speed. We further show that demanding gate error rates sufficiently below the threshold pushes the preparation threshold up to p((p)thresh)=1/3.
BFT replication resistant to MAC attacks

NASA Astrophysics Data System (ADS)

Zbierski, Maciej

2016-09-01

Over the last decade numerous Byzantine fault-tolerant (BFT) replication protocols have been proposed in the literature. However, the vast majority of these solutions reuse the same authentication scheme, which makes them susceptible to a so called MAC attack. Such vulnerability enables malicious clients to undetectably prevent the replicated service from processing incoming client requests, and consequently making it permanently unavailable. While some BFT protocols attempted to address this issue by using different authentication mechanisms, they at the same time significantly degraded the performance achieved in correct environments. This article presents a novel adaptive authentication mechanism which can be combined with practically any Byzantine fault-tolerant replication protocol. Unlike previous solutions, the proposed scheme dynamically switches between two operation modes to combine high performance in correct environments and liveness during MAC attacks. The experiment results presented in the article demonstrate that the proposed mechanism can sufficiently tolerate MAC attacks without introducing any observable overhead whenever no faults are present.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.