Flight elements: Fault detection and fault management
NASA Technical Reports Server (NTRS)
Lum, H.; Patterson-Hine, A.; Edge, J. T.; Lawler, D.
1990-01-01
Fault management for an intelligent computational system must be developed using a top down integrated engineering approach. An approach proposed includes integrating the overall environment involving sensors and their associated data; design knowledge capture; operations; fault detection, identification, and reconfiguration; testability; causal models including digraph matrix analysis; and overall performance impacts on the hardware and software architecture. Implementation of the concept to achieve a real time intelligent fault detection and management system will be accomplished via the implementation of several objectives, which are: Development of fault tolerant/FDIR requirement and specification from a systems level which will carry through from conceptual design through implementation and mission operations; Implementation of monitoring, diagnosis, and reconfiguration at all system levels providing fault isolation and system integration; Optimize system operations to manage degraded system performance through system integration; and Lower development and operations costs through the implementation of an intelligent real time fault detection and fault management system and an information management system.
Managing Space System Faults: Coalescing NASA's Views
NASA Technical Reports Server (NTRS)
Muirhead, Brian; Fesq, Lorraine
2012-01-01
Managing faults and their resultant failures is a fundamental and critical part of developing and operating aerospace systems. Yet, recent studies have shown that the engineering "discipline" required to manage faults is not widely recognized nor evenly practiced within the NASA community. Attempts to simply name this discipline in recent years has been fraught with controversy among members of the Integrated Systems Health Management (ISHM), Fault Management (FM), Fault Protection (FP), Hazard Analysis (HA), and Aborts communities. Approaches to managing space system faults typically are unique to each organization, with little commonality in the architectures, processes and practices across the industry.
Fault management for data systems
NASA Technical Reports Server (NTRS)
Boyd, Mark A.; Iverson, David L.; Patterson-Hine, F. Ann
1993-01-01
Issues related to automating the process of fault management (fault diagnosis and response) for data management systems are considered. Substantial benefits are to be gained by successful automation of this process, particularly for large, complex systems. The use of graph-based models to develop a computer assisted fault management system is advocated. The general problem is described and the motivation behind choosing graph-based models over other approaches for developing fault diagnosis computer programs is outlined. Some existing work in the area of graph-based fault diagnosis is reviewed, and a new fault management method which was developed from existing methods is offered. Our method is applied to an automatic telescope system intended as a prototype for future lunar telescope programs. Finally, an application of our method to general data management systems is described.
NASA Technical Reports Server (NTRS)
Johnson, Stephen B.; Ghoshal, Sudipto; Haste, Deepak; Moore, Craig
2017-01-01
This paper describes the theory and considerations in the application of metrics to measure the effectiveness of fault management. Fault management refers here to the operational aspect of system health management, and as such is considered as a meta-control loop that operates to preserve or maximize the system's ability to achieve its goals in the face of current or prospective failure. As a suite of control loops, the metrics to estimate and measure the effectiveness of fault management are similar to those of classical control loops in being divided into two major classes: state estimation, and state control. State estimation metrics can be classified into lower-level subdivisions for detection coverage, detection effectiveness, fault isolation and fault identification (diagnostics), and failure prognosis. State control metrics can be classified into response determination effectiveness and response effectiveness. These metrics are applied to each and every fault management control loop in the system, for each failure to which they apply, and probabilistically summed to determine the effectiveness of these fault management control loops to preserve the relevant system goals that they are intended to protect.
On-board fault management for autonomous spacecraft
NASA Technical Reports Server (NTRS)
Fesq, Lorraine M.; Stephan, Amy; Doyle, Susan C.; Martin, Eric; Sellers, Suzanne
1991-01-01
The dynamic nature of the Cargo Transfer Vehicle's (CTV) mission and the high level of autonomy required mandate a complete fault management system capable of operating under uncertain conditions. Such a fault management system must take into account the current mission phase and the environment (including the target vehicle), as well as the CTV's state of health. This level of capability is beyond the scope of current on-board fault management systems. This presentation will discuss work in progress at TRW to apply artificial intelligence to the problem of on-board fault management. The goal of this work is to develop fault management systems. This presentation will discuss work in progress at TRW to apply artificial intelligence to the problem of on-board fault management. The goal of this work is to develop fault management systems that can meet the needs of spacecraft that have long-range autonomy requirements. We have implemented a model-based approach to fault detection and isolation that does not require explicit characterization of failures prior to launch. It is thus able to detect failures that were not considered in the failure and effects analysis. We have applied this technique to several different subsystems and tested our approach against both simulations and an electrical power system hardware testbed. We present findings from simulation and hardware tests which demonstrate the ability of our model-based system to detect and isolate failures, and describe our work in porting the Ada version of this system to a flight-qualified processor. We also discuss current research aimed at expanding our system to monitor the entire spacecraft.
NASA Technical Reports Server (NTRS)
Freeman, Kenneth A.; Walsh, Rick; Weeks, David J.
1988-01-01
Space Station issues in fault management are discussed. The system background is described with attention given to design guidelines and power hardware. A contractually developed fault management system, FRAMES, is integrated with the energy management functions, the control switchgear, and the scheduling and operations management functions. The constraints that shaped the FRAMES system and its implementation are considered.
Orion GN&C Fault Management System Verification: Scope And Methodology
NASA Technical Reports Server (NTRS)
Brown, Denise; Weiler, David; Flanary, Ronald
2016-01-01
In order to ensure long-term ability to meet mission goals and to provide for the safety of the public, ground personnel, and any crew members, nearly all spacecraft include a fault management (FM) system. For a manned vehicle such as Orion, the safety of the crew is of paramount importance. The goal of the Orion Guidance, Navigation and Control (GN&C) fault management system is to detect, isolate, and respond to faults before they can result in harm to the human crew or loss of the spacecraft. Verification of fault management/fault protection capability is challenging due to the large number of possible faults in a complex spacecraft, the inherent unpredictability of faults, the complexity of interactions among the various spacecraft components, and the inability to easily quantify human reactions to failure scenarios. The Orion GN&C Fault Detection, Isolation, and Recovery (FDIR) team has developed a methodology for bounding the scope of FM system verification while ensuring sufficient coverage of the failure space and providing high confidence that the fault management system meets all safety requirements. The methodology utilizes a swarm search algorithm to identify failure cases that can result in catastrophic loss of the crew or the vehicle and rare event sequential Monte Carlo to verify safety and FDIR performance requirements.
Coordinated Fault-Tolerance for High-Performance Computing Final Project Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Panda, Dhabaleswar Kumar; Beckman, Pete
2011-07-28
With the Coordinated Infrastructure for Fault Tolerance Systems (CIFTS, as the original project came to be called) project, our aim has been to understand and tackle the following broad research questions, the answers to which will help the HEC community analyze and shape the direction of research in the field of fault tolerance and resiliency on future high-end leadership systems. Will availability of global fault information, obtained by fault information exchange between the different HEC software on a system, allow individual system software to better detect, diagnose, and adaptively respond to faults? If fault-awareness is raised throughout the system throughmore » fault information exchange, is it possible to get all system software working together to provide a more comprehensive end-to-end fault management on the system? What are the missing fault-tolerance features that widely used HEC system software lacks today that would inhibit such software from taking advantage of systemwide global fault information? What are the practical limitations of a systemwide approach for end-to-end fault management based on fault awareness and coordination? What mechanisms, tools, and technologies are needed to bring about fault awareness and coordination of responses on a leadership-class system? What standards, outreach, and community interaction are needed for adoption of the concept of fault awareness and coordination for fault management on future systems? Keeping our overall objectives in mind, the CIFTS team has taken a parallel fourfold approach. Our central goal was to design and implement a light-weight, scalable infrastructure with a simple, standardized interface to allow communication of fault-related information through the system and facilitate coordinated responses. This work led to the development of the Fault Tolerance Backplane (FTB) publish-subscribe API specification, together with a reference implementation and several experimental implementations on top of existing publish-subscribe tools. We enhanced the intrinsic fault tolerance capabilities representative implementations of a variety of key HPC software subsystems and integrated them with the FTB. Targeting software subsystems included: MPI communication libraries, checkpoint/restart libraries, resource managers and job schedulers, and system monitoring tools. Leveraging the aforementioned infrastructure, as well as developing and utilizing additional tools, we have examined issues associated with expanded, end-to-end fault response from both system and application viewpoints. From the standpoint of system operations, we have investigated log and root cause analysis, anomaly detection and fault prediction, and generalized notification mechanisms. Our applications work has included libraries for fault-tolerance linear algebra, application frameworks for coupled multiphysics applications, and external frameworks to support the monitoring and response for general applications. Our final goal was to engage the high-end computing community to increase awareness of tools and issues around coordinated end-to-end fault management.« less
Fault recovery characteristics of the fault tolerant multi-processor
NASA Technical Reports Server (NTRS)
Padilla, Peter A.
1990-01-01
The fault handling performance of the fault tolerant multiprocessor (FTMP) was investigated. Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles byzantine or lying faults. It is pointed out that these weak areas in the FTMP's design increase the probability that, for any hardware fault, a good LRU (line replaceable unit) is mistakenly disabled by the fault management software. It is concluded that fault injection can help detect and analyze the behavior of a system in the ultra-reliable regime. Although fault injection testing cannot be exhaustive, it has been demonstrated that it provides a unique capability to unmask problems and to characterize the behavior of a fault-tolerant system.
NASA Technical Reports Server (NTRS)
Rogers, William H.
1993-01-01
In rare instances, flight crews of commercial aircraft must manage complex systems faults in addition to all their normal flight tasks. Pilot errors in fault management have been attributed, at least in part, to an incomplete or inaccurate awareness of the fault situation. The current study is part of a program aimed at assuring that the types of information potentially available from an intelligent fault management aiding concept developed at NASA Langley called 'Faultfinde' (see Abbott, Schutte, Palmer, and Ricks, 1987) are an asset rather than a liability: additional information should improve pilot performance and aircraft safety, but it should not confuse, distract, overload, mislead, or generally exacerbate already difficult circumstances.
Fault management and systems knowledge
DOT National Transportation Integrated Search
2016-12-01
Pilots are asked to manage faults during flight operations. This leads to the training question of the type and depth of system knowledge required to respond to these faults. Based on discussions with multiple airline operators, there is agreement th...
Automated Generation of Fault Management Artifacts from a Simple System Model
NASA Technical Reports Server (NTRS)
Kennedy, Andrew K.; Day, John C.
2013-01-01
Our understanding of off-nominal behavior - failure modes and fault propagation - in complex systems is often based purely on engineering intuition; specific cases are assessed in an ad hoc fashion as a (fallible) fault management engineer sees fit. This work is an attempt to provide a more rigorous approach to this understanding and assessment by automating the creation of a fault management artifact, the Failure Modes and Effects Analysis (FMEA) through querying a representation of the system in a SysML model. This work builds off the previous development of an off-nominal behavior model for the upcoming Soil Moisture Active-Passive (SMAP) mission at the Jet Propulsion Laboratory. We further developed the previous system model to more fully incorporate the ideas of State Analysis, and it was restructured in an organizational hierarchy that models the system as layers of control systems while also incorporating the concept of "design authority". We present software that was developed to traverse the elements and relationships in this model to automatically construct an FMEA spreadsheet. We further discuss extending this model to automatically generate other typical fault management artifacts, such as Fault Trees, to efficiently portray system behavior, and depend less on the intuition of fault management engineers to ensure complete examination of off-nominal behavior.
Model Transformation for a System of Systems Dependability Safety Case
NASA Technical Reports Server (NTRS)
Murphy, Judy; Driskell, Stephen B.
2010-01-01
Software plays an increasingly larger role in all aspects of NASA's science missions. This has been extended to the identification, management and control of faults which affect safety-critical functions and by default, the overall success of the mission. Traditionally, the analysis of fault identification, management and control are hardware based. Due to the increasing complexity of system, there has been a corresponding increase in the complexity in fault management software. The NASA Independent Validation & Verification (IV&V) program is creating processes and procedures to identify, and incorporate safety-critical software requirements along with corresponding software faults so that potential hazards may be mitigated. This Specific to Generic ... A Case for Reuse paper describes the phases of a dependability and safety study which identifies a new, process to create a foundation for reusable assets. These assets support the identification and management of specific software faults and, their transformation from specific to generic software faults. This approach also has applications to other systems outside of the NASA environment. This paper addresses how a mission specific dependability and safety case is being transformed to a generic dependability and safety case which can be reused for any type of space mission with an emphasis on software fault conditions.
Health management and controls for Earth-to-orbit propulsion systems
NASA Astrophysics Data System (ADS)
Bickford, R. L.
1995-03-01
Avionics and health management technologies increase the safety and reliability while decreasing the overall cost for Earth-to-orbit (ETO) propulsion systems. New ETO propulsion systems will depend on highly reliable fault tolerant flight avionics, advanced sensing systems and artificial intelligence aided software to ensure critical control, safety and maintenance requirements are met in a cost effective manner. Propulsion avionics consist of the engine controller, actuators, sensors, software and ground support elements. In addition to control and safety functions, these elements perform system monitoring for health management. Health management is enhanced by advanced sensing systems and algorithms which provide automated fault detection and enable adaptive control and/or maintenance approaches. Aerojet is developing advanced fault tolerant rocket engine controllers which provide very high levels of reliability. Smart sensors and software systems which significantly enhance fault coverage and enable automated operations are also under development. Smart sensing systems, such as flight capable plume spectrometers, have reached maturity in ground-based applications and are suitable for bridging to flight. Software to detect failed sensors has reached similar maturity. This paper will discuss fault detection and isolation for advanced rocket engine controllers as well as examples of advanced sensing systems and software which significantly improve component failure detection for engine system safety and health management.
NASA Technical Reports Server (NTRS)
Padilla, Peter A.
1991-01-01
An investigation was made in AIRLAB of the fault handling performance of the Fault Tolerant MultiProcessor (FTMP). Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once in every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles Byzantine or lying faults. Byzantine faults behave such that the faulted unit points to a working unit as the source of errors. The design's problems involve: (1) the design and interface between the simplex error detection hardware and the error processing software, (2) the functional capabilities of the FTMP system bus, and (3) the communication requirements of a multiprocessor architecture. These weak areas in the FTMP's design increase the probability that, for any hardware fault, a good line replacement unit (LRU) is mistakenly disabled by the fault management software.
2009-09-01
this information supports the decison - making process as it is applied to the management of risk. 2. Operational Risk Operational risk is the threat... reasonability . However, to make a software system fault tolerant, the system needs to recognize and fix a system state condition. To detect a fault, a fault...Tracking ..........................................51 C. DECISION- MAKING PROCESS................................................................51 1. Risk
ARGES: an Expert System for Fault Diagnosis Within Space-Based ECLS Systems
NASA Technical Reports Server (NTRS)
Pachura, David W.; Suleiman, Salem A.; Mendler, Andrew P.
1988-01-01
ARGES (Atmospheric Revitalization Group Expert System) is a demonstration prototype expert system for fault management for the Solid Amine, Water Desorbed (SAWD) CO2 removal assembly, associated with the Environmental Control and Life Support (ECLS) System. ARGES monitors and reduces data in real time from either the SAWD controller or a simulation of the SAWD assembly. It can detect gradual degradations or predict failures. This allows graceful shutdown and scheduled maintenance, which reduces crew maintenance overhead. Status and fault information is presented in a user interface that simulates what would be seen by a crewperson. The user interface employs animated color graphics and an object oriented approach to provide detailed status information, fault identification, and explanation of reasoning in a rapidly assimulated manner. In addition, ARGES recommends possible courses of action for predicted and actual faults. ARGES is seen as a forerunner of AI-based fault management systems for manned space systems.
Survivable algorithms and redundancy management in NASA's distributed computing systems
NASA Technical Reports Server (NTRS)
Malek, Miroslaw
1992-01-01
The design of survivable algorithms requires a solid foundation for executing them. While hardware techniques for fault-tolerant computing are relatively well understood, fault-tolerant operating systems, as well as fault-tolerant applications (survivable algorithms), are, by contrast, little understood, and much more work in this field is required. We outline some of our work that contributes to the foundation of ultrareliable operating systems and fault-tolerant algorithm design. We introduce our consensus-based framework for fault-tolerant system design. This is followed by a description of a hierarchical partitioning method for efficient consensus. A scheduler for redundancy management is introduced, and application-specific fault tolerance is described. We give an overview of our hybrid algorithm technique, which is an alternative to the formal approach given.
Analytical Approaches to Guide SLS Fault Management (FM) Development
NASA Technical Reports Server (NTRS)
Patterson, Jonathan D.
2012-01-01
Extensive analysis is needed to determine the right set of FM capabilities to provide the most coverage without significantly increasing the cost, reliability (FP/FN), and complexity of the overall vehicle systems. Strong collaboration with the stakeholders is required to support the determination of the best triggers and response options. The SLS Fault Management process has been documented in the Space Launch System Program (SLSP) Fault Management Plan (SLS-PLAN-085).
Redundancy management for efficient fault recovery in NASA's distributed computing system
NASA Technical Reports Server (NTRS)
Malek, Miroslaw; Pandya, Mihir; Yau, Kitty
1991-01-01
The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance.
NASA Technical Reports Server (NTRS)
Rogers, William H.; Schutte, Paul C.
1993-01-01
Advanced fault management aiding concepts for commercial pilots are being developed in a research program at NASA Langley Research Center. One aim of this program is to re-evaluate current design principles for display of fault information to the flight crew: (1) from a cognitive engineering perspective and (2) in light of the availability of new types of information generated by advanced fault management aids. The study described in this paper specifically addresses principles for organizing fault information for display to pilots based on their mental models of fault management.
Implementation of Integrated System Fault Management Capability
NASA Technical Reports Server (NTRS)
Figueroa, Fernando; Schmalzel, John; Morris, Jon; Smith, Harvey; Turowski, Mark
2008-01-01
Fault Management to support rocket engine test mission with highly reliable and accurate measurements; while improving availability and lifecycle costs. CORE ELEMENTS: Architecture, taxonomy, and ontology (ATO) for DIaK management. Intelligent Sensor Processes; Intelligent Element Processes; Intelligent Controllers; Intelligent Subsystem Processes; Intelligent System Processes; Intelligent Component Processes.
Formal Validation of Fault Management Design Solutions
NASA Technical Reports Server (NTRS)
Gibson, Corrina; Karban, Robert; Andolfato, Luigi; Day, John
2013-01-01
The work presented in this paper describes an approach used to develop SysML modeling patterns to express the behavior of fault protection, test the model's logic by performing fault injection simulations, and verify the fault protection system's logical design via model checking. A representative example, using a subset of the fault protection design for the Soil Moisture Active-Passive (SMAP) system, was modeled with SysML State Machines and JavaScript as Action Language. The SysML model captures interactions between relevant system components and system behavior abstractions (mode managers, error monitors, fault protection engine, and devices/switches). Development of a method to implement verifiable and lightweight executable fault protection models enables future missions to have access to larger fault test domains and verifiable design patterns. A tool-chain to transform the SysML model to jpf-Statechart compliant Java code and then verify the generated code via model checking was established. Conclusions and lessons learned from this work are also described, as well as potential avenues for further research and development.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lumsdaine, Andrew
2013-03-08
The main purpose of the Coordinated Infrastructure for Fault Tolerance in Systems initiative has been to conduct research with a goal of providing end-to-end fault tolerance on a systemwide basis for applications and other system software. While fault tolerance has been an integral part of most high-performance computing (HPC) system software developed over the past decade, it has been treated mostly as a collection of isolated stovepipes. Visibility and response to faults has typically been limited to the particular hardware and software subsystems in which they are initially observed. Little fault information is shared across subsystems, allowing little flexibility ormore » control on a system-wide basis, making it practically impossible to provide cohesive end-to-end fault tolerance in support of scientific applications. As an example, consider faults such as communication link failures that can be seen by a network library but are not directly visible to the job scheduler, or consider faults related to node failures that can be detected by system monitoring software but are not inherently visible to the resource manager. If information about such faults could be shared by the network libraries or monitoring software, then other system software, such as a resource manager or job scheduler, could ensure that failed nodes or failed network links were excluded from further job allocations and that further diagnosis could be performed. As a founding member and one of the lead developers of the Open MPI project, our efforts over the course of this project have been focused on making Open MPI more robust to failures by supporting various fault tolerance techniques, and using fault information exchange and coordination between MPI and the HPC system software stack from the application, numeric libraries, and programming language runtime to other common system components such as jobs schedulers, resource managers, and monitoring tools.« less
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Schreckenghost, Debra L.; Woods, David D.; Potter, Scott S.; Johannesen, Leila; Holloway, Matthew; Forbus, Kenneth D.
1991-01-01
Initial results are reported from a multi-year, interdisciplinary effort to provide guidance and assistance for designers of intelligent systems and their user interfaces. The objective is to achieve more effective human-computer interaction (HCI) for systems with real time fault management capabilities. Intelligent fault management systems within the NASA were evaluated for insight into the design of systems with complex HCI. Preliminary results include: (1) a description of real time fault management in aerospace domains; (2) recommendations and examples for improving intelligent systems design and user interface design; (3) identification of issues requiring further research; and (4) recommendations for a development methodology integrating HCI design into intelligent system design.
Health management and controls for earth to orbit propulsion systems
NASA Technical Reports Server (NTRS)
Bickford, R. L.
1992-01-01
Fault detection and isolation for advanced rocket engine controllers are discussed focusing on advanced sensing systems and software which significantly improve component failure detection for engine safety and health management. Aerojet's Space Transportation Main Engine controller for the National Launch System is the state of the art in fault tolerant engine avionics. Health management systems provide high levels of automated fault coverage and significantly improve vehicle delivered reliability and lower preflight operations costs. Key technologies, including the sensor data validation algorithms and flight capable spectrometers, have been demonstrated in ground applications and are found to be suitable for bridging programs into flight applications.
Operations management system advanced automation: Fault detection isolation and recovery prototyping
NASA Technical Reports Server (NTRS)
Hanson, Matt
1990-01-01
The purpose of this project is to address the global fault detection, isolation and recovery (FDIR) requirements for Operation's Management System (OMS) automation within the Space Station Freedom program. This shall be accomplished by developing a selected FDIR prototype for the Space Station Freedom distributed processing systems. The prototype shall be based on advanced automation methodologies in addition to traditional software methods to meet the requirements for automation. A secondary objective is to expand the scope of the prototyping to encompass multiple aspects of station-wide fault management (SWFM) as discussed in OMS requirements documentation.
Automatic Fault Characterization via Abnormality-Enhanced Classification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bronevetsky, G; Laguna, I; de Supinski, B R
Enterprise and high-performance computing systems are growing extremely large and complex, employing hundreds to hundreds of thousands of processors and software/hardware stacks built by many people across many organizations. As the growing scale of these machines increases the frequency of faults, system complexity makes these faults difficult to detect and to diagnose. Current system management techniques, which focus primarily on efficient data access and query mechanisms, require system administrators to examine the behavior of various system services manually. Growing system complexity is making this manual process unmanageable: administrators require more effective management tools that can detect faults and help tomore » identify their root causes. System administrators need timely notification when a fault is manifested that includes the type of fault, the time period in which it occurred and the processor on which it originated. Statistical modeling approaches can accurately characterize system behavior. However, the complex effects of system faults make these tools difficult to apply effectively. This paper investigates the application of classification and clustering algorithms to fault detection and characterization. We show experimentally that naively applying these methods achieves poor accuracy. Further, we design novel techniques that combine classification algorithms with information on the abnormality of application behavior to improve detection and characterization accuracy. Our experiments demonstrate that these techniques can detect and characterize faults with 65% accuracy, compared to just 5% accuracy for naive approaches.« less
NASA Technical Reports Server (NTRS)
Hayashi, Miwa; Ravinder, Ujwala; McCann, Robert S.; Beutter, Brent; Spirkovska, Lily
2009-01-01
Performance enhancements associated with selected forms of automation were quantified in a recent human-in-the-loop evaluation of two candidate operational concepts for fault management on next-generation spacecraft. The baseline concept, called Elsie, featured a full-suite of "soft" fault management interfaces. However, operators were forced to diagnose malfunctions with minimal assistance from the standalone caution and warning system. The other concept, called Besi, incorporated a more capable C&W system with an automated fault diagnosis capability. Results from analyses of participants' eye movements indicate that the greatest empirical benefit of the automation stemmed from eliminating the need for text processing on cluttered, text-rich displays.
NASA Technical Reports Server (NTRS)
Yates, Amy M.; Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Gonzalez, Oscar R.; Gray, W. Steven
2010-01-01
Safety-critical distributed flight control systems require robustness in the presence of faults. In general, these systems consist of a number of input/output (I/O) and computation nodes interacting through a fault-tolerant data communication system. The communication system transfers sensor data and control commands and can handle most faults under typical operating conditions. However, the performance of the closed-loop system can be adversely affected as a result of operating in harsh environments. In particular, High-Intensity Radiated Field (HIRF) environments have the potential to cause random fault manifestations in individual avionic components and to generate simultaneous system-wide communication faults that overwhelm existing fault management mechanisms. This paper presents the design of an experiment conducted at the NASA Langley Research Center's HIRF Laboratory to statistically characterize the faults that a HIRF environment can trigger on a single node of a distributed flight control system.
NASA Astrophysics Data System (ADS)
Xu, Jiuping; Zhong, Zhengqiang; Xu, Lei
2015-10-01
In this paper, an integrated system health management-oriented adaptive fault diagnostics and model for avionics is proposed. With avionics becoming increasingly complicated, precise and comprehensive avionics fault diagnostics has become an extremely complicated task. For the proposed fault diagnostic system, specific approaches, such as the artificial immune system, the intelligent agents system and the Dempster-Shafer evidence theory, are used to conduct deep fault avionics diagnostics. Through this proposed fault diagnostic system, efficient and accurate diagnostics can be achieved. A numerical example is conducted to apply the proposed hybrid diagnostics to a set of radar transmitters on an avionics system and to illustrate that the proposed system and model have the ability to achieve efficient and accurate fault diagnostics. By analyzing the diagnostic system's feasibility and pragmatics, the advantages of this system are demonstrated.
NASA Spacecraft Fault Management Workshop Results
NASA Technical Reports Server (NTRS)
Newhouse, Marilyn; McDougal, John; Barley, Bryan; Fesq, Lorraine; Stephens, Karen
2010-01-01
Fault Management is a critical aspect of deep-space missions. For the purposes of this paper, fault management is defined as the ability of a system to detect, isolate, and mitigate events that impact, or have the potential to impact, nominal mission operations. The fault management capabilities are commonly distributed across flight and ground subsystems, impacting hardware, software, and mission operations designs. The National Aeronautics and Space Administration (NASA) Discovery & New Frontiers (D&NF) Program Office at Marshall Space Flight Center (MSFC) recently studied cost overruns and schedule delays for 5 missions. The goal was to identify the underlying causes for the overruns and delays, and to develop practical mitigations to assist the D&NF projects in identifying potential risks and controlling the associated impacts to proposed mission costs and schedules. The study found that 4 out of the 5 missions studied had significant overruns due to underestimating the complexity and support requirements for fault management. As a result of this and other recent experiences, the NASA Science Mission Directorate (SMD) Planetary Science Division (PSD) commissioned a workshop to bring together invited participants across government, industry, academia to assess the state of the art in fault management practice and research, identify current and potential issues, and make recommendations for addressing these issues. The workshop was held in New Orleans in April of 2008. The workshop concluded that fault management is not being limited by technology, but rather by a lack of emphasis and discipline in both the engineering and programmatic dimensions. Some of the areas cited in the findings include different, conflicting, and changing institutional goals and risk postures; unclear ownership of end-to-end fault management engineering; inadequate understanding of the impact of mission-level requirements on fault management complexity; and practices, processes, and tools that have not kept pace with the increasing complexity of mission requirements and spacecraft systems. This paper summarizes the findings and recommendations from that workshop, as well as opportunities identified for future investment in tools, processes, and products to facilitate the development of space flight fault management capabilities.
Technologies for unattended network operations
NASA Technical Reports Server (NTRS)
Jaworski, Allan; Odubiyi, Jide; Holdridge, Mark; Zuzek, John
1991-01-01
The necessary network management functions for a telecommunications, navigation and information management (TNIM) system in the framework of an extension of the ISO model for communications network management are described. Various technologies that could substantially reduce the need for TNIM network management, automate manpower intensive functions, and deal with synchronization and control at interplanetary distances are presented. Specific technologies addressed include the use of the ISO Common Management Interface Protocol, distributed artificial intelligence for network synchronization and fault management, and fault-tolerant systems engineering.
NASA Technical Reports Server (NTRS)
Ashworth, Barry R.
1989-01-01
A description is given of the SSM/PMAD power system automation testbed, which was developed using a systems engineering approach. The architecture includes a knowledge-based system and has been successfully used in power system management and fault diagnosis. Architectural issues which effect overall system activities and performance are examined. The knowledge-based system is discussed along with its associated automation implications, and interfaces throughout the system are presented.
Reliability of Fault Tolerant Control Systems. Part 2
NASA Technical Reports Server (NTRS)
Wu, N. Eva
2000-01-01
This paper reports Part II of a two part effort that is intended to delineate the relationship between reliability and fault tolerant control in a quantitative manner. Reliability properties peculiar to fault-tolerant control systems are emphasized, such as the presence of analytic redundancy in high proportion, the dependence of failures on control performance, and high risks associated with decisions in redundancy management due to multiple sources of uncertainties and sometimes large processing requirements. As a consequence, coverage of failures through redundancy management can be severely limited. The paper proposes to formulate the fault tolerant control problem as an optimization problem that maximizes coverage of failures through redundancy management. Coverage modeling is attempted in a way that captures its dependence on the control performance and on the diagnostic resolution. Under the proposed redundancy management policy, it is shown that an enhanced overall system reliability can be achieved with a control law of a superior robustness, with an estimator of a higher resolution, and with a control performance requirement of a lesser stringency.
Failure detection and fault management techniques for flush airdata sensing systems
NASA Technical Reports Server (NTRS)
Whitmore, Stephen A.; Moes, Timothy R.; Leondes, Cornelius T.
1992-01-01
Methods based on chi-squared analysis are presented for detecting system and individual-port failures in the high-angle-of-attack flush airdata sensing system on the NASA F-18 High Alpha Research Vehicle. The HI-FADS hardware is introduced, and the aerodynamic model describes measured pressure in terms of dynamic pressure, angle of attack, angle of sideslip, and static pressure. Chi-squared analysis is described in the presentation of the concept for failure detection and fault management which includes nominal, iteration, and fault-management modes. A matrix of pressure orifices arranged in concentric circles on the nose of the aircraft indicate the parameters which are applied to the regression algorithms. The sensing techniques are applied to the F-18 flight data, and two examples are given of the computed angle-of-attack time histories. The failure-detection and fault-management techniques permit the matrix to be multiply redundant, and the chi-squared analysis is shown to be useful in the detection of failures.
NASA Technical Reports Server (NTRS)
Lee, S. C.; Lollar, Louis F.
1988-01-01
The overall approach currently being taken in the development of AMPERES (Autonomously Managed Power System Extendable Real-time Expert System), a knowledge-based expert system for fault monitoring and diagnosis of space power systems, is discussed. The system architecture, knowledge representation, and fault monitoring and diagnosis strategy are examined. A 'component-centered' approach developed in this project is described. Critical issues requiring further study are identified.
Fault Management Design Strategies
NASA Technical Reports Server (NTRS)
Day, John C.; Johnson, Stephen B.
2014-01-01
Development of dependable systems relies on the ability of the system to determine and respond to off-nominal system behavior. Specification and development of these fault management capabilities must be done in a structured and principled manner to improve our understanding of these systems, and to make significant gains in dependability (safety, reliability and availability). Prior work has described a fundamental taxonomy and theory of System Health Management (SHM), and of its operational subset, Fault Management (FM). This conceptual foundation provides a basis to develop framework to design and implement FM design strategies that protect mission objectives and account for system design limitations. Selection of an SHM strategy has implications for the functions required to perform the strategy, and it places constraints on the set of possible design solutions. The framework developed in this paper provides a rigorous and principled approach to classifying SHM strategies, as well as methods for determination and implementation of SHM strategies. An illustrative example is used to describe the application of the framework and the resulting benefits to system and FM design and dependability.
NASA Technical Reports Server (NTRS)
Melcher, Kevin J.; Sowers, T. Shane; Maul, William A.
2005-01-01
The constraints of future Exploration Missions will require unique Integrated System Health Management (ISHM) capabilities throughout the mission. An ambitious launch schedule, human-rating requirements, long quiescent periods, limited human access for repair or replacement, and long communication delays all require an ISHM system that can span distinct yet interdependent vehicle subsystems, anticipate failure states, provide autonomous remediation, and support the Exploration Mission from beginning to end. NASA Glenn Research Center has developed and applied health management system technologies to aerospace propulsion systems for almost two decades. Lessons learned from past activities help define the approach to proper ISHM development: sensor selection- identifies sensor sets required for accurate health assessment; data qualification and validation-ensures the integrity of measurement data from sensor to data system; fault detection and isolation-uses measurements in a component/subsystem context to detect faults and identify their point of origin; information fusion and diagnostic decision criteria-aligns data from similar and disparate sources in time and use that data to perform higher-level system diagnosis; and verification and validation-uses data, real or simulated, to provide variable exposure to the diagnostic system for faults that may only manifest themselves in actual implementation, as well as faults that are detectable via hardware testing. This presentation describes a framework for developing health management systems and highlights the health management research activities performed by the Controls and Dynamics Branch at the NASA Glenn Research Center. It illustrates how those activities contribute to the development of solutions for Integrated System Health Management.
Solar Photovoltaic (PV) Distributed Generation Systems - Control and Protection
NASA Astrophysics Data System (ADS)
Yi, Zhehan
This dissertation proposes a comprehensive control, power management, and fault detection strategy for solar photovoltaic (PV) distribution generations. Battery storages are typically employed in PV systems to mitigate the power fluctuation caused by unstable solar irradiance. With AC and DC loads, a PV-battery system can be treated as a hybrid microgrid which contains both DC and AC power resources and buses. In this thesis, a control power and management system (CAPMS) for PV-battery hybrid microgrid is proposed, which provides 1) the DC and AC bus voltage and AC frequency regulating scheme and controllers designed to track set points; 2) a power flow management strategy in the hybrid microgrid to achieve system generation and demand balance in both grid-connected and islanded modes; 3) smooth transition control during grid reconnection by frequency and phase synchronization control between the main grid and microgrid. Due to the increasing demands for PV power, scales of PV systems are getting larger and fault detection in PV arrays becomes challenging. High-impedance faults, low-mismatch faults, and faults occurred in low irradiance conditions tend to be hidden due to low fault currents, particularly, when a PV maximum power point tracking (MPPT) algorithm is in-service. If remain undetected, these faults can considerably lower the output energy of solar systems, damage the panels, and potentially cause fire hazards. In this dissertation, fault detection challenges in PV arrays are analyzed in depth, considering the crossing relations among the characteristics of PV, interactions with MPPT algorithms, and the nature of solar irradiance. Two fault detection schemes are then designed as attempts to address these technical issues, which detect faults inside PV arrays accurately even under challenging circumstances, e.g., faults in low irradiance conditions or high-impedance faults. Taking advantage of multi-resolution signal decomposition (MSD), a powerful signal processing technique based on discrete wavelet transformation (DWT), the first attempt is devised, which extracts the features of both line-to-line (L-L) and line-to-ground (L-G) faults and employs a fuzzy inference system (FIS) for the decision-making stage of fault detection. This scheme is then improved as the second attempt by further studying the system's behaviors during L-L faults, extracting more efficient fault features, and devising a more advanced decision-making stage: the two-stage support vector machine (SVM). For the first time, the two-stage SVM method is proposed in this dissertation to detect L-L faults in PV system with satisfactory accuracies. Numerous simulation and experimental case studies are carried out to verify the proposed control and protection strategies. Simulation environment is set up using the PSCAD/EMTDC and Matlab/Simulink software packages. Experimental case studies are conducted in a PV-battery hybrid microgrid using the dSPACE real-time controller to demonstrate the ease of hardware implementation and the controller performance. Another small-scale grid-connected PV system is set up to verify both fault detection algorithms which demonstrate promising performances and fault detecting accuracies.
Modeling Off-Nominal Behavior in SysML
NASA Technical Reports Server (NTRS)
Day, John C.; Donahue, Kenneth; Ingham, Michel; Kadesch, Alex; Kennedy, Andrew K.; Post, Ethan
2012-01-01
Specification and development of fault management functionality in systems is performed in an ad hoc way - more of an art than a science. Improvements to system reliability, availability, safety and resilience will be limited without infusion of additional formality into the practice of fault management. Key to the formalization of fault management is a precise representation of off-nominal behavior. Using the upcoming Soil Moisture Active-Passive (SMAP) mission for source material, we have modeled the off-nominal behavior of the SMAP system during its initial spin-up activity, using the System Modeling Language (SysML). In the course of developing these models, we have developed generic patterns for capturing off-nominal behavior in SysML. We show how these patterns provide useful ways of reasoning about the system (e.g., checking for completeness and effectiveness) and allow the automatic generation of typical artifacts (e.g., success trees and FMECAs) used in system analyses.
A System for Fault Management and Fault Consequences Analysis for NASA's Deep Space Habitat
NASA Technical Reports Server (NTRS)
Colombano, Silvano; Spirkovska, Liljana; Baskaran, Vijaykumar; Aaseng, Gordon; McCann, Robert S.; Ossenfort, John; Smith, Irene; Iverson, David L.; Schwabacher, Mark
2013-01-01
NASA's exploration program envisions the utilization of a Deep Space Habitat (DSH) for human exploration of the space environment in the vicinity of Mars and/or asteroids. Communication latencies with ground control of as long as 20+ minutes make it imperative that DSH operations be highly autonomous, as any telemetry-based detection of a systems problem on Earth could well occur too late to assist the crew with the problem. A DSH-based development program has been initiated to develop and test the automation technologies necessary to support highly autonomous DSH operations. One such technology is a fault management tool to support performance monitoring of vehicle systems operations and to assist with real-time decision making in connection with operational anomalies and failures. Toward that end, we are developing Advanced Caution and Warning System (ACAWS), a tool that combines dynamic and interactive graphical representations of spacecraft systems, systems modeling, automated diagnostic analysis and root cause identification, system and mission impact assessment, and mitigation procedure identification to help spacecraft operators (both flight controllers and crew) understand and respond to anomalies more effectively. In this paper, we describe four major architecture elements of ACAWS: Anomaly Detection, Fault Isolation, System Effects Analysis, and Graphic User Interface (GUI), and how these elements work in concert with each other and with other tools to provide fault management support to both the controllers and crew. We then describe recent evaluations and tests of ACAWS on the DSH testbed. The results of these tests support the feasibility and strength of our approach to failure management automation and enhanced operational autonomy
SLURM: Simple Linux Utility for Resource Management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jette, M; Dunlap, C; Garlick, J
2002-04-24
Simple Linux Utility for Resource Management (SLURM) is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for Linux clusters of thousands of nodes. Components include machine status, partition management, job management, and scheduling modules. The design also includes a scalable, general-purpose communication infrastructure. Development will take place in four phases: Phase I results in a solid infrastructure; Phase II produces a functional but limited interactive job initiation capability without use of the interconnect/switch; Phase III provides switch support and documentation; Phase IV provides job status, fault-tolerance, and job queuing and control through Livermore's Distributed Productionmore » Control System (DPCS), a meta-batch and resource management system.« less
Fault management for the Space Station Freedom control center
NASA Technical Reports Server (NTRS)
Clark, Colin; Jowers, Steven; Mcnenny, Robert; Culbert, Chris; Kirby, Sarah; Lauritsen, Janet
1992-01-01
This paper describes model based reasoning fault isolation in complex systems using automated digraph analysis. It discusses the use of the digraph representation as the paradigm for modeling physical systems and a method for executing these failure models to provide real-time failure analysis. It also discusses the generality, ease of development and maintenance, complexity management, and susceptibility to verification and validation of digraph failure models. It specifically describes how a NASA-developed digraph evaluation tool and an automated process working with that tool can identify failures in a monitored system when supplied with one or more fault indications. This approach is well suited to commercial applications of real-time failure analysis in complex systems because it is both powerful and cost effective.
A structural model decomposition framework for systems health management
NASA Astrophysics Data System (ADS)
Roychoudhury, I.; Daigle, M.; Bregon, A.; Pulido, B.
Systems health management (SHM) is an important set of technologies aimed at increasing system safety and reliability by detecting, isolating, and identifying faults; and predicting when the system reaches end of life (EOL), so that appropriate fault mitigation and recovery actions can be taken. Model-based SHM approaches typically make use of global, monolithic system models for online analysis, which results in a loss of scalability and efficiency for large-scale systems. Improvement in scalability and efficiency can be achieved by decomposing the system model into smaller local submodels and operating on these submodels instead. In this paper, the global system model is analyzed offline and structurally decomposed into local submodels. We define a common model decomposition framework for extracting submodels from the global model. This framework is then used to develop algorithms for solving model decomposition problems for the design of three separate SHM technologies, namely, estimation (which is useful for fault detection and identification), fault isolation, and EOL prediction. We solve these model decomposition problems using a three-tank system as a case study.
A Structural Model Decomposition Framework for Systems Health Management
NASA Technical Reports Server (NTRS)
Roychoudhury, Indranil; Daigle, Matthew J.; Bregon, Anibal; Pulido, Belamino
2013-01-01
Systems health management (SHM) is an important set of technologies aimed at increasing system safety and reliability by detecting, isolating, and identifying faults; and predicting when the system reaches end of life (EOL), so that appropriate fault mitigation and recovery actions can be taken. Model-based SHM approaches typically make use of global, monolithic system models for online analysis, which results in a loss of scalability and efficiency for large-scale systems. Improvement in scalability and efficiency can be achieved by decomposing the system model into smaller local submodels and operating on these submodels instead. In this paper, the global system model is analyzed offline and structurally decomposed into local submodels. We define a common model decomposition framework for extracting submodels from the global model. This framework is then used to develop algorithms for solving model decomposition problems for the design of three separate SHM technologies, namely, estimation (which is useful for fault detection and identification), fault isolation, and EOL prediction. We solve these model decomposition problems using a three-tank system as a case study.
Optimal Management of Redundant Control Authority for Fault Tolerance
NASA Technical Reports Server (NTRS)
Wu, N. Eva; Ju, Jianhong
2000-01-01
This paper is intended to demonstrate the feasibility of a solution to a fault tolerant control problem. It explains, through a numerical example, the design and the operation of a novel scheme for fault tolerant control. The fundamental principle of the scheme was formalized in [5] based on the notion of normalized nonspecificity. The novelty lies with the use of a reliability criterion for redundancy management, and therefore leads to a high overall system reliability.
Characterization of Model-Based Reasoning Strategies for Use in IVHM Architectures
NASA Technical Reports Server (NTRS)
Poll, Scott; Iverson, David; Patterson-Hine, Ann
2003-01-01
Open architectures are gaining popularity for Integrated Vehicle Health Management (IVHM) applications due to the diversity of subsystem health monitoring strategies in use and the need to integrate a variety of techniques at the system health management level. The basic concept of an open architecture suggests that whatever monitoring or reasoning strategy a subsystem wishes to deploy, the system architecture will support the needs of that subsystem and will be capable of transmitting subsystem health status across subsystem boundaries and up to the system level for system-wide fault identification and diagnosis. There is a need to understand the capabilities of various reasoning engines and how they, coupled with intelligent monitoring techniques, can support fault detection and system level fault management. Researchers in IVHM at NASA Ames Research Center are supporting the development of an IVHM system for liquefying-fuel hybrid rockets. In the initial stage of this project, a few readily available reasoning engines were studied to assess candidate technologies for application in next generation launch systems. Three tools representing the spectrum of model-based reasoning approaches, from a quantitative simulation based approach to a graph-based fault propagation technique, were applied to model the behavior of the Hybrid Combustion Facility testbed at Ames. This paper summarizes the characterization of the modeling process for each of the techniques.
Breaking down barriers in cooperative fault management: Temporal and functional information displays
NASA Technical Reports Server (NTRS)
Potter, Scott S.; Woods, David D.
1994-01-01
At the highest level, the fundamental question addressed by this research is how to aid human operators engaged in dynamic fault management. In dynamic fault management there is some underlying dynamic process (an engineered or physiological process referred to as the monitored process - MP) whose state changes over time and whose behavior must be monitored and controlled. In these types of applications (dynamic, real-time systems), a vast array of sensor data is available to provide information on the state of the MP. Faults disturb the MP and diagnosis must be performed in parallel with responses to maintain process integrity and to correct the underlying problem. These situations frequently involve time pressure, multiple interacting goals, high consequences of failure, and multiple interleaved tasks.
Results from the NASA Spacecraft Fault Management Workshop: Cost Drivers for Deep Space Missions
NASA Technical Reports Server (NTRS)
Newhouse, Marilyn E.; McDougal, John; Barley, Bryan; Stephens Karen; Fesq, Lorraine M.
2010-01-01
Fault Management, the detection of and response to in-flight anomalies, is a critical aspect of deep-space missions. Fault management capabilities are commonly distributed across flight and ground subsystems, impacting hardware, software, and mission operations designs. The National Aeronautics and Space Administration (NASA) Discovery & New Frontiers (D&NF) Program Office at Marshall Space Flight Center (MSFC) recently studied cost overruns and schedule delays for five missions. The goal was to identify the underlying causes for the overruns and delays, and to develop practical mitigations to assist the D&NF projects in identifying potential risks and controlling the associated impacts to proposed mission costs and schedules. The study found that four out of the five missions studied had significant overruns due to underestimating the complexity and support requirements for fault management. As a result of this and other recent experiences, the NASA Science Mission Directorate (SMD) Planetary Science Division (PSD) commissioned a workshop to bring together invited participants across government, industry, and academia to assess the state of the art in fault management practice and research, identify current and potential issues, and make recommendations for addressing these issues. The workshop was held in New Orleans in April of 2008. The workshop concluded that fault management is not being limited by technology, but rather by a lack of emphasis and discipline in both the engineering and programmatic dimensions. Some of the areas cited in the findings include different, conflicting, and changing institutional goals and risk postures; unclear ownership of end-to-end fault management engineering; inadequate understanding of the impact of mission-level requirements on fault management complexity; and practices, processes, and tools that have not kept pace with the increasing complexity of mission requirements and spacecraft systems. This paper summarizes the findings and recommendations from that workshop, particularly as fault management development issues affect operations and the development of operations capabilities.
Havens: Explicit Reliable Memory Regions for HPC Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hukerikar, Saurabh; Engelmann, Christian
2016-01-01
Supporting error resilience in future exascale-class supercomputing systems is a critical challenge. Due to transistor scaling trends and increasing memory density, scientific simulations are expected to experience more interruptions caused by transient errors in the system memory. Existing hardware-based detection and recovery techniques will be inadequate to manage the presence of high memory fault rates. In this paper we propose a partial memory protection scheme based on region-based memory management. We define the concept of regions called havens that provide fault protection for program objects. We provide reliability for the regions through a software-based parity protection mechanism. Our approach enablesmore » critical program objects to be placed in these havens. The fault coverage provided by our approach is application agnostic, unlike algorithm-based fault tolerance techniques.« less
Fault Tree Analysis as a Planning and Management Tool: A Case Study
ERIC Educational Resources Information Center
Witkin, Belle Ruth
1977-01-01
Fault Tree Analysis is an operations research technique used to analyse the most probable modes of failure in a system, in order to redesign or monitor the system more closely in order to increase its likelihood of success. (Author)
On the design of fault-tolerant robotic manipulator systems
NASA Technical Reports Server (NTRS)
Tesar, Delbert
1993-01-01
Robotic systems are finding increasing use in space applications. Many of these devices are going to be operational on board the Space Station Freedom. Fault tolerance has been deemed necessary because of the criticality of the tasks and the inaccessibility of the systems to maintenance and repair. Design for fault tolerance in manipulator systems is an area within robotics that is without precedence in the literature. In this paper, we will attempt to lay down the foundations for such a technology. Design for fault tolerance demands new and special approaches to design, often at considerable variance from established design practices. These design aspects, together with reliability evaluation and modeling tools, are presented. Mechanical architectures that employ protective redundancies at many levels and have a modular architecture are then studied in detail. Once a mechanical architecture for fault tolerance has been derived, the chronological stages of operational fault tolerance are investigated. Failure detection, isolation, and estimation methods are surveyed, and such methods for robot sensors and actuators are derived. Failure recovery methods are also presented for each of the protective layers of redundancy. Failure recovery tactics often span all of the layers of a control hierarchy. Thus, a unified framework for decision-making and control, which orchestrates both the nominal redundancy management tasks and the failure management tasks, has been derived. The well-developed field of fault-tolerant computers is studied next, and some design principles relevant to the design of fault-tolerant robot controllers are abstracted. Conclusions are drawn, and a road map for the design of fault-tolerant manipulator systems is laid out with recommendations for a 10 DOF arm with dual actuators at each joint.
A systematic risk management approach employed on the CloudSat project
NASA Technical Reports Server (NTRS)
Basilio, R. R.; Plourde, K. S.; Lam, T.
2000-01-01
The CloudSat Project has developed a simplified approach for fault tree analysis and probabilistic risk assessment. A system-level fault tree has been constructed to identify credible fault scenarios and failure modes leading up to a potential failure to meet the nominal mission success criteria.
Fault Management Architectures and the Challenges of Providing Software Assurance
NASA Technical Reports Server (NTRS)
Savarino, Shirley; Fitz, Rhonda; Fesq, Lorraine; Whitman, Gerek
2015-01-01
The satellite systems Fault Management (FM) is focused on safety, the preservation of assets, and maintaining the desired functionality of the system. How FM is implemented varies among missions. Common to most is system complexity due to a need to establish a multi-dimensional structure across hardware, software and operations. This structure is necessary to identify and respond to system faults, mitigate technical risks and ensure operational continuity. These architecture, implementation and software assurance efforts increase with mission complexity. Because FM is a systems engineering discipline with a distributed implementation, providing efficient and effective verification and validation (VV) is challenging. A breakout session at the 2012 NASA Independent Verification Validation (IVV) Annual Workshop titled VV of Fault Management: Challenges and Successes exposed these issues in terms of VV for a representative set of architectures. NASA's IVV is funded by NASA's Software Assurance Research Program (SARP) in partnership with NASA's Jet Propulsion Laboratory (JPL) to extend the work performed at the Workshop session. NASA IVV will extract FM architectures across the IVV portfolio and evaluate the data set for robustness, assess visibility for validation and test, and define software assurance methods that could be applied to the various architectures and designs. This work focuses efforts on FM architectures from critical and complex projects within NASA. The identification of particular FM architectures, visibility, and associated VVIVV techniques provides a data set that can enable higher assurance that a satellite system will adequately detect and respond to adverse conditions. Ultimately, results from this activity will be incorporated into the NASA Fault Management Handbook providing dissemination across NASA, other agencies and the satellite community. This paper discusses the approach taken to perform the evaluations and preliminary findings from the research including identification of FM architectures, visibility observations, and methods utilized for VVIVV.
Fault Management Technology Maturation for NASA's Constellation Program
NASA Technical Reports Server (NTRS)
Waterman, Robert D.
2010-01-01
This slide presentation reviews the maturation of fault management technology in preparation for the Constellation Program. There is a review of the Space Shuttle Main Engine (SSME) and a discussion of a couple of incidents with the shuttle main engine and tanking that indicated the necessity for predictive maintenance. Included is a review of the planned Ares I-X Ground Diagnostic Prototype (GDP) and further information about detection and isolation of faults using Testability Engineering and Maintenance System (TEAMS). Another system that being readied for use that detects anomalies, the Inductive Monitoring System (IMS). The IMS automatically learns how the system behaves and alerts operations it the current behavior is anomalous. The comparison of STS-83 and STS-107 (i.e., the Columbia accident) is shown as an example of the anomaly detection capabilities.
Plan for the Characterization of HIRF Effects on a Fault-Tolerant Computer Communication System
NASA Technical Reports Server (NTRS)
Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.; Koppen, Sandra V.
2008-01-01
This report presents the plan for the characterization of the effects of high intensity radiated fields on a prototype implementation of a fault-tolerant data communication system. Various configurations of the communication system will be tested. The prototype system is implemented using off-the-shelf devices. The system will be tested in a closed-loop configuration with extensive real-time monitoring. This test is intended to generate data suitable for the design of avionics health management systems, as well as redundancy management mechanisms and policies for robust distributed processing architectures.
Reliability of Fault Tolerant Control Systems. Part 1
NASA Technical Reports Server (NTRS)
Wu, N. Eva
2001-01-01
This paper reports Part I of a two part effort, that is intended to delineate the relationship between reliability and fault tolerant control in a quantitative manner. Reliability analysis of fault-tolerant control systems is performed using Markov models. Reliability properties, peculiar to fault-tolerant control systems are emphasized. As a consequence, coverage of failures through redundancy management can be severely limited. It is shown that in the early life of a syi1ein composed of highly reliable subsystems, the reliability of the overall system is affine with respect to coverage, and inadequate coverage induces dominant single point failures. The utility of some existing software tools for assessing the reliability of fault tolerant control systems is also discussed. Coverage modeling is attempted in Part II in a way that captures its dependence on the control performance and on the diagnostic resolution.
A design approach for ultrareliable real-time systems
NASA Technical Reports Server (NTRS)
Lala, Jaynarayan H.; Harper, Richard E.; Alger, Linda S.
1991-01-01
A design approach developed over the past few years to formalize redundancy management and validation is described. Redundant elements are partitioned into individual fault-containment regions (FCRs). An FCR is a collection of components that operates correctly regardless of any arbitrary logical or electrical fault outside the region. Conversely, a fault in an FCR cannot cause hardware outside the region to fail. The outputs of all channels are required to agree bit-for-bit under no-fault conditions (exact bitwise consensus). Synchronization, input agreement, and input validity conditions are discussed. The Advanced Information Processing System (AIPS), which is a fault-tolerant distributed architecture based on this approach, is described. A brief overview of recent applications of these systems and current research is presented.
Coordinated Fault Tolerance for High-Performance Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dongarra, Jack; Bosilca, George; et al.
2013-04-08
Our work to meet our goal of end-to-end fault tolerance has focused on two areas: (1) improving fault tolerance in various software currently available and widely used throughout the HEC domain and (2) using fault information exchange and coordination to achieve holistic, systemwide fault tolerance and understanding how to design and implement interfaces for integrating fault tolerance features for multiple layers of the software stack—from the application, math libraries, and programming language runtime to other common system software such as jobs schedulers, resource managers, and monitoring tools.
Product Support Manager Guidebook
2011-04-01
package is being developed using supportability analysis concepts such as Failure Mode, Effects and Criticality Analysis (FMECA), Fault Tree Analysis ( FTA ...Analysis (LORA) Condition Based Maintenance + (CBM+) Fault Tree Analysis ( FTA ) Failure Mode, Effects, and Criticality Analysis (FMECA) Maintenance Task...Reporting and Corrective Action System (FRACAS), Fault Tree Analysis ( FTA ), Level of Repair Analysis (LORA), Maintenance Task Analysis (MTA
NASA Technical Reports Server (NTRS)
Truong, Long V.; Walters, Jerry L.; Roth, Mary Ellen; Quinn, Todd M.; Krawczonek, Walter M.
1990-01-01
The goal of the Autonomous Power System (APS) program is to develop and apply intelligent problem solving and control to the Space Station Freedom Electrical Power System (SSF/EPS) testbed being developed and demonstrated at NASA Lewis Research Center. The objectives of the program are to establish artificial intelligence technology paths, to craft knowledge-based tools with advanced human-operator interfaces for power systems, and to interface and integrate knowledge-based systems with conventional controllers. The Autonomous Power EXpert (APEX) portion of the APS program will integrate a knowledge-based fault diagnostic system and a power resource planner-scheduler. Then APEX will interface on-line with the SSF/EPS testbed and its Power Management Controller (PMC). The key tasks include establishing knowledge bases for system diagnostics, fault detection and isolation analysis, on-line information accessing through PMC, enhanced data management, and multiple-level, object-oriented operator displays. The first prototype of the diagnostic expert system for fault detection and isolation has been developed. The knowledge bases and the rule-based model that were developed for the Power Distribution Control Unit subsystem of the SSF/EPS testbed are described. A corresponding troubleshooting technique is also described.
Fault Management Architectures and the Challenges of Providing Software Assurance
NASA Technical Reports Server (NTRS)
Savarino, Shirley; Fitz, Rhonda; Fesq, Lorraine; Whitman, Gerek
2015-01-01
Fault Management (FM) is focused on safety, the preservation of assets, and maintaining the desired functionality of the system. How FM is implemented varies among missions. Common to most missions is system complexity due to a need to establish a multi-dimensional structure across hardware, software and spacecraft operations. FM is necessary to identify and respond to system faults, mitigate technical risks and ensure operational continuity. Generally, FM architecture, implementation, and software assurance efforts increase with mission complexity. Because FM is a systems engineering discipline with a distributed implementation, providing efficient and effective verification and validation (V&V) is challenging. A breakout session at the 2012 NASA Independent Verification & Validation (IV&V) Annual Workshop titled "V&V of Fault Management: Challenges and Successes" exposed this issue in terms of V&V for a representative set of architectures. NASA's Software Assurance Research Program (SARP) has provided funds to NASA IV&V to extend the work performed at the Workshop session in partnership with NASA's Jet Propulsion Laboratory (JPL). NASA IV&V will extract FM architectures across the IV&V portfolio and evaluate the data set, assess visibility for validation and test, and define software assurance methods that could be applied to the various architectures and designs. This SARP initiative focuses efforts on FM architectures from critical and complex projects within NASA. The identification of particular FM architectures and associated V&V/IV&V techniques provides a data set that can enable improved assurance that a system will adequately detect and respond to adverse conditions. Ultimately, results from this activity will be incorporated into the NASA Fault Management Handbook providing dissemination across NASA, other agencies and the space community. This paper discusses the approach taken to perform the evaluations and preliminary findings from the research.
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Basham, Bryan D.
1989-01-01
CONFIG is a modeling and simulation tool prototype for analyzing the normal and faulty qualitative behaviors of engineered systems. Qualitative modeling and discrete-event simulation have been adapted and integrated, to support early development, during system design, of software and procedures for management of failures, especially in diagnostic expert systems. Qualitative component models are defined in terms of normal and faulty modes and processes, which are defined by invocation statements and effect statements with time delays. System models are constructed graphically by using instances of components and relations from object-oriented hierarchical model libraries. Extension and reuse of CONFIG models and analysis capabilities in hybrid rule- and model-based expert fault-management support systems are discussed.
Model-Based Fault Diagnosis: Performing Root Cause and Impact Analyses in Real Time
NASA Technical Reports Server (NTRS)
Figueroa, Jorge F.; Walker, Mark G.; Kapadia, Ravi; Morris, Jonathan
2012-01-01
Generic, object-oriented fault models, built according to causal-directed graph theory, have been integrated into an overall software architecture dedicated to monitoring and predicting the health of mission- critical systems. Processing over the generic fault models is triggered by event detection logic that is defined according to the specific functional requirements of the system and its components. Once triggered, the fault models provide an automated way for performing both upstream root cause analysis (RCA), and for predicting downstream effects or impact analysis. The methodology has been applied to integrated system health management (ISHM) implementations at NASA SSC's Rocket Engine Test Stands (RETS).
Fault Management Techniques in Human Spaceflight Operations
NASA Technical Reports Server (NTRS)
O'Hagan, Brian; Crocker, Alan
2006-01-01
This paper discusses human spaceflight fault management operations. Fault detection and response capabilities available in current US human spaceflight programs Space Shuttle and International Space Station are described while emphasizing system design impacts on operational techniques and constraints. Preflight and inflight processes along with products used to anticipate, mitigate and respond to failures are introduced. Examples of operational products used to support failure responses are presented. Possible improvements in the state of the art, as well as prioritization and success criteria for their implementation are proposed. This paper describes how the architecture of a command and control system impacts operations in areas such as the required fault response times, automated vs. manual fault responses, use of workarounds, etc. The architecture includes the use of redundancy at the system and software function level, software capabilities, use of intelligent or autonomous systems, number and severity of software defects, etc. This in turn drives which Caution and Warning (C&W) events should be annunciated, C&W event classification, operator display designs, crew training, flight control team training, and procedure development. Other factors impacting operations are the complexity of a system, skills needed to understand and operate a system, and the use of commonality vs. optimized solutions for software and responses. Fault detection, annunciation, safing responses, and recovery capabilities are explored using real examples to uncover underlying philosophies and constraints. These factors directly impact operations in that the crew and flight control team need to understand what happened, why it happened, what the system is doing, and what, if any, corrective actions they need to perform. If a fault results in multiple C&W events, or if several faults occur simultaneously, the root cause(s) of the fault(s), as well as their vehicle-wide impacts, must be determined in order to maintain situational awareness. This allows both automated and manual recovery operations to focus on the real cause of the fault(s). An appropriate balance must be struck between correcting the root cause failure and addressing the impacts of that fault on other vehicle components. Lastly, this paper presents a strategy for using lessons learned to improve the software, displays, and procedures in addition to determining what is a candidate for automation. Enabling technologies and techniques are identified to promote system evolution from one that requires manual fault responses to one that uses automation and autonomy where they are most effective. These considerations include the value in correcting software defects in a timely manner, automation of repetitive tasks, making time critical responses autonomous, etc. The paper recommends the appropriate use of intelligent systems to determine the root causes of faults and correctly identify separate unrelated faults.
Extended Testability Analysis Tool
NASA Technical Reports Server (NTRS)
Melcher, Kevin; Maul, William A.; Fulton, Christopher
2012-01-01
The Extended Testability Analysis (ETA) Tool is a software application that supports fault management (FM) by performing testability analyses on the fault propagation model of a given system. Fault management includes the prevention of faults through robust design margins and quality assurance methods, or the mitigation of system failures. Fault management requires an understanding of the system design and operation, potential failure mechanisms within the system, and the propagation of those potential failures through the system. The purpose of the ETA Tool software is to process the testability analysis results from a commercial software program called TEAMS Designer in order to provide a detailed set of diagnostic assessment reports. The ETA Tool is a command-line process with several user-selectable report output options. The ETA Tool also extends the COTS testability analysis and enables variation studies with sensor sensitivity impacts on system diagnostics and component isolation using a single testability output. The ETA Tool can also provide extended analyses from a single set of testability output files. The following analysis reports are available to the user: (1) the Detectability Report provides a breakdown of how each tested failure mode was detected, (2) the Test Utilization Report identifies all the failure modes that each test detects, (3) the Failure Mode Isolation Report demonstrates the system s ability to discriminate between failure modes, (4) the Component Isolation Report demonstrates the system s ability to discriminate between failure modes relative to the components containing the failure modes, (5) the Sensor Sensor Sensitivity Analysis Report shows the diagnostic impact due to loss of sensor information, and (6) the Effect Mapping Report identifies failure modes that result in specified system-level effects.
Design for interaction between humans and intelligent systems during real-time fault management
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Schreckenghost, Debra L.; Thronesbery, Carroll G.
1992-01-01
Initial results are reported to provide guidance and assistance for designers of intelligent systems and their human interfaces. The objective is to achieve more effective human-computer interaction (HCI) for real time fault management support systems. Studies of the development of intelligent fault management systems within NASA have resulted in a new perspective of the user. If the user is viewed as one of the subsystems in a heterogeneous, distributed system, system design becomes the design of a flexible architecture for accomplishing system tasks with both human and computer agents. HCI requirements and design should be distinguished from user interface (displays and controls) requirements and design. Effective HCI design for multi-agent systems requires explicit identification of activities and information that support coordination and communication between agents. The effects are characterized of HCI design on overall system design and approaches are identified to addressing HCI requirements in system design. The results include definition of (1) guidance based on information level requirements analysis of HCI, (2) high level requirements for a design methodology that integrates the HCI perspective into system design, and (3) requirements for embedding HCI design tools into intelligent system development environments.
NASA Astrophysics Data System (ADS)
Wang, S.; Zhang, X. N.; Gao, D. D.; Liu, H. X.; Ye, J.; Li, L. R.
2016-08-01
As the solar photovoltaic (PV) power is applied extensively, more attentions are paid to the maintenance and fault diagnosis of PV power plants. Based on analysis of the structure of PV power station, the global partitioned gradually approximation method is proposed as a fault diagnosis algorithm to determine and locate the fault of PV panels. The PV array is divided into 16x16 blocks and numbered. On the basis of modularly processing of the PV array, the current values of each block are analyzed. The mean current value of each block is used for calculating the fault weigh factor. The fault threshold is defined to determine the fault, and the shade is considered to reduce the probability of misjudgments. A fault diagnosis system is designed and implemented with LabVIEW. And it has some functions including the data realtime display, online check, statistics, real-time prediction and fault diagnosis. Through the data from PV plants, the algorithm is verified. The results show that the fault diagnosis results are accurate, and the system works well. The validity and the possibility of the system are verified by the results as well. The developed system will be benefit for the maintenance and management of large scale PV array.
Multi-Agent Diagnosis and Control of an Air Revitalization System for Life Support in Space
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Kowing, Jeffrey; Nieten, Joseph; Graham, Jeffrey s.; Schreckenghost, Debra; Bonasso, Pete; Fleming, Land D.; MacMahon, Matt; Thronesbery, Carroll
2000-01-01
An architecture of interoperating agents has been developed to provide control and fault management for advanced life support systems in space. In this adjustable autonomy architecture, software agents coordinate with human agents and provide support in novel fault management situations. This architecture combines the Livingstone model-based mode identification and reconfiguration (MIR) system with the 3T architecture for autonomous flexible command and control. The MIR software agent performs model-based state identification and diagnosis. MIR identifies novel recovery configurations and the set of commands required for the recovery. The AZT procedural executive and the human operator use the diagnoses and recovery recommendations, and provide command sequencing. User interface extensions have been developed to support human monitoring of both AZT and MIR data and activities. This architecture has been demonstrated performing control and fault management for an oxygen production system for air revitalization in space. The software operates in a dynamic simulation testbed.
Fault-tolerant processing system
NASA Technical Reports Server (NTRS)
Palumbo, Daniel L. (Inventor)
1996-01-01
A fault-tolerant, fiber optic interconnect, or backplane, which serves as a via for data transfer between modules. Fault tolerance algorithms are embedded in the backplane by dividing the backplane into a read bus and a write bus and placing a redundancy management unit (RMU) between the read bus and the write bus so that all data transmitted by the write bus is subjected to the fault tolerance algorithms before the data is passed for distribution to the read bus. The RMU provides both backplane control and fault tolerance.
A System for Fault Management for NASA's Deep Space Habitat
NASA Technical Reports Server (NTRS)
Colombano, Silvano P.; Spirkovska, Liljana; Aaseng, Gordon B.; Mccann, Robert S.; Baskaran, Vijayakumar; Ossenfort, John P.; Smith, Irene Skupniewicz; Iverson, David L.; Schwabacher, Mark A.
2013-01-01
NASA's exploration program envisions the utilization of a Deep Space Habitat (DSH) for human exploration of the space environment in the vicinity of Mars and/or asteroids. Communication latencies with ground control of as long as 20+ minutes make it imperative that DSH operations be highly autonomous, as any telemetry-based detection of a systems problem on Earth could well occur too late to assist the crew with the problem. A DSH-based development program has been initiated to develop and test the automation technologies necessary to support highly autonomous DSH operations. One such technology is a fault management tool to support performance monitoring of vehicle systems operations and to assist with real-time decision making in connection with operational anomalies and failures. Toward that end, we are developing Advanced Caution and Warning System (ACAWS), a tool that combines dynamic and interactive graphical representations of spacecraft systems, systems modeling, automated diagnostic analysis and root cause identification, system and mission impact assessment, and mitigation procedure identification to help spacecraft operators (both flight controllers and crew) understand and respond to anomalies more effectively. In this paper, we describe four major architecture elements of ACAWS: Anomaly Detection, Fault Isolation, System Effects Analysis, and Graphic User Interface (GUI), and how these elements work in concert with each other and with other tools to provide fault management support to both the controllers and crew. We then describe recent evaluations and tests of ACAWS on the DSH testbed. The results of these tests support the feasibility and strength of our approach to failure management automation and enhanced operational autonomy.
Real-Time Simulation for Verification and Validation of Diagnostic and Prognostic Algorithms
NASA Technical Reports Server (NTRS)
Aguilar, Robet; Luu, Chuong; Santi, Louis M.; Sowers, T. Shane
2005-01-01
To verify that a health management system (HMS) performs as expected, a virtual system simulation capability, including interaction with the associated platform or vehicle, very likely will need to be developed. The rationale for developing this capability is discussed and includes the limited capability to seed faults into the actual target system due to the risk of potential damage to high value hardware. The capability envisioned would accurately reproduce the propagation of a fault or failure as observed by sensors located at strategic locations on and around the target system and would also accurately reproduce the control system and vehicle response. In this way, HMS operation can be exercised over a broad range of conditions to verify that it meets requirements for accurate, timely response to actual faults with adequate margin against false and missed detections. An overview is also presented of a real-time rocket propulsion health management system laboratory which is available for future rocket engine programs. The health management elements and approaches of this lab are directly applicable for future space systems. In this paper the various components are discussed and the general fault detection, diagnosis, isolation and the response (FDIR) concept is presented. Additionally, the complexities of V&V (Verification and Validation) for advanced algorithms and the simulation capabilities required to meet the changing state-of-the-art in HMS are discussed.
Application of a Multimedia Service and Resource Management Architecture for Fault Diagnosis
Castro, Alfonso; Sedano, Andrés A.; García, Fco. Javier; Villoslada, Eduardo
2017-01-01
Nowadays, the complexity of global video products has substantially increased. They are composed of several associated services whose functionalities need to adapt across heterogeneous networks with different technologies and administrative domains. Each of these domains has different operational procedures; therefore, the comprehensive management of multi-domain services presents serious challenges. This paper discusses an approach to service management linking fault diagnosis system and Business Processes for Telefónica’s global video service. The main contribution of this paper is the proposal of an extended service management architecture based on Multi Agent Systems able to integrate the fault diagnosis with other different service management functionalities. This architecture includes a distributed set of agents able to coordinate their actions under the umbrella of a Shared Knowledge Plane, inferring and sharing their knowledge with semantic techniques and three types of automatic reasoning: heterogeneous, ontology-based and Bayesian reasoning. This proposal has been deployed and validated in a real scenario in the video service offered by Telefónica Latam. PMID:29283398
Application of a Multimedia Service and Resource Management Architecture for Fault Diagnosis.
Castro, Alfonso; Sedano, Andrés A; García, Fco Javier; Villoslada, Eduardo; Villagrá, Víctor A
2017-12-28
Nowadays, the complexity of global video products has substantially increased. They are composed of several associated services whose functionalities need to adapt across heterogeneous networks with different technologies and administrative domains. Each of these domains has different operational procedures; therefore, the comprehensive management of multi-domain services presents serious challenges. This paper discusses an approach to service management linking fault diagnosis system and Business Processes for Telefónica's global video service. The main contribution of this paper is the proposal of an extended service management architecture based on Multi Agent Systems able to integrate the fault diagnosis with other different service management functionalities. This architecture includes a distributed set of agents able to coordinate their actions under the umbrella of a Shared Knowledge Plane, inferring and sharing their knowledge with semantic techniques and three types of automatic reasoning: heterogeneous, ontology-based and Bayesian reasoning. This proposal has been deployed and validated in a real scenario in the video service offered by Telefónica Latam.
Simulation of demand-response power management in smart city
NASA Astrophysics Data System (ADS)
Kadam, Kshitija
Smart Grids manage energy efficiently through intelligent monitoring and control of all the components connected to the electrical grid. Advanced digital technology, combined with sensors and power electronics, can greatly improve transmission line efficiency. This thesis proposed a model of a deregulated grid which supplied power to diverse set of consumers and allowed them to participate in decision making process through two-way communication. The deregulated market encourages competition at the generation and distribution levels through communication with the central system operator. A software platform was developed and executed to manage the communication, as well for energy management of the overall system. It also demonstrated self-healing property of the system in case a fault occurs, resulting in an outage. The system not only recovered from the fault but managed to do so in a short time with no/minimum human involvement.
Functional Fault Modeling Conventions and Practices for Real-Time Fault Isolation
NASA Technical Reports Server (NTRS)
Ferrell, Bob; Lewis, Mark; Perotti, Jose; Oostdyk, Rebecca; Brown, Barbara
2010-01-01
The purpose of this paper is to present the conventions, best practices, and processes that were established based on the prototype development of a Functional Fault Model (FFM) for a Cryogenic System that would be used for real-time Fault Isolation in a Fault Detection, Isolation, and Recovery (FDIR) system. The FDIR system is envisioned to perform health management functions for both a launch vehicle and the ground systems that support the vehicle during checkout and launch countdown by using a suite of complimentary software tools that alert operators to anomalies and failures in real-time. The FFMs were created offline but would eventually be used by a real-time reasoner to isolate faults in a Cryogenic System. Through their development and review, a set of modeling conventions and best practices were established. The prototype FFM development also provided a pathfinder for future FFM development processes. This paper documents the rationale and considerations for robust FFMs that can easily be transitioned to a real-time operating environment.
Fault Diagnosis of Power Systems Using Intelligent Systems
NASA Technical Reports Server (NTRS)
Momoh, James A.; Oliver, Walter E. , Jr.
1996-01-01
The power system operator's need for a reliable power delivery system calls for a real-time or near-real-time Al-based fault diagnosis tool. Such a tool will allow NASA ground controllers to re-establish a normal or near-normal degraded operating state of the EPS (a DC power system) for Space Station Alpha by isolating the faulted branches and loads of the system. And after isolation, re-energizing those branches and loads that have been found not to have any faults in them. A proposed solution involves using the Fault Diagnosis Intelligent System (FDIS) to perform near-real time fault diagnosis of Alpha's EPS by downloading power transient telemetry at fault-time from onboard data loggers. The FDIS uses an ANN clustering algorithm augmented with a wavelet transform feature extractor. This combination enables this system to perform pattern recognition of the power transient signatures to diagnose the fault type and its location down to the orbital replaceable unit. FDIS has been tested using a simulation of the LeRC Testbed Space Station Freedom configuration including the topology from the DDCU's to the electrical loads attached to the TPDU's. FDIS will work in conjunction with the Power Management Load Scheduler to determine what the state of the system was at the time of the fault condition. This information is used to activate the appropriate diagnostic section, and to refine if necessary the solution obtained. In the latter case, if the FDIS reports back that it is equally likely that the faulty device as 'start tracker #1' and 'time generation unit,' then based on a priori knowledge of the system's state, the refined solution would be 'star tracker #1' located in cabinet ITAS2. It is concluded from the present studies that artificial intelligence diagnostic abilities are improved with the addition of the wavelet transform, and that when such a system such as FDIS is coupled to the Power Management Load Scheduler, a faulty device can be located and isolated from the rest of the system. The benefit of these studies provides NASA with the ability to quickly restore the operating status of a space station from a critical state to a safe degraded mode, thereby saving costs in experimentation rescheduling, fault diagnostics, and prevention of loss-of-life.
Planning and Resource Management in an Intelligent Automated Power Management System
NASA Technical Reports Server (NTRS)
Morris, Robert A.
1991-01-01
Power system management is a process of guiding a power system towards the objective of continuous supply of electrical power to a set of loads. Spacecraft power system management requires planning and scheduling, since electrical power is a scarce resource in space. The automation of power system management for future spacecraft has been recognized as an important R&D goal. Several automation technologies have emerged including the use of expert systems for automating human problem solving capabilities such as rule based expert system for fault diagnosis and load scheduling. It is questionable whether current generation expert system technology is applicable for power system management in space. The objective of the ADEPTS (ADvanced Electrical Power management Techniques for Space systems) is to study new techniques for power management automation. These techniques involve integrating current expert system technology with that of parallel and distributed computing, as well as a distributed, object-oriented approach to software design. The focus of the current study is the integration of new procedures for automatically planning and scheduling loads with procedures for performing fault diagnosis and control. The objective is the concurrent execution of both sets of tasks on separate transputer processors, thus adding parallelism to the overall management process.
Intelligent Operation and Maintenance of Micro-grid Technology and System Development
NASA Astrophysics Data System (ADS)
Fu, Ming; Song, Jinyan; Zhao, Jingtao; Du, Jian
2018-01-01
In order to achieve the micro-grid operation and management, Studying the micro-grid operation and maintenance knowledge base. Based on the advanced Petri net theory, the fault diagnosis model of micro-grid is established, and the intelligent diagnosis and analysis method of micro-grid fault is put forward. Based on the technology, the functional system and architecture of the intelligent operation and maintenance system of micro-grid are studied, and the microcomputer fault diagnosis function is introduced in detail. Finally, the system is deployed based on the micro-grid of a park, and the micro-grid fault diagnosis and analysis is carried out based on the micro-grid operation. The system operation and maintenance function interface is displayed, which verifies the correctness and reliability of the system.
An expert systems approach to automated fault management in a regenerative life support subsystem
NASA Technical Reports Server (NTRS)
Malin, J. T.; Lance, N., Jr.
1986-01-01
This paper describes FIXER, a prototype expert system for automated fault management in a regenerative life support subsystem typical of Space Station applications. The development project provided an evaluation of the use of expert systems technology to enhance controller functions in space subsystems. The software development approach permitted evaluation of the effectiveness of direct involvement of the expert in design and development. The approach also permitted intensive observation of the knowledge and methods of the expert. This paper describes the development of the prototype expert system and presents results of the evaluation.
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Schrenkenghost, Debra K.
2001-01-01
The Adjustable Autonomy Testbed (AAT) is a simulation-based testbed located in the Intelligent Systems Laboratory in the Automation, Robotics and Simulation Division at NASA Johnson Space Center. The purpose of the testbed is to support evaluation and validation of prototypes of adjustable autonomous agent software for control and fault management for complex systems. The AA T project has developed prototype adjustable autonomous agent software and human interfaces for cooperative fault management. This software builds on current autonomous agent technology by altering the architecture, components and interfaces for effective teamwork between autonomous systems and human experts. Autonomous agents include a planner, flexible executive, low level control and deductive model-based fault isolation. Adjustable autonomy is intended to increase the flexibility and effectiveness of fault management with an autonomous system. The test domain for this work is control of advanced life support systems for habitats for planetary exploration. The CONFIG hybrid discrete event simulation environment provides flexible and dynamically reconfigurable models of the behavior of components and fluids in the life support systems. Both discrete event and continuous (discrete time) simulation are supported, and flows and pressures are computed globally. This provides fast dynamic simulations of interacting hardware systems in closed loops that can be reconfigured during operations scenarios, producing complex cascading effects of operations and failures. Current object-oriented model libraries support modeling of fluid systems, and models have been developed of physico-chemical and biological subsystems for processing advanced life support gases. In FY01, water recovery system models will be developed.
Modeling, Detection, and Disambiguation of Sensor Faults for Aerospace Applications
NASA Technical Reports Server (NTRS)
Balaban, Edward; Saxena, Abhinav; Bansal, Prasun; Goebel, Kai F.; Curran, Simon
2009-01-01
Sensor faults continue to be a major hurdle for systems health management to reach its full potential. At the same time, few recorded instances of sensor faults exist. It is equally difficult to seed particular sensor faults. Therefore, research is underway to better understand the different fault modes seen in sensors and to model the faults. The fault models can then be used in simulated sensor fault scenarios to ensure that algorithms can distinguish between sensor faults and system faults. The paper illustrates the work with data collected from an electro-mechanical actuator in an aerospace setting, equipped with temperature, vibration, current, and position sensors. The most common sensor faults, such as bias, drift, scaling, and dropout were simulated and injected into the experimental data, with the goal of making these simulations as realistic as feasible. A neural network based classifier was then created and tested on both experimental data and the more challenging randomized data sequences. Additional studies were also conducted to determine sensitivity of detection and disambiguation efficacy to severity of fault conditions.
NASA Technical Reports Server (NTRS)
Lo, Yunnhon; Johnson, Stephen B.; Breckenridge, Jonathan T.
2014-01-01
This paper describes the quantitative application of the theory of System Health Management and its operational subset, Fault Management, to the selection of abort triggers for a human-rated launch vehicle, the United States' National Aeronautics and Space Administration's (NASA) Space Launch System (SLS). The results demonstrate the efficacy of the theory to assess the effectiveness of candidate failure detection and response mechanisms to protect humans from time-critical and severe hazards. The quantitative method was successfully used on the SLS to aid selection of its suite of abort triggers.
NASA Technical Reports Server (NTRS)
Lo, Yunnhon; Johnson, Stephen B.; Breckenridge, Jonathan T.
2014-01-01
This paper describes the quantitative application of the theory of System Health Management and its operational subset, Fault Management, to the selection of Abort Triggers for a human-rated launch vehicle, the United States' National Aeronautics and Space Administration's (NASA) Space Launch System (SLS). The results demonstrate the efficacy of the theory to assess the effectiveness of candidate failure detection and response mechanisms to protect humans from time-critical and severe hazards. The quantitative method was successfully used on the SLS to aid selection of its suite of Abort Triggers.
Emerging technologies for V&V of ISHM software for space exploration
NASA Technical Reports Server (NTRS)
Feather, Martin S.; Markosian, Lawrence Z.
2006-01-01
Systems1,2 required to exhibit high operational reliability often rely on some form of fault protection to recognize and respond to faults, preventing faults' escalation to catastrophic failures. Integrated System Health Management (ISHM) extends the functionality of fault protection to both scale to more complex systems (and systems of systems), and to maintain capability rather than just avert catastrophe. Forms of ISHM have been utilized to good effect in the maintenance phase of systems' total lifecycles (often referred to as 'condition-based mainte-nance'), but less so in a 'fault protection' role during actual operations. One of the impediments to such use lies in the challenges of verification, validation and certification of ISHM systems themselves. This paper makes the case that state-of-the-practice V&V and certification techniques will not suffice for emerging forms of ISHM systems; however, a number of maturing software engineering assurance technologies show particular promise for addressing these ISHM V&V challenges.
Managing Fault Management Development
NASA Technical Reports Server (NTRS)
McDougal, John M.
2010-01-01
As the complexity of space missions grows, development of Fault Management (FM) capabilities is an increasingly common driver for significant cost overruns late in the development cycle. FM issues and the resulting cost overruns are rarely caused by a lack of technology, but rather by a lack of planning and emphasis by project management. A recent NASA FM Workshop brought together FM practitioners from a broad spectrum of institutions, mission types, and functional roles to identify the drivers underlying FM overruns and recommend solutions. They identified a number of areas in which increased program and project management focus can be used to control FM development cost growth. These include up-front planning for FM as a distinct engineering discipline; managing different, conflicting, and changing institutional goals and risk postures; ensuring the necessary resources for a disciplined, coordinated approach to end-to-end fault management engineering; and monitoring FM coordination across all mission systems.
Software reliability through fault-avoidance and fault-tolerance
NASA Technical Reports Server (NTRS)
Vouk, Mladen A.; Mcallister, David F.
1993-01-01
Strategies and tools for the testing, risk assessment and risk control of dependable software-based systems were developed. Part of this project consists of studies to enable the transfer of technology to industry, for example the risk management techniques for safety-concious systems. Theoretical investigations of Boolean and Relational Operator (BRO) testing strategy were conducted for condition-based testing. The Basic Graph Generation and Analysis tool (BGG) was extended to fully incorporate several variants of the BRO metric. Single- and multi-phase risk, coverage and time-based models are being developed to provide additional theoretical and empirical basis for estimation of the reliability and availability of large, highly dependable software. A model for software process and risk management was developed. The use of cause-effect graphing for software specification and validation was investigated. Lastly, advanced software fault-tolerance models were studied to provide alternatives and improvements in situations where simple software fault-tolerance strategies break down.
Modeling Off-Nominal Behavior in SysML
NASA Technical Reports Server (NTRS)
Day, John; Donahue, Kenny; Ingham, Mitch; Kadesch, Alex; Kennedy, Kit; Post, Ethan
2012-01-01
Fault Management is an essential part of the system engineering process that is limited in its effectiveness by the ad hoc nature of the applied approaches and methods. Providing a rigorous way to develop and describe off-nominal behavior is a necessary step in the improvement of fault management, and as a result, will enable safe, reliable and available systems even as system complexity increases... The basic concepts described in this paper provide a foundation to build a larger set of necessary concepts and relationships for precise modeling of off-nominal behavior, and a basis for incorporating these ideas into the overall systems engineering process.. The simple FMEA example provided applies the modeling patterns we have developed and illustrates how the information in the model can be used to reason about the system and derive typical fault management artifacts.. A key insight from the FMEA work was the utility of defining failure modes as the "inverse of intent", and deriving this from the behavior models.. Additional work is planned to extend these ideas and capabilities to other types of relevant information and additional products.
A Unified Nonlinear Adaptive Approach for Detection and Isolation of Engine Faults
NASA Technical Reports Server (NTRS)
Tang, Liang; DeCastro, Jonathan A.; Zhang, Xiaodong; Farfan-Ramos, Luis; Simon, Donald L.
2010-01-01
A challenging problem in aircraft engine health management (EHM) system development is to detect and isolate faults in system components (i.e., compressor, turbine), actuators, and sensors. Existing nonlinear EHM methods often deal with component faults, actuator faults, and sensor faults separately, which may potentially lead to incorrect diagnostic decisions and unnecessary maintenance. Therefore, it would be ideal to address sensor faults, actuator faults, and component faults under one unified framework. This paper presents a systematic and unified nonlinear adaptive framework for detecting and isolating sensor faults, actuator faults, and component faults for aircraft engines. The fault detection and isolation (FDI) architecture consists of a parallel bank of nonlinear adaptive estimators. Adaptive thresholds are appropriately designed such that, in the presence of a particular fault, all components of the residual generated by the adaptive estimator corresponding to the actual fault type remain below their thresholds. If the faults are sufficiently different, then at least one component of the residual generated by each remaining adaptive estimator should exceed its threshold. Therefore, based on the specific response of the residuals, sensor faults, actuator faults, and component faults can be isolated. The effectiveness of the approach was evaluated using the NASA C-MAPSS turbofan engine model, and simulation results are presented.
Advanced cloud fault tolerance system
NASA Astrophysics Data System (ADS)
Sumangali, K.; Benny, Niketa
2017-11-01
Cloud computing has become a prevalent on-demand service on the internet to store, manage and process data. A pitfall that accompanies cloud computing is the failures that can be encountered in the cloud. To overcome these failures, we require a fault tolerance mechanism to abstract faults from users. We have proposed a fault tolerant architecture, which is a combination of proactive and reactive fault tolerance. This architecture essentially increases the reliability and the availability of the cloud. In the future, we would like to compare evaluations of our proposed architecture with existing architectures and further improve it.
Evolution of shuttle avionics redundancy management/fault tolerance
NASA Technical Reports Server (NTRS)
Boykin, J. C.; Thibodeau, J. R.; Schneider, H. E.
1985-01-01
The challenge of providing redundancy management (RM) and fault tolerance to meet the Shuttle Program requirements of fail operational/fail safe for the avionics systems was complicated by the critical program constraints of weight, cost, and schedule. The basic and sometimes false effectivity of less than pure RM designs is addressed. Evolution of the multiple input selection filter (the heart of the RM function) is discussed with emphasis on the subtle interactions of the flight control system that were found to be potentially catastrophic. Several other general RM development problems are discussed, with particular emphasis on the inertial measurement unit RM, indicative of the complexity of managing that three string system and its critical interfaces with the guidance and control systems.
Managing moral hazard in motor vehicle accident insurance claims.
Ebrahim, Shanil; Busse, Jason W; Guyatt, Gordon H; Birch, Stephen
2013-05-01
Motor vehicle accident (MVA) insurance in Canada is based primarily on two different compensation systems: (i) no-fault, in which policyholders are unable to seek recovery for losses caused by other parties (unless they have specified dollar or verbal thresholds) and (ii) tort, in which policyholders may seek general damages. As insurance companies pay for MVA-related health care costs, excess use of health care services may occur as a result of consumers' (accident victims) and/or producers' (health care providers) behavior - often referred to as the moral hazard of insurance. In the United States, moral hazard is greater for low dollar threshold no-fault insurance compared with tort systems. In Canada, high dollar threshold or pure no-fault versus tort systems are associated with faster patient recovery and reduced MVA claims. These findings suggest that high threshold no-fault or pure no-fault compensation systems may be associated with improved outcomes for patients and reduced moral hazard.
The Design of a Fault-Tolerant COTS-Based Bus Architecture for Space Applications
NASA Technical Reports Server (NTRS)
Chau, Savio N.; Alkalai, Leon; Tai, Ann T.
2000-01-01
The high-performance, scalability and miniaturization requirements together with the power, mass and cost constraints mandate the use of commercial-off-the-shelf (COTS) components and standards in the X2000 avionics system architecture for deep-space missions. In this paper, we report our experiences and findings on the design of an IEEE 1394 compliant fault-tolerant COTS-based bus architecture. While the COTS standard IEEE 1394 adequately supports power management, high performance and scalability, its topological criteria impose restrictions on fault tolerance realization. To circumvent the difficulties, we derive a "stack-tree" topology that not only complies with the IEEE 1394 standard but also facilitates fault tolerance realization in a spaceborne system with limited dedicated resource redundancies. Moreover, by exploiting pertinent standard features of the 1394 interface which are not purposely designed for fault tolerance, we devise a comprehensive set of fault detection mechanisms to support the fault-tolerant bus architecture.
Advanced information processing system: Local system services
NASA Technical Reports Server (NTRS)
Burkhardt, Laura; Alger, Linda; Whittredge, Roy; Stasiowski, Peter
1989-01-01
The Advanced Information Processing System (AIPS) is a multi-computer architecture composed of hardware and software building blocks that can be configured to meet a broad range of application requirements. The hardware building blocks are fault-tolerant, general-purpose computers, fault-and damage-tolerant networks (both computer and input/output), and interfaces between the networks and the computers. The software building blocks are the major software functions: local system services, input/output, system services, inter-computer system services, and the system manager. The foundation of the local system services is an operating system with the functions required for a traditional real-time multi-tasking computer, such as task scheduling, inter-task communication, memory management, interrupt handling, and time maintenance. Resting on this foundation are the redundancy management functions necessary in a redundant computer and the status reporting functions required for an operator interface. The functional requirements, functional design and detailed specifications for all the local system services are documented.
A diagnosis system using object-oriented fault tree models
NASA Technical Reports Server (NTRS)
Iverson, David L.; Patterson-Hine, F. A.
1990-01-01
Spaceborne computing systems must provide reliable, continuous operation for extended periods. Due to weight, power, and volume constraints, these systems must manage resources very effectively. A fault diagnosis algorithm is described which enables fast and flexible diagnoses in the dynamic distributed computing environments planned for future space missions. The algorithm uses a knowledge base that is easily changed and updated to reflect current system status. Augmented fault trees represented in an object-oriented form provide deep system knowledge that is easy to access and revise as a system changes. Given such a fault tree, a set of failure events that have occurred, and a set of failure events that have not occurred, this diagnosis system uses forward and backward chaining to propagate causal and temporal information about other failure events in the system being diagnosed. Once the system has established temporal and causal constraints, it reasons backward from heuristically selected failure events to find a set of basic failure events which are a likely cause of the occurrence of the top failure event in the fault tree. The diagnosis system has been implemented in common LISP using Flavors.
Manned spacecraft automation and robotics
NASA Technical Reports Server (NTRS)
Erickson, Jon D.
1987-01-01
The Space Station holds promise of being a showcase user and driver of advanced automation and robotics technology. The author addresses the advances in automation and robotics from the Space Shuttle - with its high-reliability redundancy management and fault tolerance design and its remote manipulator system - to the projected knowledge-based systems for monitoring, control, fault diagnosis, planning, and scheduling, and the telerobotic systems of the future Space Station.
NASA Astrophysics Data System (ADS)
Haziza, M.
1990-10-01
The DIAMS satellite fault isolation expert system shell concept is described. The project, initiated in 1985, has led to the development of a prototype Expert System (ES) dedicated to the Telecom 1 attitude and orbit control system. The prototype ES has been installed in the Telecom 1 satellite control center and evaluated by Telecom 1 operations. The development of a fault isolation ES covering a whole spacecraft (the French telecommunication satellite Telecom 2) is currently being undertaken. Full scale industrial applications raise stringent requirements in terms of knowledge management and software development methodology. The approach used by MATRA ESPACE to face this challenge is outlined.
NASA Astrophysics Data System (ADS)
Kratov, Sergey
2018-01-01
Modern information systems designed to service a wide range of users, regardless of their subject area, are increasingly based on Web technologies and are available to users via Internet. The article discusses the issues of providing the fault-tolerant operation of such information systems, based on free and open source content management systems. The toolkit available to administrators of similar systems is shown; the scenarios for using these tools are described. Options for organizing backups and restoring the operability of systems after failures are suggested. Application of the proposed methods and approaches allows providing continuous monitoring of the state of systems, timely response to the emergence of possible problems and their prompt solution.
Fault Tolerance Middleware for a Multi-Core System
NASA Technical Reports Server (NTRS)
Some, Raphael R.; Springer, Paul L.; Zima, Hans P.; James, Mark; Wagner, David A.
2012-01-01
Fault Tolerance Middleware (FTM) provides a framework to run on a dedicated core of a multi-core system and handles detection of single-event upsets (SEUs), and the responses to those SEUs, occurring in an application running on multiple cores of the processor. This software was written expressly for a multi-core system and can support different kinds of fault strategies, such as introspection, algorithm-based fault tolerance (ABFT), and triple modular redundancy (TMR). It focuses on providing fault tolerance for the application code, and represents the first step in a plan to eventually include fault tolerance in message passing and the FTM itself. In the multi-core system, the FTM resides on a single, dedicated core, separate from the cores used by the application. This is done in order to isolate the FTM from application faults and to allow it to swap out any application core for a substitute. The structure of the FTM consists of an interface to a fault tolerant strategy module, a responder module, a fault manager module, an error factory, and an error mapper that determines the severity of the error. In the present reference implementation, the only fault tolerant strategy implemented is introspection. The introspection code waits for an application node to send an error notification to it. It then uses the error factory to create an error object, and at this time, a severity level is assigned to the error. The introspection code uses its built-in knowledge base to generate a recommended response to the error. Responses might include ignoring the error, logging it, rolling back the application to a previously saved checkpoint, swapping in a new node to replace a bad one, or restarting the application. The original error and recommended response are passed to the top-level fault manager module, which invokes the response. The responder module also notifies the introspection module of the generated response. This provides additional information to the introspection module that it can use in generating its next response. For example, if the responder triggers an application rollback and errors are still occurring, the introspection module may decide to recommend an application restart.
Software Health Management with Bayesian Networks
NASA Technical Reports Server (NTRS)
Mengshoel, Ole; Schumann, JOhann
2011-01-01
Most modern aircraft as well as other complex machinery is equipped with diagnostics systems for its major subsystems. During operation, sensors provide important information about the subsystem (e.g., the engine) and that information is used to detect and diagnose faults. Most of these systems focus on the monitoring of a mechanical, hydraulic, or electromechanical subsystem of the vehicle or machinery. Only recently, health management systems that monitor software have been developed. In this paper, we will discuss our approach of using Bayesian networks for Software Health Management (SWHM). We will discuss SWHM requirements, which make advanced reasoning capabilities for the detection and diagnosis important. Then we will present our approach to using Bayesian networks for the construction of health models that dynamically monitor a software system and is capable of detecting and diagnosing faults.
A fuzzy logic intelligent diagnostic system for spacecraft integrated vehicle health management
NASA Technical Reports Server (NTRS)
Wu, G. Gordon
1995-01-01
Due to the complexity of future space missions and the large amount of data involved, greater autonomy in data processing is demanded for mission operations, training, and vehicle health management. In this paper, we develop a fuzzy logic intelligent diagnostic system to perform data reduction, data analysis, and fault diagnosis for spacecraft vehicle health management applications. The diagnostic system contains a data filter and an inference engine. The data filter is designed to intelligently select only the necessary data for analysis, while the inference engine is designed for failure detection, warning, and decision on corrective actions using fuzzy logic synthesis. Due to its adaptive nature and on-line learning ability, the diagnostic system is capable of dealing with environmental noise, uncertainties, conflict information, and sensor faults.
Resilience Design Patterns - A Structured Approach to Resilience at Extreme Scale (version 1.0)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hukerikar, Saurabh; Engelmann, Christian
Reliability is a serious concern for future extreme-scale high-performance computing (HPC) systems. Projections based on the current generation of HPC systems and technology roadmaps suggest that very high fault rates in future systems. The errors resulting from these faults will propagate and generate various kinds of failures, which may result in outcomes ranging from result corruptions to catastrophic application crashes. Practical limits on power consumption in HPC systems will require future systems to embrace innovative architectures, increasing the levels of hardware and software complexities. The resilience challenge for extreme-scale HPC systems requires management of various hardware and software technologies thatmore » are capable of handling a broad set of fault models at accelerated fault rates. These techniques must seek to improve resilience at reasonable overheads to power consumption and performance. While the HPC community has developed various solutions, application-level as well as system-based solutions, the solution space of HPC resilience techniques remains fragmented. There are no formal methods and metrics to investigate and evaluate resilience holistically in HPC systems that consider impact scope, handling coverage, and performance & power eciency across the system stack. Additionally, few of the current approaches are portable to newer architectures and software ecosystems, which are expected to be deployed on future systems. In this document, we develop a structured approach to the management of HPC resilience based on the concept of resilience-based design patterns. A design pattern is a general repeatable solution to a commonly occurring problem. We identify the commonly occurring problems and solutions used to deal with faults, errors and failures in HPC systems. The catalog of resilience design patterns provides designers with reusable design elements. We define a design framework that enhances our understanding of the important constraints and opportunities for solutions deployed at various layers of the system stack. The framework may be used to establish mechanisms and interfaces to coordinate flexible fault management across hardware and software components. The framework also enables optimization of the cost-benefit trade-os among performance, resilience, and power consumption. The overall goal of this work is to enable a systematic methodology for the design and evaluation of resilience technologies in extreme-scale HPC systems that keep scientific applications running to a correct solution in a timely and cost-ecient manner in spite of frequent faults, errors, and failures of various types.« less
Partitioning in Avionics Architectures: Requirements, Mechanisms, and Assurance
NASA Technical Reports Server (NTRS)
Rushby, John
1999-01-01
Automated aircraft control has traditionally been divided into distinct "functions" that are implemented separately (e.g., autopilot, autothrottle, flight management); each function has its own fault-tolerant computer system, and dependencies among different functions are generally limited to the exchange of sensor and control data. A by-product of this "federated" architecture is that faults are strongly contained within the computer system of the function where they occur and cannot readily propagate to affect the operation of other functions. More modern avionics architectures contemplate supporting multiple functions on a single, shared, fault-tolerant computer system where natural fault containment boundaries are less sharply defined. Partitioning uses appropriate hardware and software mechanisms to restore strong fault containment to such integrated architectures. This report examines the requirements for partitioning, mechanisms for their realization, and issues in providing assurance for partitioning. Because partitioning shares some concerns with computer security, security models are reviewed and compared with the concerns of partitioning.
A PC based fault diagnosis expert system
NASA Technical Reports Server (NTRS)
Marsh, Christopher A.
1990-01-01
The Integrated Status Assessment (ISA) prototype expert system performs system level fault diagnosis using rules and models created by the user. The ISA evolved from concepts to a stand-alone demonstration prototype using OPS5 on a LISP Machine. The LISP based prototype was rewritten in C and the C Language Integrated Production System (CLIPS) to run on a Personal Computer (PC) and a graphics workstation. The ISA prototype has been used to demonstrate fault diagnosis functions of Space Station Freedom's Operation Management System (OMS). This paper describes the development of the ISA prototype from early concepts to the current PC/workstation version used today and describes future areas of development for the prototype.
Managing Risk to Ensure a Successful Cassini/Huygens Saturn Orbit Insertion (SOI)
NASA Technical Reports Server (NTRS)
Witkowski, Mona M.; Huh, Shin M.; Burt, John B.; Webster, Julie L.
2004-01-01
I. Design: a) S/C designed to be largely single fault tolerant; b) Operate in flight demonstrated envelope, with margin; and c) Strict compliance with requirements & flight rules. II. Test: a) Baseline, fault & stress testing using flight system testbeds (H/W & S/W); b) In-flight checkout & demos to remove first time events. III. Failure Analysis: a) Critical event driven fault tree analysis; b) Risk mitigation & development of contingencies. IV) Residual Risks: a) Accepted pre-launch waivers to Single Point Failures; b) Unavoidable risks (e.g. natural disaster). V) Mission Assurance: a) Strict process for characterization of variances (ISAs, PFRs & Waivers; b) Full time Mission Assurance Manager reports to Program Manager: 1) Independent assessment of compliance with institutional standards; 2) Oversight & risk assessment of ISAs, PFRs & Waivers etc.; and 3) Risk Management Process facilitator.
NASA Technical Reports Server (NTRS)
Abbott, Kathy
1990-01-01
The objective of the research in this area of fault management is to develop and implement a decision aiding concept for diagnosing faults, especially faults which are difficult for pilots to identify, and to develop methods for presenting the diagnosis information to the flight crew in a timely and comprehensible manner. The requirements for the diagnosis concept were identified by interviewing pilots, analyzing actual incident and accident cases, and examining psychology literature on how humans perform diagnosis. The diagnosis decision aiding concept developed based on those requirements takes abnormal sensor readings as input, as identified by a fault monitor. Based on these abnormal sensor readings, the diagnosis concept identifies the cause or source of the fault and all components affected by the fault. This concept was implemented for diagnosis of aircraft propulsion and hydraulic subsystems in a computer program called Draphys (Diagnostic Reasoning About Physical Systems). Draphys is unique in two important ways. First, it uses models of both functional and physical relationships in the subsystems. Using both models enables the diagnostic reasoning to identify the fault propagation as the faulted system continues to operate, and to diagnose physical damage. Draphys also reasons about behavior of the faulted system over time, to eliminate possibilities as more information becomes available, and to update the system status as more components are affected by the fault. The crew interface research is examining display issues associated with presenting diagnosis information to the flight crew. One study examined issues for presenting system status information. One lesson learned from that study was that pilots found fault situations to be more complex if they involved multiple subsystems. Another was pilots could identify the faulted systems more quickly if the system status was presented in pictorial or text format. Another study is currently under way to examine pilot mental models of the aircraft subsystems and their use in diagnosis tasks. Future research plans include piloted simulation evaluation of the diagnosis decision aiding concepts and crew interface issues. Information is given in viewgraph form.
NASA Astrophysics Data System (ADS)
Polverino, Pierpaolo; Frisk, Erik; Jung, Daniel; Krysander, Mattias; Pianese, Cesare
2017-07-01
The present paper proposes an advanced approach for Polymer Electrolyte Membrane Fuel Cell (PEMFC) systems fault detection and isolation through a model-based diagnostic algorithm. The considered algorithm is developed upon a lumped parameter model simulating a whole PEMFC system oriented towards automotive applications. This model is inspired by other models available in the literature, with further attention to stack thermal dynamics and water management. The developed model is analysed by means of Structural Analysis, to identify the correlations among involved physical variables, defined equations and a set of faults which may occur in the system (related to both auxiliary components malfunctions and stack degradation phenomena). Residual generators are designed by means of Causal Computation analysis and the maximum theoretical fault isolability, achievable with a minimal number of installed sensors, is investigated. The achieved results proved the capability of the algorithm to theoretically detect and isolate almost all faults with the only use of stack voltage and temperature sensors, with significant advantages from an industrial point of view. The effective fault isolability is proved through fault simulations at a specific fault magnitude with an advanced residual evaluation technique, to consider quantitative residual deviations from normal conditions and achieve univocal fault isolation.
NASA Astrophysics Data System (ADS)
de Barros, Felipe P. J.; Bolster, Diogo; Sanchez-Vila, Xavier; Nowak, Wolfgang
2011-05-01
Assessing health risk in hydrological systems is an interdisciplinary field. It relies on the expertise in the fields of hydrology and public health and needs powerful translation concepts to provide decision support and policy making. Reliable health risk estimates need to account for the uncertainties and variabilities present in hydrological, physiological, and human behavioral parameters. Despite significant theoretical advancements in stochastic hydrology, there is still a dire need to further propagate these concepts to practical problems and to society in general. Following a recent line of work, we use fault trees to address the task of probabilistic risk analysis and to support related decision and management problems. Fault trees allow us to decompose the assessment of health risk into individual manageable modules, thus tackling a complex system by a structural divide and conquer approach. The complexity within each module can be chosen individually according to data availability, parsimony, relative importance, and stage of analysis. Three differences are highlighted in this paper when compared to previous works: (1) The fault tree proposed here accounts for the uncertainty in both hydrological and health components, (2) system failure within the fault tree is defined in terms of risk being above a threshold value, whereas previous studies that used fault trees used auxiliary events such as exceedance of critical concentration levels, and (3) we introduce a new form of stochastic fault tree that allows us to weaken the assumption of independent subsystems that is required by a classical fault tree approach. We illustrate our concept in a simple groundwater-related setting.
Grounding explanations in evolving, diagnostic situations
NASA Technical Reports Server (NTRS)
Johannesen, Leila J.; Cook, Richard I.; Woods, David D.
1994-01-01
Certain fields of practice involve the management and control of complex dynamic systems. These include flight deck operations in commercial aviation, control of space systems, anesthetic management during surgery or chemical or nuclear process control. Fault diagnosis of these dynamic systems generally must occur with the monitored process on-line and in conjunction with maintaining system integrity.This research seeks to understand in more detail what it means for an intelligent system to function cooperatively, or as a 'team player' in complex, dynamic environments. The approach taken was to study human practitioners engaged in the management of a complex, dynamic process: anesthesiologists during neurosurgical operations. The investigation focused on understanding how team members cooperate in management and fault diagnosis and comparing this interaction to the situation with an Artificial Intelligence(AI) system that provides diagnoses and explanations. Of particular concern was to study the ways in which practitioners support one another in keeping aware of relevant information concerning the state of the monitored process and of the problem solving process.
Soft-Fault Detection Technologies Developed for Electrical Power Systems
NASA Technical Reports Server (NTRS)
Button, Robert M.
2004-01-01
The NASA Glenn Research Center, partner universities, and defense contractors are working to develop intelligent power management and distribution (PMAD) technologies for future spacecraft and launch vehicles. The goals are to provide higher performance (efficiency, transient response, and stability), higher fault tolerance, and higher reliability through the application of digital control and communication technologies. It is also expected that these technologies will eventually reduce the design, development, manufacturing, and integration costs for large, electrical power systems for space vehicles. The main focus of this research has been to incorporate digital control, communications, and intelligent algorithms into power electronic devices such as direct-current to direct-current (dc-dc) converters and protective switchgear. These technologies, in turn, will enable revolutionary changes in the way electrical power systems are designed, developed, configured, and integrated in aerospace vehicles and satellites. Initial successes in integrating modern, digital controllers have proven that transient response performance can be improved using advanced nonlinear control algorithms. One technology being developed includes the detection of "soft faults," those not typically covered by current systems in use today. Soft faults include arcing faults, corona discharge faults, and undetected leakage currents. Using digital control and advanced signal analysis algorithms, we have shown that it is possible to reliably detect arcing faults in high-voltage dc power distribution systems (see the preceding photograph). Another research effort has shown that low-level leakage faults and cable degradation can be detected by analyzing power system parameters over time. This additional fault detection capability will result in higher reliability for long-lived power systems such as reusable launch vehicles and space exploration missions.
Autonomous power expert system
NASA Technical Reports Server (NTRS)
Walters, Jerry L.; Petrik, Edward J.; Roth, Mary Ellen; Truong, Long Van; Quinn, Todd; Krawczonek, Walter M.
1990-01-01
The Autonomous Power Expert (APEX) system was designed to monitor and diagnose fault conditions that occur within the Space Station Freedom Electrical Power System (SSF/EPS) Testbed. APEX is designed to interface with SSF/EPS testbed power management controllers to provide enhanced autonomous operation and control capability. The APEX architecture consists of three components: (1) a rule-based expert system, (2) a testbed data acquisition interface, and (3) a power scheduler interface. Fault detection, fault isolation, justification of probable causes, recommended actions, and incipient fault analysis are the main functions of the expert system component. The data acquisition component requests and receives pertinent parametric values from the EPS testbed and asserts the values into a knowledge base. Power load profile information is obtained from a remote scheduler through the power scheduler interface component. The current APEX design and development work is discussed. Operation and use of APEX by way of the user interface screens is also covered.
Advanced building energy management system demonstration for Department of Defense buildings.
O'Neill, Zheng; Bailey, Trevor; Dong, Bing; Shashanka, Madhusudana; Luo, Dong
2013-08-01
This paper presents an advanced building energy management system (aBEMS) that employs advanced methods of whole-building performance monitoring combined with statistical methods of learning and data analysis to enable identification of both gradual and discrete performance erosion and faults. This system assimilated data collected from multiple sources, including blueprints, reduced-order models (ROM) and measurements, and employed advanced statistical learning algorithms to identify patterns of anomalies. The results were presented graphically in a manner understandable to facilities managers. A demonstration of aBEMS was conducted in buildings at Naval Station Great Lakes. The facility building management systems were extended to incorporate the energy diagnostics and analysis algorithms, producing systematic identification of more efficient operation strategies. At Naval Station Great Lakes, greater than 20% savings were demonstrated for building energy consumption by improving facility manager decision support to diagnose energy faults and prioritize alternative, energy-efficient operation strategies. The paper concludes with recommendations for widespread aBEMS success. © 2013 New York Academy of Sciences.
Study on Unified Chaotic System-Based Wind Turbine Blade Fault Diagnostic System
NASA Astrophysics Data System (ADS)
Kuo, Ying-Che; Hsieh, Chin-Tsung; Yau, Her-Terng; Li, Yu-Chung
At present, vibration signals are processed and analyzed mostly in the frequency domain. The spectrum clearly shows the signal structure and the specific characteristic frequency band is analyzed, but the number of calculations required is huge, resulting in delays. Therefore, this study uses the characteristics of a nonlinear system to load the complete vibration signal to the unified chaotic system, applying the dynamic error to analyze the wind turbine vibration signal, and adopting extenics theory for artificial intelligent fault diagnosis of the analysis signal. Hence, a fault diagnostor has been developed for wind turbine rotating blades. This study simulates three wind turbine blade states, namely stress rupture, screw loosening and blade loss, and validates the methods. The experimental results prove that the unified chaotic system used in this paper has a significant effect on vibration signal analysis. Thus, the operating conditions of wind turbines can be quickly known from this fault diagnostic system, and the maintenance schedule can be arranged before the faults worsen, making the management and implementation of wind turbines smoother, so as to reduce many unnecessary costs.
Fault tolerant operation of switched reluctance machine
NASA Astrophysics Data System (ADS)
Wang, Wei
The energy crisis and environmental challenges have driven industry towards more energy efficient solutions. With nearly 60% of electricity consumed by various electric machines in industry sector, advancement in the efficiency of the electric drive system is of vital importance. Adjustable speed drive system (ASDS) provides excellent speed regulation and dynamic performance as well as dramatically improved system efficiency compared with conventional motors without electronics drives. Industry has witnessed tremendous grow in ASDS applications not only as a driving force but also as an electric auxiliary system for replacing bulky and low efficiency auxiliary hydraulic and mechanical systems. With the vast penetration of ASDS, its fault tolerant operation capability is more widely recognized as an important feature of drive performance especially for aerospace, automotive applications and other industrial drive applications demanding high reliability. The Switched Reluctance Machine (SRM), a low cost, highly reliable electric machine with fault tolerant operation capability, has drawn substantial attention in the past three decades. Nevertheless, SRM is not free of fault. Certain faults such as converter faults, sensor faults, winding shorts, eccentricity and position sensor faults are commonly shared among all ASDS. In this dissertation, a thorough understanding of various faults and their influence on transient and steady state performance of SRM is developed via simulation and experimental study, providing necessary knowledge for fault detection and post fault management. Lumped parameter models are established for fast real time simulation and drive control. Based on the behavior of the faults, a fault detection scheme is developed for the purpose of fast and reliable fault diagnosis. In order to improve the SRM power and torque capacity under faults, the maximum torque per ampere excitation are conceptualized and validated through theoretical analysis and experiments. With the proposed optimal waveform, torque production is greatly improved under the same Root Mean Square (RMS) current constraint. Additionally, position sensorless operation methods under phase faults are investigated to account for the combination of physical position sensor and phase winding faults. A comprehensive solution for position sensorless operation under single and multiple phases fault are proposed and validated through experiments. Continuous position sensorless operation with seamless transition between various numbers of phase fault is achieved.
Knowledge-based systems for power management
NASA Technical Reports Server (NTRS)
Lollar, L. F.
1992-01-01
NASA-Marshall's Electrical Power Branch has undertaken the development of expert systems in support of further advancements in electrical power system automation. Attention is given to the features (1) of the Fault Recovery and Management Expert System, (2) a resource scheduler or Master of Automated Expert Scheduling Through Resource Orchestration, and (3) an adaptive load-priority manager, or Load Priority List Management System. The characteristics of an advisory battery manager for the Hubble Space Telescope, designated the 'nickel-hydrogen expert system', are also noted.
Optimal fault-tolerant control strategy of a solid oxide fuel cell system
NASA Astrophysics Data System (ADS)
Wu, Xiaojuan; Gao, Danhui
2017-10-01
For solid oxide fuel cell (SOFC) development, load tracking, heat management, air excess ratio constraint, high efficiency, low cost and fault diagnosis are six key issues. However, no literature studies the control techniques combining optimization and fault diagnosis for the SOFC system. An optimal fault-tolerant control strategy is presented in this paper, which involves four parts: a fault diagnosis module, a switching module, two backup optimizers and a controller loop. The fault diagnosis part is presented to identify the SOFC current fault type, and the switching module is used to select the appropriate backup optimizer based on the diagnosis result. NSGA-II and TOPSIS are employed to design the two backup optimizers under normal and air compressor fault states. PID algorithm is proposed to design the control loop, which includes a power tracking controller, an anode inlet temperature controller, a cathode inlet temperature controller and an air excess ratio controller. The simulation results show the proposed optimal fault-tolerant control method can track the power, temperature and air excess ratio at the desired values, simultaneously achieving the maximum efficiency and the minimum unit cost in the case of SOFC normal and even in the air compressor fault.
A Hybrid Stochastic-Neuro-Fuzzy Model-Based System for In-Flight Gas Turbine Engine Diagnostics
2001-04-05
Margin (ADM) and (ii) Fault Detection Margin (FDM). Key Words: ANFIS, Engine Health Monitoring , Gas Path Analysis, and Stochastic Analysis Adaptive Network...The paper illustrates the application of a hybrid Stochastic- Fuzzy -Inference Model-Based System (StoFIS) to fault diagnostics and prognostics for both...operational history monitored on-line by the engine health management (EHM) system. To capture the complex functional relationships between different
Data-Centric Situational Awareness and Management in Intelligent Power Systems
NASA Astrophysics Data System (ADS)
Dai, Xiaoxiao
The rapid development of technology and society has made the current power system a much more complicated system than ever. The request for big data based situation awareness and management becomes urgent today. In this dissertation, to respond to the grand challenge, two data-centric power system situation awareness and management approaches are proposed to address the security problems in the transmission/distribution grids and social benefits augmentation problem at the distribution-customer lever, respectively. To address the security problem in the transmission/distribution grids utilizing big data, the first approach provides a fault analysis solution based on characterization and analytics of the synchrophasor measurements. Specically, the optimal synchrophasor measurement devices selection algorithm (OSMDSA) and matching pursuit decomposition (MPD) based spatial-temporal synchrophasor data characterization method was developed to reduce data volume while preserving comprehensive information for the big data analyses. And the weighted Granger causality (WGC) method was investigated to conduct fault impact causal analysis during system disturbance for fault localization. Numerical results and comparison with other methods demonstrate the effectiveness and robustness of this analytic approach. As more social effects are becoming important considerations in power system management, the goal of situation awareness should be expanded to also include achievements in social benefits. The second approach investigates the concept and application of social energy upon the University of Denver campus grid to provide management improvement solutions for optimizing social cost. Social element--human working productivity cost, and economic element--electricity consumption cost, are both considered in the evaluation of overall social cost. Moreover, power system simulation, numerical experiments for smart building modeling, distribution level real-time pricing and social response to the pricing signals are studied for implementing the interactive artificial-physical management scheme.
V&V of Fault Management: Challenges and Successes
NASA Technical Reports Server (NTRS)
Fesq, Lorraine M.; Costello, Ken; Ohi, Don; Lu, Tiffany; Newhouse, Marilyn
2013-01-01
This paper describes the results of a special breakout session of the NASA Independent Verification and Validation (IV&V) Workshop held in the fall of 2012 entitled "V&V of Fault Management: Challenges and Successes." The NASA IV&V Program is in a unique position to interact with projects across all of the NASA development domains. Using this unique opportunity, the IV&V program convened a breakout session to enable IV&V teams to share their challenges and successes with respect to the V&V of Fault Management (FM) architectures and software. The presentations and discussions provided practical examples of pitfalls encountered while performing V&V of FM including the lack of consistent designs for implementing faults monitors and the fact that FM information is not centralized but scattered among many diverse project artifacts. The discussions also solidified the need for an early commitment to developing FM in parallel with the spacecraft systems as well as clearly defining FM terminology within a project.
Real-time automated failure identification in the Control Center Complex (CCC)
NASA Technical Reports Server (NTRS)
Kirby, Sarah; Lauritsen, Janet; Pack, Ginger; Ha, Anhhoang; Jowers, Steven; Mcnenny, Robert; Truong, The; Dell, James
1993-01-01
A system which will provide real-time failure management support to the Space Station Freedom program is described. The system's use of a simplified form of model based reasoning qualifies it as an advanced automation system. However, it differs from most such systems in that it was designed from the outset to meet two sets of requirements. First, it must provide a useful increment to the fault management capabilities of the Johnson Space Center (JSC) Control Center Complex (CCC) Fault Detection Management system. Second, it must satisfy CCC operational environment constraints such as cost, computer resource requirements, verification, and validation, etc. The need to meet both requirement sets presents a much greater design challenge than would have been the case had functionality been the sole design consideration. The choice of technology, discussing aspects of that choice and the process for migrating it into the control center is overviewed.
Incipient fault detection and power system protection for spaceborne systems
NASA Technical Reports Server (NTRS)
Russell, B. Don; Hackler, Irene M.
1987-01-01
A program was initiated to study the feasibility of using advanced terrestrial power system protection techniques for spacecraft power systems. It was designed to enhance and automate spacecraft power distribution systems in the areas of safety, reliability and maintenance. The proposed power management/distribution system is described as well as security assessment and control, incipient and low current fault detection, and the proposed spaceborne protection system. It is noted that the intelligent remote power controller permits the implementation of digital relaying algorithms with both adaptive and programmable characteristics.
Problems related to the integration of fault tolerant aircraft electronic systems
NASA Technical Reports Server (NTRS)
Bannister, J. A.; Adlakha, V.; Triyedi, K.; Alspaugh, T. A., Jr.
1982-01-01
Problems related to the design of the hardware for an integrated aircraft electronic system are considered. Taxonomies of concurrent systems are reviewed and a new taxonomy is proposed. An informal methodology intended to identify feasible regions of the taxonomic design space is described. Specific tools are recommended for use in the methodology. Based on the methodology, a preliminary strawman integrated fault tolerant aircraft electronic system is proposed. Next, problems related to the programming and control of inegrated aircraft electronic systems are discussed. Issues of system resource management, including the scheduling and allocation of real time periodic tasks in a multiprocessor environment, are treated in detail. The role of software design in integrated fault tolerant aircraft electronic systems is discussed. Conclusions and recommendations for further work are included.
Resilience Design Patterns - A Structured Approach to Resilience at Extreme Scale (version 1.1)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hukerikar, Saurabh; Engelmann, Christian
Reliability is a serious concern for future extreme-scale high-performance computing (HPC) systems. Projections based on the current generation of HPC systems and technology roadmaps suggest the prevalence of very high fault rates in future systems. The errors resulting from these faults will propagate and generate various kinds of failures, which may result in outcomes ranging from result corruptions to catastrophic application crashes. Therefore the resilience challenge for extreme-scale HPC systems requires management of various hardware and software technologies that are capable of handling a broad set of fault models at accelerated fault rates. Also, due to practical limits on powermore » consumption in HPC systems future systems are likely to embrace innovative architectures, increasing the levels of hardware and software complexities. As a result the techniques that seek to improve resilience must navigate the complex trade-off space between resilience and the overheads to power consumption and performance. While the HPC community has developed various resilience solutions, application-level techniques as well as system-based solutions, the solution space of HPC resilience techniques remains fragmented. There are no formal methods and metrics to investigate and evaluate resilience holistically in HPC systems that consider impact scope, handling coverage, and performance & power efficiency across the system stack. Additionally, few of the current approaches are portable to newer architectures and software environments that will be deployed on future systems. In this document, we develop a structured approach to the management of HPC resilience using the concept of resilience-based design patterns. A design pattern is a general repeatable solution to a commonly occurring problem. We identify the commonly occurring problems and solutions used to deal with faults, errors and failures in HPC systems. Each established solution is described in the form of a pattern that addresses concrete problems in the design of resilient systems. The complete catalog of resilience design patterns provides designers with reusable design elements. We also define a framework that enhances a designer's understanding of the important constraints and opportunities for the design patterns to be implemented and deployed at various layers of the system stack. This design framework may be used to establish mechanisms and interfaces to coordinate flexible fault management across hardware and software components. The framework also supports optimization of the cost-benefit trade-offs among performance, resilience, and power consumption. The overall goal of this work is to enable a systematic methodology for the design and evaluation of resilience technologies in extreme-scale HPC systems that keep scientific applications running to a correct solution in a timely and cost-efficient manner in spite of frequent faults, errors, and failures of various types.« less
41 CFR 101-39.404 - Claims in favor of the Government.
Code of Federal Regulations, 2010 CFR
2010-07-01
... VEHICLES 39-INTERAGENCY FLEET MANAGEMENT SYSTEMS 39.4-Accidents and Claims § 101-39.404 Claims in favor of... Interagency Fleet Management System (IFMS) vehicle is at fault and that party can be reasonably identified... pertaining to the accident and its investigation to the servicing GSA IFMS fleet management center. The GSA...
NASA Technical Reports Server (NTRS)
Ricks, Brian W.; Mengshoel, Ole J.
2009-01-01
Reliable systems health management is an important research area of NASA. A health management system that can accurately and quickly diagnose faults in various on-board systems of a vehicle will play a key role in the success of current and future NASA missions. We introduce in this paper the ProDiagnose algorithm, a diagnostic algorithm that uses a probabilistic approach, accomplished with Bayesian Network models compiled to Arithmetic Circuits, to diagnose these systems. We describe the ProDiagnose algorithm, how it works, and the probabilistic models involved. We show by experimentation on two Electrical Power Systems based on the ADAPT testbed, used in the Diagnostic Challenge Competition (DX 09), that ProDiagnose can produce results with over 96% accuracy and less than 1 second mean diagnostic time.
NASA Technical Reports Server (NTRS)
Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.; Koppen, Sandra V.
2008-01-01
This report describes the design of the test articles and monitoring systems developed to characterize the response of a fault-tolerant computer communication system when stressed beyond the theoretical limits for guaranteed correct performance. A high-intensity radiated electromagnetic field (HIRF) environment was selected as the means of injecting faults, as such environments are known to have the potential to cause arbitrary and coincident common-mode fault manifestations that can overwhelm redundancy management mechanisms. The monitors generate stimuli for the systems-under-test (SUTs) and collect data in real-time on the internal state and the response at the external interfaces. A real-time health assessment capability was developed to support the automation of the test. A detailed description of the nature and structure of the collected data is included. The goal of the report is to provide insight into the design and operation of these systems, and to serve as a reference document for use in post-test analyses.
Automated fault-management in a simulated spaceflight micro-world
NASA Technical Reports Server (NTRS)
Lorenz, Bernd; Di Nocera, Francesco; Rottger, Stefan; Parasuraman, Raja
2002-01-01
BACKGROUND: As human spaceflight missions extend in duration and distance from Earth, a self-sufficient crew will bear far greater onboard responsibility and authority for mission success. This will increase the need for automated fault management (FM). Human factors issues in the use of such systems include maintenance of cognitive skill, situational awareness (SA), trust in automation, and workload. This study examine the human performance consequences of operator use of intelligent FM support in interaction with an autonomous, space-related, atmospheric control system. METHODS: An expert system representing a model-base reasoning agent supported operators at a low level of automation (LOA) by a computerized fault finding guide, at a medium LOA by an automated diagnosis and recovery advisory, and at a high LOA by automate diagnosis and recovery implementation, subject to operator approval or veto. Ten percent of the experimental trials involved complete failure of FM support. RESULTS: Benefits of automation were reflected in more accurate diagnoses, shorter fault identification time, and reduced subjective operator workload. Unexpectedly, fault identification times deteriorated more at the medium than at the high LOA during automation failure. Analyses of information sampling behavior showed that offloading operators from recovery implementation during reliable automation enabled operators at high LOA to engage in fault assessment activities CONCLUSIONS: The potential threat to SA imposed by high-level automation, in which decision advisories are automatically generated, need not inevitably be counteracted by choosing a lower LOA. Instead, freeing operator cognitive resources by automatic implementation of recover plans at a higher LOA can promote better fault comprehension, so long as the automation interface is designed to support efficient information sampling.
Artificial neural network application for space station power system fault diagnosis
NASA Technical Reports Server (NTRS)
Momoh, James A.; Oliver, Walter E.; Dias, Lakshman G.
1995-01-01
This study presents a methodology for fault diagnosis using a Two-Stage Artificial Neural Network Clustering Algorithm. Previously, SPICE models of a 5-bus DC power distribution system with assumed constant output power during contingencies from the DDCU were used to evaluate the ANN's fault diagnosis capabilities. This on-going study uses EMTP models of the components (distribution lines, SPDU, TPDU, loads) and power sources (DDCU) of Space Station Alpha's electrical Power Distribution System as a basis for the ANN fault diagnostic tool. The results from the two studies are contrasted. In the event of a major fault, ground controllers need the ability to identify the type of fault, isolate the fault to the orbital replaceable unit level and provide the necessary information for the power management expert system to optimally determine a degraded-mode load schedule. To accomplish these goals, the electrical power distribution system's architecture can be subdivided into three major classes: DC-DC converter to loads, DC Switching Unit (DCSU) to Main bus Switching Unit (MBSU), and Power Sources to DCSU. Each class which has its own electrical characteristics and operations, requires a unique fault analysis philosophy. This study identifies these philosophies as Riddles 1, 2 and 3 respectively. The results of the on-going study addresses Riddle-1. It is concluded in this study that the combination of the EMTP models of the DDCU, distribution cables and electrical loads yields a more accurate model of the behavior and in addition yielded more accurate fault diagnosis using ANN versus the results obtained with the SPICE models.
Computer Sciences and Data Systems, volume 1
NASA Technical Reports Server (NTRS)
1987-01-01
Topics addressed include: software engineering; university grants; institutes; concurrent processing; sparse distributed memory; distributed operating systems; intelligent data management processes; expert system for image analysis; fault tolerant software; and architecture research.
Goal-Function Tree Modeling for Systems Engineering and Fault Management
NASA Technical Reports Server (NTRS)
Johnson, Stephen B.; Breckenridge, Jonathan T.
2013-01-01
The draft NASA Fault Management (FM) Handbook (2012) states that Fault Management (FM) is a "part of systems engineering", and that it "demands a system-level perspective" (NASAHDBK- 1002, 7). What, exactly, is the relationship between systems engineering and FM? To NASA, systems engineering (SE) is "the art and science of developing an operable system capable of meeting requirements within often opposed constraints" (NASA/SP-2007-6105, 3). Systems engineering starts with the elucidation and development of requirements, which set the goals that the system is to achieve. To achieve these goals, the systems engineer typically defines functions, and the functions in turn are the basis for design trades to determine the best means to perform the functions. System Health Management (SHM), by contrast, defines "the capabilities of a system that preserve the system's ability to function as intended" (Johnson et al., 2011, 3). Fault Management, in turn, is the operational subset of SHM, which detects current or future failures, and takes operational measures to prevent or respond to these failures. Failure, in turn, is the "unacceptable performance of intended function." (Johnson 2011, 605) Thus the relationship of SE to FM is that SE defines the functions and the design to perform those functions to meet system goals and requirements, while FM detects the inability to perform those functions and takes action. SHM and FM are in essence "the dark side" of SE. For every function to be performed (SE), there is the possibility that it is not successfully performed (SHM); FM defines the means to operationally detect and respond to this lack of success. We can also describe this in terms of goals: for every goal to be achieved, there is the possibility that it is not achieved; FM defines the means to operationally detect and respond to this inability to achieve the goal. This brief description of relationships between SE, SHM, and FM provide hints to a modeling approach to provide formal connectivity between the nominal (SE), and off-nominal (SHM and FM) aspects of functions and designs. This paper describes a formal modeling approach to the initial phases of the development process that integrates the nominal and off-nominal perspectives in a model that unites SE goals and functions of with the failure to achieve goals and functions (SHM/FM).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Myrent, Noah J.; Barrett, Natalie C.; Adams, Douglas E.
2014-07-01
Operations and maintenance costs for offshore wind plants are significantly higher than the current costs for land-based (onshore) wind plants. One way to reduce these costs would be to implement a structural health and prognostic management (SHPM) system as part of a condition based maintenance paradigm with smart load management and utilize a state-based cost model to assess the economics associated with use of the SHPM system. To facilitate the development of such a system a multi-scale modeling and simulation approach developed in prior work is used to identify how the underlying physics of the system are affected by themore » presence of damage and faults, and how these changes manifest themselves in the operational response of a full turbine. This methodology was used to investigate two case studies: (1) the effects of rotor imbalance due to pitch error (aerodynamic imbalance) and mass imbalance and (2) disbond of the shear web; both on a 5-MW offshore wind turbine in the present report. Sensitivity analyses were carried out for the detection strategies of rotor imbalance and shear web disbond developed in prior work by evaluating the robustness of key measurement parameters in the presence of varying wind speeds, horizontal shear, and turbulence. Detection strategies were refined for these fault mechanisms and probabilities of detection were calculated. For all three fault mechanisms, the probability of detection was 96% or higher for the optimized wind speed ranges of the laminar, 30% horizontal shear, and 60% horizontal shear wind profiles. The revised cost model provided insight into the estimated savings in operations and maintenance costs as they relate to the characteristics of the SHPM system. The integration of the health monitoring information and O&M cost versus damage/fault severity information provides the initial steps to identify processes to reduce operations and maintenance costs for an offshore wind farm while increasing turbine availability, revenue, and overall profit.« less
NASA Astrophysics Data System (ADS)
Zeng, Yajun; Skibniewski, Miroslaw J.
2013-08-01
Enterprise resource planning (ERP) system implementations are often characterised with large capital outlay, long implementation duration, and high risk of failure. In order to avoid ERP implementation failure and realise the benefits of the system, sound risk management is the key. This paper proposes a probabilistic risk assessment approach for ERP system implementation projects based on fault tree analysis, which models the relationship between ERP system components and specific risk factors. Unlike traditional risk management approaches that have been mostly focused on meeting project budget and schedule objectives, the proposed approach intends to address the risks that may cause ERP system usage failure. The approach can be used to identify the root causes of ERP system implementation usage failure and quantify the impact of critical component failures or critical risk events in the implementation process.
Sub-Network Access Control Technology Demonstrator: Software Design of the Network Management System
2002-08-01
Canadian Operational Fleet. Requirements The proposed network management solution must provide the normal monitoring and configuration mechanisms generally...Joint Warrior Inter- operability Demonstrations (JWID) m and the Communication System Network Inter- Operability (CSNI) Navy Network Trials. In short...management functional area normally includes two main functions: fault isolation and diagnosis, and restoration of the system . In short, an operator
NEXT Single String Integration Test Results
NASA Technical Reports Server (NTRS)
Soulas, George C.; Patterson, Michael J.; Pinero, Luis; Herman, Daniel A.; Snyder, Steven John
2010-01-01
As a critical part of NASA's Evolutionary Xenon Thruster (NEXT) test validation process, a single string integration test was performed on the NEXT ion propulsion system. The objectives of this test were to verify that an integrated system of major NEXT ion propulsion system elements meets project requirements, to demonstrate that the integrated system is functional across the entire power processor and xenon propellant management system input ranges, and to demonstrate to potential users that the NEXT propulsion system is ready for transition to flight. Propulsion system elements included in this system integration test were an engineering model ion thruster, an engineering model propellant management system, an engineering model power processor unit, and a digital control interface unit simulator that acted as a test console. Project requirements that were verified during this system integration test included individual element requirements ; integrated system requirements, and fault handling. This paper will present the results of these tests, which include: integrated ion propulsion system demonstrations of performance, functionality and fault handling; a thruster re-performance acceptance test to establish baseline performance: a risk-reduction PMS-thruster integration test: and propellant management system calibration checks.
Power system monitoring and source control of the Space Station Freedom DC power system testbed
NASA Technical Reports Server (NTRS)
Kimnach, Greg L.; Baez, Anastacio N.
1992-01-01
Unlike a terrestrial electric utility which can purchase power from a neighboring utility, the Space Station Freedom (SSF) has strictly limited energy resources; as a result, source control, system monitoring, system protection, and load management are essential to the safe and efficient operation of the SSF Electric Power System (EPS). These functions are being evaluated in the DC Power Management and Distribution (PMAD) Testbed which NASA LeRC has developed at the Power System Facility (PSF) located in Cleveland, Ohio. The testbed is an ideal platform to develop, integrate, and verify power system monitoring and control algorithms. State Estimation (SE) is a monitoring tool used extensively in terrestrial electric utilities to ensure safe power system operation. It uses redundant system information to calculate the actual state of the EPS, to isolate faulty sensors, to determine source operating points, to verify faults detected by subsidiary controllers, and to identify high impedance faults. Source control and monitoring safeguard the power generation and storage subsystems and ensure that the power system operates within safe limits while satisfying user demands with minimal interruptions. System monitoring functions, in coordination with hardware implemented schemes, provide for a complete fault protection system. The objective of this paper is to overview the development and integration of the state estimator and the source control algorithms.
Power system monitoring and source control of the Space Station Freedom dc-power system testbed
NASA Technical Reports Server (NTRS)
Kimnach, Greg L.; Baez, Anastacio N.
1992-01-01
Unlike a terrestrial electric utility which can purchase power from a neighboring utility, the Space Station Freedom (SSF) has strictly limited energy resources; as a result, source control, system monitoring, system protection, and load management are essential to the safe and efficient operation of the SSF Electric Power System (EPS). These functions are being evaluated in the dc Power Management and Distribution (PMAD) Testbed which NASA LeRC has developed at the Power System Facility (PSF) located in Cleveland, Ohio. The testbed is an ideal platform to develop, integrate, and verify power system monitoring and control algorithms. State Estimation (SE) is a monitoring tool used extensively in terrestrial electric utilities to ensure safe power system operation. It uses redundant system information to calculate the actual state of the EPS, to isolate faulty sensors, to determine source operating points, to verify faults detected by subsidiary controllers, and to identify high impedance faults. Source control and monitoring safeguard the power generation and storage subsystems and ensure that the power system operates within safe limits while satisfying user demands with minimal interruptions. System monitoring functions, in coordination with hardware implemented schemes, provide for a complete fault protection system. The objective of this paper is to overview the development and integration of the state estimator and the source control algorithms.
Rocket Engine Health Management: Early Definition of Critical Flight Measurements
NASA Technical Reports Server (NTRS)
Christenson, Rick L.; Nelson, Michael A.; Butas, John P.
2003-01-01
The NASA led Space Launch Initiative (SLI) program has established key requirements related to safety, reliability, launch availability and operations cost to be met by the next generation of reusable launch vehicles. Key to meeting these requirements will be an integrated vehicle health management ( M) system that includes sensors, harnesses, software, memory, and processors. Such a system must be integrated across all the vehicle subsystems and meet component, subsystem, and system requirements relative to fault detection, fault isolation, and false alarm rate. The purpose of this activity is to evolve techniques for defining critical flight engine system measurements-early within the definition of an engine health management system (EHMS). Two approaches, performance-based and failure mode-based, are integrated to provide a proposed set of measurements to be collected. This integrated approach is applied to MSFC s MC-1 engine. Early identification of measurements supports early identification of candidate sensor systems whose design and impacts to the engine components must be considered in engine design.
A pilot GIS database of active faults of Mt. Etna (Sicily): A tool for integrated hazard evaluation
NASA Astrophysics Data System (ADS)
Barreca, Giovanni; Bonforte, Alessandro; Neri, Marco
2013-02-01
A pilot GIS-based system has been implemented for the assessment and analysis of hazard related to active faults affecting the eastern and southern flanks of Mt. Etna. The system structure was developed in ArcGis® environment and consists of different thematic datasets that include spatially-referenced arc-features and associated database. Arc-type features, georeferenced into WGS84 Ellipsoid UTM zone 33 Projection, represent the five main fault systems that develop in the analysed region. The backbone of the GIS-based system is constituted by the large amount of information which was collected from the literature and then stored and properly geocoded in a digital database. This consists of thirty five alpha-numeric fields which include all fault parameters available from literature such us location, kinematics, landform, slip rate, etc. Although the system has been implemented according to the most common procedures used by GIS developer, the architecture and content of the database represent a pilot backbone for digital storing of fault parameters, providing a powerful tool in modelling hazard related to the active tectonics of Mt. Etna. The database collects, organises and shares all scientific currently available information about the active faults of the volcano. Furthermore, thanks to the strong effort spent on defining the fields of the database, the structure proposed in this paper is open to the collection of further data coming from future improvements in the knowledge of the fault systems. By layering additional user-specific geographic information and managing the proposed database (topological querying) a great diversity of hazard and vulnerability maps can be produced by the user. This is a proposal of a backbone for a comprehensive geographical database of fault systems, universally applicable to other sites.
SLURM: Simple Linux Utility for Resource Management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jette, M; Grondona, M
2002-12-19
Simple Linux Utility for Resource Management (SLURM) is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for Linux clusters of thousands of nodes. Components include machine status, partition management, job management, scheduling and stream copy modules. This paper presents an overview of the SLURM architecture and functionality.
SLURM: Simplex Linux Utility for Resource Management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jette, M; Grondona, M
2003-04-22
Simple Linux Utility for Resource Management (SLURM) is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for Linux clusters of thousands of nodes. Components include machine status, partition management, job management, scheduling, and stream copy modules. This paper presents an overview of the SLURM architecture and functionality.
Expert systems applied to fault isolation and energy storage management, phase 2
NASA Technical Reports Server (NTRS)
1987-01-01
A user's guide for the Fault Isolation and Energy Storage (FIES) II system is provided. Included are a brief discussion of the background and scope of this project, a discussion of basic and advanced operating installation and problem determination procedures for the FIES II system and information on hardware and software design and implementation. A number of appendices are provided including a detailed specification for the microprocessor software, a detailed description of the expert system rule base and a description and listings of the LISP interface software.
The fault monitoring and diagnosis knowledge-based system for space power systems: AMPERES, phase 1
NASA Technical Reports Server (NTRS)
Lee, S. C.
1989-01-01
The objective is to develop a real time fault monitoring and diagnosis knowledge-based system (KBS) for space power systems which can save costly operational manpower and can achieve more reliable space power system operation. The proposed KBS was developed using the Autonomously Managed Power System (AMPS) test facility currently installed at NASA Marshall Space Flight Center (MSFC), but the basic approach taken for this project could be applicable for other space power systems. The proposed KBS is entitled Autonomously Managed Power-System Extendible Real-time Expert System (AMPERES). In Phase 1 the emphasis was put on the design of the overall KBS, the identification of the basic research required, the initial performance of the research, and the development of a prototype KBS. In Phase 2, emphasis is put on the completion of the research initiated in Phase 1, and the enhancement of the prototype KBS developed in Phase 1. This enhancement is intended to achieve a working real time KBS incorporated with the NASA space power system test facilities. Three major research areas were identified and progress was made in each area. These areas are real time data acquisition and its supporting data structure; sensor value validations; development of inference scheme for effective fault monitoring and diagnosis, and its supporting knowledge representation scheme.
Methods for Probabilistic Fault Diagnosis: An Electrical Power System Case Study
NASA Technical Reports Server (NTRS)
Ricks, Brian W.; Mengshoel, Ole J.
2009-01-01
Health management systems that more accurately and quickly diagnose faults that may occur in different technical systems on-board a vehicle will play a key role in the success of future NASA missions. We discuss in this paper the diagnosis of abrupt continuous (or parametric) faults within the context of probabilistic graphical models, more specifically Bayesian networks that are compiled to arithmetic circuits. This paper extends our previous research, within the same probabilistic setting, on diagnosis of abrupt discrete faults. Our approach and diagnostic algorithm ProDiagnose are domain-independent; however we use an electrical power system testbed called ADAPT as a case study. In one set of ADAPT experiments, performed as part of the 2009 Diagnostic Challenge, our system turned out to have the best performance among all competitors. In a second set of experiments, we show how we have recently further significantly improved the performance of the probabilistic model of ADAPT. While these experiments are obtained for an electrical power system testbed, we believe they can easily be transitioned to real-world systems, thus promising to increase the success of future NASA missions.
Using a CLIPS expert system to automatically manage TCP/IP networks and their components
NASA Technical Reports Server (NTRS)
Faul, Ben M.
1991-01-01
A expert system that can directly manage networks components on a Transmission Control Protocol/Internet Protocol (TCP/IP) network is described. Previous expert systems for managing networks have focused on managing network faults after they occur. However, this proactive expert system can monitor and control network components in near real time. The ability to directly manage network elements from the C Language Integrated Production System (CLIPS) is accomplished by the integration of the Simple Network Management Protocol (SNMP) and a Abstract Syntax Notation (ASN) parser into the CLIPS artificial intelligence language.
Intelligent systems technology infrastructure for integrated systems
NASA Technical Reports Server (NTRS)
Lum, Henry, Jr.
1991-01-01
Significant advances have occurred during the last decade in intelligent systems technologies (a.k.a. knowledge-based systems, KBS) including research, feasibility demonstrations, and technology implementations in operational environments. Evaluation and simulation data obtained to date in real-time operational environments suggest that cost-effective utilization of intelligent systems technologies can be realized for Automated Rendezvous and Capture applications. The successful implementation of these technologies involve a complex system infrastructure integrating the requirements of transportation, vehicle checkout and health management, and communication systems without compromise to systems reliability and performance. The resources that must be invoked to accomplish these tasks include remote ground operations and control, built-in system fault management and control, and intelligent robotics. To ensure long-term evolution and integration of new validated technologies over the lifetime of the vehicle, system interfaces must also be addressed and integrated into the overall system interface requirements. An approach for defining and evaluating the system infrastructures including the testbed currently being used to support the on-going evaluations for the evolutionary Space Station Freedom Data Management System is presented and discussed. Intelligent system technologies discussed include artificial intelligence (real-time replanning and scheduling), high performance computational elements (parallel processors, photonic processors, and neural networks), real-time fault management and control, and system software development tools for rapid prototyping capabilities.
NASA Technical Reports Server (NTRS)
Shaver, Charles; Williamson, Michael
1986-01-01
The NASA Ames Research Center sponsors a research program for the investigation of Intelligent Flight Control Actuation systems. The use of artificial intelligence techniques in conjunction with algorithmic techniques for autonomous, decentralized fault management of flight-control actuation systems is explored under this program. The design, development, and operation of the interface for laboratory investigation of this program is documented. The interface, architecturally based on the Intel 8751 microcontroller, is an interrupt-driven system designed to receive a digital message from an ultrareliable fault-tolerant control system (UFTCS). The interface links the UFTCS to an electronic servo-control unit, which controls a set of hydraulic actuators. It was necessary to build a UFTCS emulator (also based on the Intel 8751) to provide signal sources for testing the equipment.
Risk assessment techniques with applicability in marine engineering
NASA Astrophysics Data System (ADS)
Rudenko, E.; Panaitescu, F. V.; Panaitescu, M.
2015-11-01
Nowadays risk management is a carefully planned process. The task of risk management is organically woven into the general problem of increasing the efficiency of business. Passive attitude to risk and awareness of its existence are replaced by active management techniques. Risk assessment is one of the most important stages of risk management, since for risk management it is necessary first to analyze and evaluate risk. There are many definitions of this notion but in general case risk assessment refers to the systematic process of identifying the factors and types of risk and their quantitative assessment, i.e. risk analysis methodology combines mutually complementary quantitative and qualitative approaches. Purpose of the work: In this paper we will consider as risk assessment technique Fault Tree analysis (FTA). The objectives are: understand purpose of FTA, understand and apply rules of Boolean algebra, analyse a simple system using FTA, FTA advantages and disadvantages. Research and methodology: The main purpose is to help identify potential causes of system failures before the failures actually occur. We can evaluate the probability of the Top event.The steps of this analize are: the system's examination from Top to Down, the use of symbols to represent events, the use of mathematical tools for critical areas, the use of Fault tree logic diagrams to identify the cause of the Top event. Results: In the finally of study it will be obtained: critical areas, Fault tree logical diagrams and the probability of the Top event. These results can be used for the risk assessment analyses.
Energy-efficient fault tolerance in multiprocessor real-time systems
NASA Astrophysics Data System (ADS)
Guo, Yifeng
The recent progress in the multiprocessor/multicore systems has important implications for real-time system design and operation. From vehicle navigation to space applications as well as industrial control systems, the trend is to deploy multiple processors in real-time systems: systems with 4 -- 8 processors are common, and it is expected that many-core systems with dozens of processing cores will be available in near future. For such systems, in addition to general temporal requirement common for all real-time systems, two additional operational objectives are seen as critical: energy efficiency and fault tolerance. An intriguing dimension of the problem is that energy efficiency and fault tolerance are typically conflicting objectives, due to the fact that tolerating faults (e.g., permanent/transient) often requires extra resources with high energy consumption potential. In this dissertation, various techniques for energy-efficient fault tolerance in multiprocessor real-time systems have been investigated. First, the Reliability-Aware Power Management (RAPM) framework, which can preserve the system reliability with respect to transient faults when Dynamic Voltage Scaling (DVS) is applied for energy savings, is extended to support parallel real-time applications with precedence constraints. Next, the traditional Standby-Sparing (SS) technique for dual processor systems, which takes both transient and permanent faults into consideration while saving energy, is generalized to support multiprocessor systems with arbitrary number of identical processors. Observing the inefficient usage of slack time in the SS technique, a Preference-Oriented Scheduling Framework is designed to address the problem where tasks are given preferences for being executed as soon as possible (ASAP) or as late as possible (ALAP). A preference-oriented earliest deadline (POED) scheduler is proposed and its application in multiprocessor systems for energy-efficient fault tolerance is investigated, where tasks' main copies are executed ASAP while backup copies ALAP to reduce the overlapped execution of main and backup copies of the same task and thus reduce energy consumption. All proposed techniques are evaluated through extensive simulations and compared with other state-of-the-art approaches. The simulation results confirm that the proposed schemes can preserve the system reliability while still achieving substantial energy savings. Finally, for both SS and POED based Energy-Efficient Fault-Tolerant (EEFT) schemes, a series of recovery strategies are designed when more than one (transient and permanent) faults need to be tolerated.
Assurance of Fault Management: Risk-Significant Adverse Condition Awareness
NASA Technical Reports Server (NTRS)
Fitz, Rhonda
2016-01-01
Fault Management (FM) systems are ranked high in risk-based assessment of criticality within flight software, emphasizing the importance of establishing highly competent domain expertise to provide assurance for NASA projects, especially as spaceflight systems continue to increase in complexity. Insight into specific characteristics of FM architectures seen embedded within safety- and mission-critical software systems analyzed by the NASA Independent Verification Validation (IVV) Program has been enhanced with an FM Technical Reference (TR) suite. Benefits are aimed beyond the IVV community to those that seek ways to efficiently and effectively provide software assurance to reduce the FM risk posture of NASA and other space missions. The identification of particular FM architectures, visibility, and associated IVV techniques provides a TR suite that enables greater assurance that critical software systems will adequately protect against faults and respond to adverse conditions. The role FM has with regard to overall asset protection of flight software systems is being addressed with the development of an adverse condition (AC) database encompassing flight software vulnerabilities.Identification of potential off-nominal conditions and analysis to determine how a system responds to these conditions are important aspects of hazard analysis and fault management. Understanding what ACs the mission may face, and ensuring they are prevented or addressed is the responsibility of the assurance team, which necessarily should have insight into ACs beyond those defined by the project itself. Research efforts sponsored by NASAs Office of Safety and Mission Assurance defined terminology, categorized data fields, and designed a baseline repository that centralizes and compiles a comprehensive listing of ACs and correlated data relevant across many NASA missions. This prototype tool helps projects improve analysis by tracking ACs, and allowing queries based on project, mission type, domain component, causal fault, and other key characteristics. The repository has a firm structure, initial collection of data, and an interface established for informational queries, with plans for integration within the Enterprise Architecture at NASA IVV, enabling support and accessibility across the Agency. The development of an improved workflow process for adaptive, risk-informed FM assurance is currently underway.
Augmentation of the space station module power management and distribution breadboard
NASA Technical Reports Server (NTRS)
Walls, Bryan; Hall, David K.; Lollar, Louis F.
1991-01-01
The space station module power management and distribution (SSM/PMAD) breadboard models power distribution and management, including scheduling, load prioritization, and a fault detection, identification, and recovery (FDIR) system within a Space Station Freedom habitation or laboratory module. This 120 VDC system is capable of distributing up to 30 kW of power among more than 25 loads. In addition to the power distribution hardware, the system includes computer control through a hierarchy of processes. The lowest level consists of fast, simple (from a computing standpoint) switchgear that is capable of quickly safing the system. At the next level are local load center processors, (LLP's) which execute load scheduling, perform redundant switching, and shed loads which use more than scheduled power. Above the LLP's are three cooperating artificial intelligence (AI) systems which manage load prioritizations, load scheduling, load shedding, and fault recovery and management. Recent upgrades to hardware and modifications to software at both the LLP and AI system levels promise a drastic increase in speed, a significant increase in functionality and reliability, and potential for further examination of advanced automation techniques. The background, SSM/PMAD, interface to the Lewis Research Center test bed, the large autonomous spacecraft electrical power system, and future plans are discussed.
Goal-Function Tree Modeling for Systems Engineering and Fault Management
NASA Technical Reports Server (NTRS)
Johnson, Stephen B.; Breckenridge, Jonathan T.
2013-01-01
This paper describes a new representation that enables rigorous definition and decomposition of both nominal and off-nominal system goals and functions: the Goal-Function Tree (GFT). GFTs extend the concept and process of functional decomposition, utilizing state variables as a key mechanism to ensure physical and logical consistency and completeness of the decomposition of goals (requirements) and functions, and enabling full and complete traceabilitiy to the design. The GFT also provides for means to define and represent off-nominal goals and functions that are activated when the system's nominal goals are not met. The physical accuracy of the GFT, and its ability to represent both nominal and off-nominal goals enable the GFT to be used for various analyses of the system, including assessments of the completeness and traceability of system goals and functions, the coverage of fault management failure detections, and definition of system failure scenarios.
Performance monitor system functional simulator, environmental data, orbiter 101(HFT)
NASA Technical Reports Server (NTRS)
Parker, F. W.
1974-01-01
Information concerning the environment component of the space shuttle performance monitor system simulator (PMSS) and those subsystems operational on the shuttle orbiter 101 used for horizontal flight test (HFT) is provided, along with detailed data for the shuttle performance monitor system (PMS) whose software requirements evolve from three basic PMS functions: (1) fault detection and annunciation; (2) subsystem measurement management; and (3) subsystem configuration management. Information relative to the design and operation of Orbiter systems for HFT is also presented, and the functional paths are identified to the lowest level at which the crew can control the system functions. Measurement requirements are given which are necessary to adequately monitor the health status of the system. PMS process requirements, relative to the measurements which are necessary for fault detection and annunciation of a failed functional path, consist of measurement characteristics, tolerance limits, precondition tests, and correlation measurements.
NASA Astrophysics Data System (ADS)
Nolan, S.; Jones, C. E.; Munro, R.; Norman, P.; Galloway, S.; Venturumilli, S.; Sheng, J.; Yuan, W.
2017-12-01
Hybrid electric propulsion aircraft are proposed to improve overall aircraft efficiency, enabling future rising demands for air travel to be met. The development of appropriate electrical power systems to provide thrust for the aircraft is a significant challenge due to the much higher required power generation capacity levels and complexity of the aero-electrical power systems (AEPS). The efficiency and weight of the AEPS is critical to ensure that the benefits of hybrid propulsion are not mitigated by the electrical power train. Hence it is proposed that for larger aircraft (~200 passengers) superconducting power systems are used to meet target power densities. Central to the design of the hybrid propulsion AEPS is a robust and reliable electrical protection and fault management system. It is known from previous studies that the choice of protection system may have a significant impact on the overall efficiency of the AEPS. Hence an informed design process which considers the key trades between choice of cable and protection requirements is needed. To date the fault response of a voltage source converter interfaced DC link rail to rail fault in a superconducting power system has only been investigated using simulation models validated by theoretical values from the literature. This paper will present the experimentally obtained fault response for a variety of different types of superconducting tape for a rail to rail DC fault. The paper will then use these as a platform to identify key trades between protection requirements and cable design, providing guidelines to enable future informed decisions to optimise hybrid propulsion electrical power system and protection design.
Priority scheme planning for the robust SSM/PMAD testbed
NASA Technical Reports Server (NTRS)
Elges, Michael R.; Ashworth, Barry R.
1991-01-01
Whenever mixing priorities of manually controlled resources with those of autonomously controlled resources, the space station module power management and distribution (SSM/PMAD) environment requires cooperating expert system interaction between the planning function and the priority manager. The elements and interactions of the SSM/PMAD planning and priority management functions are presented. Their adherence to cooperating for common achievement are described. In the SSM/PMAD testbed these actions are guided by having a system planning function, KANT, which has insight to the executing system and its automated database. First, the user must be given access to all information which may have an effect on the desired outcome. Second, the fault manager element, FRAMES, must be informed as to the change so that correct diagnoses and operations take place if and when faults occur. Third, some element must engage as mediator for selection of resources and actions to be added or removed at the user's request. This is performed by the priority manager, LPLMS. Lastly, the scheduling mechanism, MAESTRO, must provide future schedules adhering to the user modified resource base.
Risk-Significant Adverse Condition Awareness Strengthens Assurance of Fault Management Systems
NASA Technical Reports Server (NTRS)
Fitz, Rhonda
2017-01-01
As spaceflight systems increase in complexity, Fault Management (FM) systems are ranked high in risk-based assessment of software criticality, emphasizing the importance of establishing highly competent domain expertise to provide assurance. Adverse conditions (ACs) and specific vulnerabilities encountered by safety- and mission-critical software systems have been identified through efforts to reduce the risk posture of software-intensive NASA missions. Acknowledgement of potential off-nominal conditions and analysis to determine software system resiliency are important aspects of hazard analysis and FM. A key component of assuring FM is an assessment of how well software addresses susceptibility to failure through consideration of ACs. Focus on significant risk predicted through experienced analysis conducted at the NASA Independent Verification & Validation (IV&V) Program enables the scoping of effective assurance strategies with regard to overall asset protection of complex spaceflight as well as ground systems. Research efforts sponsored by NASAs Office of Safety and Mission Assurance (OSMA) defined terminology, categorized data fields, and designed a baseline repository that centralizes and compiles a comprehensive listing of ACs and correlated data relevant across many NASA missions. This prototype tool helps projects improve analysis by tracking ACs and allowing queries based on project, mission type, domain/component, causal fault, and other key characteristics. Vulnerability in off-nominal situations, architectural design weaknesses, and unexpected or undesirable system behaviors in reaction to faults are curtailed with the awareness of ACs and risk-significant scenarios modeled for analysts through this database. Integration within the Enterprise Architecture at NASA IV&V enables interfacing with other tools and datasets, technical support, and accessibility across the Agency. This paper discusses the development of an improved workflow process utilizing this database for adaptive, risk-informed FM assurance that critical software systems will safely and securely protect against faults and respond to ACs in order to achieve successful missions.
Risk-Significant Adverse Condition Awareness Strengthens Assurance of Fault Management Systems
NASA Technical Reports Server (NTRS)
Fitz, Rhonda
2017-01-01
As spaceflight systems increase in complexity, Fault Management (FM) systems are ranked high in risk-based assessment of software criticality, emphasizing the importance of establishing highly competent domain expertise to provide assurance. Adverse conditions (ACs) and specific vulnerabilities encountered by safety- and mission-critical software systems have been identified through efforts to reduce the risk posture of software-intensive NASA missions. Acknowledgement of potential off-nominal conditions and analysis to determine software system resiliency are important aspects of hazard analysis and FM. A key component of assuring FM is an assessment of how well software addresses susceptibility to failure through consideration of ACs. Focus on significant risk predicted through experienced analysis conducted at the NASA Independent Verification Validation (IVV) Program enables the scoping of effective assurance strategies with regard to overall asset protection of complex spaceflight as well as ground systems. Research efforts sponsored by NASA's Office of Safety and Mission Assurance defined terminology, categorized data fields, and designed a baseline repository that centralizes and compiles a comprehensive listing of ACs and correlated data relevant across many NASA missions. This prototype tool helps projects improve analysis by tracking ACs and allowing queries based on project, mission type, domaincomponent, causal fault, and other key characteristics. Vulnerability in off-nominal situations, architectural design weaknesses, and unexpected or undesirable system behaviors in reaction to faults are curtailed with the awareness of ACs and risk-significant scenarios modeled for analysts through this database. Integration within the Enterprise Architecture at NASA IVV enables interfacing with other tools and datasets, technical support, and accessibility across the Agency. This paper discusses the development of an improved workflow process utilizing this database for adaptive, risk-informed FM assurance that critical software systems will safely and securely protect against faults and respond to ACs in order to achieve successful missions.
Improving Service Management in Campus IT Operations
ERIC Educational Resources Information Center
Wan, Stewart H. C.; Chan, Yuk-Hee
2008-01-01
Purpose: This paper aims at presenting the benefits from implementing IT service management (ITSM) in an organization for managing campus-wide IT operations. In order to improve the fault correlation from business perspectives, we proposed a framework to automate network and system alerts with respect to its business service impact for proactive…
Fault tolerant data management system
NASA Technical Reports Server (NTRS)
Gustin, W. M.; Smither, M. A.
1972-01-01
Described in detail are: (1) results obtained in modifying the onboard data management system software to a multiprocessor fault tolerant system; (2) a functional description of the prototype buffer I/O units; (3) description of modification to the ACADC and stimuli generating unit of the DTS; and (4) summaries and conclusions on techniques implemented in the rack and prototype buffers. Also documented is the work done in investigating techniques of high speed (5 Mbps) digital data transmission in the data bus environment. The application considered is a multiport data bus operating with the following constraints: no preferred stations; random bus access by all stations; all stations equally likely to source or sink data; no limit to the number of stations along the bus; no branching of the bus; and no restriction on station placement along the bus.
Intelligent fault diagnosis and failure management of flight control actuation systems
NASA Technical Reports Server (NTRS)
Bonnice, William F.; Baker, Walter
1988-01-01
The real-time fault diagnosis and failure management (FDFM) of current operational and experimental dual tandem aircraft flight control system actuators was investigated. Dual tandem actuators were studied because of the active FDFM capability required to manage the redundancy of these actuators. The FDFM methods used on current dual tandem actuators were determined by examining six specific actuators. The FDFM capability on these six actuators was also evaluated. One approach for improving the FDFM capability on dual tandem actuators may be through the application of artificial intelligence (AI) technology. Existing AI approaches and applications of FDFM were examined and evaluated. Based on the general survey of AI FDFM approaches, the potential role of AI technology for real-time actuator FDFM was determined. Finally, FDFM and maintainability improvements for dual tandem actuators were recommended.
``DMS-R, the Brain of the ISS'', 10 Years of Continuous Successful Operation in Space
NASA Astrophysics Data System (ADS)
Wolff, Bernd; Scheffers, Peter
2012-08-01
Space industries on both sides of the Atlantic were faced with a new situation of collaboration in the beginning of the 1990s.In 1995, industrial cooperation between ASTRIUM ST, Bremen and RSC-E, Moscow started aiming the outfitting of the Russian Service Module ZVEZDA for the ISS with computers. The requested equipments had to provide not only redundancy but fault tolerance and high availability. The design and development of two fault tolerant computers, (FTCs) responsible for the telemetry (Telemetry Computer: TC) and the central control (CC), as well as the man machine interface CPC were contracted to ASTRIUM ST, Bremen. The computer system is responsible e.g. for the life support system and the ISS re-boost control.In July 2000, the integration of the Russian Service Module ZVEZDA with Russian ZARYA FGB and American Node 1 bears witness for transatlantic and European cooperation.The Russian Service module ZVEZDA provides several basic functions as Avionics Control, the Environmental Control and Life Support (ECLS) in the ISS and control of the docked Automatic Transfer Vehicle (ATV) which includes re-boost of ISS. If these elementary functions fail or do not work reliable the effects for the ISS will be catastrophic with respect to Safety (manned space) and ISS mission.For that reason the responsible computer system Data Management System - Russia (DMS-R) is also called "The brain of the ISS".The Russian Service module ZVEZDA, including DMS-R, was launched on 12th of July, 2000. DMS-R was operational also during launch and docking.The talk provide information about the definition, design and development of DMS-R, the integration of DMS-R in the Russian Service module and the maintenance of the system in space. Besides the technical aspects are also the German - Russian cooperation an important subject of this speech. An outlook finalises the talk providing further development activities and application of fault tolerant systems.The importance of the DMS-R equipment for the ISS related to availability and reliability is reported in paragraph 1.2, describing a serious incident.The DMS-R architecture, consisting of two fault tolerant computers, their interconnection via MIL 1553 STD Bus and the Control Post Computer (CPC) as man- machine interface is given in figure 1. The main data transfer within the ISS and therefore also the Russian segment is managed by the MIL1553 STD bus. The focus of this script is neither the operational concept nor the fault tolerant design according the Byzantine Theorem, but the architectural embedment. One fault tolerant computer consists out of up to four fault containment regions (FCR), comparing in- and output data and deciding by majority voting whether a faulty FCR has to be isolated. For this purpose all data have to pass the so-called fault management element and are distributed to the other participants in the computer pool (FTC). Each fault containment region is connected to the avionic busses of the vehicle avionics system. In case of a faulty FCR (wrong calculation result was detected by the other FCRs or by build-in self-detection) the dedicated FCR will reset itself or will be reset by the others. The bus controller functions of the isolated FCR will be taken over according to a specific deterministic scheme from another FCR. The FTC data throughput will be maintained, the FTC operation will continue without interruption. Each FCR consists of an application CPU board (ALB), the fault management layer (FML), the avionics bus interface board (AVI) and a power supply (PSU), sharing a VME data bus.The FML is fully transparent, in terms of I/O accessibility, to the application S/W and votes the data autonomously received from the avionics busses and transmitted from the application.
Comprehensive Fault Tolerance and Science-Optimal Attitude Planning for Spacecraft Applications
NASA Astrophysics Data System (ADS)
Nasir, Ali
Spacecraft operate in a harsh environment, are costly to launch, and experience unavoidable communication delay and bandwidth constraints. These factors motivate the need for effective onboard mission and fault management. This dissertation presents an integrated framework to optimize science goal achievement while identifying and managing encountered faults. Goal-related tasks are defined by pointing the spacecraft instrumentation toward distant targets of scientific interest. The relative value of science data collection is traded with risk of failures to determine an optimal policy for mission execution. Our major innovation in fault detection and reconfiguration is to incorporate fault information obtained from two types of spacecraft models: one based on the dynamics of the spacecraft and the second based on the internal composition of the spacecraft. For fault reconfiguration, we consider possible changes in both dynamics-based control law configuration and the composition-based switching configuration. We formulate our problem as a stochastic sequential decision problem or Markov Decision Process (MDP). To avoid the computational complexity involved in a fully-integrated MDP, we decompose our problem into multiple MDPs. These MDPs include planning MDPs for different fault scenarios, a fault detection MDP based on a logic-based model of spacecraft component and system functionality, an MDP for resolving conflicts between fault information from the logic-based model and the dynamics-based spacecraft models" and the reconfiguration MDP that generates a policy optimized over the relative importance of the mission objectives versus spacecraft safety. Approximate Dynamic Programming (ADP) methods for the decomposition of the planning and fault detection MDPs are applied. To show the performance of the MDP-based frameworks and ADP methods, a suite of spacecraft attitude planning case studies are described. These case studies are used to analyze the content and behavior of computed policies in response to the changes in design parameters. A primary case study is built from the Far Ultraviolet Spectroscopic Explorer (FUSE) mission for which component models and their probabilities of failure are based on realistic mission data. A comparison of our approach with an alternative framework for spacecraft task planning and fault management is presented in the context of the FUSE mission.
SLURM: Simple Linux Utility for Resource Management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jette, M; Dunlap, C; Garlick, J
2002-07-08
Simple Linux Utility for Resource Management (SLURM) is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for Linux clusters of thousands of nodes. Components include machine status, partition management, job management, scheduling and stream copy modules. The design also includes a scalable, general-purpose communication infrastructure. This paper presents a overview of the SLURM architecture and functionality.
Dolev, Danny; Függer, Matthias; Posch, Markus; Schmid, Ulrich; Steininger, Andreas; Lenzen, Christoph
2014-06-01
We present the first implementation of a distributed clock generation scheme for Systems-on-Chip that recovers from an unbounded number of arbitrary transient faults despite a large number of arbitrary permanent faults. We devise self-stabilizing hardware building blocks and a hybrid synchronous/asynchronous state machine enabling metastability-free transitions of the algorithm's states. We provide a comprehensive modeling approach that permits to prove, given correctness of the constructed low-level building blocks, the high-level properties of the synchronization algorithm (which have been established in a more abstract model). We believe this approach to be of interest in its own right, since this is the first technique permitting to mathematically verify, at manageable complexity, high-level properties of a fault-prone system in terms of its very basic components. We evaluate a prototype implementation, which has been designed in VHDL, using the Petrify tool in conjunction with some extensions, and synthesized for an Altera Cyclone FPGA.
Dolev, Danny; Függer, Matthias; Posch, Markus; Schmid, Ulrich; Steininger, Andreas; Lenzen, Christoph
2014-01-01
We present the first implementation of a distributed clock generation scheme for Systems-on-Chip that recovers from an unbounded number of arbitrary transient faults despite a large number of arbitrary permanent faults. We devise self-stabilizing hardware building blocks and a hybrid synchronous/asynchronous state machine enabling metastability-free transitions of the algorithm's states. We provide a comprehensive modeling approach that permits to prove, given correctness of the constructed low-level building blocks, the high-level properties of the synchronization algorithm (which have been established in a more abstract model). We believe this approach to be of interest in its own right, since this is the first technique permitting to mathematically verify, at manageable complexity, high-level properties of a fault-prone system in terms of its very basic components. We evaluate a prototype implementation, which has been designed in VHDL, using the Petrify tool in conjunction with some extensions, and synthesized for an Altera Cyclone FPGA. PMID:26516290
Space Station Freedom ECLSS: A step toward autonomous regenerative life support systems
NASA Technical Reports Server (NTRS)
Dewberry, Brandon S.
1990-01-01
The Environmental Control and Life Support System (ECLSS) is a Freedom Station distributed system with inherent applicability to extensive automation primarily due to its comparatively long control system latencies. These allow longer contemplation times in which to form a more intelligent control strategy and to prevent and diagnose faults. The regenerative nature of the Space Station Freedom ECLSS will contribute closed loop complexities never before encountered in life support systems. A study to determine ECLSS automation approaches has been completed. The ECLSS baseline software and system processes could be augmented with more advanced fault management and regenerative control systems for a more autonomous evolutionary system, as well as serving as a firm foundation for future regenerative life support systems. Emerging advanced software technology and tools can be successfully applied to fault management, but a fully automated life support system will require research and development of regenerative control systems and models. The baseline Environmental Control and Life Support System utilizes ground tests in development of batch chemical and microbial control processes. Long duration regenerative life support systems will require more active chemical and microbial feedback control systems which, in turn, will require advancements in regenerative life support models and tools. These models can be verified using ground and on orbit life support test and operational data, and used in the engineering analysis of proposed intelligent instrumentation feedback and flexible process control technologies for future autonomous regenerative life support systems, including the evolutionary Space Station Freedom ECLSS.
Automated Power Systems Management (APSM)
NASA Technical Reports Server (NTRS)
Bridgeforth, A. O.
1981-01-01
A breadboard power system incorporating autonomous functions of monitoring, fault detection and recovery, command and control was developed, tested and evaluated to demonstrate technology feasibility. Autonomous functions including switching of redundant power processing elements, individual load fault removal, and battery charge/discharge control were implemented by means of a distributed microcomputer system within the power subsystem. Three local microcomputers provide the monitoring, control and command function interfaces between the central power subsystem microcomputer and the power sources, power processing and power distribution elements. The central microcomputer is the interface between the local microcomputers and the spacecraft central computer or ground test equipment.
NASA Technical Reports Server (NTRS)
Fitz, Rhonda; Whitman, Gerek
2016-01-01
Research into complexities of software systems Fault Management (FM) and how architectural design decisions affect safety, preservation of assets, and maintenance of desired system functionality has coalesced into a technical reference (TR) suite that advances the provision of safety and mission assurance. The NASA Independent Verification and Validation (IVV) Program, with Software Assurance Research Program support, extracted FM architectures across the IVV portfolio to evaluate robustness, assess visibility for validation and test, and define software assurance methods applied to the architectures and designs. This investigation spanned IVV projects with seven different primary developers, a wide range of sizes and complexities, and encompassed Deep Space Robotic, Human Spaceflight, and Earth Orbiter mission FM architectures. The initiative continues with an expansion of the TR suite to include Launch Vehicles, adding the benefit of investigating differences intrinsic to model-based FM architectures and insight into complexities of FM within an Agile software development environment, in order to improve awareness of how nontraditional processes affect FM architectural design and system health management.
Fault Detection and Diagnosis of Railway Point Machines by Sound Analysis
Lee, Jonguk; Choi, Heesu; Park, Daihee; Chung, Yongwha; Kim, Hee-Young; Yoon, Sukhan
2016-01-01
Railway point devices act as actuators that provide different routes to trains by driving switchblades from the current position to the opposite one. Point failure can significantly affect railway operations, with potentially disastrous consequences. Therefore, early detection of anomalies is critical for monitoring and managing the condition of rail infrastructure. We present a data mining solution that utilizes audio data to efficiently detect and diagnose faults in railway condition monitoring systems. The system enables extracting mel-frequency cepstrum coefficients (MFCCs) from audio data with reduced feature dimensions using attribute subset selection, and employs support vector machines (SVMs) for early detection and classification of anomalies. Experimental results show that the system enables cost-effective detection and diagnosis of faults using a cheap microphone, with accuracy exceeding 94.1% whether used alone or in combination with other known methods. PMID:27092509
Decision support system for outage management and automated crew dispatch
Kang, Ning; Mousavi, Mirrasoul
2018-01-23
A decision support system is provided for utility operations to assist with crew dispatch and restoration activities following the occurrence of a disturbance in a multiphase power distribution network, by providing a real-time visualization of possible location(s). The system covers faults that occur on fuse-protected laterals. The system uses real-time data from intelligent electronics devices coupled with other data sources such as static feeder maps to provide a complete picture of the disturbance event, guiding the utility crew to the most probable location(s). This information is provided in real-time, reducing restoration time and avoiding more costly and laborious fault location finding practices.
NASA Astrophysics Data System (ADS)
Sanchez-Vila, X.; de Barros, F.; Bolster, D.; Nowak, W.
2010-12-01
Assessing the potential risk of hydro(geo)logical supply systems to human population is an interdisciplinary field. It relies on the expertise in fields as distant as hydrogeology, medicine, or anthropology, and needs powerful translation concepts to provide decision support and policy making. Reliable health risk estimates need to account for the uncertainties in hydrological, physiological and human behavioral parameters. We propose the use of fault trees to address the task of probabilistic risk analysis (PRA) and to support related management decisions. Fault trees allow decomposing the assessment of health risk into individual manageable modules, thus tackling a complex system by a structural “Divide and Conquer” approach. The complexity within each module can be chosen individually according to data availability, parsimony, relative importance and stage of analysis. The separation in modules allows for a true inter- and multi-disciplinary approach. This presentation highlights the three novel features of our work: (1) we define failure in terms of risk being above a threshold value, whereas previous studies used auxiliary events such as exceedance of critical concentration levels, (2) we plot an integrated fault tree that handles uncertainty in both hydrological and health components in a unified way, and (3) we introduce a new form of stochastic fault tree that allows to weaken the assumption of independent subsystems that is required by a classical fault tree approach. We illustrate our concept in a simple groundwater-related setting.
ESRDC - Designing and Powering the Future Fleet
2018-02-22
Awards Management 301 Main Street University of South Carolina Columbia, SC 29208 1600 Hampton St, Suite 414 Phone: 803-777-7890 Columbia, SC 29208... managing short circuit faults in MVDC Systems, and 5) modeling of SiC-based electronic power converters to support accurate scalable models in S3D...Research in advanced thermal management followed three tracks. We developed models of thermal system components that are suitable for use in early stage
Review: Evaluation of Foot-and-Mouth Disease Control Using Fault Tree Analysis.
Isoda, N; Kadohira, M; Sekiguchi, S; Schuppers, M; Stärk, K D C
2015-06-01
An outbreak of foot-and-mouth disease (FMD) causes huge economic losses and animal welfare problems. Although much can be learnt from past FMD outbreaks, several countries are not satisfied with their degree of contingency planning and aiming at more assurance that their control measures will be effective. The purpose of the present article was to develop a generic fault tree framework for the control of an FMD outbreak as a basis for systematic improvement and refinement of control activities and general preparedness. Fault trees are typically used in engineering to document pathways that can lead to an undesired event, that is, ineffective FMD control. The fault tree method allows risk managers to identify immature parts of the control system and to analyse the events or steps that will most probably delay rapid and effective disease control during a real outbreak. The present developed fault tree is generic and can be tailored to fit the specific needs of countries. For instance, the specific fault tree for the 2001 FMD outbreak in the UK was refined based on control weaknesses discussed in peer-reviewed articles. Furthermore, the specific fault tree based on the 2001 outbreak was applied to the subsequent FMD outbreak in 2007 to assess the refinement of control measures following the earlier, major outbreak. The FMD fault tree can assist risk managers to develop more refined and adequate control activities against FMD outbreaks and to find optimum strategies for rapid control. Further application using the current tree will be one of the basic measures for FMD control worldwide. © 2013 Blackwell Verlag GmbH.
DEVELOPMENT AND TESTING OF FAULT-DIAGNOSIS ALGORITHMS FOR REACTOR PLANT SYSTEMS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grelle, Austin L.; Park, Young S.; Vilim, Richard B.
Argonne National Laboratory is further developing fault diagnosis algorithms for use by the operator of a nuclear plant to aid in improved monitoring of overall plant condition and performance. The objective is better management of plant upsets through more timely, informed decisions on control actions with the ultimate goal of improved plant safety, production, and cost management. Integration of these algorithms with visual aids for operators is taking place through a collaboration under the concept of an operator advisory system. This is a software entity whose purpose is to manage and distill the enormous amount of information an operator mustmore » process to understand the plant state, particularly in off-normal situations, and how the state trajectory will unfold in time. The fault diagnosis algorithms were exhaustively tested using computer simulations of twenty different faults introduced into the chemical and volume control system (CVCS) of a pressurized water reactor (PWR). The algorithms are unique in that each new application to a facility requires providing only the piping and instrumentation diagram (PID) and no other plant-specific information; a subject-matter expert is not needed to install and maintain each instance of an application. The testing approach followed accepted procedures for verifying and validating software. It was shown that the code satisfies its functional requirement which is to accept sensor information, identify process variable trends based on this sensor information, and then to return an accurate diagnosis based on chains of rules related to these trends. The validation and verification exercise made use of GPASS, a one-dimensional systems code, for simulating CVCS operation. Plant components were failed and the code generated the resulting plant response. Parametric studies with respect to the severity of the fault, the richness of the plant sensor set, and the accuracy of sensors were performed as part of the validation exercise. The background and overview of the software will be presented to give an overview of the approach. Following, the verification and validation effort using the GPASS code for simulation of plant transients including a sensitivity study on important parameters will be presented« less
Analytical concepts for health management systems of liquid rocket engines
NASA Technical Reports Server (NTRS)
Williams, Richard; Tulpule, Sharayu; Hawman, Michael
1990-01-01
Substantial improvement in health management systems performance can be realized by implementing advanced analytical methods of processing existing liquid rocket engine sensor data. In this paper, such techniques ranging from time series analysis to multisensor pattern recognition to expert systems to fault isolation models are examined and contrasted. The performance of several of these methods is evaluated using data from test firings of the Space Shuttle main engines.
Probabilistic fault tree analysis of a radiation treatment system.
Ekaette, Edidiong; Lee, Robert C; Cooke, David L; Iftody, Sandra; Craighead, Peter
2007-12-01
Inappropriate administration of radiation for cancer treatment can result in severe consequences such as premature death or appreciably impaired quality of life. There has been little study of vulnerable treatment process components and their contribution to the risk of radiation treatment (RT). In this article, we describe the application of probabilistic fault tree methods to assess the probability of radiation misadministration to patients at a large cancer treatment center. We conducted a systematic analysis of the RT process that identified four process domains: Assessment, Preparation, Treatment, and Follow-up. For the Preparation domain, we analyzed possible incident scenarios via fault trees. For each task, we also identified existing quality control measures. To populate the fault trees we used subjective probabilities from experts and compared results with incident report data. Both the fault tree and the incident report analysis revealed simulation tasks to be most prone to incidents, and the treatment prescription task to be least prone to incidents. The probability of a Preparation domain incident was estimated to be in the range of 0.1-0.7% based on incident reports, which is comparable to the mean value of 0.4% from the fault tree analysis using probabilities from the expert elicitation exercise. In conclusion, an analysis of part of the RT system using a fault tree populated with subjective probabilities from experts was useful in identifying vulnerable components of the system, and provided quantitative data for risk management.
Development and Operation of a Database Machine for Online Access and Update of a Large Database.
ERIC Educational Resources Information Center
Rush, James E.
1980-01-01
Reviews the development of a fault tolerant database processor system which replaced OCLC's conventional file system. A general introduction to database management systems and the operating environment is followed by a description of the hardware selection, software processes, and system characteristics. (SW)
Fault detection and multiclassifier fusion for unmanned aerial vehicles (UAVs)
NASA Astrophysics Data System (ADS)
Yan, Weizhong
2001-03-01
UAVs demand more accurate fault accommodation for their mission manager and vehicle control system in order to achieve a reliability level that is comparable to that of a pilot aircraft. This paper attempts to apply multi-classifier fusion techniques to achieve the necessary performance of the fault detection function for the Lockheed Martin Skunk Works (LMSW) UAV Mission Manager. Three different classifiers that meet the design requirements of the fault detection of the UAAV are employed. The binary decision outputs from the classifiers are then aggregated using three different classifier fusion schemes, namely, majority vote, weighted majority vote, and Naieve Bayes combination. All of the three schemes are simple and need no retraining. The three fusion schemes (except the majority vote that gives an average performance of the three classifiers) show the classification performance that is better than or equal to that of the best individual. The unavoidable correlation between the classifiers with binary outputs is observed in this study. We conclude that it is the correlation between the classifiers that limits the fusion schemes to achieve an even better performance.
Distributed Cooperation Solution Method of Complex System Based on MAS
NASA Astrophysics Data System (ADS)
Weijin, Jiang; Yuhui, Xu
To adapt the model in reconfiguring fault diagnosing to dynamic environment and the needs of solving the tasks of complex system fully, the paper introduced multi-Agent and related technology to the complicated fault diagnosis, an integrated intelligent control system is studied in this paper. Based on the thought of the structure of diagnostic decision and hierarchy in modeling, based on multi-layer decomposition strategy of diagnosis task, a multi-agent synchronous diagnosis federation integrated different knowledge expression modes and inference mechanisms are presented, the functions of management agent, diagnosis agent and decision agent are analyzed, the organization and evolution of agents in the system are proposed, and the corresponding conflict resolution algorithm in given, Layered structure of abstract agent with public attributes is build. System architecture is realized based on MAS distributed layered blackboard. The real world application shows that the proposed control structure successfully solves the fault diagnose problem of the complex plant, and the special advantage in the distributed domain.
Lessons Learned in the Livingstone 2 on Earth Observing One Flight Experiment
NASA Technical Reports Server (NTRS)
Hayden, Sandra C.; Sweet, Adam J.; Shulman, Seth
2005-01-01
The Livingstone 2 (L2) model-based diagnosis software is a reusable diagnostic tool for monitoring complex systems. In 2004, L2 was integrated with the JPL Autonomous Sciencecraft Experiment (ASE) and deployed on-board Goddard's Earth Observing One (EO-1) remote sensing satellite, to monitor and diagnose the EO-1 space science instruments and imaging sequence. This paper reports on lessons learned from this flight experiment. The goals for this experiment, including validation of minimum success criteria and of a series of diagnostic scenarios, have all been successfully net. Long-term operations in space are on-going, as a test of the maturity of the system, with L2 performance remaining flawless. L2 has demonstrated the ability to track the state of the system during nominal operations, detect simulated abnormalities in operations and isolate failures to their root cause fault. Specific advances demonstrated include diagnosis of ambiguity groups rather than a single fault candidate; hypothesis revision given new sensor evidence about the state of the system; and the capability to check for faults in a dynamic system without having to wait until the system is quiescent. The major benefits of this advanced health management technology are to increase mission duration and reliability through intelligent fault protection, and robust autonomous operations with reduced dependency on supervisory operations from Earth. The work-load for operators will be reduced by telemetry of processed state-of-health information rather than raw data. The long-term vision is that of making diagnosis available to the onboard planner or executive, allowing autonomy software to re-plan in order to work around known component failures. For a system that is expected to evolve substantially over its lifetime, as for the International Space Station, the model-based approach has definite advantages over rule-based expert systems and limit-checking fault protection systems, as these do not scale well. The model-based approach facilitates reuse of the L2 diagnostic software; only the model of the system to be diagnosed and telemetry monitoring software has to be rebuilt for a new system or expanded for a growing system. The hierarchical L2 model supports modularity and expendability, and as such is suitable solution for integrated system health management as envisioned for systems-of-systems.
Advanced Ground Systems Maintenance Enterprise Architecture Project
NASA Technical Reports Server (NTRS)
Harp, Janicce Leshay
2014-01-01
The project implements an architecture for delivery of integrated health management capabilities for the 21st Century launch complex. Capabilities include anomaly detection, fault isolation, prognostics and physics-based diagnostics.
Fault Detection and Correction for the Solar Dynamics Observatory Attitude Control System
NASA Technical Reports Server (NTRS)
Starin, Scott R.; Vess, Melissa F.; Kenney, Thomas M.; Maldonado, Manuel D.; Morgenstern, Wendy M.
2007-01-01
The Solar Dynamics Observatory is an Explorer-class mission that will launch in early 2009. The spacecraft will operate in a geosynchronous orbit, sending data 24 hours a day to a devoted ground station in White Sands, New Mexico. It will carry a suite of instruments designed to observe the Sun in multiple wavelengths at unprecedented resolution. The Atmospheric Imaging Assembly includes four telescopes with focal plane CCDs that can image the full solar disk in four different visible wavelengths. The Extreme-ultraviolet Variability Experiment will collect time-correlated data on the activity of the Sun's corona. The Helioseismic and Magnetic Imager will enable study of pressure waves moving through the body of the Sun. The attitude control system on Solar Dynamics Observatory is responsible for four main phases of activity. The physical safety of the spacecraft after separation must be guaranteed. Fine attitude determination and control must be sufficient for instrument calibration maneuvers. The mission science mode requires 2-arcsecond control according to error signals provided by guide telescopes on the Atmospheric Imaging Assembly, one of the three instruments to be carried. Lastly, accurate execution of linear and angular momentum changes to the spacecraft must be provided for momentum management and orbit maintenance. In thsp aper, single-fault tolerant fault detection and correction of the Solar Dynamics Observatory attitude control system is described. The attitude control hardware suite for the mission is catalogued, with special attention to redundancy at the hardware level. Four reaction wheels are used where any three are satisfactory. Four pairs of redundant thrusters are employed for orbit change maneuvers and momentum management. Three two-axis gyroscopes provide full redundancy for rate sensing. A digital Sun sensor and two autonomous star trackers provide two-out-of-three redundancy for fine attitude determination. The use of software to maximize chances of recovery from any hardware or software fault is detailed. A generic fault detection and correction software structure is used, allowing additions, deletions, and adjustments to fault detection and correction rules. This software structure is fed by in-line fault tests that are also able to take appropriate actions to avoid corruption of the data stream.
NASA Technical Reports Server (NTRS)
Sweet, Adam
2008-01-01
The IVHM Project in the Aviation Safety Program has funded research in electrical power system (EPS) health management. This problem domain contains both discrete and continuous behavior, and thus is directly relevant for the hybrid diagnostic tool HyDE. In FY2007 work was performed to expand the HyDE diagnosis model of the ADAPT system. The work completed resulted in a HyDE model with the capability to diagnose five times the number of ADAPT components previously tested. The expanded diagnosis model passed a corresponding set of new ADAPT fault injection scenario tests with no incorrect faults reported. The time required for the HyDE diagnostic system to isolate the fault varied widely between tests; this variance was reduced by tuning HyDE input parameters. These results and other diagnostic design trade-offs are discussed. Finally, possible future improvements for both the HyDE diagnostic model and HyDE itself are presented.
An overview of New Zealand's trauma system.
Paice, Rhondda
2007-01-01
Patterns of trauma and trauma systems in New Zealand are similar to those in Australia. Both countries have geographical considerations, terrain and distance, that can cause delay to definitive care. There are only 7 hospitals in New Zealand that currently manage major trauma patients, and consequently, trauma patients are often hospitalized some distance from their homes. The prehospital services are provided by one major provider throughout the country, with a high level of volunteers providing these services in the rural areas. New Zealand has a national no-fault accident insurance system, the Accident Compensation Corporation, which funds all trauma-related healthcare from the roadside to rehabilitation. This insurance system provides 24-hour no-fault personal injury insurance coverage. The Accident Compensation Corporation provides bulk funding to hospitals for resources to manage the care of trauma patients. Case managers are assigned for major trauma patients. This national system also has a rehabilitation focus. The actual funds are managed by the hospitals, and this allows hospital staff to provide optimum care for trauma patients. New Zealand works closely with Australia in the development of a national trauma registry, research, and education in trauma care for patients in Australasia (the islands of the southern Pacific Ocean, including Australia, New Zealand, and New Guinea).
An overview of the artificial intelligence and expert systems component of RICIS
NASA Technical Reports Server (NTRS)
Feagin, Terry
1987-01-01
Artificial Intelligence and Expert Systems are the important component of RICIS (Research Institute and Information Systems) research program. For space applications, a number of problem areas that should be able to make good use of the above tools include: resource allocation and management, control and monitoring, environmental control and life support, power distribution, communications scheduling, orbit and attitude maintenance, redundancy management, intelligent man-machine interfaces and fault detection, isolation and recovery.
Modeling and Performance Considerations for Automated Fault Isolation in Complex Systems
NASA Technical Reports Server (NTRS)
Ferrell, Bob; Oostdyk, Rebecca
2010-01-01
The purpose of this paper is to document the modeling considerations and performance metrics that were examined in the development of a large-scale Fault Detection, Isolation and Recovery (FDIR) system. The FDIR system is envisioned to perform health management functions for both a launch vehicle and the ground systems that support the vehicle during checkout and launch countdown by using suite of complimentary software tools that alert operators to anomalies and failures in real-time. The FDIR team members developed a set of operational requirements for the models that would be used for fault isolation and worked closely with the vendor of the software tools selected for fault isolation to ensure that the software was able to meet the requirements. Once the requirements were established, example models of sufficient complexity were used to test the performance of the software. The results of the performance testing demonstrated the need for enhancements to the software in order to meet the demands of the full-scale ground and vehicle FDIR system. The paper highlights the importance of the development of operational requirements and preliminary performance testing as a strategy for identifying deficiencies in highly scalable systems and rectifying those deficiencies before they imperil the success of the project
Updating the USGS seismic hazard maps for Alaska
Mueller, Charles; Briggs, Richard; Wesson, Robert L.; Petersen, Mark D.
2015-01-01
The U.S. Geological Survey makes probabilistic seismic hazard maps and engineering design maps for building codes, emergency planning, risk management, and many other applications. The methodology considers all known earthquake sources with their associated magnitude and rate distributions. Specific faults can be modeled if slip-rate or recurrence information is available. Otherwise, areal sources are developed from earthquake catalogs or GPS data. Sources are combined with ground-motion estimates to compute the hazard. The current maps for Alaska were developed in 2007, and included modeled sources for the Alaska-Aleutian megathrust, a few crustal faults, and areal seismicity sources. The megathrust was modeled as a segmented dipping plane with segmentation largely derived from the slip patches of past earthquakes. Some megathrust deformation is aseismic, so recurrence was estimated from seismic history rather than plate rates. Crustal faults included the Fairweather-Queen Charlotte system, the Denali–Totschunda system, the Castle Mountain fault, two faults on Kodiak Island, and the Transition fault, with recurrence estimated from geologic data. Areal seismicity sources were developed for Benioff-zone earthquakes and for crustal earthquakes not associated with modeled faults. We review the current state of knowledge in Alaska from a seismic-hazard perspective, in anticipation of future updates of the maps. Updated source models will consider revised seismicity catalogs, new information on crustal faults, new GPS data, and new thinking on megathrust recurrence, segmentation, and geometry. Revised ground-motion models will provide up-to-date shaking estimates for crustal earthquakes and subduction earthquakes in Alaska.
Modeling and Measurement Constraints in Fault Diagnostics for HVAC Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Najafi, Massieh; Auslander, David M.; Bartlett, Peter L.
2010-05-30
Many studies have shown that energy savings of five to fifteen percent are achievable in commercial buildings by detecting and correcting building faults, and optimizing building control systems. However, in spite of good progress in developing tools for determining HVAC diagnostics, methods to detect faults in HVAC systems are still generally undeveloped. Most approaches use numerical filtering or parameter estimation methods to compare data from energy meters and building sensors to predictions from mathematical or statistical models. They are effective when models are relatively accurate and data contain few errors. In this paper, we address the case where models aremore » imperfect and data are variable, uncertain, and can contain error. We apply a Bayesian updating approach that is systematic in managing and accounting for most forms of model and data errors. The proposed method uses both knowledge of first principle modeling and empirical results to analyze the system performance within the boundaries defined by practical constraints. We demonstrate the approach by detecting faults in commercial building air handling units. We find that the limitations that exist in air handling unit diagnostics due to practical constraints can generally be effectively addressed through the proposed approach.« less
Control of large flexible space structures
NASA Technical Reports Server (NTRS)
Vandervelde, W. E.
1986-01-01
Progress in robust design of generalized parity relations, design of failure sensitive observers using the geometric system theory of Wonham, computational techniques for evaluation of the performance of control systems with fault tolerance and redundancy management features, and the design and evaluation od control systems for structures having nonlinear joints are described.
A hierarchical approach to reliability modeling of fault-tolerant systems. M.S. Thesis
NASA Technical Reports Server (NTRS)
Gossman, W. E.
1986-01-01
A methodology for performing fault tolerant system reliability analysis is presented. The method decomposes a system into its subsystems, evaluates vent rates derived from the subsystem's conditional state probability vector and incorporates those results into a hierarchical Markov model of the system. This is done in a manner that addresses failure sequence dependence associated with the system's redundancy management strategy. The method is derived for application to a specific system definition. Results are presented that compare the hierarchical model's unreliability prediction to that of a more complicated tandard Markov model of the system. The results for the example given indicate that the hierarchical method predicts system unreliability to a desirable level of accuracy while achieving significant computational savings relative to component level Markov model of the system.
Halicioglu, Kerem; Ozener, Haluk
2008-01-01
Both seismological and geodynamic research emphasize that the Aegean Region, which comprises the Hellenic Arc, the Greek mainland and Western Turkey is the most seismically active region in Western Eurasia. The convergence of the Eurasian and African lithospheric plates forces a westward motion on the Anatolian plate relative to the Eurasian one. Western Anatolia is a valuable laboratory for Earth Science research because of its complex geological structure. Izmir is a large city in Turkey with a population of about 2.5 million that is at great risk from big earthquakes. Unfortunately, previous geodynamics studies performed in this region are insufficient or cover large areas instead of specific faults. The Tuzla Fault, which is aligned trending NE–SW between the town of Menderes and Cape Doganbey, is an important fault in terms of seismic activity and its proximity to the city of Izmir. This study aims to perform a large scale investigation focusing on the Tuzla Fault and its vicinity for better understanding of the region's tectonics. In order to investigate the crustal deformation along the Tuzla Fault and Izmir Bay, a geodetic network has been designed and optimizations were performed. This paper suggests a schedule for a crustal deformation monitoring study which includes research on the tectonics of the region, network design and optimization strategies, theory and practice of processing. The study is also open for extension in terms of monitoring different types of fault characteristics. A one-dimensional fault model with two parameters – standard strike-slip model of dislocation theory in an elastic half-space – is formulated in order to determine which sites are suitable for the campaign based geodetic GPS measurements. Geodetic results can be used as a background data for disaster management systems. PMID:27873783
Halicioglu, Kerem; Ozener, Haluk
2008-08-19
Both seismological and geodynamic research emphasize that the Aegean Region, which comprises the Hellenic Arc, the Greek mainland and Western Turkey is the most seismically active region in Western Eurasia. The convergence of the Eurasian and African lithospheric plates forces a westward motion on the Anatolian plate relative to the Eurasian one. Western Anatolia is a valuable laboratory for Earth Science research because of its complex geological structure. Izmir is a large city in Turkey with a population of about 2.5 million that is at great risk from big earthquakes. Unfortunately, previous geodynamics studies performed in this region are insufficient or cover large areas instead of specific faults. The Tuzla Fault, which is aligned trending NE-SW between the town of Menderes and Cape Doganbey, is an important fault in terms of seismic activity and its proximity to the city of Izmir. This study aims to perform a large scale investigation focusing on the Tuzla Fault and its vicinity for better understanding of the region's tectonics. In order to investigate the crustal deformation along the Tuzla Fault and Izmir Bay, a geodetic network has been designed and optimizations were performed. This paper suggests a schedule for a crustal deformation monitoring study which includes research on the tectonics of the region, network design and optimization strategies, theory and practice of processing. The study is also open for extension in terms of monitoring different types of fault characteristics. A one-dimensional fault model with two parameters - standard strike-slip model of dislocation theory in an elastic half-space - is formulated in order to determine which sites are suitable for the campaign based geodetic GPS measurements. Geodetic results can be used as a background data for disaster management systems.
Advanced Ground Systems Maintenance Enterprise Architecture Project
NASA Technical Reports Server (NTRS)
Perotti, Jose M. (Compiler)
2015-01-01
The project implements an architecture for delivery of integrated health management capabilities for the 21st Century launch complex. The delivered capabilities include anomaly detection, fault isolation, prognostics and physics based diagnostics.
Mission Management Computer and Sequencing Hardware for RLV-TD HEX-01 Mission
NASA Astrophysics Data System (ADS)
Gupta, Sukrat; Raj, Remya; Mathew, Asha Mary; Koshy, Anna Priya; Paramasivam, R.; Mookiah, T.
2017-12-01
Reusable Launch Vehicle-Technology Demonstrator Hypersonic Experiment (RLV-TD HEX-01) mission posed some unique challenges in the design and development of avionics hardware. This work presents the details of mission critical avionics hardware mainly Mission Management Computer (MMC) and sequencing hardware. The Navigation, Guidance and Control (NGC) chain for RLV-TD is dual redundant with cross-strapped Remote Terminals (RTs) interfaced through MIL-STD-1553B bus. MMC is Bus Controller on the 1553 bus, which does the function of GPS aided navigation, guidance, digital autopilot and sequencing for the RLV-TD launch vehicle in different periodicities (10, 20, 500 ms). Digital autopilot execution in MMC with a periodicity of 10 ms (in ascent phase) is introduced for the first time and successfully demonstrated in the flight. MMC is built around Intel i960 processor and has inbuilt fault tolerance features like ECC for memories. Fault Detection and Isolation schemes are implemented to isolate the failed MMC. The sequencing hardware comprises Stage Processing System (SPS) and Command Execution Module (CEM). SPS is `RT' on the 1553 bus which receives the sequencing and control related commands from MMCs and posts to downstream modules after proper error handling for final execution. SPS is designed as a high reliability system by incorporating various fault tolerance and fault detection features. CEM is a relay based module for sequence command execution.
Xu, Jun; Wang, Jing; Li, Shiying; Cao, Binggang
2016-01-01
Recently, State of energy (SOE) has become one of the most fundamental parameters for battery management systems in electric vehicles. However, current information is critical in SOE estimation and current sensor is usually utilized to obtain the latest current information. However, if the current sensor fails, the SOE estimation may be confronted with large error. Therefore, this paper attempts to make the following contributions: Current sensor fault detection and SOE estimation method is realized simultaneously. Through using the proportional integral observer (PIO) based method, the current sensor fault could be accurately estimated. By taking advantage of the accurate estimated current sensor fault, the influence caused by the current sensor fault can be eliminated and compensated. As a result, the results of the SOE estimation will be influenced little by the fault. In addition, the simulation and experimental workbench is established to verify the proposed method. The results indicate that the current sensor fault can be estimated accurately. Simultaneously, the SOE can also be estimated accurately and the estimation error is influenced little by the fault. The maximum SOE estimation error is less than 2%, even though the large current error caused by the current sensor fault still exists. PMID:27548183
Xu, Jun; Wang, Jing; Li, Shiying; Cao, Binggang
2016-08-19
Recently, State of energy (SOE) has become one of the most fundamental parameters for battery management systems in electric vehicles. However, current information is critical in SOE estimation and current sensor is usually utilized to obtain the latest current information. However, if the current sensor fails, the SOE estimation may be confronted with large error. Therefore, this paper attempts to make the following contributions: Current sensor fault detection and SOE estimation method is realized simultaneously. Through using the proportional integral observer (PIO) based method, the current sensor fault could be accurately estimated. By taking advantage of the accurate estimated current sensor fault, the influence caused by the current sensor fault can be eliminated and compensated. As a result, the results of the SOE estimation will be influenced little by the fault. In addition, the simulation and experimental workbench is established to verify the proposed method. The results indicate that the current sensor fault can be estimated accurately. Simultaneously, the SOE can also be estimated accurately and the estimation error is influenced little by the fault. The maximum SOE estimation error is less than 2%, even though the large current error caused by the current sensor fault still exists.
Product quality management based on CNC machine fault prognostics and diagnosis
NASA Astrophysics Data System (ADS)
Kozlov, A. M.; Al-jonid, Kh M.; Kozlov, A. A.; Antar, Sh D.
2018-03-01
This paper presents a new fault classification model and an integrated approach to fault diagnosis which involves the combination of ideas of Neuro-fuzzy Networks (NF), Dynamic Bayesian Networks (DBN) and Particle Filtering (PF) algorithm on a single platform. In the new model, faults are categorized in two aspects, namely first and second degree faults. First degree faults are instantaneous in nature, and second degree faults are evolutional and appear as a developing phenomenon which starts from the initial stage, goes through the development stage and finally ends at the mature stage. These categories of faults have a lifetime which is inversely proportional to a machine tool's life according to the modified version of Taylor’s equation. For fault diagnosis, this framework consists of two phases: the first one is focusing on fault prognosis, which is done online, and the second one is concerned with fault diagnosis which depends on both off-line and on-line modules. In the first phase, a neuro-fuzzy predictor is used to take a decision on whether to embark Conditional Based Maintenance (CBM) or fault diagnosis based on the severity of a fault. The second phase only comes into action when an evolving fault goes beyond a critical threshold limit called a CBM limit for a command to be issued for fault diagnosis. During this phase, DBN and PF techniques are used as an intelligent fault diagnosis system to determine the severity, time and location of the fault. The feasibility of this approach was tested in a simulation environment using the CNC machine as a case study and the results were studied and analyzed.
Galileo spacecraft power management and distribution system
NASA Technical Reports Server (NTRS)
Detwiler, R. C.; Smith, R. L.
1990-01-01
The Galileo PMAD (power management and distribution system) is described, and the design drivers that established the final as-built hardware are discussed. The spacecraft is powered by two general-purpose heat-source-radioisotope thermoelectric generators. Power bus regulation is provided by a shunt regulator. Galileo PMAD distributes a 570-W beginning of mission (BOM) power source to a user complement of some 137 load elements. Extensive use of pyrotechnics requires two pyro switching subassemblies. They initiate 148 squibs which operate the 47 pyro devices on the spacecraft. Detection and correction of faults in the Galileo PMAD is an autonomous feature dictated by requirements for long life and reliability in the absence of ground-based support. Volatile computer memories in the spacecraft command and data system and attitude control system require a continuous source of backup power during all anticipated power bus fault scenarios. Power for the Jupiter Probe is conditioned, isolated, and controlled by a Probe interface subassembly. Flight performance of the spacecraft and the PMAD has been successful to date, with no major anomalies.
Research on key technology of prognostic and health management for autonomous underwater vehicle
NASA Astrophysics Data System (ADS)
Zhou, Zhi
2017-12-01
Autonomous Underwater Vehicles (AUVs) are non-cable and autonomous motional underwater robotics. With a wide range of activities, it can reach thousands of kilometers. Because it has the advantages of wide range, good maneuverability, safety and intellectualization, it becomes an important tool for various underwater tasks. How to improve diagnosis accuracy of the AUVs electrical system faults, and how to repair AUVs by the information are the focus of navy in the world. In turn, ensuring safe and reliable operation of the system has very important significance to improve AUVs sailing performance. To solve these problems, in the paper the prognostic and health management(PHM) technology is researched and used to AUV, and the overall framework and key technology are proposed, such as data acquisition, feature extraction, fault diagnosis, failure prediction and so on.
Toward a Model-Based Approach to Flight System Fault Protection
NASA Technical Reports Server (NTRS)
Day, John; Murray, Alex; Meakin, Peter
2012-01-01
Fault Protection (FP) is a distinct and separate systems engineering sub-discipline that is concerned with the off-nominal behavior of a system. Flight system fault protection is an important part of the overall flight system systems engineering effort, with its own products and processes. As with other aspects of systems engineering, the FP domain is highly amenable to expression and management in models. However, while there are standards and guidelines for performing FP related analyses, there are not standards or guidelines for formally relating the FP analyses to each other or to the system hardware and software design. As a result, the material generated for these analyses are effectively creating separate models that are only loosely-related to the system being designed. Development of approaches that enable modeling of FP concerns in the same model as the system hardware and software design enables establishment of formal relationships that has great potential for improving the efficiency, correctness, and verification of the implementation of flight system FP. This paper begins with an overview of the FP domain, and then continues with a presentation of a SysML/UML model of the FP domain and the particular analyses that it contains, by way of showing a potential model-based approach to flight system fault protection, and an exposition of the use of the FP models in FSW engineering. The analyses are small examples, inspired by current real-project examples of FP analyses.
2009 fault tolerance for extreme-scale computing workshop, Albuquerque, NM - March 19-20, 2009.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Katz, D. S.; Daly, J.; DeBardeleben, N.
2009-02-01
This is a report on the third in a series of petascale workshops co-sponsored by Blue Waters and TeraGrid to address challenges and opportunities for making effective use of emerging extreme-scale computing. This workshop was held to discuss fault tolerance on large systems for running large, possibly long-running applications. The main point of the workshop was to have systems people, middleware people (including fault-tolerance experts), and applications people talk about the issues and figure out what needs to be done, mostly at the middleware and application levels, to run such applications on the emerging petascale systems, without having faults causemore » large numbers of application failures. The workshop found that there is considerable interest in fault tolerance, resilience, and reliability of high-performance computing (HPC) systems in general, at all levels of HPC. The only way to recover from faults is through the use of some redundancy, either in space or in time. Redundancy in time, in the form of writing checkpoints to disk and restarting at the most recent checkpoint after a fault that cause an application to crash/halt, is the most common tool used in applications today, but there are questions about how long this can continue to be a good solution as systems and memories grow faster than I/O bandwidth to disk. There is interest in both modifications to this, such as checkpoints to memory, partial checkpoints, and message logging, and alternative ideas, such as in-memory recovery using residues. We believe that systematic exploration of these ideas holds the most promise for the scientific applications community. Fault tolerance has been an issue of discussion in the HPC community for at least the past 10 years; but much like other issues, the community has managed to put off addressing it during this period. There is a growing recognition that as systems continue to grow to petascale and beyond, the field is approaching the point where we don't have any choice but to address this through R&D efforts.« less
Low-Power Fault Tolerance for Spacecraft FPGA-Based Numerical Computing
2006-09-01
Ranganathan , “Power Management – Guest Lecture for CS4135, NPS,” Naval Postgraduate School, Nov 2004 [32] R. L. Phelps, “Operational Experiences with the...4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188) Washington DC 20503. 1. AGENCY USE ONLY (Leave blank) 2...undesirable, are not necessarily harmful. Our intent is to prevent errors by properly managing faults. This research focuses on developing fault-tolerant
Intelligent Engine Systems Work Element 1.3: Sub System Health Management
NASA Technical Reports Server (NTRS)
Ashby, Malcolm; Simpson, Jeffrey; Singh, Anant; Ferguson, Emily; Frontera, mark
2005-01-01
The objectives of this program were to develop health monitoring systems and physics-based fault detection models for engine sub-systems including the start, lubrication, and fuel. These models will ultimately be used to provide more effective sub-system fault identification and isolation to reduce engine maintenance costs and engine down-time. Additionally, the bearing sub-system health is addressed in this program through identification of sensing requirements, a review of available technologies and a demonstration of a demonstration of a conceptual monitoring system for a differential roller bearing. This report is divided into four sections; one for each of the subtasks. The start system subtask is documented in section 2.0, the oil system is covered in section 3.0, bearing in section 4.0, and the fuel system is presented in section 5.0.
Long term fault system reorganization of convergent and strike-slip systems
NASA Astrophysics Data System (ADS)
Cooke, M. L.; McBeck, J.; Hatem, A. E.; Toeneboehn, K.; Beyer, J. L.
2017-12-01
Laboratory and numerical experiments representing deformation over many earthquake cycles demonstrate that fault evolution includes episodes of fault reorganization that optimize work on the fault system. Consequently, the mechanical and kinematic efficiencies of fault systems do not increase monotonically through their evolution. New fault configurations can optimize the external work required to accommodate deformation, suggesting that changes in system efficiency can drive fault reorganization. Laboratory evidence and numerical results show that fault reorganization within accretion, strike-slip and oblique convergent systems is associated with increasing efficiency due to increased fault slip (frictional work and seismic energy) and commensurate decreased off-fault deformation (internal work and work against gravity). Between episodes of fault reorganization, fault systems may become less efficient as they produce increasing off fault deformation. For example, laboratory and numerical experiments show that the interference and interaction between different fault segments may increase local internal work or that increasing convergence can increase work against gravity produced by a fault system. This accumulation of work triggers fault reorganization as stored work provides the energy required to grow new faults that reorganize the system to a more efficient configuration. The results of laboratory and numerical experiments reveal that we should expect crustal fault systems to reorganize following periods of increasing inefficiency, even in the absence of changes to the tectonic regime. In other words, fault reorganization doesn't require a change in tectonic loading. The time frame of fault reorganization depends on fault system configuration, strain rate and processes that relax stresses within the crust. For example, stress relaxation may keep pace with stress accumulation, which would limit the increase in the internal work and gravitational work so that irregularities can persist along active fault systems without reorganization of the fault system. Consequently, steady state behavior, for example with constant fault slip rates, may arise either in systems with high degree of stress-relaxation or occur only within the intervals between episodes of fault reorganization.
NASA Astrophysics Data System (ADS)
Barba, M.; Rains, C.; von Dassow, W.; Parker, J. W.; Glasscoe, M. T.
2013-12-01
Knowing the location and behavior of active faults is essential for earthquake hazard assessment and disaster response. In Interferometric Synthetic Aperture Radar (InSAR) images, faults are revealed as linear discontinuities. Currently, interferograms are manually inspected to locate faults. During the summer of 2013, the NASA-JPL DEVELOP California Disasters team contributed to the development of a method to expedite fault detection in California using remote-sensing technology. The team utilized InSAR images created from polarimetric L-band data from NASA's Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) project. A computer-vision technique known as 'edge-detection' was used to automate the fault-identification process. We tested and refined an edge-detection algorithm under development through NASA's Earthquake Data Enhanced Cyber-Infrastructure for Disaster Evaluation and Response (E-DECIDER) project. To optimize the algorithm we used both UAVSAR interferograms and synthetic interferograms generated through Disloc, a web-based modeling program available through NASA's QuakeSim project. The edge-detection algorithm detected seismic, aseismic, and co-seismic slip along faults that were identified and compared with databases of known fault systems. Our optimization process was the first step toward integration of the edge-detection code into E-DECIDER to provide decision support for earthquake preparation and disaster management. E-DECIDER partners that will use the edge-detection code include the California Earthquake Clearinghouse and the US Department of Homeland Security through delivery of products using the Unified Incident Command and Decision Support (UICDS) service. Through these partnerships, researchers, earthquake disaster response teams, and policy-makers will be able to use this new methodology to examine the details of ground and fault motions for moderate to large earthquakes. Following an earthquake, the newly discovered faults can be paired with infrastructure overlays, allowing emergency response teams to identify sites that may have been exposed to damage. The faults will also be incorporated into a database for future integration into fault models and earthquake simulations, improving future earthquake hazard assessment. As new faults are mapped, they will further understanding of the complex fault systems and earthquake hazards within the seismically dynamic state of California.
Fault tolerant multi-sensor fusion based on the information gain
NASA Astrophysics Data System (ADS)
Hage, Joelle Al; El Najjar, Maan E.; Pomorski, Denis
2017-01-01
In the last decade, multi-robot systems are used in several applications like for example, the army, the intervention areas presenting danger to human life, the management of natural disasters, the environmental monitoring, exploration and agriculture. The integrity of localization of the robots must be ensured in order to achieve their mission in the best conditions. Robots are equipped with proprioceptive (encoders, gyroscope) and exteroceptive sensors (Kinect). However, these sensors could be affected by various faults types that can be assimilated to erroneous measurements, bias, outliers, drifts,… In absence of a sensor fault diagnosis step, the integrity and the continuity of the localization are affected. In this work, we present a muti-sensors fusion approach with Fault Detection and Exclusion (FDE) based on the information theory. In this context, we are interested by the information gain given by an observation which may be relevant when dealing with the fault tolerance aspect. Moreover, threshold optimization based on the quantity of information given by a decision on the true hypothesis is highlighted.
Fault Detection, Isolation and Recovery (FDIR) Portable Liquid Oxygen Hardware Demonstrator
NASA Technical Reports Server (NTRS)
Oostdyk, Rebecca L.; Perotti, Jose M.
2011-01-01
The Fault Detection, Isolation and Recovery (FDIR) hardware demonstration will highlight the effort being conducted by Constellation's Ground Operations (GO) to provide the Launch Control System (LCS) with system-level health management during vehicle processing and countdown activities. A proof-of-concept demonstration of the FDIR prototype established the capability of the software to provide real-time fault detection and isolation using generated Liquid Hydrogen data. The FDIR portable testbed unit (presented here) aims to enhance FDIR by providing a dynamic simulation of Constellation subsystems that feed the FDIR software live data based on Liquid Oxygen system properties. The LO2 cryogenic ground system has key properties that are analogous to the properties of an electronic circuit. The LO2 system is modeled using electrical components and an equivalent circuit is designed on a printed circuit board to simulate the live data. The portable testbed is also be equipped with data acquisition and communication hardware to relay the measurements to the FDIR application running on a PC. This portable testbed is an ideal capability to perform FDIR software testing, troubleshooting, training among others.
NASA Technical Reports Server (NTRS)
Harper, Richard E.; Babikyan, Carol A.; Butler, Bryan P.; Clasen, Robert J.; Harris, Chris H.; Lala, Jaynarayan H.; Masotto, Thomas K.; Nagle, Gail A.; Prizant, Mark J.; Treadwell, Steven
1994-01-01
The Army Avionics Research and Development Activity (AVRADA) is pursuing programs that would enable effective and efficient management of large amounts of situational data that occurs during tactical rotorcraft missions. The Computer Aided Low Altitude Night Helicopter Flight Program has identified automated Terrain Following/Terrain Avoidance, Nap of the Earth (TF/TA, NOE) operation as key enabling technology for advanced tactical rotorcraft to enhance mission survivability and mission effectiveness. The processing of critical information at low altitudes with short reaction times is life-critical and mission-critical necessitating an ultra-reliable/high throughput computing platform for dependable service for flight control, fusion of sensor data, route planning, near-field/far-field navigation, and obstacle avoidance operations. To address these needs the Army Fault Tolerant Architecture (AFTA) is being designed and developed. This computer system is based upon the Fault Tolerant Parallel Processor (FTPP) developed by Charles Stark Draper Labs (CSDL). AFTA is hard real-time, Byzantine, fault-tolerant parallel processor which is programmed in the ADA language. This document describes the results of the Detailed Design (Phase 2 and 3 of a 3-year project) of the AFTA development. This document contains detailed descriptions of the program objectives, the TF/TA NOE application requirements, architecture, hardware design, operating systems design, systems performance measurements and analytical models.
Model-based diagnostics for Space Station Freedom
NASA Technical Reports Server (NTRS)
Fesq, Lorraine M.; Stephan, Amy; Martin, Eric R.; Lerutte, Marcel G.
1991-01-01
An innovative approach to fault management was recently demonstrated for the NASA LeRC Space Station Freedom (SSF) power system testbed. This project capitalized on research in model-based reasoning, which uses knowledge of a system's behavior to monitor its health. The fault management system (FMS) can isolate failures online, or in a post analysis mode, and requires no knowledge of failure symptoms to perform its diagnostics. An in-house tool called MARPLE was used to develop and run the FMS. MARPLE's capabilities are similar to those available from commercial expert system shells, although MARPLE is designed to build model-based as opposed to rule-based systems. These capabilities include functions for capturing behavioral knowledge, a reasoning engine that implements a model-based technique known as constraint suspension, and a tool for quickly generating new user interfaces. The prototype produced by applying MARPLE to SSF not only demonstrated that model-based reasoning is a valuable diagnostic approach, but it also suggested several new applications of MARPLE, including an integration and testing aid, and a complement to state estimation.
Predeployment validation of fault-tolerant systems through software-implemented fault insertion
NASA Technical Reports Server (NTRS)
Czeck, Edward W.; Siewiorek, Daniel P.; Segall, Zary Z.
1989-01-01
Fault injection-based automated testing (FIAT) environment, which can be used to experimentally characterize and evaluate distributed realtime systems under fault-free and faulted conditions is described. A survey is presented of validation methodologies. The need for fault insertion based on validation methodologies is demonstrated. The origins and models of faults, and motivation for the FIAT concept are reviewed. FIAT employs a validation methodology which builds confidence in the system through first providing a baseline of fault-free performance data and then characterizing the behavior of the system with faults present. Fault insertion is accomplished through software and allows faults or the manifestation of faults to be inserted by either seeding faults into memory or triggering error detection mechanisms. FIAT is capable of emulating a variety of fault-tolerant strategies and architectures, can monitor system activity, and can automatically orchestrate experiments involving insertion of faults. There is a common system interface which allows ease of use to decrease experiment development and run time. Fault models chosen for experiments on FIAT have generated system responses which parallel those observed in real systems under faulty conditions. These capabilities are shown by two example experiments each using a different fault-tolerance strategy.
Huang, Weiqing; Fan, Hongbo; Qiu, Yongfu; Cheng, Zhiyu; Xu, Pingru; Qian, Yu
2016-05-01
Recently, China has frequently experienced large-scale, severe and persistent haze pollution due to surging urbanization and industrialization and a rapid growth in the number of motor vehicles and energy consumption. The vehicle emission due to the consumption of a large number of fossil fuels is no doubt a critical factor of the haze pollution. This work is focused on the causation mechanism of haze pollution related to the vehicle emission for Guangzhou city by employing the Fault Tree Analysis (FTA) method for the first time. With the establishment of the fault tree system of "Haze weather-Vehicle exhausts explosive emission", all of the important risk factors are discussed and identified by using this deductive FTA method. The qualitative and quantitative assessments of the fault tree system are carried out based on the structure, probability and critical importance degree analysis of the risk factors. The study may provide a new simple and effective tool/strategy for the causation mechanism analysis and risk management of haze pollution in China. Copyright © 2016 Elsevier Ltd. All rights reserved.
Pressure Monitoring to Detect Fault Rupture Due to CO 2 Injection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keating, Elizabeth; Dempsey, David; Pawar, Rajesh
The capacity for fault systems to be reactivated by fluid injection is well-known. In the context of CO 2 sequestration, however, the consequence of reactivated faults with respect to leakage and monitoring is poorly understood. Using multi-phase fluid flow simulations, this study addresses key questions concerning the likelihood of ruptures, the timing of consequent upward leakage of CO 2, and the effectiveness of pressure monitoring in the reservoir and overlying zones for rupture detection. A range of injection scenarios was simulated using random sampling of uncertain parameters. These include the assumed distance between the injector and the vulnerable fault zone,more » the critical overpressure required for the fault to rupture, reservoir permeability, and the CO 2 injection rate. We assumed a conservative scenario, in which if at any time during the five-year simulations the critical fault overpressure is exceeded, the fault permeability is assumed to instantaneously increase. For the purposes of conservatism we assume that CO 2 injection continues ‘blindly’ after fault rupture. We show that, despite this assumption, in most cases the CO 2 plume does not reach the base of the ruptured fault after 5 years. As a result, one possible implication of this result is that leak mitigation strategies such as pressure management have a reasonable chance of preventing a CO 2 leak.« less
Pressure Monitoring to Detect Fault Rupture Due to CO 2 Injection
Keating, Elizabeth; Dempsey, David; Pawar, Rajesh
2017-08-18
The capacity for fault systems to be reactivated by fluid injection is well-known. In the context of CO 2 sequestration, however, the consequence of reactivated faults with respect to leakage and monitoring is poorly understood. Using multi-phase fluid flow simulations, this study addresses key questions concerning the likelihood of ruptures, the timing of consequent upward leakage of CO 2, and the effectiveness of pressure monitoring in the reservoir and overlying zones for rupture detection. A range of injection scenarios was simulated using random sampling of uncertain parameters. These include the assumed distance between the injector and the vulnerable fault zone,more » the critical overpressure required for the fault to rupture, reservoir permeability, and the CO 2 injection rate. We assumed a conservative scenario, in which if at any time during the five-year simulations the critical fault overpressure is exceeded, the fault permeability is assumed to instantaneously increase. For the purposes of conservatism we assume that CO 2 injection continues ‘blindly’ after fault rupture. We show that, despite this assumption, in most cases the CO 2 plume does not reach the base of the ruptured fault after 5 years. As a result, one possible implication of this result is that leak mitigation strategies such as pressure management have a reasonable chance of preventing a CO 2 leak.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lewis, M.; Grimshaw, A.
1996-12-31
The Legion project at the University of Virginia is an architecture for designing and building system services that provide the illusion of a single virtual machine to users, a virtual machine that provides secure shared object and shared name spaces, application adjustable fault-tolerance, improved response time, and greater throughput. Legion targets wide area assemblies of workstations, supercomputers, and parallel supercomputers, Legion tackles problems not solved by existing workstation based parallel processing tools; the system will enable fault-tolerance, wide area parallel processing, inter-operability, heterogeneity, a single global name space, protection, security, efficient scheduling, and comprehensive resource management. This paper describes themore » core Legion object model, which specifies the composition and functionality of Legion`s core objects-those objects that cooperate to create, locate, manage, and remove objects in the Legion system. The object model facilitates a flexible extensible implementation, provides a single global name space, grants site autonomy to participating organizations, and scales to millions of sites and trillions of objects.« less
The engine fuel system fault analysis
NASA Astrophysics Data System (ADS)
Zhang, Yong; Song, Hanqiang; Yang, Changsheng; Zhao, Wei
2017-05-01
For improving the reliability of the engine fuel system, the typical fault factor of the engine fuel system was analyzed from the point view of structure and functional. The fault character was gotten by building the fuel system fault tree. According the utilizing of fault mode effect analysis method (FMEA), several factors of key component fuel regulator was obtained, which include the fault mode, the fault cause, and the fault influences. All of this made foundation for next development of fault diagnosis system.
Space power system automation approaches at the George C. Marshall Space Flight Center
NASA Technical Reports Server (NTRS)
Weeks, D. J.
1987-01-01
This paper discusses the automation approaches employed in various electrical power system breadboards at the Marshall Space Flight Center. Of particular interest is the application of knowledge-based systems to fault management and dynamic payload scheduling. A description of each major breadboard and the automation approach taken for each is given.
Autonomous power expert system
NASA Technical Reports Server (NTRS)
Ringer, Mark J.; Quinn, Todd M.
1990-01-01
The goal of the Autonomous Power System (APS) program is to develop and apply intelligent problem solving and control technologies to the Space Station Freedom Electrical Power Systems (SSF/EPS). The objectives of the program are to establish artificial intelligence/expert system technology paths, to create knowledge based tools with advanced human-operator interfaces, and to integrate and interface knowledge-based and conventional control schemes. This program is being developed at the NASA-Lewis. The APS Brassboard represents a subset of a 20 KHz Space Station Power Management And Distribution (PMAD) testbed. A distributed control scheme is used to manage multiple levels of computers and switchgear. The brassboard is comprised of a set of intelligent switchgear used to effectively switch power from the sources to the loads. The Autonomous Power Expert System (APEX) portion of the APS program integrates a knowledge based fault diagnostic system, a power resource scheduler, and an interface to the APS Brassboard. The system includes knowledge bases for system diagnostics, fault detection and isolation, and recommended actions. The scheduler autonomously assigns start times to the attached loads based on temporal and power constraints. The scheduler is able to work in a near real time environment for both scheduling and dynamic replanning.
Autonomous power expert system
NASA Technical Reports Server (NTRS)
Ringer, Mark J.; Quinn, Todd M.
1990-01-01
The goal of the Autonomous Power System (APS) program is to develop and apply intelligent problem solving and control technologies to the Space Station Freedom Electrical Power Systems (SSF/EPS). The objectives of the program are to establish artificial intelligence/expert system technology paths, to create knowledge based tools with advanced human-operator interfaces, and to integrate and interface knowledge-based and conventional control schemes. This program is being developed at the NASA-Lewis. The APS Brassboard represents a subset of a 20 KHz Space Station Power Management And Distribution (PMAD) testbed. A distributed control scheme is used to manage multiple levels of computers and switchgear. The brassboard is comprised of a set of intelligent switchgear used to effectively switch power from the sources to the loads. The Autonomous Power Expert System (APEX) portion of the APS program integrates a knowledge based fault diagnostic system, a power resource scheduler, and an interface to the APS Brassboard. The system includes knowledge bases for system diagnostics, fault detection and isolation, and recommended actions. The scheduler autonomously assigns start times to the attached loads based on temporal and power constraints. The scheduler is able to work in a near real time environment for both scheduling an dynamic replanning.
Novel Directional Protection Scheme for the FREEDM Smart Grid System
NASA Astrophysics Data System (ADS)
Sharma, Nitish
This research primarily deals with the design and validation of the protection system for a large scale meshed distribution system. The large scale system simulation (LSSS) is a system level PSCAD model which is used to validate component models for different time-scale platforms, to provide a virtual testing platform for the Future Renewable Electric Energy Delivery and Management (FREEDM) system. It is also used to validate the cases of power system protection, renewable energy integration and storage, and load profiles. The protection of the FREEDM system against any abnormal condition is one of the important tasks. The addition of distributed generation and power electronic based solid state transformer adds to the complexity of the protection. The FREEDM loop system has a fault current limiter and in addition, the Solid State Transformer (SST) limits the fault current at 2.0 per unit. Former students at ASU have developed the protection scheme using fiber-optic cable. However, during the NSF-FREEDM site visit, the National Science Foundation (NSF) team regarded the system incompatible for the long distances. Hence, a new protection scheme with a wireless scheme is presented in this thesis. The use of wireless communication is extended to protect the large scale meshed distributed generation from any fault. The trip signal generated by the pilot protection system is used to trigger the FID (fault isolation device) which is an electronic circuit breaker operation (switched off/opening the FIDs). The trip signal must be received and accepted by the SST, and it must block the SST operation immediately. A comprehensive protection system for the large scale meshed distribution system has been developed in PSCAD with the ability to quickly detect the faults. The validation of the protection system is performed by building a hardware model using commercial relays at the ASU power laboratory.
NASA Astrophysics Data System (ADS)
Piatyszek, E.; Voignier, P.; Graillot, D.
2000-05-01
One of the aims of sewer networks is the protection of population against floods and the reduction of pollution rejected to the receiving water during rainy events. To meet these goals, managers have to equip the sewer networks with and to set up real-time control systems. Unfortunately, a component fault (leading to intolerable behaviour of the system) or sensor fault (deteriorating the process view and disturbing the local automatism) makes the sewer network supervision delicate. In order to ensure an adequate flow management during rainy events it is essential to set up procedures capable of detecting and diagnosing these anomalies. This article introduces a real-time fault detection method, applicable to sewer networks, for the follow-up of rainy events. This method consists in comparing the sensor response with a forecast of this response. This forecast is provided by a model and more precisely by a state estimator: a Kalman filter. This Kalman filter provides not only a flow estimate but also an entity called 'innovation'. In order to detect abnormal operations within the network, this innovation is analysed with the binary sequential probability ratio test of Wald. Moreover, by crossing available information on several nodes of the network, a diagnosis of the detected anomalies is carried out. This method provided encouraging results during the analysis of several rains, on the sewer network of Seine-Saint-Denis County, France.
Candidate Mission from Planet Earth control and data delivery system architecture
NASA Technical Reports Server (NTRS)
Shapiro, Phillip; Weinstein, Frank C.; Hei, Donald J., Jr.; Todd, Jacqueline
1992-01-01
Using a structured, experienced-based approach, Goddard Space Flight Center (GSFC) has assessed the generic functional requirements for a lunar mission control and data delivery (CDD) system. This analysis was based on lunar mission requirements outlined in GSFC-developed user traffic models. The CDD system will facilitate data transportation among user elements, element operations, and user teams by providing functions such as data management, fault isolation, fault correction, and link acquisition. The CDD system for the lunar missions must not only satisfy lunar requirements but also facilitate and provide early development of data system technologies for Mars. Reuse and evolution of existing data systems can help to maximize system reliability and minimize cost. This paper presents a set of existing and currently planned NASA data systems that provide the basic functionality. Reuse of such systems can have an impact on mission design and significantly reduce CDD and other system development costs.
Using EMIS to Identify Top Opportunities for Commercial Building Efficiency
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Guanjing; Singla, Rupam; Granderson, Jessica
Energy Management and Information Systems (EMIS) comprise a broad family of tools and services to manage commercial building energy use. These technologies offer a mix of capabilities to store, display, and analyze energy use and system data, and in some cases, provide control. EMIS technologies enable 10–20 percent site energy savings in best practice implementations. Energy Information System (EIS) and Fault Detection and Diagnosis (FDD) systems are two key technologies in the EMIS family. Energy Information Systems are broadly defined as the web-based software, data acquisition hardware, and communication systems used to analyze and display building energy performance. At amore » minimum, an EIS provides daily, hourly or sub-hourly interval meter data at the whole-building level, with graphical and analytical capability. Fault Detection and Diagnosis systems automatically identify heating, ventilation, and air-conditioning (HVAC) system or equipment-level performances issues, and in some cases are able to isolate the root causes of the problem. They use computer algorithms to continuously analyze system-level operational data to detect faults and diagnose their causes. Many FDD tools integrate the trend log data from a Building Automation System (BAS) but otherwise are stand-alone software packages; other types of FDD tools are implemented as “on-board” equipment-embedded diagnostics. (This document focuses on the former.) Analysis approaches adopted in FDD technologies span a variety of techniques from rule-based methods to process history-based approaches. FDD tools automate investigations that can be conducted via manual data inspection by someone with expert knowledge, thereby expanding accessibility and breath of analysis opportunity, and also reducing complexity.« less
Automation of the space station core module power management and distribution system
NASA Technical Reports Server (NTRS)
Weeks, David J.
1988-01-01
Under the Advanced Development Program for Space Station, Marshall Space Flight Center has been developing advanced automation applications for the Power Management and Distribution (PMAD) system inside the Space Station modules for the past three years. The Space Station Module Power Management and Distribution System (SSM/PMAD) test bed features three artificial intelligence (AI) systems coupled with conventional automation software functioning in an autonomous or closed-loop fashion. The AI systems in the test bed include a baseline scheduler/dynamic rescheduler (LES), a load shedding management system (LPLMS), and a fault recovery and management expert system (FRAMES). This test bed will be part of the NASA Systems Autonomy Demonstration for 1990 featuring cooperating expert systems in various Space Station subsystem test beds. It is concluded that advanced automation technology involving AI approaches is sufficiently mature to begin applying the technology to current and planned spacecraft applications including the Space Station.
A Cryogenic Fluid System Simulation in Support of Integrated Systems Health Management
NASA Technical Reports Server (NTRS)
Barber, John P.; Johnston, Kyle B.; Daigle, Matthew
2013-01-01
Simulations serve as important tools throughout the design and operation of engineering systems. In the context of sys-tems health management, simulations serve many uses. For one, the underlying physical models can be used by model-based health management tools to develop diagnostic and prognostic models. These simulations should incorporate both nominal and faulty behavior with the ability to inject various faults into the system. Such simulations can there-fore be used for operator training, for both nominal and faulty situations, as well as for developing and prototyping health management algorithms. In this paper, we describe a methodology for building such simulations. We discuss the design decisions and tools used to build a simulation of a cryogenic fluid test bed, and how it serves as a core technology for systems health management development and maturation.
A distributed fault-detection and diagnosis system using on-line parameter estimation
NASA Technical Reports Server (NTRS)
Guo, T.-H.; Merrill, W.; Duyar, A.
1991-01-01
The development of a model-based fault-detection and diagnosis system (FDD) is reviewed. The system can be used as an integral part of an intelligent control system. It determines the faults of a system from comparison of the measurements of the system with a priori information represented by the model of the system. The method of modeling a complex system is described and a description of diagnosis models which include process faults is presented. There are three distinct classes of fault modes covered by the system performance model equation: actuator faults, sensor faults, and performance degradation. A system equation for a complete model that describes all three classes of faults is given. The strategy for detecting the fault and estimating the fault parameters using a distributed on-line parameter identification scheme is presented. A two-step approach is proposed. The first step is composed of a group of hypothesis testing modules, (HTM) in parallel processing to test each class of faults. The second step is the fault diagnosis module which checks all the information obtained from the HTM level, isolates the fault, and determines its magnitude. The proposed FDD system was demonstrated by applying it to detect actuator and sensor faults added to a simulation of the Space Shuttle Main Engine. The simulation results show that the proposed FDD system can adequately detect the faults and estimate their magnitudes.
Layered clustering multi-fault diagnosis for hydraulic piston pump
NASA Astrophysics Data System (ADS)
Du, Jun; Wang, Shaoping; Zhang, Haiyan
2013-04-01
Efficient diagnosis is very important for improving reliability and performance of aircraft hydraulic piston pump, and it is one of the key technologies in prognostic and health management system. In practice, due to harsh working environment and heavy working loads, multiple faults of an aircraft hydraulic pump may occur simultaneously after long time operations. However, most existing diagnosis methods can only distinguish pump faults that occur individually. Therefore, new method needs to be developed to realize effective diagnosis of simultaneous multiple faults on aircraft hydraulic pump. In this paper, a new method based on the layered clustering algorithm is proposed to diagnose multiple faults of an aircraft hydraulic pump that occur simultaneously. The intensive failure mechanism analyses of the five main types of faults are carried out, and based on these analyses the optimal combination and layout of diagnostic sensors is attained. The three layered diagnosis reasoning engine is designed according to the faults' risk priority number and the characteristics of different fault feature extraction methods. The most serious failures are first distinguished with the individual signal processing. To the desultory faults, i.e., swash plate eccentricity and incremental clearance increases between piston and slipper, the clustering diagnosis algorithm based on the statistical average relative power difference (ARPD) is proposed. By effectively enhancing the fault features of these two faults, the ARPDs calculated from vibration signals are employed to complete the hypothesis testing. The ARPDs of the different faults follow different probability distributions. Compared with the classical fast Fourier transform-based spectrum diagnosis method, the experimental results demonstrate that the proposed algorithm can diagnose the multiple faults, which occur synchronously, with higher precision and reliability.
Undesirable leakage to overlying formations with horizontal and vertical injection wells
NASA Astrophysics Data System (ADS)
Mosaheb, M.; Zeidouni, M.
2017-12-01
Deep saline aquifers are considered for underground storage of carbon dioxide. Undesirable leakage of injected CO2 to adjacent layers would disturb the storage process and can pollute shallower fresh water resources as well as atmosphere. Leaky caprocks, faults, and abandoned wells are examples of leaky pathways. In addition, the overpressure can reactivate a sealing fault or damage the caprock layer. Pressure management is applicable during the storage operation to avoid these consequences and to reduce undesirable leakage.The fluids can be injected through horizontal wells with a wider interval than vertical wells. Horizontal well injection would make less overpressure by delocalizing induced pressure especially in thin formations. In this work, numerical and analytical approaches are applied to model different leaky pathways with horizontal and vertical injection wells. we compare leakage rate and overpressure for horizontal and vertical injection wells in different leaky pathway systems. Results show that the horizontal well technology would allow high injection rates with lower leakage rate for leaky well, leaky fault, and leaky caprock cases. The overpressure would reduce considerably by horizontal well comparing to vertical well injection especially in leaky fault system. The horizontal well injection is an effective method to avoid reaching to threshold pressure of fault reactivation and prevent the consequent induced seismicity.
NASA Astrophysics Data System (ADS)
Rusu-Anghel, S.; Ene, A.
2017-05-01
The quality of electric energy capture and also the equipment operational safety depend essentially of the technical state of the contact line (CL). The present method for determining the technical state of CL based on advance programming is no longer efficient, due to the faults which can occur into the not programmed areas. Therefore, they cannot be remediated. It is expected another management method for the repairing and maintenance of CL based on its real state which must be very well known. In this paper a new method for determining the faults in CL is described. It is based on the analysis of the variation of pantograph-CL contact force in dynamical regime. Using mathematical modelling and also experimental tests, it was established that each type of fault is able to generate ‘signatures’ into the contact force diagram. The identification of these signatures can be accomplished by an informatics system which will provide the fault location, its type and also in the future, the probable evolution of the CL technical state. The measuring of the contact force is realized in optical manner using a railway inspection trolley which has appropriate equipment. The analysis of the desired parameters can be accomplished in real time by a data acquisition system, based on dedicated software.
Application of the Systematic Sensor Selection Strategy for Turbofan Engine Diagnostics
NASA Technical Reports Server (NTRS)
Sowers, T. Shane; Kopasakis, George; Simon, Donald L.
2008-01-01
The data acquired from available system sensors forms the foundation upon which any health management system is based, and the available sensor suite directly impacts the overall diagnostic performance that can be achieved. While additional sensors may provide improved fault diagnostic performance, there are other factors that also need to be considered such as instrumentation cost, weight, and reliability. A systematic sensor selection approach is desired to perform sensor selection from a holistic system-level perspective as opposed to performing decisions in an ad hoc or heuristic fashion. The Systematic Sensor Selection Strategy is a methodology that optimally selects a sensor suite from a pool of sensors based on the system fault diagnostic approach, with the ability of taking cost, weight, and reliability into consideration. This procedure was applied to a large commercial turbofan engine simulation. In this initial study, sensor suites tailored for improved diagnostic performance are constructed from a prescribed collection of candidate sensors. The diagnostic performance of the best performing sensor suites in terms of fault detection and identification are demonstrated, with a discussion of the results and implications for future research.
Application of the Systematic Sensor Selection Strategy for Turbofan Engine Diagnostics
NASA Technical Reports Server (NTRS)
Sowers, T. Shane; Kopasakis, George; Simon, Donald L.
2008-01-01
The data acquired from available system sensors forms the foundation upon which any health management system is based, and the available sensor suite directly impacts the overall diagnostic performance that can be achieved. While additional sensors may provide improved fault diagnostic performance there are other factors that also need to be considered such as instrumentation cost, weight, and reliability. A systematic sensor selection approach is desired to perform sensor selection from a holistic system-level perspective as opposed to performing decisions in an ad hoc or heuristic fashion. The Systematic Sensor Selection Strategy is a methodology that optimally selects a sensor suite from a pool of sensors based on the system fault diagnostic approach, with the ability of taking cost, weight and reliability into consideration. This procedure was applied to a large commercial turbofan engine simulation. In this initial study, sensor suites tailored for improved diagnostic performance are constructed from a prescribed collection of candidate sensors. The diagnostic performance of the best performing sensor suites in terms of fault detection and identification are demonstrated, with a discussion of the results and implications for future research.
Test plan. GCPS task 7, subtask 7.1: IHM development
NASA Technical Reports Server (NTRS)
Greenberg, H. S.
1994-01-01
The overall objective of Task 7 is to identify cost-effective life cycle integrated health management (IHM) approaches for a reusable launch vehicle's primary structure. Acceptable IHM approaches must: eliminate and accommodate faults through robust designs, identify optimum inspection/maintenance periods, automate ground and on-board test and check-out, and accommodate and detect structural faults by providing wide and localized area sensor and test coverage as required. These requirements are elements of our targeted primary structure low cost operations approach using airline-like maintenance by exception philosophies. This development plan will follow an evolutionary path paving the way to the ultimate development of flight-quality production, operations, and vehicle systems. This effort will be focused on maturing the recommended sensor technologies required for localized and wide area health monitoring to a technology readiness level (TRL) of 6 and to establish flight ready system design requirements. The following is a brief list of IHM program objectives: design out faults by analyzing material properties, structural geometry, and load and environment variables and identify failure modes and damage tolerance requirements; design in system robustness while meeting performance objectives (weight limitations) of the reusable launch vehicle primary structure; establish structural integrity margins to preclude the need for test and checkout and predict optimum inspection/maintenance periods through life prediction analysis; identify optimum fault protection system concept definitions combining system robustness and integrity margins established above with cost effective health monitoring technologies; and use coupons, panels, and integrated full scale primary structure test articles to identify, evaluate, and characterize the preferred NDE/NDI/IHM sensor technologies that will be a part of the fault protection system.
Fault tolerant and lifetime control architecture for autonomous vehicles
NASA Astrophysics Data System (ADS)
Bogdanov, Alexander; Chen, Yi-Liang; Sundareswaran, Venkataraman; Altshuler, Thomas
2008-04-01
Increased vehicle autonomy, survivability and utility can provide an unprecedented impact on mission success and are one of the most desirable improvements for modern autonomous vehicles. We propose a general architecture of intelligent resource allocation, reconfigurable control and system restructuring for autonomous vehicles. The architecture is based on fault-tolerant control and lifetime prediction principles, and it provides improved vehicle survivability, extended service intervals, greater operational autonomy through lower rate of time-critical mission failures and lesser dependence on supplies and maintenance. The architecture enables mission distribution, adaptation and execution constrained on vehicle and payload faults and desirable lifetime. The proposed architecture will allow managing missions more efficiently by weighing vehicle capabilities versus mission objectives and replacing the vehicle only when it is necessary.
Low cost management of replicated data in fault-tolerant distributed systems
NASA Technical Reports Server (NTRS)
Joseph, Thomas A.; Birman, Kenneth P.
1990-01-01
Many distributed systems replicate data for fault tolerance or availability. In such systems, a logical update on a data item results in a physical update on a number of copies. The synchronization and communication required to keep the copies of replicated data consistent introduce a delay when operations are performed. A technique is described that relaxes the usual degree of synchronization, permitting replicated data items to be updated concurrently with other operations, while at the same time ensuring that correctness is not violated. The additional concurrency thus obtained results in better response time when performing operations on replicated data. How this technique performs in conjunction with a roll-back and a roll-forward failure recovery mechanism is also discussed.
Fault tolerance analysis and applications to microwave modules and MMIC's
NASA Astrophysics Data System (ADS)
Boggan, Garry H.
A project whose objective was to provide an overview of built-in-test (BIT) considerations applicable to microwave systems, modules, and MMICs (monolithic microwave integrated circuits) is discussed. Available analytical techniques and software for assessing system failure characteristics were researched, and the resulting investigation provides a review of two techniques which have applicability to microwave systems design. A system-level approach to fault tolerance and redundancy management is presented in its relationship to the subsystem/element design. An overview of the microwave BIT focus from the Air Force Integrated Diagnostics program is presented. The technical reports prepared by the GIMADS team were reviewed for applicability to microwave modules and components. A review of MIMIC (millimeter and microwave integrated circuit) program activities relative to BIT/BITE is given.
A data management system to enable urgent natural disaster computing
NASA Astrophysics Data System (ADS)
Leong, Siew Hoon; Kranzlmüller, Dieter; Frank, Anton
2014-05-01
Civil protection, in particular natural disaster management, is very important to most nations and civilians in the world. When disasters like flash floods, earthquakes and tsunamis are expected or have taken place, it is of utmost importance to make timely decisions for managing the affected areas and reduce casualties. Computer simulations can generate information and provide predictions to facilitate this decision making process. Getting the data to the required resources is a critical requirement to enable the timely computation of the predictions. An urgent data management system to support natural disaster computing is thus necessary to effectively carry out data activities within a stipulated deadline. Since the trigger of a natural disaster is usually unpredictable, it is not always possible to prepare required resources well in advance. As such, an urgent data management system for natural disaster computing has to be able to work with any type of resources. Additional requirements include the need to manage deadlines and huge volume of data, fault tolerance, reliable, flexibility to changes, ease of usage, etc. The proposed data management platform includes a service manager to provide a uniform and extensible interface for the supported data protocols, a configuration manager to check and retrieve configurations of available resources, a scheduler manager to ensure that the deadlines can be met, a fault tolerance manager to increase the reliability of the platform and a data manager to initiate and perform the data activities. These managers will enable the selection of the most appropriate resource, transfer protocol, etc. such that the hard deadline of an urgent computation can be met for a particular urgent activity, e.g. data staging or computation. We associated 2 types of deadlines [2] with an urgent computing system. Soft-hard deadline: Missing a soft-firm deadline will render the computation less useful resulting in a cost that can have severe consequences Hard deadline: Missing a hard deadline renders the computation useless and results in full catastrophic consequences. A prototype of this system has a REST-based service manager. The REST-based implementation provides a uniform interface that is easy to use. New and upcoming file transfer protocols can easily be extended and accessed via the service manager. The service manager interacts with the other four managers to coordinate the data activities so that the fundamental natural disaster urgent computing requirement, i.e. deadline, can be fulfilled in a reliable manner. A data activity can include data storing, data archiving and data storing. Reliability is ensured by the choice of a network of managers organisation model[1] the configuration manager and the fault tolerance manager. With this proposed design, an easy to use, resource-independent data management system that can support and fulfill the computation of a natural disaster prediction within stipulated deadlines can thus be realised. References [1] H. G. Hegering, S. Abeck, and B. Neumair, Integrated management of networked systems - concepts, architectures, and their operational application, Morgan Kaufmann Publishers, 340 Pine Stret, Sixth Floor, San Francisco, CA 94104-3205, USA, 1999. [2] H. Kopetz, Real-time systems design principles for distributed embedded applications, second edition, Springer, LLC, 233 Spring Street, New York, NY 10013, USA, 2011. [3] S. H. Leong, A. Frank, and D. Kranzlmu¨ ller, Leveraging e-infrastructures for urgent computing, Procedia Computer Science 18 (2013), no. 0, 2177 - 2186, 2013 International Conference on Computational Science. [4] N. Trebon, Enabling urgent computing within the existing distributed computing infrastructure, Ph.D. thesis, University of Chicago, August 2011, http://people.cs.uchicago.edu/~ntrebon/docs/dissertation.pdf.
Model Transformation for a System of Systems Dependability Safety Case
NASA Technical Reports Server (NTRS)
Murphy, Judy; Driskell, Steve
2011-01-01
The presentation reviews the dependability and safety effort of NASA's Independent Verification and Validation Facility. Topics include: safety engineering process, applications to non-space environment, Phase I overview, process creation, sample SRM artifact, Phase I end result, Phase II model transformation, fault management, and applying Phase II to individual projects.
Embedded Multiprocessor Technology for VHSIC Insertion
NASA Technical Reports Server (NTRS)
Hayes, Paul J.
1990-01-01
Viewgraphs on embedded multiprocessor technology for VHSIC insertion are presented. The objective was to develop multiprocessor system technology providing user-selectable fault tolerance, increased throughput, and ease of application representation for concurrent operation. The approach was to develop graph management mapping theory for proper performance, model multiprocessor performance, and demonstrate performance in selected hardware systems.
Design distributed simulation platform for vehicle management system
NASA Astrophysics Data System (ADS)
Wen, Zhaodong; Wang, Zhanlin; Qiu, Lihua
2006-11-01
Next generation military aircraft requires the airborne management system high performance. General modules, data integration, high speed data bus and so on are needed to share and manage information of the subsystems efficiently. The subsystems include flight control system, propulsion system, hydraulic power system, environmental control system, fuel management system, electrical power system and so on. The unattached or mixed architecture is changed to integrated architecture. That means the whole airborne system is regarded into one system to manage. So the physical devices are distributed but the system information is integrated and shared. The process function of each subsystem are integrated (including general process modules, dynamic reconfiguration), furthermore, the sensors and the signal processing functions are shared. On the other hand, it is a foundation for power shared. Establish a distributed vehicle management system using 1553B bus and distributed processors which can provide a validation platform for the research of airborne system integrated management. This paper establishes the Vehicle Management System (VMS) simulation platform. Discuss the software and hardware configuration and analyze the communication and fault-tolerant method.
Procedural errors in air traffic control: effects of traffic density, expertise, and automation.
Di Nocera, Francesco; Fabrizi, Roberto; Terenzi, Michela; Ferlazzo, Fabio
2006-06-01
Air traffic management requires operators to frequently shift between multiple tasks and/or goals with different levels of accomplishment. Procedural errors can occur when a controller accomplishes one of the tasks before the entire operation has been completed. The present study had two goals: first, to verify the occurrence of post-completion errors in air traffic control (ATC) tasks; and second, to assess effects on performance of medium term conflict detection (MTCD) tools. There were 18 military controllers who performed a simulated ATC task with and without automation support (MTCD vs. manual) in high and low air traffic density conditions. During the task, which consisted of managing several simulated flights in an enroute ATC scenario, a trace suddenly disappeared "after" the operator took the aircraft in charge, "during" the management of the trace, or "before" the pilot's first contact. In the manual condition, only the fault type "during" was found to be significantly different from the other two. On the contrary, when in the MTCD condition, the fault type "after" generated significantly less errors than the fault type "before." Additionally, automation was found to affect performance of junior controllers, whereas seniors' performance was not affected. Procedural errors can happen in ATC, but automation can mitigate this effect. Lack of benefits for the "before" fault type may be due to the fact that operators extend their reliance to a part of the task that is unsupported by the automated system.
Adaptive Fault-Resistant Systems
1994-10-01
An Architectural Overview of the Alpha Real-Time Distributed Kernel . In Proceeding., of the USEN[X Workshop on Microkernels and Other Kernel ...system and the controller are monolithic . We have noted earlier some of the problems of distributed systems-for exam- ple, the need to bound the...are monolithic . In practice, designers employ a layered structuring for their systems in order to manage complexity, and we expect that practical
A fuzzy decision tree for fault classification.
Zio, Enrico; Baraldi, Piero; Popescu, Irina C
2008-02-01
In plant accident management, the control room operators are required to identify the causes of the accident, based on the different patterns of evolution of the monitored process variables thereby developing. This task is often quite challenging, given the large number of process parameters monitored and the intense emotional states under which it is performed. To aid the operators, various techniques of fault classification have been engineered. An important requirement for their practical application is the physical interpretability of the relationships among the process variables underpinning the fault classification. In this view, the present work propounds a fuzzy approach to fault classification, which relies on fuzzy if-then rules inferred from the clustering of available preclassified signal data, which are then organized in a logical and transparent decision tree structure. The advantages offered by the proposed approach are precisely that a transparent fault classification model is mined out of the signal data and that the underlying physical relationships among the process variables are easily interpretable as linguistic if-then rules that can be explicitly visualized in the decision tree structure. The approach is applied to a case study regarding the classification of simulated faults in the feedwater system of a boiling water reactor.
NASA Astrophysics Data System (ADS)
Khawaja, Taimoor Saleem
A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior and any abnormal or novel data during real-time operation. The results of the scheme are interpreted as a posterior probability of health (1 - probability of fault). As shown through two case studies in Chapter 3, the scheme is well suited for diagnosing imminent faults in dynamical non-linear systems. Finally, the failure prognosis scheme is based on an incremental weighted Bayesian LS-SVR machine. It is particularly suited for online deployment given the incremental nature of the algorithm and the quick optimization problem solved in the LS-SVR algorithm. By way of kernelization and a Gaussian Mixture Modeling (GMM) scheme, the algorithm can estimate "possibly" non-Gaussian posterior distributions for complex non-linear systems. An efficient regression scheme associated with the more rigorous core algorithm allows for long-term predictions, fault growth estimation with confidence bounds and remaining useful life (RUL) estimation after a fault is detected. The leading contributions of this thesis are (a) the development of a novel Bayesian Anomaly Detector for efficient and reliable Fault Detection and Identification (FDI) based on Least Squares Support Vector Machines, (b) the development of a data-driven real-time architecture for long-term Failure Prognosis using Least Squares Support Vector Machines, (c) Uncertainty representation and management using Bayesian Inference for posterior distribution estimation and hyper-parameter tuning, and finally (d) the statistical characterization of the performance of diagnosis and prognosis algorithms in order to relate the efficiency and reliability of the proposed schemes.
NASA Astrophysics Data System (ADS)
Li, Shuanghong; Cao, Hongliang; Yang, Yupu
2018-02-01
Fault diagnosis is a key process for the reliability and safety of solid oxide fuel cell (SOFC) systems. However, it is difficult to rapidly and accurately identify faults for complicated SOFC systems, especially when simultaneous faults appear. In this research, a data-driven Multi-Label (ML) pattern identification approach is proposed to address the simultaneous fault diagnosis of SOFC systems. The framework of the simultaneous-fault diagnosis primarily includes two components: feature extraction and ML-SVM classifier. The simultaneous-fault diagnosis approach can be trained to diagnose simultaneous SOFC faults, such as fuel leakage, air leakage in different positions in the SOFC system, by just using simple training data sets consisting only single fault and not demanding simultaneous faults data. The experimental result shows the proposed framework can diagnose the simultaneous SOFC system faults with high accuracy requiring small number training data and low computational burden. In addition, Fault Inference Tree Analysis (FITA) is employed to identify the correlations among possible faults and their corresponding symptoms at the system component level.
Fault detection and isolation for complex system
NASA Astrophysics Data System (ADS)
Jing, Chan Shi; Bayuaji, Luhur; Samad, R.; Mustafa, M.; Abdullah, N. R. H.; Zain, Z. M.; Pebrianti, Dwi
2017-07-01
Fault Detection and Isolation (FDI) is a method to monitor, identify, and pinpoint the type and location of system fault in a complex multiple input multiple output (MIMO) non-linear system. A two wheel robot is used as a complex system in this study. The aim of the research is to construct and design a Fault Detection and Isolation algorithm. The proposed method for the fault identification is using hybrid technique that combines Kalman filter and Artificial Neural Network (ANN). The Kalman filter is able to recognize the data from the sensors of the system and indicate the fault of the system in the sensor reading. Error prediction is based on the fault magnitude and the time occurrence of fault. Additionally, Artificial Neural Network (ANN) is another algorithm used to determine the type of fault and isolate the fault in the system.
Framework for a space shuttle main engine health monitoring system
NASA Technical Reports Server (NTRS)
Hawman, Michael W.; Galinaitis, William S.; Tulpule, Sharayu; Mattedi, Anita K.; Kamenetz, Jeffrey
1990-01-01
A framework developed for a health management system (HMS) which is directed at improving the safety of operation of the Space Shuttle Main Engine (SSME) is summarized. An emphasis was placed on near term technology through requirements to use existing SSME instrumentation and to demonstrate the HMS during SSME ground tests within five years. The HMS framework was developed through an analysis of SSME failure modes, fault detection algorithms, sensor technologies, and hardware architectures. A key feature of the HMS framework design is that a clear path from the ground test system to a flight HMS was maintained. Fault detection techniques based on time series, nonlinear regression, and clustering algorithms were developed and demonstrated on data from SSME ground test failures. The fault detection algorithms exhibited 100 percent detection of faults, had an extremely low false alarm rate, and were robust to sensor loss. These algorithms were incorporated into a hierarchical decision making strategy for overall assessment of SSME health. A preliminary design for a hardware architecture capable of supporting real time operation of the HMS functions was developed. Utilizing modular, commercial off-the-shelf components produced a reliable low cost design with the flexibility to incorporate advances in algorithm and sensor technology as they become available.
Analysis of a hardware and software fault tolerant processor for critical applications
NASA Technical Reports Server (NTRS)
Dugan, Joanne B.
1993-01-01
Computer systems for critical applications must be designed to tolerate software faults as well as hardware faults. A unified approach to tolerating hardware and software faults is characterized by classifying faults in terms of duration (transient or permanent) rather than source (hardware or software). Errors arising from transient faults can be handled through masking or voting, but errors arising from permanent faults require system reconfiguration to bypass the failed component. Most errors which are caused by software faults can be considered transient, in that they are input-dependent. Software faults are triggered by a particular set of inputs. Quantitative dependability analysis of systems which exhibit a unified approach to fault tolerance can be performed by a hierarchical combination of fault tree and Markov models. A methodology for analyzing hardware and software fault tolerant systems is applied to the analysis of a hypothetical system, loosely based on the Fault Tolerant Parallel Processor. The models consider both transient and permanent faults, hardware and software faults, independent and related software faults, automatic recovery, and reconfiguration.
Space station automation of common module power management and distribution, volume 2
NASA Technical Reports Server (NTRS)
Ashworth, B.; Riedesel, J.; Myers, C.; Jakstas, L.; Smith, D.
1990-01-01
The new Space Station Module Power Management and Distribution System (SSM/PMAD) testbed automation system is described. The subjects discussed include testbed 120 volt dc star bus configuration and operation, SSM/PMAD automation system architecture, fault recovery and management expert system (FRAMES) rules english representation, the SSM/PMAD user interface, and the SSM/PMAD future direction. Several appendices are presented and include the following: SSM/PMAD interface user manual version 1.0, SSM/PMAD lowest level processor (LLP) reference, SSM/PMAD technical reference version 1.0, SSM/PMAD LLP visual control logic representation's (VCLR's), SSM/PMAD LLP/FRAMES interface control document (ICD) , and SSM/PMAD LLP switchgear interface controller (SIC) ICD.
Benchmarking Gas Path Diagnostic Methods: A Public Approach
NASA Technical Reports Server (NTRS)
Simon, Donald L.; Bird, Jeff; Davison, Craig; Volponi, Al; Iverson, R. Eugene
2008-01-01
Recent technology reviews have identified the need for objective assessments of engine health management (EHM) technology. The need is two-fold: technology developers require relevant data and problems to design and validate new algorithms and techniques while engine system integrators and operators need practical tools to direct development and then evaluate the effectiveness of proposed solutions. This paper presents a publicly available gas path diagnostic benchmark problem that has been developed by the Propulsion and Power Systems Panel of The Technical Cooperation Program (TTCP) to help address these needs. The problem is coded in MATLAB (The MathWorks, Inc.) and coupled with a non-linear turbofan engine simulation to produce "snap-shot" measurements, with relevant noise levels, as if collected from a fleet of engines over their lifetime of use. Each engine within the fleet will experience unique operating and deterioration profiles, and may encounter randomly occurring relevant gas path faults including sensor, actuator and component faults. The challenge to the EHM community is to develop gas path diagnostic algorithms to reliably perform fault detection and isolation. An example solution to the benchmark problem is provided along with associated evaluation metrics. A plan is presented to disseminate this benchmark problem to the engine health management technical community and invite technology solutions.
Poka Yoke system based on image analysis and object recognition
NASA Astrophysics Data System (ADS)
Belu, N.; Ionescu, L. M.; Misztal, A.; Mazăre, A.
2015-11-01
Poka Yoke is a method of quality management which is related to prevent faults from arising during production processes. It deals with “fail-sating” or “mistake-proofing”. The Poka-yoke concept was generated and developed by Shigeo Shingo for the Toyota Production System. Poka Yoke is used in many fields, especially in monitoring production processes. In many cases, identifying faults in a production process involves a higher cost than necessary cost of disposal. Usually, poke yoke solutions are based on multiple sensors that identify some nonconformities. This means the presence of different equipment (mechanical, electronic) on production line. As a consequence, coupled with the fact that the method itself is an invasive, affecting the production process, would increase its price diagnostics. The bulky machines are the means by which a Poka Yoke system can be implemented become more sophisticated. In this paper we propose a solution for the Poka Yoke system based on image analysis and identification of faults. The solution consists of a module for image acquisition, mid-level processing and an object recognition module using associative memory (Hopfield network type). All are integrated into an embedded system with AD (Analog to Digital) converter and Zync 7000 (22 nm technology).
Software architecture of the Magdalena Ridge Observatory Interferometer
NASA Astrophysics Data System (ADS)
Farris, Allen; Klinglesmith, Dan; Seamons, John; Torres, Nicolas; Buscher, David; Young, John
2010-07-01
Merging software from 36 independent work packages into a coherent, unified software system with a lifespan of twenty years is the challenge faced by the Magdalena Ridge Observatory Interferometer (MROI). We solve this problem by using standardized interface software automatically generated from simple highlevel descriptions of these systems, relying only on Linux, GNU, and POSIX without complex software such as CORBA. This approach, based on gigabit Ethernet with a TCP/IP protocol, provides the flexibility to integrate and manage diverse, independent systems using a centralized supervisory system that provides a database manager, data collectors, fault handling, and an operator interface.
Overview of error-tolerant cockpit research
NASA Technical Reports Server (NTRS)
Abbott, Kathy
1990-01-01
The objectives of research in intelligent cockpit aids and intelligent error-tolerant systems are stated. In intelligent cockpit aids research, the objective is to provide increased aid and support to the flight crew of civil transport aircraft through the use of artificial intelligence techniques combined with traditional automation. In intelligent error-tolerant systems, the objective is to develop and evaluate cockpit systems that provide flight crews with safe and effective ways and means to manage aircraft systems, plan and replan flights, and respond to contingencies. A subsystems fault management functional diagram is given. All information is in viewgraph form.
Numerical modeling of fluid flow in a fault zone: a case of study from Majella Mountain (Italy).
NASA Astrophysics Data System (ADS)
Romano, Valentina; Battaglia, Maurizio; Bigi, Sabina; De'Haven Hyman, Jeffrey; Valocchi, Albert J.
2017-04-01
The study of fluid flow in fractured rocks plays a key role in reservoir management, including CO2 sequestration and waste isolation. We present a numerical model of fluid flow in a fault zone, based on field data acquired in Majella Mountain, in the Central Apennines (Italy). This fault zone is considered a good analogue for the massive presence of fluid migration in the form of tar. Faults are mechanical features and cause permeability heterogeneities in the upper crust, so they strongly influence fluid flow. The distribution of the main components (core, damage zone) can lead the fault zone to act as a conduit, a barrier, or a combined conduit-barrier system. We integrated existing information and our own structural surveys of the area to better identify the major fault features (e.g., type of fractures, statistical properties, geometrical and petro-physical characteristics). In our model the damage zones of the fault are described as discretely fractured medium, while the core of the fault as a porous one. Our model utilizes the dfnWorks code, a parallelized computational suite, developed at Los Alamos National Laboratory (LANL), that generates three dimensional Discrete Fracture Network (DFN) of the damage zones of the fault and characterizes its hydraulic parameters. The challenge of the study is the coupling between the discrete domain of the damage zones and the continuum one of the core. The field investigations and the basic computational workflow will be described, along with preliminary results of fluid flow simulation at the scale of the fault.
Systems and Methods for Determining Inertial Navigation System Faults
NASA Technical Reports Server (NTRS)
Bharadwaj, Raj Mohan (Inventor); Bageshwar, Vibhor L. (Inventor); Kim, Kyusung (Inventor)
2017-01-01
An inertial navigation system (INS) includes a primary inertial navigation system (INS) unit configured to receive accelerometer measurements from an accelerometer and angular velocity measurements from a gyroscope. The primary INS unit is further configured to receive global navigation satellite system (GNSS) signals from a GNSS sensor and to determine a first set of kinematic state vectors based on the accelerometer measurements, the angular velocity measurements, and the GNSS signals. The INS further includes a secondary INS unit configured to receive the accelerometer measurements and the angular velocity measurements and to determine a second set of kinematic state vectors of the vehicle based on the accelerometer measurements and the angular velocity measurements. A health management system is configured to compare the first set of kinematic state vectors and the second set of kinematic state vectors to determine faults associated with the accelerometer or the gyroscope based on the comparison.
Education Reform: A Managerial Agenda.
ERIC Educational Resources Information Center
Bacharach, Samuel B.; Conley, Sharon C.
1986-01-01
Education reform has wrongly focused on teacher motivation and rewards, when the organizational system itself is at fault. Research shows that effective school management hinges on increased individual discretion and decision-making opportunities for teachers and less controlling behavior by administrators. Ten characteristics of effective…
Yu, Dantong; Katramatos, Dimitrios; Sim, Alexander; Shoshani, Arie
2014-04-22
A cross-domain network resource reservation scheduler configured to schedule a path from at least one end-site includes a management plane device configured to monitor and provide information representing at least one of functionality, performance, faults, and fault recovery associated with a network resource; a control plane device configured to at least one of schedule the network resource, provision local area network quality of service, provision local area network bandwidth, and provision wide area network bandwidth; and a service plane device configured to interface with the control plane device to reserve the network resource based on a reservation request and the information from the management plane device. Corresponding methods and computer-readable medium are also disclosed.
Pratt and Whitney Overview and Advanced Health Management Program
NASA Technical Reports Server (NTRS)
Inabinett, Calvin
2008-01-01
Hardware Development Activity: Design and Test Custom Multi-layer Circuit Boards for use in the Fault Emulation Unit; Logic design performed using VHDL; Layout power system for lab hardware; Work lab issues with software developers and software testers; Interface with Engine Systems personnel with performance of Engine hardware components; Perform off nominal testing with new engine hardware.
Data Aggregation in Multi-Agent Systems in the Presence of Hybrid Faults
ERIC Educational Resources Information Center
Srinivasan, Satish Mahadevan
2010-01-01
Data Aggregation (DA) is a set of functions that provide components of a distributed system access to global information for purposes of network management and user services. With the diverse new capabilities that networks can provide, applicability of DA is growing. DA is useful in dealing with multi-value domain information and often requires…
Operator Performance Evaluation of Fault Management Interfaces for Next-Generation Spacecraft
NASA Technical Reports Server (NTRS)
Hayashi, Miwa; Ravinder, Ujwala; Beutter, Brent; McCann, Robert S.; Spirkovska, Lilly; Renema, Fritz
2008-01-01
In the cockpit of the NASA's next generation of spacecraft, most of vehicle commanding will be carried out via electronic interfaces instead of hard cockpit switches. Checklists will be also displayed and completed on electronic procedure viewers rather than from paper. Transitioning to electronic cockpit interfaces opens up opportunities for more automated assistance, including automated root-cause diagnosis capability. The paper reports an empirical study evaluating two potential concepts for fault management interfaces incorporating two different levels of automation. The operator performance benefits produced by automation were assessed. Also, some design recommendations for spacecraft fault management interfaces are discussed.
NO-FAULT COMPENSATION FOR MEDICAL INJURIES: TRENDS AND CHALLENGES.
Kassim, Puteri Nemie
2014-12-01
As an alternative to the tort or fault-based system, a no-fault compensation system has been viewed as having the potential to overcome problems inherent in the tort system by providing fair, speedy and adequate compensation for medically injured victims. Proponents of the suggested no-fault compensation system have argued that this system is more efficient in terms of time and money, as well as in making the circumstances in which compensation is paid, much clearer. However, the arguments against no-fault compensation systems are mainly on issues of funding difficulties, accountability and deterrence, particularly, once fault is taken out of the equation. Nonetheless, the no-fault compensation system has been successfully implemented in various countries but, at the same time, rejected in some others, as not being implementable. In the present trend, the no-fault system seems to fit the needs of society by offering greater access to justice for medically injured victims and providing a clearer "road map" towards obtaining suitable redress. This paper aims at providing the readers with an overview of the characteristics of the no fault compensation system and some examples of countries that have implemented it. Qualitative Research-Content Analysis. Given the many problems and hurdles posed by the tort or fault-based system, it is questionable that it can efficiently play its role as a mechanism that affords fair and adequate compensation for victims of medical injuries. However, while a comprehensive no-fault compensation system offers a tempting alternative to the tort or fault-based system, to import such a change into our local scenario requires a great deal of consideration. There are major differences, mainly in terms of social standing, size of population, political ideology and financial commitment, between Malaysia and countries that have successfully implemented no-fault systems. Nevertheless, implementing a no-fault compensation system in Malaysia is not entirely impossible. A custom-made no-fault model tailored to suit our local scenario can be promising, provided that a thorough research is made, assessing the viability of a no-fault system in Malaysia, addressing the inherent problems and, consequently, designing a workable no-fault system in Malaysia.
DMS Advanced Applications for Accommodating High Penetrations of DERs and Microgrids: Preprint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pratt, Annabelle; Veda, Santosh; Maitra, Arindam
Efficient and effective management of the electrical distribution system requires an integrated system approach for Distribution Management Systems (DMS), Distributed Energy Resources (DERs), Distributed Energy Resources Management System (DERMS), and microgrids to work in harmony. This paper highlights some of the outcomes from a U.S. Department of Energy (DOE), Office of Electricity (OE) project, including 1) Architecture of these integrated systems, and 2) Expanded functions of two example DMS applications, Volt-VAR optimization (VVO) and Fault Location, Isolation and Service Restoration (FLISR), to accommodate DER. For these two example applications, the relevant DER Group Functions necessary to support communication between DMSmore » and Microgrid Controller (MC) in grid-tied mode are identified.« less
DMS Advanced Applications for Accommodating High Penetrations of DERs and Microgrids
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pratt, Annabelle; Veda, Santosh; Maitra, Arindam
Efficient and effective management of the electric distribution system requires an integrated approach to allow various systems to work in harmony, including distribution management systems (DMS), distributed energy resources (DERs), distributed energy resources management systems, and microgrids. This study highlights some outcomes from a recent project sponsored by the US Department of Energy, Office of Electricity Delivery and Energy Reliability, including information about (i) the architecture of these integrated systems and (ii) expanded functions of two example DMS applications to accommodate DERs: volt-var optimisation and fault location, isolation, and service restoration. In addition, the relevant DER group functions necessary tomore » support communications between the DMS and a microgrid controller in grid-tied mode are identified.« less
NMESys: An expert system for network fault detection
NASA Technical Reports Server (NTRS)
Nelson, Peter C.; Warpinski, Janet
1991-01-01
The problem of network management is becoming an increasingly difficult and challenging task. It is very common today to find heterogeneous networks consisting of many different types of computers, operating systems, and protocols. The complexity of implementing a network with this many components is difficult enough, while the maintenance of such a network is an even larger problem. A prototype network management expert system, NMESys, implemented in the C Language Integrated Production System (CLIPS). NMESys concentrates on solving some of the critical problems encountered in managing a large network. The major goal of NMESys is to provide a network operator with an expert system tool to quickly and accurately detect hard failures, potential failures, and to minimize or eliminate user down time in a large network.
Key recovery factors for the August 24, 2014, South Napa Earthquake
Hudnut, Kenneth W.; Brocher, Thomas M.; Prentice, Carol S.; Boatwright, John; Brooks, Benjamin A.; Aagaard, Brad T.; Blair, James Luke; Fletcher, Jon Peter B.; Erdem, Jemile; Wicks, Chuck; Murray, Jessica R.; Pollitz, Fred F.; Langbein, John O.; Svarc, Jerry L.; Schwartz, David P.; Ponti, Daniel J.; Hecker, Suzanne; DeLong, Stephen B.; Rosa, Carla M.; Jones, Brenda; Lamb, Rynn M.; Rosinski, Anne M.; McCrink, Timothy P.; Dawson, Timothy E.; Seitz, Gordon G.; Glennie, Craig; Hauser, Darren; Ericksen, Todd; Mardock, Dan; Hoirup, Don F.; Bray, Jonathan D.; Rubin, Ron S.
2014-01-01
Through discussions between the Federal Emergency Management Agency (FEMA) and the U.S. Geological Survey (USGS) following the South Napa earthquake, it was determined that several key decision points would be faced by FEMA for which additional information should be sought and provided by USGS and its partners. This report addresses the four tasks that were agreed to. These tasks are (1) assessment of ongoing fault movement (called afterslip) especially in the Browns Valley residential neighborhood, (2) assessment of the shaking pattern in the downtown area of the City of Napa, (3) improvement of information on the fault hazards posed by the West Napa Fault System (record of past earthquakes and slip rate, for example), and (4) imagery acquisition and data processing to provide overall geospatial information support to FEMA.
Intelligent fault management for the Space Station active thermal control system
NASA Technical Reports Server (NTRS)
Hill, Tim; Faltisco, Robert M.
1992-01-01
The Thermal Advanced Automation Project (TAAP) approach and architecture is described for automating the Space Station Freedom (SSF) Active Thermal Control System (ATCS). The baseline functionally and advanced automation techniques for Fault Detection, Isolation, and Recovery (FDIR) will be compared and contrasted. Advanced automation techniques such as rule-based systems and model-based reasoning should be utilized to efficiently control, monitor, and diagnose this extremely complex physical system. TAAP is developing advanced FDIR software for use on the SSF thermal control system. The goal of TAAP is to join Knowledge-Based System (KBS) technology, using a combination of rules and model-based reasoning, with conventional monitoring and control software in order to maximize autonomy of the ATCS. TAAP's predecessor was NASA's Thermal Expert System (TEXSYS) project which was the first large real-time expert system to use both extensive rules and model-based reasoning to control and perform FDIR on a large, complex physical system. TEXSYS showed that a method is needed for safely and inexpensively testing all possible faults of the ATCS, particularly those potentially damaging to the hardware, in order to develop a fully capable FDIR system. TAAP therefore includes the development of a high-fidelity simulation of the thermal control system. The simulation provides realistic, dynamic ATCS behavior and fault insertion capability for software testing without hardware related risks or expense. In addition, thermal engineers will gain greater confidence in the KBS FDIR software than was possible prior to this kind of simulation testing. The TAAP KBS will initially be a ground-based extension of the baseline ATCS monitoring and control software and could be migrated on-board as additional computation resources are made available.
Towards an Autonomic Cluster Management System (ACMS) with Reflex Autonomicity
NASA Technical Reports Server (NTRS)
Truszkowski, Walt; Hinchey, Mike; Sterritt, Roy
2005-01-01
Cluster computing, whereby a large number of simple processors or nodes are combined together to apparently function as a single powerful computer, has emerged as a research area in its own right. The approach offers a relatively inexpensive means of providing a fault-tolerant environment and achieving significant computational capabilities for high-performance computing applications. However, the task of manually managing and configuring a cluster quickly becomes daunting as the cluster grows in size. Autonomic computing, with its vision to provide self-management, can potentially solve many of the problems inherent in cluster management. We describe the development of a prototype Autonomic Cluster Management System (ACMS) that exploits autonomic properties in automating cluster management and its evolution to include reflex reactions via pulse monitoring.
Simpson, Roy L
2004-08-01
The Institute of Medicine's landmark report asserted that medical error is seldom the fault of individuals, but the result of faulty healthcare policy/procedure systems. Numerous studies have shown that information technology (IT) can shore up weak systems. For nursing, IT plays a key role in eliminating nursing mistakes. However, managing IT is a function of managing the people who use it. For nursing administrators, successful IT implementations depend on adroit management of the three 'P's: People, processes and (computer) programs. This paper examines critical issues for managing each entity. It discusses the importance of developing trusting organizations, the requirements of process change, how to implement technology in harmony with the organization and the significance of vision.
NASA Technical Reports Server (NTRS)
Fitz, Rhonda; Whitman, Gerek
2016-01-01
Research into complexities of software systems Fault Management (FM) and how architectural design decisions affect safety, preservation of assets, and maintenance of desired system functionality has coalesced into a technical reference (TR) suite that advances the provision of safety and mission assurance. The NASA Independent Verification and Validation (IV&V) Program, with Software Assurance Research Program support, extracted FM architectures across the IV&V portfolio to evaluate robustness, assess visibility for validation and test, and define software assurance methods applied to the architectures and designs. This investigation spanned IV&V projects with seven different primary developers, a wide range of sizes and complexities, and encompassed Deep Space Robotic, Human Spaceflight, and Earth Orbiter mission FM architectures. The initiative continues with an expansion of the TR suite to include Launch Vehicles, adding the benefit of investigating differences intrinsic to model-based FM architectures and insight into complexities of FM within an Agile software development environment, in order to improve awareness of how nontraditional processes affect FM architectural design and system health management. The identification of particular FM architectures, visibility, and associated IV&V techniques provides a TR suite that enables greater assurance that critical software systems will adequately protect against faults and respond to adverse conditions. Additionally, the role FM has with regard to strengthened security requirements, with potential to advance overall asset protection of flight software systems, is being addressed with the development of an adverse conditions database encompassing flight software vulnerabilities. Capitalizing on the established framework, this TR suite provides assurance capability for a variety of FM architectures and varied development approaches. Research results are being disseminated across NASA, other agencies, and the software community. This paper discusses the findings and TR suite informing the FM domain in best practices for FM architectural design, visibility observations, and methods employed for IV&V and mission assurance.
NASA Technical Reports Server (NTRS)
1985-01-01
The primary objective of the Test Active Control Technology (ACT) System laboratory tests was to verify and validate the system concept, hardware, and software. The initial lab tests were open loop hardware tests of the Test ACT System as designed and built. During the course of the testing, minor problems were uncovered and corrected. Major software tests were run. The initial software testing was also open loop. These tests examined pitch control laws, wing load alleviation, signal selection/fault detection (SSFD), and output management. The Test ACT System was modified to interface with the direct drive valve (DDV) modules. The initial testing identified problem areas with DDV nonlinearities, valve friction induced limit cycling, DDV control loop instability, and channel command mismatch. The other DDV issue investigated was the ability to detect and isolate failures. Some simple schemes for failure detection were tested but were not completely satisfactory. The Test ACT System architecture continues to appear promising for ACT/FBW applications in systems that must be immune to worst case generic digital faults, and be able to tolerate two sequential nongeneric faults with no reduction in performance. The challenge in such an implementation would be to keep the analog element sufficiently simple to achieve the necessary reliability.
The Development of NASA's Fault Management Handbook
NASA Technical Reports Server (NTRS)
Fesq, Lorraine
2011-01-01
Disciplined approach to Fault Management (FM) has not always been emphasized by projects, contributing to major schedule and cost overruns: (1) Often faults aren't addressed until nominal spacecraft design is fairly stable. (2) Design relegated to after-the-fact patchwork, Band-Aid approach. Progress is being made on a number of fronts outside of Handbook effort: (1) Processes, Practices and Tools being developed at some Centers and Institutions (2) Management recognition. Constellation FM roles, Discovery/New Frontiers mission reviews (3) Potential Technology solutions. New approaches could avoid many current pitfalls (3a) New FM architectures, including model-based approach integrated with NASA's MBSE (Model-Based System Engineering) efforts (3b) NASA's Office of the Chief Technologist: FM identified in seven of NASA's 14 Space Technology Roadmaps. Opportunity to coalesce and establish thrust area to progressively develop new FM techniques. FM Handbook will help ensure that future missions do not encounter same FM-related problems as previous missions. Version 1 of the FM Handbook is a good start: (1) Still need Version 2 Agency-wide FM Handbook to expand Handbook to other areas, especially crewed missions. (2) Still need to reach out to other organizations to develop common understanding and vocabulary. Handbook doesn't/can't address all Workshop recommendations. Still need to identify how to address programmatic and infrastructure issues.
Pierce, Herbert A.
2001-01-01
As of 1999, surface water collected and stored in reservoirs is the sole source of municipal water for the city of Williams. During 1996 and 1999, reservoirs reached historically low levels. Understanding the ground-water flow system is critical to managing the ground-water resources in this part of the Coconino Plateau. The nearly 1,000-meter-deep regional aquifer in the Redwall and Muav Limestones, however, makes studying or utilizing the resource difficult. Near-vertical faults and complex geologic structures control the ground-water flow system on the southwest side of the Kaibab Uplift near Williams, Arizona. To address the hydrogeologic complexities in the study area, a suite of techniques, which included aeromagnetic, gravity, square-array resistivity, and audiomagnetotelluric surveys, were applied as part of a regional study near Bill Williams Mountain. Existing well data and interpreted geophysical data were compiled and used to estimate depths to the water table and to prepare a potentiometric map. Geologic characteristics, such as secondary porosity, coefficient of anisotropy, and fracture-strike direction, were calculated at several sites to examine how these characteristics change with depth. The 14-kilometer-wide, seismically active northwestward-trending Cataract Creek and the northeastward-trending Mesa Butte Fault systems intersect near Bill Williams Mountain. Several north-south-trending faults may provide additional block faulting north and west of Bill Williams Mountain. Because of the extensive block faulting and regional folding, the volcanic and sedimentary rocks are tilted toward one or more of these faults. These faults provide near-vertical flow paths to the regional water table. The nearly radial fractures allow water that reaches the regional aquifer to move away from the Bill Williams Mountain area. Depth to the regional aquifer is highly variable and depends on location and local structures. On the basis of interpreted audiomagnetotelluric and square-array resistivity sounding curves and limited well data, depths to water may range from 450 to 1,300 meters.
Propulsion Health Monitoring for Enhanced Safety
NASA Technical Reports Server (NTRS)
Butz, Mark G.; Rodriguez, Hector M.
2003-01-01
This report presents the results of the NASA contract Propulsion System Health Management for Enhanced Safety performed by General Electric Aircraft Engines (GE AE), General Electric Global Research (GE GR), and Pennsylvania State University Applied Research Laboratory (PSU ARL) under the NASA Aviation Safety Program. This activity supports the overall goal of enhanced civil aviation safety through a reduction in the occurrence of safety-significant propulsion system malfunctions. Specific objectives are to develop and demonstrate vibration diagnostics techniques for the on-line detection of turbine rotor disk cracks, and model-based fault tolerant control techniques for the prevention and mitigation of in-flight engine shutdown, surge/stall, and flameout events. The disk crack detection work was performed by GE GR which focused on a radial-mode vibration monitoring technique, and PSU ARL which focused on a torsional-mode vibration monitoring technique. GE AE performed the Model-Based Fault Tolerant Control work which focused on the development of analytical techniques for detecting, isolating, and accommodating gas-path faults.
Investigation of an advanced fault tolerant integrated avionics system
NASA Technical Reports Server (NTRS)
Dunn, W. R.; Cottrell, D.; Flanders, J.; Javornik, A.; Rusovick, M.
1986-01-01
Presented is an advanced, fault-tolerant multiprocessor avionics architecture as could be employed in an advanced rotorcraft such as LHX. The processor structure is designed to interface with existing digital avionics systems and concepts including the Army Digital Avionics System (ADAS) cockpit/display system, navaid and communications suites, integrated sensing suite, and the Advanced Digital Optical Control System (ADOCS). The report defines mission, maintenance and safety-of-flight reliability goals as might be expected for an operational LHX aircraft. Based on use of a modular, compact (16-bit) microprocessor card family, results of a preliminary study examining simplex, dual and standby-sparing architectures is presented. Given the stated constraints, it is shown that the dual architecture is best suited to meet reliability goals with minimum hardware and software overhead. The report presents hardware and software design considerations for realizing the architecture including redundancy management requirements and techniques as well as verification and validation needs and methods.
Lee, Jasper; Zhang, Jianguo; Park, Ryan; Dagliyan, Grant; Liu, Brent; Huang, H K
2012-07-01
A Molecular Imaging Data Grid (MIDG) was developed to address current informatics challenges in archival, sharing, search, and distribution of preclinical imaging studies between animal imaging facilities and investigator sites. This manuscript presents a 2nd generation MIDG replacing the Globus Toolkit with a new system architecture that implements the IHE XDS-i integration profile. Implementation and evaluation were conducted using a 3-site interdisciplinary test-bed at the University of Southern California. The 2nd generation MIDG design architecture replaces the initial design's Globus Toolkit with dedicated web services and XML-based messaging for dedicated management and delivery of multi-modality DICOM imaging datasets. The Cross-enterprise Document Sharing for Imaging (XDS-i) integration profile from the field of enterprise radiology informatics was adopted into the MIDG design because streamlined image registration, management, and distribution dataflow are likewise needed in preclinical imaging informatics systems as in enterprise PACS application. Implementation of the MIDG is demonstrated at the University of Southern California Molecular Imaging Center (MIC) and two other sites with specified hardware, software, and network bandwidth. Evaluation of the MIDG involves data upload, download, and fault-tolerance testing scenarios using multi-modality animal imaging datasets collected at the USC Molecular Imaging Center. The upload, download, and fault-tolerance tests of the MIDG were performed multiple times using 12 collected animal study datasets. Upload and download times demonstrated reproducibility and improved real-world performance. Fault-tolerance tests showed that automated failover between Grid Node Servers has minimal impact on normal download times. Building upon the 1st generation concepts and experiences, the 2nd generation MIDG system improves accessibility of disparate animal-model molecular imaging datasets to users outside a molecular imaging facility's LAN using a new architecture, dataflow, and dedicated DICOM-based management web services. Productivity and efficiency of preclinical research for translational sciences investigators has been further streamlined for multi-center study data registration, management, and distribution.
A fault isolation method based on the incidence matrix of an augmented system
NASA Astrophysics Data System (ADS)
Chen, Changxiong; Chen, Liping; Ding, Jianwan; Wu, Yizhong
2018-03-01
A new approach is proposed for isolating faults and fast identifying the redundant sensors of a system in this paper. By introducing fault signal as additional state variable, an augmented system model is constructed by the original system model, fault signals and sensor measurement equations. The structural properties of an augmented system model are provided in this paper. From the viewpoint of evaluating fault variables, the calculating correlations of the fault variables in the system can be found, which imply the fault isolation properties of the system. Compared with previous isolation approaches, the highlights of the new approach are that it can quickly find the faults which can be isolated using exclusive residuals, at the same time, and can identify the redundant sensors in the system, which are useful for the design of diagnosis system. The simulation of a four-tank system is reported to validate the proposed method.
Developing a Fault Management Guidebook for Nasa's Deep Space Robotic Missions
NASA Technical Reports Server (NTRS)
Fesq, Lorraine M.; Jacome, Raquel Weitl
2015-01-01
NASA designs and builds systems that achieve incredibly ambitious goals, as evidenced by the Curiosity rover traversing on Mars, the highly complex International Space Station orbiting our Earth, and the compelling plans for capturing, retrieving and redirecting an asteroid into a lunar orbit to create a nearby a target to be investigated by astronauts. In order to accomplish these feats, the missions must be imbued with sufficient knowledge and capability not only to realize the goals, but also to identify and respond to off-nominal conditions. Fault Management (FM) is the discipline of establishing how a system will respond to preserve its ability to function even in the presence of faults. In 2012, NASA released a draft FM Handbook in an attempt to coalesce the field by establishing a unified terminology and a common process for designing FM mechanisms. However, FM approaches are very diverse across NASA, especially between the different mission types such as Earth orbiters, launch vehicles, deep space robotic vehicles and human spaceflight missions, and the authors were challenged to capture and represent all of these views. The authors recognized that a necessary precursor step is for each sub-community to codify its FM policies, practices and approaches in individual, focused guidebooks. Then, the sub-communities can look across NASA to better understand the different ways off-nominal conditions are addressed, and to seek commonality or at least an understanding of the multitude of FM approaches. This paper describes the development of the "Deep Space Robotic Fault Management Guidebook," which is intended to be the first of NASA's FM guidebooks. Its purpose is to be a field-guide for FM practitioners working on deep space robotic missions, as well as a planning tool for project managers. Publication of this Deep Space Robotic FM Guidebook is expected in early 2015. The guidebook will be posted on NASA's Engineering Network on the FM Community of Practice website so that it will be available to all NASA projects. Future plans for subsequent guidebooks for the other NASA sub-communities are proposed.
Parameter Transient Behavior Analysis on Fault Tolerant Control System
NASA Technical Reports Server (NTRS)
Belcastro, Christine (Technical Monitor); Shin, Jong-Yeob
2003-01-01
In a fault tolerant control (FTC) system, a parameter varying FTC law is reconfigured based on fault parameters estimated by fault detection and isolation (FDI) modules. FDI modules require some time to detect fault occurrences in aero-vehicle dynamics. This paper illustrates analysis of a FTC system based on estimated fault parameter transient behavior which may include false fault detections during a short time interval. Using Lyapunov function analysis, the upper bound of an induced-L2 norm of the FTC system performance is calculated as a function of a fault detection time and the exponential decay rate of the Lyapunov function.
NASA Technical Reports Server (NTRS)
Lee, Harry
1994-01-01
A highly accurate transmission line fault locator based on the traveling-wave principle was developed and successfully operated within B.C. Hydro. A transmission line fault produces a fast-risetime traveling wave at the fault point which propagates along the transmission line. This fault locator system consists of traveling wave detectors located at key substations which detect and time tag the leading edge of the fault-generated traveling wave as if passes through. A master station gathers the time-tagged information from the remote detectors and determines the location of the fault. Precise time is a key element to the success of this system. This fault locator system derives its timing from the Global Positioning System (GPS) satellites. System tests confirmed the accuracy of locating faults to within the design objective of +/-300 meters.
A footwall system of faults associated with a foreland thrust in Montana
NASA Astrophysics Data System (ADS)
Watkinson, A. J.
1993-05-01
Some recent structural geology models of faulting have promoted the idea of a rigid footwall behaviour or response under the main thrust fault, especially for fault ramps or fault-bend folds. However, a very well-exposed thrust fault in the Montana fold and thrust belt shows an intricate but well-ordered system of subsidiary minor faults in the footwall position with respect to the main thrust fault plane. Considerable shortening has occurred off the main fault in this footwall collapse zone and the distribution and style of the minor faults accord well with published patterns of aftershock foci associated with thrust faults. In detail, there appear to be geometrically self-similar fault systems from metre length down to a few centimetres. The smallest sets show both slip and dilation. The slickensides show essentially two-dimensional displacements, and three slip systems were operative—one parallel to the bedding, and two conjugate and symmetric about the bedding (acute angle of 45-50°). A reconstruction using physical analogue models suggests one possible model for the evolution and sequencing of slip of the thrust fault system.
Health Management Applications for International Space Station
NASA Technical Reports Server (NTRS)
Alena, Richard; Duncavage, Dan
2005-01-01
Traditional mission and vehicle management involves teams of highly trained specialists monitoring vehicle status and crew activities, responding rapidly to any anomalies encountered during operations. These teams work from the Mission Control Center and have access to engineering support teams with specialized expertise in International Space Station (ISS) subsystems. Integrated System Health Management (ISHM) applications can significantly augment these capabilities by providing enhanced monitoring, prognostic and diagnostic tools for critical decision support and mission management. The Intelligent Systems Division of NASA Ames Research Center is developing many prototype applications using model-based reasoning, data mining and simulation, working with Mission Control through the ISHM Testbed and Prototypes Project. This paper will briefly describe information technology that supports current mission management practice, and will extend this to a vision for future mission control workflow incorporating new ISHM applications. It will describe ISHM applications currently under development at NASA and will define technical approaches for implementing our vision of future human exploration mission management incorporating artificial intelligence and distributed web service architectures using specific examples. Several prototypes are under development, each highlighting a different computational approach. The ISStrider application allows in-depth analysis of Caution and Warning (C&W) events by correlating real-time telemetry with the logical fault trees used to define off-nominal events. The application uses live telemetry data and the Livingstone diagnostic inference engine to display the specific parameters and fault trees that generated the C&W event, allowing a flight controller to identify the root cause of the event from thousands of possibilities by simply navigating animated fault tree models on their workstation. SimStation models the functional power flow for the ISS Electrical Power System and can predict power balance for nominal and off-nominal conditions. SimStation uses realtime telemetry data to keep detailed computational physics models synchronized with actual ISS power system state. In the event of failure, the application can then rapidly diagnose root cause, predict future resource levels and even correlate technical documents relevant to the specific failure. These advanced computational models will allow better insight and more precise control of ISS subsystems, increasing safety margins by speeding up anomaly resolution and reducing,engineering team effort and cost. This technology will make operating ISS more efficient and is directly applicable to next-generation exploration missions and Crew Exploration Vehicles.
Autonomous control system reconfiguration for spacecraft with non-redundant actuators
NASA Astrophysics Data System (ADS)
Grossman, Walter
1995-05-01
The Small Satellite Technology Initiative (SSTI) 'CLARK' spacecraft is required to be single-failure tolerant, i.e., no failure of any single component or subsystem shall result in complete mission loss. Fault tolerance is usually achieved by implementing redundant subsystems. Fault tolerant systems are therefore heavier and cost more to build and launch than non-redundent, non fault-tolerant spacecraft. The SSTI CLARK satellite Attitude Determination and Control System (ADACS) achieves single-fault tolerance without redundancy. The attitude determination system system uses a Kalman Filter which is inherently robust to loss of any single attitude sensor. The attitude control system uses three orthogonal reaction wheels for attitude control and three magnetic dipoles for momentum control. The nominal six-actuator control system functions by projecting the attitude correction torque onto the reaction wheels while a slower momentum management outer loop removes the excess momentum in the direction normal to the local B field. The actuators are not redundant so the nominal control law cannot be implemented in the event of a loss of a single actuator (dipole or reaction wheel). The spacecraft dynamical state (attitude, angular rate, and momentum) is controllable from any five-element subset of the six actuators. With loss of an actuator the instantaneous control authority may not span R(3) but the controllability gramian integral(limits between t,0) Phi(t, tau)B(tau )B(prime)(tau) Phi(prime)(t, tau)d tau retains full rank. Upon detection of an actuator failure the control torque is decomposed onto the remaining active axes. The attitude control torque is effected and the over-orbit momentum is controlled. The resulting control system performance approaches that of the nominal system.
Simple Linux Utility for Resource Management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jette, M.
2009-09-09
SLURM is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small computer clusters. As a cluster resource manager, SLURM has three key functions. First, it allocates exclusive and/or non exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allciated nodes. Finally, it arbitrates conflicting requests for resouces by managing a queue of pending work.
Development of Asset Fault Signatures for Prognostic and Health Management in the Nuclear Industry
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vivek Agarwal; Nancy J. Lybeck; Randall Bickford
2014-06-01
Proactive online monitoring in the nuclear industry is being explored using the Electric Power Research Institute’s Fleet-Wide Prognostic and Health Management (FW-PHM) Suite software. The FW-PHM Suite is a set of web-based diagnostic and prognostic tools and databases that serves as an integrated health monitoring architecture. The FW-PHM Suite has four main modules: Diagnostic Advisor, Asset Fault Signature (AFS) Database, Remaining Useful Life Advisor, and Remaining Useful Life Database. This paper focuses on development of asset fault signatures to assess the health status of generator step-up generators and emergency diesel generators in nuclear power plants. Asset fault signatures describe themore » distinctive features based on technical examinations that can be used to detect a specific fault type. At the most basic level, fault signatures are comprised of an asset type, a fault type, and a set of one or more fault features (symptoms) that are indicative of the specified fault. The AFS Database is populated with asset fault signatures via a content development exercise that is based on the results of intensive technical research and on the knowledge and experience of technical experts. The developed fault signatures capture this knowledge and implement it in a standardized approach, thereby streamlining the diagnostic and prognostic process. This will support the automation of proactive online monitoring techniques in nuclear power plants to diagnose incipient faults, perform proactive maintenance, and estimate the remaining useful life of assets.« less
SUMC fault tolerant computer system
NASA Technical Reports Server (NTRS)
1980-01-01
The results of the trade studies are presented. These trades cover: establishing the basic configuration, establishing the CPU/memory configuration, establishing an approach to crosstrapping interfaces, defining the requirements of the redundancy management unit (RMU), establishing a spare plane switching strategy for the fault-tolerant memory (FTM), and identifying the most cost effective way of extending the memory addressing capability beyond the 64 K-bytes (K=1024) of SUMC-II B. The results of the design are compiled in Contract End Item (CEI) Specification for the NASA Standard Spacecraft Computer II (NSSC-II), IBM 7934507. The implementation of the FTM and memory address expansion.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ali, Amjad Majid; Albert, Don; Andersson, Par
SLURM is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small computer clusters. As a cluster resource manager, SLURM has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work 9normally a parallel job) on the set of allocated nodes. Finally, it arbitrates conflicting requests for resources by managing a queue of pending work.
Integrated Approach To Design And Analysis Of Systems
NASA Technical Reports Server (NTRS)
Patterson-Hine, F. A.; Iverson, David L.
1993-01-01
Object-oriented fault-tree representation unifies evaluation of reliability and diagnosis of faults. Programming/fault tree described more fully in "Object-Oriented Algorithm For Evaluation Of Fault Trees" (ARC-12731). Augmented fault tree object contains more information than fault tree object used in quantitative analysis of reliability. Additional information needed to diagnose faults in system represented by fault tree.
Fault recovery for real-time, multi-tasking computer system
NASA Technical Reports Server (NTRS)
Hess, Richard (Inventor); Kelly, Gerald B. (Inventor); Rogers, Randy (Inventor); Stange, Kent A. (Inventor)
2011-01-01
System and methods for providing a recoverable real time multi-tasking computer system are disclosed. In one embodiment, a system comprises a real time computing environment, wherein the real time computing environment is adapted to execute one or more applications and wherein each application is time and space partitioned. The system further comprises a fault detection system adapted to detect one or more faults affecting the real time computing environment and a fault recovery system, wherein upon the detection of a fault the fault recovery system is adapted to restore a backup set of state variables.
An Assessment of Integrated Health Management (IHM) Frameworks
DOE Office of Scientific and Technical Information (OSTI.GOV)
N. Lybeck; M. Tawfik; L. Bond
In order to meet the ever increasing demand for energy, the United States nuclear industry is turning to life extension of existing nuclear power plants (NPPs). Economically ensuring the safe, secure, and reliable operation of aging nuclear power plants presents many challenges. The 2009 Light Water Reactor Sustainability Workshop identified online monitoring of active and structural components as essential to the better understanding and management of the challenges posed by aging nuclear power plants. Additionally, there is increasing adoption of condition-based maintenance (CBM) for active components in NPPs. These techniques provide a foundation upon which a variety of advanced onlinemore » surveillance, diagnostic, and prognostic techniques can be deployed to continuously monitor and assess the health of NPP systems and components. The next step in the development of advanced online monitoring is to move beyond CBM to estimating the remaining useful life of active components using prognostic tools. Deployment of prognostic health management (PHM) on the scale of a NPP requires the use of an integrated health management (IHM) framework - a software product (or suite of products) used to manage the necessary elements needed for a complete implementation of online monitoring and prognostics. This paper provides a thoughtful look at the desirable functions and features of IHM architectures. A full PHM system involves several modules, including data acquisition, system modeling, fault detection, fault diagnostics, system prognostics, and advisory generation (operations and maintenance planning). The standards applicable to PHM applications are indentified and summarized. A list of evaluation criteria for PHM software products, developed to ensure scalability of the toolset to an environment with the complexity of a NPP, is presented. Fourteen commercially available PHM software products are identified and classified into four groups: research tools, PHM system development tools, deployable architectures, and peripheral tools.« less
Real-Time Model-Based Leak-Through Detection within Cryogenic Flow Systems
NASA Technical Reports Server (NTRS)
Walker, M.; Figueroa, F.
2015-01-01
The timely detection of leaks within cryogenic fuel replenishment systems is of significant importance to operators on account of the safety and economic impacts associated with material loss and operational inefficiencies. Associated loss in control of pressure also effects the stability and ability to control the phase of cryogenic fluids during replenishment operations. Current research dedicated to providing Prognostics and Health Management (PHM) coverage of such cryogenic replenishment systems has focused on the detection of leaks to atmosphere involving relatively simple model-based diagnostic approaches that, while effective, are unable to isolate the fault to specific piping system components. The authors have extended this research to focus on the detection of leaks through closed valves that are intended to isolate sections of the piping system from the flow and pressurization of cryogenic fluids. The described approach employs model-based detection of leak-through conditions based on correlations of pressure changes across isolation valves and attempts to isolate the faults to specific valves. Implementation of this capability is enabled by knowledge and information embedded in the domain model of the system. The approach has been used effectively to detect such leak-through faults during cryogenic operational testing at the Cryogenic Testbed at NASA's Kennedy Space Center.
A Generic Modeling Process to Support Functional Fault Model Development
NASA Technical Reports Server (NTRS)
Maul, William A.; Hemminger, Joseph A.; Oostdyk, Rebecca; Bis, Rachael A.
2016-01-01
Functional fault models (FFMs) are qualitative representations of a system's failure space that are used to provide a diagnostic of the modeled system. An FFM simulates the failure effect propagation paths within a system between failure modes and observation points. These models contain a significant amount of information about the system including the design, operation and off nominal behavior. The development and verification of the models can be costly in both time and resources. In addition, models depicting similar components can be distinct, both in appearance and function, when created individually, because there are numerous ways of representing the failure space within each component. Generic application of FFMs has the advantages of software code reuse: reduction of time and resources in both development and verification, and a standard set of component models from which future system models can be generated with common appearance and diagnostic performance. This paper outlines the motivation to develop a generic modeling process for FFMs at the component level and the effort to implement that process through modeling conventions and a software tool. The implementation of this generic modeling process within a fault isolation demonstration for NASA's Advanced Ground System Maintenance (AGSM) Integrated Health Management (IHM) project is presented and the impact discussed.
Stafford fault system: 120 million year fault movement history of northern Virginia
Powars, David S.; Catchings, Rufus D.; Horton, J. Wright; Schindler, J. Stephen; Pavich, Milan J.
2015-01-01
The Stafford fault system, located in the mid-Atlantic coastal plain of the eastern United States, provides the most complete record of fault movement during the past ~120 m.y. across the Virginia, Washington, District of Columbia (D.C.), and Maryland region, including displacement of Pleistocene terrace gravels. The Stafford fault system is close to and aligned with the Piedmont Spotsylvania and Long Branch fault zones. The dominant southwest-northeast trend of strong shaking from the 23 August 2011, moment magnitude Mw 5.8 Mineral, Virginia, earthquake is consistent with the connectivity of these faults, as seismic energy appears to have traveled along the documented and proposed extensions of the Stafford fault system into the Washington, D.C., area. Some other faults documented in the nearby coastal plain are clearly rooted in crystalline basement faults, especially along terrane boundaries. These coastal plain faults are commonly assumed to have undergone relatively uniform movement through time, with average slip rates from 0.3 to 1.5 m/m.y. However, there were higher rates during the Paleocene–early Eocene and the Pliocene (4.4–27.4 m/m.y), suggesting that slip occurred primarily during large earthquakes. Further investigation of the Stafford fault system is needed to understand potential earthquake hazards for the Virginia, Maryland, and Washington, D.C., area. The combined Stafford fault system and aligned Piedmont faults are ~180 km long, so if the combined fault system ruptured in a single event, it would result in a significantly larger magnitude earthquake than the Mineral earthquake. Many structures most strongly affected during the Mineral earthquake are along or near the Stafford fault system and its proposed northeastward extension.
NASA Astrophysics Data System (ADS)
Katopody, D. T.; Oldow, J. S.
2015-12-01
The northwest-striking Furnace Creek - Fish Lake Valley (FC-FLV) fault system stretches for >250 km from southeastern California to western Nevada, forms the eastern boundary of the northern segment of the Eastern California Shear Zone, and has contemporary displacement. The FC-FLV fault system initiated in the mid-Miocene (10-12 Ma) and shows a south to north decrease in displacement from a maximum of 75-100 km to less than 10 km. Coeval elongation by extension on north-northeast striking faults within the adjoining blocks to the FC-FLV fault both supply and remove cumulative displacement measured at the northern end of the transcurrent fault system. Elongation and displacement transfer in the eastern block, constituting the southern Walker Lane of western Nevada, exceeds that of the western block and results in the net south to north decrease in displacement on the FC-FLV fault system. Elongation in the eastern block is accommodated by late Miocene to Pliocene detachment faulting followed by extension on superposed, east-northeast striking, high-angle structures. Displacement transfer from the FC-FLV fault system to the northwest-trending faults of the central Walker Lane to the north is accomplished by motion on a series of west-northwest striking transcurrent faults, named the Oriental Wash, Sylvania Mountain, and Palmetto Mountain fault systems. The west-northwest striking transcurrent faults cross-cut earlier detachment structures and are kinematically linked to east-northeast high-angle extensional faults. The transcurrent faults are mapped along strike for 60 km to the east, where they merge with north-northwest faults forming the eastern boundary of the southern Walker Lane. The west-northwest trending transcurrent faults have 30-35 km of cumulative left-lateral displacement and are a major contributor to the decrease in right-lateral displacement on the FC-FLV fault system.
Current Fault Management Trends in NASA's Planetary Spacecraft
NASA Technical Reports Server (NTRS)
Fesq, Lorraine M.
2009-01-01
The key product of this three-day workshop is a NASA White Paper that documents lessons learned from previous missions, recommended best practices, and future opportunities for investments in the fault management domain. This paper summarizes the findings and recommendations that are captured in the White Paper.
Evaluating the Effect of Integrated System Health Management on Mission Effectiveness
2013-03-01
Health Status, Fault Detection , IMS Commands «Needline» 110 B.6 OV-5a « O V -5 » a c t O V -5 [ O V -5 a...UAS to self- detect , isolate, and diagnose system health problems. Current flight avionics architectures may include lower level sub-system health ... monitoring or may isolate health monitoring functions to a black box configuration, but a vehicle-wide health monitoring information system has
Fleet-Wide Prognostic and Health Management Suite: Asset Fault Signature Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vivek Agarwal; Nancy J. Lybeck; Randall Bickford
Proactive online monitoring in the nuclear industry is being explored using the Electric Power Research Institute’s Fleet-Wide Prognostic and Health Management (FW-PHM) Suite software. The FW-PHM Suite is a set of web-based diagnostic and prognostic tools and databases that serves as an integrated health monitoring architecture. The FW-PHM Suite has four main modules: (1) Diagnostic Advisor, (2) Asset Fault Signature (AFS) Database, (3) Remaining Useful Life Advisor, and (4) Remaining Useful Life Database. The paper focuses on the AFS Database of the FW-PHM Suite, which is used to catalog asset fault signatures. A fault signature is a structured representation ofmore » the information that an expert would use to first detect and then verify the occurrence of a specific type of fault. The fault signatures developed to assess the health status of generator step-up transformers are described in the paper. The developed fault signatures capture this knowledge and implement it in a standardized approach, thereby streamlining the diagnostic and prognostic process. This will support the automation of proactive online monitoring techniques in nuclear power plants to diagnose incipient faults, perform proactive maintenance, and estimate the remaining useful life of assets.« less
Multiple incipient sensor faults diagnosis with application to high-speed railway traction devices.
Wu, Yunkai; Jiang, Bin; Lu, Ningyun; Yang, Hao; Zhou, Yang
2017-03-01
This paper deals with the problem of incipient fault diagnosis for a class of Lipschitz nonlinear systems with sensor biases and explores further results of total measurable fault information residual (ToMFIR). Firstly, state and output transformations are introduced to transform the original system into two subsystems. The first subsystem is subject to system disturbances and free from sensor faults, while the second subsystem contains sensor faults but without any system disturbances. Sensor faults in the second subsystem are then formed as actuator faults by using a pseudo-actuator based approach. Since the effects of system disturbances on the residual are completely decoupled, multiple incipient sensor faults can be detected by constructing ToMFIR, and the fault detectability condition is then derived for discriminating the detectable incipient sensor faults. Further, a sliding-mode observers (SMOs) based fault isolation scheme is designed to guarantee accurate isolation of multiple sensor faults. Finally, simulation results conducted on a CRH2 high-speed railway traction device are given to demonstrate the effectiveness of the proposed approach. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Model-based reasoning for power system management using KATE and the SSM/PMAD
NASA Technical Reports Server (NTRS)
Morris, Robert A.; Gonzalez, Avelino J.; Carreira, Daniel J.; Mckenzie, F. D.; Gann, Brian
1993-01-01
The overall goal of this research effort has been the development of a software system which automates tasks related to monitoring and controlling electrical power distribution in spacecraft electrical power systems. The resulting software system is called the Intelligent Power Controller (IPC). The specific tasks performed by the IPC include continuous monitoring of the flow of power from a source to a set of loads, fast detection of anomalous behavior indicating a fault to one of the components of the distribution systems, generation of diagnosis (explanation) of anomalous behavior, isolation of faulty object from remainder of system, and maintenance of flow of power to critical loads and systems (e.g. life-support) despite fault conditions being present (recovery). The IPC system has evolved out of KATE (Knowledge-based Autonomous Test Engineer), developed at NASA-KSC. KATE consists of a set of software tools for developing and applying structure and behavior models to monitoring, diagnostic, and control applications.
NASA Astrophysics Data System (ADS)
Madden, E. H.; McBeck, J.; Cooke, M. L.
2013-12-01
Over multiple earthquake cycles, strike-slip faults link to form through-going structures, as demonstrated by the continuous nature of the mature San Andreas fault system in California relative to the younger and more segmented San Jacinto fault system nearby. Despite its immaturity, the San Jacinto system accommodates between one third and one half of the slip along the boundary between the North American and Pacific plates. It therefore poses a significant seismic threat to southern California. Better understanding of how the San Jacinto system has evolved over geologic time and of current interactions between faults within the system is critical to assessing this seismic hazard accurately. Numerical models are well suited to simulating kilometer-scale processes, but models of fault system development are challenged by the multiple physical mechanisms involved. For example, laboratory experiments on brittle materials show that faults propagate and eventually join (hard-linkage) by both opening-mode and shear failure. In addition, faults interact prior to linkage through stress transfer (soft-linkage). The new algorithm GROW (GRowth by Optimization of Work) accounts for this complex array of behaviors by taking a global approach to fault propagation while adhering to the principals of linear elastic fracture mechanics. This makes GROW a powerful tool for studying fault interactions and fault system development over geologic time. In GROW, faults evolve to minimize the work (or energy) expended during deformation, thereby maximizing the mechanical efficiency of the entire system. Furthermore, the incorporation of both static and dynamic friction allows GROW models to capture fault slip and fault propagation in single earthquakes as well as over consecutive earthquake cycles. GROW models with idealized faults reveal that the initial fault spacing and the applied stress orientation control fault linkage propensity and linkage patterns. These models allow the gains in efficiency provided by both hard-linkage and soft-linkage to be quantified and compared. Specialized models of interactions over the past 1 Ma between the Clark and Coyote Creek faults within the San Jacinto system reveal increasing mechanical efficiency as these fault structures change over time. Alongside this increasing efficiency is an increasing likelihood for single, larger earthquakes that rupture multiple fault segments. These models reinforce the sensitivity of mechanical efficiency to both fault structure and the regional tectonic stress orientation controlled by plate motions and provide insight into how slip may have been partitioned between the San Andreas and San Jacinto systems over the past 1 Ma.
Expert Systems for United States Navy Shore Facilities Utility Operations.
1988-03-01
of expertise when assessing the applicability of an expert system. Each of the tasks as similarly ranked to reflect subjective judgement on the...United States Navy Shore Facilities Utility Operations ABSTRACT A technology assessment of expert systems as they might be used in Navy utility...of these applications include design, fault diagnoses, training, data base management, and real-time monitoring. An assessment is given of each
The continuation of the Kazerun fault system across the Sanandaj-Sirjan zone (Iran)
NASA Astrophysics Data System (ADS)
Safaei, Homayon
2009-08-01
The Kazerun (or Kazerun-Qatar) fault system is a north-trending dextral strike-slip fault zone in the Zagros mountain belt of Iran. It probably originated as a structure in the Panafrican basement. This fault system played an important role in the sedimentation and deformation of the Phanerozoic cover sequence and is still seismically active. No previous studies have reported the continuation of this important and ancient fault system northward across the Sanandaj-Sirjan zone. The Isfahan fault system is a north-trending dextral strike-slip fault across the Sanandaj-Sirjan zone that passes west of Isfahan city and is here recognized for the first time. This important fault system is about 220 km long and is seismically active in the basement as well as the sedimentary cover sequence. This fault system terminates to the south near the Main Zagros Thrust and to the north at the southern boundary of the Urumieh-Dokhtar zone. The Isfahan fault system is the boundary between the northern and southern parts of Sanandaj-Sirjan zone, which have fundamentally different stratigraphy, petrology, geomorphology, and geodynamic histories. Similarities in the orientations, kinematics, and geologic histories of the Isfahan and Kazerun faults and the way they affect the magnetic basement suggest that they are related. In fact, the Isfahan fault is a continuation of the Kazerun fault across the Sanandaj-Sirjan zone that has been offset by about 50 km of dextral strike-slip displacement along the Main Zagros Thrust.
Application Research of Fault Tree Analysis in Grid Communication System Corrective Maintenance
NASA Astrophysics Data System (ADS)
Wang, Jian; Yang, Zhenwei; Kang, Mei
2018-01-01
This paper attempts to apply the fault tree analysis method to the corrective maintenance field of grid communication system. Through the establishment of the fault tree model of typical system and the engineering experience, the fault tree analysis theory is used to analyze the fault tree model, which contains the field of structural function, probability importance and so on. The results show that the fault tree analysis can realize fast positioning and well repairing of the system. Meanwhile, it finds that the analysis method of fault tree has some guiding significance to the reliability researching and upgrading f the system.
Analysis of typical fault-tolerant architectures using HARP
NASA Technical Reports Server (NTRS)
Bavuso, Salvatore J.; Bechta Dugan, Joanne; Trivedi, Kishor S.; Rothmann, Elizabeth M.; Smith, W. Earl
1987-01-01
Difficulties encountered in the modeling of fault-tolerant systems are discussed. The Hybrid Automated Reliability Predictor (HARP) approach to modeling fault-tolerant systems is described. The HARP is written in FORTRAN, consists of nearly 30,000 lines of codes and comments, and is based on behavioral decomposition. Using the behavioral decomposition, the dependability model is divided into fault-occurrence/repair and fault/error-handling models; the characteristics and combining of these two models are examined. Examples in which the HARP is applied to the modeling of some typical fault-tolerant systems, including a local-area network, two fault-tolerant computer systems, and a flight control system, are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gao, Qing, E-mail: qing.gao.chance@gmail.com; Dong, Daoyi, E-mail: daoyidong@gmail.com; Petersen, Ian R., E-mail: i.r.petersen@gmai.com
The purpose of this paper is to solve the fault tolerant filtering and fault detection problem for a class of open quantum systems driven by a continuous-mode bosonic input field in single photon states when the systems are subject to stochastic faults. Optimal estimates of both the system observables and the fault process are simultaneously calculated and characterized by a set of coupled recursive quantum stochastic differential equations.
NASA ground terminal communication equipment automated fault isolation expert systems
NASA Technical Reports Server (NTRS)
Tang, Y. K.; Wetzel, C. R.
1990-01-01
The prototype expert systems are described that diagnose the Distribution and Switching System I and II (DSS1 and DSS2), Statistical Multiplexers (SM), and Multiplexer and Demultiplexer systems (MDM) at the NASA Ground Terminal (NGT). A system level fault isolation expert system monitors the activities of a selected data stream, verifies that the fault exists in the NGT and identifies the faulty equipment. Equipment level fault isolation expert systems are invoked to isolate the fault to a Line Replaceable Unit (LRU) level. Input and sometimes output data stream activities for the equipment are available. The system level fault isolation expert system compares the equipment input and output status for a data stream and performs loopback tests (if necessary) to isolate the faulty equipment. The equipment level fault isolation system utilizes the process of elimination and/or the maintenance personnel's fault isolation experience stored in its knowledge base. The DSS1, DSS2 and SM fault isolation systems, using the knowledge of the current equipment configuration and the equipment circuitry issues a set of test connections according to the predefined rules. The faulty component or board can be identified by the expert system by analyzing the test results. The MDM fault isolation system correlates the failure symptoms with the faulty component based on maintenance personnel experience. The faulty component can be determined by knowing the failure symptoms. The DSS1, DSS2, SM, and MDM equipment simulators are implemented in PASCAL. The DSS1 fault isolation expert system was converted to C language from VP-Expert and integrated into the NGT automation software for offline switch diagnoses. Potentially, the NGT fault isolation algorithms can be used for the DSS1, SM, amd MDM located at Goddard Space Flight Center (GSFC).
Mission Management Computer Software for RLV-TD
NASA Astrophysics Data System (ADS)
Manju, C. R.; Joy, Josna Susan; Vidya, L.; Sheenarani, I.; Sruthy, C. N.; Viswanathan, P. C.; Dinesh, Sudin; Jayalekshmy, L.; Karuturi, Kesavabrahmaji; Sheema, E.; Syamala, S.; Unnikrishnan, S. Manju; Ali, S. Akbar; Paramasivam, R.; Sheela, D. S.; Shukkoor, A. Abdul; Lalithambika, V. R.; Mookiah, T.
2017-12-01
The Mission Management Computer (MMC) software is responsible for the autonomous navigation, sequencing, guidance and control of the Re-usable Launch Vehicle (RLV), through lift-off, ascent, coasting, re-entry, controlled descent and splashdown. A hard real-time system has been designed for handling the mission requirements in an integrated manner and for meeting the stringent timing constraints. Redundancy management and fault-tolerance techniques are also built into the system, in order to achieve a successful mission even in presence of component failures. This paper describes the functions and features of the components of the MMC software which has accomplished the successful RLV-Technology Demonstrator mission.
Expert System Detects Power-Distribution Faults
NASA Technical Reports Server (NTRS)
Walters, Jerry L.; Quinn, Todd M.
1994-01-01
Autonomous Power Expert (APEX) computer program is prototype expert-system program detecting faults in electrical-power-distribution system. Assists human operators in diagnosing faults and deciding what adjustments or repairs needed for immediate recovery from faults or for maintenance to correct initially nonthreatening conditions that could develop into faults. Written in Lisp.
Nouri.Gharahasanlou, Ali; Mokhtarei, Ashkan; Khodayarei, Aliasqar; Ataei, Mohammad
2014-01-01
Evaluating and analyzing the risk in the mining industry is a new approach for improving the machinery performance. Reliability, safety, and maintenance management based on the risk analysis can enhance the overall availability and utilization of the mining technological systems. This study investigates the failure occurrence probability of the crushing and mixing bed hall department at Azarabadegan Khoy cement plant by using fault tree analysis (FTA) method. The results of the analysis in 200 h operating interval show that the probability of failure occurrence for crushing, conveyor systems, crushing and mixing bed hall department is 73, 64, and 95 percent respectively and the conveyor belt subsystem found as the most probable system for failure. Finally, maintenance as a method of control and prevent the occurrence of failure is proposed. PMID:26779433
Monitoring and decision making by people in man machine systems
NASA Technical Reports Server (NTRS)
Johannsen, G.
1979-01-01
The analysis of human monitoring and decision making behavior as well as its modeling are described. Classic and optimal control theoretical, monitoring models are surveyed. The relationship between attention allocation and eye movements is discussed. As an example of applications, the evaluation of predictor displays by means of the optimal control model is explained. Fault detection involving continuous signals and decision making behavior of a human operator engaged in fault diagnosis during different operation and maintenance situations are illustrated. Computer aided decision making is considered as a queueing problem. It is shown to what extent computer aids can be based on the state of human activity as measured by psychophysiological quantities. Finally, management information systems for different application areas are mentioned. The possibilities of mathematical modeling of human behavior in complex man machine systems are also critically assessed.
Nouri Gharahasanlou, Ali; Mokhtarei, Ashkan; Khodayarei, Aliasqar; Ataei, Mohammad
2014-04-01
Evaluating and analyzing the risk in the mining industry is a new approach for improving the machinery performance. Reliability, safety, and maintenance management based on the risk analysis can enhance the overall availability and utilization of the mining technological systems. This study investigates the failure occurrence probability of the crushing and mixing bed hall department at Azarabadegan Khoy cement plant by using fault tree analysis (FTA) method. The results of the analysis in 200 h operating interval show that the probability of failure occurrence for crushing, conveyor systems, crushing and mixing bed hall department is 73, 64, and 95 percent respectively and the conveyor belt subsystem found as the most probable system for failure. Finally, maintenance as a method of control and prevent the occurrence of failure is proposed.
A Classfication of Management Teachers
ERIC Educational Resources Information Center
Walker, Bob
1974-01-01
There are many classifications of management teachers today. Each has his style, successes, and faults. Some of the more prominent are: the company man, the mamagement technician, the man of principle, the evangelist, and the entrepreneur. A mixture of these classifications would be ideal since each by itself has its faults. (DS)
Results of an electrical power system fault study (CDDF)
NASA Technical Reports Server (NTRS)
Dugal-Whitehead, N. R.; Johnson, Y. B.
1993-01-01
This report gives the results of an electrical power system fault study which has been conducted over the last 2 and one-half years. First, the results of the literature search into electrical power system faults in space and terrestrial power system applications are reported. A description of the intended implementations of the power system faults into the Large Autonomous Spacecraft Electrical Power System (LASEPS) breadboard is then presented. Then, the actual implementation of the faults into the breadboard is discussed along with a discussion describing the LASEPS breadboard. Finally, the results of the injected faults and breadboard failures are discussed.
Multiple Fault Isolation in Redundant Systems
NASA Technical Reports Server (NTRS)
Pattipati, Krishna R.; Patterson-Hine, Ann; Iverson, David
1997-01-01
Fault diagnosis in large-scale systems that are products of modern technology present formidable challenges to manufacturers and users. This is due to large number of failure sources in such systems and the need to quickly isolate and rectify failures with minimal down time. In addition, for fault-tolerant systems and systems with infrequent opportunity for maintenance (e.g., Hubble telescope, space station), the assumption of at most a single fault in the system is unrealistic. In this project, we have developed novel block and sequential diagnostic strategies to isolate multiple faults in the shortest possible time without making the unrealistic single fault assumption.
Multiple Fault Isolation in Redundant Systems
NASA Technical Reports Server (NTRS)
Pattipati, Krishna R.
1997-01-01
Fault diagnosis in large-scale systems that are products of modem technology present formidable challenges to manufacturers and users. This is due to large number of failure sources in such systems and the need to quickly isolate and rectify failures with minimal down time. In addition, for fault-tolerant systems and systems with infrequent opportunity for maintenance (e.g., Hubble telescope, space station), the assumption of at most a single fault in the system is unrealistic. In this project, we have developed novel block and sequential diagnostic strategies to isolate multiple faults in the shortest possible time without making the unrealistic single fault assumption.
NASA Technical Reports Server (NTRS)
2006-01-01
This full-frame image from the High Resolution Imaging Science Experiment camera on NASA's Mars Reconnaissance Orbiter shows faults and pits in Mars' north polar residual cap that have not been previously recognized. The faults and depressions between them are similar to features seen on Earth where the crust is being pulled apart. Such tectonic extension must have occurred very recently because the north polar residual cap is very young, as indicated by the paucity of impact craters on its surface. Alternatively, the faults and pits may be caused by collapse due to removal of material beneath the surface. The pits are aligned along the faults, either because material has drained into the subsurface along the faults or because gas has escaped from the subsurface through them. NASA's Jet Propulsion Laboratory, a division of the California Institute of Technology in Pasadena, manages the Mars Reconnaissance Orbiter for NASA's Science Mission Directorate, Washington. Lockheed Martin Space Systems, Denver, is the prime contractor for the project and built the spacecraft. The High Resolution Imaging Science Experiment is operated by the University of Arizona, Tucson, and the instrument was built by Ball Aerospace and Technology Corp., Boulder, Colo.Developing RCM Strategy for Hydrogen Fuel Cells Utilizing On Line E-Condition Monitoring
NASA Astrophysics Data System (ADS)
Baglee, D.; Knowles, M. J.
2012-05-01
Fuel cell vehicles are considered to be a viable solution to problems such as carbon emissions and fuel shortages for road transport. Proton Exchange Membrane (PEM) Fuel Cells are mainly used in this purpose because they can run at low temperatures and have a simple structure. Yet high maintenance costs and the inherent dangers of maintaining equipment using hydrogen are two main issues which need to be addressed. The development of appropriate and efficient strategies is currently lacking with regard to fuel cell maintenance. A Reliability Centered Maintenance (RCM) approach offers considerable benefit to the management of fuel cell maintenance since it includes an identification and consideration of the impact of critical components. Technological developments in e-maintenance systems, radio-frequency identification (RFID) and personal digital assistants (PDAs) have proven to satisfy the increasing demand for improved reliability, efficiency and safety. RFID technology is used to store and remotely retrieve electronic maintenance data in order to provide instant access to up-to-date, accurate and detailed information. The aim is to support fuel cell maintenance decisions by developing and applying a blend of leading-edge communications and sensor technology including RFID. The purpose of this paper is to review and present the state of the art in fuel cell condition monitoring and maintenance utilizing RCM and RFID technologies. Using an RCM analysis critical components and fault modes are identified. RFID tags are used to store the critical information, possible faults and their cause and effect. The relationship between causes, faults, symptoms and long term implications of fault conditions are summarized. Finally conclusions are drawn regarding suggested maintenance strategies and the optimal structure for an integrated, cost effective condition monitoring and maintenance management system.
Simultaneous Sensor and Process Fault Diagnostics for Propellant Feed System
NASA Technical Reports Server (NTRS)
Cao, J.; Kwan, C.; Figueroa, F.; Xu, R.
2006-01-01
The main objective of this research is to extract fault features from sensor faults and process faults by using advanced fault detection and isolation (FDI) algorithms. A tank system that has some common characteristics to a NASA testbed at Stennis Space Center was used to verify our proposed algorithms. First, a generic tank system was modeled. Second, a mathematical model suitable for FDI has been derived for the tank system. Third, a new and general FDI procedure has been designed to distinguish process faults and sensor faults. Extensive simulations clearly demonstrated the advantages of the new design.
NASA Technical Reports Server (NTRS)
Brunelle, J. E.; Eckhardt, D. E., Jr.
1985-01-01
Results are presented of an experiment conducted in the NASA Avionics Integrated Research Laboratory (AIRLAB) to investigate the implementation of fault-tolerant software techniques on fault-tolerant computer architectures, in particular the Software Implemented Fault Tolerance (SIFT) computer. The N-version programming and recovery block techniques were implemented on a portion of the SIFT operating system. The results indicate that, to effectively implement fault-tolerant software design techniques, system requirements will be impacted and suggest that retrofitting fault-tolerant software on existing designs will be inefficient and may require system modification.
Making intelligent systems team players: Additional case studies
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Schreckenghost, Debra L.; Rhoads, Ron W.
1993-01-01
Observations from a case study of intelligent systems are reported as part of a multi-year interdisciplinary effort to provide guidance and assistance for designers of intelligent systems and their user interfaces. A series of studies were conducted to investigate issues in designing intelligent fault management systems in aerospace applications for effective human-computer interaction. The results of the initial study are documented in two NASA technical memoranda: TM 104738 Making Intelligent Systems Team Players: Case Studies and Design Issues, Volumes 1 and 2; and TM 104751, Making Intelligent Systems Team Players: Overview for Designers. The objective of this additional study was to broaden the investigation of human-computer interaction design issues beyond the focus on monitoring and fault detection in the initial study. The results of this second study are documented which is intended as a supplement to the original design guidance documents. These results should be of interest to designers of intelligent systems for use in real-time operations, and to researchers in the areas of human-computer interaction and artificial intelligence.
NASA Astrophysics Data System (ADS)
Ye, Jiyang; Liu, Mian
2017-08-01
In Southern California, the Pacific-North America relative plate motion is accommodated by the complex southern San Andreas Fault system that includes many young faults (<2 Ma). The initiation of these young faults and their impact on strain partitioning and fault slip rates are important for understanding the evolution of this plate boundary zone and assessing earthquake hazard in Southern California. Using a three-dimensional viscoelastoplastic finite element model, we have investigated how this plate boundary fault system has evolved to accommodate the relative plate motion in Southern California. Our results show that when the plate boundary faults are not optimally configured to accommodate the relative plate motion, strain is localized in places where new faults would initiate to improve the mechanical efficiency of the fault system. In particular, the Eastern California Shear Zone, the San Jacinto Fault, the Elsinore Fault, and the offshore dextral faults all developed in places of highly localized strain. These younger faults compensate for the reduced fault slip on the San Andreas Fault proper because of the Big Bend, a major restraining bend. The evolution of the fault system changes the apportionment of fault slip rates over time, which may explain some of the slip rate discrepancy between geological and geodetic measurements in Southern California. For the present fault configuration, our model predicts localized strain in western Transverse Ranges and along the dextral faults across the Mojave Desert, where numerous damaging earthquakes occurred in recent years.
Advanced information processing system: Fault injection study and results
NASA Technical Reports Server (NTRS)
Burkhardt, Laura F.; Masotto, Thomas K.; Lala, Jaynarayan H.
1992-01-01
The objective of the AIPS program is to achieve a validated fault tolerant distributed computer system. The goals of the AIPS fault injection study were: (1) to present the fault injection study components addressing the AIPS validation objective; (2) to obtain feedback for fault removal from the design implementation; (3) to obtain statistical data regarding fault detection, isolation, and reconfiguration responses; and (4) to obtain data regarding the effects of faults on system performance. The parameters are described that must be varied to create a comprehensive set of fault injection tests, the subset of test cases selected, the test case measurements, and the test case execution. Both pin level hardware faults using a hardware fault injector and software injected memory mutations were used to test the system. An overview is provided of the hardware fault injector and the associated software used to carry out the experiments. Detailed specifications are given of fault and test results for the I/O Network and the AIPS Fault Tolerant Processor, respectively. The results are summarized and conclusions are given.
NASA Astrophysics Data System (ADS)
Hassanabadi, Amir Hossein; Shafiee, Masoud; Puig, Vicenc
2018-01-01
In this paper, sensor fault diagnosis of a singular delayed linear parameter varying (LPV) system is considered. In the considered system, the model matrices are dependent on some parameters which are real-time measurable. The case of inexact parameter measurements is considered which is close to real situations. Fault diagnosis in this system is achieved via fault estimation. For this purpose, an augmented system is created by including sensor faults as additional system states. Then, an unknown input observer (UIO) is designed which estimates both the system states and the faults in the presence of measurement noise, disturbances and uncertainty induced by inexact measured parameters. Error dynamics and the original system constitute an uncertain system due to inconsistencies between real and measured values of the parameters. Then, the robust estimation of the system states and the faults are achieved with H∞ performance and formulated with a set of linear matrix inequalities (LMIs). The designed UIO is also applicable for fault diagnosis of singular delayed LPV systems with unmeasurable scheduling variables. The efficiency of the proposed approach is illustrated with an example.
Hierarchical Simulation to Assess Hardware and Software Dependability
NASA Technical Reports Server (NTRS)
Ries, Gregory Lawrence
1997-01-01
This thesis presents a method for conducting hierarchical simulations to assess system hardware and software dependability. The method is intended to model embedded microprocessor systems. A key contribution of the thesis is the idea of using fault dictionaries to propagate fault effects upward from the level of abstraction where a fault model is assumed to the system level where the ultimate impact of the fault is observed. A second important contribution is the analysis of the software behavior under faults as well as the hardware behavior. The simulation method is demonstrated and validated in four case studies analyzing Myrinet, a commercial, high-speed networking system. One key result from the case studies shows that the simulation method predicts the same fault impact 87.5% of the time as is obtained by similar fault injections into a real Myrinet system. Reasons for the remaining discrepancy are examined in the thesis. A second key result shows the reduction in the number of simulations needed due to the fault dictionary method. In one case study, 500 faults were injected at the chip level, but only 255 propagated to the system level. Of these 255 faults, 110 shared identical fault dictionary entries at the system level and so did not need to be resimulated. The necessary number of system-level simulations was therefore reduced from 500 to 145. Finally, the case studies show how the simulation method can be used to improve the dependability of the target system. The simulation analysis was used to add recovery to the target software for the most common fault propagation mechanisms that would cause the software to hang. After the modification, the number of hangs was reduced by 60% for fault injections into the real system.
NASA Astrophysics Data System (ADS)
Wallace, W. K.; Sherrod, B. L.; Dawson, T. E.
2002-12-01
Preliminary observations suggest that right-lateral strike-slip on the Denali fault is transferred to the Totschunda fault via an extensional bend in the Little Tok River valley. Most of the surface rupture during the Denali fault earthquake was along an east- to east-southeast striking, gently curved segment of the Denali fault. However, in the Little Tok River valley, rupture transferred to the southeast-striking Totschunda fault and continued to the southeast for another 75 km. West of the Little Tok River valley, 5-7 m of right-lateral slip and up to 2 m of vertical offset occurred on the main strand of the Denali fault, but no apparent displacement occurred on the Denali fault east of the valley. Rupture west of the intersection also occurred on multiple discontinuous strands parallel to and south of the main strand of the Denali fault. In the Little Tok River valley, the northern part of the Totschunda fault system consists of multiple discontinuous southeast-striking strands that are connected locally by south-striking stepover faults. Faults of the northern Totschunda system display 0-2.5 m of right-lateral slip and 0-2.75 m of vertical offset, with the largest vertical offset on a dominantly extensional stepover fault. The strands of the Totschunda system converge southeastward to a single strand that had up to 2 m of slip. Complex and discontinuous faulting may reflect in part the immaturity of the northern Totschunda system, which is known to be younger and have much less total slip than the Denali. The Totschunda fault forms an extensional bend relative to the dominantly right-lateral Denali fault to the west. The fault geometry and displacements at the intersection suggest that slip on the Denali fault during the earthquake was accommodated largely by extension in the northern Totschunda fault system, allowing a significant decrease in strike-slip relative to the Denali fault. Strands to the southwest in the area of the bend may represent shortcut faults that have reduced the curvature at the intersection of the two fault systems.
Design of Energy Storage Management System Based on FPGA in Micro-Grid
NASA Astrophysics Data System (ADS)
Liang, Yafeng; Wang, Yanping; Han, Dexiao
2018-01-01
Energy storage system is the core to maintain the stable operation of smart micro-grid. Aiming at the existing problems of the energy storage management system in the micro-grid such as Low fault tolerance, easy to cause fluctuations in micro-grid, a new intelligent battery management system based on field programmable gate array is proposed : taking advantage of FPGA to combine the battery management system with the intelligent micro-grid control strategy. Finally, aiming at the problem that during estimation of battery charge State by neural network, initialization of weights and thresholds are not accurate leading to large errors in prediction results, the genetic algorithm is proposed to optimize the neural network method, and the experimental simulation is carried out. The experimental results show that the algorithm has high precision and provides guarantee for the stable operation of micro-grid.
Comparative study of superconducting fault current limiter both for LCC-HVDC and VSC-HVDC systems
NASA Astrophysics Data System (ADS)
Lee, Jong-Geon; Khan, Umer Amir; Lim, Sung-Woo; Shin, Woo-ju; Seo, In-Jin; Lee, Bang-Wook
2015-11-01
High Voltage Direct Current (HVDC) system has been evaluated as the optimum solution for the renewable energy transmission and long-distance power grid connections. In spite of the various advantages of HVDC system, it still has been regarded as an unreliable system compared to AC system due to its vulnerable characteristics on the power system fault. Furthermore, unlike AC system, optimum protection and switching device has not been fully developed yet. Therefore, in order to enhance the reliability of the HVDC systems mitigation of power system fault and reliable fault current limiting and switching devices should be developed. In this paper, in order to mitigate HVDC fault, both for Line Commutated Converter HVDC (LCC-HVDC) and Voltage Source Converter HVDC (VSC-HVDC) system, an application of resistive superconducting fault current limiter which has been known as optimum solution to cope with the power system fault was considered. Firstly, simulation models for two types of LCC-HVDC and VSC-HVDC system which has point to point connection model were developed. From the designed model, fault current characteristics of faulty condition were analyzed. Second, application of SFCL on each types of HVDC system and comparative study of modified fault current characteristics were analyzed. Consequently, it was deduced that an application of AC-SFCL on LCC-HVDC system with point to point connection was desirable solution to mitigate the fault current stresses and to prevent commutation failure in HVDC electric power system interconnected with AC grid.
Intermittent/transient fault phenomena in digital systems
NASA Technical Reports Server (NTRS)
Masson, G. M.
1977-01-01
An overview of the intermittent/transient (IT) fault study is presented. An interval survivability evaluation of digital systems for IT faults is discussed along with a method for detecting and diagnosing IT faults in digital systems.
Managing Network Partitions in Structured P2P Networks
NASA Astrophysics Data System (ADS)
Shafaat, Tallat M.; Ghodsi, Ali; Haridi, Seif
Structured overlay networks form a major class of peer-to-peer systems, which are touted for their abilities to scale, tolerate failures, and self-manage. Any long-lived Internet-scale distributed system is destined to face network partitions. Consequently, the problem of network partitions and mergers is highly related to fault-tolerance and self-management in large-scale systems. This makes it a crucial requirement for building any structured peer-to-peer systems to be resilient to network partitions. Although the problem of network partitions and mergers is highly related to fault-tolerance and self-management in large-scale systems, it has hardly been studied in the context of structured peer-to-peer systems. Structured overlays have mainly been studied under churn (frequent joins/failures), which as a side effect solves the problem of network partitions, as it is similar to massive node failures. Yet, the crucial aspect of network mergers has been ignored. In fact, it has been claimed that ring-based structured overlay networks, which constitute the majority of the structured overlays, are intrinsically ill-suited for merging rings. In this chapter, we motivate the problem of network partitions and mergers in structured overlays. We discuss how a structured overlay can automatically detect a network partition and merger. We present an algorithm for merging multiple similar ring-based overlays when the underlying network merges. We examine the solution in dynamic conditions, showing how our solution is resilient to churn during the merger, something widely believed to be difficult or impossible. We evaluate the algorithm for various scenarios and show that even when falsely detecting a merger, the algorithm quickly terminates and does not clutter the network with many messages. The algorithm is flexible as the tradeoff between message complexity and time complexity can be adjusted by a parameter.
Chen, Gang; Song, Yongduan; Lewis, Frank L
2016-05-03
This paper investigates the distributed fault-tolerant control problem of networked Euler-Lagrange systems with actuator and communication link faults. An adaptive fault-tolerant cooperative control scheme is proposed to achieve the coordinated tracking control of networked uncertain Lagrange systems on a general directed communication topology, which contains a spanning tree with the root node being the active target system. The proposed algorithm is capable of compensating for the actuator bias fault, the partial loss of effectiveness actuation fault, the communication link fault, the model uncertainty, and the external disturbance simultaneously. The control scheme does not use any fault detection and isolation mechanism to detect, separate, and identify the actuator faults online, which largely reduces the online computation and expedites the responsiveness of the controller. To validate the effectiveness of the proposed method, a test-bed of multiple robot-arm cooperative control system is developed for real-time verification. Experiments on the networked robot-arms are conduced and the results confirm the benefits and the effectiveness of the proposed distributed fault-tolerant control algorithms.
NASA Astrophysics Data System (ADS)
Gonzalez-Nicolas, A.; Cihan, A.; Birkholzer, J. T.; Petrusak, R.; Zhou, Q.; Riestenberg, D. E.; Trautz, R. C.; Godec, M.
2016-12-01
Industrial-scale injection of CO2 into the subsurface can cause reservoir pressure increases that must be properly controlled to prevent any potential environmental impact. Excessive pressure buildup in reservoir may result in ground water contamination stemming from leakage through conductive pathways, such as improperly plugged abandoned wells or distant faults, and the potential for fault reactivation and possibly seal breaching. Brine extraction is a viable approach for managing formation pressure, effective stress, and plume movement during industrial-scale CO2 injection projects. The main objectives of this study are to investigate suitable different pressure management strategies involving active brine extraction and passive pressure relief wells. Adaptive optimized management of CO2 storage projects utilizes the advanced automated optimization algorithms and suitable process models. The adaptive management integrates monitoring, forward modeling, inversion modeling and optimization through an iterative process. In this study, we employ an adaptive framework to understand primarily the effects of initial site characterization and frequency of the model update (calibration) and optimization calculations for controlling extraction rates based on the monitoring data on the accuracy and the success of the management without violating pressure buildup constraints in the subsurface reservoir system. We will present results of applying the adaptive framework to test appropriateness of different management strategies for a realistic field injection project.
Smart intimation and location of faults in distribution system
NASA Astrophysics Data System (ADS)
Hari Krishna, K.; Srinivasa Rao, B.
2018-04-01
Location of faults in the distribution system is one of the most complicated problems that we are facing today. Identification of fault location and severity of fault within a short time is required to provide continuous power supply but fault identification and information transfer to the operator is the biggest challenge in the distribution network. This paper proposes a fault location method in the distribution system based on Arduino nano and GSM module with flame sensor. The main idea is to locate the fault in the distribution transformer by sensing the arc coming out from the fuse element. The biggest challenge in the distribution network is to identify the location and the severity of faults under different conditions. Well operated transmission and distribution systems will play a key role for uninterrupted power supply. Whenever fault occurs in the distribution system the time taken to locate and eliminate the fault has to be reduced. The proposed design was achieved with flame sensor and GSM module. Under faulty condition, the system will automatically send an alert message to the operator in the distribution system, about the abnormal conditions near the transformer, site code and its exact location for possible power restoration.
Seismic interpretation of the deep structure of the Wabash Valley Fault System
Bear, G.W.; Rupp, J.A.; Rudman, A.J.
1997-01-01
Interpretations of newly available seismic reflection profiles near the center of the Illinois Basin indicate that the Wabash Valley Fault System is rooted in a series of basement-penetrating faults. The fault system is composed predominantly of north-northeast-trending high-angle normal faults. The largest faults in the system bound the 22-km wide 40-km long Grayville Graben. Structure contour maps drawn on the base of the Mount Simon Sandstone (Cambrian System) and a deeper pre-Mount Simon horizon show dip-slip displacements totaling at least 600 meters across the New Harmony fault. In contrast to previous interpretations, the N-S extent of significant fault offsets is restricted to a region north of 38?? latitude and south of 38.35?? latitude. This suggests that the graben is not a NE extension of the structural complex composed of the Rough Creek Fault System and the Reelfoot Rift as previously interpreted. Structural complexity on the graben floor also decreases to the south. Structural trends north of 38?? latitude are offset laterally across several large faults, indicating strike-slip motions of 2 to 4 km. Some of the major faults are interpreted to penetrate to depths of 7 km or more. Correlation of these faults with steep potential field gradients suggests that the fault positions are controlled by major lithologic contacts within the basement and that the faults may extend into the depth range where earthquakes are generated, revealing a potential link between specific faults and recently observed low-level seismicity in the area.
Anatomy of landslides along the Dead Sea Transform Fault System in NW Jordan
NASA Astrophysics Data System (ADS)
Dill, H. G.; Hahne, K.; Shaqour, F.
2012-03-01
In the mountainous region north of Amman, Jordan, Cenomanian calcareous rocks are being monitored constantly for their mass wasting processes which occasionally cause severe damage to the Amman-Irbid Highway. Satellite remote sensing data (Landsat TM, ASTER, and SRTM) and ground measurements are applied to investigate the anatomy of landslides along the Dead Sea Transform Fault System (DSTFS), a prominent strike-slip fault. The joints and faults pertinent to the DSTFS match the architectural elements identified in landslides of different size. This similarity attests to a close genetic relation between the tectonic setting of one of the most prominent fault zones on the earth and modern geomorphologic processes. Six indicators stand out in particular: 1) The fractures developing in N-S and splay faults represent the N-S lateral movement of the DSTFS. They governed the position of the landslides. 2) Cracks and faults aligned in NE-SW to NNW-SSW were caused by compressional strength. They were subsequently reactivated during extensional processes and used in some cases as slip planes during mass wasting. 3) Minor landslides with NE-SW straight scarps were derived from compressional features which were turned into slip planes during the incipient stages of mass wasting. They occur mainly along the slopes in small wadis or where a wide wadi narrows upstream. 4) Major landslides with curved instead of straight scarps and rotational slides are representative of a more advanced level of mass wasting. These areas have to be marked in the maps and during land management projects as high-risk area mainly and may be encountered in large wadis with steep slopes or longitudinal slopes undercut by road construction works. 5) The spatial relation between minor faults and slope angle is crucial as to the vulnerability of the areas in terms of mass wasting. 6) Springs lined up along faults cause serious problems to engineering geology in that they step up the behavior of marly interbeds to accelerate sliding during mass wasting. The most vulnerable areas prone to slope instabilities are those with compressional tectonics followed by extensional movements, with fault bound springs and smectite-bearing marly layers interbedded with pure massive limestones. The semi-arid to arid climate with periodic rainfalls combined with subsurface water circulation along the joints and faults can trigger mass wasting.
The 1991 Goddard Conference on Space Applications of Artificial Intelligence
NASA Technical Reports Server (NTRS)
Rash, James L. (Editor)
1991-01-01
The purpose of this annual conference is to provide a forum in which current research and development directed at space applications of artificial intelligence can be presented and discussed. The papers in this proceeding fall into the following areas: Planning and scheduling, fault monitoring/diagnosis/recovery, machine vision, robotics, system development, information management, knowledge acquisition and representation, distributed systems, tools, neural networks, and miscellaneous applications.
Advanced Ground Systems Maintenance Functional Fault Models For Fault Isolation Project
NASA Technical Reports Server (NTRS)
Perotti, Jose M. (Compiler)
2014-01-01
This project implements functional fault models (FFM) to automate the isolation of failures during ground systems operations. FFMs will also be used to recommend sensor placement to improve fault isolation capabilities. The project enables the delivery of system health advisories to ground system operators.
Knowledge-based fault diagnosis system for refuse collection vehicle
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tan, CheeFai; Juffrizal, K.; Khalil, S. N.
The refuse collection vehicle is manufactured by local vehicle body manufacturer. Currently; the company supplied six model of the waste compactor truck to the local authority as well as waste management company. The company is facing difficulty to acquire the knowledge from the expert when the expert is absence. To solve the problem, the knowledge from the expert can be stored in the expert system. The expert system is able to provide necessary support to the company when the expert is not available. The implementation of the process and tool is able to be standardize and more accurate. The knowledgemore » that input to the expert system is based on design guidelines and experience from the expert. This project highlighted another application on knowledge-based system (KBS) approached in trouble shooting of the refuse collection vehicle production process. The main aim of the research is to develop a novel expert fault diagnosis system framework for the refuse collection vehicle.« less
Estimating earthquake-induced failure probability and downtime of critical facilities.
Porter, Keith; Ramer, Kyle
2012-01-01
Fault trees have long been used to estimate failure risk in earthquakes, especially for nuclear power plants (NPPs). One interesting application is that one can assess and manage the probability that two facilities - a primary and backup - would be simultaneously rendered inoperative in a single earthquake. Another is that one can calculate the probabilistic time required to restore a facility to functionality, and the probability that, during any given planning period, the facility would be rendered inoperative for any specified duration. A large new peer-reviewed library of component damageability and repair-time data for the first time enables fault trees to be used to calculate the seismic risk of operational failure and downtime for a wide variety of buildings other than NPPs. With the new library, seismic risk of both the failure probability and probabilistic downtime can be assessed and managed, considering the facility's unique combination of structural and non-structural components, their seismic installation conditions, and the other systems on which the facility relies. An example is offered of real computer data centres operated by a California utility. The fault trees were created and tested in collaboration with utility operators, and the failure probability and downtime results validated in several ways.
Fisher, M.A.; Langenheim, V.E.; Sorlien, C.C.; Dartnell, P.; Sliter, R.W.; Cochrane, G.R.; Wong, F.L.
2005-01-01
Offshore faults west of Point Dume, southern California, are part of an important regional fault system that extends for about 206 km, from near the city of Los Angeles westward along the south flank of the Santa Monica Mountains and through the northern Channel Islands. This boundary fault system separates the western Transverse Ranges, on the north, from the California Continental Borderland, on the south. Previous research showed that the fault system includes many active fault strands; consequently, the entire system is considered a serious potential earthquake hazard to nearby Los Angeles. We present an integrated analysis of multichannel seismic- and high-resolution seismic-reflection data and multibeam-bathymetric information to focus on the central part of the fault system that lies west of Point Dume. We show that some of the main offshore faults have cumulative displacements of 3-5 km, and many faults are currently active because they deform the seafloor or very shallow sediment layers. The main offshore fault is the Dume fault, a large north-dipping reverse fault. In the eastern part of the study area, this fault offsets the seafloor, showing Holocene displacement. Onshore, the Malibu Coast fault dips steeply north, is active, and shows left-oblique slip. The probable offshore extension of this fault is a large fault that dips steeply in its upper part but flattens at depth. High-resolution seismic data show that this fault deforms shallow sediment making up the Hueneme fan complex, indicating Holocene activity. A structure near Sycamore knoll strikes transversely to the main faults and could be important to the analysis of the regional earthquake hazard because the structure might form a boundary between earthquake-rupture segments.
Activation of preexisting transverse structures in an evolving magmatic rift in East Africa
NASA Astrophysics Data System (ADS)
Muirhead, J. D.; Kattenhorn, S. A.
2018-01-01
Inherited crustal weaknesses have long been recognized as important factors in strain localization and basin development in the East African Rift System (EARS). However, the timing and kinematics (e.g., sense of slip) of transverse (rift-oblique) faults that exploit these weaknesses are debated, and thus the roles of inherited weaknesses at different stages of rift basin evolution are often overlooked. The mechanics of transverse faulting were addressed through an analysis of the Kordjya fault of the Magadi basin (Kenya Rift). Fault kinematics were investigated from field and remote-sensing data collected on fault and joint systems. Our analysis indicates that the Kordjya fault consists of a complex system of predominantly NNE-striking, rift-parallel fault segments that collectively form a NNW-trending array of en echelon faults. The transverse Kordjya fault therefore reactivated existing rift-parallel faults in ∼1 Ma lavas as oblique-normal faults with a component of sinistral shear. In all, these fault motions accommodate dip-slip on an underlying transverse structure that exploits the Aswa basement shear zone. This study shows that transverse faults may be activated through a complex interplay among magma-assisted strain localization, preexisting structures, and local stress rotations. Rather than forming during rift initiation, transverse structures can develop after the establishment of pervasive rift-parallel fault systems, and may exhibit dip-slip kinematics when activated from local stress rotations. The Kordjya fault is shown here to form a kinematic linkage that transfers strain to a newly developing center of concentrated magmatism and normal faulting. It is concluded that recently activated transverse faults not only reveal the effects of inherited basement weaknesses on fault development, but also provide important clues regarding developing magmatic and tectonic systems as young continental rift basins evolve.
Space station electric power system requirements and design
NASA Technical Reports Server (NTRS)
Teren, Fred
1987-01-01
An overview of the conceptual definition and design of the space station Electric Power System (EPS) is given. Responsibilities for the design and development of the EPS are defined. The EPS requirements are listed and discussed, including average and peak power requirements, contingency requirements, and fault tolerance. The most significant Phase B trade study results are summarized, and the design selections and rationale are given. Finally, the power management and distribution system architecture is presented.
NASA Astrophysics Data System (ADS)
Zuza, A. V.; Yin, A.; Lin, J. C.
2015-12-01
Parallel evenly-spaced strike-slip faults are prominent in the southern San Andreas fault system, as well as other settings along plate boundaries (e.g., the Alpine fault) and within continental interiors (e.g., the North Anatolian, central Asian, and northern Tibetan faults). In southern California, the parallel San Jacinto, Elsinore, Rose Canyon, and San Clemente faults to the west of the San Andreas are regularly spaced at ~40 km. In the Eastern California Shear Zone, east of the San Andreas, faults are spaced at ~15 km. These characteristic spacings provide unique mechanical constraints on how the faults interact. Despite the common occurrence of parallel strike-slip faults, the fundamental questions of how and why these fault systems form remain unanswered. We address this issue by using the stress shadow concept of Lachenbruch (1961)—developed to explain extensional joints by using the stress-free condition on the crack surface—to present a mechanical analysis of the formation of parallel strike-slip faults that relates fault spacing and brittle-crust thickness to fault strength, crustal strength, and the crustal stress state. We discuss three independent models: (1) a fracture mechanics model, (2) an empirical stress-rise function model embedded in a plastic medium, and (3) an elastic-plate model. The assumptions and predictions of these models are quantitatively tested using scaled analogue sandbox experiments that show that strike-slip fault spacing is linearly related to the brittle-crust thickness. We derive constraints on the mechanical properties of the southern San Andreas strike-slip faults and fault-bounded crust (e.g., local fault strength and crustal/regional stress) given the observed fault spacing and brittle-crust thickness, which is obtained by defining the base of the seismogenic zone with high-resolution earthquake data. Our models allow direct comparison of the parallel faults in the southern San Andreas system with other similar strike-slip fault systems, both on Earth and throughout the solar system (e.g., the Tiger Stripe Fractures on Enceladus).
Ruleman, Chester A.; Larsen, Mort; Stickney, Michael C.
2014-01-01
The catastrophic Hebgen Lake earthquake of 18 August 1959 (MW 7.3) led many geoscientists to develop new methods to better understand active tectonics in extensional tectonic regimes that address seismic hazards. The Madison Range fault system and adjacent Hebgen Lake–Red Canyon fault system provide an intermountain active tectonic analog for regional analyses of extensional crustal deformation. The Madison Range fault system comprises fault zones (~100 km in length) that have multiple salients and embayments marked by preexisting structures exposed in the footwall. Quaternary tectonic activity rates differ along the length of the fault system, with less displacement to the north. Within the Hebgen Lake basin, the 1959 earthquake is the latest slip event in the Hebgen Lake–Red Canyon fault system and southern Madison Range fault system. Geomorphic and paleoseismic investigations indicate previous faulting events on both fault systems. Surficial geologic mapping and historic seismicity support a coseismic structural linkage between the Madison Range and Hebgen Lake–Red Canyon fault systems. On this trip, we will look at Quaternary surface ruptures that characterize prehistoric earthquake magnitudes. The one-day field trip begins and ends in Bozeman, and includes an overview of the active tectonics within the Madison Valley and Hebgen Lake basin, southwestern Montana. We will also review geologic evidence, which includes new geologic maps and geomorphic analyses that demonstrate preexisting structural controls on surface rupture patterns along the Madison Range and Hebgen Lake–Red Canyon fault systems.
Software-implemented fault insertion: An FTMP example
NASA Technical Reports Server (NTRS)
Czeck, Edward W.; Siewiorek, Daniel P.; Segall, Zary Z.
1987-01-01
This report presents a model for fault insertion through software; describes its implementation on a fault-tolerant computer, FTMP; presents a summary of fault detection, identification, and reconfiguration data collected with software-implemented fault insertion; and compares the results to hardware fault insertion data. Experimental results show detection time to be a function of time of insertion and system workload. For the fault detection time, there is no correlation between software-inserted faults and hardware-inserted faults; this is because hardware-inserted faults must manifest as errors before detection, whereas software-inserted faults immediately exercise the error detection mechanisms. In summary, the software-implemented fault insertion is able to be used as an evaluation technique for the fault-handling capabilities of a system in fault detection, identification and recovery. Although the software-inserted faults do not map directly to hardware-inserted faults, experiments show software-implemented fault insertion is capable of emulating hardware fault insertion, with greater ease and automation.
Abstractions for Fault-Tolerant Distributed System Verification
NASA Technical Reports Server (NTRS)
Pike, Lee S.; Maddalon, Jeffrey M.; Miner, Paul S.; Geser, Alfons
2004-01-01
Four kinds of abstraction for the design and analysis of fault tolerant distributed systems are discussed. These abstractions concern system messages, faults, fault masking voting, and communication. The abstractions are formalized in higher order logic, and are intended to facilitate specifying and verifying such systems in higher order theorem provers.
NASA Technical Reports Server (NTRS)
2001-01-01
Traditional spacecraft power systems incorporate a solar array energy source, an energy storage element (battery), and battery charge control and bus voltage regulation electronics to provide continuous electrical power for spacecraft systems and instruments. Dedicated power conditioning components provide limited fault isolation between systems and instruments, while a centralized power-switching unit provides spacecraft load control. Battery undervoltage conditions are detected by the spacecraft processor, which removes fault conditions and non-critical loads before permanent battery damage can occur. Cost effective operation of a micro-sat constellation requires a fault tolerant spacecraft architecture that minimizes on-orbit operational costs by permitting autonomous reconfiguration in response to unexpected fault conditions. A new micro-sat power system architecture that enhances spacecraft fault tolerance and improves power system survivability by continuously managing the battery charge and discharge processes on a cell-by-cell basis has been developed. This architecture is based on the Integrated Power Source (US patent 5644207), which integrates dual junction solar cells, Lithium Ion battery cells, and processor based charge control electronics into a structural panel that can be deployed or used to form a portion of the outer shell of a micro-spacecraft. The first generation Integrated Power Source is configured as a one inch thick panel in which prismatic Lithium Ion battery cells are arranged in a 3x7 matrix (26VDC) and a 3x1 matrix (3.7VDC) to provide the required output voltages and load currents. A multi-layer structure holds the battery cells, as well as the thermal insulators that are necessary to protect the Lithium Ion battery cells from the extreme temperatures of the solar cell layer. Independent thermal radiators, located on the back of the panel, are dedicated to the solar cell array, the electronics, and the battery cell array. In deployed panel applications, these radiators maintain the battery cells in an appropriate operational temperature range.
NASA Astrophysics Data System (ADS)
Yin, An; Kelty, Thomas K.; Davis, Gregory A.
1989-09-01
Geologic mapping in southern Glacier National Park, Montana, reveals the presence of two duplexes sharing the same floor thrust fault, the Lewis thrust. The westernmost duplex (Brave Dog Mountain) includes the low-angle Brave Dog roof fault and Elk Mountain imbricate system, and the easternmost (Rising Wolf Mountain) duplex includes the low-angle Rockwell roof fault and Mt. Henry imbricate system. The geometry of these duplexes suggests that they differ from previously described geometric-kinematic models for duplex development. Their low-angle roof faults were preexisting structures that were locally utilized as roof faults during the formation of the imbricate systems. Crosscutting of the Brave Dog fault by the Mt. Henry imbricate system indicates that the two duplexes formed at different times. The younger Rockwell-Mt. Henry duplex developed 20 km east of the older Brave Dog-Elk Mountain duplex; the roof fault of the former is at a higher structural level. Field relations confirm that the low-angle Rockwell fault existed across the southern Glacier Park area prior to localized formation of the Mt. Henry imbricate thrusts beneath it. These thrusts kinematically link the Rockwell and Lewis faults and may be analogous to P shears that form between two synchronously active faults bounding a simple shear system. The abandonment of one duplex and its replacement by another with a new and higher roof fault may have been caused by (1) warping of the older and lower Brave Dog roof fault during the formation of the imbricate system (Elk Mountain) beneath it, (2) an upward shifting of the highest level of a simple shear system in the Lewis plate to a new decollement level in subhorizontal belt strata (= the Rockwell fault) that lay above inclined strata within the first duplex, and (3) a reinitiation of P-shear development (= Mt. Henry imbricate faults) between the Lewis thrust and the subparallel, synkinematic Rockwell fault.
NASA Astrophysics Data System (ADS)
Wang, Rongxi; Gao, Xu; Gao, Jianmin; Gao, Zhiyong; Kang, Jiani
2018-02-01
As one of the most important approaches for analyzing the mechanism of fault pervasion, fault root cause tracing is a powerful and useful tool for detecting the fundamental causes of faults so as to prevent any further propagation and amplification. Focused on the problems arising from the lack of systematic and comprehensive integration, an information transfer-based novel data-driven framework for fault root cause tracing of complex electromechanical systems in the processing industry was proposed, taking into consideration the experience and qualitative analysis of conventional fault root cause tracing methods. Firstly, an improved symbolic transfer entropy method was presented to construct a directed-weighted information model for a specific complex electromechanical system based on the information flow. Secondly, considering the feedback mechanisms in the complex electromechanical systems, a method for determining the threshold values of weights was developed to explore the disciplines of fault propagation. Lastly, an iterative method was introduced to identify the fault development process. The fault root cause was traced by analyzing the changes in information transfer between the nodes along with the fault propagation pathway. An actual fault root cause tracing application of a complex electromechanical system is used to verify the effectiveness of the proposed framework. A unique fault root cause is obtained regardless of the choice of the initial variable. Thus, the proposed framework can be flexibly and effectively used in fault root cause tracing for complex electromechanical systems in the processing industry, and formulate the foundation of system vulnerability analysis and condition prediction, as well as other engineering applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Huijuan; Diao, Xiaoxu; Li, Boyuan
This paper studies the propagation and effects of faults of critical components that pertain to the secondary loop of a nuclear power plant found in Nuclear Hybrid Energy Systems (NHES). This information is used to design an on-line monitoring (OLM) system which is capable of detecting and forecasting faults that are likely to occur during NHES operation. In this research, the causes, features, and effects of possible faults are investigated by simulating the propagation of faults in the secondary loop. The simulation is accomplished by using the Integrated System Failure Analysis (ISFA). ISFA is used for analyzing hardware and softwaremore » faults during the conceptual design phase. In this paper, the models of system components required by ISFA are initially constructed. Then, the fault propagation analysis is implemented, which is conducted under the bounds set by acceptance criteria derived from the design of an OLM system. The result of the fault simulation is utilized to build a database for fault detection and diagnosis, provide preventive measures, and propose an optimization plan for the OLM system.« less
Clustering of GPS velocities in the Mojave Block, southeastern California
Savage, James C.; Simpson, Robert W.
2013-01-01
We find subdivisions within the Mojave Block using cluster analysis to identify groupings in the velocities observed at GPS stations there. The clusters are represented on a fault map by symbols located at the positions of the GPS stations, each symbol representing the cluster to which the velocity of that GPS station belongs. Fault systems that separate the clusters are readily identified on such a map. The most significant representation as judged by the gap test involves 4 clusters within the Mojave Block. The fault systems bounding the clusters from east to west are 1) the faults defining the eastern boundary of the Northeast Mojave Domain extended southward to connect to the Hector Mine rupture, 2) the Calico-Paradise fault system, 3) the Landers-Blackwater fault system, and 4) the Helendale-Lockhart fault system. This division of the Mojave Block is very similar to that proposed by Meade and Hager. However, no cluster boundary coincides with the Garlock Fault, the northern boundary of the Mojave Block. Rather, the clusters appear to continue without interruption from the Mojave Block north into the southern Walker Lane Belt, similar to the continuity across the Garlock Fault of the shear zone along the Blackwater-Little Lake fault system observed by Peltzer et al. Mapped traces of individual faults in the Mojave Block terminate within the block and do not continue across the Garlock Fault [Dokka and Travis, ].
Improving Multiple Fault Diagnosability using Possible Conflicts
NASA Technical Reports Server (NTRS)
Daigle, Matthew J.; Bregon, Anibal; Biswas, Gautam; Koutsoukos, Xenofon; Pulido, Belarmino
2012-01-01
Multiple fault diagnosis is a difficult problem for dynamic systems. Due to fault masking, compensation, and relative time of fault occurrence, multiple faults can manifest in many different ways as observable fault signature sequences. This decreases diagnosability of multiple faults, and therefore leads to a loss in effectiveness of the fault isolation step. We develop a qualitative, event-based, multiple fault isolation framework, and derive several notions of multiple fault diagnosability. We show that using Possible Conflicts, a model decomposition technique that decouples faults from residuals, we can significantly improve the diagnosability of multiple faults compared to an approach using a single global model. We demonstrate these concepts and provide results using a multi-tank system as a case study.
Geophysical Characterization of the Hilton Creek Fault System
NASA Astrophysics Data System (ADS)
Lacy, A. K.; Macy, K. P.; De Cristofaro, J. L.; Polet, J.
2016-12-01
The Long Valley Caldera straddles the eastern edge of the Sierra Nevada Batholith and the western edge of the Basin and Range Province, and represents one of the largest caldera complexes on Earth. The caldera is intersected by numerous fault systems, including the Hartley Springs Fault System, the Round Valley Fault System, the Long Valley Ring Fault System, and the Hilton Creek Fault System, which is our main region of interest. The Hilton Creek Fault System appears as a single NW-striking fault, dipping to the NE, from Davis Lake in the south to the southern rim of the Long Valley Caldera. Inside the caldera, it splays into numerous parallel faults that extend toward the resurgent dome. Seismicity in the area increased significantly in May 1980, following a series of large earthquakes in the vicinity of the caldera and a subsequent large earthquake swarm which has been suggested to be the result of magma migration. A large portion of the earthquake swarms in the Long Valley Caldera occurs on or around the Hilton Creek Fault splays. We are conducting an interdisciplinary geophysical study of the Hilton Creek Fault System from just south of the onset of splay faulting, to its extension into the dome of the caldera. Our investigation includes ground-based magnetic field measurements, high-resolution total station elevation profiles, Structure-From-Motion derived topography and an analysis of earthquake focal mechanisms and statistics. Preliminary analysis of topographic profiles, of approximately 1 km in length, reveals the presence of at least three distinct fault splays within the caldera with vertical offsets of 0.5 to 1.0 meters. More detailed topographic mapping is expected to highlight smaller structures. We are also generating maps of the variation in b-value along different portions of the Hilton Creek system to determine whether we can detect any transition to more swarm-like behavior towards the North. We will show maps of magnetic anomalies, topography, various models of the Hilton Creek Fault System and cross-sections through focal mechanism and earthquake catalogs, and will attempt to integrate these observations into a single fault geometry model.
X-33/RLV System Health Management/ Vehicle Health Management
NASA Technical Reports Server (NTRS)
Garbos, Raymond J.; Mouyos, William
1998-01-01
To reduce operations cost, the RLV must include the following elements: highly reliable, robust subsystems designed for simple repair access with a simplified servicing infrastructure and incorporating expedited decision making about faults and anomalies. A key component for the Single Stage to Orbit (SSTO) RLV System used to meet these objectives is System Health Management (SHM). SHM deals with the vehicle component- Vehicle Health Management (VHM), the ground processing associated with the fleet (GVHM) and the Ground Infrastructure Health Management (GIHM). The objective is to provide an automated collection and paperless health decision, maintenance and logistics system. Many critical technologies are necessary to make the SHM (and more specifically VHM) practical, reliable and cost effective. Sanders is leading the design, development and integration of the SHM system for RLV and X-33 SHM (a sub-scale, sub-orbit Advanced Technology Demonstrator). This paper will present the X-33 SHM design which forms the baseline for RLV SHM. This paper will also discuss other applications of these technologies.
Development of a microprocessor controller for stand-alone photovoltaic power systems
NASA Technical Reports Server (NTRS)
Millner, A. R.; Kaufman, D. L.
1984-01-01
A controller for stand-alone photovoltaic systems has been developed using a low power CMOS microprocessor. It performs battery state of charge estimation, array control, load management, instrumentation, automatic testing, and communications functions. Array control options are sequential subarray switching and maximum power control. A calculator keypad and LCD display provides manual control, fault diagnosis and digital multimeter functions. An RS-232 port provides data logging or remote control capability. A prototype 5 kW unit has been built and tested successfully. The controller is expected to be useful in village photovoltaic power systems, large solar water pumping installations, and other battery management applications.
The Deep Space Network information system in the year 2000
NASA Technical Reports Server (NTRS)
Markley, R. W.; Beswick, C. A.
1992-01-01
The Deep Space Network (DSN), the largest, most sensitive scientific communications and radio navigation network in the world, is considered. Focus is made on the telemetry processing, monitor and control, and ground data transport architectures of the DSN ground information system envisioned for the year 2000. The telemetry architecture will be unified from the front-end area to the end user. It will provide highly automated monitor and control of the DSN, automated configuration of support activities, and a vastly improved human interface. Automated decision support systems will be in place for DSN resource management, performance analysis, fault diagnosis, and contingency management.
Active faulting, earthquakes, and restraining bend development near Kerman city in southeastern Iran
NASA Astrophysics Data System (ADS)
Walker, Richard Thomas; Talebian, Morteza; Saiffori, Sohei; Sloan, Robert Alastair; Rasheedi, Ali; MacBean, Natasha; Ghassemi, Abbas
2010-08-01
We provide descriptions of strike-slip and reverse faulting, active within the late Quaternary, in the vicinity of Kerman city in southeastern Iran. The faults accommodate north-south, right-lateral, shear between central Iran and the Dasht-e-Lut depression. The regions that we describe have been subject to numerous earthquakes in the historical and instrumental periods, and many of the faults that are documented in this paper constitute hazards for local populations, including the city of Kerman itself (population ˜200,000). Faults to the north and east of Kerman are associated with the transfer of slip from the Gowk to the Kuh Banan right-lateral faults across a 40 km-wide restraining bend. Faults south and west of the city are associated with oblique slip on the Mahan and Jorjafk systems. The patterns of faulting observed along the Mahan-Jorjafk system, the Gowk-Kuh Banan system, and also the Rafsanjan-Rayen system further to the south, appear to preserve different stages in the development of these oblique-slip fault systems. We suggest that the faulting evolves through time. Topography is initially generated on oblique slip faults (as is seen on the Jorjafk fault). The shortening component then migrates to reverse faults situated away from the high topography whereas strike-slip continues to be accommodated in the high, mountainous, regions (as is seen, for example, on the Rafsanjan fault). The reverse faults may then link together and eventually evolve into new, through-going, strike-slip faults in a process that appears to be occurring, at present, in the bend between the Gowk and Kuh Banan faults.
Wu, Zhenyu; Guo, Yang; Lin, Wenfang; Yu, Shuyang; Ji, Yang
2018-04-05
Predictive maintenance plays an important role in modern Cyber-Physical Systems (CPSs) and data-driven methods have been a worthwhile direction for Prognostics Health Management (PHM). However, two main challenges have significant influences on the traditional fault diagnostic models: one is that extracting hand-crafted features from multi-dimensional sensors with internal dependencies depends too much on expertise knowledge; the other is that imbalance pervasively exists among faulty and normal samples. As deep learning models have proved to be good methods for automatic feature extraction, the objective of this paper is to study an optimized deep learning model for imbalanced fault diagnosis for CPSs. Thus, this paper proposes a weighted Long Recurrent Convolutional LSTM model with sampling policy (wLRCL-D) to deal with these challenges. The model consists of 2-layer CNNs, 2-layer inner LSTMs and 2-Layer outer LSTMs, with under-sampling policy and weighted cost-sensitive loss function. Experiments are conducted on PHM 2015 challenge datasets, and the results show that wLRCL-D outperforms other baseline methods.
Guo, Yang; Lin, Wenfang; Yu, Shuyang; Ji, Yang
2018-01-01
Predictive maintenance plays an important role in modern Cyber-Physical Systems (CPSs) and data-driven methods have been a worthwhile direction for Prognostics Health Management (PHM). However, two main challenges have significant influences on the traditional fault diagnostic models: one is that extracting hand-crafted features from multi-dimensional sensors with internal dependencies depends too much on expertise knowledge; the other is that imbalance pervasively exists among faulty and normal samples. As deep learning models have proved to be good methods for automatic feature extraction, the objective of this paper is to study an optimized deep learning model for imbalanced fault diagnosis for CPSs. Thus, this paper proposes a weighted Long Recurrent Convolutional LSTM model with sampling policy (wLRCL-D) to deal with these challenges. The model consists of 2-layer CNNs, 2-layer inner LSTMs and 2-Layer outer LSTMs, with under-sampling policy and weighted cost-sensitive loss function. Experiments are conducted on PHM 2015 challenge datasets, and the results show that wLRCL-D outperforms other baseline methods. PMID:29621131
Sensor Selection for Aircraft Engine Performance Estimation and Gas Path Fault Diagnostics
NASA Technical Reports Server (NTRS)
Simon, Donald L.; Rinehart, Aidan W.
2015-01-01
This paper presents analytical techniques for aiding system designers in making aircraft engine health management sensor selection decisions. The presented techniques, which are based on linear estimation and probability theory, are tailored for gas turbine engine performance estimation and gas path fault diagnostics applications. They enable quantification of the performance estimation and diagnostic accuracy offered by different candidate sensor suites. For performance estimation, sensor selection metrics are presented for two types of estimators including a Kalman filter and a maximum a posteriori estimator. For each type of performance estimator, sensor selection is based on minimizing the theoretical sum of squared estimation errors in health parameters representing performance deterioration in the major rotating modules of the engine. For gas path fault diagnostics, the sensor selection metric is set up to maximize correct classification rate for a diagnostic strategy that performs fault classification by identifying the fault type that most closely matches the observed measurement signature in a weighted least squares sense. Results from the application of the sensor selection metrics to a linear engine model are presented and discussed. Given a baseline sensor suite and a candidate list of optional sensors, an exhaustive search is performed to determine the optimal sensor suites for performance estimation and fault diagnostics. For any given sensor suite, Monte Carlo simulation results are found to exhibit good agreement with theoretical predictions of estimation and diagnostic accuracies.
Sensor Selection for Aircraft Engine Performance Estimation and Gas Path Fault Diagnostics
NASA Technical Reports Server (NTRS)
Simon, Donald L.; Rinehart, Aidan W.
2016-01-01
This paper presents analytical techniques for aiding system designers in making aircraft engine health management sensor selection decisions. The presented techniques, which are based on linear estimation and probability theory, are tailored for gas turbine engine performance estimation and gas path fault diagnostics applications. They enable quantification of the performance estimation and diagnostic accuracy offered by different candidate sensor suites. For performance estimation, sensor selection metrics are presented for two types of estimators including a Kalman filter and a maximum a posteriori estimator. For each type of performance estimator, sensor selection is based on minimizing the theoretical sum of squared estimation errors in health parameters representing performance deterioration in the major rotating modules of the engine. For gas path fault diagnostics, the sensor selection metric is set up to maximize correct classification rate for a diagnostic strategy that performs fault classification by identifying the fault type that most closely matches the observed measurement signature in a weighted least squares sense. Results from the application of the sensor selection metrics to a linear engine model are presented and discussed. Given a baseline sensor suite and a candidate list of optional sensors, an exhaustive search is performed to determine the optimal sensor suites for performance estimation and fault diagnostics. For any given sensor suite, Monte Carlo simulation results are found to exhibit good agreement with theoretical predictions of estimation and diagnostic accuracies.
Building a risk-targeted regional seismic hazard model for South-East Asia
NASA Astrophysics Data System (ADS)
Woessner, J.; Nyst, M.; Seyhan, E.
2015-12-01
The last decade has tragically shown the social and economic vulnerability of countries in South-East Asia to earthquake hazard and risk. While many disaster mitigation programs and initiatives to improve societal earthquake resilience are under way with the focus on saving lives and livelihoods, the risk management sector is challenged to develop appropriate models to cope with the economic consequences and impact on the insurance business. We present the source model and ground motions model components suitable for a South-East Asia earthquake risk model covering Indonesia, Malaysia, the Philippines and Indochine countries. The source model builds upon refined modelling approaches to characterize 1) seismic activity from geologic and geodetic data on crustal faults and 2) along the interface of subduction zones and within the slabs and 3) earthquakes not occurring on mapped fault structures. We elaborate on building a self-consistent rate model for the hazardous crustal fault systems (e.g. Sumatra fault zone, Philippine fault zone) as well as the subduction zones, showcase some characteristics and sensitivities due to existing uncertainties in the rate and hazard space using a well selected suite of ground motion prediction equations. Finally, we analyze the source model by quantifying the contribution by source type (e.g., subduction zone, crustal fault) to typical risk metrics (e.g.,return period losses, average annual loss) and reviewing their relative impact on various lines of businesses.
AGSM Functional Fault Models for Fault Isolation Project
NASA Technical Reports Server (NTRS)
Harp, Janicce Leshay
2014-01-01
This project implements functional fault models to automate the isolation of failures during ground systems operations. FFMs will also be used to recommend sensor placement to improve fault isolation capabilities. The project enables the delivery of system health advisories to ground system operators.
Data-based fault-tolerant control for affine nonlinear systems with actuator faults.
Xie, Chun-Hua; Yang, Guang-Hong
2016-09-01
This paper investigates the fault-tolerant control (FTC) problem for unknown nonlinear systems with actuator faults including stuck, outage, bias and loss of effectiveness. The upper bounds of stuck faults, bias faults and loss of effectiveness faults are unknown. A new data-based FTC scheme is proposed. It consists of the online estimations of the bounds and a state-dependent function. The estimations are adjusted online to compensate automatically the actuator faults. The state-dependent function solved by using real system data helps to stabilize the system. Furthermore, all signals in the resulting closed-loop system are uniformly bounded and the states converge asymptotically to zero. Compared with the existing results, the proposed approach is data-based. Finally, two simulation examples are provided to show the effectiveness of the proposed approach. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Validation techniques for fault emulation of SRAM-based FPGAs
Quinn, Heather; Wirthlin, Michael
2015-08-07
A variety of fault emulation systems have been created to study the effect of single-event effects (SEEs) in static random access memory (SRAM) based field-programmable gate arrays (FPGAs). These systems are useful for augmenting radiation-hardness assurance (RHA) methodologies for verifying the effectiveness for mitigation techniques; understanding error signatures and failure modes in FPGAs; and failure rate estimation. For radiation effects researchers, it is important that these systems properly emulate how SEEs manifest in FPGAs. If the fault emulation systems does not mimic the radiation environment, the system will generate erroneous data and incorrect predictions of behavior of the FPGA inmore » a radiation environment. Validation determines whether the emulated faults are reasonable analogs to the radiation-induced faults. In this study we present methods for validating fault emulation systems and provide several examples of validated FPGA fault emulation systems.« less
Fault-tolerant cooperative output regulation for multi-vehicle systems with sensor faults
NASA Astrophysics Data System (ADS)
Qin, Liguo; He, Xiao; Zhou, D. H.
2017-10-01
This paper presents a unified framework of fault diagnosis and fault-tolerant cooperative output regulation (FTCOR) for a linear discrete-time multi-vehicle system with sensor faults. The FTCOR control law is designed through three steps. A cooperative output regulation (COR) controller is designed based on the internal mode principle when there are no sensor faults. A sufficient condition on the existence of the COR controller is given based on the discrete-time algebraic Riccati equation (DARE). Then, a decentralised fault diagnosis scheme is designed to cope with sensor faults occurring in followers. A residual generator is developed to detect sensor faults of each follower, and a bank of fault-matching estimators are proposed to isolate and estimate sensor faults of each follower. Unlike the current distributed fault diagnosis for multi-vehicle systems, the presented decentralised fault diagnosis scheme in each vehicle reduces the communication and computation load by only using the information of the vehicle. By combing the sensor fault estimation and the COR control law, an FTCOR controller is proposed. Finally, the simulation results demonstrate the effectiveness of the FTCOR controller.
Previously unrecognized now-inactive strand of the North Anatolian fault in the Thrace basin
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perincek, D.
1988-08-01
The North Anatolian fault is a major 1,200 km-long transform fault bounding the Anatolian plate to the north. It formed in late middle Miocene time as a broad shear zone with a number of strands splaying westward in a horsetail fashion. Later, movement became localized along the stem, and the southerly and northerly splays became inactive. One such right-lateral, now-inactive splay is the west-northwest-striking Thrace strike-slip fault system, consisting of three subparallel strike-slip faults. From north to south these are the Kirklareli, Lueleburgaz, and Babaeski fault zones, extending {plus minus} 130 km along the strike. The Thrace fault zone probablymore » connected with the presently active northern strand of the North Anatolian fault in the Sea of Marmara in the southeast and may have joined the Plovdiv graben zone in Bulgaria in the northwest. The Thrace basin in which the Thrace fault system is located, is Cenozoic with a sedimentary basin fill from middle Eocene to Pliocene. The Thrace fault system formed in pre-Pliocene time and had become inactive by the Pliocene. Strike-slip fault zones with normal and reverse separation are detected by seismic reflection profiles and subsurface data. Releasing bend extensional structures (e.g., near the town of Lueleburgaz) and restraining bend compressional structures (near Vakiflar-1 well) are abundant on the fault zones. Umurca and Hamitabad fields are en echelon structures on the Lueleburgaz fault zone. The Thrace strike-slip fault system has itself a horsetail shape, the various strands of which become younger southward. The entire system died before the Pliocene, and motion on the North Anatolian fault zone began to be accommodated in the Sea of Marmara region. Thus the Thrace fault system represents the oldest strand of the North Anatolian fault in the west.« less
Magma-tectonic Interaction at Laguna del Maule, Chile
NASA Astrophysics Data System (ADS)
Keranen, K. M.; Peterson, D. E.; Miller, C. A.; Garibaldi, N.; Tikoff, B.; Williams-Jones, G.
2016-12-01
The Laguna del Maule Volcanic Field (LdM), Chile, the largest concentration of rhyolite <20 kyr globally, exhibits crustal deformation at rates higher than any non-erupting volcano. The interaction of large magmatic systems with faulting is poorly understood, however, the Chaitén rhyolitic system demonstrated that faults can serve as magma pathways during an eruption. We present a complex fault system at LdM in close proximity to the magma reservoir. In March 2016, 18 CHIRP seismic reflection lines were acquired at LdM to identify faults and analyze potential spatial and temporal impacts of the fault system on volcanic activity. We mapped three key horizons on each line, bounding sediment packages between Holocene onset, 870 ybp, and the present date. Faults were mapped on each line and offset was calculated across key horizons. Our results indicate a system of normal-component faults in the northern lake sector, striking subparallel to the mapped Troncoso Fault SW of the lake. These faults correlate to prominent magnetic lineations mapped by boat magnetic data acquired February 2016 which are interpreted as dykes intruding along faults. We also imaged a vertical fault, interpreted as a strike-slip fault, and a series of normal faults in the SW lake sector near the center of magmatic inflation. Isochron and fault offset maps illuminate areas of growth strata and indicate migration and increase of fault activity from south to north through time. We identify a domal structure in the SW lake sector, coincident with an area of low magnetization, in the region of maximum deformation from InSAR results. The dome experienced 10 ms TWT ( 10 meters) of uplift throughout the past 16 kybp, which we interpret as magmatic inflation in a shallow magma reservoir. This inflation is isolated to a 1.5 km diameter region in the hanging wall of the primary normal fault system, indicating possible fault-facilitated inflation.
Tools for Evaluating Fault Detection and Diagnostic Methods for HVAC Secondary Systems
NASA Astrophysics Data System (ADS)
Pourarian, Shokouh
Although modern buildings are using increasingly sophisticated energy management and control systems that have tremendous control and monitoring capabilities, building systems routinely fail to perform as designed. More advanced building control, operation, and automated fault detection and diagnosis (AFDD) technologies are needed to achieve the goal of net-zero energy commercial buildings. Much effort has been devoted to develop such technologies for primary heating ventilating and air conditioning (HVAC) systems, and some secondary systems. However, secondary systems, such as fan coil units and dual duct systems, although widely used in commercial, industrial, and multifamily residential buildings, have received very little attention. This research study aims at developing tools that could provide simulation capabilities to develop and evaluate advanced control, operation, and AFDD technologies for these less studied secondary systems. In this study, HVACSIM+ is selected as the simulation environment. Besides developing dynamic models for the above-mentioned secondary systems, two other issues related to the HVACSIM+ environment are also investigated. One issue is the nonlinear equation solver used in HVACSIM+ (Powell's Hybrid method in subroutine SNSQ). It has been found from several previous research projects (ASRHAE RP 825 and 1312) that SNSQ is especially unstable at the beginning of a simulation and sometimes unable to converge to a solution. Another issue is related to the zone model in the HVACSIM+ library of components. Dynamic simulation of secondary HVAC systems unavoidably requires an interacting zone model which is systematically and dynamically interacting with building surrounding. Therefore, the accuracy and reliability of the building zone model affects operational data generated by the developed dynamic tool to predict HVAC secondary systems function. The available model does not simulate the impact of direct solar radiation that enters a zone through glazing and the study of zone model is conducted in this direction to modify the existing zone model. In this research project, the following tasks are completed and summarized in this report: 1. Develop dynamic simulation models in the HVACSIM+ environment for common fan coil unit and dual duct system configurations. The developed simulation models are able to produce both fault-free and faulty operational data under a wide variety of faults and severity levels for advanced control, operation, and AFDD technology development and evaluation purposes; 2. Develop a model structure, which includes the grouping of blocks and superblocks, treatment of state variables, initial and boundary conditions, and selection of equation solver, that can simulate a dual duct system efficiently with satisfactory stability; 3. Design and conduct a comprehensive and systematic validation procedure using collected experimental data to validate the developed simulation models under both fault-free and faulty operational conditions; 4. Conduct a numerical study to compare two solution techniques: Powell's Hybrid (PH) and Levenberg-Marquardt (LM) in terms of their robustness and accuracy. 5. Modification of the thermal state of the existing building zone model in HVACSIM+ library of component. This component is revised to consider the transmitted heat through glazing as a heat source for transient building zone load prediction In this report, literature, including existing HVAC dynamic modeling environment and models, HVAC model validation methodologies, and fault modeling and validation methodologies, are reviewed. The overall methodologies used for fault free and fault model development and validation are introduced. Detailed model development and validation results for the two secondary systems, i.e., fan coil unit and dual duct system are summarized. Experimental data mostly from the Iowa Energy Center Energy Resource Station are used to validate the models developed in this project. Satisfactory model performance in both fault free and fault simulation studies is observed for all studied systems.
McLaughlin, Robert J.; Sarna-Wojcicki, Andrei M.; Wagner, David L.; Fleck, Robert J.; Langenheim, V.E.; Jachens, Robert C.; Clahan, Kevin; Allen, James R.
2012-01-01
The Rodgers Creek–Maacama fault system in the northern California Coast Ranges (United States) takes up substantial right-lateral motion within the wide transform boundary between the Pacific and North American plates, over a slab window that has opened northward beneath the Coast Ranges. The fault system evolved in several right steps and splays preceded and accompanied by extension, volcanism, and strike-slip basin development. Fault and basin geometries have changed with time, in places with younger basins and faults overprinting older structures. Along-strike and successional changes in fault and basin geometry at the southern end of the fault system probably are adjustments to frequent fault zone reorganizations in response to Mendocino Triple Junction migration and northward transit of a major releasing bend in the northern San Andreas fault. The earliest Rodgers Creek fault zone displacement is interpreted to have occurred ca. 7 Ma along extensional basin-forming faults that splayed northwest from a west-northwest proto-Hayward fault zone, opening a transtensional basin west of Santa Rosa. After ca. 5 Ma, the early transtensional basin was compressed and extensional faults were reactivated as thrusts that uplifted the northeast side of the basin. After ca. 2.78 Ma, the Rodgers Creek fault zone again splayed from the earlier extensional and thrust faults to steeper dipping faults with more north-northwest orientations. In conjunction with the changes in orientation and slip mode, the Rodgers Creek fault zone dextral slip rate increased from ∼2–4 mm/yr 7–3 Ma, to 5–8 mm/yr after 3 Ma. The Maacama fault zone is shown from several data sets to have initiated ca. 3.2 Ma and has slipped right-laterally at ∼5–8 mm/yr since its initiation. The initial Maacama fault zone splayed northeastward from the south end of the Rodgers Creek fault zone, accompanied by the opening of several strike-slip basins, some of which were later uplifted and compressed during late-stage fault zone reorganization. The Santa Rosa pull-apart basin formed ca. 1 Ma, during the reorganization of the right stepover geometry of the Rodgers Creek–Maacama fault system, when the maturely evolved overlapping geometry of the northern Rodgers Creek and Maacama fault zones was overprinted by a less evolved, non-overlapping stepover geometry. The Rodgers Creek–Maacama fault system has contributed at least 44–53 km of right-lateral displacement to the East Bay fault system south of San Pablo Bay since 7 Ma, at a minimum rate of 6.1–7.8 mm/yr.
Clendenin, C.W.; Diehl, S.F.
1999-01-01
A pronounced, subparallel set of northeast-striking faults occurs in southeastern Missouri, but little is known about these faults because of poor exposure. The Commerce fault system is the southernmost exposed fault system in this set and has an ancestry related to Reelfoot rift extension. Recent published work indicates that this fault system has a long history of reactivation. The northeast-striking Grays Point fault zone is a segment of the Commerce fault system and is well exposed along the southeast rim of an inactive quarry. Our mapping shows that the Grays Point fault zone also has a complex history of polyphase reactivation, involving three periods of Paleozoic reactivation that occurred in Late Ordovician, Devonian, and post-Mississippian. Each period is characterized by divergent, right-lateral oblique-slip faulting. Petrographic examination of sidwall rip-out clasts in calcite-filled faults associated with the Grays Point fault zone supports a minimum of three periods of right-lateral oblique-slip. The reported observations imply that a genetic link exists between intracratonic fault reactivation and strain produced by Paleozoic orogenies affecting the eastern margin of Laurentia (North America). Interpretation of this link indicate that right-lateral oblique-slip has occurred on all of the northeast-striking faults in southeastern Missouri as a result of strain influenced by the convergence directions of the different Paleozoic orogenies.
Geomorphic controls on riparian meadows in the Central Great Basin of Nevada are an important aspect in determining the formation of and planning the management of these systems. The current hypothesis is that both alluvial fan sediment and faulted bedrock steps interact to cont...
Autonomous Cryogenics Loading Operations Simulation Software: Knowledgebase Autonomous Test Engineer
NASA Technical Reports Server (NTRS)
Wehner, Walter S.
2012-01-01
The Simulation Software, KATE (Knowledgebase Autonomous Test Engineer), is used to demonstrate the automatic identification of faults in a system. The ACLO (Autonomous Cryogenics Loading Operation) project uses KATE to monitor and find faults in the loading of the cryogenics int o a vehicle fuel tank. The KATE software interfaces with the IHM (Integrated Health Management) systems bus to communicate with other systems that are part of ACLO. One system that KATE uses the IHM bus to communicate with is AIS (Advanced Inspection System). KATE will send messages to AIS when there is a detected anomaly. These messages include visual inspection of specific valves, pressure gauges and control messages to have AIS open or close manual valves. My goals include implementing the connection to the IHM bus within KATE and for the AIS project. I will also be working on implementing changes to KATE's Ul and implementing the physics objects in KATE that will model portions of the cryogenics loading operation.
Provable Transient Recovery for Frame-Based, Fault-Tolerant Computing Systems
NASA Technical Reports Server (NTRS)
DiVito, Ben L.; Butler, Ricky W.
1992-01-01
We present a formal verification of the transient fault recovery aspects of the Reliable Computing Platform (RCP), a fault-tolerant computing system architecture for digital flight control applications. The RCP uses NMR-style redundancy to mask faults and internal majority voting to purge the effects of transient faults. The system design has been formally specified and verified using the EHDM verification system. Our formalization accommodates a wide variety of voting schemes for purging the effects of transients.
Automatic Detection of Electric Power Troubles (ADEPT)
NASA Technical Reports Server (NTRS)
Wang, Caroline; Zeanah, Hugh; Anderson, Audie; Patrick, Clint; Brady, Mike; Ford, Donnie
1988-01-01
ADEPT is an expert system that integrates knowledge from three different suppliers to offer an advanced fault-detection system, and is designed for two modes of operation: real-time fault isolation and simulated modeling. Real time fault isolation of components is accomplished on a power system breadboard through the Fault Isolation Expert System (FIES II) interface with a rule system developed in-house. Faults are quickly detected and displayed and the rules and chain of reasoning optionally provided on a Laser printer. This system consists of a simulated Space Station power module using direct-current power supplies for Solar arrays on three power busses. For tests of the system's ability to locate faults inserted via switches, loads are configured by an INTEL microcomputer and the Symbolics artificial intelligence development system. As these loads are resistive in nature, Ohm's Law is used as the basis for rules by which faults are located. The three-bus system can correct faults automatically where there is a surplus of power available on any of the three busses. Techniques developed and used can be applied readily to other control systems requiring rapid intelligent decisions. Simulated modelling, used for theoretical studies, is implemented using a modified version of Kennedy Space Center's KATE (Knowledge-Based Automatic Test Equipment), FIES II windowing, and an ADEPT knowledge base. A load scheduler and a fault recovery system are currently under development to support both modes of operation.
Simple Random Sampling-Based Probe Station Selection for Fault Detection in Wireless Sensor Networks
Huang, Rimao; Qiu, Xuesong; Rui, Lanlan
2011-01-01
Fault detection for wireless sensor networks (WSNs) has been studied intensively in recent years. Most existing works statically choose the manager nodes as probe stations and probe the network at a fixed frequency. This straightforward solution leads however to several deficiencies. Firstly, by only assigning the fault detection task to the manager node the whole network is out of balance, and this quickly overloads the already heavily burdened manager node, which in turn ultimately shortens the lifetime of the whole network. Secondly, probing with a fixed frequency often generates too much useless network traffic, which results in a waste of the limited network energy. Thirdly, the traditional algorithm for choosing a probing node is too complicated to be used in energy-critical wireless sensor networks. In this paper, we study the distribution characters of the fault nodes in wireless sensor networks, validate the Pareto principle that a small number of clusters contain most of the faults. We then present a Simple Random Sampling-based algorithm to dynamic choose sensor nodes as probe stations. A dynamic adjusting rule for probing frequency is also proposed to reduce the number of useless probing packets. The simulation experiments demonstrate that the algorithm and adjusting rule we present can effectively prolong the lifetime of a wireless sensor network without decreasing the fault detected rate. PMID:22163789
Huang, Rimao; Qiu, Xuesong; Rui, Lanlan
2011-01-01
Fault detection for wireless sensor networks (WSNs) has been studied intensively in recent years. Most existing works statically choose the manager nodes as probe stations and probe the network at a fixed frequency. This straightforward solution leads however to several deficiencies. Firstly, by only assigning the fault detection task to the manager node the whole network is out of balance, and this quickly overloads the already heavily burdened manager node, which in turn ultimately shortens the lifetime of the whole network. Secondly, probing with a fixed frequency often generates too much useless network traffic, which results in a waste of the limited network energy. Thirdly, the traditional algorithm for choosing a probing node is too complicated to be used in energy-critical wireless sensor networks. In this paper, we study the distribution characters of the fault nodes in wireless sensor networks, validate the Pareto principle that a small number of clusters contain most of the faults. We then present a Simple Random Sampling-based algorithm to dynamic choose sensor nodes as probe stations. A dynamic adjusting rule for probing frequency is also proposed to reduce the number of useless probing packets. The simulation experiments demonstrate that the algorithm and adjusting rule we present can effectively prolong the lifetime of a wireless sensor network without decreasing the fault detected rate.
Jeon, Namju; Lee, Hyeongcheol
2016-12-12
An integrated fault-diagnosis algorithm for a motor sensor of in-wheel independent drive electric vehicles is presented. This paper proposes a method that integrates the high- and low-level fault diagnoses to improve the robustness and performance of the system. For the high-level fault diagnosis of vehicle dynamics, a planar two-track non-linear model is first selected, and the longitudinal and lateral forces are calculated. To ensure redundancy of the system, correlation between the sensor and residual in the vehicle dynamics is analyzed to detect and separate the fault of the drive motor system of each wheel. To diagnose the motor system for low-level faults, the state equation of an interior permanent magnet synchronous motor is developed, and a parity equation is used to diagnose the fault of the electric current and position sensors. The validity of the high-level fault-diagnosis algorithm is verified using Carsim and Matlab/Simulink co-simulation. The low-level fault diagnosis is verified through Matlab/Simulink simulation and experiments. Finally, according to the residuals of the high- and low-level fault diagnoses, fault-detection flags are defined. On the basis of this information, an integrated fault-diagnosis strategy is proposed.
Neotectonics of interior Alaska and the late Quaternary slip rate along the Denali fault system
Haeussler, Peter J.; Matmon, Ari; Schwartz, David P.; Seitz, Gordon G.
2017-01-01
The neotectonics of southern Alaska (USA) are characterized by a several hundred kilometers–wide zone of dextral transpressional that spans the Alaska Range. The Denali fault system is the largest active strike-slip fault system in interior Alaska, and it produced a Mw 7.9 earthquake in 2002. To evaluate the late Quaternary slip rate on the Denali fault system, we collected samples for cosmogenic surface exposure dating from surfaces offset by the fault system. This study includes data from 107 samples at 19 sites, including 7 sites we previously reported, as well as an estimated slip rate at another site. We utilize the interpreted surface ages to provide estimated slip rates. These new slip rate data confirm that the highest late Quaternary slip rate is ∼13 mm/yr on the central Denali fault near its intersection with the eastern Denali and the Totschunda faults, with decreasing slip rate both to the east and west. The slip rate decreases westward along the central and western parts of the Denali fault system to 5 mm/yr over a length of ∼575 km. An additional site on the eastern Denali fault near Kluane Lake, Yukon, implies a slip rate of ∼2 mm/yr, based on geological considerations. The Totschunda fault has a maximum slip rate of ∼9 mm/yr. The Denali fault system is transpressional and there are active thrust faults on both the north and south sides of it. We explore four geometric models for southern Alaska tectonics to explain the slip rates along the Denali fault system and the active fault geometries: rotation, indentation, extrusion, and a combination of the three. We conclude that all three end-member models have strengths and shortcomings, and a combination of rotation, indentation, and extrusion best explains the slip rate observations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Binh T. Pham; Nancy J. Lybeck; Vivek Agarwal
The Light Water Reactor Sustainability program at Idaho National Laboratory is actively conducting research to develop and demonstrate online monitoring capabilities for active components in existing nuclear power plants. Idaho National Laboratory and the Electric Power Research Institute are working jointly to implement a pilot project to apply these capabilities to emergency diesel generators and generator step-up transformers. The Electric Power Research Institute Fleet-Wide Prognostic and Health Management Software Suite will be used to implement monitoring in conjunction with utility partners: Braidwood Generating Station (owned by Exelon Corporation) for emergency diesel generators, and Shearon Harris Nuclear Generating Station (owned bymore » Duke Energy Progress) for generator step-up transformers. This report presents monitoring techniques, fault signatures, and diagnostic and prognostic models for emergency diesel generators. Emergency diesel generators provide backup power to the nuclear power plant, allowing operation of essential equipment such as pumps in the emergency core coolant system during catastrophic events, including loss of offsite power. Technical experts from Braidwood are assisting Idaho National Laboratory and Electric Power Research Institute in identifying critical faults and defining fault signatures associated with each fault. The resulting diagnostic models will be implemented in the Fleet-Wide Prognostic and Health Management Software Suite and tested using data from Braidwood. Parallel research on generator step-up transformers was summarized in an interim report during the fourth quarter of fiscal year 2012.« less
Nickel cadmium battery expert system
NASA Technical Reports Server (NTRS)
1986-01-01
The applicability of artificial intelligence methodologies for the automation of energy storage management, in this case, nickel cadmium batteries, is demonstrated. With the Hubble Space Telescope Electrical Power System (HST/EPS) testbed as the application domain, an expert system was developed which incorporates the physical characterization of the EPS, in particular, the nickel cadmium batteries, as well as the human's operational knowledge. The expert system returns not only fault diagnostics but also status and advice along with justifications and explanations in the form of decision support.
Vehicle fault diagnostics and management system
NASA Astrophysics Data System (ADS)
Gopal, Jagadeesh; Gowthamsachin
2017-11-01
This project is a kind of advanced automatic identification technology, and is more and more widely used in the fields of transportation and logistics. It looks over the main functions with like Vehicle management, Vehicle Speed limit and Control. This system starts with authentication process to keep itself secure. Here we connect sensors to the STM32 board which in turn is connected to the car through Ethernet cable, as Ethernet in capable of sending large amounts of data at high speeds. This technology involved clearly shows how a careful combination of software and hardware can produce an extremely cost-effective solution to a problem.
Beard, Sue; Campagna, David J.; Anderson, R. Ernest
2010-01-01
The Lake Mead fault system is a northeast-striking, 130-km-long zone of left-slip in the southeast Great Basin, active from before 16 Ma to Quaternary time. The northeast end of the Lake Mead fault system in the Virgin Mountains of southeast Nevada and northwest Arizona forms a partitioned strain field comprising kinematically linked northeast-striking left-lateral faults, north-striking normal faults, and northwest-striking right-lateral faults. Major faults bound large structural blocks whose internal strain reflects their position within a left step-over of the left-lateral faults. Two north-striking large-displacement normal faults, the Lakeside Mine segment of the South Virgin–White Hills detachment fault and the Piedmont fault, intersect the left step-over from the southwest and northeast, respectively. The left step-over in the Lake Mead fault system therefore corresponds to a right-step in the regional normal fault system.Within the left step-over, displacement transfer between the left-lateral faults and linked normal faults occurs near their junctions, where the left-lateral faults become oblique and normal fault displacement decreases away from the junction. Southward from the center of the step-over in the Virgin Mountains, down-to-the-west normal faults splay northward from left-lateral faults, whereas north and east of the center, down-to-the-east normal faults splay southward from left-lateral faults. Minimum slip is thus in the central part of the left step-over, between east-directed slip to the north and west-directed slip to the south. Attenuation faults parallel or subparallel to bedding cut Lower Paleozoic rocks and are inferred to be early structures that accommodated footwall uplift during the initial stages of extension.Fault-slip data indicate oblique extensional strain within the left step-over in the South Virgin Mountains, manifested as east-west extension; shortening is partitioned between vertical for extension-dominated structural blocks and south-directed for strike-slip faults. Strike-slip faults are oblique to the extension direction due to structural inheritance from NE-striking fabrics in Proterozoic crystalline basement rocks.We hypothesize that (1) during early phases of deformation oblique extension was partitioned to form east-west–extended domains bounded by left-lateral faults of the Lake Mead fault system, from ca. 16 to 14 Ma. (2) Beginning ca. 13 Ma, increased south-directed shortening impinged on the Virgin Mountains and forced uplift, faulting, and overturning along the north and west side of the Virgin Mountains. (3) By ca. 10 Ma, initiation of the younger Hen Spring to Hamblin Bay fault segment of the Lake Mead fault system accommodated westward tectonic escape, and the focus of south-directed shortening transferred to the western Lake Mead region. The shift from early partitioned oblique extension to south-directed shortening may have resulted from initiation of right-lateral shear of the eastern Walker Lane to the west coupled with left-lateral shear along the eastern margin of the Great Basin.
Slip distribution, strain accumulation and aseismic slip on the Chaman Fault system
NASA Astrophysics Data System (ADS)
Amelug, F.
2015-12-01
The Chaman fault system is a transcurrent fault system developed due to the oblique convergence of the India and Eurasia plates in the western boundary of the India plate. To evaluate the contemporary rates of strain accumulation along and across the Chaman Fault system, we use 2003-2011 Envisat SAR imagery and InSAR time-series methods to obtain a ground velocity field in radar line-of-sight (LOS) direction. We correct the InSAR data for different sources of systematic biases including the phase unwrapping errors, local oscillator drift, topographic residuals and stratified tropospheric delay and evaluate the uncertainty due to the residual delay using time-series of MODIS observations of precipitable water vapor. The InSAR velocity field and modeling demonstrates the distribution of deformation across the Chaman fault system. In the central Chaman fault system, the InSAR velocity shows clear strain localization on the Chaman and Ghazaband faults and modeling suggests a total slip rate of ~24 mm/yr distributed on the two faults with rates of 8 and 16 mm/yr, respectively corresponding to the 80% of the total ~3 cm/yr plate motion between India and Eurasia at these latitudes and consistent with the kinematic models which have predicted a slip rate of ~17-24 mm/yr for the Chaman Fault. In the northern Chaman fault system (north of 30.5N), ~6 mm/yr of the relative plate motion is accommodated across Chaman fault. North of 30.5 N where the topographic expression of the Ghazaband fault vanishes, its slip does not transfer to the Chaman fault but rather distributes among different faults in the Kirthar range and Sulaiman lobe. Observed surface creep on the southern Chaman fault between Nushki and north of City of Chaman, indicates that the fault is partially locked, consistent with the recorded M<7 earthquakes in last century on this segment. The Chaman fault between north of the City of Chaman to North of Kabul, does not show an increase in the rate of strain accumulation. However, lack of seismicity on this segment, presents a significant hazard on Kabul. The high rate of strain accumulation on the Ghazaband fault and lack of evidence for the rupture of the fault during the 1935 Quetta earthquake, present a growing earthquake hazard to the Balochistan and the populated areas such as the city of Quetta.
NASA Technical Reports Server (NTRS)
1990-01-01
The present conference on digital avionics discusses vehicle-management systems, spacecraft avionics, special vehicle avionics, communication/navigation/identification systems, software qualification and quality assurance, launch-vehicle avionics, Ada applications, sensor and signal processing, general aviation avionics, automated software development, design-for-testability techniques, and avionics-software engineering. Also discussed are optical technology and systems, modular avionics, fault-tolerant avionics, commercial avionics, space systems, data buses, crew-station technology, embedded processors and operating systems, AI and expert systems, data links, and pilot/vehicle interfaces.
NASA Astrophysics Data System (ADS)
Zhao, H.; Wu, L.; Xiao, A.
2016-12-01
We present a detailed structural analysis on the fault geometry and Cenozoic development in the Dongping area, northwestern Qaidam Basin, based on the precise 3-D seismic interpretation, remote sensing images and seismic attribute analysis. Two conflicting fault systems distributed in different orientations ( EW-striking and NNW-striking) with opposing senses of shear are recognized and discussed, and the interaction between them provides new insights to the intracontinental deformation of the Qaidam Basin within the NE Tibetan Plateau. The EW-striking fault system constitutes the south part of the Altyn left-slip positive flower structure. Faulting on the EW-striking faults dominated the northwestern Qaidam since 40 Ma in respond to the inception of the Altyn Tagh fault system as a ductile shear zone, tilting the south slope of the Altyn Tagh. Whereas the NNW-striking fault system became the dominant structures since the mid-Miocene ( 15 Ma), induced by the large scale strike-slip of the Altyn Tagh fault which leads to the NE-SW directed compression of the Qaidam Basin. Thus it evidently implies a structural conversion taking place within the NE Tibetan Plateau since the mid-Miocece ( 15 Ma). Interestingly, the preexisting faults possibly restrained the development of the later period faults, while the latter tended to track and link to the former fault traces. Taken the large scale sinistral striking-slip East Kunlun fault system into account, the late Cenozoic intracontinental deformation in the Qaidam Basin showing the dextral transpressional attribute is suggested to be the consequence of the combined effect of its two border sinistral strike-slip faults, which furthermore favors a continuous and lateral-extrusion mechanism of the growth of the NE Tibetan Plateau.
Clustering of GPS velocities in the Mojave Block, southeastern California
NASA Astrophysics Data System (ADS)
Savage, J. C.; Simpson, R. W.
2013-04-01
find subdivisions within the Mojave Block using cluster analysis to identify groupings in the velocities observed at GPS stations there. The clusters are represented on a fault map by symbols located at the positions of the GPS stations, each symbol representing the cluster to which the velocity of that GPS station belongs. Fault systems that separate the clusters are readily identified on such a map. The most significant representation as judged by the gap test involves 4 clusters within the Mojave Block. The fault systems bounding the clusters from east to west are 1) the faults defining the eastern boundary of the Northeast Mojave Domain extended southward to connect to the Hector Mine rupture, 2) the Calico-Paradise fault system, 3) the Landers-Blackwater fault system, and 4) the Helendale-Lockhart fault system. This division of the Mojave Block is very similar to that proposed by Meade and Hager []. However, no cluster boundary coincides with the Garlock Fault, the northern boundary of the Mojave Block. Rather, the clusters appear to continue without interruption from the Mojave Block north into the southern Walker Lane Belt, similar to the continuity across the Garlock Fault of the shear zone along the Blackwater-Little Lake fault system observed by Peltzer et al. []. Mapped traces of individual faults in the Mojave Block terminate within the block and do not continue across the Garlock Fault [Dokka and Travis, ].
Data Management Working Group report
NASA Technical Reports Server (NTRS)
Filardo, Edward J.; Smith, David B.
1986-01-01
The current flight qualification program lags technology insertion by 6 to 10 years. The objective is to develop an integrated software engineering and development environment assisted by an expert system technology. An operating system needs to be developed which is portable to the on-board computers of the year 2000. The use of ADA verses a High-Order Language; fault tolerance; fiber optics networks; communication protocols; and security are also examined and outlined.
Reliability issues in active control of large flexible space structures
NASA Technical Reports Server (NTRS)
Vandervelde, W. E.
1986-01-01
Efforts in this reporting period were centered on four research tasks: design of failure detection filters for robust performance in the presence of modeling errors, design of generalized parity relations for robust performance in the presence of modeling errors, design of failure sensitive observers using the geometric system theory of Wonham, and computational techniques for evaluation of the performance of control systems with fault tolerance and redundancy management
A resource management architecture based on complex network theory in cloud computing federation
NASA Astrophysics Data System (ADS)
Zhang, Zehua; Zhang, Xuejie
2011-10-01
Cloud Computing Federation is a main trend of Cloud Computing. Resource Management has significant effect on the design, realization, and efficiency of Cloud Computing Federation. Cloud Computing Federation has the typical characteristic of the Complex System, therefore, we propose a resource management architecture based on complex network theory for Cloud Computing Federation (abbreviated as RMABC) in this paper, with the detailed design of the resource discovery and resource announcement mechanisms. Compare with the existing resource management mechanisms in distributed computing systems, a Task Manager in RMABC can use the historical information and current state data get from other Task Managers for the evolution of the complex network which is composed of Task Managers, thus has the advantages in resource discovery speed, fault tolerance and adaptive ability. The result of the model experiment confirmed the advantage of RMABC in resource discovery performance.
Timing of activity of two fault systems on Mercury
NASA Astrophysics Data System (ADS)
Galluzzi, V.; Guzzetta, L.; Giacomini, L.; Ferranti, L.; Massironi, M.; Palumbo, P.
2015-10-01
Here we discuss about two fault systems found in the Victoria and Shakespeare quadrangles of Mercury. The two fault sets intersect each other and show probable evidence for two stages of deformation. The most prominent system is N-S oriented and encompasses several tens to hundreds of kilometers long and easily recognizable fault segments. The other system strikes NE- SW and encompasses mostly degraded and short fault segments. The structural framework of the studied area and the morphological appearance of the faults suggest that the second system is older than the first one. We intend to apply the buffered crater counting technique on both systems to make a quantitative study of their timing of activity that could confirm the already clear morphological evidence.
Modeling and Fault Simulation of Propellant Filling System
NASA Astrophysics Data System (ADS)
Jiang, Yunchun; Liu, Weidong; Hou, Xiaobo
2012-05-01
Propellant filling system is one of the key ground plants in launching site of rocket that use liquid propellant. There is an urgent demand for ensuring and improving its reliability and safety, and there is no doubt that Failure Mode Effect Analysis (FMEA) is a good approach to meet it. Driven by the request to get more fault information for FMEA, and because of the high expense of propellant filling, in this paper, the working process of the propellant filling system in fault condition was studied by simulating based on AMESim. Firstly, based on analyzing its structure and function, the filling system was modular decomposed, and the mathematic models of every module were given, based on which the whole filling system was modeled in AMESim. Secondly, a general method of fault injecting into dynamic system was proposed, and as an example, two typical faults - leakage and blockage - were injected into the model of filling system, based on which one can get two fault models in AMESim. After that, fault simulation was processed and the dynamic characteristics of several key parameters were analyzed under fault conditions. The results show that the model can simulate effectively the two faults, and can be used to provide guidance for the filling system maintain and amelioration.
Structural superposition in fault systems bounding Santa Clara Valley, California
Graymer, Russell W.; Stanley, Richard G.; Ponce, David A.; Jachens, Robert C.; Simpson, Robert W.; Wentworth, Carl M.
2015-01-01
Santa Clara Valley is bounded on the southwest and northeast by active strike-slip and reverse-oblique faults of the San Andreas fault system. On both sides of the valley, these faults are superposed on older normal and/or right-lateral normal oblique faults. The older faults comprised early components of the San Andreas fault system as it formed in the wake of the northward passage of the Mendocino Triple Junction. On the east side of the valley, the great majority of fault displacement was accommodated by the older faults, which were almost entirely abandoned when the presently active faults became active after ca. 2.5 Ma. On the west side of the valley, the older faults were abandoned earlier, before ca. 8 Ma and probably accumulated only a small amount, if any, of the total right-lateral offset accommodated by the fault zone as a whole. Apparent contradictions in observations of fault offset and the relation of the gravity field to the distribution of dense rocks at the surface are explained by recognition of superposed structures in the Santa Clara Valley region.
Goal-Function Tree Modeling for Systems Engineering and Fault Management
NASA Technical Reports Server (NTRS)
Patterson, Jonathan D.; Johnson, Stephen B.
2013-01-01
The draft NASA Fault Management (FM) Handbook (2012) states that Fault Management (FM) is a "part of systems engineering", and that it "demands a system-level perspective" (NASAHDBK- 1002, 7). What, exactly, is the relationship between systems engineering and FM? To NASA, systems engineering (SE) is "the art and science of developing an operable system capable of meeting requirements within often opposed constraints" (NASA/SP-2007-6105, 3). Systems engineering starts with the elucidation and development of requirements, which set the goals that the system is to achieve. To achieve these goals, the systems engineer typically defines functions, and the functions in turn are the basis for design trades to determine the best means to perform the functions. System Health Management (SHM), by contrast, defines "the capabilities of a system that preserve the system's ability to function as intended" (Johnson et al., 2011, 3). Fault Management, in turn, is the operational subset of SHM, which detects current or future failures, and takes operational measures to prevent or respond to these failures. Failure, in turn, is the "unacceptable performance of intended function." (Johnson 2011, 605) Thus the relationship of SE to FM is that SE defines the functions and the design to perform those functions to meet system goals and requirements, while FM detects the inability to perform those functions and takes action. SHM and FM are in essence "the dark side" of SE. For every function to be performed (SE), there is the possibility that it is not successfully performed (SHM); FM defines the means to operationally detect and respond to this lack of success. We can also describe this in terms of goals: for every goal to be achieved, there is the possibility that it is not achieved; FM defines the means to operationally detect and respond to this inability to achieve the goal. This brief description of relationships between SE, SHM, and FM provide hints to a modeling approach to provide formal connectivity between the nominal (SE), and off-nominal (SHM and FM) aspects of functions and designs. This paper describes a formal modeling approach to the initial phases of the development process that integrates the nominal and off-nominal perspectives in a model that unites SE goals and functions of with the failure to achieve goals and functions (SHM/FM). This methodology and corresponding model, known as a Goal-Function Tree (GFT), provides a means to represent, decompose, and elaborate system goals and functions in a rigorous manner that connects directly to design through use of state variables that translate natural language requirements and goals into logical-physical state language. The state variable-based approach also provides the means to directly connect FM to the design, by specifying the range in which state variables must be controlled to achieve goals, and conversely, the failures that exist if system behavior go out-of-range. This in turn allows for the systems engineers and SHM/FM engineers to determine which state variables to monitor, and what action(s) to take should the system fail to achieve that goal. In sum, the GFT representation provides a unified approach to early-phase SE and FM development. This representation and methodology has been successfully developed and implemented using Systems Modeling Language (SysML) on the NASA Space Launch System (SLS) Program. It enabled early design trade studies of failure detection coverage to ensure complete detection coverage of all crew-threatening failures. The representation maps directly both to FM algorithm designs, and to failure scenario definitions needed for design analysis and testing. The GFT representation provided the basis for mapping of abort triggers into scenarios, both needed for initial, and successful quantitative analyses of abort effectiveness (detection and response to crew-threatening events).
Delivery and application of precise timing for a traveling wave powerline fault locator system
NASA Technical Reports Server (NTRS)
Street, Michael A.
1990-01-01
The Bonneville Power Administration (BPA) has successfully operated an in-house developed powerline fault locator system since 1986. The BPA fault locator system consists of remotes installed at cardinal power transmission line system nodes and a central master which polls the remotes for traveling wave time-of-arrival data. A power line fault produces a fast rise-time traveling wave which emanates from the fault point and propagates throughout the power grid. The remotes time-tag the traveling wave leading edge as it passes through the power system cardinal substation nodes. A synchronizing pulse transmitted via the BPA analog microwave system on a wideband channel sychronizes the time-tagging counters in the remote units to a different accuracy of better than one microsecond. The remote units correct the raw time tags for synchronizing pulse propagation delay and return these corrected values to the fault locator master. The master then calculates the power system disturbance source using the collected time tags. The system design objective is a fault location accuracy of 300 meters. BPA's fault locator system operation, error producing phenomena, and method of distributing precise timing are described.
The Impact of Contextual Factors on the Security of Code
2014-12-30
in which a system is resourced, overseen, managed and assured will have a lot to do with how successfully it performs in actual practice. Software is...ensure proper and adequate system assurance . Because of the high degree of skill and specialization required, details about software and systems are...whole has to be carefully coordinated in order to assure against the types of faults that are the basis for most of the exploits listed in the Common
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Leifker, Daniel B.
1991-01-01
Current qualitative device and process models represent only the structure and behavior of physical systems. However, systems in the real world include goal-oriented activities that generally cannot be easily represented using current modeling techniques. An extension of a qualitative modeling system, known as functional modeling, which captures goal-oriented activities explicitly is proposed and how they may be used to support intelligent automation and fault management is shown.
Online fault adaptive control for efficient resource management in Advanced Life Support Systems
NASA Technical Reports Server (NTRS)
Abdelwahed, Sherif; Wu, Jian; Biswas, Gautam; Ramirez, John; Manders, Eric-J
2005-01-01
This article presents the design and implementation of a controller scheme for efficient resource management in Advanced Life Support Systems. In the proposed approach, a switching hybrid system model is used to represent the dynamics of the system components and their interactions. The operational specifications for the controller are represented by utility functions, and the corresponding resource management problem is formulated as a safety control problem. The controller is designed as a limited-horizon online supervisory controller that performs a limited forward search on the state-space of the system at each time step, and uses the utility functions to decide on the best action. The feasibility and accuracy of the online algorithm can be assessed at design time. We demonstrate the effectiveness of the scheme by running a set of experiments on the Reverse Osmosis (RO) subsystem of the Water Recovery System (WRS).
Online fault adaptive control for efficient resource management in Advanced Life Support Systems.
Abdelwahed, Sherif; Wu, Jian; Biswas, Gautam; Ramirez, John; Manders, Eric-J
2005-01-01
This article presents the design and implementation of a controller scheme for efficient resource management in Advanced Life Support Systems. In the proposed approach, a switching hybrid system model is used to represent the dynamics of the system components and their interactions. The operational specifications for the controller are represented by utility functions, and the corresponding resource management problem is formulated as a safety control problem. The controller is designed as a limited-horizon online supervisory controller that performs a limited forward search on the state-space of the system at each time step, and uses the utility functions to decide on the best action. The feasibility and accuracy of the online algorithm can be assessed at design time. We demonstrate the effectiveness of the scheme by running a set of experiments on the Reverse Osmosis (RO) subsystem of the Water Recovery System (WRS).
Intelligent (Autonomous) Power Controller Development for Human Deep Space Exploration
NASA Technical Reports Server (NTRS)
Soeder, James; Raitano, Paul; McNelis, Anne
2016-01-01
As NASAs Evolvable Mars Campaign and other exploration initiatives continue to mature they have identified the need for more autonomous operations of the power system. For current human space operations such as the International Space Station, the paradigm is to perform the planning, operation and fault diagnosis from the ground. However, the dual problems of communication lag as well as limited communication bandwidth beyond GEO synchronous orbit, underscore the need to change the operation methodology for human operation in deep space. To address this need, for the past several years the Glenn Research Center has had an effort to develop an autonomous power controller for human deep space vehicles. This presentation discusses the present roadmap for deep space exploration along with a description of conceptual power system architecture for exploration modules. It then contrasts the present ground centric control and management architecture with limited autonomy on-board the spacecraft with an advanced autonomous power control system that features ground based monitoring with a spacecraft mission manager with autonomous control of all core systems, including power. It then presents a functional breakdown of the autonomous power control system and examines its operation in both normal and fault modes. Finally, it discusses progress made in the development of a real-time power system model and how it is being used to evaluate the performance of the controller and well as using it for verification of the overall operation.
Metric Ranking of Invariant Networks with Belief Propagation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tao, Changxia; Ge, Yong; Song, Qinbao
The management of large-scale distributed information systems relies on the effective use and modeling of monitoring data collected at various points in the distributed information systems. A promising approach is to discover invariant relationships among the monitoring data and generate invariant networks, where a node is a monitoring data source (metric) and a link indicates an invariant relationship between two monitoring data. Such an invariant network representation can help system experts to localize and diagnose the system faults by examining those broken invariant relationships and their related metrics, because system faults usually propagate among the monitoring data and eventually leadmore » to some broken invariant relationships. However, at one time, there are usually a lot of broken links (invariant relationships) within an invariant network. Without proper guidance, it is difficult for system experts to manually inspect this large number of broken links. Thus, a critical challenge is how to effectively and efficiently rank metrics (nodes) of invariant networks according to the anomaly levels of metrics. The ranked list of metrics will provide system experts with useful guidance for them to localize and diagnose the system faults. To this end, we propose to model the nodes and the broken links as a Markov Random Field (MRF), and develop an iteration algorithm to infer the anomaly of each node based on belief propagation (BP). Finally, we validate the proposed algorithm on both realworld and synthetic data sets to illustrate its effectiveness.« less
Spatial and Temporal Variations in Slip Partitioning During Oblique Convergence Experiments
NASA Astrophysics Data System (ADS)
Beyer, J. L.; Cooke, M. L.; Toeneboehn, K.
2017-12-01
Physical experiments of oblique convergence in wet kaolin demonstrate the development of slip partitioning, where two faults accommodate strain via different slip vectors. In these experiments, the second fault forms after the development of the first fault. As one strain component is relieved by one fault, the local stress field then favors the development of a second fault with different slip sense. A suite of physical experiments reveals three styles of slip partitioning development controlled by the convergence angle and presence of a pre-existing fault. In experiments with low convergence angles, strike-slip faults grow prior to reverse faults (Type 1) regardless of whether the fault is precut or not. In experiments with moderate convergence angles, slip partitioning is dominantly controlled by the presence of a pre-existing fault. In all experiments, the primarily reverse fault forms first. Slip partitioning then develops with the initiation of strike-slip along the precut fault (Type 2) or growth of a secondary reverse fault where the first fault is steepest. Subsequently, the slip on the first fault transitions to primarily strike-slip (Type 3). Slip rates and rakes along the slip partitioned faults for both precut and uncut experiments vary temporally, suggesting that faults in these slip-partitioned systems are constantly adapting to the conditions produced by slip along nearby faults in the system. While physical experiments show the evolution of slip partitioning, numerical simulations of the experiments provide information about both the stress and strain fields, which can be used to compute the full work budget, providing insight into the mechanisms that drive slip partitioning. Preliminary simulations of precut experiments show that strain energy density (internal work) can be used to predict fault growth, highlighting where fault growth can reduce off-fault deformation in the physical experiments. In numerical simulations of uncut experiments with a first non-planar oblique slip fault, strain energy density is greatest where the first fault is steepest, as less convergence is accommodated along this portion of the fault. The addition of a second slip-partitioning fault to the system decreases external work indicating that these faults increase the mechanical efficiency of the system.
Late Quaternary faulting along the Death Valley-Furnace Creek fault system, California and Nevada
Brogan, George E.; Kellogg, Karl; Slemmons, D. Burton; Terhune, Christina L.
1991-01-01
The Death Valley-Furnace Creek fault system, in California and Nevada, has a variety of impressive late Quaternary neotectonic features that record a long history of recurrent earthquake-induced faulting. Although no neotectonic features of unequivocal historical age are known, paleoseismic features from multiple late Quaternary events of surface faulting are well developed throughout the length of the system. Comparison of scarp heights to amount of horizontal offset of stream channels and the relationships of both scarps and channels to the ages of different geomorphic surfaces demonstrate that Quaternary faulting along the northwest-trending Furnace Creek fault zone is predominantly right lateral, whereas that along the north-trending Death Valley fault zone is predominantly normal. These observations are compatible with tectonic models of Death Valley as a northwest-trending pull-apart basin. The largest late Quaternary scarps along the Furnace Creek fault zone, with vertical separation of late Pleistocene surfaces of as much as 64 m (meters), are in Fish Lake Valley. Despite the predominance of normal faulting along the Death Valley fault zone, vertical offset of late Pleistocene surfaces along the Death Valley fault zone apparently does not exceed about 15 m. Evidence for four to six separate late Holocene faulting events along the Furnace Creek fault zone and three or more late Holocene events along the Death Valley fault zone are indicated by rupturing of Q1B (about 200-2,000 years old) geomorphic surfaces. Probably the youngest neotectonic feature observed along the Death Valley-Furnace Creek fault system, possibly historic in age, is vegetation lineaments in southernmost Fish Lake Valley. Near-historic faulting in Death Valley, within several kilometers south of Furnace Creek Ranch, is represented by (1) a 2,000-year-old lake shoreline that is cut by sinuous scarps, and (2) a system of young scarps with free-faceted faces (representing several faulting events) that cuts Q1B surfaces.
Li, Yunji; Wu, QingE; Peng, Li
2018-01-23
In this paper, a synthesized design of fault-detection filter and fault estimator is considered for a class of discrete-time stochastic systems in the framework of event-triggered transmission scheme subject to unknown disturbances and deception attacks. A random variable obeying the Bernoulli distribution is employed to characterize the phenomena of the randomly occurring deception attacks. To achieve a fault-detection residual is only sensitive to faults while robust to disturbances, a coordinate transformation approach is exploited. This approach can transform the considered system into two subsystems and the unknown disturbances are removed from one of the subsystems. The gain of fault-detection filter is derived by minimizing an upper bound of filter error covariance. Meanwhile, system faults can be reconstructed by the remote fault estimator. An recursive approach is developed to obtain fault estimator gains as well as guarantee the fault estimator performance. Furthermore, the corresponding event-triggered sensor data transmission scheme is also presented for improving working-life of the wireless sensor node when measurement information are aperiodically transmitted. Finally, a scaled version of an industrial system consisting of local PC, remote estimator and wireless sensor node is used to experimentally evaluate the proposed theoretical results. In particular, a novel fault-alarming strategy is proposed so that the real-time capacity of fault-detection is guaranteed when the event condition is triggered.
NASA Technical Reports Server (NTRS)
Kobayashi, Takahisa; Simon, Donald L.
2004-01-01
In this paper, an approach for in-flight fault detection and isolation (FDI) of aircraft engine sensors based on a bank of Kalman filters is developed. This approach utilizes multiple Kalman filters, each of which is designed based on a specific fault hypothesis. When the propulsion system experiences a fault, only one Kalman filter with the correct hypothesis is able to maintain the nominal estimation performance. Based on this knowledge, the isolation of faults is achieved. Since the propulsion system may experience component and actuator faults as well, a sensor FDI system must be robust in terms of avoiding misclassifications of any anomalies. The proposed approach utilizes a bank of (m+1) Kalman filters where m is the number of sensors being monitored. One Kalman filter is used for the detection of component and actuator faults while each of the other m filters detects a fault in a specific sensor. With this setup, the overall robustness of the sensor FDI system to anomalies is enhanced. Moreover, numerous component fault events can be accounted for by the FDI system. The sensor FDI system is applied to a commercial aircraft engine simulation, and its performance is evaluated at multiple power settings at a cruise operating point using various fault scenarios.
Faults Discovery By Using Mined Data
NASA Technical Reports Server (NTRS)
Lee, Charles
2005-01-01
Fault discovery in the complex systems consist of model based reasoning, fault tree analysis, rule based inference methods, and other approaches. Model based reasoning builds models for the systems either by mathematic formulations or by experiment model. Fault Tree Analysis shows the possible causes of a system malfunction by enumerating the suspect components and their respective failure modes that may have induced the problem. The rule based inference build the model based on the expert knowledge. Those models and methods have one thing in common; they have presumed some prior-conditions. Complex systems often use fault trees to analyze the faults. Fault diagnosis, when error occurs, is performed by engineers and analysts performing extensive examination of all data gathered during the mission. International Space Station (ISS) control center operates on the data feedback from the system and decisions are made based on threshold values by using fault trees. Since those decision-making tasks are safety critical and must be done promptly, the engineers who manually analyze the data are facing time challenge. To automate this process, this paper present an approach that uses decision trees to discover fault from data in real-time and capture the contents of fault trees as the initial state of the trees.
NASA Astrophysics Data System (ADS)
Fitzenz, D. D.; Miller, S. A.
2001-12-01
We present preliminary results from a 3-dimensional fault interaction model, with the fault system specified by the geometry and tectonics of the San Andreas Fault (SAF) system. We use the forward model for earthquake generation on interacting faults of Fitzenz and Miller [2001] that incorporates the analytical solutions of Okada [85,92], GPS-constrained tectonic loading, creep compaction and frictional dilatancy [Sleep and Blanpied, 1994, Sleep, 1995], and undrained poro-elasticity. The model fault system is centered at the Big Bend, and includes three large strike-slip faults (each discretized into multiple subfaults); 1) a 300km, right-lateral segment of the SAF to the North, 2) a 200km-long left-lateral segment of the Garlock fault to the East, and 3) a 100km-long right-lateral segment of the SAF to the South. In the initial configuration, three shallow-dipping faults are also included that correspond to the thrust belt sub-parallel to the SAF. Tectonic loading is decomposed into basal shear drag parallel to the plate boundary with a 35mm yr-1 plate velocity, and East-West compression approximated by a vertical dislocation surface applied at the far-field boundary resulting in fault-normal compression rates in the model space about 4mm yr-1. Our aim is to study the long-term seismicity characteristics, tectonic evolution, and fault interaction of this system. We find that overpressured faults through creep compaction are a necessary consequence of the tectonic loading, specifically where high normal stress acts on long straight fault segments. The optimal orientation of thrust faults is a function of the strike-slip behavior, and therefore results in a complex stress state in the elastic body. This stress state is then used to generate new fault surfaces, and preliminary results of dynamically generated faults will also be presented. Our long-term aim is to target measurable properties in or around fault zones, (e.g. pore pressures, hydrofractures, seismicity catalogs, stress orientation, surface strain, triggering, etc.), which may allow inferences on the stress state of fault systems.
NASA Technical Reports Server (NTRS)
Duyar, A.; Guo, T.-H.; Merrill, W.; Musgrave, J.
1992-01-01
In a previous study, Guo, Merrill and Duyar, 1990, reported a conceptual development of a fault detection and diagnosis system for actuation faults of the space shuttle main engine. This study, which is a continuation of the previous work, implements the developed fault detection and diagnosis scheme for the real time actuation fault diagnosis of the space shuttle main engine. The scheme will be used as an integral part of an intelligent control system demonstration experiment at NASA Lewis. The diagnosis system utilizes a model based method with real time identification and hypothesis testing for actuation, sensor, and performance degradation faults.
Robust Fault Detection and Isolation for Stochastic Systems
NASA Technical Reports Server (NTRS)
George, Jemin; Gregory, Irene M.
2010-01-01
This paper outlines the formulation of a robust fault detection and isolation scheme that can precisely detect and isolate simultaneous actuator and sensor faults for uncertain linear stochastic systems. The given robust fault detection scheme based on the discontinuous robust observer approach would be able to distinguish between model uncertainties and actuator failures and therefore eliminate the problem of false alarms. Since the proposed approach involves precise reconstruction of sensor faults, it can also be used for sensor fault identification and the reconstruction of true outputs from faulty sensor outputs. Simulation results presented here validate the effectiveness of the robust fault detection and isolation system.
Zhao, Kaihui; Li, Peng; Zhang, Changfan; Li, Xiangfei; He, Jing; Lin, Yuliang
2017-12-06
This paper proposes a new scheme of reconstructing current sensor faults and estimating unknown load disturbance for a permanent magnet synchronous motor (PMSM)-driven system. First, the original PMSM system is transformed into two subsystems; the first subsystem has unknown system load disturbances, which are unrelated to sensor faults, and the second subsystem has sensor faults, but is free from unknown load disturbances. Introducing a new state variable, the augmented subsystem that has sensor faults can be transformed into having actuator faults. Second, two sliding mode observers (SMOs) are designed: the unknown load disturbance is estimated by the first SMO in the subsystem, which has unknown load disturbance, and the sensor faults can be reconstructed using the second SMO in the augmented subsystem, which has sensor faults. The gains of the proposed SMOs and their stability analysis are developed via the solution of linear matrix inequality (LMI). Finally, the effectiveness of the proposed scheme was verified by simulations and experiments. The results demonstrate that the proposed scheme can reconstruct current sensor faults and estimate unknown load disturbance for the PMSM-driven system.
Kinematics of shallow backthrusts in the Seattle fault zone, Washington State
Pratt, Thomas L.; Troost, K.G.; Odum, Jackson K.; Stephenson, William J.
2015-01-01
Near-surface thrust fault splays and antithetic backthrusts at the tips of major thrust fault systems can distribute slip across multiple shallow fault strands, complicating earthquake hazard analyses based on studies of surface faulting. The shallow expression of the fault strands forming the Seattle fault zone of Washington State shows the structural relationships and interactions between such fault strands. Paleoseismic studies document an ∼7000 yr history of earthquakes on multiple faults within the Seattle fault zone, with some backthrusts inferred to rupture in small (M ∼5.5–6.0) earthquakes at times other than during earthquakes on the main thrust faults. We interpret seismic-reflection profiles to show three main thrust faults, one of which is a blind thrust fault directly beneath downtown Seattle, and four small backthrusts within the Seattle fault zone. We then model fault slip, constrained by shallow deformation, to show that the Seattle fault forms a fault propagation fold rather than the alternatively proposed roof thrust system. Fault slip modeling shows that back-thrust ruptures driven by moderate (M ∼6.5–6.7) earthquakes on the main thrust faults are consistent with the paleoseismic data. The results indicate that paleoseismic data from the back-thrust ruptures reveal the times of moderate earthquakes on the main fault system, rather than indicating smaller (M ∼5.5–6.0) earthquakes involving only the backthrusts. Estimates of cumulative shortening during known Seattle fault zone earthquakes support the inference that the Seattle fault has been the major seismic hazard in the northern Cascadia forearc in the late Holocene.
NASA Astrophysics Data System (ADS)
Fagereng, A.; Hodge, M.; Biggs, J.; Mdala, H. S.; Goda, K.
2016-12-01
Faults grow through the interaction and linkage of isolated fault segments. Continuous fault systems are those where segments interact, link and may slip synchronously, whereas non-continuous fault systems comprise isolated faults. As seismic moment is related to fault length (Wells and Coppersmith, 1994), understanding whether a fault system is continuous or not is critical in evaluating seismic hazard. Maturity may be a control on fault continuity: immature, low displacement faults are typically assumed to be non-continuous. Here, we study two overlapping, 20 km long, normal fault segments of the N-S striking Bilila-Mtakataka fault, Malawi, in the southern section of the East African Rift System. Despite its relative immaturity, previous studies concluded the Bilila-Mtakataka fault is continuous for its entire 100 km length, with the most recent event equating to an Mw8.0 earthquake (Jackson and Blenkinsop, 1997). We explore whether segment geometry and relationship to pre-existing high-grade metamorphic foliation has influenced segment interaction and fault development. Fault geometry and scarp height is constrained by DEMs derived from SRTM, Pleiades and `Structure from Motion' photogrammetry using a UAV, alongside direct field observations. The segment strikes differ on average by 10°, but up to 55° at their adjacent tips. The southern segment is sub-parallel to the foliation, whereas the northern segment is highly oblique to the foliation. Geometrical surface discontinuities suggest two isolated faults; however, displacement-length profiles and Coulomb stress change models suggest segment interaction, with potential for linkage at depth. Further work must be undertaken on other segments to assess the continuity of the entire fault, concluding whether an earthquake greater than that of the maximum instrumentally recorded (1910 M7.4 Rukwa) is possible.
Modeling the data management system of Space Station Freedom with DEPEND
NASA Technical Reports Server (NTRS)
Olson, Daniel P.; Iyer, Ravishankar K.; Boyd, Mark A.
1993-01-01
Some of the features and capabilities of the DEPEND simulation-based modeling tool are described. A study of a 1553B local bus subsystem of the Space Station Freedom Data Management System (SSF DMS) is used to illustrate some types of system behavior that can be important to reliability and performance evaluations of this type of spacecraft. A DEPEND model of the subsystem is used to illustrate how these types of system behavior can be modeled, and shows what kinds of engineering and design questions can be answered through the use of these modeling techniques. DEPEND's process-based simulation environment is shown to provide a flexible method for modeling complex interactions between hardware and software elements of a fault-tolerant computing system.
Verification of an IGBT Fusing Switch for Over-current Protection of the SNS HVCM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benwell, Andrew; Kemp, Mark; Burkhart, Craig
2010-06-11
An IGBT based over-current protection system has been developed to detect faults and limit the damage caused by faults in high voltage converter modulators. During normal operation, an IGBT enables energy to be transferred from storage capacitors to a H-bridge. When a fault occurs, the over-current protection system detects the fault, limits the fault current and opens the IGBT to isolate the remaining stored energy from the fault. This paper presents an experimental verification of the over-current protection system under applicable conditions.
Discrete Wavelet Transform for Fault Locations in Underground Distribution System
NASA Astrophysics Data System (ADS)
Apisit, C.; Ngaopitakkul, A.
2010-10-01
In this paper, a technique for detecting faults in underground distribution system is presented. Discrete Wavelet Transform (DWT) based on traveling wave is employed in order to detect the high frequency components and to identify fault locations in the underground distribution system. The first peak time obtained from the faulty bus is employed for calculating the distance of fault from sending end. The validity of the proposed technique is tested with various fault inception angles, fault locations and faulty phases. The result is found that the proposed technique provides satisfactory result and will be very useful in the development of power systems protection scheme.
Dynamic characteristics of a 20 kHz resonant power system - Fault identification and fault recovery
NASA Technical Reports Server (NTRS)
Wasynczuk, O.
1988-01-01
A detailed simulation of a dc inductor resonant driver and receiver is used to demonstrate the transient characteristics of a 20 kHz resonant power system during fault and overload conditions. The simulated system consists of a dc inductor resonant inverter (driver), a 50-meter transmission cable, and a dc inductor resonant receiver load. Of particular interest are the driver and receiver performance during fault and overload conditions and on the recovery characteristics following removal of the fault. The information gained from these studies sets the stage for further work in fault identification and autonomous power system control.
The Quaternary thrust system of the northern Alaska Range
Bemis, Sean P.; Carver, Gary A.; Koehler, Richard D.
2012-01-01
The framework of Quaternary faults in Alaska remains poorly constrained. Recent studies in the Alaska Range north of the Denali fault add significantly to the recognition of Quaternary deformation in this active orogen. Faults and folds active during the Quaternary occur over a length of ∼500 km along the northern flank of the Alaska Range, extending from Mount McKinley (Denali) eastward to the Tok River valley. These faults exist as a continuous system of active structures, but we divide the system into four regions based on east-west changes in structural style. At the western end, the Kantishna Hills have only two known faults but the highest rate of shallow crustal seismicity. The western northern foothills fold-thrust belt consists of a 50-km-wide zone of subparallel thrust and reverse faults. This broad zone of deformation narrows to the east in a transition zone where the range-bounding fault of the western northern foothills fold-thrust belt terminates and displacement occurs on thrust and/or reverse faults closer to the Denali fault. The eastern northern foothills fold-thrust belt is characterized by ∼40-km-long thrust fault segments separated across left-steps by NNE-trending left-lateral faults. Altogether, these faults accommodate much of the topographic growth of the northern flank of the Alaska Range.Recognition of this thrust fault system represents a significant concern in addition to the Denali fault for infrastructure adjacent to and transecting the Alaska Range. Although additional work is required to characterize these faults sufficiently for seismic hazard analysis, the regional extent and structural character should require the consideration of the northern Alaska Range thrust system in regional tectonic models.
Jurassic faults of southwest Alabama and offshore areas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mink, R.M.; Tew, B.H.; Bearden, B.L.
1991-03-01
Four fault groups affecting Jurassic strata occur in the southwest and offshore Alabama areas. They include the regional basement rift trend, the regional peripheral fault trend, the Mobile graben fault system, and the Lower Mobile Bay fault system. The regional basement system rift and regional peripheral fault trends are distinct and rim the inner margin of the eastern Gulf Coastal Plain. The regional basement rift trend is genetically related to the breakup of Pangea and the opening of the Gulf of Mexico in the Late Triassic-Early Jurassic. This fault trend is thought to have formed contemporaneously with deposition of Latemore » Triassic-Early Jurassic Eagle Mills Formation and to displace pre-Mesozoic rocks. The regional peripheral fault trend consists of a group of en echelon extensional faults that are parallel or subparallel to regional strike of Gulf Coastal Plain strata and correspond to the approximate updip limit of thick Louann Salt. Nondiapiric salt features are associated with the trend and maximum structural development is exhibited in the Haynesville-Smackover section. No hydrocarbon accumulations have been documented in the pre-Jurassic strata of southwest and offshore Alabama. Productive hydrocarbon reservoirs occur in Jurassic strata along the trends of the fault groups, suggesting a significant relationship between structural development in the Jurassic and hydrocarbon accumulation. Hydrocarbon traps are generally structural or contain a major structural component and include salt anticlines, faulted salt anticlines, and extensional fault traps. All of the major hydrocarbon accumulations are associated with movement of the Louann Salt along the regional peripheral fault trend, the Mobile graben fault system, or the Lower Mobile Bay fault system.« less
NASA Astrophysics Data System (ADS)
Nicholson, C.; Plesch, A.; Sorlien, C. C.; Shaw, J. H.; Hauksson, E.
2014-12-01
Southern California represents an ideal natural laboratory to investigate oblique deformation in 3D owing to its comprehensive datasets, complex tectonic history, evolving components of oblique slip, and continued crustal rotations about horizontal and vertical axes. As the SCEC Community Fault Model (CFM) aims to accurately reflect this 3D deformation, we present the results of an extensive update to the model by using primarily detailed fault trace, seismic reflection, relocated hypocenter and focal mechanism nodal plane data to generate improved, more realistic digital 3D fault surfaces. The results document a wide variety of oblique strain accommodation, including various aspects of strain partitioning and fault-related folding, sets of both high-angle and low-angle faults that mutually interact, significant non-planar, multi-stranded faults with variable dip along strike and with depth, and active mid-crustal detachments. In places, closely-spaced fault strands or fault systems can remain surprisingly subparallel to seismogenic depths, while in other areas, major strike-slip to oblique-slip faults can merge, such as the S-dipping Arroyo Parida-Mission Ridge and Santa Ynez faults with the N-dipping North Channel-Pitas Point-Red Mountain fault system, or diverge with depth. Examples of the latter include the steep-to-west-dipping Laguna Salada-Indiviso faults with the steep-to-east-dipping Sierra Cucapah faults, and the steep southern San Andreas fault with the adjacent NE-dipping Mecca Hills-Hidden Springs fault system. In addition, overprinting by steep predominantly strike-slip faulting can segment which parts of intersecting inherited low-angle faults are reactivated, or result in mutual cross-cutting relationships. The updated CFM 3D fault surfaces thus help characterize a more complex pattern of fault interactions at depth between various fault sets and linked fault systems, and a more complex fault geometry than typically inferred or expected from projecting near-surface data down-dip, or modeled from surface strain and potential field data alone.
QuakeSim: a Web Service Environment for Productive Investigations with Earth Surface Sensor Data
NASA Astrophysics Data System (ADS)
Parker, J. W.; Donnellan, A.; Granat, R. A.; Lyzenga, G. A.; Glasscoe, M. T.; McLeod, D.; Al-Ghanmi, R.; Pierce, M.; Fox, G.; Grant Ludwig, L.; Rundle, J. B.
2011-12-01
The QuakeSim science gateway environment includes a visually rich portal interface, web service access to data and data processing operations, and the QuakeTables ontology-based database of fault models and sensor data. The integrated tools and services are designed to assist investigators by covering the entire earthquake cycle of strain accumulation and release. The Web interface now includes Drupal-based access to diverse and changing content, with new ability to access data and data processing directly from the public page, as well as the traditional project management areas that require password access. The system is designed to make initial browsing of fault models and deformation data particularly engaging for new users. Popular data and data processing include GPS time series with data mining techniques to find anomalies in time and space, experimental forecasting methods based on catalogue seismicity, faulted deformation models (both half-space and finite element), and model-based inversion of sensor data. The fault models include the CGS and UCERF 2.0 faults of California and are easily augmented with self-consistent fault models from other regions. The QuakeTables deformation data include the comprehensive set of UAVSAR interferograms as well as a growing collection of satellite InSAR data.. Fault interaction simulations are also being incorporated in the web environment based on Virtual California. A sample usage scenario is presented which follows an investigation of UAVSAR data from viewing as an overlay in Google Maps, to selection of an area of interest via a polygon tool, to fast extraction of the relevant correlation and phase information from large data files, to a model inversion of fault slip followed by calculation and display of a synthetic model interferogram.
Structural controls on a geothermal system in the Tarutung Basin, north central Sumatra
NASA Astrophysics Data System (ADS)
Nukman, Mochamad; Moeck, Inga
2013-09-01
The Sumatra Fault System provides a unique geologic setting to evaluate the influence of structural controls on geothermal activity. Whereas most of the geothermal systems in Indonesia are controlled by volcanic activity, geothermal systems at the Sumatra Fault System might be controlled by faults and fractures. Exploration strategies for these geothermal systems need to be verified because the typical pattern of heat source and alteration clays are missing so that conventional exploration with magnetotelluric surveys might not provide sufficient data to delineate favorable settings for drilling. We present field geological, structural and geomorphological evidence combined with mapping of geothermal manifestations to allow constraints between fault dynamics and geothermal activity in the Tarutung Basin in north central Sumatra. Our results indicate that the fault pattern in the Tarutung Basin is generated by a compressional stress direction acting at a high angle to the right-lateral Sumatra Fault System. NW-SE striking normal faults possibly related to negative flower structures and NNW-SSE to NNE-SSW oriented dilative Riedel shears are preferential fluid pathways whereas ENE-WSW striking faults act as barriers in this system. The dominant of geothermal manifestations at the eastern part of the basin indicates local extension due to clockwise block rotation in the Sumatra Fault System. Our results support the effort to integrate detailed field geological surveys to refined exploration strategies even in tropical areas where outcrops are limited.
The Trans-Rocky Mountain Fault System - A Fundamental Precambrian Strike-Slip System
Sims, P.K.
2009-01-01
Recognition of a major Precambrian continental-scale, two-stage conjugate strike-slip fault system - here designated as the Trans-Rocky Mountain fault system - provides new insights into the architecture of the North American continent. The fault system consists chiefly of steep linear to curvilinear, en echelon, braided and branching ductile-brittle shears and faults, and local coeval en echelon folds of northwest strike, that cut indiscriminately across both Proterozoic and Archean cratonic elements. The fault system formed during late stages of two distinct tectonic episodes: Neoarchean and Paleoproterozoic orogenies at about 2.70 and 1.70 billion years (Ga). In the Archean Superior province, the fault system formed (about 2.70-2.65 Ga) during a late stage of the main deformation that involved oblique shortening (dextral transpression) across the region and progressed from crystal-plastic to ductile-brittle deformation. In Paleoproterozoic terranes, the fault system formed about 1.70 Ga, shortly following amalgamation of Paleoproterozoic and Archean terranes and the main Paleoproterozoic plastic-fabric-producing events in the protocontinent, chiefly during sinistral transpression. The postulated driving force for the fault system is subcontinental mantle deformation, the bottom-driven deformation of previous investigators. This model, based on seismic anisotropy, invokes mechanical coupling and subsequent shear between the lithosphere and the asthenosphere such that a major driving force for plate motion is deep-mantle flow.
NASA Technical Reports Server (NTRS)
Patterson, Jonathan D.; Breckenridge, Jonathan T.; Johnson, Stephen B.
2013-01-01
Building upon the purpose, theoretical approach, and use of a Goal-Function Tree (GFT) being presented by Dr. Stephen B. Johnson, described in a related Infotech 2013 ISHM abstract titled "Goal-Function Tree Modeling for Systems Engineering and Fault Management", this paper will describe the core framework used to implement the GFTbased systems engineering process using the Systems Modeling Language (SysML). These two papers are ideally accepted and presented together in the same Infotech session. Statement of problem: SysML, as a tool, is currently not capable of implementing the theoretical approach described within the "Goal-Function Tree Modeling for Systems Engineering and Fault Management" paper cited above. More generally, SysML's current capabilities to model functional decompositions in the rigorous manner required in the GFT approach are limited. The GFT is a new Model-Based Systems Engineering (MBSE) approach to the development of goals and requirements, functions, and its linkage to design. As a growing standard for systems engineering, it is important to develop methods to implement GFT in SysML. Proposed Method of Solution: Many of the central concepts of the SysML language are needed to implement a GFT for large complex systems. In the implementation of those central concepts, the following will be described in detail: changes to the nominal SysML process, model view definitions and examples, diagram definitions and examples, and detailed SysML construct and stereotype definitions.
Sequential Test Strategies for Multiple Fault Isolation
NASA Technical Reports Server (NTRS)
Shakeri, M.; Pattipati, Krishna R.; Raghavan, V.; Patterson-Hine, Ann; Kell, T.
1997-01-01
In this paper, we consider the problem of constructing near optimal test sequencing algorithms for diagnosing multiple faults in redundant (fault-tolerant) systems. The computational complexity of solving the optimal multiple-fault isolation problem is super-exponential, that is, it is much more difficult than the single-fault isolation problem, which, by itself, is NP-hard. By employing concepts from information theory and Lagrangian relaxation, we present several static and dynamic (on-line or interactive) test sequencing algorithms for the multiple fault isolation problem that provide a trade-off between the degree of suboptimality and computational complexity. Furthermore, we present novel diagnostic strategies that generate a static diagnostic directed graph (digraph), instead of a static diagnostic tree, for multiple fault diagnosis. Using this approach, the storage complexity of the overall diagnostic strategy reduces substantially. Computational results based on real-world systems indicate that the size of a static multiple fault strategy is strictly related to the structure of the system, and that the use of an on-line multiple fault strategy can diagnose faults in systems with as many as 10,000 failure sources.
Fault Modeling of Extreme Scale Applications Using Machine Learning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vishnu, Abhinav; Dam, Hubertus van; Tallent, Nathan R.
Faults are commonplace in large scale systems. These systems experience a variety of faults such as transient, permanent and intermittent. Multi-bit faults are typically not corrected by the hardware resulting in an error. Here, this paper attempts to answer an important question: Given a multi-bit fault in main memory, will it result in an application error — and hence a recovery algorithm should be invoked — or can it be safely ignored? We propose an application fault modeling methodology to answer this question. Given a fault signature (a set of attributes comprising of system and application state), we use machinemore » learning to create a model which predicts whether a multibit permanent/transient main memory fault will likely result in error. We present the design elements such as the fault injection methodology for covering important data structures, the application and system attributes which should be used for learning the model, the supervised learning algorithms (and potentially ensembles), and important metrics. Lastly, we use three applications — NWChem, LULESH and SVM — as examples for demonstrating the effectiveness of the proposed fault modeling methodology.« less
Fault Modeling of Extreme Scale Applications Using Machine Learning
Vishnu, Abhinav; Dam, Hubertus van; Tallent, Nathan R.; ...
2016-05-01
Faults are commonplace in large scale systems. These systems experience a variety of faults such as transient, permanent and intermittent. Multi-bit faults are typically not corrected by the hardware resulting in an error. Here, this paper attempts to answer an important question: Given a multi-bit fault in main memory, will it result in an application error — and hence a recovery algorithm should be invoked — or can it be safely ignored? We propose an application fault modeling methodology to answer this question. Given a fault signature (a set of attributes comprising of system and application state), we use machinemore » learning to create a model which predicts whether a multibit permanent/transient main memory fault will likely result in error. We present the design elements such as the fault injection methodology for covering important data structures, the application and system attributes which should be used for learning the model, the supervised learning algorithms (and potentially ensembles), and important metrics. Lastly, we use three applications — NWChem, LULESH and SVM — as examples for demonstrating the effectiveness of the proposed fault modeling methodology.« less
A comparative study of sensor fault diagnosis methods based on observer for ECAS system
NASA Astrophysics Data System (ADS)
Xu, Xing; Wang, Wei; Zou, Nannan; Chen, Long; Cui, Xiaoli
2017-03-01
The performance and practicality of electronically controlled air suspension (ECAS) system are highly dependent on the state information supplied by kinds of sensors, but faults of sensors occur frequently. Based on a non-linearized 3-DOF 1/4 vehicle model, different methods of fault detection and isolation (FDI) are used to diagnose the sensor faults for ECAS system. The considered approaches include an extended Kalman filter (EKF) with concise algorithm, a strong tracking filter (STF) with robust tracking ability, and the cubature Kalman filter (CKF) with numerical precision. We propose three filters of EKF, STF, and CKF to design a state observer of ECAS system under typical sensor faults and noise. Results show that three approaches can successfully detect and isolate faults respectively despite of the existence of environmental noise, FDI time delay and fault sensitivity of different algorithms are different, meanwhile, compared with EKF and STF, CKF method has best performing FDI of sensor faults for ECAS system.
Hardware fault insertion and instrumentation system: Mechanization and validation
NASA Technical Reports Server (NTRS)
Benson, J. W.
1987-01-01
Automated test capability for extensive low-level hardware fault insertion testing is developed. The test capability is used to calibrate fault detection coverage and associated latency times as relevant to projecting overall system reliability. Described are modifications made to the NASA Ames Reconfigurable Flight Control System (RDFCS) Facility to fully automate the total test loop involving the Draper Laboratories' Fault Injector Unit. The automated capability provided included the application of sequences of simulated low-level hardware faults, the precise measurement of fault latency times, the identification of fault symptoms, and bulk storage of test case results. A PDP-11/60 served as a test coordinator, and a PDP-11/04 as an instrumentation device. The fault injector was controlled by applications test software in the PDP-11/60, rather than by manual commands from a terminal keyboard. The time base was especially developed for this application to use a variety of signal sources in the system simulator.
The weakest t-norm based intuitionistic fuzzy fault-tree analysis to evaluate system reliability.
Kumar, Mohit; Yadav, Shiv Prasad
2012-07-01
In this paper, a new approach of intuitionistic fuzzy fault-tree analysis is proposed to evaluate system reliability and to find the most critical system component that affects the system reliability. Here weakest t-norm based intuitionistic fuzzy fault tree analysis is presented to calculate fault interval of system components from integrating expert's knowledge and experience in terms of providing the possibility of failure of bottom events. It applies fault-tree analysis, α-cut of intuitionistic fuzzy set and T(ω) (the weakest t-norm) based arithmetic operations on triangular intuitionistic fuzzy sets to obtain fault interval and reliability interval of the system. This paper also modifies Tanaka et al.'s fuzzy fault-tree definition. In numerical verification, a malfunction of weapon system "automatic gun" is presented as a numerical example. The result of the proposed method is compared with the listing approaches of reliability analysis methods. Copyright © 2012 ISA. Published by Elsevier Ltd. All rights reserved.
Probabilistic evaluation of on-line checks in fault-tolerant multiprocessor systems
NASA Technical Reports Server (NTRS)
Nair, V. S. S.; Hoskote, Yatin V.; Abraham, Jacob A.
1992-01-01
The analysis of fault-tolerant multiprocessor systems that use concurrent error detection (CED) schemes is much more difficult than the analysis of conventional fault-tolerant architectures. Various analytical techniques have been proposed to evaluate CED schemes deterministically. However, these approaches are based on worst-case assumptions related to the failure of system components. Often, the evaluation results do not reflect the actual fault tolerance capabilities of the system. A probabilistic approach to evaluate the fault detecting and locating capabilities of on-line checks in a system is developed. The various probabilities associated with the checking schemes are identified and used in the framework of the matrix-based model. Based on these probabilistic matrices, estimates for the fault tolerance capabilities of various systems are derived analytically.
Jeon, Namju; Lee, Hyeongcheol
2016-01-01
An integrated fault-diagnosis algorithm for a motor sensor of in-wheel independent drive electric vehicles is presented. This paper proposes a method that integrates the high- and low-level fault diagnoses to improve the robustness and performance of the system. For the high-level fault diagnosis of vehicle dynamics, a planar two-track non-linear model is first selected, and the longitudinal and lateral forces are calculated. To ensure redundancy of the system, correlation between the sensor and residual in the vehicle dynamics is analyzed to detect and separate the fault of the drive motor system of each wheel. To diagnose the motor system for low-level faults, the state equation of an interior permanent magnet synchronous motor is developed, and a parity equation is used to diagnose the fault of the electric current and position sensors. The validity of the high-level fault-diagnosis algorithm is verified using Carsim and Matlab/Simulink co-simulation. The low-level fault diagnosis is verified through Matlab/Simulink simulation and experiments. Finally, according to the residuals of the high- and low-level fault diagnoses, fault-detection flags are defined. On the basis of this information, an integrated fault-diagnosis strategy is proposed. PMID:27973431
McLaughlin, R.J.; Langenheim, V.E.; Schmidt, K.M.; Jachens, R.C.; Stanley, R.G.; Jayko, A.S.; McDougall, K.A.; Tinsley, J.C.; Valin, Z.C.
1999-01-01
In the southern San Francisco Bay region of California, oblique dextral reverse faults that verge northeastward from the San Andreas fault experienced triggered slip during the 1989 M7.1 Loma Prieta earthquake. The role of these range-front thrusts in the evolution of the San Andreas fault system and the future seismic hazard that they may pose to the urban Santa Clara Valley are poorly understood. Based on recent geologic mapping and geophysical investigations, we propose that the range-front thrust system evolved in conjunction with development of the San Andreas fault system. In the early Miocene, the region was dominated by a system of northwestwardly propagating, basin-bounding, transtensional faults. Beginning as early as middle Miocene time, however, the transtensional faulting was superseded by transpressional NE-stepping thrust and reverse faults of the range-front thrust system. Age constraints on the thrust faults indicate that the locus of contraction has focused on the Monte Vista, Shannon, and Berrocal faults since about 4.8 Ma. Fault slip and fold reconstructions suggest that crustal shortening between the San Andreas fault and the Santa Clara Valley within this time frame is ~21%, amounting to as much as 3.2 km at a rate of 0.6 mm/yr. Rates probably have not remained constant; average rates appear to have been much lower in the past few 100 ka. The distribution of coseismic surface contraction during the Loma Prieta earthquake, active seismicity, late Pleistocene to Holocene fluvial terrace warping, and geodetic data further suggest that the active range-front thrust system includes blind thrusts. Critical unresolved issues include information on the near-surface locations of buried thrusts, the timing of recent thrust earthquake events, and their recurrence in relation to earthquakes on the San Andreas fault.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quinn, Heather; Wirthlin, Michael
A variety of fault emulation systems have been created to study the effect of single-event effects (SEEs) in static random access memory (SRAM) based field-programmable gate arrays (FPGAs). These systems are useful for augmenting radiation-hardness assurance (RHA) methodologies for verifying the effectiveness for mitigation techniques; understanding error signatures and failure modes in FPGAs; and failure rate estimation. For radiation effects researchers, it is important that these systems properly emulate how SEEs manifest in FPGAs. If the fault emulation systems does not mimic the radiation environment, the system will generate erroneous data and incorrect predictions of behavior of the FPGA inmore » a radiation environment. Validation determines whether the emulated faults are reasonable analogs to the radiation-induced faults. In this study we present methods for validating fault emulation systems and provide several examples of validated FPGA fault emulation systems.« less
In-flight Fault Detection and Isolation in Aircraft Flight Control Systems
NASA Technical Reports Server (NTRS)
Azam, Mohammad; Pattipati, Krishna; Allanach, Jeffrey; Poll, Scott; Patterson-Hine, Ann
2005-01-01
In this paper we consider the problem of test design for real-time fault detection and isolation (FDI) in the flight control system of fixed-wing aircraft. We focus on the faults that are manifested in the control surface elements (e.g., aileron, elevator, rudder and stabilizer) of an aircraft. For demonstration purposes, we restrict our focus on the faults belonging to nine basic fault classes. The diagnostic tests are performed on the features extracted from fifty monitored system parameters. The proposed tests are able to uniquely isolate each of the faults at almost all severity levels. A neural network-based flight control simulator, FLTZ(Registered TradeMark), is used for the simulation of various faults in fixed-wing aircraft flight control systems for the purpose of FDI.
NASA Astrophysics Data System (ADS)
Yim, Keun Soo
This dissertation summarizes experimental validation and co-design studies conducted to optimize the fault detection capabilities and overheads in hybrid computer systems (e.g., using CPUs and Graphics Processing Units, or GPUs), and consequently to improve the scalability of parallel computer systems using computational accelerators. The experimental validation studies were conducted to help us understand the failure characteristics of CPU-GPU hybrid computer systems under various types of hardware faults. The main characterization targets were faults that are difficult to detect and/or recover from, e.g., faults that cause long latency failures (Ch. 3), faults in dynamically allocated resources (Ch. 4), faults in GPUs (Ch. 5), faults in MPI programs (Ch. 6), and microarchitecture-level faults with specific timing features (Ch. 7). The co-design studies were based on the characterization results. One of the co-designed systems has a set of source-to-source translators that customize and strategically place error detectors in the source code of target GPU programs (Ch. 5). Another co-designed system uses an extension card to learn the normal behavioral and semantic execution patterns of message-passing processes executing on CPUs, and to detect abnormal behaviors of those parallel processes (Ch. 6). The third co-designed system is a co-processor that has a set of new instructions in order to support software-implemented fault detection techniques (Ch. 7). The work described in this dissertation gains more importance because heterogeneous processors have become an essential component of state-of-the-art supercomputers. GPUs were used in three of the five fastest supercomputers that were operating in 2011. Our work included comprehensive fault characterization studies in CPU-GPU hybrid computers. In CPUs, we monitored the target systems for a long period of time after injecting faults (a temporally comprehensive experiment), and injected faults into various types of program states that included dynamically allocated memory (to be spatially comprehensive). In GPUs, we used fault injection studies to demonstrate the importance of detecting silent data corruption (SDC) errors that are mainly due to the lack of fine-grained protections and the massive use of fault-insensitive data. This dissertation also presents transparent fault tolerance frameworks and techniques that are directly applicable to hybrid computers built using only commercial off-the-shelf hardware components. This dissertation shows that by developing understanding of the failure characteristics and error propagation paths of target programs, we were able to create fault tolerance frameworks and techniques that can quickly detect and recover from hardware faults with low performance and hardware overheads.
NASA Astrophysics Data System (ADS)
Gomila, Rodrigo; Arancibia, Gloria; Mitchell, Thomas M.; Cembrano, Jose M.; Faulkner, Daniel R.
2016-02-01
Understanding fault zone permeability and its spatial distribution allows the assessment of fluid-migration leading to precipitation of hydrothermal minerals. This work is aimed at unraveling the conditions and distribution of fluid transport properties in fault zones based on hydrothermally filled microfractures, which reflect the ''frozen-in'' instantaneous advective hydrothermal activity and record palaeopermeability conditions of the fault-fracture system. We studied the Jorgillo Fault, an exposed 20 km long, left-lateral strike-slip fault, which juxtaposes Jurassic gabbro against metadiorite belonging to the Atacama Fault System in northern Chile. Tracings of microfracture networks of 19 oriented thin sections from a 400 m long transect across the main fault trace was carried out to estimate the hydraulic properties of the low-strain fault damagezone, adjacent to the high-strain fault core, by assuming penny-shaped microfractures of constant radius and aperture within an anisotropic fracture system. Palaeopermeability values of 9.1*10-11 to 3.2*10-13 m2 in the gabbro and of 5.0*10-10 to 1.2*10-13 m2 in the metadiorite were determined, both decreasing perpendicularly away from the fault core. Fracture porosity values range from 40.00% to 0.28%. The Jorgillo Fault has acted as a left-lateral dilational fault-bend, generating large-scale dilation sites north of the JF during co-seismic activity.
NASA Technical Reports Server (NTRS)
Schurmeier, H. M.
1974-01-01
The long life of Pioneer interplanetary spacecraft is considered along with a general accelerated methodology for long-life mechanical components, dependable long-lived household appliances, and the design and development philosophy to achieve reliability and long life in large turbine generators. Other topics discussed include an integrated management approach to long life in space, artificial heart reliability factors, and architectural concepts and redundancy techniques in fault-tolerant computers. Individual items are announced in this issue.
Evaluation of reliability modeling tools for advanced fault tolerant systems
NASA Technical Reports Server (NTRS)
Baker, Robert; Scheper, Charlotte
1986-01-01
The Computer Aided Reliability Estimation (CARE III) and Automated Reliability Interactice Estimation System (ARIES 82) reliability tools for application to advanced fault tolerance aerospace systems were evaluated. To determine reliability modeling requirements, the evaluation focused on the Draper Laboratories' Advanced Information Processing System (AIPS) architecture as an example architecture for fault tolerance aerospace systems. Advantages and limitations were identified for each reliability evaluation tool. The CARE III program was designed primarily for analyzing ultrareliable flight control systems. The ARIES 82 program's primary use was to support university research and teaching. Both CARE III and ARIES 82 were not suited for determining the reliability of complex nodal networks of the type used to interconnect processing sites in the AIPS architecture. It was concluded that ARIES was not suitable for modeling advanced fault tolerant systems. It was further concluded that subject to some limitations (the difficulty in modeling systems with unpowered spare modules, systems where equipment maintenance must be considered, systems where failure depends on the sequence in which faults occurred, and systems where multiple faults greater than a double near coincident faults must be considered), CARE III is best suited for evaluating the reliability of advanced tolerant systems for air transport.
Expert systems for real-time monitoring and fault diagnosis
NASA Technical Reports Server (NTRS)
Edwards, S. J.; Caglayan, A. K.
1989-01-01
Methods for building real-time onboard expert systems were investigated, and the use of expert systems technology was demonstrated in improving the performance of current real-time onboard monitoring and fault diagnosis applications. The potential applications of the proposed research include an expert system environment allowing the integration of expert systems into conventional time-critical application solutions, a grammar for describing the discrete event behavior of monitoring and fault diagnosis systems, and their applications to new real-time hardware fault diagnosis and monitoring systems for aircraft.
Ultrareliable fault-tolerant control systems
NASA Technical Reports Server (NTRS)
Webster, L. D.; Slykhouse, R. A.; Booth, L. A., Jr.; Carson, T. M.; Davis, G. J.; Howard, J. C.
1984-01-01
It is demonstrated that fault-tolerant computer systems, such as on the Shuttles, based on redundant, independent operation are a viable alternative in fault tolerant system designs. The ultrareliable fault-tolerant control system (UFTCS) was developed and tested in laboratory simulations of an UH-1H helicopter. UFTCS includes asymptotically stable independent control elements in a parallel, cross-linked system environment. Static redundancy provides the fault tolerance. A polling is performed among the computers, with results allowing for time-delay channel variations with tight bounds. When compared with the laboratory and actual flight data for the helicopter, the probability of a fault was, for the first 10 hr of flight given a quintuple computer redundancy, found to be 1 in 290 billion. Two weeks of untended Space Station operations would experience a fault probability of 1 in 24 million. Techniques for avoiding channel divergence problems are identified.
Groundwater management in Cusco region, Peru Present and future challenges
NASA Astrophysics Data System (ADS)
Guttman, Joseph; Berger, Diego; Pumayalli Saloma, Rene
2013-04-01
The agriculture in the rural areas in the Andes Mountains at Cusco region-Peru is mainly rain fed agriculture and basically concentrated to one crop season per year. This situation limits the farmer's development. In order to increase the agricultural season into the winter period (May to November) also known as the dry season, many farmers are pumping water from streams or underground water that unfortunately leads to many of them becoming dry during the winter/dry season. In addition, some of those streams are polluted by the city's wastewater and heavy metals that are released from mines which are quite abundant in the Andes Mountains. The regional government through its engineering organization "Per Plan Meriss Inka", is trying to increase the water quantity and quality to the end users (farmers in the valleys) by promoting projects that among others include capturing of springs that emerge from the high mountain ridges, diverting streams and harvesting surface reservoirs. In the Ancahuasi area (Northwest of Cusco) are many springs that emerge along several geological faults that act as a border line between the permeable layers (mostly sandstone) in the upper throw of the fault and impermeable layers in the lower throw of the fault. The discharge of the springs varies in dependence to the size of each catchment area or aquifer structure. The spring water is collected in small pools and then by gravity through open channels to the farmers in the valleys. During the past 25 years, in some places, springs have been captured by horizontal wells (gallery) that were excavated from the fault zone into the mountain a few tens of meters below the spring outlet. The gallery drains excess water from the spring storage and increases the overall discharge. The galleries are a limited solution to individual places where the geology, hydrology and the topography enable it. The farmers are using flood irrigation systems which according to World Bank documents, the overall efficiency of such irrigation systems is about 35% (most of the water recharges to the underground or is lost by evaporation). Slightly increasing the efficiency by only 10-20% together with bringing additional water would cause a dramatic change in the farmer's life and in their income. A Pre-feasibility study indicates that there are deeper subsurface groundwater systems that flow from the Andes Mountain downstream to the valleys. The deeper systems are most probably separated from the spring systems. The deeper groundwater systems are flowing from the Andes Mountains downstream via individual paths, in places where both sides of the faults contain permeable layers and through several alluvial fans. Detailed researches are planned in the next few years to identify those individual sites and to locate sites for drilling boreholes (observation and production). Today, an integrated water resources management at the local and regional level is lacking. The feasibility studies will include recommendations to the regional government on how to implement such an integrated management program together with capacity building of the institutional capability of regional governments.
Web-based monitoring and management system for integrated enterprise-wide imaging networks
NASA Astrophysics Data System (ADS)
Ma, Keith; Slik, David; Lam, Alvin; Ng, Won
2003-05-01
Mass proliferation of IP networks and the maturity of standards has enabled the creation of sophisticated image distribution networks that operate over Intranets, Extranets, Communities of Interest (CoI) and even the public Internet. Unified monitoring, provisioning and management of such systems at the application and protocol levels represent a challenge. This paper presents a web based monitoring and management tool that employs established telecom standards for the creation of an open system that enables proactive management, provisioning and monitoring of image management systems at the enterprise level and across multi-site geographically distributed deployments. Utilizing established standards including ITU-T M.3100, and web technologies such as XML/XSLT, JSP/JSTL, and J2SE, the system allows for seamless device and protocol adaptation between multiple disparate devices. The goal has been to develop a unified interface that provides network topology views, multi-level customizable alerts, real-time fault detection as well as real-time and historical reporting of all monitored resources, including network connectivity, system load, DICOM transactions and storage capacities.
NASA Technical Reports Server (NTRS)
Coleman, Anthony S.; Hansen, Irving G.
1994-01-01
NASA is pursuing a program in Advanced Subsonic Transport (AST) to develop the technology for a highly reliable Fly-By-Light/Power-By-WIre aircraft. One of the primary objectives of the program is to develop the technology base for confident application of integrated PBW components and systems to transport aircraft to improve operating reliability and efficiency. Technology will be developed so that the present hydraulic and pneumatic systems of the aircraft can be systematically eliminated and replaced by electrical systems. These motor driven actuators would move the aircraft wing surfaces as well as the rudder to provide steering controls for the pilot. Existing aircraft electrical systems are not flight critical and are prone to failure due to Electromagnetic Interference (EMI) (1), ground faults and component failures. In order to successfully implement electromechanical flight control actuation, a Power Management and Distribution (PMAD) System must be designed having a reliability of 1 failure in 10(exp +9) hours, EMI hardening and a fault tolerance architecture to ensure uninterrupted power to all aircraft flight critical systems. The focus of this paper is to analyze, define, and describe technically challenging areas associated with the development of a Power By Wire Aircraft and typical requirements to be established at the box level. The authors will attempt to propose areas of investigation, citing specific military standards and requirements that need to be revised to accommodate the 'More Electric Aircraft Systems'.
[Development of fixed-base full task space flight training simulator].
Xue, Liang; Chen, Shan-quang; Chang, Tian-chun; Yang, Hong; Chao, Jian-gang; Li, Zhi-peng
2003-01-01
Fixed-base full task flight training simulator is a very critical and important integrated training facility. It is mostly used in training of integrated skills and tasks, such as running the flight program of manned space flight, dealing with faults, operating and controlling spacecraft flight, communicating information between spacecraft and ground. This simulator was made up of several subentries including spacecraft simulation, simulating cabin, sight image, acoustics, main controlling computer, instructor and assistant support. It has implemented many simulation functions, such as spacecraft environment, spacecraft movement, communicating information between spacecraft and ground, typical faults, manual control and operating training, training control, training monitor, training database management, training data recording, system detecting and so on.
Design of on-board Bluetooth wireless network system based on fault-tolerant technology
NASA Astrophysics Data System (ADS)
You, Zheng; Zhang, Xiangqi; Yu, Shijie; Tian, Hexiang
2007-11-01
In this paper, the Bluetooth wireless data transmission technology is applied in on-board computer system, to realize wireless data transmission between peripherals of the micro-satellite integrating electronic system, and in view of the high demand of reliability of a micro-satellite, a design of Bluetooth wireless network based on fault-tolerant technology is introduced. The reliability of two fault-tolerant systems is estimated firstly using Markov model, then the structural design of this fault-tolerant system is introduced; several protocols are established to make the system operate correctly, some related problems are listed and analyzed, with emphasis on Fault Auto-diagnosis System, Active-standby switch design and Data-Integrity process.
Slip accumulation and lateral propagation of active normal faults in Afar
NASA Astrophysics Data System (ADS)
Manighetti, I.; King, G. C. P.; Gaudemer, Y.; Scholz, C. H.; Doubre, C.
2001-01-01
We investigate fault growth in Afar, where normal fault systems are known to be currently growing fast and most are propagating to the northwest. Using digital elevation models, we have examined the cumulative slip distribution along 255 faults with lengths ranging from 0.3 to 60 km. Faults exhibiting the elliptical or "bell-shaped" slip profiles predicted by simple linear elastic fracture mechanics or elastic-plastic theories are rare. Most slip profiles are roughly linear for more than half of their length, with overall slopes always <0.035. For the dominant population of NW striking faults and fault systems longer than 2 km, the slip profiles are asymmetric, with slip being maximum near the eastern ends of the profiles where it drops abruptly to zero, whereas slip decreases roughly linearly and tapers in the direction of overall Aden rift propagation. At a more detailed level, most faults appear to be composed of distinct, shorter subfaults or segments, whose slip profiles, while different from one to the next, combine to produce the roughly linear overall slip decrease along the entire fault. On a larger scale, faults cluster into kinematically coupled systems, along which the slip on any scale individual fault or fault system complements that of its neighbors, so that the total slip of the whole system is roughly linearly related to its length, with an average slope again <0.035. We discuss the origin of these quasilinear, asymmetric profiles in terms of "initiation points" where slip starts, and "barriers" where fault propagation is arrested. In the absence of a barrier, slip apparently extends with a roughly linear profile, tapered in the direction of fault propagation.
NASA Technical Reports Server (NTRS)
Ferrell, Bob A.; Lewis, Mark E.; Perotti, Jose M.; Brown, Barbara L.; Oostdyk, Rebecca L.; Goetz, Jesse W.
2010-01-01
This paper's main purpose is to detail issues and lessons learned regarding designing, integrating, and implementing Fault Detection Isolation and Recovery (FDIR) for Constellation Exploration Program (CxP) Ground Operations at Kennedy Space Center (KSC). Part of the0 overall implementation of National Aeronautics and Space Administration's (NASA's) CxP, FDIR is being implemented in three main components of the program (Ares, Orion, and Ground Operations/Processing). While not initially part of the design baseline for the CxP Ground Operations, NASA felt that FDIR is important enough to develop, that NASA's Exploration Systems Mission Directorate's (ESMD's) Exploration Technology Development Program (ETDP) initiated a task for it under their Integrated System Health Management (ISHM) research area. This task, referred to as the FDIIR project, is a multi-year multi-center effort. The primary purpose of the FDIR project is to develop a prototype and pathway upon which Fault Detection and Isolation (FDI) may be transitioned into the Ground Operations baseline. Currently, Qualtech Systems Inc (QSI) Commercial Off The Shelf (COTS) software products Testability Engineering and Maintenance System (TEAMS) Designer and TEAMS RDS/RT are being utilized in the implementation of FDI within the FDIR project. The TEAMS Designer COTS software product is being utilized to model the system with Functional Fault Models (FFMs). A limited set of systems in Ground Operations are being modeled by the FDIR project, and the entire Ares Launch Vehicle is being modeled under the Functional Fault Analysis (FFA) project at Marshall Space Flight Center (MSFC). Integration of the Ares FFMs and the Ground Processing FFMs is being done under the FDIR project also utilizing the TEAMS Designer COTS software product. One of the most significant challenges related to integration is to ensure that FFMs developed by different organizations can be integrated easily and without errors. Software Interface Control Documents (ICDs) for the FFMs and their usage will be addressed as the solution to this issue. In particular, the advantages and disadvantages of these ICDs across physically separate development groups will be delineated.
Comparing Different Fault Identification Algorithms in Distributed Power System
NASA Astrophysics Data System (ADS)
Alkaabi, Salim
A power system is a huge complex system that delivers the electrical power from the generation units to the consumers. As the demand for electrical power increases, distributed power generation was introduced to the power system. Faults may occur in the power system at any time in different locations. These faults cause a huge damage to the system as they might lead to full failure of the power system. Using distributed generation in the power system made it even harder to identify the location of the faults in the system. The main objective of this work is to test the different fault location identification algorithms while tested on a power system with the different amount of power injected using distributed generators. As faults may lead the system to full failure, this is an important area for research. In this thesis different fault location identification algorithms have been tested and compared while the different amount of power is injected from distributed generators. The algorithms were tested on IEEE 34 node test feeder using MATLAB and the results were compared to find when these algorithms might fail and the reliability of these methods.
Sensor Selection and Data Validation for Reliable Integrated System Health Management
NASA Technical Reports Server (NTRS)
Garg, Sanjay; Melcher, Kevin J.
2008-01-01
For new access to space systems with challenging mission requirements, effective implementation of integrated system health management (ISHM) must be available early in the program to support the design of systems that are safe, reliable, highly autonomous. Early ISHM availability is also needed to promote design for affordable operations; increased knowledge of functional health provided by ISHM supports construction of more efficient operations infrastructure. Lack of early ISHM inclusion in the system design process could result in retrofitting health management systems to augment and expand operational and safety requirements; thereby increasing program cost and risk due to increased instrumentation and computational complexity. Having the right sensors generating the required data to perform condition assessment, such as fault detection and isolation, with a high degree of confidence is critical to reliable operation of ISHM. Also, the data being generated by the sensors needs to be qualified to ensure that the assessments made by the ISHM is not based on faulty data. NASA Glenn Research Center has been developing technologies for sensor selection and data validation as part of the FDDR (Fault Detection, Diagnosis, and Response) element of the Upper Stage project of the Ares 1 launch vehicle development. This presentation will provide an overview of the GRC approach to sensor selection and data quality validation and will present recent results from applications that are representative of the complexity of propulsion systems for access to space vehicles. A brief overview of the sensor selection and data quality validation approaches is provided below. The NASA GRC developed Systematic Sensor Selection Strategy (S4) is a model-based procedure for systematically and quantitatively selecting an optimal sensor suite to provide overall health assessment of a host system. S4 can be logically partitioned into three major subdivisions: the knowledge base, the down-select iteration, and the final selection analysis. The knowledge base required for productive use of S4 consists of system design information and heritage experience together with a focus on components with health implications. The sensor suite down-selection is an iterative process for identifying a group of sensors that provide good fault detection and isolation for targeted fault scenarios. In the final selection analysis, a statistical evaluation algorithm provides the final robustness test for each down-selected sensor suite. NASA GRC has developed an approach to sensor data qualification that applies empirical relationships, threshold detection techniques, and Bayesian belief theory to a network of sensors related by physics (i.e., analytical redundancy) in order to identify the failure of a given sensor within the network. This data quality validation approach extends the state-of-the-art, from red-lines and reasonableness checks that flag a sensor after it fails, to include analytical redundancy-based methods that can identify a sensor in the process of failing. The focus of this effort is on understanding the proper application of analytical redundancy-based data qualification methods for onboard use in monitoring Upper Stage sensors.
Space Station Module Power Management and Distribution System (SSM/PMAD)
NASA Technical Reports Server (NTRS)
Miller, William (Compiler); Britt, Daniel (Compiler); Elges, Michael (Compiler); Myers, Chris (Compiler)
1994-01-01
This report provides an overview of the Space Station Module Power Management and Distribution (SSM/PMAD) testbed system and describes recent enhancements to that system. Four tasks made up the original contract: (1) common module power management and distribution system automation plan definition; (2) definition of hardware and software elements of automation; (3) design, implementation and delivery of the hardware and software making up the SSM/PMAD system; and (4) definition and development of the host breadboard computer environment. Additions and/or enhancements to the SSM/PMAD test bed that have occurred since July 1990 are reported. These include: (1) rehosting the MAESTRO scheduler; (2) reorganization of the automation software internals; (3) a more robust communications package; (4) the activity editor to the MAESTRO scheduler; (5) rehosting the LPLMS to execute under KNOMAD; implementation of intermediate levels of autonomy; (6) completion of the KNOMAD knowledge management facility; (7) significant improvement of the user interface; (8) soft and incipient fault handling design; (9) intermediate levels of autonomy, and (10) switch maintenance.
Automatic translation of digraph to fault-tree models
NASA Technical Reports Server (NTRS)
Iverson, David L.
1992-01-01
The author presents a technique for converting digraph models, including those models containing cycles, to a fault-tree format. A computer program which automatically performs this translation using an object-oriented representation of the models has been developed. The fault-trees resulting from translations can be used for fault-tree analysis and diagnosis. Programs to calculate fault-tree and digraph cut sets and perform diagnosis with fault-tree models have also been developed. The digraph to fault-tree translation system has been successfully tested on several digraphs of varying size and complexity. Details of some representative translation problems are presented. Most of the computation performed by the program is dedicated to finding minimal cut sets for digraph nodes in order to break cycles in the digraph. Fault-trees produced by the translator have been successfully used with NASA's Fault-Tree Diagnosis System (FTDS) to produce automated diagnostic systems.
Fault tolerant architectures for integrated aircraft electronics systems, task 2
NASA Technical Reports Server (NTRS)
Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.
1984-01-01
The architectural basis for an advanced fault tolerant on-board computer to succeed the current generation of fault tolerant computers is examined. The network error tolerant system architecture is studied with particular attention to intercluster configurations and communication protocols, and to refined reliability estimates. The diagnosis of faults, so that appropriate choices for reconfiguration can be made is discussed. The analysis relates particularly to the recognition of transient faults in a system with tasks at many levels of priority. The demand driven data-flow architecture, which appears to have possible application in fault tolerant systems is described and work investigating the feasibility of automatic generation of aircraft flight control programs from abstract specifications is reported.
Nonlinear dynamic failure process of tunnel-fault system in response to strong seismic event
NASA Astrophysics Data System (ADS)
Yang, Zhihua; Lan, Hengxing; Zhang, Yongshuang; Gao, Xing; Li, Langping
2013-03-01
Strong earthquakes and faults have significant effect on the stability capability of underground tunnel structures. This study used a 3-Dimensional Discrete Element model and the real records of ground motion in the Wenchuan earthquake to investigate the dynamic response of tunnel-fault system. The typical tunnel-fault system was composed of one planned railway tunnel and one seismically active fault. The discrete numerical model was prudentially calibrated by means of the comparison between the field survey and numerical results of ground motion. It was then used to examine the detailed quantitative information on the dynamic response characteristics of tunnel-fault system, including stress distribution, strain, vibration velocity and tunnel failure process. The intensive tunnel-fault interaction during seismic loading induces the dramatic stress redistribution and stress concentration in the intersection of tunnel and fault. The tunnel-fault system behavior is characterized by the complicated nonlinear dynamic failure process in response to a real strong seismic event. It can be qualitatively divided into 5 main stages in terms of its stress, strain and rupturing behaviors: (1) strain localization, (2) rupture initiation, (3) rupture acceleration, (4) spontaneous rupture growth and (5) stabilization. This study provides the insight into the further stability estimation of underground tunnel structures under the combined effect of strong earthquakes and faults.
NASA Technical Reports Server (NTRS)
Andre, Constance G.
1989-01-01
SPOT stereoscopic and TM multispectral images support evidence in AVHRR thermal-IR images of a major unmapped shear zone in Phanerozoic cover rocks southeast of the ancient Najd Fault System in the Arabian Shield. This shear zone and faults of the Najd share a common alignment, orientation, and sinistral sense of movement. These similarities suggest a 200-km extension of the Najd Fault System and reactivation since it formed in the late Precambrian. Topographic and lithologic features in the TM and SPOT data along one of three faults inferred from the AVHRR data indicate sinistral offsets up to 2.5 km, en echelon folds and secondary faults like those predicted by models of left-lateral strike-slip faulting. The age of the affected outcrops indicates reactivation of Najd faults in the Cretaceous, judging from TM and SPOT data or in the Tertiary, based on AVHRR data. The total length of the system visible at the surface measures 1300 km. If the Najd Fault System is extrapolated beneath sands of the Empty Quarter to faults of a similar trend in South Yemen, the shear zone would span the Arabian Plate. Furthermore, if extensions into the Arabian Sea bed and into Egypt proposed by others are considered, it would exceed 3000 km.
Software fault tolerance in computer operating systems
NASA Technical Reports Server (NTRS)
Iyer, Ravishankar K.; Lee, Inhwan
1994-01-01
This chapter provides data and analysis of the dependability and fault tolerance for three operating systems: the Tandem/GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Based on measurements from these systems, basic software error characteristics are investigated. Fault tolerance in operating systems resulting from the use of process pairs and recovery routines is evaluated. Two levels of models are developed to analyze error and recovery processes inside an operating system and interactions among multiple instances of an operating system running in a distributed environment. The measurements show that the use of process pairs in Tandem systems, which was originally intended for tolerating hardware faults, allows the system to tolerate about 70% of defects in system software that result in processor failures. The loose coupling between processors which results in the backup execution (the processor state and the sequence of events occurring) being different from the original execution is a major reason for the measured software fault tolerance. The IBM/MVS system fault tolerance almost doubles when recovery routines are provided, in comparison to the case in which no recovery routines are available. However, even when recovery routines are provided, there is almost a 50% chance of system failure when critical system jobs are involved.
Possible strand of the North Anatolian fault in the Thrace basin, Turkey - An interpretation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perincek, D.
1991-02-01
This study focuses on the presence of a major strike-slip fault system in the Thrace basin. This new discovery is important for the geology of the Thrace basin and also brings a new perspective to petroleum exploration. The wrench fault system is named the Thrace strike-slip fault system (Perincek, 1988). Similarities with the North Anatolian fault zone prompted an investigation of the relationship between these two fault system.s The study area covers most of the Thrace region of Turkey. The purposes of this paper are (1) to outline the geometry of the Thrace fault system, (2) to demonstrate its tectonicmore » relation with other major structures of the region, (3) to define the age of its inception, and (4) to discuss possible magnitudes of the lateral displacement. The interpretation is based mainly on seismic data consisting of 180 seismic reflection profiles that have a total cumulative length of 2,800 km. Seismic data are complemented with subsurface control from 54 wells.« less
Critical fault patterns determination in fault-tolerant computer systems
NASA Technical Reports Server (NTRS)
Mccluskey, E. J.; Losq, J.
1978-01-01
The method proposed tries to enumerate all the critical fault-patterns (successive occurrences of failures) without analyzing every single possible fault. The conditions for the system to be operating in a given mode can be expressed in terms of the static states. Thus, one can find all the system states that correspond to a given critical mode of operation. The next step consists in analyzing the fault-detection mechanisms, the diagnosis algorithm and the process of switch control. From them, one can find all the possible system configurations that can result from a failure occurrence. Thus, one can list all the characteristics, with respect to detection, diagnosis, and switch control, that failures must have to constitute critical fault-patterns. Such an enumeration of the critical fault-patterns can be directly used to evaluate the overall system tolerance to failures. Present research is focused on how to efficiently make use of these system-level characteristics to enumerate all the failures that verify these characteristics.
Practical Methods for Estimating Software Systems Fault Content and Location
NASA Technical Reports Server (NTRS)
Nikora, A.; Schneidewind, N.; Munson, J.
1999-01-01
Over the past several years, we have developed techniques to discriminate between fault-prone software modules and those that are not, to estimate a software system's residual fault content, to identify those portions of a software system having the highest estimated number of faults, and to estimate the effects of requirements changes on software quality.
cost and benefits optimization model for fault-tolerant aircraft electronic systems
NASA Technical Reports Server (NTRS)
1983-01-01
The factors involved in economic assessment of fault tolerant systems (FTS) and fault tolerant flight control systems (FTFCS) are discussed. Algorithms for optimization and economic analysis of FTFCS are documented.
NASA Astrophysics Data System (ADS)
Heilman, E.; Kolawole, F.; Mayle, M.; Atekwana, E. A.; Abdelsalam, M. G.
2017-12-01
We address the longstanding question of the role of long-lived basement structures in strain accommodation within active rift systems. Studies have highlighted the influence of pre-existing zones of lithospheric weakness in modulating faulting and fault kinematics. Here, we investigate the role of the Neoproterozoic Mughese Shear Zone (MSZ) in Cenozoic rifting along the Rukwa-Malawi rift segment of the East African Rift System (EARS). Detailed analyses of Shuttle Radar Topography Mission (SRTM) DEM and filtered aeromagnetic data allowed us to determine the relationship between rift-related basement-rooted normal faults and the MSZ fabric extending along the southern boundary of the Rukwa-Malawi Rift North Basin. Our results show that the magnetic lineaments defining the MSZ coincide with the collinear Rukwa Rift border fault (Ufipa Fault), a dextral strike-slip fault (Mughese Fault), and the North Basin hinge-zone fault (Mbiri Fault). Fault-scarp and minimum fault-throw analyses reveal that within the Rukwa Rift, the Ufipa Border Fault has been accommodating significant displacement relative to the Lupa Border Fault, which represents the northeastern border fault of the Rukwa Rift. Our analysis also shows that within the North Basin half-graben, the Mbiri Fault has accommodated the most vertical displacement relative to other faults along the half-graben hinge zone. We propose that the Cenozoic reactivation along the MSZ facilitated significant normal slip displacement along the Ufipa Border Fault and the Mbiri Fault, and minor dextral strike-slip between the two faults. We suggest that the fault kinematics along the Rukwa-Malawi Rift is the result of reactivation of the MSZ through regional oblique extension.
Li, Xiangfei; Lin, Yuliang
2017-01-01
This paper proposes a new scheme of reconstructing current sensor faults and estimating unknown load disturbance for a permanent magnet synchronous motor (PMSM)-driven system. First, the original PMSM system is transformed into two subsystems; the first subsystem has unknown system load disturbances, which are unrelated to sensor faults, and the second subsystem has sensor faults, but is free from unknown load disturbances. Introducing a new state variable, the augmented subsystem that has sensor faults can be transformed into having actuator faults. Second, two sliding mode observers (SMOs) are designed: the unknown load disturbance is estimated by the first SMO in the subsystem, which has unknown load disturbance, and the sensor faults can be reconstructed using the second SMO in the augmented subsystem, which has sensor faults. The gains of the proposed SMOs and their stability analysis are developed via the solution of linear matrix inequality (LMI). Finally, the effectiveness of the proposed scheme was verified by simulations and experiments. The results demonstrate that the proposed scheme can reconstruct current sensor faults and estimate unknown load disturbance for the PMSM-driven system. PMID:29211017
Model-Based Diagnostics for Propellant Loading Systems
NASA Technical Reports Server (NTRS)
Daigle, Matthew John; Foygel, Michael; Smelyanskiy, Vadim N.
2011-01-01
The loading of spacecraft propellants is a complex, risky operation. Therefore, diagnostic solutions are necessary to quickly identify when a fault occurs, so that recovery actions can be taken or an abort procedure can be initiated. Model-based diagnosis solutions, established using an in-depth analysis and understanding of the underlying physical processes, offer the advanced capability to quickly detect and isolate faults, identify their severity, and predict their effects on system performance. We develop a physics-based model of a cryogenic propellant loading system, which describes the complex dynamics of liquid hydrogen filling from a storage tank to an external vehicle tank, as well as the influence of different faults on this process. The model takes into account the main physical processes such as highly nonequilibrium condensation and evaporation of the hydrogen vapor, pressurization, and also the dynamics of liquid hydrogen and vapor flows inside the system in the presence of helium gas. Since the model incorporates multiple faults in the system, it provides a suitable framework for model-based diagnostics and prognostics algorithms. Using this model, we analyze the effects of faults on the system, derive symbolic fault signatures for the purposes of fault isolation, and perform fault identification using a particle filter approach. We demonstrate the detection, isolation, and identification of a number of faults using simulation-based experiments.
Active Structures as Deduced from Geomorphic Features: A case in Hsinchu Area, northwestern Taiwan
NASA Astrophysics Data System (ADS)
Chen, Y.; Shyu, J.; Ota, Y.; Chen, W.; Hu, J.; Tsai, B.; Wang, Y.
2002-12-01
Hsinchu area is located in the northwestern Taiwan, the fold-and thrust belt created by arc-continent collision between Eurasian and Philippine. Since the collision event is still ongoing, the island is tectonically active and full of active faults. According to the historical records, some of the faults are seismically acting. In Hsinchuarea two active faults, the Hsinchu and Hsincheng, have been previously mapped. To evaluate the recent activities, we studied the related geomorphic features by using newly developed Digital Elevation Model (DEM), the aerial photos and field investigation. Geologically, both of the faults are coupled with a hanging wall anticline. The anticlines are recently active due to the deformation of the geomorphic surfaces. The Hsinchu fault system shows complicate corresponding scarps, distributed sub-parallel to the fault trace previously suggested by projection of subsurface geology. This is probably caused by its strike-slip component tearing the surrounding area along the main trace. The scarps associated with the Hsincheng fault system are rather simple and unique. It offsets a flight of terraces all the way down to recent flood plain, indicating its long lasting activity. One to two kilometers to east of main trace a back-thrust is found, showing coupled vertical surface offsets with the main fault. The striking discovery in this study is that the surface deformation is only distributed in the southern bank of Touchien river, also suddenly decreasing when crossing another tear fault system, which is originated from Hsincheng fault in the west and extending southeastward parallel to the Touchien river. The strike-slip fault system mentioned above not only bisects the Hsinchu fault, but also divides the Hsincheng fault into segments. The supporting evidence found in this study includes pressure ridges and depressions. As a whole, the study area is tectonically dominated by three active fault systems and two actively growing anticlines. The interactions between active structural systems formed the complicate geomorphic features presented in this paper.
An Ontology for Identifying Cyber Intrusion Induced Faults in Process Control Systems
NASA Astrophysics Data System (ADS)
Hieb, Jeffrey; Graham, James; Guan, Jian
This paper presents an ontological framework that permits formal representations of process control systems, including elements of the process being controlled and the control system itself. A fault diagnosis algorithm based on the ontological model is also presented. The algorithm can identify traditional process elements as well as control system elements (e.g., IP network and SCADA protocol) as fault sources. When these elements are identified as a likely fault source, the possibility exists that the process fault is induced by a cyber intrusion. A laboratory-scale distillation column is used to illustrate the model and the algorithm. Coupled with a well-defined statistical process model, this fault diagnosis approach provides cyber security enhanced fault diagnosis information to plant operators and can help identify that a cyber attack is underway before a major process failure is experienced.
Farrington, R.B.; Pruett, J.C. Jr.
1984-05-14
A fault detecting apparatus and method are provided for use with an active solar system. The apparatus provides an indication as to whether one or more predetermined faults have occurred in the solar system. The apparatus includes a plurality of sensors, each sensor being used in determining whether a predetermined condition is present. The outputs of the sensors are combined in a pre-established manner in accordance with the kind of predetermined faults to be detected. Indicators communicate with the outputs generated by combining the sensor outputs to give the user of the solar system and the apparatus an indication as to whether a predetermined fault has occurred. Upon detection and indication of any predetermined fault, the user can take appropriate corrective action so that the overall reliability and efficiency of the active solar system are increased.
Farrington, Robert B.; Pruett, Jr., James C.
1986-01-01
A fault detecting apparatus and method are provided for use with an active solar system. The apparatus provides an indication as to whether one or more predetermined faults have occurred in the solar system. The apparatus includes a plurality of sensors, each sensor being used in determining whether a predetermined condition is present. The outputs of the sensors are combined in a pre-established manner in accordance with the kind of predetermined faults to be detected. Indicators communicate with the outputs generated by combining the sensor outputs to give the user of the solar system and the apparatus an indication as to whether a predetermined fault has occurred. Upon detection and indication of any predetermined fault, the user can take appropriate corrective action so that the overall reliability and efficiency of the active solar system are increased.
Real-time fault diagnosis for propulsion systems
NASA Technical Reports Server (NTRS)
Merrill, Walter C.; Guo, Ten-Huei; Delaat, John C.; Duyar, Ahmet
1991-01-01
Current research toward real time fault diagnosis for propulsion systems at NASA-Lewis is described. The research is being applied to both air breathing and rocket propulsion systems. Topics include fault detection methods including neural networks, system modeling, and real time implementations.
Situation Awareness of Onboard System Autonomy
NASA Technical Reports Server (NTRS)
Schreckenghost, Debra; Thronesbery, Carroll; Hudson, Mary Beth
2005-01-01
We have developed intelligent agent software for onboard system autonomy. Our approach is to provide control agents that automate crew and vehicle systems, and operations assistants that aid humans in working with these autonomous systems. We use the 3 Tier control architecture to develop the control agent software that automates system reconfiguration and routine fault management. We use the Distributed Collaboration and Interaction (DCI) System to develop the operations assistants that provide human services, including situation summarization, event notification, activity management, and support for manual commanding of autonomous system. In this paper we describe how the operations assistants aid situation awareness of the autonomous control agents. We also describe our evaluation of the DCI System to support control engineers during a ground test at Johnson Space Center (JSC) of the Post Processing System (PPS) for regenerative water recovery.
The Gabbs Valley, Nevada, geothermal prospect: Exploring for a potential blind geothermal resource
NASA Astrophysics Data System (ADS)
Payne, J.; Bell, J. W.; Calvin, W. M.
2012-12-01
The Gabbs Valley prospect in west-central Nevada is a potential blind geothermal resource system. Possible structural controls on this system were investigated using high-resolution LiDAR, low sun-angle aerial (LSA) photography, exploratory fault trenching and a shallow temperature survey. Active Holocene faults have previously been identified at 37 geothermal systems with indication of temperatures greater than 100° C in the western Nevada region. Active fault controls in Gabbs Valley include both Holocene and historical structures. Two historical earthquakes occurring in 1932 and 1954 have overlapping surface rupture patterns in Gabbs Valley. Three active fault systems identified through LSA and LiDAR mapping have characteristics of Basin and Range normal faulting and Walker Lane oblique dextral faulting. The East Monte Cristo Mountains fault zone is an 8.5 km long continuous NNE striking, discrete fault with roughly 0.5 m right-normal historic motion and 3 m vertical Quaternary separation. The Phillips Wash fault zone is an 8.2 km long distributed fault system striking NE to N, with Quaternary fault scarps of 1-3 m vertical separation and a 500 m wide graben adjacent to the Cobble Cuesta anticline. This fault displays ponded drainages, an offset terrace riser and right stepping en echelon fault patterns suggestive of left lateral offset, and fault trenching exposed non-matching stratigraphy typical of a significant component of lateral offset. The unnamed faults of Gabbs Valley are a 10.6 km long system of normal faults striking NNE and Quaternary scarps are up to 4 m high. These normal faults largely do not have historic surface rupture, but a small segment of 1932 rupture has been identified. A shallow (2 m deep) temperature survey of 80 points covering roughly 65 square kilometers was completed. Data were collected over approximately 2 months, and continual base station temperature measurements were used to seasonally correct temperature measurements. A 2.5 km long temperature anomaly greater than 3° C above background temperatures forms west-northwest trending zone between terminations of the Phillips Wash fault zone and unnamed faults of Gabbs Valley to the south. Rupture segments of two young active faults bracket the temperature anomaly. The temperature anomaly may be due to several possible causes. 1. Increases in stress near the rupture segments or tip-lines of these faults, or where multiple fault splays exist, can increase fault permeability. The un-ruptured segments of these faults may be controlling the location of the Gabbs Valley thermal anomaly between ruptured segments of the 1932 Cedar Mountain and 1954 Fairview Peak earthquakes. 2. Numerous unnamed normal faults may interact and the hanging wall of these faults is hosting the thermal anomaly. The size and extent of the anomaly may be due to its proximity to a flat playa and not the direct location of the shallow heat anomaly. 3. The linear northwest nature of the thermal anomaly may reflect a hydrologic barrier in the subsurface controlling where heated fluids rise. A concealed NW- striking fault is possible, but has not been identified in previous studies or in the LiDAR or LSA fault mapping.
Nearly frictionless faulting by unclamping in long-term interaction models
Parsons, T.
2002-01-01
In defiance of direct rock-friction observations, some transform faults appear to slide with little resistance. In this paper finite element models are used to show how strain energy is minimized by interacting faults that can cause long-term reduction in fault-normal stresses (unclamping). A model fault contained within a sheared elastic medium concentrates stress at its end points with increasing slip. If accommodating structures free up the ends, then the fault responds by rotating, lengthening, and unclamping. This concept is illustrated by a comparison between simple strike-slip faulting and a mid-ocean-ridge model with the same total transform length; calculations show that the more complex system unclapms the transforms and operates at lower energy. In another example, the overlapping San Andreas fault system in the San Francisco Bay region is modeled; this system is complicated by junctions and stepovers. A finite element model indicates that the normal stress along parts of the faults could be reduced to hydrostatic levels after ???60-100 k.y. of system-wide slip. If this process occurs in the earth, then parts of major transform fault zones could appear nearly frictionless.
Automatic Detection of Electric Power Troubles (ADEPT)
NASA Technical Reports Server (NTRS)
Wang, Caroline; Zeanah, Hugh; Anderson, Audie; Patrick, Clint; Brady, Mike; Ford, Donnie
1988-01-01
Automatic Detection of Electric Power Troubles (A DEPT) is an expert system that integrates knowledge from three different suppliers to offer an advanced fault-detection system. It is designed for two modes of operation: real time fault isolation and simulated modeling. Real time fault isolation of components is accomplished on a power system breadboard through the Fault Isolation Expert System (FIES II) interface with a rule system developed in-house. Faults are quickly detected and displayed and the rules and chain of reasoning optionally provided on a laser printer. This system consists of a simulated space station power module using direct-current power supplies for solar arrays on three power buses. For tests of the system's ablilty to locate faults inserted via switches, loads are configured by an INTEL microcomputer and the Symbolics artificial intelligence development system. As these loads are resistive in nature, Ohm's Law is used as the basis for rules by which faults are located. The three-bus system can correct faults automatically where there is a surplus of power available on any of the three buses. Techniques developed and used can be applied readily to other control systems requiring rapid intelligent decisions. Simulated modeling, used for theoretical studies, is implemented using a modified version of Kennedy Space Center's KATE (Knowledge-Based Automatic Test Equipment), FIES II windowing, and an ADEPT knowledge base.
Automatic Detection of Electric Power Troubles (ADEPT)
NASA Astrophysics Data System (ADS)
Wang, Caroline; Zeanah, Hugh; Anderson, Audie; Patrick, Clint; Brady, Mike; Ford, Donnie
1988-11-01
Automatic Detection of Electric Power Troubles (A DEPT) is an expert system that integrates knowledge from three different suppliers to offer an advanced fault-detection system. It is designed for two modes of operation: real time fault isolation and simulated modeling. Real time fault isolation of components is accomplished on a power system breadboard through the Fault Isolation Expert System (FIES II) interface with a rule system developed in-house. Faults are quickly detected and displayed and the rules and chain of reasoning optionally provided on a laser printer. This system consists of a simulated space station power module using direct-current power supplies for solar arrays on three power buses. For tests of the system's ablilty to locate faults inserted via switches, loads are configured by an INTEL microcomputer and the Symbolics artificial intelligence development system. As these loads are resistive in nature, Ohm's Law is used as the basis for rules by which faults are located. The three-bus system can correct faults automatically where there is a surplus of power available on any of the three buses. Techniques developed and used can be applied readily to other control systems requiring rapid intelligent decisions. Simulated modeling, used for theoretical studies, is implemented using a modified version of Kennedy Space Center's KATE (Knowledge-Based Automatic Test Equipment), FIES II windowing, and an ADEPT knowledge base.
Catchings, R.D.; Gandhok, G.; Goldman, M.R.; Steedman, Clare
2007-01-01
Introduction The Santa Clara Valley (SCV) is located in the southern San Francisco Bay area of California and is bounded by the Santa Cruz Mountains to the southwest, the Diablo Ranges to the northeast, and the San Francisco Bay to the north (Fig. 1). The SCV, which includes the City of San Jose, numerous smaller cities, and much of the high-technology manufacturing and research area commonly referred to as the Silicon Valley, has a population in excess of 1.7 million people (2000 U. S. Census;http://quickfacts.census.gov/qfd/states/06/06085.html The SCV is situated between major active faults of the San Andreas Fault system, including the San Andreas Fault to the southwest and the Hayward and Calaveras faults to the northeast, and other faults inferred to lie beneath the alluvium of the SCV (CWDR, 1967; Bortugno et al., 1991). The importance of the SCV as a major industrial center, its large population, and its proximity to major earthquake faults are important considerations with respect to earthquake hazards and water-resource management. The fault-bounded alluvial aquifer system beneath the valley is the source of about one-third of the water supply for the metropolitan area (Hanson et al., 2004). To better address the earthquake hazards of the SCV, the U.S. Geological Survey (USGS) has undertaken a program to evaluate potential seismic sources, the effects of strong ground shaking, and stratigraphy associated with the regional aquifer system. As part of that program and to better understand water resources of the valley, the USGS and the Santa Clara Valley Water District (SCVWD) began joint studies to characterize the faults, stratigraphy, and structures beneath the SCV in the year 2000. Such features are important to both agencies because they directly influence the availability and management of groundwater resources in the valley, and they affect the severity and distribution of strong shaking from local and regional earthquakes sources that may affect reservoirs, pipelines, and flood-protection facilities maintained by SCVWD. As one component of these joint studies, the USGS acquired an approximately 10-km-long, high-resolution, combined seismic reflection/refraction transect from the Santa Cruz Mountains to the central SCV in December 2000 (Figs. 1 and 2a,b). The overall seismic investigation of the western Santa Clara Valley also included an ~18-km-long, lower-resolution (~50-m sensor) seismic imaging survey from the central Santa Cruz Mountains to the central part of the valley (Fig. 1). Collectively, we refer to these seismic investigations as the 2000 western Santa Clara Seismic Investigations (SCSI). Results of the high-resolution investigation, referred to as SCSI-HR, are presented in this report, and Catchings et al. (2006) present results of the low-resolution investigation (SCSI-LR) in a separate report. In this report, we present data acquisition parameters, unprocessed and processed seismic data, and interpretations of the SCSI-HR seismic transect.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bruns, T.R.; Carlson, P.R.; Stevenson, A.J.
1990-06-01
GLORIA images collected in 1989 along southeast Alaska and British Columbia strikingly show the active trace of the Fairweather-Queen Charlotte transform fault system beneath the outer shelf and slope; seismic-reflection data are used to track the fault system across the continental shelf where GLORIA data are not available. From Cross Sound to Chatham Strait, the fault system is comprised of two sets of subparallel fault traces separated by 3 to 6 km. The fault system crosses the shelf from Icy Point to south of Yakobi Valley, then follows the shelf edge to Chatham Strait. Between Chatham Strait and Dixon Entrance,more » a single, sharply defined active fault trace underlies the upper and middle slope. This fault segment is bounded on the seaward side by a high, midslope ridge and by lower slope Quaternary( ) anticlines up to 35 km wide. Southeast of Dixon Entrance, the active fault trace trends back onto the outer shelf until midway along the Queen Charlotte Islands, then cuts back to and stays at midslope to the Tuzo Wilson Knolls south of the Queen Charlotte Islands. The fault steps westward at Tuzo Wilson Knolls, which are likely part of a spreading ridge segment. Major deep-sea fans along southeast Alaska show a southeastward age progression from older to younger and record both point source deposition at Chatham Strait and Dixon Entrance and subsequent (Quaternary ) offset along the fault system. Subsidence of ocean plate now adjacent to the Chatham Strait-Dixon Entrance fault segment initiated development of both Mukluk and Horizon Channels.« less
Tutorial: Advanced fault tree applications using HARP
NASA Technical Reports Server (NTRS)
Dugan, Joanne Bechta; Bavuso, Salvatore J.; Boyd, Mark A.
1993-01-01
Reliability analysis of fault tolerant computer systems for critical applications is complicated by several factors. These modeling difficulties are discussed and dynamic fault tree modeling techniques for handling them are described and demonstrated. Several advanced fault tolerant computer systems are described, and fault tree models for their analysis are presented. HARP (Hybrid Automated Reliability Predictor) is a software package developed at Duke University and NASA Langley Research Center that is capable of solving the fault tree models presented.
NASA Astrophysics Data System (ADS)
Mbaya, Timmy
Embedded Aerospace Systems have to perform safety and mission critical operations in a real-time environment where timing and functional correctness are extremely important. Guidance, Navigation, and Control (GN&C) systems substantially rely on complex software interfacing with hardware in real-time; any faults in software or hardware, or their interaction could result in fatal consequences. Integrated Software Health Management (ISWHM) provides an approach for detection and diagnosis of software failures while the software is in operation. The ISWHM approach is based on probabilistic modeling of software and hardware sensors using a Bayesian network. To meet memory and timing constraints of real-time embedded execution, the Bayesian network is compiled into an Arithmetic Circuit, which is used for on-line monitoring. This type of system monitoring, using an ISWHM, provides automated reasoning capabilities that compute diagnoses in a timely manner when failures occur. This reasoning capability enables time-critical mitigating decisions and relieves the human agent from the time-consuming and arduous task of foraging through a multitude of isolated---and often contradictory---diagnosis data. For the purpose of demonstrating the relevance of ISWHM, modeling and reasoning is performed on a simple simulated aerospace system running on a real-time operating system emulator, the OSEK/Trampoline platform. Models for a small satellite and an F-16 fighter jet GN&C (Guidance, Navigation, and Control) system have been implemented. Analysis of the ISWHM is then performed by injecting faults and analyzing the ISWHM's diagnoses.
Fault Analysis in Solar Photovoltaic Arrays
NASA Astrophysics Data System (ADS)
Zhao, Ye
Fault analysis in solar photovoltaic (PV) arrays is a fundamental task to increase reliability, efficiency and safety in PV systems. Conventional fault protection methods usually add fuses or circuit breakers in series with PV components. But these protection devices are only able to clear faults and isolate faulty circuits if they carry a large fault current. However, this research shows that faults in PV arrays may not be cleared by fuses under some fault scenarios, due to the current-limiting nature and non-linear output characteristics of PV arrays. First, this thesis introduces new simulation and analytic models that are suitable for fault analysis in PV arrays. Based on the simulation environment, this thesis studies a variety of typical faults in PV arrays, such as ground faults, line-line faults, and mismatch faults. The effect of a maximum power point tracker on fault current is discussed and shown to, at times, prevent the fault current protection devices to trip. A small-scale experimental PV benchmark system has been developed in Northeastern University to further validate the simulation conclusions. Additionally, this thesis examines two types of unique faults found in a PV array that have not been studied in the literature. One is a fault that occurs under low irradiance condition. The other is a fault evolution in a PV array during night-to-day transition. Our simulation and experimental results show that overcurrent protection devices are unable to clear the fault under "low irradiance" and "night-to-day transition". However, the overcurrent protection devices may work properly when the same PV fault occurs in daylight. As a result, a fault under "low irradiance" and "night-to-day transition" might be hidden in the PV array and become a potential hazard for system efficiency and reliability.
Siler, Drew; Hinz, Nicholas H.; Faulds, James E.
2018-01-01
Slip can induce concentration of stresses at discontinuities along fault systems. These structural discontinuities, i.e., fault terminations, fault step-overs, intersections, bends, and other fault interaction areas, are known to host fluid flow in ore deposition systems, oil and gas reservoirs, and geothermal systems. We modeled stress transfer associated with slip on faults with Holocene-to-historic slip histories at the Salt Wells and Bradys geothermal systems in western Nevada, United States. Results show discrete locations of stress perturbation within discontinuities along these fault systems. Well field data, surface geothermal manifestations, and subsurface temperature data, each a proxy for modern fluid circulation in the fields, indicate that geothermal fluid flow is focused in these same areas where stresses are most highly perturbed. These results suggest that submeter- to meter-scale slip on these fault systems generates stress perturbations that are sufficiently large to promote slip on an array of secondary structures spanning the footprint of the modern geothermal activity. Slip on these secondary faults and fractures generates permeability through kinematic deformation and allows for transmission of fluids. Still, mineralization is expected to seal permeability along faults and fractures over time scales that are generally shorter than either earthquake recurrence intervals or the estimated life span of geothermal fields. This suggests that though stress perturbations resulting from fault slip are broadly important for defining the location and spatial extent of enhanced permeability at structural discontinuities, continual generation and maintenance of flow conduits throughout these areas are probably dependent on the deformation mechanism(s) affecting individual structures.
Adaptive robust fault-tolerant control for linear MIMO systems with unmatched uncertainties
NASA Astrophysics Data System (ADS)
Zhang, Kangkang; Jiang, Bin; Yan, Xing-Gang; Mao, Zehui
2017-10-01
In this paper, two novel fault-tolerant control design approaches are proposed for linear MIMO systems with actuator additive faults, multiplicative faults and unmatched uncertainties. For time-varying multiplicative and additive faults, new adaptive laws and additive compensation functions are proposed. A set of conditions is developed such that the unmatched uncertainties are compensated by actuators in control. On the other hand, for unmatched uncertainties with their projection in unmatched space being not zero, based on a (vector) relative degree condition, additive functions are designed to compensate for the uncertainties from output channels in the presence of actuator faults. The developed fault-tolerant control schemes are applied to two aircraft systems to demonstrate the efficiency of the proposed approaches.
Publications - PIR 2015-5-2 | Alaska Division of Geological & Geophysical
faults in the Bruin Bay fault system, Ursus Head, lower Cook Inlet Authors: Betka, P.M., and Gillis, R.J strike-slip and reverse-slip faults in the Bruin Bay fault system, Ursus Head, lower Cook Inlet, in
Identifiability of Additive, Time-Varying Actuator and Sensor Faults by State Augmentation
NASA Technical Reports Server (NTRS)
Upchurch, Jason M.; Gonzalez, Oscar R.; Joshi, Suresh M.
2014-01-01
Recent work has provided a set of necessary and sucient conditions for identifiability of additive step faults (e.g., lock-in-place actuator faults, constant bias in the sensors) using state augmentation. This paper extends these results to an important class of faults which may affect linear, time-invariant systems. In particular, the faults under consideration are those which vary with time and affect the system dynamics additively. Such faults may manifest themselves in aircraft as, for example, control surface oscillations, control surface runaway, and sensor drift. The set of necessary and sucient conditions presented in this paper are general, and apply when a class of time-varying faults affects arbitrary combinations of actuators and sensors. The results in the main theorems are illustrated by two case studies, which provide some insight into how the conditions may be used to check the theoretical identifiability of fault configurations of interest for a given system. It is shown that while state augmentation can be used to identify certain fault configurations, other fault configurations are theoretically impossible to identify using state augmentation, giving practitioners valuable insight into such situations. That is, the limitations of state augmentation for a given system and configuration of faults are made explicit. Another limitation of model-based methods is that there can be large numbers of fault configurations, thus making identification of all possible configurations impractical. However, the theoretical identifiability of known, credible fault configurations can be tested using the theorems presented in this paper, which can then assist the efforts of fault identification practitioners.
Designing Fault-Injection Experiments for the Reliability of Embedded Systems
NASA Technical Reports Server (NTRS)
White, Allan L.
2012-01-01
This paper considers the long-standing problem of conducting fault-injections experiments to establish the ultra-reliability of embedded systems. There have been extensive efforts in fault injection, and this paper offers a partial summary of the efforts, but these previous efforts have focused on realism and efficiency. Fault injections have been used to examine diagnostics and to test algorithms, but the literature does not contain any framework that says how to conduct fault-injection experiments to establish ultra-reliability. A solution to this problem integrates field-data, arguments-from-design, and fault-injection into a seamless whole. The solution in this paper is to derive a model reduction theorem for a class of semi-Markov models suitable for describing ultra-reliable embedded systems. The derivation shows that a tight upper bound on the probability of system failure can be obtained using only the means of system-recovery times, thus reducing the experimental effort to estimating a reasonable number of easily-observed parameters. The paper includes an example of a system subject to both permanent and transient faults. There is a discussion of integrating fault-injection with field-data and arguments-from-design.
New methods for the condition monitoring of level crossings
NASA Astrophysics Data System (ADS)
García Márquez, Fausto Pedro; Pedregal, Diego J.; Roberts, Clive
2015-04-01
Level crossings represent a high risk for railway systems. This paper demonstrates the potential to improve maintenance management through the use of intelligent condition monitoring coupled with reliability centred maintenance (RCM). RCM combines advanced electronics, control, computing and communication technologies to address the multiple objectives of cost effectiveness, improved quality, reliability and services. RCM collects digital and analogue signals utilising distributed transducers connected to either point-to-point or digital bus communication links. Assets in many industries use data logging capable of providing post-failure diagnostic support, but to date little use has been made of combined qualitative and quantitative fault detection techniques. The research takes the hydraulic railway level crossing barrier (LCB) system as a case study and develops a generic strategy for failure analysis, data acquisition and incipient fault detection. For each barrier the hydraulic characteristics, the motor's current and voltage, hydraulic pressure and the barrier's position are acquired. In order to acquire the data at a central point efficiently, without errors, a distributed single-cable Fieldbus is utilised. This allows the connection of all sensors through the project's proprietary communication nodes to a high-speed bus. The system developed in this paper for the condition monitoring described above detects faults by means of comparing what can be considered a 'normal' or 'expected' shape of a signal with respect to the actual shape observed as new data become available. ARIMA (autoregressive integrated moving average) models were employed for detecting faults. The statistical tests known as Jarque-Bera and Ljung-Box have been considered for testing the model.
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; Briscoe, Jeri M.
2005-01-01
Integrated System Health Management (ISHM) architectures for spacecraft will include hard real-time, critical subsystems and soft real-time monitoring subsystems. Interaction between these subsystems will be necessary and an architecture supporting multiple criticality levels will be required. Demonstration hardware for the Integrated Safety-Critical Advanced Avionics Communication & Control (ISAACC) system has been developed at NASA Marshall Space Flight Center. It is a modular system using a commercially available time-triggered protocol, ?Tp/C, that supports hard real-time distributed control systems independent of the data transmission medium. The protocol is implemented in hardware and provides guaranteed low-latency messaging with inherent fault-tolerance and fault-containment. Interoperability between modules and systems of modules using the TTP/C is guaranteed through definition of messages and the precise message schedule implemented by the master-less Time Division Multiple Access (TDMA) communications protocol. "Plug-and-play" capability for sensors and actuators provides automatically configurable modules supporting sensor recalibration and control algorithm re-tuning without software modification. Modular components of controlled physical system(s) critical to control algorithm tuning, such as pumps or valve components in an engine, can be replaced or upgraded as "plug and play" components without modification to the ISAACC module hardware or software. ISAACC modules can communicate with other vehicle subsystems through time-triggered protocols or other communications protocols implemented over Ethernet, MIL-STD- 1553 and RS-485/422. Other communication bus physical layers and protocols can be included as required. In this way, the ISAACC modules can be part of a system-of-systems in a vehicle with multi-tier subsystems of varying criticality. The goal of the ISAACC architecture development is control and monitoring of safety critical systems of a manned spacecraft. These systems include spacecraft navigation and attitude control, propulsion, automated docking, vehicle health management and life support. ISAACC can integrate local critical subsystem health management with subsystems performing long term health monitoring. The ISAACC system and its relationship to ISHM will be presented.
Formal Techniques for Synchronized Fault-Tolerant Systems
NASA Technical Reports Server (NTRS)
DiVito, Ben L.; Butler, Ricky W.
1992-01-01
We present the formal verification of synchronizing aspects of the Reliable Computing Platform (RCP), a fault-tolerant computing system for digital flight control applications. The RCP uses NMR-style redundancy to mask faults and internal majority voting to purge the effects of transient faults. The system design has been formally specified and verified using the EHDM verification system. Our formalization is based on an extended state machine model incorporating snapshots of local processors clocks.
Duchek, A.B.; McBride, J.H.; Nelson, W.J.; Leetaru, H.E.
2004-01-01
The Cottage Grove fault system in southern Illinois has long been interpreted as an intracratonic dextral strike-slip fault system. We investigated its structural geometry and kinematics in detail using (1) outcrop data, (2) extensive exposures in underground coal mines, (3) abundant borehole data, and (4) a network of industry seismic reflection profiles, including data reprocessed by us. Structural contour mapping delineates distinct monoclines, broad anticlines, and synclines that express Paleozoic-age deformation associated with strike slip along the fault system. As shown on seismic reflection profiles, prominent near-vertical faults that cut the entire Paleozoic section and basement-cover contact branch upward into outward-splaying, high-angle reverse faults. The master fault, sinuous along strike, is characterized along its length by an elongate anticline, ???3 km wide, that parallels the southern side of the master fault. These features signify that the overall kinematic regime was transpressional. Due to the absence of suitable piercing points, the amount of slip cannot be measured, but is constrained at less than 300 m near the ground surface. The Cottage Grove fault system apparently follows a Precambrian terrane boundary, as suggested by magnetic intensity data, the distribution of ultramafic igneous intrusions, and patterns of earthquake activity. The fault system was primarily active during the Alleghanian orogeny of Late Pennsylvanian and Early Permian time, when ultramatic igneous magma intruded along en echelon tensional fractures. ?? 2004 Geological Society of America.
NASA Astrophysics Data System (ADS)
Pantosti, Daniela
2017-04-01
The October 30, 2016 (06:40 UTC) Mw 6.5 earthquake occurred about 28 km NW of Amatrice village as the result of upper crust normal faulting on a nearly 30 km-long, NW-SE oriented, SW dipping fault system in the Central Apennines. This earthquake is the strongest Italian seismic event since the 1980 Mw 6.9 Irpinia earthquake. The Mw 6.5 event was the largest shock of a seismic sequence, which began on August 24 with a Mw 6.0 earthquake and also included a Mw 5.9 earthquake on October 26, about 9 and 35 km NW of Amatrice village, respectively. Field surveys of coseismic geological effects at the surface started within hours of the mainshock and were carried out by several national and international teams of earth scientists (about 120 people) from different research institutions and universities coordinated by the EMERGEO Working Group of the Istituto Nazionale di Geofisica e Vulcanologia. This collaborative effort was focused on the detailed recognition and mapping of: 1) the total extent of the October 30 coseismic surface ruptures, 2) their geometric and kinematic characteristics, 3) the coseismic displacement distribution along the activated fault system, including subsidiary and antithetic ruptures. The huge amount of collected data (more than 8000 observation points of several types of coseismic effects at the surface) were stored, managed and shared using a specifically designed spreadsheet to populate a georeferenced database. More comprehensive mapping of the details and extent of surface rupture was facilitated by Structure-from-Motion photogrammetry surveys by means of several helicopter flights. An almost continuous alignment of ruptures about 30 km long, N150/160 striking, mainly SW side down was observed along the already known active Mt. Vettore - Mt. Bove fault system. The mapped ruptures occasionally overlapped those of the August 24 Mw 6.0 and October 26 Mw 5.9 shocks. The coincidence between the observed surface ruptures and the trace of active normal faults mapped in the available geological literature is noteworthy. The field data collected suggest a complex coseismic surface faulting pattern along closely-spaced, parallel or subparallel, overlapping or step-like synthetic and antithetic fault splays. The cumulative surface faulting length has been estimated in about 40 km. The maximum vertical offset is significant, locally exceeding 2 meters along the Mt. Vettore Fault, measured both along bedrock fault planes and free-faces affecting unconsolidated deposits. This enormous collaborative experience has a twofold relevance, on the one side allowed to document in high detail the earthquake ruptures before Winter would destroy them, on the other represent the first large European experience for coseismic effects survey that we should use a leading case to establish a coseismic effects European team to get ready to respond to future seismic crises at the European level.
NASA Technical Reports Server (NTRS)
Lundgren, Paul; Saucier, Fraancois; Palmer, Randy; Langon, Marc
1995-01-01
We compute crustal motions in Alaska by calculating the finite element solution for an elastic spherical shell problem. The method we use allows the finite element mesh to include faults and very long baseline interferometry (VLBI) baseline rates of change. Boundary conditions include Pacific-North American (PA-NA) plate motions. The solution is constrained by the oblique orientation of the Fairweather-Queen Charlotte strike-slip faults relative to the PA-NA relative motion direction and the oblique orientation from normal convergence of the eastern Aleutian trench fault systems, as well as strike-shp motion along the Denali and Totschunda fault systems. We explore the effects that a range of fault slip constraints and weighting of VLBI rates of change has on the solution. This allows us to test the motion on faults, such as the Denali fault, where there are conflicting reports on its present-day slip rate. We find a pattern of displacements which produce fault motions generally consistent with geologic observations. The motion of the continuum has the general pattern of radial movement of crust to the NE away from the Fairweather-Queen Charlotte fault systems in SE Alaska and Canada. This pattern of crustal motion is absorbed across the Mackenzie Mountains in NW Canada, with strike-slip motion constrained along the Denali and Tintina fault systems. In south central Alaska and the Alaska forearc oblique convergence at the eastern Aleutian trench and the strike-shp motion of the Denali fault system produce a counterclockwise pattern of motion which is partially absorbed along the Contact and related fault systems in southern Alaska and is partially extruded into the Bering Sea and into the forearc parallel the Aleutian trench from the Alaska Peninsula westward. Rates of motion and fault slip are small in western and northern Alaska, but the motions we compute are consistent with the senses of strike-slip motion inferred geologically along the Kaltag, Kobuk Trench, and Thompson Creek faults and with the normal faulting observed in NW Alaska near Nome. The nonrigid behavior of our finite element solution produces patterns of motion that would not have been expected from rigid block models: strike-slip faults can exist in a continuum that has motion mostly perpendicular to their strikes, and faults can exhibit along-strike differences in magnitudes and directions.
Method and system for environmentally adaptive fault tolerant computing
NASA Technical Reports Server (NTRS)
Copenhaver, Jason L. (Inventor); Jeremy, Ramos (Inventor); Wolfe, Jeffrey M. (Inventor); Brenner, Dean (Inventor)
2010-01-01
A method and system for adapting fault tolerant computing. The method includes the steps of measuring an environmental condition representative of an environment. An on-board processing system's sensitivity to the measured environmental condition is measured. It is determined whether to reconfigure a fault tolerance of the on-board processing system based in part on the measured environmental condition. The fault tolerance of the on-board processing system may be reconfigured based in part on the measured environmental condition.
A Solid-State Fault Current Limiting Device for VSC-HVDC Systems
NASA Astrophysics Data System (ADS)
Larruskain, D. Marene; Zamora, Inmaculada; Abarrategui, , Oihane; Iturregi, Araitz
2013-08-01
Faults in the DC circuit constitute one of the main limitations of voltage source converter VSC-HVDC systems, as the high fault currents can damage seriously the converters. In this article, a new design for a fault current limiter (FCL) is proposed, which is capable of limiting the fault current as well as interrupting it, isolating the DC grid. The operation of the proposed FCL is analysed and verified with the most usual faults that can occur in overhead lines.
Graph-based real-time fault diagnostics
NASA Technical Reports Server (NTRS)
Padalkar, S.; Karsai, G.; Sztipanovits, J.
1988-01-01
A real-time fault detection and diagnosis capability is absolutely crucial in the design of large-scale space systems. Some of the existing AI-based fault diagnostic techniques like expert systems and qualitative modelling are frequently ill-suited for this purpose. Expert systems are often inadequately structured, difficult to validate and suffer from knowledge acquisition bottlenecks. Qualitative modelling techniques sometimes generate a large number of failure source alternatives, thus hampering speedy diagnosis. In this paper we present a graph-based technique which is well suited for real-time fault diagnosis, structured knowledge representation and acquisition and testing and validation. A Hierarchical Fault Model of the system to be diagnosed is developed. At each level of hierarchy, there exist fault propagation digraphs denoting causal relations between failure modes of subsystems. The edges of such a digraph are weighted with fault propagation time intervals. Efficient and restartable graph algorithms are used for on-line speedy identification of failure source components.
The emergence of asymmetric normal fault systems under symmetric boundary conditions
NASA Astrophysics Data System (ADS)
Schöpfer, Martin P. J.; Childs, Conrad; Manzocchi, Tom; Walsh, John J.; Nicol, Andrew; Grasemann, Bernhard
2017-11-01
Many normal fault systems and, on a smaller scale, fracture boudinage often exhibit asymmetry with one fault dip direction dominating. It is a common belief that the formation of domino and shear band boudinage with a monoclinic symmetry requires a component of layer parallel shearing. Moreover, domains of parallel faults are frequently used to infer the presence of a décollement. Using Distinct Element Method (DEM) modelling we show, that asymmetric fault systems can emerge under symmetric boundary conditions. A statistical analysis of DEM models suggests that the fault dip directions and system polarities can be explained using a random process if the strength contrast between the brittle layer and the surrounding material is high. The models indicate that domino and shear band boudinage are unreliable shear-sense indicators. Moreover, the presence of a décollement should not be inferred on the basis of a domain of parallel faults alone.
Bounemeur, Abdelhamid; Chemachema, Mohamed; Essounbouli, Najib
2018-05-10
In this paper, an active fuzzy fault tolerant tracking control (AFFTTC) scheme is developed for a class of multi-input multi-output (MIMO) unknown nonlinear systems in the presence of unknown actuator faults, sensor failures and external disturbance. The developed control scheme deals with four kinds of faults for both sensors and actuators. The bias, drift, and loss of accuracy additive faults are considered along with the loss of effectiveness multiplicative fault. A fuzzy adaptive controller based on back-stepping design is developed to deal with actuator failures and unknown system dynamics. However, an additional robust control term is added to deal with sensor faults, approximation errors, and external disturbances. Lyapunov theory is used to prove the stability of the closed loop system. Numerical simulations on a quadrotor are presented to show the effectiveness of the proposed approach. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Polverino, Pierpaolo; Pianese, Cesare; Sorrentino, Marco; Marra, Dario
2015-04-01
The paper focuses on the design of a procedure for the development of an on-field diagnostic algorithm for solid oxide fuel cell (SOFC) systems. The diagnosis design phase relies on an in-deep analysis of the mutual interactions among all system components by exploiting the physical knowledge of the SOFC system as a whole. This phase consists of the Fault Tree Analysis (FTA), which identifies the correlations among possible faults and their corresponding symptoms at system components level. The main outcome of the FTA is an inferential isolation tool (Fault Signature Matrix - FSM), which univocally links the faults to the symptoms detected during the system monitoring. In this work the FTA is considered as a starting point to develop an improved FSM. Making use of a model-based investigation, a fault-to-symptoms dependency study is performed. To this purpose a dynamic model, previously developed by the authors, is exploited to simulate the system under faulty conditions. Five faults are simulated, one for the stack and four occurring at BOP level. Moreover, the robustness of the FSM design is increased by exploiting symptom thresholds defined for the investigation of the quantitative effects of the simulated faults on the affected variables.
NASA Technical Reports Server (NTRS)
Mccann, Robert S.; Spirkovska, Lilly; Smith, Irene
2013-01-01
Integrated System Health Management (ISHM) technologies have advanced to the point where they can provide significant automated assistance with real-time fault detection, diagnosis, guided troubleshooting, and failure consequence assessment. To exploit these capabilities in actual operational environments, however, ISHM information must be integrated into operational concepts and associated information displays in ways that enable human operators to process and understand the ISHM system information rapidly and effectively. In this paper, we explore these design issues in the context of an advanced caution and warning system (ACAWS) for next-generation crewed spacecraft missions. User interface concepts for depicting failure diagnoses, failure effects, redundancy loss, "what-if" failure analysis scenarios, and resolution of ambiguity groups are discussed and illustrated.
Research on Fault Characteristics and Line Protections Within a Large-scale Photovoltaic Power Plant
NASA Astrophysics Data System (ADS)
Zhang, Chi; Zeng, Jie; Zhao, Wei; Zhong, Guobin; Xu, Qi; Luo, Pandian; Gu, Chenjie; Liu, Bohan
2017-05-01
Centralized photovoltaic (PV) systems have different fault characteristics from distributed PV systems due to the different system structures and controls. This makes the fault analysis and protection methods used in distribution networks with distributed PV not suitable for a centralized PV power plant. Therefore, a consolidated expression for the fault current within a PV power plant under different controls was calculated considering the fault response of the PV array. Then, supported by the fault current analysis and the on-site testing data, the overcurrent relay (OCR) performance was evaluated in the collection system of an 850 MW PV power plant. It reveals that the OCRs at downstream side on overhead lines may malfunction. In this case, a new relay scheme was proposed using directional distance elements. In the PSCAD/EMTDC, a detailed PV system model was built and verified using the on-site testing data. Simulation results indicate that the proposed relay scheme could effectively solve the problems under variant fault scenarios and PV plant output levels.
Evidence of Quaternary and recent activity along the Kyaukkyan Fault, Myanmar
NASA Astrophysics Data System (ADS)
Crosetto, Silvia; Watkinson, Ian M.; Soe Min; Gori, Stefano; Falcucci, Emanuela; Nwai Le Ngal
2018-05-01
Cenozoic right-lateral shear between the eastern Indian margin and Eurasia is expressed by numerous N-S trending fault systems inboard of the Sunda trench, including the Sagaing Fault. The most easterly of these fault systems is the prominent ∼500 km long Kyaukkyan Fault, on the Shan Plateau. Myanmar's largest recorded earthquake, Mw 7.7 on 23rd May 1912, focused near Maymyo, has been attributed to the Kyaukkyan Fault, but the area has experienced little significant seismicity since then. Despite its demonstrated seismic potential and remarkable topographic expression, questions remain about the Kyaukkyan Fault's neotectonic history.
A distributed fault-tolerant signal processor /FTSP/
NASA Astrophysics Data System (ADS)
Bonneau, R. J.; Evett, R. C.; Young, M. J.
1980-01-01
A digital fault-tolerant signal processor (FTSP), an example of a self-repairing programmable system is analyzed. The design configuration is discussed in terms of fault tolerance, system-level fault detection, isolation and common memory. Special attention is given to the FDIR (fault detection isolation and reconfiguration) logic, noting that the reconfiguration decisions are based on configuration, summary status, end-around tests, and north marker/synchro data. Several mechanisms of fault detection are described which initiate reconfiguration at different levels. It is concluded that the reliability of a signal processor can be significantly enhanced by the use of fault-tolerant techniques.
Mayer, Larry; Lu, Zhong
2001-01-01
A basic model incorporating satellite synthetic aperture radar (SAR) interferometry of the fault rupture zone that formed during the Kocaeli earthquake of August 17, 1999, documents the elastic rebound that resulted from the concomitant elastic strain release along the North Anatolian fault. For pure strike-slip faults, the elastic rebound function derived from SAR interferometry is directly invertible from the distribution of elastic strain on the fault at criticality, just before the critical shear stress was exceeded and the fault ruptured. The Kocaeli earthquake, which was accompanied by as much as ∼5 m of surface displacement, distributed strain ∼110 km around the fault prior to faulting, although most of it was concentrated in a narrower and asymmetric 10-km-wide zone on either side of the fault. The use of SAR interferometry to document the distribution of elastic strain at the critical condition for faulting is clearly a valuable tool, both for scientific investigation and for the effective management of earthquake hazard.
NASA Technical Reports Server (NTRS)
Russell, B. Don
1989-01-01
This research concentrated on the application of advanced signal processing, expert system, and digital technologies for the detection and control of low grade, incipient faults on spaceborne power systems. The researchers have considerable experience in the application of advanced digital technologies and the protection of terrestrial power systems. This experience was used in the current contracts to develop new approaches for protecting the electrical distribution system in spaceborne applications. The project was divided into three distinct areas: (1) investigate the applicability of fault detection algorithms developed for terrestrial power systems to the detection of faults in spaceborne systems; (2) investigate the digital hardware and architectures required to monitor and control spaceborne power systems with full capability to implement new detection and diagnostic algorithms; and (3) develop a real-time expert operating system for implementing diagnostic and protection algorithms. Significant progress has been made in each of the above areas. Several terrestrial fault detection algorithms were modified to better adapt to spaceborne power system environments. Several digital architectures were developed and evaluated in light of the fault detection algorithms.
Early Tertiary transtension-related deformation and magmatism along the Tintina fault system, Alaska
Till, A.B.; Roeske, S.M.; Bradley, D.C.; Friedman, R.; Layer, P.W.
2007-01-01
Transtensional deformation was concentrated in a zone adjacent to the Tintina strike-slip fault system in Alaska during the early Tertiary. The deformation occurred along the Victoria Creek fault, the trace of the Tintina system that connects it with the Kaltag fault; together the Tintina and Kaltag fault systems girdle Alaska from east to west. Over an area of ???25 by 70 km between the Victoria Creek and Tozitna faults, bimodal volcanics erupted; lacustrine and fluvial rocks were deposited; plutons were emplaced and deformed; and metamorphic rocks cooled, all at about the same time. Plutonic and volcanic rocks in this zone yield U-Pb zircon ages of ca. 60 Ma; 40Ar/ 39Ar cooling ages from those plutons and adjacent metamorphic rocks are also ca. 60 Ma. Although early Tertiary magmatism occurred over a broad area in central Alaska, meta- morphism and ductile deformation accompanied that magmatism in this one zone only. Within the zone of deformation, pluton aureoles and metamorphic rocks display consistent NE-SW-stretching lineations parallel to the Victoria Creek fault, suggesting that deformation processes involved subhorizontal elongation of the package. The most deeply buried metamorphic rocks, kyanite-bearing metapelites, occur as lenses adjacent to the fault, which cuts the crust to the Moho (Beaudoin et al., 1997). Geochronologic data and field relationships suggest that the amount of early Tertiary exhumation was greatest adjacent to the Victoria Creek fault. The early Tertiary crustal-scale events that may have operated to produce transtension in this area are (1) increased heat flux and related bimodal within-plate magmatism, (2) movement on a releasing stepover within the Tintina fault system or on a regional scale involving both the Tintina and the Kobuk fault systems, and (3) oroclinal bending of the Tintina-Kaltag fault system with counterclockwise rotation of western Alaska. ?? 2007 The Geological Society of America. All rights reserved.
Robust fault detection of wind energy conversion systems based on dynamic neural networks.
Talebi, Nasser; Sadrnia, Mohammad Ali; Darabi, Ahmad
2014-01-01
Occurrence of faults in wind energy conversion systems (WECSs) is inevitable. In order to detect the occurred faults at the appropriate time, avoid heavy economic losses, ensure safe system operation, prevent damage to adjacent relevant systems, and facilitate timely repair of failed components; a fault detection system (FDS) is required. Recurrent neural networks (RNNs) have gained a noticeable position in FDSs and they have been widely used for modeling of complex dynamical systems. One method for designing an FDS is to prepare a dynamic neural model emulating the normal system behavior. By comparing the outputs of the real system and neural model, incidence of the faults can be identified. In this paper, by utilizing a comprehensive dynamic model which contains both mechanical and electrical components of the WECS, an FDS is suggested using dynamic RNNs. The presented FDS detects faults of the generator's angular velocity sensor, pitch angle sensors, and pitch actuators. Robustness of the FDS is achieved by employing an adaptive threshold. Simulation results show that the proposed scheme is capable to detect the faults shortly and it has very low false and missed alarms rate.
Robust Fault Detection of Wind Energy Conversion Systems Based on Dynamic Neural Networks
Talebi, Nasser; Sadrnia, Mohammad Ali; Darabi, Ahmad
2014-01-01
Occurrence of faults in wind energy conversion systems (WECSs) is inevitable. In order to detect the occurred faults at the appropriate time, avoid heavy economic losses, ensure safe system operation, prevent damage to adjacent relevant systems, and facilitate timely repair of failed components; a fault detection system (FDS) is required. Recurrent neural networks (RNNs) have gained a noticeable position in FDSs and they have been widely used for modeling of complex dynamical systems. One method for designing an FDS is to prepare a dynamic neural model emulating the normal system behavior. By comparing the outputs of the real system and neural model, incidence of the faults can be identified. In this paper, by utilizing a comprehensive dynamic model which contains both mechanical and electrical components of the WECS, an FDS is suggested using dynamic RNNs. The presented FDS detects faults of the generator's angular velocity sensor, pitch angle sensors, and pitch actuators. Robustness of the FDS is achieved by employing an adaptive threshold. Simulation results show that the proposed scheme is capable to detect the faults shortly and it has very low false and missed alarms rate. PMID:24744774
Quasi-dynamic earthquake fault systems with rheological heterogeneity
NASA Astrophysics Data System (ADS)
Brietzke, G. B.; Hainzl, S.; Zoeller, G.; Holschneider, M.
2009-12-01
Seismic risk and hazard estimates mostly use pure empirical, stochastic models of earthquake fault systems tuned specifically to the vulnerable areas of interest. Although such models allow for reasonable risk estimates, such models cannot allow for physical statements of the described seismicity. In contrary such empirical stochastic models, physics based earthquake fault systems models allow for a physical reasoning and interpretation of the produced seismicity and system dynamics. Recently different fault system earthquake simulators based on frictional stick-slip behavior have been used to study effects of stress heterogeneity, rheological heterogeneity, or geometrical complexity on earthquake occurrence, spatial and temporal clustering of earthquakes, and system dynamics. Here we present a comparison of characteristics of synthetic earthquake catalogs produced by two different formulations of quasi-dynamic fault system earthquake simulators. Both models are based on discretized frictional faults embedded in an elastic half-space. While one (1) is governed by rate- and state-dependent friction with allowing three evolutionary stages of independent fault patches, the other (2) is governed by instantaneous frictional weakening with scheduled (and therefore causal) stress transfer. We analyze spatial and temporal clustering of events and characteristics of system dynamics by means of physical parameters of the two approaches.