CIFTS

From CIFTS

(Redirected from Main Page)
Jump to: navigation, search

Welcome to the CIFTS Wiki.

The "Coordinated Infrastructure for Fault Tolerant Systems (CIFTS)" initiative aims to develop an open-source "Fault Tolerance Backplane" software for helping improve fault-tolerance in large systems through a co-ordinated approach by integrating the various fault-tolerance features of the software components -starting from top level applications to middleware through the file system and operating system - present in the high-end computing system. Such integration will make possible a level of fault prediction, notification, management, and recovery that is impossible today but critical to the productive use of the high-end petascale systems of tomorrow.


The CIFTS project aims to create a fault tolerance backplane and aims to build the infrastructure necessary to enable systems to adapt to faults in a holistic manner. The objectives of this effort are as follows

  1. Design an open source reference implementation of a fault awareness and notification backplane to provide common uniform event handling and notification mechanisms for fault-aware libraries and middleware
  2. Create a public interface specification that allows libraries, run-time systems, and applications to connect to and use the fault-tolerance backplane
  3. Extend key libraries and applications to validate the interface choices and to form the critical mass necessary for adoption in the community




Personal tools