Minutes for FTB conference call - 2007 May 29th

From CIFTS
Revision as of 16:09, 31 May 2007 by Rgupta@mcs.anl.gov (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Attendees

  • Argonne National Lab: Rinku Gupta, Pete Beckman, Susan Coghlan, Narayan Desai, Rob Ross, Rajeev Thakur
  • Lawrence Berkeley National Lab: Paul Hargrove
  • Oak Ridge National Lab: Al Geist, Aniruddha Shet
  • Indiana University: Tim Prins
  • Ohio State University: Qi Gao, Abhinav Vishnu
  • University of Tennessee: George Bosilca

Items Discussed

  1. Progress Reports
    1. A single consolidated Progress report will be created. Everyone to send a 1-page report on their Fault Tolerant work by Monday June 4th to Rinku
  2. Presentation Walk-through: Important items discussed
    1. Mapping between event categories and software component categories - It was suggested that we have the same categories across all components, since some errors/warnings, belonging to different components, will be similar in nature.
    2. Mechanism needed to assign affinity between faults thrown by different components but occuring due to a common failure.
    3. Mechanism needed to provide a single aggregate response to different reported errors for a common failure
    4. 'Event scoping and grouping' was discussed - Event grouping will be important for implementors who wish to insularly include FTB in their product. In addition, there may be faults/warnings that may not need to be propagated beyond the local system, thus establishing a need for having events local in scope. The complexity of this topic led to a joint decision that it will be discussed at a later stage in the design cycle.
    5. OpenMPI folks to provide input/ideas on any fault tolerant specific features derived on experience gained with OpenMPI
  3. Face-to-face All-Hands meeting
    1. This will take place in Salt Lake city on a weekend in July
    2. CRAY representatives to be invited

Action Items

  1. Component owners to work on their schemas for the components. Rinku to send out a sample.
  2. Owners to send a 1-pager update on their FTB work

Next Meeting

June 12th 2007

Personal tools