Minutes for FTB-MPI conference call - 2010 Sep 8th
- Argonne National Lab: Rinku, Darius
- Oak Ridge National Lab: Aniruddha, Hoony, Thomas, David
- Ohio State University: Ouyang
- Indiana University: Abhishek
- University of Tennessee: Absent
- Lawrence Berkeley National Lab: Paul
- E-Poster for CIFTS submitted. Second poster is waiting for slides from everyone. Deadline for slides is Sep 13th. If no slides are received by this date, the poster will be cancelled.
- Rinku could not present FTB 1.0 feature list due to lack of time. Postponed to next week.
- Reliability: The conference call last week was followed by a lot of discussion on the mailing list. It was ultimately decided that having separate API for reliability would be the way to go. This proposal was seconded by everyone. Rinku and David are to work on an initial API that can be used for reliability (It was decided that we take a scenario (ex: that involving the job scheduler) and frame the API around it. It was also decided that we should re-check if some suitable API already exists for our needs.
- David, Aniruddha continuing work with their IPDPS paper
- No update on test scripts from Thomas
- Hoony presented slides on robustness in FTB.
- Everyone agreed that the d-fault tolerant design was a good addition to improve FTB robustness
- This will be added in FTB 1.0
- Ouyang will attend the next few calls instead of Raghu. No significant update by Ouyang. OSU folks are working on their MVAPICH release.
- Abhishek to send the MPI document to the team. The new team member (name ?) is looking at MTT to be used in the testing nvironment.
- Paul said he didnt have any significant updates.
- Aurelien was absent due to some travel. [Aurelien did mention offline that we would be interested in being involved in the topology and reliability related work in FTB]