MPICH, DCMF, and SPI

From ZeptoOS
Jump to navigationJump to search

The ZeptoOS team has enabled IBM CNK's commucation software stack to work with the Zepto compute node Linux environment for high performance computing(HPC) applications, specifically for MPI applications. Performance of MPI applications on the Zepto compute node Linux environment are comparable to that's on CNK.

As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that provides non-blocking operations. Please refer [DCMF wiki] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI.


Compilation wrapper scripts

To run your HPC application on the Zepto environment, first of all, you need to recompile your code with the compilation wrapper scripts (see blow) which are installed in your Zepto installation path. We provide the same set of wrapper scripts that IBM provides. Once you have successfully compiled your code, you need to submit it with Zepto kernel profile ( see the Kernel Profile section). Note: only SMP mode is currently supported.

- Wrapper scripts that invoke BGP enhanced GNU compilers 
zmpicc
zmpicxx
zmpif77
zmpif90

- Wrapper scripts that invoke IBM XL compilers
zmpixlc
zmpixlcxx
zmpixlf2003
zmpixlf77
zmpixlf90
zmpixlf95

- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)
zmpixlc_r
zmpixlcxx_r
zmpixlf2003_r
zmpixlf77_r
zmpixlf90_r
zmpixlf95_r

If you need to understand what those script actually do internally, run the wrapper script with the -show option.

Without compiler scripts

In case you can't use those compilation wrapper scripts, please make sure that your makefile or build environemnt points Zepto header files and libraries correctly. An example would be:

/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \
-lzoid_cn -lrt -lpthread -lm
__INST_PREFIX__/bin/zelftool -e mpi-test-linux

NOTE:

  • Replace __INST_PREFIX__ with your actuall Zepto install path
  • Don't forget calling the zelftool utility
    • which makes your executable a Zepto Compute Binary to let the Zepto kernel load

all application segments into the big memory area.


The file layout in the zepto install path would be:

|-- bin
|   |-- zelftool
|-- include
|   |-- dcmf.h
|   |-- dcmf_collectives.h
|   |-- dcmf_coremath.h
|   |-- dcmf_globalcollectives.h
|   |-- dcmf_multisend.h
|   |-- dcmf_optimath.h
|   |-- mpe_thread.h
|   |-- mpi.h
|   |-- mpi.mod
|   |-- mpi_base.mod
|   |-- mpi_constants.mod
|   |-- mpi_sizeofs.mod
|   |-- mpicxx.h
|   |-- mpif.h
|   |-- mpio.h
|   |-- mpiof.h
|   `-- mpix.h
`-- lib
    |-- libSPI.zcl.a
    |-- libcxxmpich.zcl.a
    |-- libdcmf.zcl.a
    |-- libdcmfcoll.zcl.a
    |-- libfmpich.zcl.a
    |-- libfmpich_.zcl.a
    |-- libmpich.zcl.a
    |-- libmpich.zclf90.a
    |-- libzcl.a
    `-- libzoid_cn.a

Rebuilding the libraries

We have all necessary source codes to build MPICH, DCMF and SPI. To build those libraries, just type:

$ make -C comm rebuild-target

It may take a half hour to an hour to complete the build process, depending on what file system you are using. i.e., GPFS is definitely slower than local scratch file system.

The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries, do the following steps:

$ make -C comm update-prebuilt
$ python install.py __INST_PREFIX__

Software stack layout and source code

Zepto-Comm-Stack.png

The right figure depicts the layout of communication software stack for Zepto compute node environment. This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux. While we skip the brief explanation of MPICH since it's well-known software piece, we briefly describe what DCMF and SPI are here.

  • DCMF
    • Stands for Deep Computing Messaging Framework
    • Developed by IBM originally for BleuGene architecture
    • Hardware Initialization, query functions
    • Supports BGP Torus DMA, collective network
    • Provides timer
    • Supports non-blocking collective operations
    • BGP MPICH uses DCMF internally (IBM provides a glue layer)
  • SPI
    • Stands for System Programming Interface
    • Developed by IBM. BGP specific codes.
    • Kernel interfaces - DMA control, lockbox, etc
    • DMA related definitions
      • can be used in both user space and kernel space
    • RAS, BGP personality, mapping related functions

BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux. ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.


The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory. Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.







|-- DCMF
|   |-- lib
|   |   |-- dev
|   |   |-- ga
|   |   `-- mpich2
|   |       `-- make
|   |-- sys
|   |   |-- collectives
|   |   |-- include
|   |   |-- messaging
|-- arch-runtime
|   |-- arch
|   |   `-- include
|   |       |-- bpcore
|   |       |-- cnk
|   |       |-- common
|   |       |-- spi
|   |       `-- zepto
|   |-- runtime
|   |-- testcodes
|   `-- zcl_spi
|-- prebuilt
`-- testcodes