<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https:/// /zeptoos/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Kazutomo</id>
	<title>ZeptoOS - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https:/// /zeptoos/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Kazutomo"/>
	<link rel="alternate" type="text/html" href=" /zeptoos/index.php/Special:Contributions/Kazutomo"/>
	<updated>2026-06-12T12:15:47Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.35.6</generator>
	<entry>
		<id> /zeptoos/index.php?title=Limitations&amp;diff=619</id>
		<title>Limitations</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Limitations&amp;diff=619"/>
		<updated>2009-05-15T18:44:05Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* I/O Helper thread bug(fails with a DCMF assertion) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==Known Bugs / Current Limitations==&lt;br /&gt;
&lt;br /&gt;
===I/O Helper thread bug(fails with a DCMF assertion)===&lt;br /&gt;
&lt;br /&gt;
Your MPI program might exit abnormally with a DCMF assertion due to I/O helper thread race condition.&lt;br /&gt;
If this problem happens, you'll see an error message like below in your .error file. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
void DCMF::Queueing::Tree::Device::post(DCMF::Queueing::Tree::TreeSendMessage&amp;amp;):&lt;br /&gt;
Assertion `currentSend() != &amp;amp;smsg' failed.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A '''workaround''' is to disable the I/O helper thread. &lt;br /&gt;
You can disable by passing the DCMF_ZEPTO_TREE_THREAD environment variable with 0 &lt;br /&gt;
when you submit job. Here is an example.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -n 64 -t 20 -k zeptoos -e DCMF_ZEPTO_TREE_THREAD=0 .....&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This problem is triggered from MPI collective primitive such as MPI_Allreduce(), MPI_Bcast()&lt;br /&gt;
when BGP tree device is used.&lt;br /&gt;
&lt;br /&gt;
===No VN/DUAL mode in MPI===&lt;br /&gt;
&lt;br /&gt;
Blue Gene/P supports three job modes:&lt;br /&gt;
&lt;br /&gt;
* SMP (one application process per node)&lt;br /&gt;
* DUAL (two application processes per node)&lt;br /&gt;
* VN (four application processes per node)&lt;br /&gt;
&lt;br /&gt;
In Cobalt, the job mode can be specified using &amp;lt;tt&amp;gt;cqsub -m&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;qsub --mode&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
ZeptoOS will launch the appropriate number of application processes per node as determined by the mode; however, MPI jobs currently only work in the SMP mode.  We plan to fix this problem in the near future.&lt;br /&gt;
&lt;br /&gt;
===No Universal Performance Counter (UPC)===&lt;br /&gt;
&lt;br /&gt;
UPC is not available in this release. Thus, PAPI will not work since it depends on UPC.&lt;br /&gt;
We are currently trying to enable the UPC support in our Linux environment.&lt;br /&gt;
&lt;br /&gt;
===MPI-IO support===&lt;br /&gt;
&lt;br /&gt;
Due to the limitations of FUSE (the compute-node infrastructure we use for I/O forwarding of POSIX calls), if using the standard glibc, pathnames passed to MPI-IO routines need to be prefixed with &amp;lt;tt&amp;gt;bglockless:&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;bgl:&amp;lt;/tt&amp;gt; (the latter will not work with PVFS; the former should work with all filesystems).&lt;br /&gt;
&lt;br /&gt;
This should not be necessary when using the version of glibc [[Other Packages#ZOID glibc|modified for ZOID]].  That version should also give a better performance, so please give it a try if the performance with the standard glibc is unsatisfactory.&lt;br /&gt;
&lt;br /&gt;
Also, within the DOE FastOS [http://www.iofsl.org/ I/O forwarding project] we are working on a new, high performance I/O forwarding infrastructure for parallel applications and as this work matures, we will integrate it into ZeptoOS.&lt;br /&gt;
&lt;br /&gt;
===Some MPI jobs hung when they are killed===&lt;br /&gt;
&lt;br /&gt;
We have been seeing this a lot with &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt;, the [[Other Packages#IP over torus|IP-over-torus]] program.  This program runs &amp;quot;forever&amp;quot;, so it eventually needs to be killed.  When that happens, it will frequently hung one or more compute nodes, preventing the partition from shutting down cleanly.&lt;br /&gt;
&lt;br /&gt;
However, the service node will force a shutdown after a timeout of five minutes, so in practice this is not a significant problem.  Also, we have not seen this problem with ordinary MPI applications (unlike most MPI applications, &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt; is multithreaded and communicates a lot with the kernel).&lt;br /&gt;
&lt;br /&gt;
===mpirun -nofree does not work===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;tt&amp;gt;mpirun -nofree&amp;lt;/tt&amp;gt; (submitting multiple jobs without rebooting the nodes) does not work in the current release.  Currently, partitions must be rebooted.  We intend to fix it in the next version.&lt;br /&gt;
&lt;br /&gt;
==Features Coming Soon==&lt;br /&gt;
&lt;br /&gt;
===Multiple MPI jobs one after another===&lt;br /&gt;
&lt;br /&gt;
Since ZeptoOS supports submitting a shell script as a compute node &amp;quot;application&amp;quot;, it is possible to run multiple &amp;quot;real&amp;quot; applications from within one job:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
&lt;br /&gt;
for i in 1 2 3 4 5 6 7 8 9 10; do&lt;br /&gt;
    /path/to/real/application&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This does work for sequential applications, but not for those that are linked with MPI; with MPI, an application can only be run once.  However, we have an experimental code that lifts this limitation and we plan to include it in the next release.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Limitations&amp;diff=618</id>
		<title>Limitations</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Limitations&amp;diff=618"/>
		<updated>2009-05-15T18:43:43Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* I/O Helper thread bug(fails with a DCMF assertion) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==Known Bugs / Current Limitations==&lt;br /&gt;
&lt;br /&gt;
===I/O Helper thread bug(fails with a DCMF assertion)===&lt;br /&gt;
&lt;br /&gt;
Your MPI program might exit abnormally with a DCMF assertion due to I/O helper thread race condition.&lt;br /&gt;
If this problem happens, you'll see an error message like below in your .error file. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
?????????: /gpfs/home/kazutomo/BGP/comm/DCMF/sys/messaging/devices/prod/tree/Device.cc:673:&lt;br /&gt;
void DCMF::Queueing::Tree::Device::post(DCMF::Queueing::Tree::TreeSendMessage&amp;amp;):&lt;br /&gt;
Assertion `currentSend() != &amp;amp;smsg' failed.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A '''workaround''' is to disable the I/O helper thread. &lt;br /&gt;
You can disable by passing the DCMF_ZEPTO_TREE_THREAD environment variable with 0 &lt;br /&gt;
when you submit job. Here is an example.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -n 64 -t 20 -k zeptoos -e DCMF_ZEPTO_TREE_THREAD=0 .....&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This problem is triggered from MPI collective primitive such as MPI_Allreduce(), MPI_Bcast()&lt;br /&gt;
when BGP tree device is used.&lt;br /&gt;
&lt;br /&gt;
===No VN/DUAL mode in MPI===&lt;br /&gt;
&lt;br /&gt;
Blue Gene/P supports three job modes:&lt;br /&gt;
&lt;br /&gt;
* SMP (one application process per node)&lt;br /&gt;
* DUAL (two application processes per node)&lt;br /&gt;
* VN (four application processes per node)&lt;br /&gt;
&lt;br /&gt;
In Cobalt, the job mode can be specified using &amp;lt;tt&amp;gt;cqsub -m&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;qsub --mode&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
ZeptoOS will launch the appropriate number of application processes per node as determined by the mode; however, MPI jobs currently only work in the SMP mode.  We plan to fix this problem in the near future.&lt;br /&gt;
&lt;br /&gt;
===No Universal Performance Counter (UPC)===&lt;br /&gt;
&lt;br /&gt;
UPC is not available in this release. Thus, PAPI will not work since it depends on UPC.&lt;br /&gt;
We are currently trying to enable the UPC support in our Linux environment.&lt;br /&gt;
&lt;br /&gt;
===MPI-IO support===&lt;br /&gt;
&lt;br /&gt;
Due to the limitations of FUSE (the compute-node infrastructure we use for I/O forwarding of POSIX calls), if using the standard glibc, pathnames passed to MPI-IO routines need to be prefixed with &amp;lt;tt&amp;gt;bglockless:&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;bgl:&amp;lt;/tt&amp;gt; (the latter will not work with PVFS; the former should work with all filesystems).&lt;br /&gt;
&lt;br /&gt;
This should not be necessary when using the version of glibc [[Other Packages#ZOID glibc|modified for ZOID]].  That version should also give a better performance, so please give it a try if the performance with the standard glibc is unsatisfactory.&lt;br /&gt;
&lt;br /&gt;
Also, within the DOE FastOS [http://www.iofsl.org/ I/O forwarding project] we are working on a new, high performance I/O forwarding infrastructure for parallel applications and as this work matures, we will integrate it into ZeptoOS.&lt;br /&gt;
&lt;br /&gt;
===Some MPI jobs hung when they are killed===&lt;br /&gt;
&lt;br /&gt;
We have been seeing this a lot with &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt;, the [[Other Packages#IP over torus|IP-over-torus]] program.  This program runs &amp;quot;forever&amp;quot;, so it eventually needs to be killed.  When that happens, it will frequently hung one or more compute nodes, preventing the partition from shutting down cleanly.&lt;br /&gt;
&lt;br /&gt;
However, the service node will force a shutdown after a timeout of five minutes, so in practice this is not a significant problem.  Also, we have not seen this problem with ordinary MPI applications (unlike most MPI applications, &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt; is multithreaded and communicates a lot with the kernel).&lt;br /&gt;
&lt;br /&gt;
===mpirun -nofree does not work===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;tt&amp;gt;mpirun -nofree&amp;lt;/tt&amp;gt; (submitting multiple jobs without rebooting the nodes) does not work in the current release.  Currently, partitions must be rebooted.  We intend to fix it in the next version.&lt;br /&gt;
&lt;br /&gt;
==Features Coming Soon==&lt;br /&gt;
&lt;br /&gt;
===Multiple MPI jobs one after another===&lt;br /&gt;
&lt;br /&gt;
Since ZeptoOS supports submitting a shell script as a compute node &amp;quot;application&amp;quot;, it is possible to run multiple &amp;quot;real&amp;quot; applications from within one job:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
&lt;br /&gt;
for i in 1 2 3 4 5 6 7 8 9 10; do&lt;br /&gt;
    /path/to/real/application&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This does work for sequential applications, but not for those that are linked with MPI; with MPI, an application can only be run once.  However, we have an experimental code that lifts this limitation and we plan to include it in the next release.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Limitations&amp;diff=617</id>
		<title>Limitations</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Limitations&amp;diff=617"/>
		<updated>2009-05-15T18:43:29Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* I/O Helper thread bug(fails with a DCMF assertion) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==Known Bugs / Current Limitations==&lt;br /&gt;
&lt;br /&gt;
===I/O Helper thread bug(fails with a DCMF assertion)===&lt;br /&gt;
&lt;br /&gt;
Your MPI program might exit abnormally with a DCMF assertion due to I/O helper thread race condition.&lt;br /&gt;
If this problem happens, you'll see an error message like below in your .error file. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
?????????: /gpfs/home/kazutomo/BGP/comm/DCMF/sys/messaging/devices/prod/tree/Device.cc:673: \&lt;br /&gt;
void DCMF::Queueing::Tree::Device::post(DCMF::Queueing::Tree::TreeSendMessage&amp;amp;): \&lt;br /&gt;
Assertion `currentSend() != &amp;amp;smsg' failed.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A '''workaround''' is to disable the I/O helper thread. &lt;br /&gt;
You can disable by passing the DCMF_ZEPTO_TREE_THREAD environment variable with 0 &lt;br /&gt;
when you submit job. Here is an example.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -n 64 -t 20 -k zeptoos -e DCMF_ZEPTO_TREE_THREAD=0 .....&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This problem is triggered from MPI collective primitive such as MPI_Allreduce(), MPI_Bcast()&lt;br /&gt;
when BGP tree device is used.&lt;br /&gt;
&lt;br /&gt;
===No VN/DUAL mode in MPI===&lt;br /&gt;
&lt;br /&gt;
Blue Gene/P supports three job modes:&lt;br /&gt;
&lt;br /&gt;
* SMP (one application process per node)&lt;br /&gt;
* DUAL (two application processes per node)&lt;br /&gt;
* VN (four application processes per node)&lt;br /&gt;
&lt;br /&gt;
In Cobalt, the job mode can be specified using &amp;lt;tt&amp;gt;cqsub -m&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;qsub --mode&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
ZeptoOS will launch the appropriate number of application processes per node as determined by the mode; however, MPI jobs currently only work in the SMP mode.  We plan to fix this problem in the near future.&lt;br /&gt;
&lt;br /&gt;
===No Universal Performance Counter (UPC)===&lt;br /&gt;
&lt;br /&gt;
UPC is not available in this release. Thus, PAPI will not work since it depends on UPC.&lt;br /&gt;
We are currently trying to enable the UPC support in our Linux environment.&lt;br /&gt;
&lt;br /&gt;
===MPI-IO support===&lt;br /&gt;
&lt;br /&gt;
Due to the limitations of FUSE (the compute-node infrastructure we use for I/O forwarding of POSIX calls), if using the standard glibc, pathnames passed to MPI-IO routines need to be prefixed with &amp;lt;tt&amp;gt;bglockless:&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;bgl:&amp;lt;/tt&amp;gt; (the latter will not work with PVFS; the former should work with all filesystems).&lt;br /&gt;
&lt;br /&gt;
This should not be necessary when using the version of glibc [[Other Packages#ZOID glibc|modified for ZOID]].  That version should also give a better performance, so please give it a try if the performance with the standard glibc is unsatisfactory.&lt;br /&gt;
&lt;br /&gt;
Also, within the DOE FastOS [http://www.iofsl.org/ I/O forwarding project] we are working on a new, high performance I/O forwarding infrastructure for parallel applications and as this work matures, we will integrate it into ZeptoOS.&lt;br /&gt;
&lt;br /&gt;
===Some MPI jobs hung when they are killed===&lt;br /&gt;
&lt;br /&gt;
We have been seeing this a lot with &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt;, the [[Other Packages#IP over torus|IP-over-torus]] program.  This program runs &amp;quot;forever&amp;quot;, so it eventually needs to be killed.  When that happens, it will frequently hung one or more compute nodes, preventing the partition from shutting down cleanly.&lt;br /&gt;
&lt;br /&gt;
However, the service node will force a shutdown after a timeout of five minutes, so in practice this is not a significant problem.  Also, we have not seen this problem with ordinary MPI applications (unlike most MPI applications, &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt; is multithreaded and communicates a lot with the kernel).&lt;br /&gt;
&lt;br /&gt;
===mpirun -nofree does not work===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;tt&amp;gt;mpirun -nofree&amp;lt;/tt&amp;gt; (submitting multiple jobs without rebooting the nodes) does not work in the current release.  Currently, partitions must be rebooted.  We intend to fix it in the next version.&lt;br /&gt;
&lt;br /&gt;
==Features Coming Soon==&lt;br /&gt;
&lt;br /&gt;
===Multiple MPI jobs one after another===&lt;br /&gt;
&lt;br /&gt;
Since ZeptoOS supports submitting a shell script as a compute node &amp;quot;application&amp;quot;, it is possible to run multiple &amp;quot;real&amp;quot; applications from within one job:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
&lt;br /&gt;
for i in 1 2 3 4 5 6 7 8 9 10; do&lt;br /&gt;
    /path/to/real/application&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This does work for sequential applications, but not for those that are linked with MPI; with MPI, an application can only be run once.  However, we have an experimental code that lifts this limitation and we plan to include it in the next release.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Limitations&amp;diff=616</id>
		<title>Limitations</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Limitations&amp;diff=616"/>
		<updated>2009-05-15T18:42:56Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* I/O Helper thread bug(fails with a DCMF assertion) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==Known Bugs / Current Limitations==&lt;br /&gt;
&lt;br /&gt;
===I/O Helper thread bug(fails with a DCMF assertion)===&lt;br /&gt;
&lt;br /&gt;
Your MPI program might exit abnormally with a DCMF assertion due to I/O helper thread race condition.&lt;br /&gt;
If this problem happens, you'll see an error message like below in your .error file. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
?????????: /gpfs/home/kazutomo/BGP/comm/DCMF/sys/messaging/devices/prod/tree/Device.cc:673: void DCMF::Queueing::Tree::Device::post(DCMF::Queueing::Tree::TreeSendMessage&amp;amp;): Assertion `currentSend() != &amp;amp;smsg' failed.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A '''workaround''' is to disable the I/O helper thread. &lt;br /&gt;
You can disable by passing the DCMF_ZEPTO_TREE_THREAD environment variable with 0 &lt;br /&gt;
when you submit job. Here is an example.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -n 64 -t 20 -k zeptoos -e DCMF_ZEPTO_TREE_THREAD=0 .....&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This problem is triggered from MPI collective primitive such as MPI_Allreduce(), MPI_Bcast()&lt;br /&gt;
when BGP tree device is used.&lt;br /&gt;
&lt;br /&gt;
===No VN/DUAL mode in MPI===&lt;br /&gt;
&lt;br /&gt;
Blue Gene/P supports three job modes:&lt;br /&gt;
&lt;br /&gt;
* SMP (one application process per node)&lt;br /&gt;
* DUAL (two application processes per node)&lt;br /&gt;
* VN (four application processes per node)&lt;br /&gt;
&lt;br /&gt;
In Cobalt, the job mode can be specified using &amp;lt;tt&amp;gt;cqsub -m&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;qsub --mode&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
ZeptoOS will launch the appropriate number of application processes per node as determined by the mode; however, MPI jobs currently only work in the SMP mode.  We plan to fix this problem in the near future.&lt;br /&gt;
&lt;br /&gt;
===No Universal Performance Counter (UPC)===&lt;br /&gt;
&lt;br /&gt;
UPC is not available in this release. Thus, PAPI will not work since it depends on UPC.&lt;br /&gt;
We are currently trying to enable the UPC support in our Linux environment.&lt;br /&gt;
&lt;br /&gt;
===MPI-IO support===&lt;br /&gt;
&lt;br /&gt;
Due to the limitations of FUSE (the compute-node infrastructure we use for I/O forwarding of POSIX calls), if using the standard glibc, pathnames passed to MPI-IO routines need to be prefixed with &amp;lt;tt&amp;gt;bglockless:&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;bgl:&amp;lt;/tt&amp;gt; (the latter will not work with PVFS; the former should work with all filesystems).&lt;br /&gt;
&lt;br /&gt;
This should not be necessary when using the version of glibc [[Other Packages#ZOID glibc|modified for ZOID]].  That version should also give a better performance, so please give it a try if the performance with the standard glibc is unsatisfactory.&lt;br /&gt;
&lt;br /&gt;
Also, within the DOE FastOS [http://www.iofsl.org/ I/O forwarding project] we are working on a new, high performance I/O forwarding infrastructure for parallel applications and as this work matures, we will integrate it into ZeptoOS.&lt;br /&gt;
&lt;br /&gt;
===Some MPI jobs hung when they are killed===&lt;br /&gt;
&lt;br /&gt;
We have been seeing this a lot with &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt;, the [[Other Packages#IP over torus|IP-over-torus]] program.  This program runs &amp;quot;forever&amp;quot;, so it eventually needs to be killed.  When that happens, it will frequently hung one or more compute nodes, preventing the partition from shutting down cleanly.&lt;br /&gt;
&lt;br /&gt;
However, the service node will force a shutdown after a timeout of five minutes, so in practice this is not a significant problem.  Also, we have not seen this problem with ordinary MPI applications (unlike most MPI applications, &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt; is multithreaded and communicates a lot with the kernel).&lt;br /&gt;
&lt;br /&gt;
===mpirun -nofree does not work===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;tt&amp;gt;mpirun -nofree&amp;lt;/tt&amp;gt; (submitting multiple jobs without rebooting the nodes) does not work in the current release.  Currently, partitions must be rebooted.  We intend to fix it in the next version.&lt;br /&gt;
&lt;br /&gt;
==Features Coming Soon==&lt;br /&gt;
&lt;br /&gt;
===Multiple MPI jobs one after another===&lt;br /&gt;
&lt;br /&gt;
Since ZeptoOS supports submitting a shell script as a compute node &amp;quot;application&amp;quot;, it is possible to run multiple &amp;quot;real&amp;quot; applications from within one job:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
&lt;br /&gt;
for i in 1 2 3 4 5 6 7 8 9 10; do&lt;br /&gt;
    /path/to/real/application&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This does work for sequential applications, but not for those that are linked with MPI; with MPI, an application can only be run once.  However, we have an experimental code that lifts this limitation and we plan to include it in the next release.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Limitations&amp;diff=615</id>
		<title>Limitations</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Limitations&amp;diff=615"/>
		<updated>2009-05-15T18:41:23Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Known Bugs / Current Limitations */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==Known Bugs / Current Limitations==&lt;br /&gt;
&lt;br /&gt;
===I/O Helper thread bug(fails with a DCMF assertion)===&lt;br /&gt;
&lt;br /&gt;
Your MPI program might exit abnormally with a DCMF assertion due to I/O helper thread race condition.&lt;br /&gt;
If this problem happens, you'll see an error message like below in your .error file. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
............  Assertion `currentSend() == &amp;amp;smsg' failed.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A '''workaround''' is to disable the I/O helper thread. &lt;br /&gt;
You can disable by passing the DCMF_ZEPTO_TREE_THREAD environment variable with 0 &lt;br /&gt;
when you submit job. Here is an example.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -n 64 -t 20 -k zeptoos -e DCMF_ZEPTO_TREE_THREAD=0 .....&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This problem is triggered from MPI collective primitive such as MPI_Allreduce(), MPI_Bcast()&lt;br /&gt;
when BGP tree device is used.&lt;br /&gt;
&lt;br /&gt;
===No VN/DUAL mode in MPI===&lt;br /&gt;
&lt;br /&gt;
Blue Gene/P supports three job modes:&lt;br /&gt;
&lt;br /&gt;
* SMP (one application process per node)&lt;br /&gt;
* DUAL (two application processes per node)&lt;br /&gt;
* VN (four application processes per node)&lt;br /&gt;
&lt;br /&gt;
In Cobalt, the job mode can be specified using &amp;lt;tt&amp;gt;cqsub -m&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;qsub --mode&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
ZeptoOS will launch the appropriate number of application processes per node as determined by the mode; however, MPI jobs currently only work in the SMP mode.  We plan to fix this problem in the near future.&lt;br /&gt;
&lt;br /&gt;
===No Universal Performance Counter (UPC)===&lt;br /&gt;
&lt;br /&gt;
UPC is not available in this release. Thus, PAPI will not work since it depends on UPC.&lt;br /&gt;
We are currently trying to enable the UPC support in our Linux environment.&lt;br /&gt;
&lt;br /&gt;
===MPI-IO support===&lt;br /&gt;
&lt;br /&gt;
Due to the limitations of FUSE (the compute-node infrastructure we use for I/O forwarding of POSIX calls), if using the standard glibc, pathnames passed to MPI-IO routines need to be prefixed with &amp;lt;tt&amp;gt;bglockless:&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;bgl:&amp;lt;/tt&amp;gt; (the latter will not work with PVFS; the former should work with all filesystems).&lt;br /&gt;
&lt;br /&gt;
This should not be necessary when using the version of glibc [[Other Packages#ZOID glibc|modified for ZOID]].  That version should also give a better performance, so please give it a try if the performance with the standard glibc is unsatisfactory.&lt;br /&gt;
&lt;br /&gt;
Also, within the DOE FastOS [http://www.iofsl.org/ I/O forwarding project] we are working on a new, high performance I/O forwarding infrastructure for parallel applications and as this work matures, we will integrate it into ZeptoOS.&lt;br /&gt;
&lt;br /&gt;
===Some MPI jobs hung when they are killed===&lt;br /&gt;
&lt;br /&gt;
We have been seeing this a lot with &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt;, the [[Other Packages#IP over torus|IP-over-torus]] program.  This program runs &amp;quot;forever&amp;quot;, so it eventually needs to be killed.  When that happens, it will frequently hung one or more compute nodes, preventing the partition from shutting down cleanly.&lt;br /&gt;
&lt;br /&gt;
However, the service node will force a shutdown after a timeout of five minutes, so in practice this is not a significant problem.  Also, we have not seen this problem with ordinary MPI applications (unlike most MPI applications, &amp;lt;tt&amp;gt;cn-ipfwd&amp;lt;/tt&amp;gt; is multithreaded and communicates a lot with the kernel).&lt;br /&gt;
&lt;br /&gt;
===mpirun -nofree does not work===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;tt&amp;gt;mpirun -nofree&amp;lt;/tt&amp;gt; (submitting multiple jobs without rebooting the nodes) does not work in the current release.  Currently, partitions must be rebooted.  We intend to fix it in the next version.&lt;br /&gt;
&lt;br /&gt;
==Features Coming Soon==&lt;br /&gt;
&lt;br /&gt;
===Multiple MPI jobs one after another===&lt;br /&gt;
&lt;br /&gt;
Since ZeptoOS supports submitting a shell script as a compute node &amp;quot;application&amp;quot;, it is possible to run multiple &amp;quot;real&amp;quot; applications from within one job:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
&lt;br /&gt;
for i in 1 2 3 4 5 6 7 8 9 10; do&lt;br /&gt;
    /path/to/real/application&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This does work for sequential applications, but not for those that are linked with MPI; with MPI, an application can only be run once.  However, we have an experimental code that lifts this limitation and we plan to include it in the next release.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Configuration&amp;diff=596</id>
		<title>Configuration</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Configuration&amp;diff=596"/>
		<updated>2009-05-08T16:58:39Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Building */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Introduction]] | [[ZeptoOS_Documentation|Top]] | [[Installation]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
== Downloading ==&lt;br /&gt;
&lt;br /&gt;
* Log on one of the front end nodes of the Blue Gene (a login node or a service node).&lt;br /&gt;
&lt;br /&gt;
* Download the ZeptoOS tarball from the ZeptoOS [http://press.mcs.anl.gov/zeptoos/download download page].&lt;br /&gt;
&lt;br /&gt;
* Extract the sources from the package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ tar xjf ZeptoOS-&amp;lt;version&amp;gt;.tar.bz2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Configuring ==&lt;br /&gt;
&lt;br /&gt;
Change to the top-level &amp;lt;tt&amp;gt;ZeptoOS-&amp;lt;version&amp;gt;&amp;lt;/tt&amp;gt; directory:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd ZeptoOS-&amp;lt;version&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A &amp;lt;tt&amp;gt;configure&amp;lt;/tt&amp;gt; script is provided to set the pathnames to various system directories:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./configure&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If invoked without any arguments, it will use the defaults, which should be appropriate if ZeptoOS is configured on a system with a supported BG/P driver version.  The pathnames can be changed with the help of a textual user interface by invoking the script as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./configure --edit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will display the following menu:&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure1.png|border|Main menu]]&lt;br /&gt;
&lt;br /&gt;
Please select the top item (&amp;lt;tt&amp;gt;BG/P DIST_DIR&amp;lt;/tt&amp;gt;).  The screen will change to:&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure2.png|border|DIST_DIR menu]]&lt;br /&gt;
&lt;br /&gt;
The following options are available:&lt;br /&gt;
&lt;br /&gt;
; DRV_DIR&lt;br /&gt;
: The directory with the BG/P driver tree.  The default (&amp;lt;tt&amp;gt;/bgsys/drivers/ppcfloor/&amp;lt;/tt&amp;gt;) is a link pointing to the currently active driver.&lt;br /&gt;
; BGP_CROSS&lt;br /&gt;
: A prefix to the pathnames of the GNU cross-compilers used to build the compute node and I/O node software.&lt;br /&gt;
; BGCNS_H_PATH and BGCNS_H&lt;br /&gt;
: The location of a file needed to rebuild the kernel (these options are temporary and will be removed in the next version).&lt;br /&gt;
; OS_DIR&lt;br /&gt;
: The directory with the supplementary I/O node software used when booting the I/O nodes.  It needs to be set to match the BG/P driver version being used.&lt;br /&gt;
&lt;br /&gt;
The second top-level menu (&amp;lt;tt&amp;gt;Debugging&amp;lt;/tt&amp;gt;) has only one option:&lt;br /&gt;
&lt;br /&gt;
; ADD_DEBUG_TOOLS&lt;br /&gt;
: Check this option to include &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;strace&amp;lt;/tt&amp;gt; in the compute node ramdisk.  They are not included by default because of their size.&lt;br /&gt;
&lt;br /&gt;
The third top-level menu (&amp;lt;tt&amp;gt;Kernel Profiling&amp;lt;/tt&amp;gt;) is discussed in the [[(K)TAU#Configure ZeptoOS to point to KTAU patch and path|(K)TAU section]]&lt;br /&gt;
&lt;br /&gt;
Select &amp;lt;tt&amp;gt;Exit&amp;lt;/tt&amp;gt; (multiple times if needed) and confirm if you want to save any changes made.&lt;br /&gt;
&lt;br /&gt;
== Building ==&lt;br /&gt;
&lt;br /&gt;
To start using the pre-built binaries simply type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the first invocation, this will ask for a root password to use on I/O nodes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Create root password for I/O Node&lt;br /&gt;
   Leave the password field empty if you want to disable root login&lt;br /&gt;
   New password:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Security note: root-level access to I/O nodes should only be given to trusted individuals.  A root user can access and modify files of all users in the system.'''&lt;br /&gt;
&lt;br /&gt;
Once the password has been entered and confirmed, &amp;lt;tt&amp;gt;make&amp;lt;/tt&amp;gt; will use pre-built kernel images, and will build the ramdisks from pre-built tools and utilities.  The following generated files will be placed in the top-level directory:&lt;br /&gt;
&lt;br /&gt;
; BGP-CN-zImage-with-initrd.elf&lt;br /&gt;
: ZeptoOS compute node Linux with embedded compute node ramdisk.&lt;br /&gt;
; BGP-ION-zImage.elf&lt;br /&gt;
: ZeptoOS I/O node kernel.&lt;br /&gt;
; BGP-ION-ramdisk-for-CNL.elf&lt;br /&gt;
: ZeptoOS I/O node ramdisk for use with the ZeptoOS compute node Linux.&lt;br /&gt;
; BGP-ION-ramdisk-for-CNK.elf&lt;br /&gt;
: ZeptoOS I/O node ramdisk for use with the IBM CNK (optional).&lt;br /&gt;
&lt;br /&gt;
It is possible to rebuild individual ZeptoOS components using one of the following &amp;lt;tt&amp;gt;make&amp;lt;/tt&amp;gt; targets (the list is also available by typing &amp;lt;tt&amp;gt;make help&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;make menu&amp;lt;/tt&amp;gt;):&lt;br /&gt;
&lt;br /&gt;
; bgp-cn-linux&lt;br /&gt;
: Rebuilds the compute node ramdisk and embeds it into a compute node kernel image.&lt;br /&gt;
; bgp-ion-ramdisk-cnl&lt;br /&gt;
: Rebuilds the I/O node ramdisk for the ZeptoOS compute node Linux.&lt;br /&gt;
; bgp-ion-ramdisk-cnk&lt;br /&gt;
: Rebuilds the I/O node ramdisk for the IBM CNK.&lt;br /&gt;
; bgp-ion-linux-build&lt;br /&gt;
: Rebuilds the I/O node kernel.&lt;br /&gt;
; bgp-cn-linux-build&lt;br /&gt;
: Rebuilds the compute node kernel and ramdisk and embeds the ramdisk into the kernel.&lt;br /&gt;
; bgp-all-pkg-rebuild&lt;br /&gt;
: Rebuilds all packages from sources.&lt;br /&gt;
; bgp-libs-build&lt;br /&gt;
: Rebuilds SPI, DCMF and MPICH from sources&lt;br /&gt;
(the following &amp;lt;tt&amp;gt;make&amp;lt;/tt&amp;gt; targets are mostly for internal use)&lt;br /&gt;
; bgp-ion-linux&lt;br /&gt;
: Copies a recently rebuilt I/O node kernel if one is available; otherwise, uses a prebuilt binary (will not rebuild the kernel).&lt;br /&gt;
; bgp-all-pkg-smart&lt;br /&gt;
: Copies recently rebuilt packages if available; otherwise, uses prebuilt binaries (used when preparing to rebuild ramdisks).&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[Introduction]] | [[ZeptoOS_Documentation|Top]] | [[Installation]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Other_Packages&amp;diff=591</id>
		<title>Other Packages</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Other_Packages&amp;diff=591"/>
		<updated>2009-05-08T15:09:34Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* PVFS */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==PVFS==&lt;br /&gt;
&lt;br /&gt;
[http://www.pvfs.org/ PVFS] stands for Parallel Virtual File System, which is designed to scale to petabytes &lt;br /&gt;
of storage and provide access rates at 100s of GB/s.  At Argonne BGP system, PVFS servers &lt;br /&gt;
are running and  PVFS start-up script is installed in the BGP site specific directory( /bgp/iofs ).&lt;br /&gt;
A pvfs volume is mounted at ION boot time. &lt;br /&gt;
&lt;br /&gt;
We include PVFS version 2.8.1 source code and its prebuilt client binaries &lt;br /&gt;
in the ZeptoOS release for the sites who are interested in PVFS. &lt;br /&gt;
We also include a very simple pvfs2 start-up script as an example so that you can add your ION ramdisk. &lt;br /&gt;
If you have pvfs2 server running in your system, &lt;br /&gt;
you can follow the steps below to add all pvfs2 client stuffs to the ramdisk.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd packages/pvfs2/prebuilt&lt;br /&gt;
$ sh add-pvfs2-client-ION-ramdisk.sh  tcp://192.168.1.1:3334/pvfs2-fs  &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please replace tcp://192.168.1.1:3334/pvfs2-fs with your actual server info.&lt;br /&gt;
&lt;br /&gt;
Details on building and running the pvfs2 server is out of our scope, but the following example &lt;br /&gt;
might give you a basic idea to build and run the pvfs2 server.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Build]&lt;br /&gt;
$ cd pvfs-2.8.1&lt;br /&gt;
$ ./configure  [options....]&lt;br /&gt;
$ make&lt;br /&gt;
&lt;br /&gt;
[Create a server config file]&lt;br /&gt;
$ ./src/apps/admin/pvfs2-genconfig fs.conf&lt;br /&gt;
&lt;br /&gt;
[Start the server]&lt;br /&gt;
$ ./src/server/pvfs2-server -f fs.conf  -a  ALIAS   &lt;br /&gt;
$ ./src/server/pvfs2-server    fs.conf  -a  ALIAS&lt;br /&gt;
&lt;br /&gt;
NOTE:&lt;br /&gt;
- replace ALIAS with your real alias in fs.conf&lt;br /&gt;
- the first pvfs2-server invocation just initialize a pvfs2 volume &lt;br /&gt;
- the second invocation actually starts the server&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==IP over torus==&lt;br /&gt;
&lt;br /&gt;
This is currently a preview feature.  It implements IP packet forwarding on top of MPI, over the torus network.  Torus is a point-to-point network that interconnects all the compute nodes in a partition.  Every compute node gets a unique IP address, of the form:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
10.128.0.0 | &amp;lt;rank&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;tt&amp;gt;&amp;lt;rank&amp;gt;&amp;lt;/tt&amp;gt; is the [[FAQ#MPI rank|MPI rank]].  Thus, for a 64-node partition, the IP addresses will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.0.63&amp;lt;/tt&amp;gt;, and for a 1024-node partition, they will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.3.255&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To try this feature out, submit as a compute job the &amp;lt;tt&amp;gt;cn-ipfwd.sh&amp;lt;/tt&amp;gt; script, which should have been installed in &amp;lt;tt&amp;gt;/path/to/install/cnbin/&amp;lt;/tt&amp;gt;.  The script can act as a standalone job or as a wrapper.  If invoked without any arguments, it initializes the IP forwarding and then goes to sleep; if any arguments have been passed, they are interpreted as the name of the binary (along with its command line arguments) to invoke once the IP forwarding is initialized, e.g. (an example with Cobalt):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 64 /path/to/install/cnbin/ipfwd.sh \&lt;br /&gt;
&amp;lt;name of another binary&amp;gt; &amp;lt;arguments to that binary&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The script can be copied to another location and adjusted to one's needs.&lt;br /&gt;
&lt;br /&gt;
Once the job is running, log into a compute node, and run &amp;lt;tt&amp;gt;ifconfig&amp;lt;/tt&amp;gt;; there should be a new virtual network device &amp;lt;tt&amp;gt;tun1&amp;lt;/tt&amp;gt; (in addition to the usual &amp;lt;tt&amp;gt;tun0&amp;lt;/tt&amp;gt;, used for IP forwarding between compute nodes and I/O nodes):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
~ # ifconfig tun1&lt;br /&gt;
tun1      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  &lt;br /&gt;
          inet addr:10.128.0.0  P-t-P:10.128.0.0  Mask:255.255.255.255&lt;br /&gt;
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:65535  Metric:1&lt;br /&gt;
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0&lt;br /&gt;
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0&lt;br /&gt;
          collisions:0 txqueuelen:500 &lt;br /&gt;
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)&lt;br /&gt;
~ # ping 10.128.0.1&lt;br /&gt;
PING 10.128.0.1 (10.128.0.1): 56 data bytes&lt;br /&gt;
64 bytes from 10.128.0.1: seq=0 ttl=64 time=0.321 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=1 ttl=64 time=0.191 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=2 ttl=64 time=0.203 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=3 ttl=64 time=0.194 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=4 ttl=64 time=0.207 ms&lt;br /&gt;
--- 10.128.0.1 ping statistics ---&lt;br /&gt;
5 packets transmitted, 5 packets received, 0% packet loss&lt;br /&gt;
round-trip min/avg/max = 0.191/0.223/0.321 ms&lt;br /&gt;
~ # rsh 10.128.0.1 'grep BG_RANK_IN_PSET /proc/personality.sh'&lt;br /&gt;
BG_RANK_IN_PSET=59&lt;br /&gt;
~ # &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This feature can be used to implement an arbitrary IP-based network protocol between the compute nodes.  We have even experimented running a TCP/IP-based MPICH on top of it (which, while obviously not as fast as the native Blue Gene one, has the advantage of being able to, e.g., run multiple MPI jobs at a time on a single partition).&lt;br /&gt;
&lt;br /&gt;
One major disadvantage of this feature is that the current implementation is computationally intensive; it permanently occupies one core on each node.&lt;br /&gt;
&lt;br /&gt;
==ZOID glibc==&lt;br /&gt;
&lt;br /&gt;
This is another preview feature.  It provides a modified version of GNU libc for the compute nodes, which features much better file I/O throughput rates to the I/O nodes and remote file systems than the default one.  It does so by communicating with the ZOID daemon directly, instead of going through the Linux kernel and the FUSE client (which, while convenient, is slow).&lt;br /&gt;
&lt;br /&gt;
The modified glibc is meant for compiled application processes, not for shell scripts and such.  It is currently only available in a static (&amp;lt;tt&amp;gt;.a&amp;lt;/tt&amp;gt;) version.  It is installed with the rest of the ZeptoOS, in &amp;lt;tt&amp;gt;/path/to/install/lib/zoid/&amp;lt;/tt&amp;gt;.  To link with it, simply add &amp;lt;tt&amp;gt;-L/path/to/install/lib/zoid&amp;lt;/tt&amp;gt; to the final linking stage.  Use the following command to verify that the modified version of glibc has been used for linking:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ nm &amp;lt;binary&amp;gt; | grep __zoid_init&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(no output will be generated if the standard glibc was used)&lt;br /&gt;
&lt;br /&gt;
When submitting a job linked with this glibc, please set the environment variable &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt; to a list of &amp;lt;tt&amp;gt;:&amp;lt;/tt&amp;gt;-separated pathname prefixes.  Only files opened using pathnames beginning with those prefixes will be directly forwarded to the I/O node; other files will be handled via the compute node kernel and possibly FUSE, which is much slower.&lt;br /&gt;
&lt;br /&gt;
Here is a simple benchmark:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#include &amp;lt;fcntl.h&amp;gt;&lt;br /&gt;
#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;
#include &amp;lt;sys/time.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#define BUFSIZE (1024 * 1024 * 100)&lt;br /&gt;
&lt;br /&gt;
int main(int argc, char* argv[])&lt;br /&gt;
{&lt;br /&gt;
    char* buffer;&lt;br /&gt;
    int fd;&lt;br /&gt;
    struct timeval start, stop;&lt;br /&gt;
    double time;&lt;br /&gt;
&lt;br /&gt;
    if (argc != 2)&lt;br /&gt;
    {&lt;br /&gt;
	fprintf(stderr, &amp;quot;Usage: %s &amp;lt;pathname&amp;gt;\n&amp;quot;, argv[0]);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
&lt;br /&gt;
    if (!(buffer = malloc(BUFSIZE)))&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;malloc&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    if ((fd = open(argv[1], O_CREAT | O_WRONLY, 0666)) == -1)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;open&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;start, NULL);&lt;br /&gt;
    if (write(fd, buffer, BUFSIZE) != BUFSIZE)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;write&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;stop, NULL);&lt;br /&gt;
    close(fd);&lt;br /&gt;
    free(buffer);&lt;br /&gt;
&lt;br /&gt;
    time = stop.tv_sec - start.tv_sec + (stop.tv_usec - start.tv_usec) * 1e-6;&lt;br /&gt;
    printf(&amp;quot;Writing %d B took %g s, %g B/s\n&amp;quot;, BUFSIZE, time, BUFSIZE / time);&lt;br /&gt;
&lt;br /&gt;
    return 0;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It writes 1&amp;amp;nbsp;GB of data to a file passed on the command line.  With Cobalt, we submit it as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t 10 -n 1 -e ZOID_DIRS=$HOME $PWD/speed_zoid $HOME/speed_zoid-out&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With our home directories on a GPFS filesystem, we get the following performance:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 4.58026 s, 2.34428e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the other hand, if we link it with the standard glibc, or if we forget to set &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt;, the performance we observe is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 10.4905 s, 1.02354e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The modified glibc is not used by default, because it is not yet complete.  However, if one does not try to outsmart it (in particular, we recommend always passing absolute pathnames), it should work reliably.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Other_Packages&amp;diff=590</id>
		<title>Other Packages</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Other_Packages&amp;diff=590"/>
		<updated>2009-05-08T15:05:38Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* PVFS */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==PVFS==&lt;br /&gt;
&lt;br /&gt;
[http://www.pvfs.org/ PVFS] stands for Parallel Virtual File System, which is designed to scale to petabytes &lt;br /&gt;
of storage and provide access rates at 100s of GB/s.  At Argonne BGP system, PVFS servers &lt;br /&gt;
are running and  PVFS start-up script is installed in the BGP site specific directory( /bgp/iofs ).&lt;br /&gt;
A pvfs volume is mounted at ION boot time. &lt;br /&gt;
&lt;br /&gt;
We include PVFS version 2.8.1 source code and its prebuilt client binaries &lt;br /&gt;
in the ZeptoOS release for the sites who are interested in PVFS. &lt;br /&gt;
We also include a very simple pvfs2 start-up script as an example so that you can add your ION ramdisk. &lt;br /&gt;
If you have pvfs2 server running in your system, &lt;br /&gt;
you can follow the steps below to add all pvfs2 client stuffs to the ramdisk.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd packages/pvfs2/prebuilt&lt;br /&gt;
$ sh add-pvfs2-client-ION-ramdisk.sh  tcp://192.168.1.1:3334/pvfs2-fs  &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please replace tcp://192.168.1.1:3334/pvfs2-fs with your actual server info.&lt;br /&gt;
&lt;br /&gt;
Details on building and running the pvfs2 server is out of our scope, but the following example &lt;br /&gt;
might give you a basic idea to build and run the pvfs2 server.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd pvfs-2.8.1&lt;br /&gt;
$ ./configure           # may need some configure options&lt;br /&gt;
$ make&lt;br /&gt;
$ ./src/apps/admin/pvfs2-genconfig fs.conf&lt;br /&gt;
( you will be asked some basic information )&lt;br /&gt;
$ ./src/server/pvfs2-server -f fs.conf  -a  ALIAS   # replace ALIAS with your real alias in fs.conf&lt;br /&gt;
$ /src/server/pvfs2-server fs.conf  -a  ALIAS&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==IP over torus==&lt;br /&gt;
&lt;br /&gt;
This is currently a preview feature.  It implements IP packet forwarding on top of MPI, over the torus network.  Torus is a point-to-point network that interconnects all the compute nodes in a partition.  Every compute node gets a unique IP address, of the form:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
10.128.0.0 | &amp;lt;rank&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;tt&amp;gt;&amp;lt;rank&amp;gt;&amp;lt;/tt&amp;gt; is the [[FAQ#MPI rank|MPI rank]].  Thus, for a 64-node partition, the IP addresses will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.0.63&amp;lt;/tt&amp;gt;, and for a 1024-node partition, they will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.3.255&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To try this feature out, submit as a compute job the &amp;lt;tt&amp;gt;cn-ipfwd.sh&amp;lt;/tt&amp;gt; script, which should have been installed in &amp;lt;tt&amp;gt;/path/to/install/cnbin/&amp;lt;/tt&amp;gt;.  The script can act as a standalone job or as a wrapper.  If invoked without any arguments, it initializes the IP forwarding and then goes to sleep; if any arguments have been passed, they are interpreted as the name of the binary (along with its command line arguments) to invoke once the IP forwarding is initialized, e.g. (an example with Cobalt):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 64 /path/to/install/cnbin/ipfwd.sh \&lt;br /&gt;
&amp;lt;name of another binary&amp;gt; &amp;lt;arguments to that binary&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The script can be copied to another location and adjusted to one's needs.&lt;br /&gt;
&lt;br /&gt;
Once the job is running, log into a compute node, and run &amp;lt;tt&amp;gt;ifconfig&amp;lt;/tt&amp;gt;; there should be a new virtual network device &amp;lt;tt&amp;gt;tun1&amp;lt;/tt&amp;gt; (in addition to the usual &amp;lt;tt&amp;gt;tun0&amp;lt;/tt&amp;gt;, used for IP forwarding between compute nodes and I/O nodes):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
~ # ifconfig tun1&lt;br /&gt;
tun1      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  &lt;br /&gt;
          inet addr:10.128.0.0  P-t-P:10.128.0.0  Mask:255.255.255.255&lt;br /&gt;
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:65535  Metric:1&lt;br /&gt;
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0&lt;br /&gt;
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0&lt;br /&gt;
          collisions:0 txqueuelen:500 &lt;br /&gt;
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)&lt;br /&gt;
~ # ping 10.128.0.1&lt;br /&gt;
PING 10.128.0.1 (10.128.0.1): 56 data bytes&lt;br /&gt;
64 bytes from 10.128.0.1: seq=0 ttl=64 time=0.321 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=1 ttl=64 time=0.191 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=2 ttl=64 time=0.203 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=3 ttl=64 time=0.194 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=4 ttl=64 time=0.207 ms&lt;br /&gt;
--- 10.128.0.1 ping statistics ---&lt;br /&gt;
5 packets transmitted, 5 packets received, 0% packet loss&lt;br /&gt;
round-trip min/avg/max = 0.191/0.223/0.321 ms&lt;br /&gt;
~ # rsh 10.128.0.1 'grep BG_RANK_IN_PSET /proc/personality.sh'&lt;br /&gt;
BG_RANK_IN_PSET=59&lt;br /&gt;
~ # &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This feature can be used to implement an arbitrary IP-based network protocol between the compute nodes.  We have even experimented running a TCP/IP-based MPICH on top of it (which, while obviously not as fast as the native Blue Gene one, has the advantage of being able to, e.g., run multiple MPI jobs at a time on a single partition).&lt;br /&gt;
&lt;br /&gt;
One major disadvantage of this feature is that the current implementation is computationally intensive; it permanently occupies one core on each node.&lt;br /&gt;
&lt;br /&gt;
==ZOID glibc==&lt;br /&gt;
&lt;br /&gt;
This is another preview feature.  It provides a modified version of GNU libc for the compute nodes, which features much better file I/O throughput rates to the I/O nodes and remote file systems than the default one.  It does so by communicating with the ZOID daemon directly, instead of going through the Linux kernel and the FUSE client (which, while convenient, is slow).&lt;br /&gt;
&lt;br /&gt;
The modified glibc is meant for compiled application processes, not for shell scripts and such.  It is currently only available in a static (&amp;lt;tt&amp;gt;.a&amp;lt;/tt&amp;gt;) version.  It is installed with the rest of the ZeptoOS, in &amp;lt;tt&amp;gt;/path/to/install/lib/zoid/&amp;lt;/tt&amp;gt;.  To link with it, simply add &amp;lt;tt&amp;gt;-L/path/to/install/lib/zoid&amp;lt;/tt&amp;gt; to the final linking stage.  Use the following command to verify that the modified version of glibc has been used for linking:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ nm &amp;lt;binary&amp;gt; | grep __zoid_init&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(no output will be generated if the standard glibc was used)&lt;br /&gt;
&lt;br /&gt;
When submitting a job linked with this glibc, please set the environment variable &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt; to a list of &amp;lt;tt&amp;gt;:&amp;lt;/tt&amp;gt;-separated pathname prefixes.  Only files opened using pathnames beginning with those prefixes will be directly forwarded to the I/O node; other files will be handled via the compute node kernel and possibly FUSE, which is much slower.&lt;br /&gt;
&lt;br /&gt;
Here is a simple benchmark:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#include &amp;lt;fcntl.h&amp;gt;&lt;br /&gt;
#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;
#include &amp;lt;sys/time.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#define BUFSIZE (1024 * 1024 * 100)&lt;br /&gt;
&lt;br /&gt;
int main(int argc, char* argv[])&lt;br /&gt;
{&lt;br /&gt;
    char* buffer;&lt;br /&gt;
    int fd;&lt;br /&gt;
    struct timeval start, stop;&lt;br /&gt;
    double time;&lt;br /&gt;
&lt;br /&gt;
    if (argc != 2)&lt;br /&gt;
    {&lt;br /&gt;
	fprintf(stderr, &amp;quot;Usage: %s &amp;lt;pathname&amp;gt;\n&amp;quot;, argv[0]);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
&lt;br /&gt;
    if (!(buffer = malloc(BUFSIZE)))&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;malloc&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    if ((fd = open(argv[1], O_CREAT | O_WRONLY, 0666)) == -1)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;open&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;start, NULL);&lt;br /&gt;
    if (write(fd, buffer, BUFSIZE) != BUFSIZE)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;write&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;stop, NULL);&lt;br /&gt;
    close(fd);&lt;br /&gt;
    free(buffer);&lt;br /&gt;
&lt;br /&gt;
    time = stop.tv_sec - start.tv_sec + (stop.tv_usec - start.tv_usec) * 1e-6;&lt;br /&gt;
    printf(&amp;quot;Writing %d B took %g s, %g B/s\n&amp;quot;, BUFSIZE, time, BUFSIZE / time);&lt;br /&gt;
&lt;br /&gt;
    return 0;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It writes 1&amp;amp;nbsp;GB of data to a file passed on the command line.  With Cobalt, we submit it as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t 10 -n 1 -e ZOID_DIRS=$HOME $PWD/speed_zoid $HOME/speed_zoid-out&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With our home directories on a GPFS filesystem, we get the following performance:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 4.58026 s, 2.34428e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the other hand, if we link it with the standard glibc, or if we forget to set &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt;, the performance we observe is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 10.4905 s, 1.02354e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The modified glibc is not used by default, because it is not yet complete.  However, if one does not try to outsmart it (in particular, we recommend always passing absolute pathnames), it should work reliably.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Other_Packages&amp;diff=588</id>
		<title>Other Packages</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Other_Packages&amp;diff=588"/>
		<updated>2009-05-08T15:01:33Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* PVFS */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==PVFS==&lt;br /&gt;
&lt;br /&gt;
PVFS stands for Parallel Virtual File System, which is designed to scale to petabytes &lt;br /&gt;
of storage and provide access rates at 100s of GB/s.  At Argonne BGP system, PVFS servers &lt;br /&gt;
are running and  PVFS start-up script is installed in the BGP site specific directory( /bgp/iofs ).&lt;br /&gt;
A pvfs volume is mounted at ION boot time. &lt;br /&gt;
&lt;br /&gt;
We include PVFS version 2.8.1 source code and its prebuilt client binaries &lt;br /&gt;
in the ZeptoOS release for the site who are interested in PVFS. We also provide a very simple pvfs2 start-up&lt;br /&gt;
script as an example. If you have pvfs2 server running in your system, &lt;br /&gt;
you could follow the following steps to add the start-up script to the ION ramdisk.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd packages/pvfs2/prebuilt&lt;br /&gt;
$ sh add-pvfs2-client-ION-ramdisk.sh  tcp://192.168.1.1:3334/pvfs2-fs  &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please replace tcp://192.168.1.1:3334/pvfs2-fs  with your actual server info.&lt;br /&gt;
&lt;br /&gt;
Details on building and running the pvfs2 server is out of our scope, but the following example &lt;br /&gt;
might give you an idea to build and start the pvfs2 server.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd pvfs-2.8.1&lt;br /&gt;
$ ./configure           # may need some configure options&lt;br /&gt;
$ make&lt;br /&gt;
$ ./src/apps/admin/pvfs2-genconfig fs.conf&lt;br /&gt;
( you will be asked some basic information )&lt;br /&gt;
$ ./src/server/pvfs2-server -f fs.conf  -a  ALIAS   # Replace ALIAS with your real alias&lt;br /&gt;
$ /src/server/pvfs2-server fs.conf  -a  ALIAS&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Reference: [http://www.pvfs.org/ PVFS2 page]&lt;br /&gt;
&lt;br /&gt;
==IP over torus==&lt;br /&gt;
&lt;br /&gt;
This is currently a preview feature.  It implements IP packet forwarding on top of MPI, over the torus network.  Torus is a point-to-point network that interconnects all the compute nodes in a partition.  Every compute node gets a unique IP address, of the form:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
10.128.0.0 | &amp;lt;rank&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;tt&amp;gt;&amp;lt;rank&amp;gt;&amp;lt;/tt&amp;gt; is the [[FAQ#MPI rank|MPI rank]].  Thus, for a 64-node partition, the IP addresses will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.0.63&amp;lt;/tt&amp;gt;, and for a 1024-node partition, they will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.3.255&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To try this feature out, submit as a compute job the &amp;lt;tt&amp;gt;cn-ipfwd.sh&amp;lt;/tt&amp;gt; script, which should have been installed in &amp;lt;tt&amp;gt;/path/to/install/cnbin/&amp;lt;/tt&amp;gt;.  The script can act as a standalone job or as a wrapper.  If invoked without any arguments, it initializes the IP forwarding and then goes to sleep; if any arguments have been passed, they are interpreted as the name of the binary (along with its command line arguments) to invoke once the IP forwarding is initialized, e.g. (an example with Cobalt):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 64 /path/to/install/cnbin/ipfwd.sh \&lt;br /&gt;
&amp;lt;name of another binary&amp;gt; &amp;lt;arguments to that binary&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The script can be copied to another location and adjusted to one's needs.&lt;br /&gt;
&lt;br /&gt;
Once the job is running, log into a compute node, and run &amp;lt;tt&amp;gt;ifconfig&amp;lt;/tt&amp;gt;; there should be a new virtual network device &amp;lt;tt&amp;gt;tun1&amp;lt;/tt&amp;gt; (in addition to the usual &amp;lt;tt&amp;gt;tun0&amp;lt;/tt&amp;gt;, used for IP forwarding between compute nodes and I/O nodes):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
~ # ifconfig tun1&lt;br /&gt;
tun1      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  &lt;br /&gt;
          inet addr:10.128.0.0  P-t-P:10.128.0.0  Mask:255.255.255.255&lt;br /&gt;
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:65535  Metric:1&lt;br /&gt;
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0&lt;br /&gt;
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0&lt;br /&gt;
          collisions:0 txqueuelen:500 &lt;br /&gt;
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)&lt;br /&gt;
~ # ping 10.128.0.1&lt;br /&gt;
PING 10.128.0.1 (10.128.0.1): 56 data bytes&lt;br /&gt;
64 bytes from 10.128.0.1: seq=0 ttl=64 time=0.321 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=1 ttl=64 time=0.191 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=2 ttl=64 time=0.203 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=3 ttl=64 time=0.194 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=4 ttl=64 time=0.207 ms&lt;br /&gt;
--- 10.128.0.1 ping statistics ---&lt;br /&gt;
5 packets transmitted, 5 packets received, 0% packet loss&lt;br /&gt;
round-trip min/avg/max = 0.191/0.223/0.321 ms&lt;br /&gt;
~ # rsh 10.128.0.1 'grep BG_RANK_IN_PSET /proc/personality.sh'&lt;br /&gt;
BG_RANK_IN_PSET=59&lt;br /&gt;
~ # &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This feature can be used to implement an arbitrary IP-based network protocol between the compute nodes.  We have even experimented running a TCP/IP-based MPICH on top of it (which, while obviously not as fast as the native Blue Gene one, has the advantage of being able to, e.g., run multiple MPI jobs at a time on a single partition).&lt;br /&gt;
&lt;br /&gt;
One major disadvantage of this feature is that the current implementation is computationally intensive; it permanently occupies one core on each node.&lt;br /&gt;
&lt;br /&gt;
==ZOID glibc==&lt;br /&gt;
&lt;br /&gt;
This is another preview feature.  It provides a modified version of GNU libc for the compute nodes, which features much better file I/O throughput rates to the I/O nodes and remote file systems than the default one.  It does so by communicating with the ZOID daemon directly, instead of going through the Linux kernel and the FUSE client (which, while convenient, is slow).&lt;br /&gt;
&lt;br /&gt;
The modified glibc is meant for compiled application processes, not for shell scripts and such.  It is currently only available in a static (&amp;lt;tt&amp;gt;.a&amp;lt;/tt&amp;gt;) version.  It is installed with the rest of the ZeptoOS, in &amp;lt;tt&amp;gt;/path/to/install/lib/zoid/&amp;lt;/tt&amp;gt;.  To link with it, simply add &amp;lt;tt&amp;gt;-L/path/to/install/lib/zoid&amp;lt;/tt&amp;gt; to the final linking stage.  Use the following command to verify that the modified version of glibc has been used for linking:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ nm &amp;lt;binary&amp;gt; | grep __zoid_init&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(no output will be generated if the standard glibc was used)&lt;br /&gt;
&lt;br /&gt;
When submitting a job linked with this glibc, please set the environment variable &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt; to a list of &amp;lt;tt&amp;gt;:&amp;lt;/tt&amp;gt;-separated pathname prefixes.  Only files opened using pathnames beginning with those prefixes will be directly forwarded to the I/O node; other files will be handled via the compute node kernel and possibly FUSE, which is much slower.&lt;br /&gt;
&lt;br /&gt;
Here is a simple benchmark:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#include &amp;lt;fcntl.h&amp;gt;&lt;br /&gt;
#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;
#include &amp;lt;sys/time.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#define BUFSIZE (1024 * 1024 * 100)&lt;br /&gt;
&lt;br /&gt;
int main(int argc, char* argv[])&lt;br /&gt;
{&lt;br /&gt;
    char* buffer;&lt;br /&gt;
    int fd;&lt;br /&gt;
    struct timeval start, stop;&lt;br /&gt;
    double time;&lt;br /&gt;
&lt;br /&gt;
    if (argc != 2)&lt;br /&gt;
    {&lt;br /&gt;
	fprintf(stderr, &amp;quot;Usage: %s &amp;lt;pathname&amp;gt;\n&amp;quot;, argv[0]);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
&lt;br /&gt;
    if (!(buffer = malloc(BUFSIZE)))&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;malloc&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    if ((fd = open(argv[1], O_CREAT | O_WRONLY, 0666)) == -1)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;open&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;start, NULL);&lt;br /&gt;
    if (write(fd, buffer, BUFSIZE) != BUFSIZE)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;write&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;stop, NULL);&lt;br /&gt;
    close(fd);&lt;br /&gt;
    free(buffer);&lt;br /&gt;
&lt;br /&gt;
    time = stop.tv_sec - start.tv_sec + (stop.tv_usec - start.tv_usec) * 1e-6;&lt;br /&gt;
    printf(&amp;quot;Writing %d B took %g s, %g B/s\n&amp;quot;, BUFSIZE, time, BUFSIZE / time);&lt;br /&gt;
&lt;br /&gt;
    return 0;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It writes 1&amp;amp;nbsp;GB of data to a file passed on the command line.  With Cobalt, we submit it as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t 10 -n 1 -e ZOID_DIRS=$HOME $PWD/speed_zoid $HOME/speed_zoid-out&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With our home directories on a GPFS filesystem, we get the following performance:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 4.58026 s, 2.34428e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the other hand, if we link it with the standard glibc, or if we forget to set &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt;, the performance we observe is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 10.4905 s, 1.02354e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The modified glibc is not used by default, because it is not yet complete.  However, if one does not try to outsmart it (in particular, we recommend always passing absolute pathnames), it should work reliably.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Other_Packages&amp;diff=584</id>
		<title>Other Packages</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Other_Packages&amp;diff=584"/>
		<updated>2009-05-07T22:48:13Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* PVFS */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==PVFS==&lt;br /&gt;
&lt;br /&gt;
We include PVFS version 2.8.1 source code and its prebuilt client binaries &lt;br /&gt;
in the current ZeptoOS release.  We also provide a very simple pvfs2 start-up&lt;br /&gt;
script as an example. If you have pvfs2 server running in your system, &lt;br /&gt;
you could follow the following steps to add the start-up script to the ION ramdisk.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd packages/pvfs2/prebuilt&lt;br /&gt;
$ sh add-pvfs2-client-ION-ramdisk.sh  tcp://192.168.1.1:3334/pvfs2-fs  &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please replace tcp://192.168.1.1:3334/pvfs2-fs  with your actual server info.&lt;br /&gt;
&lt;br /&gt;
Details on building and running the pvfs2 server is out of our scope, but the following example &lt;br /&gt;
might give you an idea to build and start the pvfs2 server.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd pvfs-2.8.1&lt;br /&gt;
$ ./configure           # may need some configure options&lt;br /&gt;
$ make&lt;br /&gt;
$ ./src/apps/admin/pvfs2-genconfig fs.conf&lt;br /&gt;
( you will be asked some basic information )&lt;br /&gt;
$ ./src/server/pvfs2-server -f fs.conf  -a  ALIAS   # Replace ALIAS with your real alias&lt;br /&gt;
$ /src/server/pvfs2-server fs.conf  -a  ALIAS&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Reference: [http://www.pvfs.org/ PVFS2 page]&lt;br /&gt;
&lt;br /&gt;
==IP over torus==&lt;br /&gt;
&lt;br /&gt;
This is currently a preview feature.  It implements IP packet forwarding on top of MPI, over the torus network.  Torus is a point-to-point network that interconnects all the compute nodes in a partition.  Every compute node gets a unique IP address, of the form:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
10.128.0.0 | &amp;lt;rank&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;tt&amp;gt;&amp;lt;rank&amp;gt;&amp;lt;/tt&amp;gt; is the [[FAQ#MPI rank|MPI rank]].  Thus, for a 64-node partition, the IP addresses will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.0.63&amp;lt;/tt&amp;gt;, and for a 1024-node partition, they will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.3.255&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To try this feature out, submit as a compute job the &amp;lt;tt&amp;gt;cn-ipfwd.sh&amp;lt;/tt&amp;gt; script, which should have been installed in &amp;lt;tt&amp;gt;/path/to/install/cnbin/&amp;lt;/tt&amp;gt;.  The script can act as a standalone job or as a wrapper.  If invoked without any arguments, it initializes the IP forwarding and then goes to sleep; if any arguments have been passed, they are interpreted as the name of the binary (along with its command line arguments) to invoke once the IP forwarding is initialized, e.g. (an example with Cobalt):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 64 /path/to/install/cnbin/ipfwd.sh \&lt;br /&gt;
&amp;lt;name of another binary&amp;gt; &amp;lt;arguments to that binary&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The script can be copied to another location and adjusted to one's needs.&lt;br /&gt;
&lt;br /&gt;
Once the job is running, log into a compute node, and run &amp;lt;tt&amp;gt;ifconfig&amp;lt;/tt&amp;gt;; there should be a new virtual network device &amp;lt;tt&amp;gt;tun1&amp;lt;/tt&amp;gt; (in addition to the usual &amp;lt;tt&amp;gt;tun0&amp;lt;/tt&amp;gt;, used for IP forwarding between compute nodes and I/O nodes):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
~ # ifconfig tun1&lt;br /&gt;
tun1      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  &lt;br /&gt;
          inet addr:10.128.0.0  P-t-P:10.128.0.0  Mask:255.255.255.255&lt;br /&gt;
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:65535  Metric:1&lt;br /&gt;
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0&lt;br /&gt;
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0&lt;br /&gt;
          collisions:0 txqueuelen:500 &lt;br /&gt;
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)&lt;br /&gt;
~ # ping 10.128.0.1&lt;br /&gt;
PING 10.128.0.1 (10.128.0.1): 56 data bytes&lt;br /&gt;
64 bytes from 10.128.0.1: seq=0 ttl=64 time=0.321 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=1 ttl=64 time=0.191 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=2 ttl=64 time=0.203 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=3 ttl=64 time=0.194 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=4 ttl=64 time=0.207 ms&lt;br /&gt;
--- 10.128.0.1 ping statistics ---&lt;br /&gt;
5 packets transmitted, 5 packets received, 0% packet loss&lt;br /&gt;
round-trip min/avg/max = 0.191/0.223/0.321 ms&lt;br /&gt;
~ # rsh 10.128.0.1 'grep BG_RANK_IN_PSET /proc/personality.sh'&lt;br /&gt;
BG_RANK_IN_PSET=59&lt;br /&gt;
~ # &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This feature can be used to implement an arbitrary IP-based network protocol between the compute nodes.  We have even experimented running a TCP/IP-based MPICH on top of it (which, while obviously not as fast as the native Blue Gene one, has the advantage of being able to, e.g., run multiple MPI jobs at a time on a single partition).&lt;br /&gt;
&lt;br /&gt;
One major disadvantage of this feature is that the current implementation is computationally intensive; it permanently occupies one core on each node.&lt;br /&gt;
&lt;br /&gt;
==ZOID glibc==&lt;br /&gt;
&lt;br /&gt;
This is another preview feature.  It provides a modified version of GNU libc for the compute nodes, which features much better file I/O throughput rates to the I/O nodes and remote file systems than the default one.  It does so by communicating with the ZOID daemon directly, instead of going through the Linux kernel and the FUSE client (which, while convenient, is slow).&lt;br /&gt;
&lt;br /&gt;
The modified glibc is meant for compiled application processes, not for shell scripts and such.  It is currently only available in a static (&amp;lt;tt&amp;gt;.a&amp;lt;/tt&amp;gt;) version.  It is installed with the rest of the ZeptoOS, in &amp;lt;tt&amp;gt;/path/to/install/lib/zoid/&amp;lt;/tt&amp;gt;.  To link with it, simply add &amp;lt;tt&amp;gt;-L/path/to/install/lib/zoid&amp;lt;/tt&amp;gt; to the final linking stage.  Use the following command to verify that the modified version of glibc has been used for linking:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ nm &amp;lt;binary&amp;gt; | grep __zoid_init&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(no output will be generated if the standard glibc was used)&lt;br /&gt;
&lt;br /&gt;
When submitting a job linked with this glibc, please set the environment variable &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt; to a list of &amp;lt;tt&amp;gt;:&amp;lt;/tt&amp;gt;-separated pathname prefixes.  Only files opened using pathnames beginning with those prefixes will be directly forwarded to the I/O node; other files will be handled via the compute node kernel and possibly FUSE, which is much slower.&lt;br /&gt;
&lt;br /&gt;
Here is a simple benchmark:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#include &amp;lt;fcntl.h&amp;gt;&lt;br /&gt;
#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;
#include &amp;lt;sys/time.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#define BUFSIZE (1024 * 1024 * 100)&lt;br /&gt;
&lt;br /&gt;
int main(int argc, char* argv[])&lt;br /&gt;
{&lt;br /&gt;
    char* buffer;&lt;br /&gt;
    int fd;&lt;br /&gt;
    struct timeval start, stop;&lt;br /&gt;
    double time;&lt;br /&gt;
&lt;br /&gt;
    if (argc != 2)&lt;br /&gt;
    {&lt;br /&gt;
	fprintf(stderr, &amp;quot;Usage: %s &amp;lt;pathname&amp;gt;\n&amp;quot;, argv[0]);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
&lt;br /&gt;
    if (!(buffer = malloc(BUFSIZE)))&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;malloc&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    if ((fd = open(argv[1], O_CREAT | O_WRONLY, 0666)) == -1)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;open&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;start, NULL);&lt;br /&gt;
    if (write(fd, buffer, BUFSIZE) != BUFSIZE)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;write&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;stop, NULL);&lt;br /&gt;
    close(fd);&lt;br /&gt;
    free(buffer);&lt;br /&gt;
&lt;br /&gt;
    time = stop.tv_sec - start.tv_sec + (stop.tv_usec - start.tv_usec) * 1e-6;&lt;br /&gt;
    printf(&amp;quot;Writing %d B took %g s, %g B/s\n&amp;quot;, BUFSIZE, time, BUFSIZE / time);&lt;br /&gt;
&lt;br /&gt;
    return 0;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It writes 1&amp;amp;nbsp;GB of data to a file passed on the command line.  With Cobalt, we submit it as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t 10 -n 1 -e ZOID_DIRS=$HOME $PWD/speed_zoid $HOME/speed_zoid-out&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With our home directories on a GPFS filesystem, we get the following performance:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 4.58026 s, 2.34428e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the other hand, if we link it with the standard glibc, or if we forget to set &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt;, the performance we observe is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 10.4905 s, 1.02354e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The modified glibc is not used by default, because it is not yet complete.  However, if one does not try to outsmart it (in particular, we recommend always passing absolute pathnames), it should work reliably.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Other_Packages&amp;diff=583</id>
		<title>Other Packages</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Other_Packages&amp;diff=583"/>
		<updated>2009-05-07T22:47:45Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* PVFS */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==PVFS==&lt;br /&gt;
&lt;br /&gt;
We include PVFS version 2.8.1 source code and its prebuilt client binaries &lt;br /&gt;
in the current ZeptoOS release.  We also provide a very simple pvfs2 start-up&lt;br /&gt;
script as an example. If you have pvfs2 server running in your system, &lt;br /&gt;
you could follow the following steps to add the start-up script to the ION ramdisk.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd packages/pvfs2/prebuilt&lt;br /&gt;
$ sh add-pvfs2-client-ION-ramdisk.sh  tcp://192.168.1.1:3334/pvfs2-fs  &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please replace tcp://192.168.1.1:3334/pvfs2-fs  with your actual server info.&lt;br /&gt;
&lt;br /&gt;
Details on building and running the pvfs2 server is out of our scope, but the following example &lt;br /&gt;
might give you an idea to build and start the pvfs2 server.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd pvfs-2.8.1&lt;br /&gt;
$ ./configure           # may need some configure options&lt;br /&gt;
$ make&lt;br /&gt;
$ ./src/apps/admin/pvfs2-genconfig fs.conf&lt;br /&gt;
( you will be asked some basic information )&lt;br /&gt;
$ ./src/server/pvfs2-server -f fs.conf  -a  ALIAS   # Replace ALIAS with your real alias&lt;br /&gt;
$ /src/server/pvfs2-server fs.conf  -a  ALIAS&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Reference: [http://www.pvfs.org/]&lt;br /&gt;
&lt;br /&gt;
==IP over torus==&lt;br /&gt;
&lt;br /&gt;
This is currently a preview feature.  It implements IP packet forwarding on top of MPI, over the torus network.  Torus is a point-to-point network that interconnects all the compute nodes in a partition.  Every compute node gets a unique IP address, of the form:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
10.128.0.0 | &amp;lt;rank&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;tt&amp;gt;&amp;lt;rank&amp;gt;&amp;lt;/tt&amp;gt; is the [[FAQ#MPI rank|MPI rank]].  Thus, for a 64-node partition, the IP addresses will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.0.63&amp;lt;/tt&amp;gt;, and for a 1024-node partition, they will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.3.255&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To try this feature out, submit as a compute job the &amp;lt;tt&amp;gt;cn-ipfwd.sh&amp;lt;/tt&amp;gt; script, which should have been installed in &amp;lt;tt&amp;gt;/path/to/install/cnbin/&amp;lt;/tt&amp;gt;.  The script can act as a standalone job or as a wrapper.  If invoked without any arguments, it initializes the IP forwarding and then goes to sleep; if any arguments have been passed, they are interpreted as the name of the binary (along with its command line arguments) to invoke once the IP forwarding is initialized, e.g. (an example with Cobalt):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 64 /path/to/install/cnbin/ipfwd.sh \&lt;br /&gt;
&amp;lt;name of another binary&amp;gt; &amp;lt;arguments to that binary&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The script can be copied to another location and adjusted to one's needs.&lt;br /&gt;
&lt;br /&gt;
Once the job is running, log into a compute node, and run &amp;lt;tt&amp;gt;ifconfig&amp;lt;/tt&amp;gt;; there should be a new virtual network device &amp;lt;tt&amp;gt;tun1&amp;lt;/tt&amp;gt; (in addition to the usual &amp;lt;tt&amp;gt;tun0&amp;lt;/tt&amp;gt;, used for IP forwarding between compute nodes and I/O nodes):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
~ # ifconfig tun1&lt;br /&gt;
tun1      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  &lt;br /&gt;
          inet addr:10.128.0.0  P-t-P:10.128.0.0  Mask:255.255.255.255&lt;br /&gt;
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:65535  Metric:1&lt;br /&gt;
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0&lt;br /&gt;
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0&lt;br /&gt;
          collisions:0 txqueuelen:500 &lt;br /&gt;
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)&lt;br /&gt;
~ # ping 10.128.0.1&lt;br /&gt;
PING 10.128.0.1 (10.128.0.1): 56 data bytes&lt;br /&gt;
64 bytes from 10.128.0.1: seq=0 ttl=64 time=0.321 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=1 ttl=64 time=0.191 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=2 ttl=64 time=0.203 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=3 ttl=64 time=0.194 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=4 ttl=64 time=0.207 ms&lt;br /&gt;
--- 10.128.0.1 ping statistics ---&lt;br /&gt;
5 packets transmitted, 5 packets received, 0% packet loss&lt;br /&gt;
round-trip min/avg/max = 0.191/0.223/0.321 ms&lt;br /&gt;
~ # rsh 10.128.0.1 'grep BG_RANK_IN_PSET /proc/personality.sh'&lt;br /&gt;
BG_RANK_IN_PSET=59&lt;br /&gt;
~ # &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This feature can be used to implement an arbitrary IP-based network protocol between the compute nodes.  We have even experimented running a TCP/IP-based MPICH on top of it (which, while obviously not as fast as the native Blue Gene one, has the advantage of being able to, e.g., run multiple MPI jobs at a time on a single partition).&lt;br /&gt;
&lt;br /&gt;
One major disadvantage of this feature is that the current implementation is computationally intensive; it permanently occupies one core on each node.&lt;br /&gt;
&lt;br /&gt;
==ZOID glibc==&lt;br /&gt;
&lt;br /&gt;
This is another preview feature.  It provides a modified version of GNU libc for the compute nodes, which features much better file I/O throughput rates to the I/O nodes and remote file systems than the default one.  It does so by communicating with the ZOID daemon directly, instead of going through the Linux kernel and the FUSE client (which, while convenient, is slow).&lt;br /&gt;
&lt;br /&gt;
The modified glibc is meant for compiled application processes, not for shell scripts and such.  It is currently only available in a static (&amp;lt;tt&amp;gt;.a&amp;lt;/tt&amp;gt;) version.  It is installed with the rest of the ZeptoOS, in &amp;lt;tt&amp;gt;/path/to/install/lib/zoid/&amp;lt;/tt&amp;gt;.  To link with it, simply add &amp;lt;tt&amp;gt;-L/path/to/install/lib/zoid&amp;lt;/tt&amp;gt; to the final linking stage.  Use the following command to verify that the modified version of glibc has been used for linking:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ nm &amp;lt;binary&amp;gt; | grep __zoid_init&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(no output will be generated if the standard glibc was used)&lt;br /&gt;
&lt;br /&gt;
When submitting a job linked with this glibc, please set the environment variable &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt; to a list of &amp;lt;tt&amp;gt;:&amp;lt;/tt&amp;gt;-separated pathname prefixes.  Only files opened using pathnames beginning with those prefixes will be directly forwarded to the I/O node; other files will be handled via the compute node kernel and possibly FUSE, which is much slower.&lt;br /&gt;
&lt;br /&gt;
Here is a simple benchmark:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#include &amp;lt;fcntl.h&amp;gt;&lt;br /&gt;
#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;
#include &amp;lt;sys/time.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#define BUFSIZE (1024 * 1024 * 100)&lt;br /&gt;
&lt;br /&gt;
int main(int argc, char* argv[])&lt;br /&gt;
{&lt;br /&gt;
    char* buffer;&lt;br /&gt;
    int fd;&lt;br /&gt;
    struct timeval start, stop;&lt;br /&gt;
    double time;&lt;br /&gt;
&lt;br /&gt;
    if (argc != 2)&lt;br /&gt;
    {&lt;br /&gt;
	fprintf(stderr, &amp;quot;Usage: %s &amp;lt;pathname&amp;gt;\n&amp;quot;, argv[0]);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
&lt;br /&gt;
    if (!(buffer = malloc(BUFSIZE)))&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;malloc&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    if ((fd = open(argv[1], O_CREAT | O_WRONLY, 0666)) == -1)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;open&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;start, NULL);&lt;br /&gt;
    if (write(fd, buffer, BUFSIZE) != BUFSIZE)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;write&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;stop, NULL);&lt;br /&gt;
    close(fd);&lt;br /&gt;
    free(buffer);&lt;br /&gt;
&lt;br /&gt;
    time = stop.tv_sec - start.tv_sec + (stop.tv_usec - start.tv_usec) * 1e-6;&lt;br /&gt;
    printf(&amp;quot;Writing %d B took %g s, %g B/s\n&amp;quot;, BUFSIZE, time, BUFSIZE / time);&lt;br /&gt;
&lt;br /&gt;
    return 0;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It writes 1&amp;amp;nbsp;GB of data to a file passed on the command line.  With Cobalt, we submit it as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t 10 -n 1 -e ZOID_DIRS=$HOME $PWD/speed_zoid $HOME/speed_zoid-out&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With our home directories on a GPFS filesystem, we get the following performance:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 4.58026 s, 2.34428e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the other hand, if we link it with the standard glibc, or if we forget to set &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt;, the performance we observe is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 10.4905 s, 1.02354e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The modified glibc is not used by default, because it is not yet complete.  However, if one does not try to outsmart it (in particular, we recommend always passing absolute pathnames), it should work reliably.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Other_Packages&amp;diff=582</id>
		<title>Other Packages</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Other_Packages&amp;diff=582"/>
		<updated>2009-05-07T22:39:06Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* PVFS */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==PVFS==&lt;br /&gt;
&lt;br /&gt;
We include PVFS version 2.8.1 source code and its prebuilt client binaries &lt;br /&gt;
in the current ZeptoOS release.  We also provide a very simple pvfs2 start-up&lt;br /&gt;
script as an example. If you have pvfs2 server running in your system, &lt;br /&gt;
you could follow the following steps to add the start-up script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd packages/pvfs2/prebuilt&lt;br /&gt;
$ sh add-pvfs2-client-ION-ramdisk.sh  tcp://192.168.1.1:3334/pvfs2-fs  &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
tcp://192.168.1.1:3334/pvfs2-fs is an example. Please replace it with your actual server info.&lt;br /&gt;
&lt;br /&gt;
==IP over torus==&lt;br /&gt;
&lt;br /&gt;
This is currently a preview feature.  It implements IP packet forwarding on top of MPI, over the torus network.  Torus is a point-to-point network that interconnects all the compute nodes in a partition.  Every compute node gets a unique IP address, of the form:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
10.128.0.0 | &amp;lt;rank&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;tt&amp;gt;&amp;lt;rank&amp;gt;&amp;lt;/tt&amp;gt; is the [[FAQ#MPI rank|MPI rank]].  Thus, for a 64-node partition, the IP addresses will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.0.63&amp;lt;/tt&amp;gt;, and for a 1024-node partition, they will range between &amp;lt;tt&amp;gt;10.128.0.0&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10.128.3.255&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To try this feature out, submit as a compute job the &amp;lt;tt&amp;gt;cn-ipfwd.sh&amp;lt;/tt&amp;gt; script, which should have been installed in &amp;lt;tt&amp;gt;/path/to/install/cnbin/&amp;lt;/tt&amp;gt;.  The script can act as a standalone job or as a wrapper.  If invoked without any arguments, it initializes the IP forwarding and then goes to sleep; if any arguments have been passed, they are interpreted as the name of the binary (along with its command line arguments) to invoke once the IP forwarding is initialized, e.g. (an example with Cobalt):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 64 /path/to/install/cnbin/ipfwd.sh \&lt;br /&gt;
&amp;lt;name of another binary&amp;gt; &amp;lt;arguments to that binary&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The script can be copied to another location and adjusted to one's needs.&lt;br /&gt;
&lt;br /&gt;
Once the job is running, log into a compute node, and run &amp;lt;tt&amp;gt;ifconfig&amp;lt;/tt&amp;gt;; there should be a new virtual network device &amp;lt;tt&amp;gt;tun1&amp;lt;/tt&amp;gt; (in addition to the usual &amp;lt;tt&amp;gt;tun0&amp;lt;/tt&amp;gt;, used for IP forwarding between compute nodes and I/O nodes):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
~ # ifconfig tun1&lt;br /&gt;
tun1      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  &lt;br /&gt;
          inet addr:10.128.0.0  P-t-P:10.128.0.0  Mask:255.255.255.255&lt;br /&gt;
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:65535  Metric:1&lt;br /&gt;
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0&lt;br /&gt;
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0&lt;br /&gt;
          collisions:0 txqueuelen:500 &lt;br /&gt;
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)&lt;br /&gt;
~ # ping 10.128.0.1&lt;br /&gt;
PING 10.128.0.1 (10.128.0.1): 56 data bytes&lt;br /&gt;
64 bytes from 10.128.0.1: seq=0 ttl=64 time=0.321 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=1 ttl=64 time=0.191 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=2 ttl=64 time=0.203 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=3 ttl=64 time=0.194 ms&lt;br /&gt;
64 bytes from 10.128.0.1: seq=4 ttl=64 time=0.207 ms&lt;br /&gt;
--- 10.128.0.1 ping statistics ---&lt;br /&gt;
5 packets transmitted, 5 packets received, 0% packet loss&lt;br /&gt;
round-trip min/avg/max = 0.191/0.223/0.321 ms&lt;br /&gt;
~ # rsh 10.128.0.1 'grep BG_RANK_IN_PSET /proc/personality.sh'&lt;br /&gt;
BG_RANK_IN_PSET=59&lt;br /&gt;
~ # &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This feature can be used to implement an arbitrary IP-based network protocol between the compute nodes.  We have even experimented running a TCP/IP-based MPICH on top of it (which, while obviously not as fast as the native Blue Gene one, has the advantage of being able to, e.g., run multiple MPI jobs at a time on a single partition).&lt;br /&gt;
&lt;br /&gt;
One major disadvantage of this feature is that the current implementation is computationally intensive; it permanently occupies one core on each node.&lt;br /&gt;
&lt;br /&gt;
==ZOID glibc==&lt;br /&gt;
&lt;br /&gt;
This is another preview feature.  It provides a modified version of GNU libc for the compute nodes, which features much better file I/O throughput rates to the I/O nodes and remote file systems than the default one.  It does so by communicating with the ZOID daemon directly, instead of going through the Linux kernel and the FUSE client (which, while convenient, is slow).&lt;br /&gt;
&lt;br /&gt;
The modified glibc is meant for compiled application processes, not for shell scripts and such.  It is currently only available in a static (&amp;lt;tt&amp;gt;.a&amp;lt;/tt&amp;gt;) version.  It is installed with the rest of the ZeptoOS, in &amp;lt;tt&amp;gt;/path/to/install/lib/zoid/&amp;lt;/tt&amp;gt;.  To link with it, simply add &amp;lt;tt&amp;gt;-L/path/to/install/lib/zoid&amp;lt;/tt&amp;gt; to the final linking stage.  Use the following command to verify that the modified version of glibc has been used for linking:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ nm &amp;lt;binary&amp;gt; | grep __zoid_init&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(no output will be generated if the standard glibc was used)&lt;br /&gt;
&lt;br /&gt;
When submitting a job linked with this glibc, please set the environment variable &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt; to a list of &amp;lt;tt&amp;gt;:&amp;lt;/tt&amp;gt;-separated pathname prefixes.  Only files opened using pathnames beginning with those prefixes will be directly forwarded to the I/O node; other files will be handled via the compute node kernel and possibly FUSE, which is much slower.&lt;br /&gt;
&lt;br /&gt;
Here is a simple benchmark:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#include &amp;lt;fcntl.h&amp;gt;&lt;br /&gt;
#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;
#include &amp;lt;sys/time.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
#define BUFSIZE (1024 * 1024 * 100)&lt;br /&gt;
&lt;br /&gt;
int main(int argc, char* argv[])&lt;br /&gt;
{&lt;br /&gt;
    char* buffer;&lt;br /&gt;
    int fd;&lt;br /&gt;
    struct timeval start, stop;&lt;br /&gt;
    double time;&lt;br /&gt;
&lt;br /&gt;
    if (argc != 2)&lt;br /&gt;
    {&lt;br /&gt;
	fprintf(stderr, &amp;quot;Usage: %s &amp;lt;pathname&amp;gt;\n&amp;quot;, argv[0]);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
&lt;br /&gt;
    if (!(buffer = malloc(BUFSIZE)))&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;malloc&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    if ((fd = open(argv[1], O_CREAT | O_WRONLY, 0666)) == -1)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;open&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;start, NULL);&lt;br /&gt;
    if (write(fd, buffer, BUFSIZE) != BUFSIZE)&lt;br /&gt;
    {&lt;br /&gt;
	perror(&amp;quot;write&amp;quot;);&lt;br /&gt;
	return 1;&lt;br /&gt;
    }&lt;br /&gt;
    gettimeofday(&amp;amp;stop, NULL);&lt;br /&gt;
    close(fd);&lt;br /&gt;
    free(buffer);&lt;br /&gt;
&lt;br /&gt;
    time = stop.tv_sec - start.tv_sec + (stop.tv_usec - start.tv_usec) * 1e-6;&lt;br /&gt;
    printf(&amp;quot;Writing %d B took %g s, %g B/s\n&amp;quot;, BUFSIZE, time, BUFSIZE / time);&lt;br /&gt;
&lt;br /&gt;
    return 0;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It writes 1&amp;amp;nbsp;GB of data to a file passed on the command line.  With Cobalt, we submit it as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t 10 -n 1 -e ZOID_DIRS=$HOME $PWD/speed_zoid $HOME/speed_zoid-out&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With our home directories on a GPFS filesystem, we get the following performance:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 4.58026 s, 2.34428e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the other hand, if we link it with the standard glibc, or if we forget to set &amp;lt;tt&amp;gt;ZOID_DIRS&amp;lt;/tt&amp;gt;, the performance we observe is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Writing 1073741824 B took 10.4905 s, 1.02354e+08 B/s&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The modified glibc is not used by default, because it is not yet complete.  However, if one does not try to outsmart it (in particular, we recommend always passing absolute pathnames), it should work reliably.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[(K)TAU]] | [[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Testing&amp;diff=559</id>
		<title>Testing</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Testing&amp;diff=559"/>
		<updated>2009-05-06T19:44:51Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Compiling */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Installation]] | [[ZeptoOS_Documentation|Top]] | [[MPICH, DCMF, and SPI]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
Once ZeptoOS is configured and installed, it is time to test it.  Here are a few trivial tests to verify that the environment is working:&lt;br /&gt;
&lt;br /&gt;
==The /bin/sleep job==&lt;br /&gt;
&lt;br /&gt;
If you are using Cobalt, submit using either of the commands below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 1 /bin/sleep 3600&lt;br /&gt;
$ qsub --kernel &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 1 /bin/sleep 3600&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using &amp;lt;tt&amp;gt;mpirun&amp;lt;/tt&amp;gt; directly, submit as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -verbose 1 -partition &amp;lt;partition-name&amp;gt; -np 1 -timeout &amp;lt;time&amp;gt; \&lt;br /&gt;
-cwd $PWD -exe /bin/sleep 3600&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This test, if successful, will verify that the ZeptoOS compute and I/O node environments are booting correctly.  We deliberately chose a system binary such as &amp;lt;tt&amp;gt;/bin/sleep&amp;lt;/tt&amp;gt; instead of something from a network filesystem so that even if the network filesystem does not come up for some reason, the test can still succeed.&lt;br /&gt;
&lt;br /&gt;
If everything works out fine, messages such as the following will be found in the error stream (''jobid''.error file if using Cobalt):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
FE_MPI (Info) : initialize() - using jobname '' provided by scheduler interface&lt;br /&gt;
FE_MPI (Info) : Invoking mpirun backend&lt;br /&gt;
FE_MPI (Info) : connectToServer() - Handshake successful&lt;br /&gt;
BRIDGE (Info) : rm_set_serial() - The machine serial number (alias) is BGP&lt;br /&gt;
FE_MPI (Info) : Preparing partition&lt;br /&gt;
BE_MPI (Info) : Examining specified partition&lt;br /&gt;
BE_MPI (Info) : Checking partition ANL-R00-M1-N12-64 initial state ...&lt;br /&gt;
BE_MPI (Info) : Partition ANL-R00-M1-N12-64 initial state = FREE ('F')&lt;br /&gt;
BE_MPI (Info) : Checking partition owner...&lt;br /&gt;
BE_MPI (Info) : Setting new owner&lt;br /&gt;
BE_MPI (Info) : Initiating boot of the partition&lt;br /&gt;
BE_MPI (Info) : Waiting for partition ANL-R00-M1-N12-64 to boot...&lt;br /&gt;
BE_MPI (Info) : Partition is ready&lt;br /&gt;
BE_MPI (Info) : Done preparing partition&lt;br /&gt;
FE_MPI (Info) : Adding job&lt;br /&gt;
BE_MPI (Info) : Adding job to database...&lt;br /&gt;
FE_MPI (Info) : Job added with the following id: 98461&lt;br /&gt;
FE_MPI (Info) : Starting job 98461&lt;br /&gt;
FE_MPI (Info) : Waiting for job to terminate&lt;br /&gt;
BE_MPI (Info) : IO - Threads initialized&lt;br /&gt;
BE_MPI (Info) : I/O input runner thread terminated&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(we stripped the timestamp prefixes to make the lines shorter)&lt;br /&gt;
&lt;br /&gt;
If these messages are immediately followed by other, error messages, then there is a problem.  One common instance would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BE_MPI (Info) : I/O output runner thread terminated&lt;br /&gt;
BE_MPI (Info) : Job 98463 switched to state ERROR ('E')&lt;br /&gt;
BE_MPI (ERROR): Job execution failed&lt;br /&gt;
[...]&lt;br /&gt;
BE_MPI (ERROR): The error message in the job record is as follows:&lt;br /&gt;
BE_MPI (ERROR):   &amp;quot;Load failed on 172.16.3.11: Program segment is not 1MB aligned&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This error indicates that the job was submitted to the default software environment, not to ZeptoOS (at the very least, the default I/O node ramdisk was used).  You need to go back to the [[Installation#Setting up a kernel profile|Installation]] section to fix the problem.  Information from the system log files can be useful to diagnose the problem.&lt;br /&gt;
&lt;br /&gt;
==Log files==&lt;br /&gt;
&lt;br /&gt;
===I/O node===&lt;br /&gt;
&lt;br /&gt;
Every I/O node has its own log file located in &amp;lt;tt&amp;gt;/bgsys/logs/BGP/&amp;lt;/tt&amp;gt;, with a name such as &amp;lt;tt&amp;gt;R*-M*-N*-J*.log&amp;lt;/tt&amp;gt;.  This name will generally correspond to the name of the partition where the job was running.  Above, our job ran on &amp;lt;tt&amp;gt;ANL-R00-M1-N12-64&amp;lt;/tt&amp;gt; (we could see that in the error stream; Cobalt users can also use &amp;lt;tt&amp;gt;[c]qstat&amp;lt;/tt&amp;gt;); a corresponding I/O node log file on Argonne machines will be &amp;lt;tt&amp;gt;R00-M1-N12-J00.log&amp;lt;/tt&amp;gt;.  This is how a log file from a successful ZeptoOS boot looks like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;Linux version 2.6.16.46-297 (geeko@buildhost) (gcc version 4.1.2 (BGP)) #1 SMP Wed Apr 22 15:04:42 CDT 2009&lt;br /&gt;
Kernel command line: console=bgcons root=/dev/ram0 lpj=8500000&lt;br /&gt;
init started:  BusyBox v1.4.2 (2008-04-10 05:20:01 UTC) multi-call binary&lt;br /&gt;
Starting RPC portmap daemon..done&lt;br /&gt;
eth0: Link status [RX+,TX+]&lt;br /&gt;
mount server reported tcp not available, falling back to udp&lt;br /&gt;
mount: RPC: Remote system error - No route to host&lt;br /&gt;
Zepto ION startup-00&lt;br /&gt;
eth0      Link encap:Ethernet  HWaddr 00:14:5E:7D:0C:57  &lt;br /&gt;
          inet addr:172.16.3.15  Bcast:172.31.255.255  Mask:255.240.0.0&lt;br /&gt;
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1&lt;br /&gt;
          RX packets:880 errors:0 dropped:0 overruns:0 frame:0&lt;br /&gt;
          TX packets:1009 errors:0 dropped:0 overruns:0 carrier:0&lt;br /&gt;
          collisions:0 txqueuelen:1000 &lt;br /&gt;
          RX bytes:3878545 (3.6 Mb)  TX bytes:151458 (147.9 Kb)&lt;br /&gt;
          Interrupt:32 &lt;br /&gt;
Zepto ION startup-00 done&lt;br /&gt;
                                                                      done&lt;br /&gt;
Starting syslog servicesDec 31 18:00:36 ion-15 syslogd 1.4.1: restart.&lt;br /&gt;
                                                                      done&lt;br /&gt;
Starting network time protocol daemon (NTPD) using 172.17.3.1&lt;br /&gt;
May  1 12:57:11 ion-15 ntpdate[642]: step time server 172.17.3.1 offset 1241200617.470271 sec&lt;br /&gt;
May  1 12:57:11 ion-15 ntpd[653]: ntpd 4.2.0a@1.1196-r Sat Oct  4 00:01:53 UTC 2008 (1)&lt;br /&gt;
May  1 12:57:11 ion-15 ntpd[653]: precision = 1.000 usec&lt;br /&gt;
May  1 12:57:11 ion-15 ntpd[653]: Listening on interface wildcard, 0.0.0.0#123&lt;br /&gt;
May  1 12:57:11 ion-15 ntpd[653]: Listening on interface eth0, 172.16.3.15#123&lt;br /&gt;
May  1 12:57:11 ion-15 ntpd[653]: Listening on interface lo, 127.0.0.1#123&lt;br /&gt;
May  1 12:57:11 ion-15 ntpd[653]: kernel time sync status 0040&lt;br /&gt;
                                                                      done&lt;br /&gt;
Enabling ssh&lt;br /&gt;
Mounting site filesystems&lt;br /&gt;
                                                                      done&lt;br /&gt;
Loading PVFS2 kernel module                                           done&lt;br /&gt;
Sleeping 0 seconds before starting PVFS                               done&lt;br /&gt;
Starting PVFS2 client                                                 done&lt;br /&gt;
Sleeping 10 seconds before mounting PVFS&lt;br /&gt;
                                                                      done&lt;br /&gt;
Mounting PVFS2 filesystems                                            done&lt;br /&gt;
Starting SSH daemonMay  1 12:57:21 ion-15 sshd[833]: Server listening on 0.0.0.0 port 22.&lt;br /&gt;
                                                                      done&lt;br /&gt;
Zepto ION startup-12&lt;br /&gt;
Zepto ION startup-12 done&lt;br /&gt;
Starting GPFS&lt;br /&gt;
May  1 12:57:26 ion-15 syslogd 1.4.1: restart.&lt;br /&gt;
/etc/init.d/rc3.d/S40gpfs: GPFS is ready on I/O node ion-15 : 172.16.3.15 : R00-M1-N12-J00&lt;br /&gt;
ln: creating symbolic link `/home/acherryl/acherryl' to `/gpfs/home/acherryl': File exists&lt;br /&gt;
ln: creating symbolic link `/home/bgpadmin/bgpadmin' to `/gpfs/home/bgpadmin': File exists&lt;br /&gt;
ln: creating symbolic link `/home/davidr/davidr' to `/gpfs/home/davidr': File exists&lt;br /&gt;
ln: creating symbolic link `/home/scullinl/scullinl' to `/gpfs/home/scullinl': File exists&lt;br /&gt;
Starting ZOID...&lt;br /&gt;
                                                                      done&lt;br /&gt;
Zepto ION startup-99&lt;br /&gt;
Zepto ION startup-99 done&lt;br /&gt;
May  1 17:57:59 ion-15 init: Starting pid 2823, console /dev/console: '/bin/sh'&lt;br /&gt;
BusyBox v1.4.2 (2008-10-04 00:02:35 UTC) Built-in shell (ash)&lt;br /&gt;
Enter 'help' for a list of built-in commands.&lt;br /&gt;
/bin/sh: can't access tty; job control turned off&lt;br /&gt;
~ # &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(again, we stripped the prefixes to make the lines shorter)&lt;br /&gt;
&lt;br /&gt;
Messages such as &amp;lt;tt&amp;gt;Zepto ION startup&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;Starting ZOID&amp;lt;/tt&amp;gt; clearly indicate that a ZeptoOS I/O node ramdisk is being used.  If one instead mistakenly booted with the default ramdisk, this could be recognized by messages such as:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Starting CIO services&lt;br /&gt;
[ciod:initialized]                                                    done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(&amp;lt;tt&amp;gt;ciod&amp;lt;/tt&amp;gt; is ''never'' started when using Zepto Compute Node Linux)&lt;br /&gt;
&lt;br /&gt;
In addition to verifying the ramdisk, the correct I/O node kernel can also be verified using the I/O node logfile by checking the kernel build timestamp in the first line of the boot log.  As of this writing the default kernel on the Argonne machines has a timestamp of &amp;lt;tt&amp;gt;Wed Oct 29 18:51:19 UTC 2008&amp;lt;/tt&amp;gt;; as can be seen above, the ZeptoOS kernel was built more recently.&lt;br /&gt;
&lt;br /&gt;
===Compute node===&lt;br /&gt;
&lt;br /&gt;
All the compute nodes on the machine share the same MMCS log file, located in &amp;lt;tt&amp;gt;/bgsys/logs/BGP/&amp;lt;/tt&amp;gt;.  The name of the log file is not fixed (it contains a timestamp), but &amp;lt;tt&amp;gt;sn1-bgdb0-mmcs_db_server-current.log&amp;lt;/tt&amp;gt; always links to the current file.  Because the file is shared with other jobs, we recommed to grep it for user name, partition name, or both.&lt;br /&gt;
&lt;br /&gt;
A correct boot log when when booting ZeptoOS will look something like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Common Node Services V1R3M0 (efix:0)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Licensed Machine Code - Property of IBM.&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Blue Gene/P Licensed Machine Code.&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Copyright IBM Corp., 2006, 2007 All Rights Reserved.&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Z: Zepto Linux Kernel relocating CNS... dst=80280000 src=fff40000 size=262144&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Z: CNS is successfully relocated to 00280000 in physical memory&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Linux version 2.6.19.2-g66cbca2d (kazutomo@login1) (gcc version 4.1.2 (BGP)) #12 SMP Tue Apr 21 12:58:11 CDT 2009&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Zone PFN ranges:&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0:   DMA             0 -&amp;gt;    28672&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0:   Normal      28672 -&amp;gt;    28672&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: early_node_map[1] active PFN ranges&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.1:     0:        0 -&amp;gt;    28672&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.1: Built 1 zonelists.  Total pages: 28658&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.1: Kernel command line: console=bgcons root=/dev/ram0 lpj=8500000&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.1: PID hash table entries: 4096 (order: 12, 16384 bytes)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Dentry cache hash table entries: 262144 (order: 4, 1048576 bytes)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Inode-cache hash table entries: 131072 (order: 3, 524288 bytes)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Memory: 1826560k available (1408k kernel code, 832k data, 192k init, 0k highmem)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Calibrating delay loop (skipped)... 1700.00 BogoMIPS preset&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Mount-cache hash table entries: 8192&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: CPU 1 done callin...&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: CPU 1 done setup...&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: CPU 1 done timebase take...&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Processor 1 found.&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: CPU 2 done callin...&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: CPU 2 done setup...&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: CPU 2 done timebase take...&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Processor 2 found.&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: CPU 3 done callin...&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: CPU 3 done setup...&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: CPU 3 done timebase take...&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Processor 3 found.&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Brought up 4 CPUs&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: migration_cost=0&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: checking if image is initramfs... it is&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Freeing initrd memory: 2575k freed&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: NET: Registered protocol family 16&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: NET: Registered protocol family 2&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: IP route cache hash table entries: 16384 (order: 0, 65536 bytes)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: TCP established hash table entries: 65536 (order: 3, 524288 bytes)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: TCP bind hash table entries: 32768 (order: 2, 262144 bytes)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: TCP: Hash tables configured (established 65536 bind 32768)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: TCP reno registered&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: fuse init (API version 7.7)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: io scheduler noop registered (default)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: RAMDISK driver initialized: 16 RAM disks of 32768K size 1024 blocksize&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: tun: Universal TUN/TAP device driver, 1.6&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: tun: (C) 1999-2004 Max Krasnyansky &amp;lt;maxk@qualcomm.com&amp;gt;&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: TCP cubic registered&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: NET: Registered protocol family 1&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: NET: Registered protocol family 17&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: NET: Registered protocol family 15&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: Freeing unused kernel memory: 192k init&lt;br /&gt;
iskra:ANL-R00-M1-N12-64 {20}.0: init started: BusyBox(for ZeptoOS Compute Node) v1.12.1 (2009-04-21 16:08:55 CDT)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This is very easy to tell from a boot log of the default light-weight kernel, which will consist of the first four lines ''only''.&lt;br /&gt;
&lt;br /&gt;
The MMCS log file contains other useful information besides the boot log of the compute nodes.  Before the kernel starts booting, the following messages related to the newly submitted job can be found there:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
DBBlockCmd  DatabaseBlockCommandThread started: block ANL-R00-M1-N12-64, user iskra, action 1&lt;br /&gt;
DBBlockCmd  setusername iskra &lt;br /&gt;
iskra       db_allocate ANL-R00-M1-N12-64 &lt;br /&gt;
iskra       DBConsoleController::setAllocating() ANL-R00-M1-N12-64&lt;br /&gt;
iskra       block state C&lt;br /&gt;
iskra       DBConsoleController::addBlock(ANL-R00-M1-N12-64)&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     BlockController::connect()&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     connecting to mcServer at 127.0.0.1:1206&lt;br /&gt;
    Connected to MCServer as iskra@sn1. Client version 3. Server version 3 on fd 101&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     connected to mcServer&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     mcServer target set ANL-R00-M1-N12-64 created&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     mcServer target set ANL-R00-M1-N12-64 opened&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     {0} I/O log file: /bgsys/logs/BGP/R00-M1-N12-J00.log&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     MailboxListener starting&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     DBConsoleController::doneAllocating() ANL-R00-M1-N12-64&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     BlockController::boot_block \&lt;br /&gt;
uloader=/bgsys/argonne-utils/partitions/ANL-R00-M1-N12-64/uloader \&lt;br /&gt;
cnload=/bgsys/argonne-utils/partitions/ANL-R00-M1-N12-64/CNS,/bgsys/argonne-utils/partitions/ANL-R00-M1-N12-64/CNK \&lt;br /&gt;
ioload=/bgsys/argonne-utils/partitions/ANL-R00-M1-N12-64/CNS,/bgsys/argonne-utils/partitions/ANL-R00-M1-N12-64/INK,/bgsys/argonne-utils/partitions/ANL-R00-M1-N12-64/ramdisk &lt;br /&gt;
iskra:ANL-R00-M1-N12-64     boot_block cookie: 587867023 compute_nodes: 64 io_nodes: 1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Of particular relevance is the pathname to the I/O node log file(s) (if it cannot be easily guessed from the partition name) and the pathnames to the kernels and ramdisks used to boot the partition.&lt;br /&gt;
&lt;br /&gt;
After the kernel boot log, the log file will also contain information about subsequent phases of starting a job:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     I/O node initialized: R00-M1-N12-J00&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     DBBlockController::waitBoot(ANL-R00-M1-N12-64) block initialization successful&lt;br /&gt;
iskra       DatabaseBlockCommandThread stopped&lt;br /&gt;
DBJobCmd    DatabaseJobCommandThread started: job 98461, user iskra, action 1&lt;br /&gt;
DBJobCmd    setusername iskra &lt;br /&gt;
iskra       Starting Job 98461&lt;br /&gt;
    New thread 4398305505840, for jobid 98461&lt;br /&gt;
    selectBlock(): ANL-R00-M1-N12-64        iskra(1)        connected state: I owner: iskra&lt;br /&gt;
ANL-R00-M1-N12-64   Jobid is 98461, homedir is /gpfs/home/iskra&lt;br /&gt;
ANL-R00-M1-N12-64   persist: 1&lt;br /&gt;
ANL-R00-M1-N12-64   connecting to mpirun...&lt;br /&gt;
ANL-R00-M1-N12-64   setting mpirun stream, fd=386&lt;br /&gt;
ANL-R00-M1-N12-64   contacting control node 0 at 172.16.3.15:7000&lt;br /&gt;
ANL-R00-M1-N12-64   connected to control node 0 at 172.16.3.15:7000&lt;br /&gt;
ANL-R00-M1-N12-64   Job::load() /bin/sleep &lt;br /&gt;
ANL-R00-M1-N12-64   Job loaded: 98461&lt;br /&gt;
ANL-R00-M1-N12-64   About to start /bin/sleep&lt;br /&gt;
ANL-R00-M1-N12-64   Job 98461 set to RUNNING&lt;br /&gt;
iskra:ANL-R00-M1-N12-64     {20}.0: floating point used in kernel (task=8080cfe0, pc=80017064)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Interactive login==&lt;br /&gt;
&lt;br /&gt;
We are assuming at this point that launching &amp;lt;tt&amp;gt;/bin/sleep&amp;lt;/tt&amp;gt; has been successful and that the &amp;quot;job&amp;quot; is running.  We can now start an interactive session on our BG/P resources.  Probably the most complicated part of this operation is finding the IP address of the I/O node(s).  The allocation of I/O nodes to partitions is fixed, so on a small machine one could simply make a list.  This information is also available in the log files discussed above.&lt;br /&gt;
&lt;br /&gt;
The IP address is printed near the top of the I/O node boot log, as part of the interface configuration of the Ethernet device:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
eth0      Link encap:Ethernet  HWaddr 00:14:5E:7D:0C:57  &lt;br /&gt;
          inet addr:172.16.3.15  Bcast:172.31.255.255  Mask:255.240.0.0&lt;br /&gt;
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1&lt;br /&gt;
          RX packets:880 errors:0 dropped:0 overruns:0 frame:0&lt;br /&gt;
          TX packets:1009 errors:0 dropped:0 overruns:0 carrier:0&lt;br /&gt;
          collisions:0 txqueuelen:1000 &lt;br /&gt;
          RX bytes:3878545 (3.6 Mb)  TX bytes:151458 (147.9 Kb)&lt;br /&gt;
          Interrupt:32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this case, the address is &amp;lt;tt&amp;gt;172.16.3.15&amp;lt;/tt&amp;gt; (the &amp;lt;tt&amp;gt;inet addr&amp;lt;/tt&amp;gt; value).&lt;br /&gt;
&lt;br /&gt;
The IP address is also available from the MMCS log file:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ANL-R00-M1-N12-64   contacting control node 0 at 172.16.3.15:7000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With larger partitions that include multiple I/O nodes, querying the MMCS logfile is probably better, as it will list all the addresses.&lt;br /&gt;
&lt;br /&gt;
Once the IP address is known, one can simply use the SSH:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
iskra@login1.surveyor:~&amp;gt; ssh 172.16.3.15&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
BusyBox v1.4.2 (2008-10-04 00:02:35 UTC) Built-in shell (ash)&lt;br /&gt;
Enter 'help' for a list of built-in commands.&lt;br /&gt;
&lt;br /&gt;
/gpfs/home/iskra $ hostname&lt;br /&gt;
ion-15&lt;br /&gt;
/gpfs/home/iskra $ &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SSH is supposed to let the partition owner on the I/O node without asking for a password (no other unprivileged user will be allowed on the node), but that might require site-specific customization to work properly (see [[ZOID#The /bin.rd/update_passwd_file.sh file|update_passwd_file.sh]]).  Until this is set up, one might prefer to log on as root (&amp;lt;tt&amp;gt;ssh -l root&amp;lt;/tt&amp;gt;), passing the password provided while building the ZeptoOS environment.&lt;br /&gt;
&lt;br /&gt;
Also, even when the partition owner is correctly set up, there will be a time window while booting the I/O node when the SSH daemon is already running, but a job has not yet been started; during that window, the partition owner cannot log on. If that happens, wait a few seconds and try again.&lt;br /&gt;
&lt;br /&gt;
Here's part of the &amp;lt;tt&amp;gt;ps&amp;lt;/tt&amp;gt; output from the I/O node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/gpfs/home/iskra $ ps -ef&lt;br /&gt;
UID        PID  PPID  C STIME TTY          TIME CMD&lt;br /&gt;
[...]&lt;br /&gt;
65534       98     1  0 16:09 ?        00:00:00 /sbin/portmap&lt;br /&gt;
root       108    19  0 16:09 ?        00:00:00 [rpciod/0]&lt;br /&gt;
root       109    19  0 16:09 ?        00:00:00 [rpciod/1]&lt;br /&gt;
root       110    19  0 16:09 ?        00:00:00 [rpciod/2]&lt;br /&gt;
root       111    19  0 16:09 ?        00:00:00 [rpciod/3]&lt;br /&gt;
root       570     1  0 16:09 ?        00:00:00 /sbin/syslogd&lt;br /&gt;
root       577     1  0 16:09 ?        00:00:00 /sbin/klogd -c 1 -x -x&lt;br /&gt;
ntp        653     1  0 16:09 ?        00:00:00 /usr/sbin/ntpd -p /var/run/ntpd.&lt;br /&gt;
root       688     1  0 16:09 ?        00:00:00 [lockd]&lt;br /&gt;
root       775     1  0 16:09 ?        00:00:00 /bgsys/iosoft/pvfs2/sbin/pvfs2-c&lt;br /&gt;
root       776   775  0 16:09 ?        00:00:00 pvfs2-client-core --child -a 5 -&lt;br /&gt;
root       833     1  0 16:10 ?        00:00:00 /usr/sbin/sshd -o PidFile=/var/r&lt;br /&gt;
root      1016     1  0 16:10 ?        00:00:00 /bin/ksh /usr/lpp/mmfs/bin/runmm&lt;br /&gt;
root      1079     1  0 16:10 ?        00:00:00 [nfsWatchKproc]&lt;br /&gt;
root      1080     1  0 16:10 ?        00:00:00 [gpfsSwapdKproc]&lt;br /&gt;
root      1146  1016  0 16:10 ?        00:00:01 /usr/lpp/mmfs/bin//mmfsd&lt;br /&gt;
root      1153     1  0 16:10 ?        00:00:00 [mmkproc]&lt;br /&gt;
root      1152     1  0 16:10 ?        00:00:00 [mmkproc]&lt;br /&gt;
root      1154     1  0 16:10 ?        00:00:00 [mmkproc]&lt;br /&gt;
iskra     2810     1 98 16:10 ?        00:04:09 /bin.rd/zoid -a 8 -m unix_impl.s&lt;br /&gt;
root      2823     1  0 16:10 ?        00:00:00 /bin/sh&lt;br /&gt;
root      3328   833  0 16:10 ?        00:00:00 sshd: iskra [priv]             &lt;br /&gt;
iskra     3332  3328  0 16:10 ?        00:00:00 sshd: iskra@ttyp0              &lt;br /&gt;
iskra     3333  3332  0 16:10 ttyp0    00:00:00 -sh&lt;br /&gt;
iskra     3346  3333  0 16:14 ttyp0    00:00:00 ps -ef&lt;br /&gt;
/gpfs/home/iskra $ &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The I/O nodes run a small Linux setup with the root filesystem in the ramdisk.  Custom processes can be started, just like on any ordinary Linux node.  In the example above, it is mostly a few system daemons and the remote filesystem clients (GPFS, PVFS).  Please verify at this stage that the remote filesystem have been mounted correctly.&lt;br /&gt;
&lt;br /&gt;
One custom process running on the node is [[ZOID]], the I/O forwarding and job control daemon, which enables the communication with the compute nodes.  One of the facilities offered by ZOID is IP forwarding between the I/O node and the compute nodes, implemented using the virtual network tunneling device available in Linux:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/gpfs/home/iskra $ ifconfig tun0&lt;br /&gt;
tun0      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  &lt;br /&gt;
          inet addr:192.168.1.254  P-t-P:192.168.1.254  Mask:255.255.255.255&lt;br /&gt;
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:65535  Metric:1&lt;br /&gt;
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0&lt;br /&gt;
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0&lt;br /&gt;
          collisions:0 txqueuelen:500 &lt;br /&gt;
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)&lt;br /&gt;
/gpfs/home/iskra $ &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
At least on Argonne machines, with a 64:1 ratio of compute nodes to I/O nodes, compute nodes have addresses &amp;lt;tt&amp;gt;192.168.1.1&amp;lt;/tt&amp;gt; to &amp;lt;tt&amp;gt;192.168.1.64&amp;lt;/tt&amp;gt; (the last octet of the address is the [[FAQ#Pset rank|pset rank]]).  Somewhat confusingly, the first compute node (compute node &amp;lt;tt&amp;gt;0&amp;lt;/tt&amp;gt;) has IP address &amp;lt;tt&amp;gt;192.168.1.64&amp;lt;/tt&amp;gt;, so if one submits a one-node job as we did, that is the IP address that needs to be used to log on that sole running compute node.  The IP address of the second compute node is... &amp;lt;tt&amp;gt;192.168.1.59&amp;lt;/tt&amp;gt; (do not blame us &amp;amp;#8211; blame IBM&amp;amp;nbsp;:-).&lt;br /&gt;
&lt;br /&gt;
The compute nodes are running a &amp;lt;tt&amp;gt;telnet&amp;lt;/tt&amp;gt; daemon, and no password is required to log on them:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/gpfs/home/iskra $ telnet 192.168.1.64&lt;br /&gt;
&lt;br /&gt;
Entering character mode&lt;br /&gt;
Escape character is '^]'.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
BusyBox(for ZeptoOS Compute Node) v1.12.1 (2009-04-21 16:08:55 CDT) built-in shell (ash)&lt;br /&gt;
Enter 'help' for a list of built-in commands.&lt;br /&gt;
&lt;br /&gt;
~ # &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The IP address of the I/O node on this virtual network is &amp;lt;tt&amp;gt;192.168.1.254&amp;lt;/tt&amp;gt;. The network is local to each I/O node, so for larger jobs, there will be multiple distinct virtual networks that cannot communicate with each other, and the IP addresses will duplicate.&lt;br /&gt;
&lt;br /&gt;
Here's part of the &amp;lt;tt&amp;gt;ps&amp;lt;/tt&amp;gt; output from the compute node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
~ # ps -ef&lt;br /&gt;
  PID USER       VSZ STAT COMMAND&lt;br /&gt;
[...]&lt;br /&gt;
   34 root      5440 S    /bin/sh /etc/init.d/rc.sysinit &lt;br /&gt;
   44 root      5504 S    /sbin/telnetd -l /bin/sh &lt;br /&gt;
   47 root      6528 S    /sbin/inetd &lt;br /&gt;
   48 root     46400 R N  /sbin/control &lt;br /&gt;
   62 root      7872 S    /bin/zoid-fuse -o allow_other -s /fuse &lt;br /&gt;
  116 root      5248 S    /bin/sleep 3600 &lt;br /&gt;
  118 root      5504 S    /bin/sh &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Compute nodes have an even more stripped-down environment than the I/O nodes.  There are no user accounts &amp;amp;#8211; everything runs as root, including the application processes.  This is not a security concern, because the only practical way for a compute node to communicate with the outside world is through the I/O node, and I/O nodes ''do'' enforce user-level access control.&lt;br /&gt;
&lt;br /&gt;
There are two custom processes running on each compute node:&lt;br /&gt;
&lt;br /&gt;
'''control''' is a job management daemon responsible for tasks such as the launching of application processes, for the forwarding of stdin/out/err data, and for the management of the virtual network tunneling device from the compute node side.  Do not interfere with this process in any way; this would likely make the node inaccessible.&lt;br /&gt;
&lt;br /&gt;
'''zoid-fuse''' is a FUSE ([http://fuse.sourceforge.net/ Filesystem in Userspace]) client responsible for making the filesystems from the I/O nodes available to ordinary POSIX-compliant processes running on the compute nodes.  The whole filesystem namespace from the I/O nodes is made available on the compute nodes under &amp;lt;tt&amp;gt;/fuse/&amp;lt;/tt&amp;gt;, and symbolic links such as &amp;lt;tt&amp;gt;/home -&amp;gt; /fuse/home&amp;lt;/tt&amp;gt; are set up to keep the login and I/O node pathnames valid on the compute nodes.  Please verify that this is correctly set up.  We do not foresee a need to change this setup, but should that prove necessary, the responsbile &amp;lt;tt&amp;gt;fuse-start&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;fuse-stop&amp;lt;/tt&amp;gt; scripts can be found under &amp;lt;tt&amp;gt;ramdisk/CN/tree/bin&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==Shell script job==&lt;br /&gt;
&lt;br /&gt;
Assuming that the above steps have been successful, one can now test running a simple job from a network filesystem, such as one's home directory.&lt;br /&gt;
&lt;br /&gt;
Here is a sample shell script to try:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
&lt;br /&gt;
. /proc/personality.sh&lt;br /&gt;
&lt;br /&gt;
while true; do&lt;br /&gt;
    echo &amp;quot;Node $BG_RANK_IN_PSET running (stdout)&amp;quot;&lt;br /&gt;
    echo &amp;quot;Node $BG_RANK_IN_PSET running (stderr)&amp;quot; 1&amp;gt;&amp;amp;2&lt;br /&gt;
    sleep 10&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(please see the [[FAQ#Pset rank|FAQ]] for the explanation of &amp;lt;tt&amp;gt;/proc/personality.sh&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;BG_RANK_IN_PSET&amp;lt;/tt&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
Please create the script file on the network filesystem, set the executable bit (&amp;lt;tt&amp;gt;chmod 755&amp;lt;/tt&amp;gt;) and submit it.  Verify that the script starts correctly and that at least the standard error output is visible immediately.  The scripts print a line of output from each node every ten seconds.  It does so both to the standard output and to the standard error, because, depending on software configuration, the standard output stream could be buffered.  If that is the case, kill the job and verify that the standard output data did appear.&lt;br /&gt;
&lt;br /&gt;
==MPI and OpenMP jobs==&lt;br /&gt;
&lt;br /&gt;
The final tests involve parallel programming jobs, respectively MPI and OpenMP.  Use the test programs provided with the distribution. From the top level directory:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd comm/testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Compiling===&lt;br /&gt;
&lt;br /&gt;
The programs can be compiled on a login node using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ /path/to/install/bin/zmpicc -o mpi-test-linux mpi-test.c&lt;br /&gt;
$ /path/to/install/bin/zmpixlc_r -qsmp=omp -o omp-test-linux omp-test.c&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Submitting===&lt;br /&gt;
&lt;br /&gt;
Submit the MPI test like any other job; use one of the below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n &amp;lt;number-of-processes&amp;gt; $PWD/mpi-test-linux&lt;br /&gt;
$ qsub --kernel &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n &amp;lt;number-of-processes&amp;gt;  $PWD/mpi-test-linux&lt;br /&gt;
$ mpirun -verbose 1 -partition &amp;lt;partition-name&amp;gt; -np &amp;lt;number-of-processes&amp;gt; -timeout &amp;lt;time&amp;gt; \&lt;br /&gt;
-cwd $PWD -exe $PWD/omp-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For the OpenMP test, we pass the number of OpenMP threads to use in the &amp;lt;tt&amp;gt;OMP_NUM_THREADS&amp;lt;/tt&amp;gt; variable:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -k &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 1 -e OMP_NUM_THREADS=&amp;lt;num&amp;gt; $PWD/omp-test-linux&lt;br /&gt;
$ qsub --kernel &amp;lt;profile-name&amp;gt; -t &amp;lt;time&amp;gt; -n 1 --env OMP_NUM_THREADS=&amp;lt;num&amp;gt; $PWD/mpi-test-linux&lt;br /&gt;
$ mpirun -verbose 1 -partition &amp;lt;partition-name&amp;gt; -np 1 -timeout &amp;lt;time&amp;gt; \&lt;br /&gt;
-cwd $PWD -env OMP_NUM_THREADS=&amp;lt;num&amp;gt; -exe $PWD/omp-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The MPI test benchmarks the performance of various MPI operations.  The OpenMP test is just a parallel &amp;quot;Hello world&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[Installation]] | [[ZeptoOS_Documentation|Top]] | [[MPICH, DCMF, and SPI]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=515</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=515"/>
		<updated>2009-05-05T18:59:04Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* MMCS console */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigning different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only loader program, which is not an open source code.&lt;br /&gt;
The system allows us to specify multiple CN images. Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are default images and loaded into CN's main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS, which is not an open source code , either. &lt;br /&gt;
We can also specify multiple ION images. CNS, IBM ION Linux kernel image and ION Linux ramdisk are default images. They are also loaded into ION's main memory in order. &lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, we need to boot Zepto CN kernel and ION kernel in a partition that we use. &lt;br /&gt;
We describe how to assign and boot Zepto images in this section.&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel for your task.&lt;br /&gt;
 &lt;br /&gt;
What you need is to make a directory in the kernel profile directory&lt;br /&gt;
and create a couple of symbolic link that point to Zepto images.&lt;br /&gt;
&lt;br /&gt;
In ANL BGP, /bgsys/argonne-utils/profiles/ is the kernel profile directory.&lt;br /&gt;
Here are the concrete steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images and write permission to the kernel profile directory.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check!!!&lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some extra &lt;br /&gt;
features. The following command line is equivalent to the steps &lt;br /&gt;
described in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows its help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite existing profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly setup your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
&lt;br /&gt;
Cobalt is an open source software. No guarantee that your system has colbalt scheduler installed. &lt;br /&gt;
If you don't have cobalt, probably we'll need to use Midplane Management Control System(MMCS). &lt;br /&gt;
We explain how to assign and boot your own kernel images using MMCS.&lt;br /&gt;
&lt;br /&gt;
Here is a brief summary of MMCS.&lt;br /&gt;
* Available on all BGP systems&lt;br /&gt;
* the lowest control mechanism for BGP partition&lt;br /&gt;
* allocate, free or query of block(partition)&lt;br /&gt;
* status check&lt;br /&gt;
* assign boot images&lt;br /&gt;
* low level debug command&lt;br /&gt;
&lt;br /&gt;
Due to its low level interface, it requires administrator level permission to use it.&lt;br /&gt;
You also need to reserve a partition (or block).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Please memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration later on after you are done with Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=514</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=514"/>
		<updated>2009-05-05T18:53:39Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Cobalt installed system */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigning different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only loader program, which is not an open source code.&lt;br /&gt;
The system allows us to specify multiple CN images. Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are default images and loaded into CN's main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS, which is not an open source code , either. &lt;br /&gt;
We can also specify multiple ION images. CNS, IBM ION Linux kernel image and ION Linux ramdisk are default images. They are also loaded into ION's main memory in order. &lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, we need to boot Zepto CN kernel and ION kernel in a partition that we use. &lt;br /&gt;
We describe how to assign and boot Zepto images in this section.&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel for your task.&lt;br /&gt;
 &lt;br /&gt;
What you need is to make a directory in the kernel profile directory&lt;br /&gt;
and create a couple of symbolic link that point to Zepto images.&lt;br /&gt;
&lt;br /&gt;
In ANL BGP, /bgsys/argonne-utils/profiles/ is the kernel profile directory.&lt;br /&gt;
Here are the concrete steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images and write permission to the kernel profile directory.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check!!!&lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some extra &lt;br /&gt;
features. The following command line is equivalent to the steps &lt;br /&gt;
described in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows its help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite existing profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly setup your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
&lt;br /&gt;
We explain how to assign and boot your own kernel images using Midplane Management Control System(MMCS). &lt;br /&gt;
MMCS is the lowest control mechanism for BGP partition and installed on all BGP system. &lt;br /&gt;
Here is the brief summary of MMCS.&lt;br /&gt;
* allocate, free or query of block(partition)&lt;br /&gt;
* status check&lt;br /&gt;
* assign boot images&lt;br /&gt;
* low level debug command&lt;br /&gt;
&lt;br /&gt;
Due to its low level interface, it requires administrator level permission to use it.&lt;br /&gt;
You also need to reserve a partition.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=513</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=513"/>
		<updated>2009-05-05T18:50:56Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Introduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigning different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only loader program, which is not an open source code.&lt;br /&gt;
The system allows us to specify multiple CN images. Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are default images and loaded into CN's main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS, which is not an open source code , either. &lt;br /&gt;
We can also specify multiple ION images. CNS, IBM ION Linux kernel image and ION Linux ramdisk are default images. They are also loaded into ION's main memory in order. &lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, we need to boot Zepto CN kernel and ION kernel in a partition that we use. &lt;br /&gt;
We describe how to assign and boot Zepto images in this section.&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel for your computational job.&lt;br /&gt;
 &lt;br /&gt;
What you need to make a directory in the kernel profile directory&lt;br /&gt;
and a create a couple of symbolic link that point to Zepto images.&lt;br /&gt;
In ANL BGP, /bgsys/argonne-utils/profiles/ is the kernel profile directory.&lt;br /&gt;
Here are concrete steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images and write permission to the kernel profile directory.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check!!!&lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some extra &lt;br /&gt;
features. The following commend line is equivalent to the steps &lt;br /&gt;
described in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite existing profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
&lt;br /&gt;
We explain how to assign and boot your own kernel images using Midplane Management Control System(MMCS). &lt;br /&gt;
MMCS is the lowest control mechanism for BGP partition and installed on all BGP system. &lt;br /&gt;
Here is the brief summary of MMCS.&lt;br /&gt;
* allocate, free or query of block(partition)&lt;br /&gt;
* status check&lt;br /&gt;
* assign boot images&lt;br /&gt;
* low level debug command&lt;br /&gt;
&lt;br /&gt;
Due to its low level interface, it requires administrator level permission to use it.&lt;br /&gt;
You also need to reserve a partition.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=497</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=497"/>
		<updated>2009-05-04T16:35:14Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Introduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigned different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only loader program and is not a open source code.&lt;br /&gt;
For CN images, the system allows us to specify multiple images. By default, Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are specified. They are loaded into main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS. No source code of cns is available. &lt;br /&gt;
We can also specify multiple ION images. By default, CNS, IBM ION Linux kernel image and ION Linux ramdisk are specified. They are also loaded in order. &lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, we need to boot Zepto CN kernel and ION kernel in a partition that we use. &lt;br /&gt;
We describe how to assign and boot Zepto images in this section.&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel for your computational job.&lt;br /&gt;
 &lt;br /&gt;
What you need to make a directory in the kernel profile directory&lt;br /&gt;
and a create a couple of symbolic link that point to Zepto images.&lt;br /&gt;
In ANL BGP, /bgsys/argonne-utils/profiles/ is the kernel profile directory.&lt;br /&gt;
Here are concrete steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images and write permission to the kernel profile directory.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check!!!&lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some extra &lt;br /&gt;
features. The following commend line is equivalent to the steps &lt;br /&gt;
described in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite existing profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
&lt;br /&gt;
We explain how to assign and boot your own kernel images using Midplane Management Control System(MMCS). &lt;br /&gt;
MMCS is the lowest control mechanism for BGP partition and installed on all BGP system. &lt;br /&gt;
Here is the brief summary of MMCS.&lt;br /&gt;
* allocate, free or query of block(partition)&lt;br /&gt;
* status check&lt;br /&gt;
* assign boot images&lt;br /&gt;
* low level debug command&lt;br /&gt;
&lt;br /&gt;
Due to its low level interface, it requires administrator level permission to use it.&lt;br /&gt;
You also need to reserve a partition.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Limitations&amp;diff=489</id>
		<title>Limitations</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Limitations&amp;diff=489"/>
		<updated>2009-05-01T19:23:33Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* No Universal Performance Counter(UPC) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==Known Bugs / Current Limitations==&lt;br /&gt;
&lt;br /&gt;
===No VN/DUAL mode in MPI===&lt;br /&gt;
&lt;br /&gt;
Blue Gene/P supports three job modes:&lt;br /&gt;
&lt;br /&gt;
* SMP (one application process per node)&lt;br /&gt;
* DUAL (two application processes per node)&lt;br /&gt;
* VN (four application processes per node)&lt;br /&gt;
&lt;br /&gt;
In Cobalt, the job mode can be specified using &amp;lt;tt&amp;gt;cqsub -m&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;qsub --mode&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
ZeptoOS will launch the appropriate number of application processes per node as determined by the mode; however, MPI jobs currently only work in the SMP mode.  We plan to fix this problem in the near future.&lt;br /&gt;
&lt;br /&gt;
===No Universal Performance Counter(UPC)===&lt;br /&gt;
&lt;br /&gt;
UPC is not available in this release. Thus, PAPI won't work since it depends on UPC.&lt;br /&gt;
We are currently trying to work UPC on our Linux environment.&lt;br /&gt;
&lt;br /&gt;
===MPI-IO support===&lt;br /&gt;
&lt;br /&gt;
Due to the limitations of FUSE (the compute-node infrastructure we use for I/O forwarding of POSIX calls), pathnames passed to MPI-IO routines need to be prefixed with &amp;lt;tt&amp;gt;bglockless:&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;bgl:&amp;lt;/tt&amp;gt; (the latter will not work with PVFS; the former should work with all filesystems).&lt;br /&gt;
&lt;br /&gt;
In general, the file I/O performance with ZeptoOS is not very good, again, due to the limitations of FUSE.  Within the DOE FastOS [http://www.iofsl.org/ I/O forwarding project] we are working on a new, high performance I/O forwarding infrastructure for parallel applications and as this work matures, we will integrate it into ZeptoOS.&lt;br /&gt;
&lt;br /&gt;
===Some MPI jobs hung when they are killed===&lt;br /&gt;
&lt;br /&gt;
We have been seeing this a lot with &amp;lt;tt&amp;gt;cnip&amp;lt;/tt&amp;gt;, the IP-over-torus program.  This program runs &amp;quot;forever&amp;quot;, so it eventually needs to be killed.  When that happens, it will frequently hung one or more compute nodes, preventing the partition from shutting down cleanly.&lt;br /&gt;
&lt;br /&gt;
However, the service node will force a shutdown after a timeout of five minutes, so in practice this is not a significant problem.  Also, we have not seen this problem with ordinary MPI applications (unlike most MPI applications, &amp;lt;tt&amp;gt;cnip&amp;lt;/tt&amp;gt; is multithreaded and communicates a lot with the kernel).&lt;br /&gt;
&lt;br /&gt;
==Features Coming Soon==&lt;br /&gt;
&lt;br /&gt;
===Multiple MPI jobs one after another===&lt;br /&gt;
&lt;br /&gt;
Since ZeptoOS supports submitting a shell script as a compute node &amp;quot;application&amp;quot;, it is possible to run multiple &amp;quot;real&amp;quot; applications from within one job:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
&lt;br /&gt;
for i in 1 2 3 4 5 6 7 8 9 10; do&lt;br /&gt;
    /path/to/real/application&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This does work for sequential applications, but not for those that are linked with MPI; with MPI, an application can only be run once.  However, we have an experimental code that lifts this limitation and we plan to include it in the next release.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Limitations&amp;diff=488</id>
		<title>Limitations</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Limitations&amp;diff=488"/>
		<updated>2009-05-01T18:56:48Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* No VN/DUAL mode in MPI */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[ZeptoOS_Documentation|Top]]&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==Known Bugs / Current Limitations==&lt;br /&gt;
&lt;br /&gt;
===No VN/DUAL mode in MPI===&lt;br /&gt;
&lt;br /&gt;
Blue Gene/P supports three job modes:&lt;br /&gt;
&lt;br /&gt;
* SMP (one application process per node)&lt;br /&gt;
* DUAL (two application processes per node)&lt;br /&gt;
* VN (four application processes per node)&lt;br /&gt;
&lt;br /&gt;
In Cobalt, the job mode can be specified using &amp;lt;tt&amp;gt;cqsub -m&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;qsub --mode&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
ZeptoOS will launch the appropriate number of application processes per node as determined by the mode; however, MPI jobs currently only work in the SMP mode.  We plan to fix this problem in the near future.&lt;br /&gt;
&lt;br /&gt;
==No Universal Performance Counter(UPC)==&lt;br /&gt;
&lt;br /&gt;
UPC is not available in this release. Thus, PAPI won't work since it depends on UPC.&lt;br /&gt;
We are currently trying to work UPC on our Linux environment.&lt;br /&gt;
&lt;br /&gt;
===MPI-IO support===&lt;br /&gt;
&lt;br /&gt;
Due to the limitations of FUSE (the compute-node infrastructure we use for I/O forwarding of POSIX calls), pathnames passed to MPI-IO routines need to be prefixed with &amp;lt;tt&amp;gt;bglockless:&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;bgl:&amp;lt;/tt&amp;gt; (the latter will not work with PVFS; the former should work with all filesystems).&lt;br /&gt;
&lt;br /&gt;
In general, the file I/O performance with ZeptoOS is not very good, again, due to the limitations of FUSE.  Within the DOE FastOS [http://www.iofsl.org/ I/O forwarding project] we are working on a new, high performance I/O forwarding infrastructure for parallel applications and as this work matures, we will integrate it into ZeptoOS.&lt;br /&gt;
&lt;br /&gt;
===Some MPI jobs hung when they are killed===&lt;br /&gt;
&lt;br /&gt;
We have been seeing this a lot with &amp;lt;tt&amp;gt;cnip&amp;lt;/tt&amp;gt;, the IP-over-torus program.  This program runs &amp;quot;forever&amp;quot;, so it eventually needs to be killed.  When that happens, it will frequently hung one or more compute nodes, preventing the partition from shutting down cleanly.&lt;br /&gt;
&lt;br /&gt;
However, the service node will force a shutdown after a timeout of five minutes, so in practice this is not a significant problem.  Also, we have not seen this problem with ordinary MPI applications (unlike most MPI applications, &amp;lt;tt&amp;gt;cnip&amp;lt;/tt&amp;gt; is multithreaded and communicates a lot with the kernel).&lt;br /&gt;
&lt;br /&gt;
==Features Coming Soon==&lt;br /&gt;
&lt;br /&gt;
===Multiple MPI jobs one after another===&lt;br /&gt;
&lt;br /&gt;
Since ZeptoOS supports submitting a shell script as a compute node &amp;quot;application&amp;quot;, it is possible to run multiple &amp;quot;real&amp;quot; applications from within one job:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
&lt;br /&gt;
for i in 1 2 3 4 5 6 7 8 9 10; do&lt;br /&gt;
    /path/to/real/application&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This does work for sequential applications, but not for those that are linked with MPI; with MPI, an application can only be run once.  However, we have an experimental code that lifts this limitation and we plan to include it in the next release.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[ZeptoOS_Documentation|Top]]&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=485</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=485"/>
		<updated>2009-05-01T16:25:04Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Source code */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
MPICH in the Zepto release is mpich2-1.0.7 with IBM patch. It is reasonably stable and performance &lt;br /&gt;
of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for details. &lt;br /&gt;
DCMF in the Zepto release is 1.0.0, which is older than DCMF in the current driver release(V1R3M0). SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
====A compilation example====&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. It copies &lt;br /&gt;
the built libraries and header files to the comm/tmp directory temporarily. &lt;br /&gt;
If you need to apply newly built libraries, do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The update-prebuilt target basically copies the files from the comm/tmp directory to the comm/prebuilt directory and &lt;br /&gt;
the install.py script copies from the comm/prebuilt directory to __INST_PREFIX__, and installs compilation wrapper scripts to __INST_PREFIX__. Template of compilation wrapper scripts are in the templates directory.&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes and header files of DCMF and SPI can be found in the comm directory. The source code of MPICH is in an archive DCMF/lib/mpich2/mpich2-1.0.7.tar.gz, which will be extracted at the build time.&lt;br /&gt;
&lt;br /&gt;
The DCMF source codes are located in DCMF/sys/. &lt;br /&gt;
DCMF core source codes are in DCMF/sys/messaging.&lt;br /&gt;
Component Collective Messaging Interface(CCMI) is part of DCMF and its source codes are in&lt;br /&gt;
DCMF/sys/collectives. Test codes can be found in DCMF/sys/collectives/tests for CCMI&lt;br /&gt;
and DCMF/sys/messaging/tests. Those test codes can be a good example for DCMF/CCMI programming.&lt;br /&gt;
&lt;br /&gt;
SPI headers are in arch-runtime/arch and SPI source codes are in comm/arch-runtime/runtime/.&lt;br /&gt;
arch-runtime/zcl_spi contains the source code of ZEPTO SPI layer and&lt;br /&gt;
arch-runtime/arch/include/zepto contains the header files of ZEPTO SPI layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
comm&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |   |-- adaptor&lt;br /&gt;
|   |   |   |-- kernel&lt;br /&gt;
|   |   |   |-- tests&lt;br /&gt;
|   |   |   `-- tools&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|   |   |   |-- devices&lt;br /&gt;
|   |   |   |-- messager&lt;br /&gt;
|   |   |   |-- protocols&lt;br /&gt;
|   |   |   |-- queueing&lt;br /&gt;
|   |   |   |-- sysdep&lt;br /&gt;
|   |   `-- tests&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Zepto trace print function===&lt;br /&gt;
&lt;br /&gt;
Zepto trace print function are embbed in part of SPI and DCMF codes. &lt;br /&gt;
You can enable the trace feature by passing the ZEPTO_TRACE environment variable &lt;br /&gt;
when you submit a job.  ZEPTO_TRACE has integer that indicates trace level. higher number more details.&lt;br /&gt;
&lt;br /&gt;
An example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cqsub -n 64 -t 10 ... -e ZEPTO_TRACE=2 ./a.out&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=471</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=471"/>
		<updated>2009-04-30T20:42:59Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* MMCS console */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigned different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only choice for the loader program. &lt;br /&gt;
No source code of uloader is available. We can can specify multiple CN images. By default, Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are specified. They are loaded into main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS. No source code of cns is available. &lt;br /&gt;
We can also specify multiple ION images. By default, CNS, IBM ION Linux kernel image and ION Linux ramdisk are specified. They are also loaded in order. &lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, we need to boot Zepto CN kernel and ION kernel in a partition that we use. &lt;br /&gt;
We describe how to assign and boot Zepto images in this section.&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel for your computational job.&lt;br /&gt;
 &lt;br /&gt;
What you need to make a directory in the kernel profile directory&lt;br /&gt;
and a create a couple of symbolic link that point to Zepto images.&lt;br /&gt;
In ANL BGP, /bgsys/argonne-utils/profiles/ is the kernel profile directory.&lt;br /&gt;
Here are concrete steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images and write permission to the kernel profile directory.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check!!!&lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some extra &lt;br /&gt;
features. The following commend line is equivalent to the steps &lt;br /&gt;
described in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite existing profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
&lt;br /&gt;
We explain how to assign and boot your own kernel images using Midplane Management Control System(MMCS). &lt;br /&gt;
MMCS is the lowest control mechanism for BGP partition and installed on all BGP system. &lt;br /&gt;
Here is the brief summary of MMCS.&lt;br /&gt;
* allocate, free or query of block(partition)&lt;br /&gt;
* status check&lt;br /&gt;
* assign boot images&lt;br /&gt;
* low level debug command&lt;br /&gt;
&lt;br /&gt;
Due to its low level interface, it requires administrator level permission to use it.&lt;br /&gt;
You also need to reserve a partition.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=470</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=470"/>
		<updated>2009-04-30T20:40:52Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* MMCS console */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigned different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only choice for the loader program. &lt;br /&gt;
No source code of uloader is available. We can can specify multiple CN images. By default, Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are specified. They are loaded into main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS. No source code of cns is available. &lt;br /&gt;
We can also specify multiple ION images. By default, CNS, IBM ION Linux kernel image and ION Linux ramdisk are specified. They are also loaded in order. &lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, we need to boot Zepto CN kernel and ION kernel in a partition that we use. &lt;br /&gt;
We describe how to assign and boot Zepto images in this section.&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel for your computational job.&lt;br /&gt;
 &lt;br /&gt;
What you need to make a directory in the kernel profile directory&lt;br /&gt;
and a create a couple of symbolic link that point to Zepto images.&lt;br /&gt;
In ANL BGP, /bgsys/argonne-utils/profiles/ is the kernel profile directory.&lt;br /&gt;
Here are concrete steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images and write permission to the kernel profile directory.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check!!!&lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some extra &lt;br /&gt;
features. The following commend line is equivalent to the steps &lt;br /&gt;
described in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite existing profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
&lt;br /&gt;
We explain how to assign and boot your own kernel images using Midplane Management Control System(MMCS). &lt;br /&gt;
MMCS is the lowest control mechanism for BGP partition and installed on all BGP system. &lt;br /&gt;
Here is the brief summary of MMCS.&lt;br /&gt;
* allocate, free or query of block(partition)&lt;br /&gt;
* status check&lt;br /&gt;
* assign boot images&lt;br /&gt;
* low level debug command&lt;br /&gt;
&lt;br /&gt;
Due to its low level interface, it requires administrator level permission to use it.&lt;br /&gt;
You also need to reserve a partition.&lt;br /&gt;
&lt;br /&gt;
# Assigned Zepto images to a partition that you will use&lt;br /&gt;
# Start your job with the partition via mpi_run.  Zepto kernel automatically starts.&lt;br /&gt;
# Reset to the default kernel profile&lt;br /&gt;
&lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=469</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=469"/>
		<updated>2009-04-30T20:18:41Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Cobalt installed system */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigned different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only choice for the loader program. &lt;br /&gt;
No source code of uloader is available. We can can specify multiple CN images. By default, Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are specified. They are loaded into main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS. No source code of cns is available. &lt;br /&gt;
We can also specify multiple ION images. By default, CNS, IBM ION Linux kernel image and ION Linux ramdisk are specified. They are also loaded in order. &lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, we need to boot Zepto CN kernel and ION kernel in a partition that we use. &lt;br /&gt;
We describe how to assign and boot Zepto images in this section.&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel for your computational job.&lt;br /&gt;
 &lt;br /&gt;
What you need to make a directory in the kernel profile directory&lt;br /&gt;
and a create a couple of symbolic link that point to Zepto images.&lt;br /&gt;
In ANL BGP, /bgsys/argonne-utils/profiles/ is the kernel profile directory.&lt;br /&gt;
Here are concrete steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images and write permission to the kernel profile directory.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check!!!&lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some extra &lt;br /&gt;
features. The following commend line is equivalent to the steps &lt;br /&gt;
described in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite existing profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
 &lt;br /&gt;
If no cobalt kernel profile feature is available on your BGP system, &lt;br /&gt;
using MMCS console is choice. What you basically do by mmcs console is &lt;br /&gt;
to assign Zepto kernels statatically to a partition you use. &lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=468</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=468"/>
		<updated>2009-04-30T20:17:15Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Cobalt installed system */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigned different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only choice for the loader program. &lt;br /&gt;
No source code of uloader is available. We can can specify multiple CN images. By default, Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are specified. They are loaded into main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS. No source code of cns is available. &lt;br /&gt;
We can also specify multiple ION images. By default, CNS, IBM ION Linux kernel image and ION Linux ramdisk are specified. They are also loaded in order. &lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, we need to boot Zepto CN kernel and ION kernel in a partition that we use. &lt;br /&gt;
We describe how to assign and boot Zepto images in this section.&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel for your computational job.&lt;br /&gt;
 &lt;br /&gt;
What you need to make a directory in the kernel profile directory&lt;br /&gt;
and a create a couple of symbolic link that point to Zepto images.&lt;br /&gt;
In ANL BGP, /bgsys/argonne-utils/profiles/ is the kernel profile directory.&lt;br /&gt;
Here are concrete steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images and write permission to the kernel profile directory.&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check!!!&lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some other &lt;br /&gt;
features. The following commend line is equivalent to the steps &lt;br /&gt;
described in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite existing profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
 &lt;br /&gt;
If no cobalt kernel profile feature is available on your BGP system, &lt;br /&gt;
using MMCS console is choice. What you basically do by mmcs console is &lt;br /&gt;
to assign Zepto kernels statatically to a partition you use. &lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=466</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=466"/>
		<updated>2009-04-30T19:59:37Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Introduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigned different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only choice for the loader program. &lt;br /&gt;
No source code of uloader is available. We can can specify multiple CN images. By default, Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are specified. They are loaded into main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS. No source code of cns is available. &lt;br /&gt;
We can also specify multiple ION images. By default, CNS, IBM ION Linux kernel image and ION Linux ramdisk are specified. They are also loaded in order. &lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, we need to boot Zepto CN kernel and ION kernel in a partition that we use. &lt;br /&gt;
We describe how to assign and boot Zepto images in this section.&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
You can find a directory called kernel profile directory on login &lt;br /&gt;
nodes (/bgsys/argonne-utils/profiles/ in ANL BGP system for &lt;br /&gt;
example). Here are steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: Your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check! &lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some other &lt;br /&gt;
features. The following commen line is equivalent to the steps &lt;br /&gt;
decribed in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite exsiting profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
 &lt;br /&gt;
If no cobalt kernel profile feature is available on your BGP system, &lt;br /&gt;
using MMCS console is choice. What you basically do by mmcs console is &lt;br /&gt;
to assign Zepto kernels statatically to a partition you use. &lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=465</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=465"/>
		<updated>2009-04-30T19:55:40Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Introduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
The BlueGene/P system is capble of assigned different boot images per partition.  &lt;br /&gt;
The system allows us to specify loader program, CN images and ION images.&lt;br /&gt;
The loader program is loaded into main memory via jtag network and executed first. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/uloader is only choice for the loader program. &lt;br /&gt;
No source code of uloader is available. &lt;br /&gt;
We can can specify multiple CN images. By default, Command Node Service(CNS) &lt;br /&gt;
image and CN kernel image(IBM CNK) are specified. They are loaded into main memory in order. &lt;br /&gt;
/bgsys/drivers/ppcfloor/boot/cns is only choice for CNS. No source code of cns is available. &lt;br /&gt;
We can also specify multiple ION images. By default, CNS, IBM ION Linux kernel image and ION Linux ramdisk are specified. They are also loaded in order. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To enable Zepto feature, you need to configure the system properly with Zepto Kernel&lt;br /&gt;
&lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
You can find a directory called kernel profile directory on login &lt;br /&gt;
nodes (/bgsys/argonne-utils/profiles/ in ANL BGP system for &lt;br /&gt;
example). Here are steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: Your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check! &lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some other &lt;br /&gt;
features. The following commen line is equivalent to the steps &lt;br /&gt;
decribed in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite exsiting profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
 &lt;br /&gt;
If no cobalt kernel profile feature is available on your BGP system, &lt;br /&gt;
using MMCS console is choice. What you basically do by mmcs console is &lt;br /&gt;
to assign Zepto kernels statatically to a partition you use. &lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=464</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=464"/>
		<updated>2009-04-30T19:03:51Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Cobalt installed system */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
BGP is capble of running different kernel per partition.  By default &lt;br /&gt;
IBM CNK and ION kernel(Linux) is booted in a partition when you submit &lt;br /&gt;
a job. To enable Zepto feature, you need to configure the system &lt;br /&gt;
properly with Zepto Kernel &lt;br /&gt;
 &lt;br /&gt;
==Cobalt installed system==  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
You can find a directory called kernel profile directory on login &lt;br /&gt;
nodes (/bgsys/argonne-utils/profiles/ in ANL BGP system for &lt;br /&gt;
example). Here are steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: Your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check! &lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some other &lt;br /&gt;
features. The following commen line is equivalent to the steps &lt;br /&gt;
decribed in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite exsiting profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
 &lt;br /&gt;
If no cobalt kernel profile feature is available on your BGP system, &lt;br /&gt;
using MMCS console is choice. What you basically do by mmcs console is &lt;br /&gt;
to assign Zepto kernels statatically to a partition you use. &lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel_Profile&amp;diff=463</id>
		<title>Kernel Profile</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel_Profile&amp;diff=463"/>
		<updated>2009-04-30T19:03:06Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
BGP is capble of running different kernel per partition.  By default &lt;br /&gt;
IBM CNK and ION kernel(Linux) is booted in a partition when you submit &lt;br /&gt;
a job. To enable Zepto feature, you need to configure the system &lt;br /&gt;
properly with Zepto Kernel &lt;br /&gt;
 &lt;br /&gt;
===Cobalt installed system===  &lt;br /&gt;
 &lt;br /&gt;
If your BGP system has the cobalt scheduler installed and its kernel &lt;br /&gt;
profile feature has been configured properly, it would be easy to &lt;br /&gt;
boot Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
You can find a directory called kernel profile directory on login &lt;br /&gt;
nodes (/bgsys/argonne-utils/profiles/ in ANL BGP system for &lt;br /&gt;
example). Here are steps to create a new kernel profile.  Suppose that &lt;br /&gt;
you have already built your Zepto kernel images. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd KERNEL_PROFILE_DIR &lt;br /&gt;
$ mkdir YOUR_PROFILE_NAME &amp;amp;&amp;amp; cd YOUR_PROFILE_NAME &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-CN-zImage-with-initrd.elf  CNK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-zImage.elf  INK &lt;br /&gt;
$ ln -s ZEPTO_DIR/BGP-ION-ramdisk-for-CNL.elf ramdisk &lt;br /&gt;
$ ln -s ../factory-default/CNS &lt;br /&gt;
$ ln -s ../factory-default/uloader &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
NOTE: Your Zepto images must be readable from others, otherwise your &lt;br /&gt;
job will fail. Please double check! &lt;br /&gt;
 &lt;br /&gt;
For ANL user, we provide a convenient script named mkprofile-ANL.sh &lt;br /&gt;
which essentially does what mentioned in above but has some other &lt;br /&gt;
features. The following commen line is equivalent to the steps &lt;br /&gt;
decribed in above. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd ZEPTO_DIR &amp;amp;&amp;amp; ./mkprofile-ANL.sh --profile=YOUR_PROFILE_NAME &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Invoking it with the -h option shows help message. Use -c if you &lt;br /&gt;
actually need to copy images instead of making symbolic link.  Use &lt;br /&gt;
-cn, -ion or -rd if you have a custom named image. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ./mkprofile-ANL.sh -h &lt;br /&gt;
Usage: ./mkprofile-ANL.sh [OPTIONS]   &lt;br /&gt;
 &lt;br /&gt;
Options: &lt;br /&gt;
-h             : Show this message &lt;br /&gt;
-c             : Copy images instead of making symbolic link &lt;br /&gt;
-f             : Overwrite exsiting profile  &lt;br /&gt;
--profile=name : Specify profile name &lt;br /&gt;
--cn=fn        : Compute Node Kernel Image   &lt;br /&gt;
--ion=fn       : Specify I/O Node Kernel Image       &lt;br /&gt;
--rd=fn        : Specify I/O Node Ramdisk Image &lt;br /&gt;
--ls           : show files in profile &lt;br /&gt;
--dryrun  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Once you have properly configured your Zepto kernel profile, you can &lt;br /&gt;
boot Zepto kernel by specifying your kernel profile name via the -k &lt;br /&gt;
cobalt option. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ cqsub -k YOUR_PROFILE_NAME .... &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MMCS console==&lt;br /&gt;
 &lt;br /&gt;
If no cobalt kernel profile feature is available on your BGP system, &lt;br /&gt;
using MMCS console is choice. What you basically do by mmcs console is &lt;br /&gt;
to assign Zepto kernels statatically to a partition you use. &lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
===Assign Zepto images to a BGP partition=== &lt;br /&gt;
 &lt;br /&gt;
Login to the service node and start MMCS &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;$ ssh sn         &lt;br /&gt;
sn $ ./mmcs.sh  &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;[mmcs.sh] &lt;br /&gt;
#!/bin/sh &lt;br /&gt;
 &lt;br /&gt;
export DB2HOME=/dbhome/bgpdb2c/sqllib &lt;br /&gt;
DB2SRC=${DB2HOME}/db2profile &lt;br /&gt;
[ -f &amp;quot;$DB2SRC&amp;quot; ] &amp;amp;&amp;amp; . $DB2SRC &lt;br /&gt;
 &lt;br /&gt;
cd /bgsys/drivers/ppcfloor/bin &lt;br /&gt;
./mmcs_db_console &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Memorize that the current configuration. You need to revert the &lt;br /&gt;
blockinfo to the original configuration after you have done using &lt;br /&gt;
Zepto kernel. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ getblockinfo BGP_BLOCK_NAME &lt;br /&gt;
OK &lt;br /&gt;
boot info for block BGP_BLOCK_NAME: &lt;br /&gt;
mloader: /bgsys/drivers/ppcfloor/boot/uloader &lt;br /&gt;
cnloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk &lt;br /&gt;
ioloadImg: /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
status: F &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
Assign Zepto images to a parition &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,BGP_CN_LINUX_KERNEL_PATH /bgsys/drivers/ppcfloor\&lt;br /&gt;
/boot/cns,BGP_ION_LINUX_KERNEL_PATH,BGP_ION_LINUX_RAMDISK_PATH &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
===Boot Zepto kernel=== &lt;br /&gt;
 &lt;br /&gt;
Once you have configured a partition with Zepto kernels correctly, &lt;br /&gt;
Zepto kernels will be booted when you run a job on that partition(via &lt;br /&gt;
mpirun for example) &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ mpirun -verbose 1 -partition BGP_BLOCK_NAME  -np 64 -timeout 600 -cwd `pwd` -exe ./a.out &lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
 &lt;br /&gt;
===Restore to the original configuration(Don't forget!!!)=== &lt;br /&gt;
 &lt;br /&gt;
After you have done your work on Zepto kernel, you need to restore to &lt;br /&gt;
the original configuration. Here is an example. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;fen $ ssh sn &lt;br /&gt;
sn $ ./mmcs.sh &lt;br /&gt;
console $ set_username YOUR_LOGIN_NAME &lt;br /&gt;
console $ setblockinfo BGP_BLOCK_NAME /bgsys/drivers/ppcfloor/boot/uloader /bgsys/drivers/ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/cnk /bgsys/drivers/\&lt;br /&gt;
ppcfloor/boot/cns,/bgsys/drivers/ppcfloor/boot/linux,/bgsys/drivers/ppcfloor/boot/ramdisk &lt;br /&gt;
console $ quit &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Ramdisk&amp;diff=462</id>
		<title>Ramdisk</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Ramdisk&amp;diff=462"/>
		<updated>2009-04-30T19:02:45Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
Both CN Linux kernel and ION Linux kernel requires ramdisk to&lt;br /&gt;
boot.  Ramdisk images basically contain minimum Linux utilities, init&lt;br /&gt;
scripts , configuration files, kernel modules, etc,  which are required&lt;br /&gt;
by OS boot process.&lt;br /&gt;
&lt;br /&gt;
ION ramdisk is an ELF file that contains a cpio format archive of&lt;br /&gt;
system files. Two ION ramdisk images are currently generated.&lt;br /&gt;
&lt;br /&gt;
* BGP-ION-ramdisk-for-CNL.elf&lt;br /&gt;
** Default ION ramdisk for ZeptoOS&lt;br /&gt;
* BGP-ION-ramdisk-for-CNK.elf &lt;br /&gt;
** Use this one if you need to run IBM CNK on Compute node&lt;br /&gt;
** IBM CIOD is used instead of ZOID&lt;br /&gt;
&lt;br /&gt;
Our ION ramdisks are similar to IBM default ION ramdisk but we add some&lt;br /&gt;
extra files to support our features. The extra files are located in&lt;br /&gt;
ramdisk/ION/ramdisk-add. The build-ramdisk script from IBM BGP driver is used to create&lt;br /&gt;
ION ramdisk. The default path of the build-ramdisk script is&lt;br /&gt;
/bgsys/drivers/ppcfloor. The build-ramdisk script path can be&lt;br /&gt;
configured by the main configure script.&lt;br /&gt;
&lt;br /&gt;
CN ramdisk is also a gzip'ed cpio format archive of system files, but&lt;br /&gt;
CN ramdisk is embedded into CN kernel&lt;br /&gt;
image(BGP-CN-zImage-with-initrd.elf).  CN ramdisk is created by our&lt;br /&gt;
local ramdisk build script( ramdisk/CN/create-bgp-cn-linux-ramdisk.pl&lt;br /&gt;
). Both build-ramdisk and create-bgp-cn-linux-ramdisk.pl is a wrapper script of the &lt;br /&gt;
Linux kernel's gen_init_cpio command.&lt;br /&gt;
&lt;br /&gt;
==How to create ramdisk images==&lt;br /&gt;
&lt;br /&gt;
CN kernel image, ION kernel image and ION ramdisk images are always (re-)created from prebuild&lt;br /&gt;
objects if you type make at the top level directory (without any make target),&lt;br /&gt;
&lt;br /&gt;
In case you actually need to create ION ramdisk individually (without rebuilding other images), &lt;br /&gt;
you can do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make bgp-ion-ramdisk-cnl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to create CN ramdisk(technically create CN kernel image&lt;br /&gt;
with new ramdisk contents), type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make bgp-cn-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
NOTE: ramdisk/CN/bgp-cn-ramdisk.cpio.gz is a newly built CN ramdisk.&lt;br /&gt;
&lt;br /&gt;
==How to modify ramdisk contents==&lt;br /&gt;
&lt;br /&gt;
You can customize ramdisk contents for your purpose, i.e., debugging, &lt;br /&gt;
running your system software to BGP.&lt;br /&gt;
&lt;br /&gt;
===CN ramdisk===&lt;br /&gt;
&lt;br /&gt;
You can customize CN ramdisk by editing our CN ramdisk build script, which is&lt;br /&gt;
ramdisk/CN/create-bgp-cn-linux-ramdisk.pl.  &lt;br /&gt;
You can add files with permission, delete files, creating device files, etc.&lt;br /&gt;
&lt;br /&gt;
While we keep our CN ramdisk contents in ramdisk/CN/tree, you can add any file to the ramdisk as long as they are accessible from the script. You may be able to add login node's executables if they are a 32-bit PPC binaries and you copy all required files such as shared libraries, config files, etc.&lt;br /&gt;
&lt;br /&gt;
Here is an useful example. Supposed that you need the od command in CN ramdisk.&lt;br /&gt;
You can build the od command from source code from source code for CN ramdisk.&lt;br /&gt;
If you want to do something quick, you can check see if login node's od command works or not.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ file /usr/bin/od&lt;br /&gt;
/usr/bin/od: ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1 (SYSV), &lt;br /&gt;
for GNU/Linux 2.6.4, dynamically linked (uses shared libs), for GNU/Linux 2.6.4, stripped&lt;br /&gt;
$ ldd /usr/bin/od&lt;br /&gt;
linux-vdso32.so.1 =&amp;gt;  (0x00100000)&lt;br /&gt;
libc.so.6 =&amp;gt; /lib/ppc970/libc.so.6 (0x0fe8b000)&lt;br /&gt;
/lib/ld.so.1 (0xf7fe1000)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is a 32-bit PPC executable and the current CN ramdisk has all necessary shared libraries, so you simply can use login node's od command.&lt;br /&gt;
What you need to do is to add one command to a perl array named @cmdlists in ramdisk/CN/create-bgp-cn-linux-ramdisk.pl&lt;br /&gt;
and type make to recreate the CN ramdisk.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ vi ramdisk/CN/create-bgp-cn-linux-ramdisk.pl # the following line to @cmdlists &lt;br /&gt;
     &amp;quot;file /bin/od   /usr/bin/od 0755  0  0&amp;quot;,&lt;br /&gt;
$ make bgp-cn-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now the CN ramdisk has /bin/od with file permission 0755, uid=0 and gid=0.&lt;br /&gt;
&lt;br /&gt;
The line that you added is a gen_init_cpio command. You can also&lt;br /&gt;
create a directory, device file, symbolick link, pipe file, socket&lt;br /&gt;
file as well. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
file &amp;lt;name&amp;gt; &amp;lt;location&amp;gt; &amp;lt;mode&amp;gt; &amp;lt;uid&amp;gt; &amp;lt;gid&amp;gt;&lt;br /&gt;
dir &amp;lt;name&amp;gt; &amp;lt;mode&amp;gt; &amp;lt;uid&amp;gt; &amp;lt;gid&amp;gt;&lt;br /&gt;
nod &amp;lt;name&amp;gt; &amp;lt;mode&amp;gt; &amp;lt;uid&amp;gt; &amp;lt;gid&amp;gt; &amp;lt;dev_type&amp;gt; &amp;lt;maj&amp;gt; &amp;lt;min&amp;gt;&lt;br /&gt;
slink &amp;lt;name&amp;gt; &amp;lt;target&amp;gt; &amp;lt;mode&amp;gt; &amp;lt;uid&amp;gt; &amp;lt;gid&amp;gt;&lt;br /&gt;
pipe &amp;lt;name&amp;gt; &amp;lt;mode&amp;gt; &amp;lt;uid&amp;gt; &amp;lt;gid&amp;gt;&lt;br /&gt;
sock &amp;lt;name&amp;gt; &amp;lt;mode&amp;gt; &amp;lt;uid&amp;gt; &amp;lt;gid&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;name&amp;gt;      name of the file/dir/nod/etc in the archive&lt;br /&gt;
&amp;lt;location&amp;gt;  location of the file in the current filesystem&lt;br /&gt;
&amp;lt;target&amp;gt;    link target&lt;br /&gt;
&amp;lt;mode&amp;gt;      mode/permissions of the file&lt;br /&gt;
&amp;lt;uid&amp;gt;       user id (0=root)&lt;br /&gt;
&amp;lt;gid&amp;gt;       group id (0=root)&lt;br /&gt;
&amp;lt;dev_type&amp;gt;  device type (b=block, c=character)&lt;br /&gt;
&amp;lt;maj&amp;gt;       major number of nod&lt;br /&gt;
&amp;lt;min&amp;gt;       minor number of nod&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The order of the gen_init_cpio commands in @cmdlists is matter.They are interpreted and executed from top to bottom. &lt;br /&gt;
You can't add a file to the directory that has not been created.&lt;br /&gt;
&lt;br /&gt;
====CN Linux startup script====&lt;br /&gt;
&lt;br /&gt;
The first thing that Linux kernel does after kernel successfully boots&lt;br /&gt;
is to execute an executable called the init program.&lt;br /&gt;
The init program is usually /sbin/init that is internally busybox. &lt;br /&gt;
Then /sbin/init executes a startup script defined in /etc/inittab,&lt;br /&gt;
which is /etc/init.d/rc.sysinit.&lt;br /&gt;
&lt;br /&gt;
Our startup script is a very small shell script and do minimum setup, starts telnetd to allow users to login from ION and&lt;br /&gt;
starts the Zoid control process which takes care of job control.&lt;br /&gt;
&lt;br /&gt;
In case you need to start your software at CN boot time, &lt;br /&gt;
you can add command invocations to ramdisk/CN/tree/etc/init.d/rc.sysinit.&lt;br /&gt;
&lt;br /&gt;
===ION ramdisk===&lt;br /&gt;
&lt;br /&gt;
Unlike CN ramdisk, the range of customization is limited on ION ramdisk.&lt;br /&gt;
You have no permission control. You can't create device nodes, etc.&lt;br /&gt;
Currently we build the ION ramdisk using IBM build-ramdisk script by specifying &lt;br /&gt;
add-on tree which contains our extra files.&lt;br /&gt;
&lt;br /&gt;
What you can do are basically:&lt;br /&gt;
* Add files&lt;br /&gt;
* Overwrite files that are in the ramdisk created by build-ramdisk&lt;br /&gt;
&lt;br /&gt;
Once you add files under ramdisk/ION/ramdisk-add/, they will be automatically added to the ramdisk. &lt;br /&gt;
Here is an example to add a file to the ION ramdisk.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ vi ramdisk/ION/ramdisk-add/etc/yourfile&lt;br /&gt;
$ make bgp-ion-ramdisk-cnl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need more than file adding, you might need to edit the build-ramdisk script itself.&lt;br /&gt;
The script's default location is /bgsys/drivers/ppcfloor/build-ramdisk. Copy &lt;br /&gt;
the script to your working directory, edit it and change the script path in ramdisk/ION/Makefile.&lt;br /&gt;
&lt;br /&gt;
====ION startup script====&lt;br /&gt;
&lt;br /&gt;
There is no rc.sysinit in ramdisk/ION/ramdisk-add/ since &lt;br /&gt;
rc.sysinit is provided from IBM ramdisk tree. &lt;br /&gt;
i.e., /bgsys/drivers/ppcfloor/ramdisk/etc/init.d/rc.sysinit is default one.&lt;br /&gt;
You can copy the default one to ramdisk/etc/init.d/rc.sysinit (local) and modify it &lt;br /&gt;
to change the startup behaviour but it is not recommended. &lt;br /&gt;
&lt;br /&gt;
In most cases, what you need is to start your software at ION boot time. &lt;br /&gt;
For such purpose, you can add your ION RC script to ramdisk-add/etc/init.d/rc3.d&lt;br /&gt;
to do some action.&lt;br /&gt;
&lt;br /&gt;
RC script has own naming convention. &lt;br /&gt;
&lt;br /&gt;
* S##xxxx : boot time scripts&lt;br /&gt;
* K##xxxx : shut down scripts&lt;br /&gt;
&lt;br /&gt;
It starts with S or K. Scritps with S are boot time script and scripts with K are shut down scripts.&lt;br /&gt;
A two-digit number is followed by 'S' or 'K' is used to decide execution order ; &lt;br /&gt;
a smaller number script is executed before a larger number script. Then script name follows. &lt;br /&gt;
The init scripts passes &amp;quot;start&amp;quot; as the 1st argument to boot time scripts when it is executed.  Similarly, &amp;quot;stop&amp;quot; is passed to shut down script.&lt;br /&gt;
If you parse the argument, one rc script can serve as both boot time and shut down script. &lt;br /&gt;
Here is a template of rc script. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
. /etc/rc.status&lt;br /&gt;
&lt;br /&gt;
rc_reset&lt;br /&gt;
case &amp;quot;$1&amp;quot; in&lt;br /&gt;
    start)&lt;br /&gt;
        # fill here #&lt;br /&gt;
        ;;&lt;br /&gt;
    stop)&lt;br /&gt;
        # fill here #&lt;br /&gt;
        ;;&lt;br /&gt;
    restart)&lt;br /&gt;
        # fill here #&lt;br /&gt;
        ;;&lt;br /&gt;
    status)&lt;br /&gt;
        # fill here #&lt;br /&gt;
        ;;&lt;br /&gt;
    *)&lt;br /&gt;
	echo &amp;quot;Usage: $0 {start|stop|restart|status}&amp;quot;&lt;br /&gt;
	exit 1&lt;br /&gt;
	;;&lt;br /&gt;
esac&lt;br /&gt;
rc_exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default Zepto ION ramdisk contains the following rc scripts.&lt;br /&gt;
&lt;br /&gt;
'''boot''' scripts&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
S00zepto&lt;br /&gt;
S01bootsysctl&lt;br /&gt;
S02syslog&lt;br /&gt;
S05ntp&lt;br /&gt;
S10sitefs&lt;br /&gt;
S11pvfs&lt;br /&gt;
S11sshd&lt;br /&gt;
S12zepto&lt;br /&gt;
S40gpfs&lt;br /&gt;
S41homelinks&lt;br /&gt;
S43ibmcmp&lt;br /&gt;
S46essl&lt;br /&gt;
S50ciod&lt;br /&gt;
S51zoid&lt;br /&gt;
S99zepto&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''shutdown''' scripts&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
K05ntp&lt;br /&gt;
K10sshd&lt;br /&gt;
K15ciod&lt;br /&gt;
K20gpfs&lt;br /&gt;
K30syslog&lt;br /&gt;
K50bgsys.64&lt;br /&gt;
K70pvfs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Ramdisk size limitation===&lt;br /&gt;
&lt;br /&gt;
On regular Linux environment, ramdisk size is &lt;br /&gt;
basically limited by free memory size at the time when ramdisk is loaded into memory.&lt;br /&gt;
However, on BGP, the system software(non-opensource) can not handle bigger image. &lt;br /&gt;
We don't have the exact number on the boot image size limitation &lt;br /&gt;
but 100MB or bigger ramdisk might fail to boot with the current environment.&lt;br /&gt;
If you add bigger files to ramdisk, please make sure the ramdisk file size,&lt;br /&gt;
specifically, BGP-ION-ramdisk-for-CNL.elf and BGP-CN-zImage-with-initrd.elf.&lt;br /&gt;
&lt;br /&gt;
==How to extract files from existing ramdisk image==&lt;br /&gt;
&lt;br /&gt;
If you want to extract file from existing ramdisk image, do the&lt;br /&gt;
following steps (ION ramdisk only).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./packages/tools/z-extract-cpio-from-ramdisk.sh  existingramdisk.elf  ramdisk.cpio&lt;br /&gt;
$ mkdir treeroot &amp;amp;&amp;amp; cd treeroot&lt;br /&gt;
$ cpio -idv &amp;lt; ../ramdisk.cpio&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Kernel&amp;diff=461</id>
		<title>Kernel</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Kernel&amp;diff=461"/>
		<updated>2009-04-30T19:02:28Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
We currently provide two Linux kernels because of GPFS support on ION.&lt;br /&gt;
&lt;br /&gt;
* 2.6.19 based kernel : Zepto CN kernel&lt;br /&gt;
** IBM V1R3 patch and zepto patch applied&lt;br /&gt;
** 64KB pagesize and bigmemory region available&lt;br /&gt;
** Device drivers for compute node devices such as DMA, lockbox, etc&lt;br /&gt;
** Allow to run MPICH/DCMF code through Zepto Compute Binary(ZCB)&lt;br /&gt;
** Can be used as enhanced ION kernel (no GPFS modules available)&lt;br /&gt;
&lt;br /&gt;
* 2.6.16 based kernel : Zepto ION kernel&lt;br /&gt;
** IBM V1R3 patch applied&lt;br /&gt;
** Essentially same as IBM ION kernel. Support GPFS&lt;br /&gt;
&lt;br /&gt;
==Kernel directory structure==&lt;br /&gt;
&lt;br /&gt;
Kernel directory basically consists of three main directories ; prebuilt, config and tarball.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
kernel&lt;br /&gt;
|-- prebuilt&lt;br /&gt;
|   |-- 2.6.16&lt;br /&gt;
|   |   `-- ION&lt;br /&gt;
|   `-- 2.6.19&lt;br /&gt;
|       |-- CN&lt;br /&gt;
|       `-- objs&lt;br /&gt;
|-- tarball&lt;br /&gt;
`-- config&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The prebuilt directory contains prebuilt kernel images and kernel modules. While prebuilt ION kernel &lt;br /&gt;
ELF file is found, no CN kernel ELF file is found here. This is because CN kernel ramdisk is embedded in kernel &lt;br /&gt;
and we wanted to provide a way to replace the CNK ramdisk without invoking CN build process which requires &lt;br /&gt;
untar'ing source code. Instead CN kernel objects are found in prebuilt directory and  &lt;br /&gt;
a CN kernel ELF is created from the objects and ramdisk every time when you invoke certain make target.&lt;br /&gt;
&lt;br /&gt;
The tarball directory contains kernel tarballs separately for ION and CN Linux kernel. Technically, those tarball are a snopshot of &lt;br /&gt;
the Zepto kernel git repository. The directory might contain a .patch file that contains difference between the last snapshot and the current git HEAD since we wanted to avoid creating a snapshot from git even for small modification. &lt;br /&gt;
Associated git log file can be found in this directory. A .SNAPSHOT_HEAD file indicates the git revision at the time when a snapshot is created, so this information is used to create a patch file.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
linux-2.6.19.2-BGP-V1R3.git.log&lt;br /&gt;
linux-2.6.19.2-BGP-V1R3.patch&lt;br /&gt;
linux-2.6.19.2-BGP-V1R3.SNAPSHOT_HEAD&lt;br /&gt;
linux-2.6.19.2-BGP-V1R3.tar.bz2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The config directory contains Linux kernel config.  A config file is basically associated to a tarball by kernel version. &lt;br /&gt;
i.e, bgp-cn-2.6.19.2-dot-config for linux-2.6.19.2-BGP-V1R3.tar.bz2&lt;br /&gt;
See also Makefile.&lt;br /&gt;
&lt;br /&gt;
==Build kernel==&lt;br /&gt;
&lt;br /&gt;
Makefile in the kernel directory has many options. Just type 'make' it will show you help message. &lt;br /&gt;
&lt;br /&gt;
If you need to build (or rebuild) kernel from the tarball of kernel source code, use '''bgp-ion-linux-build''' or '''bgp-cn-linux-build target'''.&lt;br /&gt;
By default it extracts ION or CN kernel tarball in a directory named work, apply a patch if any&lt;br /&gt;
and start kernel build. Once kernel has successfully been built, kernel images&lt;br /&gt;
(in both Zepto top directory and tmp directory) will be replaced with newly built images. &lt;br /&gt;
ION kernel source code is extracted in '''work/linux-2.6.16.46-297-BGP-V1R3''' and &lt;br /&gt;
CN kernel source is in '''work/linux-2.6.19.2-BGP-V1R3'''.&lt;br /&gt;
Here is an example of build, rebuilding kernel.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd kernel&lt;br /&gt;
$ make bgp-cn-linux-build&lt;br /&gt;
....&lt;br /&gt;
$ ls -al ../BGP-CN-zImage-with-initrd.elf&lt;br /&gt;
$ vi work/linux-2.6.19.2-BGP-V1R3/kernel/sched.c&lt;br /&gt;
$ make bgp-cn-linux-build&lt;br /&gt;
....&lt;br /&gt;
$ ls -al ../BGP-CN-zImage-with-initrd.elf&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Build kernel from Zepto kernel git repo===&lt;br /&gt;
As mentioned above the kernel tarball is used as source code by default. &lt;br /&gt;
If you specify '''GIT=1''' to the make command, &lt;br /&gt;
you can directly build from our Zepto kernel git tree. This is very useful for kernel development since &lt;br /&gt;
you can keep track your modifications. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cd kernel&lt;br /&gt;
$ make GIT=1 bgp-cn-linux-build&lt;br /&gt;
....&lt;br /&gt;
$ vi repo/linux-2.6.19.2-BGP-V1R3/kernel/sched.c&lt;br /&gt;
$ make GIT=1 bgp-cn-linux-build&lt;br /&gt;
....&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that repo/linux-2.6.19.2-BGP-V1R3 is a cloned git repository that is cloned from&lt;br /&gt;
http://git.anl-external.org/bg-linux.repos/linux-2.6.19-BGP-V1R3.git/. &lt;br /&gt;
Our http repo is read-only, so you can't push your modifications to the http repo. &lt;br /&gt;
You can post your git patch to [mailto:zeptoos@mcs.anl.gov ZeptoOS mailing list] first. &lt;br /&gt;
&lt;br /&gt;
See also [http://bg-linux.anl-external.org/wiki/index.php/Main_Page BG-Linux page] for our kernel git repo details.&lt;br /&gt;
&lt;br /&gt;
===Kernel config===&lt;br /&gt;
&lt;br /&gt;
Initially, config/bgp-cn-2.6.19.2-dot-config is applied CN Linux kernel build tree&lt;br /&gt;
and config/bgp-ion-2.6.16.46-dot-config is applied to ION Linux kernel build tree. &lt;br /&gt;
Technically when you type a kernel build target the first time, associated kernel config file is copied to .config in&lt;br /&gt;
kernel build directory. &lt;br /&gt;
&lt;br /&gt;
Here is the location of kernel config file.&lt;br /&gt;
* Regular build&lt;br /&gt;
** work/build-2.6.19.2-BGP-V1R3/.config&lt;br /&gt;
** work/build-2.6.16.46-297-BGP-V1R3/.config&lt;br /&gt;
* GIT build&lt;br /&gt;
** repo/build-2.6.19.2-BGP-V1R3/.config&lt;br /&gt;
** repo/build-2.6.16.46-297-BGP-V1R3/.config&lt;br /&gt;
&lt;br /&gt;
Please note that the kernel config file is copied only once until you do distclean or &lt;br /&gt;
remove the files manually.&lt;br /&gt;
&lt;br /&gt;
bgp-cn-linux-menuconfig and bgp-ion-linux-menuconfig target invoke text based Linux kernel configuration menu.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make bgp-ion-linux-menuconfig&lt;br /&gt;
$ make bgp-cn-linux-menuconfig&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For GIT build,&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make GIT=1 bgp-ion-linux-menuconfig&lt;br /&gt;
$ make GIT=1 bgp-cn-linux-menuconfig&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Those menu target never update default kernel config file in the config directory. &lt;br /&gt;
If you want to apply new config permanently, please copy it to the config directory by hand. &lt;br /&gt;
For example,&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ cp work/build-2.6.19.2-BGP-V1R3/.config  config/bgp-cn-2.6.19.2-dot-config&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Kernel (command line) Parameters==&lt;br /&gt;
&lt;br /&gt;
In usual server/desktop Linux environment, kernel parameters&lt;br /&gt;
are passed via bootloader such as grub. However, BlueGene/P boot mechanism &lt;br /&gt;
does not have such capability. Kernel parameters are very handy since we can skip kernel rebuild. &lt;br /&gt;
So we have modified CN Linux kernel (2.6.19) to allow to load kernel parameter string &lt;br /&gt;
embedded in kernel ELF image file. &lt;br /&gt;
&lt;br /&gt;
You can add kernel parameters (or reset) to a kernel ELF file by &lt;br /&gt;
a command line tool named zkparam.py, which is located in kernel directory.&lt;br /&gt;
Here is the synopsis of the tool.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
zkparam.py  KERNEL_ELF  [options]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you omit options, the tool shows you the current kernel parameters.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./kernel/zkparam.py  BGP-CN-zImage-with-initrd.elf  zepto_console_output=2&lt;br /&gt;
$ ./kernel/zkparam.py  BGP-CN-zImage-with-initrd.elf &lt;br /&gt;
Current Kernel Parameters:&lt;br /&gt;
 zepto_console_output=2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Zepto kernel specific kernel parameters===&lt;br /&gt;
&lt;br /&gt;
* '''zepto_debug'''=INTEGER    &lt;br /&gt;
** Specify the zepto kernel debug level&lt;br /&gt;
** 0 turns off all zepto debug msgs. higher number more detail&lt;br /&gt;
** default=1&lt;br /&gt;
* '''flatmemsizeMB'''=INTEGER&lt;br /&gt;
** Specify the size of flatmemory in MB.&lt;br /&gt;
** Currently the granularity of memory size is limited to 256 MB&lt;br /&gt;
** default=256  min=256  max=1792&lt;br /&gt;
* '''zepto_console_output'''=INTEGER	&lt;br /&gt;
** Specify the console output behavior. &lt;br /&gt;
** 0 disables all console output from kernel. &lt;br /&gt;
** 1 enables console output from one of nodes(personality rank=1) &lt;br /&gt;
** 2 enables console output from all nodes&lt;br /&gt;
&lt;br /&gt;
==Log files,etc==&lt;br /&gt;
&lt;br /&gt;
===Compute Node log===&lt;br /&gt;
&lt;br /&gt;
Debug messages from compute node (i.e, via printk) will appear in one of system log files.&lt;br /&gt;
The system log file is recreated every time at system reset. You can find the location of the system log file&lt;br /&gt;
by typing the following command. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ls -1 /bgsys/logs/BGP/sn*-mmcs_db_server*.log|tail -1&lt;br /&gt;
/bgsys/logs/BGP/sn1-mmcs_db_server-2009-0209-11:58:20.log&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please also take a look at a convenient script BGP/packages/tools/cn-log.sh&lt;br /&gt;
&lt;br /&gt;
===ION Node log===&lt;br /&gt;
&lt;br /&gt;
Debug message from ION node will appear in ION node log files in /bgsys/logs/BGP/,&lt;br /&gt;
/bgsys/logs/BGP/R00-M0-N00-J00.log for example.&lt;br /&gt;
Each ION has own log file. ION to CN ratio is 1:64 in ANL system. If your job is a 64 (physical) nodes job,&lt;br /&gt;
you have one ION log file. &lt;br /&gt;
&lt;br /&gt;
Please also take a look at a convenient script BGP/packages/tools/ion-log.sh&lt;br /&gt;
&lt;br /&gt;
===RAS events===&lt;br /&gt;
&lt;br /&gt;
RAS messages won't appear in the log files. &lt;br /&gt;
A command line tool named bg-listevents shows you a record of RAS events.&lt;br /&gt;
Type 'bg-listevents -h' for command line parameters.&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=460</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=460"/>
		<updated>2009-04-30T19:02:08Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
MPICH in the Zepto release is mpich2-1.0.7 with IBM patch. It is reasonably stable and performance &lt;br /&gt;
of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for details. &lt;br /&gt;
DCMF in the Zepto release is 1.0.0, which is older than DCMF in the current driver release(V1R3M0). SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
====A compilation example====&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. It copies &lt;br /&gt;
the built libraries and header files to the comm/tmp directory temporarily. &lt;br /&gt;
If you need to apply newly built libraries, do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The update-prebuilt target basically copies the files from the comm/tmp directory to the comm/prebuilt directory and &lt;br /&gt;
the install.py script copies from the comm/prebuilt directory to __INST_PREFIX__, and installs compilation wrapper scripts to __INST_PREFIX__. Template of compilation wrapper scripts are in the templates directory.&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes and header files of DCMF and SPI can be found in the comm directory. The source code of MPICH is in an archive DCMF/lib/mpich2/mpich2-1.0.7.tar.gz, which will be extracted at the build time.&lt;br /&gt;
&lt;br /&gt;
The DCMF source codes are located in DCMF/sys/. &lt;br /&gt;
DCMF core source codes are in DCMF/sys/messaging.&lt;br /&gt;
Component Collective Messaging Interface(CCMI) is part of DCMF and its source codes are in&lt;br /&gt;
DCMF/sys/collectives. Test codes can be found in DCMF/sys/collectives/tests for CCMI&lt;br /&gt;
and DCMF/sys/messaging/tests. Those test codes can be a good example for DCMF/CCMI programming.&lt;br /&gt;
&lt;br /&gt;
SPI headers are in arch-runtime/arch and SPI source codes are in comm/arch-runtime/runtime/.&lt;br /&gt;
arch-runtime/zcl_spi contains the source code of ZEPTO SPI layer and&lt;br /&gt;
arch-runtime/arch/include/zepto contains the header files of ZEPTO SPI layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
comm&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |   |-- adaptor&lt;br /&gt;
|   |   |   |-- kernel&lt;br /&gt;
|   |   |   |-- tests&lt;br /&gt;
|   |   |   `-- tools&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|   |   |   |-- devices&lt;br /&gt;
|   |   |   |-- messager&lt;br /&gt;
|   |   |   |-- protocols&lt;br /&gt;
|   |   |   |-- queueing&lt;br /&gt;
|   |   |   |-- sysdep&lt;br /&gt;
|   |   `-- tests&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=Requirements&amp;diff=457</id>
		<title>Requirements</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=Requirements&amp;diff=457"/>
		<updated>2009-04-30T18:32:45Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: New page: * Blue Gene/P system with the V1R3 driver installed  * Blue Gene/P PowerPC Front End Node  * Permission to create kernel profile ** If the system has cobalt scheduler installed and its ker...&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;* Blue Gene/P system with the V1R3 driver installed &lt;br /&gt;
* Blue Gene/P PowerPC Front End Node &lt;br /&gt;
* Permission to create kernel profile&lt;br /&gt;
** If the system has cobalt scheduler installed and its kernel profile feature is available, you need write permission to the profile directory.&lt;br /&gt;
** If no cobalt kernel profile feature is available, you need access permission to the service node and mmcs console.&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=ZeptoOS_Documentation&amp;diff=456</id>
		<title>ZeptoOS Documentation</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=ZeptoOS_Documentation&amp;diff=456"/>
		<updated>2009-04-30T18:25:43Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{| align=&amp;quot;left&amp;quot; border=&amp;quot;0&amp;quot; cellspacing=&amp;quot;0&amp;quot; style=&amp;quot;width:700px&amp;quot;&lt;br /&gt;
| [[Changes in V2.0]]&lt;br /&gt;
|rowspan=&amp;quot;16&amp;quot;|[[Image:Zepto.png|frameless|410px]]&lt;br /&gt;
|-&lt;br /&gt;
| [[ZeptoOS V2.0 Feature List]]&lt;br /&gt;
|- &lt;br /&gt;
| [[Requirements]]&lt;br /&gt;
|- &lt;br /&gt;
|&lt;br /&gt;
;Complete Documentation &lt;br /&gt;
# [[Introduction]]&lt;br /&gt;
# [[The Build Process]]&lt;br /&gt;
# [[Installation]]&lt;br /&gt;
# [[Kernel Profile]]&lt;br /&gt;
# [[Testing]]&lt;br /&gt;
# [[MPICH,DCMF and SPI]]&lt;br /&gt;
# [[Kernel]]&lt;br /&gt;
# [[Ramdisk]]&lt;br /&gt;
# [[ZOID]]&lt;br /&gt;
# [[(K)TAU]]&lt;br /&gt;
# [[FAQ]]&lt;br /&gt;
|-&lt;br /&gt;
| [[Limitations]]&lt;br /&gt;
|-&lt;br /&gt;
| [[Licensing]]&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.mcs.anl.gov/research/projects/zeptoos/cms/index.php/projects Overview of ZeptoOS Projects]&lt;br /&gt;
|-&lt;br /&gt;
|[[Source code documentation tags]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=454</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=454"/>
		<updated>2009-04-30T16:08:41Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Building MPICH, DCMF and SPI libraries */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
MPICH in the Zepto release is mpich2-1.0.7 with IBM patch. It is reasonably stable and performance &lt;br /&gt;
of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for details. &lt;br /&gt;
DCMF in the Zepto release is 1.0.0, which is older than DCMF in the current driver release(V1R3M0). SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
====A compilation example====&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. It copies &lt;br /&gt;
the built libraries and header files to the comm/tmp directory temporarily. &lt;br /&gt;
If you need to apply newly built libraries, do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The update-prebuilt target basically copies the files from the comm/tmp directory to the comm/prebuilt directory and &lt;br /&gt;
the install.py script copies from the comm/prebuilt directory to __INST_PREFIX__, and installs compilation wrapper scripts to __INST_PREFIX__. Template of compilation wrapper scripts are in the templates directory.&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes and header files of DCMF and SPI can be found in the comm directory. The source code of MPICH is in an archive DCMF/lib/mpich2/mpich2-1.0.7.tar.gz, which will be extracted at the build time.&lt;br /&gt;
&lt;br /&gt;
The DCMF source codes are located in DCMF/sys/. &lt;br /&gt;
DCMF core source codes are in DCMF/sys/messaging.&lt;br /&gt;
Component Collective Messaging Interface(CCMI) is part of DCMF and its source codes are in&lt;br /&gt;
DCMF/sys/collectives. Test codes can be found in DCMF/sys/collectives/tests for CCMI&lt;br /&gt;
and DCMF/sys/messaging/tests. Those test codes can be a good example for DCMF/CCMI programming.&lt;br /&gt;
&lt;br /&gt;
SPI headers are in arch-runtime/arch and SPI source codes are in comm/arch-runtime/runtime/.&lt;br /&gt;
arch-runtime/zcl_spi contains the source code of ZEPTO SPI layer and&lt;br /&gt;
arch-runtime/arch/include/zepto contains the header files of ZEPTO SPI layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
comm&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |   |-- adaptor&lt;br /&gt;
|   |   |   |-- kernel&lt;br /&gt;
|   |   |   |-- tests&lt;br /&gt;
|   |   |   `-- tools&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|   |   |   |-- devices&lt;br /&gt;
|   |   |   |-- messager&lt;br /&gt;
|   |   |   |-- protocols&lt;br /&gt;
|   |   |   |-- queueing&lt;br /&gt;
|   |   |   |-- sysdep&lt;br /&gt;
|   |   `-- tests&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=453</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=453"/>
		<updated>2009-04-30T16:04:16Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Building MPICH, DCMF and SPI libraries */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
MPICH in the Zepto release is mpich2-1.0.7 with IBM patch. It is reasonably stable and performance &lt;br /&gt;
of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for details. &lt;br /&gt;
DCMF in the Zepto release is 1.0.0, which is older than DCMF in the current driver release(V1R3M0). SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
====A compilation example====&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. It copies &lt;br /&gt;
the built libraries and header files to the comm/tmp directory temporarily. &lt;br /&gt;
If you need to apply newly built libraries, do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The update-prebuilt target basically copies files to the comm/prebuilt directory and &lt;br /&gt;
the install.py script installs the built libraries and header files and compilation scripts to __INST_PREFIX__.&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes and header files of DCMF and SPI can be found in the comm directory. The source code of MPICH is in an archive DCMF/lib/mpich2/mpich2-1.0.7.tar.gz, which will be extracted at the build time.&lt;br /&gt;
&lt;br /&gt;
The DCMF source codes are located in DCMF/sys/. &lt;br /&gt;
DCMF core source codes are in DCMF/sys/messaging.&lt;br /&gt;
Component Collective Messaging Interface(CCMI) is part of DCMF and its source codes are in&lt;br /&gt;
DCMF/sys/collectives. Test codes can be found in DCMF/sys/collectives/tests for CCMI&lt;br /&gt;
and DCMF/sys/messaging/tests. Those test codes can be a good example for DCMF/CCMI programming.&lt;br /&gt;
&lt;br /&gt;
SPI headers are in arch-runtime/arch and SPI source codes are in comm/arch-runtime/runtime/.&lt;br /&gt;
arch-runtime/zcl_spi contains the source code of ZEPTO SPI layer and&lt;br /&gt;
arch-runtime/arch/include/zepto contains the header files of ZEPTO SPI layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
comm&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |   |-- adaptor&lt;br /&gt;
|   |   |   |-- kernel&lt;br /&gt;
|   |   |   |-- tests&lt;br /&gt;
|   |   |   `-- tools&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|   |   |   |-- devices&lt;br /&gt;
|   |   |   |-- messager&lt;br /&gt;
|   |   |   |-- protocols&lt;br /&gt;
|   |   |   |-- queueing&lt;br /&gt;
|   |   |   |-- sysdep&lt;br /&gt;
|   |   `-- tests&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=452</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=452"/>
		<updated>2009-04-30T15:44:00Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Source code */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
MPICH in the Zepto release is mpich2-1.0.7 with IBM patch. It is reasonably stable and performance &lt;br /&gt;
of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for details. &lt;br /&gt;
DCMF in the Zepto release is 1.0.0, which is older than DCMF in the current driver release(V1R3M0). SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
====A compilation example====&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes and header files of DCMF and SPI can be found in the comm directory. The source code of MPICH is in an archive DCMF/lib/mpich2/mpich2-1.0.7.tar.gz, which will be extracted at the build time.&lt;br /&gt;
&lt;br /&gt;
The DCMF source codes are located in DCMF/sys/. &lt;br /&gt;
DCMF core source codes are in DCMF/sys/messaging.&lt;br /&gt;
Component Collective Messaging Interface(CCMI) is part of DCMF and its source codes are in&lt;br /&gt;
DCMF/sys/collectives. Test codes can be found in DCMF/sys/collectives/tests for CCMI&lt;br /&gt;
and DCMF/sys/messaging/tests. Those test codes can be a good example for DCMF/CCMI programming.&lt;br /&gt;
&lt;br /&gt;
SPI headers are in arch-runtime/arch and SPI source codes are in comm/arch-runtime/runtime/.&lt;br /&gt;
arch-runtime/zcl_spi contains the source code of ZEPTO SPI layer and&lt;br /&gt;
arch-runtime/arch/include/zepto contains the header files of ZEPTO SPI layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
comm&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |   |-- adaptor&lt;br /&gt;
|   |   |   |-- kernel&lt;br /&gt;
|   |   |   |-- tests&lt;br /&gt;
|   |   |   `-- tools&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|   |   |   |-- devices&lt;br /&gt;
|   |   |   |-- messager&lt;br /&gt;
|   |   |   |-- protocols&lt;br /&gt;
|   |   |   |-- queueing&lt;br /&gt;
|   |   |   |-- sysdep&lt;br /&gt;
|   |   `-- tests&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=451</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=451"/>
		<updated>2009-04-30T15:33:42Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Source code */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
MPICH in the Zepto release is mpich2-1.0.7 with IBM patch. It is reasonably stable and performance &lt;br /&gt;
of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for details. &lt;br /&gt;
DCMF in the Zepto release is 1.0.0, which is older than DCMF in the current driver release(V1R3M0). SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
====A compilation example====&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes and header files of DCMF and SPI can be found in the comm directory. The source code of MPICH is in an archive comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz, which will be extracted at the build time.&lt;br /&gt;
&lt;br /&gt;
The DCMF source codes are located in comm/DCMF/sys/. &lt;br /&gt;
DCMF core source codes are in comm/DCMF/sys/messaging.&lt;br /&gt;
Component Collective Messaging Interface(CCMI) is part of DCMF and its source codes are in&lt;br /&gt;
commd/DCMF/sys/collectives.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
comm&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|   |   |   |-- devices&lt;br /&gt;
|   |   |   |-- messager&lt;br /&gt;
|   |   |   |-- protocols&lt;br /&gt;
|   |   |   |-- queueing&lt;br /&gt;
|   |   |   |-- sysdep&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=450</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=450"/>
		<updated>2009-04-30T14:35:40Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* A compilation example */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
MPICH in the Zepto release is mpich2-1.0.7 with IBM patch. It is reasonably stable and performance &lt;br /&gt;
of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for details. &lt;br /&gt;
DCMF in the Zepto release is 1.0.0, which is older than DCMF in the current driver release(V1R3M0). SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
====A compilation example====&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
The source code of MPICH is in an archive comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz, which will be extracted at the build time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=449</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=449"/>
		<updated>2009-04-30T14:33:54Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Source code */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
MPICH in the Zepto release is mpich2-1.0.7 with IBM patch. It is reasonably stable and performance &lt;br /&gt;
of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for details. &lt;br /&gt;
DCMF in the Zepto release is 1.0.0, which is older than DCMF in the current driver release(V1R3M0). SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===A compilation example===&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
The source code of MPICH is in an archive comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz, which will be extracted at the build time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=448</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=448"/>
		<updated>2009-04-30T03:53:55Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
MPICH in the Zepto release is mpich2-1.0.7 with IBM patch. It is reasonably stable and performance &lt;br /&gt;
of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for details. &lt;br /&gt;
DCMF in the Zepto release is 1.0.0, which is older than DCMF in the current driver release(V1R3M0). SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===A compilation example===&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=447</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=447"/>
		<updated>2009-04-30T03:49:13Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* A compilation example */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
It is reasonably stable and performance of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===A compilation example===&lt;br /&gt;
&lt;br /&gt;
Understanding build system on a program might take some time, &lt;br /&gt;
but there is nothing special to compile a program for Zepto environment.&lt;br /&gt;
&lt;br /&gt;
Here is a real example on how to build a well-known parallel application called &lt;br /&gt;
Parallel Ocean Program(POP). &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see the patch below )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=446</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=446"/>
		<updated>2009-04-30T03:42:32Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* A compilation example */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
It is reasonably stable and performance of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===A compilation example===&lt;br /&gt;
&lt;br /&gt;
Actually nothing special to compile a program for Zepto environment.&lt;br /&gt;
But a real example can be helpful.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see below  as an example )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=445</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=445"/>
		<updated>2009-04-30T03:40:21Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Compilation wrapper scripts */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
It is reasonably stable and performance of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===A compilation example===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ wget http://climate.lanl.gov/Models/POP/POP_2.0.1.tar.Z&lt;br /&gt;
$ tar xvfz POP_2.0.1.tar.Z ; cd pop&lt;br /&gt;
$  ./setup_run_dir ztest ; cd ztest&lt;br /&gt;
$ edit ibm_mpi.gnu  ( see below  as an example )&lt;br /&gt;
$ export ARCHDIR=ibm_mpi&lt;br /&gt;
$ make   # wait for a while&lt;br /&gt;
$ edit  pop_in   # test data set&lt;br /&gt;
-  nprocs_clinic = 4&lt;br /&gt;
-  nprocs_tropic = 4&lt;br /&gt;
+  nprocs_clinic = 64&lt;br /&gt;
+  nprocs_tropic = 64&lt;br /&gt;
$ cqsub -n 64 -n 8 -k your_zepto_profile  ./pop&lt;br /&gt;
&lt;br /&gt;
--------------------&lt;br /&gt;
--- orig/ibm_mpi.gnu    2009-04-15 15:01:58.666457601 -0500&lt;br /&gt;
+++ ztest/ibm_mpi.gnu    2009-04-15 14:17:58.099132435 -0500&lt;br /&gt;
@@ -6,17 +6,18 @@&lt;br /&gt;
# will someday be a file which is a cookbook in Q&amp;amp;A style: &amp;quot;How do I do X?&amp;quot;&lt;br /&gt;
# is followed by something like &amp;quot;Go to file Y and add Z to line NNN.&amp;quot;&lt;br /&gt;
#&lt;br /&gt;
-FC = mpxlf90_r&lt;br /&gt;
-LD = mpxlf90_r&lt;br /&gt;
-CC = mpcc_r&lt;br /&gt;
-Cp = /usr/bin/cp&lt;br /&gt;
-Cpp = /usr/ccs/lib/cpp -P&lt;br /&gt;
+ZPATH=__INST_PREFIX__&lt;br /&gt;
+FC = $(ZPATH)/zmpixlf90&lt;br /&gt;
+LD = $(ZPATH)/zmpixlf90&lt;br /&gt;
+CC = $(ZPATH)/zmpixlc&lt;br /&gt;
+Cp = //bin/cp&lt;br /&gt;
+Cpp = /usr/bin/cpp -P&lt;br /&gt;
AWK = /usr/bin/awk&lt;br /&gt;
-ABI = -q64&lt;br /&gt;
+#ABI = -q64&lt;br /&gt;
COMMDIR = mpi&lt;br /&gt;
&lt;br /&gt;
-NETCDFINC = -I/usr/local/include&lt;br /&gt;
-NETCDFLIB = -L/usr/local/lib&lt;br /&gt;
+NETCDFINC = -I/soft/apps/netcdf-4.0/include/&lt;br /&gt;
+NETCDFLIB = -L/soft/apps/netcdf-4.0/lib&lt;br /&gt;
&lt;br /&gt;
#  Enable MPI library for parallel code, yes/no.&lt;br /&gt;
&lt;br /&gt;
@@ -58,7 +59,8 @@&lt;br /&gt;
#&lt;br /&gt;
#----------------------------------------------------------------------------&lt;br /&gt;
&lt;br /&gt;
-FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+#FBASE = $(ABI) -qarch=auto -qnosave -bmaxdata:0x80000000 $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
+FBASE = $(ABI) -qarch=auto -qnosave  $(NETCDFINC) -I$(ObjDepDir)&lt;br /&gt;
&lt;br /&gt;
ifeq ($(TRAP_FPE),yes)&lt;br /&gt;
  FBASE := $(FBASE) -qflttrap=overflow:zerodivide:enable -qspillsize=32704&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=444</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=444"/>
		<updated>2009-04-30T03:30:37Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
It is reasonably stable and performance of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
===ZCB and Big memory===&lt;br /&gt;
This is not limitation but MPI application on Zepto compute node environment &lt;br /&gt;
(technically applications that require DMA operation and maximum memory bandwidth) needs to be Zepto Compute Binary(ZCB).&lt;br /&gt;
ZCB enables 24th bit in the e_flags(processor specific flag) in ELF header. When kernel loads an executable, &lt;br /&gt;
it examines the bit first. Kernel treats ZCB executable differently than normal processes. Kernel creates a special memory mapping &lt;br /&gt;
called big memory region which is covered by large pages and semi-statically pinned down, and loads all applications sections to &lt;br /&gt;
the big memory region. Big memory region has virtually no TLB misses on the big memory region and allows DMA operation &lt;br /&gt;
since it's offset paged mapping instead of paged memory. Due to big memory, some system calls from ZCB are not usable, such as fork. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=443</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=443"/>
		<updated>2009-04-30T03:09:01Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
It is reasonably stable and performance of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** various file system such as ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
&lt;br /&gt;
We will support VN equivalent mode (MPI rank per core) and provide shared libraries in future release.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=442</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=442"/>
		<updated>2009-04-30T03:05:08Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
It is reasonably stable and performance of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
While there are some limitations on the porting right now, it has some benefits.&lt;br /&gt;
&lt;br /&gt;
Benefits:&lt;br /&gt;
* No limitation on the number of thread&lt;br /&gt;
** 4 or more openmp job per node&lt;br /&gt;
** Additional thread as I/O or backgroup task&lt;br /&gt;
* It's on Linux!&lt;br /&gt;
** debugging tools such as gdb, strace, etc&lt;br /&gt;
** ramfs&lt;br /&gt;
&lt;br /&gt;
Current limitations:&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
** VN mode will be supported in the future release&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
** there are slight difference in DCMF and SPI library&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
** will be provided in the future release&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=441</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=441"/>
		<updated>2009-04-30T02:37:59Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
While there are some limitations on the porting right now, it is stable and performance of MPI applications &lt;br /&gt;
on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
** the number of threads is not limited&lt;br /&gt;
** VN mode will be supported in the future release&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
** there are slight difference in DCMF and SPI library&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
** will be provided in the future release&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=440</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=440"/>
		<updated>2009-04-30T02:19:37Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
While there are some limitations on the porting right now, it is stable and performance of MPI applications &lt;br /&gt;
on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
&lt;br /&gt;
* Only SMP mode is supported&lt;br /&gt;
** the number of threads is not limited&lt;br /&gt;
** VN mode will be supported in the future release&lt;br /&gt;
* No Binary compatibility between CNK and Zepto CN Linux&lt;br /&gt;
** there are slight difference in DCMF and SPI library&lt;br /&gt;
* Shared libraries are not provided now&lt;br /&gt;
** will be provided in the future release&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=439</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=439"/>
		<updated>2009-04-30T02:08:18Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: /* Rebuilding the libraries */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
Performance of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Building MPICH, DCMF and SPI libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
	<entry>
		<id> /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=438</id>
		<title>MPICH, DCMF, and SPI</title>
		<link rel="alternate" type="text/html" href=" /zeptoos/index.php?title=MPICH,_DCMF,_and_SPI&amp;diff=438"/>
		<updated>2009-04-30T02:06:36Z</updated>

		<summary type="html">&lt;p&gt;Kazutomo: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To support high performance computing(HPC) applications, specifically for MPI applications,  &lt;br /&gt;
we have ported IBM CNK's communication software stack to the Zepto compute node Linux environment.&lt;br /&gt;
Performance of MPI applications on the Zepto compute node Linux is comparable to that's on CNK. &lt;br /&gt;
&lt;br /&gt;
As in IBM CNK environment, Deep Computing Messaging Framework(DCMF) and System Programming Interface(SPI) are available. &lt;br /&gt;
You can also write a DCMF code or a SPI code directly if necessary. DCMF is a communication library that  &lt;br /&gt;
provides non-blocking operations. Please refer [[http://dcmf.anl-external.org/wiki/index.php/Main_Page DCMF wiki]] for some details. SPI is the lowest level user space API for Torus DMA, collective network, BGP specifc lock mechanisms and &lt;br /&gt;
other compute node specific implementations. There is no public document available right now but almost all header files and source codes are available. Internally MPICH depends on DMCF that depends on SPI. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Compiling HPC applications==&lt;br /&gt;
&lt;br /&gt;
While you can use same compiler to compile your codes,&lt;br /&gt;
Zepto compute node environment requires linking with zepto modified libraries.&lt;br /&gt;
( MPI application's binary for CNK does not work on Zepto environment ).&lt;br /&gt;
&lt;br /&gt;
===Compilation wrapper scripts===&lt;br /&gt;
&lt;br /&gt;
We provide compilation wrapper scripts (see below) which &lt;br /&gt;
automatically links with appropriate libraries&lt;br /&gt;
that are installed in your Zepto installation path.  We provide the same&lt;br /&gt;
set of wrapper scripts that IBM provides. Once you have successfully&lt;br /&gt;
compiled your code, you need to submit it with Zepto kernel profile (&lt;br /&gt;
see the [[Kernel Profile]] section). Note: only SMP mode is currently supported.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
- Wrapper scripts that invoke BGP enhanced GNU compilers &lt;br /&gt;
zmpicc&lt;br /&gt;
zmpicxx&lt;br /&gt;
zmpif77&lt;br /&gt;
zmpif90&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers&lt;br /&gt;
zmpixlc&lt;br /&gt;
zmpixlcxx&lt;br /&gt;
zmpixlf2003&lt;br /&gt;
zmpixlf77&lt;br /&gt;
zmpixlf90&lt;br /&gt;
zmpixlf95&lt;br /&gt;
&lt;br /&gt;
- Wrapper scripts that invoke IBM XL compilers(thread safe compilation)&lt;br /&gt;
zmpixlc_r&lt;br /&gt;
zmpixlcxx_r&lt;br /&gt;
zmpixlf2003_r&lt;br /&gt;
zmpixlf77_r&lt;br /&gt;
zmpixlf90_r&lt;br /&gt;
zmpixlf95_r&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need to understand what those script actually do internally, run the wrapper script with the -show option.&lt;br /&gt;
&lt;br /&gt;
===Without compiler scripts===&lt;br /&gt;
In case you can't use those compilation wrapper scripts, please make sure&lt;br /&gt;
that your makefile or build environemnt points Zepto header files and&lt;br /&gt;
libraries correctly. An example would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc-bgp-linux-gcc  \&lt;br /&gt;
-o mpi-test-linux -Wall -O3  -I__INST_PREFIX__/include/   mpi-test.c \&lt;br /&gt;
-L__INST_PREFIX__/lib/ -lmpich.zcl  -ldcmfcoll.zcl -ldcmf.zcl  -lSPI.zcl -lzcl \&lt;br /&gt;
-lzoid_cn -lrt -lpthread -lm&lt;br /&gt;
__INST_PREFIX__/bin/zelftool -e mpi-test-linux&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''NOTE:''' &lt;br /&gt;
* Replace __INST_PREFIX__ with your actuall Zepto install path&lt;br /&gt;
* Don't forget calling the zelftool utility&lt;br /&gt;
** which makes your executable a Zepto Compute Binary to let the Zepto kernel load&lt;br /&gt;
all application segments into the big memory area.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The file layout in the zepto install path would be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- bin&lt;br /&gt;
|   |-- zelftool&lt;br /&gt;
|-- include&lt;br /&gt;
|   |-- dcmf.h&lt;br /&gt;
|   |-- dcmf_collectives.h&lt;br /&gt;
|   |-- dcmf_coremath.h&lt;br /&gt;
|   |-- dcmf_globalcollectives.h&lt;br /&gt;
|   |-- dcmf_multisend.h&lt;br /&gt;
|   |-- dcmf_optimath.h&lt;br /&gt;
|   |-- mpe_thread.h&lt;br /&gt;
|   |-- mpi.h&lt;br /&gt;
|   |-- mpi.mod&lt;br /&gt;
|   |-- mpi_base.mod&lt;br /&gt;
|   |-- mpi_constants.mod&lt;br /&gt;
|   |-- mpi_sizeofs.mod&lt;br /&gt;
|   |-- mpicxx.h&lt;br /&gt;
|   |-- mpif.h&lt;br /&gt;
|   |-- mpio.h&lt;br /&gt;
|   |-- mpiof.h&lt;br /&gt;
|   `-- mpix.h&lt;br /&gt;
`-- lib&lt;br /&gt;
    |-- libSPI.zcl.a&lt;br /&gt;
    |-- libcxxmpich.zcl.a&lt;br /&gt;
    |-- libdcmf.zcl.a&lt;br /&gt;
    |-- libdcmfcoll.zcl.a&lt;br /&gt;
    |-- libfmpich.zcl.a&lt;br /&gt;
    |-- libfmpich_.zcl.a&lt;br /&gt;
    |-- libmpich.zcl.a&lt;br /&gt;
    |-- libmpich.zclf90.a&lt;br /&gt;
    |-- libzcl.a&lt;br /&gt;
    `-- libzoid_cn.a&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Rebuilding the libraries==&lt;br /&gt;
&lt;br /&gt;
We have all necessary source codes to build MPICH, DCMF and SPI.&lt;br /&gt;
To build those libraries, just type:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm rebuild-target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It may take a half hour to an hour to complete the build process, depending on what file system you are using.&lt;br /&gt;
i.e., GPFS is definitely slower than local scratch file system.&lt;br /&gt;
&lt;br /&gt;
The rebuild-target target does not know anything about your installation. If you need to apply newly compiled libraries,&lt;br /&gt;
do the following steps:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ make -C comm update-prebuilt&lt;br /&gt;
$ python install.py __INST_PREFIX__&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Software stack layout==&lt;br /&gt;
&lt;br /&gt;
[[Image:Zepto-Comm-Stack.png|right|450px]]&lt;br /&gt;
&lt;br /&gt;
The right figure depicts the layout of communication software stack  for Zepto compute node environment.&lt;br /&gt;
This is essentially same as in IBM CNK's stack excepts that they have no ZEPTO SPI, and CNK instead of Linux.&lt;br /&gt;
While we skip the brief explanation of MPICH since it's well-known software piece, &lt;br /&gt;
we briefly describe what DCMF and SPI are here. &lt;br /&gt;
&lt;br /&gt;
* DCMF&lt;br /&gt;
** Stands for Deep Computing Messaging Framework&lt;br /&gt;
** Developed by IBM originally for BleuGene architecture &lt;br /&gt;
** Hardware Initialization, query functions&lt;br /&gt;
** Supports BGP Torus DMA, collective network&lt;br /&gt;
** Provides timer&lt;br /&gt;
** Supports non-blocking collective operations&lt;br /&gt;
** BGP MPICH uses DCMF internally (IBM provides a glue layer)&lt;br /&gt;
* SPI&lt;br /&gt;
** Stands for System Programming Interface&lt;br /&gt;
** Developed by IBM. BGP specific codes.&lt;br /&gt;
** Kernel interfaces - DMA control, lockbox, etc&lt;br /&gt;
** DMA related definitions &lt;br /&gt;
*** can be used in both user space and kernel space&lt;br /&gt;
** RAS, BGP personality, mapping related functions&lt;br /&gt;
&lt;br /&gt;
BGP SPI is basically designed only for IBM CNK, so SPI is not compatible with Linux.&lt;br /&gt;
ZEPTO SPI is a thin software layer that absorbs the differences between CNK and Linux, or drops the requests that Linux can not handle.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Source code==&lt;br /&gt;
&lt;br /&gt;
The source codes or header files of MPICH, DCMF and SPI can be found in the comm directory.&lt;br /&gt;
Technically the source code of MPICH is in the tarball (comm/DCMF/lib/mpich2/mpich2-1.0.7.tar.gz), which will be extracted at build time.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
|-- DCMF&lt;br /&gt;
|   |-- lib&lt;br /&gt;
|   |   |-- dev&lt;br /&gt;
|   |   `-- mpich2&lt;br /&gt;
|   |       `-- make&lt;br /&gt;
|   |-- sys&lt;br /&gt;
|   |   |-- collectives&lt;br /&gt;
|   |   |-- include&lt;br /&gt;
|   |   |-- messaging&lt;br /&gt;
|-- arch-runtime&lt;br /&gt;
|   |-- arch&lt;br /&gt;
|   |   `-- include&lt;br /&gt;
|   |       |-- bpcore&lt;br /&gt;
|   |       |-- cnk&lt;br /&gt;
|   |       |-- common&lt;br /&gt;
|   |       |-- spi&lt;br /&gt;
|   |       `-- zepto&lt;br /&gt;
|   |-- runtime&lt;br /&gt;
|   |-- testcodes&lt;br /&gt;
|   `-- zcl_spi&lt;br /&gt;
`-- testcodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Kazutomo</name></author>
	</entry>
</feed>