| <!--#include virtual="header.txt"--> |
| |
| <h1><a name="top">SLURM MPI Plugin API</a></h1> |
| |
| <h2> Overview</h2> |
| <p> This document describes SLURM MPI selection plugins and the API that defines |
| them. It is intended as a resource to programmers wishing to write their own SLURM |
| node selection plugins. This is version 0 of the API.</p> |
| |
| <p>SLURM mpi selection plugins are SLURM plugins that implement the which version of |
| mpi is used during execution of the new SLURM job. API described herein. They are |
| intended to provide a mechanism for both selecting mpi versions for pending jobs and |
| performing any mpi-specific tasks for job launch or termination. The plugins must |
| conform to the SLURM Plugin API with the following specifications:</p> |
| <p><span class="commandline">const char plugin_type[]</span><br> |
| The major type must be "mpi." The minor type can be any recognizable |
| abbreviation for the type of node selection algorithm. We recommend, for example:</p> |
| <ul> |
| <li><b>lam</b>—For use with LAM MPI and Open MPI.</li> |
| <li><b>mpich-gm</b>—For use with Myrinet.</li> |
| <li><b>mvapich</b>—For use with Infiniband.</li> |
| <li><b>none</b>—For use with most other versions of MPI.</li> |
| </ul> |
| <p>The <span class="commandline">plugin_name</span> and |
| <span class="commandline">plugin_version</span> |
| symbols required by the SLURM Plugin API require no specialization for node selection support. |
| Note carefully, however, the versioning discussion below.</p> |
| |
| <p>A simplified flow of logic follows: |
| <br> |
| srun is able to specify the correct mpi to use. with --mpi=MPITYPE |
| <br> |
| srun calls |
| <br> |
| <i>mpi_p_thr_create((srun_job_t *)job);</i> |
| <br> |
| which will set up the correct enviornment for the specified mpi. |
| <br> |
| slurmd daemon runs |
| <br> |
| <i>mpi_p_init((slurmd_job_t *)job, (int)rank);</i> |
| <br> |
| which will set configure the slurmd to use the correct mpi as well to interact with the srun. |
| <br> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h2>Data Objects</h2> |
| <p> These functions are expected to read and/or modify data structures directly in |
| the slurmd daemon's and srun memory. Slurmd is a multi-threaded program with independent |
| read and write locks on each data structure type. Thererfore the type of operations |
| permitted on various data structures is identified for each function.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h2>API Functions</h2> |
| <p>The following functions must appear. Functions which are not implemented should |
| be stubbed.</p> |
| |
| <p class="commandline">int mpi_p_init (slurmd_job_t *job, int rank);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Used by slurmd to configure the slurmd's environment |
| to that of the correct mpi.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br><span class="commandline"> job</span> |
| (input) Pointer to the slurmd_job that is running. Cannot be NULL.<br> |
| <span class="commandline"> rank</span> |
| (input) Primarially there for MVAPICH. Used to send the rank fo the mpirun job. |
| This can be 0 if no rank information is needed for the mpi type.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int mpi_p_thr_create (srun_job_t *job);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Used by srun to spawn the thread for the mpi processes. |
| Most all the real proccessing happens here.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<span class="commandline"> job</span> |
| (input) Pointer to the srun_job that is running. Cannot be NULL.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return -1.</p> |
| |
| <p class="commandline">int mpi_p_single_task ();</p> |
| <p style="margin-left:.2in"><b>Description</b>: Tells the system whether or not multiple tasks |
| can run at the same time </p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> none</span></p> |
| <p style="margin-left:.2in"><b>Returns</b>: false if multiple tasks can run and true if only |
| a single task can run at one time.</p> |
| |
| <p class="commandline">int mpi_p_exit();</p> |
| <p style="margin-left:.2in"><b>Description</b>: Cleans up anything that needs cleaning up after |
| execution.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> none</span></p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR, causing slurmctld to exit.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h2>Versioning</h2> |
| <p> This document describes version 0 of the SLURM node selection API. Future |
| releases of SLURM may revise this API. A node selection plugin conveys its ability |
| to implement a particular API version using the mechanism outlined for SLURM plugins. |
| In addition, the credential is transmitted along with the version number of the |
| plugin that transmitted it. It is at the discretion of the plugin author whether |
| to maintain data format compatibility across different versions of the plugin.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <p style="text-align:center;">Last modified 11 April 2006</p> |
| |
| <!--#include virtual="footer.txt"--> |