| <!--#include virtual="header.txt"--> |
| |
| <h1>MPI Users Guide</h1> |
| |
| <p>MPI use depends upon the type of MPI being used. |
| There are three fundamentally different modes of operation used |
| by these various MPI implementations. |
| <ol> |
| <li>Slurm directly launches the tasks and performs initialization of |
| communications through the PMI-1, PMI-2 or PMIx APIs. (Supported by most |
| modern MPI implementations.)</li> |
| <li>Slurm creates a resource allocation for the job and then |
| mpirun launches tasks using Slurm's infrastructure (srun).</li> |
| <li>Slurm creates a resource allocation for the job and then |
| mpirun launches tasks using some mechanism other than Slurm, |
| such as SSH or RSH. |
| These tasks are initiated outside of Slurm's monitoring |
| or control and require access to the nodes from the batch node (e.g. SSH). |
| Slurm's epilog should be configured to purge |
| these tasks when the job's allocation is relinquished. The |
| use of pam_slurm_adopt is strongly recommended.</li> |
| </ol> |
| <p><b>NOTE</b>: Slurm is not directly launching the user application in case 3, |
| which may prevent the desired behavior of binding tasks to CPUs and/or |
| accounting and is not a recommended way.</p> |
| |
| <p>Two Slurm parameters control which PMI (Process Management Interface) |
| implementation will be supported. Proper configuration is essential for Slurm to |
| establish the proper environment for the MPI job, such as setting the |
| appropriate environment variables. The <i>MpiDefault</i> configuration parameter |
| in <i>slurm.conf</i> establishes the system's default PMI to be used. |
| The <i>srun</i> option <i>--mpi=</i> (or the equivalent environment |
| variable <i>SLURM_MPI_TYPE</i>) can be used to specify when a |
| different PMI implementation is to be used for an individual job.</p> |
| |
| <p>There are parameters that can be set in the |
| <a href="mpi.conf.html">mpi.conf</a> file that allow you to modify |
| the behavior of the PMI plugins.</p> |
| |
| <p><b>NOTE</b>: Use of an MPI implementation without the appropriate Slurm |
| plugin may result in application failure. If multiple MPI implementations |
| are used on a system then some users may be required to explicitly specify |
| a suitable Slurm MPI plugin.</p> |
| |
| <p><b>NOTE</b>: If installing Slurm with RPMs, the <i>slurm-libpmi</i> |
| package will conflict with the <i>pmix-libpmi</i> package if it is |
| installed. If policies at your site allow you to install from source, this |
| will allow you to install these packages to different locations, so you can |
| choose which libraries to use.</p> |
| |
| <p><b>NOTE</b>: If you build any MPI stack component with hwloc, note that |
| versions 2.5.0 through 2.7.0 (inclusive) of hwloc have a bug that pushes an |
| untouchable value into the environ array, causing a segfault when accessing it. |
| It is advisable to build with hwloc version 2.7.1 or later.</p> |
| |
| <p>Links to instructions for using several varieties of MPI/PMI |
| with Slurm are provided below. |
| <ul> |
| <li><a href="#pmix">PMIx</a></li> |
| <li><a href="#open_mpi">Open MPI</a></li> |
| <li><a href="#intel_mpi">Intel-MPI</a></li> |
| <li><a href="#mpich">MPICH</a></li> |
| <li><a href="#mvapich2">MVAPICH2</a></li> |
| <li><a href="#hpe_cray_pmi">HPE Cray PMI Support</a></li> |
| </ul></p> |
| <hr size=4 width="100%"> |
| |
| <h2 id="pmix"><a href="https://openpmix.github.io/"><b>PMIx</b></a> |
| <a class="slurm_link" href="#pmix"></a> |
| </h2> |
| |
| <h3 id="pmix_build">Building PMIx |
| <a class="slurm_link" href="#pmix_build"></a> |
| </h3> |
| |
| <p>Before building PMIx, it is advisable to read these |
| <a href="https://openpmix.github.io/support/how-to/">How-To Guides</a>. They |
| provide some details on |
| <a href="https://openpmix.github.io/code/getting-the-reference-implementation"> |
| building dependencies and installation steps</a> as well as some relevant notes |
| with regards to |
| <a href="https://openpmix.github.io/support/how-to/slurm-support">Slurm Support |
| </a>.</p> |
| |
| <p>This section is intended to complement the PMIx FAQ with some notes on how to |
| prepare Slurm and PMIx to work together. PMIx can be obtained from the official |
| <a href="https://github.com/openpmix/openpmix">PMIx GitHub</a> repository, |
| either by cloning the repository or by downloading a packaged release.</p> |
| |
| <p>Slurm support for PMIx was first included in Slurm 16.05 based on the PMIx |
| v1.2 release. It has since been updated to support up to version 5.x of the |
| PMIx series, as per the following table: |
| <ul> |
| <li>Slurm 20.11+ supports PMIx v1.2+, v2.x and v3.x.</li> |
| <li>Slurm 22.05+ supports PMIx v2.x, v3.x., v4.x. and v5.x.</li> |
| </ul> |
| If running PMIx v1, it is recommended to run at least 1.2.5 since older |
| versions may have some compatibility issues with support of pmi and pmi2 APIs. |
| |
| Note also that Intel MPI doesn't officially support PMIx. It may work since PMIx |
| offers some compatibility with PMI-2, but there is no guarantee that it will. |
| </p> |
| |
| <h3 id="slurm_pmix">Building Slurm with PMIx support |
| <a class="slurm_link" href="#slurm_pmix"></a> |
| </h3> |
| |
| <p>At configure time, Slurm won't build with PMIx unless <b>--with-pmix</b> is |
| set. Then it will look by default for a PMIx installation under:</p> |
| |
| <pre> |
| /usr |
| /usr/local |
| </pre> |
| |
| <p>If PMIx isn't installed in any of the previous locations, the Slurm configure |
| script can be requested to point to the non default location. Here's an example |
| assuming the installation dir is <i>/home/user/pmix/v4.1.2/</i>: |
| </p> |
| |
| <pre> |
| user@testbox:~/slurm/22.05/build$ ../src/configure \ |
| > --prefix=/home/user/slurm/22.05/inst \ |
| > <b>--with-pmix=/home/user/pmix/4.1.2</b> |
| </pre> |
| |
| <p>Or the analogous with RPM based building:</p> |
| |
| <pre> |
| user@testbox:~/slurm_rpm$ rpmbuild \ |
| > --define '_prefix /home/user/slurm/22.05/inst' \ |
| > --define '_slurm_sysconfdir /home/user/slurm/22.05/inst/etc' \ |
| > <b>--define '_with_pmix --with-pmix=/home/user/pmix/4.1.2'</b> \ |
| > -ta slurm-22.05.2.1.tar.bz2 |
| </pre> |
| |
| <p><b>NOTE</b>: It is also possible to build against multiple PMIx versions |
| with a ':' separator. For instance to build against 3.2 and 4.1:</p> |
| |
| <pre> |
| ... |
| > <b>--with-pmix=/path/to/pmix/3.2.3:/path/to/pmix/4.1.2</b> \ |
| ... |
| </pre> |
| |
| <p>Then, when submitting a job, the desired version can then be selected |
| using any of the available from --mpi=list. The default for pmix will be the |
| highest version of the library:</p> |
| |
| <pre> |
| $ srun --mpi=list |
| MPI plugin types are... |
| cray_shasta |
| none |
| pmi2 |
| pmix |
| specific pmix plugin versions available: pmix_v3,pmix_v4 |
| </pre> |
| |
| <p>Continuing with the configuration, if Slurm is unable to locate the PMIx |
| installation and/or finds it but considers it not usable, the configure output |
| should log something like this:</p> |
| |
| <pre> |
| checking for pmix installation... |
| configure: WARNING: unable to locate pmix installation |
| </pre> |
| |
| <p>Inspecting the generated config.log in the Slurm build directory might |
| provide more detail for troubleshooting purposes. After configuration, |
| we can proceed to install Slurm (using make or rpm accordingly):</p> |
| |
| <pre> |
| user@testbox:~/slurm/22.05/build$ make -j install |
| user@testbox:~/slurm/22.05/build$ cd /home/user/slurm/22.05/inst/lib/slurm/ |
| user@testbox:~/slurm/22.05/inst/lib/slurm$ ls -l *pmix* |
| lrwxrwxrwx 1 user user 16 jul 6 17:17 mpi_pmix.so -> ./mpi_pmix_v4.so |
| -rw-r--r-- 1 user user 9387254 jul 6 17:17 mpi_pmix_v3.a |
| -rwxr-xr-x 1 user user 1065 jul 6 17:17 mpi_pmix_v3.la |
| -rwxr-xr-x 1 user user 1265840 jul 6 17:17 mpi_pmix_v3.so |
| -rw-r--r-- 1 user user 9935358 jul 6 17:17 mpi_pmix_v4.a |
| -rwxr-xr-x 1 user user 1059 jul 6 17:17 mpi_pmix_v4.la |
| -rwxr-xr-x 1 user user 1286936 jul 6 17:17 mpi_pmix_v4.so |
| </pre> |
| |
| <p>If support for PMI-1 or PMI-2 version is also needed, it can also be |
| installed from the contribs directory:</p> |
| |
| <pre> |
| user@testbox:~/slurm/22.05/build/$ cd contribs/pmi1 |
| user@testbox:~/slurm/22.05/build/contribs/pmi1$ make -j install |
| |
| user@testbox:~/slurm/22.05/build/$ cd contribs/pmi2 |
| user@testbox:~/slurm/22.05/build/contribs/pmi2$ make -j install |
| |
| user@testbox:~/$ ls -l /home/user/slurm/22.05/inst/lib/*pmi* |
| -rw-r--r-- 1 user user 493024 jul 6 17:27 libpmi2.a |
| -rwxr-xr-x 1 user user 987 jul 6 17:27 libpmi2.la |
| lrwxrwxrwx 1 user user 16 jul 6 17:27 libpmi2.so -> libpmi2.so.0.0.0 |
| lrwxrwxrwx 1 user user 16 jul 6 17:27 libpmi2.so.0 -> libpmi2.so.0.0.0 |
| -rwxr-xr-x 1 user user 219712 jul 6 17:27 libpmi2.so.0.0.0 |
| -rw-r--r-- 1 user user 427768 jul 6 17:27 libpmi.a |
| -rwxr-xr-x 1 user user 1039 jul 6 17:27 libpmi.la |
| lrwxrwxrwx 1 user user 15 jul 6 17:27 libpmi.so -> libpmi.so.0.0.0 |
| lrwxrwxrwx 1 user user 15 jul 6 17:27 libpmi.so.0 -> libpmi.so.0.0.0 |
| -rwxr-xr-x 1 user user 241640 jul 6 17:27 libpmi.so.0.0.0 |
| </pre> |
| |
| <p><b>NOTE</b>: Since Slurm and PMIx lower than 4.x both provide libpmi[2].so |
| libraries, we recommend you install both pieces of software in |
| different locations. Otherwise, these same libraries might end up being |
| installed under standard locations like /usr/lib64 and the |
| package manager would error out, reporting the conflict.</p> |
| |
| <p><b>NOTE</b>: Any application compiled against PMIx should use the same PMIx |
| or at least a PMIx with the same security domain than the one Slurm is using, |
| otherwise there could be authentication issues. E.g. one PMIx compiled |
| --with-munge while another compiled --without-munge (the default since PMIx |
| 4.2.4). A workaround which might work is to specify the desired security method |
| adding "--mca psec native" to the cli or exporting PMIX_MCA_psec=native |
| environment variable. |
| </p> |
| |
| <p><b>NOTE</b>: If you are setting up a test environment using multiple-slurmd, |
| the TmpFS option in your slurm.conf needs to be specified and the number of |
| directory paths created needs to equal the number of nodes. These directories |
| are used by the Slurm PMIx plugin to create temporal files and/or UNIX sockets. |
| Here is an example setup for two nodes named compute[1-2]:</p> |
| |
| <pre> |
| slurm.conf: |
| TmpFS=/home/user/slurm/22.05/inst/tmp/slurmd-tmpfs-%n |
| |
| $ mkdir /home/user/slurm/22.05/inst/tmp/slurmd-tmpfs-compute1 |
| $ mkdir /home/user/slurm/22.05/inst/tmp/slurmd-tmpfs-compute2 |
| </pre> |
| |
| <h3 id="pmix_testing">Testing Slurm and PMIx |
| <a class="slurm_link" href="#pmix_testing"></a> |
| </h3> |
| |
| <p>It is possible to directly test Slurm and PMIx without needing to have an |
| MPI implementation installed. Here is an example demonstrating that |
| both components work properly:</p> |
| |
| <pre> |
| $ srun --mpi=list |
| MPI plugin types are... |
| cray_shasta |
| none |
| pmi2 |
| pmix |
| specific pmix plugin versions available: pmix_v3,pmix_v4 |
| |
| $ srun --mpi=pmix_v4 -n2 -N2 \ |
| > /home/user/git/pmix/test/pmix_client -n 2 --job-fence -c |
| ==141756== OK |
| ==141774== OK |
| |
| </pre> |
| |
| <h2 id="open_mpi"><a href="https://open-mpi.org"><b>OpenMPI</b></a> |
| <a class="slurm_link" href="#open_mpi"></a> |
| </h2> |
| |
| <p>The current versions of Slurm and Open MPI support task launch using the |
| <span class="commandline">srun</span> command.</p> |
| |
| <p>If OpenMPI is configured with <i>--with-pmi=</i> pointing to either Slurm's |
| PMI-1 libpmi.so or PMI-2 libpmi2.so libraries, the OMPI jobs can then be |
| launched directly using the srun command. This is the preferred mode of |
| operation since accounting features and affinity done by Slurm will become |
| available. If pmi2 support is enabled, the option '--mpi=pmi2' must be |
| specified on the srun command line. |
| Alternately configure 'MpiDefault=pmi' or 'MpiDefault=pmi2' in slurm.conf.</p> |
| |
| <p>Starting with Open MPI version 3.1, PMIx is natively supported. To launch |
| Open MPI applications using PMIx the '--mpi=pmix' option must be specified on |
| the srun command line or 'MpiDefault=pmix' must be configured in slurm.conf.</p> |
| |
| <p>It is also possible to build OpenMPI using an external PMIx installation. |
| Refer to the OpenMPI documentation for a detailed procedure but it basically |
| consists of specifying <b>--with-pmix=PATH</b> when configuring OpenMPI. |
| Note that if building OpenMPI using an external PMIx installation, both OpenMPI |
| and PMIx need to be built against the same libevent/hwloc installations. |
| OpenMPI configure script provides the options |
| <b>--with-libevent=PATH</b> and/or <b>--with-hwloc=PATH</b> to make OpenMPI |
| match what PMIx was built against.</p> |
| |
| <p>A set of parameters are available to control the behavior of the |
| Slurm PMIx plugin, read <a href="mpi.conf.html">mpi.conf</a> for more |
| information.</p> |
| |
| <p><b>NOTE</b>: OpenMPI has a limitation that does not support calls to |
| <i>MPI_Comm_spawn()</i> from within a Slurm allocation. If you need to |
| use the <i>MPI_Comm_spawn()</i> function you will need to use another MPI |
| implementation combined with PMI-2 since PMIx doesn't support it either.</p> |
| |
| <p><b>NOTE</b>: Some kernels and system configurations have resulted in a locked |
| memory too small for proper OpenMPI functionality, resulting in application |
| failure with a segmentation fault. This may be fixed by configuring the slurmd |
| daemon to execute with a larger limit. For example, add "LimitMEMLOCK=infinity" |
| to your slurmd.service file.</p> |
| |
| <hr size=4 width="100%"> |
| |
| <h2 id="intel_mpi"><b>Intel MPI</b> |
| <a class="slurm_link" href="#intel_mpi"></a> |
| </h2> |
| |
| <p>Intel® MPI Library for Linux OS supports the following methods of |
| launching the MPI jobs under the control of the Slurm job manager: |
| |
| <li><a href="#intel_mpirun_hydra">The <i>mpirun</i> command over the Hydra PM |
| </a></li> |
| <li><a href="#intel_srun">The <i>srun</i> command (Slurm, recommended)</a></li> |
| </ul> |
| </p> |
| <p>This description provides detailed information on these two methods.</p> |
| |
| <h3 id="intel_mpirun_hydra">The mpirun Command over the Hydra Process Manager |
| <a class="slurm_link" href="#intel_mpirun_hydra"></a> |
| </h3> |
| <p>Slurm is supported by the <i>mpirun</i> command of the Intel® MPI Library |
| through the Hydra Process Manager by default. When launched within an allocation |
| the <i>mpirun</i> command will automatically read the environment variables set |
| by Slurm such as nodes, cpus, tasks, etc, in order to start the required |
| hydra daemons on every node. These daemons will be started using srun and |
| will subsequently start the user application. Since Intel® MPI supports |
| only PMI-1 and PMI-2 (not PMIx), it is highly recommended to configure this mpi |
| implementation to use Slurm's PMI-2, which offers better scalability than PMI-1. |
| PMI-1 is not recommended and should be deprecated soon.</p> |
| |
| <p>Below is an example of how a user app can be launched within an exclusive |
| allocation of 10 nodes using Slurm's PMI-2 library installed from contribs:</p> |
| <pre> |
| $ salloc -N10 --exclusive |
| $ export I_MPI_PMI_LIBRARY=/path/to/slurm/lib/libpmi2.so |
| $ mpirun -np <num_procs> user_app.bin |
| </pre> |
| |
| <p>Note that by default Slurm will inject two environment variables to ensure |
| mpirun or mpiexec will use the Slurm bootstrap mechanism (srun) to launch Hydra. |
| With these, Hydra will also pass the argument '--external-launcher' to srun in |
| order to not consider these hydra processes as regular steps. This automatic |
| variable injection mechanism can be disabled by setting |
| "MpiParams=disable_slurm_hydra_bootstrap" in slurm.conf, with the exception |
| of the cases where "slurm" is already explicitly set as a bootstrap.</p> |
| |
| <p>It is possible to run Intel MPI using a different bootstrap mechanism. To do |
| so, explicitly set the following environment variable prior to submitting the |
| job with sbatch or salloc:</p> |
| <pre> |
| $ export I_MPI_HYDRA_BOOTSTRAP=ssh |
| $ salloc -N10 |
| $ mpirun -np <num_procs> user_app.bin |
| </pre> |
| |
| <h3 id="intel_srun">The srun Command (Slurm, recommended) |
| <a class="slurm_link" href="#intel_srun"></a> |
| </h3> |
| <p>This method is also supported by the Intel® MPI Library. |
| This method is the best integrated with Slurm and supports process tracking, |
| accounting, task affinity, suspend/resume and other features. |
| As in the previous case, we show an example of how a user app can be |
| launched within an exclusive allocation of 10 nodes using Slurm's PMI-2 library |
| installed from contribs, allowing it to take advantage of of all the Slurm |
| features. This can be done with <i>sbatch</i> or <i>salloc</i> commands:</p> |
| |
| <pre> |
| $ salloc -N10 --exclusive |
| $ export I_MPI_PMI_LIBRARY=/path/to/slurm/lib/libpmi2.so |
| $ srun user_app.bin</i> |
| </pre> |
| |
| <p><b>NOTE</b>: The reason we're pointing manually to Slurm's PMI-1 or PMI-2 |
| library is for licensing reasons. IMPI doesn't link directly to any external |
| PMI implementations so, unlike other stacks (OMPI, MPICH, MVAPICH...), Intel is |
| not built against Slurm libs. Pointing to this library will cause Intel to |
| dlopen and use this PMI library.</p> |
| |
| <p><b>NOTE</b>: There is no official support provided by Intel against PMIx |
| libraries. Since IMPI is based on MPICH, using PMIx with Intel may work due to |
| PMIx maintaining compatibility with pmi2 (which are the libraries used in MPICH) |
| but it is not guaranteed to run in all cases and PMIx could break this |
| compatibility in future versions.</p> |
| |
| <p> For more information see: |
| <a href="https://software.intel.com/en-us/intel-mpi-library">Intel MPI Library |
| </a>.</p> |
| |
| <h2 id="mpich"> |
| <a href="https://www.mpich.org/"><b>MPICH</b></a> |
| <a class="slurm_link" href="#mpich"></a> |
| </h2> |
| |
| <p>MPICH was formerly known as MPICH2.</p> |
| |
| <p>MPICH jobs can be launched using <b>srun</b> or <b>mpiexec</b>. |
| Both modes of operation are described below. The MPICH implementation supports |
| PMI-1, PMI-2 and PMIx (starting with MPICH v4). |
| </p> |
| |
| <p>Note that by default Slurm will inject two environment variables to ensure |
| mpirun or mpiexec will use the Slurm bootstrap mechanism (srun) to launch Hydra. |
| With these, Hydra will also pass the argument '--external-launcher' to srun in |
| order to not consider these hydra processes as regular steps. This automatic |
| variable injection mechanism can be disabled by setting |
| "MpiParams=disable_slurm_hydra_bootstrap" in slurm.conf, with the exception |
| of the cases where "slurm" is already explicitly set as a bootstrap.</p> |
| |
| <p>It is possible to run MPICH using a different bootstrap mechanism. To do |
| so, explicitly set the following environment variable prior to submitting the |
| job with sbatch or salloc:</p> |
| <pre> |
| $ export HYDRA_BOOTSTRAP=ssh |
| $ salloc -N10 |
| $ mpirun -np <num_procs> user_app.bin |
| </pre> |
| |
| <h3 id="mpich_pmi_slurm"> |
| MPICH with srun and linked with Slurm's PMI-1 or PMI-2 libs |
| <a class="slurm_link" href="#mpich_pmi_slurm"></a> |
| </h3> |
| |
| <p>MPICH can be built specifically for use with Slurm and its PMI-1 or PMI-2 |
| libraries using a configure line similar to that shown below. Building this way |
| will force the use of this library on every execution. Note that the |
| LD_LIBRARY_PATH may not be necessary depending on your Slurm installation path: |
| </p> |
| <p> For PMI-2:</p> |
| <pre> |
| user@testbox:~/mpich-4.0.2/build$ LD_LIBRARY_PATH=~/slurm/22.05/inst/lib/ \ |
| > ../configure --prefix=/home/user/bin/mpich/ --with-pmilib=slurm \ |
| > --with-pmi=pmi2 --with-slurm=/home/lipi/slurm/master/inst |
| </pre> |
| <p> |
| or for PMI-1: |
| </p> |
| <pre> |
| user@testbox:~/mpich-4.0.2/build$ LD_LIBRARY_PATH=~/slurm/22.05/inst/lib/ \ |
| > ../configure --prefix=/home/user/bin/mpich/ --with-pmilib=slurm \ |
| > --with-slurm=/home/user/slurm/22.05/inst |
| </pre> |
| |
| <p> These configure lines will detect the Slurm's installed PMI libraries and |
| link against them, but <b>will not install</b> the mpiexec commands. Since PMI-1 |
| is already old and doesn't scale well we don't recommend you link against it. |
| It is preferable to use PMI-2. You can follow this example to run a job with |
| PMI-2:</p> |
| |
| <pre> |
| $ mpicc -o hello_world hello_world.c |
| $ srun --mpi=pmi2 ./hello_world |
| </pre> |
| |
| <p>A Slurm upgrade will not affect this MPICH installation. There is only one |
| unlikely scenario where a recompile of the MPI stack would be needed after an |
| upgrade, which is when we forcibly link against Slurm's PMI-1 and/or PMI-2 |
| libraries and if their APIs ever changed. These should not change often but |
| if it were to happen, it would be noted in Slurm's RELEASE_NOTES file.</p> |
| |
| <h3 id="mpich_slurm_pmix">MPICH with PMIx and integrated with Slurm |
| <a class="slurm_link" href="#mpich_slurm_pmix"></a> |
| </h3> |
| |
| <p> You can also build MPICH using an external PMIx library which should be the |
| same one used when building Slurm:</p> |
| |
| <pre> |
| $ LD_LIBRARY_PATH=~/slurm/22.05/inst/lib/ ../configure \ |
| > --prefix=/home/user/bin/mpich/ \ |
| > --with-pmix=/home/user/bin/pmix_4.1.2/ \ |
| > --with-pmi=pmix \ |
| > --with-slurm=/home/user/slurm/master/inst |
| </pre> |
| |
| <p>After building this way, any execution must be made with Slurm (srun) since |
| the Hydra process manager is not installed, as it was in previous examples. |
| Compile and run a process with: |
| </p> |
| |
| <pre> |
| $ mpicc -o hello_world hello_world.c |
| $ srun --mpi=pmix ./hello_world |
| </pre> |
| |
| <h3 id="mpich_slurm_hydra">MPICH with its internal PMI and integrated with Slurm |
| <a class="slurm_link" href="#mpich_slurm_hydra"></a> |
| </h3> |
| |
| <p>Another option is to just compile MPICH but not set <b>--with-pmilib</b>, |
| <b>--with-pmix</b> or <b>--with-pmi</b>, and only keep <b>--with-slurm</b>. |
| In that case, MPICH will not forcibly link against any PMI libraries and it will |
| install the mpiexec.hydra command by default. This will cause it to use its |
| internal PMI implementation (based on PMI-1) and Slurm API functions to detect |
| the job environment and launch processes accordingly: |
| </p> |
| |
| <pre> |
| user@testbox:~/mpich-4.0.2/build$ ../configure \ |
| > --prefix=/home/user/bin/mpich/ \ |
| > --with-slurm=/home/user/slurm/22.05/inst |
| </pre> |
| |
| <p>Then the app can be run with srun or mpiexec:</p> |
| |
| <pre> |
| $ mpicc -o hello_world hello_world.c |
| $ srun ./hello_world |
| </pre> |
| <p>or</p> |
| <pre> |
| $ mpiexec.hydra ./hello_world |
| </pre> |
| |
| <p>mpiexec.hydra will spawn its daemons using Slurm steps launched with srun and |
| will use its internal PMI implementation.<p> |
| |
| <p><b>NOTE</b>: In this case, compiling with the <b>--with-slurm</b> option |
| created the Hydra bootstrap commands (mpiexec.hydra and others) and linked them |
| against the versioned Slurm's main public API (libslurm.so.X.0.0). That is |
| because these commands use some Slurm functions to detect the job environment. |
| Be aware then that upgrading Slurm would need a recompile of the MPICH stack. |
| It is usually enough to symlink the name of the linked library to the new one, |
| but this is not guaranteed to work. |
| </p> |
| |
| <h3 id="mpich_hydra">MPICH without Slurm integration |
| <a class="slurm_link" href="#mpich_hydra"></a> |
| </h3> |
| |
| <p>Finally, it is possible to compile MPICH without integrating it with Slurm. |
| In that case it will not identify the job and will just run the processes as if |
| it were on a local machine. We recommend reading MPICH documentation and the |
| configure scripts for more information on the existing possibilities.</p> |
| |
| <hr size=4 width="100%"> |
| |
| <h2 id="mvapich2"> |
| <a href="https://mvapich.cse.ohio-state.edu"> |
| <b>MVAPICH2</b> |
| </a> |
| <a class="slurm_link" href="#mvapich2"></a> |
| </h2> |
| |
| <p>MVAPICH2 has support for Slurm. To enable it you need to build MVAPICH2 |
| with a command similar to this:</p> |
| <pre> |
| $ ./configure --prefix=/home/user/bin/mvapich2 \ |
| > --with-slurm=/home/user/slurm/22.05/inst/ |
| </pre> |
| |
| <p><b>NOTE</b>:In certain MVAPICH2 versions and when building with GCC > 10.x, |
| it is possible that these flags must be prepended to the configure line:</p> |
| <pre> |
| FFLAGS="-std=legacy" FCFLAGS="-std=legacy" ./configure ... |
| </pre> |
| |
| <p>When MVAPICH2 is built with Slurm support it will detect that it is |
| within a Slurm allocation, and will use the 'srun' command to spawn its hydra |
| daemons. It does not link to the Slurm API, which means that during an upgrade |
| of Slurm there is no need to recompile MVAPICH2. By default it will use the |
| internal PMI implementation.</p> |
| |
| <p>Note that by default Slurm will inject two environment variables to ensure |
| mpirun or mpiexec will use the Slurm bootstrap mechanism (srun) to launch Hydra. |
| With these, Hydra will also pass the argument '--external-launcher' to srun in |
| order to not consider these hydra processes as regular steps. This automatic |
| variable injection mechanism can be disabled by setting |
| "MpiParams=disable_slurm_hydra_bootstrap" in slurm.conf, with the exception |
| of the cases where "slurm" is already explicitly set as a bootstrap.</p> |
| |
| <p>It is possible to run MVAPICH2 using a different bootstrap mechanism. To do |
| so, explicitly set the following environment variable prior to submitting the |
| job with sbatch or salloc:</p> |
| <pre> |
| $ export HYDRA_BOOTSTRAP=ssh |
| $ salloc -N10 |
| $ mpirun -np <num_procs> user_app.bin |
| </pre> |
| |
| <h3 id="mvapich_pmi_slurm"> |
| MVAPICH2 with srun and linked with Slurm's PMI-1 or PMI-2 libs |
| <a class="slurm_link" href="#mvapich_pmi_slurm"></a> |
| </h3> |
| |
| <p>It is possible to force MVAPICH2 to use one of the Slurm's PMI-1 |
| (libpmi.so.0.0.0)or PMI-2 (libpmi2.so.0.0.0) libraries. Building with this mode |
| will cause all the executions to use Slurm and its PMI libraries. |
| The Hydra process manager binaries (mpiexec) won't be installed. In fact the |
| mpiexec command will exist as a symbolic link to Slurm's srun command. It is |
| recommended not to use PMI-1, but to use at least PMI-2 libs. |
| See below for an example of how to configure and usage:</p> |
| |
| <pre> |
| For PMI-2: |
| |
| ./configure --prefix=/home/user/bin/mvapich2 \ |
| > --with-slurm=/home/user/slurm/master/inst/ \ |
| > --with-pm=slurm --with-pmi=pmi2 |
| |
| and for PMI-1: |
| |
| ./configure --prefix=/home/user/bin/mvapich2 \ |
| > --with-slurm=/home/user/slurm/master/inst/ \ |
| > --with-pm=slurm --with-pmi=pmi1 |
| </pre> |
| |
| <p>To compile and run a user application in Slurm:</p> |
| |
| <pre> |
| $ mpicc -o hello_world hello_world.c |
| $ srun --mpi=pmi2 ./hello_world |
| </pre> |
| |
| <p>For more information, please see the MVAPICH2 documentation on their |
| <a href="https://mvapich.cse.ohio-state.edu/">webpage</a></p> |
| |
| <h3 id="mvapich_pmix_slurm"> |
| MVAPICH2 with Slurm support and linked with external PMIx |
| <a class="slurm_link" href="#mvapich_pmix_slurm"></a> |
| </h3> |
| |
| <p>It is possible to use PMIx within MVAPICH2 and integrated with Slurm. This |
| way no Hydra Process Manager will be installed and the user apps will need to |
| run with srun, assuming Slurm has been compiled against the same or a compatible |
| PMIx version as the one used when building MVAPICH2.</p> |
| |
| <p>To build MVAPICH2 to use PMIx and integrated with Slurm, a configuration line |
| similar to this is needed:</p> |
| <pre> |
| ./configure --prefix=/home/user/bin/mvapich2 \ |
| > --with-slurm=/home/user/slurm/master/inst/ \ |
| > --with-pm=slurm \ |
| > --with-pmix=/home/user/bin/pmix_4.1.2/ \ |
| > --with-pmi=pmix |
| </pre> |
| |
| <p>Running a job looks similar to previous examples:</p> |
| |
| <pre> |
| $ mpicc -o hello_world hello_world.c |
| $ srun --mpi=pmix ./hello_world |
| </pre> |
| |
| <p><b>NOTE</b>: In the MVAPICH2 case, compiling with integration with Slurm |
| (<b>--with-slurm</b>) doesn't add any dependency to commands or libraries, so |
| upgrading Slurm should be safe without any need to recompile MVAPICH2. There |
| is only one unlikely scenario where a recompile of the MPI stack would be |
| needed after an upgrade, which is when we forcibly link against Slurm's PMI-1 |
| and/or PMI-2 libraries and if their APIs ever changed. These should not change |
| often but if it were to happen, it would be noted in Slurm's RELEASE_NOTES |
| file.</p> |
| |
| <h2 id="hpe_cray_pmi"> |
| <b>HPE Cray PMI support</b> |
| <a class="slurm_link" href="#hpe_cray_pmi"></a> |
| </h2> |
| |
| <p>Slurm comes by default with a Cray PMI vendor-specific plugin which provides |
| compatibility with the HPE Cray Programming Environment's PMI. It is intended to |
| be used in applications built with this environment on HPE Cray machines.</p> |
| <p>The plugin is named <i>cray_shasta</i> (Shasta was the first Cray |
| architecture this plugin supported) and built by default in all Slurm |
| installations. Its availability is shown by running the following command:</p> |
| |
| <pre> |
| $ srun --mpi=list |
| MPI plugin types are... |
| cray_shasta |
| none |
| </pre> |
| |
| <p>The Cray PMI plugin will use some reserved ports for its communication. These |
| ports are configurable by using <i>--resv-ports</i> option on the command line |
| with <b>srun</b>, or by setting <i>MpiParams=ports</i>=[<i>port_range</i>] |
| in your slurm.conf. The first port listed in this option will be used as the |
| PMI control port, defined by Cray as the <b>PMI_CONTROL_PORT</b> environment |
| variable. There cannot be more than one application launched in the same node |
| using the same <b>PMI_CONTROL_PORT</b>.</p> |
| |
| <p>This plugin does not support MPMD/heterogeneous jobs and it requires |
| <i>libpals >= 0.2.8</i>.</p> |
| |
| <p style="text-align:center;">Last modified 12 December 2024</p> |
| |
| <!--#include virtual="footer.txt"--> |