| <!--#include virtual="header.txt"--> |
| |
| <h1><a name="top">SLURM Programmer's Guide</a></h1> |
| |
| <h2>Overview</h2> |
| |
| <p>Simple Linux Utility for Resource Management (SLURM) is an open source, fault-tolerant, |
| and highly scalable cluster management and job scheduling system for large and |
| small Linux clusters. Components include machine status, partition management, |
| job management, scheduling, and stream copy modules. SLURM requires no kernel |
| modifications for it operation and is relatively self-contained. |
| <p>There is an overview of the components and their interactions available in |
| a separate document, <a href="slurm_design.pdf"> SLURM: Simple Linux Utility for |
| Resource Management</a> [PDF]. |
| |
| <p>SLURM is written in the C language and uses a GNU <b>autoconf</b> configuration |
| engine. While initially written for Linux, other UNIX-like operating systems should |
| be easy porting targets. Code should adhere to the <a href="coding_style.pdf"> |
| Linux kernel coding style</a>. <i>(Some components of SLURM have been taken from |
| various sources. Some of these components do not conform to the Linux kernel |
| coding style. However, new code written for SLURM should follow these standards.)</i> |
| |
| <p>Many of these modules have been built and tested on a variety of Unix computers |
| including Red Hat Linux, IBM's AIX, Sun's Solaris, and Compaq's Tru-64. The only |
| module at this time that is operating system dependent is <span class="commandline">src/slurmd/read_proc.c</span>. |
| We will be porting and testing on additional platforms in future releases. |
| |
| <h2>Plugins</h2> |
| |
| <p>To make the use of different infrastructures possible, SLURM uses a general |
| purpose plugin mechanism. A SLURM plugin is a dynamically linked code object that |
| is loaded explicitly at run time by the SLURM libraries. It provides a customized |
| implementation of a well-defined API connected to tasks such as authentication, |
| interconnect fabric, task scheduling, etc. A set of functions is defined for use |
| by all of the different infrastructures of a particular variety. When a SLURM |
| daemon is initiated, it reads the configuration file to determine which of the |
| available plugins should be used. A <a href="plugins.html">plugin developer's |
| guide</a> is available with general information about plugins. Most plugin |
| types also have their own documentation available, such as |
| <a href="authplugins.html">SLURM Authentication Plugin API</a> and |
| <a href="jobcompplugins.html">SLURM Job Completion Logging API</a>.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h2>Directory Structure</h2> |
| |
| <p>The contents of the SLURM directory structure will be described below in increasing |
| detail as the structure is descended. The top level directory contains the scripts |
| and tools required to build the entire SLURM system. It also contains a variety |
| of subdirectories for each type of file.</p> |
| <p>General build tools/files include: <b>acinclude.m4</b>, <b>autogen.sh</b>, |
| <b>configure.ac</b>, <b>Makefile.am</b>, <b>Make-rpm.mk</b>, <b>META</b>, <b>README</b>, |
| <b>slurm.spec.in</b>, and the contents of the <b>auxdir</b> directory. <span class="commandline">autoconf</span> |
| and <span class="commandline">make</span> commands are used to build and install |
| SLURM in an automated fashion. NOTE: <span class="commandline">autoconf</span> |
| version 2.52 or higher is required to build SLURM. Execute |
| <span class="commandline">autoconf -V</span> to check your version number. |
| The build process is described in the README file. |
| |
| <p>Copyright and disclaimer information are in the files COPYING and DISCLAIMER. |
| All of the top-level subdirectories are described below.</p> |
| |
| <p style="margin-left:.2in"><b>auxdir</b>—Used for building SLURM.<br> |
| <b>contribs</b>—Various contributed tools.<br> |
| <b>doc</b>—Documentation including man pages. <br> |
| <b>etc</b>—Sample configuration files.<br> |
| <b>slurm</b>—Header files for API use. These files must be installed. Placing |
| these header files in this location makes for better code portability.<br> |
| <b>src</b>—Contains all source code and header files not in the "slurm" subdirectory |
| described above.<br> |
| <b>testsuite</b>—DejaGnu and Expect are used for testing all of its files |
| are here.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h2>Documentation</h2> |
| <p>All of the documentation is in the subdirectory <b>doc</b>. |
| Two directories are of particular interest:</p> |
| |
| <p style="margin-left:.2in"> |
| <b>doc/man</b>— contains the man pages for the APIs, |
| configuration file, commands, and daemons.<br> |
| <b>doc/html</b>— contains the web pages.</p> |
| |
| <h2>Source Code</h2> |
| |
| <p>Functions are divided into several categories, each in its own subdirectory. |
| The details of each directory's contents are proved below. The directories are |
| as follows: </p> |
| |
| <p style="margin-left:.2in"> |
| <b>api</b>—Application Program Interfaces into |
| the SLURM code. Used to send and get SLURM information from the central manager. |
| These are the functions user applications might utilize.<br> |
| <b>common</b>—General purpose functions for widespread use throughout |
| SLURM.<br> |
| <b>database</b>—Various database files that support the accounting |
| storage plugin.<br> |
| <b>plugins</b>—Plugin functions for various infrastructures or optional |
| behavior. A separate subdirectory is used for each plugin class:<br> |
| <ul> |
| <li><b>accounting_storage</b> for specifying the type of storage for accounting,<br> |
| <li><b>auth</b> for user authentication,<br> |
| <li><b>checkpoint</b> for system-initiated checkpoint and restart of user jobs,<br> |
| <li><b>crypto</b> for cryptographic functions,<br> |
| <li><b>jobacct_gather</b> for job accounting,<br> |
| <li><b>jobcomp</b> for job completion logging,<br> |
| <li><b>mpi</b> for MPI support,<br> |
| <li><b>priority</b> calculates job priority based on a number of factors |
| including fair-share,<br> |
| <li><b>proctrack</b> for process tracking,<br> |
| <li><b>sched</b> for job scheduler,<br> |
| <li><b>select</b> for a job's node selection,<br> |
| <li><b>switch</b> for switch (interconnect) specific functions,<br> |
| <li><b>task</b> for task affinity to processors,<br> |
| <li><b>topology</b> methods for assigning nodes to jobs based on node |
| topology.<br> |
| </ul> |
| <p style="margin-left:.2in"> |
| <b>sacct</b>—User command to view accounting information about jobs.<br> |
| <b>sacctmgr</b>—User and administrator tool to manage accounting.<br> |
| <b>salloc</b>—User command to allocate resources for a job.<br> |
| <b>sattach</b>—User command to attach standard input, output and error |
| files to a running job or job step.<br> |
| <b>sbatch</b>—User command to submit a batch job (script for later execution).<br> |
| <b>sbcast</b>—User command to broadcast a file to all nodes associated |
| with an existing SLURM job.<br> |
| <b>scancel</b>—User command to cancel (or signal) a job or job step.<br> |
| <b>scontrol</b>—Administrator tool to manage SLURM.<br> |
| <b>sinfo</b>—User command to get information on SLURM nodes and partitions.<br> |
| <b>slurmctld</b>—SLURM central manager daemon code.<br> |
| <b>slurmd</b>—SLURM daemon code to manage the compute server nodes including |
| the execution of user applications.<br> |
| <b>slurmdbd</b>—SLURM database daemon managing access to the accounting |
| storage database.<br> |
| <b>smap</b>—User command to view layout of nodes, partitions, and jobs. |
| This is particularly valuable on systems like Bluegene, which has a three |
| dimension torus topography.<br> |
| <b>sprio</b>—User command to see the breakdown of a job's priority |
| calculation when the Multifactor Job Priority plugin is installed.<br> |
| <b>squeue</b>—User command to get information on SLURM jobs and job steps.<br> |
| <b>sreport</b>—User command to view various reports about past |
| usage across the enterprise.<br> |
| <b>srun</b>—User command to submit a job, get an allocation, and/or |
| initiation a parallel job step.<br> |
| <b>srun_cr</b>—Checkpoint/Restart wrapper for srun.<br> |
| <b>sshare</b>—User command to view shares and usage when the Multifactor |
| Job Priority plugin is installed.<br> |
| <b>sstat</b>—User command to view detailed statistics about running |
| jobs when a Job Accounting Gather plugin is installed.<br> |
| <b>strigger</b>—User and administrator tool to manage event triggers.<br> |
| <b>sview</b>—User command to view and update node, partition, and job |
| job state information.<br> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h2>Configuration</h2> |
| <p>Sample configuration files are included in the <b>etc</b> subdirectory. |
| The <b>slurm.conf</b> can be built using a <a href="configurator.html">configuration tool</a>. |
| See <b>doc/man/man5/slurm.conf.5</b> and the man pages for other configuration files |
| for more details. |
| <b>init.d.slurm</b> is a script that determines which |
| SLURM daemon(s) should execute on any node based upon the configuration file contents. |
| It will also manage these daemons: starting, signalling, restarting, and stopping them.</p> |
| |
| <h2>Test Suite</h2> |
| <p>The <b>testsuite</b> files use a DejaGnu framework for testing. These tests |
| are very limited in scope.</p> |
| |
| <p>We also have a set of Expect SLURM tests available under the <b>testsuite/expect</b> |
| directory. These tests are executed after SLURM has been installed |
| and the daemons initiated. About 250 test scripts exercise all SLURM commands |
| and options including stress tests. The file <b>testsuite/expect/globals</b> |
| contains default paths and procedures for all of the individual tests. At |
| the very least, you will need to set the <i>slurm_dir</i> variable to the correct |
| value. To avoid conflicts with other developers, you can override variable settings |
| in a separate file named <b>testsuite/expect/globals.local</b>.</p> |
| |
| <p>Set your working directory to <b>testsuite/expect</b> before |
| starting these tests. Tests may be executed individually by name |
| (e.g. <i>test1.1</i>) |
| or the full test suite may be executed with the single command <i>regression</i>. |
| See <b>testsuite/expect/README</b> for more information.</p> |
| |
| |
| <h2>Adding Files and Directories</h2> |
| <p>If you are adding files and directories to SLURM, it will be necessary to |
| re-build configuration files before executing the <b>configure</b> command. |
| Update <b>Makefile.am</b> files as needed then execute |
| <b>autogen.sh</b> before executing <b>configure</b>. |
| |
| <h2>Tricks of the Trade</h2> |
| <h3>HAVE_FRONT_END</h3> |
| <p>You can make a single node appear to SLURM as a Linux cluster by running |
| <i>configure</i> with the <i>--enable-front-end</i> option. This |
| defines b>HAVE_FRONT_END</b> with a non-zero value in the file <b>config.h</b>. |
| All (fake) nodes should be defined in the <b>slurm.conf</b> file. |
| These nodes should be configured with a single <b>NodeAddr</b> value |
| indicating the node on which single <span class="commandline">slurmd</span> daemon |
| executes. Initiate one <span class="commandline">slurmd</span> and one |
| <span class="commandline">slurmctld</span> daemon. Do not initiate too many |
| simultaneous job steps to avoid overloading the |
| <span class="commandline">slurmd</span> daemon executing them all.</p> |
| |
| <h3><a name="multiple_slurmd_support">Multiple slurmd support</a></h3> |
| <p>It is possible to run multiple slurmd daemons on a single node, each using |
| a different port number and NodeName alias. This is very useful for testing |
| networking and protocol changes, or anytime you want to simulate a larger |
| cluster than you really have. The author uses this on his desktop to simulate |
| multiple nodes. However, it is important to note that not all slurm functions |
| will work with multiple slurmd support enabled (e.g. many switch plugins will |
| not work, it is best to use switch/none).</p> |
| |
| <p>Multiple support is enabled at configure-time with the |
| "--enable-multiple-slurmd" parameter. This enables a new parameter in the |
| slurm.conf file on the NodeName line, "Port=<port number>", and adds a new |
| command line parameter to slurmd, "-N".</p> |
| |
| <p>Each slurmd needs to have its own NodeName, and its own TCP port number. Here |
| is an example of the NodeName lines for running three slurmd daemons on each |
| of ten nodes:</p> |
| |
| <pre> |
| NodeName=foo[1-10] NodeHostname=host[1-10] Port=17001 |
| NodeName=foo[11-20] NodeHostname=host[1-10] Port=17002 |
| NodeName=foo[21-30] NodeHostname=host[1-10] Port=17003 |
| </pre> |
| |
| <p> |
| It is likely that you will also want to use the "%n" symbol in any slurmd |
| related paths in the slurm.conf file, for instance SlurmdLogFile, |
| SlurmdPidFile, and especially SlurmdSpoolDir. Each slurmd replaces the "%n" |
| with its own NodeName. Here is an example:</p> |
| |
| <pre> |
| SlurmdLogFile=/var/log/slurm/slurmd.%n.log |
| SlurmdPidFile=/var/run/slurmd.%n.pid |
| SlurmdSpoolDir=/var/spool/slurmd.%n |
| </pre> |
| |
| <p> |
| It is up to you to start each slurmd daemon with the proper NodeName. |
| For example, to start the slurmd daemons for host1 from the |
| above slurm.conf example:</p> |
| |
| <pre> |
| host1> slurmd -N foo1 |
| host1> slurmd -N foo11 |
| host1> slurmd -N foo21 |
| </pre> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <p style="text-align:center;">Last modified 27 March 2009</p> |
| |
| <!--#include virtual="footer.txt"--> |