| This is SLURM, the Simple Linux Utility for Resource Management. SLURM |
| is an open-source cluster resource management and job scheduling system |
| that strives to be simple, scalable, portable, fault-tolerant, and |
| interconnect agnostic. SLURM currently has been tested only under Linux. |
| |
| As a cluster resource manager, SLURM provides three key functions. First, |
| it allocates exclusive and/or non-exclusive access to resources |
| (compute nodes) to users for some duration of time so they can perform |
| work. Second, it provides a framework for starting, executing, and |
| monitoring work (normally a parallel job) on the set of allocated |
| nodes. Finally, it arbitrates conflicting requests for resources by |
| managing a queue of pending work. |
| |
| SLURM is provided "as is" and with no warranty. This software is |
| distributed under the GNU General Public License, please see the files |
| COPYING, DISCLAIMER, and LICENSE.OpenSSL for details. |
| |
| This README presents an introduction to compiling, installing, and |
| using SLURM. |
| |
| |
| SOURCE DISTRIBUTION HIERARCHY |
| ----------------------------- |
| |
| The top-level distribution directory contains this README as well as |
| other high-level documentation files, and the scripts used to configure |
| and build SLURM (see INSTALL). Subdirectories contain the source-code |
| for SLURM as well as a DejaGNU test suite and further documentation. A |
| quick description of the subdirectories of the SLURM distribution follows: |
| |
| src/ [ SLURM source ] |
| SLURM source code is further organized into self explanatory |
| subdirectories such as src/api, src/slurmctld, etc. |
| |
| doc/ [ SLURM documentation ] |
| The documentation directory contains some latex, html, and ascii |
| text papers, READMEs, and guides. Manual pages for the SLURM |
| commands and configuration files are also under the doc/ directory. |
| |
| etc/ [ SLURM configuration ] |
| The etc/ directory contains a sample config file, as well as |
| some scripts useful for running SLURM. |
| |
| slurm/ [ SLURM include files ] |
| This directory contains installed include files, such as slurm.h |
| and slurm_errno.h, needed for compiling against the SLURM API. |
| |
| testsuite/ [ SLURM test suite ] |
| The testsuite directory contains the framework for a set of |
| DejaGNU and "make check" type tests for SLURM components. |
| There is also an extensive collection of Expect scripts. |
| |
| auxdir/ [ autotools directory ] |
| Directory for autotools scripts and files used to configure and |
| build SLURM |
| |
| contribs/ [ helpful tools outside of SLURM proper ] |
| Directory for anything that is outside of slurm proper such as a |
| different api or such. To have this build you need to do a |
| make contrib/install-contrib. |
| |
| COMPILING AND INSTALLING THE DISTRIBUTION |
| ----------------------------------------- |
| |
| Please the the INSTALL file for basic instructions. You will need a |
| working installation of OpenSSL. |
| |
| SLURM does not use reserved ports to authenticate communication |
| between components. You will need to have at least one "auth" |
| plugin. Currently, only three authentication plugins are available: |
| "auth/none," "auth/authd," and "auth/munge." The "auth/none" plugin is |
| built and used by default, but one of either Brent Chun's authd, or Chris |
| Dunlap's Munge should be installed in order to get properly authenticated |
| communications. The configure script in the top-level directory of this |
| distribution will determine which authentication plugins may be built. |
| |
| |
| OpenSSL: |
| http://www.openssl.org |
| |
| AUTHD: |
| http://www.theether.org/authd/ |
| |
| MUNGE: |
| http://www.llnl.gov/linux/munge/ |
| |
| |
| CONFIGURATION |
| ------------- |
| |
| An annotated sample configuration file for SLURM is provided with this |
| distribution as etc/slurm.conf.example. Edit this config file to suit |
| your site and cluster, then copy it to `$sysconfdir/slurm.conf,' where |
| sysconfdir defaults to PREFIX/etc unless explicitly overwritten in the |
| `configure' or `make' steps. |
| |
| Once the config file is installed in the proper location, you'll need |
| to create the keys for SLURM job credential creation and verification. |
| The following openssl commands should be used: |
| |
| > openssl genrsa -out /path/to/private/key 1024 |
| > openssl rsa -in /path/to/private/key -pubout -out /path/to/public/key |
| |
| The private key and public key locations should be those specified by |
| JobCredentialPrivateKey and JobCredentialPublicCertificate in the SLURM |
| config file. |
| |
| |
| RUNNING SLURM |
| ------------- |
| |
| Once a valid configuration has been set up and installed, the SLURM |
| controller, slurmctld, should be started on the primary and backup |
| control machines, and the SLURM compute node daemon, slurmd, should be |
| started on each compute server. |
| |
| The slurmd daemons need to run as root for production use, but may be |
| run as a user for testing purposes (obviously no jobs may be run as |
| any other user in that configuration). The SLURM controller, slurmctld, |
| need to be run as the configured SlurmUser (see your config file). |
| |
| Man pages are the best source of information about SLURM commands and |
| daemons. Please see: slurmctld(8), slurmd(8), scontrol(1), sinfo(1), |
| squeue(1), scancel(1), and srun(1). |
| |
| Also, take a look at the Quickstart Guide to get acquainted with |
| running and managing jobs with SLURM: doc/html/quickstart_admin.html |
| or PREFIX/share/doc/quickstart_admin.html. |
| |
| |
| PROBLEMS |
| -------- |
| |
| If you experience problems compiling, installing, or running SLURM |
| please send e-mail to either slurm-dev@lists.llnl.gov. |
| |
| $Id$ |