| <!--#include virtual="header.txt"--> |
| |
| <h1>Multi-Cluster Operation</h1> |
| |
| <p>A cluster is comprised of all the nodes managed by a single slurmctld |
| daemon. SLURM version 2.2 offers the ability to target commands to other |
| clusters instead of, or in addition to, the local cluster on which the |
| command is invoked. When this behavior is enabled, users can submit |
| jobs to one or many clusters and receive status from those remote |
| clusters.</p> |
| |
| <p>For example:</p> |
| |
| <PRE> |
| juser@dawn> squeue -M dawn,dusk |
| CLUSTER: dawn |
| JOBID PARTITION NAME USER ST TIME NODES BP_LIST(REASON) |
| 76897 pdebug myJob juser R 4:10 128 dawn001[8-15] |
| 76898 pdebug myJob juser R 4:10 128 dawn001[16-23] |
| 16899 pdebug myJob juser R 4:10 128 dawn001[24-31] |
| |
| CLUSTER: dusk |
| JOBID PARTITION NAME USER ST TIME NODES BP_LIST(REASON) |
| 11950 pdebug aJob juser R 4:20 128 dusk000[0-15] |
| 11949 pdebug aJob juser R 5:01 128 dusk000[48-63] |
| 11946 pdebug aJob juser R 6:35 128 dusk000[32-47] |
| 11945 pdebug aJob juser R 6:36 128 dusk000[16-31] |
| </PRE> |
| |
| <p>The following SLURM client commands now offer the "-M, --clusters=" |
| option which provides the ability to communicate to and from a comma |
| separated list of clusters: |
| <ol><b>sacct, sbatch, scancel, scontrol, sinfo, smap, sprio, squeue, |
| sshare,</b> and <b>sstrigger</b></ol> |
| |
| <b>salloc, srun,</b> and <b>sstat</b> are cluster specific commands |
| and do not offer the "-M, --clusters=" option.</p> |
| |
| <p>When <b>sbatch</b> is invoked with a cluster list, SLURM will |
| immediately submit the job to the cluster that offers the earliest |
| start time subject its queue of pending and running jobs. SLURM will |
| make no subsequent effort to migrate the job to a different cluster |
| (from the list) whose resources become available when running jobs |
| finish before their scheduled end times.</p> |
| |
| <h2>Multi-Cluster Configuration</h2> |
| <p>The multi-cluster functionality requires the use of the slurmDBD. |
| The AccountingStorageType in the slurm.conf file must be set to the |
| accounting_storage/slurmdbd plugin and the MUNGE or authentication |
| keys must be installed to allow each cluster to communicate with the |
| slurmDBD. Note that MUNGE can be configured to use different keys for |
| communications within a cluster and across clusters if desired. |
| See <a href="accounting.html">accounting</a> for details.</p> |
| |
| <p>Once configured, SLURM commands specifying the "-M, --clusters=" |
| option will become active for all of the clusters listed by the |
| <b>"sacctmgr show clusters"</b> command.</p> |
| |
| <p style="text-align:center;">Last modified 14 February 2011</p> |
| |
| <!--#include virtual="footer.txt"--> |