blob: 30f40b30a524a00d0ffe0e752025d28565dc01cd [file] [log] [blame]
<!--#include virtual="header.txt"-->
<h1>Moab Cluster Suite Integration Guide</h1>
<h2>Overview</h2>
<p>Moab Cluster Suite configuration is quite complicated and is
beyond the scope of any documents we could supply with SLURM.
The best resource for Moab configuration information is the
online documents at Cluster Resources Inc.:
<a href="http://www.clusterresources.com/products/mwm/docs/slurmintegration.shtml">
http://www.clusterresources.com/products/mwm/docs/slurmintegration.shtml</a>.
<h2>Configuration</h2>
<p>First, download the Moab scheduler kit from their web site
<a href="http://www.clusterresources.com/pages/products/moab-cluster-suite.php">
http://www.clusterresources.com/pages/products/moab-cluster-suite.php</a>.<br>
<b>Note:</b> Use Moab version 5.0.0 or higher and SLURM version 1.1.28
or higher.</p>
<h3>SLURM configuration</h3>
<h4>slurm.conf</h4>
<p>Set the <i>slurm.conf</i> scheduler parameters as follows:</p>
<pre>
SchedulerType=sched/wiki2
SchedulerPort=7321
</pre>
<p>Running multiple jobs per mode can be accomplished in two different
ways.
The <i>SelectType=select/cons_res</i> parameter can be used to let
SLURM allocate the individual processors, memory, and other
consumable resources (in SLURM version 1.2.1 or higher).
Alternately, <i>SelectType=select/linear</i> or
<i>SelectType=select/bluegene</i> can be used with the
<i>Shared=yes</i> or <i>Shared=force</i> parameter in
partition configuration specifications.</p>
<p>The default value of <i>SchedulerPort</i> is 7321.</p>
<h4>SLURM commands</h4>
<p> Note that the <i>srun --immediate</i> option is not compatible
with Moab.
All jobs must wait for Moab to schedule them rather than being
scheduled immediately by SLURM.</p>
<a name="wiki.conf"><h4>wiki.conf</h4></a>
<p>SLURM's wiki configuration is stored in a file
specific to the wiki-plugin named <i>wiki.conf</i>.
This file should be protected from reading by users.
It only needs to be readable by <i>SlurmUser</i> (as configured
in <i>slurm.conf</i>) and only needs to exist on computers
where the <i>slurmctld</i> daemon executes.
More information about wiki.conf is available in
a man page distributed with SLURM.</p>
<p>The currently supported wiki.conf keywords include:</p>
<p><b>AuthKey</b> is a DES based encryption key used to sign
communications between SLURM and Maui or Moab.
This use of this key is essential to insure that a user
not build his own program to cancel other user's jobs in
SLURM.
This should be no more than 32-bit unsigned integer and match
the the encryption key in Maui (<i>--with-key</i> on the
configure line) or Moab (<i>KEY</i> parameter in the
<i>moab-private.cfg</i> file).
Note that SLURM's wiki plugin does not include a mechanism
to submit new jobs, so even without this key nobody could
run jobs as another user.</p>
<p><b>EPort</b> is an event notification port in Moab.
When a job is submitted to or terminates in SLURM,
Moab is sent a message on this port to begin an attempt
to schedule the computer.
This numeric value should match <i>EPORT</i> configured
in the <i>moab.cnf</i> file.</p>
<p><b>EHost</b> is the event notification host for Moab.
This identifies the computer on which the Moab daemons
executes which should be notified of events.
By default EHost will be identical in value to the
ControlAddr configured in slurm.conf.</p>
<p><b>EHostBackup</b> is the event notification backup host for Moab.
Names the computer on which the backup Moab server executes.
It is used in establishing a communications path for event notification.
By default EHostBackup will be identical in value to the
BackupAddr configured in slurm.conf.</p>
<p><b>ExcludePartitions</b> is used to identify partitions
whose jobs are to be scheduled directly by SLURM rather
than Moab.
This only effects jobs which are submitted using Slurm
commands (i.e. srun, salloc or sbatch, NOT msub from Moab).
These jobs will be scheduled on a First-Come-First-Served
basis.
This may provide faster response times than Moab scheduling.
Moab will account for and report the jobs, but their initiation
will be outside of Moab's control.
Note that Moab controls for resource reservation, fair share
scheduling, etc. will not apply to the initiation of these jobs.
If more than one partition is to be scheduled directly by
Slurm, use a comma separator between their names.</p>
<p><b>HidePartitionJobs</b> identifies partitions whose jobs are not
to be reported to Moab.
These jobs will not be accounted for or otherwise visible to Moab.
Any partitions listed here must also be listed in <b>ExcludePartitions</b>.
If more than one partition is to have its jobs hidden, use a comma
separator between their names.</p>
<p><b>HostFormat</b> controls the format of job task lists built
by Slurm and reported to Moab.
The default value is "0", for which each host name is listed
individually, once per processor (e.g. "tux0:tux0:tux1:tux1:...").
A value of "1" uses Slurm hostlist expressions with processor
counts (e.g. "tux[0-16]*2").
This is currently experimental.
<p><b>JobAggregationTime</b> is used to avoid notifying Moab
of large numbers of events occurring about the same time.
If an event occurs within this number of seconds since Moab was
last notified of an event, another notification is not sent.
This should be an integer number of seconds.
The default value is 10 seconds.
The value should match <i>JOBAGGREGATIONTIME</i> configured
in the <i>moab.cnf</i> file.</p>
<p><b>JobPriority</b> controls the scheduling of newly arriving
jobs in SLURM.
SLURM can either place all newly arriving jobs in a HELD state
(priority = 0) and let Moab decide when and where to run the jobs
or SLURM can control when and where to run jobs.
In the later case, Moab can modify the priorities of pending jobs
to re-order the job queue or just monitor system state.
Possible values are "hold" and "run" with "hold" being the default.</p>
<p>Here is a sample <i>wiki.conf</i> file
<pre>
# wiki.conf
# SLURM's wiki plugin configuration file
#
# Matches KEY in moab-private.cfg
AuthKey=123456789
#
# SLURM to directly schedule "debug" partition
# and hide the jobs from Moab
ExcludePartitions=debug
HidePartitionJobs=debug
#
# Have Moab control job scheduling
JobPriority=hold
#
# Moab event notification port, matches EPORT in moab.cfg
EPort=15017
# Moab event notification host, where the Moab daemon runs
#EHost=tux0
#
# Moab event notification throttle,
# matches JOBAGGREGATIONTIME in moab.cfg (seconds)
JobAggregationTime=15
</pre>
</p>
<h3>Moab Configuration</h3>
<p>Moab has support for SLURM's WIKI interface by default.
Specify this interface in the <i>moab.cfg</i> file as follows:</p>
<pre>
SCHEDCFG[base] MODE=NORMAL
RMCFG[slurm] TYPE=WIKI:SLURM AUTHTYPE=CHECKSUM
</pre>
<p>In <i>moab-private.cfg</i> specify the private key as follows:</p>
<pre>
CLIENTCFG[RM:slurm] KEY=123456789
</pre>
<p>Insure that this file is protected from viewing by users. </p>
<p class="footer"><a href="#top">top</a></p>
<p style="text-align:center;">Last modified 17 August 2007</p>
<!--#include virtual="footer.txt"-->