blob: 9edcf7a9f5fd11d8d8873802a58931961051bed8 [file] [log] [blame] [edit]
<!--#include virtual="header.txt"-->
<h1>SLURM Administrator Guide for Sun Constellation systems</h1>
<h2>Overview</h2>
<p>This document describes the unique features of SLURM on
Sun Constellation computers.
You should be familiar with the SLURM's mode of operation on Linux clusters
before studying the relatively few differences in Sun Constellation system
operation described in this document.</p>
<p>SLURM's primary mode of operation is designed for use on clusters with
nodes configured in a one-dimensional space.
A topology plugin was developed to optimize resource allocations in
three dimensions.
Changes were also made for hostlist parsing to support hostnames
of an appropriate format.</p>
<h2>Configuration</h2>
<p>Two variables must be defined in the <i>config.h</i> file:
<i>HAVE_SUN_CONST</i> and <i>SYSTEM_DIMENSIONS=4</i>
(more on that value later).
This can be accomplished in several different ways depending upon how
SLURM is being built.
<ol>
<li>Execute the <i>configure</i> command with the option
<i>--enable-sun-const</i> <b>OR</b></li>
<li>Execute the <i>rpmbuild</i> command with the option
<i>--with sun_const</i> <b>OR</b></li>
<li>Add <i>%with_sun_const 1</i> to your <i>~/.rpmmacros</i> file.</li>
</ol></p>
<p>Node names must have a four-digit suffix describing identifying their
location (this is why SYSTEM_DIMENSIONS is configured to be 4).
The first three digits specify the node's zero-origin position in the
X-, Y- and Z-dimension respectively.
This is followed by one digit specifying the node's sequence number
at that coordinate (e.g. "tux0123" for X=0, Y=1, Z=2, sequence_number=3;
"tux1234" for X=1, Y=2, Z=3, sequence_number=4).
The coordinate location should be zero-origin (starting at X=0, Y=0, Z=0).
The sequence number should also start at zero and can include upper
case letters for higher values, for up to 36 nodes at a specific coordinate
(e.g. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, ... Z).
To avoid confusion, we recommend that the node name prefix consist of
lower case letters.
Numerically sequential node names may specified by in SLURM
commands and configuration files using the system name prefix with the
end-points enclosed in square brackets and separated by an "-".
For example "tux[0000-000B]" is used to represent the twelve nodes
with sequence numbers from 0 to B, all at coordinate X=0, Y=0 and Z=0.
Alternately, rectangular prisms of node names can be specified using the
system name prefix with the end-points enclosed in square brackets and
separated by an "x".
For example "tux[0000x0111]" is used to represent the eight nodes in a
block with endpoints at "tux0000" and "tux0111" (tux0000, tux0001, tux0010,
tux0011, tux0100, tux0101, tux0110 and tux0111).
Viewed another way, these eight nodes have sequence numbers 0 or 1
and have four distinct coordinates (000, 001, 010 and 011).
While node names of this form are required for SLURM's internal use,
it need not be the name returned by the <i>hostlist -s</i> command.
See <i>man slurm.conf</i> for details on how to use the <i>NodeName</i>,
<i>NodeAddr</i> and <i>NodeHostName</i> configuration parameters
for flexibility in this matter.</p>
<p>Next you need to select from two options for the resource selection
plugin (the <i>SelectType</i> option in SLURM's <i>slurm.conf</i> configuration
file):
<ol>
<li><b>select/cons_res</b> - Performs a best-fit algorithm based upon a
one-dimensional space to allocate whole nodes, sockets, or cores to jobs
based upon other configuration parameters.</li>
<li><b>select/linear</b> - Performs a best-fit algorithm based upon a
one-dimensional space to allocate whole nodes to jobs.</li>
</ol>
<p>In order for <i>select/cons_res</i> or <i>select/linear</i> to
allocate resources physically nearby in four-dimensional space, the
nodes be specified in SLURM's <i>slurm.conf</i> configuration file in
such a fashion that those nearby in <i>slurm.conf</i> (managed
internal to SLURM as a one-dimensional space) are also nearby in
the physical four-dimensional space.</p>
<p>SLURM can automatically perform that conversion using a
<a href="http://en.wikipedia.org/wiki/Hilbert_curve">Hilbert curve</a>.
Set <i>TopologyPlugin=topology/3d_torus</i> in SLURM's <i>slurm.conf</i>
configuration file for nodes to be reordered appropriately.
First a three-dimensional Hilbert curve is constructed through all
coordinates in the system such that every coordinate in the order list
physically adjacent.
The node list are then re-ordered following that Hilbert curve while
maintaining the node's sequence number (i.e. not building a Hilbert
curve through that fourth dimension).
If the number of nodes at each coordinate varies, it may be necessary to
put a separate node definition line in the <i>slurm.conf</i> file.
If that is the case, put them in numeric order for the <i>topology/3d_torus</i>
plugin to function properly.<p>
<p>Alternately configure <i>TopologyPlugin=topology/none</i> and
construct your own node ordering sequence as desired in <i>slurm.conf</i>.
Note that each node must be listed exactly once and consecutive
nodes should be nearby in three-dimensional space.
The open source code used by SLURM to generate the Hilbert curve is
included in the distribution at <i>contribs/skilling.c</i> in the event
that you wish to experiment with it to generate your own node ordering.
Two examples of SLURM configuration files are shown below:</p>
<pre>
# slurm.conf for Sun Constellation system of size 4x4x4
# with eight nodes at each coordinate (512 nodes total)
# Configuration parameters removed here
# Automatic orders nodes following a Hilbert curve
NodeName=DEFAULT CPUs=8 RealMemory=2048 State=Unknown
NodeName=tux[0000x3337]
PartitionName=debug Nodes=tux[0000x3337] Default=Yes State=UP
</pre>
<pre>
# slurm.conf for Sun Constellation system of size 1x2x2
# with a different count of nodes at each coordinate
# Configuration parameters removed here
# Manual ordering of nodes following a space-filling curve
NodeName=DEFAULT CPUs=8 RealMemory=2048 State=Unknown
NodeName=tux[0000-0007] # 8 nodes at 0,0,0
NodeName=tux[0010-001B] # 12 nodes at 0,0,1
NodeName=tux[0100-0107] # 8 nodes at 0,1,0
NodeName=tux[0110-0115] # 6 nodes at 0,1,1
PartitionName=DEFAULT Default=Yes State=UP
PartitionName=debug Nodes=tux[0000-0007,0010-001B,0100-0107,0110-0115]
</pre>
<h2>Tools</h2>
<p>The node names output by the <i>scontrol show nodes</i> command
will be ordered as defined (sequentially along the Hilbert curve)
rather than in numeric order (e.g. "tux0010" may follow "tux1010" rather
than "tux0000").
The output of the <i>smap</i> and <i>sview</i> commands will also display
nodes ordered by the Hilbert curve so that nodes appearing adjacent in the
display will be physically adjacent.
This permits the locality of a job, partition or reservation to be easily
determined.
In order to locate specific nodes with the <i>sview</i> command, select
<i>Actions</i>, <i>Search</i> and <i>Node(s) Name</i> then enter the desired
node names.
The output of other SLURM commands (e.g. <i>sinfo</i> and <i>squeue</i>)
will use a SLURM hostlist expression with the node names numerically ordered).
SLURM partitions should contain nodes which are defined sequentially
by that ordering for optimal performance.</p>
<p class="footer"><a href="#top">top</a></p>
<p style="text-align:center;">Last modified 4 August 2009</p></td>
<!--#include virtual="footer.txt"-->