blob: f77d304c3964be245349ea448612df6b5703e493 [file] [log] [blame]
<!--#include virtual="header.txt"-->
<h1>SLURM: A Highly Scalable Resource Manager</h1>
<p>SLURM is an open-source resource manager designed for Linux clusters of all
sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive
access to resources (computer nodes) to users for some duration of time so they
can perform work. Second, it provides a framework for starting, executing, and
monitoring work (typically a parallel job) on a set of allocated nodes. Finally,
it arbitrates conflicting requests for resources by managing a queue of pending
work. </p>
<p>SLURM is not a sophisticated batch system, but it does provide an Applications
Programming Interface (API) for integration with external schedulers such as
<a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php">
The Maui Scheduler</a> and
<a href="http://www.clusterresources.com/pages/products/moab-cluster-suite.php">
Moab Cluster Suite</a>.
While other resource managers do exist, SLURM is unique in several respects:
<ul>
<li>Its source code is freely available under the
<a href="http://www.gnu.org/licenses/gpl.html">GNU General Public License</a>.</li>
<li>It is designed to operate in a heterogeneous cluster with up to 65,536 nodes.</li>
<li>It is portable; written in C with a GNU autoconf configuration engine. While
initially written for Linux, other UNIX-like operating systems should be easy
porting targets. A plugin mechanism exists to support various interconnects, authentication
mechanisms, schedulers, etc.</li>
<li>SLURM is highly tolerant of system failures, including failure of the node
executing its control functions.</li>
<li>It is simple enough for the motivated end user to understand its source and
add functionality.</li>
</ul></p>
<p>SLURM provides resource management on about 1000 computers worldwide,
including many of the most powerful computers in the world:
<ul>
<li><a href="http://www.llnl.gov/asc/computing_resources/bluegenel/bluegene_home.html">BlueGene/L</a>
at LLNL with 65,536 dual-processor compute nodes</li>
<li><a href="http://www.llnl.gov/asc/computing_resources/purple/purple_index.html">ASC Purple</a>
an IBM SP/AIX cluster at LLNL with 12,208 Power5 processors and a Federation switch</li>
<li><a href="http://www.bsc.es/plantillaA.php?cat_id=5">MareNostrum</a>
a Linux cluster at Barcelona Supercomputer Center
with 10,240 PowerPC processors and a Myrinet switch</li>
<li>Peloton with 1,152 nodes each having four sockets with dual-core Opteron processors and an InfiniBand switch</li>
<li>An <a href="http://hpc.uky.edu/">IBM HPC Server</a> at the University of Kentucky.
This is a heterogeneous cluster with 128 Power5+ processors and
340 HS21 Blades each with dual-socket and dual-core Intel Woodcrest processors
for a total of 1,488 cores connected with Infiniband switch</li>
</ul>
<p>There are about 200 downloads of SLURM per month from LLNL's FTP server
and <a href="https:sourceforge.net">SourceForge.net</a>.
As of March 2007, SLURM has been downloaded over 5000 times to over 500
distinct sites in 41 countries.
SLURM is also actively being developed, distributed and supported by
<a href="http://www.hp.com">Hewlett-Packard</a> and
<a href="http://www.bull.com">Bull</a>.</p>
<p style="text-align:center;">Last modified 4 June 2007</p>
<!--#include virtual="footer.txt"-->