| <!--#include virtual="header.txt"--> |
| |
| <h1>SLURM: A Highly Scalable Resource Manager</h1> |
| <p>SLURM is an open-source resource manager designed for Linux clusters of all |
| sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive |
| access to resources (computer nodes) to users for some duration of time so they |
| can perform work. Second, it provides a framework for starting, executing, and |
| monitoring work (typically a parallel job) on a set of allocated nodes. Finally, |
| it arbitrates conflicting requests for resources by managing a queue of pending |
| work. </p> |
| |
| <p>SLURM is not a sophisticated batch system, but it does provide an Applications |
| Programming Interface (API) for integration with external schedulers such as |
| <a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php"> |
| The Maui Scheduler</a> and |
| <a href="http://www.clusterresources.com/pages/products/moab-cluster-suite.php"> |
| Moab Cluster Suite</a>. |
| While other resource managers do exist, SLURM is unique in several respects: |
| <ul> |
| <li>Its source code is freely available under the |
| <a href="http://www.gnu.org/licenses/gpl.html">GNU General Public License</a>.</li> |
| <li>It is designed to operate in a heterogeneous cluster with up to 65,536 nodes.</li> |
| <li>It is portable; written in C with a GNU autoconf configuration engine. While |
| initially written for Linux, other UNIX-like operating systems should be easy |
| porting targets. A plugin mechanism exists to support various interconnects, authentication |
| mechanisms, schedulers, etc.</li> |
| <li>SLURM is highly tolerant of system failures, including failure of the node |
| executing its control functions.</li> |
| <li>It is simple enough for the motivated end user to understand its source and |
| add functionality.</li> |
| </ul></p> |
| |
| <p>SLURM provides resource management on about 1000 computers worldwide, |
| including many of the most powerful computers in the world: |
| <ul> |
| <li><a href="http://www.llnl.gov/asc/computing_resources/bluegenel/bluegene_home.html">BlueGene/L</a> |
| at LLNL with 65,536 dual-processor compute nodes</li> |
| <li><a href="http://www.llnl.gov/asc/computing_resources/purple/purple_index.html">ASC Purple</a> |
| an IBM SP/AIX cluster at LLNL with 12,208 Power5 processors and a Federation switch</li> |
| <li><a href="http://www.bsc.es/plantillaA.php?cat_id=5">MareNostrum</a> |
| a Linux cluster at Barcelona Supercomputer Center |
| with 10,240 PowerPC processors and a Myrinet switch</li> |
| <li>Peloton with 1,152 nodes each having four sockets with dual-core Opteron processors and an InfiniBand switch</li> |
| <li>An <a href="http://hpc.uky.edu/">IBM HPC Server</a> at the University of Kentucky. |
| This is a heterogeneous cluster with 128 Power5+ processors and |
| 340 HS21 Blades each with dual-socket and dual-core Intel Woodcrest processors |
| for a total of 1,488 cores connected with Infiniband switch</li> |
| </ul> |
| <p>There are about 200 downloads of SLURM per month from LLNL's FTP server |
| and <a href="https:sourceforge.net">SourceForge.net</a>. |
| As of March 2007, SLURM has been downloaded over 5000 times to over 500 |
| distinct sites in 41 countries. |
| SLURM is also actively being developed, distributed and supported by |
| <a href="http://www.hp.com">Hewlett-Packard</a> and |
| <a href="http://www.bull.com">Bull</a>.</p> |
| |
| <p style="text-align:center;">Last modified 4 June 2007</p> |
| |
| <!--#include virtual="footer.txt"--> |