| \section{Conclusion and Future Plans} |
| |
| We have presented in this paper an overview of SLURM, a simple, highly scalable, robust, |
| and portable cluster resource management system. |
| The contribution of this work is that we have provided a immediately-available |
| and open-source tool that virtually anybody can use to efficiently manage clusters of |
| different sizes and architecture. |
| %We expect SLURM to begin production use on LLNL Linux clusters |
| %starting in March 2003 and be available for distribution shortly |
| %thereafter. |
| |
| Looking ahead, we anticipate adding support for additional |
| operating systems. |
| % (IA64 and x86-64) and interconnects (InfiniBand |
| %and the IBM Blue Gene\cite{BlueGene2002} system\footnote{Blue Gene |
| %has a different interconnect than any supported by SLURM and |
| %a 3-D topography with restrictive allocation constraints.}). |
| We anticipate adding a job preempt/resume capability, which will |
| provide an external scheduler the infrastructure |
| required to perform gang scheduling, and a checkpoint/restart capability. |
| We also plan to use the SLURM for IBM's Blue Gene/L platform~\cite{BGL} by incorporating a capability |
| to manage jobs on a three-dimensional torus machine into the SLURM. |