| LLNL CHAOS-SPECIFIC RELEASE NOTES FOR SLURM VERSION 2.2 |
| 1 December 2010 |
| |
| This lists only the most significant changes from SLURM v2.1 to v2.2 |
| with respect to Chaos systems. See the file RELEASE_NOTES for a more |
| complete description of changes. |
| |
| Mostly for system administrators: |
| |
| * SLURM version 2.2 is able to read version 2.1 state files and preserve all |
| running and pending state. SLURM version 2.1 is *not* able to use state save |
| files generated by version 2.2, so this is a non-reversible transition. |
| |
| * Added new configuration parameter JobSubmitPlugins which provides a mechanism |
| to set default job parameters or perform other site-configurable actions at |
| job submit time. Site-specific job submission plugins may be written either C |
| or LUA. |
| |
| * We have given Operators, Administrators, and bank account Coordinators (as |
| defined in the SLURM database) the ability to invoke commands that view/modify |
| user jobs and reservations. Previously, one had to be root to invoke |
| "scontrol update JobId" for example. In addition, Administrators have the |
| ability to view/modify node and partition info without having to become root. |
| For more details, see AUTHORIZATION section of the man pages for the |
| following commands: scontrol, scancel and sbcast. |
| |
| Mostly for users: |
| |
| * Job submission commands (salloc, sbatch and srun) have a new option, |
| --time-min, that permits the job's time limit to be reduced to the extent |
| required to start early through backfill scheduling with the minimum value |
| as specified. |
| |
| * Support has been added for TotalView to attach to a subset of launched tasks |
| instead of requiring that all tasks be attached to. |
| |
| * scontrol now has the ability to shrink a job's size. Use a command of |
| "scontrol update JobId=# NumNodes=#" or |
| "scontrol update JobId=# NodeList=<names>". This command generates a script |
| to be executed in order to reset SLURM environment variables for proper |
| execution of subsequent job steps. |
| |
| * Users can hold and release their own jobs. Submit in held state using srun |
| or sbatch --hold or -H options. Hold after submission using the command |
| "scontrol hold <jobid>". Release with "scontrol release <jobid>". Users can |
| not release jobs held by system administrator. |
| |
| * Added support for a default account and wckey per cluster within accounting. |
| |
| * SLURM commands (squeue, sinfo, sbatch, etc...) can now operate between |
| clusters. Jobs can also be submitted with sbatch to other cluster(s) with the |
| job routed to the one cluster expected to initiated the job first. This |
| functionality relies upon the SlurmDBD (SLURM DataBase Daemon) to provide |
| communication information (address and port) for a command to locate the |
| SLURM control daemon (slurmctld) on other clusters. |