Prevent invalid read if task count is more than arbitrary nodelist

If the job is using arbitrary distribution and the number of node names in
the arbitrary node list is less then the task count there will be an
invalid memory read in _at_tpn_limit() when laying out the tasks. The task
count needs to be equal to the number of node names in the nodelist. If the
task count is not equal return an ESLURM_BAD_TASK_COUNT error.

This also has to prevent the loss of all jobs when upgrading to 25.11 if
any job would hit this error. Do to the new restriction, the jobs being
loaded from versions 25.05 and lower will still be allowed regardless of
the task count. However, arbitrary_tasks_np will be reallocated to prevent
an invalid read from happening.

Changelog: slurmctld - Prevent an invalid read and a possible crash by
 rejecting any arbitrary distribution jobs that do not specify a task count
 equal to the number of node names in their node list. This does not affect
 srun, salloc, or sbatch if -n is not used since they set the default task
 count.
Ticket: 21444
3 files changed
tree: ca173653d8e2defc0938a95278fb876c60616a20
  1. auxdir/
  2. CHANGELOG/
  3. contribs/
  4. debian/
  5. doc/
  6. etc/
  7. slurm/
  8. src/
  9. testsuite/
  10. tools/
  11. .gitignore
  12. .pre-commit-config.yaml
  13. aclocal.m4
  14. AUTHORS
  15. CHANGELOG.md
  16. config.h.in
  17. configure
  18. configure.ac
  19. CONTRIBUTING.md
  20. COPYING
  21. DISCLAIMER
  22. INSTALL
  23. LICENSE.OpenSSL
  24. make_ref.include
  25. Makefile.am
  26. Makefile.in
  27. META
  28. README.md
  29. RELEASE_NOTES.md
  30. SECURITY.md
  31. slurm.spec
README.md

Slurm Workload Manager

This is the Slurm Workload Manager. Slurm is an open-source cluster resource management and job scheduling system that strives to be simple, scalable, portable, fault-tolerant, and interconnect agnostic. Slurm currently has been tested only under Linux.

As a cluster resource manager, Slurm provides three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates conflicting requests for resources by managing a queue of pending work.

NOTES FOR GITHUB DEVELOPERS

The official issue tracker for Slurm is at

: https://support.schedmd.com/

We welcome code contributions and patches, but we do not accept Pull Requests through Github at this time. Please submit patches as attachments to new issues under the "C - Contributions" severity level.

SOURCE DISTRIBUTION HIERARCHY

The top-level distribution directory contains this README as well as other high-level documentation files, and the scripts used to configure and build Slurm (see INSTALL). Subdirectories contain the source-code for Slurm as well as a test suite and further documentation. A quick description of the subdirectories of the Slurm distribution follows:

src/ [ Slurm source ]

: Slurm source code is further organized into self explanatory subdirectories such as src/api, src/slurmctld, etc.

doc/ [ Slurm documentation ]

: The documentation directory contains some latex, html, and ascii text papers, READMEs, and guides. Manual pages for the Slurm commands and configuration files are also under the doc/ directory.

etc/ [ Slurm configuration ]

: The etc/ directory contains a sample config file, as well as some scripts useful for running Slurm.

slurm/ [ Slurm include files ]

: This directory contains installed include files, such as slurm.h and slurm_errno.h, needed for compiling against the Slurm API.

testsuite/ [ Slurm test suite ]

: The testsuite directory contains an extensive collection of tests written for Check, Expect and Pytest.

auxdir/ [ autotools directory ]

: Directory for autotools scripts and files used to configure and build Slurm

contribs/ [ helpful tools outside of Slurm proper ]

: Directory for anything that is outside of slurm proper such as a different api or such. To have this build you need to do a make contrib/install-contrib.

COMPILING AND INSTALLING THE DISTRIBUTION

Please see the instructions at

: https://slurm.schedmd.com/quickstart_admin.html

Extensive documentation is available from our home page at

: https://slurm.schedmd.com/slurm.html

LEGAL

Slurm is provided "as is" and with no warranty. This software is distributed under the GNU General Public License, please see the files COPYING, DISCLAIMER, and LICENSE.OpenSSL for details.