slurmctld - Prevent reserved gres from being used by the wrong jobs

Before it was possible for the gres in a reservation to be used by a job
not requesting the reservation. This was because reservations reserve a
specific gres bit, but sock_gres->bits_any_sock's bitmap still included all
the gres bits, even those allocated to reservations. Additionally, before
it was possible for a job requesting the reservation to be allocated a gres
bit that was not reserved due to the same reason.

This modifies _handle_gres_exc_bit_and_not(), now renamed to
_handle_gres_exc_bit_restrict(), so that now in the case of
resv_exc_ptr->gres_js_inc the bits_by_sock bitmap is restricted to only
gres bits in the reservation. This function is now also called on
sock_gres->bits_any_sock in the case no core topology is defined, which
can be the case for gres with a type defined.

Ticket: 21862
Changelog: slurmctld - Fix how gres with cores or a type defined are
 selected to prevent jobs not using reservations from being allocated
 reserved gres and vice versa.
1 file changed
tree: 420ccf65ad0eac1cc92c5ea44250bf785446dc6c
  1. auxdir/
  2. CHANGELOG/
  3. contribs/
  4. debian/
  5. doc/
  6. etc/
  7. slurm/
  8. src/
  9. testsuite/
  10. tools/
  11. .gitignore
  12. .pre-commit-config.yaml
  13. aclocal.m4
  14. AUTHORS
  15. CHANGELOG.md
  16. config.h.in
  17. configure
  18. configure.ac
  19. CONTRIBUTING.md
  20. COPYING
  21. DISCLAIMER
  22. INSTALL
  23. LICENSE.OpenSSL
  24. make_ref.include
  25. Makefile.am
  26. Makefile.in
  27. META
  28. README.md
  29. RELEASE_NOTES.md
  30. SECURITY.md
  31. slurm.spec
README.md

Slurm Workload Manager

This is the Slurm Workload Manager. Slurm is an open-source cluster resource management and job scheduling system that strives to be simple, scalable, portable, fault-tolerant, and interconnect agnostic. Slurm currently has been tested only under Linux.

As a cluster resource manager, Slurm provides three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates conflicting requests for resources by managing a queue of pending work.

NOTES FOR GITHUB DEVELOPERS

The official issue tracker for Slurm is at

: https://support.schedmd.com/

We welcome code contributions and patches, but we do not accept Pull Requests through Github at this time. Please submit patches as attachments to new issues under the "C - Contributions" severity level.

SOURCE DISTRIBUTION HIERARCHY

The top-level distribution directory contains this README as well as other high-level documentation files, and the scripts used to configure and build Slurm (see INSTALL). Subdirectories contain the source-code for Slurm as well as a test suite and further documentation. A quick description of the subdirectories of the Slurm distribution follows:

src/ [ Slurm source ]

: Slurm source code is further organized into self explanatory subdirectories such as src/api, src/slurmctld, etc.

doc/ [ Slurm documentation ]

: The documentation directory contains some latex, html, and ascii text papers, READMEs, and guides. Manual pages for the Slurm commands and configuration files are also under the doc/ directory.

etc/ [ Slurm configuration ]

: The etc/ directory contains a sample config file, as well as some scripts useful for running Slurm.

slurm/ [ Slurm include files ]

: This directory contains installed include files, such as slurm.h and slurm_errno.h, needed for compiling against the Slurm API.

testsuite/ [ Slurm test suite ]

: The testsuite directory contains an extensive collection of tests written for Check, Expect and Pytest.

auxdir/ [ autotools directory ]

: Directory for autotools scripts and files used to configure and build Slurm

contribs/ [ helpful tools outside of Slurm proper ]

: Directory for anything that is outside of slurm proper such as a different api or such. To have this build you need to do a make contrib/install-contrib.

COMPILING AND INSTALLING THE DISTRIBUTION

Please see the instructions at

: https://slurm.schedmd.com/quickstart_admin.html

Extensive documentation is available from our home page at

: https://slurm.schedmd.com/slurm.html

LEGAL

Slurm is provided "as is" and with no warranty. This software is distributed under the GNU General Public License, please see the files COPYING, DISCLAIMER, and LICENSE.OpenSSL for details.