blob: db7d4bad7aa061349a363db91018fd5b7c08d5c8 [file] [log] [blame]
<!--#include virtual="header.txt"-->
<h1>REST API Details</h1>
<p>Slurm provides a
<a href="https://en.wikipedia.org/wiki/Representational_state_transfer">
REST API
</a> daemon named slurmrestd. This daemon is designed to allow clients to
communicate with Slurm via a REST API (in addition to the command line
interface (CLI) or C API).
</p>
<p>See also:
<ul>
<li><a href="rest_quickstart.html">REST API Quick Start Guide</a>
<ul>
<li><a href="rest_quickstart.html#common_issues">Common Issues</a></li>
</ul></li>
<li><a href="rest_api.html">REST API Methods and Models</a></li>
<li><a href="openapi_release_notes.html">OpenAPI Plugin Release Notes</a></li>
<li><a href="rest_clients.html">REST API Client Guide</a></li>
</ul>
</p>
<h2 id="contents">Contents<a class="slurm_link" href="#contents"></a></h2>
<ul>
<li><a href="#stateless">Stateless</a></li>
<li><a href="#run_modes">Run modes</a>
<ul>
<li><a href="#inet">Inet Service Mode</a>
<li><a href="#listen">Listening Mode</a>
</ul>
</li>
<li><a href="#config">Configuration</a></li>
<li><a href="#plugins">Plugins</a></li>
<li><a href="#high_availability">High Availability</a></li>
<li><a href="#security">Security</a>
<ul>
<li><a href="#jwt">JSON Web Token (JWT) Authentication</a></li>
<li><a href="#local_auth">Local Authentication</a></li>
<li><a href="#auth_proxy">Authenticating Proxy</a></li>
</ul>
</li>
</ul>
<h2 id="stateless">Stateless<a class="slurm_link" href="#stateless"></a></h2>
<p>Slurmrestd is stateless as it does not cache or save any state between
requests. Each request is handled in a thread and then all of that state is
discarded. Any request to slurmrestd is completely synchronous with the
Slurm controller (slurmctld or slurmdbd) and is only considered complete once
the HTTP response code has been sent to the client. Slurmrestd will hold a
client connection open while processing a request. Slurm database commands are
committed at the end of every request, on the success of all API calls in the
request.</p>
<p>Sites are strongly encouraged to setup a caching proxy between slurmrestd
and clients to avoid having clients repeatedly call queries, causing usage to
be higher than needed (and causing lock contention) on the controller.</p>
<h2 id="run_modes">Run modes<a class="slurm_link" href="#run_modes"></a></h2>
<p>Slurmrestd currently supports two run modes: inet service mode and listening
mode.</p>
<h3 id="inet">Inet Service Mode<a class="slurm_link" href="#inet"></a></h3>
<p>The Slurmrestd daemon acts as an
<a href="https://en.wikipedia.org/wiki/Inetd">
Inet Service
</a> treating STDIN and STDOUT as the client. This mode allows clients to use
inetd, xinetd, or systemd socket activated services and avoid the need to run a
daemon on the host at all times. This mode creates an instance for each client
and does not support reusing the same instance for different clients.</p>
<h3 id="listen">Listening Mode<a class="slurm_link" href="#listen"></a></h3>
<p>The Slurmrestd daemon acts as a full UNIX service and continuously listens
for new TCP connections. Each connection and request are independently
authenticated.</p>
<h2 id="config">Configuration<a class="slurm_link" href="#config"></a></h2>
<p>slurmrestd can be configured either by environment variables or command line
arguments. Please see the <b>doc/man/man1/slurmrestd.8</b> man page and
<a href="rest_quickstart.html#customization">REST API Quick Start Guide</a>
for details.</p>
<h2 id="plugins">Plugins<a class="slurm_link" href="#plugins"></a></h2>
<p>As of Slurm 20.11, the REST API uses plugins for authentication and
generating content. As of Slurm-21.08, the OpenAPI plugins are available
outside of slurmrestd daemon and other slurm commands may provide or accept the
latest version of the OpenAPI formatted output. This functionality is provided
on a per command basis. These plugins can be optionally listed or selected via
command line arguments as described in the <a href="slurmrestd.html">
slurmrestd</a> documentation.</p>
<h3 id="lifecycle">Plugin life cycle
<a class="slurm_link" href="#lifecycle"></a>
</h3>
<p>Plugins provided upstream with Slurm in any release are considered supported
by that release. In general, only the latest versioned plugins will receive
minor bug fixes but all will receive security fixes. Due to the nature of the
plugins, one plugin can be supported across multiple Slurm releases to ensure
(limited) backward client compatibility. SchedMD is currently only explicitly
deprecating plugins per the
<a href="https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.1.0.md?plain=1#L861">
OpenAPI standard method of setting "Deprecated: True"</a>
in the method path. Any plugin that is not marked as deprecated will
continue to exist in the next Slurm Major release pending any critical issues
which will be announced separately. Any plugin marked for deprecation will be
removed on the next Slurm major release. In general, the last three plugins
have been retained in each Slurm major release with the oldest getting marked
as deprecated.
</p>
<p>Sites are <b>always</b> encouraged to use the latest stable plugin version
available for code development. There is <b>no</b> guarantee of compatibility
between different versions of the same plugin with clients. Clients should
always make sure to validate their code when migrating to newer versions of
plugins. Plugin versions will always be included in the path for every method
provided by a given plugin to ensure no two plugins will provide the same
path.</p>
<p>As the plugins utilize the Slurm API internally, plugins that existed in
previous versions of Slurm should continue to be able to communicate with the
two previous versions of Slurm, similar to other components of Slurm. Newer
plugins may have required RPC changes which will exclude them from working with
previous Slurm versions. For instance, the openapi/dbv0.0.36 plugin will not be
able to communicate with any slurmdbd servers prior to the slurm-20.11
release.</p>
<p>As with all other plugins in Slurm, sites are more than welcome to write
their own plugins and are suggested to submit them as code contributions via
<a href="https://bugs.schedmd.com/">bugzilla</a>, tagged as a contribution.
The plugin API provided may change between releases which could cause site
specific plugins to break.</p>
<h2 id="high_availability">High Availability
<a class="slurm_link" href="#high_availability"></a></h2>
<p>Slurmrestd is agnostic to its deployment in a highly available cluster.
The daemon may be run on multiple nodes but does not provide any coordination
with other instances for load balancing or failover.
If such functionality is desired, a separate load balancer may be deployed.
The load balancer should be able to forward any required authentication
information on to the slurmrestd machines (see <a href="#security">Security</a>
section).</p>
<p>The number of connections allowed by the slurmrestd system(s) should also be
limited so that the slurmctld is not overwhelmed with requests. Pay attention to
the <code>-t &lt;THREAD COUNT&gt;</code> and
<code>--max-connections &lt;count&gt;</code> options to <b>slurmrestd</b>, the
number of nodes deployed, and the specs of the machine running <b>slurmctld</b>.
</p>
<h2 id="security">Security<a class="slurm_link" href="#security"></a></h2>
<p>The Slurm REST API is written to provide the necessary functionality for
clients to control Slurm using REST commands. It is <b>not</b> designed to be
directly internet facing. Only unencrypted and uncompressed HTTP communications
are supported. Slurmrestd also has no protection against man in the middle or
replay attacks. Slurmrestd should only be placed in a trusted network that will
communicate with a trusted client.</p>
<p>Any site wishing to expose Slurm REST API to the internet or outside of the
cluster should at the very least use a proxy to wrap all communications with
TLS v1.3 (or later). You should also add monitoring to reject any client who
repeatedly attempts invalid logins at either the network perimeter firewall or
at the TLS proxy. Any client filtering that can be done via a proxy is
suggested to avoid common internet crawlers from talking to slurmrestd and
wasting system resource or even causing higher latency for valid clients.
Sites are recommended to use shorter lived JWT tokens for clients and renew
often, possibly via non-Slurm JWT generator to avoid having to enforce JWT
lifespan limits. It is also suggested that sites use an authenticating proxy
to handle all client authentication against the sites preferred Single Sign
On (SSO) provider instead of Slurm <b>scontrol</b> generated tokens. This will
prevent any unauthenticated client from connecting to slurmrestd.</p>
<p>The Slurm REST API is an HTTP server and all general possible precautions
for security of any web server should be applied. As these precautions are site
specific, it is highly recommended that you work with your site's security
group to ensure all policies are enforced at the proxy before connecting to
slurmrestd.</p>
<p>Slurm tries not to give potential attackers any hints when there are
authentication failures. This results in the client getting this rather terse
message: <code>Authentication failure</code>. When this happens, take a look at
the logs for the relevant Slurm daemon (i.e. <b>slurmdbd</b>, <b>slurmctld</b>,
or <b>slurmd</b>) for information about the actual issue.</p>
<h3 id="jwt">JSON Web Token (JWT) Authentication
<a class="slurm_link" href="#jwt"></a>
</h3>
<p>slurmrestd supports using <a href=jwt.html>JWT to authenticate users</a>.
JWT can be used to authenticate user over REST protocol.
<ul>
<li>User Name Header: X-SLURM-USER-NAME</li>
<li>JWT Header: X-SLURM-USER-TOKEN</li>
</ul>
SlurmUser or root can provide alternative user names to act as a proxy for the
given user. While using JWT authentication, slurmrestd should be run as a
unique, <b>unprivileged</b> user and group. Slurmrestd should be provided an
invalid SLURM_JWT environment variable at startup to activate JWT authentication.
This will allow users to provide their own JWT tokens while authenticating to
the proxy and ensuring against any possible accidental authorizations.</p>
<p>When using JWT, it is important that <u>AuthAltTypes=auth/jwt</u> be
configured in both your slurm.conf and slurmdbd.conf for slurmrestd.</p>
<h3 id="local_auth">Local Authentication
<a class="slurm_link" href="#local_auth"></a>
</h3>
<p>slurmrestd supports using UNIX domain sockets to have the kernel
authenticate local users. By default, slurmrestd will not start as root or
SlurmUser or if the user's primary group belongs to root or SlurmUser.
Slurmrestd must be located in the Munge security domain in order to function
and communicate with Slurm in local authentication mode.
</p>
<h3 id="auth_proxy">Authenticating Proxy
<a class="slurm_link" href="#auth_proxy"></a>
</h3>
<p>There is a wide array of authentication systems that a site could choose
from, if using <a href="#jwt">JWT authentication</a> doesn't meet your
requirements. An authenticating proxy is setup with a JWT token assigned to
the SlurmUser that can then be used to proxy for any user on the cluster.
This ability is only allowed for SlurmUser and the root users, all other
tokens will only work with their locally assigned users.</p>
<p>If using a third-party authenticating proxy, it is expected that it will
provide the correct HTTP headers (<b>X-SLURM-USER-NAME</b> and
<b>X-SLURM-USER-TOKEN</b>) to slurmrestd along with the user's request.</p>
<p>Slurm places no requirements on the authenticating proxy beyond its being
HTTP 1.1 compliant and that it provides the correct HTTP headers to allow
client authentication. Slurm will explicitly trust the HTTP headers provided
and has no way to verify them (beyond the proxy's trusted token
<b>X-SLURM-USER-TOKEN</b>). Any authenticating proxy will need to follow
your site's security policies and ensure that the proxied requests come from
the correct user. These requirements are standard to any authenticated
proxy and are not Slurm specific.</p>
<p>A working trivial example can be found in an <a
href="https://gitlab.com/SchedMD/training/docker-scale-out/-/tree/master/proxy">
internal tool</a> used for testing and training. It uses
<a href="https://www.php.net/">PHP</a> and
<a href="https://www.nginx.com/">NGINX</a> to provide the authentication logic.
This example should only be used as a basic starting place as it is not suitable
for deployment in a production environment.</p>
<hr size=4 width="100%">
<p style="text-align:center;">Last modified 16 May 2025</p>
<!--#include virtual="footer.txt"-->