blob: 013fbdc02dc22ef643223019e68f79944269efef [file] [log] [blame] [edit]
<!--#include virtual="header.txt"-->
<h1>Quality of Service (QOS)</h1>
<P>One can specify a Quality of Service (QOS) for each job submitted
to SLURM. The quality of service associated with a job will affect
the job in three ways:
<ul>
<li> <a href=#priority>Job Scheduling Priority</a>
<li> <a href=#preemption>Job Preemption</a>
<li> <a href=#limits>Job Limits</a>
<li> <a href=#qos_other>Other QOS Options</a>
</ul>
<P> The QOS's are defined in the SLURM database using the <i>sacctmgr</i>
utility.
<P> Jobs request a QOS using the "--qos=" option to the
<i>sbatch</i>, <i>salloc</i>, and <i>srun</i> commands.</P>
<!-------------------------------------------------------------------------->
<a name=priority>
<h2>Job Scheduling Priority</h2></a>
<P> Job scheduling priority is made up of a number of factors as
described in the <a
href="priority_multifactor.html">priority/multifactor</a> plugin. One
of the factors is the QOS priority. Each QOS is defined in the SLURM
database and includes an associated priority. Jobs that request and
are permitted a QOS will incorporate the priority associated with that
QOS in the job's <a
href="priority_multifactor.html#general">multi-factor priority
calculation.</a>
<P> To enable the QOS priority component of the multi-factor priority
calculation, the "PriorityWeightQOS" configuration parameter must be
defined in the slurm.conf file and assigned an integer value greater
than zero.
<P> A job's QOS only affects is scheduling priority when the
multi-factor plugin is loaded.</P>
<!-------------------------------------------------------------------------->
<a name=preemption>
<h2>Job Preemption</h2></a>
<P> SLURM offers two ways for a queued job to preempt a running job,
free-up the running job's resources and allocate them to the queued
job. See the <a href="preempt.html"> Preemption description</a> for
details.
<P> The preemption method is determined by the "PreemptType"
configuration parameter defined in slurm.conf. When the "PreemptType"
is set to "preempt/qos", a queued job's QOS will be used to determine
whether it can preempt a running job.
<P> The QOS can be assigned (using <i>sacctmgr</i>) a list of other
QOS's that it can preempt. When there is a queued job with a QOS that
is allowed to preempt a running job of another QOS, the SLURM
scheduler will preempt the running job.</P>
<!-------------------------------------------------------------------------->
<a name=limits>
<h2>Job Limits</h2></a>
<P> Each QOS is assigned a set of limits which will be applied to the
job. The limits mirror the limits imposed by the
user/account/cluster/partition association defined in the SLURM
database and described in the <a href="resource_limits.html"> Resource
Limits section</a>. When limits for a QOS have been defined, they
will take precedence over the association's limits.
<P> Here are the limits that will be imposed on jobs running under a
QOS</P>
<UL>
<LI><b>GrpCpus</b> Maximum number of CPU's all jobs with this QOS can be allocated.
<LI><b>GrpCPUMins</b> A hard limit of cpu minutes to be used by jobs
running from this QOS. If this limit is reached all jobs running in
this group will be killed, and no new jobs will be allowed to run.
<LI><b>GrpCPURunMins</b> Maximum number of CPU minutes all jobs
running with this QOS can run at the same time. This takes into
consideration time limit of running jobs. If the limit is reached
no new jobs are started until other jobs finish to allow time to
free up.
<LI><b>GrpJobs</b> Maximum number of jobs that can run with this QOS.
<LI><b>GrpMemory</b> Maximum amount of memory (MB) all jobs with this QOS can be allocated.
<LI><b>GrpNodes</b> Maximum number of nodes that can be allocated to all jobs with this QOS.
<LI><b>GrpSubmitJobs</b> Maximum number of jobs with this QOS that can be in the system (no matter what state).
<LI><b>GrpWall</b> Wall clock limit for all jobs running with this QOS.
<LI><b>MaxCpusPerJob</b> Maximum number of CPU's any job with this QOS can be allocated.
<LI><b>MaxCPUMinsPerJob</b> Maximum number of CPU*minutes any job with this QOS can run.
<LI><b>MaxNodesPerJob</b> Maximum number of nodes that can be allocated to any job with this QOS.
<LI><b>MaxWallDurationPerJob</b> Wall clock limit for any jobs running with this QOS.
<LI><b>MaxCpusPerUser</b> Maximum number of CPU's any user with this QOS can be allocated.
<LI><b>MaxJobsPerUser</b> Maximum number of jobs a user can run with this QOS.
<LI><b>MaxNodesPerUser</b> Maximum number of nodes that can be allocated to any user with this QOS.
<LI><b>MaxSubmitJobsPerUser</b> Maximum number of jobs with this QOS that can be in the system.
</UL>
<a name=qos_other>
<h2>Other QOS Options</h2></a>
<ul>
<li><b>Flags</b> Used by the slurmctld to override or enforce certain
characteristics. Valid options are
<ul>
<li><b>EnforceUsageThreshold</b> If set, and the QOS also has a UsageThreshold,
any jobs submitted with this QOS that fall below the UsageThreshold
will be held until their Fairshare Usage goes above the Threshold.
<li><b>NoReserve</b> If this flag is set and backfill scheduling is used,
jobs using this QOS will not reserve resources in the backfill
schedule's map of resources allocated through time. This flag is
intended for use with a QOS that may be preempted by jobs associated
with all other QOS (e.g use with a "standby" QOS). If the allocated is
used with a QOS which can not be preempted by all other QOS, it could
result in starvation of larger jobs.
<li><b>PartitionMaxNodes</b> If set jobs using this QOS will be able to
override the requested partition's MaxNodes limit.
<li><b>PartitionMinNodes</b> If set jobs using this QOS will be able to
override the requested partition's MinNodes limit.
<li><b>PartitionTimeLimit</b> If set jobs using this QOS will be able to
override the requested partition's TimeLimit.
<li><b>RequiresReservaton</b> If set jobs using this QOS must designate a
reservation when submitting a job. This option can be useful in
restricting usage of a QOS that may have greater preemptive capability
or additional resources to be allowed only within a reservation.
</ul>
<li><b>GraceTime</b> Preemption grace time to be extended to a job
which has been selected for preemption.
<li><b>UsageFactor</b> Usage factor when running with this QOS
(i.e. .5 would make it use only half the time as normal in
accounting and 2 would make it use twice as much.)
<li><b>UsageThreshold</b>
A float representing the lowest fairshare of an association allowable
to run a job. If an association falls below this threshold and has
pending jobs or submits new jobs those jobs will be held until the
usage goes back above the threshold. Use <i>sshare</i> to see current
shares on the system.
</ul>
<h2>Configuration</h2>
<P> To summarize the above, the QOS's and their associated limits are
defined in the SLURM database using the <i>sacctmgr</i> utility. The
QOS will only influence job scheduling priority when the multi-factor
priority plugin is loaded and a non-zero "PriorityWeightQOS" has been
defined in the slurm.conf file. The QOS will only determine job
preemption when the "PreemptType" is defined as "preempt/qos" in the
slurm.conf file. Limits defined for a QOS (and described above) will
override the limits of the user/account/cluster/partition
association.</P>
<p style="text-align: center;">Last modified 9 October 2009</p>
</ul></body></html>