doc/html/cons_tres_share.shtml - SchedMD/slurm - Git at Google

 <!--#include virtual="header.txt"-->

 <h1><a name="top">Sharing Consumable Resources</a></h1>

 <h2 id="cpu_management">CPU Management
 <a class="slurm_link" href="#cpu_management"></a>
 </h2>
 <P>
 (Disclaimer: In this "CPU Management" section, the term "consumable resource"
 does not include memory. The management of memory as a consumable resource is
 discussed in its own section below.)
 </P>
 <P>
 The per-partition <CODE>OverSubscribe</CODE> setting applies to the <U>entity
 being selected for scheduling</U>:
 <UL><LI><P>
 When the <CODE>select/linear</CODE> plugin is enabled, the
 per-partition <CODE>OverSubscribe</CODE> setting controls whether or not the
 <B>nodes</B> are shared among jobs.
 </P></LI><LI><P>
 When the default <CODE>select/cons_tres</CODE> plugin is
 enabled, the per-partition <CODE>OverSubscribe</CODE> setting controls
 whether or not the <B>configured consumable resources</B> are shared among jobs.
 When a consumable resource such as a core,
 socket, or CPU is shared, it means that more than one job can be assigned to it.
 </P></LI></UL>
 </P>
 <P>
 The following table describes this new functionality in more detail:
 </P>
 <TABLE CELLPADDING=3 CELLSPACING=1 BORDER=1>
 <TR><TH>Selection Setting</TH>
 <TH>Per-partition <CODE>OverSubscribe</CODE> Setting</TH>
 <TH>Resulting Behavior</TH>
 </TR><TR>
 <TD ROWSPAN=3>SelectType=<B>select/linear</B></TD>
 <TD>OverSubscribe=NO</TD>
 <TD>Whole nodes are allocated to jobs. No node will run more than one job
 per partition/queue.</TD>
 </TR><TR>
 <TD>OverSubscribe=YES</TD>
 <TD>By default same as OverSubscribe=NO. Nodes allocated to a job may be shared with
 other jobs if each job allows sharing via the <CODE>srun --oversubscribe</CODE>
 option.</TD>
 </TR><TR>
 <TD>OverSubscribe=FORCE</TD>
 <TD>Each whole node can be allocated to multiple jobs up to the count specified
 per partition/queue (default 4 jobs per node)</TD>
 </TR><TR>
 <TD ROWSPAN=3>SelectType=<B>select/cons_tres</B><BR>
 Plus one of the following:<BR>
 SelectTypeParameters=<B>CR_Core</B><BR>
 SelectTypeParameters=<B>CR_Core_Memory</B></TD>
 <TD>OverSubscribe=NO</TD>
 <TD>Cores are allocated to jobs. No core will run more than one job
 per partition/queue.</TD>
 </TR><TR>
 <TD>OverSubscribe=YES</TD>
 <TD>By default same as OverSubscribe=NO. Cores allocated to a job may be shared with
 other jobs if each job allows sharing via the <CODE>srun --oversubscribe</CODE>
 option.</TD>
 </TR><TR>
 <TD>OverSubscribe=FORCE</TD>
 <TD>Each core can be allocated to multiple jobs up to the count specified
 per partition/queue (default 4 jobs per core).</TD>
 </TR><TR>
 <TD ROWSPAN=3>SelectType=<B>select/cons_tres</B><BR>
 Plus one of the following:<BR>
 SelectTypeParameters=<B>CR_CPU</B><BR>
 SelectTypeParameters=<B>CR_CPU_Memory</B></TD>
 <TD>OverSubscribe=NO</TD>
 <TD>CPUs are allocated to jobs. No CPU will run more than one job
 per partition/queue.</TD>
 </TR><TR>
 <TD>OverSubscribe=YES</TD>
 <TD>By default same as OverSubscribe=NO. CPUs allocated to a job may be shared with
 other jobs if each job allows sharing via the <CODE>srun --oversubscribe</CODE>
 option.</TD>
 </TR><TR>
 <TD>OverSubscribe=FORCE</TD>
 <TD>Each CPU can be allocated to multiple jobs up to the count specified
 per partition/queue (default 4 jobs per CPU).</TD>
 </TR><TR>
 <TD ROWSPAN=3>SelectType=<B>select/cons_tres</B><BR>
 Plus one of the following:<BR>
 SelectTypeParameters=<B>CR_Socket</B><BR>
 SelectTypeParameters=<B>CR_Socket_Memory</B></TD>
 <TD>OverSubscribe=NO</TD>
 <TD>Sockets are allocated to jobs. No Socket will run more than one job
 per partition/queue.</TD>
 </TR><TR>
 <TD>OverSubscribe=YES</TD>
 <TD>By default same as OverSubscribe=NO. Sockets allocated to a job may be shared with
 other jobs if each job allows sharing via the <CODE>srun --oversubscribe</CODE>
 option.</TD>
 </TR><TR>
 <TD>OverSubscribe=FORCE</TD>
 <TD>Each socket can be allocated to multiple jobs up to the count specified
 per partition/queue (default 4 jobs per socket).</TD>
 </TR>
 </TABLE>
 <P>When <CODE>OverSubscribe=FORCE</CODE> is configured, the consumable resources are
 scheduled for jobs using a <B>least-loaded</B> algorithm. Thus, idle
 CPUs|cores|sockets will be allocated to a job before busy ones, and
 CPUs|cores|sockets running one job will be allocated to a job before ones
 running two or more jobs. This is the same approach that the
 <CODE>select/linear</CODE> plugin uses when allocating "shared" nodes.
 </P>
 <P>
 Note that the <B>granularity</B> of the "least-loaded" algorithm is what
 distinguishes the consumable resource and linear plugins
 when <CODE>OverSubscribe=FORCE</CODE> is configured. With the
 <CODE>select/cons_tres</CODE> plugin enabled,
 the CPUs of a node are not
 overcommitted until <B>all</B> of the rest of the CPUs are overcommitted on the
 other nodes. Thus if one job allocates half of the CPUs on a node and then a
 second job is submitted that requires more than half of the CPUs, the
 consumable resource plugin will attempt to place this new job on other
 busy nodes that have more than half of the CPUs available for use. The
 <CODE>select/linear</CODE> plugin simply counts jobs on nodes, and does not
 track the CPU usage on each node.
 </P><P>
 The sharing functionality in the
 <CODE>select/cons_tres</CODE> plugin also supports the
 new <CODE>OverSubscribe=FORCE:&lt;num&gt;</CODE> syntax. If <CODE>OverSubscribe=FORCE:3</CODE>
 is configured with a consumable resource plugin and <CODE>CR_Core</CODE> or
 <CODE>CR_Core_Memory</CODE>, then the plugin will
 run up to 3 jobs on each <U>core</U> of each node in the partition. If
 <CODE>CR_Socket</CODE> or <CODE>CR_Socket_Memory</CODE> is configured, then the
 plugin will run up to 3 jobs on each <U>socket</U>
 of each node in the partition.
 </P>
 <h2 id="multiple_partitions">Nodes in Multiple Partitions
 <a class="slurm_link" href="#multiple_partitions"></a>
 </h2>
 <P>
 Slurm has supported configuring nodes in more than one partition since version
 0.7.0. The following table describes how nodes configured in two partitions with
 different <CODE>OverSubscribe</CODE> settings will be allocated to jobs. Note that
 "shared" jobs are jobs that are submitted to partitions configured with
 <CODE>OverSubscribe=FORCE</CODE> or with <CODE>OverSubscribe=YES</CODE> and the job requested
 sharing with the <CODE>srun --oversubscribe</CODE> option. Conversely, "non-shared"
 jobs are jobs that are submitted to partitions configured with
 <CODE>OverSubscribe=NO</CODE> or <CODE>OverSubscribe=YES</CODE> and the job did <U>not</U>
 request shareable resources.
 </P>
 <TABLE CELLPADDING=3 CELLSPACING=1 BORDER=1>
 <TR><TH>&nbsp;</TH><TH>First job "shareable"</TH><TH>First job not
 "shareable"</TH></TR>
 <TR><TH>Second job "shareable"</TH><TD>Both jobs can run on the same nodes and
 may share resources</TD><TD>Jobs do not run on the same nodes</TD></TR>
 <TR><TH>Second job not "shareable"</TH><TD>Jobs do not run on the same nodes</TD>
 <TD>Jobs can run on the same nodes but will not share resources</TD></TR>
 </TABLE>
 <P>
 The next table contains several scenarios with the <CODE>select/cons_tres</CODE>
 plugin enabled to further
 clarify how a node is used when it is configured in more than one partition and
 the partitions have different "OverSubscribe" policies.
 </P>
 <TABLE CELLPADDING=3 CELLSPACING=1 BORDER=1>
 <TR><TH>Slurm configuration</TH>
 <TH>Resulting Behavior</TH>
 </TR><TR>
 <TD>Two <CODE>OverSubscribe=NO</CODE> partitions assigned the same set of nodes</TD>
 <TD>Jobs from either partition will be assigned to all available consumable
 resources. No consumable resource will be shared. One node could have 2 jobs
 running on it, and each job could be from a different partition.</TD>
 </TR><TR>
 <TD>Two partitions assigned the same set of nodes: one partition is
 <CODE>OverSubscribe=FORCE</CODE>, and the other is <CODE>OverSubscribe=NO</CODE></TD>
 <TD>A node will only run jobs from one partition at a time. If a node is
 running jobs from the <CODE>OverSubscribe=NO</CODE> partition, then none of its
 consumable resources will be shared. If a node is running jobs from the
 <CODE>OverSubscribe=FORCE</CODE> partition, then its consumable resources can be
 shared.</TD>
 </TR><TR>
 <TD>Two <CODE>OverSubscribe=FORCE</CODE> partitions assigned the same set of nodes</TD>
 <TD>Jobs from either partition will be assigned consumable resources. All
 consumable resources can be shared. One node could have 2 jobs running on it,
 and each job could be from a different partition.</TD>
 </TR><TR>
 <TD>Two partitions assigned the same set of nodes: one partition is
 <CODE>OverSubscribe=FORCE:3</CODE>, and the other is <CODE>OverSubscribe=FORCE:5</CODE></TD>
 <TD>Generally the same behavior as above. However no consumable resource will
 ever run more than 3 jobs from the first partition, and no consumable resource
 will ever run more than 5 jobs from the second partition. A consumable resource
 could have up to 8 jobs running on it at one time.</TD>
 </TR>
 </TABLE>
 <P>
 Note that the "mixed shared setting" configuration (row #2 above) introduces the
 possibility of <B>starvation</B> between jobs in each partition. If a set of
 nodes are running jobs from the <CODE>OverSubscribe=NO</CODE> partition, then these
 nodes will continue to only be available to jobs from that partition, even if
 jobs submitted to the <CODE>OverSubscribe=FORCE</CODE> partition have a higher
 priority. This works in reverse also, and in fact it's easier for jobs from the
 <CODE>OverSubscribe=FORCE</CODE> partition to hold onto the nodes longer because the
 consumable resource "sharing" provides more resource availability for new jobs
 to begin running "on top of" the existing jobs. This happens with the
 <CODE>select/linear</CODE> plugin also, so it's not specific to the
 <CODE>select/cons_tres</CODE> plugin.
 </P>

 <h2 id="memory_management">Memory Management
 <a class="slurm_link" href="#memory_management"></a>
 </h2>
 <P>
 The management of memory as a consumable resource remains unchanged and
 can be used to prevent oversubscription of memory, which would result in
 having memory pages swapped out and severely degraded performance.
 </P>
 <TABLE CELLPADDING=3 CELLSPACING=1 BORDER=1>
 <TR><TH>Selection Setting</TH>
 <TH>Resulting Behavior</TH>
 </TR><TR>
 <TD>SelectType=<B>select/linear</B></TD>
 <TD>Memory allocation is not tracked. Jobs are allocated to nodes without
 considering if there is enough free memory. Swapping could occur!</TD>
 </TR><TR>
 <TD>SelectType=<B>select/linear</B> plus<BR>
 SelectTypeParameters=<B>CR_Memory</B></TD>
 <TD>Memory allocation is tracked.  Nodes that do not have enough available
 memory to meet the jobs memory requirement will not be allocated to the job.
 </TD>
 </TR><TR>
 <TD>SelectType=<B>select/cons_tres</B><BR>
 Plus one of the following:<BR>
 SelectTypeParameters=<B>CR_Core</B><BR>
 SelectTypeParameters=<B>CR_CPU</B><BR>
 SelectTypeParameters=<B>CR_Socket</B></TD>
 <TD>Memory allocation is not tracked. Jobs are allocated to consumable resources
 without considering if there is enough free memory. Swapping could occur!</TD>
 </TR><TR>
 <TD>SelectType=<B>select/cons_tres</B><BR>
 Plus one of the following:<BR>
 SelectTypeParameters=<B>CR_Core_Memory</B><BR>
 SelectTypeParameters=<B>CR_CPU_Memory</B><BR>
 SelectTypeParameters=<B>CR_Socket_Memory</B></TD>
 <TD>Memory allocation for all jobs are tracked. Nodes that do not have enough
 available memory to meet the jobs memory requirement will not be allocated to
 the job.</TD>
 </TR>
 </TABLE>
 <P>Users can specify their job's memory requirements one of two ways. The
 <CODE>srun --mem=&lt;num&gt;</CODE> option can be used to specify the jobs
 memory requirement on a per allocated node basis. This option is recommended
 for use with the <CODE>select/linear</CODE> plugin, which allocates
 whole nodes to jobs. The
 <CODE>srun --mem-per-cpu=&lt;num&gt;</CODE> option can be used to specify the
 jobs memory requirement on a per allocated CPU basis. This is recommended
 for use with the <CODE>select/cons_tres</CODE>
 plugin, which can allocate individual CPUs to jobs.</P>

 <P>Default and maximum values for memory on a per node or per CPU basis can
 be configured by the system administrator using the following
 <CODE>slurm.conf</CODE> options: <CODE>DefMemPerCPU</CODE>,
 <CODE>DefMemPerNode</CODE>, <CODE>MaxMemPerCPU</CODE> and
 <CODE>MaxMemPerNode</CODE>.
 Users can use the <CODE>--mem</CODE> or <CODE>--mem-per-cpu</CODE> option
 at job submission time to override the default value, but they cannot exceed
 the maximum value.
 </P><P>
 Enforcement of a jobs memory allocation is performed by setting the "maximum
 data segment size" and the "maximum virtual memory size" system limits to the
 appropriate values before launching the tasks. Enforcement is also managed by
 the accounting plugin, which periodically gathers data about running jobs. Set
 <CODE>JobAcctGather</CODE> and <CODE>JobAcctFrequency</CODE> to
 values suitable for your system.</P>
 <P>
 <B>NOTE</B>: The <CODE>--oversubscribe</CODE> and <CODE>--exclusive</CODE>
 options are mutually exclusive when used at job submission. If both options are
 set when submitting a job, the job submission command used will fatal.
 </P>

 <p style="text-align:center;">Last modified 09 July 2025</p>

 <!--#include virtual="footer.txt"-->
	<!--#include virtual="header.txt"-->

	<h1><a name="top">Sharing Consumable Resources</a></h1>

	<h2 id="cpu_management">CPU Management
	<a class="slurm_link" href="#cpu_management"></a>
	</h2>
	<P>
	(Disclaimer: In this "CPU Management" section, the term "consumable resource"
	does not include memory. The management of memory as a consumable resource is
	discussed in its own section below.)
	</P>
	<P>
	The per-partition <CODE>OverSubscribe</CODE> setting applies to the <U>entity
	being selected for scheduling</U>:
	<UL><LI><P>
	When the <CODE>select/linear</CODE> plugin is enabled, the
	per-partition <CODE>OverSubscribe</CODE> setting controls whether or not the
	<B>nodes</B> are shared among jobs.
	</P></LI><LI><P>
	When the default <CODE>select/cons_tres</CODE> plugin is
	enabled, the per-partition <CODE>OverSubscribe</CODE> setting controls
	whether or not the <B>configured consumable resources</B> are shared among jobs.
	When a consumable resource such as a core,
	socket, or CPU is shared, it means that more than one job can be assigned to it.
	</P></LI></UL>
	</P>
	<P>
	The following table describes this new functionality in more detail:
	</P>
	<TABLE CELLPADDING=3 CELLSPACING=1 BORDER=1>
	<TR><TH>Selection Setting</TH>
	<TH>Per-partition <CODE>OverSubscribe</CODE> Setting</TH>
	<TH>Resulting Behavior</TH>
	</TR><TR>
	<TD ROWSPAN=3>SelectType=<B>select/linear</B></TD>
	<TD>OverSubscribe=NO</TD>
	<TD>Whole nodes are allocated to jobs. No node will run more than one job
	per partition/queue.</TD>
	</TR><TR>
	<TD>OverSubscribe=YES</TD>
	<TD>By default same as OverSubscribe=NO. Nodes allocated to a job may be shared with
	other jobs if each job allows sharing via the <CODE>srun --oversubscribe</CODE>
	option.</TD>
	</TR><TR>
	<TD>OverSubscribe=FORCE</TD>
	<TD>Each whole node can be allocated to multiple jobs up to the count specified
	per partition/queue (default 4 jobs per node)</TD>
	</TR><TR>
	<TD ROWSPAN=3>SelectType=<B>select/cons_tres</B><BR>
	Plus one of the following:<BR>
	SelectTypeParameters=<B>CR_Core</B><BR>
	SelectTypeParameters=<B>CR_Core_Memory</B></TD>
	<TD>OverSubscribe=NO</TD>
	<TD>Cores are allocated to jobs. No core will run more than one job
	per partition/queue.</TD>
	</TR><TR>
	<TD>OverSubscribe=YES</TD>
	<TD>By default same as OverSubscribe=NO. Cores allocated to a job may be shared with
	other jobs if each job allows sharing via the <CODE>srun --oversubscribe</CODE>
	option.</TD>
	</TR><TR>
	<TD>OverSubscribe=FORCE</TD>
	<TD>Each core can be allocated to multiple jobs up to the count specified
	per partition/queue (default 4 jobs per core).</TD>
	</TR><TR>
	<TD ROWSPAN=3>SelectType=<B>select/cons_tres</B><BR>
	Plus one of the following:<BR>
	SelectTypeParameters=<B>CR_CPU</B><BR>
	SelectTypeParameters=<B>CR_CPU_Memory</B></TD>
	<TD>OverSubscribe=NO</TD>
	<TD>CPUs are allocated to jobs. No CPU will run more than one job
	per partition/queue.</TD>
	</TR><TR>
	<TD>OverSubscribe=YES</TD>
	<TD>By default same as OverSubscribe=NO. CPUs allocated to a job may be shared with
	other jobs if each job allows sharing via the <CODE>srun --oversubscribe</CODE>
	option.</TD>
	</TR><TR>
	<TD>OverSubscribe=FORCE</TD>
	<TD>Each CPU can be allocated to multiple jobs up to the count specified
	per partition/queue (default 4 jobs per CPU).</TD>
	</TR><TR>
	<TD ROWSPAN=3>SelectType=<B>select/cons_tres</B><BR>
	Plus one of the following:<BR>
	SelectTypeParameters=<B>CR_Socket</B><BR>
	SelectTypeParameters=<B>CR_Socket_Memory</B></TD>
	<TD>OverSubscribe=NO</TD>
	<TD>Sockets are allocated to jobs. No Socket will run more than one job
	per partition/queue.</TD>
	</TR><TR>
	<TD>OverSubscribe=YES</TD>
	<TD>By default same as OverSubscribe=NO. Sockets allocated to a job may be shared with
	other jobs if each job allows sharing via the <CODE>srun --oversubscribe</CODE>
	option.</TD>
	</TR><TR>
	<TD>OverSubscribe=FORCE</TD>
	<TD>Each socket can be allocated to multiple jobs up to the count specified
	per partition/queue (default 4 jobs per socket).</TD>
	</TR>
	</TABLE>
	<P>When <CODE>OverSubscribe=FORCE</CODE> is configured, the consumable resources are
	scheduled for jobs using a <B>least-loaded</B> algorithm. Thus, idle
	CPUs\|cores\|sockets will be allocated to a job before busy ones, and
	CPUs\|cores\|sockets running one job will be allocated to a job before ones
	running two or more jobs. This is the same approach that the
	<CODE>select/linear</CODE> plugin uses when allocating "shared" nodes.
	</P>
	<P>
	Note that the <B>granularity</B> of the "least-loaded" algorithm is what
	distinguishes the consumable resource and linear plugins
	when <CODE>OverSubscribe=FORCE</CODE> is configured. With the
	<CODE>select/cons_tres</CODE> plugin enabled,
	the CPUs of a node are not
	overcommitted until <B>all</B> of the rest of the CPUs are overcommitted on the
	other nodes. Thus if one job allocates half of the CPUs on a node and then a
	second job is submitted that requires more than half of the CPUs, the
	consumable resource plugin will attempt to place this new job on other
	busy nodes that have more than half of the CPUs available for use. The
	<CODE>select/linear</CODE> plugin simply counts jobs on nodes, and does not
	track the CPU usage on each node.
	</P><P>
	The sharing functionality in the
	<CODE>select/cons_tres</CODE> plugin also supports the
	new <CODE>OverSubscribe=FORCE:<num></CODE> syntax. If <CODE>OverSubscribe=FORCE:3</CODE>
	is configured with a consumable resource plugin and <CODE>CR_Core</CODE> or
	<CODE>CR_Core_Memory</CODE>, then the plugin will
	run up to 3 jobs on each <U>core</U> of each node in the partition. If
	<CODE>CR_Socket</CODE> or <CODE>CR_Socket_Memory</CODE> is configured, then the
	plugin will run up to 3 jobs on each <U>socket</U>
	of each node in the partition.
	</P>
	<h2 id="multiple_partitions">Nodes in Multiple Partitions
	<a class="slurm_link" href="#multiple_partitions"></a>
	</h2>
	<P>
	Slurm has supported configuring nodes in more than one partition since version
	0.7.0. The following table describes how nodes configured in two partitions with
	different <CODE>OverSubscribe</CODE> settings will be allocated to jobs. Note that
	"shared" jobs are jobs that are submitted to partitions configured with
	<CODE>OverSubscribe=FORCE</CODE> or with <CODE>OverSubscribe=YES</CODE> and the job requested
	sharing with the <CODE>srun --oversubscribe</CODE> option. Conversely, "non-shared"
	jobs are jobs that are submitted to partitions configured with
	<CODE>OverSubscribe=NO</CODE> or <CODE>OverSubscribe=YES</CODE> and the job did <U>not</U>
	request shareable resources.
	</P>
	<TABLE CELLPADDING=3 CELLSPACING=1 BORDER=1>
	<TR><TH> </TH><TH>First job "shareable"</TH><TH>First job not
	"shareable"</TH></TR>
	<TR><TH>Second job "shareable"</TH><TD>Both jobs can run on the same nodes and
	may share resources</TD><TD>Jobs do not run on the same nodes</TD></TR>
	<TR><TH>Second job not "shareable"</TH><TD>Jobs do not run on the same nodes</TD>
	<TD>Jobs can run on the same nodes but will not share resources</TD></TR>
	</TABLE>
	<P>
	The next table contains several scenarios with the <CODE>select/cons_tres</CODE>
	plugin enabled to further
	clarify how a node is used when it is configured in more than one partition and
	the partitions have different "OverSubscribe" policies.
	</P>
	<TABLE CELLPADDING=3 CELLSPACING=1 BORDER=1>
	<TR><TH>Slurm configuration</TH>
	<TH>Resulting Behavior</TH>
	</TR><TR>
	<TD>Two <CODE>OverSubscribe=NO</CODE> partitions assigned the same set of nodes</TD>
	<TD>Jobs from either partition will be assigned to all available consumable
	resources. No consumable resource will be shared. One node could have 2 jobs
	running on it, and each job could be from a different partition.</TD>
	</TR><TR>
	<TD>Two partitions assigned the same set of nodes: one partition is
	<CODE>OverSubscribe=FORCE</CODE>, and the other is <CODE>OverSubscribe=NO</CODE></TD>
	<TD>A node will only run jobs from one partition at a time. If a node is
	running jobs from the <CODE>OverSubscribe=NO</CODE> partition, then none of its
	consumable resources will be shared. If a node is running jobs from the
	<CODE>OverSubscribe=FORCE</CODE> partition, then its consumable resources can be
	shared.</TD>
	</TR><TR>
	<TD>Two <CODE>OverSubscribe=FORCE</CODE> partitions assigned the same set of nodes</TD>
	<TD>Jobs from either partition will be assigned consumable resources. All
	consumable resources can be shared. One node could have 2 jobs running on it,
	and each job could be from a different partition.</TD>
	</TR><TR>
	<TD>Two partitions assigned the same set of nodes: one partition is
	<CODE>OverSubscribe=FORCE:3</CODE>, and the other is <CODE>OverSubscribe=FORCE:5</CODE></TD>
	<TD>Generally the same behavior as above. However no consumable resource will
	ever run more than 3 jobs from the first partition, and no consumable resource
	will ever run more than 5 jobs from the second partition. A consumable resource
	could have up to 8 jobs running on it at one time.</TD>
	</TR>
	</TABLE>
	<P>
	Note that the "mixed shared setting" configuration (row #2 above) introduces the
	possibility of <B>starvation</B> between jobs in each partition. If a set of
	nodes are running jobs from the <CODE>OverSubscribe=NO</CODE> partition, then these
	nodes will continue to only be available to jobs from that partition, even if
	jobs submitted to the <CODE>OverSubscribe=FORCE</CODE> partition have a higher
	priority. This works in reverse also, and in fact it's easier for jobs from the
	<CODE>OverSubscribe=FORCE</CODE> partition to hold onto the nodes longer because the
	consumable resource "sharing" provides more resource availability for new jobs
	to begin running "on top of" the existing jobs. This happens with the
	<CODE>select/linear</CODE> plugin also, so it's not specific to the
	<CODE>select/cons_tres</CODE> plugin.
	</P>

	<h2 id="memory_management">Memory Management
	<a class="slurm_link" href="#memory_management"></a>
	</h2>
	<P>
	The management of memory as a consumable resource remains unchanged and
	can be used to prevent oversubscription of memory, which would result in
	having memory pages swapped out and severely degraded performance.
	</P>
	<TABLE CELLPADDING=3 CELLSPACING=1 BORDER=1>
	<TR><TH>Selection Setting</TH>
	<TH>Resulting Behavior</TH>
	</TR><TR>
	<TD>SelectType=<B>select/linear</B></TD>
	<TD>Memory allocation is not tracked. Jobs are allocated to nodes without
	considering if there is enough free memory. Swapping could occur!</TD>
	</TR><TR>
	<TD>SelectType=<B>select/linear</B> plus<BR>
	SelectTypeParameters=<B>CR_Memory</B></TD>
	<TD>Memory allocation is tracked. Nodes that do not have enough available
	memory to meet the jobs memory requirement will not be allocated to the job.
	</TD>
	</TR><TR>
	<TD>SelectType=<B>select/cons_tres</B><BR>
	Plus one of the following:<BR>
	SelectTypeParameters=<B>CR_Core</B><BR>
	SelectTypeParameters=<B>CR_CPU</B><BR>
	SelectTypeParameters=<B>CR_Socket</B></TD>
	<TD>Memory allocation is not tracked. Jobs are allocated to consumable resources
	without considering if there is enough free memory. Swapping could occur!</TD>
	</TR><TR>
	<TD>SelectType=<B>select/cons_tres</B><BR>
	Plus one of the following:<BR>
	SelectTypeParameters=<B>CR_Core_Memory</B><BR>
	SelectTypeParameters=<B>CR_CPU_Memory</B><BR>
	SelectTypeParameters=<B>CR_Socket_Memory</B></TD>
	<TD>Memory allocation for all jobs are tracked. Nodes that do not have enough
	available memory to meet the jobs memory requirement will not be allocated to
	the job.</TD>
	</TR>
	</TABLE>
	<P>Users can specify their job's memory requirements one of two ways. The
	<CODE>srun --mem=<num></CODE> option can be used to specify the jobs
	memory requirement on a per allocated node basis. This option is recommended
	for use with the <CODE>select/linear</CODE> plugin, which allocates
	whole nodes to jobs. The
	<CODE>srun --mem-per-cpu=<num></CODE> option can be used to specify the
	jobs memory requirement on a per allocated CPU basis. This is recommended
	for use with the <CODE>select/cons_tres</CODE>
	plugin, which can allocate individual CPUs to jobs.</P>

	<P>Default and maximum values for memory on a per node or per CPU basis can
	be configured by the system administrator using the following
	<CODE>slurm.conf</CODE> options: <CODE>DefMemPerCPU</CODE>,
	<CODE>DefMemPerNode</CODE>, <CODE>MaxMemPerCPU</CODE> and
	<CODE>MaxMemPerNode</CODE>.
	Users can use the <CODE>--mem</CODE> or <CODE>--mem-per-cpu</CODE> option
	at job submission time to override the default value, but they cannot exceed
	the maximum value.
	</P><P>
	Enforcement of a jobs memory allocation is performed by setting the "maximum
	data segment size" and the "maximum virtual memory size" system limits to the
	appropriate values before launching the tasks. Enforcement is also managed by
	the accounting plugin, which periodically gathers data about running jobs. Set
	<CODE>JobAcctGather</CODE> and <CODE>JobAcctFrequency</CODE> to
	values suitable for your system.</P>
	<P>
	<B>NOTE</B>: The <CODE>--oversubscribe</CODE> and <CODE>--exclusive</CODE>
	options are mutually exclusive when used at job submission. If both options are
	set when submitting a job, the job submission command used will fatal.
	</P>

	<p style="text-align:center;">Last modified 09 July 2025</p>

	<!--#include virtual="footer.txt"-->