blob: 81657eb8d94c1e0d4dca11b40a53a57cd8725db8 [file] [log] [blame]
.TH sacctmgr "1" "Slurm Commands" "August 2025" "Slurm Commands"
.SH "NAME"
sacctmgr \- Used to view and modify Slurm account information.
.SH "SYNOPSIS"
\fBsacctmgr\fR [\fIOPTIONS\fR...] [\fICOMMAND\fR...]
.SH "DESCRIPTION"
\fBsacctmgr\fR is used to view or modify Slurm account information.
The account information is maintained within a database with the interface
being provided by \fBslurmdbd\fR (Slurm Database daemon).
This database can serve as a central storehouse of user and
computer information for multiple computers at a single site.
Slurm account information is recorded based upon four parameters
that form what is referred to as an \fIassociation\fR.
These parameters are \fIuser\fR, \fIcluster\fR, \fIpartition\fR, and
\fIaccount\fR. \fIuser\fR is the login name.
\fIcluster\fR is the name of a Slurm managed cluster as specified by
the \fIClusterName\fR parameter in the \fIslurm.conf\fR configuration file.
\fIpartition\fR is the name of a Slurm partition on that cluster.
\fIaccount\fR is the bank account for a job.
The intended mode of operation is to initiate the \fBsacctmgr\fR command,
add, delete, modify, and/or list \fIassociation\fR records then
commit the changes and exit.
\fBNOTE\fR: The contents of Slurm's database are maintained in lower case.
This may result in some \f3sacctmgr\fP output differing from that of other
Slurm commands.
.SH "OPTIONS"
.TP
\fB\-s\fR, \fB\-\-associations\fR
Use with show or list to display associations with the entity.
This is equivalent to the \fBassociations\fR command.
.IP
.TP
\fB\-h\fR, \fB\-\-help\fR
Print a help message describing the usage of \fBsacctmgr\fR.
This is equivalent to the \fBhelp\fR command.
.IP
.TP
\fB\-i\fR, \fB\-\-immediate\fR
Commit changes immediately without asking for confirmation.
.IP
.TP
\f3\-\-json\fP, \f3\-\-json\fP=\fIlist\fR, \f3\-\-json\fP=<\fIdata_parser\fR>
Dump information as JSON using the default data_parser plugin or explicit
data_parser with parameters. Sorting and formatting arguments will be ignored.
This option is not available for every command.
.IP
.TP
\fB\-n\fR, \fB\-\-noheader\fR
No header will be added to the beginning of the output.
.IP
.TP
\fB\-p\fR, \fB\-\-parsable\fR
Output will be '|' delimited with a '|' at the end.
.IP
.TP
\fB\-P\fR, \fB\-\-parsable2\fR
Output will be '|' delimited without a '|' at the end.
.IP
.TP
\fB\-Q\fR, \fB\-\-quiet\fR
Print no messages other than error messages.
This is equivalent to the \fBquiet\fR command.
.IP
.TP
\fB\-r\fR, \fB\-\-readonly\fR
Makes it so the running sacctmgr cannot modify accounting information.
The \fBreadonly\fR option is for use within interactive mode.
.IP
.TP
\f3\-\-yaml\fP, \f3\-\-yaml\fP=\fIlist\fR, \f3\-\-yaml\fP=<\fIdata_parser\fR>
Dump information as YAML using the default data_parser plugin or explicit
data_parser with parameters. Sorting and formatting arguments will be ignored.
This option is not available for every command.
.IP
.TP
\fB\-v\fR, \fB\-\-verbose\fR
Enable detailed logging.
This is equivalent to the \fBverbose\fR command.
.IP
.TP
\fB\-V\fR , \fB\-\-version\fR
Display version number.
This is equivalent to the \fBversion\fR command.
.IP
.SH "COMMANDS"
.TP
\fBadd\fR <\fIENTITY\fR> <\fISPECS\fR>
Add an entity.
Identical to the \fBcreate\fR command.
.IP
.TP
\fBarchive\fR {dump|load} <\fISPECS\fR>
Write database information to a flat file or load information that has
previously been written to a file.
.IP
.TP
\fBclear stats\fR
Clear the server statistics.
.IP
.TP
\fBcreate\fR <\fIENTITY\fR> <\fISPECS\fR>
Add an entity.
Identical to the \fBadd\fR command.
.IP
.TP
\fBdelete\fR <\fIENTITY\fR> \fBwhere\fR <\fISPECS\fR>
Delete the specified entities.
Identical to the \fBremove\fR command.
.IP
.TP
\fBdump\fR <\fIcluster\fR>
Dump cluster data to the specified file. If the filename is not specified
it uses clustername.cfg filename by default.
.IP
.TP
\fBhelp\fP
Display a description of sacctmgr options and commands.
.IP
.TP
\fBlist\fR <\fIENTITY\fR> [<\fISPECS\fR>]
Display information about the specified entity.
By default, all entries are displayed, you can narrow results by
specifying SPECS in your query.
Identical to the \fBshow\fR command.
.IP
.TP
\fBload\fR <\fIFILENAME\fR>
Load cluster data from the specified file. This is a configuration file
generated by running the sacctmgr dump command. This command does
not load archive data, see the sacctmgr archive load option instead.
.IP
.TP
\fBmodify\fR <\fIENTITY\fR> \fBwhere\fR <\fISPECS\fR> \fBset\fR <\fISPECS\fR>
Modify an entity.
.IP
.TP
\fBping\fR
Ping slurmdbd.
.IP
.TP
\fBreconfigure\fR
Reconfigures the SlurmDBD if running with one.
.IP
.TP
\fBremove\fR <\fIENTITY\fR> \fBwhere\fR <\fISPECS\fR>
Delete the specified entities.
Identical to the \fBdelete\fR command.
.IP
.TP
\fBshow\fR <\fIENTITY\fR> [<\fISPECS\fR>]
Display information about the specified entity.
By default, all entries are displayed, you can narrow results by
specifying SPECS in your query.
Identical to the \fBlist\fR command.
.IP
.TP
\fBshutdown\fR
Shutdown the server.
.IP
.TP
\fBversion\fP
Display the version number of sacctmgr.
.IP
.SH "INTERACTIVE COMMANDS"
\fBNOTE\fR:
All commands listed below can be used in the interactive mode, but \fINOT\fP
on the initial command line.
.TP
\fBexit\fP
Terminate sacctmgr interactive mode.
Identical to the \fBquit\fR command.
.IP
.TP
\fBquiet\fP
Print no messages other than error messages.
.IP
.TP
\fBquit\fP
Terminate the execution of sacctmgr interactive mode.
Identical to the \fBexit\fR command.
.IP
.TP
\fBverbose\fP
Enable detailed logging.
This includes time\-stamps on data structures, record counts, etc.
This is an independent command with no options meant for use in
interactive mode.
.IP
.TP
\fB!!\fP
Repeat the last command.
.IP
.SH "ENTITIES"
.TP
\fBaccount\fR
A bank account, typically specified at job submit time using the
\fB\-\-account=\fR option.
These may be arranged in a hierarchical fashion, for example
accounts 'chemistry' and 'physics' may be children of
the account 'science'.
The hierarchy may have an arbitrary depth.
.IP
.TP
\fBassociation\fR
The entity used to group information consisting of four parameters:
\fBaccount\fR, \fBcluster\fR, \fBpartition\fR (optional), and \fBuser\fR.
Used only with the \fBlist\fR or \fBshow\fR command. Add, modify, and
delete should be done to a user, account or cluster entity, which will
in turn update the underlying associations. Modification of attributes like
limits is allowed for an association but not a modification of the four
core attributes of an association. You cannot change the partition setting
(or set one if it has not been set) for an existing association. Instead,
you will need to create a new association with the partition included. You
can either keep the previous association with no partition defined, or delete
it. Note that these newly added associations are unique entities and any
existing usage information will not be carried over to the new association.
.IP
.TP
\fBcluster\fR
The \fBClusterName\fR parameter in the \fBslurm.conf\fR configuration
file, used to differentiate accounts on different machines.
.IP
.TP
\fBconfiguration\fR
Used only with the \fBlist\fR or \fBshow\fR command to report current
system configuration.
.IP
.TP
\fBcoordinator\fR
A special privileged user, usually an account manager, that can
add users or sub\-accounts to the account they are coordinator over.
This should be a trusted person since they can change limits on
account and user associations, as well as cancel, requeue or reassign
accounts of jobs inside their realm.
.IP
.TP
\fBevent\fR
Events like downed or drained nodes on clusters. Note that this does not
include transitory states like DRAINING.
.IP
.TP
\fBfederation\fR
A group of clusters that work together to schedule jobs.
.IP
.TP
\fBjob\fR
Used to modify specific fields of a job: Derived Exit Code, Comment,
AdminComment, Extra, SystemComment, TRES, or WCKey.
.IP
.TP
\fBproblem\fR
Use with \fBshow\fR or \fBlist\fR to display entity problems.
.IP
.TP
\fBqos\fR
Quality of Service.
.IP
.TP
\fBreservation\fR
A collection of resources set apart for use by a particular account, user
or group of users for a given period of time.
.IP
.TP
\fBresource\fR
Software resources for the system. Those are software licenses shared
among clusters.
.IP
.TP
\fBRunawayJobs\fR
Used only with the \fBlist\fR or \fBshow\fR command to report current
jobs that have been orphaned on the local cluster and are now
runaway. If there are jobs in this state it will also give you an
option to "fix" them.
This sets the end time for each job to the latest of the job's start, eligible,
and submit times, and sets the state to completed by default. Once corrected,
this triggers the SlurmDBD to recalculate the usage from before the earliest
submit time of all the runaway jobs. \fBNOTE\fR: This could take a long time
and sreport may not return data until the recalculation is completed.
\fBNOTE\fR: You must have an \fBAdminLevel\fR of at least \fBOperator\fR to perform
this.
.IP
.TP
\fBstats\fR
Used with \fBlist\fR or \fBshow\fR command to view server statistics.
Accepts optional argument of \fBave_time\fR or \fBtotal_time\fR to sort on those
fields. By default, sorts on increasing RPC count field.
.IP
.TP
\fBtransaction\fR
List of transactions that have occurred during a given time period.
.IP
.TP
\fBtres\fR
Used with \fBlist\fR or \fBshow\fR command to view a list of Trackable
RESources configured on the system.
.IP
.TP
\fBuser\fR
The login name. Usernames are case\-insensitive (forced to lowercase) unless
the \fBPreserveCaseUser\fR option has been set in the SlurmDBD configuration
file.
.IP
.TP
\fBwckeys\fR
Workload Characterization Key. An arbitrary string for grouping orthogonal accounts.
.IP
.SH "GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES"
\fBNOTE\fR: The group limits (GrpJobs, GrpTRES, etc.) are tested when a job is
being considered for being allocated resources.
If starting a job would cause any of its group limit to be exceeded,
that job will not be considered for scheduling even if that job might preempt
other jobs which would release sufficient group resources for the pending
job to be initiated.
.TP
\fBDefaultQOS\fR=<\fIdefault_qos\fR>
The QOS this association and its children will use by default if allowed in the
\fBQosLevel\fR list mentioned below.
This is overridden if set directly on a user.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBFairshare\fR={<\fIfairshare_number\fR>|parent}
.PD 0
.IP \fBShare\fR={<\fIfairshare_number\fR>|parent}
.PD
Allocated shares used for fairshare calculation. Can also be the string
\fIparent\fR, which is interpreted differently if set on a user or on an account.
If set on a user, the parent association is used for fairshare.
If set on an account, that account's children will be effectively re\-parented
for fairshare calculations to the first parent of their parent that is not
Fairshare=parent. Limits remain the same, only its fairshare value is affected.
To clear an existing value, set a new value of \-1.
.IP
.TP
\fBGrpJobs\fR=<\fImax_jobs\fR>
Maximum number of running jobs in aggregate for this association and its
children.
To clear an existing value, set a new value of \-1.
.IP
.TP
\fBGrpJobsAccrue\fR=<\fImax_jobs\fR>
Maximum number of pending jobs in aggregate able to accrue age priority for this
association and its children.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBGrpSubmit\fR=<\fImax_jobs\fR>
.PD 0
.IP \fBGrpSubmitJobs\fR=<\fImax_jobs\fR>
.PD
Maximum number of jobs in a pending or running state at any time in aggregate
for this association and its children.
To clear an existing value, set a new value of \-1.
.IP
.TP
\fBGrpTRES\fR=TRES=<\fImax_TRES\fR>[,TRES=<\fImax_TRES\fR>,...]
Maximum number of TRES able to be allocated by running jobs in aggregate for
this association and its children.
To clear an existing value, set a new value of \-1 for each TRES type/name.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
.IP
.TP
\fBGrpTRESMins\fR=TRES=<\fIminutes\fR>[,TRES=<\fIminutes\fR>,...]
Maximum number of TRES minutes that can possibly be used by past, present, and
future jobs in this association and its children.
To clear an existing value, set a new value of \-1 for each TRES type/name.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
\fBNOTE\fR: This limit is not enforced if set on the root
association of a cluster. So even though it may appear in sacctmgr
output, it will not be enforced.
\fBNOTE\fR: This limit only applies when using the Priority Multifactor
plugin. The time is decayed using the value of PriorityDecayHalfLife
or PriorityUsageResetPeriod as set in the slurm.conf. When this limit
is reached all associated jobs running will be killed and all future
jobs submitted with associations in the group will be delayed until
they are able to run inside the limit.
.IP
.TP
\fBGrpTRESRunMins\fR=TRES=<\fIminutes\fR>[,TRES=<\fIminutes\fR>,...]
Maximum number of TRES minutes able to be allocated by running jobs in this
association and its children. This takes into consideration time limit of
running jobs and consumes it. If the limit is reached no new jobs are started
until other jobs finish to allow time to free up.
To clear an existing value, set a new value of \-1 for each TRES type/name.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
.IP
.TP
\fBGrpWall\fR=<\fImax_wall\fR>
Maximum wall clock time able to be allocated by running jobs in aggregate in
this association and its children.
GrpWall format is <min> or <min>:<sec> or <hr>:<min>:<sec> or
<days>\-<hr>:<min>:<sec> or <days>\-<hr>.
The value is recorded in minutes with rounding as needed.
To clear an existing value, set a new value of \-1.
\fBNOTE\fR: Although it may appear in sacctmgr output, this limit will not
be enforced if set on the root association of a cluster.
\fBNOTE\fR: This limit only applies when using the Priority Multifactor
plugin. The time is decayed using the value of PriorityDecayHalfLife
or PriorityUsageResetPeriod as set in the slurm.conf. When this limit
is reached all associated jobs running will be killed and all future
jobs submitted with associations in the group will be delayed until
they are able to run inside the limit.
.IP
.TP
\fBMaxJobs\fR=<\fImax_jobs\fR>
Maximum number of running jobs per user in this association. This is overridden
if set directly on a user. Default is the cluster's limit.
To clear an existing value, set a new value of \-1.
.IP
.TP
\fBMaxJobsAccrue\fR=<\fImax_jobs\fR>
Maximum number of pending jobs able to accrue age priority at any given time in
this association. This is overridden if set directly on a user.
Default is the cluster's limit.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBMaxSubmit\fR=<\fImax_jobs\fR>
.PD 0
.IP \fBMaxSubmitJobs\fR=<\fImax_jobs\fR>
.PD
Maximum number of jobs in a pending or running state at any time in this
association. Default is the cluster's limit.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBMaxTRES\fR=TRES=<\fImax_TRES\fR>[,TRES=<\fImax_TRES\fR>,...]
.PD 0
.IP \fBMaxTRESPJ\fR=TRES=<\fImax_TRES\fR>[,TRES=<\fImax_TRES\fR>,...]
.PD 0
.IP \fBMaxTRESPerJob\fR=TRES=<\fImax_TRES\fR>[,TRES=<\fImax_TRES\fR>,...]
.PD
Maximum number of TRES each job can use in this association.
This is overridden if set directly on a user.
Default is the cluster's limit.
To clear an existing value, set a new value of \-1 for each TRES type/name.
.IP \fBMaxTRESMins\fR=TRES=<\fIminutes\fR>[,TRES=<\fIminutes\fR>,...]
.PD 0
.IP \fBMaxTRESMinsPJ\fR=TRES=<\fIminutes\fR>[,TRES=<\fIminutes\fR>,...]
.PD 0
.IP \fBMaxTRESMinsPerJob\fR=TRES=<\fIminutes\fR>[,TRES=<\fIminutes\fR>,...]
.PD
Maximum number of TRES minutes each job can use in this association.
This is overridden if set directly on a user.
Default is the cluster's limit.
To clear an existing value, set a new value of \-1 for each TRES type/name.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
.IP
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
.IP
.IP \fBMaxTRESPN\fR=TRES=<\fImax_TRES\fR>[,TRES=<\fImax_TRES\fR>,...]
.PD 0
.IP \fBMaxTRESPerNode\fR=TRES=<\fImax_TRES\fR>[,TRES=<\fImax_TRES\fR>,...]
.PD
Maximum number of TRES each node in a job allocation can use in this association.
This is overridden if set directly on a user.
Default is the cluster's limit.
To clear an existing value, set a new value of \-1 for each TRES type/name.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
.IP
.IP \fBMaxWall\fR=<\fImax_wall\fR>
.PD 0
.IP \fBMaxWallDurationPerJob\fR=<\fImax_wall\fR>
.PD
Maximum wall clock time each job can use in this association.
This is overridden if set directly on a user.
Default is the cluster's limit.
MaxWall format is <min> or <min>:<sec> or <hr>:<min>:<sec> or
<days>\-<hr>:<min>:<sec> or <days>\-<hr>.
The value is recorded in minutes with rounding as needed.
To clear an existing value, set a new value of \-1.
\fBNOTE\fR: Changing this value will have no effect on any running or
pending job.
.IP
.TP
\fBPriority\fR
Association priority factor to be used by the priority/multifactor plugin.
This is overridden if set directly on a user.
Unset by default, indicating that no extra priority is granted.
To clear an existing value, set a new value of \-1.
.IP
.TP
\fBQosLevel\fR<\fIoperator\fR><\fIcomma_separated_list_of_qos_names\fR>
List of QOS names available to jobs running in this association. To get a list
of valid QOSs use 'sacctmgr list qos'.
This value will override its parents value and push down to its
children as the new default. Setting a QosLevel to '' (two single
quotes with nothing between them) restores its default setting. You
can also use the operator += and \-= to add or remove certain QOSs
from the QOS list.
Valid <operator> values include:
.IP
.RS
\fB=\fR
.RS 5
Set \fIQosLevel\fP to the specified value. \fBNOTE\fR: the QOS that can be used
at a given account in the hierarchy are inherited by the children of that account.
By assigning QOS with the \fB=\fR sign only the assigned QOS can be used by the
account and its children.
.RE
\fB+=\fR
.RS
Add the specified <qos> value to the current \fIQosLevel\fP. The account will
have access to this QOS and any others previously assigned to it.
.RE
\fB\-=\fR
.RS
Remove the specified <qos> value from the current \fIQosLevel\fP.
.RE
.RE
.TP
See the \fBEXAMPLES\fR section below.
.IP
.SH "SPECIFICATIONS FOR ACCOUNTS"
.LP
Accounts can be created, modified, and deleted with sacctmgr. These options
allow you to set the corresponding attributes or filter on them when
querying for Accounts.
.TP
\fBCluster\fR=<\fIcluster\fR>
Specific cluster to add account to. Default is all in system.
.IP
.TP
\fBDescription\fR=<\fIdescription\fR>
An arbitrary string describing an account.
.IP
.TP
\fBFlags\fR=<\fIflag\fR>[,<\fIflag\fR>,...]
.PD
Valid options are:
.RS
.TP
\fBNoUsersAreCoords\fR
Remove the privilege \fIUsersAreCoords\fR sets.
.IP
.TP
\fBUsersAreCoords\fR
If set, all users in this account will have the coordinator status here and of
any sub-account in it's hierarchy.
.IP
.RE
.IP
.TP
\fBName\fR=<\fIname\fR>
The name of a bank account.
Note the name must be unique and can not be represent different bank
accounts at different points in the account hierarchy.
.IP
.TP
\fBOrganization\fR=<\fIorg\fR>
Organization to which the account belongs.
.IP
.TP
\fBParent\fR=<\fIparent\fR>
Parent account of this account. Default is the root account, a top
level account.
.IP
.TP
\fBRawUsage\fR=<\fIvalue\fR>
This allows an administrator to reset the raw usage accrued to an
account. The only value currently supported is 0 (zero). This is a
settable specification only \- it cannot be used as a filter to list
accounts.
.IP
.TP
\fBWithAssoc\fR
Display all associations for this account.
.IP
.TP
\fBWithCoord\fR
Display all coordinators for this account.
.IP
.TP
\fBWithDeleted\fR
Display information with previously deleted data.
Accounts that are deleted within 24 hours of being created and did not have
a job run in the account during that time will be removed from the database.
Otherwise, the account will be marked as deleted and will be viewable with
the \fBWithDeleted\fR flag.
.P
\fBNOTE\fR: If using the WithAssoc option you can also query against
association specific information to view only certain associations
this account may have. These extra options can be found in the
\fISPECIFICATIONS FOR ASSOCIATIONS\fP section. You can also use the
general specifications list above in the \fIGENERAL SPECIFICATIONS FOR
ASSOCIATION BASED ENTITIES\fP section.
.IP
.SH "LIST/SHOW ACCOUNT FORMAT OPTIONS"
.LP
Fields you can display when viewing Account records by using the \fIformat=\fR
option. The default format is:
.br
Account,Description,Organization
.TP
\fBAccount\fR
The name of a bank account.
.IP
.TP
\fBDescription\fR
An arbitrary string describing an account.
.IP
.TP
\fBFlags\fR
Flags set on the account.
.IP
.TP
\fBOrganization\fR
Organization to which the account belongs.
.IP
.TP
\fBCoordinators\fR
List of users that are a coordinator of the account. (Only filled in
when using the WithCoordinator option.)
.P
\fBNOTE\fR: If using the WithAssoc option you can also view the information
about the various associations the account may have on all the
clusters in the system. The association information can be filtered.
Note that all the accounts in the database will always be shown as filter only
takes effect over the association data. The Association format fields are
described in the \fILIST/SHOW ASSOCIATION FORMAT OPTIONS\fP section.
.IP
.SH "SPECIFICATIONS FOR ASSOCIATIONS"
.LP
Associations can be created, modified, and deleted with sacctmgr. These
options allow you to set the corresponding attributes or filter on them
when querying for Associations.
.TP
\fBClusters\fR=<\fIcluster_name\fR>[,<\fIcluster_name\fR>,...]
List the associations of the cluster(s).
.IP
.TP
\fBAccounts\fR=<\fIaccount_name\fR>[,<\fIaccount_name\fR>,...]
List the associations of the account(s).
.IP
.TP
\fBUsers\fR=<\fIuser_name\fR>[,<\fIuser_name\fR>,...]
List the associations of the user(s).
.IP
.TP
\fBPartitions\fR=<\fIpartition_name\fR>[,<\fIpartition_name\fR>,...]
List the associations of the partition(s).
.P
\fBNOTE\fR: Use Partitions="" or Partitions='' with no other names listed
when specifying the case where there is no partition. This can be useful
when using a command with an entity that has associations with and without
partitions. If given in a shell where the quotes will be consumed then
they must be quoted themselves. For example: Partitions=\\"\\".
.P
\fBNOTE\fR: You can also use the general specifications list above in the
\fIGENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES\fP section.
.P
\fBOther options unique for listing associations:\fP
.IP
.TP
\fBOnlyDefaults\fR
Display only associations that are default associations
.IP
.TP
\fBTree\fR
Display account names in a hierarchical fashion.
.IP
.TP
\fBWithDeleted\fR
Display information with previously deleted data.
Associations that are deleted within 24 hours of being created and did not have
a job run in the association during that time will be removed from the database.
Otherwise, the association will be marked as deleted and will be viewable with
the \fBWithDeleted\fR flag.
.IP
.TP
\fBWithSubAccounts\fR
Display information with subaccounts. Only really valuable when used
with the account= option. This will display all the subaccount
associations along with the accounts listed in the option.
.IP
.TP
\fBWOLimits\fR
Display information without limit information. This is for a smaller
default format of "Cluster,Account,User,Partition".
.IP
.TP
\fBWOPInfo\fR
Display information without parent information (i.e. parent id, and
parent account name). This option also implicitly sets the WOPLimits
option.
.IP
.TP
\fBWOPLimits\fR
Display information without hierarchical parent limits (i.e. will
only display limits where they are set instead of propagating them
from the parent).
.IP
.SH "LIST/SHOW ASSOCIATION FORMAT OPTIONS"
.LP
Fields you can display when viewing Association records by using the
\fIformat=\fR option.
.TP
\fBAccount\fR
The name of a bank account in the association.
.IP
.TP
\fBCluster\fR
The name of a cluster in the association.
.IP
.TP
\fBDefaultQOS\fR
The QOS this association and its children will use by default if allowed in the
\fBQosLevel\fR list mentioned below.
.IP
.IP \fBFairshare\fR
.PD 0
.IP \fBShare\fR
.PD
Allocated shares used for fairshare calculation. Can also be the string
\fIparent\fR, which is interpreted differently if set on a user or on an account.
If set on a user, the parent association is used for fairshare.
If set on an account, that account's children will be effectively re\-parented
for fairshare calculations to the first parent of their parent that is not
Fairshare=parent. Limits remain the same, only its fairshare value is affected.
.IP
.TP
\fBFlags\fR
Flags set on the association.
.IP
.TP
\fBGrpJobs\fR
Maximum number of running jobs in aggregate for this association and its
children.
.IP
.TP
\fBGrpJobsAccrue\fR
Maximum number of pending jobs in aggregate able to accrue age priority for this
association and its children.
.IP
.IP \fBGrpSubmit\fR
.PD 0
.IP \fBGrpSubmitJobs\fR
.PD
Maximum number of jobs in a pending or running state at any time in aggregate
for this association and its children.
.IP
.TP
\fBGrpTRES\fR
Maximum number of TRES able to be allocated by running jobs in aggregate for
this association and its children.
.IP
.TP
\fBGrpTRESMins\fR
Maximum number of TRES minutes that can possibly be used by past, present, and
future jobs in this association and its children.
.IP
.TP
\fBGrpTRESRunMins\fR
Maximum number of TRES minutes able to be allocated by running jobs in this
association and its children. This takes into consideration time limit of
running jobs and consumes it. If the limit is reached no new jobs are started
until other jobs finish to allow time to free up.
.IP
.TP
\fBGrpWall\fR
Maximum wall clock time able to be allocated by running jobs in aggregate in
this association and its children.
.IP
.TP
\fBID\fR
The id of the association.
.IP
.TP
\fBLineage\fR
Complete path up the hierarchy to the root association.
.IP
.TP
\fBMaxJobs\fR
Maximum number of running jobs per user.
.IP
.TP
\fBMaxJobsAccrue\fR
Maximum number of pending jobs able to accrue age priority at any given time.
.IP
.IP \fBMaxSubmit\fR
.PD 0
.IP \fBMaxSubmitJobs\fR
.PD
Maximum number of jobs in a pending or running state at any time.
.IP
.IP \fBMaxTRES\fR
.PD 0
.IP \fBMaxTRESPJ\fR
.PD 0
.IP \fBMaxTRESPerJob\fR
.PD
Maximum number of TRES each job can use.
.IP
.IP \fBMaxTRESMins\fR
.PD 0
.IP \fBMaxTRESMinsPJ\fR
.PD 0
.IP \fBMaxTRESMinsPerJob\fR
.PD
Maximum number of TRES minutes each job can use.
.IP
.IP \fBMaxTRESPN\fR
.PD 0
.IP \fBMaxTRESPerNode\fR
.PD
Maximum number of TRES each node in a job allocation can use.
.IP
.IP \fBMaxWall\fR
.PD 0
.IP \fBMaxWallDurationPerJob\fR
.PD
Maximum wall clock time each job can use.
.IP
.TP
\fBParentID\fR
The association id of the parent of this association.
.IP
.TP
\fBParentName\fR
The account name of the parent of this association.
.IP
.TP
\fBPartition\fR
The name of a partition in the association.
.IP
.TP
\fBPriority\fR
Association priority factor to be used by the priority/multifactor plugin.
.IP
.TP
\fBQos\fR
Valid QOSs for this association.
.IP
.TP
\fBQosRaw\fR
Numeric IDs of valid QOSs for this association.
.IP
.TP
\fBUser\fR
The name of a user in the association.
.IP
.TP
\fBWithRawQOSLevel\fR
Display QosLevel in an unevaluated raw format, consisting of a comma\-separated
list of QOS names prepended with '' (nothing), '+' or '\-' for
the association. QOS names without +/\- prepended were assigned (ie,
sacctmgr modify ... set QosLevel=qos_name) for the entity listed or
on one of its parents in the hierarchy. QOS names with +/\- prepended
indicate the QOS was added/filtered (ie, sacctmgr modify ... set
QosLevel=[+\-]qos_name) for the entity listed or on one of its parents
in the hierarchy. Including WOPLimits will show exactly where each QOS
was assigned, added or filtered in the hierarchy.
.IP
.SH "SPECIFICATIONS FOR CLUSTERS"
.LP
Clusters can be created, modified, and deleted with sacctmgr. These
options allow you to set the corresponding attributes or filter on them
when querying for Clusters.
.TP
\fBClassification\fR=<\fIclassification\fR>
Type of machine, current classifications are capability, capacity and
capapacity.
.IP
.TP
\fBFeatures\fR[+|-]=<\fIcomma_separated_list_of_feature_names\fR>
Features that are specific to the cluster. Federated jobs can be directed to
clusters that contain the job requested features.
To add or remove individual features, use the += or -= operators.
To clear all existing features, set a new value of '' (two single quotes with
nothing between them).
.IP
.TP
\fBFederation\fR=<\fIfederation\fR>
The federation that this cluster should be a member of. A cluster can only be a
member of one federation at a time.
.IP
.TP
\fBFedState\fR=<\fIstate\fR>
The state of the cluster in the federation.
.br
Valid states are:
.RS
.TP
\fBACTIVE\fR
Cluster will actively accept and schedule federated jobs.
.IP
.TP
\fBINACTIVE\fR
Cluster will not schedule or accept any jobs.
.IP
.TP
\fBDRAIN\fR
Cluster will not accept any new jobs and will let existing federated jobs
complete.
.IP
.TP
\fBDRAIN+REMOVE\fR
Cluster will not accept any new jobs and will remove itself from the federation
once all federated jobs have completed. When removed from the federation, the
cluster will accept jobs as a non\-federated cluster.
.RE
.IP
.TP
\fBName\fR=<\fIname\fR>
The name of a cluster.
This should be equal to the \fIClusterName\fR parameter in the \fIslurm.conf\fR
configuration file for some Slurm\-managed cluster.
.IP
.TP
\fBRPC\fR=<\fIrpc_list\fR>
Comma\-separated list of numeric RPC values.
.IP
.TP
\fBWithDeleted\fR
Display information with previously deleted data.
Clusters that are deleted within 24 hours of being created and did not have
a job run in the cluster during that time will be removed from the database.
Otherwise, the cluster will be marked as deleted and will be viewable with
the \fBWithDeleted\fR flag.
.IP
.TP
\fBWithFed\fR
Appends federation related columns to default format options
(e.g. Federation,ID,Features,FedState).
.IP
.TP
\fBWOLimits\fR
Display information without limit information. This is for a smaller
default format of Cluster,ControlHost,ControlPort,RPC
.P
\fBNOTE\fR: You can also use the general specifications list above in the
\fIGENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES\fP section.
.IP
.SH "LIST/SHOW CLUSTER FORMAT OPTIONS"
.LP
Fields you can display when viewing Cluster records by using the \fIformat=\fR
option.
.TP
\fBClassification\fR
Type of machine, i.e. capability, capacity or capapacity.
.IP
.TP
\fBCluster\fR
The name of the cluster.
.IP
.TP
\fBControlHost\fR
When a slurmctld registers with the database the ip address of the
controller is placed here.
.IP
.TP
\fBControlPort\fR
When a slurmctld registers with the database the port the controller
is listening on is placed here.
.IP
.TP
\fBFeatures\fR
The list of features on the cluster (if any).
.IP
.TP
\fBFederation\fR
The name of the federation this cluster is a member of (if any).
.IP
.TP
\fBFedState\fR
The state of the cluster in the federation (if a member of one).
.IP
.TP
\fBFedStateRaw\fR
Numeric value of the name of the FedState.
.IP
.TP
\fBFlags\fR
Attributes possessed by the cluster. Current flags include Cray, External and
MultipleSlurmd.
External clusters are registration only clusters. A slurmctld can designate an
external slurmdbd with the \fIAccountingStorageExternalHost\fR slurm.conf
option. This allows a slurmctld to register to an external slurmdbd so that
clusters attached to the external slurmdbd can communicate with the external
cluster with Slurm commands.
.IP
.TP
\fBID\fR
The ID assigned to the cluster when a member of a federation. This ID uniquely
identifies the cluster and its jobs in the federation.
.IP
.TP
\fBNodeCount\fR
The current count of nodes associated with the cluster.
.IP
.TP
\fBNodeNames\fR
The current Nodes associated with the cluster.
.IP
.TP
\fBRPC\fR
When a slurmctld registers with the database the rpc version the controller
is running is placed here.
.IP
.TP
\fBTRES\fR
Trackable RESources (Billing, BB (Burst buffer), CPU, Energy, GRES, License,
Memory, and Node) this cluster is accounting for.
.P
\fBNOTE\fR: You can also view the information about the root association for
the cluster. The Association format fields are described
in the \fILIST/SHOW ASSOCIATION FORMAT OPTIONS\fP section.
.IP
.SH "SPECIFICATIONS FOR COORDINATOR"
.LP
Coordinators can be created, modified, and deleted with sacctmgr. These
options allow you to set the corresponding attributes or filter on them
when querying for Coordinators.
.TP
\fBAccount\fR=<\fIaccount_name\fR>[,<\fIaccount_name\fR>,...]
Account name to add this user as a coordinator to.
.IP
.TP
\fBNames\fR=<\fIuser_name\fR>[,<\fIuser_name\fR>,...]
Names of coordinators.
.P
\fBNOTE\fR: To list coordinators use the WithCoordinator options with list
account or list user.
.IP
.SH "SPECIFICATIONS FOR EVENTS"
.LP
Events are automatically generated and sent to slurmdbd to be stored.
These are options you can specify to filter for specific types of events.
.TP
\fBAll_Clusters\fR
Shortcut to get information on all clusters.
.IP
.TP
\fBAll_Time\fR
Shortcut to get time period for all time.
.IP
.TP
\fBClusters\fR=<\fIcluster_name\fR>[,<\fIcluster_name\fR>,...]
List the events of the cluster(s). Default is the cluster where the
command was run.
.IP
.TP
\fBCondFlags\fR=<\fIflag\fR>[,<\fIflag\fR>,...]
Optional list of flags to filter events by.
.br
Valid options are:
.RS
.TP
\fBOpen\fR
If set, only open node events (currently down) will be returned.
.IP
.RE
.IP
.TP
\fBEnd\fR=<\fIOPT\fR>
Period ending of events. Default is now.
.br
Valid time formats are:
.RS
.LP
HH:MM[:SS] [AM|PM]
.br
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
.br
MM/DD[/YY]\-HH:MM[:SS]
.br
YYYY\-MM\-DD[THH:MM[:SS]]
.br
now[{+|\-}\fIcount\fR[seconds(default)|minutes|hours|days|weeks]]
.RE
.IP
.TP
\fBEvent\fR=<\fIOPT\fR>
Specific types of events to look for. Valid options are Cluster or Node.
The default is both.
.IP
.TP
\fBMaxCPUs\fR=<\fIOPT\fR>
Max number of CPUs affected by an event.
.IP
.TP
\fBMinCPUs\fR=<\fIOPT\fR>
Min number of CPUs affected by an event.
.IP
.TP
\fBNodes\fR=<\fInode_name\fR>[,<\fInode_name\fR>,...]
Node names affected by an event.
.IP
.TP
\fBReason\fR=<\fIreason\fR>[,<\fIreason\fR>,...]
Reason associated with a node going down. A reason that contains a space
should be surrounded by quotes.
.IP
.TP
\fBStart\fR=<\fIOPT\fR>
Period start of events. Default is 00:00:00 of previous day, unless
states are given with the States=<spec> events. If this is the case
the default behavior is to return events currently in
the states specified.
.br
Valid time formats are:
.RS
HH:MM[:SS] [AM|PM]
.br
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
.br
MM/DD[/YY]\-HH:MM[:SS]
.br
YYYY\-MM\-DD[THH:MM[:SS]]
.br
now[{+|\-}\fIcount\fR[seconds(default)|minutes|hours|days|weeks]]
.RE
.IP
.TP
\fBStates\fR=<\fIstate\fR>[,<\fIstate\fR>,...]
State of a node in a node event. If this is set, the event type is
set automatically to Node.
.IP
.TP
\fBUser\fR=<\fIuser_name\fR>[,<\fIuser_name\fR>,...]
Query against users who set the event. If this is set, the event type is
set automatically to Node since only the slurm user can perform a cluster
event.
.IP
.SH "LIST/SHOW EVENT FORMAT OPTIONS"
.LP
Fields you can display when viewing Event records by using the \fIformat=\fR
option. The default format is:
.br
Cluster,NodeName,TimeStart,TimeEnd,State,Reason,User
.TP
\fBCluster\fR
The name of the cluster event happened on.
.IP
.TP
\fBClusterNodes\fR
The hostlist of nodes on a cluster in a cluster event.
.IP
.TP
\fBDuration\fR
Time period the event was around for.
.IP
.TP
\fBEnd\fR
Period when event ended.
.IP
.TP
\fBEvent\fR
Name of the event.
.IP
.TP
\fBEventRaw\fR
Numeric value of the name of the event.
.IP
.TP
\fBNodeName\fR
The node affected by the event. In a cluster event, this is blank.
.IP
.TP
\fBReason\fR
The reason an event happened.
.IP
.TP
\fBStart\fR
Period when event started.
.IP
.TP
\fBState\fR
On a node event this is the formatted state of the node during the event.
.IP
.TP
\fBStateRaw\fR
On a node event this is the numeric value of the state of the node
during the event.
.IP
.TP
\fBTRES\fR
Number of TRES involved with the event.
.IP
.TP
\fBUser\fR
On a node event this is the user who caused the event to happen.
.IP
.SH "SPECIFICATIONS FOR FEDERATION"
.LP
Federations can be created, modified, and deleted with sacctmgr. These
options allow you to set the corresponding attributes or filter on them
when querying for Federations.
.TP
\fBClusters\fR[+|\-]=<\fIcluster_name\fR>[,<\fIcluster_name\fR>,...]
List of clusters to add/remove to a federation. A blank value (e.g. clusters=)
will remove all federations for the federation. \fBNOTE\fR: A cluster can only
be a member of one federation.
.IP
.TP
\fBName\fR=<\fIname\fR>
The name of the federation.
.IP
.TP
\fBTree\fR
Display federations in a hierarchical fashion.
.IP
.TP
\fBWithDeleted\fR
Display information with previously deleted data.
Federations that are deleted within 24 hours of being created will be removed
from the database. Federations that were created more than 24 hours prior to
the deletion request are just marked as deleted and will be viewable with
the \fBWithDeleted\fR flag.
.IP
.SH "LIST/SHOW FEDERATION FORMAT OPTIONS"
.LP
Fields you can display when viewing Federation records by using the
\fIformat=\fR option. The default format is:
.br
Federation,Cluster,Features,FedState
.TP
\fBCluster\fR
Name of the cluster that is a member of the federation.
.IP
.TP
\fBFeatures\fR
The list of features on the cluster.
.IP
.TP
\fBFederation\fR
The name of the federation.
.IP
.TP
\fBFedState\fR
The state of the cluster in the federation.
.IP
.TP
\fBFedStateRaw\fR
Numeric value of the name of the FedState.
.IP
.TP
\fBIndex\fR
The index of the cluster in the federation.
.IP
.SH "SPECIFICATIONS FOR INSTANCES"
.LP
Information about cloud node instances is sent to slurmdbd to be stored.
These are options you can specify to filter for specific instances.
.TP
\fBClusters\fR=<\fIcluster_name\fR>[,<\fIcluster_name\fR>,...]
Name of the cluster that the instance ran on. Default is the cluster where the
command was run.
.IP
.TP
\fBEnd\fR=<\fIOPT\fR>
Period ending of instances. Default is now.
Valid time formats are:
.br
HH:MM[:SS] [AM|PM]
.br
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
.br
MM/DD[/YY]\-HH:MM[:SS]
.br
YYYY\-MM\-DD[THH:MM[:SS]]
.br
now[{+|\-}\fIcount\fR[seconds(default)|minutes|hours|days|weeks]]
.IP
.TP
\fBExtra\fR=<\fIOPT\fR>
Arbitrary string associated with node during life of the instance.
.IP
.TP
\fBInstanceId\fR=<\fIOPT\fR>
Cloud instance ID.
.IP
.TP
\fBInstanceType\fR=<\fIOPT\fR>
Cloud instance type.
.IP
.TP
\fBNodes\fR=<\fInode_name\fR>[,<\fInode_name\fR>,...]
The node on which the instance ran.
.IP
.TP
\fBStart\fR=<\fIOPT\fR>
Period start of instances. Default is 00:00:00 of previous day.
Valid time formats are:
.br
HH:MM[:SS] [AM|PM]
.br
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
.br
MM/DD[/YY]\-HH:MM[:SS]
.br
YYYY\-MM\-DD[THH:MM[:SS]]
.br
now[{+|\-}\fIcount\fR[seconds(default)|minutes|hours|days|weeks]]
.IP
.SH "LIST/SHOW INSTANCE FORMAT OPTIONS"
.LP
Fields you can display when viewing Instance records by using the \fIformat=\fR
option. The default format is:
.br
Cluster,NodeName,Start,End,InstanceID,InstanceType,Extra
.TP
\fBCluster\fR
Name of the cluster that the instance ran on.
.IP
.TP
\fBEnd\fR
Time when instance ended.
.IP
.TP
\fBExtra\fR
Arbitrary string associated with node during life of the instance.
.IP
.TP
\fBInstanceId\fR
Cloud instance ID.
.IP
.TP
\fBInstanceType\fR
Cloud instance type.
.IP
.TP
\fBNodeName\fR
The node on which the instance ran.
.IP
.TP
\fBStart\fR
Time when instance started.
.IP
.SH "SPECIFICATIONS FOR JOB"
.LP
Job information is automatically sent to slurmdbd to be stored.
These are options you can specify to filter for specific jobs. There are also
some attributes you can modify for a job record.
.TP
\fBAdminComment\fR=<\fIadmin_comment\fR>
Arbitrary descriptive string. Can only be modified by a Slurm administrator.
To clear an existing value, set a new value of '' (two single quotes with
nothing between them).
.IP
.TP
\fBComment\fR=<\fIcomment\fR>
The job's comment string when the AccountingStoreFlags parameter
in the slurm.conf file contains 'job_comment'. The user can only
modify the comment string of their own job.
To clear an existing value, set a new value of '' (two single quotes with
nothing between them).
.IP
.TP
\fBCluster\fR=<\fIcluster_list\fR>
List of clusters to alter jobs on, defaults to local cluster.
.IP
.TP
\fBDerivedExitCode\fR=<\fIderived_exit_code\fR>
The derived exit code can be modified after a job completes based on
the user's judgment of whether the job succeeded or failed. The user
can only modify the derived exit code of their own job.
.IP
.TP
\fBEndTime\fR
Jobs must end before this time to be modified. Format output is,
YYYY\-MM\-DDTHH:MM:SS, unless changed through the SLURM_TIME_FORMAT environment
variable.
.IP
.TP
\fBExtra\fR=<\fIextra\fR>
The job's extra string when the AccountingStoreFlags parameter in the slurm.conf
file contains 'job_extra'. The user can only modify the extra string of their
own job.
To clear an existing value, set a new value of '' (two single quotes with
nothing between them).
.IP
.TP
\fBJobID\fR=<\fIjobid_list\fR>
The id of the job to change. Not needed if altering multiple jobs using wckey
specification.
.IP
.TP
\fBNewWCKey\fR=<\fInew_wckey\fR>
Use to rename a wckey on job(s) in the accounting database
.IP
.TP
\fBStartTime\fR
Jobs must start at or after this time to be modified in the same format as
\f3EndTime\fP.
.IP
.TP
\fBSystemComment\fR=<\fIsystem_comment\fR>
Arbitrary descriptive string, usually managed by the BurstBufferPlugin.
Can only be modified by a Slurm administrator.
To clear an existing value, set a new value of '' (two single quotes
with nothing between them).
.IP
.TP
\fBTRES\fR=<\fItres_name=value\fR>
Use to set or modify a TRES on job(s) in the accounting database that have
already completed.
\fBWARNING\fR: This is permanent, the original value will be lost afterwards.
.IP
.TP
\fBUser\fR=<\fIuser_list\fR>
Used to specify the jobs of users jobs to alter.
.IP
.TP
\fBWCKey\fR=<\fIwckey_list\fR>
Used to specify the wckeys to alter.
.IP
.P
The \fIAdminComment\fR, \fIComment\fR, \fIDerivedExitCode\fR, \fIExtra\fR,
\fISystemComment\fP, and \fIWCKey\fP fields are the only fields of a job record
in the database that can be modified after job completion.
.IP
.SH "LIST/SHOW JOB FORMAT OPTIONS"
The \fBsacct\fR command is the exclusive command to display job
records from the Slurm database.
.SH "SPECIFICATIONS FOR QOS"
.LP
A QOS can be created, modified, and deleted with sacctmgr. These
options allow you to set the corresponding attributes or filter on them
when querying for a QOS.
\fBNOTE\fR: The group limits (GrpJobs, GrpTRES, etc.) are tested when a job is
being considered for being allocated resources.
If starting a job would cause any of its group limit to be exceeded,
that job will not be considered for scheduling even if that job might preempt
other jobs which would release sufficient group resources for the pending
job to be initiated.
.TP
\fBDescription\fR
An arbitrary string describing a QOS. Can only be modified by a Slurm
administrator.
.IP
.TP
\fBFlags\fR
Used by the slurmctld to override or enforce certain characteristics.
To add or remove individual flags, use the += or -= operators.
To clear all existing flags, set a new value of \-1.
.br
Valid options are
.RS
.TP
\fBDenyOnLimit\fR
If set, jobs using this QOS will be rejected at submission time if they do
not conform to the QOS 'Max' or 'Min' limits as stand\-alone jobs.
Jobs that exceed these limits when other jobs are considered, but conform
to the limits when considered individually will not be rejected. Instead
they will pend until resources are available.
Group limits (e.g. \fBGrpTRES\fR) will also be treated like 'Max' limits
(e.g. \fBMaxTRESPerNode\fR) and jobs will be denied if they would violate
the limit as stand\-alone jobs.
This currently only applies to QOS and Association limits.
.IP
.TP
\fBEnforceUsageThreshold\fR
If set, and the QOS also has a UsageThreshold,
any jobs submitted with this QOS that fall below the UsageThreshold
will be held until their Fairshare Usage goes above the Threshold.
.IP
.TP
\fBNoDecay\fR
If set, this QOS will not have its GrpTRESMins,
GrpWall and UsageRaw decayed by the slurm.conf PriorityDecayHalfLife or
PriorityUsageResetPeriod settings. This allows a QOS to provide aggregate
limits that, once consumed, will not be replenished automatically. Such a
QOS will act as a time\-limited quota of resources for an association
that has access to it. Account/user usage will still be decayed for
associations using the QOS. The QOS GrpTRESMins and
GrpWall limits can be increased or the QOS RawUsage value reset to 0
(zero) to again allow jobs submitted with this QOS to be queued (if
DenyOnLimit is set) or run (pending with QOSGrp{TRES}MinutesLimit
or QOSGrpWallLimit reasons, where {TRES} is some type of trackable resource).
.IP
.TP
\fBNoReserve\fR
If set and backfill scheduling is used, jobs using this QOS will
not reserve resources in the backfill schedule's map of resources allocated
through time. This flag is intended for use with a QOS that may be preempted
by jobs associated with all other QOS (e.g use with a "standby" QOS). If this
flag is used with a QOS which can not be preempted by all other QOS, it could
result in starvation of larger jobs.
.IP
.TP
\fBOverPartQOS\fR
If set, jobs using this QOS will be able to
override any limits used by the requested partition's QOS limits.
.IP
.TP
\fBPartitionMaxNodes\fR
If set, jobs using this QOS will be able to
override the requested partition's MaxNodes limit.
.IP
.TP
\fBPartitionMinNodes\fR
If set, jobs using this QOS will be able to
override the requested partition's MinNodes limit.
.IP
.TP
\fBPartitionTimeLimit\fR
If set, jobs using this QOS will be able to
override the requested partition's TimeLimit.
.IP
.TP
\fBRelative\fR
If set, the QOS limits will be treated as percentages of the cluster or
partition instead of absolute limits (numbers should be less than 100).
The controller should be restarted or
reconfigured after adding the \fIRelative\fR flag to the QOS.
.br
If this is used as a partition QOS:
.RS
.LP
1. Limits will be calculated relative to the partition's resources.
.br
2. Only one partition may have this QOS as its partition QOS.
.br
3. Jobs will not be allowed to use it as a normal QOS.
.IP
.RE
Additional details are in the QOS documentation at
<https://slurm.schedmd.com/qos.html>.
.IP
.TP
\fBRequiresReservation\fR
If set, jobs using this QOS must designate a reservation when submitting a job.
This option can be useful in restricting usage of a QOS that may have greater
preemptive capability or additional resources to be allowed only within a
reservation.
.IP
.TP
\fBUsageFactorSafe\fR
If set and \fIAccountingStorageEnforce\fR includes \fISafe\fR, jobs will only
be able to run if the job can run to completion with the \fIUsageFactor\fR
applied.
.RE
.IP
.TP
\fBGraceTime\fR
Preemption grace time in seconds to be extended to a job which has been
selected for preemption. The default value is zero, meaning no preemption grace
time is allowed on this QOS. This value is only applicable for QOS
\fIPreemptMode=CANCEL\fR and \fIPreemptMode=REQUEUE\fR.
.IP
.TP
\fBGrpJobs\fR
Maximum number of running jobs in aggregate for this QOS.
To clear an existing value, set a new value of \-1.
.IP
.TP
\fBGrpJobsAccrue\fR
Maximum number of pending jobs in aggregate able to accrue age priority for this
QOS.
This limit only applies to the job's QOS and not the partition's QOS.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBGrpSubmit\fR
.PD 0
.IP \fBGrpSubmitJobs\fR
.PD
Maximum number of jobs in a pending or running state at any time in aggregate
for this QOS.
To clear an existing value, set a new value of \-1.
.IP
.TP
\fBGrpTRES\fR
Maximum number of TRES able to be allocated by running jobs in aggregate for
this QOS.
To clear an existing value, set a new value of \-1 for each TRES type/name.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
.IP
.TP
\fBGrpTRESMins\fR
Maximum number of TRES minutes that can possibly be used by past, present, and
future jobs with this QOS.
To clear an existing value, set a new value of \-1 for each TRES type/name.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
\fBNOTE\fR: This limit only applies when using the Priority Multifactor
plugin. The time is decayed using the value of PriorityDecayHalfLife
or PriorityUsageResetPeriod as set in the slurm.conf. When this limit
is reached all associated jobs running will be killed and all future jobs
submitted with this QOS will be delayed until they are able to run
inside the limit.
.IP
.TP
\fBGrpTRESRunMins\fR
Maximum number of TRES minutes able to be allocated by running jobs with this
QOS. This takes into consideration time limit of running jobs and consumes it.
If the limit is reached no new jobs are started until other jobs finish to allow
time to free up.
To clear an existing value, set a new value of \-1 for each TRES type/name.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
.IP
.TP
\fBGrpWall\fR
Maximum wall clock time able to be allocated by running jobs in aggregate for
this QOS. If this limit is reached, job submissions will be
denied and the running jobs will be killed.
GrpWall format is <min> or <min>:<sec> or <hr>:<min>:<sec> or
<days>-<hr>:<min>:<sec> or <days>-<hr>.
The value is recorded in minutes with rounding as needed.
To clear an existing value, set a new value of \-1.
\fBNOTE\fR: This limit only applies when using the Priority Multifactor
plugin. The time is decayed using the value of PriorityDecayHalfLife
or PriorityUsageResetPeriod as set in the slurm.conf. When this limit
is reached all associated jobs running will be killed and all future jobs
submitted with this QOS will be delayed until they are able to run
inside the limit.
.IP
.TP
\fBLimitFactor\fR
A float that is factored into an association's [Grp|Max]TRES limits. For
example, if the LimitFactor is 2, then an association with a GrpTRES of
30 CPUs, would be allowed to allocate 60 CPUs when running under this QOS.
To clear an existing value, set a new value of \-1.
\fBNOTE\fR: This factor is only applied to associations running in this QOS
and is not applied to any limits in the QOS itself.
.IP
.IP \fBMaxJobsAccruePA\fR
.PD 0
.IP \fBMaxJobsAccruePerAccount\fR
.PD
Maximum number of pending jobs an account (or subacct) can have accruing age
priority at any given time.
This limit only applies to the job's QOS and not the partition's QOS.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBMaxJobsAccruePU\fR
.PD 0
.IP \fBMaxJobsAccruePerUser\fR
.PD
Maximum number of pending jobs a user can have accruing age priority at any
given time.
This limit only applies to the job's QOS and not the partition's QOS.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBMaxJobsPA\fR
.PD 0
.IP \fBMaxJobsPerAccount\fR
.PD
Maximum number of running jobs per account.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBMaxJobsPU\fR
.PD 0
.IP \fBMaxJobsPerUser\fR
.PD
Maximum number of running jobs per user.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBMaxSubmitJobsPA\fR
.PD 0
.IP \fBMaxSubmitJobsPerAccount\fR
.PD
Maximum number of jobs in a pending or running state per account.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBMaxSubmitJobsPU\fR
.PD 0
.IP \fBMaxSubmitJobsPerUser\fR
.PD
Maximum number of jobs in a pending or running state per user.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBMaxTRES\fR
.PD 0
.IP \fBMaxTRESPJ\fR
.PD 0
.IP \fBMaxTRESPerJob\fR
.PD
Maximum number of TRES each job can use.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
To clear an existing value, set a new value of \-1 for each TRES type/name.
.IP
.IP \fBMaxTRESMins\fR
.PD 0
.IP \fBMaxTRESMinsPJ\fR
.PD 0
.IP \fBMaxTRESMinsPerJob\fR
.PD
Maximum number of TRES minutes each job can use.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
To clear an existing value, set a new value of \-1 for each TRES type/name.
.IP
.IP \fBMaxTRESPA\fR
.PD 0
.IP \fBMaxTRESPerAccount\fR
.PD
Maximum number of TRES each account can use.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
To clear an existing value, set a new value of \-1 for each TRES type/name.
.IP
.IP \fBMaxTRESPN\fR
.PD 0
.IP \fBMaxTRESPerNode\fR
.PD
Maximum number of TRES each node in a job allocation can use.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
To clear an existing value, set a new value of \-1 for each TRES type/name.
.IP
.IP \fBMaxTRESPU\fR
.PD 0
.IP \fBMaxTRESPerUser\fR
.PD
Maximum number of TRES each user can use.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
To clear an existing value, set a new value of \-1 for each TRES type/name.
.IP
.IP \fBMaxTRESRunMinsPA\fR
.PD 0
.IP \fBMaxTRESRunMinsPerAccount\fR
.PD
Maximum number of TRES minutes each account can use. This takes into
consideration the time limit of running jobs. If the limit is reached, no new
jobs are started until other jobs finish to allow time to free up.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
To clear an existing value, set a new value of \-1 for each TRES type/name.
.IP
.IP \fBMaxTRESRunMinsPU\fR
.PD 0
.IP \fBMaxTRESRunMinsPerUser\fR
.PD
Maximum number of TRES minutes each user can use. This takes into
consideration the time limit of running jobs. If the limit is reached, no new
jobs are started until other jobs finish to allow time to free up.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
To clear an existing value, set a new value of \-1 for each TRES type/name.
.IP
.IP \fBMaxWall\fR
.PD 0
.IP \fBMaxWallDurationPerJob\fR
.PD
Maximum wall clock time each job can use. MaxWall format is <min> or
<min>:<sec> or <hr>:<min>:<sec> or <days>-<hr>:<min>:<sec> or <days>-<hr>.
The value is recorded in minutes with rounding as needed.
To clear an existing value, set a new value of \-1.
.IP
.TP
\fBMinPrioThreshold\fR
Minimum priority required to reserve resources when scheduling.
To clear an existing value, set a new value of \-1.
.IP
.IP \fBMinTRES\fR
.PD 0
.IP \fBMinTRESPerJob\fR
.PD
Minimum number of TRES each job running under this QOS must request.
Otherwise the job will pend until modified.
\fBTRES\fR can be one of the Slurm defaults (i.e. \fIcpu\fR, \fImem\fR,
\fInode\fR, etc...), or any defined generic resource. You can see the list
of available resources by running \fBsacctmgr show tres\fR.
To clear an existing value, set a new value of \-1 for each TRES type/name.
.IP
.TP
\fBName\fR
Name of the QOS. Needed for creation.
.IP
.TP
\fBPreempt\fR
Other QOSs this QOS can preempt.
To clear an existing value, set a new value of '' (two single quotes with
nothing between them).
\fBNOTE\fP: The \fIPriority\fP of a QOS is NOT related to QOS preemption, only
\fIPreempt\fP is used to define which QOS can preempt others.
.IP
.TP
\fBPreemptExemptTime\fR
Specifies a minimum run time for jobs in this QOS before they are considered for
preemption. This QOS option takes precedence over the global
\fIPreemptExemptTime\fP. This is only honored for \fBPreemptMode=REQUEUE\fR
and \fBPreemptMode=CANCEL\fR.
.br
Setting to \-1 disables the option, allowing another
QOS or the global option to take effect. Setting to 0 indicates no minimum run
time and supersedes the lower priority QOS (see \fIOverPartQOS\fP) and/or the
global option in slurm.conf.
.IP
.TP
\fBPreemptMode\fR
Mechanism used to preempt jobs or enable gang scheduling for this QOS
when the cluster's \fIPreemptType\fR is set to \fIpreempt/qos\fR.
This QOS\-specific \fBPreemptMode\fR will override the cluster\-wide
\fBPreemptMode\fR for this QOS. Unsetting the QOS specific \fBPreemptMode\fR,
by specifying "OFF", "" or "Cluster", makes it use the default cluster\-wide
\fBPreemptMode\fR.
.br
The \fBGANG\fR option is used to enable gang scheduling independent of
whether preemption is enabled (i.e. independent of the \fBPreemptType\fR
setting). It can be specified in addition to a \fBPreemptMode\fR setting with
the two options comma\-separated (e.g. \fBPreemptMode=SUSPEND,GANG\fR).
.br
See <https://slurm.schedmd.com/preempt.html> and
<https://slurm.schedmd.com/gang_scheduling.html> for more details.
\fBNOTE\fR:
For performance reasons, the backfill scheduler reserves whole nodes for jobs,
not partial nodes. If during backfill scheduling a job preempts one or more
other jobs, the whole nodes for those preempted jobs are reserved for the
preemptor job, even if the preemptor job requested fewer resources than that.
These reserved nodes aren't available to other jobs during that backfill
cycle, even if the other jobs could fit on the nodes. Therefore, jobs may
preempt more resources during a single backfill iteration than they requested.
.br
\fBNOTE\fR:
For heterogeneous job to be considered for preemption all components
must be eligible for preemption. When a heterogeneous job is to be preempted
the first identified component of the job with the highest order PreemptMode
(\fBSUSPEND\fR (highest), \fBREQUEUE\fR, \fBCANCEL\fR (lowest)) will be
used to set the PreemptMode for all components. The \fBGraceTime\fR and user
warning signal for each component of the heterogeneous job remain unique.
Heterogeneous jobs are excluded from GANG scheduling operations.
.IP
.RS
.TP 12
\fBOFF\fR
Is the default value and disables job preemption and gang scheduling.
It is only compatible with \fBPreemptType=preempt/none\fR at a global level.
.IP
.TP
\fBCANCEL\fR
The preempted job will be cancelled.
.IP
.TP
\fBGANG\fR
Enables gang scheduling (time slicing) of jobs in the same partition, and
allows the resuming of suspended jobs.
Configure the \fIOverSubscribe\fR setting to \fIFORCE\fR for all partitions
in which time\-slicing is to take place.
Gang scheduling is performed independently for each partition, so
if you only want time\-slicing by \fBOverSubscribe\fR, without any preemption,
then configuring partitions with overlapping nodes is not recommended.
Time\-slicing won't happen between jobs on different partitions.
\fBNOTE\fR:
Heterogeneous jobs are excluded from GANG scheduling operations.
.IP
.TP
\fBREQUEUE\fR
Preempts jobs by requeuing them (if possible) or canceling them.
For jobs to be requeued they must have the \-\-requeue sbatch option set
or the cluster wide JobRequeue parameter in slurm.conf must be set to \fB1\fR.
.IP
.TP
\fBSUSPEND\fR
The preempted jobs will be suspended, and later the Gang scheduler will resume
them. Therefore the \fBSUSPEND\fR preemption mode always needs the \fBGANG\fR
option to be specified at the cluster level. Also, because the suspended jobs
will still use memory on the allocated nodes, Slurm needs to be able to track
memory resources to be able to suspend jobs.
.br
If \fBPreemptType=preempt/qos\fR is configured and if the preempted job(s) and
the preemptor job are on the same partition, then they will share resources with
the Gang scheduler (time\-slicing). If not (i.e. if the preemptees and preemptor
are on different partitions) then the preempted jobs will remain suspended until
the preemptor ends.
\fBNOTE\fR: Suspended jobs will not release GRES. Higher priority jobs will not
be able to preempt to gain access to GRES.
.IP
.TP
\fBWITHIN\fR
Allows for preemption between jobs sharing the same qos. By default,
\fBPreemptType=preempt/qos\fR will only consider jobs to be eligible for
preemption if they do not share the same qos value.
.RE
.IP
.TP
\fBPriority\fR
QOS priority factor to be used by the priority/multifactor plugin.
Unset by default, indicating that no extra priority is granted.
\fBNOTE\fP: The \fIPriority\fP of a QOS is NOT related to QOS preemption, see
\fIPreempt\fP instead.
.IP
.TP
\fBRawUsage\fR=<\fIvalue\fR>
This allows an administrator to set the raw usage accrued to a QOS. Specifying
a value of 0 (zero) will reset the raw usage. This is a settable specification
only \- it cannot be used as a filter to list accounts.
.IP
.TP
\fBUsageFactor\fR
A float that is factored into a job's TRES usage (e.g. RawUsage, TRESMins,
TRESRunMins). For example, if the usagefactor was 2, for every TRESBillingUnit
second a job ran it would count for 2. If the usagefactor was .5, every second
would only count for half of the time. A setting of 0 would add no timed usage
from the job.
The usage factor only applies to the job's QOS and not the partition QOS.
If the \fIUsageFactorSafe\fR flag \fIis\fR set and
\fIAccountingStorageEnforce\fR includes \fISafe\fR, jobs will only be started if
they can run to completion with the \fIUsageFactor\fR applied, and won't be
killed due to limits.
If the \fIUsageFactorSafe\fR flag is \fInot\fR set and
\fIAccountingStorageEnforce\fR includes \fISafe\fR, jobs will be started if
they can run to completion without the \fIUsageFactor\fR applied,
and won't be killed due to limits.
If the \fIUsageFactorSafe\fR flag is \fInot\fR set and
\fIAccountingStorageEnforce\fR does not include \fISafe\fR, jobs will be
scheduled as long as the limits are not reached, but could be killed due to
limits.
See \fIAccountingStorageEnforce\fR in slurm.conf man page.
Default is 1. To clear an existing value, set a new value of \-1.
.IP
.TP
\fBUsageThreshold\fR
A float representing the lowest fairshare of an association allowed
to run a job. If an association falls below this threshold and has
pending jobs or submits new jobs those jobs will be held until the
usage goes back above the threshold. Use \fIsshare\fP to see current
shares on the system.
To clear an existing value, set a new value of \-1.
.IP
.SH "LIST/SHOW QOS FORMAT OPTIONS"
.LP
Fields you can display when viewing QOS records by using the \fIformat=\fR
option.
.TP
\fBDescription\fR
An arbitrary string describing a QOS.
.IP
.TP
\fBFlags\fR
Used by the slurmctld to override or enforce certain characteristics.
.IP
.TP
\fBGraceTime\fR
Preemption grace time to be extended to a job which has been
selected for preemption in the format of hh:mm:ss.
.IP
.TP
\fBGrpJobs\fR
Maximum number of running jobs in aggregate for this QOS.
.IP
.TP
\fBGrpJobsAccrue\fR
Maximum number of pending jobs in aggregate able to accrue age priority for this
QOS.
This limit only applies to the job's QOS and not the partition's QOS.
.IP
.IP \fBGrpSubmit\fR
.PD 0
.IP \fBGrpSubmitJobs\fR
.PD
Maximum number of jobs in a pending or running state at any time in aggregate
for this QOS.
.IP
.TP
\fBGrpTRES\fR
Maximum number of TRES able to be allocated by running jobs in aggregate for
this QOS.
.IP
.TP
\fBGrpTRESMins\fR
Maximum number of TRES minutes that can possibly be used by past, present, and
future jobs with this QOS.
.IP
.TP
\fBGrpTRESRunMins\fR
Maximum number of TRES minutes able to be allocated by running jobs with this
QOS.
.IP
.TP
\fBGrpWall\fR
Maximum wall clock time able to be allocated by running jobs in aggregate for
this QOS.
.IP
.TP
\fBID\fR
The id of the QOS.
.IP
.TP
\fBLimitFactor\fR
A float that is factored into an association's [Grp|Max]TRES limits.
.IP
.IP \fBMaxJobsAccruePA\fR
.PD 0
.IP \fBMaxJobsAccruePerAccount\fR
.PD
Maximum number of pending jobs an account (or subacct) can have accruing age
priority at any given time. This limit only applies to the job's QOS and not the
partition's QOS.
.IP
.IP \fBMaxJobsAccruePU\fR
.PD 0
.IP \fBMaxJobsAccruePerUser\fR
.PD
Maximum number of pending jobs a user can have accruing age priority at any
given time. This limit only applies to the job's QOS and not the partition's
QOS.
.IP
.IP \fBMaxJobsPA\fR
.PD 0
.IP \fBMaxJobsPerAccount\fR
.PD
Maximum number of running jobs per account.
.IP
.IP \fBMaxJobsPU\fR
.PD 0
.IP \fBMaxJobsPerUser\fR
.PD
Maximum number of running jobs per user.
.IP
.IP \fBMaxSubmitJobsPA\fR
.PD 0
.IP \fBMaxSubmitJobsPerAccount\fR
.PD
Maximum number of jobs in a pending or running state per account.
.IP
.IP \fBMaxSubmitJobsPU\fR
.PD 0
.IP \fBMaxSubmitJobsPerUser\fR
.PD
Maximum number of jobs in a pending or running state per user.
.IP
.IP \fBMaxTRES\fR
.PD 0
.IP \fBMaxTRESPJ\fR
.PD 0
.IP \fBMaxTRESPerJob\fR
.PD
Maximum number of TRES each job can use.
.IP
.IP \fBMaxTRESMins\fR
.PD 0
.IP \fBMaxTRESMinsPJ\fR
.PD 0
.IP \fBMaxTRESMinsPerJob\fR
.PD
Maximum number of TRES minutes each job can use.
.IP
.IP \fBMaxTRESPA\fR
.PD 0
.IP \fBMaxTRESPerAccount\fR
.PD
Maximum number of TRES each account can use.
.IP
.IP \fBMaxTRESPN\fR
.PD 0
.IP \fBMaxTRESPerNode\fR
.PD
Maximum number of TRES each node in a job allocation can use.
.IP
.IP \fBMaxTRESPU\fR
.PD 0
.IP \fBMaxTRESPerUser\fR
.PD
Maximum number of TRES each user can use.
.IP
.IP \fBMaxTRESRunMinsPA\fR
.PD 0
.IP \fBMaxTRESRunMinsPerAccount\fR
.PD
Maximum number of TRES minutes each account can use.
.IP
.IP \fBMaxTRESRunMinsPU\fR
.PD 0
.IP \fBMaxTRESRunMinsPerUser\fR
.PD
Maximum number of TRES minutes each user can use.
.IP
.IP \fBMaxWall\fR
.PD 0
.IP \fBMaxWallDurationPerJob\fR
.PD
Maximum wall clock time each job can use.
.IP
.TP
\fBMinPrioThreshold\fR
Minimum priority required to reserve resources when scheduling.
.IP
.TP
\fBMinTRES\fR
Minimum number of TRES each job running under this QOS must request.
Otherwise the job will pend until modified.
.IP
.TP
\fBName\fR
Name of the QOS.
.IP
.TP
\fBPreempt\fR
Other QOSs this QOS can preempt.
.IP
.TP
\fBPreemptExemptTime\fR
Specifies a minimum run time for jobs in this QOS before they are considered for
preemption.
.IP
.TP
\fBPreemptMode\fR
Mechanism used to preempt jobs or enable gang scheduling for this QOS
when the cluster's \fIPreemptType\fR is set to \fIpreempt/qos\fR.
The default preemption mechanism is specified by the cluster\-wide
\fIPreemptMode\fP configuration parameter.
.IP
.TP
\fBPriority\fR
QOS priority factor to be used by the priority/multifactor plugin.
.IP
.TP
\fBUsageFactor\fR
A float that is factored into a job's TRES usage (e.g. RawUsage, TRESMins,
TRESRunMins).
.IP
.TP
\fBUsageThreshold\fR
A float representing the lowest fairshare of an association allowed
to run a job.
.IP
.TP
\fBWithDeleted\fR
Display information with previously deleted data.
A QOS that is deleted within 24 hours of being created and did not have
a job run in the QOS during that time will be removed from the database.
Otherwise, the QOS will be marked as deleted and will be viewable with
the \fBWithDeleted\fR flag.
.IP
.SH "SPECIFICATIONS FOR RESERVATIONS"
.LP
Reservations are created with the scontrol command and information about the
reservations is sent to slurmdbd to be stored.
These are options you can specify to filter for specific reservations.
.TP
\fBClusters\fR=<\fIcluster_name\fR>[,<\fIcluster_name\fR>,...]
List the reservations of the cluster(s). Default is the cluster where the
command was run.
.IP
.TP
\fBEnd\fR=<\fIOPT\fR>
Period ending of reservations. Default is now.
Valid time formats are:
.br
HH:MM[:SS] [AM|PM]
.br
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
.br
MM/DD[/YY]\-HH:MM[:SS]
.br
YYYY\-MM\-DD[THH:MM[:SS]]
.br
now[{+|\-}\fIcount\fR[seconds(default)|minutes|hours|days|weeks]]
.IP
.TP
\fBID\fR=<\fIOPT\fR>
Comma\-separated list of reservation ids.
.IP
.TP
\fBNames\fR=<\fIOPT\fR>
Comma\-separated list of reservation names.
.IP
.TP
\fBNodes\fR=<\fInode_name\fR>[,<\fInode_name\fR>,...]
Node names where reservation ran.
.IP
.TP
\fBStart\fR=<\fIOPT\fR>
Period start of reservations. Default is 00:00:00 of previous day.
Valid time formats are:
.br
HH:MM[:SS] [AM|PM]
.br
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
.br
MM/DD[/YY]\-HH:MM[:SS]
.br
YYYY\-MM\-DD[THH:MM[:SS]]
.br
now[{+|\-}\fIcount\fR[seconds(default)|minutes|hours|days|weeks]]
.IP
.SH "LIST/SHOW RESERVATION FORMAT OPTIONS"
.LP
Fields you can display when viewing Reservation records by using the
\fIformat=\fR option. The default format is:
.br
Cluster,Name,TRES,Start,End,UnusedWall
.TP
\fBAssociations\fR
The id's of the associations able to run in the reservation.
.IP
.TP
\fBCluster\fR
Name of cluster reservation was on.
.IP
.TP
\fBEnd\fR
End time of reservation.
.IP
.TP
\fBFlags\fR
Flags set on the reservation.
.IP
.TP
\fBID\fR
Reservation ID.
.IP
.TP
\fBName\fR
Name of this reservation.
.IP
.TP
\fBNodeNames\fR
List of nodes in the reservation.
.IP
.TP
\fBStart\fR
Start time of reservation.
.IP
.TP
\fBTRES\fR
List of TRES in the reservation.
.IP
.TP
\fBUnusedWall\fR
Wall clock time in seconds unused by any job. A job's allocated usage is its
run time multiplied by the ratio of its CPUs to the total number of CPUs in the
reservation. For example, a job using all the CPUs in the reservation running
for 1 minute would reduce unused_wall by 1 minute.
.IP
.SH "SPECIFICATIONS FOR RESOURCE"
.LP
Resources can be created, modified, and deleted with sacctmgr. These
options allow you to set the corresponding attributes or filter on them
when querying for Resources.
.TP
\fBLastConsumed\fR=<\fIOPT\fR>
Number of software resources of a specific name consumed out of \fBCount\fR on
the system being controlled by a resource manager.
.IP
.TP
\fBClusters\fR=<\fIname_list\fR>
Comma\-separated list of cluster names on which specified resources are to be
available. If no names are designated then the clusters already
allowed to use this resource will be altered.
.IP
.TP
\fBCount\fR=<\fIOPT\fR>
Number of software resources of a specific name configured on the system being
controlled by a resource manager.
.IP
.TP
\fBDescriptions\fR=
A brief description of the resource.
.IP
.TP
\fBFlags\fR[-|+]=<\fIOPT\fR>
Flags that identify specific attributes of the system resource.
.br
Valid options are
.RS
.TP
\fBAbsolute\fR
If set the resource will treat the counts for \fBAllowed\fR and \fBAllocated\fR
as absolute counts instead of percentages.
\fBNOTE\fR: If removing this with flags-=absolute there is no effort to convert
the numbers in the database back to percentages. This is required by the user.
.IP
.RE
.IP
.TP
\fBNames\fR=<\fIOPT\fR>
Comma\-separated list of the name of a resource configured on the
system being controlled by a resource manager. If this resource is
seen on the slurmctld its name will be name@server to distinguish it
from local resources defined in a slurm.conf.
.IP
.TP
\fBAllowed\fR=<\fIallowed\fR>
Percentage/Count of a specific resource that can be used on specified cluster.
.IP
.TP
\fBServer\fR=<\fIOPT\fR>
Arbitrary string indicating the name of the server serving up the resource.
Default is 'slurmdb' indicating the licenses are being served by the database.
This parameter is only for tagging purposes.
.IP
.TP
\fBServerType\fR=<\fIOPT\fR>
Arbitrary string used to tag the type of the software resource manager
providing the licenses. For example FlexNext Publisher Flexlm license server
or Reprise License Manager RLM. This does not imply any kind of integration
with license managers.
.IP
.TP
\fBType\fR=<\fIOPT\fR>
The type of the resource represented by this record. Currently the only valid
type is License.
.IP
.TP
\fBWithClusters\fR
Display the clusters percentage/count of resources. If a resource hasn't
been given to a cluster the resource will not be displayed with this flag.
.IP
.TP
\fBWithDeleted\fR
Display information with previously deleted data.
Resources that are deleted within 24 hours of being created will be removed
from the database. Resources that were created more than 24 hours prior to
the deletion request are just marked as deleted and will be viewable with
the \fBWithDeleted\fR flag.
.IP
.P
\fBNOTE\fR: Resource is used to define each resource configured on a system
available for usage by Slurm clusters.
.IP
.SH "LIST/SHOW RESOURCE FORMAT OPTIONS"
.LP
Fields you can display when viewing Resource records by using the
\fIformat=\fR option. The default format is:
.br
Name,Server,Type,Count,LastConsumed,Allocated,ServerType,Flags
.TP
\fBAllocated\fR
The percent/count of licenses allocated to a cluster.
.IP
.TP
\fBLastConsumed\fR
The count of a specific resource consumed out of \fBCount\fR on the system
globally.
.IP
.TP
\fBCluster\fR
Name of cluster resource is given to.
.IP
.TP
\fBCount\fR
The count of a specific resource configured on the system globally.
.IP
.TP
\fBDescription\fR
Description of the resource.
.IP
.TP
\fBName\fR
Name of this resource.
.IP
.TP
\fBServer\fR
Server serving up the resource.
.IP
.TP
\fBServerType\fR
The type of the server controlling the licenses.
.IP
.TP
\fBType\fR
Type of resource this record represents.
.IP
.SH "SPECIFICATIONS FOR RUNAWAYJOB"
.LP
Under certain circumstances, jobs can complete without having that completion
recorded by slurmdbd. This results in a "runaway job", where slurmdbd is not
going to record a completion time for that job without intervention.
This command will identify jobs that are in this state and offer to have
slurmdbd clean up the job record(s).
This particular variant of the "show" command also permits the use of the
"set" keyword to define the following specifications:
.TP
\fBEndState\fR=<\fIstate\fR>
Desired state to use as the end state for fixed jobs. Supported states are:
Completed, Failed.
.IP
.SH "LIST/SHOW RUNAWAYJOB FORMAT OPTIONS"
.LP
Fields you can display when viewing runaway job records by using the
\fIformat=\fR option. The default format is:
.br
ID,Name,Partition,Cluster,State,TimeSubmit,TimeStart,TimeEnd
.TP
\fBCluster\fR
Name of cluster job ran on.
.IP
.TP
\fBID\fR
Id of the job.
.IP
.TP
\fBName\fR
Name of the job.
.IP
.TP
\fBPartition\fR
Partition job ran on.
.IP
.TP
\fBState\fR
Current State of the job in the database.
.IP
.TP
\fBTimeEnd\fR
Current recorded time of the end of the job.
.IP
.TP
\fBTimeStart\fR
Time job started running.
.IP
.TP
\fBTimeSubmit\fR
Time job was submitted.
.IP
.SH "SPECIFICATIONS FOR TRANSACTIONS"
.LP
Information about changes to clusters, resources, accounts, associations,
etc., are recorded as transactions by slurmdbd.
These are options you can specify to filter for specific transactions.
.TP
\fBAccounts\fR=<\fIaccount_name\fR>[,<\fIaccount_name\fR>,...]
Only print out the transactions affecting specified accounts.
.IP
.TP
\fBAction\fR=<\fISpecific_action_the_list_will_display\fR>
Only display transactions of the specified action type.
.IP
.TP
\fBActor\fR=<\fISpecific_name_the_list_will_display\fR>
Only display transactions done by a certain person.
.IP
.TP
\fBClusters\fR=<\fIcluster_name\fR>[,<\fIcluster_name\fR>,...]
Only print out the transactions affecting specified clusters.
.IP
.TP
\fBEnd\fR=<\fIDate_and_time_of_last_transaction_to_return\fR>
Return all transactions before this Date and time. Default is now.
.IP
.TP
\fBStart\fR=<\fIDate_and_time_of_first_transaction_to_return\fR>
Return all transactions after this Date and time. Default is epoch.
Valid time formats for End and Start are:
.br
HH:MM[:SS] [AM|PM]
.br
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
.br
MM/DD[/YY]\-HH:MM[:SS]
.br
YYYY\-MM\-DD[THH:MM[:SS]]
.br
now[{+|\-}\fIcount\fR[seconds(default)|minutes|hours|days|weeks]]
.IP
.TP
\fBUsers\fR=<\fIuser_name\fR>[,<\fIuser_name\fR>,...]
Only print out the transactions affecting specified users.
.IP
.TP
\fBWithAssoc\fR
Get information about which associations were affected by the transactions.
.IP
.SH "LIST/SHOW TRANSACTIONS FORMAT OPTIONS"
.LP
Fields you can display when viewing Transaction records by using the
\fIformat=\fR option. The default format is:
.br
Time,Action,Actor,Where,Info
.TP
\fBAction\fR
Displays the type of Action that took place.
.IP
.TP
\fBActor\fR
Displays the Actor to generate a transaction.
.IP
.TP
\fBInfo\fR
Displays details of the transaction.
.IP
.TP
\fBTimeStamp\fR
Displays when the transaction occurred.
.IP
.TP
\fBWhere\fR
Displays details of the constraints for the transaction.
.P
\fBNOTE\fR: If using the WithAssoc option you can also view the information
about the various associations the transaction affected. The
Association format fields are described
in the \fILIST/SHOW ASSOCIATION FORMAT OPTIONS\fP section.
.IP
.SH "SPECIFICATIONS FOR USERS"
.LP
Users can be created, modified, and deleted with sacctmgr. These
options allow you to set the corresponding attributes or filter on them
when querying for Users.
.LP
It is important to recognize the difference between a User and an Association.
There is a User entity that exists for each unique username. However, there
can be multiple User Associations for the same User. The combination of a
Cluster, Account, User, and optionally a Partition constitute a User
Association. When adding an existing User to another Account, you are creating
an additional User Association rather than modifying an existing User.
.TP
\fBAccount\fR=<\fIaccount\fR>
Account name to add this user to.
.IP
.TP
\fBAdminLevel\fR=<\fIlevel\fR>
Admin level of user. Valid levels are None, Operator, and Admin.
.IP
.TP
\fBCluster\fR=<\fIcluster\fR>
Specific cluster to add user to the account on. Default is all in system.
.IP
.TP
\fBDefaultAccount\fR=<\fIaccount\fR>
Identify the default bank account name to be used for a job if none is
specified at submission time.
.IP
.TP
\fBDefaultWCKey\fR=<\fIdefaultwckey\fR>
Identify the default Workload Characterization Key.
.IP
.TP
\fBName\fR=<\fIname\fR>
Name of user.
.IP
.TP
\fBNewName\fR=<\fInewname\fR>
Use to rename a user in the accounting database
.IP
.TP
\fBPartition\fR=<\fIname\fR>
Partition name.
.P
\fBNOTE\fR: See also \fBPartitions\fR listed in the \fISPECIFICATIONS FOR
ASSOCIATIONS\fP section.
.P
.IP
.TP
\fBRawUsage\fR=<\fIvalue\fR>
This allows an administrator to reset the raw usage accrued to a user.
The only value currently supported is 0 (zero). This is a settable
specification only \- it cannot be used as a filter to list users.
.IP
.TP
\fBWCKeys\fR=<\fIwckeys\fR>
Workload Characterization Key values.
.IP
.TP
\fBWithAssoc\fR
Display all associations for this user.
.IP
.TP
\fBWithCoord\fR
Display all accounts a user is coordinator for.
.IP
.TP
\fBWithDeleted\fR
Display information with previously deleted data.
Users that are deleted within 24 hours of being created and did not have
a job run by the user during that time will be removed from the database.
Otherwise, the user will be marked as deleted and will be viewable with
the \fBWithDeleted\fR flag.
.P
\fBNOTE\fR: If using the WithAssoc option you can also query against
association specific information to view only certain associations
this user may have. These extra options can be found in the
\fISPECIFICATIONS FOR ASSOCIATIONS\fP section. You can also use the
general specifications list above in the \fIGENERAL SPECIFICATIONS FOR
ASSOCIATION BASED ENTITIES\fP section.
.IP
.SH "LIST/SHOW USER FORMAT OPTIONS"
.LP
Fields you can display when viewing User records by using the
\fIformat=\fR option. The default format is:
.br
User,DefaultAccount,DefaultWCKey,AdminLevel
.TP
\fBAdminLevel\fR
Admin level of user.
.IP
.TP
\fBCoordinators\fR
List of users that are a coordinator of the account. (Only filled in
when using the WithCoordinator option.)
.IP
.TP
\fBDefaultAccount\fR
The user's default account.
.IP
.TP
\fBDefaultWCKey\fR
The user's default wckey.
.IP
.TP
\fBUser\fR
The name of a user.
.P
\fBNOTE\fR: If using the WithAssoc option you can also view the information
about the various associations the user may have on all the
clusters in the system. The association information can be filtered.
Note that all the users in the database will always be shown as filter only
takes effect over the association data. The Association format fields are
described in the \fILIST/SHOW ASSOCIATION FORMAT OPTIONS\fP section.
.IP
.SH "LIST/SHOW WCKey"
.LP
Fields you can display when viewing WCKey records by using the
\fIformat=\fR option. The default format is:
.br
WCKey,Cluster,User
.TP
\fBCluster\fR
Specific cluster for the WCKey.
.IP
.TP
\fBID\fR
The ID of the WCKey.
.IP
.TP
\fBUser\fR
The name of a user for the WCKey.
.IP
.TP
\fBWCKey\fR
Workload Characterization Key.
.IP
.TP
\fBWithDeleted\fR
Display information with previously deleted data.
WCKeys that are deleted within 24 hours of being created and did not have
a job run with the WCKey during that time will be removed from the database.
Otherwise, the WCKey will be marked as deleted and will be viewable with
the \fBWithDeleted\fR flag.
.IP
.SH "LIST/SHOW TRES"
.LP
Fields you can display when viewing TRES records by using the
\fIformat=\fR option. The default format is:
.br
Type,Name,ID
.TP
\fBID\fR
The identification number of the trackable resource as it appears
in the database.
.IP
.TP
\fBName\fR
The name of the trackable resource. This option is required for
TRES types BB (Burst buffer), GRES, and License. Types CPU, Energy,
Memory, and Node do not have Names. For example if GRES is the
type then name is the denomination of the GRES itself e.g. GPU.
.IP
.TP
\fBType\fR
The type of the trackable resource. Current types are BB (Burst
buffer), CPU, Energy, GRES, License, Memory, and Node.
.IP
.SH "TRES information"
Trackable RESources (TRES) are used in many QOS or Association limits.
When setting the limits they are comma\-separated list. Each TRES has
a different limit, i.e. GrpTRESMins=cpu=10,mem=20 would make 2
different limits 1 for 10 cpu minutes and 1 for 20 MB memory minutes.
This is the case for each limit that deals with TRES. To remove the
limit \-1 is used i.e. GrpTRESMins=cpu=\-1 would remove only the cpu
TRES limit.
\fBNOTE\fR: When dealing with Memory as a TRES all limits are in MB.
\fBNOTE\fR: The Billing TRES is calculated from a partition's
TRESBillingWeights. It is temporarily calculated during scheduling for each
partition to enforce billing TRES limits. The final Billing TRES is calculated
after the job has been allocated resources. The final number can be seen in
\fBscontrol show jobs\fP and \fBsacct\fP output.
.SH "GLOBAL FORMAT OPTION"
When using the format option for listing various fields you can put a
%NUMBER afterwards to specify how many characters should be printed.
e.g. format=name%30 will print 30 characters of field name right
justified. A \-30 will print 30 characters left justified.
.SH "FLAT FILE DUMP AND LOAD"
sacctmgr has the capability to load and dump Slurm association data to and
from a file. This method can easily add a new cluster or copy an
existing cluster's associations into a new cluster with similar
accounts. Each file contains Slurm association data for a single
cluster. Be aware that QOS information is not currently included in the
information that can be dumped to a file. QOS information can be retrieved
and loaded using the REST API or it must be transferred to a new cluster
manually. Comments can be put into the file with the # character.
Each line of information must begin with one of the four titles; \fBCluster\fP,
\fBParent\fP, \fBAccount\fP or \fBUser\fP. Following the title is a space,
dash, space, entity value, then specifications. Specifications are
colon\-separated. If any variable, such as an Organization name, has a space in
it, surround the name with single or double quotes.
.P
sacctmgr dump/load must be run as a Slurm administrator or root. If using
sacctmgr load on a database without any associations, it must be run as root
(because there aren't any users in the database yet).
.SS dump
Dump cluster associations from the database into a file. If no file is given
then one will be generated, using the cluster name for the file name. That
file will be created in the current working directory.
.P
To create a file with the association information you can run:
.nf
sacctmgr dump tux file=tux.cfg
.fi
.RS
.TP
\fBCluster\fR=
Specify the cluster to dump the information for.
.IP
.TP
\fBFile\fR=
Specify a file to save flat file data to.
If the filename is not specified it uses clustername.cfg filename by default.
.IP
.RE
.SS load
Load cluster associations into the database. The imported associations will be
reconciled with existing ones.
.P
To load a previously created file you can run:
.nf
sacctmgr load file=tux.cfg
.fi
.RS
.TP
\fBclean\fR
Delete what was already there and start from scratch with this information.
With no options this will only remove the cluster along with it's associations.
No accounts, users, or QOS will be removed.
This also accepts a comma\-separated list of other options to remove. Those
include 'account', 'qos' and 'user'. If you would like to remove accounts, qos,
and users along with the cluster and associations give the input of
\fIclean=account,qos,user\fR.
.IP
.TP
\fBCluster\fR=
Specify a different name for the cluster than that which is in the file.
.IP
.TP
\fBFile\fR=
Specify a flat file to load from.
.IP
.RE
.SH "SPECIFICATIONS FOR FLAT FILE"
Since the associations in the system follow a hierarchy, so does the
file. Anything that is a parent needs to be defined before any
children. The only exception is the understood 'root' account. This
is always a default for any cluster and does not need to be defined.
To edit/create a file start with a cluster line for the new cluster:
.nf
\fBCluster\ \-\ cluster_name:MaxTRESPerJob=node=15\fP
.fi
Anything included on this line will be the default for all
associations on this cluster. The options for the cluster are:
.RS
.TP
\fBFairShare\fR=
Allocated shares used for fairshare calculation.
.IP
.TP
\fBGrpJobs\fR=
Maximum number of running jobs in aggregate for this association and its
children.
.IP
.TP
\fBGrpJobsAccrue\fR=
Maximum number of pending jobs in aggregate able to accrue age priority for this
association and its children.
.IP
.TP
\fBGrpNodes\fR=
This option has been deprecated in favor of the more versatile TRES.
Equivalent limit definition is now \fBGrpTRES=node=#\fR.
.IP
.TP
\fBGrpSubmitJobs\fR=
Maximum number of jobs in a pending or running state at any time in aggregate
for this association and its children.
.IP
.TP
\fBGrpTRES\fR=
Maximum number of TRES able to be allocated by running jobs in aggregate for
this association and its children.
.IP
.TP
\fBGrpTRESMins\fR=
Maximum number of TRES minutes that can possibly be used by past, present and
future jobs in this association and its children.
.IP
.TP
\fBGrpTRESRunMins\fR=
Maximum number of TRES minutes able to be allocated by running jobs in this
association and its children. This takes into consideration time limit of
running jobs and consumes it. If the limit is reached no new jobs are started
until other jobs finish to allow time to free up.
.IP
.TP
\fBGrpWall\fR=
Maximum wall clock time able to be allocated by running jobs in aggregate in
this association and its children.
.IP
.TP
\fBMaxJobs\fR=
Maximum number of running jobs per user in this association.
.IP
.TP
\fBMaxTRESPerJob\fR=
Maximum number of TRES each job can use in this association.
.IP
.TP
\fBMaxWallDurationPerJob\fR=
Maximum wall clock time each job can use in this association.
.IP
.TP
\fBQOS\fR=
Comma\-separated list of Quality of Service names (Defined in sacctmgr).
.IP
.RE
After the entry for the root account you will have entries for the other
accounts on the system. The entries will look similar to this example:
.nf
\fBParent\ \-\ root
Account\ \-\ cs:MaxTRESPerJob=node=5:MaxJobs=4:FairShare=399:MaxWallDurationPerJob=40:Description='Computer Science':Organization='LC'
Parent\ \-\ cs
Account\ \-\ test:MaxTRESPerJob=node=1:MaxJobs=1:FairShare=1:MaxWallDurationPerJob=1:Description='Test Account':Organization='Test'\fP
.fi
Any of the options after a ':' can be left out and they can be in any order.
If you want to add any sub accounts just list the Parent THAT HAS ALREADY
BEEN CREATED before the account you are adding.
Account options are:
.RS
.TP
\fBDescription\fR=
A brief description of the account.
.IP
.TP
\fBFairShare\fR=
Number used in conjunction with other associations to determine job priority.
.IP
.TP
\fBGrpTRES\fR=
Maximum number of TRES able to be allocated by running jobs in aggregate for
this association and its children.
.IP
.TP
\fBGrpTRESMins\fR=
Maximum number of TRES minutes that can possibly be used by past, present, and
future jobs in this association and its children.
.IP
.TP
\fBGrpTRESRunMins\fR=
Maximum number of TRES minutes able to be allocated by running jobs in this
association and its children. This takes into consideration time limit of
running jobs and consumes it. If the limit is reached no new jobs are started
until other jobs finish to allow time to free up.
.IP
.TP
\fBGrpJobs\fR=
Maximum number of running jobs in aggregate for this association and its
children.
.IP
.TP
\fBGrpJobsAccrue\fR=
Maximum number of pending jobs in aggregate able to accrue age priority for this
association and its children.
.IP
.TP
\fBGrpNodes\fR=
This option has been deprecated in favor of the more versatile TRES.
Equivalent limit definition is now \fBGrpTRES=node=#\fR.
.IP
.TP
\fBGrpSubmitJobs\fR=
Maximum number of jobs in a pending or running state at any time in aggregate
for this association and its children.
.IP
.TP
\fBGrpWall\fR=
Maximum wall clock time able to be allocated by running jobs in aggregate in
this association and its children.
.IP
.TP
\fBMaxJobs\fR=
Maximum number of running jobs per user in this association.
.IP
.TP
\fBMaxNodesPerJob\fR=
Maximum number of nodes per job in this association.
.IP
.TP
\fBMaxWallDurationPerJob\fR=
Maximum wall clock time each job can use in this association.
.IP
.TP
\fBOrganization\fR=
Name of organization that owns this account.
.IP
.TP
\fBQOS\fR(=,+=,\-=)
Comma\-separated list of Quality of Service names (Defined in sacctmgr).
.RE
To add users to an account add a line after the Parent line, similar to this:
.nf
\fBParent\ \-\ test
User\ \-\ adam:MaxTRESPerJob=node:2:MaxJobs=3:FairShare=1:MaxWallDurationPerJob=1:AdminLevel=Operator:Coordinator='test'\fP
.fi
User options are:
.RS
.TP
\fBAdminLevel\fR=
Type of admin this user is (Administrator, Operator)
.br
\fBMust be defined on the first occurrence of the user.\fP
.IP
.TP
\fBCoordinator\fR=
Comma\-separated list of accounts this user is coordinator over
.br
\fBMust be defined on the first occurrence of the user.\fP
.IP
.TP
\fBDefaultAccount\fR=
System wide default account name
.br
\fBMust be defined on the first occurrence of the user.\fP
.IP
.TP
\fBFairShare\fR=
Number used in conjunction with other associations to determine job priority.
.IP
.TP
\fBMaxJobs\fR=
Maximum number of running jobs from this user.
.IP
.TP
\fBMaxTRESPerJob\fR=
Maximum number of TRES each job from this user can use.
.IP
.TP
\fBMaxWallDurationPerJob\fR=
Maximum wall clock time each job from this user can use.
.IP
.TP
\fBQOS\fR(=,+=,\-=)
Comma\-separated list of Quality of Service names (Defined in sacctmgr).
.IP
.RE
.SH "ARCHIVE FUNCTIONALITY"
Sacctmgr has the capability to archive to a flatfile and or load that
data if needed later. The archiving is usually done by the slurmdbd
and it is highly recommended you only do it through sacctmgr if you
completely understand what you are doing. For slurmdbd options see
"man slurmdbd" for more information.
Loading data into the database can be done from these files to either
view old data or regenerate rolled up data.
For information about configuring an archive server see
<https://slurm.schedmd.com/accounting.html#archive>.
.SS archive dump
Dump accounting data to file. Data will not be archived unless the
corresponding purge option is included in this command or in slurmdbd.conf.
This operation cannot be rolled back
once executed. If one of the following options is not specified when sacctmgr
is called, the value configured in slurmdbd.conf is used.
.RS
.TP
\fBDirectory\fR=
Directory to store the archive data.
.IP
.TP
\fBEvents\fR
Archive Events. If not specified and PurgeEventAfter is set
all event data removed will be lost permanently.
.IP
.TP
\fBJobs\fR
Archive Jobs. If not specified and PurgeJobAfter is set
all job data removed will be lost permanently.
.IP
.TP
\fBPurgeEventAfter\fR=
Purge cluster event records older than time stated in months. If you
want to purge on a shorter time period you can include hours, or days
behind the numeric value to get those more frequent purges. (e.g. a
value of '12hours' would purge everything older than 12 hours.)
.IP
.TP
\fBPurgeJobAfter\fR=
Purge job records older than time stated in months. If you
want to purge on a shorter time period you can include hours, or days
behind the numeric value to get those more frequent purges. (e.g. a
value of '12hours' would purge everything older than 12 hours.)
.IP
.TP
\fBPurgeStepAfter\fR=
Purge step records older than time stated in months. If you
want to purge on a shorter time period you can include hours, or days
behind the numeric value to get those more frequent purges. (e.g. a
value of '12hours' would purge everything older than 12 hours.)
.IP
.TP
\fBPurgeSuspendAfter\fR=
Purge job suspend records older than time stated in months. If you
want to purge on a shorter time period you can include hours, or days
behind the numeric value to get those more frequent purges. (e.g. a
value of '12hours' would purge everything older than 12 hours.)
.IP
.TP
\fBScript\fR=
Run this script instead of the generic form of archive to flat files.
.IP
.TP
\fBSteps\fR
Archive Steps. If not specified and PurgeStepAfter is set
all step data removed will be lost permanently.
.IP
.TP
\fBSuspend\fR
Archive Suspend Data. If not specified and PurgeSuspendAfter is set
all suspend data removed will be lost permanently.
.IP
.RE
.SS archive load
Load in to the database previously archived data. The archive file will not be
loaded if the records already exist in the database \- therefore, trying to load
an archive file more than once will result in an error. When this data is again
archived and purged from the database, if the old archive file is still in the
directory ArchiveDir, a new archive file will be created (see ArchiveDir in the
slurmdbd.conf man page), so the old file will not be overwritten and these files
will have duplicate records.
.P
Archive files from the current or any prior Slurm release may be loaded
through \fBarchive load\fR.
.RS
.TP
\fBFile\fR=
File to load into database. The specified file must exist on the slurmdbd host,
which is not necessarily the machine running the command.
.IP
.TP
\fBInsert\fR=
SQL to insert directly into the database. This should be used very
cautiously since this is writing your sql into the database.
.IP
.RE
.SH "PERFORMANCE"
.PP
Executing \fBsacctmgr\fR sends a remote procedure call to \fBslurmdbd\fR. If
enough calls from \fBsacctmgr\fR or other Slurm client commands that send remote
procedure calls to the \fBslurmdbd\fR daemon come in at once, it can result in a
degradation of performance of the \fBslurmdbd\fR daemon, possibly resulting in a
denial of service.
.PP
Do not run \fBsacctmgr\fR or other Slurm client commands that send remote
procedure calls to \fBslurmdbd\fR from loops in shell scripts or other programs.
Ensure that programs limit calls to \fBsacctmgr\fR to the minimum necessary for
the information you are trying to gather.
.SH "ENVIRONMENT VARIABLES"
.PP
Some \fBsacctmgr\fR options may be set via environment variables. These
environment variables, along with their corresponding options, are listed below.
(Note: Command line options will always override these settings.)
.TP 20
\fBSLURM_CONF\fR
The location of the Slurm configuration file.
.IP
.TP
\fBSLURM_DEBUG_FLAGS\fR
Specify debug flags for sacctmgr to use. See DebugFlags in the
\fBslurm.conf\fR(5) man page for a full list of flags. The environment
variable takes precedence over the setting in the slurm.conf.
.IP
.TP
\fBSLURM_JSON\fR
Control JSON serialization:
.IP
.RS
.TP
\fBcompact\fR
Output JSON as compact as possible.
.IP
.TP
\fBpretty\fR
Output JSON in pretty format to make it more readable.
.IP
.RE
.TP
\fBSLURM_YAML\fR
Control YAML serialization:
.IP
.RS
.TP
\fBcompact\fR Output YAML as compact as possible.
.IP
.TP
\fBpretty\fR Output YAML in pretty format to make it more readable.
.RE
.IP
.SH "EXAMPLES"
\fBNOTE\fR: There is an order to set up accounting associations.
You must define clusters before you add accounts and you must add accounts
before you can add users.
.nf
$ sacctmgr create cluster tux
$ sacctmgr create account name=science fairshare=50
$ sacctmgr create account name=chemistry parent=science fairshare=30
$ sacctmgr create account name=physics parent=science fairshare=20
$ sacctmgr create user name=adam cluster=tux account=physics fairshare=10
$ sacctmgr delete user name=adam cluster=tux account=physics
$ sacctmgr delete user name=adam cluster=tux account=science partition=\\"\\"
$ sacctmgr delete account name=physics cluster=tux
$ sacctmgr modify user where name=adam cluster=tux account=physics set maxjobs=2 maxwall=30:00
$ sacctmgr add user brian account=chemistry
$ sacctmgr list associations cluster=tux format=Account,Cluster,User,Fairshare tree withd
$ sacctmgr list transactions Action="Add Users" Start=11/03\-10:30:00 format=Where,Time
$ sacctmgr dump cluster=tux file=tux_data_file
$ sacctmgr load tux_data_file
.fi
A user's account can not be changed directly. A new association needs to be
created for the user with the new account. Then the association with the old
account can be deleted.
When modifying an object placing the key words 'set' and the
optional 'where' is critical to perform correctly below are examples to
produce correct results. As a rule of thumb anything you put in front
of the set will be used as a quantifier. If you want to put a
quantifier after the key word 'set' you should use the key
word 'where'. The following is wrong:
.nf
$ sacctmgr modify user name=adam set fairshare=10 cluster=tux
.fi
This will produce an error as the above line reads modify user adam
set fairshare=10 and cluster=tux. Either of the following is correct:
.nf
$ sacctmgr modify user name=adam cluster=tux set fairshare=10
$ sacctmgr modify user name=adam set fairshare=10 where cluster=tux
.fi
When changing qos for something only use the '=' operator when wanting
to explicitly set the qos to something. In most cases you will want
to use the '+=' or '\-=' operator to either add to or remove from the
existing qos already in place.
If a user already has qos of normal,standby for a parent or it was
explicitly set you should use qos+=expedite to add this to the list in
this fashion.
If you are looking to only add the qos expedite to only a certain
account and or cluster you can do that by specifying them in the
sacctmgr line.
.nf
$ sacctmgr modify user name=adam set qos+=expedite
.fi
or
.nf
$ sacctmgr modify user name=adam acct=this cluster=tux set qos+=expedite
.fi
Let's give an example how to add QOS to user accounts.
List all available QOSs in the cluster.
.nf
$ sacctmgr show qos format=name
Name
\-\-\-\-\-\-\-\-\-
normal
expedite
.fi
List all the associations in the cluster.
.nf
$ sacctmgr show assoc format=cluster,account,qos
Cluster Account QOS
\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
zebra root normal
zebra root normal
zebra g normal
zebra g1 normal
.fi
Add the QOS expedite to account G1 and display the result.
Using the operator += the QOS will be added together
with the existing QOS to this account.
.nf
$ sacctmgr modify account name=g1 set qos+=expedite
$ sacctmgr show assoc format=cluster,account,qos
Cluster Account QOS
\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
zebra root normal
zebra root normal
zebra g normal
zebra g1 expedite,normal
.fi
Now set the QOS expedite as the only QOS for the account G and display
the result. Using the operator = that expedite is the only usable
QOS by account G
.nf
$ sacctmgr modify account name=G set qos=expedite
$ sacctmgr show assoc format=cluster,account,qos
Cluster Account QOS
\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
zebra root normal
zebra root normal
zebra g expedite
zebra g1 expedite,normal
.fi
If a new account is added under the account G it will inherit the
QOS expedite and it will not have access to QOS normal.
.nf
$ sacctmgr add account banana parent=G
$ sacctmgr show assoc format=cluster,account,qos
Cluster Account QOS
\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
zebra root normal
zebra root normal
zebra g expedite
zebra banana expedite
zebra g1 expedite,normal
.fi
An example of listing trackable resources:
.nf
$ sacctmgr show tres
Type Name ID
\-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-
cpu 1
mem 2
energy 3
node 4
billing 5
gres gpu:tesla 1001
license vcs 1002
bb cray 1003
.fi
.SH "COPYING"
Copyright (C) 2008\-2010 Lawrence Livermore National Security.
Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
.br
Copyright (C) 2010\-2022 SchedMD LLC.
.LP
This file is part of Slurm, a resource management program.
For details, see <https://slurm.schedmd.com/>.
.LP
Slurm is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
.LP
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
details.
.SH "SEE ALSO"
\fBslurm.conf\fR(5),
\fBslurmdbd\fR(8)