| .TH SCONTROL "1" "December 2012" "scontrol 2.5" "Slurm components" |
| |
| .SH "NAME" |
| scontrol \- Used view and modify Slurm configuration and state. |
| |
| .SH "SYNOPSIS" |
| \fBscontrol\fR [\fIOPTIONS\fR...] [\fICOMMAND\fR...] |
| |
| .SH "DESCRIPTION" |
| \fBscontrol\fR is used to view or modify Slurm configuration including: job, |
| job step, node, partition, reservation, and overall system configuration. Most |
| of the commands can only be executed by user root. If an attempt to view or modify |
| configuration information is made by an unauthorized user, an error message |
| will be printed and the requested action will not occur. If no command is |
| entered on the execute line, \fBscontrol\fR will operate in an interactive |
| mode and prompt for input. It will continue prompting for input and executing |
| commands until explicitly terminated. If a command is entered on the execute |
| line, \fBscontrol\fR will execute that command and terminate. All commands |
| and options are case\-insensitive, although node names, partition names, and |
| reservation names are case\-sensitive (node names "LX" and "lx" are distinct). |
| All commands and options can be abbreviated to the extent that the |
| specification is unique. |
| |
| .SH "OPTIONS" |
| .TP |
| \fB\-a\fR, \fB\-\-all\fR |
| When the \fIshow\fR command is used, then display all partitions, their jobs |
| and jobs steps. This causes information to be displayed about partitions |
| that are configured as hidden and partitions that are unavailable to user's |
| group. |
| .TP |
| \fB\-d\fR, \fB\-\-details\fR |
| Causes the \fIshow\fR command to provide additional details where available. |
| Repeating the option more than once (e.g., "\-dd") will cause the \fIshow job\fR |
| command to also list the batch script, if the job was a batch job. |
| .TP |
| \fB\-h\fR, \fB\-\-help\fR |
| Print a help message describing the usage of scontrol. |
| .TP |
| \fB\-\-hide\fR |
| Do not display information about hidden partitions, their jobs and job steps. |
| By default, neither partitions that are configured as hidden nor those partitions |
| unavailable to user's group will be displayed (i.e. this is the default behavior). |
| .TP |
| \fB\-M\fR, \fB\-\-clusters\fR=<\fIstring\fR> |
| The cluster to issue commands to. Only one cluster name may be specified. |
| |
| .TP |
| \fB\-o\fR, \fB\-\-oneliner\fR |
| Print information one line per record. |
| .TP |
| \fB\-Q\fR, \fB\-\-quiet\fR |
| Print no warning or informational messages, only fatal error messages. |
| .TP |
| \fB\-v\fR, \fB\-\-verbose\fR |
| Print detailed event logging. Multiple \fB\-v\fR's will further increase |
| the verbosity of logging. By default only errors will be displayed. |
| |
| .TP |
| \fB\-V\fR , \fB\-\-version\fR |
| Print version information and exit. |
| .TP |
| \fBCOMMANDS\fR |
| |
| .TP |
| \fBall\fP |
| Show all partitions, their jobs and jobs steps. This causes information to be |
| displayed about partitions that are configured as hidden and partitions that |
| are unavailable to user's group. |
| |
| .TP |
| \fBabort\fP |
| Instruct the Slurm controller to terminate immediately and generate a core file. |
| See "man slurmctld" for information about where the core file will be written. |
| |
| .TP |
| \fBcheckpoint\fP \fICKPT_OP\fP \fIID\fP |
| Perform a checkpoint activity on the job step(s) with the specified identification. |
| \fIID\fP can be used to identify a specific job (e.g. "<job_id>", |
| which applies to all of its existing steps) |
| or a specific job step (e.g. "<job_id>.<step_id>"). |
| Acceptable values for \fICKPT_OP\fP include: |
| .RS |
| .TP 12 |
| \fIable\fP |
| Test if presently not disabled, report start time if checkpoint in progress |
| .TP |
| \fIcreate\fP |
| Create a checkpoint and continue the job or job step |
| .TP |
| \fIdisable\fP |
| Disable future checkpoints |
| .TP |
| \fIenable\fP |
| Enable future checkpoints |
| .TP |
| \fIerror\fP |
| Report the result for the last checkpoint request, error code and message |
| .TP |
| \fIrestart\fP |
| Restart execution of the previously checkpointed job or job step |
| .TP |
| \fIrequeue\fP |
| Create a checkpoint and requeue the batch job, combines vacate |
| and restart operations |
| .TP |
| \fIvacate\fP |
| Create a checkpoint and terminate the job or job step |
| .RE |
| Acceptable values for \fICKPT_OP\fP include: |
| .RS |
| .TP 20 |
| \fIMaxWait=<seconds>\fP |
| Maximum time for checkpoint to be written. |
| Default value is 10 seconds. |
| Valid with \fIcreate\fP and \fIvacate\fP options only. |
| .TP |
| \fIImageDir=<directory_name>\fP |
| Location of checkpoint file. |
| Valid with \fIcreate\fP, \fIvacate\fP and \fIrestart\fP options only. |
| This value takes precedent over any \-\-checkpoint\-dir value specified |
| at job submission time. |
| .TP |
| \fIStickToNodes\fP |
| If set, resume job on the same nodes are previously used. |
| Valid with the \fIrestart\fP option only. |
| .RE |
| |
| .TP |
| \fBcluster\fR \fICLUSTER_NAME\fP |
| The cluster to issue commands to. Only one cluster name may be specified. |
| |
| .TP |
| \fBcreate\fP \fISPECIFICATION\fP |
| Create a new partition or reservation. See the full list of parameters |
| below. Include the tag "res" to create a reservation without specifying |
| a reservation name. |
| |
| .TP |
| \fBcompleting\fP |
| Display all jobs in a COMPLETING state along with associated nodes in either a |
| COMPLETING or DOWN state. |
| |
| .TP |
| \fBdelete\fP \fISPECIFICATION\fP |
| Delete the entry with the specified \fISPECIFICATION\fP. |
| The two \fISPECIFICATION\fP choices are \fIPartitionName=<name>\fP and |
| \fIReservation=<name>\fP. On Dynamically laid out Bluegene systems |
| \fIBlockName=<name>\fP also works. Reservations and partitions should have |
| no associated jobs at the time of their deletion (modify the job's first). |
| If the specified partition is in use, the request is denied. |
| |
| .TP |
| \fBdetails\fP |
| Causes the \fIshow\fP command to provide additional details where available. |
| Job information will include CPUs and NUMA memory allocated on each node. |
| Note that on computers with hyperthreading enabled and SLURM configured to |
| allocate cores, each listed CPU represents one physical core. |
| Each hyperthread on that core can be allocated a separate task, so a job's |
| CPU count and task count may differ. |
| See the \fB\-\-cpu_bind\fR and \fB\-\-mem_bind\fR option descriptions in |
| srun man pages for more information. |
| The \fBdetails\fP option is currently only supported for the \fIshow job\fP |
| command. To also list the batch script for batch jobs, in addition to the |
| details, use the \fBscript\fP option described below instead of this option. |
| |
| .TP |
| \fBexit\fP |
| Terminate the execution of scontrol. |
| This is an independent command with no options meant for use in interactive mode. |
| |
| .TP |
| \fBhelp\fP |
| Display a description of scontrol options and commands. |
| |
| .TP |
| \fBhide\fP |
| Do not display partition, job or jobs step information for partitions that are |
| configured as hidden or partitions that are unavailable to the user's group. |
| This is the default behavior. |
| |
| .TP |
| \fBhold\fP \fIjob_id\fP |
| Prevent a pending job from beginning started (sets it's priority to 0). |
| Use the \fIrelease\fP command to permit the job to be scheduled. |
| Note that when a job is held by a system administrator using the \fBhold\fP |
| command, only a system administrator may release the job for execution (also |
| see the \fBuhold\fP command). When the job is held by its owner, it may also |
| be released by the job's owner. |
| |
| .TP |
| \fBnotify\fP \fIjob_id\fP \fImessage\fP |
| Send a message to standard error of the salloc or srun command or batch job |
| associated with the specified \fIjob_id\fP. |
| |
| .TP |
| \fBoneliner\fP |
| Print information one line per record. |
| |
| .TP |
| \fBpidinfo\fP \fIproc_id\fP |
| Print the Slurm job id and scheduled termination time corresponding to the |
| supplied process id, \fIproc_id\fP, on the current node. This will work only |
| with processes on node on which scontrol is run, and only for those processes |
| spawned by SLURM and their descendants. |
| |
| .TP |
| \fBlistpids\fP [\fIjob_id\fP[.\fIstep_id\fP]] [\fINodeName\fP] |
| Print a listing of the process IDs in a job step (if JOBID.STEPID is provided), |
| or all of the job steps in a job (if \fIjob_id\fP is provided), or all of the job |
| steps in all of the jobs on the local node (if \fIjob_id\fP is not provided |
| or \fIjob_id\fP is "*"). This will work only with processes on the node on |
| which scontrol is run, and only for those processes spawned by SLURM and |
| their descendants. Note that some SLURM configurations |
| (\fIProctrackType\fP value of \fIpgid\fP or \fIaix\fP) |
| are unable to identify all processes associated with a job or job step. |
| |
| Note that the NodeName option is only really useful when you have multiple |
| slurmd daemons running on the same host machine. Multiple slurmd daemons on |
| one host are, in general, only used by SLURM developers. |
| |
| .TP |
| \fBping\fP |
| Ping the primary and secondary slurmctld daemon and report if |
| they are responding. |
| |
| .TP |
| \fBquiet\fP |
| Print no warning or informational messages, only fatal error messages. |
| |
| .TP |
| \fBquit\fP |
| Terminate the execution of scontrol. |
| |
| .TP |
| \fBreboot_nodes\fP [\fINodeList\fP] |
| Reboot all nodes in the system when they become idle using the |
| \fBRebootProgram\fP as configured in SLURM's slurm.conf file. |
| Accepts an option list of nodes to reboot. By default all nodes are rebooted. |
| NOTE: This command does not prevent additional jobs from being scheduled on |
| these nodes, so many jobs can be executed on the nodes prior to them being |
| rebooted. You can explicitly drain the nodes in order to reboot nodes as soon |
| as possible, but the nodes must also explicitly be returned to service after |
| being rebooted. You can alternately create an advanced reservation to |
| prevent additional jobs from being initiated on nodes to be rebooted. |
| |
| .TP |
| \fBreconfigure\fP |
| Instruct all Slurm daemons to re\-read the configuration file. |
| This command does not restart the daemons. |
| This mechanism would be used to modify configuration parameters (Epilog, |
| Prolog, SlurmctldLogFile, SlurmdLogFile, etc.). |
| The Slurm controller (slurmctld) forwards the request all other daemons |
| (slurmd daemon on each compute node). Running jobs continue execution. |
| Most configuration parameters can be changed by just running this command, |
| however, SLURM daemons should be shutdown and restarted if any of these |
| parameters are to be changed: AuthType, BackupAddr, BackupController, |
| ControlAddr, ControlMach, PluginDir, StateSaveLocation, SlurmctldPort |
| or SlurmdPort. The slurmctld daemon must be restarted if nodes are added to |
| or removed from the cluster. |
| |
| .TP |
| \fBrelease\fP \fIjob_id\fP |
| Release a previously held job to begin execution. Also see \fBhold\fR. |
| |
| .TP |
| \fBrequeue\fP \fIjob_id\fP |
| Requeue a running or pending SLURM batch job. |
| |
| .TP |
| \fBresume\fP \fIjob_id\fP |
| Resume a previously suspended job. Also see \fBsuspend\fR. |
| |
| .TP |
| \fBschedloglevel\fP \fILEVEL\fP |
| Enable or disable scheduler logging. |
| \fILEVEL\fP may be "0", "1", "disable" or "enable". "0" has the same |
| effect as "disable". "1" has the same effect as "enable". |
| This value is temporary and will be overwritten when the slurmctld |
| daemon reads the slurm.conf configuration file (e.g. when the daemon |
| is restarted or \fBscontrol reconfigure\fR is executed) if the |
| SlurmSchedLogLevel parameter is present. |
| |
| .TP |
| \fBscript\fP |
| Causes the \fIshow job\fP command to list the batch script for batch |
| jobs in addition to the detail information described under the |
| \fBdetails\fP option above. |
| |
| .TP |
| \fBsetdebug\fP \fILEVEL\fP |
| Change the debug level of the slurmctld daemon. |
| \fILEVEL\fP may be an integer value between zero and nine (using the |
| same values as \fISlurmctldDebug\fP in the \fIslurm.conf\fP file) or |
| the name of the most detailed message type to be printed: |
| "quiet", "fatal", "error", "info", "verbose", "debug", "debug2", "debug3", |
| "debug4", or "debug5". |
| This value is temporary and will be overwritten whenever the slurmctld |
| daemon reads the slurm.conf configuration file (e.g. when the daemon |
| is restarted or \fBscontrol reconfigure\fR is executed). |
| |
| .TP |
| \fBsetdebugflags\fP [+|\-]\fIFLAG\fP |
| Add or remove DebugFlags of the slurmctld daemon. |
| See "man slurm.conf" for a list of supported DebugFlags. |
| NOTE: Changing the value of some DebugFlags will have no effect without |
| restarting the slurmctld daemon, which would set DebugFlags based upon the |
| contents of the slurm.conf configuration file. |
| |
| .TP |
| \fBshow\fP \fIENTITY\fP \fIID\fP |
| Display the state of the specified entity with the specified identification. |
| \fIENTITY\fP may be \fIaliases\fP, \fIconfig\fP, \fIdaemons\fP, \fIfrontend\fP, |
| \fIjob\fP, \fInode\fP, \fIpartition\fP, \fIreservation\fP, \fIslurmd\fP, |
| \fIstep\fP, \fItopology\fP, \fIhostlist\fP, \fIhostlistsorted\fP or |
| \fIhostnames\fP |
| (also \fIblock\fP or \fIsubmp\fP on BlueGene systems). |
| \fIID\fP can be used to identify a specific element of the identified |
| entity: the configuration parameter name, job ID, node name, partition name, |
| reservation name, or job step ID for \fIconfig\fP, \fIjob\fP, \fInode\fP, |
| \fIpartition\fP, or \fIstep\fP respectively. |
| For an \fIENTITY\fP of \fItopology\fP, the \fIID\fP may be a node or switch name. |
| If one node name is specified, all switches connected to that node (and |
| their parent switches) will be shown. |
| If more than one node name is specified, only switches that connect to all |
| named nodes will be shown. |
| \fIaliases\fP will return all \fINodeName\fP values associated to a given |
| \fINodeHostname\fP (useful to get the list of virtual nodes associated with a |
| real node in a configuration where multiple slurmd daemons execute on a single |
| compute node). |
| \fIconfig\fP displays parameter names from the configuration files in mixed |
| case (e.g. SlurmdPort=7003) while derived parameters names are in upper case |
| only (e.g. SLURM_VERSION). |
| \fIhostnames\fP takes an optional hostlist expression as input and |
| writes a list of individual host names to standard output (one per |
| line). If no hostlist expression is supplied, the contents of the |
| SLURM_NODELIST environment variable is used. For example "tux[1\-3]" |
| is mapped to "tux1","tux2" and "tux3" (one hostname per line). |
| \fIhostlist\fP takes a list of host names and prints the hostlist |
| expression for them (the inverse of \fIhostnames\fP). |
| \fIhostlist\fP can also take the absolute pathname of a file |
| (beginning with the character '/') containing a list of hostnames. |
| Multiple node names may be specified using simple node range expressions |
| (e.g. "lx[10\-20]"). All other \fIID\fP values must identify a single |
| element. The job step ID is of the form "job_id.step_id", (e.g. "1234.1"). |
| \fIslurmd\fP reports the current status of the slurmd daemon executing |
| on the same node from which the scontrol command is executed (the |
| local host). It can be useful to diagnose problems. |
| By default \fIhostlist\fP does not sort the node list or make it |
| unique (e.g. tux2,tux1,tux2 = tux[2,1-2]). If you wanted a sorted |
| list use \fIhostlistsorted\fP (e.g. tux2,tux1,tux2 = tux[1-2,2]). |
| By default, all elements of the entity type specified are printed. |
| For an \fIENTITY\fP of \fIjob\fP, if the job does not specify |
| socket-per-node, cores-per-socket or threads-per-core then it |
| will display '*' in ReqS:C:T=*:*:* field. |
| |
| .TP |
| \fBshutdown\fP \fIOPTION\fP |
| Instruct Slurm daemons to save current state and terminate. |
| By default, the Slurm controller (slurmctld) forwards the request all |
| other daemons (slurmd daemon on each compute node). |
| An \fIOPTION\fP of \fIslurmctld\fP or \fIcontroller\fP results in |
| only the slurmctld daemon being shutdown and the slurmd daemons |
| remaining active. |
| |
| .TP |
| \fBsuspend\fP \fIjob_id\fP |
| Suspend a running job. |
| Use the \fIresume\fP command to resume its execution. |
| User processes must stop on receipt of SIGSTOP signal and resume |
| upon receipt of SIGCONT for this operation to be effective. |
| Not all architectures and configurations support job suspension. |
| |
| .TP |
| \fBtakeover\fP |
| Instruct SLURM's backup controller (slurmctld) to take over system control. |
| SLURM's backup controller requests control from the primary and waits for |
| its termination. After that, it switches from backup mode to controller |
| mode. If primary controller can not be contacted, it directly switches to |
| controller mode. This can be used to speed up the SLURM controller |
| fail\-over mechanism when the primary node is down. |
| This can be used to minimize disruption if the computer executing the |
| primary SLURM controller is scheduled down. |
| (Note: SLURM's primary controller will take the control back at startup.) |
| |
| .TP |
| \fBuhold\fP \fIjob_id\fP |
| Prevent a pending job from being started (sets it's priority to 0). |
| Use the \fIrelease\fP command to permit the job to be scheduled. |
| This command is designed for a system administrator to hold a job so that |
| the job owner may release it rather than requiring the intervention of a |
| system administrator (also see the \fBhold\fP command). |
| |
| .TP |
| \fBupdate\fP \fISPECIFICATION\fP |
| Update job, step, node, partition, or reservation configuration per the |
| supplied specification. \fISPECIFICATION\fP is in the same format as the Slurm |
| configuration file and the output of the \fIshow\fP command described above. It |
| may be desirable to execute the \fIshow\fP command (described above) on the |
| specific entity you which to update, then use cut\-and\-paste tools to enter |
| updated configuration values to the \fIupdate\fP. Note that while most |
| configuration values can be changed using this command, not all can be changed |
| using this mechanism. In particular, the hardware configuration of a node or |
| the physical addition or removal of nodes from the cluster may only be |
| accomplished through editing the Slurm configuration file and executing |
| the \fIreconfigure\fP command (described above). |
| |
| .TP |
| \fBverbose\fP |
| Print detailed event logging. |
| This includes time\-stamps on data structures, record counts, etc. |
| |
| .TP |
| \fBversion\fP |
| Display the version number of scontrol being executed. |
| |
| .TP |
| \fBwait_job\fP \fIjob_id\fP |
| Wait until a job andall of its nodes are ready for use or the job has entered |
| some termination state. This option is particularly useful in the SLURM Prolog |
| or in the batch script itself if nodes are powered down and restarted |
| automatically as needed. |
| |
| .TP |
| \fB!!\fP |
| Repeat the last command executed. |
| |
| .TP |
| \fBSPECIFICATIONS FOR UPDATE COMMAND, JOBS\fR |
| .TP |
| \fIAccount\fP=<account> |
| Account name to be changed for this job's resource use. |
| Value may be cleared with blank data value, "Account=". |
| .TP |
| \fIConn\-Type\fP=<type> |
| Reset the node connection type. |
| Possible values on Blue Gene are "MESH", "TORUS" and "NAV" |
| (mesh else torus). |
| .TP |
| \fIContiguous\fP=<yes|no> |
| Set the job's requirement for contiguous (consecutive) nodes to be allocated. |
| Possible values are "YES" and "NO". |
| .TP |
| \fIDependency\fP=<dependency_list> |
| Defer job's initiation until specified job dependency specification |
| is satisfied. |
| Cancel dependency with an empty dependency_list (e.g. "Dependency="). |
| <\fIdependency_list\fR> is of the form |
| <\fItype:job_id[:job_id][,type:job_id[:job_id]]\fR>. |
| Many jobs can share the same dependency and these jobs may even belong to |
| different users. |
| .PD |
| .RS |
| .TP |
| \fBafter:job_id[:jobid...]\fR |
| This job can begin execution after the specified jobs have begun |
| execution. |
| .TP |
| \fBafterany:job_id[:jobid...]\fR |
| This job can begin execution after the specified jobs have terminated. |
| .TP |
| \fBafternotok:job_id[:jobid...]\fR |
| This job can begin execution after the specified jobs have terminated |
| in some failed state (non-zero exit code, node failure, timed out, etc). |
| .TP |
| \fBafterok:job_id[:jobid...]\fR |
| This job can begin execution after the specified jobs have successfully |
| executed (ran to completion with an exit code of zero). |
| .TP |
| \fBsingleton\fR |
| This job can begin execution after any previously launched jobs |
| sharing the same job name and user have terminated. |
| .RE |
| .TP |
| \fIEligibleTime\fP=<time_spec> |
| See \fIStartTime\fP. |
| .TP |
| \fIExcNodeList\fP=<nodes> |
| Set the job's list of excluded node. Multiple node names may be |
| specified using simple node range expressions (e.g. "lx[10\-20]"). |
| Value may be cleared with blank data value, "ExcNodeList=". |
| .TP |
| \fIFeatures\fP=<features> |
| Set the job's required node features. |
| The list of features may include multiple feature names separated |
| by ampersand (AND) and/or vertical bar (OR) operators. |
| For example: \fBFeatures="opteron&video"\fR or \fBFeatures="fast|faster"\fR. |
| In the first example, only nodes having both the feature "opteron" AND |
| the feature "video" will be used. |
| There is no mechanism to specify that you want one node with feature |
| "opteron" and another node with feature "video" in case no |
| node has both features. |
| If only one of a set of possible options should be used for all allocated |
| nodes, then use the OR operator and enclose the options within square brackets. |
| For example: "\fBFeatures=[rack1|rack2|rack3|rack4]"\fR might |
| be used to specify that all nodes must be allocated on a single rack of |
| the cluster, but any of those four racks can be used. |
| A request can also specify the number of nodes needed with some feature |
| by appending an asterisk and count after the feature name. |
| For example "\fBFeatures=graphics*4"\fR |
| indicates that at least four allocated nodes must have the feature "graphics." |
| Constraints with node counts may only be combined with AND operators. |
| Value may be cleared with blank data value, for example "Features=". |
| |
| .TP |
| \fIGeometry\fP=<geo> |
| Reset the required job geometry. |
| On Blue Gene the value should be three digits separated by |
| "x" or ",". The digits represent the allocation size in |
| X, Y and Z dimensions (e.g. "2x3x4"). |
| |
| .TP |
| \fIGres\fP=<list> |
| Specifies a comma delimited list of generic consumable resources. |
| The format of each entry on the list is "name[:count[*cpu]]". |
| The name is that of the consumable resource. |
| The count is the number of those resources with a default value of 1. |
| The specified resources will be allocated to the job on each node |
| allocated unless "*cpu" is appended, in which case the resources |
| will be allocated on a per cpu basis. |
| The available generic consumable resources is configurable by the system |
| administrator. |
| A list of available generic consumable resources will be printed and the |
| command will exit if the option argument is "help". |
| Examples of use include "Gres=gpus:2*cpu,disk=40G" and "Gres=help". |
| |
| .TP |
| \fIJobId\fP=<id> |
| Identify the job to be updated. This specification is required. |
| .TP |
| \fILicenses\fP=<name> |
| Specification of licenses (or other resources available on all nodes |
| of the cluster) as described in salloc/sbatch/srun man pages. |
| .TP |
| \fIMinCPUsNode\fP=<count> |
| Set the job's minimum number of CPUs per node to the specified value. |
| .TP |
| \fIMinMemoryCPU\fP=<megabytes> |
| Set the job's minimum real memory required per allocated CPU to the specified |
| value. |
| Either \fIMinMemoryCPU\fP or \fIMinMemoryNode\fP may be set, but not both. |
| .TP |
| \fIMinMemoryNode\fP=<megabytes> |
| Set the job's minimum real memory required per node to the specified value. |
| Either \fIMinMemoryCPU\fP or \fIMinMemoryNode\fP may be set, but not both. |
| .TP |
| \fIMinTmpDiskNode\fP=<megabytes> |
| Set the job's minimum temporary disk space required per node to the specified value. |
| .TP |
| \fIName\fP=<name> |
| Set the job's name to the specified value. |
| .TP |
| \fINice\fP[=delta] |
| Adjust job's priority by the specified value. Default value is 100. |
| The adjustment range is from \-10000 (highest priority) |
| to 10000 (lowest priority). |
| Nice value changes are not additive, but overwrite any prior nice |
| value and are applied to the job's base priority. |
| Only privileged users can specify a negative adjustment. |
| .TP |
| \fINodeList\fP=<nodes> |
| Change the nodes allocated to a running job to shrink it's size. |
| The specified list of nodes must be a subset of the nodes currently |
| allocated to the job. Multiple node names may be specified using |
| simple node range expressions (e.g. "lx[10\-20]"). After a job's allocation |
| is reduced, subsequent \fBsrun\fR commands must explicitly specify node and |
| task counts which are valid for the new allocation. |
| .TP |
| \fINumCPUs\fP=<min_count>[\-<max_count>] |
| Set the job's minimum and optionally maximum count of CPUs to be allocated. |
| .TP |
| \fINumNodes\fP=<min_count>[\-<max_count>] |
| Set the job's minimum and optionally maximum count of nodes to be allocated. |
| If the job is already running, use this to specify a node count less than |
| currently allocated and resources previously allocated to the job will be |
| relinquished. After a job's allocation is reduced, subsequent \fBsrun\fR |
| commands must explicitly specify node and task counts which are valid for the |
| new allocation. Also see the \fINodeList\fP parameter above. |
| .TP |
| \fINumTasks\fP=<count> |
| Set the job's count of required tasks to the specified value. |
| .TP |
| \fIPartition\fP=<name> |
| Set the job's partition to the specified value. |
| .TP |
| \fIPriority\fP=<number> |
| Set the job's priority to the specified value. |
| Note that a job priority of zero prevents the job from ever being scheduled. |
| By setting a job's priority to zero it is held. |
| Set the priority to a non\-zero value to permit it to run. |
| Explicitly setting a job's priority clears any previously set nice value and |
| removes the priority/multifactor plugin's ability to manage a job's priority. |
| In order to restore the priority/multifactor plugin's ability to manage a |
| job's priority, hold and then release the job. |
| .TP |
| \fIQOS\fP=<name> |
| Set the job's QOS (Quality Of Service) to the specified value. |
| Value may be cleared with blank data value, "QOS=". |
| .TP |
| \fIReqCores\fP=<count> |
| Set the job's count of cores per socket to the specified value. |
| .TP |
| \fIReqNodeList\fP=<nodes> |
| Set the job's list of required node. Multiple node names may be specified using |
| simple node range expressions (e.g. "lx[10\-20]"). |
| Value may be cleared with blank data value, "ReqNodeList=". |
| .TP |
| \fIReqSockets\fP=<count> |
| Set the job's count of sockets per node to the specified value. |
| .TP |
| \fIReqThreads\fP=<count> |
| Set the job's count of threads per core to the specified value. |
| .TP |
| \fIRequeue\fP=<0|1> |
| Stipulates whether a job should be requeued after a node failure: 0 |
| for no, 1 for yes. |
| .TP |
| \fIReservationName\fP=<name> |
| Set the job's reservation to the specified value. |
| Value may be cleared with blank data value, "ReservationName=". |
| .TP |
| \fIRotate\fP=<yes|no> |
| Permit the job's geometry to be rotated. |
| Possible values are "YES" and "NO". |
| .TP |
| \fIShared\fP=<yes|no> |
| Set the job's ability to share nodes with other jobs. Possible values are |
| "YES" and "NO". |
| .TP |
| \fIStartTime\fP=<time_spec> |
| Set the job's earliest initiation time. |
| It accepts times of the form \fIHH:MM:SS\fR to run a job at |
| a specific time of day (seconds are optional). |
| (If that time is already past, the next day is assumed.) |
| You may also specify \fImidnight\fR, \fInoon\fR, or |
| \fIteatime\fR (4pm) and you can have a time\-of\-day suffixed |
| with \fIAM\fR or \fIPM\fR for running in the morning or the evening. |
| You can also say what day the job will be run, by specifying |
| a date of the form \fIMMDDYY\fR or \fIMM/DD/YY\fR or \fIMM.DD.YY\fR, |
| or a date and time as \fIYYYY\-MM\-DD[THH:MM[:SS]]\fR. You can also |
| give times like \fInow + count time\-units\fR, where the time\-units |
| can be \fIminutes\fR, \fIhours\fR, \fIdays\fR, or \fIweeks\fR |
| and you can tell SLURM to run the job today with the keyword |
| \fItoday\fR and to run the job tomorrow with the keyword |
| \fItomorrow\fR. |
| .RS |
| .PP |
| Notes on date/time specifications: |
| \- although the 'seconds' field of the HH:MM:SS time specification is |
| allowed by the code, note that the poll time of the SLURM scheduler |
| is not precise enough to guarantee dispatch of the job on the exact |
| second. The job will be eligible to start on the next poll |
| following the specified time. The exact poll interval depends on the |
| SLURM scheduler (e.g., 60 seconds with the default sched/builtin). |
| \- if no time (HH:MM:SS) is specified, the default is (00:00:00). |
| \- if a date is specified without a year (e.g., MM/DD) then the current |
| year is assumed, unless the combination of MM/DD and HH:MM:SS has |
| already passed for that year, in which case the next year is used. |
| .RE |
| .TP |
| \fISwitches\fP=<count>[@<max\-time\-to\-wait>] |
| When a tree topology is used, this defines the maximum count of switches |
| desired for the job allocation. If SLURM finds an allocation containing more |
| switches than the count specified, the job remain pending until it either finds |
| an allocation with desired switch count or the time limit expires. By default |
| there is no switch count limit and no time limit delay. Set the count |
| to zero in order to clean any previously set count (disabling the limit). |
| The job's maximum time delay may be limited by the system administrator using |
| the \fBSchedulerParameters\fR configuration parameter with the |
| \fBmax_switch_wait\fR parameter option. |
| Also see \fIwait\-for\-switch\fP. |
| |
| .TP |
| \fITimeLimit\fP=<time> |
| The job's time limit. |
| Output format is [days\-]hours:minutes:seconds or "UNLIMITED". |
| Input format (for \fBupdate\fR command) set is minutes, minutes:seconds, |
| hours:minutes:seconds, days\-hours, days\-hours:minutes or |
| days\-hours:minutes:seconds. |
| Time resolution is one minute and second values are rounded up to |
| the next minute. |
| If changing the time limit of a job, either specify a new time limit value or |
| precede the time with a "+" or "\-" to increment or decrement the current |
| time limit (e.g. "TimeLimit=+30"). In order to increment or decrement the |
| current time limit, the \fIJobId\fP specification must precede the |
| \fITimeLimit\fP specification. |
| |
| .TP |
| \fIwait\-for\-switch\fP=<max\-time\-to\-wait> |
| When a tree topology is used, this defines the maximum time to wait for the |
| desired count of switches. If SLURM finds an allocation containing more |
| switches than the count specified, the job remain pending until it either finds |
| an allocation with desired switch count or the time limit expires. By default |
| there is no switch count limit and there is not time delay. Set the time |
| to zero in order to clean any previously set time limit (disabling the limit). |
| The job's maximum time delay may be limited by the system administrator using |
| the \fBSchedulerParameters\fR configuration parameter with the |
| \fBmax_switch_wait parameter\fR option. |
| Also see \fISwitches\fP. |
| .TP |
| \fIWCKey\fP=<key> |
| Set the job's workload characterization key to the specified value. |
| |
| .TP |
| NOTE: The "show" command, when used with the "job" or "job <jobid>" |
| entity displays detailed information about a job or jobs. Much of |
| this information may be modified using the "update job" command as |
| described above. However, the following fields displayed by the show |
| job command are read\-only and cannot be modified: |
| |
| .TP |
| \fIAllocNode:Sid\fP |
| Local node and system id making the resource allocation. |
| .TP |
| \fIEndTime\fP |
| The time the job is expected to terminate based on the job's time |
| limit. When the job ends sooner, this field will be updated with the |
| actual end time. |
| .TP |
| \fIExitCode\fP=<exit>:<sig> |
| Exit status reported for the job by the wait() function. |
| The first number is the exit code, typically as set by the exit() function. |
| The second number of the signal that caused the process to terminate if |
| it was terminated by a signal. |
| .TP |
| \fIJobState\fP |
| The current state of the job. |
| .TP |
| \fINodeList\fP |
| The list of nodes allocated to the job. |
| .TP |
| \fINodeListIndices\fP |
| The NodeIndices expose the internal indices into the node table |
| associated with the node(s) allocated to the job. |
| .TP |
| \fIPreemptTime\fP |
| Time at which job was signaled that it was selected for preemption. |
| (Meaningful only for PreemptMode=CANCEL and the partition or QOS |
| with which the job is associated has a GraceTime value designated.) |
| .TP |
| \fIPreSusTime\fP |
| Time the job ran prior to last suspend. |
| .TP |
| \fIReason\fP |
| The reason job is not running: e.g., waiting "Resources". |
| .TP |
| \fISubmitTime\fP |
| The time and date stamp (in Universal Time Coordinated, UTC) |
| the job was submitted. The format of the output is identical |
| to that of the EndTime field. |
| |
| NOTE: If a job is requeued, the submit time is reset. |
| To obtain the original submit time it is necessary |
| to use the "sacct \-j <job_id[.<step_id>]" command also |
| designating the \-D or \-\-duplicate option to display all |
| duplicate entries for a job. |
| .TP |
| \fISuspendTime\fP |
| Time the job was last suspended or resumed. |
| .TP |
| \fIUserId\fP \fIGroupId\fP |
| The user and group under which the job was submitted. |
| .TP |
| NOTE on information displayed for various job states: |
| When you submit a request for the "show job" function the scontrol |
| process makes an RPC request call to slurmctld with a REQUEST_JOB_INFO |
| message type. If the state of the job is PENDING, then it returns |
| some detail information such as: min_nodes, min_procs, cpus_per_task, |
| etc. If the state is other than PENDING the code assumes that it is in |
| a further state such as RUNNING, COMPLETE, etc. In these cases the |
| code explicitly returns zero for these values. These values are |
| meaningless once the job resources have been allocated and the job has |
| started. |
| |
| .TP |
| \fBSPECIFICATIONS FOR UPDATE COMMAND, STEPS\fR |
| .TP |
| \fIStepId\fP=<job_id>[.<step_id>] |
| Identify the step to be updated. |
| If the job_id is given, but no step_id is specified then all steps of |
| the identified job will be modified. |
| This specification is required. |
| .TP |
| \fITimeLimit\fP=<time> |
| The job's time limit. |
| Output format is [days\-]hours:minutes:seconds or "UNLIMITED". |
| Input format (for \fBupdate\fR command) set is minutes, minutes:seconds, |
| hours:minutes:seconds, days\-hours, days\-hours:minutes or |
| days\-hours:minutes:seconds. |
| Time resolution is one minute and second values are rounded up to |
| the next minute. |
| If changing the time limit of a step, either specify a new time limit value or |
| precede the time with a "+" or "\-" to increment or decrement the current |
| time limit (e.g. "TimeLimit=+30"). In order to increment or decrement the |
| current time limit, the \fIStepId\fP specification must precede the |
| \fITimeLimit\fP specification. |
| |
| .TP |
| \fBSPECIFICATIONS FOR UPDATE COMMAND, NODES\fR |
| .TP |
| \fINodeName\fP=<name> |
| Identify the node(s) to be updated. Multiple node names may be specified using |
| simple node range expressions (e.g. "lx[10\-20]"). This specification is required. |
| .TP |
| \fIFeatures\fP=<features> |
| Identify feature(s) to be associated with the specified node. Any |
| previously defined feature(s) will be overwritten with the new value. |
| Features assigned via \fBscontrol\fR will only persist across the restart |
| of the slurmctld daemon with the \fI\-R\fR option and state files |
| preserved or slurmctld's receipt of a SIGHUP. |
| Update slurm.conf with any changes meant to be persistent across normal |
| restarts of slurmctld or the execution of \fBscontrol reconfig\fR. |
| |
| .TP |
| \fIGres\fP=<gres> |
| Identify generic resources to be associated with the specified node. Any |
| previously defined generic resources will be overwritten with the new value. |
| Specifications for multiple generic resources should be comma separated. |
| Each resource specification consists of a name followed by an optional |
| colon with a numeric value (default value is one) |
| (e.g. "Gres=bandwidth:10000,gpus"). |
| Generic resources assigned via \fBscontrol\fR will only persist across the |
| restart of the slurmctld daemon with the \fI\-R\fR option and state files |
| preserved or slurmctld's receipt of a SIGHUP. |
| Update slurm.conf with any changes meant to be persistent across normal |
| restarts of slurmctld or the execution of \fBscontrol reconfig\fR. |
| |
| .TP |
| \fIReason\fP=<reason> |
| Identify the reason the node is in a "DOWN". "DRAINED", "DRAINING", |
| "FAILING" or "FAIL" state. |
| Use quotes to enclose a reason having more than one word. |
| |
| .TP |
| \fIState\fP=<state> |
| Identify the state to be assigned to the node. Possible values are "NoResp", |
| "ALLOC", "ALLOCATED", "DOWN", "DRAIN", "FAIL", "FAILING", "IDLE", |
| "MIXED", "MAINT", "POWER_DOWN", "POWER_UP", or "RESUME". |
| If a node is in a "MIXED" state it usually means the node is in |
| multiple states. For instance if only part of the node is "ALLOCATED" |
| and the rest of the node is "IDLE" the state will be "MIXED". |
| If you want to remove a node from service, you typically want to set |
| it's state to "DRAIN". |
| "FAILING" is similar to "DRAIN" except that some applications will |
| seek to relinquish those nodes before the job completes. |
| "RESUME" is not an actual node state, but will return a "DRAINED", "DRAINING", |
| or "DOWN" node to service, either "IDLE" or "ALLOCATED" state as appropriate. |
| Setting a node "DOWN" will cause all running and suspended jobs on that |
| node to be terminated. |
| "POWER_DOWN" and "POWER_UP" will use the configured \fISuspendProg\fR and |
| \fIResumeProg\fR programs to explicitly place a node in or out of a power |
| saving mode. |
| The "NoResp" state will only set the "NoResp" flag for a node without |
| changing its underlying state. |
| While all of the above states are valid, some of them are not valid new |
| node states given their prior state. |
| If the node state code printed is followed by "~", this indicates |
| the node is presently in a power saving mode (typically |
| running at reduced frequency). |
| If the node state code is followed by "#", this indicates |
| the node is presently being powered up or configured. |
| Generally only "DRAIN", "FAIL" and "RESUME" should be used. |
| NOTE: The scontrol command should not be used to change node state on Cray |
| systems. Use Cray tools such as \fIxtprocadmin\fR instead. |
| |
| .TP |
| \fIWeight\fP=<weight> |
| Identify weight to be associated with specified nodes. This allows |
| dynamic changes to weight associated with nodes, which will be used |
| for the subsequent node allocation decisions. |
| Weight assigned via \fBscontrol\fR will only persist across the restart |
| of the slurmctld daemon with the \fI\-R\fR option and state files |
| preserved or slurmctld's receipt of a SIGHUP. |
| Update slurm.conf with any changes meant to be persistent across normal |
| restarts of slurmctld or the execution of \fBscontrol reconfig\fR. |
| |
| .TP |
| \fBSPECIFICATIONS FOR UPDATE COMMAND, FRONTEND\fR |
| |
| .TP |
| \fIFrontendName\fP=<name> |
| Identify the front end node to be updated. This specification is required. |
| |
| .TP |
| \fIReason\fP=<reason> |
| Identify the reason the node is in a "DOWN" or "DRAIN" state. |
| Use quotes to enclose a reason having more than one word. |
| |
| .TP |
| \fIState\fP=<state> |
| Identify the state to be assigned to the front end node. Possible values are |
| "DOWN", "DRAIN" or "RESUME". |
| If you want to remove a front end node from service, you typically want to set |
| it's state to "DRAIN". |
| "RESUME" is not an actual node state, but will return a "DRAINED", "DRAINING", |
| or "DOWN" front end node to service, either "IDLE" or "ALLOCATED" state as |
| appropriate. |
| Setting a front end node "DOWN" will cause all running and suspended jobs on |
| that node to be terminated. |
| |
| .TP |
| \fBSPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, PARTITIONS\fR |
| .TP |
| \fIAllowGroups\fP=<name> |
| Identify the user groups which may use this partition. |
| Multiple groups may be specified in a comma separated list. |
| To permit all groups to use the partition specify "AllowGroups=ALL". |
| |
| .TP |
| \fIAllocNodes\fP=<name> |
| Comma separated list of nodes from which users can execute jobs in the |
| partition. |
| Node names may be specified using the node range expression syntax |
| described above. |
| The default value is "ALL". |
| |
| .TP |
| \fIAlternate\fP=<partition name> |
| Alternate partition to be used if the state of this partition is "DRAIN" or |
| "INACTIVE." The value "NONE" will clear a previously set alternate partition. |
| |
| .TP |
| \fIDefault\fP=<yes|no> |
| Specify if this partition is to be used by jobs which do not explicitly |
| identify a partition to use. |
| Possible output values are "YES" and "NO". |
| In order to change the default partition of a running system, |
| use the scontrol update command and set Default=yes for the partition |
| that you want to become the new default. |
| |
| .TP |
| \fIDefaultTime\fP=<time> |
| Run time limit used for jobs that don't specify a value. If not set |
| then MaxTime will be used. |
| Format is the same as for MaxTime. |
| |
| .TP |
| \fIDefMemPerCPU\fP=<MB> |
| Set the default memory to be allocated per CPU for jobs in this partition. |
| The memory size is specified in megabytes. |
| .TP |
| \fIDefMemPerCNode\fP=<MB> |
| Set the default memory to be allocated per node for jobs in this partition. |
| The memory size is specified in megabytes. |
| |
| .TP |
| \fIDisableRootJobs\fP=<yes|no> |
| Specify if jobs can be executed as user root. |
| Possible values are "YES" and "NO". |
| |
| .TP |
| \fIGraceTime\fP=<seconds> |
| Specifies, in units of seconds, the preemption grace time |
| to be extended to a job which has been selected for preemption. |
| The default value is zero, no preemption grace time is allowed on |
| this partition or qos. |
| (Meaningful only for PreemptMode=CANCEL) |
| |
| .TP |
| \fIHidden\fP=<yes|no> |
| Specify if the partition and its jobs should be hidden from view. |
| Hidden partitions will by default not be reported by SLURM APIs |
| or commands. |
| Possible values are "YES" and "NO". |
| |
| .TP |
| \fIMaxMemPerCPU\fP=<MB> |
| Set the maximum memory to be allocated per CPU for jobs in this partition. |
| The memory size is specified in megabytes. |
| .TP |
| \fIMaxMemPerCNode\fP=<MB> |
| Set the maximum memory to be allocated per node for jobs in this partition. |
| The memory size is specified in megabytes. |
| |
| .TP |
| \fIMaxNodes\fP=<count> |
| Set the maximum number of nodes which will be allocated to any single job |
| in the partition. Specify a number, "INFINITE" or "UNLIMITED". (On a |
| Bluegene type system this represents a c\-node count.) |
| Changing the \fIMaxNodes\fP of a partition has no effect upon jobs that |
| have already begun execution. |
| |
| .TP |
| \fIMaxTime\fP=<time> |
| The maximum run time for jobs. |
| Output format is [days\-]hours:minutes:seconds or "UNLIMITED". |
| Input format (for \fBupdate\fR command) is minutes, minutes:seconds, |
| hours:minutes:seconds, days\-hours, days\-hours:minutes or |
| days\-hours:minutes:seconds. |
| Time resolution is one minute and second values are rounded up to |
| the next minute. |
| Changing the \fIMaxTime\fP of a partition has no effect upon jobs that |
| have already begun execution. |
| |
| .TP |
| \fIMinNodes\fP=<count> |
| Set the minimum number of nodes which will be allocated to any single job |
| in the partition. (On a Bluegene type system this represents a c\-node count.) |
| Changing the \fIMinNodes\fP of a partition has no effect upon jobs that |
| have already begun execution. |
| |
| .TP |
| \fINodes\fP=<name> |
| Identify the node(s) to be associated with this partition. Multiple node names |
| may be specified using simple node range expressions (e.g. "lx[10\-20]"). |
| Note that jobs may only be associated with one partition at any time. |
| Specify a blank data value to remove all nodes from a partition: "Nodes=". |
| Changing the \fINodes\fP in a partition has no effect upon jobs that |
| have already begun execution. |
| |
| .TP |
| \fIPartitionName\fP=<name> |
| Identify the partition to be updated. This specification is required. |
| |
| .TP |
| \fIPreemptMode\fP=<mode> |
| Reset the mechanism used to preempt jobs in this partition if \fIPreemptType\fP |
| is configured to \fIpreempt/partition_prio\fP. The default preemption mechanism |
| is specified by the cluster\-wide \fIPreemptMode\fP configuration parameter. |
| Possible values are "OFF", "CANCEL", "CHECKPOINT", "REQUEUE" and "SUSPEND". |
| |
| .TP |
| \fIPriority\fP=<count> |
| Jobs submitted to a higher priority partition will be dispatched |
| before pending jobs in lower priority partitions and if possible |
| they will preempt running jobs from lower priority partitions. |
| Note that a partition's priority takes precedence over a job's |
| priority. |
| The value may not exceed 65533. |
| |
| .TP |
| \fIRootOnly\fP=<yes|no> |
| Specify if only allocation requests initiated by user root will be satisfied. |
| This can be used to restrict control of the partition to some meta\-scheduler. |
| Possible values are "YES" and "NO". |
| |
| .TP |
| \fIReqResv\fP=<yes|no> |
| Specify if only allocation requests designating a reservation will be |
| satisfied. This is used to restrict partition usage to be allowed only |
| within a reservation. |
| Possible values are "YES" and "NO". |
| |
| .TP |
| \fIShared\fP=<yes|no|exclusive|force>[:<job_count>] |
| Specify if nodes in this partition can be shared by multiple jobs. |
| Possible values are "YES", "NO", "EXCLUSIVE" and "FORCE". |
| An optional job count specifies how many jobs can be allocated to use |
| each resource. |
| |
| .TP |
| \fIState\fP=<up|down|drain|inactive> |
| Specify if jobs can be allocated nodes or queued in this partition. |
| Possible values are "UP", "DOWN", "DRAIN" and "INACTIVE". |
| .RS |
| .TP 10 |
| \fIUP\fP |
| Designates that new jobs may queued on the partition, and that |
| jobs may be allocated nodes and run from the partition. |
| .TP |
| \fIDOWN\fP |
| Designates that new jobs may be queued on the partition, but |
| queued jobs may not be allocated nodes and run from the partition. Jobs |
| already running on the partition continue to run. The jobs |
| must be explicitly canceled to force their termination. |
| .TP |
| \fIDRAIN\fP |
| Designates that no new jobs may be queued on the partition (job |
| submission requests will be denied with an error message), but jobs |
| already queued on the partition may be allocated nodes and run. |
| See also the "Alternate" partition specification. |
| .TP |
| \fIINACTIVE\fP |
| Designates that no new jobs may be queued on the partition, |
| and jobs already queued may not be allocated nodes and run. |
| See also the "Alternate" partition specification. |
| .RE |
| |
| .TP |
| \fBSPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, RESERVATIONS\fR |
| .TP |
| |
| .TP |
| \fIReservation\fP=<name> |
| Identify the name of the reservation to be created, updated, or deleted. |
| This parameter is required for update and is the only parameter for delete. |
| For create, if you do not want to give a reservation name, use |
| "scontrol create res ..." and a name will be created automatically. |
| |
| .TP |
| \fIAccounts\fP=<account list> |
| List of accounts permitted to use the reserved nodes, for example |
| "Accounts=physcode1,physcode2". |
| A user in any of the accounts may use the reserved nodes. |
| A new reservation must specify Users and/or Accounts. |
| If both Users and Accounts are specified, a job must match both in order to |
| use the reservation. |
| Accounts can also be denied access to reservations by preceding all of the |
| account names with '\-'. Alternately precede the equal sign with '\-'. |
| For example, "Accounts=-physcode1,-physcode2" or "Accounts-=physcode1,physcode2" |
| will permit any account except physcode1 and physcode2 to use the reservation. |
| You can add or remove individual accounts from an existing reservation by |
| using the update command and adding a '+' or '\-' sign before the '=' sign. |
| If accounts are denied access to a reservation (account name preceded by a '\-'), |
| then all other accounts are implicitly allowed to use the reservation and it is |
| not possible to also explicitly specify allowed accounts. |
| |
| .TP |
| \fICoreCnt\fP=<num> |
| Identify number of cores to be reserved. This should only be used for |
| reservations that are less than one node in size. Otherwise use the |
| \fINodeCnt\fP option described below. |
| |
| .TP |
| \fILicenses\fP=<license> |
| Specification of licenses (or other resources available on all |
| nodes of the cluster) which are to be reserved. |
| License names can be followed by a colon and count |
| (the default count is one). |
| Multiple license names should be comma separated (e.g. "Licenses=foo:4,bar"). |
| A new reservation must specify one or more resource to be included: NodeCnt, |
| Nodes and/or Licenses. |
| If a reservation includes Licenses, but no NodeCnt or Nodes, then the option |
| \fIFlags=LICENSE_ONLY\fP must also be specified. |
| |
| .TP |
| \fINodeCnt\fP=<num>[,num,...] |
| Identify number of nodes to be reserved. The number can include a suffix of |
| "k" or "K", in which case the number specified is multiplied by 1024. |
| On BlueGene systems, this number represents a c\-node (compute node) count and |
| will be rounded up as needed to reserve whole nodes (midplanes). |
| In order to optimize the topology of the resource allocation on a new |
| reservation (not on an updated reservation), specific sizes |
| required for the reservation may be specified. For example, if you want to |
| reserve 4096 c\-nodes on a BlueGene system that can be used to allocate two |
| jobs each with 2048 c\-nodes, specify "NodeCnt=2k,2k". |
| A new reservation must specify one or more resource to be included: NodeCnt, |
| Nodes and/or Licenses. |
| |
| .TP |
| \fINodes\fP=<name> |
| Identify the node(s) to be reserved. Multiple node names |
| may be specified using simple node range expressions (e.g. "Nodes=lx[10\-20]"). |
| Specify a blank data value to remove all nodes from a reservation: "Nodes=". |
| A new reservation must specify one or more resource to be included: NodeCnt, |
| Nodes and/or Licenses. A specification of "ALL" will reserve all nodes. Set |
| \fIFlags=PART_NODES\fP and \fIPartitionName=\fP in order for changes in the |
| nodes associated with a partition to also be reflected in the nodes associated |
| with a reservation. |
| |
| .TP |
| \fIStartTime\fP=<time_spec> |
| The start time for the reservation. A new reservation must specify a start |
| time. It accepts times of the form \fIHH:MM:SS\fR for |
| a specific time of day (seconds are optional). |
| (If that time is already past, the next day is assumed.) |
| You may also specify \fImidnight\fR, \fInoon\fR, or |
| \fIteatime\fR (4pm) and you can have a time\-of\-day suffixed |
| with \fIAM\fR or \fIPM\fR for running in the morning or the evening. |
| You can also say what day the job will be run, by specifying |
| a date of the form \fIMMDDYY\fR or \fIMM/DD/YY\fR or \fIMM.DD.YY\fR, |
| or a date and time as \fIYYYY\-MM\-DD[THH:MM[:SS]]\fR. You can also |
| give times like \fInow + count time\-units\fR, where the time\-units |
| can be \fIminutes\fR, \fIhours\fR, \fIdays\fR, or \fIweeks\fR |
| and you can tell SLURM to run the job today with the keyword |
| \fItoday\fR and to run the job tomorrow with the keyword |
| \fItomorrow\fR. |
| |
| .TP |
| \fIEndTime\fP=<time_spec> |
| The end time for the reservation. A new reservation must specify an end |
| time or a duration. Valid formats are the same as for StartTime. |
| |
| .TP |
| \fIDuration\fP=<time> |
| The length of a reservation. A new reservation must specify an end |
| time or a duration. Valid formats are minutes, minutes:seconds, |
| hours:minutes:seconds, days\-hours, days\-hours:minutes, |
| days\-hours:minutes:seconds, or UNLIMITED. Time resolution is one minute and |
| second values are rounded up to the next minute. Output format is always |
| [days\-]hours:minutes:seconds. |
| |
| .TP |
| \fIPartitionName\fP=<name> |
| Identify the partition to be reserved. |
| |
| .TP |
| \fIFlags\fP=<flags> |
| Flags associated with the reservation. |
| You can add or remove individual flags from an existing reservation by |
| adding a '+' or '\-' sign before the '=' sign. For example: |
| Flags\-=DAILY (NOTE: this shortcut is not supported for all flags). |
| Currently supported flags include: |
| .RS |
| .TP 12 |
| \fILICENSE_ONLY\fR |
| This is a reservation for licenses only and not compute nodes. |
| If this flag is set, a job using this reservation may use the associated |
| licenses and any compute nodes. |
| If this flag is not set, a job using this reservation may use only the nodes |
| and licenses associated with the reservation. |
| .TP |
| \fIMAINT\fR |
| Maintenance mode, receives special accounting treatment. |
| This partition is permitted to use resources that are already in another |
| reservation. |
| .TP |
| \fIOVERLAP\fR |
| This reservation can be allocated resources that are already in another |
| reservation. |
| .TP |
| \fIIGNORE_JOBS\fR |
| Ignore currently running jobs when creating the reservation. |
| This can be especially useful when reserving all nodes in the system |
| for maintenance. |
| .TP |
| \fIPART_NODES\fR |
| This flag can be used to reserve all nodes within the specified |
| partition. PartitionName and Nodes=ALL must be specified or |
| this option is ignored. |
| .TP |
| \fIDAILY\fR |
| Repeat the reservation at the same time every day |
| .TP |
| \fIWEEKLY\fR |
| Repeat the reservation at the same time every week |
| .TP |
| \fISPEC_NODES\fR |
| Reservation is for specific nodes (output only) |
| .TP |
| \fISTATIC_ALLOC\fR |
| Make it so after the nodes are selected for a reservation they don't |
| change. Without this option when nodes are selected for a reservation |
| and one goes down the reservation will select a new node to fill the spot. |
| .RE |
| |
| .TP |
| \fIFeatures\fP=<features> |
| Set the reservation's required node features. Multiple values |
| may be "&" separated if all features are required (AND operation) or |
| separated by "|" if any of the specified features are required (OR operation). |
| Value may be cleared with blank data value, "Features=". |
| |
| .TP |
| \fIUsers\fP=<user list> |
| List of users permitted to use the reserved nodes, for example |
| "User=jones1,smith2". |
| A new reservation must specify Users and/or Accounts. |
| If both Users and Accounts are specified, a job must match both in order to |
| use the reservation. |
| Users can also be denied access to reservations by preceding all of the |
| user names with '\-'. Alternately precede the equal sign with '\-'. |
| For example, "User=-jones1,-smith2" or "User-=jones1,smith2" |
| will permit any user except jones1 and smith2 to use the reservation. |
| You can add or remove individual users from an existing reservation by |
| using the update command and adding a '+' or '\-' sign before the '=' sign. |
| If users are denied access to a reservation (user name preceded by a '\-'), |
| then all other users are implicitly allowed to use the reservation and it is |
| not possible to also explicitly specify allowed users. |
| |
| .TP |
| \fBSPECIFICATIONS FOR UPDATE BLOCK/SUBMP \fR |
| .TP |
| Bluegene systems only! |
| .TP |
| \fIBlockName\fP=<name> |
| Identify the bluegene block to be updated. This specification is required. |
| .TP |
| \fIState\fP=<free|error|recreate|remove|resume> |
| This will update the state of a bluegene block. |
| (i.e. update BlockName=RMP0 STATE=ERROR) |
| \fBWARNING!!!!\fR With the exception of the RESUME state, all other |
| state values will cancel any running job on the block! |
| .RS |
| .TP 10 |
| \fIFREE\fP |
| Return the block to a free state. |
| .TP |
| \fIERROR\fP |
| Make it so jobs don't run on the block. |
| .TP |
| \fIRECREATE\fP |
| Destroy the current block and create a new one to take its place. |
| .TP |
| \fIREMOVE\fP |
| Free and remove the block from the system. If the block is smaller |
| than a midplane every block on that midplane will be removed. (only |
| available on dynamic laid out systems) |
| .TP |
| \fIRESUME\fP |
| If a block is in ERROR state RESUME will return the block to its |
| previous usable state (FREE or READY). |
| .RE |
| |
| .TP |
| \fISubMPName\fP=<name> |
| Identify the bluegene ionodes to be updated (i.e. bg000[0\-3]). This |
| specification is required. |
| NOTE: Even on BGQ where node names are given in bg0000[00000] format |
| this option takes an ionode name bg0000[0]. |
| |
| .TP |
| \fBDESCRIPTION FOR SHOW COMMAND, NODES\fR |
| .TP |
| The meaning of the energy information is as follows: |
| |
| .TP |
| \fICurrentWatts\fP |
| The instantaneous power consumption of the node at the time of the last node |
| energy accounting sample, in watts. |
| |
| .TP |
| \fILowestJoules\fP |
| The energy consumed by the node between the last time it was powered on and |
| the last time it was registered by slurmd, in joules. |
| |
| .TP |
| \fIConsumedJoules\fP |
| The energy consumed by the node between the last time it was registered by |
| the slurmd daemon and the last node energy accounting sample, in joules. |
| |
| .PP |
| If the reported value is "n/s" (not supported), the node does not support the |
| configured \fBAcctGatherEnergyType\fR plugin. If the reported value is zero, energy |
| accounting for nodes is disabled. |
| |
| .SH "ENVIRONMENT VARIABLES" |
| .PP |
| Some \fBscontrol\fR options may |
| be set via environment variables. These environment variables, |
| along with their corresponding options, are listed below. (Note: |
| Commandline options will always override these settings.) |
| .TP 20 |
| \fBSCONTROL_ALL\fR |
| \fB\-a, \-\-all\fR |
| .TP |
| \fBSLURM_CLUSTERS\fR |
| Same as \fB\-\-clusters\fR |
| .TP |
| \fBSLURM_CONF\fR |
| The location of the SLURM configuration file. |
| .TP |
| \fBSLURM_TIME_FORMAT\fR |
| Specify the format used to report time stamps. A value of \fIstandard\fR, the |
| default value, generates output in the form "year-month-dateThour:minute:second". |
| A value of \fIrelative\fR returns only "hour:minute:second" if the current day. |
| For other dates in the current year it prints the "hour:minute" preceded by |
| "Tomorr" (tomorrow), "Ystday" (yesterday), the name of the day for the coming |
| week (e.g. "Mon", "Tue", etc.), otherwise the date (e.g. "25 Apr"). |
| For other years it returns a date month and year without a time (e.g. |
| "6 Jun 2012"). |
| Another suggested value is "%a %T" for a day of week and time stamp (e.g. |
| "Mon 12:34:56"). All of the time stamps use a 24 hour format. |
| |
| .SH "AUTHORIZATION" |
| |
| When using the SLURM db, users who have AdminLevel's defined (Operator |
| or Admin) and users who are account coordinators are given the |
| authority to view and modify jobs, reservations, nodes, etc., as |
| defined in the following table \- regardless of whether a PrivateData |
| restriction has been defined in the slurm.conf file. |
| |
| .br |
| \fBscontrol show job(s): \fR Admin, Operator, Coordinator |
| .br |
| \fBscontrol update job: \fR Admin, Operator, Coordinator |
| .br |
| \fBscontrol requeue: \fR Admin, Operator, Coordinator |
| .br |
| \fBscontrol show step(s): \fR Admin, Operator, Coordinator |
| .br |
| \fBscontrol update step: \fR Admin, Operator, Coordinator |
| .br |
| .sp |
| \fBscontrol show block: \fR Admin, Operator |
| .br |
| \fBscontrol update block: \fR Admin |
| .br |
| .sp |
| \fBscontrol show node: \fR Admin, Operator |
| .br |
| \fBscontrol update node: \fR Admin |
| .br |
| .sp |
| \fBscontrol create partition: \fR Admin |
| .br |
| \fBscontrol show partition: \fR Admin, Operator |
| .br |
| \fBscontrol update partition: \fR Admin |
| .br |
| \fBscontrol delete partition: \fR Admin |
| .br |
| .sp |
| \fBscontrol create reservation:\fR Admin, Operator |
| .br |
| \fBscontrol show reservation: \fR Admin, Operator |
| .br |
| \fBscontrol update reservation:\fR Admin, Operator |
| .br |
| \fBscontrol delete reservation:\fR Admin, Operator |
| .br |
| .sp |
| \fBscontrol reconfig: \fR Admin |
| .br |
| \fBscontrol shutdown: \fR Admin |
| .br |
| \fBscontrol takeover: \fR Admin |
| .br |
| |
| .SH "EXAMPLES" |
| .eo |
| .br |
| # scontrol |
| .br |
| scontrol: show part debug |
| .br |
| PartitionName=debug |
| .br |
| AllocNodes=ALL AllowGroups=ALL Default=YES |
| .br |
| DefaultTime=NONE DisableRootJobs=NO Hidden=NO |
| .br |
| MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 |
| .br |
| Nodes=snowflake[0-48] |
| .br |
| Priority=1 RootOnly=NO Shared=YES:4 |
| .br |
| State=UP TotalCPUs=694 TotalNodes=49 |
| .br |
| scontrol: update PartitionName=debug MaxTime=60:00 MaxNodes=4 |
| .br |
| scontrol: show job 71701 |
| .br |
| JobId=71701 Name=hostname |
| .br |
| UserId=da(1000) GroupId=da(1000) |
| .br |
| Priority=66264 Account=none QOS=normal WCKey=*123 |
| .br |
| JobState=COMPLETED Reason=None Dependency=(null) |
| .br |
| TimeLimit=UNLIMITED Requeue=1 Restarts=0 BatchFlag=0 ExitCode=0:0 |
| .br |
| SubmitTime=2010-01-05T10:58:40 EligibleTime=2010-01-05T10:58:40 |
| .br |
| StartTime=2010-01-05T10:58:40 EndTime=2010-01-05T10:58:40 |
| .br |
| SuspendTime=None SecsPreSuspend=0 |
| .br |
| Partition=debug AllocNode:Sid=snowflake:4702 |
| .br |
| ReqNodeList=(null) ExcNodeList=(null) |
| .br |
| NodeList=snowflake0 |
| .br |
| NumNodes=1 NumCPUs=10 CPUs/Task=2 ReqS:C:T=1:1:1 |
| .br |
| MinCPUsNode=2 MinMemoryNode=0 MinTmpDiskNode=0 |
| .br |
| Features=(null) Reservation=(null) |
| .br |
| Shared=OK Contiguous=0 Licenses=(null) Network=(null) |
| .br |
| scontrol: update JobId=71701 TimeLimit=30:00 Priority=500 |
| .br |
| scontrol: show hostnames tux[1-3] |
| .br |
| tux1 |
| .br |
| tux2 |
| .br |
| tux3 |
| .br |
| scontrol: create res StartTime=2009-04-01T08:00:00 Duration=5:00:00 Users=dbremer NodeCnt=10 |
| .br |
| Reservation created: dbremer_1 |
| .br |
| scontrol: update Reservation=dbremer_1 Flags=Maint NodeCnt=20 |
| .br |
| scontrol: delete Reservation=dbremer_1 |
| .br |
| scontrol: quit |
| .ec |
| |
| .SH "COPYING" |
| Copyright (C) 2002\-2007 The Regents of the University of California. |
| Copyright (C) 2008\-2010 Lawrence Livermore National Security. |
| Portions Copyright (C) 2010 SchedMD <http://www.schedmd.com>. |
| Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). |
| CODE\-OCEC\-09\-009. All rights reserved. |
| .LP |
| This file is part of SLURM, a resource management program. |
| For details, see <http://slurm.schedmd.com/>. |
| .LP |
| SLURM is free software; you can redistribute it and/or modify it under |
| the terms of the GNU General Public License as published by the Free |
| Software Foundation; either version 2 of the License, or (at your option) |
| any later version. |
| .LP |
| SLURM is distributed in the hope that it will be useful, but WITHOUT ANY |
| WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS |
| FOR A PARTICULAR PURPOSE. See the GNU General Public License for more |
| details. |
| .SH "FILES" |
| .LP |
| /etc/slurm.conf |
| .SH "SEE ALSO" |
| \fBscancel\fR(1), \fBsinfo\fR(1), \fBsqueue\fR(1), |
| \fBslurm_checkpoint\fR (3), |
| \fBslurm_create_partition\fR (3), |
| \fBslurm_delete_partition\fR (3), |
| \fBslurm_load_ctl_conf\fR (3), |
| \fBslurm_load_jobs\fR (3), \fBslurm_load_node\fR (3), |
| \fBslurm_load_partitions\fR (3), |
| \fBslurm_reconfigure\fR (3), \fBslurm_requeue\fR (3), |
| \fBslurm_resume\fR (3), |
| \fBslurm_shutdown\fR (3), \fBslurm_suspend\fR (3), |
| \fBslurm_takeover\fR (3), |
| \fBslurm_update_job\fR (3), \fBslurm_update_node\fR (3), |
| \fBslurm_update_partition\fR (3), |
| \fBslurm.conf\fR(5), \fBslurmctld\fR(8) |