| .TH SINFO "1" "May 2008" "sinfo 2.0" "Slurm components" |
| |
| .SH "NAME" |
| sinfo \- view information about SLURM nodes and partitions. |
| |
| .SH "SYNOPSIS" |
| \fBsinfo\fR [\fIOPTIONS\fR...] |
| .SH "DESCRIPTION" |
| \fBsinfo\fR is used to view partition and node information for a |
| system running SLURM. |
| |
| .SH "OPTIONS" |
| |
| .TP |
| \fB\-a\fR, \fB\-\-all\fR |
| Display information about all partions. This causes information to be |
| displayed about partitions that are configured as hidden and partitions that |
| are unavailable to user's group. |
| |
| .TP |
| \fB\-b\fR, \fB\-\-bgl\fR |
| Display information about bglblocks (on Blue Gene systems only). |
| |
| .TP |
| \fB\-d\fR, \fB\-\-dead\fR |
| If set only report state information for non\-responding (dead) nodes. |
| |
| .TP |
| \fB\-e\fR, \fB\-\-exact\fR |
| If set, do not group node information on multiple nodes unless |
| their configurations to be reported are identical. Otherwise |
| cpu count, memory size, and disk space for nodes will be listed |
| with the minimum value followed by a "+" for nodes with the |
| same partition and state (e.g., "250+"). |
| |
| .TP |
| \fB\-h\fR, \fB\-\-noheader\fR |
| Do not print a header on the output. |
| |
| .TP |
| \fB\-\-help\fR |
| Print a message describing all \fBsinfo\fR options. |
| .TP |
| |
| \fB\-\-hide\fR |
| Do not display information about hidden partitions. By default, partitions |
| that are configured as hidden or are not available to the user's group |
| will not be displayed (i.e. this is the default behavior). |
| |
| .TP |
| \fB\-i <seconds>\fR, \fB\-\-iterate=<seconds>\fR |
| Print the state on a periodic basis. |
| Sleep for the indicated number of seconds between reports. |
| By default, prints a time stamp with the header. |
| |
| .TP |
| \fB\-l\fR, \fB\-\-long\fR |
| Print more detailed information. |
| This is ignored if the \fB\-\-format\fR option is specified. |
| |
| .TP |
| \fB\-n <nodes>\fR, \fB\-\-nodes=<nodes>\fR |
| Print information only about the specified node(s). |
| Multiple nodes may be comma separated or expressed using a |
| node range expression. For example "linux[00\-07]" would |
| indicate eight nodes, "linux00" through "linux07." |
| |
| .TP |
| \fB\-N\fR, \fB\-\-Node\fR |
| Print information in a node\-oriented format. |
| The default is to print information in a partition\-oriented format. |
| This is ignored if the \fB\-\-format\fR option is specified. |
| |
| .TP |
| \fB\-o <output_format>\fR, \fB\-\-format=<output_format>\fR |
| Specify the information to be displayed using an \fBsinfo\fR |
| format string. Format strings transparently used by \fBsinfo\fR |
| when running with various options are |
| .RS |
| .TP 15 |
| .I "default" |
| "%9P %5a %.10l %.5D %6t %N" |
| .TP |
| .I "\-\-summarize" |
| "%9P %5a %.10l %15F %N" |
| .TP |
| .I "\-\-long" |
| "%9P %5a %.10l %.8s %4r %5h %10g %.5D %11T %N" |
| .TP |
| .I "\-\-Node" |
| "%#N %.5D %9P %6t" |
| .TP |
| .I "\-\-long \-\-Node" |
| "%#N %.5D %9P %11T %.4c %.8z %.6m %.8d %.6w %8f %R" |
| .TP |
| .I "\-\-list\-reasons" |
| "%50R %N" |
| .TP |
| .I "\-\-long \-\-list\-reasons" |
| "%50R %6t %N" |
| .RE |
| |
| .IP |
| In the above format strings the use of "#" represents the |
| maximum length of an node list to be printed. |
| .IP |
| The field specifications available include: |
| .RS |
| .TP 4 |
| \fB%a\fR |
| State/availability of a partition |
| .TP |
| \fB%A\fR |
| Number of nodes by state in the format "allocated/idle". |
| Do not use this with a node state option ("%t" or "%T") or |
| the different node states will be placed on separate lines. |
| .TP |
| \fB%c\fR |
| Number of CPUs per node |
| .TP |
| \fB%C\fR |
| Number of CPUs by state in the format |
| "allocated/idle/other/total". Do not use this with a node |
| state option ("%t" or "%T") or the different node states will |
| be placed on separate lines. |
| .TP |
| \fB%d\fR |
| Size of temporary disk space per node in megabytes |
| .TP |
| \fB%D\fR |
| Number of nodes |
| .TP |
| \fB%E\fR |
| The reason a node is unavailable (down, drained, or draining states). |
| This is the same as \fB%R\fR except the entries will be sorted by |
| time rather than the reason string. |
| .TP |
| \fB%f\fR |
| Features associated with the nodes |
| .TP |
| \fB%F\fR |
| Number of nodes by state in the format |
| "allocated/idle/other/total". Do not use this with a node |
| state option ("%t" or "%T") or the different node states will |
| be placed on separate lines. |
| .TP |
| \fB%g\fR |
| Groups which may use the nodes |
| .TP |
| \fB%h\fR |
| Jobs may share nodes, "yes", "no", or "force" |
| .TP |
| \fB%l\fR |
| Maximum time for any job in the format "days\-hours:minutes:seconds" |
| .TP |
| \fB%L\fR |
| Default time for any job in the format "days\-hours:minutes:seconds" |
| .TP |
| \fB%m\fR |
| Size of memory per node in megabytes |
| .TP |
| \fB%N\fR |
| List of node names |
| .TP |
| \fB%P\fR |
| Partition name |
| .TP |
| \fB%r\fR |
| Only user root may initiate jobs, "yes" or "no" |
| .TP |
| \fB%R\fR |
| The reason a node is unavailable (down, drained, draining, |
| fail or failing states) |
| .TP |
| \fB%s\fR |
| Maximum job size in nodes |
| .TP |
| \fB%S\fR |
| Allowed allocating nodes |
| .TP |
| \fB%t\fR |
| State of nodes, compact form |
| .TP |
| \fB%T\fR |
| State of nodes, extended form |
| .TP |
| \fB%w\fR |
| Scheduling weight of the nodes |
| .TP |
| \fB%X\fR |
| Number of sockets per node |
| .TP |
| \fB%Y\fR |
| Number of cores per socket |
| .TP |
| \fB%Z\fR |
| Number of threads per core |
| .TP |
| \fB%z\fR |
| Extended processor information: number of sockets, cores, threads (S:C:T) per node |
| .TP |
| \fB%.<*>\fR |
| right justification of the field |
| .TP |
| \fB%<Number><*>\fR |
| size of field |
| .RE |
| |
| .TP |
| \fB\-p <partition>\fR, \fB\-\-partition=<partition>\fR |
| Print information only about the specified partition. |
| |
| .TP |
| \fB\-r\fR, \fB\-\-responding\fR |
| If set only report state information for responding nodes. |
| |
| .TP |
| \fB\-R\fR, \fB\-\-list\-reasons\fR |
| List reasons nodes are in the down, drained, fail or failing state. |
| When nodes are in these states SLURM supports optional inclusion |
| of a "reason" string by an administrator. |
| This option will display the first 35 characters of the reason |
| field and list of nodes with that reason for all nodes that are, |
| by default, down, drained, draining or failing. |
| This option may be used with other node filtering options |
| (e.g. \fB\-r\fR, \fB\-d\fR, \fB\-t\fR, \fB\-n\fR), |
| however, combinations of these options that result in a |
| list of nodes that are not down or drained or failing will |
| not produce any output. |
| When used with \fB\-l\fR the output additionally includes |
| the current node state. |
| |
| .TP |
| \fB\-s\fR, \fB\-\-summarize\fR |
| List only a partition state summary with no node state details. |
| This is ignored if the \fB\-\-format\fR option is specified. |
| |
| .TP |
| \fB\-S <sort_list>\fR, \fB\-\-sort=<sort_list>\fR |
| Specification of the order in which records should be reported. |
| This uses the same field specifciation as the <output_format>. |
| Multiple sorts may be performed by listing multiple sort fields |
| separated by commas. The field specifications may be preceeded |
| by "+" or "\-" for assending (default) and desending order |
| respectively. The partition field specification, "P", may be |
| preceeded by a "#" to report partitions in the same order that |
| they appear in SLURM's configuration file, \fBslurm.conf\fR. |
| For example, a sort value of "+P,\-m" requests that records |
| be printed in order of increasing partition name and within a |
| partition by decreasing memory size. The default value of sort |
| is "#P,\-t" (partitions ordered as configured then decreasing |
| node state). If the \fB\-\-Node\fB option is selected, the |
| default sort value is "N" (increasing node name). |
| |
| .TP |
| \fB\-t <states>\fR , \fB\-\-states=<states>\fR |
| List nodes only having the given state(s). Multiple states |
| may be comma separated and the comparison is case insensitive. |
| Possible values include (case insensitive): ALLOC, ALLOCATED, |
| COMP, COMPLETING, DOWN, DRAIN (for node in DRAINING or DRAINED |
| states), DRAINED, DRAINING, FAIL, FAILING, IDLE, MAINT, NO_RESPOND, |
| POWER_SAVE, UNK, and UNKNOWN. |
| By default nodes in the specified state are reported whether |
| they are responding or not. |
| The \fB\-\-dead\fR and \fB\-\-responding\fR options may be |
| used to filtering nodes by the responding flag. |
| |
| .TP |
| \fB\-\-usage\fR |
| Print a brief message listing the \fBsinfo\fR options. |
| |
| .TP |
| \fB\-v\fR, \fB\-\-verbose\fR |
| Provide detailed event logging through program execution. |
| |
| .TP |
| \fB\-V\fR, \fB\-\-version\fR |
| Print version information and exit. |
| |
| .SH "OUTPUT FIELD DESCRIPTIONS" |
| .TP |
| \fBAVAIL\fR |
| Partition state: \fBup\fR or \fBdown\fR. |
| .TP |
| \fBCPUS\fR |
| Count of CPUs (processors) on these nodes. |
| .TP |
| \fBS:C:T\fR |
| Count of sockets (S), cores (C), and threads (T) on these nodes. |
| .TP |
| \fBSOCKETS\fR |
| Count of sockets on these nodes. |
| .TP |
| \fBCORES\fR |
| Count of cores on these nodes. |
| .TP |
| \fBTHREADS\fR |
| Count of threads on these nodes. |
| .TP |
| \fBGROUPS\fR |
| Resource allocations in this partition are restricted to the |
| named groups. \fBall\fR indicates that all groups may use |
| this partition. |
| .TP |
| \fBJOB_SIZE\fR |
| Minimum and maximum node count that can be allocated to any |
| user job. A single number indicates the minimum and maximum |
| node count are the same. \fBinfinite\fR is used to identify |
| partitions without a maximum node count. |
| .TP |
| \fBTIMELIMIT\fR |
| Maximum time limit for any user job in |
| days\-hours:minutes:seconds. \fBinfinite\fR is used to identify |
| partitions without a job time limit. |
| .TP |
| \fBMEMORY\fR |
| Size of real memory in megabytes on these nodes. |
| .TP |
| \fBNODELIST\fR or \fBBP_LIST\fR (BlueGene systems only) |
| Names of nodes associated with this configuration/partition. |
| .TP |
| \fBNODES\fR |
| Count of nodes with this particular configuration. |
| .TP |
| \fBNODES(A/I)\fR |
| Count of nodes with this particular configuration by node |
| state in the form "available/idle". |
| .TP |
| \fBNODES(A/I/O/T)\fR |
| Count of nodes with this particular configuration by node |
| state in the form "available/idle/other/total". |
| .TP |
| \fBPARTITION\fR |
| Name of a partition. Note that the suffix "*" identifies the |
| default partition. |
| .TP |
| \fBROOT\fR |
| Is the ability to allocate resources in this partition |
| restricted to user root, \fByes\fR or \fBno\fR. |
| .TP |
| \fBSHARE\fR |
| Will jobs allocated resources in this partition share those |
| resources. |
| \fBno\fR indicates resources are never shared. |
| \fBexclusive\fR indicates whole nodes are dedicated to jobs |
| (equivalent to srun \-\-exclusive option, may be used even |
| with shared/cons_res managing individual processors). |
| \fBforce\fR indicates resources are always available to be shared. |
| \fByes\fR indicates resource may be shared or not |
| per job's resource allocation. |
| .TP |
| \fBSTATE\fR |
| State of the nodes. |
| Possible states include: allocated, completing, down, |
| drained, draining, fail, failing, idle, and unknown plus |
| their abbreviated forms: alloc, comp, donw, drain, drng, |
| fail, failg, idle, and unk respectively. |
| Note that the suffix "*" identifies nodes that are presently |
| not responding. |
| .TP |
| \fBTMP_DISK\fR |
| Size of temporary disk space in megabytes on these nodes. |
| |
| .SH "NODE STATE CODES" |
| .PP |
| Node state codes are shortened as required for the field size. |
| If the node state code is followed by "*", this indicates the |
| node is presently not responding and will not be allocated |
| any new work. If the node remains non\-responsive, it will |
| be placed in the \fBDOWN\fR state (except in the case of |
| \fBCOMPLETING\fR, \fBDRAINED\fR, \fBDRAINING\fR, |
| \fBFAIL\fR, \fBFAILING\fR nodes). |
| If the node state code is followed by "~", this indicates |
| the node is presently in a power saving mode (typically |
| running at reduced frequency). |
| .TP 12 |
| \fBALLOCATED\fR |
| The node has been allocated to one or more jobs. |
| .TP |
| \fBALLOCATED+\fR |
| The node is allocated to one or more active jobs plus |
| one or more jobs are in the process of COMPLETING. |
| .TP |
| \fBCOMPLETING\fR |
| All jobs associated with this node are in the process of |
| COMPLETING. This node state will be removed when |
| all of the job's processes have terminated and the SLURM |
| epilog program (if any) has terminated. See the \fBEpilog\fR |
| parameter description in the \fBslurm.conf\fR man page for |
| more information. |
| .TP |
| \fBDOWN\fR |
| The node is unavailable for use. SLURM can automatically |
| place nodes in this state if some failure occurs. System |
| administrators may also explicitly place nodes in this state. If |
| a node resumes normal operation, SLURM can automatically |
| return it to service. See the \fBReturnToService\fR |
| and \fBSlurmdTimeout\fR parameter descriptions in the |
| \fBslurm.conf\fR(5) man page for more information. |
| .TP |
| \fBDRAINED\fR |
| The node is unavailable for use per system administrator |
| request. See the \fBupdate node\fR command in the |
| \fBscontrol\fR(1) man page or the \fBslurm.conf\fR(5) man page |
| for more information. |
| .TP |
| \fBDRAINING\fR |
| The node is currently executing a job, but will not be allocated |
| to additional jobs. The node state will be changed to state |
| \fBDRAINED\fR when the last job on it completes. Nodes enter |
| this state per system administrator request. See the \fBupdate |
| node\fR command in the \fBscontrol\fR(1) man page or the |
| \fBslurm.conf\fR(5) man page for more information. |
| .TP |
| \fBFAIL\fR |
| The node is expected to fail soon and is unavailable for |
| use per system administrator request. |
| See the \fBupdate node\fR command in the \fBscontrol\fR(1) |
| man page or the \fBslurm.conf\fR(5) man page for more information. |
| .TP |
| \fBFAILING\fR |
| The node is currently executing a job, but is expected to fail |
| soon and is unavailable for use per system administrator request. |
| See the \fBupdate node\fR command in the \fBscontrol\fR(1) |
| man page or the \fBslurm.conf\fR(5) man page for more information. |
| .TP |
| \fBIDLE\fR |
| The node is not allocated to any jobs and is available for use. |
| .TP |
| \fBMAINT\fR |
| The node is currently in a reservation with a flag value of "maintainence". |
| .TP |
| \fBUNKNOWN\fR |
| The SLURM controller has just started and the node's state |
| has not yet been determined. |
| |
| .SH "ENVIRONMENT VARIABLES" |
| .PP |
| Some \fBsinfo\fR options may |
| be set via environment variables. These environment variables, |
| along with their corresponding options, are listed below. (Note: |
| Commandline options will always override these settings.) |
| .TP 20 |
| \fBSINFO_ALL\fR |
| \fB\-a, \-\-all\fR |
| .TP |
| \fBSINFO_FORMAT\fR |
| \fB\-o <output_format>, \-\-format=<output_format>\fR |
| .TP |
| \fBSINFO_PARTITION\fR |
| \fB\-p <partition>, \-\-partition=<partition>\fR |
| .TP |
| \fBSINFO_SORT\fR |
| \fB\-S <sort>, \-\-sort=<sort>\fR |
| .TP |
| \fBSLURM_CONF\fR |
| The location of the SLURM configuration file. |
| |
| .SH "EXAMPLES" |
| .eo |
| Report basic node and partition configurations: |
| |
| .nf |
| |
| > sinfo |
| PARTITION AVAIL TIMELIMIT NODES STATE NODELIST |
| batch up infinite 2 alloc adev[8-9] |
| batch up infinite 6 idle adev[10-15] |
| debug* up 30:00 8 idle adev[0-7] |
| |
| .fi |
| |
| Report partition summary information: |
| .nf |
| |
| > sinfo -s |
| PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST |
| batch up infinite 2/6/0/8 adev[8-15] |
| debug* up 30:00 0/8/0/8 adev[0-7] |
| |
| .fi |
| |
| Report more complete information about the partition debug: |
| .nf |
| |
| > sinfo --long --partition=debug |
| PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT SHARE GROUPS NODES STATE NODELIST |
| debug* up 30:00 8 no no all 8 idle dev[0-7] |
| .fi |
| |
| Report only those nodes that are in state DRAINED: |
| .nf |
| |
| > sinfo --states=drained |
| PARTITION AVAIL NODES TIMELIMIT STATE NODELIST |
| debug* up 2 30:00 drain adev[6-7] |
| |
| .fi |
| |
| Report node-oriented information with details and exact matches: |
| .nf |
| |
| > sinfo -Nel |
| NODELIST NODES PARTITION STATE CPUS MEMORY TMP_DISK WEIGHT FEATURES REASON |
| adev[0-1] 2 debug* idle 2 3448 38536 16 (null) (null) |
| adev[2,4-7] 5 debug* idle 2 3384 38536 16 (null) (null) |
| adev3 1 debug* idle 2 3394 38536 16 (null) (null) |
| adev[8-9] 2 batch allocated 2 246 82306 16 (null) (null) |
| adev[10-15] 6 batch idle 2 246 82306 16 (null) (null) |
| |
| .fi |
| |
| Report only down, drained and draining nodes and their reason field: |
| .nf |
| |
| > sinfo -R |
| REASON NODELIST |
| Memory errors dev[0,5] |
| Not Responding dev8 |
| |
| .fi |
| .ec |
| |
| .SH "COPYING" |
| Copyright (C) 2002\-2007 The Regents of the University of California. |
| Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). |
| CODE\-OCEC\-09\-009. All rights reserved. |
| .LP |
| This file is part of SLURM, a resource management program. |
| For details, see <https://computing.llnl.gov/linux/slurm/>. |
| .LP |
| SLURM is free software; you can redistribute it and/or modify it under |
| the terms of the GNU General Public License as published by the Free |
| Software Foundation; either version 2 of the License, or (at your option) |
| any later version. |
| .LP |
| SLURM is distributed in the hope that it will be useful, but WITHOUT ANY |
| WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS |
| FOR A PARTICULAR PURPOSE. See the GNU General Public License for more |
| details. |
| |
| .SH "SEE ALSO" |
| \fBscontrol\fR(1), \fBsmap\fR(1), \fBsqueue\fR(1), |
| \fBslurm_load_ctl_conf\fR(3), \fBslurm_load_jobs\fR(3), \fBslurm_load_node\fR(3), |
| \fBslurm_load_partitions\fR(3), |
| \fBslurm_reconfigure\fR(3), \fBslurm_shutdown\fR(3), |
| \fBslurm_update_job\fR(3), \fBslurm_update_node\fR(3), |
| \fBslurm_update_partition\fR(3), |
| \fBslurm.conf\fR(5) |