| .TH "salloc" "1" "SLURM 2.5" "October 2012" "SLURM Commands" |
| |
| .SH "NAME" |
| salloc \- Obtain a SLURM job allocation (a set of nodes), execute a command, |
| and then release the allocation when the command is finished. |
| |
| .SH "SYNOPSIS" |
| salloc [\fIoptions\fP] [<\fIcommand\fP> [\fIcommand args\fR]] |
| |
| .SH "DESCRIPTION" |
| salloc is used to allocate a SLURM job allocation, which is a set of resources |
| (nodes), possibly with some set of constraints (e.g. number of processors per |
| node). When salloc successfully obtains the requested allocation, it then runs |
| the command specified by the user. Finally, when the user specified command is |
| complete, salloc relinquishes the job allocation. |
| |
| The command may be any program the user wishes. Some typical commands are |
| xterm, a shell script containing srun commands, and srun (see the EXAMPLES |
| section). If no command is specified, then the value of |
| \fBSallocDefaultCommand\fR in slurm.conf is used. If |
| \fBSallocDefaultCommand\fR is not set, then \fBsalloc\fR runs the |
| user's default shell. |
| |
| The following document describes the the influence of various options on the |
| allocation of cpus to jobs and tasks. |
| .br |
| http://slurm.schedmd.com/cpu_management.html |
| |
| NOTE: The salloc logic includes support to save and restore the terminal line |
| settings and is designed to be executed in the foreground. If you need to |
| execute salloc in the background, set its standard input to some file, for |
| example: "salloc \-n16 a.out </dev/null &" |
| |
| .SH "OPTIONS" |
| .LP |
| |
| .TP |
| \fB\-A\fR, \fB\-\-account\fR=<\fIaccount\fR> |
| Charge resources used by this job to specified account. |
| The \fIaccount\fR is an arbitrary string. The account name may |
| be changed after job submission using the \fBscontrol\fR |
| command. |
| |
| .TP |
| \fB\-\-acctg\-freq\fR=<\fIseconds\fR> |
| Define the job accounting sampling interval. |
| This can be used to override the \fIJobAcctGatherFrequency\fR parameter in SLURM's |
| configuration file, \fIslurm.conf\fR. |
| A value of zero disables real the periodic job sampling and provides accounting |
| information only on job termination (reducing SLURM interference with the job). |
| |
| .TP |
| \fB\-B\fR \fB\-\-extra\-node\-info\fR=<\fIsockets\fR[:\fIcores\fR[:\fIthreads\fR]]> |
| Request a specific allocation of resources with details as to the |
| number and type of computational resources within a cluster: |
| number of sockets (or physical processors) per node, |
| cores per socket, and threads per core. |
| The total amount of resources being requested is the product of all of |
| the terms. |
| Each value specified is considered a minimum. |
| An asterisk (*) can be used as a placeholder indicating that all available |
| resources of that type are to be utilized. |
| As with nodes, the individual levels can also be specified in separate |
| options if desired: |
| .nf |
| \fB\-\-sockets\-per\-node\fR=<\fIsockets\fR> |
| \fB\-\-cores\-per\-socket\fR=<\fIcores\fR> |
| \fB\-\-threads\-per\-core\fR=<\fIthreads\fR> |
| .fi |
| If task/affinity plugin is enabled, then specifying an allocation in this |
| manner also sets a default \fB\-\-cpu_bind\fR option of \fIthreads\fR |
| if the \fB\-B\fR option specifies a thread count, otherwise an option of |
| \fIcores\fR if a core count is specified, otherwise an option of \fIsockets\fR. |
| If SelectType is configured to select/cons_res, it must have a parameter of |
| CR_Core, CR_Core_Memory, CR_Socket, or CR_Socket_Memory for this option |
| to be honored. |
| This option is not supported on BlueGene systems (select/bluegene plugin |
| is configured). |
| If not specified, the scontrol show job will display 'ReqS:C:T=*:*:*'. |
| |
| .TP |
| \fB\-\-begin\fR=<\fItime\fR> |
| Submit the batch script to the SLURM controller immediately, like normal, but |
| tell the controller to defer the allocation of the job until the specified time. |
| |
| Time may be of the form \fIHH:MM:SS\fR to run a job at |
| a specific time of day (seconds are optional). |
| (If that time is already past, the next day is assumed.) |
| You may also specify \fImidnight\fR, \fInoon\fR, or |
| \fIteatime\fR (4pm) and you can have a time\-of\-day suffixed |
| with \fIAM\fR or \fIPM\fR for running in the morning or the evening. |
| You can also say what day the job will be run, by specifying |
| a date of the form \fIMMDDYY\fR or \fIMM/DD/YY\fR |
| \fIYYYY\-MM\-DD\fR. Combine date and time using the following |
| format \fIYYYY\-MM\-DD[THH:MM[:SS]]\fR. You can also |
| give times like \fInow + count time\-units\fR, where the time\-units |
| can be \fIseconds\fR (default), \fIminutes\fR, \fIhours\fR, |
| \fIdays\fR, or \fIweeks\fR and you can tell SLURM to run |
| the job today with the keyword \fItoday\fR and to run the |
| job tomorrow with the keyword \fItomorrow\fR. |
| The value may be changed after job submission using the |
| \fBscontrol\fR command. |
| For example: |
| .nf |
| \-\-begin=16:00 |
| \-\-begin=now+1hour |
| \-\-begin=now+60 (seconds by default) |
| \-\-begin=2010\-01\-20T12:34:00 |
| .fi |
| |
| .RS |
| .PP |
| Notes on date/time specifications: |
| \- Although the 'seconds' field of the HH:MM:SS time specification is |
| allowed by the code, note that the poll time of the SLURM scheduler |
| is not precise enough to guarantee dispatch of the job on the exact |
| second. The job will be eligible to start on the next poll |
| following the specified time. The exact poll interval depends on the |
| SLURM scheduler (e.g., 60 seconds with the default sched/builtin). |
| \- If no time (HH:MM:SS) is specified, the default is (00:00:00). |
| \- If a date is specified without a year (e.g., MM/DD) then the current |
| year is assumed, unless the combination of MM/DD and HH:MM:SS has |
| already passed for that year, in which case the next year is used. |
| .RE |
| |
| .TP |
| \fB\-\-bell\fR |
| Force salloc to ring the terminal bell when the job allocation is granted |
| (and only if stdout is a tty). By default, salloc only rings the bell |
| if the allocation is pending for more than ten seconds (and only if stdout |
| is a tty). Also see the option \fB\-\-no\-bell\fR. |
| |
| .TP |
| \fB\-\-comment\fR=<\fIstring\fR> |
| An arbitrary comment. |
| |
| .TP |
| \fB\-C\fR, \fB\-\-constraint\fR=<\fIlist\fR> |
| Nodes can have \fBfeatures\fR assigned to them by the SLURM administrator. |
| Users can specify which of these \fBfeatures\fR are required by their job |
| using the constraint option. |
| Only nodes having features matching the job constraints will be used to |
| satisfy the request. |
| Multiple constraints may be specified with AND, OR, exclusive OR, |
| resource counts, etc. |
| Supported \fbconstraint\fR options include: |
| .PD 1 |
| .RS |
| .TP |
| \fBSingle Name\fR |
| Only nodes which have the specified feature will be used. |
| For example, \fB\-\-constraint="intel"\fR |
| .TP |
| \fBNode Count\fR |
| A request can specify the number of nodes needed with some feature |
| by appending an asterisk and count after the feature name. |
| For example "\fB\-\-nodes=16 \-\-constraint=graphics*4 ..."\fR |
| indicates that the job requires 16 nodes at that at least four of those |
| nodes must have the feature "graphics." |
| .TP |
| \fBAND\fR |
| If only nodes with all of specified features will be used. |
| The ampersand is used for an AND operator. |
| For example, \fB\-\-constraint="intel&gpu"\fR |
| .TP |
| \fBOR\fR |
| If only nodes with at least one of specified features will be used. |
| The vertical bar is used for an OR operator. |
| For example, \fB\-\-constraint="intel|amd"\fR |
| .TP |
| \fBExclusive OR\fR |
| If only one of a set of possible options should be used for all allocated |
| nodes, then use the OR operator and enclose the options within square brackets. |
| For example: "\fB\-\-constraint=[rack1|rack2|rack3|rack4]"\fR might |
| be used to specify that all nodes must be allocated on a single rack of |
| the cluster, but any of those four racks can be used. |
| .TP |
| \fBMultiple Counts\fR |
| Specific counts of multiple resources may be specified by using the AND |
| operator and enclosing the options within square brackets. |
| For example: "\fB\-\-constraint=[rack1*2&rack2*4]"\fR might |
| be used to specify that two nodes must be allocated from nodes with the feature |
| of "rack1" and four nodes must be allocated from nodes with the feature |
| "rack2". |
| .RE |
| |
| .TP |
| \fB\-\-contiguous\fR |
| If set, then the allocated nodes must form a contiguous set. |
| Not honored with the \fBtopology/tree\fR or \fBtopology/3d_torus\fR |
| plugins, both of which can modify the node ordering. |
| |
| .TP |
| \fB\-\-cores\-per\-socket\fR=<\fIcores\fR> |
| Restrict node selection to nodes with at least the specified number of |
| cores per socket. See additional information under \fB\-B\fR option |
| above when task/affinity plugin is enabled. |
| |
| .TP |
| \fB\-\-cpu_bind\fR=[{\fIquiet,verbose\fR},]\fItype\fR |
| Bind tasks to CPUs. |
| Used only when the task/affinity or task/cgroup plugin is enabled. |
| The configuration parameter \fBTaskPluginParam\fR may override these options. |
| For example, if \fBTaskPluginParam\fR is configured to bind to cores, |
| your job will not be able to bind tasks to sockets. |
| NOTE: To have SLURM always report on the selected CPU binding for all |
| commands executed in a shell, you can enable verbose mode by setting |
| the SLURM_CPU_BIND environment variable value to "verbose". |
| |
| The following informational environment variables are set when \fB\-\-cpu_bind\fR |
| is in use: |
| .nf |
| SLURM_CPU_BIND_VERBOSE |
| SLURM_CPU_BIND_TYPE |
| SLURM_CPU_BIND_LIST |
| .fi |
| |
| See the \fBENVIRONMENT VARIABLE\fR section for a more detailed description |
| of the individual SLURM_CPU_BIND* variables. |
| |
| When using \fB\-\-cpus\-per\-task\fR to run multithreaded tasks, be aware that |
| CPU binding is inherited from the parent of the process. This means that |
| the multithreaded task should either specify or clear the CPU binding |
| itself to avoid having all threads of the multithreaded task use the same |
| mask/CPU as the parent. Alternatively, fat masks (masks which specify more |
| than one allowed CPU) could be used for the tasks in order to provide |
| multiple CPUs for the multithreaded tasks. |
| |
| By default, a job step has access to every CPU allocated to the job. |
| To ensure that distinct CPUs are allocated to each job step, use the |
| \fB\-\-exclusive\fR option. |
| |
| If the job step allocation includes an allocation with a number of |
| sockets, cores, or threads equal to the number of tasks to be started |
| then the tasks will by default be bound to the appropriate resources. |
| Disable this mode of operation by explicitly setting "-\-cpu\-bind=none". |
| |
| Note that a job step can be allocated different numbers of CPUs on each node |
| or be allocated CPUs not starting at location zero. Therefore one of the |
| options which automatically generate the task binding is recommended. |
| Explicitly specified masks or bindings are only honored when the job step |
| has been allocated every available CPU on the node. |
| |
| Binding a task to a NUMA locality domain means to bind the task to the set of |
| CPUs that belong to the NUMA locality domain or "NUMA node". |
| If NUMA locality domain options are used on systems with no NUMA support, then |
| each socket is considered a locality domain. |
| |
| Supported options include: |
| .PD 1 |
| .RS |
| .TP |
| .B q[uiet] |
| Quietly bind before task runs (default) |
| .TP |
| .B v[erbose] |
| Verbosely report binding before task runs |
| .TP |
| .B no[ne] |
| Do not bind tasks to CPUs (default) |
| .TP |
| .B rank |
| Automatically bind by task rank. |
| Task zero is bound to socket (or core or thread) zero, etc. |
| Not supported unless the entire node is allocated to the job. |
| .TP |
| .B map_cpu:<list> |
| Bind by mapping CPU IDs to tasks as specified |
| where <list> is <cpuid1>,<cpuid2>,...<cpuidN>. |
| CPU IDs are interpreted as decimal values unless they are preceded |
| with '0x' in which case they are interpreted as hexadecimal values. |
| Not supported unless the entire node is allocated to the job. |
| .TP |
| .B mask_cpu:<list> |
| Bind by setting CPU masks on tasks as specified |
| where <list> is <mask1>,<mask2>,...<maskN>. |
| CPU masks are \fBalways\fR interpreted as hexadecimal values but can be |
| preceded with an optional '0x'. |
| .TP |
| .B sockets |
| Automatically generate masks binding tasks to sockets. |
| Only the CPUs on the socket which have been allocated to the job will be used. |
| If the number of tasks differs from the number of allocated sockets |
| this can result in sub\-optimal binding. |
| .TP |
| .B cores |
| Automatically generate masks binding tasks to cores. |
| If the number of tasks differs from the number of allocated cores |
| this can result in sub\-optimal binding. |
| .TP |
| .B threads |
| Automatically generate masks binding tasks to threads. |
| If the number of tasks differs from the number of allocated threads |
| this can result in sub\-optimal binding. |
| .TP |
| .B ldoms |
| Automatically generate masks binding tasks to NUMA locality domains. |
| If the number of tasks differs from the number of allocated locality domains |
| this can result in sub\-optimal binding. |
| .TP |
| .B help |
| Show this help message |
| .RE |
| |
| .TP |
| \fB\-c\fR, \fB\-\-cpus\-per\-task\fR=<\fIncpus\fR> |
| Advise the SLURM controller that ensuing job steps will require \fIncpus\fR |
| number of processors per task. Without this option, the controller will |
| just try to allocate one processor per task. |
| |
| For instance, |
| consider an application that has 4 tasks, each requiring 3 processors. If our |
| cluster is comprised of quad\-processors nodes and we simply ask for |
| 12 processors, the controller might give us only 3 nodes. However, by using |
| the \-\-cpus\-per\-task=3 options, the controller knows that each task requires |
| 3 processors on the same node, and the controller will grant an allocation |
| of 4 nodes, one for each of the 4 tasks. |
| |
| .TP |
| \fB\-d\fR, \fB\-\-dependency\fR=<\fIdependency_list\fR> |
| Defer the start of this job until the specified dependencies have been |
| satisfied completed. |
| <\fIdependency_list\fR> is of the form |
| <\fItype:job_id[:job_id][,type:job_id[:job_id]]\fR>. |
| Many jobs can share the same dependency and these jobs may even belong to |
| different users. The value may be changed after job submission using the |
| scontrol command. |
| .PD |
| .RS |
| .TP |
| \fBafter:job_id[:jobid...]\fR |
| This job can begin execution after the specified jobs have begun |
| execution. |
| .TP |
| \fBafterany:job_id[:jobid...]\fR |
| This job can begin execution after the specified jobs have terminated. |
| .TP |
| \fBafternotok:job_id[:jobid...]\fR |
| This job can begin execution after the specified jobs have terminated |
| in some failed state (non-zero exit code, node failure, timed out, etc). |
| .TP |
| \fBafterok:job_id[:jobid...]\fR |
| This job can begin execution after the specified jobs have successfully |
| executed (ran to completion with an exit code of zero). |
| .TP |
| \fBexpand:job_id\fR |
| Resources allocated to this job should be used to expand the specified job. |
| The job to expand must share the same QOS (Quality of Service) and partition. |
| Gang scheduling of resources in the partition is also not supported. |
| .TP |
| \fBsingleton\fR |
| This job can begin execution after any previously launched jobs |
| sharing the same job name and user have terminated. |
| .RE |
| |
| .TP |
| \fB\-D\fR, \fB\-\-chdir\fR=<\fIpath\fR> |
| change directory to \fIpath\fR before beginning execution. |
| |
| .TP |
| \fB\-\-exclusive\fR |
| The job allocation can not share nodes with other running jobs. |
| This is the opposite of \-\-share, whichever option is seen last |
| on the command line will be used. The default shared/exclusive |
| behavior depends on system configuration and the partition's \fBShared\fR |
| option takes precedence over the job's option. |
| |
| .TP |
| \fB\-F\fR, \fB\-\-nodefile\fR=<\fInode file\fR> |
| Much like \-\-nodelist, but the list is contained in a file of name |
| \fInode file\fR. The node names of the list may also span multiple lines |
| in the file. Duplicate node names in the file will be ignored. |
| The order of the node names in the list is not important; the node names |
| will be sorted by SLURM. |
| |
| .TP |
| \fB\-\-get\-user\-env\fR[=\fItimeout\fR][\fImode\fR] |
| This option will load login environment variables for the user specified |
| in the \fB\-\-uid\fR option. |
| The environment variables are retrieved by running something of this sort |
| "su \- <username> \-c /usr/bin/env" and parsing the output. |
| Be aware that any environment variables already set in salloc's environment |
| will take precedence over any environment variables in the user's |
| login environment. |
| The optional \fItimeout\fR value is in seconds. Default value is 3 seconds. |
| The optional \fImode\fR value control the "su" options. |
| With a \fImode\fR value of "S", "su" is executed without the "\-" option. |
| With a \fImode\fR value of "L", "su" is executed with the "\-" option, |
| replicating the login environment. |
| If \fImode\fR not specified, the mode established at SLURM build time |
| is used. |
| Example of use include "\-\-get\-user\-env", "\-\-get\-user\-env=10" |
| "\-\-get\-user\-env=10L", and "\-\-get\-user\-env=S". |
| NOTE: This option only works if the caller has an |
| effective uid of "root". |
| This option was originally created for use by Moab. |
| |
| .TP |
| \fB\-\-gid\fR=<\fIgroup\fR> |
| Submit the job with the specified \fIgroup\fR's group access permissions. |
| \fIgroup\fR may be the group name or the numerical group ID. |
| In the default Slurm configuration, this option is only valid when used |
| by the user root. |
| |
| .TP |
| \fB\-\-gres\fR=<\fIlist\fR> |
| Specifies a comma delimited list of generic consumable resources. |
| The format of each entry on the list is "name[:count[*cpu]]". |
| The name is that of the consumable resource. |
| The count is the number of those resources with a default value of 1. |
| The specified resources will be allocated to the job on each node |
| allocated unless "*cpu" is appended, in which case the resources |
| will be allocated on a per cpu basis. |
| The available generic consumable resources is configurable by the system |
| administrator. |
| A list of available generic consumable resources will be printed and the |
| command will exit if the option argument is "help". |
| Examples of use include "\-\-gres=gpus:2*cpu,disk=40G" and "\-\-gres=help". |
| |
| .TP |
| \fB\-H, \-\-hold\fR |
| Specify the job is to be submitted in a held state (priority of zero). |
| A held job can now be released using scontrol to reset its priority |
| (e.g. "\fIscontrol release <job_id>\fR"). |
| |
| .TP |
| \fB\-h\fR, \fB\-\-help\fR |
| Display help information and exit. |
| |
| .TP |
| \fB\-\-hint\fR=<\fItype\fR> |
| Bind tasks according to application hints |
| .RS |
| .TP |
| .B compute_bound |
| Select settings for compute bound applications: |
| use all cores in each socket, one thread per core |
| .TP |
| .B memory_bound |
| Select settings for memory bound applications: |
| use only one core in each socket, one thread per core |
| .TP |
| .B [no]multithread |
| [don't] use extra threads with in-core multi-threading |
| which can benefit communication intensive applications |
| .TP |
| .B help |
| show this help message |
| .RE |
| |
| .TP |
| \fB\-I\fR, \fB\-\-immediate\fR[=<\fIseconds\fR>] |
| exit if resources are not available within the |
| time period specified. |
| If no argument is given, resources must be available immediately |
| for the request to succeed. |
| By default, \fB\-\-immediate\fR is off, and the command |
| will block until resources become available. Since this option's |
| argument is optional, for proper parsing the single letter option must |
| be followed immediately with the value and not include a space between |
| them. For example "\-I60" and not "\-I 60". |
| |
| .TP |
| \fB\-J\fR, \fB\-\-job\-name\fR=<\fIjobname\fR> |
| Specify a name for the job allocation. The specified name will appear along with |
| the job id number when querying running jobs on the system. The default job |
| name is the name of the "command" specified on the command line. |
| |
| .TP |
| \fB\-\-jobid\fR=<\fIjobid\fR> |
| Allocate resources as the specified job id. |
| NOTE: Only valid for user root. |
| |
| .TP |
| \fB\-K\fR, \fB\-\-kill\-command\fR[=\fIsignal\fR] |
| salloc always runs a user\-specified command once the allocation is |
| granted. salloc will wait indefinitely for that command to exit. |
| If you specify the \-\-kill\-command option salloc will send a signal to |
| your command any time that the SLURM controller tells salloc that its job |
| allocation has been revoked. The job allocation can be revoked for a |
| couple of reasons: someone used \fBscancel\fR to revoke the allocation, |
| or the allocation reached its time limit. If you do not specify a signal |
| name or number and SLURM is configured to signal the spawned command at job |
| termination, the default signal is SIGHUP for interactive and SIGTERM for |
| non\-interactive sessions. Since this option's argument is optional, |
| for proper parsing the single letter option must be followed |
| immediately with the value and not include a space between them. For |
| example "\-K1" and not "\-K 1". |
| |
| .TP |
| \fB\-k\fR, \fB\-\-no\-kill\fR |
| Do not automatically terminate a job of one of the nodes it has been |
| allocated fails. The user will assume the responsibilities for fault\-tolerance |
| should a node fail. When there is a node failure, any active job steps (usually |
| MPI jobs) on that node will almost certainly suffer a fatal error, but with |
| \-\-no\-kill, the job allocation will not be revoked so the user may launch |
| new job steps on the remaining nodes in their allocation. |
| |
| By default SLURM terminates the entire job allocation if any node fails in its |
| range of allocated nodes. |
| |
| .TP |
| \fB\-L\fR, \fB\-\-licenses\fR=<\fBlicense\fR> |
| Specification of licenses (or other resources available on all |
| nodes of the cluster) which must be allocated to this job. |
| License names can be followed by a colon and count |
| (the default count is one). |
| Multiple license names should be comma separated (e.g. |
| "\-\-licenses=foo:4,bar"). |
| |
| .TP |
| \fB\-m\fR, \fB\-\-distribution\fR= |
| <\fIblock\fR|\fIcyclic\fR|\fIarbitrary\fR|\fIplane=<options>\fR[:\fIblock\fR|\fIcyclic\fR]> |
| |
| Specify alternate distribution methods for remote processes. |
| In salloc, this only sets environment variables that will be used by |
| subsequent srun requests. |
| This option controls the assignment of tasks to the nodes on which |
| resources have been allocated, and the distribution of those resources |
| to tasks for binding (task affinity). The first distribution |
| method (before the ":") controls the distribution of resources across |
| nodes. The optional second distribution method (after the ":") |
| controls the distribution of resources across sockets within a node. |
| Note that with select/cons_res, the number of cpus allocated on each |
| socket and node may be different. Refer to |
| http://slurm.schedmd.com/mc_support.html |
| for more information on resource allocation, assignment of tasks to |
| nodes, and binding of tasks to CPUs. |
| .RS |
| |
| First distribution method: |
| .TP |
| .B block |
| The block distribution method will distribute tasks to a node such |
| that consecutive tasks share a node. For example, consider an |
| allocation of three nodes each with two cpus. A four\-task block |
| distribution request will distribute those tasks to the nodes with |
| tasks one and two on the first node, task three on the second node, |
| and task four on the third node. Block distribution is the default |
| behavior if the number of tasks exceeds the number of allocated nodes. |
| .TP |
| .B cyclic |
| The cyclic distribution method will distribute tasks to a node such |
| that consecutive tasks are distributed over consecutive nodes (in a |
| round\-robin fashion). For example, consider an allocation of three |
| nodes each with two cpus. A four\-task cyclic distribution request |
| will distribute those tasks to the nodes with tasks one and four on |
| the first node, task two on the second node, and task three on the |
| third node. |
| Note that when SelectType is select/cons_res, the same number of CPUs |
| may not be allocated on each node. Task distribution will be |
| round\-robin among all the nodes with CPUs yet to be assigned to tasks. |
| Cyclic distribution is the default behavior if the number |
| of tasks is no larger than the number of allocated nodes. |
| .TP |
| .B plane |
| The tasks are distributed in blocks of a specified size. The options |
| include a number representing the size of the task block. This is |
| followed by an optional specification of the task distribution scheme |
| within a block of tasks and between the blocks of tasks. For more |
| details (including examples and diagrams), please see |
| .br |
| http://slurm.schedmd.com/mc_support.html |
| .br |
| and |
| .br |
| http://slurm.schedmd.com/dist_plane.html |
| .TP |
| .B arbitrary |
| The arbitrary method of distribution will allocate processes in\-order |
| as listed in file designated by the environment variable |
| SLURM_HOSTFILE. If this variable is listed it will over ride any |
| other method specified. If not set the method will default to block. |
| Inside the hostfile must contain at minimum the number of hosts |
| requested and be one per line or comma separated. If specifying a |
| task count (\fB\-n\fR, \fB\-\-ntasks\fR=<\fInumber\fR>), your tasks |
| will be laid out on the nodes in the order of the file. |
| .br |
| \fBNOTE:\fR The arbitrary distribution option on a job allocation only |
| controls the nodes to be allocated to the job and not the allocation of |
| CPUs on those nodes. This option is meant primarily to control a job step's |
| task layout in an existing job allocation for the srun command. |
| |
| .TP |
| Second distribution method: |
| .TP |
| .B block |
| The block distribution method will distribute tasks to sockets such |
| that consecutive tasks share a socket. |
| .TP |
| .B cyclic |
| The cyclic distribution method will distribute tasks to sockets such |
| that consecutive tasks are distributed over consecutive sockets (in a |
| round\-robin fashion). |
| .RE |
| |
| .TP |
| \fB\-\-mail\-type\fR=<\fItype\fR> |
| Notify user by email when certain event types occur. |
| Valid \fItype\fR values are BEGIN, END, FAIL, REQUEUE, and ALL (any state |
| change). The user to be notified is indicated with \fB\-\-mail\-user\fR. |
| |
| .TP |
| \fB\-\-mail\-user\fR=<\fIuser\fR> |
| User to receive email notification of state changes as defined by |
| \fB\-\-mail\-type\fR. |
| The default value is the submitting user. |
| |
| .TP |
| \fB\-\-mem\fR=<\fIMB\fR> |
| Specify the real memory required per node in MegaBytes. |
| Default value is \fBDefMemPerNode\fR and the maximum value is |
| \fBMaxMemPerNode\fR. If configured, both of parameters can be |
| seen using the \fBscontrol show config\fR command. |
| This parameter would generally be used if whole nodes |
| are allocated to jobs (\fBSelectType=select/linear\fR). |
| Also see \fB\-\-mem\-per\-cpu\fR. |
| \fB\-\-mem\fR and \fB\-\-mem\-per\-cpu\fR are mutually exclusive. |
| NOTE: Enforcement of memory limits currently relies upon the task/cgroup plugin |
| or enabling of accounting, which samples memory use on a periodic basis (data |
| need not be stored, just collected). In both cases memory use is based upon |
| the job's Resident Set Size (RSS). A task may exceed the memory limit until |
| the next periodic accounting sample. |
| |
| .TP |
| \fB\-\-mem\-per\-cpu\fR=<\fIMB\fR> |
| Mimimum memory required per allocated CPU in MegaBytes. |
| Default value is \fBDefMemPerCPU\fR and the maximum value is \fBMaxMemPerCPU\fR |
| (see exception below). If configured, both of parameters can be |
| seen using the \fBscontrol show config\fR command. |
| Note that if the job's \fB\-\-mem\-per\-cpu\fR value exceeds the configured |
| \fBMaxMemPerCPU\fR, then the user's limit will be treated as a memory limit |
| per task; \fB\-\-mem\-per\-cpu\fR will be reduced to a value no larger than |
| \fBMaxMemPerCPU\fR; \fB\-\-cpus\-per\-task\fR will be set and value of |
| \fB\-\-cpus\-per\-task\fR multiplied by the new \fB\-\-mem\-per\-cpu\fR |
| value will equal the original \fB\-\-mem\-per\-cpu\fR value specified by |
| the user. |
| This parameter would generally be used if individual processors |
| are allocated to jobs (\fBSelectType=select/cons_res\fR). |
| Also see \fB\-\-mem\fR. |
| \fB\-\-mem\fR and \fB\-\-mem\-per\-cpu\fR are mutually exclusive. |
| |
| .TP |
| \fB\-\-mem_bind\fR=[{\fIquiet,verbose\fR},]\fItype\fR |
| Bind tasks to memory. Used only when the task/affinity plugin is enabled |
| and the NUMA memory functions are available. |
| \fBNote that the resolution of CPU and memory binding |
| may differ on some architectures.\fR For example, CPU binding may be performed |
| at the level of the cores within a processor while memory binding will |
| be performed at the level of nodes, where the definition of "nodes" |
| may differ from system to system. \fBThe use of any type other than |
| "none" or "local" is not recommended.\fR |
| If you want greater control, try running a simple test code with the |
| options "\-\-cpu_bind=verbose,none \-\-mem_bind=verbose,none" to determine |
| the specific configuration. |
| |
| NOTE: To have SLURM always report on the selected memory binding for |
| all commands executed in a shell, you can enable verbose mode by |
| setting the SLURM_MEM_BIND environment variable value to "verbose". |
| |
| The following informational environment variables are set when |
| \fB\-\-mem_bind\fR is in use: |
| |
| .nf |
| SLURM_MEM_BIND_VERBOSE |
| SLURM_MEM_BIND_TYPE |
| SLURM_MEM_BIND_LIST |
| .fi |
| |
| See the \fBENVIRONMENT VARIABLES\fR section for a more detailed description |
| of the individual SLURM_MEM_BIND* variables. |
| |
| Supported options include: |
| .RS |
| .TP |
| .B q[uiet] |
| quietly bind before task runs (default) |
| .TP |
| .B v[erbose] |
| verbosely report binding before task runs |
| .TP |
| .B no[ne] |
| don't bind tasks to memory (default) |
| .TP |
| .B rank |
| bind by task rank (not recommended) |
| .TP |
| .B local |
| Use memory local to the processor in use |
| .TP |
| .B map_mem:<list> |
| bind by mapping a node's memory to tasks as specified |
| where <list> is <cpuid1>,<cpuid2>,...<cpuidN>. |
| CPU IDs are interpreted as decimal values unless they are preceded |
| with '0x' in which case they interpreted as hexadecimal values |
| (not recommended) |
| .TP |
| .B mask_mem:<list> |
| bind by setting memory masks on tasks as specified |
| where <list> is <mask1>,<mask2>,...<maskN>. |
| memory masks are \fBalways\fR interpreted as hexadecimal values. |
| Note that masks must be preceded with a '0x' if they don't begin |
| with [0-9] so they are seen as numerical values by srun. |
| .TP |
| .B help |
| show this help message |
| .RE |
| |
| .TP |
| \fB\-\-mincpus\fR=<\fIn\fR> |
| Specify a minimum number of logical cpus/processors per node. |
| |
| .TP |
| \fB\-N\fR, \fB\-\-nodes\fR=<\fIminnodes\fR[\-\fImaxnodes\fR]> |
| Request that a minimum of \fIminnodes\fR nodes be allocated to this job. |
| A maximum node count may also be specified with \fImaxnodes\fR. |
| If only one number is specified, this is used as both the minimum and |
| maximum node count. |
| The partition's node limits supersede those of the job. |
| If a job's node limits are outside of the range permitted for its |
| associated partition, the job will be left in a PENDING state. |
| This permits possible execution at a later time, when the partition |
| limit is changed. |
| If a job node limit exceeds the number of nodes configured in the |
| partition, the job will be rejected. |
| Note that the environment |
| variable \fBSLURM_NNODES\fR will be set to the count of nodes actually |
| allocated to the job. See the \fBENVIRONMENT VARIABLES \fR section |
| for more information. If \fB\-N\fR is not specified, the default |
| behavior is to allocate enough nodes to satisfy the requirements of |
| the \fB\-n\fR and \fB\-c\fR options. |
| The job will be allocated as many nodes as possible within the range specified |
| and without delaying the initiation of the job. |
| The node count specification may include a numeric value followed by a suffix |
| of "k" (multiplies numeric value by 1,024) or "m" (multiplies numeric value by |
| 1,048,576). |
| |
| .TP |
| \fB\-n\fR, \fB\-\-ntasks\fR=<\fInumber\fR> |
| salloc does not launch tasks, it requests an allocation of resources and |
| executed some command. This option advises the SLURM controller that job |
| steps run within this allocation will launch a maximum of \fInumber\fR |
| tasks and sufficient resources are allocated to accomplish this. |
| The default is one task per node, but note |
| that the \fB\-\-cpus\-per\-task\fR option will change this default. |
| |
| .TP |
| \fB\-\-network\fR=<\fItype\fR> |
| Specify the communication protocol to be used. |
| The interpretation of \fItype\fR is system dependent. |
| This option is current supported on systems with IBM's Parallel Environment (PE). |
| See IBM's LoadLeveler job command keyword documentation about the keyword |
| "network" for more information. |
| Multiple values may be specified in a comma separated list. |
| All options are case in\-sensitive. |
| Supported values include: |
| .RS |
| .TP 12 |
| \fBBULK_XFER\fR[=<\fIresources\fR>] |
| Enable bulk transfer of data using Remote Direct-Memory Access (RDMA). |
| The optional \fIresources\fR specification is a numeric value which can have |
| a suffix of "k", "K", "m", "M", "g" or "G" for kilobytes, megabytes or |
| gigabytes. |
| NOTE: The \fIresources\fR specification is not supported by the underlying |
| IBM infrastructure as of Parallel Environment version 2.2 and no value should |
| be specified at this time. |
| .TP |
| \fBCAU\fR=<\fIcount\fR> |
| Number of Collectve Acceleration Units (CAU) required. |
| Applies only to IBM Power7-IH processors. |
| Default value is zero. |
| Independent CAU will be allocated for each programming interface (MPI, LAPI, etc.) |
| .TP |
| \fBDEVNAME\fR=<\fIname\fR> |
| Specify the device name to use for communications (e.g. "eth0" or "mlx4_0"). |
| .TP |
| \fBDEVTYPE\fR=<\fItype\fR> |
| Specify the device type to use for communications. |
| The supported values of \fItype\fR are: |
| "IB" (InfiniBand), "HFI" (P7 Host Fabric Interface), |
| "IPONLY" (IP-Only interfaces), "HPCE" (HPC Ethernet), and |
| "KMUX" (Kernel Emulation of HPCE). |
| The devices allocated to a job must all be of the same type. |
| The default value depends upon depends upon what hardware is available and in |
| order of preferences is IPONLY (which is not considered in User Space mode), |
| HFI, IB, HPCE, and KMUX. |
| .TP |
| \fBIMMED\fR =<\fIcount\fR> |
| Number of immediate send slots per window required. |
| Applies only to IBM Power7-IH processors. |
| Default value is zero. |
| .TP |
| \fBINSTANCES\fR =<\fIcount\fR> |
| Specify number of network connections for each task on each network connection. |
| The default instance count is 1. |
| .TP |
| \fBIPV4\fR |
| Use Internet Protocol (IP) version 4 communications (default). |
| .TP |
| \fBIPV6\fR |
| Use Internet Protocol (IP) version 6 communications. |
| .TP |
| \fBLAPI\fR |
| Use the LAPI programming interface. |
| .TP |
| \fBMPI\fR |
| Use the MPI programming interface. |
| MPI is the default interface. |
| .TP |
| \fBPAMI\fR |
| Use the PAMI programming interface. |
| .TP |
| \fBSHMEM\fR |
| Use the OpenSHMEM programming interface. |
| .TP |
| \fBSN_ALL\fR |
| Use all available switch networks (default). |
| .TP |
| \fBSN_SINGLE\fR |
| Use one available switch network. |
| .TP |
| \fBUPC\fR |
| Use the UPC programming interface. |
| .TP |
| \fBUS\fR |
| Use User Space communications. |
| .TP |
| |
| Some examples of network specifications: |
| .TP |
| \fBInstances=2,US,MPI,SN_ALL\fR |
| Create two user space connections for MPI communications on every switch |
| network for each task. |
| .TP |
| \fBUS,MPI,Instances=3,Devtype=IB\fR |
| Create three user space connections for MPI communications on every InfiniBand |
| network for each task. |
| .TP |
| \fBIPV4,LAPI,SN_Single\fR |
| Create a IP version 4 connection for LAPI communications on one switch network |
| for each task. |
| .TP |
| \fBInstances=2,US,LAPI,MPI\fR |
| Create two user space connections each for LAPI and MPI communcations on every |
| switch network for each task. Note that SN_ALL is the default option so every |
| switch network is used. Also note that Instances=2 specifies that two |
| connections are established for each protocol (LAPI and MPI) and each task. |
| If there are two networks and four tasks on the node then a total |
| of 32 connections are established (2 instances x 2 protocols x 2 networks x |
| 4 tasks). |
| .RE |
| |
| .TP |
| \fB\-\-nice\fR[=\fIadjustment\fR] |
| Run the job with an adjusted scheduling priority within SLURM. |
| With no adjustment value the scheduling priority is decreased |
| by 100. The adjustment range is from \-10000 (highest priority) |
| to 10000 (lowest priority). Only privileged users can specify |
| a negative adjustment. NOTE: This option is presently |
| ignored if \fISchedulerType=sched/wiki\fR or |
| \fISchedulerType=sched/wiki2\fR. |
| |
| .TP |
| \fB\-\-ntasks\-per\-core\fR=<\fIntasks\fR> |
| Request the maximum \fIntasks\fR be invoked on each core. |
| Meant to be used with the \fB\-\-ntasks\fR option. |
| Related to \fB\-\-ntasks\-per\-node\fR except at the core level |
| instead of the node level. Masks will automatically be generated |
| to bind the tasks to specific core unless \fB\-\-cpu_bind=none\fR |
| is specified. |
| NOTE: This option is not supported unless |
| \fISelectTypeParameters=CR_Core\fR or |
| \fISelectTypeParameters=CR_Core_Memory\fR is configured. |
| |
| .TP |
| \fB\-\-ntasks\-per\-socket\fR=<\fIntasks\fR> |
| Request the maximum \fIntasks\fR be invoked on each socket. |
| Meant to be used with the \fB\-\-ntasks\fR option. |
| Related to \fB\-\-ntasks\-per\-node\fR except at the socket level |
| instead of the node level. Masks will automatically be generated |
| to bind the tasks to specific sockets unless \fB\-\-cpu_bind=none\fR |
| is specified. |
| NOTE: This option is not supported unless |
| \fISelectTypeParameters=CR_Socket\fR or |
| \fISelectTypeParameters=CR_Socket_Memory\fR is configured. |
| |
| .TP |
| \fB\-\-ntasks\-per\-node\fR=<\fIntasks\fR> |
| Request the maximum \fIntasks\fR be invoked on each node. |
| Meant to be used with the \fB\-\-nodes\fR option. |
| This is related to \fB\-\-cpus\-per\-task\fR=\fIncpus\fR, |
| but does not require knowledge of the actual number of cpus on |
| each node. In some cases, it is more convenient to be able to |
| request that no more than a specific number of tasks be invoked |
| on each node. Examples of this include submitting |
| a hybrid MPI/OpenMP app where only one MPI "task/rank" should be |
| assigned to each node while allowing the OpenMP portion to utilize |
| all of the parallelism present in the node, or submitting a single |
| setup/cleanup/monitoring job to each node of a pre\-existing |
| allocation as one step in a larger job script. |
| |
| .TP |
| \fB\-\-no\-bell\fR |
| Silence salloc's use of the terminal bell. Also see the option \fB\-\-bell\fR. |
| |
| .TP |
| \fB\-\-no\-shell\fR |
| immediately exit after allocating resources, without running a |
| command. However, the SLURM job will still be created and will remain |
| active and will own the allocated resources as long as it is active. |
| You will have a SLURM job id with no associated processes or |
| tasks. You can submit \fBsrun\fR commands against this resource allocation, |
| if you specify the \fB\-\-jobid=\fR option with the job id of this SLURM job. |
| Or, this can be used to temporarily reserve a set of resources so that |
| other jobs cannot use them for some period of time. (Note that the |
| SLURM job is subject to the normal constraints on jobs, including time |
| limits, so that eventually the job will terminate and the resources |
| will be freed, or you can terminate the job manually using the |
| \fBscancel\fR command.) |
| |
| .TP |
| \fB\-O\fR, \fB\-\-overcommit\fR |
| Overcommit resources. Normally, \fBsalloc\fR will allocate one task |
| per processor. By specifying \fB\-\-overcommit\fR you are explicitly |
| allowing more than one task per processor. However no more than |
| \fBMAX_TASKS_PER_NODE\fR tasks are permitted to execute per node. |
| |
| .TP |
| \fB\-p\fR, \fB\-\-partition\fR=<\fIpartition_names\fR> |
| Request a specific partition for the resource allocation. If not specified, |
| the default behavior is to allow the slurm controller to select the default |
| partition as designated by the system administrator. If the job can use more |
| than one partition, specify their names in a comma separate list and the one |
| offering earliest initiation will be used. |
| |
| .TP |
| \fB\-Q\fR, \fB\-\-quiet\fR |
| Suppress informational messages from salloc. Errors will still be displayed. |
| |
| .TP |
| \fB\-\-qos\fR=<\fIqos\fR> |
| Request a quality of service for the job. QOS values can be defined |
| for each user/cluster/account association in the SLURM database. |
| Users will be limited to their association's defined set of qos's when |
| the SLURM configuration parameter, AccountingStorageEnforce, includes |
| "qos" in it's definition. |
| |
| .TP |
| \fB\-\-reservation\fR=<\fIname\fR> |
| Allocate resources for the job from the named reservation. |
| |
| .TP |
| \fB\-s\fR, \fB\-\-share\fR |
| The job allocation can share nodes with other running jobs. |
| This is the opposite of \-\-exclusive, whichever option is seen last |
| on the command line will be used. The default shared/exclusive |
| behavior depends on system configuration and the partition's \fBShared\fR |
| option takes precedence over the job's option. |
| This option may result the allocation being granted sooner than if the \-\-share |
| option was not set and allow higher system utilization, but application |
| performance will likely suffer due to competition for resources within a node. |
| |
| .TP |
| \fB\-\-signal\fR=<\fIsig_num\fR>[@<\fIsig_time\fR>] |
| When a job is within \fIsig_time\fR seconds of its end time, |
| send it the signal \fIsig_num\fR. |
| Due to the resolution of event handling by SLURM, the signal may |
| be sent up to 60 seconds earlier than specified. |
| \fIsig_num\fR may either be a signal number or name (e.g. "10" or "USR1"). |
| \fIsig_time\fR must have integer value between zero and 65535. |
| By default, no signal is sent before the job's end time. |
| If a \fIsig_num\fR is specified without any \fIsig_time\fR, |
| the default time will be 60 seconds. |
| |
| .TP |
| \fB\-\-sockets\-per\-node\fR=<\fIsockets\fR> |
| Restrict node selection to nodes with at least the specified number of |
| sockets. See additional information under \fB\-B\fR option above when |
| task/affinity plugin is enabled. |
| |
| .TP |
| \fB\-\-switches\fR=<\fIcount\fR>[@<\fImax\-time\fR>] |
| When a tree topology is used, this defines the maximum count of switches |
| desired for the job allocation and optionally the maximum time to wait |
| for that number of switches. If SLURM finds an allocation containing more |
| switches than the count specified, the job remains pending until it either finds |
| an allocation with desired switch count or the time limit expires. |
| It there is no switch count limit, there is no delay in starting the job. |
| Acceptable time formats include "minutes", "minutes:seconds", |
| "hours:minutes:seconds", "days\-hours", "days\-hours:minutes" and |
| "days\-hours:minutes:seconds". |
| The job's maximum time delay may be limited by the system administrator using |
| the \fBSchedulerParameters\fR configuration parameter with the |
| \fBmax_switch_wait\fR parameter option. |
| The default max\-time is the max_switch_wait SchedulerParameter. |
| |
| .TP |
| \fB\-t\fR, \fB\-\-time\fR=<\fItime\fR> |
| Set a limit on the total run time of the job allocation. If the |
| requested time limit exceeds the partition's time limit, the job will |
| be left in a PENDING state (possibly indefinitely). The default time |
| limit is the partition's default time limit. When the time limit is reached, |
| each task in each job step is sent SIGTERM followed by SIGKILL. The |
| interval between signals is specified by the SLURM configuration |
| parameter \fBKillWait\fR. A time limit of zero requests that no time |
| limit be imposed. Acceptable time formats include "minutes", |
| "minutes:seconds", "hours:minutes:seconds", "days\-hours", |
| "days\-hours:minutes" and "days\-hours:minutes:seconds". |
| |
| .TP |
| \fB\-\-threads\-per\-core\fR=<\fIthreads\fR> |
| Restrict node selection to nodes with at least the specified number of |
| threads per core. NOTE: "Threads" refers to the number of processing units on |
| each core rather than the number of application tasks to be launched per core. |
| See additional information under \fB\-B\fR option above when task/affinity |
| plugin is enabled. |
| |
| .TP |
| \fB\-\-time\-min\fR=<\fItime\fR> |
| Set a minimum time limit on the job allocation. |
| If specified, the job may have it's \fB\-\-time\fR limit lowered to a value |
| no lower than \fB\-\-time\-min\fR if doing so permits the job to begin |
| execution earlier than otherwise possible. |
| The job's time limit will not be changed after the job is allocated resources. |
| This is performed by a backfill scheduling algorithm to allocate resources |
| otherwise reserved for higher priority jobs. |
| Acceptable time formats include "minutes", "minutes:seconds", |
| "hours:minutes:seconds", "days\-hours", "days\-hours:minutes" and |
| "days\-hours:minutes:seconds". |
| |
| .TP |
| \fB\-\-tmp\fR=<\fIMB\fR> |
| Specify a minimum amount of temporary disk space. |
| |
| .TP |
| \fB\-u\fR, \fB\-\-usage\fR |
| Display brief help message and exit. |
| |
| .TP |
| \fB\-\-uid\fR=<\fIuser\fR> |
| Attempt to submit and/or run a job as \fIuser\fR instead of the |
| invoking user id. The invoking user's credentials will be used |
| to check access permissions for the target partition. This option |
| is only valid for user root. This option may be used by user root |
| may use this option to run jobs as a normal user in a RootOnly |
| partition for example. If run as root, \fBsalloc\fR will drop |
| its permissions to the uid specified after node allocation is |
| successful. \fIuser\fR may be the user name or numerical user ID. |
| |
| .TP |
| \fB\-V\fR, \fB\-\-version\fR |
| Display version information and exit. |
| |
| .TP |
| \fB\-v\fR, \fB\-\-verbose\fR |
| Increase the verbosity of salloc's informational messages. Multiple |
| \fB\-v\fR's will further increase salloc's verbosity. By default only |
| errors will be displayed. |
| |
| .TP |
| \fB\-W\fR, \fB\-\-wait\fR=<\fIseconds\fR> |
| This option has been replaced by \fB\-\-immediate\fR=<\fIseconds\fR>. |
| |
| .TP |
| \fB\-w\fR, \fB\-\-nodelist\fR=<\fInode name list\fR> |
| Request a specific list of node names. The list may be specified as a |
| comma\-separated list of node names, or a range of node names |
| (e.g. mynode[1\-5,7,...]). Duplicate node names in the list will be ignored. |
| The order of the node names in the list is not important; the node names |
| will be sorted by SLURM. |
| |
| .TP |
| \fB\-\-wait\-all\-nodes\fR=<\fIvalue\fR> |
| Controls when the execution of the command begins. |
| By default the job will begin execution as soon as the allocation is made. |
| .RS |
| .TP 5 |
| 0 |
| Begin execution as soon as allocation can be made. |
| Do not wait for all nodes to be ready for use (i.e. booted). |
| .TP |
| 1 |
| Do not begin execution until all nodes are ready for use. |
| .RE |
| |
| .TP |
| \fB\-\-wckey\fR=<\fIwckey\fR> |
| Specify wckey to be used with job. If TrackWCKey=no (default) in the |
| slurm.conf this value is ignored. |
| |
| .TP |
| \fB\-x\fR, \fB\-\-exclude\fR=<\fInode name list\fR> |
| Explicitly exclude certain nodes from the resources granted to the job. |
| |
| .PP |
| The following options support Blue Gene systems, but may be |
| applicable to other systems as well. |
| |
| .TP |
| \fB\-\-blrts\-image\fR=<\fIpath\fR> |
| Path to blrts image for bluegene block. BGL only. |
| Default from \fIblugene.conf\fR if not set. |
| |
| .TP |
| \fB\-\-cnload\-image\fR=<\fIpath\fR> |
| Path to compute node image for bluegene block. BGP only. |
| Default from \fIblugene.conf\fR if not set. |
| |
| .TP |
| \fB\-\-conn\-type\fR=<\fItype\fR> |
| Require the block connection type to be of a certain type. |
| On Blue Gene the acceptable of \fItype\fR are MESH, TORUS and NAV. |
| If NAV, or if not set, then SLURM will try to fit a what the |
| DefaultConnType is set to in the bluegene.conf if that isn't set the |
| default is TORUS. |
| You should not normally set this option. |
| If running on a BGP system and wanting to run in HTC mode (only for 1 |
| midplane and below). You can use HTC_S for SMP, HTC_D for Dual, HTC_V |
| for virtual node mode, and HTC_L for Linux mode. |
| For systems that allow a different connection type per dimension you |
| can supply a comma separated list of connection types may be specified, one for |
| each dimension (i.e. M,T,T,T will give you a torus connection is all |
| dimensions expect the first). |
| |
| .TP |
| \fB\-g\fR, \fB\-\-geometry\fR=<\fIXxYxZ\fR> | <\fIAxXxYxZ\fR> |
| Specify the geometry requirements for the job. On BlueGene/L and BlueGene/P |
| systems there are three numbers giving dimensions in the X, Y and Z directions, |
| while on BlueGene/Q systems there are four numbers giving dimensions in the |
| A, X, Y and Z directions and can not be used to allocate sub-blocks. |
| For example "\-\-geometry=1x2x3x4", specifies a block of nodes having |
| 1 x 2 x 3 x 4 = 24 nodes (actually midplanes on BlueGene). |
| |
| .TP |
| \fB\-\-ioload\-image\fR=<\fIpath\fR> |
| Path to io image for bluegene block. BGP only. |
| Default from \fIblugene.conf\fR if not set. |
| |
| .TP |
| \fB\-\-linux\-image\fR=<\fIpath\fR> |
| Path to linux image for bluegene block. BGL only. |
| Default from \fIblugene.conf\fR if not set. |
| |
| .TP |
| \fB\-\-mloader\-image\fR=<\fIpath\fR> |
| Path to mloader image for bluegene block. |
| Default from \fIblugene.conf\fR if not set. |
| |
| .TP |
| \fB\-R\fR, \fB\-\-no\-rotate\fR |
| Disables rotation of the job's requested geometry in order to fit an |
| appropriate block. |
| By default the specified geometry can rotate in three dimensions. |
| |
| .TP |
| \fB\-\-ramdisk\-image\fR=<\fIpath\fR> |
| Path to ramdisk image for bluegene block. BGL only. |
| Default from \fIblugene.conf\fR if not set. |
| |
| .TP |
| \fB\-\-reboot\fR |
| Force the allocated nodes to reboot before starting the job. |
| |
| .SH "INPUT ENVIRONMENT VARIABLES" |
| .PP |
| Upon startup, salloc will read and handle the options set in the following |
| environment variables. Note: Command line options always override environment |
| variables settings. |
| |
| .TP 22 |
| \fBSALLOC_ACCOUNT\fR |
| Same as \fB\-A, \-\-account\fR |
| .TP |
| \fBSALLOC_ACCTG_FREQ\fR |
| Same as \fB\-\-acctg\-freq\fR |
| .TP |
| \fBSALLOC_BELL\fR |
| Same as \fB\-\-bell\fR |
| .TP |
| \fBSALLOC_CONN_TYPE\fR |
| Same as \fB\-\-conn\-type\fR |
| .TP |
| \fBSALLOC_CPU_BIND\fR |
| Same as \fB\-\-cpu_bind\fR |
| .TP |
| \fBSALLOC_DEBUG\fR |
| Same as \fB\-v, \-\-verbose\fR |
| .TP |
| \fBSALLOC_EXCLUSIVE\fR |
| Same as \fB\-\-exclusive\fR |
| .TP |
| \fBSLURM_EXIT_ERROR\fR |
| Specifies the exit code generated when a SLURM error occurs |
| (e.g. invalid options). |
| This can be used by a script to distinguish application exit codes from |
| various SLURM error conditions. |
| Also see \fBSLURM_EXIT_IMMEDIATE\fR. |
| .TP |
| \fBSLURM_EXIT_IMMEDIATE\fR |
| Specifies the exit code generated when the \fB\-\-immediate\fR option |
| is used and resources are not currently available. |
| This can be used by a script to distinguish application exit codes from |
| various SLURM error conditions. |
| Also see \fBSLURM_EXIT_ERROR\fR. |
| .TP |
| \fBSALLOC_GEOMETRY\fR |
| Same as \fB\-g, \-\-geometry\fR |
| .TP |
| \fBSALLOC_IMMEDIATE\fR |
| Same as \fB\-I, \-\-immediate\fR |
| .TP |
| \fBSALLOC_JOBID\fR |
| Same as \fB\-\-jobid\fR |
| .TP |
| \fBSALLOC_KILL_CMD\fR |
| Same as \fB\-K\fR, \fB\-\-kill\-command\fR |
| .TP |
| \fBSALLOC_MEM_BIND\fR |
| Same as \fB\-\-mem_bind\fR |
| .TP |
| \fBSALLOC_NETWORK\fR |
| Same as \fB\-\-network\fR |
| .TP |
| \fBSALLOC_NO_BELL\fR |
| Same as \fB\-\-no\-bell\fR |
| .TP |
| \fBSALLOC_NO_ROTATE\fR |
| Same as \fB\-R, \-\-no\-rotate\fR |
| .TP |
| \fBSALLOC_OVERCOMMIT\fR |
| Same as \fB\-O, \-\-overcommit\fR |
| .TP |
| \fBSALLOC_PARTITION\fR |
| Same as \fB\-p, \-\-partition\fR |
| .TP |
| \fBSALLOC_QOS\fR |
| Same as \fB\-\-qos\fR |
| .TP |
| \fBSALLOC_REQ_SWITCH\fR |
| When a tree topology is used, this defines the maximum count of switches |
| desired for the job allocation and optionally the maximum time to wait |
| for that number of switches. See \fB\-\-switches\fR. |
| .TP |
| \fBSALLOC_RESERVATION\fR |
| Same as \fB\-\-reservation\fR |
| .TP |
| \fBSALLOC_SIGNAL\fR |
| Same as \fB\-\-signal\fR |
| .TP |
| \fBSALLOC_TIMELIMIT\fR |
| Same as \fB\-t, \-\-time\fR |
| .TP |
| \fBSALLOC_WAIT\fR |
| Same as \fB\-W, \-\-wait\fR |
| .TP |
| \fBSALLOC_WAIT_ALL_NODES\fR |
| Same as \fB\-\-wait\-all\-nodes\fR |
| .TP |
| \fBSALLOC_WCKEY\fR |
| Same as \fB\-\-wckey\fR |
| .TP |
| \fBSALLOC_WAIT4SWITCH\fR |
| Max time waiting for requested switches. See \fB\-\-switches\fR |
| |
| .SH "OUTPUT ENVIRONMENT VARIABLES" |
| .PP |
| salloc will set the following environment variables in the environment of |
| the executed program: |
| .TP |
| \fBBASIL_RESERVATION_ID\fR |
| The reservation ID on Cray systems running ALPS/BASIL only. |
| .TP |
| \fBSLURM_CPU_BIND\fR |
| Set to value of the \-\-cpu_bind\fR option. |
| .TP |
| \fBSLURM_CPU_BIND_LIST\fR |
| \-\-cpu_bind map or mask list (list of SLURM CPU IDs or masks for this node, |
| CPU_ID = Board_ID x threads_per_board + |
| Socket_ID x threads_per_socket + |
| Core_ID x threads_per_core + Thread_ID). |
| .TP |
| \fBSLURM_JOB_ID\fR (and \fBSLURM_JOBID\fR for backwards compatibility) |
| The ID of the job allocation. |
| .TP |
| \fBSLURM_JOB_CPUS_PER_NODE\fR |
| Count of processors available to the job on this node. |
| Note the select/linear plugin allocates entire nodes to |
| jobs, so the value indicates the total count of CPUs on each node. |
| The select/cons_res plugin allocates individual processors |
| to jobs, so this number indicates the number of processors |
| on each node allocated to the job allocation. |
| .TP |
| \fBSLURM_JOB_NODELIST\fR (and \fBSLURM_NODELIST\fR for backwards compatibility) |
| List of nodes allocated to the job. |
| .TP |
| \fBSLURM_JOB_NUM_NODES\fR (and \fBSLURM_NNODES\fR for backwards compatibility) |
| Total number of nodes in the job allocation. |
| .TP |
| \fBSLURM_MEM_BIND\fR |
| Set to value of the \-\-mem_bind\fR option. |
| .TP |
| \fBSLURM_SUBMIT_DIR\fR |
| The directory from which \fBsalloc\fR was invoked. |
| .TP |
| \fBSLURM_NODE_ALIASES\fR |
| Sets of node name, communication address and hostname for nodes allocated to |
| the job from the cloud. Each element in the set if colon separated and each |
| set is comma separated. For example: |
| SLURM_NODE_ALIASES=ec0:1.2.3.4:foo,ec1:1.2.3.5:bar |
| .TP |
| \fBSLURM_NTASKS\fR |
| Same as \fB\-n, \-\-ntasks\fR |
| .TP |
| \fBSLURM_NTASKS_PER_NODE\fR |
| Set to value of the \-\-ntasks\-per\-node\fR option, if specified. |
| .TP |
| \fBSLURM_TASKS_PER_NODE\fR |
| Number of tasks to be initiated on each node. Values are |
| comma separated and in the same order as SLURM_NODELIST. |
| If two or more consecutive nodes are to have the same task |
| count, that count is followed by "(x#)" where "#" is the |
| repetition count. For example, "SLURM_TASKS_PER_NODE=2(x3),1" |
| indicates that the first three nodes will each execute three |
| tasks and the fourth node will execute one task. |
| .TP |
| \fBMPIRUN_NOALLOCATE\fR |
| Do not allocate a block on Blue Gene L/P systems only. |
| .TP |
| \fBMPIRUN_NOFREE\fR |
| Do not free a block on Blue Gene L/P systems only. |
| .TP |
| \fBMPIRUN_PARTITION\fR |
| The block name on Blue Gene systems only. |
| |
| .SH "SIGNALS" |
| .LP |
| While salloc is waiting for a PENDING job allocation, most signals will cause |
| salloc to revoke the allocation request and exit. |
| |
| However if the allocation has been granted and salloc has already started the |
| specified command, then salloc will ignore most signals. |
| salloc will not exit or release the allocation until the command exits. |
| One notable exception is SIGHUP. A SIGHUP signal will cause salloc to |
| release the allocation and exit without waiting for the command to finish. |
| Another exception is SIGTERM, which will be forwarded to the spawned process. |
| |
| .SH "EXAMPLES" |
| .LP |
| To get an allocation, and open a new xterm in which srun commands may be typed |
| interactively: |
| .IP |
| $ salloc \-N16 xterm |
| .br |
| salloc: Granted job allocation 65537 |
| .br |
| (at this point the xterm appears, and salloc waits for xterm to exit) |
| .br |
| salloc: Relinquishing job allocation 65537 |
| .LP |
| To grab an allocation of nodes and launch a parallel application on one command |
| line (See the \fBsalloc\fR man page for more examples): |
| .IP |
| salloc \-N5 srun \-n10 myprogram |
| |
| .SH "COPYING" |
| Copyright (C) 2006\-2007 The Regents of the University of California. |
| Copyright (C) 2008\-2010 Lawrence Livermore National Security. |
| Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). |
| CODE\-OCEC\-09\-009. All rights reserved. |
| .LP |
| This file is part of SLURM, a resource management program. |
| For details, see <http://slurm.schedmd.com/>. |
| .LP |
| SLURM is free software; you can redistribute it and/or modify it under |
| the terms of the GNU General Public License as published by the Free |
| Software Foundation; either version 2 of the License, or (at your option) |
| any later version. |
| .LP |
| SLURM is distributed in the hope that it will be useful, but WITHOUT ANY |
| WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS |
| FOR A PARTICULAR PURPOSE. See the GNU General Public License for more |
| details. |
| |
| .SH "SEE ALSO" |
| .LP |
| \fBsinfo\fR(1), \fBsattach\fR(1), \fBsbatch\fR(1), \fBsqueue\fR(1), \fBscancel\fR(1), \fBscontrol\fR(1), |
| \fBslurm.conf\fR(5), \fBsched_setaffinity\fR (2), \fBnuma\fR (3) |