| .TH scancel "1" "Slurm Commands" "March 2025" "Slurm Commands" |
| |
| .SH "NAME" |
| scancel \- Used to signal jobs or job steps that are under the control of Slurm. |
| |
| .SH "SYNOPSIS" |
| \fBscancel\fR [\fIOPTIONS\fR...] [\fIjob_id\fR[_\fIarray_id\fR][.\fIstep_id\fR]] [\fIjob_id\fR[_\fIarray_id\fR][.\fIstep_id\fR]...] |
| |
| .SH "DESCRIPTION" |
| \fBscancel\fR is used to signal or cancel jobs, job arrays or job steps. |
| An arbitrary number of jobs or job steps may be signaled using job |
| specification filters or a space separated list of specific job and/or |
| job step IDs. |
| If the job ID of a job array is specified with an array ID value and the job |
| associated with the array ID value has been split from the array, then only that |
| job array element will be cancelled. |
| If the job ID of a job array is specified without an array ID value or the |
| array ID value corresponds to a job that has not been split from the array, |
| then all job array elements will be cancelled. |
| While a heterogeneous job is in a PENDING state, only the entire job can be |
| cancelled rather than its individual components. |
| A request to cancel an individual component of a heterogeneous job while in |
| a PENDING state will return an error. |
| After the job has begun execution, an individual component can be cancelled |
| except for component zero. If component zero is cancelled, the whole het job is |
| cancelled. |
| A job or job step can only be signaled by the owner of that job or user root. |
| If an attempt is made by an unauthorized user to signal a job or job step, an |
| error message will be printed and the job will not be signaled. |
| |
| .SH "OPTIONS" |
| |
| .TP |
| \fB\-A\fR, \fB\-\-account\fR=\fIaccount\fR |
| Restrict the scancel operation to jobs under this charge account. |
| .IP |
| |
| .TP |
| \fB\-b\fR, \fB\-\-batch\fR |
| By default, signals other than SIGKILL are not sent to the batch step (the shell |
| script). With this option \fBscancel\fR signals only the batch step, but not |
| any other steps. |
| This is useful when the shell script has to trap the signal and take some |
| application defined action. |
| Most shells cannot handle signals while a command is running (i.e. is a child |
| process of the batch step), so the shell needs to wait until the command ends to |
| then handle the signal. |
| Children of the batch step are not signaled with this option. If this is |
| desired, use \fB\-f\fR, \fB\-\-full\fR instead. |
| \fBNOTE\fR: If used with \fB\-f\fR, \fB\-\-full\fR, this option is ignored. |
| \fBNOTE\fR: This option is not applicable if \fIstep_id\fR is specified. |
| \fBNOTE\fR: The shell itself may exit upon receipt of many signals. |
| You may avoid this by explicitly trap signals within the shell |
| script (e.g. "trap <arg> <signals>"). See the shell documentation |
| for details. |
| .IP |
| |
| .TP |
| \fB\-M\fR, \fB\-\-clusters\fR=<\fIstring\fR> |
| Cluster to issue commands to. Implies \fB\-\-ctld\fR. |
| Note that the \fBslurmdbd\fR must be up for this option to work properly, unless |
| running in a federation with \fBFederationParameters=fed_display\fR configured. |
| .IP |
| |
| .TP |
| \fB\-\-ctld\fR |
| If this option is not used with \fB\-\-interactive\fR, |
| \fB\-\-sibling\fR, federated job ids, or specific step ids, then this issues a |
| single request to the slurmctld to signal all jobs matching the specified |
| filters. This greatly improves the performance of slurmctld and scancel. |
| Otherwise, this option causes scancel to send each job signal request to the |
| slurmctld daemon rather than directly to the slurmd daemons, which increases |
| overhead, but offers better fault tolerance. \fB\-\-ctld\fR is the default |
| behavior on when the \fB\-\-clusters\fR option is used. |
| .IP |
| |
| .TP |
| \fB\-c\fR, \fB\-\-cron\fR |
| Confirm request to cancel a job submitted by scrontab. This option only has |
| effect with the "explicit_scancel" option is set in \fBScronParameters\fR. |
| .IP |
| |
| .TP |
| \fB\-f\fR, \fB\-\-full\fR |
| By default, signals other than SIGKILL are not sent to the batch step (the shell |
| script). With this option \fBscancel\fR also signals the batch script and its |
| children processes. |
| Most shells cannot handle signals while a command is running (i.e. is a child |
| process of the batch step), so the shell needs to wait until the command ends to |
| then handle the signal. |
| Unlike \fB\-b\fR, \fB\-\-batch\fR, children of the batch step |
| are also signaled with this option. |
| \fBNOTE\fR: srun steps are also children of the batch step, so steps are also |
| signaled with this option. |
| .IP |
| |
| .TP |
| \fB\-\-help\fR |
| Print a help message describing all \fBscancel\fR options. |
| .IP |
| |
| .TP |
| \fB\-H\fR, \fB\-\-hurry\fR |
| Do not stage out any burst buffer data. |
| .IP |
| |
| .TP |
| \fB\-i\fR, \fB\-\-interactive\fR |
| Interactive mode. Confirm each job_id.step_id before performing the cancel operation. |
| .IP |
| |
| .TP |
| \fB\-n\fR, \fB\-\-jobname\fR=\fIjob_name\fR, \fB\-\-name\fR=\fIjob_name\fR |
| Restrict the scancel operation to jobs with this job name. |
| .IP |
| |
| .TP |
| \fB\-\-me\fR |
| Restrict the scancel operation to jobs owned by the current user. |
| |
| .TP |
| \fB\-w\fR, \fB\-\-nodelist=\fIhost1,host2,...\fR |
| Cancel any jobs using any of the given hosts. The list may be specified as |
| a comma\-separated list of hosts, a range of hosts (host[1\-5,7,...] for |
| example), or a filename. The host list will be assumed to be a filename only |
| if it contains a "/" character. |
| .IP |
| |
| .TP |
| \fB\-p\fR, \fB\-\-partition\fR=\fIpartition_name\fR |
| Restrict the scancel operation to jobs in this partition. |
| .IP |
| |
| .TP |
| \fB\-q\fR, \fB\-\-qos\fR=\fIqos\fR |
| Restrict the scancel operation to jobs with this quality of service. |
| .IP |
| |
| .TP |
| \fB\-Q\fR, \fB\-\-quiet\fR |
| Do not report an error if the specified job is already completed. |
| This option is incompatible with the \fB\-\-verbose\fR option. |
| .IP |
| |
| .TP |
| \fB\-R\fR, \fB\-\-reservation\fR=\fIreservation_name\fR |
| Restrict the scancel operation to jobs with this reservation name. |
| .IP |
| |
| .TP |
| \fB\-\-sibling\fR=\fIcluster_name\fR |
| Remove an active sibling job from a federated job. |
| .IP |
| |
| .TP |
| \fB\-s\fR, \fB\-\-signal\fR=\fIsignal_name\fR |
| The name or number of the signal to send. If this option is not used |
| the specified job or step will be terminated. |
| .IP |
| |
| .TP |
| \fB\-t\fR, \fB\-\-state\fR=\fIjob_state_name\fR |
| Restrict the scancel operation to jobs in this |
| state. \fIjob_state_name\fR may have a value of either "PENDING", |
| "RUNNING" or "SUSPENDED". |
| .IP |
| |
| .TP |
| \fB\-\-usage\fR |
| Print a brief help message listing the \fBscancel\fR options. |
| .IP |
| |
| .TP |
| \fB\-u\fR, \fB\-\-user\fR=\fIuser_name\fR |
| Restrict the scancel operation to jobs owned by the given user. |
| .IP |
| |
| .TP |
| \fB\-v\fR, \fB\-\-verbose\fR |
| Print additional logging. Multiple v's increase logging detail. |
| This option is incompatible with the \fB\-\-quiet\fR option. |
| .IP |
| |
| .TP |
| \fB\-V\fR, \fB\-\-version\fR |
| Print the version number of the scancel command. |
| .IP |
| |
| .TP |
| \fB\-\-wckey\fR=\fIwckey\fR |
| Restrict the scancel operation to jobs using this workload |
| characterization key. |
| .IP |
| |
| .SH |
| ARGUMENTS |
| |
| .TP |
| \fIjob_id\fP |
| The Slurm job ID to be signaled. |
| .IP |
| |
| .TP |
| \fIstep_id\fP |
| The step ID of the job step to be signaled. |
| If not specified, the operation is performed at the level of a job. |
| |
| If neither \fB\-\-batch\fR nor \fB\-\-signal\fR are used, |
| the entire job will be terminated. |
| |
| When \fB\-\-batch\fR is used, the batch shell processes will be signaled. |
| The child processes of the shell will not be signaled by Slurm, but |
| the shell may forward the signal. |
| |
| When \fB\-\-batch\fR is not used but \fB\-\-signal\fR is used, |
| then all job steps will be signaled, but the batch script itself |
| will not be signaled. |
| .IP |
| |
| .SH "PERFORMANCE" |
| .PP |
| When executing \fBscancel\fR without the \fB\-\-ctld\fR option; or with the |
| \fB\-\-ctld\fR option and \fB\-\-interactive\fR, \fB\-\-sibling\fR, or specific |
| step ids; a remote procedure call is sent to \fBslurmctld\fR to get all the |
| jobs. \fBscancel\fR then sends a signal job remote procedure call for each job |
| that matches the requested filters. |
| |
| When executing \fBscancel\fR with the \fB\-\-ctld\fR option and without |
| \fB\-\-interactive\fR, \fB\-\-sibling\fR, or specific step ids, a single |
| remote procedure call is sent to \fBslurmctld\fR to signal all jobs matching |
| the requested filters. It is therefore recommended to use the \fB\-\-ctld\fR |
| option in order to reduce the number of remote procedure calls sent to the |
| \fBslurmctld\fR. |
| |
| .PP |
| If enough calls from \fBscancel\fR or other Slurm client commands that send |
| remote procedure calls to the \fBslurmctld\fR daemon come in at once, it can |
| result in a degradation of performance of the \fBslurmctld\fR daemon, possibly |
| resulting in a denial of service. |
| |
| .PP |
| Do not run \fBscancel\fR or other Slurm client commands that send remote |
| procedure calls to \fBslurmctld\fR from loops in shell scripts or other |
| programs. Ensure that programs limit calls to \fBscancel\fR to the minimum |
| necessary for the information you are trying to gather. |
| |
| .SH "ENVIRONMENT VARIABLES" |
| .PP |
| Some \fBscancel\fR options may be set via environment variables. These |
| environment variables, along with their corresponding options, are listed below. |
| (Note: Command line options will always override these settings.) |
| |
| .TP 20 |
| \fBSCANCEL_ACCOUNT\fR |
| \fB\-A\fR, \fB\-\-account\fR=\fIaccount\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_BATCH\fR |
| \fB\-b, \-\-batch\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_CTLD\fR |
| \fB\-\-ctld\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_CRON\fR |
| \fB\-c, \-\-cron\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_FULL\fR |
| \fB\-f, \-\-full\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_HURRY\fR |
| \fB\-H\fR, \fB\-\-hurry\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_INTERACTIVE\fR |
| \fB\-i\fR, \fB\-\-interactive\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_NAME\fR |
| \fB\-n\fR, \fB\-\-name\fR=\fIjob_name\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_PARTITION\fR |
| \fB\-p\fR, \fB\-\-partition\fR=\fIpartition_name\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_QOS\fR |
| \fB\-q\fR, \fB\-\-qos\fR=\fIqos\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_STATE\fR |
| \fB\-t\fR, \fB\-\-state\fR=\fIjob_state_name\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_USER\fR |
| \fB\-u\fR, \fB\-\-user\fR=\fIuser_name\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_VERBOSE\fR |
| \fB\-v\fR, \fB\-\-verbose\fR |
| .IP |
| |
| .TP |
| \fBSCANCEL_WCKEY\fR |
| \fB\-\-wckey\fR=\fIwckey\fR |
| .IP |
| |
| .TP |
| \fBSLURM_CONF\fR |
| The location of the Slurm configuration file. |
| .IP |
| |
| .TP |
| \fBSLURM_CLUSTERS\fR |
| \fB\-M\fR, \fB\-\-clusters\fR |
| .IP |
| |
| .TP |
| \fBSLURM_DEBUG_FLAGS\fR |
| Specify debug flags for scancel to use. See DebugFlags in the |
| \fBslurm.conf\fR(5) man page for a full list of flags. The environment |
| variable takes precedence over the setting in the slurm.conf. |
| .IP |
| |
| .SH "NOTES" |
| .LP |
| If multiple filters are supplied (e.g. \fB\-\-partition\fR and \fB\-\-name\fR) |
| only the jobs satisfying all of the filtering options will be signaled. |
| .LP |
| Cancelling a job step will not result in the job being terminated. |
| The job must be cancelled to release a resource allocation. |
| .LP |
| To cancel a job, invoke \fBscancel\fR without \-\-signal option. This |
| will send first a SIGCONT to all steps to eventually wake them up followed by |
| a SIGTERM, then wait the KillWait duration defined in the slurm.conf file |
| and finally if they have not terminated send a SIGKILL. This gives |
| time for the running job/step(s) to clean up. |
| .LP |
| If a signal value of "KILL" is sent to an entire job, this will cancel |
| the active job steps but not cancel the job itself. |
| |
| .SH "AUTHORIZATION" |
| |
| When using SlurmDBD, users who have an AdminLevel defined (Operator |
| or Admin) and users who are account coordinators are given the |
| authority to invoke scancel on other users jobs. |
| |
| .SH "EXAMPLES" |
| .IP |
| |
| .TP |
| Send SIGTERM to steps 1 and 3 of job 1234: |
| .IP |
| .nf |
| $ scancel \-\-signal=TERM 1234.1 1234.3 |
| .fi |
| |
| .TP |
| Cancel job 1234 along with all of its steps: |
| .IP |
| .nf |
| $ scancel 1234 |
| .fi |
| |
| .TP |
| Send SIGKILL to all steps of job 1235, but do not cancel the job itself: |
| .IP |
| .nf |
| $ scancel \-\-signal=KILL 1235 |
| .fi |
| |
| .TP |
| Send SIGUSR1 to the batch shell processes of job 1236: |
| .IP |
| .nf |
| $ scancel \-\-signal=USR1 \-\-batch 1236 |
| .fi |
| |
| .TP |
| Cancel all pending jobs belonging to user "bob" in partition "debug": |
| .IP |
| .nf |
| $ scancel \-\-state=PENDING \-\-user=bob \-\-partition=debug |
| .fi |
| |
| .TP |
| Cancel only array ID 4 of job array 1237 |
| .IP |
| .nf |
| $ scancel 1237_4 |
| .fi |
| |
| .SH "COPYING" |
| Copyright (C) 2002\-2007 The Regents of the University of California. |
| Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). |
| .br |
| Copyright (C) 2008\-2011 Lawrence Livermore National Security. |
| .br |
| Copyright (C) 2010\-2022 SchedMD LLC. |
| .LP |
| This file is part of Slurm, a resource management program. |
| For details, see <https://slurm.schedmd.com/>. |
| .LP |
| Slurm is free software; you can redistribute it and/or modify it under |
| the terms of the GNU General Public License as published by the Free |
| Software Foundation; either version 2 of the License, or (at your option) |
| any later version. |
| .LP |
| Slurm is distributed in the hope that it will be useful, but WITHOUT ANY |
| WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS |
| FOR A PARTICULAR PURPOSE. See the GNU General Public License for more |
| details. |
| |
| .SH "SEE ALSO" |
| \fBslurm_kill_job\fR (3), \fBslurm_kill_job_step\fR (3) |