blob: fbb3eb3d7290844cf6635ed4f286887fbc9b521f [file] [log] [blame] [edit]
<!--#include virtual="header.txt"-->
<h1>Prolog and Epilog Guide</h1>
<p>SLURM supports a multitude of prolog and epilog programs.
The first table below identifies what prologs and epilogs are available for job
allocations, when and where they run.</p>
<center>
<table style="page-break-inside: avoid; font-family: Arial,Helvetica,sans-serif;" border="1" bordercolor="#000000" cellpadding="6" cellspacing="0" width="100%">
<colgroup><col width="15%">
<col width="15%">
<col width="15%">
<col width="15%">
<col width="40%">
</colgroup><tbody><tr>
<td bgcolor="#e0e0e0" height="18" width="15%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>Parameter
</b></font></p>
</td>
<td bgcolor="#e0e0e0" height="18" width="15%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>Location
</b></font></p>
</td>
<td bgcolor="#e0e0e0" width="15%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>Invoked by
</b></font></p>
</td>
<td bgcolor="#e0e0e0" width="15%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>User
</b></font></p>
</td>
<td bgcolor="#e0e0e0" width="40%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>When executed
</b></font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Prolog (from slurm.conf)</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Compute or front end node</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
slurmd daemon</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
SlurmdUser (normally user root)</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
First job or job step initaion on that node</font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
PrologSlurmctld (from slurm.conf)</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Head node (where slurmctld daemon runs)</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
slurmctld daemon</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
SlurmctldUser</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
At job allocation</font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Epilog (from slurm.conf)</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Compute or front end node</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
slurmd daemon</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
SlurmdUser (normally user root)</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
At job termination</font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
EpilogSlurmctld (from slurm.conf)</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Head node (where slurmctld daemon runs)</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
slurmctld daemon</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
SlurmctldUser</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
At job termination</font></p>
</td>
</tr>
</tbody></table>
</center>
<p>This second table below identifies what prologs and epilogs are available for job
step allocations, when and where they run.</p>
<center>
<table style="page-break-inside: avoid; font-family: Arial,Helvetica,sans-serif;" border="1" bordercolor="#000000" cellpadding="6" cellspacing="0" width="100%">
<colgroup><col width="15%">
<col width="15%">
<col width="15%">
<col width="15%">
<col width="40%">
</colgroup><tbody><tr>
<td bgcolor="#e0e0e0" height="18" width="15%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>Parameter
</b></font></p>
</td>
<td bgcolor="#e0e0e0" height="18" width="15%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>Location
</b></font></p>
</td>
<td bgcolor="#e0e0e0" width="15%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>Invoked by
</b></font></p>
</td>
<td bgcolor="#e0e0e0" width="15%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>User
</b></font></p>
</td>
<td bgcolor="#e0e0e0" width="40%">
<p align="CENTER"><font style="font-size: 8pt" size="1"><b>When executed
</b></font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
SrunProlog (from slurm.conf) or srun --prolog</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
srun invocation node</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
srun command</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
User invoking srun command</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Prior to launching job step</font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
TaskProlog (from slurm.conf)</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Compute node</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
slurmstepd daemon</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
User invoking srun command</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Prior to launching job step</font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
srun --task-prolog</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Compute node</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
slurmstepd daemon</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
User invoking srun command</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Prior to launching job step</font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
TaskEpilog (from slurm.conf)</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Compute node</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
slurmstepd daemon</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
User invoking srun command</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Completion job step</font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
srun --task-epilog</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Compute node</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
slurmstepd daemon</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
User invoking srun command</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Completion job step</font></p>
</td>
</tr>
<tr>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
SrunEpilog (from slurm.conf) or srun --epilog</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
srun invocation node</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
srun command</font></p>
</td>
<td height="18" width="15%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
User invoking srun command</font></p>
</td>
<td width="40%">
<p align="LEFT"><font style="font-size: 8pt" size="1">
Completion job step</font></p>
</td>
</tr>
</tbody></table>
</center>
<p>Plugins functions are may also be useful to execute logic at various well
defined points.</p>
<p><a href="spank.html">SPANK</a> is another mechanism that may be useful
to invoke logic in the user commands, slurmd daemon, and slurmstepd daemon.</p>
<h3>Failure Handling</h3>
<p>If the Epilog fails (returns a non-zero exit code), this will result in the
node being set to a DOWN state.
If the EpilogSlurmctld fails (returns a non-zero exit code), this will only
be logged.
If the Prolog fails (returns a non-zero exit code), this will result in the
node being set to a DOWN state and the job requeued to executed on another node.
If the PrologSlurmctld fails (returns a non-zero exit code), this will result
in the job requeued to executed on another node if possible. Only batch jobs
can be requeued. Interactive jobs (salloc and srun) will be cancelled if the
PrologSlurmctld fails.</p>
<HR SIZE=4>
<p>Based upon work by Jason Sollom, Cray Inc. and used by permission.</p>
<p style="text-align:center;">Last modified 26 November 2012</p>
<!--#include virtual="footer.txt"-->