blob: 3fa76ad3f7689d5587641d010fa9e84b918d520b [file] [log] [blame] [edit]
<!--#include virtual="header.txt"-->
<h1><a name="top">SLURM Job Accounting Gather Plugin API</a></h1>
<h2> Overview</h2>
<p> This document describes SLURM job accounting gather plugins and the API that
defines them. It is intended as a resource to programmers wishing to write
their own SLURM job accounting gather plugins.
<p>SLURM job accounting gather plugins must conform to the
SLURM Plugin API with the following specifications:
<p><span class="commandline">const char
plugin_name[]="<i>full&nbsp;text&nbsp;name</i>"</span>
<p style="margin-left:.2in">
A free-formatted ASCII text string that identifies the plugin.
<p><span class="commandline">const char
plugin_type[]="<i>major/minor</i>"</span><br>
<p style="margin-left:.2in">
The major type must be &quot;jobacct_gather.&quot;
The minor type can be any suitable name
for the type of accounting package. We currently use
<ul>
<li><b>aix</b>&#151; Gathers information from AIX /proc table and adds this
information to the standard rusage information also gathered for each job.
<li><b>cgroup</b>&#151;Gathers information from Linux cgroup
infrastructure and adds this information to the standard rusage
information also gathered for each job. (Experimental, not to be used
in production.)
<li><b>linux</b>&#151;Gathers information from Linux /proc table and adds this
information to the standard rusage information also gathered for each job.
<li><b>none</b>&#151;No information gathered.
</ul>
The <b>sacct</b> program can be used to display gathered data from regular
accounting and from these plugins.
<p>The programmer is urged to study
<span class="commandline">src/plugins/jobacct_gather/linux</span> and
<span class="commandline">src/common/slurm_jobacct_gather.[c|h]</span>
for a sample implementation of a SLURM job accounting gather plugin.
<p class="footer"><a href="#top">top</a>
<h2>API Functions</h2>
<p>All of the following functions are required. Functions which are not
implemented must be stubbed.
<p class="commandline">int jobacct_gather_p_poll_data(List task_list, bool pgid_plugin, uint64_t cont_id)
<p style="margin-left:.2in"><b>Description</b>:<br>
Build a table of all current processes.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline"> task_list</span> (in/out) List containing
current processes <br>
<span class="commandline"> pgid_plugin</span> (input) if we are
running with the pgid plugin<br>
<span class="commandline"> cont_id</span> (input) container id of processes if not running with pgid
<p class="commandline">int jobacct_gather_p_endpoll(void)
<p style="margin-left:.2in"><b>Description</b>:<br>
Called when the process is finished to stop the
polling thread.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline">none</span>
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">int jobacct_gather_p_add_task(pid_t pid, uint16_t tid)
<p style="margin-left:.2in"><b>Description</b>:<br>
Used to add a task to the poller.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline"> pid</span> (input) Process id <br>
<span class="commandline"> tid</span> (input) slurm global task id
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<h2>Job Account Gathering</h2>
<p>All of the following functions are not required but may be used.
<p class="commandline">int jobacct_gather_init(void)
<p style="margin-left:.2in"><b>Description</b>:<br>
Loads the job account gather plugin.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">int jobacct_gather_fini(void)
<p style="margin-left:.2in"><b>Description</b>:<br>
Unloads the job account gathering plugin.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">int jobacct_gather_startpoll(uin16_t frequency)
<p style="margin-left:.2in"><b>Description</b>:<br>
Creates and starts the polling thread.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline"> frequency </span> (input) frequency of the polling.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">void jobacct_gather_change_poll(uint16_t frequency)
<p style="margin-left:.2in"><b>Description</b>:<br>
Changes the polling thread to a new frequency.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline"> frequency </span> (input) frequency of the polling
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">void jobacct_gather_suspend_poll(void)
<p style="margin-left:.2in"><b>Description</b>:<br>
Temporarily stops the polling thread.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">void jobacct_gather_resume_poll(void)
<p style="margin-left:.2in"><b>Description</b>:<br>
Resumes the polling thread that was stopped.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandine">jobacctinfo_t *jobacct_gather_stat_task(pid_t
pid)
<p style="margin-left:.2in"><b>Description</b>:<br>
Gets the basis of the information of the task.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline">pid</span> (input) process id.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">jobacctinfo_t *jobacct_gather_remove_task(pid_t pid)
<p style="margin-left:.2in"><b>Description</b>:<br>
Removes the task.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline">pid</span> (input) process id.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">int
jobacct_gather_set_proctrack_container_id(uint64_t id)
<p style="margin-left:.2in"><b>Description</b>:<br>
Sets the proctrack container to a given id.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline">id</span> (input) id to set.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">int jobacct_gather_set_mem_limit(uint32_t job_id,
uint32_t step_id,uint32_t mem_limit)
<p style="margin-left:.2in"><b>Description</b>:<br>
Sets the memory limit of the job account.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline">job_id</span> (input) id of the job.<br>
<span class="commandline">sted_id</span> (input) id of the step.<br>
<span class="commandline">mem_limit</span> (input) memory limit in megabytes.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">void jobacct_gather_handle_mem_limit(uint32_t total_job_mem, uint32_t total_job_vsize)
<p style="margin-left:.2in"><b>Description</b>:<br>
Called to find out how much memory is used.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline"> total_job_mem</span> (input) total
amount of memory for jobs.<br>
<span class="commandline"> total_job_vsize</span> (input) the
total job size.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<h2>Job Account Info</h2>
<p>All of the following functions are not required but may be used.
<p class="commandline">jobacctinfo_t *jobacctinfo_create(jobacct_id_t *jobacct_id)
<p style="margin-left:.2in"><b>Description</b>:<br>
Creates the job account info.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline"> jobacct_id</span> (input) the job
account id.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">void jobacctinfo_destroy(void *object)
<p style="margin-left:.2in"><b>Description</b>:<br>
Destroys the job account info.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline"> object</span> (input) the job that needs to be destroyed
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">int jobacctinfo_setinfo(jobacctinfo_t *jobacct, enum jobacct_data_type type, void *data)
<p style="margin-left:.2in"><b>Description</b>:<br>
Set the information for the job.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline"> jobacct</span> (input) job account<br>
<span class="commandline"> type</span>(input) enum telling the plugin how to transform the data.<br>
<span class="commandline"> data</span> (input/output) Is a void * and
the actual data type depends upon the first argument to this function (type).
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">int jobacctinfo_getinfo(jobacctinfo_t *jobacct, enum jobacct_data_type type, void *data)
<p style="margin-left:.2in"><b>Description</b>:<br>
Gets the information about the job.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline"> jobacct</span> (input) job account.<br>
<span class="commandline">type</span> (input) the
data type of the job account.
<span class="commandline">data</span>
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline"> void jobacctinfo_pack(jobacctinfo_t *jobacct,
uint16_t rpc_version, Buf buffer)
<p style="margin-left:.2in"><b>Description</b>:<br>
Packs the job account information.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline">jobacct</span> (input) the job account.<br>
<span class="commandline">rpc_version</span> (input) the
rpc version.<br>
<span class="commandline">buffer</span> (input) the buffer.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">int jobacctinfo_unpack(jobacctinfo_t **jobacct,
uint16_t rpc_version, Buf buffer)
<p style="margin-left:.2in"><b>Description</b>:<br>
Unpacks the job account information.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline">jobacct</span> (input) the job account.<br>
<span class="commandline">rpc_version</span> (input) the rpc
version.<br>
<span class="commandline">buffer</span> (input) the buffer.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">void jobacctinfo_aggregate(jobacctinfo_t *dest, jobacctinfo_t *from)
<p style="margin-left:.2in"><b>Description</b>:<br>
Aggregates the jobs.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline">dest</span> (input) New destination of the job.<br>
<span class="commandline">from</span> (input) Original location of job.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="commandline">void jobacctinfo_2_stats(slurmdb_stats_t *stats, jobacctinfo_t *jobacct)
<p style="margin-left:.2in"><b>Description</b>:<br>
Gets the stats of the job in accounting.
<p style="margin-left:.2in"><b>Arguments</b>: <br>
<span class="commandline">stats</span> (input) slurm data base stat.<br>
<span class="commandline">jobacct</span> (input) the job account.
<p style="margin-left:.2in"><b>Returns</b>: <br>
<span class="commandline">SLURM_SUCCESS</span> on success, or<br>
<span class="commandline">SLURM_ERROR</span> on failure.
<p class="footer"><a href="#top">top</a>
<h2>Parameters</h2>
<p>These parameters can be used in the slurm.conf to configure the
plugin and the frequency at which to gather information about running jobs.</p>
<dl>
<dt><span class="commandline">JobAcctGatherType</span>
<dd>Specifies which plugin should be used.
<dt><span class="commandline">JobAcctGatherFrequency</span>
<dd>Time interval between pollings in seconds.
</dl>
<h2>Versioning</h2>
<p> This document describes version 2 of the SLURM Job Accounting Gather API. Future
releases of SLURM may revise this API. A job accounting gather plugin conveys its
ability to implement a particular API version using the mechanism outlined
for SLURM plugins.</p>
<p class="footer"><a href="#top">top</a>
<p style="text-align:center;">Last modified 21 June 2012</p>
<!--#include virtual="footer.txt"-->