| <!--#include virtual="header.txt"--> |
| |
| <h1><a name="top">SLURM Job Accounting Gather Plugin API</a></h1> |
| |
| <h2> Overview</h2> |
| <p> This document describes SLURM job accounting gather plugins and the API that |
| defines them. It is intended as a resource to programmers wishing to write |
| their own SLURM job accounting gather plugins. |
| |
| <p>SLURM job accounting gather plugins must conform to the |
| SLURM Plugin API with the following specifications: |
| |
| <p><span class="commandline">const char |
| plugin_name[]="<i>full text name</i>"</span> |
| <p style="margin-left:.2in"> |
| A free-formatted ASCII text string that identifies the plugin. |
| |
| <p><span class="commandline">const char |
| plugin_type[]="<i>major/minor</i>"</span><br> |
| <p style="margin-left:.2in"> |
| The major type must be "jobacct_gather." |
| The minor type can be any suitable name |
| for the type of accounting package. We currently use |
| <ul> |
| <li><b>aix</b>— Gathers information from AIX /proc table and adds this |
| information to the standard rusage information also gathered for each job. |
| <li><b>cgroup</b>—Gathers information from Linux cgroup |
| infrastructure and adds this information to the standard rusage |
| information also gathered for each job. (Experimental, not to be used |
| in production.) |
| <li><b>linux</b>—Gathers information from Linux /proc table and adds this |
| information to the standard rusage information also gathered for each job. |
| <li><b>none</b>—No information gathered. |
| </ul> |
| The <b>sacct</b> program can be used to display gathered data from regular |
| accounting and from these plugins. |
| <p>The programmer is urged to study |
| <span class="commandline">src/plugins/jobacct_gather/linux</span> and |
| <span class="commandline">src/common/slurm_jobacct_gather.[c|h]</span> |
| for a sample implementation of a SLURM job accounting gather plugin. |
| <p class="footer"><a href="#top">top</a> |
| |
| |
| <h2>API Functions</h2> |
| <p>All of the following functions are required. Functions which are not |
| implemented must be stubbed. |
| |
| <p class="commandline">int jobacct_gather_p_poll_data(List task_list, bool pgid_plugin, uint64_t cont_id) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Build a table of all current processes. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline"> task_list</span> (in/out) List containing |
| current processes <br> |
| <span class="commandline"> pgid_plugin</span> (input) if we are |
| running with the pgid plugin<br> |
| <span class="commandline"> cont_id</span> (input) container id of processes if not running with pgid |
| |
| <p class="commandline">int jobacct_gather_p_endpoll(void) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Called when the process is finished to stop the |
| polling thread. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline">none</span> |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">int jobacct_gather_p_add_task(pid_t pid, uint16_t tid) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Used to add a task to the poller. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline"> pid</span> (input) Process id <br> |
| <span class="commandline"> tid</span> (input) slurm global task id |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| |
| |
| <h2>Job Account Gathering</h2> |
| <p>All of the following functions are not required but may be used. |
| |
| <p class="commandline">int jobacct_gather_init(void) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Loads the job account gather plugin. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">int jobacct_gather_fini(void) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Unloads the job account gathering plugin. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">int jobacct_gather_startpoll(uin16_t frequency) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Creates and starts the polling thread. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline"> frequency </span> (input) frequency of the polling. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">void jobacct_gather_change_poll(uint16_t frequency) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Changes the polling thread to a new frequency. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline"> frequency </span> (input) frequency of the polling |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">void jobacct_gather_suspend_poll(void) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Temporarily stops the polling thread. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">void jobacct_gather_resume_poll(void) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Resumes the polling thread that was stopped. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandine">jobacctinfo_t *jobacct_gather_stat_task(pid_t |
| pid) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Gets the basis of the information of the task. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline">pid</span> (input) process id. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">jobacctinfo_t *jobacct_gather_remove_task(pid_t pid) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Removes the task. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline">pid</span> (input) process id. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">int |
| jobacct_gather_set_proctrack_container_id(uint64_t id) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Sets the proctrack container to a given id. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline">id</span> (input) id to set. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">int jobacct_gather_set_mem_limit(uint32_t job_id, |
| uint32_t step_id,uint32_t mem_limit) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Sets the memory limit of the job account. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline">job_id</span> (input) id of the job.<br> |
| <span class="commandline">sted_id</span> (input) id of the step.<br> |
| <span class="commandline">mem_limit</span> (input) memory limit in megabytes. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">void jobacct_gather_handle_mem_limit(uint32_t total_job_mem, uint32_t total_job_vsize) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Called to find out how much memory is used. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline"> total_job_mem</span> (input) total |
| amount of memory for jobs.<br> |
| <span class="commandline"> total_job_vsize</span> (input) the |
| total job size. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <h2>Job Account Info</h2> |
| <p>All of the following functions are not required but may be used. |
| |
| <p class="commandline">jobacctinfo_t *jobacctinfo_create(jobacct_id_t *jobacct_id) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Creates the job account info. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline"> jobacct_id</span> (input) the job |
| account id. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">void jobacctinfo_destroy(void *object) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Destroys the job account info. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline"> object</span> (input) the job that needs to be destroyed |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">int jobacctinfo_setinfo(jobacctinfo_t *jobacct, enum jobacct_data_type type, void *data) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Set the information for the job. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline"> jobacct</span> (input) job account<br> |
| <span class="commandline"> type</span>(input) enum telling the plugin how to transform the data.<br> |
| <span class="commandline"> data</span> (input/output) Is a void * and |
| the actual data type depends upon the first argument to this function (type). |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">int jobacctinfo_getinfo(jobacctinfo_t *jobacct, enum jobacct_data_type type, void *data) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Gets the information about the job. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline"> jobacct</span> (input) job account.<br> |
| <span class="commandline">type</span> (input) the |
| data type of the job account. |
| <span class="commandline">data</span> |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline"> void jobacctinfo_pack(jobacctinfo_t *jobacct, |
| uint16_t rpc_version, Buf buffer) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Packs the job account information. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline">jobacct</span> (input) the job account.<br> |
| <span class="commandline">rpc_version</span> (input) the |
| rpc version.<br> |
| <span class="commandline">buffer</span> (input) the buffer. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">int jobacctinfo_unpack(jobacctinfo_t **jobacct, |
| uint16_t rpc_version, Buf buffer) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Unpacks the job account information. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline">jobacct</span> (input) the job account.<br> |
| <span class="commandline">rpc_version</span> (input) the rpc |
| version.<br> |
| <span class="commandline">buffer</span> (input) the buffer. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">void jobacctinfo_aggregate(jobacctinfo_t *dest, jobacctinfo_t *from) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Aggregates the jobs. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline">dest</span> (input) New destination of the job.<br> |
| <span class="commandline">from</span> (input) Original location of job. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="commandline">void jobacctinfo_2_stats(slurmdb_stats_t *stats, jobacctinfo_t *jobacct) |
| <p style="margin-left:.2in"><b>Description</b>:<br> |
| Gets the stats of the job in accounting. |
| <p style="margin-left:.2in"><b>Arguments</b>: <br> |
| <span class="commandline">stats</span> (input) slurm data base stat.<br> |
| <span class="commandline">jobacct</span> (input) the job account. |
| <p style="margin-left:.2in"><b>Returns</b>: <br> |
| <span class="commandline">SLURM_SUCCESS</span> on success, or<br> |
| <span class="commandline">SLURM_ERROR</span> on failure. |
| |
| <p class="footer"><a href="#top">top</a> |
| |
| <h2>Parameters</h2> |
| <p>These parameters can be used in the slurm.conf to configure the |
| plugin and the frequency at which to gather information about running jobs.</p> |
| <dl> |
| <dt><span class="commandline">JobAcctGatherType</span> |
| <dd>Specifies which plugin should be used. |
| <dt><span class="commandline">JobAcctGatherFrequency</span> |
| <dd>Time interval between pollings in seconds. |
| </dl> |
| |
| <h2>Versioning</h2> |
| <p> This document describes version 2 of the SLURM Job Accounting Gather API. Future |
| releases of SLURM may revise this API. A job accounting gather plugin conveys its |
| ability to implement a particular API version using the mechanism outlined |
| for SLURM plugins.</p> |
| |
| <p class="footer"><a href="#top">top</a> |
| |
| <p style="text-align:center;">Last modified 21 June 2012</p> |
| |
| <!--#include virtual="footer.txt"--> |