| <!--#include virtual="header.txt"--> |
| |
| <h2><a name="top">SLURM Switch Plugin API</a></h2> |
| |
| <h3>Overview</h3> |
| <p>All of the SLURM commands utilize a collection of Application Progamming |
| Interfaces (APIs). |
| User and system applications can directly use these APIs as desired to |
| achieve tighter integration with SLURM. |
| For example, SLURM data structures and error codes can be directly |
| examined rather than executing SLURM commands and parsing their output. |
| This document describes SLURM APIs. |
| You should see the man pages for individual APIs to get more details.</p> |
| |
| <h3>Get Overall SLURM Information</h3> |
| <ul> |
| |
| <li><b>slurm_api_version</b>—Get SLURM API version number.</li> |
| |
| <li><b>slurm_load_ctl_conf</b>—Load system-wide configuration |
| specifications. Free with <i>slurm_free_ctl_conf</i> to avoid memory |
| leak.</li> |
| |
| <li><b>slurm_print_ctl_conf</b>—Print system-wide configuration |
| specifications.</li> |
| |
| <li><b>slurm_free_ctl_conf</b>—Free storage allocated by |
| <i>slurm_load_ctl_conf</i>.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Get Job Information</h3> |
| <ul> |
| |
| <li><b>slurm_pid2jobid</b>—For a given process ID on a node |
| get the corresponding SLURM job ID.</li> |
| |
| <li><b>slurm_get_end_time</b>—For a given SLURM job ID |
| get the expected termination time.</li> |
| |
| <li><b>slurm_load_jobs</b>—Load job information. |
| Free with <i>slurm_free_job_info_msg</i> to avoid memory leak.</li> |
| |
| <li><b>slurm_print_job_info_msg</b>—Print information about |
| all jobs.</li> |
| |
| <li><b>slurm_print_job_info</b>—Print information about |
| a specific job.</li> |
| |
| <li><b>slurm_get_select_jobinfo</b>—Get <i>select</i> plugin |
| specific information associated with the job. The information |
| available is will vary by select plugin type configured.</li> |
| |
| <li><b>slurm_free_job_info_msg</b>—Free storage allocated by |
| <i>slurm_load_jobs</i>.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Get Job Step Information</h3> |
| <ul> |
| |
| <li><b>slurm_get_job_steps</b>—Load job step information. |
| Free with <i>slurm_free_job_step_info_response_msg</i> to |
| avoid memory leak.</li> |
| |
| <li><b>slurm_print_job_step_info_msg</b>—Print information about |
| all job steps.</li> |
| |
| <li><b>slurm_print_job_step_info</b>—Print information about |
| a specific job step.</li> |
| |
| <li><b>slurm_free_job_step_info_response_msg</b>—Free storage |
| allocated by <i>slurm_get_job_steps</i>.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Get Node Information</h3> |
| <ul> |
| |
| <li><b>slurm_load_node</b>—Load node information. |
| Free with <i>slurm_free_node_info</i> to avoid memory leak.</li> |
| |
| <li><b>slurm_print_node_info_msg</b>—Print information about |
| all nodes.</li> |
| |
| <li><b>slurm_print_node_table</b>—Print information about |
| a specific node.</li> |
| |
| <li><b>slurm_free_node_info</b>—Free storage |
| allocated by <i>slurm_load_node</i>.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Get Partition Information</h3> |
| <ul> |
| |
| <li><b>slurm_load_partitions</b>—Load partition (queue) information. |
| Free with <i>slurm_free_partition_info</i> to avoid memory leak.</li> |
| |
| <li><b>slurm_print_partition_info_msg</b>—Print information about |
| all partitions.</li> |
| |
| <li><b>slurm_print_partition_info</b>—Print information about |
| a specific partition.</li> |
| |
| <li><b>slurm_free_partition_info</b>—Free storage |
| allocated by <i>slurm_load_partitions</i>.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Error Handling</h3> |
| <ul> |
| |
| <li><b>slurm_get_errno</b>—Return the error code set by the |
| last SLURM API function executed.</li> |
| |
| <li><b>slurm_perror</b>—Print SLURM error information to |
| standard output.</li> |
| |
| <li><b>slurm_strerror</b>—Return a string describing a specific |
| SLURM error code.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Resource Allocation</h3> |
| <ul> |
| |
| <li><b>slurm_init_job_desc_msg</b>—Initialize the data structure |
| used in resource allocation requests. You can then just set the fields |
| of particular interest and let the others use default values.</li> |
| |
| <li><b>slurm_job_will_run</b>—Determine if a job would be |
| immediately initiated if submitted now.</li> |
| |
| <li><b>slurm_allocate_resources</b>—Allocate resources for a job. |
| Response message must be freed using |
| <i>slurm_free_resource_allocation_response_msg</i> to avoid a |
| memory leak.</li> |
| |
| <li><b>slurm_free_resource_allocation_response_msg</b>— |
| Frees memory allocated by <i>slurm_allocate_resources</i>.</li> |
| |
| <li><b>slurm_allocate_resources_and_run</b>—Allocate resources for a |
| job and spawn a job step. Response message must be freed using |
| <i>slurm_free_resource_allocation_and_run_response_msg</i> to avoid a |
| memory leak.</li> |
| |
| <li><b>slurm_free_resource_allocation_and_run_response_msg</b>— |
| Frees memory allocated by <i>slurm_allocate_resources_and_run</i>.</li> |
| |
| <li><b>slurm_submit_batch_job</b>—Submit a script for later |
| execution. Response message must be freed using |
| <i>slurm_free_submit_response_response_msg</i> to avoid a |
| memory leak.</li> |
| |
| <li><b>slurm_free_submit_response_response_msg</b>— |
| Frees memory allocated by <i>slurm_submit_batch_job</i>.</li> |
| |
| <li><b>slurm_confirm_allocation</b>—Test if a resource allocation has |
| already been made for a given job id. Response message must be freed using |
| <i>slurm_free_resource_allocation_response_msg</i> to avoid a |
| memory leak. This can be used to confirm that an |
| allocation is still active or for error recovery.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Job Step Creation</h3> |
| <p>SLURM job steps involve numerous interactions with the |
| <i>slurmd</i> daemon. The job step creation is only the |
| first step in the process. We don't advise direct user |
| creation of job steps, but include the information here |
| for completeness.</p> |
| <ul> |
| |
| <li><b>slurm_job_step_create</b>—Initiate a job step. |
| Allocated memory must be freed by |
| <i>slurm_free_job_step_create_response_msg</i> to avoid a |
| memory leak.</li> |
| |
| <li><b>slurm_free_job_step_create_response_msg</b>—Free |
| memory allocated by <i>slurm_job_step_create</i>. |
| |
| <li><b>slurm_step_ctx_create</b>—Create job step context. |
| Destroy using <i>slurm_step_ctx_destroy</i>.</li> |
| |
| <li><b>slurm_step_ctx_destroy</b>—Destroy a job step context |
| created by <i>slurm_step_ctx_create</i>.</li> |
| |
| <li><b>slurm_step_ctx_get</b>—Get values from job step context.</li> |
| |
| <li><b>slurm_step_ctx_set</b>—Set values in job step context.</li> |
| |
| <li><b>slurm_jobinfo_ctx_get</b>—Get values from a <i>jobinfo</i> |
| field as returned by <i>slurm_step_ctx_get</i>.</li> |
| |
| <li><b>slurm_spawn</b>—Spawn tasks and establish communcations.</li> |
| |
| <li><b>slurm_spawn_kill</b>—Signal spawned tasks.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Job and Job Step Signaling and Cancelling</h3> |
| <ul> |
| |
| <li><b>slurm_kill_job</b>—Signal or cancel a job.</li> |
| |
| <li><b>slurm_kill_job_step</b>—Signal or cancel a job step.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Job Completion</h3> |
| <ul> |
| |
| <li><b>slurm_complete_job</b>—Note completion of a job. |
| Releases resource allocation for the job.</li> |
| |
| <li><b>slurm_complete_job_step</b>—Note completion of a |
| job step.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Checkpoint</h3> |
| <ul> |
| |
| <li><b>slurm_checkpoint_able</b>—Note that a specific job or |
| job step is elligible for checkpoint.</li> |
| |
| <li><b>slurm_checkpoint_complete</b>—Note that a requested |
| checkpoint has completed.</li> |
| |
| <li><b>slurm_checkpoint_create</b>—Request a checkpoint for |
| a specific job step. Continue execution upon completion of the |
| checkpoint.</li> |
| |
| <li><b>slurm_checkpoint_vacate</b>—Request a checkpoint for |
| a specific job step. Terminate execution upon completion of the |
| checkpoint.</li> |
| |
| <li><b>slurm_checkpoint_disable</b>—Make the identified job step |
| non-checkpointable.</li> |
| |
| <li><b>slurm_checkpoint_enable</b>—Make the identified job |
| step checkpointable.</li> |
| |
| <li><b>slurm_checkpoint_error</b>—Get error information for |
| the last checkpoint operation on a given job step.</li> |
| |
| <li><b>slurm_checkpoint_restart</b>—Request that a previously |
| checkpointed job resume execution.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Administrative Functions</h3> |
| <p>Most of these functions can only be exected by user <i>root</i>.</p> |
| <ul> |
| |
| <li><b>slurm_reconfigure</b>—Update slurm daemons |
| based upon current <i>slurm.conf</i> configuration file. |
| Use this after updating the configuration file to |
| insure that it takes effect.</li> |
| |
| <li><b>slurm_shutdown</b>—Terminate slurm daemons.</li> |
| |
| <li><b>slurm_update_job</b>—Update state |
| information associated with a given job.</li> |
| |
| <li><b>slurm_update_node</b>—Update state |
| information associated with a given node. NOTE: Most |
| of a node's characteristics can not be modified.</li> |
| |
| <li><b>slurm_init_part_desc_msg</b>—Initialize a |
| partition update descriptor. Used this to initialize |
| the data structure used in <i>slurm_update_partition</i>.</li> |
| |
| <li><b>slurm_update_partition</b>—Update state |
| information associated with a given partition.</li> |
| |
| <li><b>slurm_delete_partition</b>—Destroy a partition.</li> |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>SLURM Host List Support</h3> |
| <p>SLURM uses a condensed format to express node names. |
| For example <i>linux[1-3,6]</i> represents <i>linux1</i>, |
| <i>linux2</i>, <i>linux3</i>, and <i>linux6</i>. These |
| functions permit you to translate the SLURM expression |
| into a list of individual node names.</p> |
| |
| <ul> |
| |
| <li><b>slurm_hostlist_create</b>—Translate a SLURM |
| node name expression into a record used for parsing. |
| Use <i>slurm_hostlist_destroy</i> to free the allocated |
| storage.</li> |
| |
| <li><b>slurm_hostlist_shift</b>—Get the next node |
| name.</li> |
| |
| <li><b>slurm_hostlist_destroy</b>—Release storage |
| allocated by <i>slurm_hostlist_create</i>. |
| |
| </ul> |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <p style="text-align:center;">Last modified 25 October 2005</p> |
| |
| <!--#include virtual="footer.txt"--> |