| <!--#include virtual="header.txt"--> |
| |
| <h1><a name="top">Resource Selection Plugin Programmer Guide</a></h1> |
| |
| <h2>Overview</h2> |
| <p>This document describes SLURM resource selection plugins and the API that defines |
| them. It is intended as a resource to programmers wishing to write their own SLURM |
| node selection plugins. This is version 100 of the API.</p> |
| |
| <p>SLURM node selection plugins are SLURM plugins that implement the SLURM node selection |
| API described herein. They are intended to provide a mechanism for both selecting |
| nodes for pending jobs and performing any system-specific tasks for job launch or |
| termination. The plugins must conform to the SLURM Plugin API with the following |
| specifications:</p> |
| <p><span class="commandline">const char plugin_type[]</span><br> |
| The major type must be "select." The minor type can be any recognizable |
| abbreviation for the type of node selection algorithm. We recommend, for example:</p> |
| <ul> |
| <li><b>bluegene</b>—<a href="http://www.research.ibm.com/bluegene/">IBM Blue Gene</a> |
| node selector. Note that this plugin not only selects the nodes for a job, but performs |
| some initialization and termination functions for the job. Use this plugin for |
| BlueGene/L, BlueGene/P and BlueGene/Q systems.</li> |
| <li><b>cons_res</b>—A plugin that can allocate individual processors, |
| memory, etc. within nodes. This plugin is recommended for systems with |
| many non-parallel programs sharing nodes. For more information see |
| <a href=cons_res.html>Consumable Resources in SLURM</a>.</li> |
| <li><b>cray</b>—Cray XE and XT system node selector. Note that this |
| plugin not only selects the nodes for a job, but performs some initialization |
| and termination functions for the job. This plugin also serves as a wrapper |
| for the <i>select/linear</i> plugin which enforces various limits and |
| provides support for resource selection optimized for the system topology.</li> |
| <li><b>linear</b>—A plugin that selects nodes assuming a one-dimensional |
| array of nodes. The nodes are selected so as to minimize the number of consecutive |
| sets of nodes utilizing a best-fit algorithm. While supporting shared nodes, |
| this plugin does not allocate individual processors, but can allocate memory to jobs. |
| This plugin is recommended for systems without shared nodes.</li> |
| </ul> |
| <p>The <span class="commandline">plugin_name</span> and |
| <span class="commandline">plugin_version</span> |
| symbols required by the SLURM Plugin API require no specialization for node selection support. |
| Note carefully, however, the versioning discussion below.</p> |
| |
| <p>A simplified flow of logic follows: |
| <pre> |
| /* slurmctld daemon starts, recover state */ |
| if ((<i>select_p_node_init)</i>() != SLURM_SUCCESS) || |
| (<i>select_p_block_init)</i>() != SLURM_SUCCESS) || |
| (<i>select_p_state_restore)</i>() != SLURM_SUCCESS) || |
| (<i>select_p_job_init)</i>() != SLURM_SUCCESS)) |
| abort |
| |
| /* wait for job arrival */ |
| if (<i>select_p_job_test</i>(all available nodes) != SLURM_SUCCESS) { |
| if (<i>select_p_job_test</i>(all configured nodes) != SLURM_SUCCESS) |
| /* reject the job and tell the user it can never run */ |
| else |
| /* leave the job queued for later execution */ |
| } else { |
| /* update job's node list and node bitmap */ |
| if (<i>select_p_job_begin</i>() != SLURM_SUCCESS) |
| /* leave the job queued for later execution */ |
| else { |
| while (!<i>select_p_job_ready</i>()) |
| wait |
| /* execute the job */ |
| /* wait for job to end or be terminated */ |
| <i>select_p_job_fini</i>() |
| } |
| } |
| |
| /* wait for slurmctld shutdown request */ |
| <i>select_p_state_save</i>() |
| </pre> |
| <p>Depending upon failure modes, it is possible that |
| <span class="commandline">select_p_state_save()</span> |
| will not be called at slurmctld termination. |
| When slurmctld is restarted, other function calls may be replayed. |
| <span class="commandline">select_p_node_init()</span> may be used |
| to synchronize the plugin's state with that of slurmctld.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h2>Data Objects</h2> |
| <p> These functions are expected to read and/or modify data structures directly in |
| the slurmctld daemon's memory. Slurmctld is a multi-threaded program with independent |
| read and write locks on each data structure type. Therefore the type of operations |
| permitted on various data structures is identified for each function.</p> |
| |
| <p>These functions make use of bitmaps corresponding to the nodes in a table. |
| The function <span class="commandline">select_p_node_init()</span> should |
| be used to establish the initial mapping of bitmap entries to nodes. |
| Functions defined in <i>src/common/bitmap.h</i> should be used for bitmap |
| manipulations (these functions are directly accessible from the plugin).</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h2>API Functions</h2> |
| <p>The following functions must appear. Functions which are not implemented should |
| be stubbed.</p> |
| |
| <h3>State Save Functions</h3> |
| |
| <p class="commandline">int select_p_state_save (char *dir_name);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Save any global node selection state |
| information to a file within the specified directory. The actual file name used is plugin specific. |
| It is recommended that the global switch state contain a magic number for validation purposes. |
| This function is called by the slurmctld daemon on shutdown.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<span class="commandline"> dir_name</span> |
| (input) fully-qualified pathname of a directory into which user SlurmUser (as defined |
| in slurm.conf) can create a file and write state information into that file. Cannot be NULL.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_state_restore (char *dir_name);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Restore any global node selection state |
| information from a file within the specified directory. The actual file name used is plugin specific. |
| It is recommended that any magic number associated with the global switch state be verified. |
| This function is called by the slurmctld daemon on startup.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<span class="commandline"> dir_name</span> |
| (input) fully-qualified pathname of a directory containing a state information file |
| from which user SlurmUser (as defined in slurm.conf) can read. Cannot be NULL.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR, causing slurmctld to exit.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h3>State Initialization Functions</h3> |
| |
| <p class="commandline">int select_p_node_init (struct node_record *node_ptr, int node_cnt);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Note the initialization of the |
| node record data structure. This function is called by the slurmctld daemon |
| when the node records are initially established and again when any nodes are |
| added to or removed from the data structure. </p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> node_ptr</span> (input) pointer |
| to the node data records. Data in these records can read. Nodes deleted after initialization |
| may have their the <i>name</i> field in the record cleared (zero length) rather than |
| rebuilding the node records and bitmaps.<br><br> |
| <span class="commandline"> node_cnt</span> (input) number |
| of node data records.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR, causing slurmctld to exit.</p> |
| |
| <p class="commandline">int select_p_block_init (List part_list);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Note the initialization of the |
| partition record data structure. This function is called by the slurmctld |
| daemon when the partition records are initially established and again |
| when any partition configurations change. </p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> part_list</span> (input) list of partition |
| record entries. Note that some of these partitions may have no associated nodes. Also |
| consider that nodes can be removed from one partition and added to a different partition.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR, causing slurmctld to exit.</p> |
| |
| <p class="commandline">int select_p_job_init(List job_list);<p> |
| <p style="margin-left:.2in"><b>Description</b>: Used at slurmctld daemon |
| startup to synchronize plugin (and node) state with that of currently active |
| jobs.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> job_list</span> (input) |
| list of slurm jobs from slurmctld job records.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_reconfigure (void);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Used to notify plugin |
| of change in partition configuration or general configuration change. |
| The plugin will test global variables for changes as appropriate.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful, otherwise SLURM_ERROR</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Node-Specific Functions</h3> |
| |
| <p class="commandline">select_nodeinfo_t *select_p_select_nodeinfo_alloc(void);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Allocate a buffer for select |
| plugin specific information about a node. Use select_p_select_nodeinfo_free() |
| to free the returned data structure.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: A buffer for select plugin specific |
| information about a node or NULL on failure. Use select_p_select_nodeinfo_free() |
| to free this data structure.</p> |
| |
| <p class="commandline">int select_p_select_nodeinfo_pack(select_nodeinfo_t *nodeinfo, |
| Buf buffer, uint16_t protocol_version);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Pack select plugin specific |
| information about a node into a buffer for node queries.</p> |
| <p style="margin-left:.2in"><b>Argument</b>:<br> |
| <span class="commandline"> nodeinfo</span> (input) Node information to be packed.<br> |
| <span class="commandline"> buffer</span> (input/output) pointer |
| to buffer into which the node information is packed.<br> |
| <span class="commandline"> protocol_version</span> (input) |
| Version number of the data packing mechanism (needed for backward compatibility).</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful, otherwise SLURM_ERROR</p> |
| |
| <p class="commandline">int select_p_select_nodeinfo_unpack(select_nodeinfo_t **nodeinfo, |
| Buf buffer, uint16_t protocol_version);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Unpack select plugin specific |
| information about a node from a buffer for node queries. Use |
| select_p_select_nodeinfo_free() to free the returned data structure.</p> |
| <p style="margin-left:.2in"><b>Argument</b>:<br> |
| <span class="commandline"> nodeinfo</span> (output) Node |
| information unpacked from the buffer. Use select_p_select_nodeinfo_free() |
| to free the returned data structure.<br> |
| <span class="commandline"> buffer</span> (input/output) pointer |
| to buffer from which the node information is to be unpacked.<br> |
| <span class="commandline"> protocol_version</span> (input) |
| Version number of the data packing mechanism (needed for backward compatibility).</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful, otherwise SLURM_ERROR</p> |
| |
| <p class="commandline">int select_p_select_nodeinfo_free(select_nodeinfo_t *nodeinfo);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Free a buffer which was |
| previously allocated for select plugin specific information about a node.</p> |
| <p style="margin-left:.2in"><b>Argument</b>: |
| <span class="commandline"> nodeinfo</span> (input/output) The buffer to be freed.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful, otherwise SLURM_ERROR</p> |
| |
| <p class="commandline">int int select_p_select_nodeinfo_set(struct job_record *job_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Reset select plugin specific |
| information about a job. Called by slurmctld daemon after that job's state has |
| been restored (at startup) or job has been scheduled.</p> |
| <p style="margin-left:.2in"><b>Argument</b>: |
| <span class="commandline"> job_ptr</span> (input) Pointer |
| to the updated job.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_select_nodeinfo_set_all(void);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Update select plugin specific |
| information about every node as needed.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_select_nodeinfo_get(select_nodeinfo_t *nodeinfo, |
| enum select_nodedata_type dinfo, enum node_states state, void *data);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Get information from a |
| select plugin's node specific data structure.</p> |
| <p style="margin-left:.2in"><b>Argument</b>:<br> |
| <span class="commandline"> nodeinfo</span> (input) Node information |
| data structure from which information is to get retrieved.<br> |
| <span class="commandline"> dinfo</span> (input) Data type to |
| be retrieved.<br> |
| <span class="commandline"> state</span> (input) Node state filter |
| to be applied (e.g. only get information about ALLOCATED nodes).<br> |
| <span class="commandline"> data</span> (output) The retrieved data.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_update_node_config (int index);</p> |
| <p style="margin-left:.2in"><b>Description</b>: note that a node has |
| registered with a different configuration than previously registered. |
| For example, the node was configured with 1GB of memory in slurm.conf, |
| but actually registered with 2GB of memory.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> index</span> (input) zero origin index |
| of the node in reference to the entire system.<br> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful, otherwise SLURM_ERROR</p> |
| |
| <p class="commandline">bool select_p_node_ranking(struct node_record *node_ptr, int node_cnt)</p> |
| <p style="margin-left:.2in"><b>Description</b>: This function is called by the slurmctld |
| daemon at start time to set node rank information for recording the nodes to |
| optimize application performance. </p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> node_ptr</span> (input/output) pointer |
| to the node data structure. Each node's node rank field may be set.<br> |
| <span class="commandline"> node_cnt</span> (input) number |
| of nodes configured on the system.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: true if node rank information has |
| been set.</p> |
| |
| <p class="commandline">int select_p_update_node_state (struct node_record *node_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: push a node state change |
| into the plugin. The index should be the index from the slurmctld of |
| the entire system.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> node_ptr</span> (input/output) pointer |
| to the node data structure. Each node's node rank field may be set.<br> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful, otherwise SLURM_ERROR</p> |
| |
| <p class="commandline">int select_p_alter_node_cnt (enum |
| select_node_cnt type, void *data);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Used for systems like an IBM |
| Bluegene system where one SLURM node is mapped to many compute nodes. In |
| Bluegene's case one SLURM node/midplane represents 512 compute nodes, but |
| since 512 is typically the smallest allocatable block SLURM treats |
| it as one node. This is a function so the user can issue a 'real' |
| number and the function will alter it so SLURM can understand what the |
| user really means in SLURM terms.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> type</span> (input) enum |
| telling the plugin how to transform the data.<br> |
| <span class="commandline"> data</span> (input/output) |
| Is a void * and the actual data type depends upon the first argument to this |
| function (type).</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful, otherwise SLURM_ERROR</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Block-Specific Functions</h3> |
| |
| <p class="commandline">int select_p_update_sub_node (update_block_msg_t *block_desc_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Update the state of a portion of |
| a SLURM node. Currently used on BlueGene systems to place node cards within a |
| midplane into or out of an error state.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> block_desc_ptr</span> (input) pointer |
| to the modified block containing the sub-block name and its new state.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful, otherwise SLURM_ERROR</p> |
| |
| <p class="commandline">int select_p_update_block (update_block_msg_t *block_desc_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: This function is called when the admin needs |
| to manually update the state of a block. </p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> block_desc_ptr</span> (input) block |
| description variable. Containing the block name and the state to set the block.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Job-Specific Functions</h3> |
| |
| <p class="commandline">select_jobinfo_t *select_p_select_jobinfo_alloc(void);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Allocate a buffer for select |
| plugin specific information about a job. Use select_p_select_jobinfo_free() |
| to free the allocated memory.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: Pointer to a select plugin buffer |
| for a job or NULL on failure. Use select_p_select_jobinfo_free() to free the |
| allocated memory.</p> |
| |
| <p class="commandline">select_jobinfo_t *select_p_select_jobinfo_copy(select_jobinfo_t *jobinfo);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Copy the buffer containing select |
| plugin specific information about a job. Use select_p_select_jobinfo_free() |
| to free the allocated memory.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> jobinfo</span> (input) pointer |
| to the select plugin specific information about a job.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: A copy of jobinfo or NULL on |
| failure. Use select_p_select_jobinfo_free() to free the allocated memory.</p> |
| |
| <p class="commandline">int select_p_select_jobinfo_free(select_jobinfo_t *jobinfo);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Free the buffer containing select |
| plugin specific information about a job.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> jobinfo</span> (input) pointer |
| to the select plugin specific information about a job.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_select_jobinfo_pack(select_jobinfo_t *jobinfo, |
| Buf buffer, uint16_t protocol_version);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Pack into a buffer the contents |
| of the select plugin specific information about a job.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> jobinfo</span> (input) pointer |
| to the select plugin specific information about a job.<br> |
| <span class="commandline"> buffer</span> (input/output) pointer |
| to buffer into which the job information is packed.<br> |
| <span class="commandline"> protocol_version</span> (input) |
| Version number of the data packing mechanism (needed for backward compatibility).</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_select_jobinfo_unpack(select_jobinfo_t **jobinfo_pptr, |
| Buf buffer, uint16_t protocol_version);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Unpack from a buffer the contents |
| of the select plugin specific information about a job. |
| The returned value must be freed using select_p_select_jobinfo_free().</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> jobinfo</span> (output) pointer |
| to the select plugin specific information about a job. The returned value must |
| be freed using select_p_select_jobinfo_free().<br> |
| <span class="commandline"> buffer</span> (input/output) pointer |
| to buffer from which the job information is unpacked.<br> |
| <span class="commandline"> protocol_version</span> (input) |
| Version number of the data packing mechanism (needed for backward compatibility).</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_select_jobinfo_get(select_jobinfo_t *jobinfo, |
| enum select_jobdata_type data_type, void *data);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Get the contents of a field |
| from the select plugin specific information about a job.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> jobinfo</span> (input) pointer |
| to the select plugin specific information about a job to be read.<br> |
| <span class="commandline"> data_type</span> (input) identification |
| of the field to be retrieved.<br> |
| <span class="commandline"> data</span> (output) data read |
| from the job record.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_select_jobinfo_set(select_jobinfo_t *jobinfo, |
| enum select_jobdata_type data_type, void *data);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Set a field in the select |
| plugin specific information about a job.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> jobinfo</span> (input/output) pointer |
| to the select plugin specific information about a job to be modified.<br> |
| <span class="commandline"> data_type</span> (input) identification |
| of the field to be set.<br> |
| <span class="commandline"> data</span> (input) data to be written |
| into the job record.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">char *select_p_select_jobinfo_sprint(select_jobinfo_t *jobinfo, |
| char *buf, size_t size, int mode);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Print the contents of the select |
| plugin specific information about a job.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> jobinfo</span> (input) pointer |
| to the select plugin specific information about a job.<br> |
| <span class="commandline"> buf</span> (input/output) buffer |
| into which the contents are written.<br> |
| <span class="commandline"> size</span> (input) size of buf in bytes.<br> |
| <span class="commandline"> mode</span> (input) print mode, see enum select_print_mode.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: Pointer to the buf on success or NULL on failure.</p> |
| |
| <p class="commandline">char *select_p_select_jobinfo_xstrdup(select_jobinfo_t *jobinfo, int mode);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Print the contents of the select |
| plugin specific information about a job. The return value must be released using the xfree() function.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> jobinfo</span> (input) pointer |
| to the select plugin specific information about a job.<br> |
| <span class="commandline"> mode</span> (input) print mode, see enum select_print_mode.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: Pointer to a string on success or NULL on failure. |
| Call xfree() to release the memory allocated for the return value.</p> |
| |
| <p class="commandline">int select_p_job_test (struct job_record *job_ptr, |
| bitstr_t *bitmap, uint32_t min_nodes, uint32_t max_nodes, uint32_t req_nodes, uint32_t mode, |
| List preemption_candidates, List *preempted_jobs, bitstr_t *exc_core_bitmap);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Given a job's scheduling requirement |
| specification and a set of nodes which might be used to satisfy the request, identify |
| the nodes which "best" satisfy the request. Note that nodes being considered for allocation |
| to the job may include nodes already allocated to other jobs, even if node sharing is |
| not permitted. This is done to ascertain whether or not job may be allocated resources |
| at some later time (when the other jobs complete). This permits SLURM to reject |
| non-runnable jobs at submit time rather than after they have spent hours queued. |
| Informing users of problems at job submission time permits them to quickly resubmit |
| the job with appropriate constraints.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> job_ptr</span> (input) pointer |
| to the job being considered for scheduling. Data in this job record may safely be read. |
| Data of particular interest include <i>details->contiguous</i> (set if allocated nodes |
| should be contiguous), <i>num_procs</i> (minimum processors in allocation) and |
| <i>details->req_node_bitmap</i> (specific required nodes).<br> |
| <span class="commandline"> bitmap</span> (input/output) |
| bits representing nodes which might be allocated to the job are set on input. |
| This function should clear the bits representing nodes not required to satisfy |
| job's scheduling request. |
| Bits left set will represent nodes to be used for this job. Note that the job's |
| required nodes (<i>details->req_node_bitmap</i>) will be a superset |
| <i>bitmap</i> when the function is called.<br> |
| <span class="commandline"> min_nodes</span> (input) |
| minimum number of nodes to allocate to this job. Note this reflects both job |
| and partition specifications.<br> |
| <span class="commandline"> max_nodes</span> (input) |
| maximum number of nodes to allocate to this job. Note this reflects both job |
| and partition specifications.<br> |
| <span class="commandline"> req_nodes</span> (input) |
| the requested (desired) of nodes to allocate to this job. This reflects job's |
| maximum node specification (if supplied).<br> |
| <span class="commandline"> mode</span> (input) |
| controls the mode of operation. Valid options are:<br> |
| * SELECT_MODE_RUN_NOW: try to schedule job now<br> |
| * SELECT_MODE_TEST_ONLY: test if job can ever run<br> |
| * SELECT_MODE_WILL_RUN: determine when and where job can run<br> |
| <span class="commandline"> preemption_candidates</span> (input) |
| list of pointers to jobs which may be preempted in order to initiate this |
| pending job. May be NULL if there are no preemption candidates.<br> |
| <span class="commandline"> preempted_jobs</span> (input/output) |
| list of jobs which must be preempted in order to initiate the pending job. |
| If the value is NULL, no job list is returned. |
| If the list pointed to has a value of NULL, a new list will be created |
| otherwise the existing list will be overwritten. |
| Use the <i>list_destroy</i> function to destroy the list when no longer |
| needed.<br> |
| <span class="commandline"> exc_core_bitmap</span> (input) |
| bitmap of cores held for advanced reservations.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR .</p> |
| |
| <p class="commandline">int select_p_job_begin (struct job_record *job_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Note the initiation of the specified job |
| is about to begin. This function is called immediately after |
| <span class="commandline">select_p_job_test()</span> successfully completes for this job. |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> job_ptr</span> (input) pointer |
| to the job being initialized. Data in this job record may safely be read or written. |
| The <i>nodes</i> and <i>node_bitmap</i> fields of this job record identify the |
| nodes which have already been selected for this job to use.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR, which causes the job to be requeued for |
| later execution.</p> |
| |
| <p class="commandline">int select_p_job_ready (struct job_record *job_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Test if resources are configured |
| and ready for job execution. This function is only used in the job prolog for |
| BlueGene systems to determine if the bgblock has been booted and is ready for use.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> job_ptr</span> (input) pointer |
| to the job being initialized. Data in this job record may safely be read. |
| The <i>nodes</i> and <i>node_bitmap</i> fields of this job record identify the |
| nodes which have already been selected for this job to use. </p> |
| <p style="margin-left:.2in"><b>Returns</b>: 1 if the job may begin execution, |
| 0 otherwise.</p> |
| |
| <p class="commandline">int select_p_job_fini (struct job_record *job_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Note the termination of the |
| specified job. This function is called as the termination process for the |
| job begins (prior to killing the tasks).</p> |
| <p style="margin-left:.2in"><b>Arguments</b>: |
| <span class="commandline"> job_ptr</span> (input) pointer |
| to the job being terminated. Data in this job record may safely be read or written. |
| The <i>nodes</i> and/or <i>node_bitmap</i> fields of this job record identify the |
| nodes which were selected for this job to use.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_job_signal (struct job_record *job_ptr, |
| int signal);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Signal the specified job. |
| This is needed for architectures where the job steps are launched by a |
| mechanism outside of SLURM, for example when ALPS is used on Cray systems.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> job_ptr</span> (input) pointer |
| to the job to be signaled.<br> |
| <span class="commandline"> signal</span> (input) signal to |
| be sent to the job.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On |
| failure, the plugin should return a SLURM error code.</p> |
| |
| <p class="commandline">int select_p_job_suspend (struct job_record *job_ptr, |
| bool indf_susp);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Suspend the specified job. |
| Release resources for use by other jobs.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> job_ptr</span> (input) pointer |
| to the job being suspended. Data in this job record may safely be read or |
| written. The <i>nodes</i> and/or <i>node_bitmap</i> fields of this job record |
| identify the nodes which were selected for this job to use.<br> |
| <span class="commandline"> indf_susp</span> (input) flag |
| which if set indicates the job is being suspended indefinitely by the user or |
| administrator. If not set, the job is being suspended temporarily for gang |
| scheduling.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On |
| failure, the plugin should return a SLURM error code.</p> |
| |
| <p class="commandline">int select_p_job_resume (struct job_record *job_ptr, |
| bool indf_susp);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Resume the specified job |
| which was previously suspended.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> job_ptr</span> (input) pointer |
| to the job being resumed. Data in this job record may safely be read or |
| written. The <i>nodes</i> and/or <i>node_bitmap</i> fields of this job record |
| identify the nodes which were selected for this job to use.<br> |
| <span class="commandline"> indf_susp</span> (input) flag |
| which if set indicates the job is being resumed after being suspended |
| indefinitely by the user or administrator. If not set, the job is being |
| resumed after being temporarily suspended for gang scheduling.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On |
| failure, the plugin should return a SLURM error code.</p> |
| |
| <p class="commandline">int select_p_job_expand_allow (void);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Report the ability of this |
| select plugin to expand jobs.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: True if job expansion is |
| supported, otherwise false.</p> |
| |
| <p class="commandline">int select_p_job_expand (struct job_record *from_job_ptr, |
| struct job_record *to_job_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Transfer all resources |
| currently allocated to one job to another job. One job is left with no |
| allocated resources and the other job is left with the resources previously |
| allocated to both jobs.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> from_job_ptr</span> (input) pointer |
| to the job being to have all of its resources removed.<br> |
| <span class="commandline"> to_job_ptr</span> (input) pointer |
| to the job getting all of the resources previously either job.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On |
| failure, the plugin should return a SLURM error code.</p> |
| |
| <p class="commandline">int select_p_job_resized (struct job_record *job_ptr, |
| struct node_record *node_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Remove the specified node |
| from the job's allocation.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> job_ptr</span> (input) pointer |
| to the job being decreased in size.<br> |
| <span class="commandline"> node_ptr</span> (input) pointer |
| to the node being removed from a job's allocation.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On |
| failure, the plugin should return a SLURM error code.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Step-Specific Functions</h3> |
| |
| <p class="commandline">bitstr_t *select_p_step_pick_nodes(struct job_record *job_ptr, |
| select_jobinfo_t *step_jobinfo, uint32_t node_count)</p> |
| <p style="margin-left:.2in"><b>Description</b>: If the select plugin needs to |
| select nodes for a job step, then do so here.<br> |
| <b>NOTE:</b> Only select/bluegene selects the job step resources. The logic |
| within the slurmctld daemon directly selects resources for a job step for all |
| other select plugins.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> job_ptr</span> (input) |
| Pointer to the job which is attempting to allocate a job step.</br> |
| <span class="commandline"> step_jobinfo</span> (input/output) |
| On input, this is a pointer to an empty buffer. On output for a successful |
| job step allocation, this structure is filled in with detailed information |
| about the job step allocation.</br> |
| <span class="commandline"> node_count</span> (input) |
| Number of nodes required by the new job step.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: If successful, then return a |
| bitmap of the nodes allocated to the job step, otherwise return NULL and the |
| logic within the slurmctld daemon will select the nodes to be allocated to |
| the job step.</p> |
| |
| <p class="commandline">int select_p_step_finish(struct step_record *step_ptr)</p> |
| <p style="margin-left:.2in"><b>Description</b>: Note that a job step has completed execution</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> step_ptr</span> (input) |
| Pointer to the step which has completed execution.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_fail_cnode(struct step_record *step_ptr);</p> |
| <p style="margin-left:.2in"><b>Description</b>: This function fails |
| certain cnodes in a blocks midplane.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline">step_ptr</span>  (input) |
| information on the step that has failed cnodes.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Advanced Reservation Functions</h3> |
| |
| <p class="commandline">bitstr_t * select_p_resv_test(bitstr_t *avail_bitmap, |
| uint32_t node_cnt)</p> |
| <p style="margin-left:.2in"><b>Description</b>: Identify the nodes which best |
| satisfy a reservation request taking system topology into consideration if |
| applicable.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> avail_bitmap</span> (input) |
| a bitmap of the nodes which are available for use in creating the reservation.<br> |
| <span class="commandline"> node_cnt</span> (input) |
| number of nodes required to satisfy the reservation request.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: A bitmap of the nodes which should |
| be used for the advanced reservation or NULL if the selected nodes can not |
| be used for an advanced reservation.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h3>Get Information Functions</h3> |
| |
| <p class="commandline">int select_p_get_info_from_plugin(enum select_plugindata_info dinfo, |
| struct job_record *job_ptr, void *data);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Get plugin-specific information |
| about a job.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> info</span> (input) identifies |
| the type of data to be updated.<br> |
| <span class="commandline"> job_ptr</span> (input) pointer to |
| the job related to the query (if applicable; may be NULL).<br> |
| <span class="commandline"> data</span> (output) the requested data.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="commandline">int select_p_pack_select_info(time_t last_query_time, |
| uint16_t show_flags, Buf *buffer_ptr, uint16_t protocol_version);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Pack plugin-specific information |
| about its general state into a buffer. Currently only used by select/bluegene |
| to pack block state information.<br> |
| <b>NOTE:</b> Functions to work with this data may be needed on computers |
| without the plugin which generated the data, so those functions are in |
| src/common modules. The unpack function is performed by |
| slurm_unpack_block_info_members() in src/common/slurm_protocol_pack.c |
| using BlueGene specific data structures. Use destroy_select_ba_request() |
| in src/common/noe_select.c to free the data structure's memory.</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> last_query_time</span> (input) |
| Time when the data was previously requested (used so only updated information |
| needs to be sent).<br> |
| <span class="commandline"> show_flags</span> (input) identifies |
| the type of data requested.<br> |
| <span class="commandline"> buffer_ptr</span> (input/output) |
| Pointer to buffer filled in with select plugin state information.</br> |
| <span class="commandline"> protocol_version</span> (input) |
| Version number of the data packing mechanism (needed for backward compatibility).</p> |
| <p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure, |
| the plugin should return SLURM_ERROR.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <h3>Block Allocator interface</h3> |
| |
| <p class="commandline">void select_p_ba_init(node_info_msg_t *node_info_ptr, bool sanity_check);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Construct an internal block allocation |
| table containing information about the nodes on a computer. This allocated memory |
| should be released by calling select_p_ba_fini();</p> |
| <p style="margin-left:.2in"><b>Arguments</b>:<br> |
| <span class="commandline"> node_info_ptr</span> (input) |
| Information about the nodes on a system.<br> |
| <span class="commandline"> sanity_check</span> (input) if set |
| then validate that the node name suffix values represent coordinates which are |
| within the system's dimension size (see function select_p_ba_get_dims).</p> |
| |
| <p class="commandline">void select_p_ba_fini(void);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Free storage allocated by |
| select_p_ba_init().</p> |
| |
| <p class="commandline">int *select_p_ba_get_dims(void);</p> |
| <p style="margin-left:.2in"><b>Description</b>: Return an array containing |
| the number of elements in each dimension of the system size. For example, an IBM |
| Bluegene/P system has a three-dimensional torus topology. If it has eight elements |
| in the X dimension, and four in the Y and Z dimensions, the returned array will |
| contain the values 8, 4, 4.</p> |
| <p style="margin-left:.2in"><b>Returns</b>: An array containing the number of |
| elements in each dimension of the system size.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| |
| <h2>Versioning</h2> |
| <p> This document describes version 100 of the SLURM node selection API. Future |
| releases of SLURM may revise this API. A node selection plugin conveys its ability |
| to implement a particular API version using the mechanism outlined for SLURM plugins. |
| In addition, the credential is transmitted along with the version number of the |
| plugin that transmitted it. It is at the discretion of the plugin author whether |
| to maintain data format compatibility across different versions of the plugin.</p> |
| |
| <p class="footer"><a href="#top">top</a></p> |
| |
| <p style="text-align:center;">Last modified 18 July 2012</p> |
| |
| <!--#include virtual="footer.txt"--> |