| RELEASE NOTES FOR SLURM VERSION 14.03 |
| 26 February 2014 |
| |
| |
| IMPORTANT NOTE: |
| If using the slurmdbd (Slurm DataBase Daemon) you must update this first. |
| The 14.03 slurmdbd will work with Slurm daemons of version 2.5 and above. |
| You will not need to update all clusters at the same time, but it is very |
| important to update slurmdbd first and having it running before updating |
| any other clusters making use of it. No real harm will come from updating |
| your systems before the slurmdbd, but they will not talk to each other |
| until you do. Also at least the first time running the slurmdbd you need to |
| make sure your my.cnf file has innodb_buffer_pool_size equal to at least 64M. |
| You can accomplish this by adding the line |
| |
| innodb_buffer_pool_size=64M |
| |
| under the [mysqld] reference in the my.cnf file and restarting the mysqld. |
| This is needed when converting large tables over to the new database schema. |
| |
| Slurm can be upgraded from version 2.5 or 2.6 to version 14.03 without loss of |
| jobs or other state information. Upgrading directly from an earlier version of |
| Slurm will result in loss of state information. |
| |
| |
| HIGHLIGHTS |
| ========== |
| -- Added support for native Slurm operation on Cray systems (without ALPS). |
| -- Added partition configuration parameters AllowAccounts, AllowQOS, |
| DenyAccounts and DenyQOS to provide greater control over use. |
| -- Added the ability to perform load based scheduling. Allocating resources to |
| jobs on the nodes with the largest number if idle CPUs. |
| -- Added support for reserving cores on a compute node for system services |
| (core specialization) |
| -- Add mechanism for job_submit plugin to generate error message for srun, |
| salloc or sbatch to stderr. |
| -- Support for Postgres database has long since been out of date and |
| problematic, so it has been removed entirely. If you would like to |
| use it the code still exists in <= 2.6, but will not be included in |
| this and future versions of the code. |
| -- Added new structures and support for both server and cluster resources. |
| -- Significant performance improvements, especially with respect to job |
| array support. |
| |
| RPMBUILD CHANGES |
| ================ |
| -- The rpmbuild option for a Cray system with ALPS has changed from |
| %_with_cray to %_with_cray_alps. |
| -- CRAY - Do not package Slurm's libpmi or libpmi2 libraries. The Cray version |
| of those libraries must be used. |
| |
| |
| CONFIGURATION FILE CHANGES (see man appropriate man page for details) |
| ===================================================================== |
| -- Added AuthInfo configuration parameter to specify port used by |
| authentication plugin. |
| -- Added CoreSpecPlugin configuration parameter and core specialization plugin |
| infrastructure. |
| -- Added JobAcctGatherParams configuration parameter. |
| -- Added JobContainerType configuration parameter and plugin infrastructure. |
| -- Added FairShareDampeningFactor configuration parameter to offer a greater |
| priority range based upon utilization. |
| -- Added LogTimeFormat configuration parameter to control format of the |
| timestamp in log files. |
| -- Added PrologFlags configuration parameter to control when the Prolog is run. |
| -- Added SlurmdPlugstack configuration parameter for generic slurmd daemon |
| plugin mechanism. |
| -- Added partition configuration parameters AllowAccounts, AllowQOS, |
| DenyAccounts and DenyQOS to provide greater control over use. |
| -- SelectType: select/cray has been renamed select/alps. |
| -- Added SchedulingParameters paramter of "CR_LLN" and partition parameter of |
| "LLN=yes|no" to enable loadbased scheduling. |
| -- In SchedulerParameters, change default max_job_bf value from 50 to 100. |
| -- Added SchedulerParameters option of "partition_job_depth" to limit |
| scheduling logic depth by partition. |
| -- Added SchedulerParameters option of "bf_max_job_start". |
| -- In SchedulerParameters, replace "max_job_bf" with "bf_max_job_test" |
| (both will work for now). |
| -- Add SchedulerParameters options of "preempt_reorder_count" and |
| "preempt_strict_order". |
| -- Change MaxArraySize from 16-bit to 32-bit field. |
| -- Added fields to "scontrol show job" output: boards_per_node, |
| sockets_per_board, ntasks_per_node, ntasks_per_board, ntasks_per_socket, |
| ntasks_per_core, and nice. |
| -- gres.conf - Add "NodeName" specification so that a single gres.conf file |
| can be used for a heterogeneous cluster. |
| -- Added DebugFlags value of "License". |
| |
| DBD CONFIGURATION FILE CHANGES (see "man slurmdbd.conf" for details) |
| ==================================================================== |
| |
| |
| COMMAND CHANGES (see man pages for details) |
| =========================================== |
| -- Added sbatch --signal option of "B:" to signal the batch shell rather than |
| only the spawned job steps. |
| -- Added sinfo and squeue format option of "%all" to print all fields available |
| for the data type with a vertical bar separating each field. |
| -- Add StdIn, StdOut, and StdErr paths to job information dumped with |
| "scontrol show job". |
| -- Permit Slurm administrator to submit a batch job as any user. |
| -- Added -I|--item-extract option to sh5util to extract data item from series. |
| -- Add squeue output format options for job command and working directory |
| (%o and %Z respectively). |
| -- Add stdin/out/err to sview job output. |
| -- Added a new option to the scontrol command to view licenses that are configured |
| in use and available. 'scontrol show licenses'. |
| -- Permit jobs in finished state to be requeued. |
| -- Added a new option to scontrol to put a requeued job on hold. A requeued |
| job can be put in a new special state called SPECIAL_EXIT indicating |
| the job has exited with a special value. |
| "scontrol requeuehold state=SpecialExit 123". |
| -- Pending job steps will have step_id of INFINITE rather than NO_VAL and |
| will be reported as "TBD" by scontrol and squeue commands. |
| -- Added sgather tool to gather files from a job's compute nodes into a |
| central location. |
| -- Added -S/--core-spec option to salloc, sbatch and srun commands to reserve |
| specialized cores for system use. |
| -- Added sacctmgr options to create, modify, list & delete Resources. |
| |
| OTHER CHANGES |
| ============= |
| -- Added job_info() and step_info() functions to the gres plugins to extract |
| plugin specific fields from the job's or step's GRES data structure. |
| |
| API CHANGES |
| =========== |
| -- Added "state" argument to slurm_requeue function. |
| -- Added sacctmgr options to create, modify, list & delete Resources. |
| |
| Changed members of the following structs |
| ======================================== |
| -- Struct job_info / slurm_job_info_t: Changed exc_node_inx, node_inx, and |
| req_node_inx from type int to type int32_t; Changed array_task_id from type |
| uint16_t to type uint32_t |
| -- job_step_info_t: Changed node_inx from type int to type int32_t; Changed |
| array_task_id from type uint16_t to type uint32_t; Changed node_inx from |
| int to int32_t |
| -- Struct partition_info / partition_info_t: Changed node_inx from type int |
| to type int32_t |
| -- block_job_info_t: Changed cnode_inx from type int to type int32_t |
| -- block_info_t: Changed ionode_inx and mp_inx from type int to type int32_t |
| -- Struct reserve_info / reserve_info_t: Changed node_inx from type int to |
| type int32_t |
| -- Struct slurm_ctl_conf / slurm_ctl_conf_t: Changed max_array_sz from uint16_t |
| to uint32_t. |
| -- Struct reserve_info / reserve_info_t: Changed flags from uint16_t to |
| uint32_t. |
| -- Struct resv_desc_msg / resv_desc_msg_t: Changed flags from uint16_t to |
| uint32_t. |
| |
| Added the following struct definitions |
| ====================================== |
| -- Struct job_info / slurm_job_info_t: Added std_err, std_in and std_out. |
| -- Struct job_info / slurm_job_info_t: Added core_spec |
| -- struct job_descriptor / job_desc_msg_t: Added core_spec, warn_flags |
| -- Struct partition_info / partition_info_t: Added allow_accounts, allow_qos, |
| deny_accounts, deny_groups and deny_qos. |
| -- struct slurm_step_launch_params_t: Added partition (for Prolog environment |
| variable) |
| -- Struct resource_allocation_response_msg: Added partition (for Prolog |
| environment variable) |
| -- Struct slurm_ctl_conf / slurm_ctl_conf_t: Added acct_gather_conf, authinfo. |
| core_spec_plugin, ext_sensors_conf, fs_dampening_factor, |
| job_acct_gather_params, job_container_plugin, log_fmt, prolog_flags and |
| slurmd_plugstack. |
| -- Added license data structures: slurm_license_info and license_info_msg. |
| |
| Changed the following enums and #defines |
| ======================================== |
| -- Add new job_state of JOB_BOOT_FAIL for job terminations due to failure to |
| boot it's allocated nodes or BlueGene block. |
| -- Added set of job state change flags: JOB_REQUEUE, JOB_REQUEUE_HOLD, |
| JOB_SPECIAL_EXIT. |
| -- Added new job_state_reason values: WAIT_BLOCK_MAX_ERR, WAIT_BLOCK_D_ACTION, |
| WAIT_CLEANING and WAIT_PROLOG. |
| -- Added new select_jobdata_type values: SELECT_JOBDATA_CLEANING and |
| SELECT_JOBDATA_NETWORK. |
| -- Added new node state values: NODE_STATE_NET, NODE_STATE_RES and |
| NODE_STATE_UNDRAIN. |
| -- Added new flags for SelectTypeParameters: CR_OTHER_CONS_RES, CR_NHC_STEP_NO, |
| CR_LLN, and CR_NHC_NO. |
| -- Added new PriorityFlags value of PRIORITY_FLAGS_DEPTH_OBLIVIOUS |
| -- Added new partition flags: PART_FLAG_LLN and PART_FLAG_LLN_CLR |
| -- Added DebugFlags: DEBUG_FLAG_JOB_CONT and DEBUG_FLAG_TASK |
| -- Removed DebugFlags: DEBUG_FLAG_THREADID |
| -- Added PrologFlag: PROLOG_FLAG_ALLOC |
| -- Added LogTimeFormat options: LOG_FMT_ISO8601_MS, LOG_FMT_ISO8601, |
| LOG_FMT_RFC5424_MS, LOG_FMT_RFC5424, LOG_FMT_CLOCK, LOG_FMT_SHORT and |
| LOG_FMT_THREAD_ID |
| |
| Added the following API's |
| ========================= |
| -- Added functions for retrieving license information: slurm_load_licenses and |
| slurm_free_license_info_msg. |
| -- Added functions to get specific job information: slurm_get_job_stderr, |
| slurm_get_job_stdin and slurm_get_job_stdout. |
| |
| Change the following API's |
| =========================== |
| -- Add task pointer to the task_post_term() function in task plugins. The |
| terminating task's PID is available in task->pid. |
| |
| |
| |
| DBD API Changes |
| =============== |
| |
| Changed members of the following structs |
| ======================================== |
| |
| Added the following struct definitions |
| ====================================== |
| -- struct slurmdb_clus_res_rec_t: new cluster resource record. |
| -- struct slurmdb_res_cond_t : new resource condition. |
| -- struct slurmdb_res_rec_t: new resource record. |
| |
| Added the following enums and #defines |
| ======================================== |
| -- enum slurmdb_resource_type_t: server resource record "type" definition. |
| -- Added new slurmdb_update_type_t SLURMDB_ADD_RES, |
| SLURMDB_REMOVE_RES, SLURMDB_MODIFY_RES, SLURMDB_REMOVE_QOS_USAGE |
| -- Added new flags for Resources SLURMDB_RES_FLAG_BASE, |
| SLURMDB_RES_FLAG_NOTSET, SLURMDB_RES_FLAG_ADD, SLURMDB_RES_FLAG_REMOVE |
| -- Added new flags for QOS QOS_FLAG_BASE |
| |
| Added the following API's |
| ========================= |
| -- Added functions for resource management in the database. |