| <!--#include virtual="header.txt"--> |
| |
| <h1>Accounting</h1> |
| |
| <p>NOTE: This documents accounting features available in SLURM version |
| 1.3, which are far more extensive than those available in previous |
| releases.</p> |
| |
| <p>SLURM can be configured to collect accounting information for every |
| job and job step executed. |
| Accounting records can be written to a simple text file or a database. |
| Information is available about both currently executing jobs and |
| jobs which have already terminated. |
| The <b>sacct</b> command can report resource usage for running or terminated |
| jobs including individual tasks, which can be useful to detect load imbalance |
| between the tasks. |
| The <b>sstat</b> command can be used to status only currently running jobs. |
| It also can give you valuable information about imbalance between tasks. |
| The <b>sreport</b> can be used to generate reports based upon all jobs |
| executed in a particular time interval.</p> |
| |
| <p>There are three distinct plugin types associated with resource accounting. |
| The SLURM configuration parameters (in <i>slurm.conf</i>) associated with |
| these plugins include:</p> |
| <ul> |
| <li><b>AccountingStorageType</b> controls how detailed job and job |
| step information is recorded. You can store this information in a |
| text file, <a href="http://www.mysql.com/">MySQL</a> or |
| <a href="http://www.postgresql.org/">PostgreSQL</a> |
| database, optionally using SlurmDBD for added security.</li> |
| <li><b>JobAcctGatherType</b> is operating system dependent and |
| controls what mechanism is used to collect accounting information. |
| Supported values are <i>jobacct_gather/aix</i>, <i>jobacct_gather/linux</i> |
| and <i>jobacct_gather/none</i> (no information collected).</li> |
| <li><b>JobCompType</b> controls how job completion information is |
| recorded. This can be used to record basic job information such |
| as job name, user name, allocated nodes, start time, completion |
| time, exit status, etc. If the preservation of only basic job |
| information is required, this plugin should satisfy your needs |
| with minimal overhead. You can store this information in a |
| text file, <a href="http://www.mysql.com/">MySQL</a> or |
| <a href="http://www.postgresql.org/">PostgreSQL</a> |
| database</li> |
| </ul> |
| |
| <p>The use of sacct to view information about jobs |
| is dependent upon AccountingStorageType |
| being configured to collect and store that information. |
| The use of sreport is dependent upon some database being |
| used to store that information.</p> |
| |
| <p>The use of sacct or sstat to view information about resource usage |
| within jobs is dependent upon both JobAcctGatherType and AccountingStorageType |
| being configured to collect and store that information.</p> |
| |
| <p>Storing the accounting information into text files is |
| very simple. Just configure the appropriate plugin (e.g. |
| <i>AccountingStorageType=accounting_storage/filetxt</i> and/or |
| <i>JobCompType=jobcomp/filetxt</i>) and then specify the |
| pathname of the file (e.g. |
| <i>AccountingStorageLoc=/var/log/slurm/accounting</i> and/or |
| <i>JobCompLoc=/var/log/slurm/job_completions</i>). |
| Use the <i>logrotate</i> or similar tool to prevent the |
| log files from getting too large. |
| Send a SIGHUP signal to the <i>slurmctld</i> daemon |
| after moving the files, but before compressing them so |
| that new log files will be created.</p> |
| |
| <p>Storing the data directly into a database from SLURM may seem |
| attractive, but requires the availability of user name and |
| password data not only for the SLURM control daemon (slurmctld), |
| but also user commands which need to access the data (sacct, sreport, and |
| sacctmgr). |
| Making possibly sensitive information available to all users makes |
| database security more difficult to provide, sending the data through |
| an intermediate daemon can provide better security and performance |
| (through caching data) and SlurmDBD provides such services. |
| SlurmDBD (SLURM Database Daemon) is written in C, multi-threaded, |
| secure and fast. |
| The configuration required to use SlurmDBD will be described below. |
| Storing information directly into database would be similar.</p> |
| |
| <p>Note that SlurmDBD relies upon existing SLURM plugins |
| for authentication and database use, but the other SLURM |
| commands and daemons are not required on the host where |
| SlurmDBD is installed. Install the <i>slurmdbd</i> and |
| <i>slurm-plugins</i> RPMs on the computer when SlurmDBD |
| is to execute.</p> |
| |
| <p>If SlurmDBD is configured for use but not responding then <i>slurmctld</i> |
| will utilize an interal cache until SlurmDBD is returned to service. |
| The cached data is written by <i>slurmctld</i> to local storage upon shutdown |
| and recovered at startup. |
| If SlurmDBD is not available when <i>slurmctld</i> starts, a cache of |
| valid bank accounts, user limits, etc. based upon their state when the |
| daemons were last communicating will be used. |
| Note that SlurmDBD must be responding when <i>slurmctld</i> is first started |
| since no cache of this critical data will be available. |
| Job and step accounting records generated by <i>slurmctld</i> will be |
| written to a cache as needed and transfered to SlurmDBD when returned to |
| service.</p> |
| |
| <h2>Infrastructure</h2> |
| |
| <p>With the SlurmDBD, we are able to collect data from multiple |
| clusters in a single location. |
| This does impose some constraints on the user naming and IDs. |
| Accounting is maintained by user name (not user ID), but a |
| given user name should refer to the same person across all |
| of the computers. |
| Authentication relies upon user ID numbers, so those must |
| be uniform across all computers communicating with each |
| SlurmDBD, at least for users requiring authentication. |
| In particular, the configured <i>SlurmUser</i> must have the |
| same name and ID across all clusters. |
| If you plan to have administrators of user accounts, limits, |
| etc. they must also have consistent names and IDs across all |
| clusters. |
| If you plan to restrict access to accounting records (e.g. |
| only permit a user to view records of his jobs), then all |
| users should have consistent names and IDs.</p> |
| |
| <p>The best way to insure security of the data is by authenticating |
| communications to the SlurmDBD and we recommend |
| <a href="http://home.gna.org/munge/">Munge</a> for that purpose. |
| If you have one cluster managed by SLURM and execute the SlurmDBD |
| on that one cluster, the normal Munge configuration will suffice. |
| Otherwise Munge should then be installed on all nodes of all |
| SLURM managed clusters, plus the machine where SlurmDBD executes. |
| You then have a choice of either having a single Munge key for |
| all of these computers or maintaining a unique key for each of the |
| clusters plus a second key for communications between the clusters |
| for better security. |
| Munge enhancements are planned to support two keys within a single |
| configuration file, but presently two different daemons must be |
| started with different configurations to support two different keys |
| (create two key files and start the daemons with the |
| <i>--key-file</i> option to locate the proper key plus the |
| <i>--socket</i> option to specify distinct local domain sockets for each). |
| The pathname of local domain socket will be needed in the SLURM |
| and SlurmDBD configuration files (slurm.conf and slurmdbd.conf |
| respectively, more details are provided below).</p> |
| |
| <p?Whether you use any authentication module or not you will need to have |
| a way for the SlurmDBD to get uid's for users and/or admin. If using |
| Munge, it is ideal for your users to have the same id on all your |
| clusters. If this is the case you should have a combination of every clusters |
| /etc/passwd file on the database server to allow the DBD to resolve |
| names for authentication. If using Munge and a users name is not in |
| the passwd file the action will fail. If not using Munge, you should |
| add anyone you want to be an administrator or operator to the passwd file. |
| If they plan on running sacctmgr or any of the accounting tools they |
| should have the same uid, or they will not authentic correctly. An |
| LDAP server could also server as a way to gather this information. |
| |
| <h2>Slurm JobComp Configuration</h2> |
| |
| <p>Presently job completion is not supported with the SlurmDBD, but can be |
| written directly to a database, script or flat file. If you are |
| running with the accounting storage, you may not need to run this |
| since it contains much of the same information. If you would like |
| to configure this, some of the more important parameters include:</p> |
| |
| <ul> |
| <li><b>JobCompHost</b>: |
| Only needed if using a database. The name or address of the host where |
| the database server executes.</li> |
| |
| <li><b>JobCompPass</b>: |
| Only needed if using a database. Password for the user connecting to |
| the database. Since the password can not be security maintained, |
| storing the information directly in a database is not recommended.</li> |
| |
| <li><b>JobCompPort</b>: |
| Only needed if using a database. The network port that the database |
| accepts communication on.</li> |
| |
| <li><b>JobCompType</b>: |
| Type of jobcomp plugin set to "jobcomp/mysql", "jobcomp/pgsql", or |
| "jobcomp/filetxt".</li> |
| |
| <li><b>JobCompUser</b>: |
| Only needed if using a database. User name to connect to |
| the database with.</li> |
| </ul> |
| |
| <h2>SLURM Accounting Configuration Before Build</h2> |
| |
| <p>While the SlurmDBD will work with a flat text file for recording |
| job completions and such this configuration will not allow |
| "associations" between a user and account. A database allows such |
| a configuration. |
| |
| <p><b>MySQL is the preferred database, PostgreSQL is |
| supported for job and step accounting only.</b> The infrastructure for |
| PostgresSQL for use with associations is not yet supported, meaning |
| sacctmgr will not work correctly. If interested in adding this |
| capability for PostgresSQL, please contact us at slurm-dev@lists.llnl.gov. |
| |
| <p>To enable this database support |
| one only needs to have the development package for the database they |
| wish to use on the system. The slurm configure script uses |
| mysql_config and pg-config to find out the information it needs |
| about installed libraries and headers. You can specify where your |
| mysql_config script is with the |
| </i>--with-mysql_conf=/path/to/mysql_config</i> option when configuring your |
| slurm build. A similar option is also available for PostgreSQL. |
| On a successful configure, output is something like this: </p> |
| <pre> |
| checking for mysql_config... /usr/bin/mysql_config |
| MySQL test program built properly. |
| </pre> |
| |
| <h2>SLURM Accounting Configuration After Build</h2> |
| |
| <p>For simplicity sake we are going to reference everything as if you |
| are running with the SlurmDBD. You can communicate with a storage plugin |
| directly, but that offers minimal security. </p> |
| |
| <p>Several SLURM configuration parameters must be set to support |
| archiving information in SlurmDBD. SlurmDBD has a separate configuration |
| file which is documented in a separate section. |
| Note that you can write accounting information to SlurmDBD |
| while job completion records are written to a text file or |
| not maintained at all. |
| If you don't set the configuration parameters that begin |
| with "AccountingStorage" then accounting information will not be |
| referenced or recorded.</p> |
| |
| <ul> |
| <li><b>AccountingStorageEnforce</b>: |
| This option contains a comma separated list of options you may want to |
| enforce. The valid options are any comma separated combination of |
| <ul> |
| <li>associations - This will prevent users from running jobs if |
| their <i>association</i> is not in the database. This option will |
| prevent users from accessing invalid accounts. |
| </li> |
| <li>limits - This will enforce limits set to associations. By setting |
| this option, the 'associations' option is also set. |
| </li> |
| <li>qos - This will require all jobs to specify (either overtly or by |
| default) a valid qos (Quality of Service). QOS values are defined for |
| each association in the database. By setting this option, the |
| 'associations' option is also set. |
| </li> |
| <li>wckeys - This will prevent users from running jobs under a wckey |
| that they don't have access to. By using this option, the |
| 'associations' option is also set. The 'TrackWCKey' option is also |
| set to true. |
| </li> |
| </ul> |
| (NOTE: The association is a combination of cluster, account, |
| user names and optional partition name.) |
| <br> |
| Without AccountingStorageEnforce being set (the default behavior) |
| jobs will be executed based upon policies configured in SLURM on each |
| cluster. |
| <br> |
| It is advisable to run without the option 'limits' set when running a |
| scheduler on top of SLURM, like Moab, that does not update in real |
| time their limits per association.</li> |
| |
| <li><b>AccountingStorageHost</b>: The name or address of the host where |
| SlurmDBD executes</li> |
| |
| <li><b>AccountingStoragePass</b>: If using SlurmDBD with a second Munge |
| daemon, store the pathname of the named socket used by Munge to provide |
| enterprise-wide. Otherwise the default Munge daemon will be used.</li> |
| |
| <li><b>AccountingStoragePort</b>: |
| The network port that SlurmDBD accepts communication on.</li> |
| |
| <li><b>AccountingStorageType</b>: |
| Set to "accounting_storage/slurmdbd".</li> |
| |
| <li><b>ClusterName</b>: |
| Set to a unique name for each Slurm-managed cluster so that |
| accounting records from each can be identified.</li> |
| <li><b>TrackWCKey</b>: |
| Boolean. If you want to track wckeys (Workload Characterization Key) |
| of users. A Wckey is an orthogonal way to do accounting against |
| maybe a group of unrelated accounts. WCKeys can be defined using |
| sacctmgr add wckey 'name'. When a job is run use srun --wckey and |
| time will be summed up for this wckey. |
| </li> |
| </ul> |
| |
| <h2>SlurmDBD Configuration</h2> |
| |
| <p>SlurmDBD requires its own configuration file called "slurmdbd.conf". |
| This file should be only on the computer where SlurmDBD executes and |
| should only be readable by the user which executes SlurmDBD (e.g. "slurm"). |
| This file should be protected from unauthorized access since it |
| contains a database login name and password. |
| See "man slurmdbd.conf" for a more complete description of the |
| configuration parameters. |
| Some of the more important parameters include:</p> |
| |
| <ul> |
| <li><b>AuthInfo</b>: |
| If using SlurmDBD with a second Munge daemon, store the pathname of |
| the named socket used by Munge to provide enterprise-wide. |
| Otherwise the default Munge daemon will be used.</li> |
| |
| <li><b>AuthType</b>: |
| Define the authentication method for communications between SLURM |
| components. A value of "auth/munge" is recommended.</li> |
| |
| <li><b>DbdHost</b>: |
| The name of the machine where the Slurm Database Daemon is executed. |
| This should be a node name without the full domain name (e.g. "lx0001"). |
| This defaults to <i>localhost</i> but should be supplied to avoid a |
| warning message.</li> |
| |
| <li><b>DbdPort</b>: |
| The port number that the Slurm Database Daemon (slurmdbd) listens |
| to for work. The default value is SLURMDBD_PORT as established at system |
| build time. If none is explicitly specified, it will be set to 6819. |
| This value must be equal to the <i>AccountingStoragePort</i> parameter in the |
| slurm.conf file.</li> |
| |
| <li><b>LogFile</b>: |
| Fully qualified pathname of a file into which the Slurm Database Daemon's |
| logs are written. |
| The default value is none (performs logging via syslog).</li> |
| |
| <li><b>PluginDir</b>: |
| Identifies the places in which to look for SLURM plugins. |
| This is a colon-separated list of directories, like the PATH |
| environment variable. |
| The default value is the prefix given at configure time + "/lib/slurm".</li> |
| |
| <li><b>SlurmUser</b>: |
| The name of the user that the <i>slurmctld</i> daemon executes as. |
| This user must exist on the machine executing the Slurm Database Daemon |
| and have the same user ID as the hosts on which <i>slurmctld</i> execute. |
| For security purposes, a user other than "root" is recommended. |
| The default value is "root". This name should also be the same SlurmUser |
| on all clusters reporting to the SlurmDBD.</li> |
| |
| <li><b>StorageHost</b>: |
| Define the name of the host the database is running where we are going |
| to store the data. |
| Ideally this should be the host on which SlurmDBD executes. But could |
| be a different machine.</li> |
| |
| <li><b>StorageLoc</b>: |
| Specifies the name of the database where accounting |
| records are written, for databases the default database is |
| slurm_acct_db. Note the name can not have a '/' in it or the |
| default will be used.</li> |
| |
| <li><b>StoragePass</b>: |
| Define the password used to gain access to the database to store |
| the job accounting data.</li> |
| |
| <li><b>StoragePort</b>: |
| Define the port on which the database is listening.</li> |
| |
| <li><b>StorageType</b>: |
| Define the accounting storage mechanism type. |
| Acceptable values at present include |
| "accounting_storage/mysql" and "accounting_storage/pgsql". |
| The value "accounting_storage/mysql" indicates that accounting records |
| should be written to a MySQL database specified by the |
| <i>StorageLoc</i> parameter. |
| The value "accounting_storage/pgsql" indicates that accounting records |
| should be written to a PostgreSQL database specified by the |
| <i>StorageLoc</i> parameter. |
| This value must be specified.</li> |
| |
| <li><b>StorageUser</b>: |
| Define the name of the user we are going to connect to the database |
| with to store the job accounting data.</li> |
| </ul> |
| |
| <h2>MySQL Configuration</h2> |
| |
| <p>While SLURM will create the database automatically you will need to |
| make sure the StorageUser is given permissions in MySQL to do so. |
| As the <i>mysql</i> user grant privileges to that user using a |
| command such as:</p> |
| |
| <p>GRANT ALL ON StorageLoc.* TO 'StorageUser'@'StorageHost'; |
| (The ticks are needed)</p> |
| |
| <p>(You need to be root to do this. Also in the info for password |
| usage there is a line that starts with '->'. This a continuation |
| prompt since the previous mysql statement did not end with a ';'. It |
| assumes that you wish to input more info.)</p> |
| |
| live example: |
| |
| <pre> |
| mysql@snowflake:~$ mysql |
| Welcome to the MySQL monitor.Commands end with ; or \g. |
| Your MySQL connection id is 538 |
| Server version: 5.0.51a-3ubuntu5.1 (Ubuntu) |
| |
| Type 'help;' or '\h' for help. Type '\c' to clear the buffer. |
| |
| mysql> grant all on slurm_acct_db.* TO 'slurm'@'localhost'; |
| Query OK, 0 rows affected (0.00 sec) |
| |
| or with a password... |
| |
| mysql> grant all on slurm_acct_db.* TO 'slurm'@'localhost' |
| -> identified by 'some_pass' with grant option; |
| Query OK, 0 rows affected (0.00 sec) |
| </pre> |
| |
| <p>This will grant user 'slurm' access to do what it needs to do on |
| the local host. This should be done before the SlurmDBD will work |
| properly. After you grant permission to the Slurm user in mysql then |
| you can start SlurmDBD and Slurm. You start SlurmDBD by typing |
| 'slurmdbd'. You can verify that SlurmDBD is running by typing 'ps aux |
| | grep slurmdbd'. After SlurmDBD and the slurmctld start you can |
| verify that the database was created by using the mysql command 'show |
| databases;'. You can display the tables that slurm created in the |
| database by using the mysql command 'use slurm_acct_db;' and then 'show |
| tables;'.</p> |
| |
| <p>Use the mysql 'show databases;' command</p> |
| |
| <pre> |
| mysql> show databases; |
| |
| +--------------------+ |
| | Database | |
| +--------------------+ |
| | information_schema | |
| | slurm_acct_db | |
| | test | |
| +--------------------+ |
| |
| 3 rows in set (0.00 sec) |
| </pre> |
| |
| <p>Select the database that you created.</p> |
| |
| <pre> |
| mysql> use slurm_acct_db; |
| |
| Reading table information for completion of table and column names |
| You can turn off this feature to get a quicker startup with -A |
| |
| Database changed |
| </pre> |
| |
| <p>Now do a mysql 'show tables;' command.</p> |
| |
| <pre> |
| mysql> show tables; |
| |
| +---------------------------+ |
| | Tables_in_slurm_acct_db | |
| +---------------------------+ |
| | acct_coord_table | |
| | acct_table | |
| | assoc_day_usage_table | |
| | assoc_hour_usage_table | |
| | assoc_month_usage_table | |
| | assoc_table | |
| | cluster_day_usage_table | |
| | cluster_event_table | |
| | cluster_hour_usage_table | |
| | cluster_month_usage_table | |
| | cluster_table | |
| | job_table | |
| | last_ran_table | |
| | qos_table | |
| | resv_table | |
| | step_table | |
| | suspend_table | |
| | table_defs_table | |
| | txn_table | |
| | user_table | |
| | wckey_day_usage_table | |
| | wckey_hour_usage_table | |
| | wckey_month_usage_table | |
| | wckey_table | |
| +---------------------------+ |
| |
| 24 rows in set (0.02 sec) |
| |
| mysql> quit |
| </pre> |
| |
| <p>If the database is not created or SlurmDBD is not running you can |
| use the -v option when you start SlurmDBD to get more detailed |
| information.</p> |
| |
| <h2>Tools</h2> |
| |
| <p>There are a few tools available to work with accounting data, |
| <b>sacct</b>, <b>sacctmgr</b>, and <b>sreport</b>. |
| These tools all get or set data through the SlurmDBD daemon. |
| <ul> |
| <li><b>sacct</b> is used to generate accounting report for both |
| running and completed jobs.</li> |
| <li><b>sacctmgr</b> is used to manage associations in the database: |
| add or remove clusters, add or remove users, etc.</li> |
| <li><b>sreport</b> is used to generate various reports on usage collected over a |
| given time period.</li> |
| </ul> |
| <p>See the man pages for each command for more information.</p> |
| |
| <p>Web interfaces with graphical output is currently under |
| development and should be available in the Fall of 2009. |
| A tool to report node state information is also under development.</p> |
| |
| <h2>Database Configuration</h2> |
| |
| <p>Accounting records are maintained based upon what we refer |
| to as an <i>Association</i>, |
| which consists of four elements: cluster, account, user names and |
| an optional partition name. Use the <i>sacctmgr</i> |
| command to create and manage these records. There is an order to set up |
| accounting associations. You must define clusters before you add |
| accounts and you must add accounts before you can add users. </p> |
| |
| <p>For example, to add a cluster named "snowflake" to the database |
| execute this line:</p> |
| <pre> |
| sacctmgr add cluster snowflake |
| </pre> |
| |
| <p>Add accounts "none" and "test" to cluster "snowflake" with an execute |
| line of this sort:</p> |
| <pre> |
| sacctmgr add account none,test Cluster=snowflake \ |
| Description="none" Organization="none" |
| </pre> |
| |
| <p>If you have more clusters you want to add these accounts, to you |
| can either not specify a cluster, which will add the accounts to all |
| clusters in the system, or comma separate the cluster names you want |
| to add to in the cluster option. |
| Note that multiple accounts can be added at the same time |
| by comma separating the names. |
| Some <i>description</i> of the account and the <i>organization</i> which |
| it belongs must be specified. |
| These terms can be used later to generated accounting reports. |
| Accounts may be arranged in a hierarchical fashion, for example accounts |
| <i>chemistry</i> and <i>physics</i> may be children of the account <i>science</i>. |
| The hierarchy may have an arbitrary depth. |
| Just specify the <i>parent=''</i> option in the add account line to construct |
| the hierarchy. |
| For the example above execute</p> |
| <pre> |
| sacctmgr add account science \ |
| Description="science accounts" Organization=science |
| sacctmgr add account chemistry,physics parent=science \ |
| Description="physical sciences" Organization=science |
| </pre> |
| |
| <p>Add users to accounts using similar syntax. |
| For example, to permit user <i>da</i> to execute jobs on all clusters |
| with a default account of <i>test</i> execute:</p> |
| <pre> |
| sacctmgr add user da DefaultAccount=test |
| </pre> |
| |
| <p>If <b>AccountingStorageEnforce=associations</b> is configured in |
| the slurm.conf of the cluster <i>snowflake</i> then user <i>da</i> would be |
| allowed to run in account <i>test</i> and any other accounts added |
| in the future. |
| Any attempt to use other accounts will result in the job being |
| aborted. |
| Account <i>test</i> will be the default if he doesn't specify one in |
| the job submission command.</p> |
| |
| <p>Partition names can also be added to an "add user" command with the |
| Partition='partitionname' option to specify an association specific to |
| a slurm partition.</p> |
| |
| <h2>Cluster Options</h2> |
| |
| <p>When either adding or modifying a cluster, these are the options |
| available with sacctmgr: |
| <ul> |
| <li><b>Name=</b> Cluster name</li> |
| |
| </ul> |
| |
| <h2>Account Options</h2> |
| |
| <p>When either adding or modifying an account, the following sacctmgr |
| options are available: |
| <ul> |
| <li><b>Cluster=</b> Only add this account to these clusters. |
| The account is added to all defined clusters by default.</li> |
| |
| <li><b>Description=</b> Description of the account. (Default is |
| account name)</li> |
| |
| <li><b>Name=</b> Name of account</li> |
| |
| <li><b>Organization=</b>Organization of the account. (Default is |
| parent account unless parent account is root then organization is |
| set to the account name.)</li> |
| |
| <li><b>Parent=</b> Make this account a child of this other account |
| (already added).</li> |
| |
| </ul> |
| |
| <h2>User Options</h2> |
| |
| <p>When either adding or modifying a user, the following sacctmgr |
| options are available: |
| |
| <ul> |
| <li><b>Account=</b> Account(s) to add user to</li> |
| |
| <li><b>AdminLevel=</b> This field is used to allow a user to add accounting |
| privileges to this user. Valid options are |
| <ul> |
| <li>None</li> |
| <li>Operator: can add, modify,and remove users, and add other operators)</li> |
| <li>Admin: In addition to operator privileges these users can add, modify, |
| and remove accounts and clusters</li> |
| </ul> |
| |
| <li><b>Cluster=</b> Only add to accounts on these clusters (default is all clusters)</li> |
| |
| <li><b>DefaultAccount=</b> Default account for the user, used when no account |
| is specified when a job is submitted. (Required on creation)</li> |
| |
| <li><b>DefaultWCKey=</b> Default wckey for the user, used when no wckey |
| is specified when a job is submitted. (Only used when tracking wckeys.)</li> |
| |
| <li><b>Name=</b> User name</li> |
| |
| <li><b>Partition=</b> Name of SLURM partition this association applies to</li> |
| |
| </ul> |
| |
| <h2>Limit enforcement</h2> |
| |
| <p>When limits are developed they will work in this order. |
| If a user has a limit set SLURM will read in those, |
| if not we will refer to the account associated with the job. |
| If the account doesn't have the limit set we will refer to |
| the cluster's limits. |
| If the cluster doesn't have the limit set no limit will be enforced. |
| <p>All of the above entities can include limits as described below... |
| |
| <ul> |
| |
| <li><b>Fairshare=</b> Used for determining priority. Essentially |
| this is the amount of claim this association and it's children have |
| to the above system.</li> |
| </li> |
| |
| <!-- For future use |
| <li><b>GrpCPUMins=</b> A hard limit of cpu minutes to be used by jobs |
| running from this association and its children. If this limit is |
| reached all jobs running in this group will be killed, and no new |
| jobs will be allowed to run. |
| </li> |
| --> |
| |
| <!-- For future use |
| <li><b>GrpCPUs=</b> The total count of cpus able to be used at any given |
| time from jobs running from this association and its children. If |
| this limit is reached new jobs will be queued but only allowed to |
| run after resources have been relinquished from this group. |
| </li> |
| --> |
| |
| <li><b>GrpJobs=</b> The total number of jobs able to run at any given |
| time from this association and its children. If |
| this limit is reached new jobs will be queued but only allowed to |
| run after previous jobs complete from this group. |
| </li> |
| |
| <li><b>GrpNodes=</b> The total count of nodes able to be used at any given |
| time from jobs running from this association and its children. If |
| this limit is reached new jobs will be queued but only allowed to |
| run after resources have been relinquished from this group. |
| </li> |
| |
| <li><b>GrpSubmitJobs=</b> The total number of jobs able to be submitted |
| to the system at any given time from this association and its children. If |
| this limit is reached new submission requests will be denied until |
| previous jobs complete from this group. |
| </li> |
| |
| <li><b>GrpWall=</b> The maximum wall clock time any job submitted to |
| this group can run for. If this limit is reached submission requests |
| will be denied. |
| </li> |
| |
| <!-- For future use |
| <li><b>MaxCPUMinsPerJob=</b> A limit of cpu minutes to be used by jobs |
| running from this association. If this limit is |
| reached the job will be killed will be allowed to run. |
| </li> |
| --> |
| |
| <!-- For future use |
| <li><b>MaxCPUsPerJob=</b> The maximum size in cpus any given job can |
| have from this association. If this limit is reached the job will |
| be denied at submission. |
| </li> |
| --> |
| |
| <li><b>MaxJobs=</b> The total number of jobs able to run at any given |
| time from this association. If this limit is reached new jobs will |
| be queued but only allowed to run after previous jobs complete from |
| this association. |
| </li> |
| |
| <li><b>MaxNodesPerJob=</b> The maximum size in nodes any given job can |
| have from this association. If this limit is reached the job will |
| be denied at submission. |
| </li> |
| |
| <li><b>MaxSubmitJobs=</b> The maximum number of jobs able to be submitted |
| to the system at any given time from this association. If |
| this limit is reached new submission requests will be denied until |
| previous jobs complete from this association. |
| </li> |
| |
| <li><b>MaxWallDurationPerJob=</b> The maximum wall clock time any job |
| submitted to this association can run for. If this limit is reached |
| the job will be denied at submission. |
| </li> |
| |
| <li><b>QOS=</b> comma separated list of QOS's this association is |
| able to run. |
| </li> |
| </ul> |
| |
| <h2>Modifying Entities</h2> |
| |
| <p>When modifying entities, you can specify many different options in |
| SQL-like fashion, using key words like <i>where</i> and <i>set</i>. |
| A typical execute line has the following form: |
| <pre> |
| sacctmgr modify <entity> set <options> where <options> |
| </pre> |
| |
| <p>For example:</p> |
| <pre> |
| sacctmgr modify user set default=none where default=test |
| </pre> |
| <p>will change all users with a default account of "test" to account "none". |
| Once an entity has been added, modified or removed, the change is |
| sent to the appropriate SLURM daemons and will be available for use |
| instantly.</p> |
| |
| <h2>Removing Entities</h2> |
| |
| <p>Removing entities using an execute line similar to the modify example above, |
| but without the set options. |
| For example, remove all users with a default account "test" using the following |
| execute line:</p> |
| <pre> |
| sacctmgr remove user where default=test |
| </pre> |
| <p>Note: In most cases, removed entities are preserved, but flagged |
| as deleted. |
| If an entity has existed for less than 1 day, the entity will be removed |
| completely. This is meant to clean up after typographic errors.</p> |
| |
| <p style="text-align: center;">Last modified 25 January 2010</p> |
| |
| <!--#include virtual="footer.txt"--> |
| |