| Simple build/install on Linux: |
| ./autogen.sh |
| ./configure --enable-debug \ |
| --prefix=<install-dir> --sysconfdir=<config-dir> |
| make |
| make install |
| |
| If you make changes to Makefile.am files, then on _MCR_, run |
| ./autogen.sh |
| then check-in the new Makefile.am and Makefile.in files |
| |
| Here is a step-by-step HOWTO for creating a new release of SLURM on a |
| Linux cluster (See BlueGene and AIX specific notes below for some differences). |
| 0. svn co https://eris.llnl.gov/svn/slurm/trunk slurm |
| svn co https://eris.llnl.gov/svn/chaos/private/buildfarm/trunk buildfarm |
| put the buildfarm directory in your search path |
| 1. Update NEWS and META files for the new release. In the META file, |
| the API, Major, Minor, Micro, Version, and Release fields must all |
| by up-to-date. **** DON'T UPDATE META UNTIL RIGHT BEFORE THE TAG **** |
| The Release field should always be 1 unless one of |
| the following is true |
| - Changes were made to the spec file, documentation, or example |
| files, but not to code. |
| - this is a prerelease (Release = 0.preX) |
| 2. Tag the repository with the appropriate name for the new version. |
| svn copy https://eris.llnl.gov/svn/slurm/trunk \ |
| https://eris.llnl.gov/svn/slurm/tags/slurm-1-2-0-0-pre3 \ |
| -m "description" |
| 3. Use the rpm make target to create the new RPMs. This requires a .rpmmacros |
| (.rpmrc for newer versions of rpmbuild) file containing: |
| %_slurm_sysconfdir /etc/slurm |
| %_enable_debug "--enable-debug" |
| I usually build with using the following syntax: |
| build -s https://eris.llnl.gov/svn/slurm/tags/slurm-1-2-0-0-pre3 |
| NOTE: For v1.0 and earlier add: --pre-exec='./autogen.sh' |
| 4. Move the RPMs to |
| /usr/local/admin/rpms/llnl/RPMS-RHEL4/x86_64 (odevi, or gauss) |
| /usr/local/admin/rpms/llnl/RPMS-RHEL4/i386/ (mdevi) |
| /usr/local/admin/rpms/llnl/RPMS-RHEL4/ia64/ (tdevi) |
| send an announcement email (with the latest entry from the NEWS |
| file) out to linux-admin@lists.llnl.gov. |
| 5. Copy tagged bzip file (e.g. slurm-0.6.0-0.pre3.bz2) to FTP server |
| for external SLURM users. |
| 6. Copy bzip file and rpms (including src.rpm) to sourceforge.net: |
| ncftp upload.sf.net |
| cd upload |
| put filename |
| Use SourceForge admin tool to ad new release, including changelog. |
| |
| BlueGene build notes: |
| 3. Use the rpm make target to create the new RPMs. This requires a .rpmmacros |
| (.rpmrc for newer versions of rpmbuild) file containing: |
| %_slurm_sysconfdir /etc/slurm |
| %_enable_debug "--enable-debug" |
| %with_cflags CFLAGS=-m64 CXX="g++ -m64" |
| Build on Service Node with using the following syntax |
| build -s https://eris.llnl.gov/svn/slurm/tags/slurm-1-2-0-0-pre3 |
| 4. Copy RPMs to /usr/admin/sles/llnl/RPMS-SLES9 |
| Do _not_ copy the switch-elan, authd-authd, |
| aix-federation or auth-none RPMs |
| |
| To build and run on AIX: |
| 0. svn co https://eris.llnl.gov/svn/slurm/trunk slurm |
| svn co https://eris.llnl.gov/svn/slurm/private/proctrack-aix/trunk proctrack |
| svn co https://eris.llnl.gov/svn/buildfarm/trunk buildfarm |
| put the buildfarm directory in your search path |
| Also, you will need two commands to appear FIRST in your PATH: |
| |
| /usr/local/tools/gnu/aix_5_64_fed/bin/install |
| /usr/local/gnu/bin/tar |
| |
| I do this by making symlinks to those commands in the buildfarm directory, |
| then making the buildfarm directory the first one in my PATH. |
| 1. export OBJECT_MODE=32 |
| 2. Build with: |
| ./configure --enable-debug --prefix=/opt/freeware \ |
| --sysconfdir=/opt/freeware/etc/slurm |
| --with-proctrack=<your directory>/proctrack \ |
| --with-ssl=/opt/freeware --with-munge=/opt/freeware |
| make |
| make uninstall # remove old shared libraries, aix caches them |
| make install |
| 3. To build RPMs (NOTE: Many GNU tools are required): |
| Create a file specifying system specific files: |
| # |
| # RPM Macros for use with SLURM on AIX |
| # The system-wide macros for RPM are in /usr/lib/rpm/macros |
| # and this overrides a few of them |
| # |
| %_prefix /opt/freeware |
| %_slurm_sysconfdir %{_prefix}/etc/slurm |
| %_defaultdocdir %{_prefix}/doc |
| |
| %_enable_debug "--enable-debug" |
| %with_proctrack "--with-proctrack=<your directory>/proctrack" |
| %with_ssl "--with-ssl=/opt/freeware" |
| %with_munge "--with-munge=/opt/freeware" |
| build -s https://eris.llnl.gov/svn/slurm/tags/slurm-1-2-0-0-pre3 |
| 4. export MP_RMLIB=./slurm_ll_api.so |
| export CHECKPOINT=yes |
| 5. poe hostname -rmpool debug |
| 6. To debug, set SLURM_LL_API_DEBUG=3 before running poe - will create a file |
| /tmp/slurm.* |
| It can also be helpful to use poe options "-ilevel 6 -pmdlog yes" |
| There will be a log file create named /tmp/mplog.<jobid>.<taskid> |
| 7. If you update proctrack, be sure to run "slibclean" to clear cached |
| version. |
| 8. Install the rpms slurm-*.ppc.rpm, slurm-aix-federation-*.ppc.rpm, |
| slurm-auth-munge-*.ppc.rpm, slurm-devel-*.ppc.rpm, and |
| slurm-sched-wiki-*.ppc.rpm in /usr/admin/inst.image/slurm/aix5.3 on an |
| OCF AIX machine (pdev is a good choice). |
| |
| AIX/Federation switch window problems |
| To clean switch windows: ntblclean =w 8 -a sni0 |
| To get switch window status: ntblstatus |
| |
| BlueGene bglblock boot problem diagnosis |
| - Logon to the Service Node (bglsn, ubglsn) |
| - Execute /admin/bglscripts/fatalras |
| This will produce a list of failures including Rack and Midplane number |
| <date> R<rack> M<midplane> <failure details> |
| - Translate the Rack and Midplane to SLURM node id: smap -R r<rack><midplane> |
| - Drain only the bad SLURM node, return others to service using scontrol |
| |
| Configuration file update procedures: |
| - cd /usr/bgl/dist/slurm (on bgli) |
| - co -l <filename> |
| - vi <filename> |
| - ci -u <filename> |
| - make install |
| - then run "dist_local slurm" on SN and FENs to update /etc/slurm |
| |
| Some RPM commands: |
| - rpm -querry --all | grep slurm |
| - rpm --erase package_name |
| - rpm --install --ignoresize file_name |
| For main SLURM plugin installation on BGL service node: |
| - rpm --install --force --nodeps --ignoresize slurm-#.rpm |
| |
| |
| To clear a wedged job: |
| /bgl/startMMCSconsole |
| > delete bgljob #### |
| > free RMP### |
| |
| Starting and stopping daemons on Linux: |
| /etc/init.d/slurm stop |
| /etc/init.d/slurm start |
| |
| Patches: |
| - cd to the top level src directory |
| - Run the patch command with epilog_complete.patch as stdin: |
| patch -p[path_level_to_filter] [--dry-run] < epilog_complete.patch |
| |
| CVS and gnats: |
| Include "gnats:<id> e.g. "(gnats:123)" as part of cvs commit to |
| automatically record that update in gnats database. NOTE: Does |
| not change gnats bug state, but records source files associated |
| with the bug. |
| |
| For memory leaks (for AIX use zerofault, zf; for linux use valgrind) |
| valgrind --tool=memcheck --leak-check=yes --num-callers=6 --leak-resolution=med ./slurmctld |
| |
| Remember to test on ia64, i386, BGL, and AIX. |