Tuesday, January 24, 2017

Spectrum LSF 10.1 Installation and Job Submission


Load Sharing Facility (or simply LSF) is a workload management platform, job scheduler, for distributed HPC environments. It can be used to execute batch jobs on networked Unix and Windows systems on many different architectures. LSF was based on the Utopia research project at the University of Toronto.

IBM Platform Computing is now renamed to IBM Spectrum Computing to complement IBM’s Spectrum Storage family of software-defined offerings. The IBM Platform LSF product is now IBM Spectrum LSF.
IBM Spectrum LSF 10.1 is available as the following offering packages: 
1) IBM Spectrum LSF Community Edition 10.1, 
2) IBM Spectrum LSF Suite for Workgroups 10.1, and 
3) IBM Spectrum LSF Suite for HPC 10.1.

LSF provides a resource management framework that takes your job requirements, finds the best resources to run the job, and monitors its progress. Jobs always run according to host load and site policies.


LSF daemons and processes

Multiple LSF processes run on each host in the cluster. The type and number of processes that are running depends on whether the host is a master host or a compute host.


LSF hosts run various daemon processes, depending on their role in the cluster.


DaemonRole
mbatchdJob requests and dispatch
mbschdJob scheduling
sbatchdJob execution
resJob execution
limHost information
pimJob process information
elimDynamic load indexes


Installation Steps :

Step 1 : Create installation directory: 
[root@localhost LSF_installation_files]# pwd
/root/LSF_installation_files
[root@localhost LSF_installation_files]#

Step 2 : Untar the package 
----------------
[root@localhost LSF_installation_files]# ls
lsfce10.1-x86_64  lsfce10.1-x86_64.tar.gz
[root@localhost LSF_installation_files]#


[root@localhost LSF_installation_files]# gunzip -c lsfce10.1-x86_64.tar.gz | tar xvf -
lsfce10.1-x86_64/
lsfce10.1-x86_64/pmpi/
lsfce10.1-x86_64/pmpi/platform_mpi-09.01.02.00u.x64.bin
lsfce10.1-x86_64/lsf/
lsfce10.1-x86_64/lsf/lsf10.1_lsfinstall_linux_x86_64.tar.Z
lsfce10.1-x86_64/lsf/lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z
lsfce10.1-x86_64/pac/
lsfce10.1-x86_64/pac/pac10.1_basic_linux-x64.tar.Z
[root@localhost LSF_installation_files]#

/root/LSF_installation_files/lsfce10.1-x86_64/lsf/
[root@localhost lsf]# ls
lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z  lsf10.1_lsfinstall_linux_x86_64.tar.Z
[root@localhost lsf]# zcat lsf10.1_lsfinstall_linux_x86_64.tar.Z | tar xvf -

----------------------

[root@localhost lsf10.1_lsfinstall]# pwd
/root/LSF_installation_files/lsfce10.1-x86_64/lsf/lsf10.1_lsfinstall
[root@localhost lsf10.1_lsfinstall]# ls
conf_tmpl  install.config  lap         lsf_unix_install.pdf  patchlib   README      rpm      slave.config
hostsetup  instlib         lsfinstall  patchinstall          pversions  rhostsetup  scripts
[root@localhost lsf10.1_lsfinstall]#

---------------------------
=========================
2. Use lsfinstall
========================
The installation program for IBM Spectrum LSF Version 10.1 is lsfinstall.
Use the lsfinstall script to install a new LSF Version 10.1 cluster.

------------------------
2.1 Steps
------------------------
1. Edit lsf10.1_lsfinstall/install.config to specify the options
   for your cluster. Uncomment the options you want and replace the
   example values with your own settings.
2. Run lsf10.1_lsfinstall/lsfinstall -f install.config
3. Read the following files generated by lsfinstall:
   o  lsf10.1_lsfinstall/lsf_getting_started.html to find out how
      to set up your LSF hosts, start LSF, and test your new LSF cluster
   o  lsf10.1_lsfinstall/lsf_quick_admin.html to learn more about
      your new LSF cluster

--------------------------------------------------------------------------
Start install script :


[root@localhost lsf10.1_lsfinstall]# ./lsfinstall -f install.config


Logging installation sequence in /root/LSF_installation_files/lsfce10.1-x86_64/lsf/lsf10.1_lsfinstall/Install.log
International License Agreement for Non-Warranted Programs

Part 1 - General Terms

BY DOWNLOADING, INSTALLING, COPYING, ACCESSING, CLICKING ON
AN "ACCEPT" BUTTON, OR OTHERWISE USING THE PROGRAM,
LICENSEE AGREES TO THE TERMS OF THIS AGREEMENT. IF YOU ARE
ACCEPTING THESE TERMS ON BEHALF OF LICENSEE, YOU REPRESENT
AND WARRANT THAT YOU HAVE FULL AUTHORITY TO BIND LICENSEE
TO THESE TERMS. IF YOU DO NOT AGREE TO THESE TERMS,

* DO NOT DOWNLOAD, INSTALL, COPY, ACCESS, CLICK ON AN
"ACCEPT" BUTTON, OR USE THE PROGRAM; AND

* PROMPTLY RETURN THE UNUSED MEDIA AND DOCUMENTATION TO THE

Press Enter to continue viewing the license agreement, or
enter "1" to accept the agreement, "2" to decline it, "3"
to print it, "4" to read non-IBM terms, or "99" to go back
to the previous screen.
1
LSF pre-installation check ...

Checking the LSF TOP directory /lsf_home ...
... Done checking the LSF TOP directory /lsf_home ...
You are installing IBM Spectrum LSF - 10.1 Community Edition.

Checking LSF Administrators ...
   LSF administrator(s):       "lsfadmin"
   Primary LSF administrator:  "lsfadmin"
Checking the configuration template  ...
    Done checking configuration template ...
    Done checking ENABLE_STREAM ...

[Mon Jan  2 09:50:49 EST 2017:lsfprechk:WARN_2007]
    Hosts defined in LSF_MASTER_LIST must be LSF server hosts. The
    following hosts will be added to server hosts automatically: localhost.

Checking the patch history directory  ...
... Done checking the patch history directory /lsf_home/patch ...

Checking the patch backup directory ...
... Done checking the patch backup directory /lsf_home/patch/backup ...


Searching LSF 10.1 distribution tar files in /lsf_home/lsf_distrib Please wait ...

  1) linux2.6-glibc2.3-x86_64

Press 1 or Enter to install this host type: 1

You have chosen the following tar file(s):
    lsf10.1_linux2.6-glibc2.3-x86_64

Checking selected tar file(s) ...
... Done checking selected tar file(s).


Pre-installation check report saved as text file:
/root/LSF_installation_files/lsfce10.1-x86_64/lsf/lsf10.1_lsfinstall/prechk.rpt.

... Done LSF pre-installation check.

Installing LSF binary files " lsf10.1_linux2.6-glibc2.3-x86_64"...
Creating /lsf_home/10.1 ...

Copying lsfinstall files to /lsf_home/10.1/install
Creating /lsf_home/10.1/install ...
Creating /lsf_home/10.1/install/scripts ...
Creating /lsf_home/10.1/install/instlib ...
Creating /lsf_home/10.1/install/patchlib ...
Creating /lsf_home/10.1/install/lap ...
Creating /lsf_home/10.1/install/conf_tmpl ...
... Done copying lsfinstall files to /lsf_home/10.1/install

Installing linux2.6-glibc2.3-x86_64 ...

Please wait, extracting lsf10.1_linux2.6-glibc2.3-x86_64 may take up to a few minutes ...

... Adding package information to patch history.
... Done adding package information to patch history.
... Done extracting /lsf_home/lsf_distrib/lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z.

Creating links to LSF commands ...
... Done creating links to LSF commands ...

Modifying owner, access mode, setuid flag of LSF binary files ...
... Done modifying owner, access mode, setuid flag of LSF binary files ...

Creating the script file lsf_daemons ...
... Done creating the script file lsf_daemons ...

... linux2.6-glibc2.3-x86_64 installed successfully under /lsf_home/10.1.

... Done installing LSF binary files "linux2.6-glibc2.3-x86_64".

Creating LSF configuration directories and files ...
Creating /lsf_home/work ...
Creating /lsf_home/log ...
Creating /lsf_home/conf ...
Creating /lsf_home/conf/lsbatch ...
... Done creating LSF configuration directories and files ...

Creating a new cluster "CI_cluster1" ...
Adding entry for cluster CI_cluster1 to /lsf_home/conf/lsf.shared.
Installing lsbatch directories and configurations ...
Creating /lsf_home/conf/lsbatch/CI_cluster1 ...
Creating /lsf_home/conf/lsbatch/CI_cluster1/configdir ...
Creating /lsf_home/work/CI_cluster1 ...
Creating /lsf_home/work/CI_cluster1/logdir ...
Creating /lsf_home/work/CI_cluster1/live_confdir ...
Creating /lsf_home/work/CI_cluster1/lsf_indir ...
Creating /lsf_home/work/CI_cluster1/lsf_cmddir ...
Enabling schmod_mc in /lsf_home/conf/lsbatch/CI_cluster1/configdir/lsb.modules ...
schmod_mc has been enabled...

Adding server hosts ...

Host(s) "localhost" has (have) been added to the cluster "CI_cluster1".

Adding LSF_MASTER_LIST in lsf.conf file...

... LSF configuration is done.
... Creating EGO configuration directories and files ...
Creating /lsf_home/conf/ego ...
Creating /lsf_home/conf/ego/CI_cluster1 ...
Creating /lsf_home/conf/ego/CI_cluster1/kernel ...
Creating /lsf_home/work/CI_cluster1/ego ...
... Done creating EGO configuration directories and files.
Configuring EGO components...
... EGO configuration is done.
... Creating resource connector configuration directories and files ...
Creating /lsf_home/conf/resource_connector ...
Creating /lsf_home/conf/resource_connector/ego ...
Creating /lsf_home/conf/resource_connector/openstack ...
... Done creating resource connector configuration directories and files.
... Finished resource connector configuration.

... LSF inventory tag file is installed.
... LSF entitlement file is installed.
Creating lsf_getting_started.html ...
... Done creating lsf_getting_started.html

Creating lsf_quick_admin.html ...
... Done creating lsf_quick_admin.html


lsfinstall is done.

To complete your LSF installation and get your
cluster "CI_cluster1" up and running, follow the steps in
"/root/LSF_installation_files/lsfce10.1-x86_64/lsf/lsf10.1_lsfinstall/lsf_getting_started.html".

After setting up your LSF server hosts and verifying
your cluster "CI_cluster1" is running correctly,
see "/lsf_home/10.1/lsf_quick_admin.html"
to learn more about your new LSF cluster.

After installation, remember to bring your cluster up to date
by applying the latest updates and bug fixes.

[root@localhost lsf10.1_lsfinstall]#

---------------------------------------------------------------------

LSF package installed to directory as per the  install.config :
[root@localhost lsf10.1_lsfinstall]# cd /lsf_home/
[root@localhost lsf_home]# pwd
/lsf_home
[root@localhost lsf_home]# ls
10.1  conf  log  lsf_distrib  LSF_redist.txt  patch  patch.conf  properties  work
-----------------------------------------------------------
[root@localhost lsf_home]# lsid 
Jan  2 10:00:26 2017 6263 3 9.1.2 _ls_initdebug: initenv_:fopen(/etc/lsf.conf) failed, No such file or directory.
ls_initdebug: Unable to open file lsf.conf
[root@localhost lsf_home]#
[root@localhost conf]# source profile.lsf
[root@localhost conf]# lsid
lsid: ls_getentitlementinfo() failed: LIM is down; try later
[root@localhost conf]#

You need to start LSF daemons and  later source profile.lsf 

Start Daemons:

--------------------------------------------------------------------------
[root@localhost conf]# lsfstartup
Starting up all LIMs ...
Do you really want to start up LIM on all hosts ? [y/n]y
Start up LIM on <localhost.pok.stglabs.ibm.com> ...... done

Waiting for Master LIM to start up ...  Master LIM is ok
Starting up all RESes ...
Do you really want to start up RES on all hosts ? [y/n]y
Start up RES on <localhost.pok.stglabs.ibm.com> ...... done

Starting all slave daemons on LSBATCH hosts ...
Do you really want to start up slave batch daemon on all hosts ? [y/n] y
Start up slave batch daemon on <localhost.pok.stglabs.ibm.com> ...... done

Done starting up LSF daemons on the local LSF cluster ...
-------------------------------------------------------------------------------

[root@localhost conf]# ps -ef | grep lsf
root      6428     1  0 10:02 ?        00:00:00 /lsf_home/10.1/linux2.6-glibc2.3-x86_64/etc/lim
root      6431  6428  0 10:02 ?        00:00:00 /lsf_home/10.1/linux2.6-glibc2.3-x86_64/etc/pim
root      6494     1  0 10:03 ?        00:00:00 /lsf_home/10.1/linux2.6-glibc2.3-x86_64/etc/res
root      6560     1  0 10:03 ?        00:00:00 /lsf_home/10.1/linux2.6-glibc2.3-x86_64/etc/sbatchd
root      6564  6560  0 10:03 ?        00:00:00 /lsf_home/10.1/linux2.6-glibc2.3-x86_64/etc/mbatchd -d /lsf_home/conf
lsfadmin  6575  6564  0 10:03 ?        00:00:00 /lsf_home/10.1/linux2.6-glibc2.3-x86_64/etc/mbschd
root      6581 29404  0 10:03 pts/1    00:00:00 grep --color=auto lsf
[root@localhost conf]#  
--------------------------------------
lsid - Displays the current LSF version number, the cluster name, and the master host name

[root@localhost conf]# lsid
IBM Spectrum LSF Community Edition 10.1.0.0, Jun 15 2016
Copyright IBM Corp. 1992, 2016. All rights reserved.
My cluster name is CI_cluster1
My master name is localhost.pok.stglabs.ibm.com
[root@localhost conf]#

---------------------------------------
[root@localhost conf]# su - lsfadmin
Last login: Mon Jan  2 09:25:54 EST 2017 on pts/0
[lsfadmin@localhost ~]$ cd /lsf_home/conf
[lsfadmin@localhost conf]$ source profile.lsf

[lsfadmin@localhost conf]$ lsid
IBM Spectrum LSF Community Edition 10.1.0.0, Jun 15 2016
Copyright IBM Corp. 1992, 2016. All rights reserved.
My cluster name is CI_cluster1
My master name is localhost.pok.stglabs.ibm.com
[lsfadmin@localhost conf]$

-------------------------------------------
 LSF Job Submission :

[lsfadmin@localhost ~]$ bsub
bsub> sleep 100
bsub> Job <1> is submitted to default queue <normal>.
[lsfadmin@localhost ~]$ bjobs
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
1       lsfadmi RUN   normal     localhost.po localhost.po sleep 100  Jan  2 10:05
[lsfadmin@localhost ~]$

--------------------------------------------------

2 comments: