Sunday, August 22, 2021

Spectrum LSF 10.1 Installation and Applying Patch | FP | interim FIX on Linux Platform

IBM Spectrum LSF (LSF, originally Platform Load Sharing Facility) is a workload management platform, job scheduler, for distributed high performance computing (HPC) by IBM. In January, 2012, Platform Computing was acquired by IBM. The product is now called IBM® Spectrum LSF.

IBM® Spectrum LSF is a complete workload management solution for demanding HPC environments that takes your job requirements, finds the best resources to run the job, and monitors its progress. Jobs always run according to host load and site policies.

LSF cluster (source)

  • Cluster is a  group of computers (hosts) running LSF that work together as a single unit, combining computing power, workload, and resources. A cluster provides a single-system image for a network of computing resources. Hosts can be grouped into a cluster in a number of ways. A cluster can contain 1) All the hosts in a single administrative group  2) All the hosts on a subnetwork.
  • Job is a unit of work that is running in the LSF system or  job is a command or set of commands  submitted to LSF for execution. LSF schedules, controls, and tracks the job according to configured policies.
  • Queue is a cluster-wide container for jobs. All jobs wait in queues until they are scheduled and dispatched to hosts.
  • Resources are the objects in your cluster that are available to run work. 

Spectrum LSF 10.1 base Installation  and applying FP /PTF/FIX

Plan your installation and install a new production IBM Spectrum LSF cluster on UNIX or Linux hosts. The following diagram illustrates an example directory structure after the LSF installation is complete.

Source

Plan your installation to determine the required parameters for the install.config file.

a )  lsf10.1_lsfinstall.tar.Z

The standard installer package. Use this package in a heterogeneous cluster with a mix of systems other than x86-64. Requires approximately 1 GB free space.

b)  lsf10.1_lsfinstall_linux_x86_64.tar.Z 

      lsf10.1_lsfinstall_linux_ppc64le.tar.Z

Use this smaller installer package in a homogeneous x86-64 or ppc cluster accordingly . 

------------------------

Get the LSF distribution packages for all host types you need and put them in the same directory as the extracted LSF installer script. Copy that package to LSF_TARDIR path mentioned in Step 3.

For example:

Linux 2.6 kernel glibc version 2.3, the distribution package is lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z.

Linux  kernel glibc version 3.x, the distribution package is lsf10.1_lnx310-lib217-ppc64le.tar.Z

------------------------

LSF uses entitlement files to determine which feature set is enabled or disabled based on the edition of the product. Copy  entitlement configuration file to LSF_ENTITLEMENT_FILE  path mentioned in step 3.

The following LSF entitlement configuration files are available for each edition:

LSF Standard Edition  ===>  lsf_std_entitlement.dat

LSF Express Edition   ===>  lsf_exp_entitlement.dat

LSF Advanced Edition  ==>  lsf_adv_entitlement.dat

-------------------------

Step 1 : Get the LSF installer script package that you selected and extract it.

# zcat lsf10.1_lsfinstall_linux_x86_64.tar.Z | tar xvf -

Step 2 :  Go to extracted directory :

 cd lsf10.1_lsfinstall

Step 3 : Configure install.config as per the plan

 cat install.config
  LSF_TOP="/nfs_shared_dir/LSF_HOME"
  LSF_ADMINS="lsfadmin"
  LSF_CLUSTER_NAME="x86-64_cluster2"
  LSF_MASTER_LIST="myhost1"
  LSF_TARDIR="/nfs_shared_dir/conf_lsf/lsf_distrib/"
  LSF_ENTITLEMENT_FILE="/nfs_shared_dir/conf_lsf/lsf_std_entitlement.dat"
  LSF_ADD_SERVERS="myhost1 myhost2 myhost3 myhost4 myhost5 myhost6 myhost7 myhost8"

  ENABLE_DYNAMIC_HOSTS="Y"

Step 4:  Start LSF 10.1 base installation 

          ./lsfinstall -f install.config

Logging installation sequence in /root/LSF_new/lsf10.1_lsfinstall/Install.log
International Program License Agreement
Part 1 - General TermsBY DOWNLOADING, INSTALLING, COPYING, ACCESSING, CLICKING 
 "ACCEPT" BUTTON, OR OTHERWISE USING THE PROGRAM,
LICENSEE AGREES TO THE TERMS OF THIS AGREEMENT. IF YOU ARE
ACCEPTING THESE TERMS ON BEHALF OF LICENSEE, YOU REPRESENT
AND WARRANT THAT YOU HAVE FULL AUTHORITY TO BIND LICENSEE
TO THESE TERMS. IF YOU DO NOT AGREE TO THESE TERMS
* DO NOT DOWNLOAD, INSTALL, COPY, ACCESS, CLICK ON AN
"ACCEPT" BUTTON, OR USE THE PROGRAM; AND
* PROMPTLY RETURN THE UNUSED MEDIA, DOCUMENTATION, AND
Press Enter to continue viewing the license agreement, or
enter "1" to accept the agreement, "2" to decline it, "3"
to print it, "4" to read non-IBM terms, or "99" to go back
to the previous screen.
1
Checking the LSF TOP directory /nfs_shared_dir/LSF_HOME ...
... Done checking the LSF TOP directory /nfs_shared_diri/LSF_HOME ...
You are installing IBM Spectrum LSF - 10.1 Standard Edition
Searching LSF 10.1 distribution tar files in /nfs_shared_dir/conf_lsf/lsf_distrib Please wait ...
  1) linux3.10-glibc2.17-x86_64
Press 1 or Enter to install this host type: 1
Installing linux3.10-glibc2.17-x86_64 ...
Please wait, extracting lsf10.1_lnx310-lib217-x86_64 may take up to a few minutes ...
lsfinstall is done.
After installation, remember to bring your cluster up to date by applying the latest updates and bug fixes.

NOTE: You can do LSF installation as  non-root user. That will  be similar but with one extra prompt for multi-node cluster(yes/no)

Step 5 :  This step required only if installation was done by root .

 chown -R lsfadmin:lsfadmin $LSF_TOP

Step 6 :  check the binary files 

cd LSF_TOP/10.1/linux3.10-glibc2.17-x86_64/bin

Step 7 : By default, only root can start the LSF daemons. Any user can not submit jobs to your cluster. To make the cluster available to other users, you must manually change the ownership and setuid bit for the lsadmin and badmin binary files to root, and the file permission mode to -rwsr-xr-x (4755) so that the user ID bit for the owner is setuid.

 chown root lsadmin
 chown root badmin
 chmod 4755 lsadmin
 chmod 4755 badmin
 ls -alsrt lsadmin
 ls -alsrt badmin

chown root  $LSF_SERVERDIR/eauth  

chmod u+s $LSF_SERVERDIR/eauth 

OR 

          ./hostsetup --top="LSF_HOME" --setuid 

Step 8 : Configure  /etc/lsf.sudoers 

[root@myhost1]# cat /etc/lsf.sudoers
LSF_STARTUP_USERS="lsfadmin"
LSF_STARTUP_PATH="/nfs_shared_dir/
LSF_HOME/10.1/linux3.10-glibc2.17-ppc64le/etc"
LSF_EAUTH_KEY="testKey1"

NOTE: This lsf.sudoers file is not installed by default. This file is located in /etc. lsf.sudoers file is used to set the parameter LSF_EAUTH_KEY to configure a key for eauth to encrypt and decrypt user authentication data. All the nodes/hosts should have this file . Customers need to configure LSF_EAUTH_KEY in /etc/lsf.sudoers on each side of multi-cluster. 

Step 9 : check $LSF_SERVERDIR/eauth   and copy  lsf.sudoers to all hosts in the cluster

 ls  $LSFTOP/10.1/linux3.10-glibc2.17-x86_64/etc/


scp /etc/lsf.sudoers myhost02:/etc/lsf.sudoers
scp /etc/lsf.sudoers myhost03:/etc/lsf.sudoers
scp /etc/lsf.sudoers myhost04:/etc/lsf.sudoers
scp /etc/lsf.sudoers myhost05:/etc/lsf.sudoers
scp /etc/lsf.sudoers myhost06:/etc/lsf.sudoers
scp /etc/lsf.sudoers myhost07:/etc/lsf.sudoers
scp /etc/lsf.sudoers myhost08:/etc/lsf.sudoers

Step 10 : Start LSF  as lsfadmin and check base Installation using  lsid command.

Step 11 : Check binary type with  lsid -V

$ lsid -V
IBM Spectrum LSF 10.1.0.0 build 403338, May 27 2016
Copyright International Business Machines Corp. 1992, 2016.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

binary type: linux3.10-glibc2.17-x86_64

NOTE:  Download required FP and interim fixes from https://www.ibm.com/support/fixcentral/ 

Step 12 : Before applying PTF12 and interim patches , bring down the LSF daemons.  Use the following commands to shut down the original LSF daemons

 badmin hshutdown all
 lsadmin resshutdown all
 lsadmin limshutdown all

Deactivate all queues to make sure that no new jobs can be dispatched during the upgrade:

badmin qinact all 

Step 13: Then, become the root to apply FP12 and interim patches . 

Set LSF environment :   .   LSF_TOP/conf/profile.lsf

.   /nfs_shared_dir/LSF_HOME/conf/profile.lsf

Step 14: Apply  FP 12 on LSF BASE installation.  The patchinstall is available in $LSF_TOP//install directory

         # cd $LSF_TOP/10.1/install

Perform a check on patches running. It is recommended to check for the patch before its installation

$ patchinstall –c

 ./patchinstall /root/PTF12_x86_2versions/lsf10.1_lnx310-lib217-x86_64-600488.tar.Z

[root@myhost7 install]# ./patchinstall /root/PTF12_x86_2versions/lsf10.1_lnx310-lib217-x86_64-600488.tar.Z
Logging patch installation sequence in /nfs_shared_dir/LSF_HOME/10.1/install/patch.log
Checking the LSF installation directory /nfs_shared_dir/LSF_HOME ...
Done checking the LSF installation directory /nfs_shared_dir/LSF_HOME.
Checking the patch history directory ...
Done checking the patch history directory /nfs_shared_dir/LSF_HOME/patch.
Checking the backup directory ...
Done checking the backup directory /nfs_shared_dir/LSF_HOME/patch/backup.
Installing package "/root/PTF12_x86_2versions/lsf10.1_lnx310-lib217-x86_64-600488.tar.Z"...
Checking the package definition for /root/PTF12_x86_2versions/lsf10.1_lnx310-lib217-x86_64-600488.tar.Z ...
Done checking the package definition for /root/PTF12_x86_2versions/lsf10.1_lnx310-lib217-x86_64-600488.tar.Z.
.
.
Finished backing up files to "/nfs_shared_dir/LSF_HOME/patch/backup/LSF_linux3.10-glibc2.17-x86_64_600488".
Done installing /root/PTF12_x86_2versions/lsf10.1_lnx310-lib217-x86_64-600488.tar.Z.

Step 15: Apply  interim fix1

./patchinstall /root/LSF_patch1/lsf10.1_lnx310-lib217-x86_64-600505.tar.Z

Logging patch installation sequence in /nfs_shared_dir/LSF_HOME/10.1/install/patch.log 
Installing package "/root/LSF_patch1/lsf10.1_lnx310-lib217-x86_64-600505.tar.Z"...
Checking the package definition for /root/LSF_patch1/lsf10.1_lnx310-lib217-x86_64-600505.tar.Z ...
Are you sure you want to update your cluster with this patch? (y/n) [y] y
Y
Backing up existing files ...
Finished backing up files to "/nfs_shared_dir/LSF_HOME/patch/backup/LSF_linux3.10-glibc2.17-x86_64_600505".
Done installing /root/LSF_patch1/lsf10.1_lnx310-lib217-x86_64-600505.tar.Z.
Exiting..
.
Step 16: Apply interim fix2

 ./patchinstall /root/LSF_patch2/lsf10.1_lnx310-lib217-x86_64-600625.tar.Z

[root@myhost7 install]# ./patchinstall /root/LSF_patch2/lsf10.1_lnx310-lib217-x86_64-600625.tar.Z
Installing package "/root/LSF_patch2/lsf10.1_lnx310-lib217-x86_64-600625.tar.Z"...
Checking the package definition for /root/LSF_patch2/lsf10.1_lnx310-lib217-x86_64-600625.tar.Z ...
Backing up existing files ...
Finished backing up files to "/nfs_shared_dir/LSF_HOME/patch/backup/LSF_linux3.10-glibc2.17-x86_64_600625".
Done installing /root/LSF_patch2/lsf10.1_lnx310-lib217-x86_64-600625.tar.Z.
Exiting...
 
Step 17: As a root user , Setbit for new command bctrld

  cd LSF_TOP/10.1/linux3.10-glibc2.17-x86_64/bin
  chown root bctrld
  chmod 4755 bctrld
 

Step 18 :  Check lsf.shared file for multi cluster setup.

Begin Cluster
ClusterName      Servers
CLUSTER1       (cloudhost)
CLUSTER2       (myhost1)
CLUSTER3       (remotehost2)

          End Cluster

Step 19 : Switch back to  user lsfadmin. Use the following commands to start LSF using the newer daemons.

  lsadmin limstartup all
lsadmin resstartup all
badmin hstartup all

Use the following command to reactivate all LSF queues after upgrading: badmin qact all

Step 20 : Modify Conf files as per requirement add queues, clusters...etc . Then run badmin reconfig or lsadmin reconfig as explained in LSF configuration section below.  Restart LSF as "lsfadmin" user .

$ lsid
IBM Spectrum LSF Standard 10.1.0.12, Jun 10 2021
Copyright International Business Machines Corp. 1992, 2016.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
My cluster name is CLUSTER2
My master name is myhost1
$ lsclusters -w
CLUSTER_NAME        STATUS   MASTER_HOST             ADMIN    HOSTS  SERVERS
CLUSTER1            ok       cloudhost            lsfadmin        7        7
CLUSTER2            ok       myhost1              lsfadmin        8        8
CLUSTER3            ok       remotehost2          lsfadmin        8        8
$ bhosts
HOST_NAME        STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV
myhost1 ok - 20 0 0 0 0 0 myhost2 ok - 20 0 0 0 0 0 myhost3 ok - 19 0 0 0 0 0
myhost4 ok - 44 4 4 0 0 0
myhost5 ok - 44 4 4 0 0 0
myhost6 ok - 20 0 0 0 0 0 myhost7 ok - 20 0 0 0 0 0 myhost8 ok - 19 0 0 0 0 0
Spectrum LSF Cluster Installation and FP12 upgradation completed successfully  as per the details copied above.

You must run hostsetup as root to use --boot="y" option to modify the system scripts to automatically start and stop LSF daemons at system startup or shutdown. . The default is --boot="n".

1. Log on to each LSF server host as root. Start with the LSF master host.

2. Run hostsetup on each LSF server host. For example:

# cd $LSF_TOP/10.1/install

# ./hostsetup --top="$LSF_TOP" --boot="y"

NOTE: For more details on hostsetup usage, enter hostsetup -h.

In case of multi-cluster environment, reinstalling  master cluster would show status=disk  after issuing bclusters command. 


[smpici@c656f7n06 ~]$ bclusters
[Job Forwarding Information ]

LOCAL_QUEUE     JOB_FLOW   REMOTE CLUSTER    STATUS
            Queue1              send                     CLUSTER1          disc
            Queue2              send                     CLUSTER2          disc
            Queue3              send                     CLUSTER3          disc

where status=disc means communication between the two clusters is not established. The disc status might occur because no jobs are waiting to be dispatched, or because the remote master cannot be located.

Possible solution is to cleanup all the LSF daemons on all clusters. Note : lsfshutdown leaves some of the daemons on Master node. So , you need to manually kill all the LSF daemons on all master nodes.

Later,  bclusters should show the status as shown below:

[smpici@c656f7n06 ~]$ bclusters
[Job Forwarding Information ]

LOCAL_QUEUE     JOB_FLOW   REMOTE CLUSTER    STATUS
Queue1                              send        CLUSTER1                     ok
Queue2                              send        CLUSTER2                     ok
Queue3                              send        CLUSTER3                     ok

 

Singularity is a containerization solution designed [os-level-virtualization] for high-performance computing cluster environments. It allows a user on an HPC resource to run an application using a different operating system than the one provided by the cluster. For example, the application may require Ubuntu but the cluster OS is CentOS. If you are using LSF_EAUTH_KEY in  container based environment, then  there may be  eauth setbit issue.  LSF client will invoke "eauth -c" in the container by the job owner; eauth is setuid program, so it can read lsf.sudoers file to get the key. If eauth loses the setuid permission, it cannot read the lsf.sudoers file, and it will use the default key to encrypt user information. When request reaches to a server, it calls "eauth -s" which is running on root at host. It gets the key, and use the key to decrypt the user information, and failed. In other words, only default key can work for singularity environment. 

That can be resolved by disabling LSF_AUTH_QUERY_COMMANDS in configuration file as shown below. Since LSF12, LSF introduced LSF_AUTH_QUERY_COMMANDS in lsf.conf. The default value is set to Y. Basically it adds extra user authentication for batch query. By default, each cluster has own default eauth key. If user uses "boosts remote_cluster", it uses local key to encrypt user data, but talk to remote daemon directly. As the remote daemon uses its own key to decrypt data, it fails. Job operations information is exchanged through mbatchd to mbatchd communication channel. It does not go through kind of auth. Logically, user should only query local mbatchd to see job's status (the remote job status will be sent back to the submission cluster). 

Modified files :

1) Add LSF_AUTH_QUERY_COMMANDS=N  to  `lsf.conf file` 
2) Removed : LSF_EAUTH_KEY="testKey1"   from `/etc/lsf.sudoers`

========================================================================

Step 1 : Check cluster names 

[sachinpb@cluster1_master sachinpb]$ lsclusters  -w
CLUSTER_NAME        STATUS   MASTER_HOST       ADMIN    HOSTS  SERVERS
cluster1   ok    cluster1_master              sachinpb        5        5
cluster2   ok    cluster2_master              sachinpb        8        8
[sachinpb@cluster1_master sachinpb]$ 

Step 2:  Submit job to remote cluster (Job forwarding queue=Forwarding_queue) 

[sachinpb@cluster1_master sachinpb]$  bsub -n 10 -R "span[ptile=2]" -q Forwarding_queue  -R - sleep 1000
Job <35298> is submitted to queue <Forwarding_queue>.
[sachinpb@cluster1_master sachinpb]$

Step 3 : Check status of job from submission cluster:

[sachinpb@cluster1_master sachinpb]$ bjobs
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
35298   sachinpb  RUN   x86_c656f8 cluster1_master   cluster2_master sleep 1000 Oct 11 03:30
                                             cluster2_master@cluster2
                                             cluster2_master@cluster2
                                             cluster2_master@cluster2
                                             cluster2_master@cluster2
                                             cluster2_master@cluster2
                                             cluster2_master@cluster2
                                             cluster2_master@cluster2
                                             cluster2_master@cluster2
                                             cluster2_master@cluster2
[sachinpb@cluster1_master sachinpb]$ bjobs -o 'forward_cluster' 35298
FORWARD_CLUSTER
cluster2
[sachinpb@cluster1_master sachinpb]$ bjobs -o 'dstjobid' 35298
DSTJOBID
8589
[sachinpb@cluster1_master sachinpb]$

Step 4 : Get list of compute nodes on remote cluster  by issuing  bjobs -m  command from submission cluster as shown below:
[sachinpb@cluster1_master sachinpb]$ bjobs -m cluster2 -o 'EXEC_HOST' 8589
EXEC_HOST
computeNode04:computeNode04:computeNode06:computeNode06:computeNode05:computeNode05:computeNode03:computeNode03:computeNode07:computeNode07
[sachinpb@cluster1_master sachinpb]$

======================= LSF configuration section ===========================

After you change any configuration file, use the lsadmin reconfig and badmin reconfig commands to reconfigure your cluster. Log on to the host as root or the LSF administrator (in our case "lsfadmin")

Run lsadmin reconfig to restart LIM and checks for configuration errors. If no errors are found, you are prompted to either restart the lim daemon on management host candidates only, or to confirm that you want to restart the lim daemon on all hosts. If unrecoverable errors are found, reconfiguration is canceled. Run the badmin reconfig command to reconfigure the mbatchd daemon and checks for configuration errors.

  • lsadmin reconfig to reconfigure the lim daemon
  • badmin reconfig to reconfigure the mbatchd daemon without restarting
  • badmin mbdrestart to restart the mbatchd daemon
  • bctrld restart sbd to restart the sbatchd daemon

More details about cluster reconfiguration commands as shown in the table copied below :

https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=cluster-commands-reconfigure-your
Source

How to resolve some known eauth related issues - commands like bhosts, bjobs ...etc fail with error "User permission denied".
Problem/solution 1: 
Example 1: 
[smpici@host1 ~]$ bhosts
User permission denied
Example 2:
mpirun --timeout 30 hello_world
Jan 11 02:42:52 2022 1221079 3 10.1 lsb_pjob_send_requests: lsb_pjob_getAckReturn failed on host <host1>, lsberrno <0>
[host1:1221079] [[64821,0],0] ORTE_ERROR_LOG: The specified application failed to start in file ../../../../../../opensrc/ompi/orte/mca/plm/lsf/plm_lsf_module.c at line 347
--------------------------------------------------------------------------
The LSF process starter (lsb_launch) failed to start the daemons on
the nodes in the allocation.
Returned : -1
lsberrno : (282) Failed while executing tasks
This may mean that one or more of the nodes in the LSF allocation is
not setup properly.
Then, Please check clocks on the node . If clocks show the difference, 
then you need configure  chrony as shown below on all nodes.
systemctl enable chronyd.service
systemctl stop chronyd.service
systemctl start chronyd.service
systemctl status chronyd.service
The root cause of the problem was that the system clock between the nodes and the launch nodes were out of sync.
After the clocks were sync'ed up, I tried LSF cmds and it worked . In worst-case, If hosts have time un-synchronization, 
please configure LSF_EAUTH_TIMEOUT=0 in lsf.conf on each side of multi-cluster
Problem/solution 2:

When multi-cluster job forwarding cluster shows unavailable state. 
Just check the PORT numbers defined in lsf.conf are same on both the clusters. Otherwise , cluster shows unavailable state.

[lsfadmin@ conf]$ lsclusters -w
CLUSTER_NAME          STATUS   MASTER_HOST               ADMIN       HOSTS  SERVERS
cluster1               ok       c1_master              lsfadmin       40       40
cluster2              unavail   unknown                 unknown        -  
Default port numbers :
[root@c1_master ~]# cat $LSF_HOME/conf/lsf.conf | grep PORT
LSF_LIM_PORT=7869
LSF_RES_PORT=6878
LSB_MBD_PORT=6881
LSB_SBD_PORT=6882
LSB_QUERY_PORT=6891

Useful Tips:  How long jobs are available after done or exit 
CLEAN_PERIOD_DONE=seconds
Controls the amount of time during which successfully finished jobs are kept in mbatchd core memory. 
This applies to DONE and PDONE (post job execution processing) jobs.

If CLEAN_PERIOD_DONE is not defined, the clean period for DONE jobs is defined by CLEAN_PERIOD in lsb.params. 
If CLEAN_PERIOD_DONE is defined, its value must be less than CLEAN_PERIOD, otherwise it will be ignored and a warning message will appear.
 
# bparams -a | grep CLEAN_PERIOD
        CLEAN_PERIOD = 3600
        CLEAN_PERIOD_DONE = not configured
Problem/solution 3:
Regarding job forwarding setup between clusters over public/private IP address.
I can't ssh from cluster1 Master (f2n01-10.x.x.x) to cluster2 with public IP on cluster2 Master(9.x.x.x).
Do we need to open these ports by setting the firewall rules ?. From an LSF point of view the ports for
the lim and mbd need to be opened up. Issue commands lsclusters and bclusters commands.
The status reported should be OK in both. You should try this from both clusters.
Problem/solution 4: 
Test working of blaunch with LSF job submission. The blaunch command works only under LSF. 
It can be used only to launch tasks on remote hosts that are part of a job allocation. 
It cannot be used as a stand-alone command. The call to blaunch is made under the bsub environment.
You cannot run blaunch directly from the command line.
blaunch is not to be used outside of the job execution environment provided by bsub.
Most MPI implementations and many distributed applications use the rsh and ssh commands as their task launching
mechanism. The blaunch command provides a drop-in replacement for the rsh and ssh commands as a transparent
method for launching parallel applications within LSF.The following are some examples of blaunch usage:
Submit a parallel job:
bsub -n 4 blaunch myjob
Submit a job to an application profile
bsub -n 4 -app pjob blaunch myjob
-----------------Example -----------------------
[sachinpb@xyz]$ cat show_ulimits.c
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <sys/resource.h>
int main()
{
  struct rlimit old_lim;
  if(getrlimit(RLIMIT_CORE, &old_lim) == 0)
      printf("current limits -> soft limit= %ld \t"
           " hard limit= %ld \n", old_lim.rlim_cur, old_lim.rlim_max);
  else
      fprintf(stderr, "%s\n", strerror(errno));
//  abort();
  return 0;
}
[sachinpb@xyz]$
-------------------------------------------------

[submission_node]$ gcc -o show_ulimits show_ulimits.c
[submission_node]$ ls
generate_core.c  show_ulimits  show_ulimits.c  test_limit
[submission_node]$ ./show_ulimits
current limits -> soft limit= 0          hard limit= 0
[submission_node]$

--------------------------------------------------------


[submission_node]$ bsub -o /shared-dir/sachin_test_lsf_out_%J -n 4 -R "span[ptile=2]" -q x86_test_q -m "HOSTA HOSTB" blaunch /shared_dir/core_file_test/show_limits
Job <2511> is submitted to queue <x86_test_q>.
[submission_node]$ bjobs 2511
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
2511    sachinpb  DONE  x86_test_q bsub_host HOSTA     show_limits Oct 29 02:17
                                             HOSTA
                                             HOSTB
                                             HOSTB
[submission_node]$


[submission_node]$ cat /nfs_smpi_ci/sachin_test_lsf_out_2511
Sender: LSF System <sachinpb@HOSTA>
Subject: Job 2511: <blaunch /nfs_smpi_ci/core_file_test/test_limit> in cluster <x86-64_pok-cluster2> Done

Job <blaunch /nfs_smpi_ci/core_file_test/test_limit> was submitted from host <bsub_host> by user <sachinpb> in cluster <x86-64_pok-cluster2> at Thu Oct 29 02:17:45 2020
Job was executed on host(s) <2*HOSTA>, in queue <x86_test_q>, as user <sachinpb> in cluster <x86-64_pok-cluster2> at Thu Oct 29 02:23:52 2020
                            <2*HOSTB>

Your job looked like:
------------------------------------------------------------
# LSBATCH: User input
blaunch /nfs_smpi_ci/core_file_test/test_limit
------------------------------------------------------------

Successfully completed.

The output (if any) follows:

current limits -> soft limit= -1         hard limit= -1
current limits -> soft limit= -1         hard limit= -1
current limits -> soft limit= -1         hard limit= -1
current limits -> soft limit= -1         hard limit= -1
[submission_node]$

-----------------------------------------------------------------------------------
Problem/solution 5: 
When job submission fail with "User permission denied" . 
Please check setbits  with details copied below 
[smpici@c690f2n01 big-mpi]$ bsub
bsub> sleep 100
bsub> User permission denied. Job not submitted.
$LSF_HOME/10.1/linux3.10-glibc2.17-ppc64le/bin 
chown root lsadmin
chown root badmin
chmod 4755 lsadmin
chmod 4755 badmin

chown root bctrld
chmod 4755 bctrld 

$LSF_HOME/10.1/linux3.10-glibc2.17-ppc64le/etc
chown root eauth  
chmod u+s eauth 

Just run this script to avoid manual configurations.
./hostsetup --top="LSF_HOME" --setuid 
--------------------------------------------------
References:
https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=migrate-install-unix-linux
https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=iul-if-you-install-lsf-as-non-root-user

Friday, July 23, 2021

Spectrum scale :High-performance storage GPFS cluster Installation and setup

IBM Spectrum Scale(formerly GPFS) is a scale-out high performance global parallel file system (cluster file system) that provides concurrent access to a single file system or set of file systems from multiple nodes. Enterprises and organizations are creating, analyzing and keeping more data than ever before. Islands of data are being created all over the organization and in the cloud creating complexity, difficult to manage systems and increasing costs. Those that can deliver insights faster while managing rapid infrastructure growth are the leaders in their industry. In delivering those insights, an organization’s underlying information architecture must support the hybrid cloud, big data and artificial intelligence (AI) workloads along with traditional applications while ensuring security, reliability, data efficiency and high performance. IBM Spectrum Scale™ meets these challenges as a parallel high-performance solution with global file and object data access for managing data at scale with the distinctive ability to perform archive and analytics in place.

Manually installing the IBM Spectrum Scale software packages on POWER nodes myhost1, myhost2 and myhost3

The following packages are required for IBM Spectrum Scale Standard Edition on Red Hat Enterprise Linux:

  1. gpfs.base*.rpm
  2. gpfs.gpl*.noarch.rpm
  3. gpfs.msg.en_US*.noarch.rpm
  4. gpfs.gskit*.rpm
  5. gpfs.license*.rpm

Step 1:Download spectrum scale 5.1.1.1 SE package from fix central and Install RPM packages on all nodes:

 rpm -ivh gpfs.base*.rpm gpfs.gpl*rpm gpfs.license.std*.rpm gpfs.gskit*rpm gpfs.msg*rpm gpfs.docs*rpm


Step 2 : Verify installed GPFS packages

 [root@myhost1 ]# rpm -qa | grep gpfs
 gpfs.docs-5.1.1-1.noarch
 gpfs.license.std-5.1.1-1.ppc64le
 gpfs.bda-integration-1.0.3-1.noarch
 gpfs.base-5.1.1-1.ppc64le
 gpfs.gplbin-4.18.0-305.el8.ppc64le-5.1.1-1.ppc64le
 gpfs.gskit-8.0.55-19.ppc64le
 gpfs.msg.en_US-5.1.1-1.noarch
 gpfs.gpl-5.1.1-1.noarch

Step 3 : Build GPL (5.1.1.1) module by issuing command mmbuildgpl on all nodes in cluster.

Step 4 : Verify GPFS packages installed on all nodes with GPL module built properly.

              Export the path for GPFS commands. 

              export PATH=$PATH:/usr/lpp/mmfs/bin

Step 5 : Use the mmcrcluster command to create a GPFS cluster

mmcrcluster -N NodeFile -C smpi_gpfs_power8

                    where NodeFile has following entries

#cat NodeFile
 myhost2:quorum
 myhost1:quorum-manager
 myhost3:quorum-manager

Step 6: Use the mmchlicense command to designate licenses as needed. This command controls the type of GPFS license associated with the nodes in the cluster. -- accept indicates that you accept the applicable licensing terms. 

 mmchlicense server --accept -N serverLicense

Step 7: mmgetstate command. Displays the state of the GPFS™ daemon on one or more nodes.

 mmgetstate -a

Step 8: mmlslicense command displays information about the IBM Spectrum Scale node licensing designation or about disk and cluster capacity.

 mmlslicense -L

Step 9: The mmcrnsd command is used to create cluster-wide names for NSDs used by GPFS. This is the first GPFS step in preparing disks for use by a GPFS file system.

 mmcrnsd -F NSD_Stanza_smpi_gpfs_power -v no

 where NSD_Stanza_smpi_gpfs_power has

#cat NSD_Stanza_smpi_gpfs_power
%nsd:
                        device=/dev/sda
                        nsd=nsd1
                        servers=myhost2
                        usage=dataAndMetadata
                        failureGroup=-1
                        pool=system

%nsd:
                       device=/dev/sdb
                       nsd=nsd2
                       servers=myhost1
                       usage=dataAndMetadata
                       failureGroup=-1
                       pool=system

%nsd:
                       device=/dev/sda
                       nsd=nsd3
                       servers=myhost3
                       usage=dataAndMetadata
                       failureGroup=-1
                       pool=system

Step 10: Use the mmlsnsd command to display the current information for the NSDs belonging to the GPFS cluster.

 mmlsnsd -X

Step 11: Use the mmcrfs command to create a GPFS file system

 mmcrfs smpi_gpfs -F NSD_Stanza_smpi_gpfs_power

Step 12: The mmmount command mounts the specified GPFS file system on one or more nodes in the cluster.

 mmmount smpi_gpfs -a

Step 13 : Use the mmlsfs command to list the attributes of a file system.

 mmlsfs all

Step 14: The mmlsmount command reports if a file system is in use at the time the command is issued.

 mmlsmount all

step 15: How to change the mount point from /gpfs to /my_gpfs 

 mmchfs gpfs -T /my_gpfs

Step 16:  GPFS auto start and auto mount  setup

[root@myhost1 ~]#  systemctl status gpfs.service
● gpfs.service - General Parallel File System
   Loaded: loaded (/usr/lib/systemd/system/gpfs.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-07-20 03:27:04 EDT; 3 days ago
  Process: 96622 ExecStart=/usr/lpp/mmfs/bin/mmremote startSubsys systemd $STARTSUBSYS_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 96656 (runmmfs)
   CGroup: /system.slice/gpfs.service
           ├─96656 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/runmmfs
           └─97093 /usr/lpp/mmfs/bin/mmfsd

[root@myhost1 ~]# systemctl is-active gpfs.service
active
[root@myhost1 ~]#  systemctl is-enabled gpfs.service
disabled
[root@myhost1 ~]# systemctl is-failed gpfs.service
active
[root@myhost1 ~]# systemctl enable  gpfs.service
Created symlink from /etc/systemd/system/multi-user.target.wants/gpfs.service to /usr/lib/systemd/system/gpfs.service.
[root@myhost1 ~]# systemctl is-enabled gpfs.service
enabled
[root@myhost1 ~]# ls -alsrt /etc/systemd/system/multi-user.target.wants/gpfs.service
0 lrwxrwxrwx 1 root root 36 Jul 23 05:43 /etc/systemd/system/multi-user.target.wants/gpfs.service -> /usr/lib/systemd/system/gpfs.service
 

[root@myhost1 ~]# mmgetstate -a
 Node number  Node name        GPFS state
-------------------------------------------
       1                myhost2        active
       2                myhost1        active
       3                myhost3        active
 

[root@myhost1 ~]# mmchfs smpi_gpfs -A yes
mmchfs: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
 

[root@myhost1 ~]# mmlsfs smpi_gpfs  -A
flag                value                    description
------------------- ------------------------ -----------------------------------
 -A                 yes                      Automatic mount option
 

[root@myhost1 ~]# mmchconfig autoload=yes
mmchconfig: Command successfully completed
mmchconfig: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@myhost1 ~]#

Step 17: Troubleshoot when GPFS node went to inactive state  or when disk goes down .

[root@myhost1 ~]# mmlscluster
GPFS cluster information
========================
GPFS cluster name:         my_spectrumScale_cluster
GPFS cluster id:           9784093264651231821
GPFS UID domain:          my_spectrumScale_cluster
Remote shell command:      /usr/bin/ssh
Remote file copy command:  /usr/bin/scp
Repository type:           CCR

Node  Daemon node name  IP address     Admin node name  Designation
---------------------------------------------------------------------
1   myhost2         10.x.y.1  myhost2        quorum
2   myhost1         10.x.y.2  myhost1        quorum-manager
3   myhost3         10.x.y.3  myhost3        quorum-manager

[root@myhost1 ~]# mmgetstate -a
Node number  Node name        GPFS state
-------------------------------------------
1                      myhost1         active
2                      myhost2        down
3                      myhost3        active

[root@myhost1 ~]# mmstartup -a
Tue Jul 20 03:27:03 EDT 2021: mmstartup: Starting GPFS ...
myhost2:  The GPFS subsystem is already active.
myhost3:  The GPFS subsystem is already active.

[root@myhost1 ~]# mmgetstate -a

Node number  Node name        GPFS state
-------------------------------------------
1                      myhost1        active
2                      myhost2        active
3                      myhost3        active
[root@myhost1 ~]#

[root@myhost1 ~]# mmunmount smpi_gpfs -a
Tue Jul 20 04:12:04 EDT 2021: mmunmount: Unmounting file systems ...
[root@myhost1 ~]# 

[root@myhost1 ~]#  mmlsdisk smpi_gpfs
disk         driver   sector     failure holds    holds                            storage
name         type       size       group metadata data  status        availability pool
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------------
nsd1         nsd         512          -1 Yes      Yes   ready         up              system
nsd2         nsd         512          -1 Yes      Yes   ready         down         system
nsd3         nsd         512          -1 Yes      Yes   ready         up              system
[root@myhost1 ~]#

[root@myhost1 ~]#  mmchdisk smpi_gpfs start -d nsd2
mmnsddiscover:  Attempting to rediscover the disks.  This may take a while ...
mmnsddiscover:  Finished.
myhost1:  Rediscovered nsd server access to nsd2.
Scanning file system metadata, phase 1 ...
100 % complete on Tue Jul 20 04:24:14 2021
Scan completed successfully.
Scanning file system metadata, phase 2 ...
100 % complete on Tue Jul 20 04:24:14 2021
Scan completed successfully.
Scanning file system metadata, phase 3 ...
Scan completed successfully.
Scanning file system metadata, phase 4 ...
100 % complete on Tue Jul 20 04:24:14 2021
Scan completed successfully.
Scanning file system metadata, phase 5 ...
100 % complete on Tue Jul 20 04:24:14 2021
Scan completed successfully.
Scanning user file metadata ...
100.00 % complete on Tue Jul 20 04:24:25 2021  (    500736 inodes with total      26921 MB data processed)
Scan completed successfully.

[root@myhost1 ~]#  mmmount  smpi_gpfs  -a
Tue Jul 20 04:24:42 EDT 2021: mmmount: Mounting file systems ...
[root@myhost1 ~]#

[root@myhost1 ~]# mmlsdisk smpi_gpfs
disk         driver   sector     failure holds    holds                            storage
name         type       size       group metadata data  status        availability pool
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------------
nsd1         nsd         512          -1 Yes      Yes   ready         up           system
nsd2         nsd         512          -1 Yes      Yes   ready         up           system
nsd3         nsd         512          -1 Yes      Yes   ready         up           system
[root@myhost1 ~]#

[root@myhost1 ~]# mmgetstate -a
Node number  Node name        GPFS state
-------------------------------------------
1                       myhost1        active
2                       myhost2        active
3                       myhost3        active
[root@myhost1 ~]#

Step 18:  Steps to permanently uninstall GPFS

- Unmount all GPFS file systems on all nodes by issuing the mmumount all -a command.

- Issue the mmdelfs command for each file system in the cluster to remove GPFS file systems.

- Issue the mmdelnsd command for each NSD in the cluster to remove the NSD volume ID from the device.

mmdelfs smpi_gpfs
mmdelnsd nsd1
mmdelnsd nsd2
mmdelnsd nsd3

- Issue the mmshutdown -a command to shutdown GPFS on all nodes.

- Uninstall GPFS from each node

rpm -qa | grep gpfs | xargs rpm -e --nodeps

Remove the /var/mmfs and /usr/lpp/mmfs directories.

rm -rf  /var/mmfs
rm -rf  /usr/lpp/mmfs

 ------------------------------------------------------------------------------------------

The Quick Start automatically deploys a highly available IBM Spectrum Scale cluster on the Amazon Web Services (AWS) Cloud. This Quick Start deploys IBM Spectrum Scale into a virtual private cloud (VPC) that spans two Availability Zones in your AWS account. You can build a new VPC for IBM Spectrum Scale, or deploy the software into your existing VPC. The deployment and configuration tasks are automated by AWS CloudFormation templates that you can customize during launch.

IBM's container-native storage solution for OpenShift is designed for enterprise customers who need global hybrid cloud data access. These storage services meet the strict requirements for mission critical data. IBM Spectrum® Fusion provides a streamlined way for organizations to discover, secure, protect and manage data from the edge, to the core data center, to the public cloud.

Spectrum Fusion

IBM launched a containerized derivative of its Spectrum Scale parallel file system called Spectrum Fusion. The rationale is that customers need to store and analyze more data at edge sites, while operating in a hybrid and multi-cloud world that requires data availability across all these locations. The ESS arrays provide Edge storage capacity and a containerized Spectrum Fusion can run in any of the locations mentioned. It’s clear that to build, deploy and manage applications requires advanced capabilities that help provide rapid availability to data across the entire enterprise – from the edge to the data center to the cloud. 

Spectrum Fusion combines Spectrum Scale functionality with unspecified IBM data protection software. It will appear first in a hyperconverged infrastructure (HCI) system that integrates compute, storage and networking. This will be equipped with Red Hat Open Shift to support virtual machine and containerized workloads for cloud, edge and containerized data centres.

Spectrum Fusion will integrate with Red Hat Advanced Cluster Manager (ACM) for managing multiple Red Hat OpenShift clusters, and it will support tiering. Spectrum Fusion provides customers with a streamlined way to discover data from across the enterprise as it has a global index of the data it stores. It will manage a single copy of data only – i.e. there is no need to create duplicate data when moving application workloads across the enterprise. Spectrum Fusion will integrate with IBM’s Cloud Satellite, a managed distribution cloud that deploys and runs apps across the on-premises, edge and cloud environments. 

References:
https://www.ibm.com/in-en/products/spectrum-scale
https://aws.amazon.com/quickstart/architecture/ibm-spectrum-scale
https://www.ibm.com/in-en/products/spectrum-fusion
https://www.ibm.com/docs/en/spectrum-scale/5.0.4?topic=installing-spectrum-scale-linux-nodes-deploying-protocols