Thursday, August 15, 2019

Spectrum LSF GPU enhancements & Enabling GPU features

The IBM Spectrum LSF Suites portfolio redefines cluster virtualization and workload management by providing a tightly integrated solution for demanding, mission-critical HPC environments that can increase both user productivity and hardware utilization while decreasing system management costs. The heterogeneous, highly scalable and available architecture provides support for traditional high-performance computing and high throughput workloads, as well as for big data, cognitive, GPU machine learning, and containerized workloads. Clients worldwide are using technical computing environments supported by LSF to run hundreds of genomic workloads, including Burrows-Wheeler Aligner (BWA), SAMtools, Picard, GATK, Isaac, CASAVA, and other frequently used pipelines for genomic analysis.

source
IBM Spectrum LSF provides support for heterogeneous computing environments, including NVIDIA GPUs. With the ability to detect, monitor and schedule GPU enabled workloads to the appropriate resources, IBM Spectrum LSF enables users to easily take advantage of the benefits provided by GPUs.  

Solution highlights include: 
  •     Enforcement of GPU allocations via cgroups
  •     Exclusive allocation and round robin shared mode allocation
  •     CPU-GPU affinity
  •     Boost control
  •     Power management/li>
  •     Multi-Process Server (MPS) support
  •     NVIDIA Pascal and DCGM support 
The order of GPU conditions when allocating the GPUs are as follows:
  •     The largest GPU compute capability (gpu_factor value).
  •     GPUs with direct NVLink connections.
  •     GPUs with the same model, including the GPU total memory size.
  •     The largest available GPU memory.
  •     The number of concurrent jobs on the same GPU.
  •     The current GPU mode.

Configurations:

1) GPU auto-configuration
Enabling GPU detection for LSF is now available with automatic configuration. To enable automatic GPU configuration, configure LSF_GPU_AUTOCONFIG=Y in the lsf.conf file. LSF_GPU_AUTOCONFIG controls whether LSF enables use of GPU resources automatically. If set to Y, LSF automatically configures built-in GPU resources and automatically detects GPUs. If set to N, manual configuration of GPU resources is required to use GPU features in LSF. Whether LSF_GPU_AUTOCONFIG is set to Y or N, LSF will always collect GPU metrics from hosts. On
When enabled, the lsload -gpu, lsload -gpuload, and lshosts -gpu commands will show host-based or GPU-based resource metrics for monitoring.

2) The LSB_GPU_NEW_SYNTAX=extend parameter must be defined in the lsf.conf file to enable the -gpu option and GPU_REQ parameter syntax.

3) Other configurations :

  • To configure GPU resource requirements for an application profile, specify the GPU_REQ parameter in the lsb.applications file.   i.e GPU_REQ="gpu_req"
  • To configure GPU resource requirements for a queue, specify the GPU_REQ parameter in the lsb.queues file.  i.e GPU_REQ="gpu_req"
  • To configure default GPU resource requirements for the cluster, specify the LSB_GPU_REQ parameter in the lsf.conf file. i.e LSB_GPU_REQ="gpu_req"
---------------------------------------------------------------------------------------------
Configuration change required on clusters : LSF_HOME/conf/lsf.conf

#To enable "-gpu"
LSF_GPU_AUTOCONFIG=Y
LSB_GPU_NEW_SYNTAX=extend
LSB_GPU_REQ="num=4:mode=shared:j_exclusive=yes"
--------------------------------------------------------------------------------------------
Specify additional GPU resource requirements
LSF now allows you to request additional GPU resource requirements to allow you to further refine the GPU resources that are allocated to your jobs. The existing bsub -gpu command option, LSB_GPU_REQ parameter in the lsf.conf file, and the GPU_REQ parameter in the lsb.queues and lsb.applications files now have additional GPU options to make the following requests:
  •     The gmodel option requests GPUs with a specific brand name, model number, or total GPU memory.
  •     The gtile option specifies the number of GPUs to use per socket.
  •     The gmem option reserves the specified amount of memory on each GPU that the job requires.
  •     The nvlink option requests GPUs with NVLink connections.
You can also use these options in the bsub -R command option or RES_REQ parameter in the lsb.queues and lsb.applications files for complex GPU resource requirements, such as for compound or alternative resource requirements. Use the gtile option in the span[] string and the other options (gmodel, gmem, and nvlink) in the rusage[] string as constraints on the ngpus_physical resource.

Monitor GPU resources with lsload command
Options within the lsload command show the host-based and GPU-based GPU information for a cluster. The lsload -l command does not show GPU metrics. GPU metrics can be viewed using the lsload -gpu command, lsload -gpuload command, and lshosts -gpu command.

lsload -gpu

[root@powerNode2 ~]# lsload -gpu
HOST_NAME       status  ngpus  gpu_shared_avg_mut  gpu_shared_avg_ut  ngpus_physical
powerNode1           ok      4                  0%                 0%               4
powerNode2           ok      4                  0%                 0%               4
powerNode3           ok      4                  0%                 0%               4
powerNode4           ok      4                  0%                 0%               4
powerNode5           ok      4                  0%                 0%               4
[root@powerNode2 ~]#


lsload -gpuload
[root@powerNode2 ~]# lsload -gpuload
HOST_NAME       gpuid   gpu_model   gpu_mode  gpu_temp   gpu_ecc  gpu_ut  gpu_mut gpu_mtotal gpu_mused   gpu_pstate   gpu_status   gpu_error
powerNode1 0 TeslaV100_S        0.0       33C       0.0      0%       0%      15.7G        0M            0           ok           -
                    1 TeslaV100_S        0.0       36C       0.0      0%       0%      15.7G        0M            0            ok           -
                    2 TeslaV100_S        0.0       33C       0.0      0%       0%      15.7G        0M            0            ok           -
                    3 TeslaV100_S        0.0       36C       0.0      0%       0%      15.7G        0M            0            ok           -
powerNode2 0 TeslaP100_S        0.0       37C       0.0      0%       0%      15.8G        0M            0           ok           -
                    1 TeslaP100_S        0.0       32C       0.0      0%       0%      15.8G        0M            0           ok           -
                    2 TeslaP100_S        0.0       36C       0.0      0%       0%      15.8G        0M            0           ok           -
                    3 TeslaP100_S        0.0       31C       0.0      0%       0%      15.8G        0M            0           ok           -
powerNode3 0 TeslaP100_S        0.0       33C       0.0      0%       0%      15.8G        0M            0           ok           -
                    1 TeslaP100_S        0.0       32C       0.0      0%       0%      15.8G        0M            0           ok           -
                    2 TeslaP100_S        0.0       35C       0.0      0%       0%      15.8G        0M            0           ok           -
                    3 TeslaP100_S        0.0       37C       0.0      0%       0%      15.8G        0M            0           ok           -
powerNode4 0 TeslaV100_S        0.0       35C       0.0      0%       0%      15.7G        0M            0           ok           -
                    1 TeslaV100_S        0.0       35C       0.0      0%       0%      15.7G        0M            0           ok           -
                    2 TeslaV100_S        0.0       32C       0.0      0%       0%      15.7G        0M            0           ok           -
                    3 TeslaV100_S        0.0       36C       0.0      0%       0%      15.7G        0M            0           ok           -
powerNode5 0 TeslaP100_S        0.0       31C       0.0      0%       0%      15.8G        0M            0           ok           -
                    1 TeslaP100_S        0.0       32C       0.0      0%       0%      15.8G        0M            0           ok           -
                    2 TeslaP100_S        0.0       34C       0.0      0%       0%      15.8G        0M            0           ok           -
                    3 TeslaP100_S        0.0       36C       0.0      0%       0%      15.8G        0M            0           ok           -
[root@powerNode2 ~]#


lshosts -gpu

[root@powerNode2 ~]# bhosts -gpu
HOST_NAME              ID           MODEL     MUSED      MRSV  NJOBS    RUN   SUSP    RSV
powerNode1               0 TeslaP100_SXM2_        0M        0M      0      0      0      0
                        1 TeslaP100_SXM2_        0M        0M      0      0      0      0
                        2 TeslaP100_SXM2_        0M        0M      0      0      0      0
                        3 TeslaP100_SXM2_        0M        0M      0      0      0      0
powerNode2              0 TeslaP100_SXM2_        0M        0M      0      0      0      0
                        1 TeslaP100_SXM2_        0M        0M      0      0      0      0
                        2 TeslaP100_SXM2_        0M        0M      0      0      0      0
                        3 TeslaP100_SXM2_        0M        0M      0      0      0      0
powerNode3              0 TeslaP100_SXM2_        0M        0M      0      0      0      0
                        1 TeslaP100_SXM2_        0M        0M      0      0      0      0
                        2 TeslaP100_SXM2_        0M        0M      0      0      0      0
                        3 TeslaP100_SXM2_        0M        0M      0      0      0      0
powerNode4              0 TeslaV100_SXM2_        0M        0M      0      0      0      0
                        1 TeslaV100_SXM2_        0M        0M      0      0      0      0
                        2 TeslaV100_SXM2_        0M        0M      0      0      0      0
                        3 TeslaV100_SXM2_        0M        0M      0      0      0      0
powerNode5              0 TeslaV100_SXM2_        0M        0M      0      0      0      0
                        1 TeslaV100_SXM2_        0M        0M      0      0      0      0
                        2 TeslaV100_SXM2_        0M        0M      0      0      0      0
                        3 TeslaV100_SXM2_        0M        0M      0      0      0      0
[root@powerNode2 ~]# 

 The -gpu option for lshosts shows the GPU topology information for a cluster.

[root@powerNode2 ~]# lshosts -gpu
HOST_NAME   gpu_id       gpu_model   gpu_driver   gpu_factor      numa_id
powerNode1       0 TeslaP100_SXM2_       418.67          6.0            0
                 1 TeslaP100_SXM2_       418.67          6.0            0
                 2 TeslaP100_SXM2_       418.67          6.0            1
                 3 TeslaP100_SXM2_       418.67          6.0            1
powerNode2       0 TeslaP100_SXM2_       418.67          6.0            0
                 1 TeslaP100_SXM2_       418.67          6.0            0
                 2 TeslaP100_SXM2_       418.67          6.0            1
                 3 TeslaP100_SXM2_       418.67          6.0            1
powerNode3       0 TeslaP100_SXM2_       418.67          6.0            0
                 1 TeslaP100_SXM2_       418.67          6.0            0
                 2 TeslaP100_SXM2_       418.67          6.0            1
                 3 TeslaP100_SXM2_       418.67          6.0            1
powerNode4       0 TeslaV100_SXM2_       418.67          7.0            0
                 1 TeslaV100_SXM2_       418.67          7.0            0
                 2 TeslaV100_SXM2_       418.67          7.0            8
                 3 TeslaV100_SXM2_       418.67          7.0            8
powerNode5       0 TeslaV100_SXM2_       418.67          7.0            0
                 1 TeslaV100_SXM2_       418.67          7.0            0
                 2 TeslaV100_SXM2_       418.67          7.0            8
                 3 TeslaV100_SXM2_       418.67          7.0            8
[root@powerNode2 ~]# 
Job Submission :
1) Submit a  normal job 
[sachinpb@powerNode2 ~]$  bsub -q ibm_q -R "select[type==ppc]" sleep 200
Job <24807> is submitted to queue <ibm_q>.
[sachinpb@powerNode2 ~]$

2) Submit a job with GPU requirements:
[sachinpb@powerNode2 ~]$  bsub -q ibm_q -gpu "num=1" -R "select[type==ppc]" sleep 200
Job <24808> is submitted to queue <ibm_q>.
[sachinpb@powerNode2 ~]$

3) List jobs
[sachinpb@powerNode2 ~]$ bjobs
JOBID   USER    STAT  QUEUE      ROM_HOST   EXEC_HOST   JOB_NAME     SUBMIT_TIME
24807   sachinpb  RUN   ibm_q    powerNode2   powerNode6 sleep 200     Aug  1 05:34
24808   sachinpb  RUN   ibm_q    powerNode2   powerNode2 sleep 200     Aug  1 05:34
[sachinpb@powerNode2 ~]$

We can see that job <24807>  submitted without "-gpu" option and so, it selected non-GPU node [powerNode6]. Other job <24808> was running on powerNode2 with 4 GPUs as listed in lshosts -gpu command shown in above example.

4) Submit a job with GPU requirements to Another cluster(x86-cluster2) where cluster was configured with Job-forwarding Mode:

[sachinpb@powerNode2  ~]$ lsclusters
CLUSTER_NAME    STATUS   MASTER_HOST   ADMIN    HOSTS  SERVERS
power_cluster1             ok      powerNode2                 lsfadmin       5        5
x86-64_cluster2           ok       x86-masterNode           lsfadmin       8        8
[sachinpb@powerNode2  ~]$


[sachinpb@powerNode2 ~]$ bsub -q x86_q -gpu "num=1" -R "select[type==X86_64]" sleep 200
Job <46447> is submitted to queue <x86_ibmgpu_q>.
[sachinpb@powerNode2 ~]$
[sachinpb@powerNode2 ~]$ bjobs
JOBID   USER    STAT  QUEUE        FROM_HOST   EXEC_HOST                    JOB_NAME   SUBMIT_TIME
46447   sachinpb  RUN   x86_q           powerNode2   x86_intelbox@x86-cluster2    sleep 200      Feb  9 00:55 


I hope this blog helped in understanding how to enable GPU support in Spectrum LSF followed by job submission.
NOTE: GPU enabled workloads supported from IBM Spectrum LSF Version 10.1 Fix Pack 6 onwards. LSF systems using RHEL, version 7 or higher is required to support LSF_GPU_AUTOCONFIG.

References:
https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_gpu/chap_submit_monitor_gpu_jobs.html

No comments:

Post a Comment