Thursday, February 21, 2019

Kubernetes [K8s] Architecture and setup on RHEL

Kubernetes (k8s or Kube ) is an open source container management platform designed to run enterprise-class, cloud-enabled and web-scalable IT workloads.In other words,  Kubernetes is a container orchestrator to provision, manage, and scale apps.With the rise of containerization in the world of Devops, the need of a platform to effectively orchestrate these containers also grew.  Since Kubernetes operates at the container level rather than at the hardware level, it provides some generally applicable features common to PaaS offerings.  It is a vendor-agnostic cluster and container management tool, open-sourced by Google in 2014. Kubernetes allows you to manage the life cycle of containerized apps in a cluster.

Kubernetes Architecture : 

Similar to the Apache Mesos and Docker Swarm, Kubernetes is a container orchestrator to provision, manage, and scale apps. The key paradigm of Kubernetes is its declarative model. Kubernetes is designed on the principles of scalability, availability, security and portability. It optimizes the cost of infrastructure by efficiently distributing the workload across available resources. This architecture of Kubernetes provides a flexible, loosely-coupled mechanism for service discovery. Like most distributed computing platforms, a Kubernetes cluster consists of at least one master and multiple compute nodes (also known as worker nodes). The master is responsible for exposing the application program interface (API), scheduling the deployments and managing the overall cluster. Each node runs a container runtime, such as Docker or rkt (container system developed by CoreOS as a light weight and secure alternative to Docker), along with an agent that communicates with the master. The node also runs additional components for logging, monitoring, service discovery and optional add-ons. Nodes are the workhorses of a Kubernetes cluster. They expose compute, networking and storage resources to applications. Nodes can be virtual machines (VMs) running in a cloud or bare metal servers running within the data center.
A pod is a collection of one or more containers. The pod serves as Kubernetes’ core unit of management. Pods act as the logical boundary for containers sharing the same context and resources. The grouping mechanism of pods make up for the differences between containerization and virtualization by making it possible to run multiple dependent processes together. At runtime, pods can be scaled by creating replica sets, which ensure that the deployment always runs the desired number of pods.

Replica sets deliver the required scale and availability by maintaining a pre-defined set of pods at all times. A single pod or a replica set can be exposed to the internal or external consumers via services. Services enable the discovery of pods by associating a set of pods to a specific criterion. Pods are associated to services through key-value pairs called labels and selectors. Any new pod with labels that match the selector will automatically be discovered by the service. This architecture provides a flexible, loosely-coupled mechanism for service discovery.
The definition of Kubernetes objects, such as pods, replica sets and services, are submitted to the master. Based on the defined requirements and availability of resources, the master schedules the pod on a specific node. The node pulls the images from the container image registry and coordinates with the local container runtime to launch the container.

etcd is an open source, distributed key-value database from CoreOS, which acts as the single source of truth (SSOT) for all components of the Kubernetes cluster. The master queries etcd to retrieve various parameters of the state of the nodes, pods and containers.


Kubernetes Components:

 i) Master Components
Master components provide the cluster’s control plane. Master components make global decisions about the cluster (for example, scheduling), and detecting and responding to cluster events (starting up a new pod when a replication controller’s ‘replicas’ field is unsatisfied. The master stores the state and configuration data for the entire cluster in ectd, a persistent and distributed key-value data store. Each node has access to ectd and through it, nodes learn how to maintain the configurations of the containers they’re running. You can run etcd on the Kubernetes master or in standalone configurations.

  • kube-apiserver
  • etcd
  • kube-scheduler
  • kube-controller-manager
  • cloud-controller-manager

ii) Node Components
Node components run on every node, maintaining running pods and providing the Kubernetes runtime environment . All nodes in a Kubernetes cluster must be configured with a container runtime, which is typically Docker. The container runtime starts and manages the containers as they’re deployed to nodes in the cluster by Kubernetes.
  • kubelet
  • kube-proxy
  • Container Runtime


Installation Steps of Kubernetes 1.7 on RHEL 


Lets do the installation on two x86 nodes installed with RHEL 7.5
1) master-node
2) worker-node

Step 1: on master-node, disable SELinux & setup firewall rules
  1. setenforce 0
  2. sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux

Set the following firewall rules and other configuration details.

  1.  firewall-cmd --permanent --add-port=6443/tcp
  2.  firewall-cmd --permanent --add-port=2379-2380/tcp
  3.  firewall-cmd --permanent --add-port=10250/tcp
  4.  firewall-cmd --permanent --add-port=10251/tcp
  5.  firewall-cmd --permanent --add-port=10252/tcp
  6.  firewall-cmd --permanent --add-port=10255/tcp
  7.  firewall-cmd --reload
  8.  modprobe br_netfilter
  9.  echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables

NOTE:You MUST disable swap in order for the kubelet to work properly

Step 2: Configure Kubernetes Repository
cat /etc/yum.repos.d/kubernetes.repo

Step 3: configure  Docker  - validated version(18.06) for kubernetes

cat /etc/yum.repos.d/docker-main.repo
name=Docker main Repository

Step 4: Install docker

[root@master-node ]# yum install docker-ce-18.06*
Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
Resolving Dependencies
--> Running transaction check
---> Package docker-ce.x86_64 0:18.06.3.ce-3.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

 Package                      Arch                      Version                              Repository                           Size
 docker-ce                    x86_64                    18.06.3.ce-3.el7                     docker-ce-stable                     41 M

Transaction Summary
Install  1 Package

Total size: 41 M
Installed size: 168 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Warning: RPMDB altered outside of yum.
  Installing : docker-ce-18.06.3.ce-3.el7.x86_64                                                                                   1/1
  Verifying  : docker-ce-18.06.3.ce-3.el7.x86_64                                                                                   1/1

  docker-ce.x86_64 0:18.06.3.ce-3.el7

[root@master-node ]#

Step 5: Start and enable kubenetes

[root@master-node ~]# systemctl  start kubelet
[root@master-node ~]# systemctl  status kubelet
? kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
   Active: active (running) since Thu 2019-02-21 01:00:56 EST; 1min 50s ago
 Main PID: 9551 (kubelet)
    Tasks: 69
   Memory: 56.3M
   CGroup: /system.slice/kubelet.service
           +-9551 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubele...

Feb 21 01:02:22 master-node kubelet[9551]: W0221 01:02:22.527780    9551 cni.go:203] Unable to update cni config: No networks fo...i/net.d
Feb 21 01:02:22 master-node kubelet[9551]: E0221 01:02:22.528025    9551 kubelet.go:2192] Container runtime network not ready: N...ialized
Feb 21 01:02:27 master-node kubelet[9551]: W0221 01:02:27.528986    9551 cni.go:203] Unable to update cni config: No networks fo...i/net.d
Feb 21 01:02:27 master-node kubelet[9551]: E0221 01:02:27.529196    9551 kubelet.go:2192] Container runtime network not ready: N...ialized
Feb 21 01:02:32 master-node kubelet[9551]: W0221 01:02:32.530265    9551 cni.go:203] Unable to update cni config: No networks fo...i/net.d
Feb 21 01:02:32 master-node kubelet[9551]: E0221 01:02:32.530448    9551 kubelet.go:2192] Container runtime network not ready: N...ialized
Feb 21 01:02:37 master-node kubelet[9551]: W0221 01:02:37.531526    9551 cni.go:203] Unable to update cni config: No networks fo...i/net.d
Feb 21 01:02:37 master-node kubelet[9551]: E0221 01:02:37.531645    9551 kubelet.go:2192] Container runtime network not ready: N...ialized
Feb 21 01:02:42 master-node kubelet[9551]: W0221 01:02:42.532552    9551 cni.go:203] Unable to update cni config: No networks fo...i/net.d
Feb 21 01:02:42 master-node kubelet[9551]: E0221 01:02:42.532683    9551 kubelet.go:2192] Container runtime network not ready: N...ialized
Hint: Some lines were ellipsized, use -l to show in full.
[root@master-node ~]#

[root@master-node ]# systemctl enable kubelet
Created symlink from /etc/systemd/system/ to /etc/systemd/system/kubelet.service.

step 6 : Start and enable docker

[root@master-node]# systemctl restart docker
[root@master-node]# systemctl status docker
? docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-02-21 00:41:14 EST; 7s ago
 Main PID: 30271 (dockerd)
    Tasks: 161
   Memory: 90.8M
   CGroup: /system.slice/docker.service
           +-30271 /usr/bin/dockerd
           +-30284 docker-containerd --config /var/run/docker/containerd/containerd.toml
           +-30490 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/...
           +-30523 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/...
           +-30541 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/...
           +-30639 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/...
           +-30667 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/...
           +-30747 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/...
           +-30764 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/...
           +-30810 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/...
           +-30938 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/...

Feb 21 00:41:15 master-node dockerd[30271]: time="2019-02-21T00:41:15-05:00" level=info msg="shim docker-containerd-shim started...d=30541
Feb 21 00:41:15 master-node dockerd[30271]: time="2019-02-21T00:41:15-05:00" level=info msg="shim docker-containerd-shim started...d=30639
Feb 21 00:41:16 master-node dockerd[30271]: time="2019-02-21T00:41:16-05:00" level=info msg="shim docker-containerd-shim started...d=30667
Feb 21 00:41:16 master-node dockerd[30271]: time="2019-02-21T00:41:16-05:00" level=info msg="shim docker-containerd-shim started...d=30747
Feb 21 00:41:16 master-node dockerd[30271]: time="2019-02-21T00:41:16-05:00" level=info msg="shim docker-containerd-shim started...d=30764
Feb 21 00:41:16 master-node dockerd[30271]: time="2019-02-21T00:41:16-05:00" level=info msg="shim docker-containerd-shim started...d=30810
Feb 21 00:41:16 master-node dockerd[30271]: time="2019-02-21T00:41:16-05:00" level=info msg="shim docker-containerd-shim started...d=30938
Feb 21 00:41:16 master-node dockerd[30271]: time="2019-02-21T00:41:16-05:00" level=info msg="shim docker-containerd-shim started...d=30983
Feb 21 00:41:17 master-node dockerd[30271]: time="2019-02-21T00:41:17-05:00" level=info msg="shim reaped" id=cc95fdfdc1d7d0d6104...62d4d91
Feb 21 00:41:17 master-node dockerd[30271]: time="2019-02-21T00:41:17.122817061-05:00" level=info msg="ignoring event" module=li...Delete"
Hint: Some lines were ellipsized, use -l to show in full.
[root@master-node]# systemctl enable docker
Created symlink from /etc/systemd/system/ to /usr/lib/systemd/system/docker.service.

Step 7: Check the version of kubernetes installed on the master node :
[root@master-node ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:05:53Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
[root@master-node ~]#


Step 8:
Check the version of docker installed :

[root@master-node]# docker version
 Version:           18.06.3-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        d7080c1
 Built:             Wed Feb 20 02:26:51 2019
 OS/Arch:           linux/amd64
 Experimental:      false

  Version:          18.06.3-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       d7080c1
  Built:            Wed Feb 20 02:28:17 2019
  OS/Arch:          linux/amd64
  Experimental:     false


Step 9: Initialize Kubernetes Master with ‘kubeadm init’

[root@master-node ~]# kubeadm init
[init] Using Kubernetes version: v1.13.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master-node localhost] and IPs [IP_ADDRESS_master-node ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master-node localhost] and IPs [IP_ADDRESS_master-node ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master-node kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [ IP_ADDRESS_master-node]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 20.502302 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "master-node" as an annotation
[mark-control-plane] Marking the node master-node as control-plane by adding the label "''"
[mark-control-plane] Marking the node master-node as control-plane by adding the taints []
[bootstrap-token] Using token: od9n1d.rltj6quqmm2kojd7
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:

You can now join any number of machines by running the following on each node
as root:

  kubeadm join IP_ADDRESS_master-node:6443 --token od9n1d.rltj6quqmm2kojd7 --discovery-token-ca-cert-hash sha256:9ea1e1163550080fb9f5f63738fbf094f065de12cd38f493ec4e7c67c735fc7b

[root@master-node ~]#
If you get error -PORT already in use - please run kubectl reset
kubernetes master has been initialized successfully as shown above .

Step 10:Set the cluster for root user.

[root@master-node ~]# mkdir -p $HOME/.kube
[root@master-node ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@master-node ~]# chown $(id -u):$(id -g) $HOME/.kube/config
[root@master-node ~]#


Step 11 : Check the status on master node  and Deploy pod network to the cluster

[root@master-node ~]# kubectl get nodes
master-node   NotReady   master   12m   v1.13.3
[root@master-node ~]#

[root@master-node]# kubectl get pods --all-namespaces
NAMESPACE     NAME                               READY   STATUS    RESTARTS   AGE
default       nginx                              0/1     Pending   0          125m
kube-system   coredns-86c58d9df4-d4j4x           1/1     Running   0          3h49m
kube-system   coredns-86c58d9df4-sg8tk           1/1     Running   0          3h49m
kube-system   etcd-master-node                      1/1     Running   0          3h48m
kube-system   kube-apiserver-master-node            1/1     Running   0          3h48m
kube-system   kube-controller-manager-master-node   1/1     Running   0          3h48m
kube-system   kube-proxy-b6wcd                   1/1     Running   0          159m
kube-system   kube-proxy-qfdhq                   1/1     Running   0          3h49m
kube-system   kube-scheduler-master-node            1/1     Running   0          3h48m
kube-system   weave-net-5c46g                    2/2     Running   0          159m
kube-system   weave-net-7qsnj                    2/2     Running   0          3h35m


Step 12:  deploy network

The Weave Net addon for Kubernetes comes with a Network Policy Controller that automatically monitors Kubernetes for any NetworkPolicy annotations on all namespaces and configures iptables rules to allow or block traffic as directed by the policies.

[root@master-node ~]#  export kubever=$(kubectl version | base64 | tr -d '\n')
[root@master-node ~]#  kubectl apply -f "$kubever"
serviceaccount/weave-net created created created created created
daemonset.extensions/weave-net created
[root@master-node ~]#


Step 13:Check the status on master node

[root@master-node ~]# kubectl get nodes
master-node   Ready    master   14m   v1.13.3
[root@master-node ~]#

Perform the following steps on each worker node

Step 14: 
  Disable SELinux and other configuration details

  1. setenforce 0
  2. sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
  3. firewall-cmd --permanent --add-port=10250/tcp
  4. firewall-cmd --permanent --add-port=10255/tcp
  5. firewall-cmd --permanent --add-port=30000-32767/tcp
  6. firewall-cmd --permanent --add-port=6783/tcp
  7. firewall-cmd  --reload
  8. echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables

NOTE: You MUST disable swap in order for the kubelet to work properly
Step 15: Configure Kubernetes  and docker Repositories on worker node  ( same as steps above)

Step 16:
Install docker

Step 17:

Start and enable docker service


Step 18:

Now you can  Join worker node to master node

Whenever kubernetes master initialized , then in the output we get command and token.  Copy that command and run

[root@worker-node ~]#  kubeadm join IP_ADDRESS_master-node:6443 --token od9n1d.rltj6quqmm2kojd7 --discovery-token-ca-cert-hash sha256:9ea1e1163550080fb9f5f63738fbf094f065de12cd38f493ec4e7c67c735fc7b
[preflight] Running pre-flight checks
[discovery] Trying to connect to API Server "IP_ADDRESS_master-node:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://IP_ADDRESS_master-node:6443"
[discovery] Requesting info from "https://IP_ADDRESS_master-node:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "IP_ADDRESS_master-node:6443"
[discovery] Successfully established connection with API Server "IP_ADDRESS_master-node:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "worker-node" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

[root@worker-node ~]#

This will activate the services required -------------------------

Step 19:

Now verify Nodes status from master node using kubectl command

[root@master-node]# kubectl get nodes
master-node   Ready    master   119m   v1.13.3
worker-node   Ready    <none>   49m    v1.13.3
[root@master-node ]#

As we can see master and worker nodes are in ready status. This concludes that kubernetes 1.7 has been installed successfully and also we have successfully joined worker node.  Now we can create pods and services

---------------------------------   oooooooooooooooooo -------------------------------------------


Tuesday, January 8, 2019

OpenMP Accelerator Support for GPUs on POWER architecture

The combination of the IBM® POWER® processors and the NVIDIA GPUs provides a platform for heterogeneous high-performance computing that can run several technical computing workloads efficiently. The computational capability is built on top of massively parallel and multithreaded cores within the NVIDIA GPUs and the IBM POWER processors. You can offload parallel operations within applications, such as data analysis or high-performance computing workloads, to GPUs or other devices. 

       OpenMP API specification provides a set of directives to instruct the compiler and runtime to offload a block of code to the device. The device can be GPU, FPGA etc. In the new generations of POWER architecture, the POWER processor can be attached to the Nvidia GPU via the high speed NVLINK for fast data transfer between CPU and GPU. This hardware configuration is an essential part of the CORAL project with the U.S. national labs and can bring us closer to the Exascale Computing. The IBM XL compilers has a long history of supporting OpenMP API starting from the first version of the specification. The XL compilers continue to support OpenMP specification and exploit the POWER hardware architecture with GPU. The XL compiler team works closely with the IBM Research team to develop the compiler infrastructure for the offloading mechanism. In addition, the team also collaborates with the open source community for the runtime interface on the GPU device runtime library.


The OpenMP program (C, C++ or Fortran) with device constructs is fed into the High-Level Optimizer and partitioned into the CPU and GPU parts. The intermediate code is optimized by High-level Optimizer. Note that such optimization benefits both code for CPU as well as GPU. The CPU part is sent to the POWER Low-level Optimizer for further optimization and code generation. The GPU part of the code is translated to the LLVM IR and then fed into the LLVM optimizer in the CUDA Toolkit for optimization specific for Nvidia GPU and PTX code generation. Finally, the linker is invoked to link the objects to create an executable. From this outline view, one can see that the compiler employs the expertise from the both worlds to ensure that the applications are being optimized accordingly. For the CPU part, the POWER Low-level Optimizer which accumulates many years of optimization knowledge on the POWER architecture generates optimized code. For the GPU part, the GPU expertise from the CUDA Toolkit is used to generate optimized code on the Nvidia device. As a result, the entire applications are optimized in a balanced way.

The XL C/C++ V13.1.5 and XL Fortran V15.1.5 compilers are one of the first compilers that provide support for Nvidia GPU offloading using OpenMP 4.5 programming model. This release has the basic device constructs (i.e. target, target update and target data directives) support to allow users to experiment the offloading mechanism and porting code for GPU. The other important aspect of offloading computation to devices is the data mapping. The map clause is also supported in this release.

Compile the above code using IBM XL compiler  with below mentioned flags

 xlc_r -O3 -fopenmp -qsmp=omp -qoffload sample_offload.c

Observe the offloaded task running on GPU

  OpenMP offloading compilers for NVIDIA GPUs have improved dramatically over the past year and are ready for real use.


Thursday, October 4, 2018

Cloudera to merge with Hortonworks, creating a $5.2 billion company

 Cloudera and Hortonworks, the two leading enterprise Hadoop providers, announce Merger to Create World’s Leading Next Generation Data Platform and Deliver Industry’s First Enterprise Data Cloud. According to the companies, the combined entity has a better chance to be a next-gen data platform across multiple clouds, on-premises and Edge computing. Hortonworks and Cloudera also have complementary approaches, customers and industries. They will combine in a merger of equals in a deal valued at $5.2 billion. Both companies were pioneers in Hadoop, an open-source platform that could analyze data in ways that scaled up easily—a necessity during a time when the availability of data was increasing exponentially each year.


First, remember the history of Apache Hadoop.

Google built an innovative scale-out platform for data storage and analysis in the late 1990s and early 2000s, and published research papers about their work. Doug Cutting and Mike Cafarella were working together on a personal project, a web crawler, and read the Google papers. The two of them started the Hadoop project to build an open-source implementation of Google’s system.

Yahoo quickly recognized the promise of the project. It staffed up a team to drive Hadoop forward, and hired Doug Cutting. That team delivered the first production cluster in 2006 and continued to improve it in the years that followed.

In 2008,
Rob Bearden co-founded Cloudera with folks from Google, Facebook, and Yahoo to deliver a big data platform built on Hadoop to the enterprise market. We believed then, and we still believe today, that the rest of the world would need to capture, store, manage and analyze data at massive scale. We were the first company to commercialize the software, building on the work done at Yahoo and other consumer internet companies.

Three years later, the core team of developers working inside Yahoo on Hadoop spun out to found Hortonworks. They, too, saw the enormous potential for data at scale in the enterprise. They had proven their ability to build and deliver the technology at Yahoo.

Hortonworks, which spun out of Yahoo, went public in 2014, and Cloudera, which is larger than Hortonworks in terms of market capitalization and revenue, went public in 2017. Intel was a major Cloudera investor. Amazon's market-leading cloud unit has a distribution of Hadoop software, and another competitor of the companies, MapR, is privately held. Cloudera, which was founded in 2008, raised over a billion dollars before going public, the vast majority coming in one major $740 million burst from Intel Capital in 2014. Hortonworks, founded three years later, raised $248 million.

Tom Reilly, the long-time CEO at Cloudera, certainly sees the two companies as complementary, offering customers something together that they couldn’t separately. “Our businesses are highly complementary and strategic. By bringing together Hortonworks’ investments in end-to-end data management with Cloudera’s investments in data warehousing and machine learning, we will deliver the industry’s first enterprise data cloud from the Edge to AI,” Reilly said in a statement. The companies commercialize the Hadoop open-source big data software, which companies can use to store, process and analyze lots of different types of data.

 Cloudera stock jumped as much as 25 percent  after it announced an all-stock merger of equals with competitor Hortonworks. Hortonworks stock was halted just prior to the announcement and jumped as much as 29 percent.. The combined equity value of the two companies is $5.2 billion  of their stocks . The deal is subject to U.S. antitrust clearance, and the companies expect it to close in the first quarter of 2019. Under the terms of the transaction agreement, Cloudera stockholders will own approximately 60% of the equity of the combined company and Hortonworks stockholders will own approximately 40% . Hortonworks shareholders will get 1.305 Cloudera shares for each share owned.
The two companies are committed to supporting existing offerings from the two companies for at least three years but will work on a "unity release" of software, drawing on technologies from both companies' portfolios, Reilly said. The unified company wants to honor Hortonworks' commitment around providing all of its software under open-source licenses, but over time there will also be a proprietary option that offer additional features, including in the cloud.

Tom Reilly, will serve as CEO of the combined company. Hortonworks' Chief Operating Officer, Scott Davidson, will serve as Chief Operating Officer; Hortonworks' Chief Product Officer, Arun C. Murthy, will serve as Chief Product Officer; and Cloudera's Chief Financial Officer, Jim Frankola, will serve as Chief Financial Officer, of the combined company.
Rob Bearden will join the board of directors. Current Cloudera board member, Marty Cole, will become Chairman of the board of directors. They plan to merge, creating a single company under the Cloudera banner that will focus on “edge to AI” opportunities.



Sunday, September 30, 2018

Blockchain Technology

Blockchain is one of the biggest buzzwords in technology . Blockchain technologies are at an early stage of adoption, but they are proliferating rapidly.  Blockchain was invented by Satoshi Nakamoto in 2008 to serve as the public transaction ledger of the cryptocurrency bitcoin. And many alternative cryptocurrencies have their own block chain.

A blockchain is a growing list of records, called blocks, which are linked using cryptography.Each block contains a cryptographic hash of the previous block, a timestamp, and transaction data (generally represented as a merkle tree root hash).

A blockchain is a decentralized, distributed and public digital ledger that is used to record transactions across many computers so that the record cannot be altered retroactively without the alteration of all subsequent blocks and the consensus of the network. This allows the participants to verify and audit transactions inexpensively. A blockchain database is managed autonomously using a peer-to-peer network and a distributed timestamping server. They are authenticated by mass collaboration powered by collective self-interests. Some basic Bitcoin terms to help you understand it all better.

  • Decentralized: It isn’t stored in a single location, but on millions of nodes simultaneously
  • Distributed: It’s shared and continually reconciled on the network
  • Blockchain — Bitcoin’s ledger. The Blockchain is a public record and can be stored by anyone. The Blockchain stores all transaction data.
  • Mining — the process by which Bitcoin transactions are validated using special processors. The people who do this are called miners.
  • Node — A server or storage device which stores the entire Blockchain and runs a Bitcoin client software that peruses all transaction data and the Blockchain to check if they conform to Bitcoin protocol.
  • Bitcoin wallet — A software application in which you can view your Bitcoin holdings, and send or receive Bitcoins.
  • Bitcoin wallet address — Your wallet address is equivalent to your bank account number. 
  • Bitcoins are stored against this address/ID. Each wallet address is associated with two unique keys, called public and private keys.
  • Public key — The public key is used to send Bitcoins to you and can be seen by anyone. The private key is your password and you need it to spend your Bitcoins.

 How does cryptocurrencies like bitcoin work?

The bitcoin blockchain is “decentralized,” meaning it is not controlled by one central authority. While traditional currencies are issued by central banks, bitcoin has no central authority. Instead, the bitcoin blockchain is maintained by a network of people known as miners.

These “miners,” sometimes called “nodes” on the network, are people running purpose-built computers that are actually competing to solve complex mathematical problems in order to make a transaction go through. This is know as Proof-of-Work.

For example, say lots of people are making bitcoin transactions. Each transaction originates from a wallet which has a “private key.” This is a digital signature and provides mathematical proof that the transaction has come from the owner of the wallet.

Now imagine lots of transactions are taking place across the world. These individual transactions are grouped together into a block, organized by strict cryptographic rules. The block is sent out to the bitcoin network, which are made up of people running high-powered computers. These computers compete to validate the transactions by trying to solve complex mathematical puzzles. The winner receives an award in bitcoin.This validated block is then added onto previous blocks creating a chain of blocks called a blockchain.

Proof-of-work is a process of producing data that’s hard to get,  but easy to verify. In the context of a blockchain, proof-of-work is about solving mathematical problems. If a problem is successfully solved, then a new block can be added to the blockchain. On average, performing proof-of-work calculations and adding a new block to the chain takes about 10 minutes.

What’s behind the proof-of-work process  or Solving the puzzle?. How do they find this number? By guessing at random. The hash function makes it impossible to predict what the output will be. So, miners guess the mystery number and apply the hash function to the combination of that guessed number and the data in the block. The resulting hash has to start with a pre-established number of zeroes. There's no way of knowing which number will work, because two consecutive integers will give wildly varying results. What's more, there may be several nonces that produce the desired result, or there may be none (in which case the miners keep trying, but with a different block configuration).

The first miner to get a resulting hash within the desired range announces its victory to the rest of the network. All the other miners immediately stop work on that block and start trying to figure out the mystery number for the next one. As a reward for its work, the victorious miner gets some new bitcoin.

 Technically, a blockchain is a chain of blocks ordered in a network of non-trusted peers. Each block references the previous one and contains data, its own hash, and the hash of the previous block.

There are different types of blockchains, which rely on different configurations as well as consensus mechanism depending on the type and size of a network. Bitcoin which is the popular blockchain is permission less, which means anyone, can participate and access the content in the chain.

Every time a person wants to initiate a transaction on a blockchain, a block is created detailing or the details of the transaction which must be broadcast’ to all nodes in the network. The block, in this case, comes with a timestamp that helps establish a sequence of events.

Once all the nodes agree, and the authenticity of the block is established, the new block is linked’ to the previous block which is also linked to the previous block, resulting in a chain of a sought, commonly referred to as blockchain.

The blockchain is normally replicated on an entire network where everyone in the network can see and access it. Cryptography is used to secure the chain which makes it impossible for a single person to manipulate its contents. For any change to take place, all the people which in this case are represented by nodes must agree to the proposed changes which are initiated in the next block without altering the previous block.

Once a piece of information is added on a block and then recorded on the blockchain ledger, nobody can change or remove it. The tamper-proof aspect is what is fuelling suggestion as to why Blockchain CIOs should take a keen interest in the technology at a time when data security and preservation is of utmost importance.                             

 The transactions may be payments on a loan. It may also be a disbursement of funds from a loan or line of credit to a borrower to a third party such as the car dealer who originated the loan. These transactions are assembled into “blocks.” The blocks are loaded into the public record and become the “chain” of events to arrive at a balance. Each block is encrypted and contains control amounts to ensure that blocks cannot be altered.


How the blockchain is tamperproof:

  1. One of the advantages of blockchain is that it can’t be tampered with. Each block that is added onto the chain carries a hard, cryptographic reference to the previous block.
  2. That reference is part of the mathematical problem that needs to be solved in order to bring the following block into the network and the chain. Part of solving the puzzle involves working out random number called the “nonce.” The nonce, combined with the other data such as the transaction size, creates a digital fingerprint called a hash. This is encrypted, thus making it secure.
  3. Each hash is unique and must meet certain cryptographic conditions. Once this happens a block is completed and added to the chain. In order to tamper with this, each earlier block, of which there are over half a million, would require the cryptographic puzzles to be re-mined, which is impossible.

Cryptocurrencies also present a problem for governments to control their economies. Fiscal and monetary policies become harder to enforce since the new currencies are outside the traditional government institutions. The electronic movement of funds in a blockchain environment can bypass government institutions and their controls.

To meet modern business demands, IBM has joined with other companies to collaboratively develop an open source, production-ready, business blockchain framework, called Hyperledger Fabric™, one of the Hyperledger® projects hosted by The Linux Foundation®. Hyperledger Fabric supports distributed ledger solutions on permissioned networks for a wide range of industries. Its modular architecture maximizes the confidentiality, resilience, and flexibility of blockchain solutions.

The IBM Blockchain Platform is a blockchain software-as-a-service offering on the IBM Cloud. It's the only fully integrated, enterprise-ready blockchain platform designed to simplify the development, governance, and operation of a decentralized, multi-institution business network. The IBM Blockchain Platform accelerates collaboration in this decentralized world by leveraging open source technology from the Hyperledger Fabric framework and Hyperledger Composer tooling.

IBM Launches Food Trust Blockchain For Commercial Use: IBM's blockchain-based food traceability platform is now live for global use by retailers, wholesalers and suppliers across the food ecosystem. In September, retail giant Walmart announced that it would begin requiring its suppliers to implement the system to track bags of spinach and heads of lettuce. Other participants include multinational companies Nestle, Kroger, Tyson Foods, Kroger and Unilever.

IBM Code pattern provides an exemplar solution of IoT Asset Tracking via a Blockchain. Develop an IoT asset tracking app using Blockchain. Use an IoT asset tracking device to improve a supply chain by using Blockchain, IoT devices, and Node-RED. This pattern addresses the very real problem of the safe delivery of perishable goods (food, medicine, livestock, etc.) that are sensitive to environmental conditions during shipment. Every shipment of perishable goods has thresholds (refrigeration requirements, avoidance of shocks or vibration, etc. ) to protect the goods from contamination or damage. If the shipment exceeds these thresholds, the goods are damaged and might become a health hazard. By recording the details (Where, What, and When) of a shipment that experienced extreme conditions (thresholds specified in the smart contract) developers can verify that the goods were delivered successfully (or not). Then, payment is predicated on successful delivery. Tracking the conditions of the shipment across multiple participants using a blockchain provides verification and trust in these processes.

The primary difference between a blockchain and a database is centralization. While all records secured on a database are centralized, each participant on a blockchain has a secured copy of all records and all changes so each user can view the provenance of the data. The magic happens when there’s an inconsistency — since each participant maintains a copy of the records, blockchain technology will immediately identify and correct any unreliable information.

Trust the data : An interesting thing happens when competitors can trust the data being shared, it creates opportunities for more participants within the vertical to join the blockchain network and increase the visibility into the data. Expanding on the previous example, if Samsung and Apple are sharing technology and data on a blockchain network, and a transportation company joins the network, that data the transportation company wants to share on the network is immediately accessible to each of the other participants and then replicated to their records. Any time one of the participants makes a change, a new version of the record is validated by all participants. In this case, Apple could track the shipment from Samsung’s factory to Apple’s manufacturing center. Additionally, if a bank is added to the network,  payment to the bank and to each participant after a transaction can be triggered automatically when a condition in the data is met, and because this data is secured and validated by all the participants, no single participant can fraudulently, or accidentally, alter the data to meet the conditional trigger within the data.

Ultimately, common blockchain standards and protocols may play a key role in enabling interoperability among international payment systems. Meanwhile, new technology usually evolves faster than consensus standards development, so these efforts may have little impact on near-term blockchain deployments. Disruptive technologies such as Blockchain and the Internet of Things, will have a profound impact in the way we live and work

 References :