Sunday, May 24, 2020

RHEL8 - Next generation of Linux container Capabilities - podman, buildah, skopeo .....!

Container technology is creating a lot of buzz in the recent times. As people move from virtualization to container technology, many enterprises have adopted software container cloud application deployment. Containers leverage some key capabilities available within Linux. Containers depend on key Linux kernel features such as control groups, namespaces, and SELinux in order to manage resources and isolate the applications that are running inside the containers. It’s not just containers that generally work best with Linux, but also the tools used to manage their lifecycles. Today, Kubernetes is the leading container orchestration platform, and it was built on Linux concepts and uses Linux tooling and application programming interfaces (APIs) to manage the containers.

Red Hat OpenShift is a leading hybrid cloud, enterprise Kubernetes application platform, trusted by 1,700+ organizations. It is much easier to use, and it even has a web interface for configuration. They developed container tools for single hosts and in clusters, standardizing on Kubernetes. Other alternative- popular managed Kubernetes service are  AWS EKS(Amazon Elastic Kubernetes Service)/Fargate,  Azure AKS, or Google Cloud Platform’s GKE, Apache Mesos, Docker Swarm, Nomad, OpenStack, Rancher, and Docker Compose.

For RHEL 8, the Docker package is not included and not supported by Red Hat. The docker package has been replaced by the new suite of tools in the Container Tools module as listed

  •     The podman container engine replaced docker engine
  •     The buildah utility replaced docker build
  •     The skopeo utility replaced docker push

Red Hat Quay -A distributed, highly available container registry for entire enterprise.  Unlike other container tools implementations, tools described here do not center around the monolithic Docker container engine and docker command. Instead,  they provide a set of command-line tools that can operate without a container engine. These include:

  • podman - client tool for directly managing pods and container images (run, stop, start, ps, attach, exec, and so on)
  • buildah - client tool for building, pushing and signing container images
  • skopeo - client tool for copying, inspecting, deleting, and signing images
  • runc -  Container runtime client for providing container run and build features to podman and buildahwith OCI format containers
  • crictl - For troubleshooting and working directly with CRI-O container engines
Because these tools are compatible with the Open Container Initiative (OCI), they can be used to manage the same Linux containers that are produced and managed by Docker and other OCI-compatible container engines. However, they are especially suited to run directly on Red Hat Enterprise Linux, in single-node use cases. Each tool in this scenario can be more light-weight and focused on a subset of features. And with no need for a daemon process running to implement a container engine, these tools can run without the overhead of having to work with a daemon process.

For a multi-node container platform, there is OpenShift. Instead of relying on the single-node, daemonless tools, OpenShift requires a daemon-based container engine like  CRI-O Container Engine. . Also, podman stores its data in the same directory structure used by Buildah, Skopeo, and CRI-O, which will allow podman to eventually work with containers being actively managed by CRI-O in OpenShift.

In a nutshell, you get Podman with RHEL in a single node use case (orchestrate yourself) and CRI-O as part of the highly automated OpenShift 4 software stack as shown in diagram.
source

What is CRI-O? 
CRI-O is an implementation of the Kubernetes CRI (Container Runtime Interface) to enable using OCI (Open Container Initiative) compatible runtimes. It is a lightweight alternative to using Docker as the runtime for kubernetes. It allows Kubernetes to use any OCI-compliant runtime as the container runtime for running pods. Today it supports runc and Kata Containers as the container runtimes but any OCI-conformant runtime can be plugged in principle. CRI-O supports OCI container images and can pull from any container registry. It is a lightweight alternative to using Docker, Moby or rkt as the runtime for Kubernetes.

Why CRI-O ?
CRI-O is an open source, community-driven container engine. Its primary goal is to replace the Docker service as the container engine for Kubernetes implementations, such as OpenShift Container Platform.  The CRI-O container engine provides a stable, more secure, and performant platform for running Open Container Initiative (OCI) compatible runtimes. You can use the CRI-O container engine to launch containers and pods by engaging OCI-compliant runtimes like runc [the default OCI runtime] or Kata Containers.


CRI RUNTIMES source
CRI-O is not supported as a stand-alone container engine. You must use CRI-O as a container engine for a Kubernetes installation, such as OpenShift Container Platform. To run containers without Kubernetes or OpenShift Container Platform, use podman. CRI-O’s purpose is to be the container engine that implements the Kubernetes Container Runtime Interface (CRI) for OpenShift Container Platform and Kubernetes, replacing the Docker service.  The scope of CRI-O is tied to the Container Runtime Interface (CRI). CRI extracted and standardized exactly what a Kubernetes service (kubelet) needed from its container engine. There is little need for direct command-line contact with CRI-O. A set of container-related command-line tools are available to provide full access to CRI-O for testing and monitoring - crictl, runc, podman, buildah, skopeo. Some Docker features are included in other tools instead of in CRI-O. For example, podman offers exact command-line compatibility with many docker command features and extends those features to managing pods as well. No container engine is needed to run containers or pods with podman. Features for building, pushing, and signing container images, which are also not required in a container engine, are available in the buildah command.
Kubernetes and CRI-O process
The following are the components of CRI-O :
  • OCI compatible runtime – Default is runC, other OCI compliant are supported as well e.g Kata Containers.
  • containers/storage – Library used for managing layers and creating root file-systems for the containers in a pod.
  • containers/image – Library is used for pulling images from registries.
  • networking (CNI) – Used for setting up networking for the pods. Flannel, Weave and OpenShift-SDN CNI plugins have been tested.
  • container monitoring (conmon) – Utility within CRI-O that is used to monitor the containers.
  • security is provided by several core Linux capabilities
Runtime in Kubernetes
where : OCI runtime works as low-level runtime
             High-level runtime provides inputs to OCI runtime as per OCI specs

How do Podman, CRI-O and Kata Containers relate to this ecosystem?

source
An OCI runtime is relatively simple. You give it the root filesystem of the container and a json file describing core properties of the container, and the runtime spins up the container and connects it to an existing network using a pre-start hook.

Listed actions below are the job of a high-level container runtime. On top of this, the high-level container runtime implements the CRI so that Kubernetes has an easy way to drive the runtime.
  •     Actually creating the network of a container.
  •     Managing container images.
  •     Preparing the environment of a container.
  •     Managing local/persistent storage.
runc is the default for most tools such as Docker and Podman.

source
What CRI-O isn’t:

Building images, for example, is out of scope for CRI-O and that’s left to tools like Docker’s build command, Buildah, or OpenShift’s Source-to-Image (S2I). Once an image is built, CRI-O will happily consume it, but the building of images is left to other tools.
What is Podman?
Podman is a daemonless container engine for developing, managing, and running OCI Containers on your Linux System developed by Red Hat where engineers have paid special attention to using the same nomenclature when executing Podman commands. Containers can either be run as root or in rootless mode. It's a replacement for Docker for local development of containerized applications. Podman commands map 1 to 1 to Docker commands, including their arguments. You could alias docker with podman and never notice that there is a completely different tool managing your local containers.The Podman approach is simply to directly interact with the image registry, with the container and image storage, and with the Linux kernel through the runC container runtime process (not a daemon). Podman allows you to do all of the Docker commands without the daemon dependency.


Podman workflow

One of the core features of Podman is it's focus on security. There is no daemon involved in using Podman. It uses traditional fork-exec model instead and as well heavily utilizes user namespaces and network namespaces. As a result, Podman is a bit more isolated and in general more secure to use than Docker. You can even be root in a container without granting container or Podman any root privileges on the host -- and user in a container won't be able to do any root-level tasks on the host machine.Running rootless Podman and Buildah can do most things people want to do with containers, but there are times when root is still required. The nicest feature is running Podman and containers as a non-root user. This means you never have give a user root privileges on the host, while in the client/server model (like Docker employs), you must open a socket to a privileged daemon running as root to launch the containers. There you are at the mercy of the security mechanisms implemented in the daemon versus the security mechanisms implemented in the host operating systems—a dangerous proposition.

How containers run with container Engine ?
source
Podman can now ease the transition to Kubernetes and CRI-O :
 
On a basic level, Kubernetes is often viewed as the application that runs your containers, but Kubernetes really is a huge bundle of utilities or APIs that explain how a group of microservices running in containers on a group of servers can coordinate and work together and share services and resources. Kubernetes only supplies the APIs for  orchestration and scheduling, and resource management. To have a complete container orchestration platform, you’ll need the OS underneath, a container registry, container networking, container storage, logging and monitoring, and a way to integrate continuous integration/continuous delivery (CI/CD). Red Hat OpenShift, a supported Kubernetes for cloud-native applications with enterprise security on multi-cloud environment.
A group of seals is called a pod :)-  Padman manage pods. The Pod concept was introduced by Kubernetes.  Podman pods are similar to the Kubernetes definition. Podman can now capture the YAML description of local pods and containers and then help users transition to a more sophisticated orchestration environment like Kubernetes. Check this developer and user workflow:
  • Create containers/pods locally using Podman on the command line.
  • Verify these containers/pods locally or in a localized container runtime (on a different physical machine).
  • Snapshot the container and pod descriptions using Podman and help users re-create them in Kubernetes.
  • Users add sophistication and orchestration (where Podman cannot) to the snapshot descriptions and leverage advanced functions of Kubernetes.
How containers run in kubernetes cluster?

This container stack within Red Hat Enterprise Linux and Red Hat Enterprise Linux CoreOS serves as part of the foundation for OpenShift. As can be seen in the drawing below, the CRI-O stack in OpenShift shares many of its underlying components with Podman. This allows Red Hat engineers to leverage knowledge gained in experiments conducted in Podman for new capabilities in OpenShift.

source
Pod-Architecture source

Every Podman pod includes an “infra” container.   This container does nothing, but go to sleep. Its purpose is to hold the namespaces associated with the pod and allow podman to connect other containers to the pod.  This allows you to start and stop containers within the POD and the pod will stay running, where as if the primary container controlled the pod, this would not be possible. Most of the attributes that make up the Pod are actually assigned to the “infra” container.  Port bindings, cgroup-parent values, and kernel namespaces are all assigned to the “infra” container. This is critical to understand, because once the pod is created these attributes are assigned to the “infra” container and cannot be changed. 

In the above diagram, notice the box above each container, conmon, this is the container monitor.  It is a small C Program that’s job is to watch the primary process of the container, and if the container dies, save the exit code.  It also holds open the tty of the container, so that it can be attached to later. This is what allows podman to run in detached mode (backgrounded), so podman can exit but conmon continues to run.  Each container has its own instance of conmon.


Buildah : The buildah command allows you to build container images either from command line or using Dockerfiles. These images can then be pushed to any container registry and can be used by any container engine, including Podman, CRI-O, and Docker. Buildah specializes in building OCI images. Buildah’s commands replicate all of the commands that are found in a Dockerfile. Buildah’s goal is also to provide a lower level coreutils interface to build container images, allowing people to build containers without requiring a Dockerfile. Buildah’s other goal is to allow you to use other scripting languages to build container images without requiring a daemon. The buildah command can be used as a separate command, but is incorporated into other tools as well. For example the podman build command used buildah code to build container images. Buildah is also often used to securely build containers while running inside of a locked down container by a tool like Podman, OpenShift/Kubernetes or Docker. Buildah allows you to have a Kubernetes cluster without any Docker daemon for both runtime and builds.  So, When to use Buildah and when to use Podman. With Podman you can run, build (it calls Buildah under the covers for this), modify and troubleshoot containers in your Kubernetes cluster. With the two projects together, you have a well rounded solution for your OCI container image and container needs. Buildah and Podman are easily installable via yum install buildah podman.

A quick and easy way to summarize the difference between the two projects is the buildah run command emulates the RUN command in a Dockerfile while the podman run command emulates the docker run command in functionality. Buildah is an efficient way to create OCI images while Podman allows you to manage and maintain those images and containers in a production environment using familiar container CLI commands. Together they form a strong foundation to support your OCI container image and container needs.

skopeo: The skopeo command is a tool for copying containers and images between different types of container storage. It can copy containers from one container registry to another. It can copy images to and from a host, as well as to other container environments and registries. Skopeo can inspect images from container image registries, get images and image layers, and use signatures to create and verify images. 

Running containers as root or rootless :

Running the container tools such as podman, skopeo, or buildah as a user with superuser privilege (root user) is the best way to ensure that your containers have full access to any feature available on your system. However, with the feature called "Rootless Containers," generally available as of RHEL 8.1, you can work with containers as a regular user.

Although container engines, such as Docker, let you run docker commands as a regular (non-root) user, the docker daemon that carries out those requests runs as root. So, effectively, regular users can make requests through their containers that harm the system, without there being clarity about who made those requests. By setting up rootless container users, system administrators limit potentially damaging container activities from regular users, while still allowing those users to safely run many container features under their own accounts.
Also, note that Docker is a daemon-based container engine which allows us to deploy applications inside containers as shown in diagram docker-workflow. With the release of RHEL 8 and CentOS 8, docker package has been removed from their default package repositories, docker has been replaced with podman and buildah. If you are comfortable with docker and deploy most the applications inside the docker containers and does not want to switch to podman then there is a way to install and use community version of docker on CentOS 8 and RHEL 8 system by using the official Docker repository for CentOS7/RHEL7, which is a compatible clone.
Docker workflow

NOTE: Technology Preview features provide early access to upcoming product innovations, enabling you to test functionality and provide feedback during the development process. RHEL 8.2 provides access to technology previews of containerized versions of Buildah, a tool for building container images that comply with the Open Container Image (OCI) specification, and Skopeo, a tool that facilitates the movement of container images. Red Hat is adding Udica, a tool that makes it easier to create customized, container-centric SELinux security policies that reduce the risk that a process might “break out” of a container. RHEL 8.2 also introduces enhancements to the Red Hat Universal Base Image, which now supports OpenJDK and .NET 3.0, in addition to making it easier to access source code associated with a given image via a single command. That adds additional management and monitoring capabilities via updates to Red Hat Insights, which is provided to make it easier to define and monitor policies created by the IT organization, as well as reduce any drift from baselines initially defined by the IT team.
----------------------------------------------------------------------------------------------------------------------------------
Podman installation on RHEL and small demo to illustrate with DB application:
Step 1: yum -y install podman
This command will install Podman and also its dependencies: atomic-registries, runC, skopeo-containers, and SELinux policies. Check this as shown below :
[root@IBMPOWER_sachin]# rpm -qa | grep podman
podman-1.6.4-18.el7_8.x86_64
[root@IBMPOWER_sachin]# rpm -qa | grep skopeo
skopeo-0.1.40-7.el7_8.x86_64
[root@IBMPOWER_sachin]# rpm -qa | grep runc
runc-1.0.0-67.rc10.el7_8.x86_64
[root@IBMPOWER_sachin]#

Step 2 : Command-line examples to create container and run RHEL container 
[root@IBMPOWER_sachin script]# podman run -it rhel sh
Trying to pull registry.access.redhat.com/rhel...
Getting image source signatures
Copying blob feaa73091cc9 done
Copying blob e20f387c7bf5 done
Copying config 1a9b6d0a58 done
Writing manifest to image destination
Storing signatures
sh-4.2#

[root@IBMPOWER_sachin ~]# podman images
REPOSITORY                        TAG      IMAGE ID       CREATED       SIZE
registry.access.redhat.com/rhel   latest   1a9b6d0a58f8   2 weeks ago   215 MB
[root@IBMPOWER_sachin ~]#

Step 3 : Install a containerized service for setting up a MariaDB database :
Run a MariaDB persistent container - MariaDB 10.2 with some custom variables and try to let its “data” be persistent.
[root@IBMPOWER_sachin~]# 
podman pull registry.access.redhat.com/rhscl/mariadb-102-rhel7
Trying to pull registry.access.redhat.com/rhscl/mariadb-102-rhel7...
Getting image source signatures
Copying blob 8574a8f8c7e5 done
Copying blob f60299098adf done
Copying blob 82a8f4ea76cb done
Copying blob a3ac36470b00 done
Copying config 66a314da15 done
Writing manifest to image destination
Storing signatures
66a314da15d608d89f7b589f6668f9bc0c2fa814ec9c690481a7a057206338bd
[root@IBMPOWER_sachin ~]#
[root@IBMPOWER_sachin ~]# podman images
REPOSITORY                                           TAG      IMAGE ID       CREATED       SIZE
registry.access.redhat.com/rhscl/mariadb-102-rhel7   latest   66a314da15d6   11 days ago   453 MB
registry.access.redhat.com/rhel                      latest   1a9b6d0a58f8   2 weeks ago   215 MB
[root@IBMPOWER_sachin ~]#

After you pull an image to your local system and before you run it, it is a good idea to investigate that image. Reasons for investigating an image before you run it include:
  •  Understanding what the image does
  •  Checking what software is inside the image
Example: Get information about  the “user ID running inside the container”, "ExposedPorts" and the “persistent volume location to attach“ ....etc as shown here:
podman inspect registry.access.redhat.com/rhscl/mariadb-102-rhel7  | grep User
podman inspect registry.access.redhat.com/rhscl/mariadb-102-rhel7 | grep -A1 ExposedPorts
podman inspect registry.access.redhat.com/rhscl/mariadb-102-rhel7 | grep -A1 Volume

 

Step 4 : Set up a folder that will handle MariaDB’s data once we start our container:
[root@IBMPOWER_sachin ~]# mkdir /root/mysql-data
[root@IBMPOWER_sachin ~]# chown 27:27 /root/mysql-data
 Step 5: Run the container
[root@IBMPOWER_sachin ~]#  
podman run -d -v /root/mysql-data:/var/lib/mysql/data:Z -e MYSQL_USER=user -e MYSQL_PASSWORD=pass -e MYSQL_DATABASE=db -p 3306:3306 registry.access.redhat.com/rhscl/mariadb-102-rhel7
fd2d30f8ec72734a2eee100f89f35574739c7a6a30281be77998de466635b3b0
[root@IBMPOWER_sachin ~]# podman container list
CONTAINER ID  IMAGE                                                      COMMAND     CREATED        STATUS            PORTS                   NAMES
fd2d30f8ec72  registry.access.redhat.com/rhscl/mariadb-102-rhel7:latest  run-mysqld  9 seconds ago  Up 9 seconds ago  0.0.0.0:3306->3306/tcp  wizardly_jang
[root@IBMPOWER_sachin ~]#
Step 6:  check logs
[root@ ]# podman logs fd2d30f8ec72 | head
=> sourcing 20-validate-variables.sh ...
=> sourcing 25-validate-replication-variables.sh ...
=> sourcing 30-base-config.sh ...
---> 11:03:27     Processing basic MySQL configuration files ...
=> sourcing 60-replication-config.sh ...
=> sourcing 70-s2i-config.sh ...
---> 11:03:27     Processing additional arbitrary  MySQL configuration provided by s2i ...
=> sourcing 40-paas.cnf ...
=> sourcing 50-my-tuning.cnf ...
---> 11:03:27     Initializing database ...
Step 7: That started and initialized its database . Lets create some table and check
[root@IBMPOWER_sachin ~]# podman exec -it fd2d30f8ec72 /bin/bash
bash-4.2$ mysql --user=user --password=pass -h 127.0.0.1 -P 3306 -t
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 8
Server version: 10.2.22-MariaDB MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| db                 |
| information_schema |
| test               |
+--------------------+
3 rows in set (0.00 sec) MariaDB [(none)]>  use test;
Database changed
MariaDB [test]> show tables;
Empty set (0.00 sec) MariaDB [test]> CREATE TABLE hpc_team (username VARCHAR(20), date DATETIME);
Query OK, 0 rows affected (0.00 sec) MariaDB [test]> INSERT INTO hpc_team (username, date) VALUES ('Aboorva', Now());
Query OK, 1 row affected (0.00 sec) MariaDB [test]> INSERT INTO hpc_team (username, date) VALUES ('Nysal', Now());
Query OK, 1 row affected (0.00 sec) MariaDB [test]> INSERT INTO hpc_team (username, date) VALUES ('Sachin', Now());
Query OK, 1 row affected (0.00 sec) MariaDB [test]> select * from hpc_team;
+----------+---------------------+
| username | date                |
+----------+---------------------+
| Aboorva  | 2020-05-26 11:12:41 |
| Nysal    | 2020-05-26 11:12:55 |
| Sachin   | 2020-05-26 11:13:08 |
+----------+---------------------+
3 rows in set (0.00 sec) MariaDB [test]> quit
Bye
bash-4.2$
bash-4.2$ ls
aria_log.00000001  db                ib_buffer_pool  ib_logfile1  ibtmp1             mysql               performance_schema  test
aria_log_control   fd2d30f8ec72.pid  ib_logfile0     ibdata1      multi-master.info  mysql_upgrade_info  tc.log
bash-4.2$ cd test/
bash-4.2$ ls -alsrt
total 108
 4 drwxr-xr-x 6 mysql mysql  4096 May 26 11:03 ..
 4 -rw-rw---- 1 mysql mysql   483 May 26 11:12 hpc_team.frm
 4 drwx------ 2 mysql mysql  4096 May 26 11:12 .
96 -rw-rw---- 1 mysql mysql 98304 May 26 11:13 hpc_team.ibd
bash-4.2$
Step 8: Check DB folder from host machine :
[root@IBMPOWER_sachin mysql-data]# cd test/
[root@IBMPOWER_sachin test]# ls -alsrt
total 108
 4 drwxr-xr-x 6 27 27  4096 May 26 07:03 ..
 4 -rw-rw---- 1 27 27   483 May 26 07:12 hpc_team.frm
 4 drwx------ 2 27 27  4096 May 26 07:12 .
96 -rw-rw---- 1 27 27 98304 May 26 07:13 hpc_team.ibd
[root@IBMPOWER_sachin test]#

Step 9: We can set up our systemd unit file for handling the database. We’ll use a unit file as shown below:
cat /etc/systemd/system/mariadb-service.service
[Unit]
Description=Custom MariaDB Podman Container
After=network.target
[Service]
Type=simple
TimeoutStartSec=5m
ExecStartPre=-/usr/bin/podman rm "mariadb-service"
ExecStart=/usr/bin/podman run --name mariadb-service -v /root/mysql-data:/var/lib/mysql/data:Z -e MYSQL_USER=user -e MYSQL_PASSWORD=pass -e MYSQL_DATABASE=db -p 3306:3306 --net host registry.access.redhat.com/rhscl/mariadb-102-rhel7
ExecReload=-/usr/bin/podman stop "mariadb-service"
ExecReload=-/usr/bin/podman rm "mariadb-service"
ExecStop=-/usr/bin/podman stop "mariadb-service"
Restart=always
RestartSec=30
[Install]
WantedBy=multi-user.target