Saturday, September 7, 2024

CMA (Contiguous Memory Allocator) - Linux Memory Management Mechanism

CMA (Contiguous Memory Allocator) in Linux is a memory management mechanism designed to provide large contiguous blocks of physical memory for specific use cases, such as DMA (Direct Memory Access) operations or device drivers that require continuous memory regions. When discussing CMA (Contiguous Memory Allocator), it's crucial to focus on physical memory*rather than virtual memory due to the specific requirements of devices and hardware components that rely on direct access to memory. physical memory is referenced instead of virtual memory in the context of CMA.

Purpose:

Some hardware, like certain device drivers or subsystems (e.g., graphics or networking devices), need large chunks of physically contiguous memory. However, as Linux uses a virtual memory system, physical memory can become fragmented over time. CMA ensures that these devices get the required memory, even if the system is fragmented.

How it Works:

   - CMA reserves a portion of memory at boot time, which can later be allocated in contiguous blocks when requested.

   - During normal system operation, this reserved memory isn't locked away—it can be used for general purpose allocations. However, when a contiguous allocation request comes, this memory is freed and given to the device or driver that requested it.

Device and Hardware Requirements:

   - Certain hardware components (like GPUs, network cards, or other peripherals) require contiguous blocks of physical memory for  DMA (Direct Memory Access) operations or other high-performance activities.

   - DMA is a process where devices communicate directly with the physical memory without the CPU's intervention. For DMA to work efficiently, the hardware needs physically contiguous memory, meaning that the memory addresses are adjacent in physical memory.

   - Virtual memory is designed to make efficient use of available memory for software processes, but virtual memory can be fragmented and non-contiguous in physical memory. This is because virtual memory maps logical addresses to scattered physical memory locations.

CMA Guarantees Physical Contiguity:

 The key feature of CMA is that it reserves a contiguous region of physical memory that can be allocated on demand. Virtual memory, on the other hand, is not necessarily contiguous in physical terms.

   - Even though processes use virtual addresses (which are convenient for applications), the underlying devices or drivers that require DMA or large contiguous blocks of memory need physical addresses, and CMA ensures that this need is met.

Physical vs. Virtual Memory:

   - Physical Memory: Refers to the actual RAM installed in the system. It's where data is physically stored, and it must be contiguous for hardware operations.

   - Virtual Memory: Is an abstraction provided by the operating system that allows applications to use more memory than physically available. It is divided into pages, which can be scattered across different locations in physical memory.

For example, a 4 GB virtual address space could be mapped to non-contiguous physical memory chunks. However, if a device needs to access a block of memory directly through DMA, the physical memory it accesses must be contiguous.

Virtual Memory Can’t Be Used Directly for DMA or Hardware Access: 

   - Virtual memory is designed for software abstraction and can be fragmented across the physical memory. This is fine for applications but unsuitable for devices that require access to memory in a sequential physical block.

   - When devices perform DMA, they must work with real physical addresses. Therefore, allocating memory in virtual space doesn’t meet the requirement unless the physical memory behind those virtual addresses is also contiguous, which is why CMA allocates from physical memory directly.

How CMA Works with Physical Memory:

   - CMA reserves a chunk of  physical memory at boot time that can later be allocated in contiguous blocks when requested by the device drivers. It does so to ensure that even when the system’s physical memory becomes fragmented, there will still be a large contiguous block of physical memory available for hardware components.

   - Although this memory is allocated from the physical address space, it can be used by virtual memory applications when it's not being actively used by a device.

CMA deals with physical memory because certain devices and hardware require contiguous blocks of physical RAM for tasks like DMA. Virtual memory, which can be fragmented across different physical locations, doesn't meet the needs of these operations. Physical memory contiguity ensures that devices can perform high-speed data transfers efficiently, whereas virtual memory, though beneficial for applications, cannot guarantee this contiguity.

CMA Allocation and Range:

   - CMA typically reserves a contiguous memory range at boot time based on system configuration or kernel parameters. The location and size of the CMA region are either:

     - Automatically determined by the kernel based on memory requirements.

     - Specified manually using kernel boot parameters.

   - CMA memory is reserved in the physical memory address space and is set aside as a separate region from the rest of the memory.

Kernel Boot Parameters:

   - The size and location of the CMA region can be controlled via kernel boot parameters, such as:

     - `cma=size[M/G]`: Specifies the size of the CMA region. For example, `cma=512M` would reserve                     512 MB for CMA.

     - `cma_start=address`: Specifies the starting address of the CMA region in the physical memory.

     - `cma_end=address`: Specifies the ending address of the CMA region.

   Example:

   cma=256M cma_start=0x20000000 cma_end=0x30000000

   This reserves 256 MB for CMA starting at the physical address `0x20000000`.

Where is CMA Allocated?:

   - CMA is allocated during boot time and usually resides in the lower end of the physical memory to ensure that DMA or other hardware requests can access it easily.

   - CMA allocations can be made in any part of the memory, but it usually starts from a specific predefined region (if not defined manually).

6. Checking CMA Information in Linux:

   You can get details about CMA configuration by looking at certain files in the `/proc` or `/sys` filesystems:

/proc/meminfo: This file contains general memory information, including CMA. You can find the CMA reserved region under the entry `CmaTotal` and the currently used CMA memory under `CmaFree`.

     Example:

     CmaTotal:  262144 kB

     CmaFree:   131072 kB

/sys/kernel/debug/cma: If CMA debugging is enabled, this directory will provide detailed information about the CMA memory allocations.

Example to View CMA Memory: cat /proc/meminfo | grep Cma

Output might look like this

CmaTotal:         524288 kB

CmaFree:          512000 kB

This tells you the total reserved CMA memory (`CmaTotal`) and the currently available CMA memory (`CmaFree`).

Summary:

- CMA is allocated at boot and reserves contiguous memory blocks for devices needing such memory.

- It can be specified using boot parameters (size, start, and end address).

- The reserved memory is used by the system when no contiguous memory requests are made and freed when needed for such operations.

- You can check the allocation and usage through system files like `/proc/meminfo`.

========================FADUMP=========================================

Firmware assisted dump (fadump) is a dump capturing mechanism provided as a reliable alternative to kdump on IBM POWER systems. The fadump utility captures the vmcore file from a fully-reset system with PCI and I/O devices. This mechanism uses firmware to preserve memory regions during a crash and then reuses the kdump userspace scripts to save the vmcore file. The memory regions consist of all system memory contents, except the boot memory, system registers, and hardware Page Table Entries (PTEs).

The fadump mechanism offers improved reliability over the traditional dump type, by rebooting the partition and using a new kernel to dump the data from the previous kernel crash. 

README: /usr/share/doc/kexec-tools/fadump-howto.txt

In the Secure Boot environment, the GRUB2 boot loader allocates a boot memory region, known as the Real Mode Area (RMA). The RMA has a size of 512 MB, which is divided among the boot components and, if a component exceeds its size allocation, GRUB2 fails with an out-of-memory (OOM) error.

Options for Using fadump:

fadump=on:  This is the default setting for enabling fadump. It reserves memory from a special area called **CMA (Contiguous Memory Allocator)**. Think of this as a memory-saving technique that allows some of this reserved memory to still be used by other parts of the system during normal operation. The idea is to avoid wasting memory that would otherwise sit idle.

fadump=nocma: This option tells the system not to use the special CMA-backed memory for fadump. Instead, it reserves a portion of memory separately and completely, which might be useful if you want to capture more detailed information, like user-level data, during a crash. By not using CMA, this memory is reserved exclusively for fadump and isn't used for other tasks while the system is running.

fadump=on: Imagine you have a spare room (memory) in your house. Normally, you leave it empty just in case you need to store something later (for fadump). But with this setting, you let guests use the room for sleeping when you don’t need it. When something goes wrong (system crash), you ask them to leave so you can use it to store important things (dump data).

 fadump=nocma: Now, if you set the option to nocma, it's like keeping that spare room off-limits to guests at all times, so it's always ready for storing important stuff whenever you need it.

fadump=on (default): Allows the reserved memory to be used for other tasks when the system is working normally, saving memory.

fadump=nocma : Keeps the reserved memory off-limits to other tasks, ensuring that it's available for storing more detailed data during a crash.

The system with SLES distro will  automatically choose  whether to use `fadump=nocma` or `fadump=on`, depending on the KDUMP_DUMPLEVEL setting. On RHEL based systems , you can set fadump=on /fadump=nocma using grubby command followd by reboot. Or else you can add "/etc/default/grub" file to add these options and run "grub2-mkconfig -o /boot/grub2/grub.cfg"

KDUMP_DUMPLEVEL determines how much information is captured in a system crash. If it’s set to exclude user pages, the system will automatically use `fadump=on` (the default behavior). But, if user pages are **included** in the dump, it will switch to `fadump=nocma`.


Friday, September 8, 2023

Multipath setup in Linux

A multipath setup in Linux is a configuration that allows multiple physical paths (usually represented by multiple physical storage devices or network connections) to be used to access a single logical device or storage target. The primary goal of multipath is to enhance redundancy and fault tolerance while providing load balancing and improved performance. Multipath is commonly used in storage area networks (SANs) and environments where high availability and reliability are critical.

Here are the key components and concepts of a multipath setup in Linux:

Multipath Devices (Multipath): In a multipath setup, there is a logical device known as a multipath device (often referred to as a multipath or mpath). This logical device represents a single storage target, such as a disk or LUN (Logical Unit Number), even though it is accessible through multiple physical paths.

Physical Paths: Physical paths are the actual connections or channels through which the storage target is accessible. These paths can be physical SCSI buses, Fibre Channel links, iSCSI connections, or any other transport mechanism. Each path is associated with a unique identifier, typically called a World Wide Name (WWN), device name, or other similar identifiers.

Path Management: The multipath software in Linux (such as multipathd and multipath-tools) manages the physical paths and ensures that they are utilized effectively. It monitors the status of the paths and makes decisions about which path to use for I/O operations. It can also detect and respond to path failures or changes in path availability.

Load Balancing: Multipath configurations often include load balancing mechanisms that distribute I/O requests across the available paths. This helps improve performance by distributing the workload and preventing one path from becoming a bottleneck.

Redundancy and Failover: Multipath setups provide redundancy and failover capabilities. If one path fails due to hardware or network issues, the system can automatically switch to an alternate path without interrupting I/O operations. This enhances system reliability and availability.

Device Mapper (DM-Multipath): In Linux, the Device Mapper subsystem is commonly used to manage multipath devices. DM-Multipath is a kernel component that works with the multipath software to create and manage multipath devices. It presents a single device to the operating system, which is actually a combination of the multiple physical paths.

Configuration Files: To set up multipath in Linux, administrators configure multipath settings using configuration files. The main configuration file is typically located at /etc/multipath.conf (or a similar location) and defines the behavior of the multipath devices.

Multipath Tools: The multipath tools package (multipath-tools or similar) includes utilities such as multipath and multipathd that are used to manage and configure multipath devices. These tools help monitor path status, configure load balancing policies, and perform other administrative tasks related to multipathing.
-----------------------------------

For Example : This system is using multipath and LVM for storage management to provide redundancy and flexibility in managing storage devices. Both sda and sdb are part of the multipath configuration, and their partitions are managed using LVM. This setup is commonly used in enterprise environments for high availability and fault tolerance


There are three partitions on the multipath device mpatha (which represents both sda and sdb due to multipathing). 


The lsblk command is used to list block devices on this system, displaying information about disks, partitions, and their relationships. Let's break down the lsblk output:
----------------------------------------------------------------------------------------

#lsblk
NAME                            MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
sda                               8:0    0   80G  0 disk
└─mpatha                        253:0    0   80G  0 mpath
  ├─mpatha1                     253:1    0    4M  0 part
  ├─mpatha2                     253:2    0    1G  0 part  /boot
  └─mpatha3                     253:3    0   79G  0 part
    ├─rhel_myhost-root 253:4    0 47.7G  0 lvm   /
    ├─rhel_myhost-swap 253:5    0    8G  0 lvm   [SWAP]
    └─rhel_myhost-home 253:6    0 23.3G  0 lvm   /home
sdb                               8:16   0   80G  0 disk
└─mpatha                        253:0    0   80G  0 mpath
  ├─mpatha1                     253:1    0    4M  0 part
  ├─mpatha2                     253:2    0    1G  0 part  /boot
  └─mpatha3                     253:3    0   79G  0 part
    ├─rhel_myhost-root 253:4    0 47.7G  0 lvm   /
    ├─rhel_myhost-swap 253:5    0    8G  0 lvm   [SWAP]
    └─rhel_myhost-home 253:6    0 23.3G  0 lvm   /home
#

Here's a breakdown of the information in each column:

NAME: This column shows the name of the block device.
MAJ:MIN: Major and minor device numbers that uniquely identify the device to the operating system.
RM: This indicates whether the device is removable (1 for yes, 0 for no).
SIZE: The size of the device, often in gigabytes (G) or megabytes (M).
RO: This indicates whether the device is read-only (1 for yes, 0 for no).
TYPE: The type of device, which can be "disk" for physical disks or "part" for partitions. In this case, you also see "mpath" and "lvm," which are related to storage management.
MOUNTPOINT: The mount point where the device is currently mounted. If it's not mounted, this field will be empty.

Now, let's interpret the information based on the provided output:

There are two disk devices: sda and sdb.
Both sda and sdb are part of a multipath configuration, as indicated by the "mpath" type.
Each disk (sda and sdb) has three partitions (mpatha1, mpatha2, and mpatha3), and each of these partitions is used in an LVM (Logical Volume Management) setup.
The /boot partition (mpatha2) is mounted on both sda and sdb, and it contains the boot files.
The root (/) partition (rhel_myhost-root) is mounted on both sda and sdb, and it is the root filesystem.
The swap partition (rhel_myhost-swap) is also mounted on both sda and sdb and is used for swap space.
The /home partition (rhel_myhost-home) is mounted on both sda and sdb and is used for user home directories.

Here's what each of these partitions is typically used for:

mpatha1: This partition appears to be very small (only 4MB), and it is often used for storing bootloader-related files. Specifically, it might contain the GRUB bootloader's core files or other boot-related data. It's a common practice to allocate a small partition for bootloader files to ensure that they are easily accessible and less likely to be affected by changes or issues in the rest of the filesystem. A small partition like this is often sufficient for storing the essential boot files.

mpatha2: This partition is mounted as /boot, and it contains the kernel and initial ramdisk files needed for booting the system. /boot typically holds the Linux kernel, GRUB configuration files, and other boot-related data. Having a separate /boot partition is a common practice, especially in systems that use LVM or other complex storage configurations. It ensures that essential boot files are easily accessible and are less prone to issues that might affect other partitions.

mpatha3: This partition appears to be the largest and is not directly mounted as part of the root filesystem (/). Instead, it seems to be part of an LVM (Logical Volume Management) setup. It is divided into multiple logical volumes (rhel_myhost-root, rhel_myhost-swap, and rhel_myhost-home) that are used for various purposes:

rhel_myhost-root: This is the root filesystem (/) where most of the operating system and software are installed.

rhel_myhost-swap: This logical volume is used as swap space, which is used for virtual memory and can help improve system performance.

rhel_myhost-home: This logical volume is typically used for user home directories. User data and files are stored in the /home directory, which is often mounted on a separate filesystem to isolate user data from the root filesystem.

So, mpatha1 is likely used for bootloader files, mpatha2 is the /boot partition containing boot-related files, and mpatha3 represents an LVM setup with separate logical volumes for the root filesystem, swap space, and user home directories. This kind of partitioning and storage management allows for flexibility, scalability, and better system maintenance

You can map the UUID specified in the /etc/fstab file to the corresponding device name, such as /dev/sda1, /dev/sdb1, or /dev/mpatha1. The UUID (Universally Unique Identifier) is a unique identifier assigned to each filesystem or partition and is a more reliable way to identify devices than device names, which can change if hardware configurations are altered.

To map a UUID to the corresponding device name, you can use the blkid command. 


[root@myhost ~]# blkid
/dev/mapper/rhel_myhost-root: UUID="XXXXXXXXXXXXXXX" BLOCK_SIZE="512" TYPE="xfs"
/dev/mapper/mpatha3: UUID="XXXXXXXXX" TYPE="LVM2_member" PARTUUID="XXXXXXXX"
/dev/sda: PTUUID="XXXXXXXX" PTTYPE="dos"
/dev/mapper/mpatha: PTUUID="abc" PTTYPE="dos"
/dev/sdb: PTUUID="XXXXXXX" PTTYPE="dos"
/dev/mapper/mpatha1: PARTUUID="abc-01"
/dev/mapper/mpatha2: UUID="XXXXXXXXXXXXXXXXXXXX" BLOCK_SIZE="512" TYPE="xfs" PARTUUID="XXXXX"
/dev/mapper/rhel_myhost-swap: UUID="abc123c" TYPE="swap"
/dev/mapper/rhel_myhost-home: UUID="xyz123" BLOCK_SIZE="512" TYPE="xfs"
[root@myhost ~]#
-----------------

[root@myhost ~]# cat /boot/grub2/device.map
# this device map was generated by anaconda
(hd0)      /dev/mapper/mpatha
[root@myhost ~]#


--------------


[root@myhost ~]# cat /etc/fstab

#
# /etc/fstab
/dev/mapper/rhel_myhost-root /                       xfs     defaults        0 0
UUID=XXXXXXXXXXXXXXXXXXXXXXX /boot                   xfs     defaults        0 0
/dev/mapper/rhel_myhost-home /home                   xfs     defaults        0 0
/dev/mapper/rhel_myhost-swap none                    swap    defaults        0 0
[root@myhost ~]#

A multipath setup in Linux provides redundancy, load balancing, and fault tolerance for storage devices, ensuring that data remains accessible even if one path or connection fails. This technology is crucial in enterprise environments where continuous access to data is essential for operations.

----------------------------

The grub2-install command is a utility used in Linux to install the GRUB (Grand Unified Bootloader) bootloader onto a device, typically a hard disk or a partition.

The primary purpose of the grub2-install command is to install the GRUB bootloader on a specific device. You specify the target device as an argument to the command. 

For example :  grub2-install /dev/sda

In this example, the GRUB bootloader is installed on the MBR (Master Boot Record) of /dev/sda, which is typically the primary boot device.


Boot Device Configuration:
When GRUB is installed on a device, it configures the bootloader to locate and load the kernel and initial ramdisk (initrd) from the designated boot device or partition. It also stores configuration information, such as the location of the kernel and the root filesystem.

Device Map Configuration:
GRUB maintains a device map that associates BIOS drive numbers (e.g., (hd0), (hd1)) with actual device names (e.g., /dev/sda, /dev/sdb). The grub2-install command updates or creates this device map, ensuring that GRUB can correctly identify the boot device.

Bootloader Configuration File:
GRUB bootloader configurations are specified in the /boot/grub2/grub.cfg (or similar) file. This configuration file is automatically generated by GRUB utilities and scripts based on the system's configuration, such as the kernel and initrd locations, boot options, and menu entries.

Boot Menu:
GRUB provides a boot menu during system startup, allowing users to select from available kernels and boot options. The grub2-install command ensures that the necessary components for this boot menu are installed and configured correctly.

Updating GRUB Configuration:
In addition to installing GRUB, the grub2-install command also updates the bootloader's configuration to reflect changes in the system's disk layout or partitioning scheme. This includes updating device names and paths if necessary.

EFI and UEFI Support:
The behavior of grub2-install can differ depending on whether the system uses BIOS or UEFI (Unified Extensible Firmware Interface) for booting. For UEFI systems, the grub2-install command installs the UEFI version of GRUB and configures it accordingly.

Additional Options:
The grub2-install command supports various options to specify installation details, such as the target architecture, firmware type (BIOS or UEFI), and more. You can use the --target, --boot-directory, and other options to customize the installation.

----------------------------------------------------------

The PReP boot partition is a specialized partition used on PowerPC systems that follow the PReP boot standard to store firmware-specific bootloader and boot-related files. On the other hand, the /boot partition is a common convention on Linux systems, including PowerPC systems, to store kernel, initramfs, and bootloader configuration files, but it is not tied to any specific firmware standard and is used across various hardware architectures.

PReP Boot Partition:
The PReP boot partition is a specific partition type used in the context of the PReP boot standard, which is a firmware standard for booting PowerPC-based systems.
Its primary purpose is to store the bootloader and boot-related information required to initiate the boot process on PowerPC systems adhering to the PReP standard.
It typically contains essential firmware boot files, such as Open Firmware or IEEE 1275-compliant firmware, which are necessary to start the system.

/boot Partition:
The /boot partition is a common convention used on various Linux distributions, including those running on PowerPC systems.
Its purpose is to store the kernel, initramfs (initial RAM disk), bootloader configuration files, and other files required for the early stages of the boot process.
The /boot partition is part of the Linux filesystem structure and is used by the Linux bootloader (e.g., GRUB) to locate and load the kernel and initramfs during the boot process.
Firmware Dependency:

PReP Boot Partition:
The PReP boot partition's usage is closely tied to the firmware standard it follows, such as Open Firmware or IEEE 1275-compliant firmware. The firmware is responsible for loading the bootloader from this partition.
It may also contain firmware-specific files and configurations.

/boot Partition:
The /boot partition is not firmware-dependent and is part of the Linux filesystem. It is managed by the Linux bootloader (e.g., GRUB) and the operating system itself.
The bootloader reads the kernel and initramfs from the /boot partition during the boot process, and this partition is independent of the system's firmware.
Common Usage:

PReP Boot Partition:
Commonly used on older PowerPC-based systems that adhere to the PReP standard.
It's specific to the boot process defined by the firmware standard used on these systems

/boot Partition:
Widely used on various Linux distributions, including those on PowerPC systems.
It's part of the standard Linux filesystem structure and is used on many different hardware platforms.


Monday, September 4, 2023

Openstack Framework and components

OpenStack is an open-source cloud computing platform that provides a set of software tools and components for building and managing public and private clouds. It enables organizations to create and manage cloud infrastructure services, including compute, storage, networking, and more. OpenStack is designed to be highly flexible, scalable, and customizable, making it a popular choice for building cloud solutions.

OpenStack is an open-source cloud computing platform that was initially launched in July 2010 as a joint project by Rackspace Hosting and NASA. Since then, it has grown into a vibrant open-source community with contributions from a wide range of organizations and individuals. Here's a brief history of OpenStack and an overview of its main components:

OpenStack History:

Launch (2010): OpenStack was publicly launched in July 2010 with the release of the first two core projects, Nova (compute) and Swift (object storage). It was created to address the need for an open and flexible cloud computing platform.

Expanding Community (2011-2012): The OpenStack community quickly expanded, with numerous companies joining the project. The community released new versions of OpenStack, including Diablo, Essex, and Folsom, each with additional core and supporting projects.

Foundation Establishment (2012): In September 2012, the OpenStack Foundation was established to oversee the project's development and ensure its long-term governance as an open-source project.

Maturing Ecosystem (2013-2015): OpenStack continued to evolve, with new releases like Grizzly, Havana, Icehouse, and Juno. During this period, more projects were added to the ecosystem, covering areas such as networking (Neutron), block storage (Cinder), and identity (Keystone).

Enterprise Adoption (2016-2017): OpenStack gained significant traction among enterprises and service providers. Projects like Heat (orchestration) and Magnum (containers) were introduced to support cloud automation and container orchestration.

Continued Growth (2018-Present): OpenStack has continued to grow and evolve, with new projects and features being added regularly. The community releases new versions of OpenStack every six months, with each version introducing enhancements and improvements.

Openstack Releases: Currently running Openstack is release is "xena".  Austin was the 1st Openstack release and it obsolete now. For more details check the links below:

https://docs.openstack.org/puppet-openstack-guide/latest/install/releases.html

https://releases.openstack.org/

Austin (2010): The first official release of OpenStack, code-named "Austin."
Bexar (2011): The second release, code-named "Bexar."
Cactus (2011): The third release, code-named "Cactus."
Diablo (2011): The fourth release, code-named "Diablo."
Essex (2012): The fifth release, code-named "Essex."
Folsom (2012): The sixth release, code-named "Folsom."
Grizzly (2013): The seventh release, code-named "Grizzly."
Havana (2013): The eighth release, code-named "Havana."
Icehouse (2014): The ninth release, code-named "Icehouse."
Juno (2014): The tenth release, code-named "Juno."
Kilo (2015): The eleventh release, code-named "Kilo."
Liberty (2015): The twelfth release, code-named "Liberty."
Mitaka (2016): The thirteenth release, code-named "Mitaka."
Newton (2016): The fourteenth release, code-named "Newton."
Ocata (2017): The fifteenth release, code-named "Ocata."
Pike (2017): The sixteenth release, code-named "Pike."
Queens (2018): The seventeenth release, code-named "Queens."
Rocky (2018): The eighteenth release, code-named "Rocky."
Stein (2019): The nineteenth release, code-named "Stein."
Train (2019): The twentieth release, code-named "Train."
Ussuri (2020): The twenty-first release, code-named "Ussuri."
Victoria (2020): The twenty-second release, code-named "Victoria."
Wallaby (2021): The twenty-third release, code-named "Wallaby."
Xena (2021): The twenty-fourth release, code-named "Xena."
Yoga (2022): The twenty-fifth release, code-named "Yoga."
Zuul (2022): The twenty-sixth release, code-named "Zuul."

OpenStack's modular architecture allows organizations to choose the components that best fit their cloud computing needs, making it a versatile and customizable platform for building private, public, and hybrid clouds. OpenStack is built using a modular architecture, where each component provides a specific cloud service. These components can be combined to create a custom cloud infrastructure tailored to the organization's needs. OpenStack is composed of multiple projects, each providing a specific cloud service. 

  1. Multi-Tenancy: OpenStack supports multi-tenancy, allowing organizations to create isolated environments within the cloud infrastructure. This means that multiple users or projects can share the same cloud while maintaining security and resource separation.
  2. Open Source: OpenStack is released under an open-source license, making it freely available for anyone to use, modify, and contribute to. This open nature has led to a vibrant community of developers and users collaborating on its development.
  3. Integration and Compatibility: OpenStack is designed to integrate with various virtualization technologies, hardware vendors, and third-party tools. It can be used with different hypervisors, storage systems, and networking solutions.
  4. Private and Public Clouds: Organizations can use OpenStack to create private clouds within their data centers or deploy public cloud services to offer cloud resources to external customers or users.
  5. Hybrid Clouds: OpenStack can be part of a hybrid cloud strategy, where organizations combine private and public cloud resources to achieve flexibility and scalability

Here are some of the main components:

source

  1. Nova (Compute): Manages and orchestrates virtual machines (instances) on hypervisors. It provides features for creating, scheduling, and managing VMs.
  2. Swift (Object Storage): Offers scalable and durable object storage services for storing and retrieving data, including large files and unstructured data.
  3. Cinder (Block Storage): Manages block storage volumes that can be attached to instances. It provides persistent storage for VMs.
  4. Neutron (Networking): Handles networking services, including the creation and management of networks, subnets, routers, and security groups.
  5. Keystone (Identity): Manages identity and authentication services, including user management, role-based access control (RBAC), and token authentication.
  6. Glance (Image Service): Stores and manages virtual machine images (VM snapshots) that can be used to create instances.
  7. Horizon (Dashboard): A web-based user interface that provides a graphical way to manage and monitor OpenStack resources.
  8. Heat (Orchestration): Provides orchestration and automation services for defining and managing cloud application stacks.
  9. Ceilometer (Telemetry): Collects telemetry data, including usage and performance statistics, for billing, monitoring, and auditing.
  10. Trove (Database-as-a-Service): Manages database instances as a service, making it easier to provision and manage databases.
  11. Ironic (Bare Metal): Manages bare-metal servers as a service, allowing users to provision physical machines in the same way as virtual machines.
  12. Zaqar (Messaging and Queuing): Provides messaging and queuing services for distributed applications.
  13. Magnum (Container Orchestration): Orchestrates container platforms like Kubernetes to manage containerized applications.


Postman provides a user-friendly interface for building and sending API requests, inspecting responses, and automating API testing. Internally, Postman is a comprehensive software tool that facilitates the process of sending HTTP requests to APIs, receiving responses, and performing various tasks related to API testing, monitoring, and development. It operates through a combination of user interactions and underlying components. Postman simplifies the process of sending HTTP requests to APIs by providing a user-friendly interface, generating HTTP requests based on user input, and enabling users to work with API responses. It also supports more advanced features such as scripting, automation, and test execution for comprehensive API testing and monitoring. It's widely used by developers to

  1. Test APIs: Developers can use Postman to send requests to APIs and receive responses, making it easy to test how the API functions.
  2. Automate Tests: Postman allows you to create and automate test scripts to ensure that your APIs are working as expected. You can set up tests to validate the response data, headers, and more.
  3. Document APIs: You can use Postman to generate API documentation, which is useful for sharing information about how to use an API with others.
  4. Monitor APIs: Postman can be used to monitor APIs and receive alerts when issues or errors occur.
  5. Mock Servers: Postman provides the ability to create mock servers, which can simulate an API's behavior without the actual backend being implemented yet.

 Here's how Postman is involved and invoked internally when working with the examples provided:

1) User Interface (UI): Postman provides a user-friendly graphical interface where users can create, manage, and send API requests. Users interact with this UI to input API details, such as request URLs, headers, parameters, and request bodies.

2) Request Configuration: When you create a request in Postman, you configure various aspects of the request, including the request method (e.g., GET, POST, PUT, DELETE), request URL, headers, query parameters, request body (if applicable), and authentication settings.

3) HTTP Request Generation: Postman internally generates the corresponding HTTP request based on the user's configuration. For example, if you configure a GET request to retrieve user data, Postman generates an HTTP GET request to the specified URL with the provided headers and parameters.

4) Request Sending: When you click the "Send" button within Postman, it sends the generated HTTP request to the target API endpoint using the configured settings (e.g., URL, headers, body). This request is sent via the HTTP protocol to the specified API server.

5) API Server Interaction: The HTTP request sent by Postman is received by the API server. The server processes the request based on the HTTP method, URL, and other request details. For example, in a RESTful API, a GET request may retrieve data, while a POST request may create new data.

6) Response Reception: After the API server processes the request, it sends an HTTP response back to Postman. This response includes data (e.g., JSON or XML) and metadata (e.g., status code, headers) generated by the server.

7) Response Handling: Postman receives the HTTP response and presents it to the user within its UI. The user can inspect the response content, status code, headers, and other details. Postman also provides tools for handling response data, such as extracting values or running tests.

8)Test Execution: Users can define tests and assertions within Postman using scripts (e.g., JavaScript). When a test script is defined, Postman internally executes the script and checks the results against the specified assertions.

9) Results Reporting: Postman provides feedback to the user about the outcome of the API request and any tests that were run. Users can view whether the request was successful, the response met the expected criteria, and any potential errors or issues.

10)Automation: Postman can be integrated into automated testing pipelines, continuous integration (CI) workflows, and monitoring systems. It can be invoked programmatically to run collections of requests, automate tests, and monitor APIs at specified intervals.

Examples: make sure you have access to a RESTful API that you want to test. Replace the URL, endpoints, and parameters with the appropriate values for your specific API.

1) GET Request to Retrieve Data . To retrieve data from an API using a GET request:

  •    GET https://api.example.com/users

2) GET Request with Query Parameters.To retrieve data with query parameters:

  • GET https://api.example.com/users?id=123&name=John

3) POST Request to Create Data.To create data using a POST request with a JSON body:

  • POST https://api.example.com/users
Headers:
Content-Type: application/json

Body (JSON):
{
    "name": "Alice",
    "email": "alice@example.com"
}

4) PUT Request to Update Data.To update data using a PUT request with a JSON body:

  • PUT https://api.example.com/users/123

Headers:
Content-Type: application/json

Body (JSON):
{
    "name": "Updated Name",
    "email": "updated@example.com"
}


5) DELETE Request to Remove Data. To delete data using a DELETE request:

  •  DELETE https://api.example.com/users/123

6) Headers and Authentication. You can add headers, such as authorization headers, to your requests. For example, to send an API key in the headers

  • GET https://api.example.com/resource
Headers:
Authorization: Bearer YOUR_API_KEY

7) Handling Response Data:After sending a request, you can inspect the response data. For example, to extract a specific value from the response, you can use JavaScript-like syntax in Postman's Tests tab:

// Extract the value of the "name" field from the JSON response
var jsonData = pm.response.json();
pm.environment.set("username", jsonData.name);

These are just some basic examples of how to use Postman to interact with RESTful APIs. You can create collections of requests, use variables, and write more complex tests to thoroughly test and validate your APIs.

Python code example that demonstrates how to make an HTTP GET request to a RESTful API using the popular requests library. In this example, we'll use the JSONPlaceholder API, which provides dummy data for testing and learning purposes:

import requests
# Define the API endpoint URL
api_url = "https://jsonplaceholder.typicode.com/posts/1"
try:
    # Send an HTTP GET request to the API endpoint
    response = requests.get(api_url)
    # Check if the request was successful (status code 200)
    if response.status_code == 200:
        # Parse the JSON response
        data = response.json()
        # Print the response data
        print("Title:", data["title"])
        print("Body:", data["body"])
    else:
        print("HTTP Request Failed with Status Code:", response.status_code)
except requests.exceptions.RequestException as e:
    # Handle any exceptions that may occur during the request
    print("An error occurred:", e)

NOTE: We define the API endpoint URL (api_url) that we want to retrieve data from. In this example, we're fetching data for a specific post using its ID.

and use a try block to send an HTTP GET request to the API endpoint using requests.get(api_url).

We check the HTTP response status code. If it's 200, the request was successful, and we proceed to parse the JSON response.If the request was successful, we parse the JSON response using response.json() and print specific fields from the response (in this case, the post's title and body). If the request fails or encounters an exception, we handle it and print an error message.

OpenStack provides a set of RESTful APIs for managing cloud infrastructure resources. These APIs are used to create, manage, and interact with virtualized resources such as instances (virtual machines), volumes, networks, and more. Here are some common API endpoint examples with respect to OpenStack:

1) Identity (Keystone) API:

Authentication and token management.

Example: http://<OpenStack-IP>:5000/v3/

Compute (Nova) API:

2) Management of virtual machines (instances).

Example: http://<OpenStack-IP>:8774/v2.1/

Block Storage (Cinder) API:

3) Management of block storage volumes.

Example: http://<OpenStack-IP>:8776/v2/

Object Storage (Swift) API:

4) Storage and retrieval of objects (files and data).

Example: http://<OpenStack-IP>:8080/v1/

Image (Glance) API:

5) Management of virtual machine images (VM snapshots).

Example: http://<OpenStack-IP>:9292/v2/

Network (Neutron) API:

6) Management of network resources, including routers, subnets, and security groups.

Example: http://<OpenStack-IP>:9696/v2.0/

Orchestration (Heat) API:

7) Orchestration of cloud resources through templates.

Example: http://<OpenStack-IP>:8004/v1/

Telemetry (Ceilometer) API:

8) Collection of usage and performance data.

Example: http://<OpenStack-IP>:8777/v2/

Dashboard (Horizon) API:

9) Web-based user interface for OpenStack services.

Example: http://<OpenStack-IP>/dashboard/

Placement (Placement) API:

10) Management of resource placement and allocation.

Example: http://<OpenStack-IP>:8778/

These are just some examples of the core OpenStack APIs and their respective endpoint URLs.

--------

To check if a user exists in your OpenStack environment, you can use the Identity (Keystone) API, which manages authentication and user-related operations. Specifically, you can make a request to the Keystone API to list users and then check if the desired user is in the list. Here are the general steps to do this:

Step 1 :Authenticate with Keystone:

Before making any requests to the Keystone API, you need to authenticate. Typically, this involves sending a POST request with your credentials to the Keystone authentication endpoint. You'll receive a token in response, which you can use to make subsequent API requests.

List Users:

Step 2 : Make a GET request to the Keystone API's user listing endpoint to retrieve a list of all users in the OpenStack environment.

Example API endpoint for listing users: http://<OpenStack-IP>:5000/v3/users

Include the authentication token in the request headers.

Check User Existence:

Step 3 : After receiving the list of users, you can iterate through the user data and check if the desired user exists by comparing usernames, IDs, or other unique identifiers.

Here's a Python example using the requests library to check if a user exists in Keystone:

import requests
# Keystone authentication endpoint
auth_url = "http://<OpenStack-IP>:5000/v3/auth/tokens"
# Keystone user listing endpoint
users_url = "http://<OpenStack-IP>:5000/v3/users"
# Replace with your OpenStack credentials
auth_data = {
    "auth": {
        "identity": {
            "methods": ["password"],
            "password": {
                "user": {
                    "name": "your_username",
                    "domain": {"name": "your_domain"},
                    "password": "your_password"
                }
            }
        }
    }
}

# Authenticate and get a token
response = requests.post(auth_url, json=auth_data)
if response.status_code == 201:
    token = response.headers["X-Subject-Token"]
    
    # List all users
    headers = {"X-Auth-Token": token}
    response = requests.get(users_url, headers=headers)

    if response.status_code == 200:
        users = response.json()["users"]

        # Check if the user exists
        target_user = "desired_username"
        user_exists = any(user["name"] == target_user for user in users)
        if user_exists:
            print(f"User {target_user} exists.")
        else:
            print(f"User {target_user} does not exist.")
    else:
        print("Failed to list users.")
else:
    print("Authentication failed.")

This example demonstrates how to authenticate with Keystone, list users, and check if a specific user exists by comparing usernames. Replace placeholders with your OpenStack-specific values and adjust the code as needed for your environment

-----------------------

OpenStack service overview: 

source

Nova , Cinder, Swift and Neutron -these OpenStack services together provide a comprehensive cloud computing platform. Nova manages compute resources, Cinder offers block storage, Swift provides object storage, and Neutron handles networking, enabling organizations to build and manage private and public clouds tailored to their specific needs.

Nova (OpenStack Compute): Nova is the core compute service in OpenStack. It manages the creation, scheduling, and management of virtual machines (VMs) in a cloud environment. Nova is hypervisor-agnostic, supporting various virtualization technologies, and it provides features for live migration, scaling, and resource management.

Cinder (OpenStack Block Storage): Cinder is the block storage service in OpenStack. It offers block-level storage volumes that can be attached to VMs. Users can create, manage, and snapshot these volumes, making it suitable for data persistence in applications like databases.

Swift (OpenStack Object Storage): Swift is the object storage service in OpenStack. It is designed for the storage of large amounts of unstructured data, such as images, videos, and backups. Swift provides scalable, redundant, and highly available storage with easy-to-use APIs.

Neutron (OpenStack Networking): Neutron is the networking service in OpenStack. It enables users to create and manage networks, subnets, routers, and security groups for VMs. Neutron supports various network configurations, including flat networks, VLANs, and overlay networks, allowing for flexibility in network design.

--------

Key Differences between Cinder and swift : The object storage and block storage serve different purposes and have distinct access methods. Object storage is well-suited for handling unstructured data and large-scale content distribution, while block storage is preferred for applications requiring direct control over data blocks and high performance. Organizations often choose between these storage types based on their specific use cases and storage needs.

Access Level: Object storage uses a higher-level access method, where data is accessed and managed as whole objects using unique identifiers. Block storage provides lower-level access, treating data as raw blocks.

Use Cases: Object storage is ideal for storing large amounts of unstructured data and content distribution, while block storage is suited for applications requiring direct control over storage blocks.

Scalability: Object storage is known for its horizontal scalability and ease of expansion, whereas block storage scalability may require more planning and management.

Data Management: Object storage systems often manage data redundancy and durability internally, while block storage may rely on external solutions or the application to manage data redundancy.

Data Retrieval: Object storage is optimized for read-heavy workloads and large-scale data distribution, while block storage is designed for high performance and low-latency access.

------------

Ceph:

Ceph is an open-source, distributed storage system designed for both object and block storage. It is known for its flexibility, scalability, and ability to provide a unified storage platform. Ceph is often used in cloud computing environments, data centers, and high-performance computing (HPC) clusters.

Key components and features of Ceph include:

Object Storage (RADOS Gateway): Ceph provides object storage capabilities through its RADOS (Reliable Autonomic Distributed Object Store) Gateway. This allows users to store and retrieve objects using a RESTful API compatible with Amazon S3 and Swift.

Block Storage (RBD): Ceph's RADOS Block Device (RBD) allows users to create block storage volumes that can be attached to virtual machines or used as raw block storage. RBD is often integrated with virtualization platforms like KVM.

Scalability: Ceph scales seamlessly from a few nodes to thousands of nodes by distributing data across OSDs (Object Storage Daemons) and MONs (Monitor Daemons). This scalability makes it suitable for large-scale storage deployments.

Data Redundancy: Ceph replicates data across multiple OSDs to ensure redundancy and high availability. It uses a CRUSH algorithm to distribute data efficiently.

Self-Healing: Ceph can automatically detect and recover from hardware failures or data inconsistencies. It continuously monitors data integrity.

Unified Storage: Ceph provides a unified storage platform that combines object, block, and file storage, allowing users to access data in various ways, depending on their requirements.

Community and Ecosystem: Ceph has a vibrant open-source community and a wide ecosystem of tools and projects that integrate with it. This includes interfaces for OpenStack integration.

-------------------------

Neutron, the networking component of OpenStack, plays a crucial role in creating and managing networking resources within a cloud infrastructure. 

source

Here are some interesting factors and capabilities related to Neutron:

Network Abstraction: Neutron abstracts network resources, allowing users to create and manage virtual networks, subnets, routers, and security groups through APIs or the dashboard. This abstraction simplifies complex networking tasks and provides a consistent interface.

Multi-Tenancy: Neutron supports multi-tenancy, enabling the isolation of network resources between different projects or tenants. This ensures that one tenant's network activities do not impact another's.

Pluggable Architecture: Neutron follows a pluggable architecture, allowing users to integrate with various networking technologies and solutions. This includes support for different plugins and drivers, enabling compatibility with a wide range of network devices and services.

Software-Defined Networking (SDN): Neutron can be used in conjunction with SDN controllers and solutions to provide advanced network automation, programmability, and flexibility. SDN allows for the dynamic configuration of network services and policies.

Networking Interfaces: Neutron allows the creation of various types of networking interfaces for virtual machines, including:

Port: Neutron manages ports, which represent virtual interfaces connected to a network. VMs attach to ports to access the network.

Router: Routers connect different subnets and provide inter-subnet routing. Neutron manages router interfaces and routing rules.

Floating IPs: Floating IPs provide external network access to VMs. Neutron can assign floating IPs dynamically or statically.

Bonding and Teaming: Neutron can manage bonded network interfaces (NIC bonding) for redundancy and increased network bandwidth. This is especially useful for ensuring high availability and load balancing of VMs.

Security Groups: Neutron's security groups feature allows users to define firewall rules and policies to control incoming and outgoing traffic to VMs. It enhances network security within the cloud environment.

L3 and L2 Services: Neutron supports Layer 3 (routing) and Layer 2 (bridging) services. This flexibility enables complex network topologies and scenarios.

Interoperability: Neutron integrates with various network technologies, including VLANs, VXLANs, GRE tunnels, and more. It provides interoperability with physical network infrastructure and external networks.

Communication Between VMs: Neutron ensures that VMs can communicate with each other within the same network or across networks using routing. It manages the routing tables and connectivity.

Load Balancing as a Service (LBaaS): Neutron offers LBaaS, allowing users to create and manage load balancers to distribute traffic among multiple VMs or instances.

High Availability (HA): Neutron can be configured for high availability, ensuring network services remain operational even in the event of network node failures.

---------------------------------------------------------

Containerization in OpenStack involves deploying and managing containers within an OpenStack cloud environment. This allows users to run containerized applications and microservices alongside traditional virtual machines (VMs).

source

Here's a step-by-step explanation of the design and components involved in containerization within OpenStack:

1. Container Orchestration Framework: OpenStack supports various container orchestration frameworks, with Kubernetes being one of the most popular choices. Kubernetes helps manage the deployment, scaling, and operation of application containers. It serves as the foundation for container orchestration in an OpenStack environment.

2. Container Runtime: Containers are run using a container runtime, such as Docker or containerd. This runtime manages the execution of containerized applications and provides isolation between containers. In an OpenStack-based containerization setup, a container runtime is installed on each compute node in the OpenStack cluster.

3. OpenStack Components:

  • Nova (Compute Service): Nova is responsible for managing compute resources, including VMs and, in a containerized environment, bare metal servers. It can provision servers specifically for running containers alongside traditional VMs.
  • Neutron (Networking Service): Neutron handles networking and connectivity for containers. It ensures that containers can communicate with each other, VMs, and external networks.
  • Cinder (Block Storage Service): Cinder provides block-level storage for containers when persistent storage is required. Containers can use Cinder volumes for data storage.

4. Magnum (Container Orchestration Service): OpenStack Magnum is a dedicated service for managing container orchestration clusters, such as Kubernetes, within the OpenStack cloud. It simplifies the deployment and management of container orchestration platforms.

5. Heat (Orchestration Service): Heat is an orchestration service in OpenStack that enables the automated deployment and scaling of infrastructure resources, including containers. It allows users to define templates describing the desired container infrastructure and then deploys and manages the resources accordingly.

6. Glance (Image Service): Glance is responsible for storing and managing container images. Containers are typically built from base images, and Glance helps manage these images within the OpenStack environment.

7. Keystone (Identity Service): Keystone provides authentication and authorization services for containerized applications and services. It ensures that only authorized users and services can access containers and container orchestration platforms.

8. Container Networking and Storage Plugins: In an OpenStack-based containerization environment, specialized networking and storage plugins are often used to integrate container networking and storage with OpenStack services. These plugins enable efficient communication and data storage for containers.

9. User Interface: Users interact with the containerization platform through the OpenStack dashboard (Horizon) or through the command-line interface (CLI). They can deploy and manage containers, container orchestration clusters, and associated resources.

10. Monitoring and Logging: Containerized applications generate logs and require monitoring for performance and resource usage. OpenStack can be integrated with monitoring and logging solutions like Prometheus, Grafana, and ELK (Elasticsearch, Logstash, and Kibana) to provide insights into containerized workloads.

11. External Services Integration: Containers often need to interact with external services and APIs. OpenStack allows for integration with external services through the use of network configurations, load balancers, and other relevant components.

In summary, containerization in OpenStack involves a combination of OpenStack services, container orchestration frameworks like Kubernetes, container runtimes, and specialized plugins to provide a seamless environment for deploying and managing containerized applications alongside traditional VMs within an OpenStack cloud infrastructure. This setup offers flexibility, scalability, and isolation for running containerized workloads in a cloud environment.

Sunday, July 30, 2023

Watsonx AI and data platform with Foundation Models

We are witnessing a fundamental shift in AI driven by self-supervision and by the ability to create foundation models that power generative AI. Several exciting new Foundation Model capabilities have been announced at IBM Think 2023. Watsonx is a new platform for foundation models and generative AI, offering a studio, data store, and governance toolkit. Let’s take a look what this platform intends to provide.

Why can't we build and reuse AI models? More data, more problems? Learn how AI foundation models change the game for training AI/ML from IBM Research AI VP Sriram Raghavan and DarĂ­o Gil, SVP and Director of IBM Research  as they demystifies the technology and shares a set of principles to guide your generative AI business strategy. Experience watsonx, IBM’s new data and AI platform for generative AI and learn about the breakthroughs that IBM Research is bringing to this platform and to the world of computing. and  to explore foundation models, an emerging approach to machine learning and data representation. Even in the age of big data when AI/ML is more prevalent, training the next generation of AI tools like NLP requires enormous data, and using AI models to new or different domains may be tricky. A foundation model can consolidate data from several sources so that one model may then be used for various activities. But how will foundation models be used for things beyond natural language processing? Don't miss this episode to explore how foundation models are a paradigm shift in how AI gets done.  

You can bring your own data and AI models to watsonx or choose from a library of tools and technologies. You can train or influence training (if you want), then you can tune, that way you can have transparency and control over governing data and AI models. You can prompt it too. Instead  of only one model, you can have family of models. The foundation models trained with your own data will become more valuable asset. Watsonx is a new integrated data platform to become a value creator. It consists of 3 primary parts, first watsonx.data is massive curated data repository that is ready to be tapped to train and fine-tune models with  data management system. Watsonx.ai is an enterprise studio to train, validate, tune and deploy traditional ML and foundation models that provide generative AI capabilities. Watson.governance is a powerful set of tools to ensure your AI is executing responsibly. They work together seemlessly throughout the entire lifecycle of foundation models. Watsonx built on top of RedHat Openshift. The lifecycle consists of 

STEP 1:  preparing our data [Acquire, filter and pre-process, version & tag]. Each data set after being filtered , processed , it receives a data card. Data card has name and version of pile, specifies its content and filters that have been applied to it. We can have multiple data piles . They co-exists in .data and access different versions of data maintained for different purpose is managed seamlessly.

STEP2 : using it to train the model,  validate the model, Tune the model and deploying  applications and solutions. So we moved from .data to .AI and start picking a model architecture from the five families that IBM provides. These are bedrocks of models  and they range from encoder only, encoder-decoder, decoder only and other novel architectures.

What Are Foundation Models? . Foundation models are AI neural networks trained on massive unlabeled datasets to handle a wide variety of jobs from translating text to analyzing medical images. We're witnessing a transition in AI. Systems that execute specific tasks in a single domain are giving way to broad AI that learns more generally and works across domains and problems. Foundation models, trained on large, unlabeled datasets and fine-tuned for an array of applications, are driving this shift. The models are pre-trained to support a range of natural language processing (NLP) type tasks including question answering, content generation and summarization, text classification and extraction. Future releases will provide access to a greater variety of IBM-trained proprietary foundation models for efficient domain and task specialization.

Source

Foundation models are trained with massive amounts of data that allow for generative AI capabilities with a broad set of raw data that can be applied to different tasks, such as natural language processing. Instead of one model built solely for one task, foundation models can be adapted across a wide variety of different scenarios, summarizing documents, generating stories, answering questions, writing code, solving math problems, synthesizing audio. A year after the group defined foundation models, other tech watchers coined a related term — generative AI. It’s an umbrella term for transformers, large language models, diffusion models and other neural networks capturing people’s imaginations because they can create text, images, music, software and more.

IBM has planned to offer a suite of foundation models, for example smaller encoder based models, but also encoder-decoder or just decoder based models. 

source

Recognizing that one size doesn’t fit all, we’re building a family of language and code foundation models of different sizes and architectures. Each model family has a geology-themed code name —Granite, Sandstone, Obsidian, and Slate — which brings together cutting-edge innovations from IBM Research and the open research community. Each model can be customized for a range of enterprise tasks. While Foundation Models are in general good in performing multiple tasks, they have been trained with generic data. To optimize them, fine tuning with domain specific or proprietary data can be done.

Watsonx is our enterprise-ready AI and data platform designed to multiply the impact of AI across your business. The platform comprises three powerful products: the watsonx.ai studio for new foundation models, generative AI and machine learning; the watsonx.data fit-for-purpose data store, built on an open lakehouse architecture; and the watsonx.governance toolkit, to accelerate AI workflows that are built with responsibility, transparency and explainability. It consists of Watsonx.data, Watsonx.ai and Watsonx.governance

Source


Watsonx.data : An open, hybrid and governed data store
It makes it possible for enterprises to scale analytics and AI with a fit-for-purpose data store, built on an open lakehouse architecture, supported by querying, governance and open data formats to access and share data. With watsonx.data, you can connect to data in minutes, quickly get trusted insights and reduce your data warehouse costs. Now available as a service on IBM Cloud and AWS and as containerized software.

Watsonx.ai Studio: is an AI studio that combines the capabilities of IBM Watson Studio with the latest generative AI capabilities that leverage the power of foundation models. It provides access to high-quality, pre-trained, and proprietary IBM foundation models built with a rigorous focus on data acquisition, provenance, and quality. watsonx.ai is user-friendly. It’s not just for data scientists & developers, but also for business users. It provides a simple, natural language interface for different tasks. Watsonx.ai Studio with the new playground including easy to use Prompt Tuning. With watsonx.xi, you can train, validate,  tune and deploy AI models.

WatsonX.governance : IBM has described watsonX.governance as a tool for building responsible, transparent and explainable AI workflows. According to IBM, watsonx.governance will also enable customers to direct, manage and monitor AI activities, map with regulatory requirements, and address ethical issues. The more AI is embedded into daily workflows, the more you need proactive governance to drive responsible, ethical decisions across the business. Watsonx.governance allows you to direct, manage, and monitor your organization’s AI activities, and employs software automation to strengthen your ability to mitigate risk, manage regulatory requirements and address ethical concerns without the excessive costs of switching your data science platform—even for models developed using third-party tools.

Source


IBM plans to provide Foundation Models as a Service  with the capabilities of IBM’s first AI-optimized, cloud-native supercomputer Vela as a Service. The stack utilizes Red Hat OpenShift, so that it could also be run on multiple clouds or on-premises. It is based on popular open source frameworks and communities like PyTorch, Ray and Hugging Face. 

 

Why we built an AI supercomputer in the cloud?. Introducing Vela, IBM’s first AI-optimized, cloud-native supercomputer.

IBM built Vela supercomputer designed specifically for training so-called “foundation” AI models such as GPT-3. According to IBM, this new supercomputer should become the basis for all its own research and development activities for these types of AI models.IBM’s Vela supercomputer uses x86-based standard hardware. In the Vela system, each node’s hardware consists of a pair of “regular” Intel Xeon Scalable processors. To this are added eight 80GB Nvidia A100 GPUs per node. Furthermore, each node within the supercomputer is connected to several 100 Gbps Ethernet network interfaces. Each Vela node also has 1.5TB of DRAM internal memory and four 3.2TB NVMe drives for storage.In addition, IBM has also built a new workload-scheduling system for the Vela, the MultiCluster App Dispatcher (MCAD) system. This should handle cloud-based job scheduling for training foundation AI models.

Multi-Cluster Application Dispatcher:

The multi-cluster-app-dispatcher is a Kubernetes controller providing mechanisms for applications to manage batch jobs in a single or mult-cluster environment. The multi-cluster-app-dispatcher (MCAD) controller is capable of (i) providing an abstraction for wrapping all resources of the job/application and treating them holistically, (ii) queuing job/application creation requests and applying different queuing policies, e.g., First In First Out, Priority, (iii) dispatching the job to one of multiple clusters, where a MCAD queuing agent runs, using configurable dispatch policies, and (iv) auto-scaling pod sets, balancing job demands and cluster availability.

What is prompt-tuning?

Prompt-tuning is an efficient, low-cost way of adapting an AI foundation model to new downstream tasks without retraining the model and updating its weights. Redeploying an AI model without retraining it can cut computing and energy use by at least 1,000 times, saving thousands of dollars.  With prompt-tuning, you can rapidly spin up a powerful model for your particular needs. It also lets you move faster and experiment.

In prompt-tuning, the best cues, or front-end prompts, are fed to your AI model to give it task-specific context. The prompts can be extra words introduced by a human, or AI-generated numbers introduced into the model's embedding layer. Like crossword puzzle clues, both prompt types guide the model toward a desired decision or prediction. Prompt-tuning allows a company with limited data to tailor a massive model to a narrow task. It also eliminates the need to update the model’s billions (or trillions) of weights, or parameters. Prompt-tuning originated with large language models but has since expanded to other foundation models, like transformers that handle other sequential data types, including audio and video. Prompts may be snippets of text, streams of speech, or blocks of pixels in a still image or video. We don’t touch the model. It’s frozen. 

For Example: How do AI art generators work?

 AI art generators don’t know what an owl looks like in the wild. They don’t know what a sunset looks like in a physical sense. They can only understand details about features, patterns, and relationships within the datasets they’ve been trained on. Prompting for a “beautiful face” is not very helpful. It is more effective to prompt for specific features such as symmetry, big lips, and green eyes. Even if the bot doesn’t understand beauty, it can recognize the features you describe as beautiful and generate something relatively accurate. To get the best results from your AI art generator prompt, you’ll need to give clear and detailed instructions. An effective AI art prompt should include specific descriptions, shapes, colors, textures, patterns, and artistic styles. This allows the neural networks used by the generator to create the best possible visuals.

T5 (Text to test transfer transformer ) is an encoder decoder model pre trained on a multi-task mixture of unsupervised and supervised tasks. We have complete transformer. T5 provides simple way to train a single model on a wide variety of text tasks. FLAN is Fine-Tuning LANguage Model. FLAN already been fine tuned by google and you try your multiple tasks on already pre tuned Model by Google. If you fine tune, then  you may destroy that fine tuned model by overwriting it. Flan-UL2 is an encoder decoder model based on the T5 architecture. It uses the same configuration as the UL2 model released earlier last year. It was fine tuned using the “Flan” prompt tuning and dataset collection. With its impressive 20 billion parameters, Flan-UL2 is a remarkable encoder-decoder model with exceptional performance. UL2 20B: An Open Source Unified Language Learner.In “Unifying Language Learning Paradigms”, we present a novel language pre-training paradigm called Unified Language Learner (UL2) that improves the performance of language models universally across datasets and setups. UL2 frames different objective functions for training language models as denoising tasks, where the model has to recover missing sub-sequences of a given input. During pre-training it uses a novel mixture-of-denoisers that samples from a varied set of such objectives, each with different configurations. We demonstrate that models trained using the UL2 framework perform well in a variety of language domains, including prompt-based few-shot learning and models fine-tuned for down-stream tasks. Additionally, we show that UL2 excels in generation, language understanding, retrieval, long-text understanding and question answering tasks.

Retrieval Augmented Generation (RAG): 

Foundation models are usually trained offline, making the model agnostic to any data that is created after the model was trained. Additionally, foundation models are trained on very general domain corpora, making them less effective for domain-specific tasks. You can use Retrieval Augmented Generation (RAG) to retrieve data from outside a foundation model and augment your prompts by adding the relevant retrieved data in context. For more information about RAG model architectures, see Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.

With RAG, the external data used to augment your prompts can come from multiple data sources, such as a document repositories, databases, or APIs. The first step is to convert your documents and any user queries into a compatible format to perform relevancy search. To make the formats compatible, a document collection, or knowledge library, and user-submitted queries are converted to numerical representations using embedding language models. Embedding is the process by which text is given numerical representation in a vector space. RAG model architectures compare the embeddings of user queries within the vector of the knowledge library. The original user prompt is then appended with relevant context from similar documents within the knowledge library. This augmented prompt is then sent to the foundation model. You can update knowledge libraries and their relevant embeddings asynchronously.

Pre-requisites:  data sets 

1) Training data set ( contains question and answer )

2) Test data set 

embedding generation -----> storing it in vector data base ---> giving a user question ----> convert it into embedding --->sending it to vector database ---> getting an answer ---> Finally creating a prompt  ---->sending it to foundation model Flan-UL2 (encoder-decoder Model)   ---> getting an answer  

Be a value creator. You can build foundation models using watsonx Platform on your data and that will be under your control. It will become your most valuable asset. Don’t outsource that and don’t reduce your AI strategy to an API call. One model will not rule them all. Build responsibly, transparently and put governance into the heart of your AI lifecycle.

Reference:
https://www.youtube.com/watch?v=FrDnPTPgEmk
https://www.ibm.com/products/watsonx-ai
https://www.ibm.com/products/watsonx-governance
https://www.ibm.com/products/watsonx-data