KVM Host installation

KVM Host Requirements

Hardware Requirements

  • Industry standard Intel/AMD Servers that supports virtualization

  • Ethernet Port for IPMI/iDRAC/iLO OOB (required for PAT)

  • Ethernet Port for Management Bus

  • Ethernet Port for Private Bus

  • Operating System Drive (preferably 2 drives in RAID 1)

  • VM Storage Drive (preferably multiple drives in RAID10)

Software Requirements

  • Ubuntu 20.04 LTS (GNU/Linux 4.15.0-65-generic x86_64)

KVM Host Installation

  1. Server IPMI/iDRAC/iLO configuration:

    • Assign the OOB IP address (10.<pod_number>.0.x/16, x!= 0,1,2,3,4,5,6,254,253 and make sure x is not used already)

    • Set user root password

    • Enable virtualization

    • If LED screen kvm(x).{regionname}

  2. Server RAID Configuration:

    • Boot server into RAID configuration and create RAID volume(s):

    • RAID1 - OS (at least 200GB)

    • RAID10(Preferably) - VMIMAGES (The rest available storage)

  3. Operating System Install:

    • Assigning the correct management IPv6 address (<p>::30::{x}/64, x in 1-ffff ).Set DNS server to a public IPv6 DNS server address

    • Set username as: administrator

    • Set password for administrator to Network Password supplied by the PAT or Region POD owner.

    • Set hostname to: kvm{x}-{regionname}-{organization url} (x is the number starting with 1 to ffff)

    • Set Partition to: Guided - Use entire disk and set UP LVM and install on the 200GB volume

    • In the software selection tab select: OpenSSH Server

    • Finish installation as per prompts

  4. Update the server:

    • Login to the host as administrator, become root and patch the operating system

      sudo su
      apt update && apt upgrade -y
      
  5. Install and run KVM server roles and features

    • Install packages

      apt install qemu qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils virtinst -y
      
    • Start the service

      service libvirtd start
      
  6. Configure logical volume on the VMIMAGES RAID 10 volume

    • List logical device name

      lsblk
      
    • To clear a drive from the existing file systems, if any, use

      wipefs -a /dev/sdb
      
    • Create a physical device

      pvcreate /dev/sdb
      
    • Create volume group

      vgcreate vmimages /dev/sdb
      
    • Create logical volume

      lvcreate -l 100%FREE -n lvm vmimages
      
    • Format volume

      mkfs.ext4 /dev/vmimages/lvm
      
  7. Append /etc/fstab file and mount logical volume

    echo /dev/mapper/vmimages-lvm  /var/lib/libvirt/images   ext4    defaults 0 0 >> /etc/fstab && sudo mount -a
    
  8. Change the logical name of the “private” interface (usually second interface) to “cloud0”

    • Find interface MAC address

      lshw -C network | grep -n2 -e serial
      
    • Create new file

      nano /etc/udev/rules.d/70-persistent-net.rules
      
    • Add the following to the file using the MAC address from the first step

      SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="mac:address:goes:here", ATTR{dev_id}=="0x0", ATTR{type}=="1", NAME="cloud0"
      
    • Flush the rules

      ln -sf /dev/null /lib/udev/rules.d/80-net-setup-link.rules
      
  9. Install NFS Common

    apt install nfs-common -y
    
  10. Create ISOs directory and mount NFS drive to /var/lib/libvirt/ISOs

    • Create a directory

      mkdir /var/lib/libvirt/ISOs
      
    • Append /etc/fstab file and mount NFS share

      echo robot.{regionname}.{organization url}:/etc/cloudcix/robot /var/lib/libvirt/ISOs nfs defaults,user,exec  0  0 >> /etc/fstab && mount -a
      
  11. Create backup directories and mount NAS backup repositories

    • Create directories

      mkdir /mnt/backup-p && mkdir /mnt/backup-s && mkdir /mnt/kvm_backups
      
    • Append /etc/fstab file and mount NFS shares

      echo truenas{x}.{regionname}.{organization url}:/mnt/pool/kvm_backups /mnt/kvm_backups nfs defaults,user,exec  0  0 >> /etc/fstab  && \
      echo truenas{x}.{regionname}.{organization url}:/mnt/pool/kvm /mnt/backup-p nfs defaults,user,exec  0  0 >> /etc/fstab  && \
      echo truenas{x}.{regionname}.{organization url}:/mnt/pool/kvm-1 /mnt/backup-s nfs defaults,user,exec  0  0 >> /etc/fstab && \
      mount -a
      

    (where x is the number starting with 1 to ffff of the NAS server)

  12. Install and configure NTP client and set to the common NTP server so that the time on the host is synced across the Region

    • Install the package

      apt install ntp -y
      
    • Modify ntp.conf file, nano /etc/ntp.conf and put your preferred NTP server details under “# Specify one or more NTP servers.” section

      server 2.ie.pool.ntp.org prefer iburst
      
    • Restart the service

      systemctl restart ntp
      
  13. Set timezone to UTC

    timedatectl set-timezone UTC
    
  14. Due to an abnormal behaviour on some servers during netplan-apply (https://bugs.launchpad.net/ubuntu/+source/netplan.io/+bug/1962095) the following fixes this:

    • Enable networkd.socket

      systemctl enable systemd-networkd.socket
      
    • Start networkd.socket (might fail to start if it already running)

      systemctl start systemd-networkd.socket
      
  15. Update SSH Sever options

    • Open “sshd_config” file

      sudo nano /etc/ssh/sshd_config
      
    • Look for the “MaxStartups” option and set it equal to the number of Robot workers in the Region. The default number of Robot workers is 25.

      MaxStartups  25
      
    • Look for the “MaxSessions” option and set it equal to the number of Robot workers in the Region. The default number of Robot workers is 25.

      MaxSessions  25
      
    • Look for the “UseDNS” option and set it to “no” to disable DNS lookups when Robot tries to ssh into a host

      UseDNS no
      
    • Restart SSH service for changes to take effect

      sudo systemctl restart ssh
      
  16. Increase IPv6 Routing cache

    sudo sysctl -w net.ipv6.route.max_size=32768
    
  17. Add routing to Robot docker containers

    • Open “/etc/netplan/00-installer-config.yml” in nano

    • Add the following route under the management interface, telling the host to route traffic for Robot’s docker network through the Region Appliance. :

      {management_interface}:
        routes:
          - to: "{p}:d0c6::/64"
            via: "{p}::6000:1"
      
    • Save the file, exit nano, and run the following command to apply the changes:

      sudo netplan apply
      
  18. Place Robot’s Public Key on the host into the .ssh directory

    • Connect to the Region appliance. The Region appliance password can be obtained from the PAT owner:

      ssh administrator@{ipv6:6000:1}
      
    • From appliance, copy SSH keys to the host

      ssh-copy-id -i /home/administrator/.ssh/id_rsa.pub administrator@{kvmhostiPV6}
      

      NOTE: If system is running a “legacy” Region:

      • Change user from root to administrator: sudo su administrator

      • Create (if doesn’t exist) .ssh directory under /home/administrator directory:

        mkdir .ssh
        
      • Copy “authorized_keys” file from /var/lib/libvirt/ISOs/Robot_SSH_Key to /home/administrator/.ssh/:

        cp /var/lib/libvirt/ISOs/Robot_SSH_Key/authorized_keys  /home/administrator/.ssh/
        
  19. Copy backup script

    cp /var/lib/libvirt/ISOs/KVM_Backup_Script/kvm-backup /etc/cron.weekly
    
  20. Reboot the server and do an overall final check

    • Become root and authenticate

      sudo su
      
    • Check if the “Private” interface is named “cloud0”

      lshw -c network | grep -n3 cloud0
      
    • Check if the following directories are mounted

      • /var/lib/libvirt/images

      • /var/lib/libvirt/ISOs

      • /mnt/backup-p

      • /mnt/backup-s

      • /mnt/kvm_backups

  21. Add data about the server into IaaS via the Franchisee App following the steps here. This is so Robot can find out where it can build infrastructure.