KVM with GPUs

This procedure is applied as an additional procedure to KVM hosts with GPU hardware is also being installed.

  1. Installing VFIO drivers: First, the default nouveau driver needs to be disabled.

    • Check the presence of video cards lspci -nn | grep NVIDIA. Output will be like:

      04:00.0 3D controller: NVIDIA Corporation GM200GL [Tesla M40] [10d3:102d] (rev a1)
      05:00.0 3D controller: NVIDIA Corporation GM200GL [Tesla M40] [10d3:102d] (rev a1)
      06:00.0 3D controller: NVIDIA Corporation GM200GL [Tesla M40] [10d3:102d] (rev a1)
      07:00.0 3D controller: NVIDIA Corporation GM200GL [Tesla M40] [10d3:102d] (rev a1)
      
    • Disable standard Nouveau video driver. Create a new file called blacklist-nouveau.conf:

      ``sudo nano /etc/modprobe.d/blacklist-nouveau.conf``
      
    • Add the following lines:

      blacklist nouveau
      options nouveau modeset=0
      
    • Next the VFIO driver needs to be enabled. Add the following lines to /etc/modules to make sure VFIO drivers are loaded early at boot:

      vfio-pci
      vfio_iommu_type1
      
    • Get the device ids of the GPUs that are installed in the server. In the output of the lspci -nn command above, the device id is the 8 hexadecimal digit in the last set of square brackets, which in the example above would be 10d3:102d.

    • Now create a new file called /etc/modprobe.d/vfio.conf with the following lines to bind the devices to the vfio driver:

      options vfio-pci ids=<device_id>
      

      Replace <device_id> with the device ids you found the previous step. If there are multiple devices, add them with commas between them.

    • Update the existing initramfs sudo update-initramfs -u

    • Note: lspci -k outputs a Drivers in use field for devices. The GPU will not display as using VFIO drivers until it has been attached to a VM.

  2. Enable IOMMU: To allow the host to pass the GPU device to the VM, IOMMU needs to be enabled on the host.

    • Open the grub config with the following command:

      sudo nano /etc/default/grub
      
    • Find the line beginning with GRUB_CMDLINE_LINUX_DEFAULT=”” and set it to the following based on your system architecture. For an AMD based server, enable AMD IOMMU with:

      GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on"
      

      For an Intel based server, enable Intel VT-d with:

      GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on iommu=pt"
      

      You can check the CPU type by command sudo cat /proc/cpuinfo, and check the output for vendor_id. It can be GenuineIntel or AMD.

    • Generate a new grub config file:

      sudo grub-mkconfig -o /boot/grub/grub.cfg
      
    • Reboot the system:

      sudo init 6
      
    • Check that the Nouveau driver is not running and that IOMMU is enabled. Running lsmod | grep nouveau should produce no output. To check if IOMMU is enabled run the following:

      grep -i -e DMAR -e IOMMU /var/log/dmesg
      

      You should see message like below:

      [    0.123456] pci 0000:09:00.0: Adding to iommu group 1
      [    0.234567] pci 0000:0a:00.0: Adding to iommu group 2
      [    0.345678] DMAR: Intel(R) Virtualization Technology for Directed I/O
      
  3. Check IOMMU groups: To ensure individual GPU devices can be passed to a VM make sure all devices are in their own IOMMU groups:

    for dev in $(lspci -D | grep -i nvidia | cut -d ' ' -f 1 | sed -E -e 's/\W/_/g' -e 's/.*/pci_\0/' - ); do virsh nodedev-dumpxml $dev | sed -n '/<iommuGroup/,/iommuGroup>/p' ; echo ; done;
    

    This will output XML snippets that should each have one address in them like the following:

    <iommuGroup number='180'>
        <address domain='0x0000' bus='0xe1' slot='0x00' function='0x0'/>
    </iommuGroup>
    

    If there are multiple items in an IOMMU group, you will need to update the system with the Linux ACS override patch. This should be taken as a last resort as you will be modifying the kernel. Moving the physical GPUs to different slots can change its assigned IOMMU group. Instructions on how to apply the ACS override patch will be added soon.

  4. Attaching GPU to VM via Virsh: To add a GPU device to a VM, you will need to edit the XML of the VM. Create an XML snippet for the device:

    <hostdev mode='subsystem' type='pci' managed='yes'>
        <driver name='vfio'/>
        <source>
            <address domain='0x0000' bus='0x00' slot='0x00' function='0x0'/>
        </source>
    </hostdev>
    

    Replace the domain, bus, slot, and function values with those of the device you want to add. These values can be found with lspci -D where the first digits on each line are the device id, written <domain>:<bus>:<slot>.<function>.

    • The GPU can then be attached to the VM with:

      virsh attach-device [vm_domain_name] [filename] --persistent
      

    Note: NVIDIA vGPU software supports up to a maximum of 16 vGPUs per VM on Linux with KVM.