LXD Cluster Installation with Remote Ceph

LXD Cluster Requirements

Hardware Requirements

  • Industry standard Intel/AMD Servers that supports virtualization

  • Ethernet Port for IPMI/iDRAC/iLO OOB (required for PAT)

  • Ethernet Port for Management Bus

  • Ethernet Port for Private Bus

  • Operating System Drive (preferably 2 drives in RAID 1)

  • VM/Container Storage Drive (preferably multiple drives in RAID10)

  • Access to existing Ceph cluster with RBD configured

Software Requirements

  • Ubuntu 24.04 LTS (GNU/Linux 6.x kernel)

  • snap (for LXD)

  • zfsutils-linux (for ZFS storage pool)

  • ceph-common (on each LXD host)

  • Existing Ceph cluster with MONs and OSDs configured

Prerequisites

1. Set up cloud0 Bonded Interface

Create cloud0 netplan file:

touch /etc/netplan/cloud0.yaml
chmod 600 /etc/netplan/cloud0.yaml

Add the following template to the above netplan file you just created.

network:
 bonds:
     cloud0:
         interfaces:
         - <first_interface> #replace with the Management Interface
         - <second_interface> #replace with the Private Interface
         parameters:
             lacp-rate: fast
             mode: 802.3ad
             transmit-hash-policy: layer2
         addresses:
          - <IPV6_address> #IPV6 address of lxd host you are configuring
         routes:
          - to: ::/0
            via: <IPV6_address> #IPV6 Address of Regions Podnet
         nameservers:
           addresses:
            - 2001:4860:4860::8888 #DNS of Google
 ethernets:
     <first_interface>: {} #replace with the Management Interface
     <second_interface>: {} #replace with the Private Interface

Next fill in the values of the current host you are configuring.

EXAMPLE:

network:
 bonds:
     cloud0:
         interfaces:
         - enp94s0f0np0
         - enp94s0f1np1
         parameters:
             lacp-rate: fast
             mode: 802.3ad
             transmit-hash-policy: layer2
         addresses:
          - 2a02:2078:10::30:0:30/64
         routes:
          - to: ::/0
            via: 2a02:2078:10::10:0:1
         nameservers:
           addresses:
            - 2001:4860:4860::8888
 ethernets:
     enp94s0f0np0: {}
     enp94s0f1np1: {}

Then apply the config:

sudo netplan apply

2. Install Ceph Utilities on LXD Host

On each LXD host, install the Ceph client:

sudo apt install ceph-common

3. Create LXD Ceph Client Key

This section describes how to give LXD hosts access to a remote Ceph Cluster. These storage pools are used as secondary storage for containers and VMs

On the Ceph host, create the client.lxd key:

sudo ceph auth get-or-create client.lxd \
  mon 'profile rbd' \
  osd 'profile rbd' \
  mgr 'profile rbd' \
  -o /etc/ceph/ceph.client.lxd.keyring

Set capabilities:

sudo ceph auth caps client.lxd \
  mon 'allow r' \
  osd 'allow rw pool=lxd'

4. Copy Ceph Config and Key to LXD Host

On the Ceph host, copy the config and keyring:

scp /etc/ceph/ceph.conf administrator@[<ipv6_address_this_host>]:/home/administrator
scp /etc/ceph/ceph.client.lxd.keyring administrator@[<ipv6_address_this_host>]:/home/administrator

On the LXD host, move them into place:

sudo cp /home/administrator/ceph* /etc/ceph/

5. Verify Ceph Access

On the LXD host:

sudo ceph --id lxd -s

Local ZFS Pool Setup

This section describes how to create a local ZFS pool named local on a single disk (e.g., /dev/sda). This pool is used for container and VM storage local to the host.

1. Wipe the target disk

Warning

This will irreversibly destroy all data on the target disk.

sudo wipefs -a /dev/sd$

2. Install ZFS Utilities

sudo apt install zfsutils-linux -y

3. Create the ZFS Pool

sudo zpool create -f local /dev/sd$

4. Verify the ZFS Pool

sudo zpool list

Example output:

NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
local  6.97T   468K  6.97T        -         -     0%     0%  1.00x  ONLINE  -

LXD Cluster Setup

Leader Node (lxd001)

1. Install LXD
sudo snap install lxd
2. Initialize LXD with Preseed

Create the following lxd-preseed.yaml:

config:
  core.https_address: '[<ipv6_address_this_host>]:8443'

storage_pools:
  - name: local
    driver: zfs

  - name: ceph
    driver: ceph
    config:
      ceph.user.name: lxd
      ceph.cluster_name: ceph
      ceph.osd.pool_name: lxd
        name: eth0

cluster:
  server_name: lxd001
  enabled: true
  member_config: []
  cluster_address: ""
  cluster_certificate: ""
  server_address: <ipv6_address_this_host>

Then run:

lxd init --preseed < /path/to/lxd-preseed.yaml

Joining Nodes (lxd002, lxd003…)

1. Install LXD
sudo snap install lxd
2. Generate Join Token

On lxd001, run:

lxd cluster add lxd002

Copy the full join token that appears in the output.

3. Create Preseed YAML

On the joining node (e.g., lxd002):

config:
  core.https_address: '[<ipv6_address_this_host]:8443'

cluster:
  enabled: true
  server_name: <this_servers_node_name> #lxd-node0042
  server_address: '[<ipv6_address_this_host>]:8443'
  cluster_address: '[<cluster_leader_ipv6>]:8443'
  cluster_token: <paste_your_token>
  member_config:
    - entity: storage-pool
      name: local
      key: source
      value: local
    - entity: storage-pool
      name: local
      key: zfs.pool_name
      value: local

Then run:

lxd init --preseed < /path/to/preseed.yaml

Repeat for each additional node (e.g., lxd003).

4. Verify Cluster Status

On any cluster node:

lxd cluster ls

Example output:

+--------+------------------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
|  NAME  |                URL                 |      ROLES      | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE  |      MESSAGE      |
+--------+------------------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| lxd001 | https://[<ipv6_address_this_host>]:8443 | database-leader | x86_64       | default        |             | ONLINE | Fully operational |
| lxd002 | https://[<ipv6_address_this_host>]:8443 | database        | x86_64       | default        |             | ONLINE | Fully operational |
| lxd003 | https://[<ipv6_address_this_host>]:8443 | database        | x86_64       | default        |             | ONLINE | Fully operational |
+--------+------------------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
5. Configuring LXD for Production environments.

Canonical recommends tuning several kernel parameters to ensure optimal performance and scalability for LXD in production environments.

These settings increase limits for file descriptors, inotify watches, asynchronous I/O, and other kernel parameters that can otherwise become bottlenecks when running a large number of containers or VMs.

Since this deployment uses the snap version of LXD, the nofile and memlock limits are already adjusted automatically, so we only need to apply sysctl tunables.

On each LXD host, create the following sysctl drop-in file:

sudo tee /etc/sysctl.d/99-lxd-production.conf >/dev/null <<EOF
# LXD Production Tuning
# Added on $(date +%Y-%m-%d) following Canonical's production recommendations
fs.aio-max-nr=524288
fs.inotify.max_queued_events=1048576
fs.inotify.max_user_instances=1048576
fs.inotify.max_user_watches=1048576
kernel.dmesg_restrict=1
kernel.keys.maxbytes=2000000
kernel.keys.maxkeys=2000
net.core.bpf_jit_limit=1000000000
net.ipv4.neigh.default.gc_thresh3=8192
net.ipv6.neigh.default.gc_thresh3=8192
vm.max_map_count=262144
EOF

Apply the new settings:

sudo sysctl --system

Note

  • These settings persist across reboots.

  • A reboot is recommended after applying them for the first time.