Kubernetes Cluster on Fedora CoreOS with Qemu

Sep 15, 2020 . 14 min read

K8s Cluster on Fedora CoreOS VM

After dabbling with Kubernetes on Minikube, I moved on to setup a three node cluster with VirtualBox and Ubuntu. It was not that difficult as some blogs and books on K8s claims but found that Ubuntu K8s cluster has its own share of drawbacks. It’s time and space: Ubuntu Server installation takes an hour and needs about 20GB of disk space for a three node cluster. Add to that, cluster boot is slow and resource intensive because of unnecessary services and packages that comes with it. I was looking for a minimal Linux distro and came to know Fedora CoreOS.

This post, explains how to create K8s Cluster on Fedora CoreOS (Fcos) using Qemu/KVM. However, if your prefer VirtualBox then K8s Cluster with VirtualBox explains the setup for that environment.

Why Fedora CoreOS

Fedora CoreOS (fcos) is a minimal operating system designed for running containerized workloads at scale. Fcos image is about 750MB and unlike Ubuntu server, comes with pre-installed Docker. I could spinoff VM with OS installation in 10 minutes and a three node fcos K8s cluster in 30 minutes. It takes just 12GB of disk space, and the cluster is blazing fast without the overhead of extra services.

However, there is a catch. The design of Fedora CoreOS is quite different from the regular, run-of-the-mill, Linux distro. Where else you can find /usr directory that is not writable; new package install spins of a new os image; ships with a single user without password and worst still, can’t set or reset that user. Nothing is same in Fcos because it is designed for fast and secured rollout of cluster nodes in high end data centers.

Frustrated, I almost gave up after a day. Then on second try, I got hang of it; in the end, it is much easier to setup K8s cluster on Fcos than on Ubuntu.

This post lists all the steps to create Fcos base VM and clone it to setup three node cluster. This is not a tutorial on how to spinoff a VM, how to change its settings, how to create a network adaptor or on kubernetes as there are enough tutorials out there on these things. Again, if you are new to Kubernetes then familiarize with K8s by working through Kubernetes and Minikube Tutorial found in official site and then, try to setup a multi node cluster.

Download Fedora CoreOS

We use Fedora CoreOS Bare Metal ISO and you can download it from Fedora CoreOS Downloads

Setup Qemu Base VM

Steps to create base VM are as follows

  • create Ignition File in host
  • create and configure VM and networks
  • start VM and boot into Fedora CoreOS Live Environment
  • copy ignition file from host to guest
  • install Fedora CoreOS

Create Ignition File

On first boot, Fcos uses a json file known as Ignition file to configure and provision the new system. With text editor, create yaml file base-config.yaml with following contents,


variant: fcos
version: 1.1.0
passwd:
  users:
    - name: k
      groups:
        - docker
        - wheel
        - sudo
      password_hash: <replace-this>
      ssh_authorized_keys:
        - <replace-this>
storage:
  files:
    - path: /etc/sysctl.d/20-silence-audit.conf
      contents:
        inline: |
          kernel.printk=4          
    - path: /etc/hostname
      mode: 420
      contents:
        source: "data:,fcos"
    - path: /etc/NetworkManager/system-connections/enp1s0.nmconnection
      mode: 0600
      overwrite: true
      contents:
        inline: |
          [connection]
          type=ethernet
          id=network1
          interface-name=ens3

          [ipv4]
          method=manual
          addresses=192.168.99.100/24
          gateway=192.168.99.1
          dns=192.168.99.1;8.8.8.8          

On first boot, the ignition file provisions

  • a new user k
  • adds the user to docker, wheel and sudo groups
  • sets hostname to fcos
  • reduces audit level to warn so that debug messages are not spewed to console
  • configures static IP for the interface.

In the yaml file you have to replace two fields - password_hash, sshAuthorizedKeys.

For password_hash, generate password hash with mkpasswd --method=md5crypt , enter some password of your choice and copy the generated hash to password_hash field. Windows users have to use Cygwin mkpasswd utility to generate the hash.

The sshAuthorizedKeys allows password less login via ssh and in case, you don’t need this feature you may remove the field from the file. Otherwise, copy the content of your ssh public key to sshAuthorizedKeys field. On Linux, you can find it in $HOME/.ssh/id_rsa.pub or you can create a new one with ssh-keygen. In Windows, use Putty to generate and manage ssh key.

Now, compile yaml definition to ignition format. If you have docker, then you can compile with following command.


docker run -i --rm quay.io/coreos/fcct:release --pretty --strict < base-config.yaml > base-config.ign

In case, docker is not installed in your PC, then download appropriate fcct binary from Coreos Fcct Releases and compile the yaml,


fcct --pretty --strict < base-config.yaml > base-config.ign

For some reason, you are unable to compile the file; then as workaround, you can download the compiled version from GitHub repo. As mentioned earlier, replace fields - password_hash, sshAuthorizedKeys.

Once you are ready with the file, move on to create base VM.

Create Base VM

Start Virtual Machine Manager (VMM) and open File -> Add Connection and select Qemu/KVM Hypervisor, enable auto-connect and create the connection. Next, open the connection details and go to Virtual Network tab. Add a network interface with a following configuration,

  • Name - network1
  • Mode - NAT
  • Forward to - any physical device
  • IPv4 Configuration
    • Network - 192.168.99.0/24
    • DHCP - disable

This network is able to connect both to host and internet. Unlike VirtualBox, the Qemu DHCP allots random IP to interface where as K8s master node need fixed IP, so we have to disable DHCP and go with static IP. Network name network1 is hardcoded in base-config.ign, so don’t change the name.

Next, create a new VM, choose Local Install Media and browse and select the downloaded Fedora CoreOS iso image and in os type generic and select Generic default. Set memory to 2048MB and cpu to 2. In Storage dialog, choose Create custom storage and create a disk of size 8GB and qcow2 format. In network selection dropdown, select the network1 that we have created earlier. Name the VM as fcos Click Finish to start Fedora CoreOS. Once os boots, it auto login to Fedora CoreOS Live Environment.

Get Ignition File to Guest

Earlier, we had created Ignition file base-config.ign on host. To install OS, we have to get it from host to guest. Easiest option is to use python http server,


# on host, run from the ignition file's directory
python3 -m http.server

# on guest
# find the interface name, normally ens3
ip a

# temporarily allot static IP in guest
sudo ip addr add 192.168.99.100/24 dev ens3

curl -LO 192.168.99.1:8000/base-config.ign

As we have disabled DHCP while creating the virtual network, set static IP and the use curl. The ip addr add temporarily sets the static ip to interface. However, it is not stable, may unset before you do curl before you try scp. Use up-arrow, again set the ip and retry curl. I know this not elegant solution, but with couple of tries you should be able to copy the file to guest.

Alternatively, if you have ssh server on host, use scp to get the file to guest.


# on host - start open ssh server
sudo systemctl start ssh

# on guest
# find the interface name, normally ens3
ip a

# temporarily allot static IP in guest
sudo ip addr add 192.168.99.100/24 dev ens3

# copy ignition file from host to VM
scp <user_id>@192.168.99.1:base-config.ign .

Troubleshoot:

  • network unreachable: interface ip not set, reset with ip addr add ...
  • connection refused: ssh server not running on host, start it on host

For more info: Creating a Virtual Machine with Virtual Machine Manager and Configuring Virtual Machines with Virtual Machine Manager

Install Fedora CoreOS

VM boots to CoreOS Live Environment which runs completely from memory and we have to manually install the os to disk. To install Fcos, run following command,


sudo coreos-installer install /dev/sda -i base-config.ign

It extracts the os (around 3GB) from image and copies it to VM storage file. Installation completes within 2 minutes and after installation, reboot with sudo init 6. On first boot, fcos provisions the new system by running the base-config.ign; it creates user k with sudo privileges and sets interface ip to 192.168.99.100. Login with user id k and the password your have provided to create the password hash.

It is cumbersome to work in guest console, rather it’s preferred to work from host console with ssh. Make ssh connection from host with,


ssh k@192.168.99.100

Troubleshoot:

  • fcos hangs on first boot after os installation: cause, syntax error in base-config.ign. Correct the ignition file and delete and create new VM; start fresh installation.

For more info: Fedora CoreOS

Install Packages

Kubeadm has dependency on conntrack and ethtool packages, so install them with,


sudo rpm-ostree install conntrack ethtool

sudo systemctl reboot

In case of “error: Transaction in progress: …”, wait for any running rpm-ostree process to finish. This happens because node automatically upgrades to new version when there is a new release of Fcos. As last resort, you can cancel transaction with sudo rpm-ostree cancel.

The rpm-ostree is the package manager used by Fcos, which installs packages as layers above the base os image. On reboot, in boot menu, we can see two os trees - ostree:0 (after installation of conntrack and ethtool) and ostree:1 (base os); and boot any of them. The top one is the latest. We can also view the ostree with sudo rpm-ostree status.

Setup Docker

The container runtime, Docker uses either systemd or cgroupfs as cgroup managers. For a stable K8s cluster, it is advised to use systemd as cgroup manager. On guest, run


sudo systemctl start docker
sudo systemctl enable docker

sudo touch /etc/docker/daemon.json
cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2",
  "storage-opts": [
    "overlay2.override_kernel_check=true"
  ]
}
EOF

sudo mkdir -p /etc/systemd/system/docker.service.d

sudo touch /etc/systemd/system/docker.service.d/docker.conf
cat <<EOF | sudo tee /etc/systemd/system/docker.service.d/docker.conf
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd
EOF

sudo systemctl daemon-reload
sudo systemctl restart docker

Check the docker setup by executing docker run hello-world

Install K8s Toolbox

The K8s toolbox consists of kubeadm, kubectl and kubelet. Install them with,


CNI_VERSION="v0.8.2"
sudo mkdir -p /opt/cni/bin
curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-Linux-amd64-${CNI_VERSION}.tgz" | sudo tar -C /opt/cni/bin -xz

DOWNLOAD_DIR=/usr/local/bin
sudo mkdir -p $DOWNLOAD_DIR

CRICTL_VERSION="v1.17.0"
curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-Linux-amd64.tar.gz" | sudo tar -C $DOWNLOAD_DIR -xz

RELEASE="$(curl -sSL https://dl.k8s.io/release/stable.txt)"
cd $DOWNLOAD_DIR
sudo curl -L --remote-name-all https://storage.googleapis.com/kubernetes-release/release/${RELEASE}/bin/linux/amd64/{kubeadm,kubelet,kubectl}
sudo chmod +x {kubeadm,kubelet,kubectl}

RELEASE_VERSION="v0.4.0"
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service
sudo mkdir -p /etc/systemd/system/kubelet.service.d
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

sudo systemctl enable --now kubelet

Allow iptables see bridged traffic,


cat <<EOF | sudo tee /etc/sysctl.d/K8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

sudo sysctl --system

Shut it down with sudo init 0. We are done with the base VM, no more install or setup. We are ready to clone it to create other nodes - master and workers.

For more info: Installing kubeadm and specifically: Fedora CoreOS or Flatcar Container Linux tab

Clone Master Node

Next, we clone master node from fcos vm. Each K8s node should have unique product_uuid, but clones created by Virtual Machine Manager (GUI) will have same uuid, so use cli to clone the nodes. Go ahead and create a clone of fcos and name it as master with,


virt-clone --connect qemu:///system --original fcos --name master --file /var/lib/libvirt/images/master.qcow2

Start the master and login through ssh k@192.168.99.100 and execute following commands,


sudo hostnamectl set-hostname master

# ensure that product_uuid is unique
sudo cat /sys/class/dmi/id/product_uuid

# reset machine id to product uuid
sudo rm /etc/machine-id
sudo systemd-machine-id-setup
sudo systemd-machine-id-setup --commit

# change static ip
sudo nmcli connection   # find the <connection name>

sudo nmcli connection mod network1 \
     ipv4.method manual \
     ipv4.addresses 192.168.99.101/24 \
     ipv4.gateway 192.168.99.1 \
     ipv4.dns 192.168.99.1 \
     +ipv4.dns 8.8.8.8 \
     connection.autoconnect yes

sudo systemctl restart NetworkManager
sudo systemctl reboot

After VM reboot, login to master with the new ip ssh k@192.168.99.101.

Setup Master Node with Control Plane

On master node, we initialize the kubeadm so that it works as K8s API Server and control plane. Normally, master node is initialized with sudo kubeadm init --apiserver-advertise-address=192.168.99.101 --pod-network-cidr=192.168.0.0/16 . In FCOS, this is not going to work as the Fcos /usr directory is read-only and kublet-plugins are not able to write to it. To change kubelet plugins directory, we need to use a config file to pass initialization configs to kubeadm. Create config file, kubeadm-init.yaml, by executing following command in master node.


cat << EOF > kubeadm-init.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
nodeRegistration:
  kubeletExtraArgs:
    volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
localAPIEndpoint:
  advertiseAddress: "192.168.99.101"
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
networking:
  podSubnet: "192.168.0.0/16"
controllerManager:
  extraArgs:
    flex-volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
apiServer:
  extraArgs:
    advertise-address: 192.168.99.101

EOF

It indicates that,

  • control manager and nodes should use /opt/libexec as the kubernetes volume instead of default /usr/libexec. The /opt is writable in Fcos.
  • API server advertise address is 192.168.99.101, i.e. primary ip of master node.
  • Pods subnet is 192.168.0.0/16

With the config file, run kubeadm init on master node.


sudo kubeadm init --config kubeadm-init.yaml

Init pulls K8s images and starts various pods. At the end of kubeadm init messages, a join command is displayed; save it somewhere as we need it to join worker nodes to cluster. Copy the admin.conf file to your $HOME/.kube directory so that you can run kubectl commands as normal user.


mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

watch kubectl get pods --all-namespaces

Watch cluster creation till all pods reach Running state except coredns pods which reach ClusterCreating or Pending state.

For more info: Creating a cluster with kubeadm and Using kubeadm init with a configuration file

Install Pod Network Add-on

Pods communicates through Container Network Interface (CNI) based Pod Network Add-on and Calico is one such add-on. To install calico, download calico.yaml and apply in master node.


curl https://docs.projectcalico.org/manifests/calico.yaml -O

# replace all `/usr/libexec` to `/opt/libexec`
sed -i 's/usr\/libexec/opt\/libexec/g' calico.yaml

kubectl apply -f calico.yaml

watch kubectl get pods --all-namespaces

It pulls calico container images. Once Calico pods are up and running the coredns pods should change to Running state.

Now, master node, with K8s api server and control-plane, is ready and kubectl get nodes should now show one node cluster with master in Ready state.

If something goes wrong during init, clean up and revert back with sudo kubeadm reset and try init again.

For more info: Calico Add on

Setup worker node

Create second clone of fcos and name it worker1. Steps are same as explained when we cloned master, but with following changes,

In virt-clone command,

  • clone name: worker1
  • file: worker1.qcow2

In hostnamectl command

  • host name: worker1

In nmcli connection mod command,

  • ipv4.addresses: 192.168.99.102/24

Don’t init control plane on worker nodes and also, don’t install calico addon.

Reboot worker1 and login with ssh k@192.168.99.102.

Join the K8s Cluster

As already explained in master node section, we can’t use kubeadm join cli method; so, we go with config file method.

To join the cluster, worker node needs token and discovery-token-ca-cert-hash which was part of join command displayed when we setup master. If you haven’t noted down the join command, then find out token and cert-hash with these commands in the master node.


# run in master node to get token
kubeadm token list

# run in master node to get caCertHash
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

Next, in worker1 node, assign token and cert-hash values to variable and create kubeadm-join.yaml config file.


JOIN_TOKEN=<paste-token-from-master-here>
JOIN_CERT_HASH=<paste-cert-hash-from-master-here>

cat <<EOF > kubeadm-join.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
nodeRegistration:
  kubeletExtraArgs:
    volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
discovery:
  bootstrapToken:
    apiServerEndpoint: 192.168.99.101:6443
    token: ${JOIN_TOKEN}
    caCertHashes:
    - sha256:${JOIN_CERT_HASH}
EOF

# verify whether variables are substituted properly
cat kubeadm-join.yaml

# if variables are not replaced in config file then use envsubst

Node is ready to join the cluster; to join execute following commands in the worker node.


sudo systemctl enable kubelet.service
sudo kubeadm join --config kubeadm-join.yaml

On master run watch kubectl get pods --all-namespaces . It may take about 2 to 3 minutes as it pulls calico and K8s images. Once all pods are up and running, fire kubectl get nodes and both master and worker1 are part of K8s cluster and in Ready state.

Clone one more node, worker2 from base VM and join it to cluster.

For more info: K8s - Join your nodes

Via systemctl, we have enabled docker and kubelet services to start on system boot. Once docker and kubelet are up, K8s cluster starts and synchronizes on its own.

Access Cluster from Host

While it is fine to administer the cluster from master, it is quite convenient to do it from the host. For that, all you have to do is to install kubectl in the host and copy the $HOME/.kube/config file from master node to hosts $HOME/.kube directory and you are good to go.