I’m thinking about getting in to Ceph development,
and so I’m thinking about how to quickly provision and destroy and
reprovision tiny Ceph clusters at home.
For love of blinking lights, I’ll probably build a cluster with
Banana Pi boards in the future, but for
now I’m just using libvirt and KVM.
Ceph-libvirt-clusterer
lets me clone a template virtual machine and attach as many OSD disks
as I’d like in the process. I’m really happy with the tool
considering that I only have a day’s worth of work in it, and I got to
learn some details of the libvirt API and python bindings in the process.
I spent the last few days learning about the openSUSECeph installation process. I ran into some issues, and I’m not
done yet, so these are just my working notes for now. Once complete, I’ll
write up the process on my regular blog.
Prerequisite: build a tool to build and destroy small clusters quickly
I needed a way to quickly provision and destroy
virtual machines that were well suited to run small Ceph clusters. I mostly
run libvirt / kvm
in my home lab, and I didn’t find any solutions tailored to that platform, so
I wrote ceph-libvirt-clusterer.
Ceph-libvirt-clusterer
lets me clone a template virtual machine and attach as many OSD disks
as I’d like in the process. I’m really happy with the tool
considering that I only have a day’s worth of work in it, and I got to
learn some details of the libvirt API and python bindings in the process.
Build a template machine
I built a template machine with
openSUSE’s tumbleweed and
completed the following preliminary configurations:
created ceph user
ceph user has a SSH key
ceph user’s public key is in the ceph user’s authorized_keys file
ceph user is configured for passwordless sudo
emacs is installed (not strictly necessary :-) )
Provision a cluster
I used ceph-libvirt-clusterer to create a four node cluster, and each node had
two 8GBOSD drives attached.
The ceph packages aren’t yet in the mainline repositories, so I added it
to the admin node:
And ceph packages were visible:
First issue: python was missing on the other nodes
When I installed ceph-deploy on the admin node, python was also
installed. The other nodes were still running with a bare minimum
configuration from the tumbleweed install, so python was missing, and
ceph-deploy’s install step failed.
I installed Ansible to correct the problem on all
nodes simultaneously, but Ansible requires python on the remote side, too.
That meant I had to manually install python on the remaining three nodes just
like sysadmins had to do years ago.
Second issue: all nodes need the OBS repository
I didn’t add the OBS repository to the remaining three nodes because I
wanted to see if ceph-deploy would add it automatically. I didn’t expect
that to be the case, but since this version of ceph-deploy came directly from
SUSE, there was a chance.
Fortunately Ansible works now:
Once both of these commands completed, ceph-deploy install worked as expected.
Third issue: I was using IP addresses
ceph-deploy new complains when provided with IP addresses:
In the future, it’d be pretty cool if ceph-libvirt-clusterer supported
updating DNS records so I didn’t need to resort to the host file
ansible playbook that I used today:
Fourth issue: tumbleweed uses systemd, but ceph-deploy doesn’t expect that
Sure enough, a little manual inspection revealed no file at /etc/init.d/ceph and systemd integration:
I learned that this is a known bug,
and I’ll try all of this again with an older version of openSUSE.
… and that’s where I’m calling it a night. I’ll be back at it this week.