I spent the last few days learning about the openSUSECeph installation process. I ran into some issues, and I’m not
done yet, so these are just my working notes for now. Once complete, I’ll
write up the process on my regular blog.
Prerequisite: build a tool to build and destroy small clusters quickly
I needed a way to quickly provision and destroy
virtual machines that were well suited to run small Ceph clusters. I mostly
run libvirt / kvm
in my home lab, and I didn’t find any solutions tailored to that platform, so
I wrote ceph-libvirt-clusterer.
Ceph-libvirt-clusterer
lets me clone a template virtual machine and attach as many OSD disks
as I’d like in the process. I’m really happy with the tool
considering that I only have a day’s worth of work in it, and I got to
learn some details of the libvirt API and python bindings in the process.
Build a template machine
I built a template machine with
openSUSE’s tumbleweed and
completed the following preliminary configurations:
created ceph user
ceph user has a SSH key
ceph user’s public key is in the ceph user’s authorized_keys file
ceph user is configured for passwordless sudo
emacs is installed (not strictly necessary :-) )
Provision a cluster
I used ceph-libvirt-clusterer to create a four node cluster, and each node had
two 8GBOSD drives attached.
The ceph packages aren’t yet in the mainline repositories, so I added it
to the admin node:
And ceph packages were visible:
First issue: python was missing on the other nodes
When I installed ceph-deploy on the admin node, python was also
installed. The other nodes were still running with a bare minimum
configuration from the tumbleweed install, so python was missing, and
ceph-deploy’s install step failed.
I installed Ansible to correct the problem on all
nodes simultaneously, but Ansible requires python on the remote side, too.
That meant I had to manually install python on the remaining three nodes just
like sysadmins had to do years ago.
Second issue: all nodes need the OBS repository
I didn’t add the OBS repository to the remaining three nodes because I
wanted to see if ceph-deploy would add it automatically. I didn’t expect
that to be the case, but since this version of ceph-deploy came directly from
SUSE, there was a chance.
Fortunately Ansible works now:
Once both of these commands completed, ceph-deploy install worked as expected.
Third issue: I was using IP addresses
ceph-deploy new complains when provided with IP addresses:
In the future, it’d be pretty cool if ceph-libvirt-clusterer supported
updating DNS records so I didn’t need to resort to the host file
ansible playbook that I used today:
Fourth issue: tumbleweed uses systemd, but ceph-deploy doesn’t expect that
Sure enough, a little manual inspection revealed no file at /etc/init.d/ceph and systemd integration:
I learned that this is a known bug,
and I’ll try all of this again with an older version of openSUSE.
… and that’s where I’m calling it a night. I’ll be back at it this week.
Nothing much to see here.
This is a private work journal made public. It probably won't be interesting to others. Check out set_trace for my traditional public blog.