Dominic Cronin's weblog
System refresh: new architecture for www.dominic.cronin.nl
It's taken a while, and the odd skinned knuckle and a bit of cursing, but I can finally announce that this site is running on...erm.. the other server. Tada! Ta-ta-ta-diddly.... daaahhhh!!!!
Um yeah - I get it. it's not so exciting is it really? The blog's still here, and it's got more or less the same content. It doesn't look any different. Maybe it's a tiny smidgin faster, but even that's more likely to do with the fact that we switched over to an ISP that actually makes use of the glass that runs in to our meter cupboard.
But I'm excited. Just a bit, anyway. Partly because it's taken me months. It needn't have, but it's the usual question of squeezing it into the cracks between all the other things that need to get done in life. That and the fact that I'm an utter cheapskate and I don't want to pay for anything. There's also plenty not to be excited about. As I said, the functionality is exactly as it was. The benefits I get from it are mostly about the ability to do things better going forward.
So what have I done? Well it all started an incredibly long time ago when I started tinkering with docker. I figured that the whole containerisation technology thing had such a lot of potential that I ought at least to run docker on my own server. After all, over the years, I'd always struggled with Plone needing to have a different version of Python than the one available in the current Gentoo ebuilds. I'd attempted a couple of things, including I think an early version of what became LXC, but then along came virtualenv, which made the whole thing moot.
Yeah, well - until I wanted to play with docker for itself. At this point, I just thought I'd install it on my server, and get going, but I immediately discovered, that the old box I was running was 32-bit, and docker is just far too hip to run on anything so old-fashioned. So I needed a new server, and once I'd realised that, that's when the whole thing started. If I was going to have a new server, why didn't I just containerise everything? It's at this point that someone inevitably chips in with a suggestion that if I weren't such a dinosaur, I'd run it on the cloud, wouldn't I? Well yes - sure! But I told you - I'm a cheapskate, and apart from that, I don't want anyone's soul-less reliability messing with my carefully constructed one-nine availability commitment.
Actually I like cloud tech, but frankly, when you look at the micro-budget that supports this site, I'd have spent all my time searching out a super-cheap host, and even then I'd have begrudged it. So my compromise with myself was that I'd build it all very cloudy, and then the world's various public clouds would be my disaster recovery plan. And so it is. If this server dies, I can get it all up in the cloud with a fairly meagre effort. Still not going to two-nines though.
So I went down to my local high street where there's a shop run by these Indian guys. They always have a good choice of "hardly used" ex-business computers. I think I shelled out a couple of hundred Euros, and then I had something with an i5 and enough memory, and a couple of stupidly big disks to make a raid. Anyway - more than enough for a web server - which is just as well, because pretty soon it ends up just being "the server", and it'll get used for all sorts of other things. All the more reason to containerise everything.
I got the thing home, and instead of doing what I've done many times before, and installing Gentoo linux, I poked around a bit on the Internet and found CoreOS. Gentoo is a masochist's delight. I mean - it runs like a sports car, but you have to own a set of spanners. CoreOS, on the other hand, is more or less maintenance free. It's built on Gentoo's build system, so it inherits the sports car mentality of only installing things you are going to use, but then the guys at CoreOS do that, and their idea of "things you are going to use" is basically everything that it takes to get containers up and keep them running, plus exactly nothing else. For the rest, it's designed for cloud use, so you can install it from bare metal to fully working just by writing a configuration file, and it knows how to update itself while running. (It has a separate partition for the new version, and it just switches over.)
So with CoreOS up and running, the next thing was to convert all the moving parts over to Docker containers. As it stands now, I didn't want to change too much of the basics, so I'm running Plone on a Gentoo container. That's way too much masochism though. I'd already been thinking I'd do a fresh one with a more generic out-of-the-box OS, and I've just realised I can pull a pre-built Plone image based on Debian (or Alpine). This gets better and better. And I can run it all up side-by-side in separate containers until I'm ready to flip the switch. Just great! Hmm... maybe my grand master plan was just to get to Plone 5!
The Gentoo container I'm using is based on one created by the Gentoo community, which you can pull from the Docker hub. Once I found this, I thought I was home and dry, but it's not really well-suited to just pulling automatically from a docker file. What they've done is to separate out the portage tree into a separate container. This is smart, because you are unlikely to want the whole of portage in your container for any given purpose that makes you want to run Gentoo. What you do instead is mount the portage data using docker's --volumes-from argument. With it mounted, you can run emerge and install whatever packages you need, and then at runtime you get to run a much slimmer system. Which is great, but it means you have to create and store your own image manually rather than using a dockerfile. (At least, that's how it ended up for a noob like me, once I realised that dockerfile doesn't have an equivalent of --volumes-from.)
My goal was to set up CoreOs to automatically pull the docker images it needed, and run some setup commands. This meant that I'd need to have my personalised Gentoo image available somewhere. Some of the data in there was sensitive, so I went looking for a private Docker registry that I could upload it to. There are plenty of private registries, but most of them aren't free. (If you don't mind the whole world pulling your containers, then free registries abound.) I eventually found https://canister.io/, which suited my needs. That said, my needs aren't much. If I ever need an alternative to canister, I'll probably look at Google Cloud Platform, which isn't free but has a private container registry where you only pay for storage and data egress, at pretty reasonable rates. Or I could just host it myself, but that's maybe too many eggs in the same basket.
Meanwhile, my very next step ought most probably be to get backups sorted out. The "Dockerish" way to do this is to run up yet another dedicated container to deal with just this concern. Then if I want to host it separately, and my backup approach changes, nothing else needs to. Once I have the backups sorted out, it will definitely be worth the while to tidy things up so that I really can just push to the cloud if needs be. The way it's set up now, I could be up and running again very quickly but we're probably talking hours rather than seconds.
I'm really enjoying the flexibility that containerisation gives me, although it's definitely important to get into the right mindset. Being able to build containers that will run on a really generic platform is quite liberating.
Spoofing a MAC address in gentoo linux
I spent a few hours this weekend fiddling with networking things at home. One of the things I ran into was that the DHCP server provided by my ISP was behaving erratically. Specifically, it was being very fussy about giving out a new lease. It would give out a lease to a Windows 7 system I was using for testing, but not to my Gentoo server. At some point, having spent the day with this kind of frustration, I was ready to put up with almost any hack to get things running. Someone on the #gentoo IRC channel suggested that spoofing the MAC address that already had a lease might be a solution. Their solution was to do this:
ifconfig eth0 down ifconfig eth0 hw ether 08:07:99:66:12:01 ifconfig eth0 up
Here, you have to imagine that eth0 is the name of the interface, although on my system it isn't any more. (Another thing I learned this weekend was about predictable interface names.) You should also imagine that 08:07:99:66:12:01 is the mac address of the network interface on my Win7 system.
The trouble with this is that it doesn't integrate very well in the standard init scripts that get things going on a Gentoo system. Network interfaces are started by running /etc/init.d/net.eth0 (although that's just a link to another script). The configuration is to be found in /etc/init.d/net where you can add directives that control the way your network interfaces are configured. The most important of these are the ones that begin with "config_". For example, to set up a static IP for eth0, you might say something like:
config_eth0="192.168.0.99 netmask 255.255.255.0 brd 192.168.0.255"
or for DHCP it's much simpler:
config_eth0="dhcp"
So my obvious first try for setting up a spoofed MAC address was something like this:
config_eth0="dhcp hw ether 08:07:99:66:12:01"
but this didn't work at all. Anyway - after a bit of fiddling and more Googling (sorry - I can't remember where I found this) it turned out that there's a specific directive just for this purpose. I tried this
mac_eth0="08:07:99:66:12:01" config_eth0="dhcp"
It works a treat. Note that the order is important, which is obvious once you know it I suppose, but wasn't obvious to me until I'd got it wrong once.
The good news after that was that for an established lease, everything worked rather better.
One-nine availability
A couple of weeks ago, this site went down. That happens from time to time. It went down just as I left the country to go on holiday, and it could only be fixed via physical access, so it was down for a week. At least one person has commented that maybe I should stop with this silliness of running my own server on an old Gentoo box in the meter cupboard, and get some proper hosting.
The thing is, that when I started this blog, some years ago now, I went through a detailed requirements analysis, and a full MoSCoWMeh matrix. If you aren't familiar with MoSCoWMeh, this is an enhanced variant of the well-known MoSCoW technique, which also accommodates the needs of private and hobby-run systems.
The requirement for reliable hosting and 5-nines up-time was classified as Meh, and has remained so since. So now you know.
What not to do when upgrading to Grub2
I'd been following the Gentoo Wiki guidance on upgrading Grub, and had been taking it very carefully. I'd worried about getting this right, as getting it wrong would leave me with a brick, so I'd been very pleased to see the notes on using the old bootloader to chain load the new one. That way I could check that my configuration was correct before taking the plunge of installing the new version into the Master Boot Record. I didn't want to automatically generate the new config file, as I didn't trust it. (Rightly so as it turned out, because my initrd files didn't follow the strict naming requirements, so weren't picked up by the config generation script) Anyway - the hand-written config was half a dozen lines long, and the generated one was utterly incomprehensible.
So anyway - I managed to create the config file, and get everything set up for chain loading. I rebooted the server, and bingo - there was the chain loader entry in my "old" boot screen, and when I followed it, I got the new menu and could boot the server. Great stuff! Now it should have been a simple question of running grub2-install, and I'd be finished. So I did this, and then.... the computer wouldn't start. Fortunately I had a grub prompt, so grub was "working" - but it obviously couldn't find its config file. I already knew that with the right incantations it might be possible to get the thing to boot without a config file, and after a bit of googling, I got enough clues to attempt it. (For the record, what I think I'd done wrong was to fail to remount /boot after my chain test and before running grub2-install, with the result that grub then didn't know how to correctly find /boot.)
It took a few attempts, but the command line completion in grub helps a lot. This is what I eventually ended up typing at the grub prompt to get a working boot.
grub > set root=(hd0,1) grub > linux /kernel-gen-newudev-3.3.8-gentoo root=/dev/sda3 grub > initrd /initramfs-gen-newudev-3.3.8-gentoo grub > boot
Note that the root for the boot loader is different from the root of the operating system, so you have to specify them separately. Obviously YMMV for the names of the kernel and initrd files, not to mention device identifiers.
But the real advice here is to avoid missing out that crucial mount operation!!
Gentoo emerge dies with 'failed to open /dev/urandom' when wrong default python is configured.
So there I was - just for fun building my new Gentoo system, when all of a sudden, I wasn't. Building, that is. I wasn't building anything. In fact, part of the motivation for a clean build had been that emerging new things was getting tiresomely fragile. Anyway - here's what happened when I tried an emerge. The interesting part is where it says: Fatal Python error: Failed to open /dev/urandom
>>> Emerging (1 of 18) sys-libs/glibc-2.17 * Fetching files in the background. To view fetch progress, run * `tail -f /var/log/emerge-fetch.log` in another terminal. * glibc-2.17.tar.xz SHA256 SHA512 WHIRLPOOL size ;-) ... [ ok ] * glibc-2.17-patches-8.tar.bz2 SHA256 SHA512 WHIRLPOOL size ;-) ... [ ok ] make -j2 -s glibc-test make -j2 -s glibc-test >>> Unpacking source... * Checking gcc for __thread support ... [ ok ] * Checking kernel version (3.3.8 >= 2.6.16) ... [ ok ] * Checking linux-headers version (3.9.0 >= 2.6.16) ... [ ok ] >>> Unpacking glibc-2.17.tar.xz to /var/tmp/portage/sys-libs/glibc-2.17/work >>> Unpacking glibc-2.17-patches-8.tar.bz2 to /var/tmp/portage/sys-libs/glibc-2.17/work * Applying Gentoo Glibc Patchset 2.17-8 ... * 0035_all_glibc-2.16-i386-math-feraiseexcept-overhead.patch ... [ ok ] * 0059_all_glibc-2.19-make-4.0.patch ... [ ok ] * 0065_all_glibc-2.18-qecvt-guards.patch ... [ ok ] * 0070_all_glibc-2.18-localedef-page-align-1.patch ... [ ok ] * 0071_all_glibc-2.18-localedef-page-align-2.patch ... [ ok ] * 0072_all_glibc-2.18-localedef-page-align-3.patch ... [ ok ] * 0085_all_glibc-disable-ldconfig.patch ... [ ok ] * 0090_all_glibc-2.17-arm-ldso.cache.patch ... [ ok ] * 1005_all_glibc-sigaction.patch ... [ ok ] * 1008_all_glibc-2.16-fortify.patch ... [ ok ] * 1040_all_2.3.3-localedef-fix-trampoline.patch ... [ ok ] * 1055_all_glibc-resolv-dynamic.patch ... [ ok ] * 1505_all_glibc-nptl-stack-grows-up.patch ... [ ok ] * 1506_all_glibc-2.17-hppa-fpu.patch ... [ ok ] * 1507_all_glibc-2.17-hppa-ldso-flag.patch ... [ ok ] * 1507_all_hppa-ia64-DL_AUTO_FUNCTION_ADDRESS.patch ... [ ok ] * 1508_all_glibc-2.17-hppa-futex.patch ... [ ok ] * 1508_all_hppa-fanotify_mark.patch ... [ ok ] * 3020_all_glibc-tests-sandbox-libdl-paths.patch ... [ ok ] * 5063_all_glibc-dont-build-timezone.patch ... [ ok ] * 6024_all_alpha-fix-signal-thunk-unwind-info.patch ... [ ok ] * 6230_all_arm-glibc-hardened.patch ... [ ok ] * Done with patching * Using GNU config files from /usr/share/gnuconfig * Updating scripts/config.sub [ ok ] * Updating scripts/config.guess [ ok ] >>> Source unpacked in /var/tmp/portage/sys-libs/glibc-2.17/work Fatal Python error: Failed to open /dev/urandom /usr/lib/portage/bin/phase-functions.sh: line 87: 4204 Aborted "${PORTAGE_PYTHON:-/usr/bin/python}" "${PORTAGE_BIN_PATH}"/ilter-bash-environment.py "${filtered_vars}" * ERROR: sys-libs/glibc-2.17::gentoo failed (unpack phase): * filter-bash-environment.py failed * * Call stack: * ebuild.sh, line 714: Called __ebuild_main 'unpack' * phase-functions.sh, line 993: Called __filter_readonly_variables '--filter-features' * phase-functions.sh, line 137: Called die * The specific snippet of code: * "${PORTAGE_PYTHON:-/usr/bin/python}" "${PORTAGE_BIN_PATH}"/filter-bash-environment.py "${filtered_vars}" || die "filter-bash-enviroment.py failed"
So what was going on here? Well as it turned out, my system has three versions of python loaded, and Gentoo's portage system (of which emerge is part) seems to rely on you not using python 3. After a short bit of fiddling with "eselect python list" and "eselect python set", to get the default python back to 2.7, the build ran like a charm.
So anyway - this has got to count as the most bizarrely mis-reported error in my most recent years. "/dev/urandom" was working fine. I could start it and stop it ("/etc/init.d/urandom stop" and so forth) and I could use it to access randomness. Why then did I get the "failed to open" message with one version of python, and not with another. Answers on a postcard? Whatever - this was a public service announcement.