LVM snapshots for a resetable machine

I have ended up maintaining a few websites which we are hosting on a machine off in Germany somewhere.

I want to get everything automated, so I have less work to do if something goes wrong.

I’m using ansible, which is wonderful, and have a nice set of playbooks I’ve written which take a raw CentOS install, and install everything, (php-fpm, nginx, etc…) set up the virtualhosts, install wordpress & joomla and all that for the sites that need it, etc.

Until today, I’ve been using a virtualbox on my local computer to test on, and it works great.  I haven’t bothered with vagrant, as I tried it for a couple of days, and it crashed my whole computer twice, so I gave up.  With virtualbox, it’s almost as simple.  I have a virtualmachine which I can spin up, install stuff on, and then when I want to go back to a fresh machine, it’s a matter of turning it off, and clicking ‘restore snapshot’ to the snapshot I made when it was clean installed.

It’s practically instant, and just works.

However, running a virtualmachine on my primary work computer all the time does make everything else somewhat sluggish.  So I’ve scrouged an old computer that wasn’t doing anything, and am now using that instead.

In order to get snapshots and restore points going well, here’s how I did it:

  1. Install CentOS, leaving a bunch of free space on the LVM primary group.
  2. Make a snapshot when it’s first installed
  3. Restore (merge to) that snapshot whenever I want it back to original settings.
  4. Reboot

To make & restore the snapshots, I’ve written the commands as scripts so I don’t have to remember the lg-whatever stuff. (vg_localtest is the name of the volume group I set up for the HD when I installed):


lvcreate –size 100G -s -n original_snapshot /dev/vg_localtest/root


lvconvert –merge /dev/vg_localtest/original_snapshot && reboot

It works great so far.  One improvement I’m making, since I one time forgot to make a snapshot, and so couldn’t restore to a blank slate without re-installing the whole thing (which, admittedly, only takes half an hour or so):

I’m adding ‘snapshot_make’ into a boot script, and then modifying it to remove itself from the bootscript once it’s made the snapshot.  That way as soon as the machine reboots into it’s original snapshot, it will automatically re-create the snapshot.

This looks like:


lvcreate –size 100G -s -n original_snapshot /dev/vg_localtest/root

sed -ine ‘/snapshot/d’ /etc/rc.local

and then /etc/rc.local will look like:

#… whatever it has


Merging directories with the magic of Python.

We finally got the last projects out of that monstrosity ‘Final Cut Server’, but one project at the end was a nightmare to export, and we weren’t sure which files from the end actually were in a different version of the project that we already had.

We essentially needed to merge two different versions of projects directories, making sure not to lose any files, and we didn’t want to lose the organization of the files.

Here’s a quick python script I wipped up to make it quicker.

With the 4000 odd files in the project, it took under a second to run, and it turned out we only had about 20 files which hadn’t already been merged.  Much simpler to sort out.

The script took about 10 minutes to write and test. This is why you should learn to program.  Hacking stuff like this up is easy, and saves *so* much time.

(Yes, you probably could do this with a couple of lines of perl or BASH, but what the heck.)

#!/usr/bin/env python

from subprocess import Popen, PIPE
from os import stat
from os.path import basename, abspath

def run(*command):
    found = Popen(command, stdout=PIPE)
    return found.communicate()[0]

def files_in(dirname):
    return [x for x in run('find', abspath(dirname), '-type','f', '-print0').split(chr(0)) if x]

if __name__ == '__main__':
    from sys import argv

        sourcedir = files_in(argv[1])
        destdir = files_in(argv[2])
    except IndexError:
        print 'Usage:'
        print argv[0], '  '
        print 'Where you want to check if files in  are also in '
        print '(but perhaps with a different relative path)'

    print '---------------------------------------------------'
    print '{0} files in {1}'.format(len(sourcedir), abspath(argv[1]))
    print '{0} files in {1}'.format(len(destdir), abspath(argv[2]))
    print '---------------------------------------------------'

    destnames = {}

    for destfile in destdir:
        destnames[basename(destfile)] = {'size': stat(destfile).st_size,
                                         'path': destfile }

    for newfile in sourcedir:
        base = basename(newfile)
        if base not in destnames:
            print newfile, 'is NOT in the new dir'
            destfile = destnames[base]
            if stat(newfile).st_size != destfile['size']:
                print '{0}({1}) differs from {2}({3})'.format(
                      newfile, stat(newfile).st_size,
                      destfile['path'], destfile['size'])