David Nalley's Blog

A Runbook for CloudStack


Documentation is one of those vital things that any software, but especially open source software needs to be successful. When anyone can come along and download your software and use it, you suddenly have an incredibly diverse audience for not just your software but also your documentation. My experience with CloudStack have made that all the more evident to me. CloudStack, like many projects, requires an understanding of a number of different areas of practice – namely: virtualization, networking, Linux, and storage. The number of people who are experts in all of those areas is pretty small. Someone might be an ‘expert’ in 2 or 3 of those areas, but which one is always a mystery, then of course you have folks who are experts at all. So writing for a specific audience is at best tedious.

If you look at the existing documentation that exists, it has one other ‘flaw’, it documents almost everything. This really isn’t a flaw, you really do want documentation for everything, but if your target is new users of your software, documenting every esoteric feature they could possibly make use of is painful.

In #cloudstack on irc.freenode.net I kept hearing a common refrain, that it would take new folks 1-2 weeks to get CloudStack operating successfully, but once it was up, it ran great. The problem is that as a sysadmin, I know that being interrupt driven means that things that take a ton of time and effort are going to be dropped along the way, and perhaps never picked up again. While discussing this problem Chiradeep called my attention to RightScale’s runbooks; they don’t explore every option, instead they provide a prescriptive path to success for one niche way of doing things.

So I began to work on removing all of the choices – I picked the operating system, type of storage, network model, hypervisor, even the network addresses, and then documented the procedure for assembling those pieces into a CloudStack cloud. The first revision of which I’ve published here, with lots of help from Joe Brockmeier, Chiradeep, and Watnuss


A good chunk of this should be completely reusable – if people wanted to alter the network model or hypervisor, or any of the other choices, the base documentation should still be good, so feel free to fork it, or improve what is already there, you can find the source here: