So I was testing my restore from backup for chef and ran into a few problems. The first problem I encountered was that my nginx load balancers config files are dynamically created based role assigned to boxes. After my restore one of the first boxes I tested was one of the LB boxes and to my horror even thought the systems where listed when I did a chef node list it seems that until they have check into the restored chef server they are not counted. This means that my nginx config server pools where empty … bummer. The easy fix here was to have my servers move over to the restored chef server instance from the bottom up … i.e. sql boxes, web boxes, then edge lb stuff. Not a huge problem but it does mean if you ever have to retore a chef box, stop all client before you bring it up.
The other odd problem I had was one node that had a local variable assigned to it did not pull the var over. Now the variable in question had not changed in months and my daily backups should have contained this info. I got lucky that even though it was a db password access for the system, I had removed the notify restart of a lot of services before the restore to minimize impact of changes but over it went pretty well.
My backups … tar zcvf `date +%Y%m%d`.`hostname`.chef.tar.gz /var/lib/couchdb/ /etc/chef
Restore, build server, install chef-server, stop chef-server, drop tar into place and start chef-server.
One final thought, I had to restore a 0.8.16 system after 0.9.8 was out which turns out to be a problem as the bootstrap latest files do not work with 0.8.16. Luckily I had a local copy of the boot strap that I used for 0.8.x installs and was able to run from there. I suggest you backup any files you use for installs locally just incase.