Chef error: marshal data too short

WARN: HTTP Request Returned 500 Internal Server Error: marshal data too short … what to do?

jmiller@srv-101-29:~$ sudo chef-client
[Tue, 10 Aug 2010 12:36:13 -0700] INFO: Starting Chef Run
[Tue, 10 Aug 2010 12:36:28 -0700] WARN: HTTP Request Returned 500 Internal Server Error: marshal data too short
/usr/lib/ruby/1.8/net/http.rb:2097:in `error!’: 500 “Internal Server Error” (Net::HTTPFatalError)
from /usr/lib/ruby/1.8/chef/rest.rb:216:in `api_request’
from /usr/lib/ruby/1.8/chef/rest.rb:267:in `retriable_rest_request’
from /usr/lib/ruby/1.8/chef/rest.rb:197:in `api_request’
from /usr/lib/ruby/1.8/chef/rest.rb:100:in `get_rest’
from /usr/lib/ruby/1.8/chef/client.rb:270:in `sync_cookbooks’
from /usr/lib/ruby/1.8/chef/client.rb:86:in `run’
from /usr/lib/ruby/1.8/chef/application/client.rb:215:in `run_application’
from /usr/lib/ruby/1.8/chef/application/client.rb:207:in `loop’
from /usr/lib/ruby/1.8/chef/application/client.rb:207:in `run_application’
from /usr/lib/ruby/1.8/chef/application.rb:62:in `run’
from /usr/bin/chef-client:25
jmiller@srv-101-29:~$

So looking at this I thought it was a checksum error on the client and deleted the /var/chef/cache directory without luck. After digging around I found that stopping the chef server and deleting /var/chef/cache/checksums, then restarting chef server fixed the problem. Easy fix but odd problem. Chef 0.8.16

Chef Performance Tuning — Part 1

It turns out chef-server is a cpu hog, not sure why all its really doing is attribute storage and file pushing. I started noticing that my 66 node chef server farm was seeing longer and longer chef-client runs. At first I looked at disk as I did not think chef sever would have problems with this small of a farm. After much consideration that did not seem to be the problem, then I noticed while watching top that chef-server was using 99% of one core 85% of the time. While I do not claim to be a experts here is the solution that worked for me.

I am reading more of the chef-server code and thinking I overcomplicated this a bunch but am checking with others to confirm … this works but may not be the best solution.

It turns out chef-server is a cpu hog, not sure why all its really doing is attribute storage and file pushing. I started noticing that my 66 node chef server farm was seeing longer and longer chef-client runs. At first I looked at disk as I did not think chef sever would have problems with this small of a farm. After much consideration that did not seem to be the problem, then I noticed while watching top that chef-server was using 99% of one core 85% of the time. While I do not claim to be a experts here is the solution that worked for me.

One work around is to create additional merb threads, to do this on a gems install edit: /etc/service/chef-server/run

I have added the -c 8: -c, –cluster-nodes NUM_MERBS Number of merb daemons to run.

jmiller@srv-101-03:~$ cat /etc/service/chef-server/run
#!/bin/sh
PATH=/usr/local/bin:/usr/local/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/lib/ruby/gems/1.8/bin
exec 2>&1
exec /usr/bin/env chef-server -c 8 -N -p 4000 -e production -P /var/run/chef/server.%s.pid
jmiller@srv-101-03:~

Then restart the chef server:

sudo /etc/init.d/chef-server restart

This will spawn 8 worker threads starting at 4000 (port 4040 is the chef-server-webui)

jmiller@srv-101-03:~$ ps -eaf |grep merb |grep -v grep
root 3559 12380 0 Jun14 ? 00:00:37 merb : worker (port 4040)
root 3623 12342 0 14:08 ? 00:00:07 merb : spawner (ports 4000)
root 3638 3623 12 14:08 ? 00:18:40 merb : worker (port 4004)
root 3639 3623 12 14:08 ? 00:18:47 merb : worker (port 4005)
root 3640 3623 11 14:08 ? 00:17:17 merb : worker (port 4006)
root 3641 3623 12 14:08 ? 00:18:30 merb : worker (port 4007)
root 10890 1 64 Jun14 ? 11:18:07 merb : worker (port 4000)
root 10891 1 5 Jun14 ? 00:54:46 merb : worker (port 4001)
root 10892 1 4 Jun14 ? 00:51:57 merb : worker (port 4002)
root 10893 1 4 Jun14 ? 00:51:03 merb : worker (port 4003)
jmiller@srv-101-03:~$

Apply the correct recipes to the chef server

“recipe[apache2]”,
“recipe[apache2::mod_status]”,
“recipe[apache2::mod_proxy]”,
“recipe[apache2::mod_proxy_http]”,
“recipe[apache2::mod_proxy_balancer]”,
“recipe[apache2::mod_rewrite]”,
“recipe[apache2::mod_headers]”,

Now that we have the threads we need to lb request to them, opscode provides examples for apache lb so that is what I chose to use the port 4080 was a random choice that works in my env:

jmiller@srv-101-03:~$ cat /etc/apache2/sites-available/chef.example.com
Listen 4080

ServerName chef.example.com
DocumentRoot /usr/share/chef-server/public


BalancerMember http://127.0.0.1:4000
BalancerMember http://127.0.0.1:4001
BalancerMember http://127.0.0.1:4002
BalancerMember http://127.0.0.1:4003
BalancerMember http://127.0.0.1:4004
BalancerMember http://127.0.0.1:4005
BalancerMember http://127.0.0.1:4006
BalancerMember http://127.0.0.1:4007
Order deny,allow
Allow from all

LogLevel info
ErrorLog /var/log/apache2/chef_server-error.log
CustomLog /var/log/apache2/chef_server-access.log combined

RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ balancer://chef_server%{REQUEST_URI} [P,QSA,L]

Enable the new vsite:

jmiller@srv-101-03:~$ sudo a2ensite chef.example.com
Site chef.example.com already enabled
jmiller@srv-101-03:~$

Restart apache:

jmiller@srv-101-03:~$ sudo /etc/init.d/apache2 reload
* Reloading web server config apache2
Warning: DocumentRoot [/usr/share/chef-server/public] does not exist
[Tue Jun 15 16:48:21 2010] [warn] NameVirtualHost *:443 has no VirtualHosts
…done.
jmiller@srv-101-03:~$

Make sure it works, first update your knife chef_server_url port

jmiller@srv-101-03:~$ cat .chef/knife.rb
log_level :warn
#log_location “/home/jmiller/.chef/knife.log”
node_name ‘jmiller’
client_key ‘/home/jmiller/.chef/jmiller.pem’
validation_client_name ‘chef-validator’
validation_key ‘/home/jmiller/.chef/chef-validator.pem’
chef_server_url ‘http://srv-101-03.example.com:4080’
cache_type ‘BasicFile’
cache_options( :path => ‘/home/jmiller/.chef/checksums’ )
cookbook_path [ ‘/home/jmiller/site-cookbooks’ ]

jmiller@srv-101-03:~$

Before the worker threads this was taking 2 – 9 seconds this is the highest I have seen since the change 🙂

jmiller@srv-101-03:~$ time knife role list
[
“APACHE_ROLE”,

“WEBSERVER_ROLE”
]

real 0m0.673s
user 0m0.310s
sys 0m0.100s
jmiller@srv-101-03:~$

Now you will need to update all the /etc/chef/client.rb files on the systems and restart chef-client daemon, I suggest you use chef to do it.

thank you to Josh Timberman, Adam Jacob, and holoway for many pointers

Automated role updates with knife

In this example we want to update a role, this is the basics you will need to automate the actually edit of the json file in whatever language you like

In this example we want to update a role, this is the basics you will need to automate the actually edit of the json file in whatever language you like

List the roles, no sample role

joshua-millers-macbook-pro:chef jmiller$ knife role list
[
“APACHE_ROLE”,
“APPBASE_ROLE”,
“APTREPO_ROLE”,
“WEBSERVER_ROLE”
]
joshua-millers-macbook-pro:chef jmiller$

Dump the BASE_ROLE so we can use it to create a new role

joshua-millers-macbook-pro:chef jmiller$ knife role show BASE_ROLE > SAMPLE_ROLE.json
joshua-millers-macbook-pro:chef jmiller$

Edit the role; going to do it manually here but could be done with perl …

joshua-millers-macbook-pro:chef jmiller$ cat SAMPLE_ROLE.json
{
“name”: “SAMPLE_ROLE”,
“default_attributes”: {
},
“json_class”: “Chef::Role”,
“run_list”: [
],
“description”: “All nodes wiil get this base”,
“chef_type”: “role”,
“override_attributes”: {
“authorization”: {
“sudo”: {
“groups”: [
“dev”
],
“users”: [

]
}
},
“chef”: {
“client_splay”: “20”,
“client_interval”: “900”,
“server_fqdn”: “chef.example.com”
},
“postfix”: {
“myorigin”: “mail.example.com”,
“relayhost”: “mailrelay.example.com”,
“mydomain”: “example.com”
},
“ntp”: {
“is_server”: false,
“service”: “ntpd”,
“servers”: [
“time01.example.com”,
“time02.example.com”
]
}
}
}
joshua-millers-macbook-pro:chef jmiller$

I am creating the role so it going to generate a “Not Found” error

joshua-millers-macbook-pro:chef jmiller$ knife role from file SAMPLE_ROLE.json
WARN: HTTP Request Returned 404 Not Found: Cannot load role SAMPLE_ROLE
WARN: Updated Role SAMPLE_ROLE!
joshua-millers-macbook-pro:chef jmiller$

Sample role created:

joshua-millers-macbook-pro:chef jmiller$ knife role list | grep SAMPLE
“SAMPLE_ROLE”,
joshua-millers-macbook-pro:chef jmiller$

Here is what we have:

joshua-millers-macbook-pro:chef jmiller$ knife role show SAMPLE_ROLE
{
“name”: “SAMPLE_ROLE”,
“default_attributes”: {
},
“json_class”: “Chef::Role”,
“run_list”: [

],
“description”: “All nodes wiil get this base”,
“chef_type”: “role”,
“override_attributes”: {
“authorization”: {
“sudo”: {
“groups”: [
“dev”
],
“users”: [

]
}
},
“chef”: {
“client_splay”: “20”,
“client_interval”: “900”,
“server_fqdn”: “chef.example.com”
},
“postfix”: {
“myorigin”: “mail.example.com”,
“relayhost”: “mailrelay.example.com”,
“mydomain”: “example.com”
},
“ntp”: {
“is_server”: false,
“service”: “ntpd”,
“servers”: [
“time01.example.com”,
“time02.example.com”
]
}
}
}
joshua-millers-macbook-pro:chef jmiller$

I update the role ( could be automated with a script ) and update chef

joshua-millers-macbook-pro:chef jmiller$ vi SAMPLE_ROLE.json

joshua-millers-macbook-pro:chef jmiller$ cat SAMPLE_ROLE.json
{
“name”: “SAMPLE_ROLE”,
“default_attributes”: {
},
“json_class”: “Chef::Role”,
“run_list”: [
],
“description”: “All nodes wiil get this base”,
“chef_type”: “role”,
“override_attributes”: {
“ntp”: {
“is_server”: false,
“service”: “ntpd”,
“servers”: [
“time01.example.com”,
“time02.example.com”
]
}
}
}
joshua-millers-macbook-pro:chef jmiller$ knife role from file SAMPLE_ROLE.json
WARN: Updated Role SAMPLE_ROLE!
joshua-millers-macbook-pro:chef jmiller$ knife role show SAMPLE_ROLE
{
“name”: “SAMPLE_ROLE”,
“default_attributes”: {
},
“json_class”: “Chef::Role”,
“run_list”: [

],
“description”: “All nodes wiil get this base”,
“chef_type”: “role”,
“override_attributes”: {
“ntp”: {
“is_server”: false,
“service”: “ntpd”,
“servers”: [
“time01.example.com”,
“time02.example.com”
]
}
}
}
joshua-millers-macbook-pro:chef jmiller$

It looks like we should be able to use the following to do the role edit on the chef server … or create another client pem for just this task …

root@chef:~# knife role show SAMPLE_ROLE -s http://chef.example.com:4000 -u chef-webui -k /etc/chef/webui.pem
{
“name”: “SAMPLE_ROLE”,
“default_attributes”: {

},
“json_class”: “Chef::Role”,
“run_list”: [

],
“description”: “All nodes wiil get this base”,
“chef_type”: “role”,
“override_attributes”: {
“ntp”: {
“is_server”: false,
“service”: “ntpd”,
“servers”: [
“time01.example.com”,
“time02.example.com”
]
}
}
}
root@chef:~# knife role show SAMPLE_ROLE -s http://chef.example.com:4000 -u chef-webui -k /etc/chef/webui.pem > SAMPLE_ROLE.json
root@chef:~# vi SAMPLE_ROLE.json
root@chef:~# knife role from file SAMPLE_ROLE.json -s http://chef.example.com:4000 -u chef-webui -k /etc/chef/webui.pem
WARN: Updated Role SAMPLE_ROLE!
root@chef:~# knife role show SAMPLE_ROLE -s http://chef.example.com:4000 -u chef-webui -k /etc/chef/webui.pem
{
“name”: “SAMPLE_ROLE”,
“default_attributes”: {

},
“json_class”: “Chef::Role”,
“run_list”: [

],
“description”: “A sample role”,
“chef_type”: “role”,
“override_attributes”: {
“ntp”: {
“is_server”: false,
“service”: “ntpd”,
“servers”: [
“time01.example.com”,
“time02.example.com”
]
}
}
}
root@chef:~#

Chef 0.8.x Deb and Upstart

So my chef clients have been crashing and its always a bummer to ssh in and restart it. I could just have my monitoring system start it but why bother when Ubuntu has a wonderful and built in way to make sure the service stays up!

So my chef clients have been crashing and its always a bummer to ssh in and restart it. I could just have my monitoring system start it but why bother when Ubuntu has a wonderful and built in way to make sure the service stays up!

First I downloaded the chef recipe from opscode, then I added the following.

joshua-millers-macbook-pro:site-cookbooks jmiller$ cat chef/recipes/client-deb.rb
#
# Author:: Joshua Miller
# Cookbook Name:: chef
# Recipe:: client-deb
#
# Copyright 2008-2010, Fitsnips.net
#
# Licensed under the Apache License, Version 2.0 (the “License”);
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an “AS IS” BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# since I have the deb installed already by this point I dont install it.
case node[:platform]
when “ubuntu”
# Upstart is on karmic and above by default … not sure about lower versions
if node[:platform_version].to_f >= 9.10

# my chef server is installed with gems, but for easy of auto install I am using debs
# in my kickstart build with a local apt-mirror. Due to that I have added a check for
# chef-server and its there we dont make any changes.
template “/etc/init.d/chef-client” do
source “chef-client-upstartjob.erb”
owner “root”
group “root”
mode 0774
backup 0
not_if do File.symlink?(“/etc/init.d/chef-server”) end
end

service “chef-client” do
provider Chef::Provider::Service::Upstart
supports :restart => true, :reload => true
end

template “/etc/default/chef-client” do
source “default-chef-client.erb”
owner “root”
group “root”
mode 644
backup 0
not_if do File.symlink?(“/etc/init.d/chef-server”) end
end

template “/etc/init/chef-client.conf” do
source “upstart-chef-client.conf.erb”
owner “root”
group “root”
mode 0644
backup 0
notifies :start, resources(:service => “chef-client”)
not_if do File.symlink?(“/etc/init.d/chef-server”) end
end

end

end

Then we create a few templates:

joshua-millers-macbook-pro:site-cookbooks jmiller$ cat chef/templates/default/upstart-chef-client.conf.erb
start on runlevel [2345]

script
exec /usr/bin/env chef-client -c /etc/chef/client.rb -i <%= @node[:chef][:client_interval] %> -s <%= @node[:chef][:client_splay] %>
end script

# Restart the process if it dies with a signal
# or exit code not given by the ‘normal exit’ stanza.
respawn

# Give up if restart occurs 10 times in 90 seconds.
respawn limit 10 90

Lets make is easy on the other admins who are not used to Upstart:

joshua-millers-macbook-pro:site-cookbooks jmiller$ cat chef/templates/default/chef-client-upstartjob.erb
#!/bin/sh -e
# upstart-job
#
# Symlink target for initscripts that have been converted to Upstart.

set -e

INITSCRIPT=”$(basename “$0″)”
JOB=”${INITSCRIPT%.sh}”

if [ “$JOB” = “upstart-job” ]; then
if [ -z “$1” ]; then
echo “Usage: upstart-job JOB COMMAND” 1>&2
exit 1
fi

JOB=”$1″
INITSCRIPT=”$1″
shift
else
if [ -z “$1” ]; then
echo “Usage: $0 COMMAND” 1>&2
exit 1
fi
fi

COMMAND=”$1″
shift

if [ -z “$DPKG_MAINTSCRIPT_PACKAGE” ]; then
ECHO=echo
else
ECHO=:
fi

$ECHO “Rather than invoking init scripts through /etc/init.d, use the service(8)”
$ECHO “utility, e.g. service $INITSCRIPT $COMMAND”

case $COMMAND in
status)
$ECHO
$ECHO “Since the script you are attempting to invoke has been converted to an”
$ECHO “Upstart job, you may also use the $COMMAND(8) utility, e.g. $COMMAND $JOB”
$COMMAND “$JOB”
;;
start|stop|restart)
$ECHO
$ECHO “Since the script you are attempting to invoke has been converted to an”
$ECHO “Upstart job, you may also use the $COMMAND(8) utility, e.g. $COMMAND $JOB”
PID=$(status “$JOB” 2>/dev/null | awk ‘/[0-9]$/ { print $NF }’)
if [ -z “$PID” ] && [ “$COMMAND” = “stop” ]; then
exit 0
elif [ -n “$PID” ] && [ “$COMMAND” = “start” ]; then
exit 0
elif [ -z “$PID” ] && [ “$COMMAND” = “restart” ]; then
start “$JOB”
exit 0
fi
$COMMAND “$JOB”
;;
reload|force-reload)
$ECHO
$ECHO “Since the script you are attempting to invoke has been converted to an”
$ECHO “Upstart job, you may also use the $COMMAND(8) utility, e.g. $COMMAND $JOB”
reload “$JOB”
;;
*)
$ECHO
$ECHO “The script you are attempting to invoke has been converted to an Upstart” 1>&2
$ECHO “job, but $COMMAND is not supported for Upstart jobs.” 1>&2
exit 1
esac

Chef search and templates … be aware

Search return order is inconsistent, is there a way to deal with it? I really would not care except every time chef runs it restarts perbal which is a issue.

In my recipe:

search(:node, “role:WEBSERVER_ROLE”) do |n|
WEBSERVER_ROLE_host << n['ipaddress'] end search(:node, "role:APACHE_ROLE") do |n| APACHE_ROLE_host << n['ipaddress'] end template "/etc/perlbal/perlbal.conf" do source "perlbal.conf.erb" mode 0440 owner "root" group "root" variables( :WEBSERVER_ROLE_host => WEBSERVER_ROLE_host,
:APACHE_ROLE_host => APACHE_ROLE_host
)
backup 1
notifies :restart, resources(:service => “perlbal”)
end

My Template:

CREATE POOL app_pool
<% @WEBSERVER_ROLE_host.each do |n| -%>
POOL app_pool ADD <%= n %>:80
<% end -%>

CREATE POOL media_pool
<% @APACHE_ROLE_host.each do |n| -%>
POOL media_pool ADD <%= n %>:80
<% end -%>

# run A

CREATE POOL app_pool
POOL app_pool ADD 10.400.441.23:80 <<<< Problem forces restart POOL app_pool ADD 10.400.441.24:80 POOL app_pool ADD 10.400.441.25:80 CREATE POOL media_pool POOL media_pool ADD 10.400.441.27:80 POOL media_pool ADD 10.400.441.28:80 POOL media_pool ADD 10.400.441.27:80 # run B 15 mintues later CREATE POOL app_pool POOL app_pool ADD 10.400.441.25:80 <<<< Problem forces restart POOL app_pool ADD 10.400.441.23:80 POOL app_pool ADD 10.400.441.24:80 CREATE POOL media_pool POOL media_pool ADD 10.400.441.28:80 POOL media_pool ADD 10.400.441.27:80 POOL media_pool ADD 10.400.441.27:80 The fix actually is rather simple, notice the addition of .sort to my vars when passed to the template. I dont need a certain order just a consistent one so this was quick and easy. template "/etc/perlbal/perlbal.conf" do source "perlbal.conf.erb" mode 0440 owner "root" group "root" variables( :WEBSERVER_ROLE_host => WEBSERVER_ROLE_host.sort,
:APACHE_ROLE_host => APACHE_ROLE_host.sort
)
backup 1
notifies :restart, resources(:service => “perlbal”)
end

chef, knife, and ssh – loving it!

Opscode added a ssh call to the knife utility which when used with the search syntax can be very nice. A few minor examples below.

Opscode added a ssh call to the knife utility which when used with the search syntax can be very nice. A few minor examples below.

jmiller@srv-101-03: $ knife ssh role:APACHE_ROLE uptime
srv-101-18.example.com  02:07:24 up 140 days, 23:23,  1 user,  load average: 0.00, 0.00, 0.00
srv-101-17.example.com  02:07:24 up 125 days, 10:53,  1 user,  load average: 0.03, 0.06, 0.02

j

miller@srv-101-03:~/operations/chef/roles$ knife ssh “role:BASE_ROLE” ‘ grep paranoia /etc/nscd.conf ‘
srv-101-01.example.com # paranoia
srv-101-01.example.com paranoia no
srv-101-14.example.com # paranoia
srv-101-14.example.com paranoia yes
srv-201-22.example.com # paranoia
srv-201-22.example.com paranoia yes
srv-201-01.example.com # paranoia
srv-201-01.example.com paranoia yes
srv-201-26.example.com # paranoia
srv-201-26.example.com paranoia yes
srv-101-04.example.com # paranoia
….

Backup chef roles

I like to keep my chef roles in git so I do a dump of them and check them when I make changes. Very nice if you remove something and can not recall what it is.

I like to keep my chef roles in git so I do a dump of them and check them when I make changes. Very nice if you remove something and can not recall what it was as you jump around.

#!/bin/bash

####
#
# Must be run from a server that has knife and your key i.e. chef.int.rdio
#
###

# List of all roles:

knife role list | sed s/\”//g | sed s/,// | egrep -v ‘\]|\[‘ > ./rolelist.txt

# Generate a file for each role containing the servers in that role

for i in `cat rolelist.txt`; do echo $i; knife role show $i > $i.json; done

Quick and dirty server list from chef

So I have always used a simple bash look to do quick task on lots of servers:

Example:

for i in `cat server.list`; do ssh $i ‘hostname;uptime’;done

We can use chef to build list of servers by role, and a list all servers in a our farm if managed by chef 🙂

#!/bin/bash

####
#
# Must be run from a server that has knife and your key i.e. chef.server.com
#
###

# I think I am going to make this a recipe
# but for now…

#Generate a list of all chef controlled servers

knife node list | sed s/\”//g | sed s/,// | grep -v \] > /home/operations/servers/all.txt

# List of all roles:

knife role list | sed s/\”//g | sed s/,// | egrep -v ‘\]|\[‘ > /home/operations/servers/roles.txt

# Generate a file for each role containing the servers in that role
# Tetsu likes the files lower case … works for me 🙂

for i in `cat roles.txt`; do echo $i; z=`echo $i | tr ‘[:upper:]’ ‘[:lower:]’`; knife search node role:$i -i > $z.txt; done

Joy with Chef 0.8 – and user error!!!

Maybe not really read for prime time, chef 0.8 is a major step forward … but the lack of good docs make it feel like a half a step back.

So install of Chef 0.8.6 on Ubuntu 9.10 karmic was not bad on a clean machine, then I go and do the dumb thing of updating to 0.8.8 now its busted!

root@srv-101-03:~# chef-server
Loading init file from /usr/lib/ruby/gems/1.8/gems/chef-server-0.8.8/config/init.rb
Loading /usr/lib/ruby/gems/1.8/gems/chef-server-0.8.8/config/environments/development.rb
/usr/local/lib/site_ruby/1.8/rubygems.rb:230:in `activate’: can’t activate chef (= 0.8.8, runtime) for [“chef-solr-0.8.8”], already activated chef-0.8.6 for [] (Gem::LoadError)
from /usr/local/lib/site_ruby/1.8/rubygems.rb:246:in `activate’

jmiller@srv-101-03:~$ knife node list
/usr/lib/ruby/1.8/net/http.rb:2097:in `error!’: 500 “Internal Server Error” (Net::HTTPFatalError)
from /usr/lib/ruby/gems/1.8/gems/chef-0.8.8/lib/chef/rest.rb:296:in `run_request’
from /usr/lib/ruby/gems/1.8/gems/chef-0.8.8/lib/chef/rest.rb:106:in `get_rest’
from /usr/lib/ruby/gems/1.8/gems/chef-0.8.8/lib/chef/node.rb:363:in `list’
from /usr/lib/ruby/gems/1.8/gems/chef-0.8.8/lib/chef/knife/node_list.rb:35:in `run’
from /usr/lib/ruby/gems/1.8/gems/chef-0.8.8/lib/chef/application/knife.rb:110:in `run’
from /usr/lib/ruby/gems/1.8/gems/chef-0.8.8/bin/knife:26
from /usr/bin/knife:19:in `load’
from /usr/bin/knife:19
jmiller@srv-101-03:~$

OK here was the dumb and quick fix, the failure was that I run gem upgrade chef … not the command below. Ruby is stupid!

gem install chef -v ‘=0.8.8’

Or maybe not .. that only fixed the error “can’t activate chef”

More progress thank you to the mailing list, webui is back and running but node lists are still messed up:

After looking at your stack trace, you are using Merb 1.1 which is not compatable with Chef .8.8, you should downgrade Merb back to 1.0.15 if you want the webui to work at all.

Damm

root@srv-101-03:~# gem list

*** LOCAL GEMS ***

abstract (1.0.0)
amqp (0.6.7)
bundler (0.9.13)
bunny (0.6.0)
chef (0.8.8)
chef-server (0.8.8)
chef-server-api (0.8.8)
chef-server-webui (0.8.8)
chef-solr (0.8.8)

merb-assets (1.1.0)
merb-core (1.1.0)
merb-haml (1.1.0)
merb-helpers (1.1.0)
merb-param-protection (1.1.0)
merb-slices (1.1.0)

gem uninstall -aIx merb-assets merb-core merb-haml merb-helpers merb-param-protection merb-slices

gem install merb-assets merb-core merb-haml merb-helpers merb-param-protection merb-slices -v ‘~> 1.0.0’

I love the chef mailing list, they pointed out. http://tickets.opscode.com/browse/CHEF-1069

On Tue, Mar 30, 2010 at 5:15 PM, Joshua Miller wrote:
I did a dump of the chef couchdb and am sure this is the problem but do not know enough about couchdb to fix it .. doing research but if anyone just knows the answer.

{“chef_type”: “node”, “name”: null, “_rev”: “1-d40f879d3cbf5d93099b75619d03c8cf”, “defaults”: {}, “run_list”: [], “attributes”: {}, “json_class”: “Chef::Node”, “_id”: “61250eb6-62da-450e-a90a-97856291a2ee”, “overrides”: {}}^M
–==954fdeac87864055bc0716669a22d711==^M
Content-ID: 72b00c98-75c8-4ac8-8aed-723c60686d1c^M
Content-Length: 351^M
Content-MD5: mcsIj4Vf9ssbVNz8jjub2w==^M
Content-Type: application/json;charset=utf-8^M

Joshua,
you probably want to access CouchDB’s webui which is available from a
URL like http://localhost:5984/_utils/

On most installations, CouchDB configured to listen *only* on the
localhost/loopback interface, so you’ll most likely want to set up an
SSH tunnel from port 5984 on your box to localhost:5984. From there,
you can navigate to the “chef” database and then select the nodes >
all_id view. This URL will probably work for that:
http://localhost:5984/_utils/database.html?chef/_design/nodes/_view/all_id

Then find the one with a null/blank id and delete it.

HTH,
Dan DeLeo

Once I deleted the offending node all works again! So happy to have my chef 0.8.8 running again and a big thank you to Dan DeLeo

Opscode adds training but will anyone care?

While this is a next logical step I feel that they need to focus on getting 8.0 out before they even start to worry about training. I have held off on suggesting chef to a lot of people due to tall the change coming in 8.0.

So as you can see I have enjoyed Opscodes chef a lot but now I see they added training. While this is a next logical step I feel that they need to focus on getting 8.0 out before they even start to worry about training. I have held off on suggesting chef to a lot of people due to tall the change coming in 8.0. As a user or almost 8 months now I feel the changes are so extreme that its not worth starting with chef at this point. While the recipes an basic stuff your pushing out will move forward a lot of logic changes happen in 8.0. First there is the new databags that will allow you to rethink how you use shared data. Then there is the joy of roles in roles, which I love by the way. Why these are really minor changes I hate to think about going over all my roles and recipes and reworking them for 8.0. Not because you have to but more because I like to have a common pattern in execution and I assure you that I will be using these new features in new additions to my chef tool kit. O then there is knife … umm yea world changer there. So in summary hold off on training opscode and get 8.0 out the door.