MegaCLI Raid6 Array creation

I am using Ubuntu Karmic on Dell R610 to access MD1200 storage devices and since (until recently) Openmanage was not a option for the H800 SAS Raid adaptors so I had to explore the wonderful megacli utility!

I am using Ubuntu Karmic on Dell R610 to access MD1200 storage devices and since (until recently) Openmanage was not a option for the H800 SAS Raid adaptors so I had to explore the wonderful megacli utility!

# Find unused disks

root@srv-103-27:/opt/MegaRAID/MegaCli# ./MegaCli64 -PDList -a0 | grep -B14 Unconfigured | grep -e ‘^Enclosure Device ID:’ -e ‘^Slot Number:’
Enclosure Device ID: 41
Slot Number: 11
Enclosure Device ID: 80
Slot Number: 0
Enclosure Device ID: 80
Slot Number: 1
Enclosure Device ID: 80
Slot Number: 2ID: 80
Enclosure Device ID: 80
Slot Number: 3
Enclosure Device ID: 80
Slot Number: 4
Enclosure Device ID: 80
Slot Number: 5
Enclosure Device ID: 80
Slot Number: 6
Enclosure Device ID: 80
Slot Number: 7
Enclosure Device ID: 80
Slot Number: 8
Enclosure Device ID: 80
Slot Number: 9
Enclosure Device ID: 80
Slot Number: 10
Enclosure Device ID: 80
Slot Number: 11
Enclosure Device ID: 106
Slot Number: 0
Enclosure Device ID: 106
Slot Number: 1
Enclosure Device ID: 106
Slot Number: 2
Enclosure Device ID: 106
Slot Number: 3
Enclosure Device ID: 106
Slot Number: 4
Enclosure Device ID: 106
Slot Number: 5
Enclosure Device ID: 106
Slot Number: 6
Enclosure Device ID: 106
Slot Number: 7
Enclosure Device ID: 106
Slot Number: 8
Enclosure Device ID: 106
Slot Number: 9
Enclosure Device ID: 106
Slot Number: 10
Enclosure Device ID: 106
Slot Number: 11
root@srv-103-27:/opt/MegaRAID/MegaCli#

# Create Raid 6 Volume

root@srv-103-27:/opt/MegaRAID/MegaCli# ./MegaCli64 -CfgLdAdd -r6 [80:0,80:1,80:2,80:3,80:4,80:5,80:6,80:7,80:8,80:9,80:10] -a0

Adapter 0: Created VD 5

Adapter 0: Configured the Adapter!!

Exit Code: 0x00
root@srv-103-27:/opt/MegaRAID/MegaCli#

# add dedicated hot spares, we use dedicated as they stay with the array/shelf
root@srv-103-27:/opt/MegaRAID/MegaCli# ./MegaCli64 -PDHSP -Set -Dedicated -Array5 -PhysDrv [80:11] -a0

Adapter: 0: Set Physical Drive at EnclId-80 SlotId-11 as Hot Spare Success.

Exit Code: 0x00

Chef 0.8.x Deb and Upstart

So my chef clients have been crashing and its always a bummer to ssh in and restart it. I could just have my monitoring system start it but why bother when Ubuntu has a wonderful and built in way to make sure the service stays up!

So my chef clients have been crashing and its always a bummer to ssh in and restart it. I could just have my monitoring system start it but why bother when Ubuntu has a wonderful and built in way to make sure the service stays up!

First I downloaded the chef recipe from opscode, then I added the following.

joshua-millers-macbook-pro:site-cookbooks jmiller$ cat chef/recipes/client-deb.rb
#
# Author:: Joshua Miller
# Cookbook Name:: chef
# Recipe:: client-deb
#
# Copyright 2008-2010, Fitsnips.net
#
# Licensed under the Apache License, Version 2.0 (the “License”);
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an “AS IS” BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# since I have the deb installed already by this point I dont install it.
case node[:platform]
when “ubuntu”
# Upstart is on karmic and above by default … not sure about lower versions
if node[:platform_version].to_f >= 9.10

# my chef server is installed with gems, but for easy of auto install I am using debs
# in my kickstart build with a local apt-mirror. Due to that I have added a check for
# chef-server and its there we dont make any changes.
template “/etc/init.d/chef-client” do
source “chef-client-upstartjob.erb”
owner “root”
group “root”
mode 0774
backup 0
not_if do File.symlink?(“/etc/init.d/chef-server”) end
end

service “chef-client” do
provider Chef::Provider::Service::Upstart
supports :restart => true, :reload => true
end

template “/etc/default/chef-client” do
source “default-chef-client.erb”
owner “root”
group “root”
mode 644
backup 0
not_if do File.symlink?(“/etc/init.d/chef-server”) end
end

template “/etc/init/chef-client.conf” do
source “upstart-chef-client.conf.erb”
owner “root”
group “root”
mode 0644
backup 0
notifies :start, resources(:service => “chef-client”)
not_if do File.symlink?(“/etc/init.d/chef-server”) end
end

end

end

Then we create a few templates:

joshua-millers-macbook-pro:site-cookbooks jmiller$ cat chef/templates/default/upstart-chef-client.conf.erb
start on runlevel [2345]

script
exec /usr/bin/env chef-client -c /etc/chef/client.rb -i <%= @node[:chef][:client_interval] %> -s <%= @node[:chef][:client_splay] %>
end script

# Restart the process if it dies with a signal
# or exit code not given by the ‘normal exit’ stanza.
respawn

# Give up if restart occurs 10 times in 90 seconds.
respawn limit 10 90

Lets make is easy on the other admins who are not used to Upstart:

joshua-millers-macbook-pro:site-cookbooks jmiller$ cat chef/templates/default/chef-client-upstartjob.erb
#!/bin/sh -e
# upstart-job
#
# Symlink target for initscripts that have been converted to Upstart.

set -e

INITSCRIPT=”$(basename “$0″)”
JOB=”${INITSCRIPT%.sh}”

if [ “$JOB” = “upstart-job” ]; then
if [ -z “$1” ]; then
echo “Usage: upstart-job JOB COMMAND” 1>&2
exit 1
fi

JOB=”$1″
INITSCRIPT=”$1″
shift
else
if [ -z “$1” ]; then
echo “Usage: $0 COMMAND” 1>&2
exit 1
fi
fi

COMMAND=”$1″
shift

if [ -z “$DPKG_MAINTSCRIPT_PACKAGE” ]; then
ECHO=echo
else
ECHO=:
fi

$ECHO “Rather than invoking init scripts through /etc/init.d, use the service(8)”
$ECHO “utility, e.g. service $INITSCRIPT $COMMAND”

case $COMMAND in
status)
$ECHO
$ECHO “Since the script you are attempting to invoke has been converted to an”
$ECHO “Upstart job, you may also use the $COMMAND(8) utility, e.g. $COMMAND $JOB”
$COMMAND “$JOB”
;;
start|stop|restart)
$ECHO
$ECHO “Since the script you are attempting to invoke has been converted to an”
$ECHO “Upstart job, you may also use the $COMMAND(8) utility, e.g. $COMMAND $JOB”
PID=$(status “$JOB” 2>/dev/null | awk ‘/[0-9]$/ { print $NF }’)
if [ -z “$PID” ] && [ “$COMMAND” = “stop” ]; then
exit 0
elif [ -n “$PID” ] && [ “$COMMAND” = “start” ]; then
exit 0
elif [ -z “$PID” ] && [ “$COMMAND” = “restart” ]; then
start “$JOB”
exit 0
fi
$COMMAND “$JOB”
;;
reload|force-reload)
$ECHO
$ECHO “Since the script you are attempting to invoke has been converted to an”
$ECHO “Upstart job, you may also use the $COMMAND(8) utility, e.g. $COMMAND $JOB”
reload “$JOB”
;;
*)
$ECHO
$ECHO “The script you are attempting to invoke has been converted to an Upstart” 1>&2
$ECHO “job, but $COMMAND is not supported for Upstart jobs.” 1>&2
exit 1
esac

Quick and dirty server list from chef

So I have always used a simple bash look to do quick task on lots of servers:

Example:

for i in `cat server.list`; do ssh $i ‘hostname;uptime’;done

We can use chef to build list of servers by role, and a list all servers in a our farm if managed by chef 🙂

#!/bin/bash

####
#
# Must be run from a server that has knife and your key i.e. chef.server.com
#
###

# I think I am going to make this a recipe
# but for now…

#Generate a list of all chef controlled servers

knife node list | sed s/\”//g | sed s/,// | grep -v \] > /home/operations/servers/all.txt

# List of all roles:

knife role list | sed s/\”//g | sed s/,// | egrep -v ‘\]|\[‘ > /home/operations/servers/roles.txt

# Generate a file for each role containing the servers in that role
# Tetsu likes the files lower case … works for me 🙂

for i in `cat roles.txt`; do echo $i; z=`echo $i | tr ‘[:upper:]’ ‘[:lower:]’`; knife search node role:$i -i > $z.txt; done

Chef 8.0 almost here?

Its starting to feel like 8.0 will never ship and I just dont feel its ready to run in production just yet based on the lack of documentation but I have tasted enough to know I want it.

Been busy as heck around here at Rdio, Inc still loving chef but can not wait for 8.0

Some features I am looking forward to:

Knife: a command-line utility used to interact with a Chef server directly through the RESTful API.
one of the best parts of this that I have seen is that it will make multiple admins much easier to deal with. My favorite command so far: cookbook upload

Openid no longer only option for logins: Infact the whole login stuff has changed and with knife there will be even less reason then ever to login to the UI, this is a major change as the whole auth stuff is in flux right now.

Better Serach: now this one I have not played with much but they say it will be much better based partially on the databag addition

Databags: Data bags are arbitrary stores of JSON data on the server that get indexed for search.
This will help you store data that is used across recipes with less effort.

I am sure there are more, but those are the ones I have played with so far. Its starting to feel like 8.0 will never ship and I just dont feel its ready to run in production just yet based on the lack of documentation but I have tasted enough to know I want it.

Favorite command today – disown

Often I start a process and realize its going to run longer then I really want to keep the session open. If I had know it would run that long I would have used screen or nohup but now its to late for that whats the fix? Simple background the process and then use the command “disown”. Now when you log out the command will finish running, o happy day!

Ubuntu 9.10 karmic and Chef

Quick notes on installing chef configuration management on Ubuntu 9.10 Karmic

Quick notes on installing chef configuration management on Ubuntu 9.10 Karmic, this is mostly taken directly from the chef wiki pages but kind of putting it all together and noting problems I ran into.

My automated install is a pretty tight server install:

%packages
openssh-server
curl
nfs-common
portmap
libnss-ldap
libpam-ldap
vlan

I want to install the newest version which is at opscode and not the version in karmic universe so I add the apt repo to the system.

echo “deb http://apt.opscode.com/ karmic universe” > /etc/apt/sources.list.d/opscode.list
curl http://apt.opscode.com/packages@opscode.com.gpg.key | sudo apt-key add –
apt-get update
# actually install chef-server
sudo apt-get install rubygems ohai chef chef-server

I have to manually install git for this server as its usually installed by chef

sudo apt-get -y install git-core

Now I install apache, and the apache modules

sudo apt-get -y install apache2

# module setup
for a2mod in proxy proxy_http proxy_balancer ssl rewrite headers
do
sudo a2enmod $a2mod
done

Now I create the virtual host:

Create /etc/apache2/sites-available/chef_server.repo with the following info, but replace server_fqdn with your chef fully qualified domain name.

<VirtualHost *:443>
ServerName server_fqdn
DocumentRoot /usr/share/chef-server/public

<Proxy balancer://chef_server>
BalancerMember http://127.0.0.1:4000
Order deny,allow
Allow from all
</Proxy>

LogLevel info
ErrorLog /var/log/apache2/chef_server-error.log
CustomLog /var/log/apache2/chef_server-access.log combined

SSLEngine On
SSLCertificateFile /etc/chef/certificates/server_fqdn.pem
SSLCertificateKeyFile /etc/chef/certificates/server_fqdn.pem

RequestHeader set X_FORWARDED_PROTO ‘https’

RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ balancer://chef_server%{REQUEST_URI} [P,QSA,L]
</VirtualHost>

<VirtualHost *:444%gt;
ServerName server_fqdn
DocumentRoot /usr/share/chef-server/public

<Proxy balancer://chef_server_openid>
BalancerMember http://127.0.0.1:4001
Order deny,allow
Allow from all
</Proxy>

LogLevel info
ErrorLog /var/log/apache2/chef_server-error.log
CustomLog /var/log/apache2/chef_server-access.log combined

SSLEngine On
SSLCertificateFile /etc/chef/certificates/server_fqdn.pem
SSLCertificateKeyFile /etc/chef/certificates/server_fqdn.pem

RequestHeader set X_FORWARDED_PROTO ‘https’

RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ balancer://chef_server_openid%{REQUEST_URI} [P,QSA,L]
</VirtualHost>

Checkout the chef repo:

cd
git clone git://github.com/opscode/chef-repo.git
cd chef-repo

Time to create your ssl cert

rake ssl_cert FQDN=chef.int.domain

Not sure what I am doing wrong here but I now run the install that for some reason does not copy the certs I just generated over … so I manually copy them over

rake install
cd /root/chef-repo/certificates
cp -a * /etc/chef/certificates/

Now we should be ready to restart apache and see if everything is working

sudo /etc/init.d/apache2 restart

We need to enable the chef virtual site

sudo a2ensite chef_server.repo
/etc/init.d/apache2 reload

now you should be able to bring up the Chef web interface in your browser, if you followed the directions in this writeup it will only work with https.

https://chef.int.domain/

Since I already have a open ldap gateway server configured I am able to log right in and confirm a running install, for more info on that see:

http://mrmiller.nonesensedomains.com/2009/09/18/chef-openid-to-ldap-gateway/

I always like to do a reboot after configuring a new host as even the best make mistakes from time to time and this way I can confirm that everything is starting/running as expected.

Next time I document my work on migrating the roles and cookbooks from my existing install on CentOS 5.3.

Additional notes:

I update my servers as part of the install, but found out that the couchdb that was with the Karmic on release would not start (local mirror was out of date). This was fixed by running

apt-get update
apt-get upgrade -y

I forgot to enable the chef virtual host at first and when I pulled up the URL in my browser got the following error: “SSL received a record that exceeded the maximum permissible length.”. Enabling the site and restarting apache fixed that right up.

Ref:

http://wiki.opscode.com/display/chef/Package+Installation+on+Debian+and+Ubuntu

http://wiki.opscode.com/display/chef/How+to+Proxy+Chef+Server+with+Apache

Chef openid to ldap gateway

Setting up a openid to ldap gateway for chef authentication.

So one of my main complaints about chef and I might mention the office joke is the use of openid for authentication of users. This presented two problems for me, one that I would never trust the authentication to my management server to a outside source and second that my chef server does not have internet access. Chef pointed me in the direction of http://www.openid-ldap.org/ and after a little wresting I was able to have a working internal openid auth system using already existing ldap auth system.

I am running openidldap on a system that I have configured to handle admin web apps, and the install consisted of simply creating the web root and updating the ldap.php. A few things I did fine useful was to rename it the directory to openid. The ldap.php was pretty easy but one place I did get stuck was that I did not clearly read the directions and tried to create .htaccess files rather then just update /etc/httpd/conf.d/ssl.conf and /etc/httpd/conf/httpd.conf like they said.

If you follow the directions it should be a 10 minute setup at most.

Untar the file in your webroot, rename directory to openid

append to httpd.conf or virtualhost.conf if your using one


---
RewriteEngine On

RewriteRule ^/openid$ https://openid.int.mycompany/openid/ [R=permanent,L]
RewriteRule ^/openid/$ https://openid.int.mycompany/openid/ [R=permanent,L]
RewriteRule ^/openid/(.*)$ https://openid.int.mycompany/openid/$1 [R=permanent,L]
---

insert inside the virtualhost of ssl.conf


---
SSLProxyEngine On
RewriteEngine On

RewriteCond %{REQUEST_URI} !^/(.+)\.php(.*)$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /openid/([A-Za-z0-9]+)\?(.*)\ HTTP/
RewriteRule ^/openid/(.*)$ https://openid.int.mycompany/openid/index.php?user=%1&%2 [P]

RewriteCond %{REQUEST_URI} !^/(.+)\.php(.*)$
RewriteRule ^/openid/([A-Za-z0-9]+)$ https://openid.int.mycompany/openid/index.php?user=$1 [P]
---

update the ldap.php in the openid directory you just created, which is pretty clear but I did have to edit the following lines to make sure the name showed up correctly.


---
# SREG names matching to LDAP attribute names
'nickname' => 'uid',
'email' => 'mail',
'fullname' => 'cn',
---

then simply test by going to https://yourhostname/openid/

One thing I have yet to fix is that my chef server straddles two networks, one side that I can access form the office the other that servers talk on and this creates havok on my logins, for now I wound up creating a openid.int.mycompany entry pointing to the ip visible to my mac and that gets me round he problem of int.mycompany not being routable outside the server server network.

Chef for fun and maybe some profit

Upon joining my new company last month I came into the perfect env of empty servers and all the freedom I wanted. I had been testing Cobbler https://fedorahosted.org/cobbler/ and Chef http://wiki.opscode.com/display/chef/Home over the last month as a replacement for my home grown build system. Well the joy of testing on virtual systems did not truly expose me to the joys of deploying chef in a closed system. I designed the environment to not be reaching outside of our network for anything and chef did not like that, but it turned out to be OK after lots and lots of fun.

Upon joining my new company last month I came into the perfect env of empty servers and all the freedom I wanted. I had been testing Cobbler https://fedorahosted.org/cobbler/ and Chef http://wiki.opscode.com/display/chef/Home the previous month at my at Tagged, Inc as a replacement for my home grown build system that had been implemented there and at Pay By Touch. Well the joy of testing on virtual systems did not truly expose me to the joys of deploying chef in a closed environment. As any security minded person would do I designed the new environment to not allow reaching outside of the local network for anything and chef did not like that, but it turned out to be OK after lots and lots of fun.

My cobbler server is providing a local mirror of http://elff.bravenet.com/, and I have pulled down the current bootstrap file to my cobbler system Apache server.

In the package section of my company_base.ks file, I include

rubygem-chef

Based on notes I found for puppet install I created a snippet in my cobbler install:

Then to prep for install of chef client this is run before the %post section in my company_base.ks file
$SNIPPET(‘company_chef_chroot’)

[jmiller@cobbler ~]$ cat /var/lib/cobbler/snippets/company_chef_nochroot

# Make sure we have network stuff in place so when we register with the server all is well

%post --nochroot
# Copy netinfo, which has our FQDN from DHCP, into the chroot
test -f /tmp/netinfo && cp /tmp/netinfo /mnt/sysimage/tmp/

This snippet in my company_base.ks file installs, validates, and first runs the chef client
$SNIPPET(‘rdio_chef_client’)

[jmiller@cobbler ~]$ cat /var/lib/cobbler/snippets/company_chef_client

# In this script we actually install the client

cat < /root/solo.rb
file_cache_path "/tmp/chef-solo"
cookbook_path "/tmp/chef-solo/cookbooks"
EOF

cat < /root/chef.json
{
"chef": {
"server_fqdn": "chef.int.company"
},
"packages": {
"dist_only": true
},
"recipes": "chef::client"
}
EOF

cat < /root/client.json
{
"run_list": ["role[COMPANY_BASE]"]
}
EOF

# Configure the Env
echo "Installing Chef Bootstrap"
cd /root/
chef-solo -c solo.rb -j chef.json -r http://chef.int.company/bootstrap-0.7.8.tar.gz
cd -

# register with the server
echo "Register with Chef"
chef-client -t "myAuthToken" -j /root/client.json

chef-client
[jmiller@cobbler ~]$

My COMPANY_BASE role in chef was lacking a few recipes and threw me for a huge loop.

recipes in COMPANY_BASE ( chef chef::client sudo screen ntp openssh snmp git )


[root@srv-101-25 ~]# chef-client -l debug -j /root/client.json
/usr/lib/ruby/gems/1.8/gems/chef-0.7.8/lib/chef/recipe.rb:200:in `method_missing': Cannot find Chef::Resource::DistOnly? for dist_only? (NameError)
Original: undefined method `DistOnly?' for Chef::Resource:Class

After a lot of troubleshooting with Joshua Timberman of Opsec we found out that I needed two more recipes ( packages & runit ), turns out this is limit with RPM based systems and caused me a lot of hurt.

Sites I owe a lot of thank you to:
http://wiki.opscode.com/display/chef/Home
https://fedorahosted.org/cobbler/
http://reductivelabs.com/trac/puppet/wiki/BootstrappingWithPuppet
http://wiki.opscode.com/display/chef/Installation+on+RHEL+and+CentOS+5+with+RPMs

ganglia for fun

I have been playing with ganglia [ http://ganglia.info ] for the last few days on 1/10 of our 200 server web cluster at work.  While I see many useful metrics I am having trouble really getting solid numbers.  Maybe over time it will level out to the point that changes can be see but for not the information does not seem more useful then our zenoss [ http://www.zenoss.com ] install.

Also I had some trouble with the install not showing all data on the member nodes when I self compiled so I wound up using the rpms from fedora epel.repo which worked like a charm.

http://idolinux.blogspot.com/2009/03/ganglia-cluster-monitoring-made-easy.html