Please Note: This document is no longer the authoritative source for questions on how to set up SystemImager on the ACL here at Earlham College. Please check the Admin page (access restricted) for the now authoratative source.
The Imageserver: You should only have to go through these steps if you are creating a new Image Server for the ACL (or whatever cluster you are working with. At the time of writing this documentation we maintained 4 separate images. Lovelace (aka ACL0) holds the images for the ACL and the E&I & Robotics Lab (a separate image due to hardware differences), Noether holds the image for the Athena Cluster, and Hopper holds the image for the Bazaar Cluster. If any of these machines are re-built/re-installed it is most likely that you will have to go through these steps. Otherwise you should be able to skip ahead to the creation of the Golden-Client and the image.
The Golden-Client: This is where you will be creating the machine that you want the entire cluster to look like. If you are reading this doc, you will most likely want to go through these steps.
There are a number of software packages that are needed to be installed onto the Imageserver. Some of these may or may not be part of the standard installation. Since (with the exception of Quark) we use RedHat or Debian/GNU Linux on all of our machines, it is fairly easy to find pre-compiled packages (RPM or deb packages) for the server and install them using that mechanism. The advantage of this is that upgrading these packages as the updates become available is much easier and less complicated. So, where possible, use the packages instead of the source (this "rule" is not one that is widely accepted). Links to sources for all of the required packages can be found at the systemimager.org download page. The packages that you need are:
Go ahead and install nasm, syslinux, rsync, libappconfig-perl and SystemConfigurator, as you will need them for the next step and from here on out. Don't worry about configuring the packages (i.e. rsync) since SystemImager's Install script will do that for you.
NOTE: If you are reading this section and attempting
to upgrade your Imageserver, it would probably be a good idea
to back-up your rsync.conf
file (the proper one)
and your /etc/hosts
file as they may get erased
in the below process and you will want the current ones
available when you go back to re-create them after the
install.
Now that we have all of the supporting applications installed on our image server, we can focus on setting up the file-space that is going to house the actual images, and installing the packages that contain the SystemImager server scripts.
On the Imageserver, our main concern is going to be having
disk space necessary to store all of the images. On our
Imageserver (Lovelace), we maintain the images for two
clusters: the E&I Robotics Lab and the ACL. Each image takes
up anywhere from 1.4G to 2G (or more) of disk space. As of
February 2002, we implemented a version scheme for these two
images which allows us to easily switch between a production
image and a beta image at the same time as holding onto the
most recently retired image (for emergency purposes). I
explain this and our reasons in the
Maintenance section below. This means that
on Lovelace, we need disk space for at least 6 images, with
occasional need for an additional 2 images. Obviously, it
wouldn't be too good of an idea to just store the images on
the same disk as the OS since that would mean that a OS
failure could jeopardize all of our work. Therefor we put our
images on a separate hard drive. This also allows us to move
our images around the file system without having to copy them
around. This drive needs to be mounted in the proper location
in order for rsync and SystemImager to work properly though.
Version 2.0.1 of SystemImager is set up to store the images in
the /var/lib/systemimager
directory. I recommend
that you set up this mount point before installing the
systemimager-server and systemimager-common packages as they
will want to put a few files into these directories. Once you
have set up that mount point and mounted the drive, go ahead
and install the systemimager-server and systemimager-common
packages onto your Imageserver. It would probably be a good
idea to restart your server to make sure things start working
properly.
So, for now, thats it! You've set up your image server and
it should be ready to serve any images that it has on it. If
you followed this process to upgrade your imageserver to a
newer version of SystemImager, you can/should now go back and
make sure that your /etc/hosts
and
rsync.conf
file are the way they should be.
This step is probably the most intricate step in the entire process. For this step, you will need the distribution CD's for the Linux Distribution that you will be installing on the cluster. For the ACL this is the most current (and stable) RedHat Linux version (which was 7.2 at the last modification of this document). You will also need one of the "typical" ACL box onto which you will install the system. Throughout the remainder of this document, this box will be refereed to as your "Golden-Client."
Before we get too far into the installation and package selection, a short side-note. The way that you install a Linux system & choose which packages to install is almost like a religion in some communities. I personally have my own way of doing it. This method sometimes meshes with the way others do it, but sometimes does not... Therefore, do it the way you want. I will be explaining the way that I do it since I understand it and can explain it. If you don't like it, find your own way. The Anaconda/Kickstart file generated by the most recent ACL install that I participated in can be found here. In that file you can find the listing of packages that were installed during the install. Here you can find a text file which contains the packages and changes that we added to the standard install. With that said, back to the task at hand.
Here is how I install RedHat Linux on one of the ACL Boxes:
Once you have that base install complete, there are various things that we do to this install to make it look and work the way we want it to. Start off by setting up the NFS mount point for /clients/. This will allow us to get all of the files that we have set aside that on quark.
Now, lets configure GDM for the ACL. First, copy over the
GDM images and background that we have created for the
ACL:
cp
/clients/acl-stuff/acl-config/acl-ec-logo.xpm
/clients/acl-stuff/acl-config/acl-background.png
/usr/share/pixmaps/
Next, you will want to
overwrite /etc/X11/xsrirc with the one in the acl-config
directory:
cp
/clients/acl-stuff/acl-config/xsrirc
/etc/X11/xsrirc
Then, edit
/etc/X11/gdm/gdm.conf
and replace
--redhat-login
with --acl-login
and
then change the LOGO to be acl-ec-logo.xpm
. That
should take care of the configuration of GDM for the ACL.
In order to send messages as root to the X session, we must
gain authority on the X server. To do this, we need to modify
/etc/X11/xdm/Xsession
to contain the apropriate
commands. Modify /etc/X11/xdm/Xsession so that it contains
the contents of Xsession.mod (in the acl-config directory)
before every exec in the section where it is starting the
window manager.
Finally, we need to make modifications to several
configuration files. Modify resolve.conf to include
student.earlham.edu, bazzar.cs.earlham.edu and
athena.cs.earlham.edu in the search string. Modify
sendmail.mc so that the diffs from sendmail-ACL-changes.txt
(again in the acl-config directory) match. Then issue m4
/etc/mail/sendmail.mc > /etc/sendmail.cf
to have the
changes take effect. Next, add %wheel to the end of the
/etc/suders file (via visudo) with the same permissions as
root. Next change /etc/issue to say:
Welcome to
the ACL.
Authorized Users Only.
(\n -
\l)
Finally you should overwrite the
snmpd.conf
file with the one found in the
acl-config
directory. That should do it for the
various configuration files.
Next you will probably want to upgrade all of the packages
(for which package upgrades exist) that are installed. I found
it easiest to do this using Ximian's Red Carpet. The RPM for
this program can be found in
/clients/acl-stuff/installed-apps
directory. Once
installed, you add the RedHat chanel for the distribution that
you are using and then it tells you what it thinks are the
packages that you need to install.
Now that you have that done, go ahead and install the
addtional programs that we use on the ACL. Most of the
packages/tarballs can be found in
/clients/acl-stuff/installed-apps
directory. The
list of apps is:
Once you have those installed, we want to set up printing. For us, this includes upgrading to the most recent stable version of LPRng, lprngtool and ifhp (print filters). Once you've installed those, just copy over the printcap file from any ACL machine (they should be portable).
Next, we want to set up the keyrings localy so that Lovelace can connect without a password to this (and any other ACL box). To do this we need to copy over Lovelace's Root key and put it into the authorized_keys and authorized_keys2 files. We need to do this so that the C3 tools work without hitch.
Now, we set up XFree86, the X server. For the most part (unless the file format changes) you should be able to just copy over the config file from the acl-config directory. You should only need to put the XF86Config-4 file into /etc/X11/, but you might as well put the XF86Config file in there too since id won't do any damage. If you find that you need to modify either of these files (changes in hardware or formats), Refer to the local documentation for the X Server.
Almost done now... Copy over migrate.pl, force-update.sh,
force-update-now.sh, syncronize-warning.txt and
acl-firewall.sh to /root
. Then you will want to
add this line to /etc/crontab
:
#00
3 * * 0 root /root/force-update.sh
The hash is
there so that the script dosen't run until we want it to.
Uncomment it once we are testing the image on another
machine.
Refer to this file for my notes
on what I installed the last time I built an ACL image. Out of
habit, whenever I install a new package onto the ACL, I try to
put a copy of the RPM/Tarball in
/clients/acl-stuff
so that I can easily access it
from anywhere. The disadvantage to this is that no-one
actively maintains that collection and therefor the packages
get out of date fairly easily. Regardless, looking there will
allow you to get a fairly good idea of the stuff that you need
to put into the image. On more of a personal note: When you
are installing a package/program from the source (ie. via
Tarball) try to use the common/standard locations for its
install location. By this I mean, typically when you install
something from source, the program, its src, and anything else
it needs are placed in /usr/local
. The reason
that I mention this is that by placing things in non-standard
places or moving them around the system between images can be
a pain to the user due to configuration changes that must take
place due to such differences. Another thing is that RPM
usually takes care of most everything elsewhere, and by
putting things under /usr/local
you prevent
accidental over-writing by RPM during upgrades.
With that being said, get everything that you need/want in the distribution installed and running properly on the "golden-client." Any problems that exist on the "golden-client" will also exist on the machines which receive the image later. Don't worry about getting absolutely everything since you can add it after you start distributing the image to the rest of the ACL. One thing that we have been doing has been migrating just a few machines to the new image so that limited \ testing can occur. You too may want to do this for a few weeks so that you can be sure to work out the major bugs before it is put on the entire cluster.
Now that we have built the image and have done some of the
preliminary testing, we are ready/able to grab the image from
the "golden-client." First we need to prepare the client for
having its image pulled. What this does is sets up a temporary
rsync server on the golden client that that will allow it to
serve its image to the imageserver. It also gathers the
necessary information about the drive partition geometries.
It then makes all of this information available for retrieval
through the rsync
server. All of this is
accomplished by running the command prepareclient
as root. You don't want to do this before you are ready
because it leaves your system open to having any of its files
pulled from anywhere. Prepareclient will set everything up on
the client side for you. All that is left in order to pull the
image is to run getimage
on the server to pull
the image. You need to pass getimage the golden-client's
hostname and the imagename (e.g. getimage -golden-client
[hostname] -image [imagename]
). getimage
will put the image in
/var/lib/systemimager/images/[imagename]
.
/var/lib/systemimager
can be a symbolic link, but
wherever it leads to must have room for the image. Once it is
done pulling the image, getimage
will ask you if
you how you want to configure the network. Network
configuration on the ACL is accomplished by static dhcp on
QUARK. So tell it static dhcp, but DO NOT CONFIGURE A DHCP
SERVER ON ACL0. That will mess up the entire 159.28.230
subnet. Since we have already configured dhcp on Quark, you
can skip through all of the prompts to run makedhcpstatic (say
no to all of them). Once that is complete, go ahead and reboot
your "golden client" so that you close the rsync server. And
thats it... You now have an Image that you can distribute to
any number of clients. Now, when you are ready, follow the
next step in order to add clients to the server to use the new
image.
Here is where we tell the Imageserver which client gets
syncronized to which image. There are two ways to do this:
manually and automatically with
addclients
. Depending on the image and which ACL
boxes it belongs to, you may want to just do things manually,
but if you are doing a generic image that applies to the
entire cluster, you may want to just use addclients. When I
initially set up the server with the inital image, I set the
entire acl up to use the same image. This way, you get the
infrastructure from addclients and you can then tweek it as
necessary. For instance, when you want to change the image
that a machine receives, all you do is change the link with
its name to point to the new master install script in
/usr/lib/systemimager/scripts/
. The last thing
that you need to do before you can start syncronizing machines
is to create an autoinstall boot floppy. This can be done with
makeautoinstalldiskete
. As a matter of fact, you
can use one disk for any image since it contacts the
imageserver to obtain its initiall startup script. Once you
have this disk, you can start installing the image onto the
ACL.
Now, all that you need to do is put the autoinstall disk into each machine and reboot it. The disk will boot up, repartition the drives and grab the image's files from the server. The other option, is if you are already operating on an ACL box that has the Systemimager client utilities, you can just run updateclient -autoinstall. That will download all of the necessary things and then run lilo so that the next boot is to the systemimager kernel. Thats It... Your new distribution is on the cluster.
Here is where VA SystemImager really shines... When it
comes time to make changes to the image, all you need to do is
execute chroot /var/lib/systemimager/[imagename]
and you will be working as if you were on one of your client
machines. You can do anything from install RPM/DEB packages
to mount NFS partitions. You can even recompile the kernel and
change the lilo/grub config... Thats its beauty. Now when you
want to do things like change partition sizes, you actually
have to do some work. The files that set the partition files
are in the /var/lib/systemimager/scripts
directory. You actually have to modify the initrd.gz file and
change the files in the /etc
directory of that
file.