How to convert a Debian system to bootable Software RAID 1 with a second hard drive, 'mdadm' and a few standard UNIX tools
Version 0.97 (2004-06-03) Lucas Albers -- admin At cs DOT montana dot edu and Roger Chrisman
Home of most recent version:
http://alioth.debian.org/projects/rootraiddoc
Thanks to: Alvin Olga, Era Eriksson, Yazz D. Atlas, James Bromberger, Timothy F Nagy, and alioth.debian.org
WARNING: No warranty of any kind. Proceed at your own risk. A typo, especially in lilo.conf, can leave your system unbootable. Back-up data and make a boot floppy before starting this procedure.
We begin with Debian installed on the Primary Master drive, hda (step 1). We need RAID support in our Kernel (step 2). We add another disk as Secondary Master, hdc, set it up for RAID (step 3), and copy Debian to it (step 4). Now we can reboot to the RAID device (step 5) and declare hda part of the RAID and it automatically syncs with hdc to complete our RAID 1 device (step 6).
If all goes well
Use this HowTo at your own risk. We are not responsible for what happens!
First things first
Whenever you change your partitions, you need to reboot! (If you know what you are doing, ignore this advice.)
I assume you will mess up a step so wherever possible, we include verification.
I use 'mdadm' because it is easier than 'raidtools' or 'raidtools2'.
We now have grub and lilo directions, grub directions are still in beta form.
Read the grub directions, and comment on them.
Do a fresh install the normal way on your first drive, hda (the Primary Master drive in your computer). Or, if you already have a running Debian system that you want to use on hda; skip ahead to step 2. If you need Debian installation instructions, see:
Debian Installation HowTo » http://www.debian.org/releases/stable/installmanual
Sarge Debian Installation HowTo » http://d-i.alioth.debian.org/manual/
RAID must be compiled into the Kernel, not added as a module, for you to boot from the RAID device (unless you use a RAID savvy initrd kernel or boot from a non-RAID boot drive. (I now cover initrd methods!). You need RAID 1 but I usually include RAID 5, too. For step by step Kernel compile and install instructions, see:
Creating custom Kernels with Debian's kernel-package system » http://newbiedoc.sourceforge.net/system/kernel-pkg.html
cat /proc/mdstat
(You should see the RAID "personalities" your Kernel supports.)
Something like this:
Personalities : [linear] [raid0] [raid1] [raid5]
read_ahead 1024 sectors
md4 : active raid5 hdh4[3] hdg4[2] hdf4[1] hde4[0]
356958720 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
unused devices:
YOU MUST VERIFY you have raid support via /proc/mdstat. This is the most important item to verify before going any farther. So the kernel has to support it or you have to load the modules in initrd.
(This will show you if raid is compiled into kernel, or detected as a module from initrd.)
/etc/modules will not list RAID if Kernel has RAID compiled in instead of loaded as modules.
Use lsmod to list currently loaded modules, this will show raid modules loaded.
reiserfs
raid1
ext2
ide-disk
raid5
ext3
cat /etc/modules
(IF YOU SEE ANY RAID LISTED IN /etc/modules, then you probably have your Kernel loading RAID via modules. That will prevent you from booting from your RAID device, unless you use initrd. To boot from your RAID device, unless you use a RAID savvy initrd, you need RAID compiled into Kernel, not added as a module.)
apt-get install mdadm
2.4 List what IDE devices you have:ls /proc/ide
Setup RAID 1 and declare disk-one of your RAID to be 'missing' and disk-two of your RAID to be 'hdc'.
Warning: ALWAYS give the partition when editing with cfdisk. By default cfdisk will select the first disk in the system. I accidentally wiped the wrong partition with cfdisk, once.
Do A or B, either way will work:
A. Create partitions on new disk.
cfdisk /dev/hdc
or
B. copy existing partitions to new disk with sfdisk.
sfdisk -d /dev/hda | sfdisk /dev/hdc
cfdisk /dev/hdc
reboot
(To verify that everything is working ok.)
that has two members and one of the members does not exist yet. md0 is the RAID partition we are creating, /dev/hdc1 is the initial partition. We will be adding /dev/hda1 back into the /dev/md0
RAID set after we boot into /dev/md0.
mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/hdc1
If this gives errors then you need to zero the super block, see useful mdadm commands.
You can use reiserfs or ext3 for this, both work, I use reiserfs for larger devices. Go with what you trust.
mkfs.ext3 /dev/md0
or
mkfs -t reiserfs /dev/md0
Copy your Debian system from hda to /dev/md0 ('missing' + 'hdc'). Then, check to make sure that the new RAID device is still setup right and can be mounted correctly. We do this with an entry in hda's /etc/fstab and a reboot. Note that by editing hda's /etc/fstab after the copy, instead of before, we leave the copy on md0 unaltered and only are editing hda's /etc/fstab.
NB: THIS IS A BRANCH IN OUR SYSTEM CONFIGURATION (eg temporary!), but it will overwritten later by the md0 version of /etc/fstab by the sync in step 6.
mkdir /mnt/md0
mount /dev/md0 /mnt/md0
cp -axu / /mnt/md0
Please refer to the Copying data section to verify you copied the data correctly.You don't need the -u switch; it just tells cp not to copy the files again if they exist. If you are running the command a second time it will run faster with the -u switch.
This verifies that you have the correct partition signatures on the partition and that your partition is correct. Sample Line in /etc/fstab:
/dev/md0 /mnt/md0 ext3 defaults 0 0
Then
reboot
And see if the RAID partition comes up.
mount
Should show /dev/md0 mounted on /mnt/md0.
For step 5 reboot, we will tell Lilo that
We will, as before, be using hda's MBR (Master Boot Record is the first 512 bytes on a disk and is what the BIOS reads first in determining how to boot up a system) and hda's /boot dir (the kernel-image and some other stuff live here), but instead of mounting root (/) from hda, we will mount md0's root (/) (the root of our RAID device, currently running off of only hdc because we declared the first disk 'missing').
(Later we will configure Lilo to write the boot sector to the RAID boot device also, so we can still boot even if either disk fails.)
Add a stanza labeled 'RAID' to /etc/lilo.conf on hda1 so that we can boot with /dev/md0, our RAID device, as root (/):
#the same boot drive as before.
boot=/dev/hda
image=/vmlinuz
label=RAID
read-only
#our new root partition.
root=/dev/md0
That makes an entry labeled 'RAID' specific to the RAID device, so you can still boot to /dev/hda if /dev/md0 does not work.
sample complete lilo.conf file:
#sample working lilo.conf for raid.
#hda1,hdc1 are boot, hda2,hdc2 are swap
#hda3,hdc3 are the partition used by array
#root partition is /dev/md3 on / type reiserfs (rw)
#I named the raid volumes the same as the partition numbers
#this is the final lilo.conf file of a system completely finished,
#and booted into raid.
lba32
boot=/dev/md1
root=/dev/hda3
install=/boot/boot-menu.b
map=/boot/map
prompt
delay=50
timeout=50
vga=normal
raid-extra-boot=/dev/hda,/dev/hdd
default=RAID
image=/boot/vmlinuz-RAID
label=RAID
read-only
root=/dev/md3
alias=1
image=/vmlinuz
label=Linux
read-only
alias=2
image=/vmlinuz.old
label=LinuxOLD
read-only
optional
lilo -t -v
(With a RAID installation, always run lilo -t first just to have Lilo tell you what it is about to do; use the -v flag, too, for verbose output.)
Configure a one time Lilo boot via the -R flag and with a reboot with Kernel panic
The -R <boot-parameters-here> tells Lilo to only use the specified image for the next boot. So once you reboot it will revert to your old Kernel.
From 'man lilo':
-R command line
This option sets the default command for the boot loader the next time
it executes. The boot loader will then erase this line: this is a
once-only command. It is typically used in reboot scripts, just before
calling `shutdown -r'. Used without any arguments, it will cancel a
lock-ed or fallback command line.
Before you can do the 'lilo -v -R RAID' command, you must first do a 'lilo' command to update the Lilo boot record with the contents of your new lilo.conf. Otherwise Lilo does not know what you mean by 'RAID' and you just get a 'Fatal: No image "RAID" is defined' error message when you do 'lilo -v -R RAID'. So,
lilo
lilo -v -R RAID
to have /dev/md0 mount as root (/), when Lilo boots from our RAID device, /dev/md0.
Previous root (/) in fstab was:
/dev/hda1 / reiserfs defaults 0 0
Edit it to:
/dev/md0 / ext3 defaults 0 0
Note: edit /mnt/md0/etc/fstab, not /etc/fstab, because at the moment we are booted with hda1 as root (/) but we want to change the /etc/fstab that we currently have mounted on /mnt/md0/etc/fstab, our RAID device.
Reboot to check if system boots our RAID device, /dev/md0, as root (/). If it does not, just reboot again and you will come up with your previous boot partition courtesy of the -R flag in step 5.3 above.
reboot
Verify /dev/md0 is mounted as root (/)
mount
should show:
/dev/md0 on / type reiserfs (rw)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
'type reiserfs' is just my example; you will see whatever your file system type is.
Now we are booted into the new RAID device -- md0 as root (/). Our RAID device only has one disk in it at the moment because we earlier declared the other disk as 'missing'. That was because we needed that other disk, hda, to install Debian on or because it was our pre-existing Debian system.
For step 6 reboots, we tell Lilo that
Here we not only use md0's root (/) as in step 5, but also md0's /boot (it contains an identical kernel-image to the one on hda because we copied it here from hda in step 4, but we will be overwriting everything on hda in step 6 and can't continue relying on the stuff on hda) and MBR from either hda or hdc, whichever the BIOS can find (they will be identical MBRs and the BIOS will still find hda's MBR but in case the hda disk were to fail down the road we would want the BIOS to look on hdc as a fail over so that it could still boot up the system).
cfdisk /dev/hda
My two hard disks are from different manufacturers and as it happens, while both are roughly 40G, they have different architectures in terms of sectors and precise size. So cfdisk was unable to make the partitions precisely the same size and I had hda1 29,997.60MB and hdc1 30,000MB. This didn't work when I get to the 'mdadm --add /dev/md0 /dev/hda1' step. I got a, "failed: no space left on device!" error. So I ran cfdisk again and made hda1 slightly larger than hdc1, since I could not make them both exactly the same size. Now hda1 is 30,005.83MB and the 'mdadm -add /dev/md0 /dev/hda1' step works :-). (The remaining 10,000MB on each disk I am using for other purposes, including a md1 of 1,000MB composed of hda2 and hdc2.)
And watch the booted RAID system automatically mirror itself onto the new drive. We are currently booted from MBR and /boot device on /dev/hdc1, with /dev/md0 as root (/).
mdadm --add /dev/md0 /dev/hda1
Note: We are adding /dev/hda1 into our existing RAID device. See if it is syncing.
cat /proc/mdstat
should show that it is syncing.
these are from when we are booted onto RAID.
boot=/dev/md0
root=/dev/md0
#this writes the boot signatures to either disk.
raid-extra-boot=/dev/hda,/dev/hdc
image=/vmlinuz
label=RAID
read-only
YOU NEED THE raid-extra-boot to have it write the boot loader to all the disks.
YOU ARE OVERWRITING THE BOOT LOADER ON BOTH /dev/hda and /dev/hdc.
You can keep your old boot option to boot /dev/hda so you can boot RAID and /dev/hda.
But remember you don't want to boot into a RAID device in non RAID as it will hurt the synchronization. If you make changes on one disk and not the other.
(we are currently booted into RAID)
lilo -t -v
lilo -R RAID
The -R option tells Lilo it to use the new Lilo setting only for the next reboot, and then revert back to previous setting.
So I waited for the synchronization, started in Step 6.2, to finish (checking it with 'cat /proc/mdstat'). Once it was done, did 'lilo -t -v' again. No "Fatal" error; Lilo seems happy now (no "Fatal" message).
Note 1a: The synchronization however took two hours! I checked with 'hdparm' and it seems I have DMA turned off. Perhaps the synchronization would go faster with DMA turned on. Some examination of my system revealed that I did not have my computer's PCI chipset support compiled into my custom kernel. I recompiled the kernel (kernel 2.6.4) and selected the correct PCI chipset support for my computer and now DMA works correctly :-) and by default. For DMA to be default is also configurable in the PCI area of 'make menuconfig' during kernel compile configuration, and I chose it.
So I can now do Lilo with '-R
Note 2: another error, "Fatal: No image "RAID" is defined."
As in Step 5.3 above, I need to do 'lilo' first so that Lilo reads my new /etc/lilo.conf, otherwise Lilo does not know about my stanza labeled "RAID" which is new in my lilo.conf. (Yes I told Lilo about it on hda1 in step 5.3, but that was after I had copied the hda1 root (/) system to here, md0, which branched my system into two separate system configurations. So it needs to be done here, too. Then I can do 'lilo -R RAID'.
Note 2a: However, the '-R' switch is pointless here unless the lilo.conf stanza labeled "RAID" is *not* the first kernel-image stanza in my lilo.conf. Because if it *is* the first stanza, then it is the default stanza anyway, with or without the '-R'.
reboot
and check
cat /proc/mdstat
and check
mount
to be sure all is as expected.
See what Lilo will do.
lilo -t -v
If it looks okay, do it:
lilo
reboot
and check
cat /proc/mdstat
and check
mount
as a final system check.
I used the following procedure with stock Debian 2.6.5, which has an initrd with all the modules ready to boot into RAID. The procedure also covers using grub as the boot loader. I built this from a bare install of Sarge using the new installer with grub as the boot loader, but most of this document is distro independent. My file system throughout is ext3 and it shouldn't take too much to use reiserfs.
These steps reference back to the procedure sections outlined above and indicate where things differ due to initrd or grub, so you will have to read/do/be familiar with the above steps. Also, make sure you currently use grub as your boot loader, if you are using LILO, install grub and make sure it works before proceeding!
When using initrd the kernel does not need to have the RAID compiled in, they will be loaded as modules. Make sure the kernel loads the RAID modules.
Edit /etc/modules and add
md
raid1
Instead of section 5 using LILO, grub is used as the boot loader, and initrd used to load the kernel. A new kernel entry in the grub menu is created that refers to an initrd that is created which will start the md [raid] device. The original kernel entry will remain and can be reverted to if something goes wrong until RAID is running. This will still use grub loaded installed on the /dev/hda MBR.
A) Make sure the initrd has the modules it needs, by editing /etc/mkinitrd/modules. Add the following [you can see what modules are available by mounting the initrd and looking in the lib/modules - see section 8.]:
md
raid1
B) Update the initrd so that the root device loaded is the raid device, not probed. Edit the /etc/mkinitrd/mkinitrd.conf, and update the ROOT line
ROOT=/dev/md0
C) Create the new initrd and a link to it.
mkinitrd -o /boot/initrd.img-2.6.5-raid
edit /boot/grub/menu.lst
1. Add the following entry
title Debian GNU/Linux, kernel 2.6.5-1-686 RAID root (hd0,0) kernel /boot/vmlinuz-2.6.5-1-686 root=/dev/md0 ro initrd /boot/initrd.img-2.6.5-1-686-raid savedefault boot
2. Update the following kernel root option in the file. Note: the grub known issues, so this option will not be used anyway.
# kopt=root=/dev/md0 ro
[Copied from Part I 5.4 above]
to have /dev/md0 mount as root (/), when grub boots from our RAID device, /dev/md0:
Previous root (/) in fstab was:
/dev/hda1 / ext3 defaults 0 0
Edit it to:
/dev/md0 / ext3 defaults 0 0
Note: edit /mnt/md0/etc/fstab, not /etc/fstab, because at the moment we are booted with hda1 as root (/) but we want to change the /etc/fstab that we currently have mounted on /mnt/md0/etc/fstab, our RAID device.
Reboot and choose the RAID kernel to check if system boots our RAID device, /dev/md0, as root (/). If it does not, just reboot again and choose the original pre-read kernel image
reboot
Verify /dev/md0 is mounted as root (/)
mount
should show something similar to:
/dev/md0 on / type ext3 (rw)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
Now we are booted into the new RAID device -- md0 as root (/). Our RAID device only has one disk in it at the moment because we earlier declared the other disk as 'missing'. That was because we needed that other disk, hda, to install Debian on or because it was our pre-existing Debian system.
cat /proc/mdstat shows the [degraded] array is up and running, note the [_U] - second disk is up.Follow steps 6.1, and 6.2. Wait and make sure the drives are fully synced before proceeding.
This is needed to make sure that mkinitrd starts the newly built array with all drives. mkinitrd uses mdadm -D to discover what drives to assemble in the array during startup, this is contained in a script in the initrd image. If this step is not done the next time you reboot the array will be degraded.
Do the following
mkinitrd -o /boot/initrd-2.6.5-raid.img
reboot
and check the array is fully up, look for the [UU]
cat /proc/mdstat
and check /dev/md0 is mounted
mount
grub refers to the boot(ed) device as hd0, so if the primary hard drive (/dev/hda) fails the system will look for the next bootable device (/dev/hdc) and loads it's MBR, which grub will still refer to as hd0. So, the grub configuration can still use hd0 even when the primary device fails.
These steps temporarily tell grub the second device is hd0 and then loads the MBR.
start the grub command line, then run the load commands. Note: grub partition references are offset by 1, so in the following with a partition of /dev/hdc1, the root is (hd0,0) [previous line tells grub to set hdc as hd0]. If the partition was /dev/hdc2, the root would be (hd0,1)!
grub
grub> device (hd0) /dev/hdc
grub> root (hd0,0)
grub> setup (hd0)
reboot, verify the /proc/mdstat devices always start. Follow section VIII and verify the system boots with one disk off line.
grub will already be installed on hda, and you will manually force grub to be installed on hdc so the MBRs are ok; however, install-grub and update-grub will fail because grub does not understand the md0 device. This is not a problem with install-grub as it will not be executed again after it has been installed, but update-grub is executed after an updated kernel is apt'd, causing an error to be reported by apt. The update-grub error is ok, the kernel gets installed and the initrd is created with all the md array information, provided the array was not degraded during the kernel upgrade. But you will have to manually update the grub menu.lst and add the new kernel information before you reboot, or the new kernel will not appear in the grub menu.
When using mdadm, mkinitrd will only detect disks in the array that are running at the time of execution. You should not install a new kernel while the array is degraded, otherwise, even if you do an mdadm --add, the next reboot will still be degraded! The array is started at boot time by script. You can see what is in the script of the initrd by mounting it, e.g.
mount /boot/initrd.img-X.X.X /mnt -o loop
cat /mnt/script
And look for the array start line similar to
mdadm -A /devfs/md/0 -R -u 23d8dd00:bc834589:0dab55b1:7bfcc1ec /dev/hda1 /dev/hdc1
Redundant Array of Inexpensive Disks (RAID) refers to putting more than one hard disk to work together in various advantageous ways. Hardware RAID relies on special hardware controllers to do this and we do not covered in this HowTo. Software RAID, this HowTo, uses software plus the ordinary controllers on your computer's motherboard and works excellently.
RAID 1 is where you use two hard drives as if they were one by mirroring them onto each other. Advantages of RAID 1 are (a) faster data reads because one part of the data can be read from one of the disks while simultaneously another part of the data is read from the other disk, and (b) a measure of fail over stability -- if one of the disks in the RAID 1 fails, the system will usually stay online using the remaining drive while you find time to replace the failed drive.
To achieve the speed gain, the two disks that comprise your RAID 1 device must be on separate controllers (in other words, on separate drive cables). The first part of the data is read from one disk while simultaneously the second part of data is read from the other disk. Writing data to a RAID 1 device takes twice as long apparently. However, under most system use data is more often read from disk than written to disk. So RAID 1 almost doubles the effective speed of your drives. Nice.
RAID is not a substitute for regular data back ups. Many things can happen that destroy both your drives at the same time.
Drive designators.
Drives on IDE 1 -- Primary Controller
Drives on IDE 2 -- Secondary Controller
Jumpers. When moving drives around in your computer, be sure to set the jumpers on your drives correctly. They are the little clips that connect two of various pins on your drive to set it to Cable Select, Master, or Slave. IDE drives usually have a diagram right on their case that shows where to set the clip for what setting. Different brands sometimes use different pin configurations.
Cables. Use 80 wire 40 pin IDE drive cables, not 40 wire 40 pin or you will slow down your hard drive access. For best results, cables should be no longer than the standard 18". If your cable has a blue end, that's the end to attach to the mother board (I don't know why). I don't think it matters which of the two drive connectors on the cable you plug your drive into, the middle or end one, unless you use Cable Select in which case I believe the sable's end plug is Master and its middle plug is Slave.
You can have a multi-partition RAID system if you prefer. You just need to create multiple RAID devices.
I have found it useful when setting software RAID on multiple partitions to set the RAID device to the same name as the disk partition.
If you have 3 partitions on /dev/hda and I want to add /dev/hdc for software RAID, then boot /dev/hdc and add /dev/hda back into the device, exactly what I did earlier, but with 3 partitions which are: hda1=/boot, hda2=/, hda3=/var
sfdisk -d /dev/hda | sfdisk /dev/hdc;
reboot
mdadm --zero-superblock /dev/hda1
mdadm --zero-superblock /dev/hda2
mdadm --zero-superblock /dev/hda3
mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/hdc1
mdadm --create /dev/md2 --level=1 --raid-disks=2 missing /dev/hdc2
mdadm --create /dev/md3 --level=1 --raid-disks=2 missing /dev/hdc3
mkfs.reiserfs /dev/md1;mkfs.reiserfs /dev/md2; mkfs /dev/md3;
mkdir /mnt/md1 /mnt/md2 /mnt/md3;
cp -ax /boot /mnt/md1;cp -ax / /mnt/md2; cp -ax /var /mnt/md3;
add entry in current fstab for all 3 and REBOOT.
Sync data again, only copying changed stuff.
cp -aux /boot /mnt/md1;cp -aux / /mnt/md2; cp -aux /var /mnt/md3;
edit lilo.conf entry in this case:
boot=/dev/md1
root=/dev/md2
Edit /mnt/md2/etc/fstab to have / set to /dev/md2.
REBOOT into RAID.
Add devices in:
mdadm --add /dev/md1 /dev/hda1
mdadm --add /dev/md2 /dev/hda2
Wait for sync, write Lilo permanently, and REBOOT into your setup.
It is not harder to include more devices in a software RAID device.
You need special entries to use Lilo as your boot loader, I couldn't get grub to work, but nothing prevents you from using grub. Just standard Lilo/grub entries WILL NOT WORK FOR RAID.
Entries in /etc/lilo.conf:
raid-extra-boot=<option>
That option only has meaning for RAID 1 installations. The <option> may be specified as none, auto, mbr-only, or a comma-separated list of devices; e.g., "/dev/hda,/dev/hdc6".
panic='' line in lilo.conf tells Lilo to automatically boot back to the old install if something goes wrong with the new Kernel.
Use "cp -aux" to just copy updated items. if you are copying a partition that is not root you need to copy the subdirectories and not the mount point, otherwise it will just copy the directory over. To copy boot which is a separately mounted partition to /mnt/md1 which is our new software RAID partition we copy as thus: "cp -aux /boot/* /mnt/md1" NOTE THE DIFFERENCE when copying mount points and not just /. If you just do cp -aux /boot /mnt/md1 it will just copy over boot as a subdirectory of /mnt/md1.
Or, alternatively, you could copy the root system with 'find' piped to 'cpio', like this:
cd /
find . -xdev -print | cpio -dvpm /mnt/md0
You should always reboot if you have changed your partitions, otherwise the Kernel will not see the new partitions correctly. I have changed partitions and not rebooted, and it caused problems. I would rather have the simpler longer less potentially troublesome approach. Just because it appears to work, does not mean it does work. You really only need to reboot if you are CHANGING or rebooting a new Lilo configuration. Don't email me if you hose yourself because you did not feel the urge to reboot. Trust me.
initrd: Use RAID as initrd modules.
The Kernel that is installed when you first build a system does not use an initrd.img. However the default kernel uses initrd. So you can use a stock kernel for with software raid.
The new Kernel by default won't contain the right modules for creating a RAID savvy initrd, but they can be added.
(Per James Bromberger)
Now we need to prepare for running a RAID setup. Our packages need an update.
Use apt, because it rocks, and install the following:
DevFSd
kernel-image-2.4.x (whatever suits you)
reiserfsprogs
less
screen
vim
...Anything else you need and can't live without for the next 10 minutes
You might already have some of these modules in the kernel, eg ext2.
Edit /etc/modules and add the following modules:
reiserfs
md
raid1
ext2
ide-disk (might not need this one.)
raid5
ext3
ide-probe-mod (might not need this one.)
ide-mod (might not need this one.)
Edit /etc/mkinitrd/modules, and add the same modules to this list. Your initrd
image needs to be able to read and write to your RAID array, before your
filesystem is mounted. Initrd is the trick here. You probably also want to see
if you need to edit /etc/mkinitrd/mkinitrd.cfg and set the variable ROOT=probe
to be ROOT=/dev/md0, or possibly, if using DevFS, ROOT=/dev/md/0.
Regenerate your initrd image for your new kernel with
mkinitrd -o /tmp/initrd-new /lib/modules/2.4.x-... .
If all is good, move this to /boot/initrd-2.4.x-... and edit your /etc/lilo.conf to add initrd=/boot/initrd against the "Linux" kernel entry. Run lilo, and you should see an asterisk next to the boot image "Linux".
With those modules you should be able to install the new kernel-image package. The install will add those modules to the initrd.img that. Now you can do for example (I actually only tested with kernel-image-2.4.24-1-686-smp on a machine using testing and unstable listed in the /etc/apt/source.list)
apt-get install kernel-image-2.4.24-1-686-smp
You will need to modify /etc/lilo.conf to include the right stuff. Otherwise the post install scripts for the package will likely fail.
image=/vmlinuz
label=Linux
initrd=/initrd.img
(The above is all one line)
Run Lilo and REBOOT.
You should now have the modules loaded. Check with: cat /proc/mdstat
Roger did it this way.
NB: I (Roger) had to disconnect power to my CD-ROM drive (because my CD-ROM was on /dev/hdd -- Secondary Slave) in order to boot with my Secondary Master disconnected. Otherwise my BIOS refused to boot the machine because my CD-ROM was then a Slave on a cable without any Master. Your mileage may vary. :-) So I decided to leave my CD-ROM disconnected, as this is a server and I need it to boot even with a failed drive more than I need the convenience of keeping the CD-ROM connected. I can of course connect the CD-ROM when I need it as long as I have a working Master drive on its cable with it or set it to Master.
I created a swap RAID device as follows:
(I have a 1000MB hda2 and a 1000MB hdc2, both as type 'fd' created with 'cfdisk', that I will use as md1 for swap.)
(Or you can just create the swap parttions on the actual disk, don't put swap on raid. Just put a swap partition on each disk in your raid set on an empty partition.)
Add a Swap entry in /etc/fstab, just after root (/) partition line. Example line to add to /etc/fstab:
/dev/md1 none swap sw 0 0
Reboot and the boot sequence should start up the Swap when it reads /etc/fstab.
reboot
You can argue whether swap should be on raid. A large colo admin mentions that he does not use swap on raid. Keep it as simple as possible. You decide.
hdparm -d1 -c3 /dev/hda /dev/hdc
You need to use bonnie++ to measure software raid performance
You want all your devices to be as masters. As your limited to total bandwidth on that chain of
hard drives.
I just stick as many hard drives in the system as possible,
I have not encountered problems where having disks on the same master
slave channel caused a slowdown.
(These directions are untested, I need to adopt them to mdadm instead of raid2 --luke)
So what to do if you can't get your root RAID1 filesystem to boot? Here is a straightforward way to get to your md0:
dpkg-deb -x kernel-image-2.4.yy-bf45.deb temp/
mount /floppy
cp /etc/raid* /sbin
# (Ie: copy to the ramfs /sbin)
mkdir /etc/raid
cp /floppy/raidtab /etc/raid
ln -s /etc/raid/raidtab /etc/raidtab
raidstart /dev/md0
mount -t reiserfs /dev/md0 /target
DON'T JUST LOOK AT THIS QUICK REFERENCE. Understand the rest of the document.
Verify RAID savvy Kernel. (1) You should see the RAID "personalities" your Kernel supports:
cat /proc/mdstat
dmsg|grep -i RAID
(This will show you if raid is compiled into kernel, or detected as a module from initrd.) /etc/modules will not list RAID if Kernel has RAID compiled in instead of loaded as modules. Use lsmod to list currently loaded modules this will show raid modules loaded.
(2) You should NOT see any RAID modules in /etc/modules (If you do, review step 2 of Procedure):
cat /etc/modules
Copy partitions hda to hdc:
sfdisk -d /dev/hda | sfdisk /dev/hdc
Create array:
mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/hdc1
Copy data:
cp -ax / /mnt/md0
Example /etc/lilo.conf entry for 1 disk RAID device:
boot=/dev/hda
image=/vmlinuz
label=RAID
read-only
#our new root partition.
root=/dev/md0
Add second disk to array:
mdadm --add /dev/md0 /dev/hdc1
Example final /etc/lilo.conf entry:
boot=/dev/md0
root=/dev/md0
#this writes the boot signatures to either disk.
raid-extra-boot=/dev/hda,/dev/hdc
image=/vmlinuz
label=RAID
read-only
Always zero the superblock of a device before adding it to a RAID device. Why? Because the disks decide what array they are in based on the disk-id information written on them. Zero the superblock first in case the disk was part of a previous RAID device. Also, if a partition was part of a previous RAID device, it appears to store the size of it's previous partition in the signature. Zeroing the superblock before adding it to a new RAID device takes care of cleaning up that, too.
Erase the MD superblock from a device:
mdadm --zero-superblock /dev/hdx
Remove disk from array:
mdadm --set-faulty /dev/md1 /dev/hda1
mdadm --remove /dev/md1 /dev/hda1
Replace failed disk or add disk to array:
mdadm --add /dev/md1 /dev/hda1
(that will format the disk and copy the data from the existing disk to the new disk.)
Create mdadm config file:
echo "DEVICE /dev/hda /dev/hdc" > /etc/mdadm/mdadm.conf
mdadm --brief --detail --verbose /dev/md0 >> /etc/mdadm/mdadm.conf
mdadm --brief --detail --verbose /dev/md1 >> /etc/mdadm/mdadm.conf
To stop the array completely:
mdadm -S /dev/md0
Finish directions on smart monitoring and mdadm configuration to monitor disks,and hot spares.
RAID 1 Root HowTo PA-RISC
http://www.pa-RISC-linux.org/faq/RAIDboot-howto.html
Lilo RAID Configuration:
http://lists.debian.org/debian-user/2003/debian-user-200309/msg04821.html
Grub RAID Howto
http://www.linuxsa.org.au/mailing-list/2003-07/1270.html
Building a Software RAID System in Slackware 8.0
http://slacksite.com/slackware/RAID.html
Root-on-LVM-on-RAID HowTo
http://www.midhgard.it/docs/lvm/html/install.disks.html
Software RAID HowTo
http://unthought.net/Software-RAID.HOWTO/Software-RAID.HOWTO.txt
HowTo - Install Debian Onto a Remote Linux System
http://trilldev.sourceforge.net/files/remotedeb.html
Kernel Compilation Information and good getting started info for Debian
http://newbiedoc.sourceforge.net
Initrd information and Raid Disaster Recovery,