A customer in the medical industry was creating a new product line, and it was to use an Intel processor running the Linux operating system to support the surrounding hardware. The operating system itself was no part of the product features: this was an embedded project.
The unit is a high-resolution medical imager, and in addition to the imaging components (which were of course the main purpose of the whole unit), the components that concerned my project were:
At bootup, the processor booted the BIOS from flash, which in turn launched ROM-DOS (an MS-DOS clone) from the "A:" drive in flash. Once ROM-DOS was running, it ran A:\AUTOEXEC.BAT which contained the bootstrap code that launched the device software itself from the hard drive.
The KDU (Keyboard and Display Unit) contains a
10-button keypad and an LC display, and it's a separate daughterboard
with its own processor and firmware. The "main" firmware has very
sophisticated functions for input and output, but also has a "safe mode"
which has only the bare minimum of functions required for KDU and device
functionality. Even if main KDU firmware is corrupted, we can use safe
mode to recover the device.
The device had several sub-components whose flash could be reprogrammed,
and some of them were so vital to operation that failure would always
produce an RTF unit. For instance, we could reflash the KDU's "main"
mode at any time and still recover, but if "safe mode" were corrupted,
the unit was dead. Likewise: if the reprogramming of the A: flash
filesystem failed, the unit would be disabled.
The installation system allowed the end customer to insert a
ZIP disk into the device and have provisions for essentially booting
from it (though not directly supported by the BIOS). The "hook" for
the ZIP was very early in the process, so even a failed or unformatted
hard drive would not get in the way.
Instead, it probed the hardware and attempted to locate the secondary
bootstrap program from the ZIP drive. If found, it launched the boot
program even before the hard drive was ever considered, so this gave a
"hook" to take control of an otherwise dead or unconfigured system.
If no secondary bootstrap was found on the ZIP drive (or if no media were
inserted), then it attempted to launch the secondary bootstrap from the
hard drive. This program was responsible for bringing the unit into
full operational mode.
By using this system of delegated responsibility, it was straightforward
to update any of the secondary bootstraps (on the hard drive or the
ZIP disk) without risking reflashing of the A: drive. I believe
that the primary bootstrap won't ever need to be changed.
The root tarball that was unloaded onto the newly-formatted drive
was created by the software build process, and this required a
substantial effort. I believed that we must use the standard
RPM mechanism for software installation because this is the only
way to get reliable execution of pre- and post-install scripts.
So the build process had a list of all the RPM files required for
the full system, and it "installed" them all to a root-image
directory. I utilized the --root=DIR facility of the rpm
command to issue a chroot() to the working directory: in
this way it fully believed that it was installing into a "live"
system, but in fact was simply populating a directory. By applying
all of the RPMs in this manner, the result was a complete image of
the full Linux system. This full image was archived into a single
large tarball and placed onto the installation media.
The 16 megabyte ramdisk image was likewise created by an automated
process. I created a configuration file that described how the ramdisk
filesystem was to be laid out, and the build script populated it
from scratch. While copying system binaries, it automatically
located and installed the required shared libraries, and at no
time was a manual filesystem-crafting process employed. It was
vital that the process be reproduceable.
This ramdisk filesystem contained everything needed for installation,
maintenance, and general debugging of the printer. It included
the usual editors, traditional Linux utilities, and even a DHCP
client and secure shell server. I created a maintenance disk
that booted hands-off and allowed for remote management via secure
shell.
The full Linux installation process took about ten minutes, and
was entirely unintended. The customer has reported that this
entire mechanism has been utterly trouble free.
My task was to create a bootstrap and software installation system that
would be robust even in the face of power interruptions or corruption
that rendered the hard disk unbootable. It was considered a fatal error
if the device became so corrupted that it was rendered RTF
(Return To Factory).
Design Goals
The bootstrap mechanism I designed was launched from the AUTOEXEC.BAT
file from the A: flash filesystem. Because reprogramming the flash
was "dangerous", the primary bootstrap (ABOOT.EXE) implemented
"mechanism", not "policy": it knew very little about the printer or
Linux or anything that we might want to change.
Bootstrap mechanism
I designed the installation system that booted from the ZIP disk,
and the secondary bootstrap launched Linux from a ramdisk. This
Linux system fully crafted the hard drive from start to finish:
Software Installation