Last Friday, the motherboard and CPU of my main Linux system died, and since it was a Pentium III, even the RAM and power supply were obsolete, so I ran off to the computer store to get the parts for a new one. My old system ran Red Hat 6.2 (very old), so upgrading was on my list anyway. This machine had been up for a year before it croaked.
I'd been grousing at Broadband Reports about poor performance of the hardware RAID in some Dell servers at a customer, so I figured I'd give a try at software RAID to learn the technology, as well as getting the reliability benefits on my own system.
So I fired up the ASUS P4P800-VM motherboard with 2.4GHz Celeron Pentium 4 with 1GB of RAM, installed RedHat 9 and started reconfiguring everything. RAID, nameserver, mailserver, migrating data, etc.
Though it was all working, it just felt slow so I decided to run some tests: building SpamAssassin took 2 minutes on the new box and 30 seconds on my Kenmore system (a 2GHz P4). Something wasn't right, and I figured it was the RAID.
After doing some digging, I broke the RAID1 mirrors to run with just the native partitions (though this wasn't easy: I haven't found any way to remove the misnamed "persistent-superblock"), but found that it made no real difference in time. Hmmm.
So maybe it's the Celeron: a "real" Pentium 4 has 512kbytes of cache, but a Celeron has only 128kb: could that make a difference? I swapped the CPUs between the new machine and kenmore: no difference.
After visiting the computer store to get a new motherboard, the guy - who was generally well informed - said that this used new Intel chipsets that might not have OS support yet - this could cause very slow disk I/O due to the lack of running with optimal DMA. I was using some new good Maxtor IDE drives with 8MB buffers, so I didn't think the drives themselves were causing the problems. Fair enough.
The ASUS website had no information or drivers, and the BIOS settings all looked more or less optimal. I dug into hdparm and DMA settings: everything looked OK as far as I could tell. hdparm -t /dev/hda showed more than 50 MB/sec of throughput, and running dd to and from the device showed numbers in those ranges. The disk didn't look like the problem, but it sure as hell wasn't obvious what was.
So maybe it's something in the kernel, but I have the distinct disadvantage of having never actually built a Linux kernel. I have studied the kernel for 20 years, read the source plenty, keep up with the news, have built other kernels, but have never had the need to build and run a Linux kernel myself.
At this point it was Sunday afternoon, and on the chance that I'd get no joy, I went to the local Micro Center to get a different one. I'd previously stopped in, but all the Pentium motherboards had the same chipset, so I passed, but I wasn't going to take a chance: I bought an Intel D865PERL motherboard (gotta love the model number) but left it unopened. It wasn't a slam-dunk that this would work any better.
Back to the kernel: it took more than two hours to build on this 2G system, and the resultant kernel didn't run any better. I had a feeling this was going to be a very long night, but then Jeremy showed up on Yahoo! IM from Korea, so I was able to ping him about this. It turns out that he had the same Intel motherboard and it worked great for him, so this made the choice easy for me.
So with "ASUS Outside" and "Intel Inside", the machine ran much faster. Whew. But while I'm doing this upgrade-fest it's time to really build my own kernel in a meaningful way: I downloaded 2.4.23-pre8 (with patches) from kernel.org and am now running a real live recent kernel (after 10 revisions to try things out).
I'm quite sure I'd have preferred to have received this education without spending three days, but I guess I'm better off. What a mess. Thanks to Jeremy for saving me at least a day.
Up next: re-install RAID1 and make a go at the 2.6 kernel. But maybe not this week :-)
Update - I've been asked several times if I ever got this resolved, and the answer is "no". I gave the Asus motherboard away and have never looked back.
I'll also note that the Intel motherboard has no onboard video, so it's not entirely a drop-in replacement for the Asus. This caught me by surprise, and at least one other person too. Check your "junk drawer" before doing the Asus-to-Intel swap.
Posted by Steve at October 28, 2003 11:08 AM | TrackBackYou should take a good look at Kasia's weblog. You might have something to say :)
Anyway, you didn't mention how long it took to build SA on the ASUS?
And
The Kenmore system..how'd you come up with that name? :)
Anyway, gnawing at the issue..I think your local well informed tech geek at the store might be right about the chipset..its probably just too new right now, and not able to chew it up yet. Or there might be a couple of bugs floating up and around..
I mean..what other things could be wrong with it? I thought about it for a second, and if you have the perfect HDDs, a good processor, and RAID going, the bottleneck is probably going to be within the chipset somewhere, or a bottleneck somewhere.
Glad to see that you're running smoothly now though.
Posted by: David on October 29, 2003 01:10 AMBoth motherboards use the same 865 chipset, and all indications were that it was using the same UltraDMA5 setting. No specific test on the disk drives showed anything that wasn't right - nothing.
I was about ready to find some single process (not a "build of a whole system") that ran slowly so I could strace it, finding where the time was going, but by this point Jeremy had saved the day and I was really ready to get past this.
A build of SpamAssassin-2.60 on the ASUS took about two minutes. On the Intel it takes 13 seconds. Same CPU, RAM, and disks. Ugh.
"Kenmore" is "an appliance for cleaning email". Sorry Sears :-)
Posted by: Steve Friedl on October 29, 2003 07:25 AMI don't know if you've figured out the problem already, or if you even care, but check this out:
http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/2887.html
We were having a similar problem at work with our P4P800-VM boards.
Does anyone look here anymore? I'm about to find out but it looks like I've got the same problem as you. P4P800-VM, nice spiffy new hard drive, 2.4 Gig Pentium 4 (no celeron), and my spiffy new system is show as shit. Dog slow. 5-10 minutes to boot. Takes like 10 seconds to load something like "konsole" or "terminal". Awful.
So I see that your solution was to replace the motherboard with an "Intel D864PERL". One problem:
Googling this phrase or even just "D864PERL" brings back just this artcle and one more. The beast does not seem to exist. Are you sure you have the number right.
Oops, It's an D865PERL - a typo, now fixed.
Posted by: Steve Friedl on January 10, 2004 10:16 AMThanks for the suggestion! This really got me off the dime.
I got the D865PERL and an AGP video card. One thing you didn't mention - the ASUS has onboard video, the Intel doesn't - but I'm guessing that onboard video is part of the problem!
Now it's all good. Although I did have a few hours of hair pulling. I installed the D865 bought an AGP video card and plugged it in and turned the machine on and...
nothing. No video! Aargh!! Tried a different monitor. No dice. Tried unplugging everything but the video card. No dice. Returned the video card and got another AGP card. No dice. Looked at the online documentation. Something about a jumper block for managing BIOS security. Aargh. The BIOS was locked in an unbootable state, and I had to switch the jumper to let me change it. Intel should mention that on their installation docs!
Once I got over that, I now see in front of me the spiffy fast system I thought I was buying. Thanks for the suggestion.
So now, only one problem remains. The sound chip on the D865PERL is not recognized by the RedHat 9.0 sound system. It fails on bootup, and running sndconfig, I get the message that it isn't supported. Did you have this problem, and if so, what was your solution?