I'm not 100% sure, but I think the dual-processor limitation is more a function of chipset than of the CPU(s). For example, I've personally seen Intel supercomputers containing 256 Pentium Pro CPUs (and there are systems containing 4096, and possibly more) in one box; I'm pretty sure Pentium Pro was not designed with that kind of usage in mind.
If somebody designs a cheap kind of chipset that can support at least 4 processors (be they Celeron, PII/III, Willamette, Merced, PowerPC, Athlon, or Sledgehammer), and scale to more (perhaps up to 16), such systems could become very popular, especially under BeOS. The reason I talked of Celeron 400s, is that they are dirt-cheap for their performance; even when you put 8 of them together, you get an aggregate cost equivalent to just one PIII/600.
The reason such systems are not common right now, is primarily because multi-CPU boxes are mostly designed for high-performance computing. Which means they have very fast/wide busses and I/O controllers, support for lots of very fast memory, etc. As far as I know, nobody has yet attempted to build an <u>affordable</u> multichip system. AMD seems like the more likely company to build one, though -- thanks to their point-to-point design of the chipset. Shouldn't be too hard to include several parallel EV6 busses on a single motherboard, and connect them in a crossbar-like fashion.
------------------
I am; therefore I think.