[hackgrid] Fwd: [openMosix-general] Recommendations

Delete this message

Reply to this message
Author: brain
Date:  
To: hackgrid
Subject: [hackgrid] Fwd: [openMosix-general] Recommendations


Vi inoltro questo lunghissimo messaggio arrivato in lista openmosix.
Mi sembra piuttosto interessante perche' sottolinea l'importanza
della ricerca nell'ambito del calcolo parallelo... (e' una delle
tante risposte all'annuncio della fine del progetto openMosix).

----- Original Message -----
> From: "Ian Latter" <ian.latter@???>
> To: "Moshe Bar" <moshix@???>
> Subject: Re: [openMosix-general] Recommendations
> Date: Tue, 24 Jul 2007 18:29:26 +1000
>
>
> Perhaps you're showing your age, and perhaps I'm showing mine ... ;o)
>
>
> > From your remarks, I'll break my comments out into Future, Now and
> Effort;
>
> > It is tempting to think that with massive multi-cores, the advantage
> > of an SSI cluster stays because then you have "even more CPUs to play
> > with". That is not so. The migration from a multi-core system where
> > multi-threaded applications (as all apps are currently being
> > re-written as) thrive and performe well to another such node, implies
> > a DSM model capable to relocate threads either in gangs or single
> > across nodes. In other words multi-cores are changing the application
> > landscape towards a massively threaded one, and these apps will not
> > perform well in a DSM cluster by definition due to very high true and
> > false sharing.
> >
> > Tests we performed at my company Qlusters with very low latency
> > inter-connects (< 1microsecond round-trip latency) with a very
> > efficient implementation of DSM have shown an regular performance
> > impacts of 100x to 1000x.
> >
> > Even the sgi Altix let's threads run in the same SMP quadrant.
> >
>
>
> Future;
>
> I wrote a C implementation of the Bootp/DHCP RFCs two weeks ago,
> for libMidnightCode. In those RFC documents, that range from 20 to
> 10years ago, the standards' authors explicitly permitted Bootp agents
> to create UDP/IP packets with invalid checksums. The reason they did
> this was that they felt that it would be infeasible to calculate a valid
> CRC given the size and speed of "current" firmware and embedded
> processors.
> The same was true of the 25 year-old SMTP protocol - authored to
> allow parsing as characters were received because memory was
> seen as too rare or too precious to house an entire SMTP message in
> a single buffer, before it was parsed.
> Of course, I can now buy a 4Gbyte USB2.0 300+kbps MP3 playing
> wrist-watch, for AU$95.
>
> To me, technology and time are an inevitable duet. What's impossible
> today is milspec tomorrow, enterprise next week, and child's play by
> months end. This is the by-product, the waste, of the information age.
>
> At the desktop, there is sufficient industry demand to drive Terabyte
> disks and beyond. There is equally sufficient industry demand to drive
> multi-core processors, and beyond, even though the industry leader
> tried to drive the industry in other ways;
>
>     http://www.theregister.co.uk/2006/06/08/intel_gelsinger_stanford/
>     June, 2006

>
>     [...]

>
>     "A couple of years ago, I had a discussion with Bill Gates (about
>     the multi-core products)," Gelsinger said. "He was just in disbelief.
>     He said, 'We can't write software to keep up with that.'"

>
>     Gates called for Intel to just keep making its chips faster. "No,

Bill,
>     it's not going to work that way," Gelsinger informed him.

>
>     [...]

>
>
>
> Though despite Bill's comments to Intel, Microsoft are well aware of
> this cascading technology phenomenon too;
>
>     http://www.hpcwire.com/hpc/1347210.html
>     April, 2007

>
>     [...]

>
>     David Callahan (Microsoft Research) pointed out that exponential
>     grows really fast. If we plan on doubling the number of cores on
>     a chip every year or 18 months, it won't be long before we have
>     hundreds or thousands of cores on a chip. Any software solution
>     aimed at 16 or 32 cores will quickly become irrelevant. We'd better
>     be looking at productive ways to use massively parallel systems,
>     since these may well find their way into our workstations, and
>     yes, even our laptops, before the end of the next decade.

>
>     [...]

>
>
>
> The PPoPP (Principles and Practice of Parallel Programming) group
> have been working on this problem;
>
>     http://www.hpcwire.com/hpc/1347210.html
>     April, 2007

>
>     [...]

>
>     One of the PPoPP attendees, Prof. Rudolf Eigenmann (Purdue
>     Univ.) issued an indictment, saying that we in the parallel
>     programming research community should be ashamed of
>     ourselves. Single-processor systems have run out of steam,
>     something the parallel programming community has been
>     predicting since I was a college student. Now is the time to step
>     up and reap the benefits of all our past work. We've had 30
>     years to study this problem and come up with a solution, but
>     what's the end result? Surprise! We still have no well-accepted
>     method to generate parallel applications.

>
>     [...]

>
>
> And despite no solid answer in 30 years, I know this problem will
> be solved - it is inevitable (I'd put 10 bucks on it being built into a
> wrist watch one day - except I doubt either of us will be here to
> cash in on that bet).
>
>
> But that wasn't the only limitation you proposed.
>
> I do realise that your latency comments are directed to those you
> see flogging a dead architecture - but this is an evolving
> architecture; as those of us with incompatible CPU's, RAM,
> expansion cards, disks, power supplies, cases, screens, keyboards
> (and even bloody mice), can testify.
>
> It is inevitable that the bus speed through to the network speed will
> increase ... and that your latency gap will decrease over time.
>
>
>
> Now;
>
> However, I understand that this project is not academic (the future
> is too far away for those with power problems now). And here I
> see openMosix in two lights;
>
>     - it is an alternative High Performacne Cluster solution, for those
>       shopping around to make their out-of-the-box software run
>       faster.

>
>     - it is an alternative High Performance Cluster solution, for those
>       choosing to write a new application, and to make it run faster
>       by using multiple devices.

>
>
> From the tens of thousands (approaching hundreds of thousands)
> of downloads that I've seen of CHAOS, over the past couple of
> years, I'd suggest that openMosix fills a non-academic role, in those
> two ways, well.
>
>
>
> Effort;
>
> So, to my mind, there's no need for openMosix to justify its
> existence.
>
> Though there is still cause to see openMosix terminated while there
> is insufficient effort available to drive the project. I am in no way
> bagging out Florian. The problem is that there is only one Florian.
>
> I'm not naive enough to believe that your public comments and press
> release (all suggesting the future shutdown of openMosix) are
> anything less than a witty recruiting call; I very much hope you
> succeed.
>
>
> However, as neither of us are offering to write the code, and as
> you still seem intent on leaving the project, I'd suggest the
> programmers be given the real vote .. as it is their baby ...
>
>
> To whomever does drive this ship, I'm still here, awating a current
> kernel version to deploy as an improved CHAOS version.
>
> My reference platform runs much better than the previous CHAOS'
> did, but does not yet support all of the old features.
>
> I will continue to chip away at libMidnightCode - to incorporate better
> functionality (with higher quality code) into the platform. For example;
> There is a grealty improved multi-threaded peer-based
> communications protocol implemented there; waiting to replace Tyd
> when the time is due.