N. Adiga et al, An Overview of the BlueGene/L Supercomputer, in the
Proceedings of Supercomputing (SC2002) Technical Papers, 2002
|
Summary:
This paper gives an overview of the approach used by BlueGene/L
to achieve teraFLOPS-scale computing. The approach used here is
different from traditional approach of clustering large number of
nodes in two ways:
- First,
BlueGene/L system is built out of a very large number of nodes,
each of which has a relatively modest clock rate, rather
than clustering large, very fast SMPs, which is limited by power
consumptions and footprints constraints;
- Second,
the design point of BG/L utilizes system-on-a-chip techniques
that allow for integration of all system functions including
compute processor, communications processor, 3 cache levels, and
multiple high speed interconnection networks with sophisticated
routing onto a single ASIC. This allows for latencies and
bandwidths are significant better than those for nodes typically
used in an ASCI scale supercomputers. As a result, memory is
close to the processor and the power consumption is reduced
(modest rate processor).
In addition, the integration of the inter-node
communications network functions onto the same ASIC as the
processors reduces cost, since the need for a separate, high-speed
switch is eliminated.
Discussion notes:
What is the downside of putting memory and
processor so close? What is the advantage of
choosing modest clock rate CPU for supercomputer design? What
is the main reason for cluster nodes categorized as computing node
and I/O nodes as a design choice?
|
Predrag Tosic, A perspective on the
future of massively parallel computing: fine-grain vs. coarse-grain
parallel models comparison & contrast, in the Proceedings of the
first conference on computing frontiers, pages 488 - 502, April,
2004. |
|
|
|
|
Last updated on
01/25/2005 |