From the inception of the idea of a computing machine by Turing in 1937 to the development of the first electronic computers, the Atanasoff-Berry Computer and ENIAC in the forties, electronic computers have been an essential tool of science. (Gray, 1999) The scope of the computer’s role in science advanced further in the 1960’s with the seminal work of Seymour Cray on parallel computation and his development of machines capable of parallel processing. (Bell & Gray, 2002) Following Cray’s work, the concept and realization of supercomputers further expanded the role computers play in scientific development.
The supercomputer provides computational resources much greater than those available in a typical personal or business computer. By leveraging the large scale of the supercomputer, scientists are able to explore very complex questions. Through the last decade of the twentieth century and the first decade of the twenty-first century, high-performance computer development has continued to make new advances in science possible.
In 2009, the future scientific role of high-performance computers continues to advance; in many fields of study, simulation augments or replaces experimentation. (Peck, 2004) While the traditional supercomputer architectures pioneered by Cray continue to be developed, expanded, and made more powerful, there are new architectures being built that depart from the silicon based, electronic machines. As Munkata explains, the new computing paradigms, “include computing schemes based on nanowires, carbon nanotubes, organic molecules, bio-DNA, and quantum physics [and] special forms of computing including optical, micro/nanofluidic, and amoeba-based chaotic.” (Munkata, 2007)
Traditional Supercomputer Architectures
The Princeton WordNet lexicography database defines the word “supercomputer” as “a mainframe computer that is one of the most powerful available at a given time.” (“WordNet 3.0,” 2006) By this definition, the first electronic computers could be classified as supercomputers. Devices like the electronic numerical integrator and computer (ENIAC) and the Atanasoff-Berry Computer (ABC) were, for their time, singular in their capabilities and performance. However, general credit for the first supercomputer is given to Seymour Cray. (Dongarra, Meuer, Simon, & Strohmeier, 2000)
Origins of silicon-based, electronic supercomputers and the MFLOPS machines
In the 1970’s, Seymour Cray’s company Cray Research introduced many concepts that dramatically influenced computer and supercomputer development. The concept of having separate processors to perform system housekeeping comes from Cray and work he did while working at Control Data Corporation(CDC). (Narayan, 2009) Prior systems utilized the central processing unit for tasks like input/output and memory management. Modern systems continue to use these innovations. Seymour Cray and Jim Thornton worked together to develop both the CDC 6600 and CDC 7600 machines. These were the first computers to achieve computational performance of one million floating-point operations per second (MFLOPS) [like many computer measures, megaflops is actually 1024 x 1024 floating-point operations per second as opposed to 1000 x 1000]. The rate of floating-point operations per second is a common measure of computer performance, as the floating-point operation is crucial for scientific computing.
In the late seventies, Cray left CDC to found Cray Research, while Thornton continued to work at CDC on the development of the CDC STAR-100. The CDC STAR-100 ran at 100 MFLOPS, was introduced in 1974, and was the first vector processor supercomputer. (Narayan, 2009) In 1976, the CDC STAR-100 was displaced by Cray’s Cray-1, which ran at 160 MFLOPS and had several design innovations that improved performance and stability, such as the use of memory registers. (Russell, 1978)
The CDC and Cray machines utilized electrically coupled logic (ECL) chips. (Russell, 1978) The next innovation to drive the supercomputer industry was the development of massively parallel, multi-processing, shared memory systems, also known as MPP’s. MPP’s utilized custom designed processors based on complementary metal oxide semiconductor (CMOS) chips. CMOS was much cheaper to manufacture and enabled large systems for a given cost. The first CMOS MPP to perform as a world leading supercomputer was the Cray XMP. (Bell & Gray, 2002) With a lower manufacturing cost, competition soon emerged in the form of the Connection Machines line from Thinking Machines. (Dongarra, et al., 2000)
As shown in this figure from Dongarra et al’s 2000 article, Cray designed machines and MPP’s like the Thinking Machines CM’s dominated the supercomputer performance through the end of the 20th century.
Figure 1 Chart of Supercomputer Performance Through 2000 (Dongarra, et al., 2000)
The advent of CMOS technology, with its lower production costs, and the emergence of microcomputer companies like Sun, SGI, HP, and IBM with product lines addressing general computing presented a predictable threat to the traditional supercomputer market. (Bell & Gray, 2002) The workstation market utilized scalar processors (SP) and as early as the eighties it became apparent that, “CMOS-based killer micros would eventually challenge the performance of the vector supers with much better price performance and an ability to scale to thousands of processors and memory banks.” (Bell & Gray, 2002) By the late eighties and early nineties, an MPP machine cost up to $30,000 per microprocessor with systems housing thousands of procesors. (Bell & Gray, 2002)
Beowulfs, supercomputer economics and TFLOPS & PFLOPS machines
In the early nineties, partially in response to the costs of supercomputers, the National Aeronautics and Space Administration (NASA) issued requirements for a 1 GFLOPS (1024 MFLOPS) machine to be built for less than $50,000. According to Bell & Gray, “In 1994, a 16- node $40,000 cluster built from Intel 486 computers achieved that goal.” (Bell & Gray, 2002) The winning cluster utilized a design known as the Beowulf cluster. The Beowulf cluster design, “builds on decades of parallel processing research and on many attempts to apply loosely coupled computers to a variety of applications.” (Bell & Gray, 2002)
The advent of the Beowulf architecture was a turning point for supercomputing. The economics of vector and MPP supercomputers became untenable in the face of cheap commodity clusters. Bell and Gray’s article analyzes data from the Top500 list, which has rated supercomputer performance since the mid-nineties, and includes a table that clearly demonstrates the current dominance of Beowulf clusters.
Figure 2 Growth of Beowulf (scalar) clusters (Bell & Gray, 2002)
Over less than a decade, the scalar processor Beowulf clusters have displaced and replaced the vector processors and MMP systems. The scalar Beowulf cluster ASCI Red was the first supercomputer to achieve 1 TFLOPS (1024 GFLOPS) performance. (Meuer, 2008) In June 2008, the Top500 list announced that the first 1 PFLOPS (1024 TFLOPS) machine had been measured and it follow the Beowulf architecture. (“Top 500 List – June 2008,” 2008) In spite of the current dominance of the Beowulf architecture, there are emerging platforms that represent radical departures from past computational models that may provide the foundation for future supercomputers.
Beowulf clusters will continue to dominate the supercomputer market for the foreseeable future. And, these machines will continue to provide valuable resources for the solution of important problems of science and engineering. However, there are emerging technologies that may eventually disrupt the role of Beowulfs. These new technologies are more disruptive than the previous transitions from ECL to CMOS chips or vector to scalar processors. Imagine molecules of DNA performing computation or circuit pathways that conduct light instead of electricity.
Graphics Processing Units
Originally developed to provide fast three dimensional graphics processing for the consumer electronic gaming industry, graphics processing units (GPU) are a CMOS commodity version of single instruction multiple dataset (SIMD) processors. (Fan, Qiu, Kaufman, & Yoakum-Stover, 2004) As SP based clusters have displaced vector and MPP architectures, certain problem sets have been left without a highly efficient platform for computation, particularly image processing. The use of the GPU to service computation instead of rendering video game images is an innovative use of a specialized processor. The GPU is to SIMD processors what Beowulf was to vector and MPP systems. Existing GPU’s are not equivalent to SIMD, as indicated in Suda et al’s 2009 article. (Suda, et al., 2009) However, the low cost CMOS solution has potential to be developed into a replacement, much like Beowulfs replaced vector processing machines.
Utilizing quantum mechanics phenomena like superposition, quantum computers promise to provide a dramatic speed-up for the right types of calculation. Where a physical computer must iterate over every possible solution, quantum computers could operate on all states simultaneously. (Hunter, 2009) While difficult to understand, the potential is enormous. The current state of the art is not capable of handling large amounts of data at 28 qubits (qubits are the quantum computing equivalent of bits). (Hunter, 2009)
In addition, today’s quantum computers are large scientific appartuses that do not resemble the electronic computers that are more familiar. In Meter and Oskin’s 2006 article, the authors evaluate eight different designs of quantum computers for their capabilities, including scalability. (Meter & Oskin, 2006)
While researchers in the field of quantum computing work to build larger and more scalable devices, some doubts remain as to the feasibility of building quantum computers capable of addressing worthwhile problems. In his 2004 article, Aaronson attempts to address the concerns of scale by presenting a framework for evaluating the potential success of a quantum computer. (Aaronson, 2004)
Biological and molecular computing
The promise of biological and molecular computing is density. When the computational or data storage unit is of molecular size, there is an order of magnitude improvement to the density of information that can be stored or utilized. By contrast, modern electronic microprocessors operate on scales of tens or hundreds of atoms per bit. Molecular scale computing also offers advantages for miniaturization.
The initial demonstration of molecular and biological computing was Adleman’s 1994 experiment utilizing DNA to calculate the Hamiltonian path of a graph. (Reif & LaBean, 2007) Leveraging the path shown by Adelman’s experiment and advances in self-assembling nanotechnology and understanding of complex biological molecules like DNA and proteins, biologists and computer scientists are able create finite state computers from organic molecules. (Reif & LaBean, 2007) Reif and LaBean highlight the fact that the initial computers, prior to the familiar electronic ones, were mechanical finite state machines that fundamentally operated much like DNA based molecular computers. (Reif & LaBean, 2007)
Aono et al report on the use of amoebas of slime mold to perform computations. (Aono, Hara, & Aihara, 2007) By utilizing principles of chaotic dynamics and neural network programming, the authors are able to develop a computing system that utilizes the amoebas’ physiology to perform calculations. (Aono, et al., 2007) One major benefit to this type of biological computer is that the amoeba system never experiences a deadlock condition, which can occur in silicon-based electronic processors. (Aono, et al., 2007) The performance is substantially slower than electronic systems due to the biology of amoebas. (Aono, et al., 2007) However, one can envision some utility in having a computer that does not crash.
Optical computers utilize transmission and reception of light in place of electronics. The principle advantage is increased processing speed due to faster transmission of data, since light travels at the maximum speed possible while electrons in modern computers travel about 70% the speed of light. Fully optical computers are expected to perform 105 times faster than all electronic computers. (Munakata, 2007) However, Munakata highlights that there are substantial technical challenges to the realization of optical computers. (Munakata, 2007)
The field of supercomputing was dominated by the ideas of Seymour Cray from its inception until the end of the 20th century. At the turn of the century, a combination of CMOS and scalar processing technologies dethroned the vector and MPP architectures from their positions as the top supercomputers. Cost was a major factor in these developments. Today, Beowulf clusters are the primary architecture for supercomputing.
However, there are emerging technologies that may dramatically change the traditionally electronic computer field. From GPU’s to quantum computing, from DNA-based computers to light based computers, the future architectures are significantly different from silicon-based electronics.
The potential for further research within the area of emerging supercomputing architectures is very rich. All of the emerging architectures are nascent and have fundamental questions of implementation, usability, and scalability that must be researched and resolved.
Aaronson, S. (2004). Multilinear formulas and skepticism of quantum computing. Paper presented at the Proceedings of the thirty-sixth annual ACM symposium on Theory of computing.
Aono, M., Hara, M., & Aihara, K. (2007). Amoeba-based neurocomputing with chaotic dynamics. Commun. ACM, 50(9), 69-72.
Bell, G., & Gray, J. (2002). What’s Next in High-Performance Computing? [Article]. Communications of the ACM, 45(2), 91-95.
Dongarra, J., Meuer, H., Simon, H., & Strohmeier, E. (2000). Biannual Top-500 Computer Lists Track Changing Environments For Scientific Computing. SIAM News, 34.
Fan, Z., Qiu, F., Kaufman, A., & Yoakum-Stover, S. (2004). GPU Cluster for High Performance Computing. Paper presented at the Proceedings of the 2004 ACM/IEEE conference on Supercomputing.
Gray, P. (1999). Alan Turing. (cover story). [Article]. Time, 153(12), 147.
Hunter, P. (2009). Quantum leaps [quantum computing]. [Article]. Engineering & Technology (17509637), 4(1), 64-67.
Meter, R. V., & Oskin, M. (2006). Architectural implications of quantum computing technologies. J. Emerg. Technol. Comput. Syst., 2(1), 31-63.
Meuer, H. (2008). The TOP500 Project: Looking Back over 15 Years of Supercomputing Experience. Retrieved from http://www.top500.org/files/TOP500_Looking_back_HWM.pdf
Munakata, T. (2007). Introduction. Commun. ACM, 50(9), 30-34.
Narayan, S. (2009). Supercomputers: past, present and the future. Crossroads, 15(4), 7-10.
Peck, S. L. (2004). Simulation as experiment: a philosophical reassessment for biological modeling. [doi: DOI: 10.1016/j.tree.2004.07.019]. Trends in Ecology & Evolution, 19(10), 530-534.
Reif, J. H., & LaBean, T. H. (2007). Autonomous programmable biomolecular devices using self-assembled DNA nanostructures. Commun. ACM, 50(9), 46-53.
Russell, R. M. (1978). The CRAY-1 computer system. Commun. ACM, 21(1), 63-72.
Suda, R., Aoki, T., Hirasawa, S., Nukada, A., Honda, H., & Matsuoka, S. (2009). Aspects of GPU for general purpose high performance computing. Paper presented at the Proceedings of the 2009 Asia and South Pacific Design Automation Conference.
Top 500 List – June 2008. (2008). Retrieved 9/20/2009, 2009, from http://top500.org/list/2008/06/100
WordNet 3.0. (2006). WordNet 3.0, from http://wordnet.princeton.edu/perl/webwn?s=supercomputer