SciClone Cluster Project Computational Science Cluster
User Info

An Introduction to the SciClone Cluster Project

SciClone is a heterogeneous cluster computing system designed to support a variety of activities in Computational Science at the College of William and Mary. Among other things, the system is used for:

  • large-scale computations in disciplines such as physics, chemistry, mathematics, biology, and marine science;
  • research and development of algorithms, methodologies, and tools to support scientific computing on parallel and distributed systems;
  • experimental research on a wide range of topics in computer science and numerical computing;
  • instructional tasks ranging from classroom assignments to dissertation research.

The name "SciClone" is derived from "science" + "clone", indicating that the system is intended for use in the computational sciences, and that most components in the system are identical copies of some other component. The name is also a play on "cyclone", which is one of the most powerful and complex phenomena in nature, and one which has been studied extensively with high performance parallel computing systems.

SciCloneMajor support for SciClone was provided by Sun Microsystems, the National Science Foundation, Virginia's Commonwealth Technology Research Fund, and the College of William and Mary. Additional assistance was provided by Myricom, Inc. and a number of other individuals and organizations.

A key feature of SciClone is its heterogeneous architecture, which provides both flexibility for applications as well as a controlled environment for studying the complex issues which arise in larger distributed systems. Specifically, SciClone's heterogeneity arises from its use of two different processor architectures (UltraSPARC and Opteron), three different node configurations (single-, dual-, and quad-cpu nodes), four different networking technologies (Fast Ethernet, Gigabit Ethernet, Myrinet, and InfiniBand), and its organization as a "cluster of clusters".

SciClone is presently arranged as eight tightly-coupled subclusters which can be used individually or together. These are:

  • whirlwind - 64 single-cpu Sun Fire V120 servers @ 650 MHz w/ 1 GB memory and 36 GB local disk
  • tornado - 32 dual-cpu Sun Ultra 60 workstations @ 360 MHz w/ 512 MB memory and 18 GB local disk
  • gulfstream - 6 dual-cpu Sun Ultra 60 workstations @ 360 MHz w/ 512 MB memory and 36-64 GB local disk
  • hurricane - 4 quad-cpu Sun Enterprise 420R servers @ 450 MHz w/ 4 GB memory and 18 GB local disk
  • twister - 32 dual-cpu Sun Fire 280R servers @ 900 MHz w/ 2 GB memory and 73 GB local disk
  • vortex - 4 quad-cpu Sun Fire V440 servers @ 1.28 GHz w/ 8-16 GB memory and 292 GB local disk
  • tempest - 42 dual-cpu Sun Fire V20z servers @ 2.4 GHz w/ 4 GB memory and 73 GB local disk
  • typhoon - [pre-production] 72 dual-processor, dual-core Dell SC1435 servers @ 2.6 GHz w/ 8-24 GB memory and 80 GB local disk

Seven additional server nodes act as front-ends and fileservers for the entire system, and two "network server" nodes provide application gateways between SciClone's various internal and external networks. A management node provides unintrusive performance monitoring and control services for computers, networks, and storage.

In aggregate, SciClone provides:

  • 271 nodes with a total of 623 processing cores
  • 1.1 TB of physical memory
  • 34.1 TB of disk capacity
  • 2.3 TFLOP/S peak floating point performance

Networking within the cluster is provided by multiple Ethernet switches and routers at speeds varying from 100 Mb/s to 10 Gb/s, along with a 64-port Myrinet-1280 switch, a 48-port Myrinet-2000 switch, and a 120-port Cisco InfiniBand switch. SciClone has a 1 Gb/s connection to the campus backbone and to VIMS, and 10 Gb/s connectivity to the National LambdaRail.

A Fibre Channel Storage Area Network (SAN) connects six Sun StorEdge disk arrays and an Apple XServe RAID to the fileserver nodes, with Sun's QFS filesystem providing high-bandwidth shared access to the contents. Backups are provided by a 20 TB Sun StorEdge L100 tape library.

SciClone's UltraSPARC subclusters run under Sun's Solaris 9 and Solaris 10 operating systems, while the Opteron-based subclusters run Novell's SLES 10 (SuSE Linux) operating system. Both are augmented with a variety of other software components to facilitate their use as parallel computing platforms.

See also: