bluefire

By Staff
04/29/2008 - 12:00am

Bluefire Quick Links

System Information Overview
Access and User Environment
Compiling and optimization
Running Jobs
File storage and data transfer
Debugging and Performance Tools

For the Media

Contact:
Marijke Unger, External Relations
NCAR Computational and Information Systems Laboratory
303-497-1285

*bluefire users and technical inquiries, please use the Support Links above.

News Release | Quick Facts | Tech Facts

  • Overview
  • Hardware
  • Software
  • Usage & Scheduling
  • Queues, charges
  • Examples
  • Documentation

    Overview

    On April 24, 2008, NCAR took delivery of the first IBM Power 575 supercomputer to be shipped anywhere in the world. The system, named bluefire, consists of 11 cabinets weighing 3,200 pounds each. Bluefire is over three times more powerful and three times more energy efficient than the supercomputers it replaces. It is also over a million times more powerful than the first recognized supercomputer, the Cray 1-A, which NCAR used from 1977-1986.

    NCAR will be conducting validation testing during May and June 2008 to ensure the system is fully functional before going into service for the atmospheric sciences community. During the initial production period, NCAR will be providing large computing grants to a select number of special projects, chosen through a priority review process, as part of the Accelerated Scientific Discovery program.

    Bluefire is scheduled to go into full production in late 2008, and will be used to improve climate and weather simulations, study solar processes, gain a deeper understanding of turbulence, and refine oceanic and atmospheric circulation models.

    Hardware

    Processors: With 4,096 POWER6™ processors running at 4.7 GHz, the bluefire system as a whole has a theoretical peak computation rate of 77 trillion floating-point operations per second (TFLOPs). It is estimated to be the 25th most powerful supercomputer in the world upon its installation. Each processor can deliver four floating point operations per clock cycle and supports Simultaneous MultiThreading (SMT). The 118 nodes dedicated to batch processing provide a peak computation rate of 71 TFLOPS.

    Memory: The on-chip L1 cache is 128 KB (64 KB data + 64 KB instruction) per processor. L2 cache is 4 MB per processor on-chip. The off-chip L3 cache is 32 MB per chip, and is shared by the two processors on the chip. L3 cache memory is connected to the chip via an 80-GB-per-second bus. Two types of main memory are deployed in the batch processing nodes: 48 nodes contain 4 GB of shared memory per processor, and 69 nodes contain 2 GB of shared memory per processor.

    Nodes: Each of the 128 nodes contains 32 processors. Two processors (cores) are built on one silicon die (chip), with 16 chips packaged at the heart of each node.

    • 118 nodes are dedicated to batch workload.
    • 10 nodes are used to support interactive sessions, debugging and share queues, GPFS filesystem I/O, MSS connectivity, and other system services.

    High-speed interconnect (switch): Nodes are interconnected with an InfiniBand switch for parallel processing using MPI. Each node has eight 4X InfiniBand DDR links, each capable of 2 GB/sec with 1.3-microsecond MPI latency.

    Disk space: The 150-TB disk subsystem that is currently on blueice will be migrated to bluefire. All user files on blueice will be available to bluefire. The /home and /ptmp filesystems will be expanded:

    • /home will grow to 17.6 TB (from 4.4 TB on blueice), and default quotas will be increased to 20 GB (from 5 GB on blueice).
    • /ptmp will grow to 109 TB (from 65.3 TB on blueice), and default quotas will be increased to 400 GB (from 250 GB on blueice).
      Scrubbing of old data in /ptmp begins when /ptmp usage reaches 85% and continues until usage drops to 70%. This /ptmp scrubbing policy is identical to that of blueice.

    Connectivity to the Mass Storage System: There is Gigabit Ethernet connectivity to the MSS Storage Manager. The MSS Storage Manager writes files to the MSS disk cache and to MSS cartridges over fiberchannel and fiberchannel tape drives, respectively.

    Login connectivity: A Gigabit Ethernet network provides login connectivity. Two nodes are reserved for login and command line interface work only.

    Security considerations: Bluefire resides within the CISL security perimeter and can only be accessed via a CRYPTOCard. User access will be simplified so users can ssh directly to bluefire.

    Software

    Operating System: AIX (IBM-proprietary UNIX).

    Batch system: Load Sharing Facility (LSF).

    Compilers: Fortran (95/90/77), C, C++ (Note: The compilers will produce 64-bit APIs. To produce 32-bit APIs, set environment variable OBJECT_MODE to 32.)

    Utilities: These include pmrinfo, spinfo, batchview, and mssview. Please refer to /bin and /usr/local/bin on bluefire for a more complete list of user utilities.

    Software libraries: These include IBM's parallel libraries for OpenMP and MPI usage. Users may also request single-threaded libraries maintained at NCAR, including Spherepack and Mudpack. CISL prefers that users download the source code for these libraries and install them for their own use.

    Debugger: TotalView.

    File System: General Parallel File System (GPFS), a UNIX-style file system that allows applications on multiple nodes to share file data. GPFS supports very large file systems and stripes data across multiple disks for higher performance.

    System information commands: spinfo for general information; lslpp for information about libraries; batchview for batch jobs; bjall for more detailed information on batch jobs.

    Usage

    If you are a present user of CISL supercomputer resources and you want a bluefire account, please request this via the web form at CISL Customer Support.

    All users will receive a bluefire login if they have a blueice login and have logged in to blueice since November 1, 2007. This applies to CSL and Community Computing users.

    Community Computing users who have General Accounting Unit (GAU) allocations are eligible to apply for an account on bluefire.

    Community users may request a bluefire login by contacting CISL Customer Support. Please include the following information with your login request:

    Your login name
    Your project number

    Parallel programming on blueice is done with OpenMP, MPI, and a mixture of both (hybrid).

    • To use more than one processor on a node, use OpenMP threading directives on the node, or use MPI processes on the node, or use a mixture of both.
    • To pass information between nodes, you must use MPI.
    • To take full advantage of parallelism: use OpenMP threads, MPI, or a mixture of both on the node plus use MPI between nodes.

    Scheduling

    Batch job scheduling is done via the Load Sharing Facility (LSF) batch system. Please see the documentation section below for pointers to LSF documentation.

    Queues, charges

    The class (queue) structure for bluefire is described in the Bluefire Quick Start Guide at Queues and charging.

    Examples

    Please see the directory /usr/local/examples on bluefire for examples of commonly used batch and interactive jobs.