Common Supercomputer Terms and Definitions

What is a supercomputer?

Very simply, a supercomputer is a very fast computer. Usually the term is reserved for the 500 fastest computers in the world. Because technology moves rapidly, the list of top supercomputers changes constantly, and today's supercomputer is destined to become tomorrow's "regular" computer. Modern supercomputers are made up of many smaller computers – sometimes thousands of them – connected via fast local network connections. Those smaller computers work as an "army of ants" to solve difficult calculations very fast to benefit science and society. A supercomputer is typically used for solving larger scientific or engineering challenges in numerous fields, such as new drug development and medical research, environmental sciences such as global climate change, or helping us better respond to natural or man-made disasters by creating earthquake simulations or modeling the projected flow of oil spills.

Commonly Used Terms

Bits and bytes
In computing, a bit is a binary digit that's the smallest unit of data. It can store a value of 1 or zero. Because bits are so small, they are typically assembled eight bits at a time to form a byte. A byte contains just enough information to form one small letter or number. By using binary code, any number, letter, word, or pixel can be represented by a string of bytes and bits.

Cloud Computing
Cloud computing is where one can access a lot of computer power from a desktop or laptop computer, but the actual calculation is done remotely on another more powerful server or supercomputer. So replace the word 'cloud' with 'Internet' and the meaning becomes clearer. Cloud computing is where Internet-based resources, such as software and storage, are provided to computers on demand, possibly in a pay-per-use model. It is seen as a way to increase one's computing capacity without investing in new infrastructure or training new personnel.

Cyberinfrastructure 
First used in 1991 according to Merriam-Webster, the prefix 'cyber' means relating to computers or computer networks. So cyberinfrastructure means the combination of computer software, hardware, and other technologies – as well as human expertise – required to support current and future discoveries in science and engineering. Similar to the way a highway infrastructure includes roads, bridges, and tunnels working together to keep vehicles moving, all these components are necessary to keep scientific discovery moving along as well.

Data-intensive computing 
A new area of computational research created by the ever-growing amount of digitally based data. (see data "tsunami"/data "deluge" and HPD ). Many of the newest supercomputers are being designed so they can sift through vast amounts of digitally based data at lightning speeds, transferring raw information into research that has meaningful analysis and results. Data-intensive computing will allow researchers to solve problems that were not even attempted previously.

Data mining
One aspect of data-intensive computing is called data mining, or the process of extracting patterns from data. Many newer supercomputers are being used to rapidly sift through huge amounts of data to find similarities, disparities, and anomalies, and transform all that data into meaningful information. Data mining is being used throughout a wide range of scientific research, including drug discovery, as well as to uncover fraud detection or enhance marketing strategies by more closely matching potential buyers and products.

Data "tsunami," data "deluge"
Simply put, as we generate almost all of our information in electronic formats such as disks or chips, we are running out of storage space for all our data. International Data Corporation (IDC) forecasts that the amount of digitally based information generated worldwide – including text, video, images, etc. – will reach 35 zettabytes by 2020. What's a zettabyte? It's one sextillion bytes of information, or enough data to fill a stack of DVDs reaching halfway to Mars. A zettabyte is 1,000 times more than an exabyte, and today's supercomputers are nowhere near the exascale level (seeExascale/exaflop/Exabyte ). So the terms "data tsunami" or "data deluge" are becoming increasingly common, as IDC warns that our ability to store valuable data needs to catch up with our output.

HPC
High Performance Computing. HPC commonly refers to large-scale high performance computing resources, including storage and visualization systems.

HPD
High Performance Data. The HPC community is evolving to include HPD, according to SDSC Director Mike Norman, who coined the term as it relates to data-intensive supercomputing. As scientists struggle to keep up with the exponentially growing amount of digitally based data, HPD supercomputers will help researchers who need to access, analyze, and store extremely large data sets in significantly shorter amounts of time. Fusing HPC and HPD together has the potential to create supercomputers that will be at least one order of magnitude faster than any HPC system today, says Norman.

Parallel processing
Today's supercomputers have more than one "brain," or processor. These processors run different parts of the same computer program concurrently, resulting in significantly faster compute times. Parallel processing is used when many complex calculations are required, such as in climate or earthquake modeling. Processing "runs" that used to take months can often be done within days or even hours on larger supercomputers. The largest supercomputers today have 100,000 or more processors.

Thumb drive
A portable storage device, similar to a compact disk (CD) but much smaller (smaller than a disposable cigarette lighter) and having much more capacity. It is also much more robust, since there are no moving parts. The device, also known as a flash drive, simply plugs into a computer, where information can be transferred to it for storage or use on another computer. Thumb drives or flash drives are reusable, and can be easily carried in a pocket or on a keychain.

More Technical "Supercomputer Speak"

Cache
Usually meaning a hidden storage space for money or other valuables, in computing terms cache means an area of memory used to hold commonly used variables. A cache is simply a work area for the computer that it can access quickly. Think of your desk as a cache and the public library as a store for all of the world's data. Your desk cannot hold the whole library, but it can hold a few books that you might frequently reference.

Flash memory
Flash memory relies on a storage chip instead of older, slower, and less reliable spinning disk technology. It can be electronically erased and reprogrammed quickly without being removed from the circuit board. Flash memory is now being used in smaller mobile devices such as cell phones, thumb drives, digital cameras, and laptop computers. In late 2011, the San Diego Supercomputer Center at UC San Diego will introduce Gordon , the first supercomputer system employ flash memory using solid state drives, or SSDs, instead of the spinning disks. Think of Gordon as the world's largest thumb drive! 

GPU or GPUs
GPU stands for Graphics Processing Unit. Similar to a CPU, or Central Processing Unit, a GPU is a single-chip processor. However, GPUs are specialized primarily for computing 3D functions, such as video animations, transforming objects, lighting effects, and other mathematically-intensive operations that might strain CPUs and reduce computing speeds. Some newer supercomputers are using GPUs. 

Peak speed, FLOPS
A common term within the HPC community, peak speed means that fastest speed at which a supercomputer can operate. It is typically measured in "FLOPS" or "FLOP/s" which stands for FLoating point OPerations per Second. In lay terms, it basically means calculations per second. The world's fastest supercomputers have been ranked using this metric since 1993, and China's Tianhe-2 system currently holds the top spot ( as of 11/2014 ). Many HPC experts now contend that supercomputers should be measured not just by their peak speed, but other metrics, including their overall ability to helps researchers solve real-world science problems.

Portals or gateways 
Originally defined as a grand or imposing entrance or in science fiction, a doorway or threshold that connects two distant locations. In "computer speak," a portal is a term for a website that serves as a major starting point for users. Also called a gateway, there are general portals such as Google or Yahoo, as well as specialized portals for specific areas of research or interest.

Processors or cores
The processing part of a computer's central processing unit, or CPU, made up of the control unit and the arithmetic logic unit (ALU). The control unit retrieves instructions from the computer's memory. The ALU performs the mathematical calculations. Supercomputers are made up of thousands of processor cores. A multi-core processor is an integrated circuit to which two or more processors have been attached increased performance, such as simultaneous processing of multiple tasks, as well as reduced power consumption. 

Petascale/petaflop/petabyte
Today's largest supercomputers are operating at what's called the 'petascale' level. A petaflop is a measure of supercomputer speed (see Peak speed ), and means the ability to perform one quadrillion calculations per second – the equivalent of using the combined computing power of about 200,000 laptops. Today's fastest supercomputer is rated at about 2.5 petaflops, and even faster systems are under development. Similarly, a petabyte is one quadrillion bytes of storage capacity. That's equivalent having 250 billion pages of text. In digital terms, listening to a petabyte's worth of music on a (very large) MP3 player would take about 1,902 years. 

Exascale/exaflop/exabyte
The next frontier in supercomputing is computing at the exascale level: one quintillion calculations per second, or about 1,000 times faster than what we have today. A key challenge in breaking the exascale barrier is finding new innovative ways to reduce power consumption, because at the exascale level, today's supercomputer designs would need a nuclear power plant's worth of energy to operate! So an exabyte is one quintillion bytes of data storage, or having to make room for 1,000 copies of those 250 billion pages of text!