NVIDIA® Grace CPUs

Designed from the ground-up to answer the humanity’s most challenging problems.

Accelerate the Largest AI, HPC, Cloud, and Hyperscale Workloads

AI models are exploding in complexity and size as they enhance deep recommender systems containing tens of terabytes of data, improve conversational AI with hundreds of billions of parameters, and enable scientific discoveries. Scaling these massive models requires new architectures with fast access to a large pool of memory and a tight coupling of the CPU and GPU. The NVIDIA Grace CPU delivers high performance, power efficiency, and high-bandwidth connectivity that can be used in diverse configurations for different data center needs.

nvidia grace hopper cpu aspen systems

NVIDIA Grace Hopper Superchip

Higher Performance and Faster Memory—Massive Bandwidth for Compute Efficiency

The NVIDIA GH200 Grace Hopper Superchip is a breakthrough accelerated CPU designed from the ground up for giant-scale AI and high-performance computing (HPC) applications. The superchip delivers up to 10X higher performance for applications running terabytes of data, enabling scientists and researchers to reach unprecedented solutions for the world’s most complex problems.

nvidia grace hopper cpu superchip

NVIDIA Grace Hopper Superchip Architecture

The NVIDIA GH200 Grace Hopper Superchip combines the NVIDIA Grace and Hopper architectures using NVIDIA® NVLink®-C2C to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.

  • CPU+GPU designed for giant-scale AI and HPC
  • New 900 gigabytes per second (GB/s) coherent interface, 7X faster than PCIe Gen5
  • Supercharges accelerated computing and generative AI with HBM3 and HBM3e GPU memory
  • Runs all NVIDIA software stacks and platforms, including NVIDIA AI Enterprise, HPC SDK, and Omniverse

Speak with One of Our System Engineers Today

nvidia grace cpu aspen systems

NVIDIA Grace CPU Superchip

Designed to Meet the Performance and Efficiency Needs of Today’s AI Data Centers

NVIDIA Grace is designed for a new type of data center—one that processes mountains of data to produce intelligence. These data centers run diverse workloads, from AI to high-performance computing (HPC) to data analytics, digital twins, and hyperscale cloud applications. NVIDIA Grace delivers 2X the performance per watt, 2X the packaging density, and the highest memory bandwidth compared to today’s DIMM-based servers to meet the most demanding needs of the data center.

nvidia grace cpu superchip

NVIDIA Grace Superchip Architecture

The NVIDIA Grace CPU Superchip uses the NVIDIA® NVLink®-C2C technology to deliver 144 Arm® Neoverse V2 cores and 1 terabyte per second (TB/s) of memory bandwidth.

  • High-performance CPU for HPC and cloud computing
  • Superchip design with up to 144 Arm Neoverse V2 CPU cores with Scalable Vector Extensions (SVE2)
  • World’s first LPDDR5X with error-correcting code (ECC) memory, 1TB/s total bandwidth
  • 900 gigabyte per second (GB/s) coherent interface, 7X faster than PCIe Gen 5
  • NVIDIA Scalable Coherency Fabric with 3.2TB/s of aggregate bisectional bandwidth
  • 2X the packaging density of DIMM-based solutions
  • 2X the performance per watt of today’s leading CPU

Runs all NVIDIA software stacks and platforms, including NVIDIA RTX, NVIDIA HPC SDK, NVIDIA AI, and NVIDIA Omniverse.

Our Value

Decades of successful HPC deployments

Architected For You

As a leading HPC provider, Aspen Systems offers a standardized build and package selection that follows HPC best practices. However, unlike some other HPC vendors, we also provide you the opportunity to customize your cluster hardware and software with options and capabilities tuned to your specific needs and your environment. This is a more complex process than simply providing you a “canned” cluster, which might or might not best fit your needs. Many customers value us for our flexibility and engineering expertise, coming back again and again for upgrades to existing clusters or new clusters which mirror their current optimized solutions. Other customers value our standard cluster configuration to serve their HPC computing needs and purchase that option from us repeatedly. Call an Aspen Systems sales engineer today if you wish to procure a custom-built cluster built to your specifications.

Solutions Ready To Go

Aspen Systems typically ships clusters to our customers as complete turn-key solutions, including full remote testing by you before the cluster is shipped. All a customer will need to do is unpack the racks, roll them into place, connect power and networking, and begin computing. Of course, our involvement doesn’t end when the system is delivered.

True Expertise

With decades of experience in the high-performance computing industry, Aspen Systems is uniquely qualified to provide unparalleled systems, infrastructure, and management support tailored to your unique needs. Built to the highest quality, customized to your needs, and fully integrated, our clusters provide many years of trouble-free computing for customers all over the world. We can handle all aspects of your HPC needs, including facility design or upgrades, supplemental cooling, power management, remote access solutions, software optimization, and many additional managed services.

Passionate Support, People Who Care

Aspen Systems offers industry-leading support options. Our Standard Service Package is free of charge to every customer. We offer additional support packages, such as our future-proofing Flex Service or our fully managed Total Service package, along with many additional Add-on services! With our On-site services, we can come to you to fully integrate your new cluster into your existing infrastructure or perform other upgrades and changes you require. We also offer standard and custom Training packages for your administrators and your end-users or even informal customized, one-on-one assistance.

Speak with One of Our System Engineers Today

nvidia hgx 8 h100 gpus

NVIDIA HGX AI Supercomputer

The most powerful end-to-end AI supercomputing platform.

AI, complex simulations, and massive datasets require multiple GPUs with extremely fast interconnections and a fully accelerated software stack. The NVIDIA HGX AI supercomputing platform brings together the full power of NVIDIA GPUs, NVLink®, NVIDIA networking, and fully optimized AI and high-performance computing (HPC) software stacks to provide the highest application performance and drive the fastest time to insights.

HGX Platform Features

nvidia hgx accelerated compute platform

Unmatched End-to-End Accelerated Computing Platform

NVIDIA HGX H100 combines H100 Tensor Core GPUs with high-speed interconnects to form the world’s most powerful servers. Configurations of up to eight GPUs deliver unprecedented acceleration, with up to 640 gigabytes (GB) of GPU memory and 24 terabytes per second (TB/s) of aggregate memory bandwidth. And a staggering 32 petaFLOPS of performance creates the world’s most powerful accelerated scale-up server platform for AI and HPC.

HGX H100 includes advanced networking options— at speeds up to 400 gigabits per second (Gb/s)—utilizing NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet for the highest AI performance. HGX H100 also includes NVIDIA® BlueField®-3 data processing units (DPUs) to enable cloud networking, composable storage, zero-trust security, and GPU compute elasticity in hyperscale AI clouds.

Up to 4X Higher AI Training on GPT-3

nvidia hgx deep learning training graph

GPT-3 175B training NVIDIA A100 Tensor Core GPU cluster: NVIDIA Quantum InfiniBand network, H100 cluster: NVIDIA Quantum-2 InfiniBand network | Mixture of Experts (MoE) training transformer switch-XXL variant with 395B parameters on 1T token dataset, A100 cluster: NVIDIA Quantum InfiniBand network, H100 cluster: NVIDIA Quantum-2 InfiniBand network with NVLink Switch System where indicated. (Note: H100 systems offering NVLink NVSwitch System are not currently available.)​

Deep Learning Training: Performance and Scalability

NVIDIA H100 GPUs feature the Transformer Engine, with FP8 precision, that provides up to 4X faster training over the prior GPU generation for large language models. The combination of fourth-generation NVIDIA NVLink, which offers 900GB/s of GPU-to-GPU interconnect, NVLink Switch System, which accelerates collective communication by every GPU across nodes, PCIe Gen5, and Magnum IO software delivers efficient scalability, from small enterprises to massive unified GPU clusters. These infrastructure advances, working in tandem with the NVIDIA AI Enterprise software suite, make HGX H100 the most powerful end-to-end AI and HPC data center platform.

Up to 30X Higher AI Inference Performance on the Largest Models

Megatron chatbot inference with 530 billion parameters.

nvidia hgx deep learning inference graph

Inference on Megatron 530B parameter model chatbot for input sequence length = 128, output sequence length = 20 , A100 cluster: NVIDIA Quantum InfiniBand network; H100 cluster: NVIDIA Quantum-2 InfiniBand network for 2x HGX H100 configurations; 4x HGX A100 vs. 2x HGX H100 for 1 and 1.5 sec ; 2x HGX A100 vs. 1x HGX H100 for 2 sec.​

Deep Learning Training: Performance and Scalability

NVIDIA H100 GPUs feature the Transformer Engine, with FP8 precision, that provides up to 4X faster training over the prior GPU generation for large language models. The combination of fourth-generation NVIDIA NVLink, which offers 900GB/s of GPU-to-GPU interconnect, NVLink Switch System, which accelerates collective communication by every GPU across nodes, PCIe Gen5, and Magnum IO software delivers efficient scalability, from small enterprises to massive unified GPU clusters. These infrastructure advances, working in tandem with the NVIDIA AI Enterprise software suite, make HGX H100 the most powerful end-to-end AI and HPC data center platform.

Up to 7X Higher Performance for HPC Applications

nvidia hgx deep learning inference graph

3D FFT (4K^3) throughput | HGX A100 cluster: NVIDIA Quantum InfiniBand network | H100 cluster: NVLink Switch System, NVIDIA Quantum-2 InfiniBand | genome sequencing (Smith-Waterman) |  A100 |  H100.

HPC Performance

HGX H100 triples the floating-point operations per second (FLOPS) of double-precision Tensor Cores, delivering up to 535 teraFLOPS of FP64 computing for HPC in the 8-GPU configuration or 268 teraFLOPs in the 4-GPU configuration. AI-fused HPC applications can also leverage H100’s TF32 precision to achieve nearly 8,000 teraFLOPS of throughput for single-precision matrix-multiply operations with zero code changes.

H100 features DPX instructions that speed up dynamic programming algorithms—such as Smith-Waterman used in DNA sequence alignment and protein alignment for protein structure prediction—by 7X over NVIDIA Ampere architecture-based GPUs. By increasing the throughput of diagnostic functions like gene sequencing, H100 can enable every clinic to offer accurate, real-time disease diagnosis and precision medicine prescriptions.

nvidia ovx server

NVIDIA OVX Systems

Scalable data center infrastructure for high-performance AI and graphics.

From physically accurate digital twins to generative AI, training and inference, NVIDIA OVX systems deliver industry-leading graphics and compute performance to accelerate the next generation of AI-enabled workloads in the data center.

Built and sold by NVIDIA-Certified partners, each NVIDIA OVX system combines up to eight of the latest NVIDIA Ada Lovelace L40S GPUs with high-performance ConnectX and Bluefield networking technology to deliver accelerated performance at scale.

Accelerate the Most Demanding Workloads

From Generative AI to virtualization, Nvidia’s OVX Systems are purpose-built to tackle your most demanding workloads.

Generative AI

Generative AI


Develop new services, insights, and original content.

LLM Training and Inference

LLM Training & Inference


Accelerate AI training and inference workloads.

Industrial Digitalization

Industrial Digitalization


Create and operate metaverse applications.

Rendering and 3D Graphics

Rendering & 3D


Power high-fidelity creative workflows with NVIDIA RTX graphics.