NVIDIA® Grace CPUs
Designed from the ground-up to answer the humanity’s most challenging problems.
Designed from the ground-up to answer the humanity’s most challenging problems.
AI models are exploding in complexity and size as they enhance deep recommender systems containing tens of terabytes of data, improve conversational AI with hundreds of billions of parameters, and enable scientific discoveries. Scaling these massive models requires new architectures with fast access to a large pool of memory and a tight coupling of the CPU and GPU. The NVIDIA Grace™ CPU delivers high performance, power efficiency, and high-bandwidth connectivity that can be used in diverse configurations for different data center needs.
Higher Performance and Faster Memory—Massive Bandwidth for Compute Efficiency
The NVIDIA GH200 Grace Hopper™ Superchip is a breakthrough accelerated CPU designed from the ground up for giant-scale AI and high-performance computing (HPC) applications. The superchip delivers up to 10X higher performance for applications running terabytes of data, enabling scientists and researchers to reach unprecedented solutions for the world’s most complex problems.
The NVIDIA GH200 Grace Hopper Superchip combines the NVIDIA Grace™ and Hopper™ architectures using NVIDIA® NVLink®-C2C to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.
Designed to Meet the Performance and Efficiency Needs of Today’s AI Data Centers
NVIDIA Grace™ is designed for a new type of data center—one that processes mountains of data to produce intelligence. These data centers run diverse workloads, from AI to high-performance computing (HPC) to data analytics, digital twins, and hyperscale cloud applications. NVIDIA Grace delivers 2X the performance per watt, 2X the packaging density, and the highest memory bandwidth compared to today’s DIMM-based servers to meet the most demanding needs of the data center.
The NVIDIA Grace CPU Superchip uses the NVIDIA® NVLink®-C2C technology to deliver 144 Arm® Neoverse V2 cores and 1 terabyte per second (TB/s) of memory bandwidth.
Runs all NVIDIA software stacks and platforms, including NVIDIA RTX™, NVIDIA HPC SDK, NVIDIA AI, and NVIDIA Omniverse™.
Why Aspen Systems
Aspen Systems has a server solution to meet every type of workload and application. Our broad range of servers and solutions can fit the needs of any budget and environment. With our dedicated staff, tight vendor relationships, and commitment to quality, you can rest assured that we will do what it takes to get you the best quality solution at the best price.
At Aspen Systems we can work with you to architect and build a server solution, no matter what stage your environment is. If you are starting from square one and need a server to get you going on your projects, we can get you on the right track to get your work done expediently. If you already have a system in place and need to grow, we can tailor your new systems to seamlessly drop into place by working with your existing environment.
Based on your software and configuration requirements, our engineers will do all the work from BIOS and RAID configurations to OS, applications, and library installations. When you receive your systems, they will be ready to go once racked and powered on. Should any final adjustments be needed, we can finish them remotely.
With servers purchased from Aspen Systems, all hardware and software-related issues can be directed through us, and we will work with the component manufacturers, so you won’t need to. Our support team is all in-house and we won’t bounce you from department to department and back again.
The most powerful end-to-end AI supercomputing platform.
AI, complex simulations, and massive datasets require multiple GPUs with extremely fast interconnections and a fully accelerated software stack. The NVIDIA HGX™ AI supercomputing platform brings together the full power of NVIDIA GPUs, NVLink®, NVIDIA networking, and fully optimized AI and high-performance computing (HPC) software stacks to provide the highest application performance and drive the fastest time to insights.
NVIDIA HGX H100 combines H100 Tensor Core GPUs with high-speed interconnects to form the world’s most powerful servers. Configurations of up to eight GPUs deliver unprecedented acceleration, with up to 640 gigabytes (GB) of GPU memory and 24 terabytes per second (TB/s) of aggregate memory bandwidth. And a staggering 32 petaFLOPS of performance creates the world’s most powerful accelerated scale-up server platform for AI and HPC.
HGX H100 includes advanced networking options— at speeds up to 400 gigabits per second (Gb/s)—utilizing NVIDIA Quantum-2 InfiniBand and Spectrum-X™ Ethernet for the highest AI performance. HGX H100 also includes NVIDIA® BlueField®-3 data processing units (DPUs) to enable cloud networking, composable storage, zero-trust security, and GPU compute elasticity in hyperscale AI clouds.
GPT-3 175B training NVIDIA A100 Tensor Core GPU cluster: NVIDIA Quantum InfiniBand network, H100 cluster: NVIDIA Quantum-2 InfiniBand network | Mixture of Experts (MoE) training transformer switch-XXL variant with 395B parameters on 1T token dataset, A100 cluster: NVIDIA Quantum InfiniBand network, H100 cluster: NVIDIA Quantum-2 InfiniBand network with NVLink Switch System where indicated. (Note: H100 systems offering NVLink NVSwitch System are not currently available.)
NVIDIA H100 GPUs feature the Transformer Engine, with FP8 precision, that provides up to 4X faster training over the prior GPU generation for large language models. The combination of fourth-generation NVIDIA NVLink, which offers 900GB/s of GPU-to-GPU interconnect, NVLink Switch System, which accelerates collective communication by every GPU across nodes, PCIe Gen5, and Magnum IO™ software delivers efficient scalability, from small enterprises to massive unified GPU clusters. These infrastructure advances, working in tandem with the NVIDIA AI Enterprise software suite, make HGX H100 the most powerful end-to-end AI and HPC data center platform.
Megatron chatbot inference with 530 billion parameters.
Inference on Megatron 530B parameter model chatbot for input sequence length = 128, output sequence length = 20 , A100 cluster: NVIDIA Quantum InfiniBand network; H100 cluster: NVIDIA Quantum-2 InfiniBand network for 2x HGX H100 configurations; 4x HGX A100 vs. 2x HGX H100 for 1 and 1.5 sec ; 2x HGX A100 vs. 1x HGX H100 for 2 sec.
NVIDIA H100 GPUs feature the Transformer Engine, with FP8 precision, that provides up to 4X faster training over the prior GPU generation for large language models. The combination of fourth-generation NVIDIA NVLink, which offers 900GB/s of GPU-to-GPU interconnect, NVLink Switch System, which accelerates collective communication by every GPU across nodes, PCIe Gen5, and Magnum IO™ software delivers efficient scalability, from small enterprises to massive unified GPU clusters. These infrastructure advances, working in tandem with the NVIDIA AI Enterprise software suite, make HGX H100 the most powerful end-to-end AI and HPC data center platform.
3D FFT (4K^3) throughput | HGX A100 cluster: NVIDIA Quantum InfiniBand network | H100 cluster: NVLink Switch System, NVIDIA Quantum-2 InfiniBand | genome sequencing (Smith-Waterman) | A100 | H100.
HGX H100 triples the floating-point operations per second (FLOPS) of double-precision Tensor Cores, delivering up to 535 teraFLOPS of FP64 computing for HPC in the 8-GPU configuration or 268 teraFLOPs in the 4-GPU configuration. AI-fused HPC applications can also leverage H100’s TF32 precision to achieve nearly 8,000 teraFLOPS of throughput for single-precision matrix-multiply operations with zero code changes.
H100 features DPX instructions that speed up dynamic programming algorithms—such as Smith-Waterman used in DNA sequence alignment and protein alignment for protein structure prediction—by 7X over NVIDIA Ampere architecture-based GPUs. By increasing the throughput of diagnostic functions like gene sequencing, H100 can enable every clinic to offer accurate, real-time disease diagnosis and precision medicine prescriptions.
Scalable data center infrastructure for high-performance AI and graphics.
From physically accurate digital twins to generative AI, training and inference, NVIDIA OVX™ systems deliver industry-leading graphics and compute performance to accelerate the next generation of AI-enabled workloads in the data center.
Built and sold by NVIDIA-Certified partners, each NVIDIA OVX system combines up to eight of the latest NVIDIA Ada Lovelace L40S GPUs with high-performance ConnectX and Bluefield networking technology to deliver accelerated performance at scale.
From Generative AI to virtualization, Nvidia’s OVX Systems are purpose-built to tackle your most demanding workloads.
Develop new services, insights, and original content.
Accelerate AI training and inference workloads.
Create and operate metaverse applications.
Power high-fidelity creative workflows with NVIDIA RTX™ graphics.