nvidia elite partnerNVIDIA GPU – Accelerating scientific discovery, visualizing big data for insights, and providing smart services to consumers are everyday challenges for researchers and engineers. Solving these challenges takes increasingly complex and precise simulations, the processing of tremendous amounts of data, or training sophisticated deep learning networks. These workloads also require accelerating data centers to meet the growing exponential demand for computing.

NVIDIA’s Hopper GPU architecture represents a monumental leap in computing technology, positioning it as the world’s leading platform for accelerated data centers, high-performance computing (HPC), and AI applications, including both training and inference.

One of the key strengths of the Hopper architecture is its implementation of the innovative Transformer Engine. This specialized hardware is tailor-made for accelerating Transformer-based models, which are fundamental to modern AI applications such as natural language processing, image recognition, and autonomous vehicle technology. The Transformer Engine optimizes performance for these models, dramatically reducing the time and energy required for training and inference phases. This makes Hopper exceptionally well-suited for environments where rapid iteration and deployment of AI models are critical.

By combining these technological advancements with robust software support through NVIDIA’s comprehensive suite of development tools, such as CUDA, cuDNN, and RAPIDS, Hopper empowers developers and organizations to push the boundaries of what’s possible in AI and HPC. This comprehensive approach ensures that NVIDIA’s Hopper GPU architecture remains at the forefront of accelerated computing technology, driving innovation and performance in data centers around the globe.



Architected For You

As a leading HPC provider, Aspen Systems offers a standardized build and package selection that follows HPC best practices. However, unlike some other HPC vendors, we also provide you the opportunity to customize your cluster hardware and software with options and capabilities tuned to your specific needs and your environment. This is a more complex process than simply providing you a “canned” cluster, which might or might not best fit your needs. Many customers value us for our flexibility and engineering expertise, coming back again and again for upgrades to existing clusters or new clusters which mirror their current optimized solutions. Other customers value our standard cluster configuration to serve their HPC computing needs and purchase that option from us repeatedly. Call an Aspen Systems sales engineer today if you wish to procure a custom-built cluster built to your specifications.

Solutions Ready To Go

Aspen Systems typically ships clusters to our customers as complete turn-key solutions, including full remote testing by you before the cluster is shipped. All a customer will need to do is unpack the racks, roll them into place, connect power and networking, and begin computing. Of course, our involvement doesn’t end when the system is delivered.

True Expertise

With decades of experience in the high-performance computing industry, Aspen Systems is uniquely qualified to provide unparalleled systems, infrastructure, and management support tailored to your unique needs. Built to the highest quality, customized to your needs, and fully integrated, our clusters provide many years of trouble-free computing for customers all over the world. We can handle all aspects of your HPC needs, including facility design or upgrades, supplemental cooling, power management, remote access solutions, software optimization, and many additional managed services.

Passionate Support, People Who Care

Aspen Systems offers industry-leading support options. Our Standard Service Package is free of charge to every customer. We offer additional support packages, such as our future-proofing Flex Service or our fully managed Total Service package, along with many additional Add-on services! With our On-site services, we can come to you to fully integrate your new cluster into your existing infrastructure or perform other upgrades and changes you require. We also offer standard and custom Training packages for your administrators and your end-users or even informal customized, one-on-one assistance.

Shop Now

Aspen Products Featuring the NVIDIA GPU

Speak with One of Our System Engineers Today


Select a GPU to Get Started


Data Center Double Precision & Compute GPUs

Name H200
A800 A30 A30X
(Bluefield DPU included)
Appearance Image Image Image Image Image Image Image
Architecture Hopper Hopper Ampere Ampere Ampere Ampere Ampere
FP64 34 TF 34 TF 26 TF 68 TF 9.7 TF 5.2 TF 5.2 TF
FP64 Tensor Core 67 TF 67 TF 51 TF 134 TF 10.3 TF 10.3 TF
FP32 67 TF 67 TF 51 TF 134 TF 19.5 TF 10.3 TF 10.3 TF
Tensor Float 32 (TF32) 989 TF 989 TF 756 TF 1,979 TF 82 TF | 165 TF* 82 TF | 165 TF*
BFLOAT16 Tensor Core 1,979 TF 1,979 TF 1,513 TF 3,958 TF 624 TF 165 TF | 330 TF* 165 TF | 330 TF*
FP16 Tensor Core
INT8 Tensor Core 3,958 TOPS 3,958 TOPS 3,206 TOPS 7,916 TOPS 330 TOPS | 661 TOPS* 330 TOPS | 661 TOPS*
GPU Memory 141 GB 80 GB 80 GB 188 GB 40 GB 24GB 24GB
GPU Memory Bandwidth 4.8 TB/s 3.35 TB/s 2 TB/s 7.8 TB/s 1.5 TB/s 933 GB/s 1,223 GB/s
TDP 700 W 700 W 300-350 W 2x 350-400 W 240 W 165 W 230 W
Interconnect NVLink: 900GB/s
PCIe Gen5: 128GB/s
NVLink: 900GB/s
PCIe Gen5: 128GB/s
NVLink: 600GB/s
PCIe Gen5: 128GB/s
NVLink: 600GB/s
PCIe Gen5: 128GB/s
NVLink: 400GB/s
PCIe Gen4: 64GB/s
NVLink: 200GB/s
PCIe Gen4: 64GB/s
NVLink: 200GB/s
PCIe Gen4: 64GB/s

Data Center Ada Lovelace Single Precision GPUs

Name L40S L40 L4
Appearance Image Image Image
Architecture Ada Lovelace Ada Lovelace Ada Lovelace
FP32 91.6 TF 90.5 TF 30.3 TF
Tensor Float 32 (TF32) 183 TF | 366 TF 90.5 TF | 181 TF 120 TF
BFLOAT16 Tensor Core 362 TF | 733 TF 181.05 TF | 362.1 TF 242 TF
FP16 Tensor Core
INT8 Tensor Core 733 TF | 1,466 TOPS 362 TOPS | 724 TOPS 485 TF
GPU Memory 48 GB GDDR6 with ECC 48 GB GDDR6 with ECC 24 GB GDDR6 with ECC
GPU Memory Bandwidth 864 GB/s 864 GB/s 300 GB/s
TDP 350 W 300 W 72 W
Interconnect PCIe Gen4 64 GB/s
PCIe Gen4 64 GB/s
PCIe Gen4 64 GB/s

Data Center Ampere Single Precision GPUs

Name A40 A16 A10 A2
Appearance Image Image Image Image
Architecture Ampere Ampere Ampere Ampere
FP64 271.2 GF
FP32 37.4 TF 8.678 TF 31.2 TF 4.5 TF
Tensor Float 32 (TF32) 74.8 TF | 299.4 TF 62.5 TF | 125 TF* 9 TF | 18 TF
BFLOAT16 Tensor Core 149.7 TF | 299.4 TF 8.678 TF 125 TF | 250 TF* 18 TF | 36 TF
FP16 Tensor Core
INT8 Tensor Core 299.3 TOPS | 598.6 TOPS 250 TOPS | 500 TOPS*
GPU Memory 48 GB GDDR6 with ECC 4x 16GB GDDR6 with ECC 24 GB GDDR6 16 GB GDDR6
GPU Memory Bandwidth 696 GB/s 4x 232 GB/s 600 GB/s 200 GB/s
TDP 300 W 250 W 150 W 40-60 W
Interconnect NVIDIA® NVLink® 112.5 GB/s
PCIe Gen4 31.5 GB/s
PCI Express Gen 4 x16 PCIe Gen4: 64 GB/s PCIe 4.0 x 8

Professional Ada Lovelace Series GPUs

Name RTX 6000 Ada RTX 5000 Ada RTX 4500 Ada RTX 4000 Ada
Appearance Image Image Image Image
Architecture Ada Lovelace Ada Lovelace Ada Lovelace Ada Lovelace
FP64 1,423 GF 1,020 GF 619.2 GF 299.5 GF
FP32 91 TF 65.28 TF 39.63 TF 19.17 TF
BFLOAT16 Tensor Core 91 TF 65.28 TF 39.63 TF 19.17 TF
FP16 Tensor Core
GPU Memory 48GB GDDR6 32 GB GDDR6 24 GB GDDR6 20 GB GDDR6
GPU Memory Bandwidth 960 GB/s 576 GB/s 432 GB/s 280 GB/s
TDP 300 W 250 W 130 W 70 W
Interconnect PCIe 4.0 x16 PCIe 4.0 x16 PCIe 4.0 x16 PCIe 4.0 x16

Professional Ampere Series GPUs

Name RTX A6000 RTX A5500 RTX A5000 RTX A4500 RTX A4000 RTX A2000
Appearance Image Image Image Image Image Image
Architecture Ampere Ampere Ampere Ampere Ampere Ampere
FP64 1,210 GF 1,066 GF 867.8 GF 739.2 GF 599 GF 124.8 GF
FP32 38.71 TF 34.10 TF 27.77 TF 23.65 TF 19.17 TF 8 TF
BFLOAT16 Tensor Core 38.71 TF 34.10 TF 27.77 TF 23.65 TF 19.17 TF 8 TF
FP16 Tensor Core
GPU Memory 48 GB GDDR6 24 GB GDDR6 24 GB GDDR6 20 GB GDDR6 16 GB GDDR6 6 GB GDDR6
GPU Memory Bandwidth 768 GB/s 768 GB/s 768 GB/s 640 GB/s 448 GB/s 288 GB/s
TDP 300 W 230 W 230 W 200 W 140 W 70 W
Interconnect PCIe 4.0 x 16 PCIe 4.0 x 16 PCIe 4.0 x 16 PCIe 4.0 x 16 PCIe 4.0 x 16 PCIe 4.0 x 16

Speak with One of Our System Engineers Today


Enhanced Performance with Bigger, Quicker Memory

The NVIDIA H200 Tensor Core GPU accelerates generative AI and high-performance computing (HPC) applications, delivering revolutionary performance and memory capabilities.

Utilizing the NVIDIA Hopper™ architecture, the NVIDIA H200 emerges as the first GPU to feature 141 gigabytes (GB) of HBM3e memory with a speed of 4.8 terabytes per second (TB/s)—offering nearly twice the memory capacity and 1.4 times the memory bandwidth compared to the NVIDIA H100 Tensor Core GPU. The H200’s expanded and quicker memory boosts generative AI and extensive language models, enhancing scientific computing for HPC tasks with improved energy efficiency and reduced overall ownership costs.

Platform Features

H200 GPU Feature Highlights

Unlock Insights With High-Performance LLM Inference

In the dynamic world of AI, companies depend on large language models to tackle a variety of inference demands. An AI inference accelerator must provide optimal throughput while maintaining the lowest total cost of ownership (TCO) when scaled up for a broad user base. The H200 offers a significant advancement in this area, doubling the inference performance of its predecessor, the H100 GPUs, when managing extensive language models like Llama2 70B. This enhancement boosts efficiency and supports the seamless handling of more complex and larger models, making it a robust solution for businesses aiming to leverage AI at scale.

nvidia h200 llm inference performance

Supercharge HPC Workloads

For high-performance computing (HPC) applications, memory bandwidth is essential as it facilitates quicker data transfers and alleviates processing bottlenecks. In memory-demanding HPC fields such as simulations, scientific research, and artificial intelligence, the increased memory bandwidth of the H200 ensures efficient data access and processing. This capability leads to dramatically faster outcomes, achieving results up to 110 times quicker than previous standards. Such improvements significantly streamline workflows in environments where speed and efficiency are critical.

nvidia h200 supercharge hpc

Lowering Energy Use and Total Cost of Ownership

The launch of the H200 marks a new era in energy efficiency and total cost of ownership (TCO). This advanced technology delivers unmatched performance, maintaining the same power consumption as the H100 Tensor Core GPU. The enhancement in energy efficiency makes AI factories and supercomputing systems not only quicker but also more environmentally friendly. This combination of speed and sustainability provides a significant economic advantage, driving progress in the AI and scientific sectors.

nvidia h200 energy efficient


The ultimate workstation development platform for data science and HPC.

Bring the power of a supercomputer to your workstation and accelerate end-to-end data science workflows with the NVIDIA A800 40GB Active GPU. Powered by the NVIDIA Ampere architecture, the A800 40GB Active delivers powerful compute, high-speed memory, and scalability, so data professionals can tackle their most challenging data science, AI, and HPC workloads.


Discover Unmatched Multi-Workload Performance with the NVIDIA L40S GPU.

The NVIDIA L40S GPU offers a groundbreaking combination of powerful AI computing, top-tier graphics, and superior media acceleration, designed to handle the demands of next-generation data center workloads. From generative AI and extensive language model (LLM) training and inference to 3D graphics, rendering, and video processing, the L40S GPU is engineered to excel across a diverse range of applications, setting a new standard for performance in modern data centers.

Software Tools for GPU Computing

Tensorflow Artificial Intelligence Library

Tensorflow, developed by google, is an open source symbolic math library for high performance computation. It has quickly become an industry standard for artificial intelligence and machine learning applications, and is known for its flexibility, used in many scientific disciplines. It is based on the concept of a Tensor, which, as you may have guessed, is where the Volta Tensor Cores gets its name.

GPU Accelerated Libraries

There are a handful of GPU accelerated libraries that developers can use to speed up applications using GPUs. Many of them are NVIDIA CUDA libraries (such as cuBLAS and CUDA Math Library), but there are others such as IMSL Fortran libraries and HiPLAR (High Performance Linear Algebra in R). These libraries can be linked to replace standard libraries that are commonly used in non-GPU-Accelerated computing.

CUDA Development Toolkit

NVIDIA has created an entire toolkit devoted to computing on their CUDA-enabled GPUs. The CUDA toolkit, which includes the CUDA libraries, are the core of many GPU-Accelerated programs. CUDA is one of the most widely used toolkits in the GPGPU world today.

NVIDIA Deep Learning SDK

In today’s world, Deep Learning is becoming essential in many segments of the industry. For instance, Deep Learning is key in voice and image recognition where the machine must learn while gaining input. Writing algorithms for machines to learn from data is a difficult task, but NVIDIA has written a Deep Learning SDK to provide the tools necessary to help design code to run on GPUs.

OpenACC Parallel Programming Model

OpenACC is a user-driven directive-based performance-portable parallel programming model. It is designed for scientists and engineers interested in porting their codes to a wide-variety of heterogeneous HPC hardware platforms and architectures with significantly less programming effort than required with a low-level model. . The OpenACC Directives can be a powerful tool in porting a user’s application to run on GPU servers. There are two key features to OpenACC: easy of use and portability. Applications that use OpenACC can not only run on NVIDIA GPUs, but it can run on other GPUs, X86 CPUs & POWER CPUs, as well.