Intel Xeon Scalable Processor Family (Skylake)

Intel Xeon Scalable Processors

The new Intel Xeon Scalable Processor Family (Intel Skylake) are workload-optimized to support hybrid cloud infrastructures and the most high-demand applications. You can drive actionable insight, count on hardware-based security, and deploy dynamic service delivery. See how Aspen Systems can apply Intel Xeon Scalable processors to your project.

Advanced Features Are Designed into the Silicon

Synergy among compute, network, and storage is built in. Intel® Xeon® Processor Scalable Family optimizes interconnectivity with a focus on speed without compromising data security.

Intel® Xeon® Processor Scalable Family
available in four feature configurations:

Intel Xeon Scalable Processor Bronze

8 Cores
2 Socket CFG
1.5 TB Memory

6 channel DDR4 @ 2133
2x UPI links @ 9.6GT/s
16 DP FLOPs per cycle
AVX 512 (1x 512b FMA)

Intel Xeon Scalable Processor Silver

12 Cores
2 Socket CFG
1.5 TB Memory

6 channel DDR4 @ 2400
2 UPI links @ 9.6GT/s
16 DP FLOPs per cycle
AVX-512 (1x 512b FMA)
Hyper-Threading
Turbo Boost

Intel Xeon Scalable Processor Gold

22 Cores
4 Socket CFG
6 TB Memory

6 channels DDR4 @ 2666
3 UPI links @ 10.4GT/s
32 DP FLOPs per cycle
AVX-512 (2x 512b FMA)
Hyper-Threading
Turbo Boost

Intel Xeon Scalable Processor Platinum

28 Cores
8+ Socket CFG
12 TB Memory

6 channel DDR4 @ 2666
3 UPI links @ 10.4GT/s
32 DP FLOPs per cycle
AVX-512 (2x 512b FMA)
Hyper-Threading
Turbo Boost

Values are Maximum. Changes between Processors are underlined

Broadwell vs Skylake Processors

Everybody talks about what the Broadwell processors are and what the Skylake processors do, as well as describe how Intel does the Tick-Tock releases. For HPC, there are a few differences that really make a difference when looking at Broadwell vs Skylake processors (Intel Xeon Processor Scalable Family).

HPL Performance

First, a lot of people talk about HPL performance, and this benchmark is still what is used for ranking on the Top 500 list. Compared to the 16 Double Precision Floating Point Operations per second (DP FLOPs) per cycle for Broadwell, the Gold and Platinum Skylake processors can do 32 DP FLOPs per cycle. This means that for every CPU core, you get twice the theoretical performance. It is important to note that the Bronze and Silver CPUs can only do 16 DP FLOPs per cycle. And not all of your applications are going to run twice as fast as it did with Broadwell processors. There are a lot of optimizations in code which is needed to take advantage of all of the features, such as AVX-512.

Intel Chip Example

Memory Speed and Capacity

Another important factor when looking at HPC performance is memory speed, and capacity. The Broadwell CPUs can accept 2133MHz and up to 2400 MHz for memory speed. For Skylake, the Bronze CPUs still take 2133MHz memory, and the Silver CPUs take 2400MHz memory, but the Gold and Platinum can take 2666MHz memory DIMMs. Not only do you get higher memory speeds, but you also get six (6) memory channels (path from the processor to memory) per CPU, meaning with one memory DIMM per channel (DPC), you get twelve memory DIMMs in a dual socket server. This is an upgrade from Broadwell which only had four (4) memory channels per CPU. With a dual-socket Skylake server, if you populate out two (2) DIMMs per channel (share each memory to CPU path with two memory DIMMs), you can get 24 DIMMs and using 64GB LRDIMMs, you can easily get to 1.5TB of memory.

Multi-processor Servers

There are times when researchers want large core count multi-processor servers. The Gold processors can be used in a 4-socket server, and the Platinum processors can be used in 8 (or larger) socket servers. This means you can get 88 Gold processor cores in a single four-way server, or 228 Platinum processor cores in a single eight-way server. With hyperthreading, you can get 456 CPU threads in a single server!

Parallel Processing

Finally, with parallel processing, networking becomes very important. Intel has put their Omni-Path Architecture (OPA) fabric, right only the CPU chip. These CPUs have a “-F” after the CPU model. This means that network traffic no longer has to go through the motherboard and through the PCIe slot, but a cable comes right out of the CPU for network connectivity. And you not only get one OPA connection, but two per CPU. In a dual-socket system, you can get two “-F” CPUs in the system, or both CPUs can share the OPA connection through the UPI connection between the two CPUs.

The Intel Xeon Processor Scalable Family has been created for performance, and with HPC users in mind. There are a lot of new features available as described above, and in general, most code can take advantage of some of these features without any changes to the code. For best performance, the code would need to be updated and optimized to take advantage of all that these new CPUs offer. Still have questions? An Aspen Engineer would be happy to answer your questions.

Skylake Enhances your Workload

Intel Xeon Scalable Processor Optimization

Performance
New features such as Intel® Advanced Vector Extension 512 (Intel® AVX-512) improve with workload-optimized performance and throughput increases for advanced analytics, high performance computing (HPC) applications, and data compression.

Download PDF

Intel Xeon Scalable Processor Acceleration

Acceleration
Accelerate applications by adding Intel QAT to a software-defined infrastructure (SDI) environment. It provides a software-enabled foundation for security, authentication, and compression, and significantly increases the performance and efficiency of standard platform solutions.

Download PDF

Intel Xeon Scalable Processor Networking

Networking
High-speed Integrated Intel® Ethernet (up to 4x10GbE) helps reduce total system cost. It also lowers power consumption and improves transfer latency of large storage blocks and virtual machine migration.

Download PDF

Intel Xeon Scalable Processor Security

Security
As more devices connect to the Internet and workloads move to the cloud, hacking is a greater threat than ever. Traditional schemes against known-bad elements are not enough to keep the enterprise secured. Instead, the entire IT infrastructure must be secure, starting from the root with platform silicon.

Download PDF

Intel® Xeon® Processor Scalable Family SKU Stack

Sorted by #cores in family

*Each processor can address up to 786GB Memory space
SKU Cores Last-level Cache TDP(W) Base Frequency Max Turbo Frequency AVX 2.0 Base Frequency AVX-512 Base Frequency Sockets
8180 28 38.50 MB 205W 2.5 GHz 3.80 GHz 2.1 GHz 1.7 GHz 2S/4S/8S
8176 28 38.50 MB 165W 2.1 GHz 3.80 GHz 1.7 GHz 1.3 GHz 2S/4S/8S
8170 26 35.75 MB 165W 2.1 GHz 3.70 GHz 1.7 GHz 1.3 GHz 2S/4S/8S
8164 26 35.75 MB 150W 2.0 GHz 3.70 GHz 1.6 GHz 1.2 GHz 2S/4S/8S
8168 24 33.00 MB 205W 2.7 GHz 3.70 GHz 2.3 GHz 1.9 GHz 2S/4S/8S
8160 24 33.00 MB 150W 2.1 GHz 3.70 GHz 1.8 GHz 1.4 GHz 2S/4S/8S
8153 16 22.00 MB 125W 2.0 GHz 2.80 GHz 1.6 GHz 1.2 GHz 2S/4S/8S
6152 22 30.25 MB 140W 2.1 GHz 3.70 GHz 1.7 GHz 1.4 GHz 2S/4S
6148 20 27.50 MB 150W 2.4 GHz 3.70 GHz 1.9 GHz 1.6 GHz 2S/4S
6138 20 27.50 MB 125W 2.0 GHz 3.70 GHz 1.6 GHz 1.3 GHz 2S/4S
6154 18 24.75 MB 200W 3.0 GHz 3.70 GHz 2.6 GHz 2.1 GHz 2S/4S
6150 18 24.75 MB 165W 2.7 GHz 3.70 GHz 2.3 GHz 1.9 GHz 2S/4S
6142 16 22.00 MB 150W 2.6 GHz 3.70 GHz 2.2 GHz 1.6 GHz 2S/4S
6130 16 22.00 MB 125W 2.1 GHz 3.70 GHz 1.7 GHz 1.3 GHz 2S/4S
6132 14 19.25 MB 140W 2.6 GHz 3.70 GHz 2.2 GHz 1.7 GHz 2S/4S
6146 12 24.75 MB TBD 2S/4S
6136 12 24.75 MB 150W 3.0 GHz 3.70 GHz 2.6 GHz 2.1 GHz 2S/4S
6126 12 19.25 MB 125W 2.6 GHz 3.70 GHz 2.2 GHz 1.7 GHz 2S/4S
6144 8 24.75 MB TBD 2S/4S
6134 8 24.75 MB 130W 3.2 GHz 3.70 GHz 2.7 GHz 2.1 GHz 2S/4S
6128 6 19.25 MB 115W 3.4 GHz 3.70 GHz 2.9 GHz 2.3 GHz 2S/4S
5120 14 19.25 MB 105W 2.2 GHz 3.20 GHz 1.8 GHz 1.2 GHz 2S/4S
5118 12 16.50 MB 105W 2.3 GHz 3.20 GHz 1.9 GHz 1.2 GHz 2S/4S
5115 10 13.75 MB 85W 2.4 GHz 3.20 GHz 2.0 GHz 1.2 GHz 2S/4S
5122 4 16.50 MB 105W 3.6 GHz 3.70 GHz 3.3 GHz 2.7 GHz 2S/4S
4116 12 16.50 MB 85W 2.1 GHz 3.00 GHz 1.7 GHz 1.1 GHz 2S
4114 10 13.75 MB 85W 2.2 GHz 3.00 GHz 1.8 GHz 1.1 GHz 2S
4110 8 11.00 MB 85W 2.1 GHz 3.00 GHz 1.7 GHz 1.0 GHz 2S
4112 4 8.25 MB 85W 2.6 GHz 3.00 GHz 2.2 GHz 1.1 GHz 2S
3106 8 11.00 MB 85W 1.7 GHz 1.3 GHz 0.8 GHz 2S
3104 6 8.25 MB 85W 1.7 GHz 1.3 GHz 0.8 GHz 2S

SKUs Optimized with Increased Memory Capacity

*Each processor can address up to 1.5TB Memory space
SKU Cores Last-level Cache TDP(W) Base Frequency Max Turbo Frequency AVX 2.0 Base Frequency AVX-512 Base Frequency Sockets
8180M 28 38.50 MB 205W 2.5 GHz 3.80 GHz 2.1 GHz 1.7 GHz 2S/4S/8S
8176M 28 38.50 MB 165W 2.1 GHz 3.80 GHz 1.7 GHz 1.3 GHz 2S/4S/8S
8170M 26 35.75 MB 165W 2.1 GHz 3.70 GHz 1.7 GHz 1.3 GHz 2S/4S/8S
8160M 24 33.00 MB 150W 2.1 GHz 3.70 GHz 1.8 GHz 1.4 GHz 2S/4S/8S
6142M 16 22.00 MB 150W 2.6 GHz 3.70 GHz 2.2 GHz 1.6 GHz 2S/4S
6140M 18 24.75 MB 140W 2.3 GHz 3.70 GHz 1.9 GHz 1.5 GHz 2S/4S
6134M 8 24.75 MB 130W 3.2 GHz 3.70 GHz 2.7 GHz 2.1 GHz 2S/4S

SKUs with Integrated Intel® Omni-Path Fabric

SKU Cores Last-level Cache TDP(W) Base Frequency Max Turbo Frequency AVX 2.0 Base Frequency AVX-512 Base Frequency Sockets
8176F 28 38.50 MB 173W 2.1 GHz 3.80 GHz 1.7 GHz 1.3 GHz 2S/4S/8S
8160F 24 33.00 MB 160W 2.1 GHz 3.70 GHz 1.8 GHz 1.4 GHz 2S/4S/8S
6148F 20 27.50 MB 160W 2.4 GHz 3.70 GHz 1.9 GHz 1.6 GHz 2S/4S
6138F 20 27.50 MB 135W 2.0 GHz 3.70 GHz 1.6 GHz 1.3 GHz 2S/4S
6142F 16 22.00 MB 160W 2.6 GHz 3.70 GHz 2.2 GHz 1.6 GHz 2S/4S
6130F 16 22.00 MB 135W 2.1 GHz 3.70 GHz 1.7 GHz 1.3 GHz 2S/4S
6126F 12 19.25 MB 135W 2.6 GHz 3.70 GHz 2.2 GHz 1.7 GHz 2S/4S