INTEL XEON PHI

Introducing Intel’s Knights Landing

Knights Landing is the codename for Intel’s 2nd generation Xeon Phi Product Family, which delivers massive thread parallelism, data parallelism and memory bandwidth – with improved single-thread performance and Intel Xeon processor binary-compatibility in a standard CPU form factor.


Intel Xeon Phi Knights Landing
 2U Intel Xeon Phi Processor (KNL) Quad Module Server/Omni-Path Host Fabric Interface
Intel Server Chassis

2U Intel Xeon Phi Processor (KNL) Quad Module Server/Omni-Path Host Fabric Interface

Designed for parallelized workflows in the HPC market and features four Intel Compute Modules, each with support for the Intel Xeon Phi Processor. The Intel Omni-Path Host Fabric Interface Adapter offers up to 100 Gbps per port of bandwidth, delivering performance that scales with high node and core counts. The hot-swappable compute modules, 3.5″ drive bays, and redundant power supply modules offer easy serviceability.

The most distinguishing feature of the chip is that it’s a bootable host CPU — unlike its predecessor Knights Corner, which is a coprocessor that connects over PCIe. The Knights Landing Phi is the first chip to offer an integrated fabric, Intel’s Omni-Path Architecture (OPA), in the package.


Knights Landing also puts integrated on-package memory in a processor, which benefits memory bandwidth and overall application performance. A six-channel memory controller supports up to 384 GB of DDR4-2400 memory (~90GB/s sustained bandwidth). There are 36 PCI Express 3.0 lanes for connecting PCIe SSDs or discrete graphics cards. The MIC (Many Integrated Cores) design fits 8 billion transistors on a die, using 14 nm process technology. The Phi product family comes in two variants: a stand-alone CPU, and a stand-alone CPU with integrated Omni-Path fabric technology. The SKU stack that Intel is launching includes four parts with different core counts, frequencies, TDPs and price points.



Product Specifications

Processor Number Cache Clock Speed # of Cores/
# of Threads
Max TDP/
Power
OPA on Chip
Xeon Phi Processor 7290F (16GB, 1.50 GHz, 72 core) 36 1.50 GHz 72/72 260 Yes
Xeon Phi Processor 7290 (16GB, 1.50 GHz, 72 core) 36 1.50 GHz 72/72 245 No
Xeon Phi Processor 7250F (16GB, 1.40 GHz, 68 core) 34 1.40 GHz 68/68 230 Yes
Xeon Phi Processor 7250 (16GB, 1.40 GHz, 68 core) 34 1.40 GHz 68/68 215 No
Xeon Phi Processor 7230F (16GB, 1.30 GHz, 64 core) 32 1.30 GHz 64/64 230 Yes
Xeon Phi Processor 7230 (16GB, 1.30 GHz, 64 core) 32 1.30 GHz 64/64 215 No
Xeon Phi Processor 7210F (16GB, 1.30 GHz, 64 core) 32 1.30 GHz 64/64 230 Yes
Xeon Phi Processor 7210 (16GB, 1.30 GHz, 64 core) 32 1.30 GHz 64/64 215 No

Many Trailblazing Improvements in Knights Landing

Improvements What / Why
Self-Boot Processor No PCIe bottleneck
Binary Compatibility with Xeon Runs all legacy software. No recompilation.
New Core: SLM based ~3x higher ST performance over KNC
Improved Vector density 3+ TFLOPS (DP) peak per chip
AVX 512 ISA New 512-bit Vector ISA with Masks
Contact us about Knights Landing



Summary

  • Knights Landing (KNL) is the first self-boot Intel Xeon Phi processor
  • Many improvements for performance and programmability
    • Significant leap in scalar and vector performance
    • Significant increase in memory bandwidth and capacity
    • Binary compatible with Intel Xeon processor
  • Common programming models between Intel Xeon processor and Intel Xeon Phi processor
  • KNL offers immense amount of parallelism (both data and thread)
    • Future trend is further increase in parallelism for both Intel Xeon processor and Intel Xeon Phi processor
    • Developers need to prepare software to extract full benefits from this trend


Choose from Some of Our Most Popular Intel Xeon Phi Chassis

Intel Parallel Studio XE

Intel Parallel Studio XE – Optimized Tools to Build Fast Code

Boost your applications performance with Intel C++ Compiler and Intel Fortran Compiler for Windows, Linux and OS X. The built-in OpenMP and Intel Cilk Plus parallel models combined with performance libraries simplify the implementation of fast, parallel code. Available in 3 editions: Cluster, Professional and Composer. As processors evolve, it is becoming more and more critical to both vectorize (use AVX or SIMD instructions) and thread software to realize the full performance potential of the processor. In some cases, code that is vectorized and threaded can be more than 175X faster than unthreaded / unvectorized code and about 7X faster than code that is only threaded or vectorized. And that gap is growing with every new processor generation.
Read more about Intel Parallel Studio XE 2017