Nvidia's new Tesla GV100 research paper details specifications of the upcoming Volta chips and by the looks of it, it'll pack quite a game performance punch. First off this Tesla V100 physically very big. A while back, Nvidia showcased one of these chips and its die size was around 815mm², making it one of the biggest GPUs of all time.
Nvidia also simultaneously announced the Tesla V100 (PCIe) card which has around 5120 CUDA cores and around 7 TFLOPs of double precision computing performance. The Tesla V100 card was reported to have a 1370 MHz boost clock on a 4096-bit memory interface with a total of 16GB HBM2 VRAM and 6 MB L3 cache. The card will be available for retail later this year, 2017.
Alongside you have 640 Tensor cores, 900GB/s memory bandwidth and 112TFLOPs of deep learning computing power. But that is not the only thing, the whitepaper details that the upcoming chips can house up to 5,376 CUDA cores. In essence, the chip still isn't being fully utilised to its full potential. There are around 21 billion transistors in the V100 chip and its maximum TDP is around 250W.
This is a massive performance boost over last year's P100, which was based on the Pascal architecture. In comparison to V100, the former has zero Tensor cores and only 3584 CUDA Cores with 4.7 TFLOPs of double precision computing performance.
Tesla V100 is based on the new GV100 chip which has a bigger die size at 815mm² compared to 610mm² of the GP100 chip. However, the TDP remains the same at 250W because of the TSMC 12nm FFN manufacturing process of the new chip.
|Nvidia Product||Nvidia Tesla P100||Nvidia Tesla V100|
|GPU||GP100 Pascal||GV100 Volta|
|Release Date||April 2016||Late 2017|
|FP32 Cores / SM||64||64|
|FP32 Cores / GPU||3584||5120|
|FP64 Cores / SM||32||32|
|FP64 Cores / GPU||1792||2560|
|Tensor Cores / SM||NA||8|
|Tensor Cores / GPU||NA||640|
|GPU Boost Clock||1480 MHz||1462 MHz|
|Peak FP32 TFLOPS||10.6||15|
|Peak FP64 TFLOPS||5.3||7.5|
|Peak Tensor TFLOPS||NA||120|
|Memory Interface||4096-bit HBM2||4096-bit HBM2|
|Memory Size||16 GB||16 GB|
|L2 Cache Size||4096 KB||6144 KB|
|Shared Memory Size / |
|Configurable up |
to 96 KB
|Register File Size / SM||256 KB||256 KB|
|Register File Size / GPU||14336 KB||20480 KB|
|TDP||300 Watts||300 Watts|
|Transistors||15.3 billion||21.1 billion|
|GPU Die Size||610 mm²||815 mm²|
|16 nm FinFET+||12 nm FFN|
This latest tech from Nvidia will also feature a new Volta Streaming Multiprocessor (SM) that is meant to deliver improvements across energy efficiency, ease of programming and performance. Using a new partitioning method to improve utilization and overall performance. It will also merge shared memory and L1 resources enabling swifter transactions due to a shared memory capcity of up to 96KB per Volta SM as opposed to the 64KB we found in the GP100 Pascal predecessor.
And so the question on all our lips. What on earth is a Tensor Core? What is this new Nvidia Tensor Core Technology they are saying will make a difference to my gaming hardware performance?
Well the new Tensor Core found on the Volta GV100 SM enables the architecture to deliver performance that is required to train large neural networks. The Tensor Core performs 64 floating-point operations per clock. And the Volta GV100 has 640 Tensor Cores. Ok so with the mention of neural networks and floating point operations I am going to not go any further as this is a big area. In simple terms of gaming performance this Tensor Core technology provides an additional new layer of processing capability to the GPU every clock cycle that helps specifically with matrices multiplication processing, which is what neural networks require. Nvidia are saying that "Volta's mixed-precision Tensor Cores boost performance by more than 9 times compared to the P100 Pascal"
Here you go, take a look at this Pascal and Volta 4x4 Matrix Multiplication picture to help clarify all that extra GPU performance offered by the Tensor Cores. Just look at all those extra green blobs down there. Brilliant.
If you are interested in more details regarding the new V100 Nvidia architecture, take a look at Nvidia's whitepaper for the in depth Tesla V100 details. In the meantime, what are your thoughts on the new card? Are you impressed by the year on year improvements?