Microcenter's Trade-In system seems to be lowballing users by offering lower value for their GPUs such as the GeForce RTX 4090 with one of our readers receiving less than half the value of the said graphics card.
27.03.2024 - 15:33 / wccftech.com / Hassan Mujtaba
NVIDIA continues to push the AI envelope with its strong TensorRT-LLM suite, boosting the H200 GPUs to new heights in the latest MLPerf v4.0 results.
Generative AI or GenAI is an emerging market and all hardware manufacturers are trying to grab their slice of the cake. But despite their best efforts, it's NVIDIA that has so far taken the bulk of the share and there's no stopping the green giant as it has showcased some utterly strong benchmarks and records within the MLPerf v4.0 inference results.
Fine-tuning on TensorRT-LLM has been ongoing ever since the AI Software suite was released last year. We saw a major increase in performance with the previous MLPerf v3.1 results & now with MLPerf v4.0, NVIDIA is supercharging Hopper's performance. Why inference matters is because it accounts for 40% of the data center revenue (generated last year). Inference workloads range from LLMs (Large Language Models), Visual Content, and Recommenders. As these models increase in size, there comes more complexity and the need to have both strong hardware and software.
That's why TensorRT-LLM is there as a state-of-the-art inference compiler that is co-designed with the NVIDIA GPU architectures. Some features of TensorRT-LLMs include:
Using the latest TensorRT-LLM optimizations, NVIDIA has managed to squeeze in an additional 2.9x performance for its Hopper GPUs (such as the H100) in MLPerf v4.0 versus MLPerf v3.1. In today's benchmark results, NVIDIA has set new performance records in MLPerf Llama 2 (70 Billion) with up to 31,712 tokens generated per second on the H200 (Preview) and 21,806 tokens generated per second on the H100.
It should be mentioned that the H200 GPU was benchmarked about a month ago which is why its mentioned in the preview state but NVIDIA has stated that they are now shipping these GPUs to customers.
The NVIDIA H200 GPU manages to offer an additional 45% performance gain in Llama 2 versus the H100 GPUs thanks to its higher memory configuration of 141 GB HBM3E and faster bandwidth of up to 4.8 TB/s. Meanwhile, the H200 is a behemoth against Intel's Gaudi 2, the only other competitor solution submitted within the MLPerf v4.0 benchmarks while the H100 also sits in at a massive 2.7x gain.
In addition to these,
Microcenter's Trade-In system seems to be lowballing users by offering lower value for their GPUs such as the GeForce RTX 4090 with one of our readers receiving less than half the value of the said graphics card.
AMD's Strix Point APUs featuring the RDNA 3+ iGPU should offer comparable performance to entry-level discrete GPUs as indicated in new performance rumors.
Intel has finally revealed its next-gen AI Accelerator, the Gaudi 3, based on a 5nm process node and competing directly against NVIDIA's H100 GPUs.
NVIDIA's next-gen GeForce RTX 5090 & RTX 5080 "Blackwell" GPUs are rumored to launch in the fourth quarter of 2024.
NVIDIA's board partners are reportedly increasing the prices of various GeForce RTX 40 & RTX 30 GPUs in China which is a stark contrast to what's happening in the US markets.
AMD's RDNA 4-based Navi 48 GPU is more or less confirmed now in the latest ROCm commits & will be powering next-gen Radeon RX 8000 series graphics cards.
New rumors surrounding AMD's next-gen RDNA 4 "Radeon RX 8000" GPUs and their performance positioning have been posted by @All_The_Watts.
A few weeks back, PGL announced its hardware of choice for its upcoming CS2 Major Tournament which included systems with AMD Ryzen 7 7800X3D CPUs and NVIDIA GeForce RTX 4080 GPUs but it looks like things didn't go as planned as a driver crash associated with the GPU became the very reason of one team's chances of going into the playoffs being washed away.
Intel has just released its latest MLPerf v4.0 performance figures covering the Gaudi 2 Accelerators & 5th Gen Xeon "Emerald Rapids" CPUs, with the former showcasing strong performance per dollar values against NVIDIA's H100 GPU.
The newest pictures of Intel's next-gen Lunar Lake-MX CPU powering thin & light platforms have been leaked by Igor's Lab.
The much-awaited Snapdragon X Elite CPU platform has been demoed running Baldur's Gate 3 at 30 FPS as a demonstration of its gaming GPU performance.
Intel's next-generation Battlemage "Xe2-HPG" GPUs have potentially been spotted within the SiSoftware Sandra database.