| Thread Tools |
21st October 2024, 13:07 | #1 |
[M] Reviewer Join Date: May 2010 Location: Romania
Posts: 153,292
| Meta Shows Open-Architecture NVIDIA "Blackwell" GB200 System for Data Center During the Open Compute Project (OCP) Summit 2024, Meta, one of the prime members of the OCP project, showed its NVIDIA "Blackwell" GB200 systems for its massive data centers. We previously covered Microsoft's Azure server rack with GB200 GPUs featuring one-third of the rack space for computing and two-thirds for cooling. A few days later, Google showed off its smaller GB200 system, and today, Meta is showing off its GB200 system—the smallest of the bunch. To train a dense transformer large language model with 405B parameters and a context window of up to 128k tokens, like the Llama 3.1 405B, Meta must redesign its data center infrastructure to run a distributed training job on two 24,000 GPU clusters. That is 48,000 GPUs used for training a single AI model. https://www.techpowerup.com/327856/m...or-data-center |
Thread Tools | |
| |