IoT Worlds
nvidia eos
Artificial IntelligenceMachine Learning

Nvidia Eos | A Big Deal For Cloud Builders and Hyperscalers for AI-IoT

NVidia’s new supercomputer, the Eos, is a big deal for cloud builders and hyperscalers alike. It features a Hopper GPU architecture and 32 petaflops of AI performance. It also features NVidia’s Omniverse Cloud platform.

NVidia’s new supercomputer

Nvidia’s new supercomputer Eos could make it hard for IBM to compete in the supercomputer market. It is an all-flash supercomputer and will need hundreds of petabytes of storage capacity. It will also likely require a second-tier capacity store. This will require candidate storage vendors to devote a lot of engineering resources to Eos.

The Eos supercomputer is based on the H100 GPU, which is the first version of Nvidia’s new ‘Hopper’ architecture. It is set to debut within the year or so, according to Ian Buck, Nvidia’s vice president of hyperscale and HPC. It will also power Nvidia’s research projects.

The supercomputer has the ability to process data at over one trillion operations per second. It is expected to beat the world’s largest supercomputer, the Perlmutter, which has an AI performance of four exaflops. The Department of Energy’s Perlmutter supercomputer uses 6,159 Nvidia A100 GPUs and 1,536 AMD Epyc CPUs. Facebook/Meta is also building a supercomputer based on a third-generation Nvidia DGX A100 system.

The Nvidia Venado supercomputer is set to arrive in 2020. It is the first supercomputer to feature an all-Nvidia architecture, and will be the first of several all-Nvidia high-performance computers to come. The supercomputer will incorporate Grace-Hopper Superchips, which combine the GPU and CPU dies in a single chip. It will also use Grace CPU-only Superchips.

The Eos supercomputer will be capable of performing AI-related tasks with up to 18 exaflops per second, according to Nvidia’s official website. The supercomputer is expected to be operational around the end of 2022. It will use Nvidia’s new AI architecture and DGX SuperPOD GPUs. It will also have 576 DGX H100 systems.

Nvidia DGX H100

Hopper GPU architecture

The Hopper GPU architecture of NVIDIA DGX systems allows for partitioning a single GPU into up to seven independent instances. This allows for efficient resource allocation and multi-tenancy. It also supports full PCIe Gen 5 networking without involving the CPU. As a result, the Hopper offers a high-throughput intelligent video analytics solution.

The H100 GPU is based on the next-generation Hopper architecture. It offers a massive boost in AI performance over the A100 and allows for faster deep learning models. However, the new GPU is power-hungry and requires a custom HGX motherboard.

Hopper is compatible with the new Tensor processing format FP8 and has a processing rate of 4 petaFLOPS, which is more than twice as fast as the Ampere A100. It also features the Hopper Transformer Engine that dynamically processes the layers of a Transformer network. Finally, Hopper supports 2nd generation multi-instance and multi-tenancy in the cloud, which will allow a single GPU to support up to seven cloud tenants.

Hopper is an incredible technological leap for accelerated computing. It can scale diverse workloads across data centers, from exascale HPC to trillion-parameter AI. It contains 80 billion transistors and five breakthrough innovations. The first of these innovations is the H100 Tensor Core GPU with a 30X boost in performance over the previous generation. Another is the Megatron 530B chatbot – the largest generative language model in the world.

The Hopper GPU architecture is named after Grace Hopper, an American computer scientist. Its new CPU and GPU SoCs have 600GB of GPU memory and a 900 gigabyte per second coherent interconnect. This new architecture will be used in giant-scale HPC and AI systems.

32 petaflops of AI performance

Nvidia is building a new supercomputer that will push AI performance to 32 petaflops. It is the first system to be built with new GPUs that can handle the massive workloads of AI research. Each system contains eight NVIDIA DGX H100 GPUs connected by NVLink and will provide up to 32 petaflops of AI performance, more than six times the previous generation of GPUs.

Nvidia has also unveiled the first prototype of its new supercomputer, the Eos, which will be built in a factory for advanced AI research. It will feature five76 DGX H100 systems with 4,608 DGX GPUs. The supercomputer will have 18.4 exaflops of AI performance and will be used to develop a blueprint for advanced AI infrastructure. The Eos supercomputer will incorporate elements of quantum computing and other advanced technologies. It will also be capable of multi-tenant isolation.

The NVIDIA H100 Tensor Core GPU delivers unprecedented performance and scalability. It is capable of supporting trillion-parameter language models. The GPU also features NVIDIA Hopper architecture, which helps AI models process massive amounts of unlabeled data. NVIDIA DGX systems also use NVIDIA’s Base Command software suite, which includes enterprise-grade orchestration and libraries to accelerate compute, storage, and network infrastructure. Its system software is optimized for modern cloud native workloads.

The DGX H100 AI infrastructure systems are powered by the NVIDIA H100 chip. Together, the NVIDIA H100 GPUs can deliver up to 32 petaflops of AI performance and new FP8 precision. The H100 GPUs can be arranged in “SuperPODs” containing 256 H100 GPUs.

Omniverse Cloud platform

The new Omniverse Cloud platform from NVIDIA offers a suite of cloud services to creative professionals who need to create, share, and collaborate on 3D designs. The platform provides access to billions of devices, enabling users to create and collaborate from any location. Omniverse Create, for example, enables designers to create large scenes quickly and easily, without having to transfer huge datasets across different machines.

Omniverse is being used by a wide variety of companies to enhance their creative workflows and pipelines. Among them are Amazon, DB Netze, DNEG, Kroger, Lowe’s, and PepsiCo. These organizations are using the cloud platform to create photorealistic digital twins and to train AI robots. Another Nvidia customer, Siemens Gamesa Renewable Energy, is using the Omniverse to build a physics-informed digital model of wind farms.

The NVIDIA Omniverse Cloud platform is available to developers on Windows and Mac, and it powers both the creation and editing of 3D scenes. It also enables collaboration and access from anywhere via the web or mobile devices. For example, creators can use the Omniverse Create app to create and edit 3D worlds in real time. And with the addition of the “View” app, non-technical users can easily view and manipulate scenes.

Omniverse Cloud also enables non-RTX users to stream Omniverse View and Create using GeForce NOW. This means that developers can access the platform from any location without having to upgrade their IT infrastructure. It also enables designers to collaborate with remote collaborators as easily as if they were in the same studio.

The Omniverse Cloud platform leverages the expertise of over 20 years of NVIDIA’s rendering technologies and AI and simulation SDKs. It also includes over 300 pre-built extensions that allow developers to extend the functionality of the platform and increase the number of users. The platform supports a variety of 3D workflows and uses the Universal Scene Description (USD) framework, which was originally developed by the PIxar Animation Studios. This framework allows developers to combine assets, iterate design concepts in real time, and share high-fidelity models.

Storage provided by Nvidia

Nvidia is preparing an in-house system for its upcoming Eos hyperscale platform. This system is expected to require hundreds of petabytes of all-flash storage capacity. Nvidia is also likely to need a secondary capacity store. The new Nvidia Eos system will require candidate storage vendors to invest substantial engineering resources.

The Eos will use 360 NVLink switches to transfer data from one GPU to another. It will also feature ConnectX-7 Quantum-2 InfiniBand networking adapters, which can handle 400Gb/sec. Nvidia has partnered with a number of companies to provide storage hardware and software for the Eos system.

The Eos supercomputer is expected to provide 18.4 exaflops of AI computing performance – that’s four times the performance of the Fugaku supercomputer in Japan. It’s also expected to perform 275 petaflops of traditional scientific computing. In comparison, the previous fastest supercomputer, the Nvidia Fugaku, was only capable of 2.7 exaflops.

The Eos supercomputer is expected to become operational later this year. The supercomputer will consist of 576 DGX H100 computers and 4608 DGX H100 GPUs. It will have a total of 18,4 exaflops of AI computing, which is four times the speed of the Fugaku supercomputer in Japan. The Eos supercomputer will be a benchmark for advanced AI infrastructure and will serve as an example for future AI infrastructure. The NVIDIA Eos supercomputer will be easy to scale to meet the needs of AI applications.

Related Articles

WP Radio
WP Radio