Press "Enter" to skip to content

NVIDIA Introduces Inference Platforms RTX for Handling Large Language Models and Generative AI Tasks

Spread the love

At CES, NVIDIA unveiled the GeForce RTX™ SUPER desktop GPUs designed to elevate generative AI performance, alongside new AI laptops from leading manufacturers. Additionally, NVIDIA introduced RTX™-accelerated AI software and tools catering to both developers and consumers.

Leveraging its extensive experience in PC technology, with over 100 million RTX GPUs powering the AI PC era, NVIDIA is expanding its offerings to enhance PC capabilities with generative AI features. This includes NVIDIA TensorRT™ acceleration for the widely used Stable Diffusion XL model, RTX Remix for AI-driven texture tools, and the introduction of NVIDIA ACE microservices alongside DLSS 3 technology in more games.

AI Workbench, a user-friendly toolkit for AI developers, will enter beta later this month. Moreover, NVIDIA TensorRT-LLM (TRT-LLM), an open-source library optimizing inference performance for large language models (LLMs), now supports additional pre-optimized models for PC usage. Enhanced by TRT-LLM, Chat with RTX—a NVIDIA tech demo—allows users to engage with their content interactively.

Nvidia and Microsoft featured image www

ALSO READ : Top 5 Google Chrome Extensions to increase Productivity

Jensen Huang, NVIDIA’s founder and CEO, highlighted the transformative impact of generative AI across industries, emphasizing the significance of running such AI locally on PCs for privacy, latency, and cost reasons. With a vast installed base of AI-ready systems and comprehensive developer tools, NVIDIA aims to drive new experiences and expand the range of AI-enabled applications and games already accelerated by RTX technology.

RTX AI PCs and Workstations

RTX AI-enabled PCs and workstations powered by NVIDIA RTX GPUs offer unparalleled performance, unlocking the full potential of generative AI applications. These GPUs, equipped with Tensor Cores, significantly accelerate AI tasks across a wide range of demanding workloads.

At CES, NVIDIA introduced the new GeForce RTX 40 SUPER Series graphics cards, including the GeForce RTX 4080 SUPER, 4070 Ti SUPER, and 4070 SUPER, designed to deliver top-notch AI performance. The GeForce RTX 4080 SUPER, for instance, outperforms the GeForce RTX 3080 Ti GPU, boasting 1.5x faster AI video processing and 1.7x faster image rendering. With up to 836 trillion operations per second, the Tensor Cores in SUPER GPUs revolutionize gaming, content creation, and productivity tasks.

Leading manufacturers such as Acer, ASUS, Dell, HP, Lenovo, MSI, Razer, and Samsung are launching a new lineup of RTX AI laptops, providing users with comprehensive generative AI capabilities. These systems offer performance gains ranging from 20x to 60x compared to using neural processing units and are set to hit the market this month.

Mobile workstations featuring RTX GPUs can leverage NVIDIA AI Enterprise software, including TensorRT and NVIDIA RAPIDS™, for streamlined and secure generative AI and data science development. Every NVIDIA A800 40GB Active GPU includes a three-year license for NVIDIA AI Enterprise, making it an ideal platform for AI and data science projects.

ALSO READ : How to get a Job via LinkedIn: A Comprehensive Guide for 2024

NVIDIA also unveiled new PC developer tools to facilitate AI model creation and customization. NVIDIA AI Workbench, set to launch in beta later this month, provides developers with access to popular repositories like Hugging Face, GitHub, and NVIDIA NGC™. The platform features a user-friendly interface for project reproduction, collaboration, and migration. Additionally, projects can be scaled across various environments, from data centers to public clouds, and seamlessly transitioned back to local RTX systems for inference and customization.

In collaboration with HP, NVIDIA is integrating NVIDIA AI Foundation Models and Endpoints into HP AI Studio, a centralized platform for data science. This integration simplifies AI model development, enabling users to search, import, and deploy optimized models across PCs and the cloud with ease.

Once developers have constructed AI models tailored for PC applications, they can further enhance their performance by leveraging NVIDIA TensorRT to fully utilize the capabilities of RTX GPUs’ Tensor Cores.

NVIDIA has expanded TensorRT’s functionality to include text-based applications through TensorRT-LLM for Windows, an open-source library designed to accelerate LLMs. The most recent update to TensorRT-LLM, currently accessible, introduces Phi-2 to its expanding roster of pre-optimized models for PC. These models demonstrate up to 5x faster processing speeds compared to alternative inference backends.

How robot is different from AI

RTX-Accelerated Generative AI Elevates PC Experiences
At CES, NVIDIA and its developer collaborators are introducing a range of new generative AI-powered applications and services for PCs, including:

1. NVIDIA RTX Remix: Set to launch in beta later this month, this platform offers generative AI tools that can transform basic textures from classic games into stunning, 4K-resolution, physically based rendering materials.

2. NVIDIA ACE Microservices: Featuring generative AI-powered speech and animation models, developers can integrate intelligent, dynamic digital avatars into games with ease.

ALSO READ : How to Build a Career in AI and ML in 2024

3. TensorRT Acceleration for Stable Diffusion XL (SDXL) Turbo and Latent Consistency Models: TensorRT boosts the performance of these popular Stable Diffusion acceleration methods by up to 60%, offering significant improvements over previous implementations. The updated Stable Diffusion WebUI TensorRT extension now includes acceleration for SDXL, SDXL Turbo, LCM – Low-Rank Adaptation (LoRA), and enhanced LoRA support.

4. NVIDIA DLSS 3 with Frame Generation: AI-driven frame rate enhancement technology increases frame rates by up to 4x compared to native rendering. It will be integrated into a dozen of the 14 new RTX games announced, including Horizon Forbidden West, Pax Dei, and Dragon’s Dogma 2.

5. Chat with RTX: This NVIDIA tech demo, available later this month, enables AI enthusiasts to seamlessly connect PC Large Language Models (LLMs) to their own data using retrieval-augmented generation (RAG). Accelerated by TensorRT-LLM, users can interact with their notes, documents, and other content effortlessly. Moreover, it will be released as an open-source reference project, facilitating easy implementation of similar capabilities in other applications.

For More Update Join My WhatsApp Channel Click Here

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *