Over the last year, generative artificial intelligence (AI) has significantly influenced various aspects of our daily lives, including writing, content creation, gaming, learning, and productivity. This advanced technology is rapidly being adopted by PC enthusiasts and developers who are eager to expand its potential and explore new frontiers. Generative AI refers to a type of artificial intelligence capable of creating content such as text, images, and even music, based on the data it has been trained on.
In the world of technology, many revolutionary breakthroughs have originated from humble beginnings, often in a garage. Reflecting this spirit of innovation, NVIDIA has launched the RTX AI Garage series. This initiative aims to provide regular content for developers and tech enthusiasts who are keen to delve into NVIDIA NIM microservices and AI Blueprints. These resources will guide users on building AI agents, creative workflows, digital humans, and productivity applications on AI-enabled PCs. The RTX AI Garage is set to become a hub for innovation and learning.
The first session of this series coincides with announcements made at the Consumer Electronics Show (CES), introducing new AI foundation models for NVIDIA RTX AI PCs. These models are designed to enhance the capabilities of digital humans, streamline content creation, and boost productivity and development tasks. They represent a significant step forward in making AI more accessible and functional for everyday users.
The newly announced models are offered as NVIDIA NIM microservices and are powered by the latest GeForce RTX 50 Series GPUs. These GPUs, built on the NVIDIA Blackwell architecture, are capable of performing up to 3,352 trillion AI operations per second. They come with 32GB of VRAM and feature FP4 compute, which effectively doubles the AI inference performance. This allows generative AI to operate locally on devices while consuming less memory.
NVIDIA has also introduced NVIDIA AI Blueprints, which are essentially ready-to-use, preconfigured workflows built on NIM microservices. These blueprints are particularly useful for applications such as creating digital humans and generating content. The integration of NIM microservices and AI Blueprints enables developers and enthusiasts to quickly build, test, and deploy AI-powered experiences on PCs, paving the way for a new wave of practical and innovative capabilities for users.
### Fast-Track AI With NVIDIA NIM
Bringing AI advancements to personal computers presents two primary challenges. Firstly, the pace at which AI research is advancing is incredibly fast, with new models emerging daily on platforms like Hugging Face. This means that what is cutting-edge today can quickly become obsolete. Secondly, adapting these AI models for PC use involves a complex and resource-intensive process. It requires optimizing the models for PC hardware, integrating them with AI software, and connecting them to applications, which demands significant engineering effort.
NVIDIA NIM helps overcome these challenges by providing state-of-the-art AI models that are prepackaged and optimized for PCs. These NIM microservices cover various model domains and can be installed with a single click. They come with application programming interfaces (APIs) that make integration seamless and leverage NVIDIA AI software and RTX GPUs for accelerated performance.
During CES, NVIDIA announced a pipeline of NIM microservices for RTX AI PCs. These microservices support a range of use cases including large language models (LLMs), vision-language models, image generation, speech recognition, retrieval-augmented generation (RAG), PDF extraction, and computer vision.
The newly introduced Llama Nemotron family of open models is designed to deliver high accuracy across a wide range of tasks. The Llama Nemotron Nano model, available as a NIM microservice for RTX AI PCs and workstations, excels in tasks such as instruction following, function calls, chat, coding, and mathematical computations.
Developers will soon be able to download and run these microservices on Windows 11 PCs using the Windows Subsystem for Linux (WSL). This provides a streamlined process for developers to utilize cutting-edge AI models without the need for extensive backend integration and optimization efforts.
To showcase the potential of NIM in building AI agents and assistants, NVIDIA previewed Project R2X. This is a vision-enabled PC avatar capable of providing users with information at their fingertips, assisting with desktop applications and video conference calls, reading and summarizing documents, and more.
By leveraging NIM microservices, AI enthusiasts can bypass the complexities associated with model curation, optimization, and backend integration. This allows them to focus on creating innovative applications using the latest AI models.
### Understanding APIs
An API, or Application Programming Interface, is a set of tools that allows an application to communicate with a software library. It defines a set of “calls” that the application can make to the library and the responses it can expect. Traditional AI APIs often require extensive setup and configuration, which can hinder the use of AI capabilities and slow down innovation.
NIM microservices offer intuitive APIs that simplify this process. Applications can easily send requests to these APIs and receive responses, making it easier to integrate AI functionalities. These APIs are designed to handle different model types based on their input and output media. For instance, large language models (LLMs) process text input to generate text output, image generators transform text into images, and speech recognizers convert speech into text.
The microservices are compatible with leading AI development and agent frameworks such as AI Toolkit for VSCode, AnythingLLM, ComfyUI, Flowise AI, LangChain, Langflow, and LM Studio. Developers can conveniently download and deploy them from NVIDIA’s build site.
By bringing these APIs to RTX, NVIDIA NIM is set to accelerate AI innovation on PCs, enabling enthusiasts to explore a wide range of AI applications through an upcoming release of the NVIDIA ChatRTX tech demo.
### A Blueprint for Innovation
With the use of cutting-edge models that are prepackaged and optimized for PCs, developers and enthusiasts can swiftly create AI-powered projects. They can further enhance these projects by combining multiple AI models and functionalities to build complex applications such as digital humans, podcast generators, and application assistants.
NVIDIA AI Blueprints, which are built on NIM microservices, provide reference implementations for complex AI workflows. These blueprints assist developers in connecting various components, including libraries, software development kits, and AI models, into a single cohesive application.
AI Blueprints come with everything a developer needs to build, run, customize, and extend the reference workflow. This includes the reference application and source code, sample data, and documentation for customizing and orchestrating the different components.
During CES, NVIDIA introduced two AI Blueprints for RTX: one for converting PDFs to podcasts, allowing users to generate a podcast from any PDF, and another for 3D-guided generative AI, based on the FLUX.1 [dev] model, which offers artists greater control over text-based image generation.
With AI Blueprints, developers can transition from AI experimentation to AI development for advanced workflows on RTX PCs and workstations, fostering innovation and creativity.
### Built for Generative AI
The new GeForce RTX 50 Series GPUs are specifically designed to address the challenges of generative AI, featuring fifth-generation Tensor Cores with FP4 support, faster G7 memory, and an AI-management processor for efficient multitasking between AI and creative workflows.
The addition of FP4 support in the GeForce RTX 50 Series enhances performance and enables the deployment of more models on PCs. FP4 is a lower quantization method, akin to file compression, which reduces model sizes. Compared to FP16, the default method used by most models, FP4 uses less than half the memory, and the 50 Series GPUs offer over twice the performance of the previous generation. This is achieved with minimal loss in quality, thanks to advanced quantization methods provided by the NVIDIA TensorRT Model Optimizer.
For instance, Black Forest Labs’ FLUX.1 [dev] model, which requires over 23GB of VRAM at FP16, can only be supported by the GeForce RTX 4090 and professional GPUs. With FP4, the model needs less than 10GB, allowing it to run locally on more GeForce RTX GPUs.
With a GeForce RTX 4090 using FP16, the FLUX.1 [dev] model can generate images in 15 seconds with 30 steps. However, with a GeForce RTX 5090 using FP4, the same images can be generated in just over five seconds.
### Getting Started With New AI APIs for PCs
NVIDIA NIM microservices and AI Blueprints are anticipated to be available starting next month, with initial hardware support for GeForce RTX 50 Series, GeForce RTX 4090 and 4080, and NVIDIA RTX 6000 and 5000 professional GPUs. Additional GPUs are expected to be supported in the future.
NIM-ready RTX AI PCs are projected to be available from manufacturers such as Acer, ASUS, Dell, GIGABYTE, HP, Lenovo, MSI, Razer, and Samsung. Local system builders such as Corsair, Falcon Northwest, LDLC, Maingear, Mifcon, Origin PC, PCS, and Scan are also expected to offer these systems.
The GeForce RTX 50 Series GPUs and laptops are set to deliver game-changing performance, power transformative AI experiences, and enable creators to complete workflows in record time. For more insights into NVIDIA’s AI innovations unveiled at CES, you can rewatch NVIDIA CEO Jensen Huang’s keynote on YouTube. For additional details, you can refer to NVIDIA’s official website.
For more Information, Refer to this article.