Microsoft and Nvidia Collaborate to Streamline AI Model Deployment on Windows

Microsoft and Nvidia are collaborating to facilitate AI model development on Windows PCs. Microsoft introduced Windows AI Studio during the Microsoft Ignite event, providing a hub for developers to access and customize AI models. It integrates tools and models from Azure AI Studio and services like Hugging Face, offering a guided workspace setup for configuring small language models (SLMs) such as Microsoft’s Phi, Meta’s Llama 2, and Mistral. The studio allows performance testing using Prompt Flow and Gradio templates and will be released as a Visual Studio Code extension in the coming weeks. Nvidia, in parallel, updated TensorRT-LLM for Windows, enabling efficient execution of large language models on GeForce RTX 30 and 40 Series GPUs with 8GB of RAM or more. The update aims to make TensorRT-LLM compatible with OpenAI’s Chat API through a new wrapper, supporting local PC-based LLMs for enhanced privacy. Microsoft’s overarching goal is to establish a “hybrid loop” development pattern, allowing AI development across both the cloud and local devices, offering flexibility for developers.

READ MORE »