Microsoft Azure AI: GPT-4.5, Phi-4, Stability AI, o3-mini, GPT-4o-Audio and Realtime

Effective March 2025, various new AI models have been introduced to Microsoft Azure AI Foundry. These include OpenAI GPT-4.5, Microsoft Phi-4-multimodal, Microsoft Phi-4-mini, Stability AI’s latest generative imaging models, and OpenAI o3-mini. Also, Cohere ReRank v3.5, OpenAI GPT-4o-Audio-Preview and GPT-4o-Realtime-Preview have been added. Let’s explore these new LLMs in detail and find out, if they are the right fit for your requirements.

 

Models in Detail

OpenAI GPT-4.5

OpenAI GPT-4.5 (preview) is the latest and most advanced general-purpose model developed by OpenAI. It builds on the success of previous models, offering enhanced capabilities in coding, writing, and problem-solving tasks. GPT-4.5 is designed to provide a more natural interaction experience, with a broader knowledge base and improved emotional intelligence (EQ).

  • Natural Interaction: Offers a more natural interaction experience with a broader knowledge base.
  • Accuracy and Hallucinations: Lower hallucination rate (37.1% vs. 61.8%) and higher accuracy (62.5% vs. 38.2%) compared to GPT-4.
  • Stronger Human Alignment: Enhanced alignment techniques improve the ability to follow instructions, understand nuances, and engage in natural conversations.

It is supposed to be the last non-reasoning model of OpenAI. OpenAI will focus on reasoning models of the o-series with models such as o3-mini.

OpenAI o3-mini

OpenAI o3-mini is a reasoning model that offers significant cost efficiencies compared to previous o-models. It introduces new features like reasoning effort control and tools, providing comparable or better responsiveness. With faster performance and lower latency, o3-mini is designed to handle complex reasoning workloads while maintaining efficiency.

  • Reasoning Effort Control: Allows users to adjust the model’s cognitive load with low, medium, and high reasoning levels.
  • Structured Outputs: Supports JSON Schema constraints for well-defined, structured outputs.
  • Functions and Tools Support: Integrates with functions and external tools for AI-powered automation.

Microsoft Phi-4-multimodal

Microsoft Phi-4-multimodal is an AI model that unifies text, speech, and vision for context-aware interactions. This model is designed to enhance user experiences by enabling more natural and intuitive interactions. For example, retail kiosks can now diagnose product issues via camera and voice inputs, eliminating the need for complex manual descriptions.

  • Context-Aware Interactions: Unifies text, speech, and vision for more intuitive user experiences.
  • Retail Kiosk Integration: Enables product issue diagnosis via camera and voice inputs.

Microsoft Phi-4-mini

Microsoft Phi-4-mini is a compact yet powerful AI model with just 3.8 billion parameters and a 128K-token context window. Despite its smaller size, it outperforms larger models on coding and math tasks while increasing inference speed by 30% compared to previous models.

  • Compact Performance: Packs impressive performance into just 3.8 billion parameters.
  • Efficiency: Outperforms larger models on coding and math tasks with a 30% increase in inference speed.

Stability AI’s Generative Imaging Models

Stability AI continues to advance generative imaging with models that accelerate creative workflows. The latest models include:

  • Stable Diffusion 3.5 Large: Generates high-fidelity marketing assets faster than previous versions, maintaining brand consistency across diverse visual styles.
  • Stable Image Ultra: Achieves photorealism for product imagery, reducing photoshoot costs through accurate material rendering and color fidelity.
  • Stable Image Core: An enhanced version of SDXL (Stable Diffusion XL), providing high-quality output with exceptional speed and efficiency.
Made by Stability AI models

 

Shift to reasoning models

The Large Language Model (LLM) landscape is evolving, with a significant shift towards reasoning models. DeepSeek‘s introduction of cost-efficient reasoning models has accelerated the shift towards reasoning models by demonstrating that advanced cognitive tasks can be handled with fewer computational resources, making sophisticated AI more accessible and scalable for businesses. Unlike traditional AI models like GPT-4.5 that primarily focus on pattern recognition and data processing, reasoning models like OpenAI o3-mini are designed to handle complex cognitive tasks. They offer a deeper understanding and more accurate responses, making them ideal for applications that require logical thinking, problem-solving, and decision-making.

They excel in various scenarios, including coding, mathematical reasoning, scientific research, business analytics, and customer support. They come equipped with advanced capabilities such as reasoning effort control, structured outputs, and integration with external tools. This makes them highly effective in providing precise and context-aware interactions.

The shift to reasoning models also has significant implications for compute power and cost. While DeepSeek has demonstrated that reasoning models can be cost-effective, these models generally require more computational resources than traditional large language models (LLMs). This increased demand for computing power often results in higher operational costs. However, the advanced capabilities and improved performance of reasoning models justify the investment, as they offer greater accuracy, reliability, and efficiency in handling complex tasks. We expect that the cost per 1m tokens will decrease in future reasoning models, and companies like OpenAI offer cheaper alternatives (e.g. o3-mini) to make reasoning models more affordable.

 

Use Cases of these different LLMs

  • Coding Assistance: GPT-4.5 and Phi-4-mini can provide step-by-step guidance and automate repetitive tasks, saving time and reducing errors.
  • Content Creation: Use GPT-4.5 to craft clear and effective emails, messages, and documentation.
  • Complex Tasks: Ask o3-mini to use its reasoning capabilities for complex tasks like generating complicated mathematical formulas or executing multi-step workflows.
  • Retail Kiosk Integration: Phi-4-multimodal enables product issue diagnosis via camera and voice inputs, enhancing customer service.
  • Marketing Assets: Stable Diffusion 3.5 Large generates high-fidelity marketing assets, maintaining brand consistency.
  • Product Imagery: Stable Image Ultra achieves photorealism when creating images, reducing photoshoot costs.

 

How can I explore these new LLMs?

To take advantage of the new AI model capabilities in Azure, you can:

 

More Information

Azure OpenAI Service: https://azure.microsoft.com/en-us/products/ai-services/openai-service/.

Azure AI Foundry dashboard: https://ai.azure.com/.

O3-mini announcement: https://azure.microsoft.com/en-us/blog/announcing-the-availability-of-the-o3-mini-reasoning-model-in-microsoft-azure-openai-service/.

Other models mentioned: https://azure.microsoft.com/en-us/blog/announcing-new-models-customization-tools-and-enterprise-agent-upgrades-in-azure-ai-foundry/.

Contact us to secure advantageous pricing on Azure when shifting your Azure consumption to SCHNEIDER IT MANAGEMENT.

Artikel deelen