Microsoft Azure Blog Archives | Microsoft AI Blogs

One year of Phi: Small language models making big leaps in AI

ceprodblogsupp — Thu, 01 May 2025 00:00:00 +0000

A new era of AI

One year ago, Microsoft introduced small language models (SLMs) to customers with the release of Phi-3 on Azure AI Foundry, leveraging research on SLMs to expand the range of efficient AI models and tools available to customers.

Today, we are excited to introduce Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning—marking a new era for small language models and once again redefining what is possible with small and efficient AI.

Reasoning models, the next step forward

Reasoning models are trained to leverage inference-time scaling to perform complex tasks that demand multi-step decomposition and internal reflection. They excel in mathematical reasoning and are emerging as the backbone of agentic applications with complex, multi-faceted tasks. Such capabilities are typically found only in large frontier models. Phi-reasoning models introduce a new category of small language models. Using distillation, reinforcement learning, and high-quality data, these models balance size and performance. They are small enough for low-latency environments yet maintain strong reasoning capabilities that rival much bigger models. This blend allows even resource-limited devices to perform complex reasoning tasks efficiently.

Phi-4-reasoning and Phi-4-reasoning-plus

Phi-4-reasoning is a 14-billion parameter open-weight reasoning model that rivals much larger models on complex reasoning tasks. Trained via supervised fine-tuning of Phi-4 on carefully curated reasoning demonstrations from OpenAI o3-mini, Phi-4-reasoning generates detailed reasoning chains that effectively leverage additional inference-time compute. The model demonstrates that meticulous data curation and high-quality synthetic datasets allow smaller models to compete with larger counterparts.

Phi-4-reasoning-plus builds upon Phi-4-reasoning capabilities, further trained with reinforcement learning to utilize more inference-time compute, using 1.5x more tokens than Phi-4-reasoning, to deliver higher accuracy.

Despite their significantly smaller size, both models achieve better performance than OpenAI o1-mini and DeepSeek-R1-Distill-Llama-70B at most benchmarks, including mathematical reasoning and Ph.D. level science questions. They achieve performance better than the full DeepSeek-R1 model (with 671-billion parameters) on the AIME 2025 test, the 2025 qualifier for the USA Math Olympiad. Both models are available on Azure AI Foundry and HuggingFace, here and here.

Figure 1. Phi-4-reasoning performance across representative reasoning benchmarks spanning mathematical and scientific reasoning. We illustrate the performance gains from reasoning-focused post-training of Phi-4 via Phi-4-reasoning (SFT) and Phi-4-reasoning-plus (SFT+RL), alongside a representative set of baselines from two model families: open-weight models from DeepSeek including DeepSeek R1 (671B Mixture-of-Experts) and its distilled dense variant DeepSeek-R1 Distill Llama 70B, and OpenAI’s proprietary frontier models o1-mini and o3-mini. Phi-4-reasoning and Phi-4-reasoning-plus consistently outperform the base model Phi-4 by significant margins, exceed DeepSeek-R1 Distill Llama 70B (5x larger) and demonstrate competitive performance against significantly larger models such as Deepseek-R1.

Figure 2. Accuracy of models across general-purpose benchmarks for: long input context QA (FlenQA), instruction following (IFEval), Coding (HumanEvalPlus), knowledge & language understanding (MMLUPro), safety detection (ToxiGen), and other general skills (ArenaHard and PhiBench).

Phi-4-reasoning models introduce a major improvement over Phi-4, surpass larger models like DeepSeek-R1-Distill-70B and approach Deep-Seek-R1 across various reasoning and general capabilities, including math, coding, algorithmic problem solving, and planning. The technical report provides extensive quantitative evidence of these improvements through diverse reasoning tasks.

Phi-4-mini-reasoning

Phi-4-mini-reasoning is designed to meet the demand for a compact reasoning model. This transformer-based language model is optimized for mathematical reasoning, providing high-quality, step-by-step problem solving in environments with constrained computing or latency. Fine-tuned with synthetic data generated by Deepseek-R1 model, Phi-4-mini-reasoning balances efficiency with advanced reasoning ability. It’s ideal for educational applications, embedded tutoring, and lightweight deployment on edge or mobile systems, and is trained on over one million diverse math problems spanning multiple levels of difficulty from middle school to Ph.D. level. Try out the model on Azure AI Foundry or HuggingFace today.

Figure 3. The graph compares the performance of various models on popular math benchmarks for long sentence generation. Phi-4-mini-reasoning outperforms its base model on long sentence generation across each evaluation, as well as larger models like OpenThinker-7B, Llama-3.2-3B-instruct, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Llama-8B, and Bespoke-Stratos-7B. Phi-4-mini-reasoning is comparable to OpenAI o1-mini across math benchmarks, surpassing the model’s performance during Math-500 and GPQA Diamond evaluations. As seen above, Phi-4-mini-reasoning with 3.8B parameters outperforms models of over twice its size. 

For more information about the model, read the technical report that provides additional quantitative insights.

Phi reasoning models in action

Phi’s evolution over the last year has continually pushed this envelope of quality vs. size, expanding the family with new features to address diverse needs. Across the scale of Windows 11 devices, these models are available to run locally on CPUs and GPUs.

As Windows works towards creating a new type of PC, Phi models have become an integral part of Copilot+ PCs with the NPU-optimized Phi Silica variant. This highly efficient and OS-managed version of Phi is designed to be preloaded in memory, and available with blazing fast time to first token responses, and power efficient token throughput so it can be concurrently invoked with other applications running on your PC.

It is used in core experiences like Click to Do, providing useful text intelligence tools for any content on your screen, and is available as developer APIs to be readily integrated into applications—already being used in several productivity applications like Outlook, offering its Copilot summary features offline. These small but mighty models have already been optimized and integrated to be used across several applications across the breadth of our PC ecosystem. The Phi-4-reasoning and Phi-4-mini-reasoning models leverage the low-bit optimizations for Phi Silica and will be available to run soon on Copilot+ PC NPUs.

Safety and Microsoft’s approach to responsible AI

At Microsoft, responsible AI is a fundamental principle guiding the development and deployment of AI systems, including our Phi models. Phi models are developed in accordance with Microsoft AI principles: accountability, transparency, fairness, reliability and safety, privacy and security, and inclusiveness.

The Phi family of models has adopted a robust safety post-training approach, leveraging a combination of Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning from Human Feedback (RLHF) techniques. These methods utilize various datasets, including publicly available datasets focused on helpfulness and harmlessness, as well as various safety-related questions and answers. While the Phi family of models is designed to perform a wide range of tasks effectively, it is important to acknowledge that all AI models may exhibit limitations. To better understand these limitations and the measures in place to address them, please refer to the model cards below, which provide detailed information on responsible AI practices and guidelines.

Responsible AI at Microsoft

Learn more here:

Try out the new models on Azure AI Foundry.
Read the Phi Cookbook.
Read about Phi reasoning models on edge devices.
Learn more about Phi-4-mini-reasoning.
Learn more about Phi-4-reasoning.
Learn more about Phi-4-reasoning-plus.
Read more about Phi reasoning on the Educators Developer blog.

The post One year of Phi: Small language models making big leaps in AI appeared first on Microsoft AI Blogs.

Adaptability by design: Unifying cloud and edge infrastructure trends

ceprodblogsupp — Tue, 29 Apr 2025 15:00:00 +0000

Adaptability isn’t an option. It’s the strategy.

Every day, we engage with organizations navigating growing business complexity across industries, geographies, and regulatory environments. From global manufacturers modernizing plants to financial institutions re-architecting for resilience, we are seeing the same pattern: ambitions are moving faster than traditional architectures can respond to.

Organizations must also contend with geopolitical disruptions, supply chain volatility, and rapid advances in AI that are reshaping how and where companies operate. Competition is intensifying as incumbents and new entrants alike use AI technology to ship faster, scale smarter, and reset benchmarks. Meanwhile, customer expectations continue to rise, with users demanding seamless digital experiences on any device, anywhere, at any time. In this dynamic environment, adaptability is no longer a feature of great architecture, it is the foundation of competitive resilience.

Get started with Azure today >

Forrester study: How tech leaders are evolving their infrastructure strategy

To better understand how infrastructure strategies are evolving, we commissioned Forrester Consulting to conduct a global study of over 600 cloud and edge decision-makers. What emerged from the data closely matches what we hear in our own customer conversations: leaders are increasingly unifying cloud and edge technologies as a strategic accelerator. The 2025 study, Edge Meets Cloud: A New Era of Scalability and Security, reveals how global IT leaders are no longer treating cloud and edge as separate environments, but rather an integrated approach that enables adaptability for business success. 

What tech leaders are prioritizing in 2025

In the study, respondents were surveyed on their top business priorities for the next two years. The results revealed they prioritized enhancing customer experience, increasing operational resilience, and strengthening data privacy. These emerged as interdependent outcomes—requiring companies to invest across both cloud and edge environments, rather than choosing between them. 

Respondents also said that, in many cases, operational reality has not caught up with the needs of the business. The top blockers they identified were:

Siloed IT teams and disconnected platforms.

Legacy applications that were not built to scale.

High integration and maintenance costs.

Security complexity and gaps in real-time insight.

Given these challenges, it’s no surprise that 65% of respondents said their organizations plan to merge their edge and cloud environments in the next 12 months. Why? Because operating in fragments slows everything down from innovation to insight to response time.

The case for cloud and edge integration

When organizations integrate cloud and edge environments, they unlock performance gains across every layer:

Reduced latency for faster customer-facing experiences.

Lower costs through smarter, localized processing.

Better insights from mobile, IoT, and on-premises data.

More cohesive security and governance.

Streamlined collaboration across teams and workloads.

In fact, 84% of surveyed leaders say they would value a solution that helps consolidate edge and cloud operations across systems, sites, and teams. 

How to get there: Four strategic moves

In the Forrester study commissioned by Microsoft, four key recommendations surfaced providing clear, actionable guidance for business and technology leaders looking to evolve their cloud-to-edge strategies. What stood out to me is how closely these recommendations reflect what we hear from enterprise customers. These aren’t theoretical ideas. They are grounded in what’s working today at scale, across some of the most complex IT environments in the world.

1. Break down silos

Forrester’s first recommendation is to step back and ensure alignment between cloud and edge priorities. This goes beyond technology, it’s about cultural alignment across teams, shared goals across domains, and unified decision-making across cloud and edge environments.

I have seen the impact of this alignment firsthand. Dick’s Sporting Goods, operating more than 850 stores alongside a growing e-commerce footprint, adopted a One Store technology strategy to unify how software is written, deployed, and monitored across all locations through Azure Arc. That cohesion didn’t just improve operational efficiency, it created consistent, high-quality customer experiences across every channel. When edge and cloud teams operate from the same playbook, adaptability becomes the default. 

2. Run AI where it works best

Forrester emphasizes the importance of using technologies where they have the greatest impact. Sometimes that means running AI at the edge for real-time responsiveness. Other times, it means leveraging cloud resources for deep analysis and long-term learning. Often, it’s both.

This flexible mindset is one we see increasingly in data-driven industries. LALIGA, which operates over 40 stadiums and serves more than 230 million fans globally, uses Azure Arc to orchestrate infrastructure and AI across environments. They process match-day insights on-site for immediate responsiveness, while simultaneously leveraging the cloud for strategic analysis. By allowing each environment to do what it does best, LALIGA delivers both agility and scale.

3. Design for security and scalability

Security no longer has a static perimeter, instead—it’s an integrated, continuous responsibility across every node, site, and service. Forrester recommends designing security and scalability hand-in-hand, particularly in environments that span hybrid and edge deployments.

AVEVA, a global industrial software leader, faced the challenge of integrating dozens of acquired IT systems—many with their own infrastructure, policies, and compliance needs. Through Azure’s adaptive management and policy tooling, they unified governance across more than 50 distinct Active Directory environments. Now, AVEVA can project its full operational landscape into a single control plane, applying consistent security, observability, and compliance across all regions. It is a powerful example of scaling without losing control.

4. Extend cloud-native to the Edge

As organizations move away from legacy monolith applications, Forrester recommends extending cloud-native development patterns and technologies—like containers, orchestration, and modular architecture—closer to the edge. This accelerates delivery, reduces friction, and supports long-term flexibility.

This is exactly what we’ve seen from Delta Dental, which modernized its core payor system, MetaVance, using containerized services on Azure Kubernetes Service (AKS). Development and test environments live in the cloud, while production workloads run on-prem to meet regulatory requirements. Leveraging AKS, Delta Dental can use one underlying platform to manage all its applications, no matter where they run. The result isn’t just better performance; it’s future optionality. As constraints evolve, the infrastructure is built to adapt.

What’s next for cloud and edge with Azure

To continue this conversation, join us for an in-depth video series featuring Forrester as we unpack the recently commissioned research and strategies shaping the future.

The post Adaptability by design: Unifying cloud and edge infrastructure trends appeared first on Microsoft AI Blogs.

Forrester Total Economic Impact study: A 304% ROI within 3 years using Azure Arc

ceprodblogsupp — Mon, 28 Apr 2025 15:00:00 +0000

As businesses continue to adapt to changing market demands, running modern workloads requires the rapid integration of AI from digital to physical estate. Managing enterprise-level infrastructure and applications across environments needs to be simplified with centralized visualization and control across IT assets.

Azure Arc extends the Azure platform to standardize governance, security, and orchestration for applications in multi-cloud, hybrid, and edge environments, enabling IT teams to have a common operations approach and focus on strategic priorities such as modernization, and AI initiatives.

Secure and manage workloads with Azure services anywhere

For the 2025 commissioned study The Total Economic Impact™ Of Microsoft Azure Arc With Cloud-Based Management Services, Forrester Consulting interviewed decision-makers from organizations using Azure Arc with cloud-based management services to manage their IT assets. These organizations span across manufacturing, cloud solutions providers, and the public sector. The consolidated data from the interviewees’ experiences shows that use of Azure Arc with cloud-based management services, demonstrated the following benefits:

304% return on investment (ROI) over a period of three years with payback in less than six months.
30% gain in IT operations productivity.
50% reduction in risk of security breach.
10% licensing savings with on-premises Windows Server and SQL Server pay-as-you-go as well as a 30% reduction in fees for extended security updates.
15% reduction in spending on third-party solutions and tools.

Cost savings and business benefits enabled by Azure Arc

Azure Arc addresses key challenges organizations face in centrally managing their diverse and vast IT infrastructure, such as gaining visibility across infrastructure, increased administrative costs with multiple tools, difficulty in enforcing security standards, and costly extended security update processes.

A unified approach to management and governance enabled by Azure Arc

According to the Forrester study, organizations realized a 30% gain in IT operations productivity by leveraging a unified approach to managing IT infrastructure enabled by Azure Arc. IT operations personnel were able to get a complete view of asset inventory, standardization of security and governance using cloud-based management services, automation of manual processes such as hot patching, and reduction in time taken for context switching when using multiple third-party tools. IT operations teams are now empowered to apply best practices consistently and spend more time on value-add strategic initiatives as Azure Arc has reduced operational toil while managing diverse environments.

Azure Arc saves us approximately 75% of the time for updates and rollouts for our many on-premises Windows Servers.

—HEAD ARCHITECT OF INFRASTRUCTURE AND SECURITY, MANUFACTURING

Cost savings and reduction in security risks

Organizations participating in the Forrester study that invested in Azure Arc and cloud-based management services from Azure not only achieved a more streamlined way to manage day-to-day activities such as updates, management of configurations, policies, and permissions, but also realized savings due to automation and better visibility of licensing. Implementing a pay-as-you-go pricing model for on-premises Windows Server and SQL Server, similar to Azure’s consumption pricing model, resulted in savings of 10% compared to capacity-based pricing. Additionally, the simplified procurement process, which eliminated third-party vendor markups for extended security updates, reduced fees by 30%.

The on-premises licensing model is now like cloud utilization. For example, we can turn down SQL servers that we are not needing on weekends or after hours, and don’t get charged for that time period.

—SOLUTIONS MANAGER, CLOUD SOLUTIONS PROVIDER

Organizations strengthened their security posture by using Microsoft’s observability and security services like Azure Monitor, Microsoft Defender for Cloud, and Microsoft Sentinel across their IT estate. This approach helped them to identify assets that did not meet security standards, standardize security policies and threat detection, and lower the response time for security incidents, thereby reducing the risk of security breaches by 50%.

In our security operations center, instead of getting three different workspaces each with all these different signals, now all the telemetry is flowing into one. And so, it allows us to better do threat hunting to recognize threats when they’re occurring on the network.

—SOLUTIONS MANAGER, CLOUD SOLUTIONS PROVIDER

With standardized use of cloud-based management services, organizations were able to reduce their dependence on third party tools, leading to savings from deprecating tools by up to 15%.

Azure Arc: The backbone of Azure’s adaptive cloud approach

Azure Arc provides organizations with the next level of modularity, integration, and simplicity to their IT and OT teams. Several of our customers are leveraging Azure Arc to embrace Azure’s adaptive cloud approach, enabling them to move beyond traditional boundaries, standardize management and operations across distributed sites, and integrate cloud-native and AI technologies to work simultaneously across hybrid, multi-cloud, edge, distributed computing, and IoT to accomplish key business outcomes in industries such as Manufacturing, Retail, Transportation and Logistics, Sports and Live Experiences, and Government.

How customers are accelerating innovation with the adaptive cloud approach

Chevron is reimagining monitoring of physical operations by analyzing data locally at remote facilities at the edge while still maintaining a centralized cloud-based management plane with Azure Arc. COCD (SANTÉ QUÉBEC) is managing its growing healthcare network by eliminating silos and taking a holistic approach to managing cybersecurity and other digital services using Azure Arc and security services from Microsoft. Louisiana State Government is optimizing value from operational investments by adopting Azure Arc and adaptive cloud approach to gain full visibility and control while accommodating legacy infrastructure. LALIGA can now innovate for the ultimate fan experience using Azure Arc to extend Azure to the edge from a single platform, no matter where the workload is located, improving security by centralizing management of its on-premises and cloud environments, and becoming more efficient and strategic in its operations by taking advantage of AI.

Learn more

Download the full study: The Total Economic Impact™ Of Microsoft Azure Arc With Cloud-Based Management Services.

To learn more about Azure Arc, visit our website.
Try Azure Arc for free.

The post Forrester Total Economic Impact study: A 304% ROI within 3 years using Azure Arc appeared first on Microsoft AI Blogs.

Unveiling GPT-image-1: Rising to new heights with image generation in Azure AI Foundry

ceprodblogsupp — Wed, 23 Apr 2025 18:00:00 +0000

We are thrilled to announce the launch of GPT-image-1, the latest and most advanced image generation model. Our API is available now to all gated customers: limited access model application, and playground is coming early next week. This groundbreaking model sets a new standard in generating high-quality images, solving complex prompts, and offering zero-shot capabilities in various scenarios.

Build custom generative AI solutions

That’s enough text to describe an image generation model for now—check out the video below and see GPT-image-1 in action:

Key features and improvements

GPT-image-1 builds upon the strengths of its predecessor, DALL-E, with significant enhancements:

Granular instruction response: GPT-image-1 excels at understanding and executing detailed instructions, ensuring precise and accurate image generation.
Text rendering: The model reliably renders text within images, enhancing its utility in creating educational materials and storybooks.
Image input acceptance: Users can upload images and provide text prompts to generate new images or edit existing ones, offering a versatile tool for creative projects.

GPT-image-1 capabilities

GPT-image-1 supports multiple modalities and features:

Text-to-image: Generate images from text prompts, similar to text2im in ChatGPT DALL-E.
Image-to-image: Create new images from user-uploaded images and text prompts, a feature not available in ChatGPT DALL-E.
Text transformation: Edit images using text prompts, akin to the transform feature in ChatGPT DALL-E.
Inpainting: Edit images with text prompts and user-drawn bounding boxes, similar to inpainting with DALL-E.

Use cases

GPT-image-1 is designed to power a wide range of applications, including:

Educational material generation: Create visual aids and interactive content for learning.
Storybook creation: Generate consistent and engaging illustrations for children’s books.
Game production: Develop game assets with consistent style and character design.
UI designs: Design user interfaces with photorealistic elements and coherent layouts.

Technical specifications

Resolution: Supports images with a minimum width and height of 1024 pixels, including 1024×1024, 1024×1536, and 1536×1024 resolutions.
API integration: Gpt-image-1 is available via API.

Safety and moderation

GPT-image-1 is built with a robust safety stack from OpenAI, including c2pa and input/output moderation. Azure AI specific items include: Content safety and abuse monitoring.

Get started today

Unleash your creative potential with GPT-image-1, the cutting-edge technology designed to elevate your artistic projects. With capabilities that support high-resolution images and seamless API integration, you can effortlessly bring your visions to life. Experience the stunning photorealistic elements and coherent layouts that will set your projects apart. Use ethical and safe image generation with GPT-image-1’s robust moderation systems, making it the clear choice for all your creative needs.

Discover the transformative power of GPT-image-1 today.

The post Unveiling GPT-image-1: Rising to new heights with image generation in Azure AI Foundry appeared first on Microsoft AI Blogs.

o3 and o4-mini: Unlock enterprise agent workflows with next-level reasoning AI with Azure AI Foundry and GitHub

ceprodblogsupp — Wed, 16 Apr 2025 22:09:14 +0000

We are thrilled to announce the availability of the latest iterations in the o-series of models: OpenAI o3 and o4-mini models on Microsoft Azure OpenAI Service in Azure AI Foundry and GitHub. These models represent a significant leap forward in AI reasoning, offering enhanced quality, safety, and performance compared to their predecessors.

Create custom AI-powered experiences

Key features and enhancements

Both o3 and o4-mini offer significant improvements on quality and safety while supporting the existing features of o1 and o3-mini and delivering comparable or better performance through its integration and support of the newest APIs and reasoning features.

In addition, they introduce:

Multiple APIs support: Both models are available in Responses API and Chat Completions API with Responses API supporting seamless integration with multiple tools and enhanced transparency with the reasoning summary as part of the model output.
Reasoning summary: In the Responses API, both models now support reasoning summary in their output providing more insights into their thinking process. This enhances the explainability and the effectiveness of the resulting actions and tools that leverage the insights for even better outcomes.
Multimodality: With enhanced vision analysis capabilities in o3, and new vision support in o4-mini, both models expand their reasoning capabilities to process and analyze visual data, extracting valuable insights and generating comprehensive text outputs. This is supported in both Responses API and Chat Completions API.
Full tools support includes parallel tool calling: Both models are first reasoning models with full tools support like the mainline models including parallel tool calling. Customers can use these to build the next generation of agentic solutions. This capability is supported in both Responses API and Chat Completions API.

New innovations in safety

The o-series reasoning models use deliberative alignment, a training strategy that teaches reasoning models safety specifications and trains them to reason explicitly about these specifications before answering. Both o3 and o4-mini feature the next level of safety improvements within the o-series of models so you can use the power of these models knowing these models are pushing the frontiers on safety as well.

New audio models available

Azure OpenAI Service has also introduced three powerful new audio models available for deployment today in East US2 on Azure AI Foundry: GPT-4o-Transcribe and GPT-4o-Mini-Transcribe, which are speech-to-text models outperforming previous benchmarks, and GPT-4o-Mini-TTS, a customizable text-to-speech model enabling detailed instructions on speech characteristics. Check out more on the Tech Community blog.

A new era in AI reasoning

Imagine a world where AI reasoning is not just a tool but a partner in innovation. With the launch of o3 and o4-mini models, we are stepping into that world. These models are not just upgrades; they are gateways to new possibilities, enabling you to push the boundaries of what AI can achieve. Whether you’re solving complex problems, creating seamless workflows, or exploring new frontiers in AI, o3 and o4-mini are here to elevate your journey. Embrace the future of AI reasoning with Azure OpenAI Service and let your imagination soar. Sign up to use o3 and o4-mini in Azure AI Foundry today.

The post o3 and o4-mini: Unlock enterprise agent workflows with next-level reasoning AI with Azure AI Foundry and GitHub appeared first on Microsoft AI Blogs.

Announcing the GPT-4.1 model series for Azure AI Foundry and GitHub developers

ceprodblogsupp — Mon, 14 Apr 2025 17:30:00 +0000

We are excited to share the launch of the next iteration of the GPT model series with GPT-4.1, 4.1-mini, and 4.1-nano to Microsoft Azure OpenAI Service and GitHub. The GPT-4.1 models bring improved capabilities and significant advancements in coding, instruction following, and long-context processing that is critical for developers. We’re also excited to announce fine-tuning support for GPT-4.1 and 4.1-mini, allowing developers to further customize the models for their specific business needs.

Build custom generative AI solutions with Azure OpenAI

What is GPT-4.1?

GPT-4.1 is the latest iteration of the GPT-4o model, trained to excel at coding and instruction-following tasks. This model will improve the quality of agentic workflows and accelerate the productivity of developers across all scenarios.

Key features of GPT-4.1

GPT-4.1 brings several notable improvements:

Enhanced coding and instruction following: The model is optimized for better handling of complex technical and coding problems. It generates cleaner, simpler front-end code, accurately identifies necessary changes in existing code, and consistently produces outputs that compile and run successfully.

Long context model: GPT-4.1 supports one million token inputs, allowing it to process and understand extensive context in a single interaction. This capability is particularly beneficial for tasks requiring detailed and nuanced understanding as well as multi-step agents that increase context as they operate.

Improved instruction following: The model excels at following detailed instructions, especially agents containing multiple requests. It is more intuitive and collaborative, making it easier to work with for various applications.

Model capabilities

In addition to the post-training improvements and long context support, GPT-4.1 retains the same API capabilities as the GPT-4o model family, including tool calling and structured outputs.

Model	Reasoning & Accuracy	Cost & Efficiency	Context Length
GPT-4.1	Highest	Higher Cost	1M
GPT-4.1-mini	Balanced	Balanced	1M
GPT-4.1-nano	Lower	Lowest Cost	1M

Coming soon: Fine-tune GPT-4.1 to your business needs

Later this week, we’ll be enabling supervised fine-tuning for GPT-4.1 and 4.1-mini, empowering developers to adapt these models to their unique business requirements. Fine-tuning enables you to securely customize the base models on your own datasets, helping align responses with your organization’s specific tone, domain terminology, and task workflows. Fine-tuned models are managed and deployed through the Azure AI Foundry, giving you full control over versioning, security, and scalability.

Explore GPT-4.1 today

GPT-4.1 is now available on Azure OpenAI Service, bringing unparalleled advancements in AI capabilities. This release marks a significant leap forward, offering enhanced performance, efficiency, and versatility across a wide array of applications. Whether you’re looking to improve your customer service chatbot, develop cutting-edge data analysis tools, or explore new frontiers in machine learning, GPT-4.1 has something to offer.

We invite you to delve into the new features and discover how GPT-4.1 can revolutionize your workflows and applications. Explore, deploy, and build applications using these models today in Azure AI Foundry to access this powerful tool and stay ahead in the rapidly evolving world of AI.

Explore GPT-4.1 today.

Create with Azure AI Foundry

The post Announcing the GPT-4.1 model series for Azure AI Foundry and GitHub developers appeared first on Microsoft AI Blogs.

Announcing the Responses API and Computer-Using Agent in Azure AI Foundry

ceprodblogsupp — Tue, 11 Mar 2025 20:30:00 +0000

AI agents are transforming industries by automating workflows, enhancing productivity, and enabling intelligent decision-making. Businesses are leveraging AI agents to process insurance claims, manage IT service desks, optimize supply chain logistics, and even assist healthcare professionals in analyzing medical records. The potential is vast, and we’re excited to introduce two powerful innovations in Azure AI Foundry:

Responses API: A powerful API enabling AI-powered applications to retrieve information, process data, and take action seamlessly.
Computer-Using Agent (CUA): A breakthrough AI model that navigates software interfaces, executes tasks, and automates workflows.

Together, these capabilities empower businesses to reimagine AI not just as an assistant—but as an active digital workforce. Enterprise customers will soon gain access to these innovations driving automation, efficiency, and intelligence at scale.

Enhancing AI Agents with the Responses API

The Responses API is the key to unlocking agentic AI in Azure AI Foundry, transforming how enterprises harness AI for real-world impact. It is the new foundation for leveraging Azure OpenAI Service’s powerful built-in tools, combining the simplicity of the Chat Completions API with the advanced capabilities available through Assistants API and Azure AI Agent Service. The Responses API enables seamless interaction with tools like CUA, function calling, and file search—all in a single API call. This API enables AI systems to retrieve data, process information, and take actions—seamlessly connecting agentic AI with enterprise workflows.

Build with Azure AI Foundry

How the Responses API Works

The Responses API provides a structured response format that allows AI to interact with multiple tools while maintaining context across interactions. It supports:

Tool calling in one simple API call: Now, developers can seamlessly integrate AI tools, making execution more efficient.

Computer use: Use the computer use tool within the Responses API to drive automation and execute software interactions.

File search: Interact with enterprise data dynamically and extract relevant information.

Function calling: Develop and invoke custom functions to enhance AI capabilities.

Chaining responses into conversations: Keep track of interactions by linking responses together using unique response IDs, ensuring continuity in AI-driven dialogues.

Enterprise-grade data privacy: Built with Azure’s trusted security and compliance standards, ensuring data protection for organizations.

By consolidating retrieval, reasoning, and action execution into a single API, the Responses API simplifies AI agent development, reducing the complexity of orchestrating multiple AI tools within an automation pipeline.

This scalability makes it well-suited for enterprise use cases across industries such as customer service, IT operations, finance, and supply chain management, where AI-powered automation can streamline workflows and improve efficiency. For even greater flexibility and control, organizations can explore Azure AI Agent Service, which offers additional tools and models for developing and scaling AI agents. Azure AI Agent Service integrates with Semantic Kernel and AutoGen, enabling seamless multi-agent orchestration for more complex scenarios requiring multiple agents to collaborate on tasks.

Empowering AI Agents with the Computer-Using Agent

The Computer-Using Agent (CUA) is a specialized AI model in Azure OpenAI Service that allows AI to interact with graphical user interfaces (GUIs), navigate applications, and automate multi-step tasks—all through natural language instructions. Unlike traditional automation tools that rely on predefined scripts or API-based integrations, CUA can interpret visual elements, adapt dynamically, and take action based on on-screen content.

Build custom generative AI solutions

What makes the Computer-Using Agent unique?

Autonomous UI navigation: Can open applications, click buttons, fill out forms, and navigate multi-page workflows.
Dynamic adaptation: Interprets UI changes and adjusts actions accordingly, reducing reliance on rigid automation scripts.
Cross-application task execution: Operates across web-based and desktop applications, integrating disparate systems without API dependencies.
Natural language command interface: Users can describe a task in plain language, and CUA determines the correct UI interactions to execute.

With today’s announcement, developers can start building additional agentic capabilities right away with CUA. As enterprises look to deploy this technology at scale, we are evaluating integration with Windows 365 and Azure Virtual Desktop to enable CUA automation to run seamlessly in a managed host environment on Cloud PCs or virtual machines (VMs), ensuring consistent performance while maintaining enterprise compliance and security standards.

Secured desktops and apps with maximum control

Ensuring secure and trustworthy AI automation

As AI systems become more autonomous, ensuring security, reliability, and alignment with human intent is critical. The CUA model is one of the first agentic AI models capable of directly interacting with software environments, bringing new challenges in misuse prevention, unintended actions, and adversarial risks. To address these, Microsoft and OpenAI have implemented a multi-layered safety approach spanning the model, system, and deployment levels.

The CUA model is developed with safeguards to refuse harmful tasks, reject unauthorized actions, and prevent misuse. At the system level, Microsoft implements enterprise-grade content filtering and execution monitoring to help detect and prevent policy violations. To minimize unintended actions, CUA is designed to request user confirmations before executing irreversible tasks and to restrict high-risk actions such as financial transactions.

Microsoft’s Trustworthy AI framework further ensures real-time observability, logging, and compliance auditing for enterprise deployments. Automated and human-in-the-loop detection systems monitor execution patterns, identifying anomalous behaviors and enforcing governance policies. These safeguards are continuously refined based on internal red-teaming, external audits, and real-world testing to strengthen protection against prompt injections, adversarial manipulations, and unauthorized access. Given the current reliability level of the CUA model—particularly in non-browser environments—human oversight remains strongly recommended for sensitive operations.

As AI agents evolve, Microsoft is committed to transparency, security, and ongoing risk mitigation. By combining CUA’s built-in safeguards with Azure’s enterprise compliance and governance tools, organizations can deploy AI-powered automation with confidence, ensuring safe and responsible AI adoption at scale.

Build and scale exceptional generative AI systems

Getting started with CUA and Responses API

Azure AI Foundry continues to push the boundaries of AI-powered automation. Enterprise customers will gain access to the Responses API and CUA in Azure OpenAI Service in the coming weeks.

We’re excited to see how developers and businesses innovate with these new capabilities.

The post Announcing the Responses API and Computer-Using Agent in Azure AI Foundry appeared first on Microsoft AI Blogs.

What’s new in Azure Elastic SAN

ceprodblogsupp — Wed, 05 Mar 2025 16:00:00 +0000

I’m excited to share our recent updates to Azure Elastic SAN—our solution for high-scale cost efficiency in the cloud. Whether you’re looking for a seamless migration of your SAN environment or looking to consolidate existing workloads within the cloud, this enterprise-class offering stands out by helping you simplify your storage management experience and giving you the optimal price-performance ratio for your workloads.

With the SAN-like resource hierarchy, you can provision resources at the storage level and dynamically distribute these resources to meet the demands of diverse workloads across databases like SQL and Oracle, virtual desktop infrastructure (VDIs), and business applications. It supports a multitude of compute options, including workloads running on Azure VMware Solution, containerized applications (via Azure Container Storage), and virtual machines. Beyond that, it delivers cloud-native benefits with scale on demand, policy-based service management, and cloud-native security enforcements across encryption and network access. It’s a solution that combines the efficiency at scale of on-premises SAN systems and the flexibility of Cloud.

Deploy your Elastic SAN today

Since the general availability (GA) release of Azure Elastic SAN in early 2024, we have introduced various new capabilities that let you integrate more workloads, a few of which I want to highlight in this blog.

Enhanced resiliency, scalability, and simplicity to empower mission-critical workloads

We have released the public preview of autoscale for capacity. Elastic SAN is the first block storage solution in the cloud to support autoscaling. This will help save you time by simplifying management of the Elastic SAN, as you can set a policy for autoscaling your capacity when you are running out of storage rather than needing to actively track whether your storage is reaching its limits. With autoscaling, you can scale up on demand, so there is less of a need to provision extra storage just in case, helping you lower your monthly bill. Plus, you will be able to set the exact increments by which your SAN(s) will grow, so you stay in control of your total cost. It is a feature that fits in well with our theme of improving the ease of storage management.

We have also made snapshot support on Elastic SAN generally available. You can now take instant point-in-time backups of the state of your workloads with Elastic SAN volume snapshots. You can export these volume snapshots to managed disk snapshots for hardening purposes. Snapshots can be either full or incremental snapshots of your data, and you can restore your volumes from either of these snapshots in case you need to recover from a disaster.

Additionally, we have enabled CRC protection to help you maintain the integrity of your data by providing CRC32C checksum verification. If enabled on the client side, Elastic SAN supports checksum verification at the volume group level. This will cause connections that don’t have CRC32C set for both header and data digests to be rejected, to prevent accidental errors during communication or storage of data.

All of this is backed by our published availability SLA of 99.99%, which covers mission critical workloads running on Elastic SAN. This is what will allow you to maintain your peace of mind while running these workloads.

Hosting SQL on Azure Virtual Machines (VMs) with Elastic SAN

We have dedicated ourselves to ensuring that the SQL experience on Elastic SAN has been fully validated and optimized for cost savings. We have verified that you can reliably use Elastic SAN for your clustered workloads like SQL FCI. This is especially helpful when paired with ZRS (zone-redundant storage) as you can ensure zonal redundancy for your failover clusters. Additionally, to assist with storage and workload consolidation of multi-SQL databases hosted on Elastic SAN, we built out Elastic SAN in such a way that you can get the performance you need while simultaneously improving your total cost of ownership with features like dynamic performance allocation. The performance you provision at the SAN level is shared across all the SAN volumes you create, and since all of your workloads won’t peak at the same time, you can avoid provisioning for the total peak performance target of all your workloads by utilizing this dynamic performance allocation. This helps with cost reduction as you no longer need to provision for your peak performance targets.

Reduced total cost of operations (TCO) for Azure VMware Solution with Elastic SAN

Azure Elastic SAN is now available as a storage option for all Azure VMware Solution SKUs, including the all new AV64 SKUs. Because Azure VMware Solution supports attaching iSCSI datastores as a persistent storage option, you can attach Elastic SAN volumes to an Azure VMware Solution cluster of your choice and present them as Virtual Machine File System (VMFS) datastores. By using VMFS datastores backed by Azure Elastic SAN, you can expand your storage with an Azure deployed, fully-managed and VMware Certified storage area network instead of scaling more local storage nodes in the cluster. Additionally, Elastic SAN is the only Azure block storage offering with ZRS capabilities that VMware customers can leverage. ZRS ensures high availability and resiliency by storing three copies of each SAN in three distinct and physically isolated storage clusters across different Azure availability zones.

Get fast migration for VMware workloads

Customers who are leveraging Azure VMware Solution for disaster recovery use cases or hosting capacity intensive workloads, can benefit from a more cost-efficient, extensible storage solution. For instance, you could stand up a AVS cluster as the secondary site for on-premises VMware environment, and replicate data into Elastic SAN while keeping minimal cluster footprint. At $0.06-0.08 per GiB per month (based on current pricing in East US¹), Azure Elastic SAN is the most cost efficient per GiB storage option for AVS, while still offering scalable performance for a variety of use cases—and it can be deployed and connected straight from the Azure portal, making it an integrated and easy experience.

Enabling cloud-native workloads on Elastic SAN

For customers building their applications in the cloud, Azure Kubernetes Service (AKS) is the new VM. Azure Container Storage is Azure’s latest solution to simplify and offer comprehensive persistent storage management for stateful containers. Available as an easy add-on to your AKS cluster, one of the backing storage options you can use through Azure Container Storage is Elastic SAN. Azure Container Storage lets you provision persistent volumes (PVs) within a single Elastic SAN, sharing the performance available in the same SAN across multiple PVs. Because Elastic SAN utilizes iSCSI protocol to connect to AKS clusters, traditional VM-based limitations on the number of persistent volumes you can attach per node are overcome, reaching new heights of scale. Azure Container Storage can also be used with ZRS enabled, which offers enhanced resiliency for your workloads.

Traditional storage management for stateful containers meant provisioning storage resources, deploying PVs within your cluster, and manually managing the connection and scaling of the storage resource with your cluster. With Azure Container Storage, persistent volume orchestration and coordination of your Elastic SAN via your AKS cluster is fully managed for you. With the preview of autoscaling on Elastic SAN, storage management at scale is even simpler through Azure Container Storage. By setting your autoscale policy, you don’t need to worry about running into capacity limitations.

Manage volumes with Azure Container Storage

Getting started with Azure Elastic SAN

As always, there is more to come. Our roadmap for the year ahead involves expanding Elastic SANs existing backup and disaster recovery functionality, expanding our locally redundant storage (LRS) and ZRS footprint into new regions, and helping customers achieve even lower latencies and higher performance. For now, you can deploy an Elastic SAN² by following our instructions on how to get started or refer to our documentation to learn more. If you have any questions or feedback, feel free to reach out to the PM team at AzElasticSAN-Ex@microsoft.com, and someone from our team will be happy to assist.

¹You can find the most up to date pricing on our pricing page.

² For a list of available regions please refer to our documentation.

The post What’s new in Azure Elastic SAN appeared first on Microsoft AI Blogs.

New global report: How to stand out in an AI-savvy world

ceprodblogsupp — Tue, 04 Mar 2025 16:00:00 +0000

What was the last thing you did with a generative AI app? Create a cat-unicorn coloring book for your niece? Summarize that 42-page brief a colleague sent you? For me, it was using Microsoft Copilot to help my 9^th grader with a history study session—I know more than you can believe about Mesopotamia.

Whatever it was for you, I bet it was something you wouldn’t have even considered a year ago. As fast as we’ve become comfortable with AI at our fingertips, our expectations for what it can do for us are growing just as fast. Companies are responding to those rising expectations by increasingly customizing AI to create apps and unique experiences that differentiate their brand in the marketplace.

When I say customers are customizing AI to create apps, I mean they are reshaping entire experiences with it. The NBA is redefining fandom with AI-powered personalization, delivering game highlights and stats tailored to each viewer. Meanwhile, the city of Buenos Aires has transformed urban living with ‘Boti,’ an AI chatbot managing over two million monthly queries, providing residents with instant assistance for things like driver’s license renewals, subway schedules, parking regulations, and even personalized tourism plans. These organizations are bending AI to their vision, pushing the limits of what’s possible. That is why I am happy to share a new MIT Technology Review Insights report that delves into how businesses are leveraging AI customization to stay ahead in the competitive market—DIY GenAI: Customizing generative AI for unique value. The report highlights the motivations, methods, and challenges faced by technology leaders as they tailor AI models to create net new value for their businesses.

While AI customization isn’t new, rapidly advancing AI platforms like Azure AI Foundry can make it easier and offer businesses greater opportunities to create unique value with AI. According to the MIT report, while boosting efficiency is a top motivation for customizing generative AI models, creating unique solutions, better user satisfaction, and greater innovation and creativity are equal motivations.

Improved efficiency is a top motivator here because it is the first clear-cut benefit businesses can realize quickly by customizing AI. As organizations gain experience, the learning curve flattens, and I think we’ll see the other motivators soar as companies focus more on customizing AI for top-line revenue impact than COGS (Cost of Goods Sold) savings.

Specializing with agents

When it comes to selecting models, half of the executives surveyed in the MIT report said they are prioritizing agentic and multi-agent capabilities in addition to multimodality (56%), flexible payment options (53%), and performance improvements (63%). AI agents that perform tasks and make decisions without the need for direct human intervention have broad utility. They lend themselves to autonomous problem solving in areas like data entry and retrieval for clinical operations in Healthcare, supplier coordination and maintenance tracking in manufacturing, and enhancing inventory and store operations in Retail.

Agents have the potential to disrupt the market with something unique beyond automating processes that humans find dull. Take Atomicwork, a newcomer to the service management space dominated by established industry players with decades of experience. Atomicwork stands out with an ITSM (IT Service Management) and ESM (Enterprise Service Management) platform centered around specialized AI agents that integrate into the flow of work, providing seamless, instant support without the need for multiple tools or complex integrations. According to Atomicwork, one of their customers achieved a 65% deflection rate (the percentage of issues resolved without human intervention) within six months.

Like other areas of AI development, agent-building tools are rapidly evolving to accommodate a wide variety of use cases. From creating simple low-code agents in Microsoft Copilot Studio to developing more complex, autonomous pro-code agents using GitHub and Visual Studio, the process is streamlined. For example, using the intuitive agent orchestration experience built directly into Azure AI Foundry, Azure AI Agent Service allows you to accomplish in just a few lines of code what originally took hundreds of lines. This makes it remarkably easy to customize and safely put agents to work in your operations.

Good data equals good AI

The potential of AI customization is immense but not without its challenges. Ironically, the greatest asset for AI customization often presents the biggest barrier customers run into: data. Specifically, data integrity—the safety, security, and quality of the data they use with AI. Half the participants in the MIT report cited data privacy and security (52%) and data quality and preparation (49%) as AI customization obstacles.

Generative AI is one of the best things to happen to data in a long time. It presents innovative ways for companies to interact with and use their data in solutions unique to them. Data is where the magic happens. AI models know a lot, but a model doesn’t know your company from your competitor until you ground it in your data.

Critical to empowering data-driven AI is an intelligent data platform that unifies sprawling, fragmented data stores, provides controls to govern and secure data, and seamlessly integrates with AI building tools. It’s why Microsoft Fabric is now the fastest-growing analytics product in our history and why we’re seeing AI-driven data growth of raw storage, database services, and app platform services as customers fuel their AI workloads with data. Fabric removes the data integrity obstacle. Together with Azure AI Foundry, data and dev teams are integrated and working in the same environment, removing any time-to-market drag due to data issues.

RAG is the customization starting point

One of the simplest and most effective methods for customization is retrieval-augmented generation (RAG). Two-thirds of those surveyed in the MIT report are implementing RAG or exploring its use. Grounding an AI model in data specific to an organization or practice makes the model unique and capable of providing a specialized experience.

In practice, RAG isn’t used alone to customize models. The report found it’s often used in combination with fine-tuning (54%) and prompt engineering (46%) to create highly specialized models. Dentsu, a global advertising and PR firm based in Tokyo, initially analyzed media channel contributions to client sales using general-purpose LLMs but found their accuracy lacking at 40-50%. To improve this, they developed custom data controls and structures and tailored models leveraging their expertise in retail and marketing data analysis. By integrating a customized RAG framework and an agentic decision layer, Dentsu reports about 95% accuracy in retrieving relevant data and insights. This AI-powered approach now plays a central role in shaping campaign strategies and optimizing marketing budget allocation for their clients.

Empowering development teams

Developing AI brings new dynamics, not the least of which is keeping pace with AI advancements. Model features and capabilities, along with developer tools and methods, are evolving rapidly, which makes empowering teams with the right tools crucial for successful AI customization.

For example, the pace of new model capabilities begs for model evaluation tooling automation. According to the MIT report, 54% of companies use manual evaluation methods, and 26% are either beginning to apply automated methods or are doing so consistently. I expect we’ll see these numbers flip soon. The report notes that playgrounds and prompt development features are also widely used to facilitate collaboration between AI engineers and app developers while customizing models.

Evaluation is a critical component not just for customizing an AI but also in managing and monitoring the app once it hits production. We built full lifecycle evaluation into Azure AI Foundry so you can continuously evaluate model capabilities, optimize performance, test safety, and keep pace with advancements.

We also see customization and growing AI portfolios ushering in next-generation AI development. The report reveals that more than half of the surveyed organizations have adopted telemetry tracing and debugging tools. AI tracing enhances the transparency needed to understand the outcomes of AI applications, and debugging helps optimize performance by showing how reasoning flows from the initial prompt to the final output.

Looking ahead with Azure AI

AI has high utility when it comes to creating services and experiences that can differentiate you in the marketplace. The speed of adoption, exploration, and customization is evidence of the value companies see in that utility. Models are continually advancing and specializing by task and industry. In fact, there are more than 1,800 models in the Azure AI Foundry catalog today – and they are evolving just as quickly as the tools and methods to build with them. We already see agents delivering new customer service experiences—something that might be a differentiator today, but I expect fast-follows will reshape customer service for most companies as consumers learn to expect an AI-powered experience. As that happens, what we see as AI customization today will lose the novelty of being custom and become standard practice for building with AI. What we won’t lose is the novelty of building something unique. It will become an organization’s IP.

What’s that unique experience for your business? What’s the next special thing you want to do for your customers? How do you want to empower your employees? You’ll find everything you need to bend the curve of innovation with Azure AI Foundry.

One final note: No matter where you are in retooling your organization to operationalize AI, I encourage you to read the MIT report. In addition to survey findings, the team spent quality time talking with technology leaders about creating value by customizing generative AI. Sprinkled throughout the report are some helpful, real-world examples and insights. Big thanks to the researchers and editors at MIT Technology Review Insights for helping put a focus on this exciting area of opportunity.

Download the MIT report

About Jessica Hawk

Jessica leads Azure Data, AI, and Digital Applications product marketing at Microsoft. Find Jessica’s blog posts here, and be sure to follow Jessica on LinkedIn.

The post New global report: How to stand out in an AI-savvy world appeared first on Microsoft AI Blogs.

Announcing new models, customization tools, and enterprise agent upgrades in Azure AI Foundry

ceprodblogsupp — Thu, 27 Feb 2025 20:30:00 +0000

At Microsoft, we are dedicated to advancing AI innovation to empower organizations, transform industries, and redefine productivity. Today, we are excited to announce major updates to Azure AI Foundry, our integrated platform for designing, customizing, and managing enterprise-grade AI applications. These updates include groundbreaking new models like OpenAI’s GPT-4.5, enhanced fine-tuning and distillation techniques, and the launch of new enterprise tools for agents. These advancements are designed to accelerate the journey from AI experimentation to tangible business impact.

Build with Azure AI Foundry

Introducing GPT-4.5 in preview on Azure OpenAI Service

Building on the success of previous models, GPT-4.5 is the latest and strongest general-purpose model. This research preview demonstrates the achievements from scaling pre and post-training, a step forward in unsupervised learning techniques.

Natural interaction: GPT-4.5 offers a more natural interaction experience. It has a broader knowledge base, and its higher “EQ” can help to improve coding, writing, and problem-solving tasks.

Accuracy and hallucinations: With a lower hallucination rate (37.1% versus 61.8%) and higher accuracy (62.5% versus 38.2%) compared to GPT-4o, developers can rely on more precise and relevant responses.

Stronger human alignment: Enhanced alignment techniques improve GPT-4.5’s ability to follow instructions, understand nuances, and engage in natural conversations, making it a more effective tool for coding and project management.

 Developers can leverage GPT-4.5 in numerous ways to enhance productivity and creativity. In communication, users can rely on GPT-4.5 to craft clear and effective emails, messages, and documentation. It also offers personalized learning and coaching experiences, helping users to acquire new skills or deepen knowledge in specific areas. During brainstorming sessions, GPT-4.5 can generate innovative ideas and solutions, making it a valuable tool for creative thinking.

For project planning and execution, GPT-4.5 assists users in organizing their tasks, ensuring thorough and efficient approaches. It can also handle complex task automation, simplifying intricate processes and workflows. Developers can streamline their coding workflows by getting step-by-step guidance and automating repetitive tasks, to save time and reduce errors. Overall, GPT-4.5 is a versatile model. Starting today, enterprise customers can access GPT-4.5 in Azure AI Foundry. GPT-4.5, is also available in GitHub Copilot Chat to Copilot Enterprise users.

New models: Phi-4, Stability AI, and recent releases

The latest wave of AI models shares a common focus: delivering specialized capabilities with greater efficiency. These releases represent a shift toward purpose-built AI that excels in specific domains while requiring fewer computational resources. Here are some standout launches on Azure AI Foundry.

Microsoft’s Phi models continue to push the boundaries of what’s possible with smaller, more efficient architectures:

Phi-4-multimodal unifies text, speech, and vision for context-aware interactions. Retail kiosks can now diagnose product issues via camera and voice inputs, eliminating the need for complex manual descriptions.

Phi-4-mini packs impressive performance into just 3.8 billion parameters with a 128K-token context window. It outperforms larger models on coding and math tasks while increasing inference speed by 30% compared to previous models.

Experiment with Phi for free

Stability AI continues to advance generative imaging with models that accelerate creative workflows:

Stable Diffusion 3.5 Large generates high-fidelity marketing assets faster than previous versions, maintaining brand consistency across diverse visual styles.

Stable Image Ultra achieves photorealism for product imagery, reducing photoshoot costs through accurate material rendering and color fidelity.

Stable Image Core an enhanced version of SDXL (text-to-image generative AI model developed by Stability AI), provides high-quality output with exceptional speed and efficiency.

Cohere enhances information retrieval capabilities with its latest ranking technology:

Cohere Rerank v3.5 delivers a powerful semantic boost to the search quality of any keyword or vector search system with just a single line of code. This newest model features enhanced reasoning skills and improved multilingual performance across 100+ languages.

The GPT-4o family expands with two specialized variants:

GPT-4o-Audio-Preview handles audio prompts and generates spoken responses with appropriate emotion and emphasis, which is ideal for digital assistants and customer service.

GPT-4o-Realtime-Preview eliminates conversation lag with breakthrough latency reduction, creating genuinely human-like interaction flows.

Agora, a pioneer in enabling real-time engagement, has been using GPT-4o-Realtime.

GPT-4o-Realtime has revolutionized voice interaction for our conversational AI product, empowering developers with multilingual human-like voices, stable streaming, and ultra-low latency across customer service, telemedicine, and education.

—Patrick Ferriter, Senior Vice President, Product & Marketing, Agora

These advances collectively signal AI’s evolution toward more natural, responsive, and efficient interactions across diverse use cases and deployment environments.

New customization tools

As our model library surpasses 1,800 offerings, we continue to push the boundaries of experimentation and observability. Our new suite of fine-tuning tools complements the rise of unsupervised learning techniques.

Distillation workflows: Azure OpenAI Service now offers a code-first approach to model distillation with Stored Completions API (Application programming interface) and SDK (Software Development Kit). This allows smaller models to inherit knowledge from larger models like GPT-4.5, reducing cost and latency while maintaining high performance for specific tasks.

Reinforcement fine-tuning: Now in private preview, this technique teaches models to reason in new ways by rewarding correct logical paths and penalizing incorrect reasoning.

Provisioned Deployment for fine-tuning: Azure OpenAI Service now offers Provisioned Deployments for fine-tuned models, ensuring predictable performance and costs through Provisioned Throughput Units (PTUs) in addition to token-based billing.

Fine-tuning for Mistral Models: Exclusive to Azure AI Foundry, Mistral Large 2411 and Ministral 3B now support fine-tuning for industry-specific tasks like healthcare document redaction.

Build custom generative AI solutions with Azure OpenAI

Secure automation and scale for enterprise agents

In today’s enterprise landscape, security, and scalability are strategic imperatives. We’re introducing two powerful features designed to help you securely harness AI for mission-critical tasks:

Bring your Vnet: Azure AI Agent Service now enables all AI agent interactions, data processing, and API calls to remain securely within your organization’s own virtual network, eliminating exposure to the public internet. Early adopters like Fujitsu are leveraging this capability to improve sales by 67% with their sales proposal creation agent, saving countless hours that can be redirected toward customer engagement and strategic planning, all while maintaining data integrity.

Magma (Multi-Agent Goal Management Architecture): Available through Azure AI Foundry Labs, Magma revolutionizes complex workflow orchestration by coordinating hundreds of AI agents in parallel. This architecture enables tackling large-scale challenges like supply chain optimization with unprecedented speed and accuracy, bridging the physical and digital agentic world. Magma is available for experimentation in Azure AI Foundry.

To learn more about building multi-agent applications with Azure AI Foundry, check out this informative webinar.

Create your AI solution with Azure AI Foundry

We are excited about these new developments and look forward to seeing how you will leverage these powerful tools and features to drive innovation. Get going with new models and tools in Azure AI Foundry.

The post Announcing new models, customization tools, and enterprise agent upgrades in Azure AI Foundry appeared first on Microsoft AI Blogs.