Tag Archives: llm

Azure AI Foundry: The Enterprise Architecture Layer for Building AI Apps and Agents at Scale

Artificial intelligence is no longer just a proof-of-concept conversation. Enterprises are now asking a much harder question: How do we build, govern, secure, evaluate, and scale AI solutions across the business without creating another disconnected technology stack?

That is where Azure AI Foundry, now positioned within Microsoft Foundry, becomes extremely important.

Microsoft describes Foundry as a unified Azure platform-as-a-service offering for enterprise AI operations, model builders, and application development. Its purpose is to help developers and organizations build AI applications and agents without spending unnecessary effort managing the underlying infrastructure.

Why Azure AI Foundry Matters

The first wave of generative AI was about experimentation. Teams built copilots, chatbots, document search experiences, and prompt-based prototypes. Many of those pilots proved value, but they also exposed enterprise challenges:

Organizations now need answers to questions like:

How do we manage multiple models?
How do we secure enterprise data?
How do we evaluate AI quality?
How do we govern prompts, agents, and tools?
How do we move from prototype to production?
How do we monitor cost, risk, performance, and business value?

Azure AI Foundry helps address this gap by acting as an AI app and agent factory. It brings together models, agents, tools, evaluation, safety, and governance into a unified platform experience for AI development teams. Microsoft’s AI learning hub describes Azure AI Foundry as a platform of models, agents, tools, and safeguards for AI development teams.

The Architectural View

From an architecture perspective, Azure AI Foundry should not be seen as a single service. It should be viewed as an enterprise AI control plane that connects models, data, applications, governance, security, and operational monitoring.

Microsoft’s Foundry architecture is organized around a layered model: a top-level Foundry resource for governance, projects for development isolation, and connected Azure services for capabilities such as storage, search, and secrets management.

A simplified architecture looks like this:

Business Users / Applications |Copilot, Chat Apps, Agent Interfaces, APIs |Azure AI Foundry Projects |Models, Prompts, Agents, Tools, Evaluations |Enterprise Data LayerAzure AI Search, Fabric, Databricks, SQL, Storage, APIs |Security and GovernanceMicrosoft Entra ID, Key Vault, Private Networking, Monitoring, Policy |Azure InfrastructureCompute, Storage, Networking, Observability

Core Architecture Components

1. Foundry Resource: The Governance Boundary

The Foundry resource acts as the top-level management and governance layer. This is where enterprise teams can organize AI workloads, manage access, and establish common controls across AI development.

For architects, this is critical. Without a centralized governance boundary, AI projects quickly become fragmented across teams, tools, and environments.

2. Projects: The Development and Isolation Layer

Projects provide logical separation for AI workloads. A project can represent a business use case, product team, department, or development environment. This allows teams to manage their own AI assets while still operating under enterprise governance.

For example:

Foundry Resource | |-- HR Knowledge Assistant Project |-- Finance Forecasting Agent Project |-- Customer Service Copilot Project |-- Legal Document Review Project

This project-based architecture supports better separation of data, prompts, evaluations, models, and application components.

3. Model Layer: Choice and Flexibility

One of the biggest strengths of Azure AI Foundry is model choice. Enterprises are not locked into one model pattern. They can use models from the Foundry model catalog and select the right model based on use case, cost, latency, accuracy, and risk profile.

This is important because not every AI workload needs the most powerful model. Some workloads need speed. Some need cost efficiency. Some need domain reasoning. Some need multimodal capabilities.

A mature architecture should define a model selection framework:

Use Case Type Model StrategySimple Q&A Lower-cost language modelComplex reasoning Advanced reasoning modelDocument extraction Specialized document AI modelImage or vision workload Multimodal modelEnterprise agent Model plus tools plus retrieval

Agent Architecture in Azure AI Foundry

The future of enterprise AI is not just chatbots. It is agents that can reason, use tools, retrieve enterprise data, call APIs, and complete business workflows.

Microsoft Foundry Agent Service is described as a fully managed platform for building, deploying, and scaling AI agents. It supports agent development through the Foundry portal, SDKs, REST APIs, and frameworks such as Agent Framework and LangGraph.

Microsoft currently describes three agent types: prompt agents, workflow agents, and hosted agents. Prompt agents are configuration-based, workflow agents support multi-step automation, and hosted agents allow code-based orchestration in managed containers.

A strong enterprise agent architecture includes:

Agent Interface |Agent Instructions and Policies |Model Selection |Tools and Actions |Enterprise Data Retrieval |Evaluation and Safety Controls |Monitoring and Feedback Loop

Retrieval-Augmented Generation Architecture

For most enterprise use cases, the AI solution should not rely only on the model’s general knowledge. It needs access to trusted business data.

That is where Retrieval-Augmented Generation, commonly known as RAG, becomes important.

A typical Azure AI Foundry RAG architecture includes:

Enterprise SourcesSharePoint, PDFs, SQL, Fabric, Databricks, APIs |Data Processing and Chunking |Embeddings and Indexing |Azure AI Search or Vector Store |Azure AI Foundry Application or Agent |Grounded Response with Citations

This architecture helps organizations create AI experiences that are grounded in internal knowledge, policies, documents, operational data, and business context.

Security and Governance Considerations

AI architecture must be designed with security from day one.

Key considerations include:

Identity and Access Management: Use Microsoft Entra ID to control who can access projects, models, data, and applications.

Secrets Management: Use Azure Key Vault to protect API keys, connection strings, and secrets.

Network Security: Use private endpoints and controlled network access where required for sensitive workloads.

Data Governance: Define which data sources can be used, what data can be indexed, and what data should be excluded.

Responsible AI: Implement safety filters, evaluation processes, human review, and output monitoring.

Operational Monitoring: Track latency, cost, usage, quality, failure rates, and user feedback.

Microsoft’s Azure Architecture Center recommends that AI and machine learning workloads follow Azure Well-Architected Framework guidance across the architecture pillars.

Enterprise Reference Architecture

For a production-grade implementation, I recommend the following architecture pattern:

1. Experience Layer - Web app - Teams app - Copilot extension - API endpoint2. AI Orchestration Layer - Azure AI Foundry project - Prompt flow or agent workflow - Model routing - Tool orchestration3. Knowledge Layer - Azure AI Search - Vector index - Enterprise semantic layer - Metadata and citations4. Data Platform Layer - Microsoft Fabric - Azure Data Lake - Databricks - SQL databases - Business APIs5. Governance Layer - Entra ID - Key Vault - Purview - Policy - Logging and audit6. Operations Layer - Application Insights - Cost monitoring - Evaluation metrics - Feedback loop

This pattern allows enterprises to move beyond isolated AI pilots and create a repeatable foundation for AI delivery.

Best Practices for Architects

The most successful Azure AI Foundry implementations follow a few principles.

First, start with business value, not the model. The model is only one part of the solution. The real value comes from solving a business problem.

Second, design for governance early. AI without governance creates risk, duplication, and loss of trust.

Third, separate experimentation from production. Use projects, environments, access controls, and deployment practices to manage maturity.

Fourth, evaluate continuously. AI quality is not a one-time test. It must be measured through accuracy, groundedness, safety, latency, and business outcomes.

Fifth, build reusable architecture patterns. Every use case should not start from zero. Create repeatable templates for RAG, agents, document intelligence, workflow automation, and enterprise copilots.

Final Thought

Azure AI Foundry is not just another AI tool. It is becoming a strategic platform for building enterprise-grade AI applications and agents with structure, governance, and scalability.

For organizations serious about AI transformation, the goal should not be to build one chatbot. The goal should be to build an AI operating model where business teams, data teams, developers, architects, and governance leaders can collaborate on a secure, scalable, and reusable foundation.

That is the real promise of Azure AI Foundry: helping enterprises move from AI experimentation to AI execution.

Advanced retrieval for your AI Apps and Agents on Azure

Advanced retrieval on Azure lets AI agents move beyond “good-enough RAG” into precise, context-rich answers by combining hybrid search, graph reasoning, and agentic query planning. This blogpost walks through what that means in practice, using a concrete retail example you can adapt to your own apps.Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​learn.microsoft


Why your agents need better retrieval

Most useful agents are really “finders”:

  • Shopper agents find products and inventory.
  • HR agents find policies and benefits rules.
  • Support agents find troubleshooting steps and logs.Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

If retrieval is weak, even the best model hallucinates or returns incomplete answers, which is why Retrieval-Augmented Generation (RAG) became the default pattern for enterprise AI apps.Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​


Hybrid search: keywords + vectors + reranking

Different user queries benefit from different retrieval strategies: a precise SKU matches well with keyword search, while fuzzy “garden watering supplies” works better with vectors. Hybrid search runs both in parallel, then fuses them.Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

On Azure, a strong retrieval stack typically includes:learn.microsoft+1​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

  • Keyword search using BM25 over an inverted index (great for exact terms and filters).
  • Vector search using embeddings with HNSW or DiskANN (great for semantic similarity).
  • Reciprocal Rank Fusion (RRF) to merge the two ranked lists into a single result set.
  • A semantic or cross-encoder reranker on top to reorder the final set by true relevance.

Example: “garden watering supplies”

Imagine a shopper agent backing a hardware store:

  1. User asks: “garden watering supplies”.
  2. Keyword search hits items mentioning “garden”, “hose”, “watering” in name/description.Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​
  3. Vector search finds semantically related items like soaker hoses, planters, and sprinklers, even if the wording differs.Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​
  4. RRF merges both lists so items strong in either keyword or semantic match rise together.learn.microsoft​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​
  5. A reranker model (e.g., Azure AI Search semantic ranker) re-scores top candidates using full text and query context.azure+1​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

This hybrid + reranking stack reliably outperforms pure vector or pure keyword across many query types, especially concept-seeking and long queries.argonsys​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​


Going beyond hybrid: graph RAG with PostgreSQL

Some questions are not just “find documents” but “reason over relationships,” such as comparing reviews, features, or compliance constraints. A classic example:Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

“I want a cheap pair of headphones with noise cancellation and great reviews for battery life.”Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

Answering this well requires understanding relationships between:

  • Products
  • Features (noise cancellation, battery life)
  • Review sentiment about those specific features

Building a graph with Apache AGE

Azure Database for PostgreSQL plus Apache AGE turns relational and unstructured data into a queryable property graph, with nodes like Product, Feature, and Review, and edges such as HAS_FEATURE or positive_sentiment.learn.microsoft+1​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

A typical flow in a retail scenario:

  1. Use azure_ai.extract() in PostgreSQL to pull product features and sentiments from free-text reviews into structured JSON (e.g., “battery life: positive”).Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​
  2. Load these into an Apache AGE graph so each product connects to features and sentiment-weighted reviews.learn.microsoft​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​
  3. Use Cypher-style queries to answer questions like “headphones where noise cancellation and battery life reviews are mostly positive, sorted by review count.”learn.microsoft​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

Your agent can then:

  • Use vector/hybrid search to shortlist candidate products.
  • Run a graph query to rank those products by positive feature sentiment.
  • Feed only the top graph results into the LLM for grounded, explainable answers.Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

Hybrid search and graph RAG still assume a single, well-formed query, but real users often ask multi-part or follow-up questions. Azure AI Search’s agentic retrieval solves this by letting an LLM plan and execute multiple subqueries over your index.securityboulevard+1​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

Example: HR agent multi-part question

Consider an internal HR agent:

“I’m having a baby soon. What’s our parental leave policy, how do I add a baby to benefits, and what’s the open enrollment deadline?”Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

Agentic retrieval pipeline:infoq​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

  1. Query planning
    • Decompose into subqueries: parental leave policy, dependent enrollment steps, open enrollment dates.
    • Fix spellings and incorporate chat history (“we talked about my role and region earlier”).
  2. Fan-out search
    • Run parallel searches over policy PDFs, benefits docs, and plan summary pages with hybrid search.
  3. Results merging and reranking
    • Merge results across subqueries, apply rankers, and surface the top snippets from each area.
  4. LLM synthesis
    • LLM draws from all retrieved slices to produce a single, coherent answer, citing relevant docs or links.

Microsoft’s evaluation shows agentic retrieval can materially increase answer quality and coverage for complex, multi-document questions compared to plain RAG.infoq​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​


Designing your own advanced retrieval flow

When turning this into a real solution on Azure, a pragmatic pattern looks like this:learn.microsoft+2​Advanced-retrieval-for-your-AI-Apps-and-Agents-on-Azure.pptx​

  • Start with hybrid search + reranking as the default retrieval layer for most agents.
  • Introduce graph RAG with Apache AGE when:
    • You must reason over relationships (e.g., product–feature–review, user–role–policy).
    • You repeatedly join and aggregate across structured entities and unstructured text.
  • Add agentic retrieval in Azure AI Search for:
    • Multi-part questions.
    • Long-running conversations where context and follow-ups matter.

You can mix these strategies: use Azure AI Search’s agentic retrieval to plan and fan out queries, a PostgreSQL + AGE graph to compute relational insights, and then fuse everything back into a single grounded answer stream for your AI app or agent.

Embracing Responsible AI Practices for Traditional and Generative AI

Introduction: Artificial Intelligence (AI) is reshaping industries and enhancing human capabilities. From traditional AI models like recommendation systems to the transformative potential of generative AI, the need for responsible AI practices has never been more critical. As we navigate these advancements, it becomes imperative to ensure that AI operates ethically, transparently, and inclusively.

1. Ideation and Exploration: The journey begins with identifying the business use case. Developers explore Azure AI’s model catalog, which includes foundation models from providers like OpenAI and Hugging Face. Using a subset of data, they prototype and evaluate models to validate business hypotheses. For example, in customer support, developers test sample queries to ensure the model generates helpful responses.

2. Experimentation and Refinement: Once a model is selected, the focus shifts to customization. Techniques like Retrieval Augmented Generation (RAG) allow enterprises to integrate local or real-time data into prompts. Developers iterate on prompts, chunking methods, and indexing to enhance model performance. Azure AI’s tools enable bulk testing and automated metrics for efficient refinement.

3. Deployment and Monitoring: Deploying LLMs at scale requires careful planning. Azure AI supports seamless integration with enterprise systems, ensuring models are optimized for real-world applications. Continuous monitoring helps identify bottlenecks and areas for improvement. Azure AI’s Responsible AI Framework ensures ethical and accountable deployment.

4. Scaling and Optimization: As enterprises expand their use of LLMs, scalability becomes crucial. Azure AI offers solutions for managing large-scale deployments, including fine-tuning and real-time data integration. By leveraging Azure AI’s capabilities, businesses can achieve consistent performance across diverse scenarios.

Conclusion: The enterprise LLM life cycle is an iterative process that demands collaboration, innovation, and diligence. Azure AI empowers organizations to navigate this journey with confidence, unlocking the full potential of LLMs while adhering to ethical standards. Whether you’re just starting or scaling up, Azure AI is your partner in building the future of enterprise AI.

What do you think? Would you like me to refine or expand on any section?

Please write impressive blogpost for Responsible AI Practices for Traditional & Generative AI

Title: Embracing Responsible AI Practices for Traditional and Generative AI

Introduction: Artificial Intelligence (AI) is reshaping industries and enhancing human capabilities. From traditional AI models like recommendation systems to the transformative potential of generative AI, the need for responsible AI practices has never been more critical. As we navigate these advancements, it becomes imperative to ensure that AI operates ethically, transparently, and inclusively.

1. Understanding Responsibility in Traditional and Generative AI: Traditional AI, which powers applications like fraud detection and predictive analytics, focuses on processing structured data to provide specific outputs. Generative AI, on the other hand, uses advanced models like GPT to create new content, whether it’s text, images, or music. Despite their differences, both require responsible practices to prevent unintended consequences. Responsible AI involves fairness, accountability, and respect for user privacy.

2. Building Ethical AI Systems: For traditional AI, ethics often revolve around eliminating biases in data and ensuring models do not disproportionately harm certain groups. Practices like diverse data sourcing, periodic audits, and transparent algorithms play a critical role. Generative AI, due to its broader creative capabilities, has unique challenges, such as avoiding the generation of harmful or misleading content. Guidelines to include:

  • Training models with diverse and high-quality datasets.
  • Filtering outputs to prevent harmful language or misinformation.
  • Clearly disclosing AI-generated content to distinguish it from human-created work.

3. The Importance of Transparency: Transparency builds trust in both traditional and generative AI applications. Organizations should adopt practices like:

  • Documenting data sources, methodologies, and algorithms.
  • Communicating how AI decisions are made, whether it’s a product recommendation or a generated paragraph.
  • Introducing “explainability” features to demystify black-box algorithms, helping users understand why an AI reached a certain decision.

4. Ensuring Data Privacy and Security: Both traditional and generative AI rely on extensive data. Responsible AI practices prioritize:

  • Adhering to privacy regulations like GDPR or CCPA.
  • Implementing secure protocols to protect data from breaches.
  • Avoiding over-collection of personal data and ensuring users have control over how their data is used.

5. The Role of AI Governance: Strong governance frameworks are the cornerstone of responsible AI deployment. These include:

  • Establishing cross-functional AI ethics committees.
  • Conducting regular audits to identify ethical risks.
  • Embedding responsible AI principles into organizational policies and workflows.

6. The Future of Responsible AI: As AI evolves, so must the practices governing it. Collaboration between governments, tech companies, and academic institutions will be essential in setting global standards. Open-source initiatives and AI research organizations can drive accountability and innovation hand-in-hand.

Conclusion: Responsible AI is not just a regulatory necessity—it is a moral imperative. Traditional and generative AI hold the power to create significant societal impact, and organizations must harness this power thoughtfully. By embedding ethics, transparency, and governance into every stage of the AI lifecycle, we can ensure that AI contributes positively to humanity while mitigating risks.

Navigating the Enterprise LLM Life Cycle with Azure AI

Introduction: The rise of Large Language Models (LLMs) has revolutionized the way enterprises approach artificial intelligence. From customer support to content generation, LLMs are unlocking new possibilities. However, managing the life cycle of these models requires a strategic approach. Azure AI provides a robust framework for enterprises to operationalize, refine, and scale LLMs effectively.

1. Ideation and Exploration: The journey begins with identifying the business use case. Developers explore Azure AI’s model catalog, which includes foundation models from providers like OpenAI and Hugging Face. Using a subset of data, they prototype and evaluate models to validate business hypotheses. For example, in customer support, developers test sample queries to ensure the model generates helpful responses.

2. Experimentation and Refinement: Once a model is selected, the focus shifts to customization. Techniques like Retrieval Augmented Generation (RAG) allow enterprises to integrate local or real-time data into prompts. Developers iterate on prompts, chunking methods, and indexing to enhance model performance. Azure AI’s tools enable bulk testing and automated metrics for efficient refinement.

3. Deployment and Monitoring: Deploying LLMs at scale requires careful planning. Azure AI supports seamless integration with enterprise systems, ensuring models are optimized for real-world applications. Continuous monitoring helps identify bottlenecks and areas for improvement. Azure AI’s Responsible AI Framework ensures ethical and accountable deployment.

4. Scaling and Optimization: As enterprises expand their use of LLMs, scalability becomes crucial. Azure AI offers solutions for managing large-scale deployments, including fine-tuning and real-time data integration. By leveraging Azure AI’s capabilities, businesses can achieve consistent performance across diverse scenarios.

Conclusion: The enterprise LLM life cycle is an iterative process that demands collaboration, innovation, and diligence. Azure AI empowers organizations to navigate this journey with confidence, unlocking the full potential of LLMs while adhering to ethical standards. Whether you’re just starting or scaling up, Azure AI is your partner in building the future of enterprise AI.

Developing LLM Applications Using Prompt Flow in Azure AI Studio

Developing LLM Applications Using Prompt Flow in Azure AI Studio

By Deepak Kaaushik, Microsoft MVP

Large Language Models (LLMs) are at the forefront of AI-driven innovation, shaping how organizations extract insights, interact with customers, and automate workflows. At the recent Canadian MVP Show, Rahat Yasir and I had the privilege of presenting a session on developing robust LLM applications using Prompt Flow in Azure AI Studio. Here’s a summary of our presentation, diving into the power and possibilities of Prompt Flow.


What is Prompt Flow?

Prompt Flow is an end-to-end platform for LLM application development, testing, and deployment. It is specifically designed to simplify complex workflows while ensuring high-quality outcomes through iterative testing and evaluation.

Key Features Include:

  • Flow Development: Combine LLMs, custom prompts, and Python scripts to create sophisticated workflows.
  • Prompt Tuning: Test different variants to optimize your application’s performance.
  • Evaluation Metrics: Assess model outputs using pre-defined metrics for quality and consistency.
  • Deployment and Monitoring: Seamlessly deploy your applications and monitor their performance over time.

Agenda of the Session

  1. Overview of Azure AI: Setting the stage with the foundational components of Azure AI Studio.
  2. Preparing the Environment: Ensuring optimal configurations for prompt flow workflows.
  3. Prompt Flow Overview: Exploring its architecture, lifecycle, and use cases.
  4. Capabilities: Highlighting the tools and functionalities that make Prompt Flow indispensable.
  5. Live Demo: Showcasing the evaluation of RAG (Retrieval-Augmented Generation) systems using Prompt Flow.

Prompt Flow Lifecycle

The lifecycle of Prompt Flow mirrors the iterative nature of AI development:

  1. Develop: Create flows with LLM integrations and Python scripting.
  2. Test: Fine-tune prompts to optimize performance for diverse use cases.
  3. Evaluate: Utilize robust metrics to validate outputs against expected standards.
  4. Deploy & Monitor: Transition applications into production and ensure continuous improvement.

RAG System Evaluation

One of the highlights of the session was a live demo on evaluating a Retrieval-Augmented Generation (RAG) system using Prompt Flow. RAG systems combine retrieval mechanisms with generative models, enabling more accurate and contextually relevant outputs.

Why RAG Matters

RAG architecture enhances LLMs by integrating factual retrieval from external sources, making them ideal for applications requiring high precision.

Evaluation in Prompt Flow

We showcased:

  • Custom Metrics: Designing tests to assess output relevance and factual accuracy.
  • Flow Types: Using modular tools in Prompt Flow to streamline evaluation.

Empowering You to Build Smarter Applications

Prompt Flow equips developers and data scientists with the tools to build smarter, scalable, and reliable AI applications. Whether you’re experimenting with LLM prompts or refining a RAG workflow, Prompt Flow makes the process intuitive and effective.


Join the Journey

To learn more, visit the Prompt Flow documentation. Your feedback and questions are always welcome!

Thank you to everyone who joined the session. Together, let’s continue pushing the boundaries of AI innovation.

Deepak Kaaushik
Microsoft MVP | Cloud Solution Architect