Mission-Critical SaaS: API Gateway and AI Gateway Architecture with Azure API Management¶
Executive Summary¶
This document provides comprehensive architectural guidance for implementing both an API Gateway and a dedicated AI Gateway in a mission-critical SaaS application hosted on Azure. The solution leverages Azure Container Apps for microservices, with Azure API Management (APIM) serving as the foundation for both gateways. This architecture addresses multi-region deployment for high availability and disaster recovery, priority-based AI request handling, and separation of concerns between general API traffic and AI/GenAI workloads.
Table of Contents¶
- Architecture Overview
- Architecture Decision: Combined vs. Separate Gateways
- Recommended Architecture: Hybrid Approach
- Multi-Region Deployment Strategy
- AI Gateway Design Patterns
- Priority and Request Handling for AI Services
- Load Balancing and Resilience
- Security Considerations
- Monitoring and Observability
- Implementation Guidance
- References
Architecture Overview¶
High-Level Architecture¶
flowchart TB
subgraph External["External Clients"]
EC1[Web Apps]
EC2[Mobile Apps]
EC3[Partner APIs]
EC4[External AI Consumers]
end
subgraph GlobalRouting["Global Traffic Routing"]
AFD[Azure Front Door]
end
subgraph Region1["Primary Region - East US"]
subgraph APIM1["Azure API Management Premium"]
AG1[API Gateway<br/>External APIs]
AIG1[AI Gateway<br/>GenAI Services]
end
subgraph ACA1["Azure Container Apps Environment"]
MS1[Microservice A]
MS2[Microservice B]
MS3[Microservice C]
MS4[AI Orchestrator Service]
end
subgraph AI1["Azure OpenAI Services"]
AOAI1_PTU[Azure OpenAI<br/>PTU Instance]
AOAI1_PAYG[Azure OpenAI<br/>PAYG Instance]
end
end
subgraph Region2["Secondary Region - West US"]
subgraph APIM2["Azure API Management Premium"]
AG2[API Gateway<br/>External APIs]
AIG2[AI Gateway<br/>GenAI Services]
end
subgraph ACA2["Azure Container Apps Environment"]
MS1B[Microservice A]
MS2B[Microservice B]
MS3B[Microservice C]
MS4B[AI Orchestrator Service]
end
subgraph AI2["Azure OpenAI Services"]
AOAI2_PTU[Azure OpenAI<br/>PTU Instance]
AOAI2_PAYG[Azure OpenAI<br/>PAYG Instance]
end
end
EC1 --> AFD
EC2 --> AFD
EC3 --> AFD
EC4 --> AFD
AFD --> AG1
AFD --> AG2
AFD --> AIG1
AFD --> AIG2
AG1 --> MS1
AG1 --> MS2
AG1 --> MS3
AIG1 --> MS4
MS4 --> AIG1
AIG1 --> AOAI1_PTU
AIG1 --> AOAI1_PAYG
MS1 --> AIG1
MS2 --> AIG1
MS3 --> AIG1
AG2 --> MS1B
AG2 --> MS2B
AG2 --> MS3B
AIG2 --> MS4B
MS4B --> AIG2
AIG2 --> AOAI2_PTU
AIG2 --> AOAI2_PAYG
MS1B --> AIG2
MS2B --> AIG2
MS3B --> AIG2
Key Components¶
| Component | Purpose | Azure Service |
|---|---|---|
| Global Traffic Router | Latency-based routing, failover, WAF | Azure Front Door |
| API Gateway | External/internal API management, routing, security | Azure API Management Premium |
| AI Gateway | GenAI request management, load balancing, token limits | Azure API Management Premium |
| Microservices Platform | Containerized workloads | Azure Container Apps |
| AI Services | LLM inference, embeddings | Azure OpenAI Service |
Architecture Decision: Combined vs. Separate Gateways¶
Option 1: Single Combined Gateway¶
A single APIM instance handles both traditional API traffic and AI/GenAI traffic.
flowchart LR
subgraph Clients
C1[External Clients]
C2[Internal Services]
end
subgraph CombinedGateway["Single APIM Instance"]
APIs[Traditional APIs]
AIAPIs[AI APIs]
end
subgraph Backend
MS[Microservices]
AI[Azure OpenAI]
end
C1 --> CombinedGateway
C2 --> CombinedGateway
APIs --> MS
AIAPIs --> AI
Pros: - Simpler management with single control plane - Lower operational overhead - Unified monitoring and logging - Cost-effective for smaller deployments
Cons: - Risk of noisy neighbor issues between AI and regular API traffic - AI workloads may consume disproportionate resources - Difficult to apply different SLAs and rate limiting strategies - Scaling constraints (AI traffic spikes affect all APIs)
Option 2: Fully Separate Gateways¶
Two completely independent APIM instances - one for APIs, one for AI.
flowchart LR
subgraph Clients
C1[External Clients]
C2[Internal Services]
end
subgraph APIGateway["APIM Instance 1"]
APIs[API Gateway]
end
subgraph AIGateway["APIM Instance 2"]
AIAPIs[AI Gateway]
end
subgraph Backend
MS[Microservices]
AI[Azure OpenAI]
end
C1 --> APIGateway
C1 --> AIGateway
C2 --> AIGateway
APIs --> MS
AIAPIs --> AI
MS -.-> AIGateway
Pros: - Complete isolation between workloads - Independent scaling for AI-specific demands - Separate rate limiting and quota management - Different security policies per gateway - Easier to implement AI-specific features
Cons: - Higher cost (two Premium APIM instances per region) - More complex management - Duplicate configuration for common policies - Multiple endpoints for clients to manage
Option 3: Hybrid Approach (Recommended)¶
Single APIM instance with logical separation using Products, Workspaces, or distinct API versioning, combined with dedicated backend pools.
flowchart TB
subgraph ExternalZone["External Access Zone"]
AFD[Azure Front Door<br/>+ WAF]
end
subgraph APIM["Azure API Management Premium<br/>Multi-Region Deployment"]
subgraph Products["Logical Separation via Products/Workspaces"]
P1[Product: External APIs<br/>Rate Limits: Standard]
P2[Product: AI Services<br/>Rate Limits: Token-based]
P3[Product: Internal APIs<br/>Rate Limits: High throughput]
end
subgraph Backends["Backend Pools"]
BP1[Backend Pool: Microservices]
BP2[Backend Pool: Azure OpenAI<br/>PTU Priority + PAYG Spillover]
end
end
subgraph Internal["Internal Services Zone"]
ACA[Azure Container Apps<br/>Microservices]
end
subgraph AIServices["AI Services Zone"]
AOAI1[Azure OpenAI PTU]
AOAI2[Azure OpenAI PAYG]
end
AFD --> P1
AFD --> P2
P1 --> BP1
P2 --> BP2
P3 --> BP1
P3 --> BP2
BP1 --> ACA
BP2 --> AOAI1
BP2 --> AOAI2
ACA -.->|Internal AI Requests| P3
Pros: - Single control plane with logical isolation - Cost-effective (one Premium instance per region) - Flexible product-based access control - Centralized monitoring with workload segregation - Ability to apply AI-specific policies per product
Cons: - Requires careful capacity planning - More complex policy configuration - Shared infrastructure (though logically separated)
Recommended Architecture: Hybrid Approach¶
For mission-critical SaaS applications, the Hybrid Approach provides the best balance of cost, manageability, and separation of concerns.
Architecture Details¶
flowchart TB
subgraph ExternalClients["External Clients"]
WEB[Web Applications]
MOBILE[Mobile Apps]
PARTNER[Partner Systems]
end
subgraph GlobalLayer["Global Routing Layer"]
AFD["Azure Front Door Premium<br/>• WAF Protection<br/>• SSL Termination<br/>• Health Probes<br/>• Latency-based Routing"]
end
subgraph PrimaryRegion["Primary Region (East US)"]
subgraph APIMPrimary["Azure API Management Premium"]
subgraph ExtAPIs["External API Product"]
EA1[/orders API/]
EA2[/products API/]
EA3[/customers API/]
end
subgraph AIProduct["AI Gateway Product"]
AI1[/chat/completions/]
AI2[/embeddings/]
AI3[/assistants/]
end
subgraph IntAPIs["Internal API Product"]
IA1[/internal/ai/]
IA2[/internal/workflow/]
end
end
subgraph ACAEnv1["Container Apps Environment"]
SVC1[Order Service]
SVC2[Product Service]
SVC3[Customer Service]
SVC4[AI Orchestrator]
end
subgraph AOAIPrimary["Azure OpenAI"]
PTU1["PTU Deployment<br/>gpt-4 (High Priority)"]
PAYG1["PAYG Deployment<br/>gpt-4 (Spillover)"]
end
end
subgraph SecondaryRegion["Secondary Region (West US)"]
subgraph APIMSecondary["Azure API Management Premium<br/>(Same Instance - Multi-Region)"]
ExtAPIs2[External APIs]
AIProduct2[AI Gateway]
IntAPIs2[Internal APIs]
end
subgraph ACAEnv2["Container Apps Environment"]
SVC1B[Order Service]
SVC2B[Product Service]
SVC3B[Customer Service]
SVC4B[AI Orchestrator]
end
subgraph AOAISecondary["Azure OpenAI"]
PTU2["PTU Deployment<br/>gpt-4 (High Priority)"]
PAYG2["PAYG Deployment<br/>gpt-4 (Spillover)"]
end
end
WEB --> AFD
MOBILE --> AFD
PARTNER --> AFD
AFD -->|"Low Latency"| APIMPrimary
AFD -->|"Failover"| APIMSecondary
ExtAPIs --> SVC1
ExtAPIs --> SVC2
ExtAPIs --> SVC3
AIProduct --> PTU1
AIProduct -.->|"429 Spillover"| PAYG1
SVC4 -->|"Internal AI Calls"| IntAPIs
IntAPIs --> PTU1
ExtAPIs2 --> SVC1B
ExtAPIs2 --> SVC2B
ExtAPIs2 --> SVC3B
AIProduct2 --> PTU2
AIProduct2 -.->|"429 Spillover"| PAYG2
SVC4B --> IntAPIs2
IntAPIs2 --> PTU2
Product Configuration Strategy¶
| Product | Target Consumers | Rate Limiting | Features |
|---|---|---|---|
| External APIs | External clients, partners | Requests/sec per subscription | OAuth 2.0, API keys, standard throttling |
| AI Gateway (External) | External AI consumers | Token-based (TPM) limits | Semantic caching, content safety, priority queuing |
| Internal APIs | Backend microservices | Higher limits, service identity | Managed identity auth, circuit breaker |
Multi-Region Deployment Strategy¶
Active-Active Multi-Region Configuration¶
For mission-critical workloads requiring 99.99%+ SLA, deploy APIM Premium with multi-region gateways.
flowchart TB
subgraph Global["Global Resources"]
AFD[Azure Front Door]
DNS[Azure DNS]
end
subgraph EastUS["East US (Primary)"]
APIM_E["APIM Gateway<br/>3 Units + AZ"]
ACA_E[Container Apps]
AOAI_E[Azure OpenAI]
COSMOS_E[(Cosmos DB<br/>Multi-Region Write)]
end
subgraph WestUS["West US (Secondary)"]
APIM_W["APIM Gateway<br/>3 Units + AZ"]
ACA_W[Container Apps]
AOAI_W[Azure OpenAI]
end
subgraph WestEurope["West Europe (Tertiary)"]
APIM_EU["APIM Gateway<br/>2 Units + AZ"]
ACA_EU[Container Apps]
AOAI_EU[Azure OpenAI]
end
AFD --> |"Latency Routing"| APIM_E
AFD --> |"Latency Routing"| APIM_W
AFD --> |"Latency Routing"| APIM_EU
APIM_E --> ACA_E
APIM_E --> AOAI_E
ACA_E --> COSMOS_E
APIM_W --> ACA_W
APIM_W --> AOAI_W
ACA_W --> COSMOS_E
APIM_EU --> ACA_EU
APIM_EU --> AOAI_EU
ACA_EU --> COSMOS_E
Region-Aware Backend Routing¶
Use APIM policies to route requests to regional backend services:
<policies>
<inbound>
<base />
<choose>
<!-- Route to regional Azure OpenAI based on gateway region -->
<when condition="@("East US".Equals(context.Deployment.Region, StringComparison.OrdinalIgnoreCase))">
<set-backend-service base-url="https://aoai-eastus.openai.azure.com/" />
</when>
<when condition="@("West US".Equals(context.Deployment.Region, StringComparison.OrdinalIgnoreCase))">
<set-backend-service base-url="https://aoai-westus.openai.azure.com/" />
</when>
<when condition="@("West Europe".Equals(context.Deployment.Region, StringComparison.OrdinalIgnoreCase))">
<set-backend-service base-url="https://aoai-westeurope.openai.azure.com/" />
</when>
<otherwise>
<set-backend-service base-url="https://aoai-eastus.openai.azure.com/" />
</otherwise>
</choose>
</inbound>
</policies>
AI Gateway Design Patterns¶
1. Load Balancing with Circuit Breaker¶
flowchart LR
subgraph AIGateway["AI Gateway (APIM)"]
LB[Backend Load Balancer]
CB[Circuit Breaker]
end
subgraph Backends["Azure OpenAI Backend Pool"]
PTU1[PTU Instance 1<br/>Priority: 1]
PTU2[PTU Instance 2<br/>Priority: 1]
PAYG[PAYG Instance<br/>Priority: 2]
end
Request[AI Request] --> LB
LB --> CB
CB -->|"Healthy"| PTU1
CB -->|"Healthy"| PTU2
CB -.->|"Spillover/429"| PAYG
Backend Pool Configuration¶
{
"backends": [
{
"url": "https://aoai-ptu-primary.openai.azure.com",
"priority": 1,
"weight": 50
},
{
"url": "https://aoai-ptu-secondary.openai.azure.com",
"priority": 1,
"weight": 50
},
{
"url": "https://aoai-payg-spillover.openai.azure.com",
"priority": 2,
"weight": 100
}
],
"circuitBreaker": {
"rules": [
{
"failureCondition": {
"count": 3,
"interval": "PT10S",
"statusCodeRanges": [
{ "min": 429, "max": 429 },
{ "min": 500, "max": 599 }
]
},
"tripDuration": "PT30S",
"acceptRetryAfter": true
}
]
}
}
2. PTU to PAYG Spillover Strategy¶
sequenceDiagram
participant Client
participant AIGateway as AI Gateway
participant PTU as Azure OpenAI (PTU)
participant PAYG as Azure OpenAI (PAYG)
Client->>AIGateway: POST /chat/completions
AIGateway->>PTU: Forward Request
alt PTU Available
PTU-->>AIGateway: 200 OK + Response
AIGateway-->>Client: 200 OK + Response
else PTU Throttled (429)
PTU-->>AIGateway: 429 Too Many Requests
Note over AIGateway: Circuit Breaker Activates<br/>Route to PAYG
AIGateway->>PAYG: Forward Request
PAYG-->>AIGateway: 200 OK + Response
AIGateway-->>Client: 200 OK + Response
end
3. Token Rate Limiting¶
Apply token-based rate limiting per consumer:
<policies>
<inbound>
<base />
<!-- Token limit policy for AI APIs -->
<llm-token-limit
counter-key="@(context.Subscription.Id)"
tokens-per-minute="10000"
estimate-prompt-tokens="true"
remaining-tokens-variable-name="remainingTokens">
<llm-token-limit-backend-id>aoai-backend</llm-token-limit-backend-id>
</llm-token-limit>
</inbound>
</policies>
Priority and Request Handling for AI Services¶
Priority Queue Architecture¶
flowchart TB
subgraph Consumers["Request Sources"]
HC[High Priority<br/>Critical Business Ops]
MC[Medium Priority<br/>User-Facing Features]
LC[Low Priority<br/>Batch Processing]
end
subgraph AIGateway["AI Gateway"]
PQ[Priority Queue<br/>Classification]
RL[Rate Limiter]
CB[Circuit Breaker]
end
subgraph Processing["Backend Processing"]
PTU[PTU Instances<br/>Reserved Capacity]
PAYG[PAYG Instances<br/>Burst Capacity]
end
HC -->|"Priority: 1"| PQ
MC -->|"Priority: 2"| PQ
LC -->|"Priority: 3"| PQ
PQ --> RL
RL --> CB
CB -->|"High Priority First"| PTU
CB -.->|"Spillover"| PAYG
Priority-Based Routing Policy¶
<policies>
<inbound>
<base />
<!-- Extract priority from header or subscription -->
<set-variable name="requestPriority"
value="@(context.Request.Headers.GetValueOrDefault("X-Priority", "medium"))" />
<choose>
<!-- High priority: Direct to PTU with no throttling -->
<when condition="@(context.Variables.GetValueOrDefault<string>("requestPriority") == "high")">
<set-backend-service backend-id="aoai-ptu-primary" />
<set-header name="X-Route" exists-action="override">
<value>ptu-priority</value>
</set-header>
</when>
<!-- Medium priority: PTU with spillover to PAYG -->
<when condition="@(context.Variables.GetValueOrDefault<string>("requestPriority") == "medium")">
<set-backend-service backend-id="aoai-backend-pool" />
</when>
<!-- Low priority: PAYG only, with aggressive rate limiting -->
<otherwise>
<rate-limit-by-key
calls="10"
renewal-period="60"
counter-key="@(context.Subscription.Id)" />
<set-backend-service backend-id="aoai-payg" />
</otherwise>
</choose>
</inbound>
</policies>
Consumer-Based Quota Allocation¶
| Consumer Type | TPM Quota | Priority | Backend Pool |
|---|---|---|---|
| Critical Operations | 50,000 | High | PTU Only |
| User-Facing Apps | 20,000 | Medium | PTU + PAYG Spillover |
| Batch Processing | 5,000 | Low | PAYG Only |
| Development/Test | 1,000 | Low | PAYG (Shared) |
Load Balancing and Resilience¶
Multi-Backend Load Balancing¶
flowchart TB
subgraph Gateway["AI Gateway"]
LB["Load Balancer<br/>Round-Robin + Priority"]
end
subgraph PTUPool["PTU Backend Pool (Priority 1)"]
PTU1["PTU East US<br/>Weight: 50%"]
PTU2["PTU West US<br/>Weight: 50%"]
end
subgraph PAYGPool["PAYG Backend Pool (Priority 2)"]
PAYG1["PAYG East US<br/>Weight: 50%"]
PAYG2["PAYG West US<br/>Weight: 50%"]
end
LB -->|"Active"| PTU1
LB -->|"Active"| PTU2
LB -.->|"Spillover"| PAYG1
LB -.->|"Spillover"| PAYG2
Retry and Circuit Breaker Configuration¶
<policies>
<backend>
<retry condition="@(context.Response.StatusCode == 429 || context.Response.StatusCode >= 500)"
count="3"
interval="1"
delta="1"
max-interval="10"
first-fast-retry="true">
<forward-request buffer-request-body="true" />
</retry>
</backend>
<on-error>
<base />
<choose>
<when condition="@(context.Response.StatusCode == 429)">
<!-- Return Retry-After header to client -->
<return-response>
<set-status code="429" reason="Too Many Requests" />
<set-header name="Retry-After" exists-action="override">
<value>@(context.Response.Headers.GetValueOrDefault("Retry-After", "30"))</value>
</set-header>
</return-response>
</when>
</choose>
</on-error>
</policies>
Security Considerations¶
Authentication Architecture¶
flowchart LR
subgraph ExternalAuth["External Authentication"]
OAuth[OAuth 2.0 / OIDC]
APIKey[API Key]
end
subgraph Gateway["AI Gateway"]
Validate[Token Validation]
Transform[Credential Transform]
end
subgraph InternalAuth["Internal Authentication"]
MI[Managed Identity]
end
subgraph Backend["Azure OpenAI"]
AOAI[Azure OpenAI Service]
end
OAuth --> Validate
APIKey --> Validate
Validate --> Transform
Transform --> MI
MI --> AOAI
Security Best Practices¶
- Terminate client credentials at the gateway - Use managed identity for backend connections
- Apply Content Safety policies - Integrate Azure AI Content Safety
- Implement PII detection - Scan prompts before forwarding
- Network isolation - Deploy APIM and backends in private virtual networks
Content Safety Integration¶
<policies>
<inbound>
<base />
<!-- Content Safety Check -->
<llm-content-safety backend-id="content-safety-backend">
<text-blocklist-ids>
<id>custom-blocklist-1</id>
</text-blocklist-ids>
<categories>
<category name="Hate" threshold="Medium" />
<category name="Violence" threshold="Medium" />
<category name="Sexual" threshold="Medium" />
<category name="SelfHarm" threshold="Medium" />
</categories>
</llm-content-safety>
</inbound>
</policies>
Monitoring and Observability¶
Observability Architecture¶
flowchart TB
subgraph Gateway["AI Gateway"]
Metrics[Token Metrics Emission]
Logs[Request/Response Logging]
end
subgraph Monitoring["Azure Monitor"]
AppInsights[Application Insights]
LogAnalytics[Log Analytics]
Alerts[Azure Alerts]
end
subgraph Dashboards["Visualization"]
Workbooks[Azure Workbooks]
Grafana[Azure Managed Grafana]
end
Gateway --> AppInsights
Gateway --> LogAnalytics
AppInsights --> Alerts
LogAnalytics --> Alerts
AppInsights --> Workbooks
LogAnalytics --> Grafana
Token Metrics Emission Policy¶
<policies>
<outbound>
<base />
<!-- Emit token metrics for chargeback and monitoring -->
<llm-emit-token-metric namespace="genai-metrics">
<dimension name="Subscription" value="@(context.Subscription.Id)" />
<dimension name="Product" value="@(context.Product.Name)" />
<dimension name="API" value="@(context.Api.Name)" />
<dimension name="Region" value="@(context.Deployment.Region)" />
<dimension name="Model" value="@(context.Request.Headers.GetValueOrDefault("X-Model", "unknown"))" />
</llm-emit-token-metric>
</outbound>
</policies>
Key Metrics to Monitor¶
| Metric | Description | Alert Threshold |
|---|---|---|
| Total Tokens | Total tokens consumed per subscription | 80% of quota |
| Prompt Tokens | Input tokens per request | Anomaly detection |
| Completion Tokens | Output tokens per request | Anomaly detection |
| 429 Rate | Throttling frequency | > 5% of requests |
| Latency P95 | 95th percentile response time | > 5 seconds |
| Circuit Breaker Trips | Backend failures | Any occurrence |
Implementation Guidance¶
Phase 1: Foundation (Weeks 1-2)¶
- Deploy Azure API Management Premium in primary region
- Configure availability zones (minimum 3 units)
- Set up virtual network integration
- Create Product structure (External APIs, AI Gateway, Internal APIs)
Phase 2: AI Gateway Configuration (Weeks 3-4)¶
- Import Azure OpenAI API definitions
- Configure backend pools with PTU and PAYG instances
- Implement load balancing policies
- Set up circuit breaker rules
Phase 3: Multi-Region Expansion (Weeks 5-6)¶
- Add secondary region to APIM instance
- Deploy regional Azure OpenAI instances
- Configure region-aware routing policies
- Set up Azure Front Door with health probes
Phase 4: Security and Monitoring (Weeks 7-8)¶
- Implement managed identity authentication
- Configure content safety policies
- Set up Application Insights integration
- Create monitoring dashboards and alerts
Deployment Checklist¶
- [ ] APIM Premium tier deployed with availability zones
- [ ] Multi-region gateways configured
- [ ] Backend pools defined for PTU/PAYG spillover
- [ ] Token rate limiting policies applied
- [ ] Circuit breaker configured
- [ ] Managed identity authentication enabled
- [ ] Content safety integration complete
- [ ] Monitoring and alerting configured
- [ ] Disaster recovery runbooks documented
Summary¶
For your mission-critical SaaS application, the Hybrid Approach with a single APIM Premium instance provides:
| Requirement | Solution |
|---|---|
| High Availability | Multi-region APIM deployment with active-active configuration |
| Disaster Recovery | Automatic failover via Azure Front Door |
| AI Request Priority | Product-based segregation with priority routing policies |
| Cost Optimization | PTU for predictable workloads, PAYG for spillover |
| Security | Managed identity, content safety, network isolation |
| Observability | Token metrics, request logging, alerting |
The architecture maintains logical separation between API Gateway and AI Gateway functionality while sharing infrastructure for cost efficiency and simplified management.
References¶
- AI Gateway in Azure API Management - Microsoft Learn
- Use a Gateway in Front of Multiple Azure OpenAI Deployments - Azure Architecture Center
- GenAI Gateway Reference Architecture using APIM - AI Playbook
- Key Considerations for Designing a GenAI Gateway Solution - AI Playbook
- Deploy Azure API Management to Multiple Regions - Microsoft Learn
- Reliability in Azure API Management - Microsoft Learn
- Azure API Management Landing Zone Architecture - Azure Architecture Center
- Mission-Critical Architecture Pattern - Well-Architected Framework
- Microservices with Azure Container Apps - Microsoft Learn
- API Gateway Pattern for Microservices - Azure Architecture Center
Document Version: 1.0
Last Updated: December 2024
Author: Architecture Team