Azure Container Apps (ACA) - Comprehensive Architecture Guide¶

Table of Contents¶

Overview
Key Concepts
Environment Architecture
Single vs Multiple Environments Decision Guide
Workload Profiles and Plans
Microservices Architecture Patterns
Networking and Security
Scaling and Performance
Reliability and High Availability
Cost Optimization
Operational Excellence
Deployment Strategies
References

Overview¶

Azure Container Apps is a fully managed serverless container service designed for running microservices and containerized applications at scale. It provides built-in autoscaling (including scale-to-zero), supports multiple programming languages and frameworks, and eliminates the need to manage underlying infrastructure like Kubernetes clusters.

When to Use Azure Container Apps¶

Use Case	Suitability
Microservices architectures	✅ Excellent
Event-driven applications	✅ Excellent
HTTP APIs and web apps	✅ Excellent
Background processing jobs	✅ Excellent
Teams without Kubernetes expertise	✅ Excellent
Direct Kubernetes API access needed	❌ Use AKS instead
Complex custom networking requirements	⚠️ Consider AKS

Key Concepts¶

Architecture Hierarchy¶

graph TB
    subgraph "Azure Subscription"
        subgraph "Resource Group"
            subgraph "Container Apps Environment"
                CA1[Container App 1<br/>UI Service]
                CA2[Container App 2<br/>Backend API]
                CA3[Container App 3<br/>Business Logic]
                JOB1[Job 1<br/>Background Processing]

                subgraph "Shared Resources"
                    VNET[Virtual Network]
                    LOG[Log Analytics Workspace]
                    DAPR[Dapr Configuration]
                end
            end
        end
    end

    CA1 --> CA2
    CA2 --> CA3
    CA3 --> JOB1

Core Components¶

Component	Description
Environment	Secure boundary around container apps; manages networking, logging, and Dapr configuration
Container App	An application running one or more containers with HTTP ingress, scaling, and lifecycle management
Job	Containerized task that runs for a finite duration and exits (manual, scheduled, or event-driven)
Revision	Immutable snapshot of a container app version used for deployments and traffic splitting
Replica	Running instance of a container app or job

Environment Architecture¶

A Container Apps Environment is the foundational deployment unit that establishes a secure boundary around your container apps and jobs.

Environment Features¶

graph LR
    subgraph "Container Apps Environment"
        direction TB

        subgraph "Compute Layer"
            WP1[Consumption Profile<br/>Serverless]
            WP2[Dedicated Profile<br/>Reserved Compute]
            WP3[GPU Profile<br/>AI/ML Workloads]
        end

        subgraph "Platform Services"
            ING[Ingress Controller<br/>Envoy Proxy]
            SD[Service Discovery]
            LB[Load Balancer]
        end

        subgraph "Integration"
            DAPR[Dapr Sidecar]
            KEDA[KEDA Autoscaler]
        end

        subgraph "Observability"
            LOGS[Log Analytics]
            METRICS[Azure Monitor]
            TRACES[Distributed Tracing]
        end
    end

Environment Types¶

Environment Type	Identifier	Description
Workload Profiles (v2)	Default	Supports both Consumption and Dedicated plans; maximum flexibility
Consumption-only (v1)	Legacy	Only supports Consumption plan

Recommendation: Always create Workload Profiles (v2) environments for new deployments. They provide all consumption functionality plus access to dedicated compute and robust networking features.

Single vs Multiple Environments Decision Guide¶

This is a critical architectural decision for microservices deployments. Here's a comprehensive decision framework:

Use a Single Environment When:¶

✅ Managing related services that form a cohesive workload
✅ Services need to communicate via Dapr service invocation
✅ Deploying to the same virtual network
✅ Sharing the same Dapr configuration
✅ Centralizing logs to a single destination
✅ Services share the same security boundary

Use Multiple Environments When:¶

✅ Services must never share compute resources (performance isolation)
✅ Team or environment separation (dev, staging, production)
✅ Different security boundaries are required
✅ Avoiding noisy neighbor problems for critical workloads
✅ Services don't need to communicate via Dapr service invocation
✅ Multi-tenant isolation per customer

Environment Strategy Decision Tree¶

flowchart TD
    START[Start: Deploying Microservices] --> Q1{Do services share<br/>the same security boundary?}

    Q1 -->|No| MULTI[Multiple Environments]
    Q1 -->|Yes| Q2{Do services need to<br/>communicate via Dapr?}

    Q2 -->|Yes| SINGLE[Single Environment]
    Q2 -->|No| Q3{Is performance isolation<br/>critical?}

    Q3 -->|Yes| MULTI
    Q3 -->|No| Q4{Different teams<br/>own different services?}

    Q4 -->|Yes| Q5{Do teams need<br/>independent deployments?}
    Q4 -->|No| SINGLE

    Q5 -->|Yes| MULTI
    Q5 -->|No| SINGLE

    SINGLE --> REC1[Recommendation:<br/>Use workload profiles<br/>for resource segmentation]
    MULTI --> REC2[Recommendation:<br/>Use separate environments<br/>per boundary/team]

Recommended Environment Patterns for Microservices¶

Pattern 1: Environment Per Lifecycle Stage (Recommended for Most Teams)¶

graph TB
    subgraph "Production Subscription"
        subgraph "Prod Environment"
            PROD_UI[UI Service]
            PROD_API[Backend API]
            PROD_BIZ[Business Service]
            PROD_JOB[Background Jobs]
        end
    end

    subgraph "Non-Production Subscription"
        subgraph "Staging Environment"
            STAGE_UI[UI Service]
            STAGE_API[Backend API]
            STAGE_BIZ[Business Service]
        end

        subgraph "Dev Environment"
            DEV_UI[UI Service]
            DEV_API[Backend API]
            DEV_BIZ[Business Service]
        end
    end

    DEV_UI --> STAGE_UI
    STAGE_UI --> PROD_UI

Pattern 2: Workload Separation Within Single Environment¶

graph TB
    subgraph "Container Apps Environment"
        subgraph "Consumption Profile"
            UI[UI Services<br/>Scale to Zero]
            API[Public APIs<br/>HTTP Scaling]
        end

        subgraph "Dedicated Profile D4"
            BIZ[Business Logic<br/>Steady State]
            CACHE[Cache Services]
        end

        subgraph "Dedicated Profile D8"
            DATA[Data Processing<br/>High Memory]
        end

        subgraph "GPU Profile"
            ML[ML Inference]
        end
    end

Pattern 3: Multi-tenant Isolation¶

graph TB
    subgraph "Tenant A Environment"
        A_UI[UI Service]
        A_API[API Service]
        A_DATA[Data Service]
    end

    subgraph "Tenant B Environment"
        B_UI[UI Service]
        B_API[API Service]
        B_DATA[Data Service]
    end

    subgraph "Tenant C Environment"
        C_UI[UI Service]
        C_API[API Service]
        C_DATA[Data Service]
    end

    SHARED[Shared Management<br/>Monitoring & Logging]

    A_UI --> SHARED
    B_UI --> SHARED
    C_UI --> SHARED

Environment Limits to Consider¶

Limit	Value	Notes
Environments per subscription per region	Limited	Can be increased via support ticket
Container apps per environment	100 (default)	Soft limit, can be increased
Replicas per app	1,000 max	Configure based on workload
Cores per environment	Varies by profile	Check quota

Workload Profiles and Plans¶

Profile Types Overview¶

graph LR
    subgraph "Plan Types"
        CONS[Consumption Plan<br/>Pay per use]
        DED[Dedicated Plan<br/>Reserved instances]
    end

    subgraph "Workload Profiles"
        CP[Consumption Profile<br/>Serverless, Scale to Zero]
        DP[Dedicated Profiles<br/>D4, D8, D16, D32]
        GP[GPU Profiles<br/>AI/ML Workloads]
    end

    CONS --> CP
    DED --> DP
    DED --> GP

Profile Selection Guide¶

Profile	Best For	Billing	Scale to Zero
Consumption	Variable/unpredictable workloads, dev/test	Per-second usage	✅ Yes
Dedicated (General)	Steady-state production workloads	Per-instance hour	❌ No
Dedicated (Memory)	Data-intensive applications	Per-instance hour	❌ No
GPU	AI/ML inference, compute-heavy	Per-instance hour	❌ No

Service Type Recommendations¶

Service Type	Recommended Profile	Scaling Strategy
UI/Frontend	Consumption	HTTP-based, scale to zero
Public API	Consumption or Dedicated	HTTP-based scaling
Backend Services	Dedicated	Custom metrics, steady state
Business Logic	Dedicated	Event-driven or steady
Background Jobs	Consumption	Event-driven, scale to zero
ML Inference	GPU	Custom metrics
Data Processing	Memory-optimized Dedicated	Event-driven

Microservices Architecture Patterns¶

Reference Architecture: Microservices with Container Apps¶

graph TB
    subgraph "External"
        USERS[Users/Clients]
        AFD[Azure Front Door<br/>+ WAF]
    end

    subgraph "Container Apps Environment"
        subgraph "Ingress Layer"
            UI[UI Service<br/>External Ingress]
            APIGW[API Gateway<br/>External Ingress]
        end

        subgraph "Application Layer"
            AUTH[Auth Service<br/>Internal]
            ORDER[Order Service<br/>Internal]
            CATALOG[Catalog Service<br/>Internal]
            NOTIFY[Notification Service<br/>Internal]
        end

        subgraph "Background Layer"
            WORKFLOW[Workflow Job<br/>Event-driven]
            REPORT[Report Job<br/>Scheduled]
        end

        DAPR[Dapr Sidecars<br/>Service Mesh]
    end

    subgraph "Data Services"
        COSMOS[(Azure Cosmos DB)]
        REDIS[(Azure Cache for Redis)]
        SB[Azure Service Bus]
        STORAGE[(Azure Storage)]
    end

    subgraph "Security"
        KV[Azure Key Vault]
        MI[Managed Identities]
    end

    USERS --> AFD --> UI
    USERS --> AFD --> APIGW

    UI --> AUTH
    APIGW --> ORDER
    APIGW --> CATALOG
    ORDER --> NOTIFY
    ORDER --> SB
    SB --> WORKFLOW

    ORDER --> COSMOS
    CATALOG --> COSMOS
    AUTH --> REDIS
    WORKFLOW --> STORAGE

    AUTH --> KV
    ORDER --> KV

Service Communication Patterns¶

Internal Service Discovery¶

sequenceDiagram
    participant Client as UI Service
    participant Envoy as Envoy Proxy
    participant API as Backend API
    participant DB as Database

    Client->>Envoy: HTTP Request to http://backend-api
    Envoy->>API: Route to healthy replica
    API->>DB: Query data
    DB-->>API: Return data
    API-->>Envoy: Response
    Envoy-->>Client: Response with load balancing

Dapr Service Invocation (Recommended for Microservices)¶

sequenceDiagram
    participant App1 as Order Service
    participant Dapr1 as Dapr Sidecar
    participant Dapr2 as Dapr Sidecar
    participant App2 as Inventory Service

    App1->>Dapr1: Invoke inventory-service/check
    Note over Dapr1,Dapr2: mTLS + Retries + Circuit Breaker
    Dapr1->>Dapr2: Service discovery + call
    Dapr2->>App2: Forward request
    App2-->>Dapr2: Response
    Dapr2-->>Dapr1: Return with policies
    Dapr1-->>App1: Final response

Apps vs Jobs: When to Use Each¶

flowchart TD
    START[New Workload] --> Q1{Runs continuously?}

    Q1 -->|Yes| Q2{Serves HTTP requests?}
    Q1 -->|No| Q3{Triggered by events?}

    Q2 -->|Yes| APP1[Container App<br/>with HTTP Ingress]
    Q2 -->|No| APP2[Container App<br/>Internal/Background]

    Q3 -->|Yes| JOB1[Event-driven Job<br/>KEDA Scaling]
    Q3 -->|No| Q4{Runs on schedule?}

    Q4 -->|Yes| JOB2[Scheduled Job<br/>Cron Expression]
    Q4 -->|No| JOB3[Manual Job<br/>On-demand Execution]

Networking and Security¶

Network Architecture¶

graph TB
    subgraph "Internet"
        USERS[External Users]
    end

    subgraph "Edge Services"
        AFD[Azure Front Door<br/>Global Load Balancer]
        WAF[Web Application Firewall]
    end

    subgraph "Hub VNet"
        FW[Azure Firewall]
        BASTION[Azure Bastion]
    end

    subgraph "Spoke VNet - Container Apps"
        subgraph "Container Apps Subnet /23 or /27"
            ENV[Container Apps Environment]
            subgraph "Apps"
                PUB[Public Apps<br/>External Ingress]
                INT[Internal Apps<br/>Internal Ingress]
            end
        end

        subgraph "Private Endpoints Subnet"
            PE_KV[Key Vault PE]
            PE_ACR[Container Registry PE]
            PE_COSMOS[Cosmos DB PE]
        end
    end

    USERS --> AFD --> WAF
    WAF --> PUB
    PUB --> INT
    INT --> PE_COSMOS
    INT --> PE_KV

    ENV --> FW
    FW --> Internet2[Internet Egress]

Security Best Practices¶

Area	Recommendation
Identity	Use managed identities; avoid storing credentials
Network	Deploy in custom VNet; use internal ingress for backend services
Ingress	Enable WAF via Application Gateway or Front Door
Egress	Route through Azure Firewall with UDR
Secrets	Store in Azure Key Vault; reference via managed identity
Images	Scan in ACR with Microsoft Defender; use minimal base images
mTLS	Enable for Dapr service-to-service communication
HTTPS	Enforce HTTPS-only; configure via Envoy proxy

Network Security Checklist¶

[ ] Deploy environment into custom virtual network
[ ] Configure Network Security Groups (NSGs) on subnets
[ ] Use internal ingress for non-public services
[ ] Route egress through Azure Firewall or NAT Gateway
[ ] Enable private endpoints for Azure services (Cosmos DB, Key Vault, ACR)
[ ] Configure Web Application Firewall for external ingress
[ ] Enable mTLS for service-to-service communication
[ ] Disable public network access where not needed

Scaling and Performance¶

Autoscaling with KEDA¶

Azure Container Apps uses KEDA (Kubernetes Event-driven Autoscaling) for powerful autoscaling capabilities.

graph LR
    subgraph "Scale Triggers"
        HTTP[HTTP Requests]
        TCP[TCP Connections]
        CPU[CPU/Memory]
        SB[Service Bus Queue]
        EH[Event Hubs]
        KAFKA[Apache Kafka]
        REDIS[Azure Cache for Redis]
    end

    subgraph "KEDA Engine"
        SCALER[KEDA Scaler]
        METRICS[Metrics Collector]
    end

    subgraph "Container App"
        R1[Replica 1]
        R2[Replica 2]
        R3[Replica N...]
    end

    HTTP --> SCALER
    SB --> SCALER
    EH --> SCALER

    SCALER --> METRICS
    METRICS --> R1
    METRICS --> R2
    METRICS --> R3

Scaling Configuration¶

Parameter	Default	Min	Max	Description
Min replicas	0	0	1,000	Minimum running instances
Max replicas	10	1	1,000	Maximum scale-out limit
Polling interval	30s	-	-	How often KEDA checks triggers
Cool down period	300s	-	-	Delay before scaling to zero

Scaling Rules by Service Type¶

Service Type	Scale Rule	Configuration Example
Web API	HTTP	`concurrentRequests: 100`
Queue Processor	Azure Service Bus	`queueName: orders, messageCount: 5`
Event Processor	Event Hubs	`consumerGroup: $Default, threshold: 64`
Real-time Service	TCP	`concurrentConnections: 100`
Custom Metrics	Prometheus/Custom	Based on application metrics

Performance Recommendations¶

Configure appropriate min replicas for critical services (at least 1 to avoid cold starts)
Use availability zones for production workloads across 3 replicas minimum
Implement health probes (startup, liveness, readiness) for proper traffic management
Separate workloads by profile to avoid noisy neighbor issues
Load test to validate scaling rules before production

Reliability and High Availability¶

Availability Zone Architecture¶

graph TB
    subgraph "Azure Region"
        subgraph "Zone 1"
            R1A[Replica 1A]
            R1B[Replica 1B]
        end

        subgraph "Zone 2"
            R2A[Replica 2A]
            R2B[Replica 2B]
        end

        subgraph "Zone 3"
            R3A[Replica 3A]
            R3B[Replica 3B]
        end

        LB[Internal Load Balancer<br/>Automatic Distribution]
    end

    LB --> R1A
    LB --> R2A
    LB --> R3A

Multi-Region Deployment¶

graph TB
    subgraph "Global"
        AFD[Azure Front Door<br/>Global Load Balancer]
        TM[Azure Traffic Manager<br/>DNS-based Routing]
    end

    subgraph "Region 1 - Primary"
        ENV1[Container Apps Env 1]
        COSMOS1[(Cosmos DB<br/>Multi-region Write)]
    end

    subgraph "Region 2 - Secondary"
        ENV2[Container Apps Env 2]
        COSMOS2[(Cosmos DB<br/>Replica)]
    end

    AFD --> ENV1
    AFD --> ENV2

    ENV1 --> COSMOS1
    ENV2 --> COSMOS2
    COSMOS1 <--> COSMOS2

Health Probe Configuration¶

Probe Type	Purpose	Recommended Settings
Startup	Prevents premature restarts for slow-starting apps	`failureThreshold: 60, periodSeconds: 1, initialDelaySeconds: 0`
Readiness	Ensures only healthy containers receive traffic	`failureThreshold: 60, periodSeconds: 1, initialDelaySeconds: 5`
Liveness	Detects and restarts failed containers	`failureThreshold: 3, periodSeconds: 10, initialDelaySeconds: 10`

Reliability Checklist¶

[ ] Enable availability zone support in environment
[ ] Configure minimum 3 replicas for production services
[ ] Implement all three health probe types
[ ] Configure resiliency policies (retries, timeouts, circuit breakers)
[ ] Use zone-redundant storage (ZRS) for stateful data
[ ] Deploy IaC templates for disaster recovery
[ ] Set up monitoring alerts for availability metrics
[ ] Test failover scenarios regularly

Cost Optimization¶

Billing Model¶

graph LR
    subgraph "Consumption Plan"
        CPU_SEC[vCPU-seconds]
        MEM_SEC[GiB-seconds]
        REQ[Requests]
    end

    subgraph "Dedicated Plan"
        INSTANCE[Instance Hours]
        PROFILE[Profile Size]
    end

    TOTAL[Total Cost]

    CPU_SEC --> TOTAL
    MEM_SEC --> TOTAL
    REQ --> TOTAL
    INSTANCE --> TOTAL

Cost Optimization Strategies¶

Strategy	Implementation	Savings Potential
Scale to zero	Set min replicas to 0 for non-critical services	High
Right-size profiles	Monitor and adjust CPU/memory allocations	Medium
Azure Savings Plan	Commit to 1 or 3 year plans	Up to 17%
Use Consumption plan	For variable/unpredictable workloads	Variable
Consolidate environments	Reduce environment count where possible	Medium
Optimize container images	Use minimal base images	Low

Cost Monitoring¶

graph TB
    subgraph "Cost Management"
        BUDGET[Azure Budgets<br/>Set Spending Limits]
        ALERTS[Cost Alerts<br/>Threshold Notifications]
        ANALYSIS[Cost Analysis<br/>Usage Breakdown]
        TAGS[Resource Tags<br/>Cost Allocation]
    end

    subgraph "Optimization Actions"
        RIGHTSIZE[Right-size Resources]
        SCALE[Optimize Scaling Rules]
        CLEAN[Remove Unused Resources]
    end

    BUDGET --> ALERTS
    ALERTS --> ANALYSIS
    ANALYSIS --> TAGS
    TAGS --> RIGHTSIZE
    TAGS --> SCALE
    TAGS --> CLEAN

Operational Excellence¶

Infrastructure as Code¶

Deploy Container Apps using Bicep or Terraform for repeatable, traceable deployments.

graph LR
    subgraph "Source Control"
        CODE[Application Code]
        IAC[Infrastructure Code<br/>Bicep/Terraform]
    end

    subgraph "CI/CD Pipeline"
        BUILD[Build & Test]
        SCAN[Security Scan]
        DEPLOY[Deploy]
    end

    subgraph "Azure"
        ENV[Container Apps Env]
        APPS[Container Apps]
        SUPPORT[Supporting Services]
    end

    CODE --> BUILD
    IAC --> BUILD
    BUILD --> SCAN
    SCAN --> DEPLOY
    DEPLOY --> ENV
    DEPLOY --> APPS
    DEPLOY --> SUPPORT

Monitoring and Observability¶

Tool	Purpose	Integration
Azure Monitor	Metrics, logs, alerts	Built-in
Log Analytics	Centralized logging	Default destination
Application Insights	APM, distributed tracing	SDK integration
OpenTelemetry	Vendor-neutral observability	Collector support
Dapr Dashboard	Dapr component monitoring	.NET Aspire integration

Operational Checklist¶

[ ] Implement IaC for all deployments (Bicep/Terraform)
[ ] Configure CI/CD pipelines with automated testing
[ ] Set up centralized logging in Log Analytics
[ ] Enable Application Insights for all services
[ ] Configure alerting for key metrics
[ ] Implement consistent resource tagging
[ ] Document runbooks for common operations
[ ] Use Azure Policy for governance

Deployment Strategies¶

Blue-Green Deployment¶

sequenceDiagram
    participant Prod as Production Traffic
    participant Blue as Blue Revision<br/>(Current)
    participant Green as Green Revision<br/>(New)
    participant LB as Load Balancer

    Note over Blue: Serving 100% traffic
    Prod->>LB: User requests
    LB->>Blue: Route traffic

    Note over Green: Deploy new version

    Note over Blue,Green: Test Green revision

    Note over LB: Switch traffic
    Prod->>LB: User requests
    LB->>Green: Route 100% traffic

    Note over Blue: Keep for rollback

Traffic Splitting for Canary Deployments¶

graph LR
    subgraph "Traffic Distribution"
        USERS[100% Traffic]
    end

    subgraph "Revisions"
        V1[Revision v1<br/>90% Traffic]
        V2[Revision v2<br/>10% Traffic]
    end

    USERS --> V1
    USERS --> V2

Deployment Labels¶

Use deployment labels for sophisticated deployment strategies:

Label	Purpose	Use Case
`production`	Current stable version	Always receives majority traffic
`staging`	Pre-release testing	Internal testing, smoke tests
`canary`	Early adopter testing	Small % of production traffic

Summary: Your Microservices Deployment Strategy¶

Based on your scenario with multiple microservices (UI, backend, business services), here's the recommended approach:

Recommended Architecture¶

graph TB
    subgraph "Production"
        subgraph "Prod Environment"
            direction TB
            subgraph "Consumption Profile"
                PROD_UI[UI Services]
                PROD_JOB[Background Jobs]
            end
            subgraph "Dedicated Profile"
                PROD_API[API Gateway]
                PROD_BIZ[Business Services]
                PROD_BACK[Backend Services]
            end
        end
    end

    subgraph "Non-Production"
        subgraph "Staging Environment"
            STAGE[All Services<br/>Consumption Profile]
        end
        subgraph "Dev Environment"
            DEV[All Services<br/>Consumption Profile]
        end
    end

Key Recommendations¶

Environment Strategy: Use separate environments for dev, staging, and production. Within production, use a single environment with workload profiles for resource segmentation.
Workload Profiles:
Use Consumption for UI services, background jobs, and variable workloads
Use Dedicated for critical business services and APIs with steady traffic
Communication: Enable Dapr for service-to-service communication to get built-in retries, circuit breakers, and mTLS.
Security:
Deploy in custom VNet
Use internal ingress for all non-public services
Front external services with Azure Front Door + WAF
Scaling: Configure appropriate scaling rules per service type (HTTP, event-driven, or custom metrics).
Reliability: Enable availability zones with minimum 3 replicas for production services.

References¶

Official Microsoft Documentation¶

Architecture Guides¶

Security and Reliability¶

GitHub Resources¶

Document Version: 1.0
Last Updated: December 2024
Based on Azure Container Apps documentation as of December 2024