How to fix Azure Container Apps ingress configuration problems

Understanding Azure Container Apps Ingress Issues

Azure Container Apps provides a serverless container hosting platform with built-in ingress capabilities including automatic HTTPS, traffic splitting, and custom domain support. However, ingress configuration is one of the most common sources of problems — misconfigured target ports, wrong traffic settings, health probe mismatches, and protocol issues can all prevent your container from receiving traffic.

This guide covers every ingress configuration problem you’re likely to encounter, explains why each issue occurs, and provides exact CLI commands and YAML configurations to fix them.

Diagnostic Context

When encountering Azure Container Apps ingress configuration, the first step is understanding what changed. In most production environments, errors do not appear spontaneously. They are triggered by a change in configuration, code, traffic patterns, or the platform itself. Review your deployment history, recent configuration changes, and Azure Service Health notifications to identify potential triggers.

Azure maintains detailed activity logs for every resource operation. These logs capture who made a change, what was changed, when it happened, and from which IP address. Cross-reference the timeline of your error reports with the activity log entries to establish a causal relationship. Often, the fix is simply reverting the most recent change that correlates with the error onset.

If no recent changes are apparent, consider external factors. Azure platform updates, regional capacity changes, and dependent service modifications can all affect your resources. Check the Azure Status page and your subscription’s Service Health blade for any ongoing incidents or planned maintenance that coincides with your issue timeline.

Common Pitfalls to Avoid

When fixing Azure service errors under pressure, engineers sometimes make the situation worse by applying changes too broadly or too quickly. Here are critical pitfalls to avoid during your remediation process.

First, avoid making multiple changes simultaneously. If you change the firewall rules, the connection string, and the service tier all at once, you cannot determine which change actually resolved the issue. Apply one change at a time, verify the result, and document what worked. This disciplined approach builds reliable operational knowledge for your team.

Second, do not disable security controls to bypass errors. Opening all firewall rules, granting overly broad RBAC permissions, or disabling SSL enforcement might eliminate the error message, but it creates security vulnerabilities that are far more dangerous than the original issue. Always find the targeted fix that resolves the error while maintaining your security posture.

Third, test your fix in a non-production environment first when possible. Azure resource configurations can be exported as ARM or Bicep templates and deployed to a test resource group for validation. This extra step takes minutes but can prevent a failed fix from escalating the production incident.

Fourth, document the error message exactly as it appears, including correlation IDs, timestamps, and request IDs. If you need to open a support case with Microsoft, this information dramatically speeds up the investigation. Azure support engineers can use correlation IDs to trace the exact request through Microsoft’s internal logging systems.

How Container Apps Ingress Works

Container Apps uses an Envoy-based ingress controller that sits in front of your container. When you enable ingress, the platform creates:

A public or internal FQDN for your app
TLS termination with auto-provisioned certificates
HTTP/HTTPS routing with optional path-based rules
Health probes to determine container readiness
Traffic splitting across multiple revisions

The ingress controller forwards traffic to your container on the configured target port. If any part of this chain is misconfigured, requests will fail.

Ingress Not Enabled

Symptoms

No FQDN assigned to the container app
Cannot access the application via HTTP/HTTPS
Portal shows “Ingress: Disabled”

Fix

# Enable external ingress
az containerapp ingress enable \
  --name my-container-app \
  --resource-group myRG \
  --target-port 8080 \
  --type external \
  --transport auto

# Enable internal ingress (VNet only)
az containerapp ingress enable \
  --name my-container-app \
  --resource-group myRG \
  --target-port 8080 \
  --type internal \
  --transport auto

# YAML: Ingress configuration
properties:
  configuration:
    ingress:
      external: true
      targetPort: 8080
      transport: auto
      allowInsecure: false

Wrong Target Port — 502/503 Errors

The most common ingress problem is a mismatch between the ingress target port and the port your container actually listens on. If they don’t match, the Envoy proxy cannot forward traffic to your container.

Symptoms

HTTP 502 Bad Gateway
HTTP 503 Service Unavailable
App works locally but fails when deployed

Diagnosis

# Check current ingress configuration
az containerapp show \
  --name my-container-app \
  --resource-group myRG \
  --query "properties.configuration.ingress" -o json

# Check container logs for the actual listening port
az containerapp logs show \
  --name my-container-app \
  --resource-group myRG \
  --type console \
  --tail 50

Fix

# Update target port to match your container
az containerapp ingress update \
  --name my-container-app \
  --resource-group myRG \
  --target-port 3000

Common framework default ports for reference:

Framework	Default Port
Node.js (Express)	3000
Python (Flask)	5000
Python (FastAPI/Uvicorn)	8000
ASP.NET Core	8080 (container default)
Java (Spring Boot)	8080
Go (net/http)	8080
Nginx	80

External vs Internal Traffic Setting

Symptoms

External requests return connection refused or timeout
App is reachable from within the VNet but not from the internet

Diagnosis

# Check traffic type
az containerapp show \
  --name my-container-app \
  --resource-group myRG \
  --query "properties.configuration.ingress.external" -o tsv

Fix

# Switch to external (internet-accessible)
az containerapp ingress update \
  --name my-container-app \
  --resource-group myRG \
  --type external

# Switch to internal (VNet only)
az containerapp ingress update \
  --name my-container-app \
  --resource-group myRG \
  --type internal

Transport Protocol Mismatch

Container Apps supports four transport modes: auto, http, http2, and tcp. Using the wrong transport for your application causes connection failures.

Transport	Use Case	Notes
`auto`	Most HTTP apps	Negotiates HTTP/1.1 or HTTP/2 automatically
`http`	HTTP/1.1 only	Force HTTP/1.1 — use when HTTP/2 causes issues
`http2`	gRPC services	Required for gRPC, prior-knowledge HTTP/2
`tcp`	Non-HTTP protocols	Raw TCP — requires VNet-connected environment

# Set transport for gRPC service
az containerapp ingress update \
  --name my-grpc-app \
  --resource-group myRG \
  --transport http2 \
  --target-port 50051

# Set transport for TCP service
az containerapp ingress update \
  --name my-tcp-app \
  --resource-group myRG \
  --transport tcp \
  --target-port 9000 \
  --exposed-port 9000

Health Probe Failures

Container Apps configures default health probes (startup, liveness, readiness) for every container. If these probes fail, the platform marks the container as unhealthy and stops routing traffic to it.

Default Health Probe Configuration

Probe	Protocol	Timeout	Period	Initial Delay	Failure Threshold
Startup	TCP	3s	1s	1s	240
Liveness	TCP	5s	5s	3s	48 (consecutive)
Readiness	Not configured	—	—	—	—

Symptoms

Revision status shows “Provisioning” or “Degraded”
Container keeps restarting
App works locally but pods are unhealthy in Container Apps

Diagnosis

# Check revision health
az containerapp revision list \
  --name my-container-app \
  --resource-group myRG \
  --query "[].{name:name, healthState:properties.healthState, runningState:properties.runningState}" \
  --output table

# Check system logs for probe failures
az containerapp logs show \
  --name my-container-app \
  --resource-group myRG \
  --type system \
  --tail 100

Fix: Configure Custom Health Probes

# YAML: Custom health probes for slow-starting app (e.g., Java)
properties:
  template:
    containers:
      - name: my-java-app
        image: myregistry.azurecr.io/my-java-app:latest
        probes:
          - type: startup
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 5
            failureThreshold: 30
            timeoutSeconds: 5
          - type: liveness
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 60
            periodSeconds: 30
            failureThreshold: 3
            timeoutSeconds: 5
          - type: readiness
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 10
            failureThreshold: 3
            timeoutSeconds: 3

Key point: The health probe port must match the ingress target port. A mismatch is a common cause of persistent “Provisioning” status.

IP Security Restrictions

Symptoms

HTTP 403 Forbidden from specific IP addresses
App works from some networks but not others

# Check IP restrictions
az containerapp show \
  --name my-container-app \
  --resource-group myRG \
  --query "properties.configuration.ingress.ipSecurityRestrictions" -o json

# Add an allow rule
az containerapp ingress access-restriction set \
  --name my-container-app \
  --resource-group myRG \
  --rule-name AllowOffice \
  --ip-address 203.0.113.0/24 \
  --action Allow

# Remove all restrictions (allow all traffic)
az containerapp ingress access-restriction remove \
  --name my-container-app \
  --resource-group myRG \
  --rule-name AllowOffice

Root Cause Analysis Framework

After applying the immediate fix, invest time in a structured root cause analysis. The Five Whys technique is a simple but effective method: start with the error symptom and ask “why” five times to drill down from the surface-level cause to the fundamental issue.

For example, considering Azure Container Apps ingress configuration: Why did the service fail? Because the connection timed out. Why did the connection timeout? Because the DNS lookup returned a stale record. Why was the DNS record stale? Because the TTL was set to 24 hours during a migration and never reduced. Why was it not reduced? Because there was no checklist for post-migration cleanup. Why was there no checklist? Because the migration process was ad hoc rather than documented.

This analysis reveals that the root cause is not a technical configuration issue but a process gap that allowed undocumented changes. The preventive action is creating a migration checklist and review process, not just fixing the DNS TTL. Without this depth of analysis, the team will continue to encounter similar issues from different undocumented changes.

Categorize your root causes into buckets: configuration errors, capacity limits, code defects, external dependencies, and process gaps. Track the distribution over time. If most of your incidents fall into the configuration error bucket, invest in infrastructure-as-code validation and policy enforcement. If they fall into capacity limits, improve your monitoring and forecasting. This data-driven approach focuses your improvement efforts where they will have the most impact.

mTLS Client Certificate Issues

Container Apps supports mutual TLS (mTLS) with configurable client certificate validation. If your client doesn’t send a certificate when required, connections will be refused.

Mode	Behavior
`Ignore` (default)	Accepts connections with or without client certificates
`Accept`	Accepts client certificates if provided, passes them to the app
`Require`	Rejects connections without valid client certificates

# Set client certificate mode to not require certificates
az containerapp ingress update \
  --name my-container-app \
  --resource-group myRG \
  --client-certificate-mode ignore

Traffic Splitting and Revision Issues

Traffic may be routed to an old or broken revision if traffic splitting percentages are not configured correctly.

# Check traffic distribution
az containerapp revision list \
  --name my-container-app \
  --resource-group myRG \
  --query "[].{name:name, trafficWeight:properties.trafficWeight, active:properties.active}" \
  --output table

# Route all traffic to latest revision
az containerapp ingress traffic set \
  --name my-container-app \
  --resource-group myRG \
  --revision-weight latest=100

# Split traffic between revisions
az containerapp ingress traffic set \
  --name my-container-app \
  --resource-group myRG \
  --revision-weight my-app--abc1234=80 my-app--def5678=20

Container Image Pull Failures

If the container image cannot be pulled, the container never starts and ingress has nothing to route traffic to.

Diagnosis

# Check system logs for image pull errors
az containerapp logs show \
  --name my-container-app \
  --resource-group myRG \
  --type system \
  --tail 50 \
  --query "[?contains(msg, 'pull') || contains(msg, 'image')]"

Fix: Configure Registry Authentication

# Set ACR credentials
az containerapp registry set \
  --name my-container-app \
  --resource-group myRG \
  --server myregistry.azurecr.io \
  --identity system

# Or with username/password
az containerapp registry set \
  --name my-container-app \
  --resource-group myRG \
  --server myregistry.azurecr.io \
  --username myregistry \
  --password "registry-password"

HTTP Request Timeout

Container Apps has a fixed 240-second HTTP request timeout. Requests that take longer than 4 minutes will be terminated by the ingress controller.

Workaround for Long-Running Operations

// Implement async request pattern
// Instead of blocking for 5+ minutes:

// POST /api/jobs → Returns 202 Accepted with Location header
// GET /api/jobs/{id} → Returns job status (running/completed/failed)

[HttpPost("api/jobs")]
public IActionResult StartJob([FromBody] JobRequest request)
{
    var jobId = Guid.NewGuid().ToString();
    _ = Task.Run(() => ProcessJobAsync(jobId, request));
    
    return Accepted(
        new Uri($"/api/jobs/{jobId}", UriKind.Relative),
        new { id = jobId, status = "running" }
    );
}

[HttpGet("api/jobs/{id}")]
public IActionResult GetJobStatus(string id)
{
    var status = _jobStore.GetStatus(id);
    return Ok(status);
}

Error Classification and Severity Assessment

Not all errors require the same response urgency. Classify errors into severity levels based on their impact on users and business operations. A severity 1 error causes complete service unavailability for all users. A severity 2 error degrades functionality for a subset of users. A severity 3 error causes intermittent issues that affect individual operations. A severity 4 error is a cosmetic or minor issue with a known workaround.

For Azure Container Apps ingress configuration, map the specific error codes and messages to these severity levels. Create a classification matrix that your on-call team can reference when triaging incoming alerts. This prevents over-escalation of minor issues and under-escalation of critical ones. Include the expected resolution time for each severity level and the communication protocol (who to notify, how frequently to update stakeholders).

Track your error rates over time using Azure Monitor metrics and Log Analytics queries. Establish baseline error rates for healthy operation so you can distinguish between normal background error levels and genuine incidents. A service that normally experiences 0.1 percent error rate might not need investigation when errors spike to 0.2 percent, but a jump to 5 percent warrants immediate attention. Without this baseline context, every alert becomes equally urgent, leading to alert fatigue.

Implement error budgets as part of your SLO framework. An error budget defines the maximum amount of unreliability your service can tolerate over a measurement window (typically monthly or quarterly). When the error budget is exhausted, the team shifts focus from feature development to reliability improvements. This mechanism creates a structured trade-off between innovation velocity and operational stability.

Dependency Management and Service Health

Azure services depend on other Azure services internally, and your application adds additional dependency chains on top. When diagnosing Azure Container Apps ingress configuration, map out the complete dependency tree including network dependencies (DNS, load balancers, firewalls), identity dependencies (Azure AD, managed identity endpoints), and data dependencies (storage accounts, databases, key vaults).

Check Azure Service Health for any ongoing incidents or planned maintenance affecting the services in your dependency tree. Azure Service Health provides personalized notifications specific to the services and regions you use. Subscribe to Service Health alerts so your team is notified proactively when Microsoft identifies an issue that might affect your workload.

For each critical dependency, implement a health check endpoint that verifies connectivity and basic functionality. Your application’s readiness probe should verify not just that the application process is running, but that it can successfully reach all of its dependencies. When a dependency health check fails, the application should stop accepting new requests and return a 503 status until the dependency recovers. This prevents requests from queuing up and timing out, which would waste resources and degrade the user experience.

CORS Configuration

Cross-origin requests from web browsers will be blocked unless CORS is configured on the container app.

# Enable CORS
az containerapp ingress cors enable \
  --name my-container-app \
  --resource-group myRG \
  --allowed-origins "https://myapp.com" "https://www.myapp.com" \
  --allowed-methods "GET" "POST" "PUT" "DELETE" \
  --allowed-headers "*" \
  --max-age 3600

Diagnostic Checklist

# Comprehensive ingress diagnostic
APP="my-container-app"
RG="myRG"

echo "=== Ingress Config ==="
az containerapp show -n $APP -g $RG \
  --query "properties.configuration.ingress" -o json

echo "=== FQDN ==="
az containerapp show -n $APP -g $RG \
  --query "properties.configuration.ingress.fqdn" -o tsv

echo "=== Revision Health ==="
az containerapp revision list -n $APP -g $RG \
  --query "[].{name:name, health:properties.healthState, running:properties.runningState, traffic:properties.trafficWeight}" \
  --output table

echo "=== System Logs (last 20) ==="
az containerapp logs show -n $APP -g $RG --type system --tail 20

echo "=== Container Logs (last 20) ==="
az containerapp logs show -n $APP -g $RG --type console --tail 20

Reserved Ports and Limitations

Port 36985 is reserved for internal health checks — do not use it
Maximum 5 additional TCP ports per app (beyond the main ingress port)
TCP ingress requires a VNet-connected Container Apps environment
Each exposed TCP port must be unique across the entire environment
Port 80 automatically redirects to 443 by default (unless allowInsecure is true)

Prevention Best Practices

Always verify target port matches your container — This is the #1 ingress issue
Configure health probes explicitly — Don’t rely on defaults for production workloads
Use startup probes for slow-starting containers — Java/Spring apps may need 30+ seconds to start
Test ingress locally with Docker — Verify your container listens on the expected port before deploying
Monitor revision health — Set alerts on revision health state changes
Use traffic splitting for safe deployments — Route 10% traffic to new revisions before going to 100%

Post-Resolution Validation and Hardening

After applying the fix, perform a structured validation to confirm the issue is fully resolved. Do not rely solely on the absence of error messages. Actively verify that the service is functioning correctly by running health checks, executing test transactions, and monitoring key metrics for at least 30 minutes after the change.

Validate from multiple perspectives. Check the Azure resource health status, run your application’s integration tests, verify that dependent services are receiving data correctly, and confirm that end users can complete their workflows. A fix that resolves the immediate error but breaks a downstream integration is not a complete resolution.

Implement defensive monitoring to detect if the issue recurs. Create an Azure Monitor alert rule that triggers on the specific error condition you just fixed. Set the alert to fire within minutes of recurrence so you can respond before the issue impacts users. Include the remediation steps in the alert’s action group notification so that any on-call engineer can apply the fix quickly.

Finally, conduct a brief post-incident review. Document the root cause, the fix applied, the time to detect, diagnose, and resolve the issue, and any preventive measures that should be implemented. Share this documentation with the broader engineering team through a blameless post-mortem process. This transparency transforms individual incidents into organizational learning that raises the entire team’s operational capability.

Consider adding the error scenario to your integration test suite. Automated tests that verify the service behaves correctly under the conditions that triggered the original error provide a safety net against regression. If a future change inadvertently reintroduces the problem, the test will catch it before it reaches production.

Summary

Container Apps ingress problems are almost always configuration mismatches: wrong target port, wrong traffic type (external vs internal), wrong transport protocol, or health probe failures. The diagnostic checklist above quickly identifies which configuration is wrong. For production deployments, always configure explicit health probes, verify the target port matches your container’s listening port, and use traffic splitting for safe rollouts.

For more details, refer to the official documentation: Azure Container Apps overview, Ingress in Azure Container Apps.

Understanding Azure Container Apps Ingress Issues

Diagnostic Context

Common Pitfalls to Avoid

How Container Apps Ingress Works

Ingress Not Enabled

Symptoms

Fix

Wrong Target Port — 502/503 Errors

Symptoms

Diagnosis

Fix

External vs Internal Traffic Setting

Symptoms

Diagnosis

Fix

Transport Protocol Mismatch

Health Probe Failures

Default Health Probe Configuration

Symptoms

Diagnosis

Fix: Configure Custom Health Probes

IP Security Restrictions

Symptoms

Root Cause Analysis Framework

mTLS Client Certificate Issues

Traffic Splitting and Revision Issues

Container Image Pull Failures

Diagnosis

Fix: Configure Registry Authentication

HTTP Request Timeout

Workaround for Long-Running Operations

Error Classification and Severity Assessment

Dependency Management and Service Health

CORS Configuration

Diagnostic Checklist

Reserved Ports and Limitations

Prevention Best Practices

Post-Resolution Validation and Hardening

Summary

Leave a Reply Cancel reply