How to fix Azure Front Door routing and custom domain configuration issues

Understanding Azure Front Door Routing and Custom Domains

Azure Front Door is a global, scalable entry point for web applications that provides CDN, SSL offloading, WAF, and intelligent routing. Routing configuration and custom domain setup are the two most common areas where problems occur — misconfigured origins, missing routing rules, certificate validation failures, and DNS propagation delays can all prevent traffic from reaching your application.

This guide covers every routing and custom domain issue, explains the underlying mechanism, and provides exact CLI commands and portal steps to resolve each problem.

Migration notice: Azure Front Door (classic) is being retired on March 31, 2027. If you’re using classic, plan migration to Front Door Standard or Premium tier.

Diagnostic Context

When encountering Azure Front Door routing and custom domain configuration, the first step is understanding what changed. In most production environments, errors do not appear spontaneously. They are triggered by a change in configuration, code, traffic patterns, or the platform itself. Review your deployment history, recent configuration changes, and Azure Service Health notifications to identify potential triggers.

Azure maintains detailed activity logs for every resource operation. These logs capture who made a change, what was changed, when it happened, and from which IP address. Cross-reference the timeline of your error reports with the activity log entries to establish a causal relationship. Often, the fix is simply reverting the most recent change that correlates with the error onset.

If no recent changes are apparent, consider external factors. Azure platform updates, regional capacity changes, and dependent service modifications can all affect your resources. Check the Azure Status page and your subscription’s Service Health blade for any ongoing incidents or planned maintenance that coincides with your issue timeline.

Common Pitfalls to Avoid

When fixing Azure service errors under pressure, engineers sometimes make the situation worse by applying changes too broadly or too quickly. Here are critical pitfalls to avoid during your remediation process.

First, avoid making multiple changes simultaneously. If you change the firewall rules, the connection string, and the service tier all at once, you cannot determine which change actually resolved the issue. Apply one change at a time, verify the result, and document what worked. This disciplined approach builds reliable operational knowledge for your team.

Second, do not disable security controls to bypass errors. Opening all firewall rules, granting overly broad RBAC permissions, or disabling SSL enforcement might eliminate the error message, but it creates security vulnerabilities that are far more dangerous than the original issue. Always find the targeted fix that resolves the error while maintaining your security posture.

Third, test your fix in a non-production environment first when possible. Azure resource configurations can be exported as ARM or Bicep templates and deployed to a test resource group for validation. This extra step takes minutes but can prevent a failed fix from escalating the production incident.

Fourth, document the error message exactly as it appears, including correlation IDs, timestamps, and request IDs. If you need to open a support case with Microsoft, this information dramatically speeds up the investigation. Azure support engineers can use correlation IDs to trace the exact request through Microsoft’s internal logging systems.

Front Door Routing Architecture

Understanding the routing flow helps diagnose where failures occur:

  1. Client sends request to Front Door endpoint (e.g., contoso.azurefd.net)
  2. Front Door matches the request hostname against custom domains
  3. Matched domain maps to an endpoint
  4. Endpoint applies routes to determine which origin group handles the request
  5. Origin group selects the best origin based on health probes, latency, and weight
  6. Front Door forwards the request to the selected origin

503/504 Errors — Origin Response Timeout

Root Cause

The default origin response timeout is 30 seconds. If your origin takes longer to respond, Front Door returns a 503 or 504 with the error code OriginInvalidResponse.

Diagnosis

# Test origin response time directly (bypassing Front Door)
curl -w "Time: %{time_total}s\n" -o /dev/null -s https://myapp.azurewebsites.net/api/slow-endpoint

# Check Front Door access logs
az monitor log-analytics query \
  --workspace {workspace-id} \
  --analytics-query "AzureDiagnostics | where Category == 'FrontDoorAccessLog' | where httpStatusCode_d >= 500 | project TimeGenerated, requestUri_s, httpStatusCode_d, originUrl_s, timeTaken_d | order by TimeGenerated desc | take 20"

Fix

# Increase origin response timeout (Standard/Premium)
# Range: 16 to 240 seconds
az afd endpoint update \
  --endpoint-name myEndpoint \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --origin-response-timeout-seconds 120

For Front Door classic, the timeout is set at the profile level in the portal: Overview > Origin response timeout.

503 on HTTPS Only — Certificate Mismatch

Root Cause

When Front Door connects to the origin over HTTPS, it validates the origin’s SSL certificate. If the certificate’s subject name doesn’t match the origin hostname, or if the certificate chain is incomplete, the connection fails.

Diagnosis

# Check origin certificate
openssl s_client -connect myapp.azurewebsites.net:443 \
  -servername myapp.azurewebsites.net 2>/dev/null | \
  openssl x509 -noout -subject -issuer -dates

# Verify certificate chain
openssl s_client -connect myapp.azurewebsites.net:443 \
  -servername myapp.azurewebsites.net -showcerts

Fix

# Option 1: Disable certificate name check (not recommended for production)
# Standard/Premium: 
az afd origin update \
  --origin-name myOrigin \
  --origin-group-name myOriginGroup \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --enforce-certificate-name-check false

# Option 2 (recommended): Set the correct origin host header
az afd origin update \
  --origin-name myOrigin \
  --origin-group-name myOriginGroup \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --origin-host-header myapp.azurewebsites.net \
  --host-name myapp.azurewebsites.net

Important: If your origin is configured as an IP address instead of an FQDN, SNI (Server Name Indication) won’t work and the certificate check will fail. Always use FQDNs for origins.

400 Bad Request on Custom Domain

Root Cause

A custom domain is added to Front Door but no route matches requests for that domain. Every custom domain must have at least one route with a default path pattern (/*).

Fix

# Create a route for the custom domain
az afd route create \
  --route-name defaultRoute \
  --endpoint-name myEndpoint \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --origin-group myOriginGroup \
  --supported-protocols Https Http \
  --link-to-default-domain Enabled \
  --https-redirect Enabled \
  --patterns-to-match "/*" \
  --forwarding-protocol HttpsOnly \
  --custom-domains myCustomDomain

Custom Domain Configuration — Step by Step

Step 1: Create DNS Records

For domain validation, create a CNAME record pointing to your Front Door endpoint:

# Standard CNAME mapping
www.contoso.com    CNAME    contoso-frontend.azurefd.net

# For apex/root domains, use an ALIAS record (if your DNS provider supports it)
contoso.com        ALIAS    contoso-frontend.azurefd.net

For zero-downtime migration from another CDN/load balancer, use the afdverify subdomain:

# Validation CNAME (does not affect live traffic)
afdverify.www.contoso.com    CNAME    afdverify.contoso-frontend.azurefd.net

Step 2: Add Custom Domain to Front Door

# Add custom domain (Standard/Premium)
az afd custom-domain create \
  --custom-domain-name wwwContoso \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --host-name www.contoso.com \
  --certificate-type ManagedCertificate \
  --minimum-tls-version TLS12

Step 3: Associate Domain with Route

# Update route to include custom domain
az afd route update \
  --route-name defaultRoute \
  --endpoint-name myEndpoint \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --custom-domains wwwContoso

Step 4: Verify Domain Validation Status

# Check domain validation status
az afd custom-domain show \
  --custom-domain-name wwwContoso \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --query "{validationState:validationProperties.validationState, domainStatus:domainValidationState, provisioningState:provisioningState}"

Step 5: Remove afdverify Records

After the permanent CNAME is in place and domain validation is complete, remove the afdverify DNS records.

Managed Certificate Provisioning Issues

Certificate Stuck in “Pending”

Managed certificates require DNS validation. If the CNAME record is not properly configured, certificate provisioning will stay in “Pending” state indefinitely.

Diagnosis

# Verify DNS resolution
nslookup www.contoso.com
# Should return contoso-frontend.azurefd.net

# Check certificate status
az afd custom-domain show \
  --custom-domain-name wwwContoso \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --query "tlsSettings" -o json

Fix

  1. Verify the CNAME record resolves correctly
  2. Wait up to 8 hours for DNS propagation and certificate issuance
  3. If using a DNS provider with a proxy (like Cloudflare), disable the proxy temporarily to allow DNS-based validation
  4. Ensure no CAA DNS records block DigiCert (Front Door’s managed certificate CA)

HTTP Not Redirecting to HTTPS

# Enable HTTPS redirect on route
az afd route update \
  --route-name defaultRoute \
  --endpoint-name myEndpoint \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --https-redirect Enabled \
  --supported-protocols Https Http

Root Cause Analysis Framework

After applying the immediate fix, invest time in a structured root cause analysis. The Five Whys technique is a simple but effective method: start with the error symptom and ask “why” five times to drill down from the surface-level cause to the fundamental issue.

For example, considering Azure Front Door routing and custom domain configuration: Why did the service fail? Because the connection timed out. Why did the connection timeout? Because the DNS lookup returned a stale record. Why was the DNS record stale? Because the TTL was set to 24 hours during a migration and never reduced. Why was it not reduced? Because there was no checklist for post-migration cleanup. Why was there no checklist? Because the migration process was ad hoc rather than documented.

This analysis reveals that the root cause is not a technical configuration issue but a process gap that allowed undocumented changes. The preventive action is creating a migration checklist and review process, not just fixing the DNS TTL. Without this depth of analysis, the team will continue to encounter similar issues from different undocumented changes.

Categorize your root causes into buckets: configuration errors, capacity limits, code defects, external dependencies, and process gaps. Track the distribution over time. If most of your incidents fall into the configuration error bucket, invest in infrastructure-as-code validation and policy enforcement. If they fall into capacity limits, improve your monitoring and forecasting. This data-driven approach focuses your improvement efforts where they will have the most impact.

411 Length Required on POST Requests

Root Cause

Front Door requires the Content-Length header on POST requests. Requests without this header or Transfer-Encoding: chunked are rejected with HTTP 411.

Fix

Ensure your client includes a Content-Length header on all POST, PUT, and PATCH requests. If using chunked transfer encoding, include the Transfer-Encoding: chunked header.

429 Too Many Requests — Rate Limiting

Root Cause

Front Door enforces platform-level rate limits. Legitimate traffic that exceeds these limits gets throttled.

Fix

  • Contact Azure support to increase rate limits for legitimate high-traffic scenarios
  • Implement client-side throttling to spread requests
  • Use Azure CDN caching to reduce origin requests

Origin Health Probe Failures

If health probes to your origin fail, Front Door marks the origin as unhealthy and stops routing traffic to it.

# Check origin group health probe configuration
az afd origin-group show \
  --origin-group-name myOriginGroup \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --query "healthProbeSettings"

# Update health probe
az afd origin-group update \
  --origin-group-name myOriginGroup \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --probe-request-type GET \
  --probe-protocol Https \
  --probe-path /health \
  --probe-interval-in-seconds 30

Compression and Accept-Encoding Issues

When clients send byte-range requests with Accept-Encoding: gzip, Front Door may return unexpected results because compression changes the content length.

# Disable compression on the route
az afd route update \
  --route-name defaultRoute \
  --endpoint-name myEndpoint \
  --profile-name myFrontDoor \
  --resource-group myRG \
  --content-types-to-compress ""

Or create a Rules Engine rule to strip the Accept-Encoding header for specific paths.

Error Classification and Severity Assessment

Not all errors require the same response urgency. Classify errors into severity levels based on their impact on users and business operations. A severity 1 error causes complete service unavailability for all users. A severity 2 error degrades functionality for a subset of users. A severity 3 error causes intermittent issues that affect individual operations. A severity 4 error is a cosmetic or minor issue with a known workaround.

For Azure Front Door routing and custom domain configuration, map the specific error codes and messages to these severity levels. Create a classification matrix that your on-call team can reference when triaging incoming alerts. This prevents over-escalation of minor issues and under-escalation of critical ones. Include the expected resolution time for each severity level and the communication protocol (who to notify, how frequently to update stakeholders).

Track your error rates over time using Azure Monitor metrics and Log Analytics queries. Establish baseline error rates for healthy operation so you can distinguish between normal background error levels and genuine incidents. A service that normally experiences 0.1 percent error rate might not need investigation when errors spike to 0.2 percent, but a jump to 5 percent warrants immediate attention. Without this baseline context, every alert becomes equally urgent, leading to alert fatigue.

Implement error budgets as part of your SLO framework. An error budget defines the maximum amount of unreliability your service can tolerate over a measurement window (typically monthly or quarterly). When the error budget is exhausted, the team shifts focus from feature development to reliability improvements. This mechanism creates a structured trade-off between innovation velocity and operational stability.

Dependency Management and Service Health

Azure services depend on other Azure services internally, and your application adds additional dependency chains on top. When diagnosing Azure Front Door routing and custom domain configuration, map out the complete dependency tree including network dependencies (DNS, load balancers, firewalls), identity dependencies (Azure AD, managed identity endpoints), and data dependencies (storage accounts, databases, key vaults).

Check Azure Service Health for any ongoing incidents or planned maintenance affecting the services in your dependency tree. Azure Service Health provides personalized notifications specific to the services and regions you use. Subscribe to Service Health alerts so your team is notified proactively when Microsoft identifies an issue that might affect your workload.

For each critical dependency, implement a health check endpoint that verifies connectivity and basic functionality. Your application’s readiness probe should verify not just that the application process is running, but that it can successfully reach all of its dependencies. When a dependency health check fails, the application should stop accepting new requests and return a 503 status until the dependency recovers. This prevents requests from queuing up and timing out, which would waste resources and degrade the user experience.

Dangling DNS Prevention

When you delete a Front Door resource, always clean up the CNAME records first. Otherwise, the DNS records become “dangling” — pointing to a resource that no longer exists — which creates a subdomain takeover vulnerability.

# Before deleting Front Door:
# 1. Remove CNAME records from DNS
# 2. Remove custom domains from Front Door
az afd custom-domain delete \
  --custom-domain-name wwwContoso \
  --profile-name myFrontDoor \
  --resource-group myRG

# 3. Then delete the profile
az afd profile delete \
  --profile-name myFrontDoor \
  --resource-group myRG

Diagnostic Tools

# Front Door diagnostic script
PROFILE="myFrontDoor"
RG="myRG"

echo "=== Profile Status ==="
az afd profile show -n $PROFILE -g $RG \
  --query "{state:provisioningState, sku:sku.name}" -o json

echo "=== Endpoints ==="
az afd endpoint list --profile-name $PROFILE -g $RG \
  --query "[].{name:name, hostname:hostName, state:enabledState}" -o table

echo "=== Custom Domains ==="
az afd custom-domain list --profile-name $PROFILE -g $RG \
  --query "[].{name:name, hostname:hostName, validation:validationProperties.validationState, tls:tlsSettings.certificateType}" -o table

echo "=== Origin Groups ==="
az afd origin-group list --profile-name $PROFILE -g $RG \
  --query "[].{name:name, probe:healthProbeSettings.probePath}" -o table

echo "=== Origins ==="
for OG in $(az afd origin-group list --profile-name $PROFILE -g $RG --query "[].name" -o tsv); do
  echo "--- Origin Group: $OG ---"
  az afd origin list --origin-group-name $OG --profile-name $PROFILE -g $RG \
    --query "[].{name:name, hostName:hostName, enabled:enabledState}" -o table
done

echo "=== Routes ==="
for EP in $(az afd endpoint list --profile-name $PROFILE -g $RG --query "[].name" -o tsv); do
  echo "--- Endpoint: $EP ---"
  az afd route list --endpoint-name $EP --profile-name $PROFILE -g $RG \
    --query "[].{name:name, patterns:patternsToMatch, httpsRedirect:httpsRedirect, domains:customDomains[].id}" -o table
done

Custom Domain Limitations

  • Custom domains don’t support punycode/internationalized domain names
  • A custom domain can only be associated with one Front Door profile at a time
  • Apex/root domains require DNS ALIAS/ANAME records (not all providers support this)
  • Managed certificates are issued by DigiCert — ensure no CAA records block DigiCert
  • Managed certificate provisioning can take up to 8 hours

Prevention Best Practices

  • Use Front Door Standard or Premium — Classic is being retired
  • Always use FQDNs for origins — Never use IP addresses
  • Set appropriate origin response timeouts — Match to your application’s actual response time
  • Create routes for all custom domains — Every domain must have at least one route
  • Monitor origin health probe status — Set alerts on probe failures
  • Clean up DNS before deleting resources — Prevent dangling DNS vulnerabilities
  • Remove afdverify records after domain mapping is complete
  • Test custom domains with nslookup and curl before considering setup complete

Post-Resolution Validation and Hardening

After applying the fix, perform a structured validation to confirm the issue is fully resolved. Do not rely solely on the absence of error messages. Actively verify that the service is functioning correctly by running health checks, executing test transactions, and monitoring key metrics for at least 30 minutes after the change.

Validate from multiple perspectives. Check the Azure resource health status, run your application’s integration tests, verify that dependent services are receiving data correctly, and confirm that end users can complete their workflows. A fix that resolves the immediate error but breaks a downstream integration is not a complete resolution.

Implement defensive monitoring to detect if the issue recurs. Create an Azure Monitor alert rule that triggers on the specific error condition you just fixed. Set the alert to fire within minutes of recurrence so you can respond before the issue impacts users. Include the remediation steps in the alert’s action group notification so that any on-call engineer can apply the fix quickly.

Finally, conduct a brief post-incident review. Document the root cause, the fix applied, the time to detect, diagnose, and resolve the issue, and any preventive measures that should be implemented. Share this documentation with the broader engineering team through a blameless post-mortem process. This transparency transforms individual incidents into organizational learning that raises the entire team’s operational capability.

Consider adding the error scenario to your integration test suite. Automated tests that verify the service behaves correctly under the conditions that triggered the original error provide a safety net against regression. If a future change inadvertently reintroduces the problem, the test will catch it before it reaches production.

Summary

Azure Front Door routing and custom domain issues fall into three main categories: origin connectivity (503/504 timeouts, certificate mismatches), domain configuration (DNS validation, missing routes, certificate provisioning), and protocol/header issues (411, 429, compression). The diagnostic script above quickly identifies the state of every component in your Front Door configuration. For reliable custom domain setup, always verify DNS resolution, wait for certificate provisioning, and ensure every domain has an associated route.

Leave a Reply