Understanding Azure Monitor Alert Rules
Azure Monitor alerts proactively notify you when conditions in your monitoring data indicate potential issues. They are essential for maintaining the health and availability of your Azure workloads. However, alert rules that don’t trigger — or trigger incorrectly — are worse than having no alerts at all, because they create a false sense of security.
This guide covers every reason why Azure Monitor alert rules fail to trigger correctly, from misconfigured aggregation settings to auto-disabled rules, with exact diagnostic steps and fixes.
How Alert Rules Evaluate
Understanding the evaluation cycle is essential for diagnosing alert issues:
- Frequency of evaluation — How often the rule checks the condition (e.g., every 5 minutes)
- Aggregation granularity (Period) — The time window over which data is aggregated (e.g., last 15 minutes)
- Aggregation type — How data points within the window are combined (Average, Count, Min, Max, Total)
- Threshold — The value that triggers the alert
A common misconception is that a rule with 5-minute frequency and 15-minute period checks the last 5 minutes. It actually checks the last 15 minutes every 5 minutes.
Metric Alerts Not Firing
Wrong Aggregation Settings
The most common reason metric alerts don’t fire is a mismatch between the aggregation settings in the alert rule and how you expect the metric to behave.
Example scenario:
- Alert rule: Average CPU > 80% over 5 minutes
- Actual data: CPU spikes to 100% for 30 seconds, rest at 40%
- Average over 5 minutes: ~50%
- Result: Alert never fires despite 100% spikes
Fix: Use "Maximum" aggregation instead of "Average" to catch spikes
Diagnosis: Compare Metrics Chart to Alert Settings
- Navigate to Portal > Monitor > Metrics
- Select the same resource, metric, and aggregation as your alert rule
- Set the time granularity to match the alert rule’s aggregation granularity
- If the chart shows the metric crossing the threshold, the alert should have fired
- If it doesn’t cross, adjust your alert settings
Stateful Alert Behavior
Metric alerts are stateful by default. Once fired, they won’t fire again on the same time series until the condition is resolved for 3 consecutive evaluation periods. This prevents alert storms but can mislead you into thinking the alert isn’t working.
// Make alert stateless (fires on every evaluation that meets condition)
{
"properties": {
"autoMitigate": false
}
}
Dynamic Thresholds Not Active
Dynamic thresholds need historical data to establish a baseline. They require:
- At least 3 days of metric history
- At least 30 metric samples
During the learning period, dynamic threshold alerts will not fire. Use the Ignore data before setting to exclude anomalous historical periods from baseline calculation.
Missing First Evaluation
When a new resource starts emitting metrics or a new dimension value appears, the first evaluation period may be missed because there isn’t enough data. Set Aggregation granularity greater than Frequency of evaluation to overlap evaluation windows and avoid missing the first data point.
Guest OS Metrics Not Available
VM metrics like CPU percentage and network are host-level metrics collected automatically. Guest OS metrics — memory usage, disk space, process counts — require the Azure Monitor Agent.
# Install Azure Monitor Agent on a VM
az vm extension set \
--resource-group myRG \
--vm-name myVM \
--name AzureMonitorLinuxAgent \
--publisher Microsoft.Azure.Monitor \
--version 1.0
# Or for Windows
az vm extension set \
--resource-group myRG \
--vm-name myVM \
--name AzureMonitorWindowsAgent \
--publisher Microsoft.Azure.Monitor \
--version 1.0
Log Search Alerts Not Firing
Alert Rule Health
Log search alert rules can be in a degraded or unavailable state. Check the rule’s health:
Portal > Monitor > Alerts > Alert rules > Select rule > Help > Resource health
Log Ingestion Latency
Log data has inherent ingestion latency — data may not be available for querying for several minutes after generation. If your alert rule evaluates before the data arrives, it sees no results and doesn’t fire.
// Check ingestion latency for a table
AzureDiagnostics
| where TimeGenerated > ago(1h)
| extend IngestionTime = ingestion_time()
| extend LatencySeconds = datetime_diff('second', IngestionTime, TimeGenerated)
| summarize avg(LatencySeconds), max(LatencySeconds), percentile(LatencySeconds, 95)
If latency exceeds 4 minutes, consider using metric alerts instead, which have near-real-time evaluation.
Auto-Disabled Rules
Azure Monitor automatically disables log search alert rules if the query fails consistently for one week. Check the Activity Log for the event Microsoft.Insights/ScheduledQueryRules/disable/action.
# Check if rule is disabled
az monitor scheduled-query show \
--name myAlertRule \
--resource-group myRG \
--query "enabled"
Common reasons for auto-disabling:
- Target resource (Log Analytics workspace) was deleted
- Data source stopped sending data for 30+ days, causing table removal
- Query references a column that no longer exists in the schema
Muted Actions
Actions can be suppressed by:
- Mute actions checkbox — Temporarily suppresses notifications when you want the alert to fire but not notify
- Automatically resolve alerts — If enabled, the alert auto-resolves and may not be visible in the fired alerts list
- Action group suppression rules — Maintenance windows can suppress all notifications
Managed Identity Permissions
Log search alert rules created with a system-assigned managed identity require explicit permissions to query the target workspace.
# Get the alert rule's managed identity
ALERT_IDENTITY=$(az monitor scheduled-query show \
--name myAlertRule \
--resource-group myRG \
--query "identity.principalId" -o tsv)
# Grant Reader role on the Log Analytics workspace
az role assignment create \
--assignee-object-id $ALERT_IDENTITY \
--assignee-principal-type ServicePrincipal \
--role "Reader" \
--scope /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.OperationalInsights/workspaces/{workspace}
# Also grant Log Analytics Reader for query permissions
az role assignment create \
--assignee-object-id $ALERT_IDENTITY \
--assignee-principal-type ServicePrincipal \
--role "Log Analytics Reader" \
--scope /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.OperationalInsights/workspaces/{workspace}
Activity Log Alerts Not Firing
Common Issues
- Scope mismatch — Alert is scoped to a resource group but the event occurs at resource level
- Event category mismatch — Alert watches “Administrative” but event is “Service Health”
- Insufficient permissions — User creating the alert doesn’t have Reader on the scope
Alert Processing Rules (formerly Action Rules)
Alert processing rules can add, suppress, or modify action groups for fired alerts. If you’re not receiving notifications, check for suppression rules:
# List alert processing rules
az monitor alert-processing-rule list \
--resource-group myRG \
--output table
Service Limits
| Limit | Value |
|---|---|
| Metric alert rules per subscription | 4,000 |
| Log search alert rules per subscription | 4,096 |
| Activity log alert rules per subscription | 100 |
| Action groups per subscription | 2,000 |
| Email notifications per hour | 100 |
| SMS notifications per 5 minutes | 1 |
Common Error Messages
| Error | Meaning | Fix |
|---|---|---|
| “Alert has been failing consistently for the past week” | Rule auto-disabled | Fix the query, re-enable the rule |
| “The query couldn’t be validated since you need permission for the logs” | Missing query permission | Assign Log Analytics Reader |
| “One-minute frequency isn’t supported for this query” | Query uses unsupported operators at 1-min frequency | Increase frequency to 5 minutes |
| “Failed to resolve scalar expression named <>“ | Column doesn’t exist | Fix column name in query |
Testing Alert Rules
# Manually trigger an alert for testing (create a test metric value)
# For Application Insights:
az monitor app-insights metrics show \
--app myAppInsights \
--resource-group myRG \
--metrics requests/count
# For Azure Monitor:
# Generate test conditions by temporarily lowering thresholds
# Example: Set CPU threshold to 1% to verify the alert fires
Diagnostic Checklist
# Quick diagnostic for alert rules
RG="myRG"
echo "=== Alert Rules ==="
az monitor metrics alert list -g $RG \
--query "[].{name:name, enabled:enabled, severity:severity}" -o table
echo "=== Log Search Alert Rules ==="
az monitor scheduled-query list -g $RG \
--query "[].{name:name, enabled:enabled}" -o table
echo "=== Fired Alerts (last 24h) ==="
az monitor alert list \
--query "[?essentials.startDateTime > '$(date -u -d '-1 day' '+%Y-%m-%dT%H:%M:%SZ')'].{name:essentials.targetResource, severity:essentials.severity, state:essentials.alertState}" \
-o table
echo "=== Alert Processing Rules ==="
az monitor alert-processing-rule list -g $RG \
--query "[].{name:name, enabled:properties.enabled}" -o table
Prevention Best Practices
- Set aggregation granularity greater than frequency — Prevents missing first evaluations
- Use “Number of violations” for Dynamic Thresholds — Filters out transient spikes
- Install Azure Monitor Agent for guest OS metrics — Host metrics alone don’t cover memory and disk
- Monitor alert rule health — Check Resource health periodically for degraded rules
- Assign managed identity permissions immediately — Don’t wait for query failures
- Test alerts in a non-production environment first — Verify they fire as expected before going live
- Use metric alerts over log alerts when possible — Faster evaluation with near-real-time latency
- Document suppression rules — Maintenance window suppressions are easy to forget
Summary
Azure Monitor alert rules fail to trigger for predictable reasons: wrong aggregation settings, stateful behavior suppressing re-fires, log ingestion latency, auto-disabled rules, missing managed identity permissions, or action suppression rules. The most common fix is adjusting the aggregation type and granularity to match how you expect the metric to behave. For log search alerts, check rule health, ingestion latency, and query validity. Always test alert rules by temporarily lowering thresholds to verify the entire notification pipeline works end to end.
For more details, refer to the official documentation: Azure Monitor overview, What are Azure Monitor alerts?, Data collection rules (DCRs) in Azure Monitor.