Understanding Azure Blob Upload Size and Timeout Limits
Azure Blob Storage is designed to handle massive amounts of unstructured data, supporting individual blobs up to 190.7 TiB. However, uploading data to Blob Storage involves specific size limits, timeout constraints, and throttling thresholds that can cause upload failures if not properly understood and handled. Whether you’re uploading small configuration files or multi-terabyte datasets, knowing these limits — and how to work within them — is essential.
This guide covers every upload failure scenario, from single-blob size limits to storage account throttling, with tested solutions and production-ready code examples.
Diagnostic Context
When encountering Azure Blob upload failures due to size or timeout, the first step is understanding what changed. In most production environments, errors do not appear spontaneously. They are triggered by a change in configuration, code, traffic patterns, or the platform itself. Review your deployment history, recent configuration changes, and Azure Service Health notifications to identify potential triggers.
Azure maintains detailed activity logs for every resource operation. These logs capture who made a change, what was changed, when it happened, and from which IP address. Cross-reference the timeline of your error reports with the activity log entries to establish a causal relationship. Often, the fix is simply reverting the most recent change that correlates with the error onset.
If no recent changes are apparent, consider external factors. Azure platform updates, regional capacity changes, and dependent service modifications can all affect your resources. Check the Azure Status page and your subscription’s Service Health blade for any ongoing incidents or planned maintenance that coincides with your issue timeline.
Common Pitfalls to Avoid
When fixing Azure service errors under pressure, engineers sometimes make the situation worse by applying changes too broadly or too quickly. Here are critical pitfalls to avoid during your remediation process.
First, avoid making multiple changes simultaneously. If you change the firewall rules, the connection string, and the service tier all at once, you cannot determine which change actually resolved the issue. Apply one change at a time, verify the result, and document what worked. This disciplined approach builds reliable operational knowledge for your team.
Second, do not disable security controls to bypass errors. Opening all firewall rules, granting overly broad RBAC permissions, or disabling SSL enforcement might eliminate the error message, but it creates security vulnerabilities that are far more dangerous than the original issue. Always find the targeted fix that resolves the error while maintaining your security posture.
Third, test your fix in a non-production environment first when possible. Azure resource configurations can be exported as ARM or Bicep templates and deployed to a test resource group for validation. This extra step takes minutes but can prevent a failed fix from escalating the production incident.
Fourth, document the error message exactly as it appears, including correlation IDs, timestamps, and request IDs. If you need to open a support case with Microsoft, this information dramatically speeds up the investigation. Azure support engineers can use correlation IDs to trace the exact request through Microsoft’s internal logging systems.
Blob Upload Size Limits Reference
Azure Blob Storage supports three blob types, each with different size constraints:
| Blob Type | Max Single Upload (Put Blob) | Max Block/Page Size | Max Total Blob Size |
|---|---|---|---|
| Block Blob (API v2019-12-12+) | 5,000 MiB | 4,000 MiB per block | ~190.7 TiB (50,000 blocks) |
| Block Blob (older API versions) | 256 MiB | 100 MiB per block | ~4.75 TiB |
| Append Blob | N/A | 4 MiB per block | ~195 GiB (50,000 blocks) |
| Page Blob | N/A | 4 MiB per page | 8 TiB |
Storage Account-Level Limits
Beyond individual blob limits, the storage account itself has throughput and request rate constraints:
| Limit | Value (Major Regions) | Value (Other Regions) |
|---|---|---|
| Max account capacity | 5 PiB (default) | 5 PiB (default) |
| Max request rate | 40,000 requests/sec | 20,000 requests/sec |
| Max ingress | 60 Gbps | 25 Gbps |
| Max egress | 200 Gbps | 50 Gbps |
Single Upload Exceeds Put Blob Limit
Error: 413 Request Entity Too Large
When you attempt to upload a file larger than the Put Blob maximum size (5,000 MiB on modern API versions) in a single request, the server rejects the upload immediately.
Fix: Use Block Upload
For files larger than 256 MiB, always use the block upload pattern: split the file into blocks, upload each block individually with Put Block, then commit all blocks with Put Block List.
# PowerShell: Upload large file using block upload
$storageAccount = "mystorageaccount"
$containerName = "uploads"
$blobName = "largefile.dat"
$filePath = "C:\data\largefile.dat"
$blockSizeMB = 100 # 100 MB blocks
$context = New-AzStorageContext -StorageAccountName $storageAccount -StorageAccountKey $key
# Set-AzStorageBlobContent handles block upload automatically for large files
Set-AzStorageBlobContent `
-File $filePath `
-Container $containerName `
-Blob $blobName `
-Context $context `
-Force
# Python SDK: Upload with automatic chunking
from azure.storage.blob import BlobServiceClient, BlobClient
blob_service = BlobServiceClient.from_connection_string(conn_str)
blob_client = blob_service.get_blob_client(container="uploads", blob="largefile.dat")
# max_single_put_size controls when SDK switches to block upload
# max_block_size controls individual block size
with open("largefile.dat", "rb") as data:
blob_client.upload_blob(
data,
overwrite=True,
max_single_put_size=64 * 1024 * 1024, # 64 MB threshold
max_block_size=100 * 1024 * 1024, # 100 MB blocks
max_concurrency=8 # Parallel block uploads
)
Using AzCopy for Large Uploads
AzCopy is Microsoft’s recommended tool for large file transfers. It handles block uploads, parallelism, retries, and resume automatically.
# Upload single large file
azcopy copy "largefile.dat" \
"https://mystorageaccount.blob.core.windows.net/uploads/largefile.dat?sv=..." \
--block-size-mb 100 \
--cap-mbps 1000
# Upload entire directory
azcopy copy "/data/uploads/*" \
"https://mystorageaccount.blob.core.windows.net/uploads/?sv=..." \
--recursive \
--put-md5
# Resume a failed transfer
azcopy jobs resume <job-id>
Storage Account Throttling — 503 Server Busy
Error: HTTP 503 Server Busy
When your upload rate exceeds the storage account’s partition server capacity, Azure returns HTTP 503. This is not a hard limit but a dynamic throttling response based on current load.
Diagnosis
# Check storage account metrics
az monitor metrics list \
--resource /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Storage/storageAccounts/{account} \
--metric "Transactions" \
--dimension "ResponseType" \
--interval PT1H \
--start-time 2024-01-01T00:00:00Z
Fix: Implement Exponential Backoff
# Python: Upload with exponential backoff retry
from azure.storage.blob import BlobServiceClient
from azure.core.exceptions import ServiceResponseError
import time
def upload_with_retry(blob_client, data, max_retries=5):
for attempt in range(max_retries):
try:
blob_client.upload_blob(data, overwrite=True)
return True
except ServiceResponseError as e:
if "503" in str(e) or "Server Busy" in str(e):
wait_time = (2 ** attempt) + (random.random() * 0.5)
print(f"Throttled. Retrying in {wait_time:.1f}s...")
time.sleep(wait_time)
data.seek(0) # Reset stream position
else:
raise
raise Exception("Max retries exceeded")
// C#: BlobClientOptions with retry policy
var options = new BlobClientOptions();
options.Retry.MaxRetries = 5;
options.Retry.Mode = RetryMode.Exponential;
options.Retry.Delay = TimeSpan.FromSeconds(1);
options.Retry.MaxDelay = TimeSpan.FromSeconds(60);
options.Retry.NetworkTimeout = TimeSpan.FromMinutes(10);
var blobServiceClient = new BlobServiceClient(connectionString, options);
Operation Timeout — 500 Internal Server Error
Error: HTTP 500 Operation Timeout
This error occurs when a single upload operation takes longer than the server-side timeout allows. It’s common with large single-block uploads over slow connections.
Fix
- Reduce block size — Smaller blocks upload faster per-request
- Increase client-side timeout — Match the timeout to your expected upload duration
- Use parallel uploads — Upload multiple blocks simultaneously
# Azure CLI: Set longer timeout
az storage blob upload \
--account-name mystorageaccount \
--container-name uploads \
--name largefile.dat \
--file ./largefile.dat \
--timeout 600 \
--max-connections 8
# PowerShell: Configure timeout in storage context
$context = New-AzStorageContext `
-StorageAccountName $accountName `
-StorageAccountKey $key
# ServerTimeoutPerRequest in seconds
Set-AzStorageBlobContent `
-File "largefile.dat" `
-Container "uploads" `
-Blob "largefile.dat" `
-Context $context `
-ServerTimeoutPerRequest 600 `
-ConcurrentTaskCount 8
SAS Token Issues
Error: 403 Forbidden
Upload failures with 403 errors typically indicate SAS token problems rather than true permission issues.
Common SAS Token Issues
- Token expired — Check the
se(expiry) parameter in the SAS URL - Missing write permission — SAS must include
sp=w(write) orsp=c(create) - IP restriction — Check
sipparameter in the SAS URL - Protocol restriction — SAS may require HTTPS only (
spr=https) - Wrong service version — Old
svparameter may not support your operation
# Generate a SAS token with proper permissions for upload
az storage container generate-sas \
--account-name mystorageaccount \
--name uploads \
--permissions cwl \
--expiry $(date -u -d "+1 day" '+%Y-%m-%dT%H:%MZ') \
--https-only
# Generate account-level SAS
az storage account generate-sas \
--account-name mystorageaccount \
--services b \
--resource-types co \
--permissions cwl \
--expiry $(date -u -d "+1 day" '+%Y-%m-%dT%H:%MZ') \
--https-only
Root Cause Analysis Framework
After applying the immediate fix, invest time in a structured root cause analysis. The Five Whys technique is a simple but effective method: start with the error symptom and ask “why” five times to drill down from the surface-level cause to the fundamental issue.
For example, considering Azure Blob upload failures due to size or timeout: Why did the service fail? Because the connection timed out. Why did the connection timeout? Because the DNS lookup returned a stale record. Why was the DNS record stale? Because the TTL was set to 24 hours during a migration and never reduced. Why was it not reduced? Because there was no checklist for post-migration cleanup. Why was there no checklist? Because the migration process was ad hoc rather than documented.
This analysis reveals that the root cause is not a technical configuration issue but a process gap that allowed undocumented changes. The preventive action is creating a migration checklist and review process, not just fixing the DNS TTL. Without this depth of analysis, the team will continue to encounter similar issues from different undocumented changes.
Categorize your root causes into buckets: configuration errors, capacity limits, code defects, external dependencies, and process gaps. Track the distribution over time. If most of your incidents fall into the configuration error bucket, invest in infrastructure-as-code validation and policy enforcement. If they fall into capacity limits, improve your monitoring and forecasting. This data-driven approach focuses your improvement efforts where they will have the most impact.
Lease Conflicts — 409 Conflict
Error: HTTP 409 Conflict — There is currently a lease on the blob
Active leases on a blob prevent overwriting. This commonly happens when a previous upload was interrupted or when another process holds a lease.
# Break the lease on a blob
az storage blob lease break \
--account-name mystorageaccount \
--container-name uploads \
--blob-name myfile.dat
# Then retry the upload
az storage blob upload \
--account-name mystorageaccount \
--container-name uploads \
--name myfile.dat \
--file ./myfile.dat \
--overwrite
Hot Partition Throttling
Azure Blob Storage automatically partitions data. When many blobs with similar names are uploaded rapidly, they may land on the same partition server, causing localized throttling even when account-level limits are not reached.
Symptoms
- Intermittent 503 errors during bulk uploads
- Performance degrades as upload batch size increases
- Some uploads succeed while others in the same batch fail
Fix: Distribute Blob Names
# Bad: Sequential names create hot partitions
# logs/2024-01-01.log, logs/2024-01-02.log, logs/2024-01-03.log
# Good: Prefix with hash for distribution
import hashlib
def distributed_blob_name(original_name):
hash_prefix = hashlib.md5(original_name.encode()).hexdigest()[:4]
return f"{hash_prefix}/{original_name}"
# Result: a1b2/logs/2024-01-01.log, c3d4/logs/2024-01-02.log
Append Blob Specific Limits
Append blobs have stricter limits than block blobs and are commonly used for logging. Each append operation is limited to 4 MiB, and the total blob size cannot exceed approximately 195 GiB.
Error: Append Block Size Exceeded
# Python: Safely append data with size checking
from azure.storage.blob import BlobServiceClient
MAX_APPEND_BLOCK = 4 * 1024 * 1024 # 4 MiB
MAX_APPEND_BLOB = 195 * 1024 * 1024 * 1024 # ~195 GiB
def safe_append(blob_client, data):
# Check current blob size
props = blob_client.get_blob_properties()
current_size = props.size
if current_size + len(data) > MAX_APPEND_BLOB:
raise ValueError("Append would exceed maximum blob size")
# Split into chunks if data exceeds max block size
for i in range(0, len(data), MAX_APPEND_BLOCK):
chunk = data[i:i + MAX_APPEND_BLOCK]
blob_client.append_block(chunk)
Network-Level Upload Failures
Connection Reset and Socket Errors
Network instability between the client and Azure Storage can cause upload failures, especially for long-running transfers.
# Python: Robust upload with connection retry
from azure.storage.blob import BlobServiceClient
from azure.core.pipeline.policies import RetryPolicy
retry_policy = RetryPolicy(
retry_total=10,
retry_connect=5,
retry_read=5,
retry_status=5,
retry_backoff_factor=0.5
)
blob_service = BlobServiceClient.from_connection_string(
conn_str,
retry_policy=retry_policy,
connection_timeout=300,
read_timeout=600
)
Error Classification and Severity Assessment
Not all errors require the same response urgency. Classify errors into severity levels based on their impact on users and business operations. A severity 1 error causes complete service unavailability for all users. A severity 2 error degrades functionality for a subset of users. A severity 3 error causes intermittent issues that affect individual operations. A severity 4 error is a cosmetic or minor issue with a known workaround.
For Azure Blob upload failures due to size or timeout, map the specific error codes and messages to these severity levels. Create a classification matrix that your on-call team can reference when triaging incoming alerts. This prevents over-escalation of minor issues and under-escalation of critical ones. Include the expected resolution time for each severity level and the communication protocol (who to notify, how frequently to update stakeholders).
Track your error rates over time using Azure Monitor metrics and Log Analytics queries. Establish baseline error rates for healthy operation so you can distinguish between normal background error levels and genuine incidents. A service that normally experiences 0.1 percent error rate might not need investigation when errors spike to 0.2 percent, but a jump to 5 percent warrants immediate attention. Without this baseline context, every alert becomes equally urgent, leading to alert fatigue.
Implement error budgets as part of your SLO framework. An error budget defines the maximum amount of unreliability your service can tolerate over a measurement window (typically monthly or quarterly). When the error budget is exhausted, the team shifts focus from feature development to reliability improvements. This mechanism creates a structured trade-off between innovation velocity and operational stability.
Dependency Management and Service Health
Azure services depend on other Azure services internally, and your application adds additional dependency chains on top. When diagnosing Azure Blob upload failures due to size or timeout, map out the complete dependency tree including network dependencies (DNS, load balancers, firewalls), identity dependencies (Azure AD, managed identity endpoints), and data dependencies (storage accounts, databases, key vaults).
Check Azure Service Health for any ongoing incidents or planned maintenance affecting the services in your dependency tree. Azure Service Health provides personalized notifications specific to the services and regions you use. Subscribe to Service Health alerts so your team is notified proactively when Microsoft identifies an issue that might affect your workload.
For each critical dependency, implement a health check endpoint that verifies connectivity and basic functionality. Your application’s readiness probe should verify not just that the application process is running, but that it can successfully reach all of its dependencies. When a dependency health check fails, the application should stop accepting new requests and return a 503 status until the dependency recovers. This prevents requests from queuing up and timing out, which would waste resources and degrade the user experience.
Monitoring Upload Performance
# Monitor storage account metrics
az monitor metrics list \
--resource /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Storage/storageAccounts/{account} \
--metric "BlobCount,BlobCapacity,Ingress,Egress,SuccessServerLatency,Transactions" \
--interval PT5M \
--start-time $(date -u -d "-1 hour" '+%Y-%m-%dT%H:%M:%SZ')
# Check for throttled transactions
az monitor metrics list \
--resource /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Storage/storageAccounts/{account} \
--metric "Transactions" \
--dimension "ResponseType" \
--filter "ResponseType eq 'ServerBusy' or ResponseType eq 'ServerTimeoutError'" \
--interval PT1H
Premium Block Blob Accounts
For high-transaction workloads, consider Premium Block Blob storage accounts, which offer:
- Consistent low-latency performance
- Higher transaction rates per partition
- SSD-backed storage
# Create premium block blob account
az storage account create \
--name mypremiumaccount \
--resource-group myRG \
--location eastus \
--kind BlockBlobStorage \
--sku Premium_LRS \
--min-tls-version TLS1_2
Prevention Best Practices
- Use AzCopy for large transfers — It handles parallelism, retry, and resume automatically
- Use block uploads for files over 256 MiB — Never attempt single-request uploads for large files
- Implement exponential backoff — All storage SDKs support configurable retry policies
- Distribute blob names — Use hash prefixes to avoid hot partitions during bulk uploads
- Monitor ingress metrics — Set alerts when approaching account throughput limits
- Set appropriate timeouts — Match client-side timeouts to expected upload duration
- Use managed identities instead of SAS tokens where possible to avoid expiration issues
- Choose the right blob type — Block blobs for general uploads, append blobs for logging, page blobs for random access
Post-Resolution Validation and Hardening
After applying the fix, perform a structured validation to confirm the issue is fully resolved. Do not rely solely on the absence of error messages. Actively verify that the service is functioning correctly by running health checks, executing test transactions, and monitoring key metrics for at least 30 minutes after the change.
Validate from multiple perspectives. Check the Azure resource health status, run your application’s integration tests, verify that dependent services are receiving data correctly, and confirm that end users can complete their workflows. A fix that resolves the immediate error but breaks a downstream integration is not a complete resolution.
Implement defensive monitoring to detect if the issue recurs. Create an Azure Monitor alert rule that triggers on the specific error condition you just fixed. Set the alert to fire within minutes of recurrence so you can respond before the issue impacts users. Include the remediation steps in the alert’s action group notification so that any on-call engineer can apply the fix quickly.
Finally, conduct a brief post-incident review. Document the root cause, the fix applied, the time to detect, diagnose, and resolve the issue, and any preventive measures that should be implemented. Share this documentation with the broader engineering team through a blameless post-mortem process. This transparency transforms individual incidents into organizational learning that raises the entire team’s operational capability.
Consider adding the error scenario to your integration test suite. Automated tests that verify the service behaves correctly under the conditions that triggered the original error provide a safety net against regression. If a future change inadvertently reintroduces the problem, the test will catch it before it reaches production.
Summary
Azure Blob Storage upload failures come down to three categories: size limit violations, account-level throttling, and network or authentication issues. For size limits, always use block upload patterns for files over 256 MiB. For throttling, implement exponential backoff retries and distribute blob naming to avoid hot partitions. For authentication failures, validate SAS token permissions and expiry. AzCopy remains the most robust tool for bulk uploads, handling all of these concerns automatically.
For more details, refer to the official documentation: What is Azure Blob storage?, Introduction to Azure Blob Storage, Scalability and performance targets for Blob storage.