How to fix Azure Data Factory linked service authentication issues

Understanding Azure Data Factory Linked Service Authentication

Azure Data Factory (ADF) connects to dozens of data sources — SQL databases, Blob Storage, Cosmos DB, Databricks, HDInsight, REST APIs, and more — through linked services. Each linked service defines the connection information and authentication method for a data store or compute resource. Authentication failures are the most common class of errors in Data Factory, and they manifest differently depending on the connector type, authentication method, and network configuration.

This guide covers authentication failures across all major Data Factory linked service types, with exact error codes, diagnostic steps, and tested fixes.

Diagnostic Context

When encountering Azure Data Factory linked service authentication, the first step is understanding what changed. In most production environments, errors do not appear spontaneously. They are triggered by a change in configuration, code, traffic patterns, or the platform itself. Review your deployment history, recent configuration changes, and Azure Service Health notifications to identify potential triggers.

Azure maintains detailed activity logs for every resource operation. These logs capture who made a change, what was changed, when it happened, and from which IP address. Cross-reference the timeline of your error reports with the activity log entries to establish a causal relationship. Often, the fix is simply reverting the most recent change that correlates with the error onset.

If no recent changes are apparent, consider external factors. Azure platform updates, regional capacity changes, and dependent service modifications can all affect your resources. Check the Azure Status page and your subscription’s Service Health blade for any ongoing incidents or planned maintenance that coincides with your issue timeline.

Common Pitfalls to Avoid

When fixing Azure service errors under pressure, engineers sometimes make the situation worse by applying changes too broadly or too quickly. Here are critical pitfalls to avoid during your remediation process.

First, avoid making multiple changes simultaneously. If you change the firewall rules, the connection string, and the service tier all at once, you cannot determine which change actually resolved the issue. Apply one change at a time, verify the result, and document what worked. This disciplined approach builds reliable operational knowledge for your team.

Second, do not disable security controls to bypass errors. Opening all firewall rules, granting overly broad RBAC permissions, or disabling SSL enforcement might eliminate the error message, but it creates security vulnerabilities that are far more dangerous than the original issue. Always find the targeted fix that resolves the error while maintaining your security posture.

Third, test your fix in a non-production environment first when possible. Azure resource configurations can be exported as ARM or Bicep templates and deployed to a test resource group for validation. This extra step takes minutes but can prevent a failed fix from escalating the production incident.

Fourth, document the error message exactly as it appears, including correlation IDs, timestamps, and request IDs. If you need to open a support case with Microsoft, this information dramatically speeds up the investigation. Azure support engineers can use correlation IDs to trace the exact request through Microsoft’s internal logging systems.

How Linked Service Authentication Works

When a Data Factory pipeline runs an activity that accesses a data store, the flow is:

  1. The pipeline activity references a linked service
  2. The linked service specifies the connection string or endpoint and the authentication type
  3. The integration runtime (Azure IR or Self-Hosted IR) executes the connection using the linked service credentials
  4. The data store validates the credentials and either accepts or rejects the connection

Authentication can fail at any step: invalid credentials, expired tokens, wrong tenant configuration, network-level blocks, or integration runtime issues.

Common Error Codes Reference

Error Code Message Root Cause
2106 Storage connection string is invalid Malformed or wrong connection string
2110 Linked service type not supported Wrong linked service type for the activity
2103 Missing required property Configuration field left blank
2104 Property type is incorrect Wrong data type in configuration
2105 Invalid JSON for property Malformed JSON in linked service definition
2709 Access token from wrong tenant Multi-tenant Entra ID mismatch
2010 Self-hosted IR is offline Integration runtime node is down
3200 Error 403 on Databricks Expired personal access token
4121 Credential expired for Azure ML Service principal secret expired
4122 No permission for operation Insufficient RBAC or data store permissions

Storage Account Authentication Failures

Error 2106: Invalid Connection String

This is the most common linked service error. The connection string format is wrong, or the key has been rotated.

Diagnosis

# Verify the storage account exists and get connection string
az storage account show-connection-string \
  --name mystorageaccount \
  --resource-group myRG \
  --output tsv

Fix

  1. Navigate to Azure Portal > Storage Account > Access Keys
  2. Copy the full connection string (key1 or key2)
  3. In Data Factory Studio > Manage > Linked Services > Select the linked service
  4. Paste the new connection string and click “Test connection”

Recommended: Use Managed Identity

// Linked service definition using managed identity
{
  "name": "AzureBlobStorageLinkedService",
  "type": "Microsoft.DataFactory/factories/linkedservices",
  "properties": {
    "type": "AzureBlobStorage",
    "typeProperties": {
      "serviceEndpoint": "https://mystorageaccount.blob.core.windows.net",
      "accountKind": "StorageV2"
    },
    "connectVia": {
      "referenceName": "AutoResolveIntegrationRuntime",
      "type": "IntegrationRuntimeReference"
    }
  }
}
# Grant Data Factory managed identity access to storage
ADF_IDENTITY=$(az datafactory show \
  --factory-name myDataFactory \
  --resource-group myRG \
  --query identity.principalId -o tsv)

az role assignment create \
  --assignee-object-id $ADF_IDENTITY \
  --role "Storage Blob Data Contributor" \
  --scope /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Storage/storageAccounts/mystorageaccount

SQL Database Authentication Failures

Common Errors

  • Login failed for user — Wrong username/password or user doesn’t exist
  • Cannot open server — Firewall blocking, server doesn’t exist
  • The server was not found or was not accessible — DNS resolution failure or network issue

Fix: SQL Authentication

# Verify SQL server exists and check firewall
az sql server show \
  --name mysqlserver \
  --resource-group myRG \
  --query fullyQualifiedDomainName -o tsv

# Add Data Factory's outbound IPs to SQL firewall
az sql server firewall-rule create \
  --server mysqlserver \
  --resource-group myRG \
  --name AllowAzureServices \
  --start-ip-address 0.0.0.0 \
  --end-ip-address 0.0.0.0

Fix: Managed Identity for SQL

-- Run on the target SQL database
CREATE USER [myDataFactory] FROM EXTERNAL PROVIDER;
ALTER ROLE db_datareader ADD MEMBER [myDataFactory];
ALTER ROLE db_datawriter ADD MEMBER [myDataFactory];
# Set Entra admin on SQL server (required for managed identity)
az sql server ad-admin create \
  --server mysqlserver \
  --resource-group myRG \
  --display-name "ADF Admin" \
  --object-id $(az ad user show --id admin@contoso.com --query id -o tsv)

Databricks Authentication Failures

Error 3200: HTTP 403 on Databricks Cluster

Databricks personal access tokens (PATs) expire after a configurable period (default 90 days). When the token expires, all ADF activities using that linked service fail.

Fix

  1. Open Databricks workspace
  2. Navigate to Settings > Developer > Access tokens
  3. Generate a new token with appropriate expiration
  4. Update the ADF linked service with the new token
# Generate new Databricks token via CLI
databricks tokens create --lifetime-seconds 7776000 --comment "ADF linked service"

# Update ADF linked service via REST API
az rest --method PATCH \
  --url "https://management.azure.com/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.DataFactory/factories/{factory}/linkedservices/DatabricksLinkedService?api-version=2018-06-01" \
  --body '{
    "properties": {
      "type": "AzureDatabricks",
      "typeProperties": {
        "domain": "https://adb-xxxx.azuredatabricks.net",
        "accessToken": {
          "type": "SecureString",
          "value": "NEW_TOKEN_HERE"
        },
        "existingClusterId": "xxxx-xxxxxx-xxxxxxxx"
      }
    }
  }'

Recommended: Use Azure Key Vault for Token Storage

// Linked service referencing Key Vault secret
{
  "name": "DatabricksLinkedService",
  "properties": {
    "type": "AzureDatabricks",
    "typeProperties": {
      "domain": "https://adb-xxxx.azuredatabricks.net",
      "accessToken": {
        "type": "AzureKeyVaultSecret",
        "store": {
          "referenceName": "AzureKeyVaultLinkedService",
          "type": "LinkedServiceReference"
        },
        "secretName": "databricks-pat"
      },
      "existingClusterId": "xxxx-xxxxxx-xxxxxxxx"
    }
  }
}

Self-Hosted Integration Runtime Issues

Error 2010: Self-Hosted IR is Offline

The Self-Hosted Integration Runtime (SHIR) is a Windows agent that runs on your own infrastructure to access on-premises or private network data sources. If it goes offline, all activities using that IR fail.

Diagnosis

# Check IR status
az datafactory integration-runtime show \
  --factory-name myDataFactory \
  --resource-group myRG \
  --name mySelfHostedIR \
  --query "properties.state" -o tsv

# List IR nodes
az datafactory integration-runtime list-auth-key \
  --factory-name myDataFactory \
  --resource-group myRG \
  --name mySelfHostedIR

Fix

# On the SHIR machine: Check service status
Get-Service -Name DIAHostService | Select-Object Status

# Restart the service
Restart-Service -Name DIAHostService -Force

# Check connectivity from SHIR to Azure
Test-NetConnection -ComputerName *.servicebus.windows.net -Port 443
Test-NetConnection -ComputerName login.microsoftonline.com -Port 443
Test-NetConnection -ComputerName *.frontend.clouddatahub.net -Port 443

Azure ML Linked Service Failures

Error 4121: Credential Expired

# Check service principal credential status
az ad app credential list \
  --id YOUR_APP_ID \
  --query "[].{endDate:endDateTime, keyId:keyId}" -o table

# Reset credentials
az ad app credential reset \
  --id YOUR_APP_ID \
  --years 2

Error 4122: No Permission for Operation

# Grant the service principal access to ML workspace
az role assignment create \
  --assignee YOUR_APP_ID \
  --role "AzureML Data Scientist" \
  --scope /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.MachineLearningServices/workspaces/{workspace}

Root Cause Analysis Framework

After applying the immediate fix, invest time in a structured root cause analysis. The Five Whys technique is a simple but effective method: start with the error symptom and ask “why” five times to drill down from the surface-level cause to the fundamental issue.

For example, considering Azure Data Factory linked service authentication: Why did the service fail? Because the connection timed out. Why did the connection timeout? Because the DNS lookup returned a stale record. Why was the DNS record stale? Because the TTL was set to 24 hours during a migration and never reduced. Why was it not reduced? Because there was no checklist for post-migration cleanup. Why was there no checklist? Because the migration process was ad hoc rather than documented.

This analysis reveals that the root cause is not a technical configuration issue but a process gap that allowed undocumented changes. The preventive action is creating a migration checklist and review process, not just fixing the DNS TTL. Without this depth of analysis, the team will continue to encounter similar issues from different undocumented changes.

Categorize your root causes into buckets: configuration errors, capacity limits, code defects, external dependencies, and process gaps. Track the distribution over time. If most of your incidents fall into the configuration error bucket, invest in infrastructure-as-code validation and policy enforcement. If they fall into capacity limits, improve your monitoring and forecasting. This data-driven approach focuses your improvement efforts where they will have the most impact.

Azure Functions Linked Service

Error 3606: Missing Function Key

# Get the function key
az functionapp keys list \
  --name myFunctionApp \
  --resource-group myRG \
  --query "functionKeys.default" -o tsv

Error 3607: Missing Function Name

Ensure the activity configuration specifies the exact function name that matches the deployed function in the Function App.

Error 3602: Invalid HTTP Method

Verify the activity uses a method (GET, POST, etc.) that the Azure Function accepts. Check the function’s authLevel and methods in function.json.

HDInsight Authentication Issues

Error 2304: MSI Not Supported on Storage for HDI

HDInsight activities in Data Factory do not support managed identity authentication for storage accounts. You must use account keys or shared access signatures.

// HDInsight linked service with storage credentials
{
  "name": "HDInsightLinkedService",
  "properties": {
    "type": "HDInsight",
    "typeProperties": {
      "clusterUri": "https://mycluster.azurehdinsight.net",
      "userName": "admin",
      "password": {
        "type": "AzureKeyVaultSecret",
        "store": {
          "referenceName": "AzureKeyVaultLinkedService",
          "type": "LinkedServiceReference"
        },
        "secretName": "hdi-password"
      },
      "linkedServiceName": {
        "referenceName": "StorageLinkedServiceWithKey",
        "type": "LinkedServiceReference"
      }
    }
  }
}

For VNet-deployed HDInsight clusters, use the internal URL with the -int suffix:

https://mycluster-int.azurehdinsight.net/

Wrong Tenant Issues

Error 2709: Access Token from Wrong Tenant

This occurs in multi-tenant scenarios where the linked service authenticates against the wrong Microsoft Entra ID tenant.

# Verify the correct tenant ID
az account show --query tenantId -o tsv

# List all tenants
az account tenant list --query "[].{tenantId:tenantId, displayName:displayName}" -o table

Update the linked service to specify the correct tenant ID in the authentication configuration.

Testing Linked Service Connections

# Test all linked services programmatically
FACTORY="myDataFactory"
RG="myRG"

# List all linked services
az datafactory linked-service list \
  --factory-name $FACTORY \
  --resource-group $RG \
  --query "[].name" -o tsv

# The Test Connection feature is only available via Portal or SDK
# Portal: Manage > Linked Services > Select > Test connection

Error Classification and Severity Assessment

Not all errors require the same response urgency. Classify errors into severity levels based on their impact on users and business operations. A severity 1 error causes complete service unavailability for all users. A severity 2 error degrades functionality for a subset of users. A severity 3 error causes intermittent issues that affect individual operations. A severity 4 error is a cosmetic or minor issue with a known workaround.

For Azure Data Factory linked service authentication, map the specific error codes and messages to these severity levels. Create a classification matrix that your on-call team can reference when triaging incoming alerts. This prevents over-escalation of minor issues and under-escalation of critical ones. Include the expected resolution time for each severity level and the communication protocol (who to notify, how frequently to update stakeholders).

Track your error rates over time using Azure Monitor metrics and Log Analytics queries. Establish baseline error rates for healthy operation so you can distinguish between normal background error levels and genuine incidents. A service that normally experiences 0.1 percent error rate might not need investigation when errors spike to 0.2 percent, but a jump to 5 percent warrants immediate attention. Without this baseline context, every alert becomes equally urgent, leading to alert fatigue.

Implement error budgets as part of your SLO framework. An error budget defines the maximum amount of unreliability your service can tolerate over a measurement window (typically monthly or quarterly). When the error budget is exhausted, the team shifts focus from feature development to reliability improvements. This mechanism creates a structured trade-off between innovation velocity and operational stability.

Dependency Management and Service Health

Azure services depend on other Azure services internally, and your application adds additional dependency chains on top. When diagnosing Azure Data Factory linked service authentication, map out the complete dependency tree including network dependencies (DNS, load balancers, firewalls), identity dependencies (Azure AD, managed identity endpoints), and data dependencies (storage accounts, databases, key vaults).

Check Azure Service Health for any ongoing incidents or planned maintenance affecting the services in your dependency tree. Azure Service Health provides personalized notifications specific to the services and regions you use. Subscribe to Service Health alerts so your team is notified proactively when Microsoft identifies an issue that might affect your workload.

For each critical dependency, implement a health check endpoint that verifies connectivity and basic functionality. Your application’s readiness probe should verify not just that the application process is running, but that it can successfully reach all of its dependencies. When a dependency health check fails, the application should stop accepting new requests and return a 503 status until the dependency recovers. This prevents requests from queuing up and timing out, which would waste resources and degrade the user experience.

Debugging Pipeline Failures

# View recent pipeline runs
az datafactory pipeline-run query-by-factory \
  --factory-name myDataFactory \
  --resource-group myRG \
  --last-updated-after "2024-01-01T00:00:00Z" \
  --last-updated-before "2024-12-31T00:00:00Z" \
  --filters "[{\"operand\":\"Status\",\"operator\":\"Equals\",\"values\":[\"Failed\"]}]"

# Get activity run details for a specific pipeline run
az datafactory activity-run query-by-pipeline-run \
  --factory-name myDataFactory \
  --resource-group myRG \
  --run-id "xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" \
  --last-updated-after "2024-01-01T00:00:00Z" \
  --last-updated-before "2024-12-31T00:00:00Z"

Key Vault Integration for Secrets

Instead of storing credentials directly in linked services, use Azure Key Vault references:

# Create Key Vault linked service ADF identity can access
az keyvault set-policy \
  --name myKeyVault \
  --resource-group myRG \
  --object-id $(az datafactory show --factory-name myDataFactory --resource-group myRG --query identity.principalId -o tsv) \
  --secret-permissions get list

# Store connection string in Key Vault
az keyvault secret set \
  --vault-name myKeyVault \
  --name storage-connection-string \
  --value "DefaultEndpointsProtocol=https;AccountName=..."

Prevention Best Practices

  • Use Managed Identity wherever possible — Eliminates credential management and expiration issues
  • Store secrets in Azure Key Vault — Never hardcode credentials in linked service definitions
  • Monitor integration runtime health — Set alerts on Self-Hosted IR offline events
  • Track token and secret expiration — Use Key Vault expiration notifications for PATs and secrets
  • Keep SHIR updated — Run the latest version of the Self-Hosted Integration Runtime
  • Test connections after key rotation — Always click “Test connection” after updating credentials
  • Use parameterized linked services — Reduce the number of linked services to maintain by parameterizing connection strings
  • Payload limit awareness — Activity run payload limit is 896 KB; design accordingly

Post-Resolution Validation and Hardening

After applying the fix, perform a structured validation to confirm the issue is fully resolved. Do not rely solely on the absence of error messages. Actively verify that the service is functioning correctly by running health checks, executing test transactions, and monitoring key metrics for at least 30 minutes after the change.

Validate from multiple perspectives. Check the Azure resource health status, run your application’s integration tests, verify that dependent services are receiving data correctly, and confirm that end users can complete their workflows. A fix that resolves the immediate error but breaks a downstream integration is not a complete resolution.

Implement defensive monitoring to detect if the issue recurs. Create an Azure Monitor alert rule that triggers on the specific error condition you just fixed. Set the alert to fire within minutes of recurrence so you can respond before the issue impacts users. Include the remediation steps in the alert’s action group notification so that any on-call engineer can apply the fix quickly.

Finally, conduct a brief post-incident review. Document the root cause, the fix applied, the time to detect, diagnose, and resolve the issue, and any preventive measures that should be implemented. Share this documentation with the broader engineering team through a blameless post-mortem process. This transparency transforms individual incidents into organizational learning that raises the entire team’s operational capability.

Consider adding the error scenario to your integration test suite. Automated tests that verify the service behaves correctly under the conditions that triggered the original error provide a safety net against regression. If a future change inadvertently reintroduces the problem, the test will catch it before it reaches production.

Summary

Data Factory linked service authentication failures stem from expired credentials, wrong connection strings, misconfigured tenant IDs, offline integration runtimes, or insufficient permissions. The error codes in this guide map directly to specific root causes, making diagnosis straightforward. For long-term reliability, migrate all linked services to managed identity authentication where supported, store remaining secrets in Azure Key Vault, and monitor integration runtime health proactively. Testing connections after every credential change prevents pipeline failures from reaching production.

For more details, refer to the official documentation: What is Azure Data Factory?.

Leave a Reply