Bemærk
Adgang til denne side kræver godkendelse. Du kan prøve at logge på eller ændre mapper.
Adgang til denne side kræver godkendelse. Du kan prøve at ændre mapper.
This article helps you diagnose and fix common issues when building applications with the portable Durable Task SDKs. These SDKs connect to the Durable Task Scheduler backend and run on any hosting platform, including Azure Container Apps, Kubernetes, and VMs. For issues specific to the Durable Task Scheduler service, see Troubleshoot the Durable Task Scheduler. For Durable Functions issues, see Durable Functions troubleshooting guide.
Tip
The Durable Task Scheduler monitoring dashboard is useful for inspecting orchestration status, viewing execution history, and identifying failures. Use it alongside this guide to speed up troubleshooting.
Find your issue
| Error message or symptom | Section |
|---|---|
connection refused or failed to connect at startup |
Emulator isn't running or is unreachable |
| Connection string parse errors or authentication errors at startup | Connection string format is incorrect |
| Worker connects but orchestrations don't start | Task hub doesn't exist |
401 Unauthorized or identity/role errors on Azure |
Identity-based authentication failures on Azure |
| Orchestration stuck in "Pending" | Orchestration is stuck in the "Pending" state |
| Orchestration stuck in "Running" | Orchestration is stuck in the "Running" state |
| Replay failures, infinite loops, or unexpected behavior | Nondeterministic orchestrator code |
| Type mismatch or JSON serialization errors | Serialization and deserialization errors |
activity not found |
Activity not found |
RESOURCE_EXHAUSTED or message too large |
gRPC message size limit exceeded |
CANCELLED: Cancelled on client during shutdown |
Stream cancellation errors during shutdown |
CS0419 / VSTHRD105 warnings break build |
Source generator warnings break builds (C#) |
OrchestratorBlockedException (Java) |
OrchestratorBlockedException (Java) |
Unhelpful error when using retry_policy (Python) |
Retry policy requires max_retry_interval (Python) |
Connection and setup issues
Emulator isn't running or is unreachable
If your app fails at startup with a connection error like "connection refused" or "failed to connect," check that the Durable Task Scheduler emulator is running and accessible.
Check that the emulator Docker container is running:
docker ps | grep durabletaskCheck the correct port mappings. The emulator exposes two ports:
- 8080—gRPC endpoint (used by your app)
- 8082—Dashboard UI
If you're using a custom port mapping, update your connection string to match the host port mapped to container port
8080.Test connectivity to the gRPC endpoint:
curl -v http://localhost:8080A connection refusal indicates that the container isn't running or that the port mapping is incorrect.
Connection string format is incorrect
Connection string errors are a common cause of startup failures. Check that your connection string matches the expected format.
Local development (emulator):
Endpoint=http://localhost:8080;Authentication=None
Azure (managed identity):
Endpoint=https://<scheduler-name>.durabletask.io;Authentication=ManagedIdentity
Azure (user-assigned managed identity):
Endpoint=https://<scheduler-name>.durabletask.io;Authentication=ManagedIdentity;ClientID=<client-id>
Common mistakes:
- Using
httpsfor the local emulator (the emulator useshttp) - Using
httpfor Azure endpoints (Azure requireshttps) - Omitting the
Authenticationparameter - Using the dashboard port (
8082) instead of the gRPC port (8080)
Client or worker fails to connect
If your client or worker fails to connect, verify the following:
- Connection string matches the expected format shown in Connection string format is incorrect.
- Task hub name is the same for both the client and worker.
- Endpoint URL uses
httpfor the local emulator andhttpsfor Azure.
For full setup examples in each language, see Create an app with Durable Task SDKs.
Task hub doesn't exist
If your orchestrations fail to start or the worker connects but doesn't process work, the task hub might not exist on the scheduler. The emulator typically creates task hubs automatically using the DTS_TASK_HUB_NAMES environment variable.
Check that the emulator was started with the correct task hub name:
docker run -d -p 8080:8080 -p 8082:8082 \
-e DTS_TASK_HUB_NAMES="my-taskhub" \
mcr.microsoft.com/dts/dts-emulator:latest
For Azure-hosted schedulers, create the task hub using the Azure CLI:
az durabletask taskhub create \
--resource-group <resource-group> \
--scheduler-name <scheduler-name> \
--name <taskhub-name>
Identity-based authentication failures on Azure
If your app runs locally but fails when deployed to Azure, the issue is likely related to authentication:
- Check that the managed identity is assigned to your app (system-assigned or user-assigned).
- Check that the identity has the Durable Task Data Contributor role on the scheduler resource or specific task hub.
- Make sure the connection string uses the correct
Authenticationvalue (ManagedIdentity). In Python, pass aDefaultAzureCredential()instance as thetoken_credentialparameter instead of using a connection string. - For user-assigned identities, check that the
ClientIDin the connection string matches the identity's client ID.
For detailed instructions, see Identity-based access for Durable Task Scheduler.
Orchestration issues
Orchestration is stuck in the "Pending" state
An orchestration in "Pending" status indicates it was scheduled but a worker hasn't picked it up. Check the following items:
- Worker is running. Ensure your worker process is running and connected to the same task hub where the orchestration was scheduled.
- Task hub name matches. Check that the worker and client both reference the same task hub name. A mismatch causes the worker to poll a different task hub.
- Orchestrator is registered. The orchestrator function or class referenced when scheduling must be registered with the worker.
Check that the orchestrator class is registered with the worker during startup. If you use source generators ([DurableTask] attribute), the registration is automatic. Otherwise, register manually:
builder.Services.AddDurableTaskWorker(builder =>
{
builder.AddTasks(tasks =>
{
tasks.AddOrchestrator<MyOrchestrator>();
tasks.AddActivity<MyActivity>();
});
});
Orchestration is stuck in the "Running" state
An orchestration stuck in "Running" typically means it's waiting for a task that isn't complete. To diagnose, open the Durable Task Scheduler dashboard and inspect the orchestration's execution history. Look for the last completed event — the next event in the sequence is the one that's blocking.
Common causes:
- Activity not registered. The orchestration calls an activity name that isn't registered with the worker. The dashboard shows a
TaskScheduledevent with no correspondingTaskCompleted. Check that the activity name matches between your orchestrator code and worker registration (see Activity not found). - Waiting on an external event. The orchestration calls
waitForExternalEventand the event isn't raised yet. The dashboard shows anEventRaisedevent is expected but missing. Verify the event name and that the sender is targeting the correct orchestration instance ID. - Waiting on a durable timer. The orchestration creates a timer that isn't expired yet. The dashboard shows a
TimerCreatedevent. Wait for the timer to fire, or check if the timer duration is longer than expected. - Activity throws an unhandled exception. The dashboard shows a
TaskFailedevent. Check the failure details for the exception message and stack trace.
Nondeterministic orchestrator code
Orchestrator code must be deterministic. Nondeterministic code causes replay failures that result in unexpected behavior, infinite loops, or errors. Don't use current time, random numbers, GUIDs, or I/O (like HTTP calls) directly in orchestrator code. Use the context-provided alternatives or delegate to activities.
// ❌ Wrong - non-deterministic
var now = DateTime.UtcNow;
var id = Guid.NewGuid();
var data = await httpClient.GetAsync("https://example.com/api");
// ✅ Correct - deterministic
var now = context.CurrentUtcDateTime;
var id = context.NewGuid();
var data = await context.CallActivityAsync<string>("FetchData");
Serialization and deserialization errors
Serialization errors occur when the types used for orchestration inputs, outputs, or activity results don't match between caller and callee. These errors can appear as unexpected null values, JsonException, or type cast failures in your orchestration history.
How to diagnose:
- Open the Durable Task Scheduler dashboard and inspect the orchestration history. Look at the
InputandResultfields for activities that failed. - Verify the type expected by the orchestrator matches the type returned by the activity. For example, if the activity returns a
stringbut the orchestrator expects anint, the deserialization fails. - Check for non-serializable types. Custom types that can't be serialized to JSON (for example, types with circular references or no default constructor) fail silently or throw exceptions.
Known issue (Java): Passing a String directly to an activity can result in double-quoted strings (for example, "\"hello\"" instead of "hello"). This behavior is a known issue. Cast the result explicitly or use wrapper objects.
Tip
Use simple data types (strings, numbers, arrays, and plain objects or POJOs/POCOs/dataclasses) for orchestration and activity inputs and outputs. Avoid complex types with custom serialization logic.
Activity issues
Activity not found
If an orchestration fails with an "activity not found" error, the activity name registered with the worker doesn't match the name used in the orchestration code.
In .NET, activities can be registered by class name or by using the [DurableTask] attribute with source generators. Verify that the activity class is included in the worker registration:
builder.Services.AddDurableTaskWorker(builder =>
{
builder.AddTasks(tasks =>
{
tasks.AddActivity<SayHello>();
});
});
When calling the activity from an orchestrator, use the class name:
string result = await context.CallActivityAsync<string>(nameof(SayHello), "Tokyo");
Activity failure handling
When an activity throws an exception, the orchestrator receives a TaskFailedException (or language equivalent). Catch this exception and inspect the inner error details to find the root cause. In C#, use ex.FailureDetails to access the error type and message, and IsCausedBy<T>() to check for specific exception types.
For detailed error handling and retry policy examples in each language, see Error handling and retries.
gRPC issues
gRPC message size limit exceeded
If you see a RESOURCE_EXHAUSTED or message too large error, an orchestration or activity input/output exceeds the gRPC default maximum message size of 4 MB.
Mitigations:
- Reduce the size of inputs and outputs. Store large payloads in external storage, like Azure Blob Storage, and pass only references.
- Break large fan-out results into smaller batches processed through sub-orchestrations.
Stream cancellation errors during shutdown
When stopping a worker, you might see CANCELLED: Cancelled on client errors. These errors are typically harmless and occur because the gRPC stream between the worker and the scheduler closes during shutdown. The .NET, Python, and Java SDKs handle these errors internally.
In JavaScript, the SDK might throw Stream error Error: 1 CANCELLED: Cancelled on client when calling worker.stop(). This error is a known issue. Wrap the stop call in a try-catch if the error affects your shutdown logic:
try {
await worker.stop();
} catch (error) {
// Ignore stream cancellation errors during shutdown
if (!error.message.includes("CANCELLED")) {
throw error;
}
}
Logging and diagnostics
Verbose logging configuration
Increase log verbosity to get more details about SDK operations, including gRPC communication and orchestration replay events.
In your appsettings.json or logging configuration file:
{
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft.DurableTask": "Debug"
}
}
}
Use replay-safe loggers to avoid duplicate log entries during orchestration replay:
public override async Task<string> RunAsync(
TaskOrchestrationContext context, string input)
{
ILogger logger = context.CreateReplaySafeLogger<MyOrchestrator>();
logger.LogInformation("Processing input: {Input}", input);
// ...
}
Application Insights integration
For production applications, configure Application Insights to collect telemetry from your Durable Task SDK application. The integration approach depends on your hosting platform:
| Hosting platform | Setup instructions |
|---|---|
| Azure Container Apps | Monitor logs in Azure Container Apps with Log Analytics |
| Azure App Service | Enable diagnostic logging for apps in Azure App Service |
| Azure Kubernetes Service | Monitor Azure Kubernetes Service |
For more information about diagnostics, see Diagnostics in Durable Task SDKs.
Language-specific issues
C#
Source generator warnings break builds
If you use <TreatWarningsAsErrors>true</TreatWarningsAsErrors> in your project, the Durable Task source generators might produce warnings (CS0419, VSTHRD105) that break your build. Suppress these specific warnings:
<PropertyGroup>
<NoWarn>$(NoWarn);CS0419;VSTHRD105</NoWarn>
</PropertyGroup>
This known issue is being tracked on GitHub and is addressed in an upcoming release.
Roslyn analyzer throws in foreach loops
The Durable Task Roslyn analyzer might throw an ArgumentNullException when orchestrator lambda code is inside a foreach loop. This behavior is a known issue that doesn't affect runtime behavior. Update to the latest analyzer package version to get the fix.
Java
Gradle permission denied error
On macOS or Linux, running ./gradlew might fail with a "permission denied" error. Fix this error by making the file executable:
chmod +x gradlew
OrchestratorBlockedException
The OrchestratorBlockedException occurs when orchestrator code performs a blocking operation that the SDK detects as potentially nondeterministic. This exception is a safeguard to prevent orchestrator code from violating orchestrator code constraints.
Common causes:
- Calling a blocking external API in orchestrator code.
- Using
Thread.sleep()directly instead ofctx.createTimer(). - Performing file or network I/O in orchestrator code.
Move all blocking or I/O operations into activities.
Python
Retry policy requires max_retry_interval
When you configure a retry_policy in Python, omitting the max_retry_interval parameter produces an error that doesn't clearly indicate the cause. Always specify max_retry_interval:
from datetime import timedelta
from durabletask import task
retry_policy = task.RetryPolicy(
max_number_of_attempts=3,
first_retry_interval=timedelta(seconds=5),
max_retry_interval=timedelta(minutes=1), # Required
)
WhenAllTask exception behavior
When you use when_all to run multiple tasks in parallel, if one or more tasks fail, the exception behavior might not match expectations. Only the first exception is raised, and the remaining task exceptions might be lost. Inspect individual task results if you need complete error information:
tasks = [ctx.call_activity(process_item, input=item) for item in items]
try:
results = yield task.when_all(tasks)
except TaskFailedError as e:
# Only the first failure is raised
# Check individual tasks for comprehensive error handling
print(f"At least one task failed: {e}")
Get support
For questions and reporting bugs, open an issue in the GitHub repo for the relevant SDK. When you report a bug, include:
- Affected orchestration instance IDs
- Time range in UTC that shows the problem
- Application name and deployment region (if relevant)
- SDK version and hosting platform
- Relevant logs or error messages
| SDK | GitHub repository |
|---|---|
| .NET | microsoft/durabletask-dotnet |
| Java | microsoft/durabletask-java |
| JavaScript | microsoft/durabletask-js |
| Python | microsoft/durabletask-python |