Exponential retry with Azure Function and Service Bus trigger
Implementing a custom exponential backoff retry strategy
The Problem
When you need to have Azure Function to work with a Service Bus, you could be easily confused with the way how retry mechanism works in such a case. You often need to have a retry strategy for your function. This could be a simple fixed delay retry or exponential backoff.
At the moment of writing, the functions retry feature is being deprecated.
How retry works in Service Bus
A very important detail to mention is that a service bus trigger retries independent of function app retries. The function retry policy is a layer on top of a trigger retry. So any retry policies you define inside a function app will be on top of the main retry policies setup on the service bus. As an example, suppose you have the default delivery count of 10 on your service bus queue:
This means that after 10 attempts to deliver a message, the service bus will dead-letter it. Now, if you defined a function retry policy of 3, the message would be first dequeued, incrementing the service bus delivery count to 1. When all executions failed, after 3 attempts to trigger the same message, that message would be marked as abandoned. Service Bus would immediately re-queue the message, so it would trigger a function and increment delivery count to 2. This will result in 30 attempts in total (10 service bus deliveries * 3 function retries per delivery). After that, the message would be abandoned and moved to a dead-letter queue on the service bus.
Exponential backoff in the function app
As there is no way to specify an exponential backoff retry strategy in Azure Portal at the moment, I will show you a way how to implement it manually in your function app.
To do that it is important to track a retry counter on the message itself and update an interval when it will be scheduled for the next interval. When there are more retries available, the message is scheduled for the next one, when all retries are exhausted message is moved to the dead-letter queue. I defined a messaging option in the local.settings.json
and created a record to work with it:
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "UseDevelopmentStorage=true",
"FUNCTIONS_WORKER_RUNTIME": "dotnet",
"Messaging:RetryCountProperty": "retry-count",
"Messaging:SequenceProperty": "original-SequenceNumber",
"Messaging:RetryCount": 5
}
}
public record MessagingOptions
{
public const string Messaging = "Messaging";
public string? RetryCountProperty { get; set; }
public string? SequenceProperty { get; set; }
public int RetryCount { get; set; }
}
Here is the utility code in .NET 6/C#:
public static async Task ExponentialRetry(
[NotNull] this ServiceBusReceivedMessage receivedMessage,
[NotNull] Exception ex,
[NotNull] ServiceBusMessageActions messageActions,
[NotNull] ServiceBusSender sender,
[NotNull] MessagingOptions messagingOptions,
[NotNull] ILogger log)
{
// If the message doesn't have a retry-count, set as 0
var retryMessage = new ServiceBusMessage(receivedMessage);
if (!receivedMessage.ApplicationProperties.ContainsKey(messagingOptions.RetryCountProperty))
{
retryMessage.ApplicationProperties.Add(messagingOptions.RetryCountProperty, 0);
retryMessage.ApplicationProperties.Add(messagingOptions.SequenceProperty, receivedMessage.SequenceNumber);
}
// If there are more retries available
var retryAttempt = (int)retryMessage.ApplicationProperties[messagingOptions.RetryCountProperty];
if (retryAttempt < messagingOptions.RetryCount)
{
retryAttempt += 1;
var interval = Math.Pow(3, retryAttempt);
var scheduledTime = DateTimeOffset.Now.AddMinutes(interval);
retryMessage.ApplicationProperties[messagingOptions.RetryCountProperty] = retryAttempt;
await sender.ScheduleMessageAsync(retryMessage, scheduledTime).ConfigureAwait(false);
await messageActions.CompleteMessageAsync(receivedMessage).ConfigureAwait(false);
log.LogWarning("Scheduling message retry {RetryCount} to wait {Interval} seconds and arrive at {ScheduledTime}", retryAttempt, interval, scheduledTime.UtcDateTime);
}
// If there are no more retries, deadletter the message (note the host.json config that enables this)
else
{
log.LogCritical("Exhausted all retries for message sequence # {SequenceNumber}", receivedMessage.ApplicationProperties[messagingOptions.SequenceProperty]);
await messageActions.DeadLetterMessageAsync(receivedMessage, ex.Message, ex.ToString()).ConfigureAwait(false);
}
}
To use it in the function app trigger code
[FunctionName("ExponentialRetry")]
public async Task Run(
[NotNull] [ServiceBusTrigger(
"%TopicName%",
"%SubscriptionName%",
Connection = "ServiceBusConnectionString")] ServiceBusReceivedMessage receivedMessage,
ServiceBusMessageActions messageActions,
[NotNull][ServiceBus("%TopicName%", Connection = "ServiceBusConnectionString")] ServiceBusSender sender,
[NotNull] ILogger log)
{
try
{
log.LogInformation($"C# ServiceBus queue trigger function processed message sequence #{message.SystemProperties.SequenceNumber}");
throw new Exception("Some exception");
await messageActions.CompleteMessageAsync(receivedMessage).ConfigureAwait(false);
}
catch (Exception ex)
{
await receivedMessage
.ExponentialRetry(ex, messageActions, sender, _messagingOptions, log)
.ConfigureAwait(false);
}
I've omitted MessagingOptions
. Just pass it via constructor of function trigger with IOptions<MessagingOptions> messagingOptions
. The important detail to mention here is that after successful processing of the message it should be explicitly marked as completed by a call to CompleteMessageAsync
, otherwise retry logic in the catch section will execute and the message will be exponentially retried. In order to do that, you should set AutoCompleteMessages to false in your host.json
file. Here is an example of host.json
I have:
{
"version": "2.0",
"functionTimeout": "00:10:00",
"logging": {
"logLevel": {
"default": "Trace",
},
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"extensions": {
"autoCompleteMessages": false,
"maxAutoLockRenewalDuration": "00:06:00",
"maxConcurrentCalls": 16,
"maxConcurrentSessions": 8
}
}
}