software-design2 Min Read

Why Exponential Backoff in Rabbitmq or In Event-Driven Systems

Gorav Singal

August 23, 2022

TL;DR

Without exponential backoff, a failing message re-queues instantly and creates an infinite retry storm that crushes your consumer. Backoff with dead-letter exchanges gives the downstream system time to recover.

Why Exponential Backoff in Rabbitmq or In Event-Driven Systems

Understanding Simple Message Workflow

First, lets understand a simple workflow in event-driven systems or in messaging workflow.

  • A message (event) is generated on a queue (on some action)
  • Some consumer get the message
  • Process the message, do processing
  • Acknowledge or Delete the message on completion

Event Driven System Positive Workflow

Lets consider a failure scenario, where the the format of message is unknown and the consumer throws an exception, without acknowledging or negative acknowledgement.

The message goes back to the original queue, and is available for next consumption.

Event Driven System Negative Workflow

Ways to Handle Failure Messages

There are three ways to handle the failure case:

  1. Reject the message
  2. Re-queue in Rabbitmq Queue
  3. Publish it to Dead-Letter-Exchange queue

Note: We can not lose any message, so every message is important to the system.

Issue in Re-queue the Message Back to Queue

Now when one worker fails by saying that I didn’t understand the format. It is mostly likely, it will be fail next time as well. Now, think of a case that you are rejecting something, and it is coming back to you again infinitely! In computer terms, we are wasting the resources and actually doing DDOS our own systems, which is bad.

Issue in Dead-Letter-Exchange queue

The idea behind pushing in a Dead-Letter-Exchange queue is that someone will manually handle the messages, and would probably push it back to original queue after some modification, or code change in workers. Or, may be we will delete the message if its not important or gets produced by mistake.

But, its a manual step! But, this saves our services from DDOS attack.

The Saviour - Exponential Backoff Strategy

Remember, what we do in a normal Exponential Backoff retries. We retry after some random time sleep, and we keep on increasing this sleep time exponentially.

The idea is same. We will have a separate retry queue for every such queue present in our system.

Let me list the steps:

  • Create separate retry queue for each of your queue
  • On failures, you push the message to retry queue, with a Expiration or TTL metadata.
  • On expiring that Expiration time or TTL time, the message will be expired from the retry queue and sent back to original queue. And, its back to be processed again.

Event Driven System Negative with Retry Workflow

Hope it helps.

Share

Related Posts

How to Implement Exponential Backoff in Rabbitmq Using AMQP in Node.js

How to Implement Exponential Backoff in Rabbitmq Using AMQP in Node.js

Exponential Backoff in Rabbitmq Please make sure to read first, why we need the…

Message Queues with RabbitMQ and SQS in Node.js

Message Queues with RabbitMQ and SQS in Node.js

Why Message Queues Message queues decouple producers from consumers, enabling…

Node.js Architecture — Event Loop Deep Dive

Node.js Architecture — Event Loop Deep Dive

Why the Event Loop Matters Node.js runs JavaScript on a single thread, yet…

Deep Dive on Redis: Architecture, Data Structures, and Production Usage

Deep Dive on Redis: Architecture, Data Structures, and Production Usage

“Redis is not just a cache. It’s a data structure server that happens to be…

Deep Dive on Apache Kafka: A System Design Interview Perspective

Deep Dive on Apache Kafka: A System Design Interview Perspective

“Kafka is not a message queue. It’s a distributed commit log that happens to be…

Deep Dive on Elasticsearch: A System Design Interview Perspective

Deep Dive on Elasticsearch: A System Design Interview Perspective

“If you’re searching, filtering, or aggregating over large volumes of semi…

Latest Posts

AI Video Generation in 2025 — Models, Costs, and How to Build a Cost-Effective Pipeline

AI Video Generation in 2025 — Models, Costs, and How to Build a Cost-Effective Pipeline

AI video generation went from “cool demo” to “usable in production” in 2024-202…

AI Models in 2025 — Cost, Capabilities, and Which One to Use

AI Models in 2025 — Cost, Capabilities, and Which One to Use

Choosing the right AI model is one of the most impactful decisions you’ll make…

AI Image Generation in 2025 — Models, Costs, and How to Optimize Spend

AI Image Generation in 2025 — Models, Costs, and How to Optimize Spend

Generating one image with AI costs between $0.002 and $0.12. That might sound…

AI Coding Assistants in 2025 — Every Tool Compared, and Which One to Actually Use

AI Coding Assistants in 2025 — Every Tool Compared, and Which One to Actually Use

Two years ago, AI coding meant one thing: GitHub Copilot autocompleting your…

AI Agents Demystified — It's Just Automation With a Better Brain

AI Agents Demystified — It's Just Automation With a Better Brain

Let’s cut through the noise. If you read Twitter or LinkedIn, you’d think “AI…

Supply Chain Security — Protecting Your Software Pipeline

Supply Chain Security — Protecting Your Software Pipeline

In 2024, a single malicious contributor nearly compromised every Linux system on…