fix: duplicate message_created webhooks for WhatsApp messages (#13523)

Some customers using WhatsApp inboxes with account-level webhooks were
reporting receiving duplicate `message_created` webhook deliveries for
every incoming message. Upon inspection, here's what we found

- Both payloads are identical.
- No errors appear in the application logs
- Webhook URL is only configured in one place. 

This meant, the system was sending the webhooks twice. For some context,
there's a know related issue... Meta's WhatsApp Business API can deliver
the same webhook notification multiple times for a single message. The
codebase already acknowledges this — there's a comment in
`IncomingMessageBaseService#process_messages` noting that "multiple
webhook events can be received against the same message due to
misconfigurations in the Meta business manager account." A deduplication
guard exists, but it doesn't actually work under concurrency.

### Rationale

The existing dedup was a three-step sequence: check Redis (`GET`), check
the database, then set a Redis flag (`SETEX`). Two Sidekiq workers
processing duplicate Meta webhooks simultaneously would both complete
the `GET` before either executed the `SETEX`, so both would proceed to
create a message. The `source_id` column has a non-unique index, so the
database wouldn't catch the duplicate either. Each message then
independently fires `after_create_commit`, dispatching two
`message_created` webhook events to the customer.

```
             Worker A                          Worker B
                │                                 │
                ▼                                 ▼
        Redis GET key ──► nil               Redis GET key ──► nil
                │                                 │
                │    ◄── both pass guard ──►      │
                │                                 │
                ▼                                 ▼
        Redis SETEX key                    Redis SETEX key
                │                                 │
                ▼                                 ▼
        BEGIN transaction               BEGIN transaction
        INSERT message                   INSERT message
        DELETE Redis key ◄─┐                      │
        COMMIT             │             DELETE Redis key
                           │             COMMIT
                           │                      │
                           └── key gone before ───┘
                              B's commit lands

                ▼                                 ▼
        after_create_commit              after_create_commit
        dispatch MESSAGE_CREATED         dispatch MESSAGE_CREATED
                │                                 │
                ▼                                 ▼
        WebhookJob ──► n8n               WebhookJob ──► n8n
                    (duplicate!)
```

There was a second, subtler problem visible in the diagram: the Redis
key was cleared *inside* the database transaction, before the
transaction committed. This opened a window where neither the Redis
check nor the database check would see the in-flight message.

The fix collapses the check-and-set into a single `SET NX EX` call,
which is atomic in Redis. The key is no longer eagerly cleared — it
expires naturally after 24 hours. The database lookup
(`find_message_by_source_id`) remains as a fallback for messages that
were created before the lock expired.

```
             Worker A                          Worker B
                │                                 │
                ▼                                 ▼
        Redis SET NX ──► OK              Redis SET NX ──► nil
                │                                 │
                ▼                                 ▼
        proceeds to create              returns early
        message normally                (lock already held)
```

### Implementation Notes

The lock logic is extracted into `Whatsapp::MessageDedupLock`, a small
class that wraps a single `Redis SET NX EX` call. This makes the
concurrency guarantee testable in isolation — the spec uses a
`CyclicBarrier` to race two threads against the same key and asserts
exactly one wins, without needing database writes,
`use_transactional_tests = false`, or monkey-patching.

Because the Redis lock now persists (instead of being cleared
mid-transaction), existing WhatsApp specs needed an `after` hook to
clean up `MESSAGE_SOURCE_KEY::*` keys between examples. Transactional
fixtures only roll back the database, not Redis.
This commit is contained in:
Shivam Mishra
2026-02-17 14:01:10 +05:30
committed by GitHub
parent fb2f5e1d42
commit 39243b9e71
6 changed files with 90 additions and 30 deletions

View File

@@ -6,6 +6,14 @@ describe Whatsapp::IncomingMessageService do
stub_request(:post, 'https://waba.360dialog.io/v1/configs/webhook')
end
after do
# The atomic dedup lock lives in Redis and is not rolled back by
# transactional fixtures. Clean up any keys created during the test.
Redis::Alfred.scan_each(match: 'MESSAGE_SOURCE_KEY::*') do |key|
Redis::Alfred.delete(key)
end
end
let!(:whatsapp_channel) { create(:channel_whatsapp, sync_templates: false) }
let(:wa_id) { '2423423243' }
let!(:params) do
@@ -393,8 +401,8 @@ describe Whatsapp::IncomingMessageService do
end
end
describe 'when message processing is in progress' do
it 'ignores the current message creation request' do
describe 'when another worker already holds the dedup lock' do
it 'skips message creation' do
params = { 'contacts' => [{ 'profile' => { 'name' => 'Kedar' }, 'wa_id' => '919746334593' }],
'messages' => [{ 'from' => '919446284490',
'id' => 'wamid.SDFADSf23sfasdafasdfa',
@@ -409,17 +417,14 @@ describe Whatsapp::IncomingMessageService do
'phones' => [{ 'phone' => '+1 (415) 341-8386' }] }
] }] }.with_indifferent_access
expect(Message.find_by(source_id: 'wamid.SDFADSf23sfasdafasdfa')).not_to be_present
key = format(Redis::RedisKeys::MESSAGE_SOURCE_KEY, id: 'wamid.SDFADSf23sfasdafasdfa')
Redis::Alfred.setex(key, true)
expect(Redis::Alfred.get(key)).to be_truthy
# Simulate another worker holding the lock
lock = Whatsapp::MessageDedupLock.new('wamid.SDFADSf23sfasdafasdfa')
expect(lock.acquire!).to be_truthy
described_class.new(inbox: whatsapp_channel.inbox, params: params).perform
expect(whatsapp_channel.inbox.messages.count).to eq(0)
expect(Message.find_by(source_id: 'wamid.SDFADSf23sfasdafasdfa')).not_to be_present
expect(Redis::Alfred.get(key)).to be_truthy
ensure
key = format(Redis::RedisKeys::MESSAGE_SOURCE_KEY, id: 'wamid.SDFADSf23sfasdafasdfa')
Redis::Alfred.delete(key)
end
end

View File

@@ -2,6 +2,10 @@ require 'rails_helper'
describe Whatsapp::IncomingMessageWhatsappCloudService do
describe '#perform' do
after do
Redis::Alfred.scan_each(match: 'MESSAGE_SOURCE_KEY::*') { |key| Redis::Alfred.delete(key) }
end
let!(:whatsapp_channel) { create(:channel_whatsapp, provider: 'whatsapp_cloud', sync_templates: false, validate_provider_config: false) }
let(:params) do
{

View File

@@ -0,0 +1,43 @@
require 'rails_helper'
describe Whatsapp::MessageDedupLock do
let(:source_id) { "wamid.test_#{SecureRandom.hex(8)}" }
let(:lock) { described_class.new(source_id) }
let(:redis_key) { format(Redis::RedisKeys::MESSAGE_SOURCE_KEY, id: source_id) }
after { Redis::Alfred.delete(redis_key) }
describe '#acquire!' do
it 'returns truthy on first acquire' do
expect(lock.acquire!).to be_truthy
end
it 'returns falsy on second acquire for the same source_id' do
lock.acquire!
expect(described_class.new(source_id).acquire!).to be_falsy
end
it 'allows different source_ids to acquire independently' do
lock.acquire!
other = described_class.new("wamid.other_#{SecureRandom.hex(8)}")
expect(other.acquire!).to be_truthy
end
it 'lets exactly one thread win when two race for the same source_id' do
results = Concurrent::Array.new
barrier = Concurrent::CyclicBarrier.new(2)
threads = Array.new(2) do
Thread.new do
barrier.wait
results << described_class.new(source_id).acquire!
end
end
threads.each(&:join)
wins = results.count { |r| r }
expect(wins).to eq(1), "Expected exactly 1 winner but got #{wins}. Results: #{results.inspect}"
end
end
end