queue fills up disk but should not

339 views
Skip to first unread message

Boris Sagadin

unread,
Oct 10, 2023, 9:48:48 PM10/10/23
to rabbitmq-users
Hi,

we've been on version 3.11.23 for a few days and today an issue appeared with one of the queues, which normally gets about 20k small jobs from a cronjob. GUI shower queue usage as low, at about 30MB (jobs are very small files and this seems correct), but disk was full.

I checked the mnesia folder and this queue size showed as 70GB (got queue name from "config" file).

(omitted path)/2F_SCR47527NSI5QNN$ ls -al
...
-rw-r--r--   1 rabbitmq root 5269170 Oct 10 23:30 01070183.segment
-rw-r--r--   1 rabbitmq root 5265125 Oct 10 23:30 01070182.segment
-rw-r--r--   1 rabbitmq root 5261113 Oct 10 23:30 01070181.segment
-rw-r--r--   1 rabbitmq root 5262928 Oct 10 23:30 01070180.segment
-rw-rw-r--   1 rabbitmq root    1993 Oct  6 04:36 config

About 13000 segment files (until disk was full).

Deleting all messages didn't help. Only when I deleted the queue, the disk space was freed. Queue type is Quorum, 3 nodes, all running same version.

Any ideas what was the issue?

Boris




Michal Kuratczyk

unread,
Oct 11, 2023, 11:11:47 AM10/11/23
to rabbitm...@googlegroups.com
1. If you have the segment files, we can investigate. If you deleted them, if it happens again - please move them somewhere for further investigation
2. Anything else you noticed when this happened? Were consumers present? Did you see unexpected unacknowledged messages perhaps?

Best,

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/aacca4fe-2057-47c3-b1c3-55ae7f44fb9fn%40googlegroups.com.


--
Michał
RabbitMQ team

Boris Sagadin

unread,
Oct 11, 2023, 11:34:50 AM10/11/23
to rabbitmq-users
Michal,

Later I've noticed that we've had an active shovel on this queue. It was a leftover from when we've moved the jobs from the old server to the new, we forgot to delete it. Shovel now pointed with source and destination to the same server. I noticed it since the queue publish & confirm rates were very high all the time, at about 15k reqs/s, even when supposedly idle. I guess the shovel was constantly shoveling jobs from and to the same queue. Maybe this was the culprit or it triggered some bug with eventually filling up the disk. If it repeats, I'll save the files.


Regards,
Boris


sreda, 11. oktober 2023 ob 17:11:47 UTC+2 je oseba Michal Kuratczyk napisala:

Lucas Weis Polesello

unread,
Jan 11, 2024, 8:45:47 AM1/11/24
to rabbitmq-users
Guys, for the past four to three months my team had the same issue - empty queues w/ disks almost full. The mnesia folder was full of messages that were not being GCed. What seems to have fixed was splitting our single Rabbit into N separate instances (not clustering).

How it used to happen was:
- One queue would build up to 500k messages (small messages - usually a DL/Failure queue)
- After consuming it, the disk would start to behave abnormally
- It would only go up until it reached a Disk Alarm
- No abnormal load

Our load is usually:
- 1k/s W/R
- Spikes to 5-6K Writes
- Messages size: 4Kb to 20-30MB messages (depends on which service, which queue, which task)
- Rabbit 3.12.16 (All running Classic V2 Queues, Persistent Messages and Durable Queues)
- ~2k Connections
- ~4k Channels
- 2TB of Disk

About your suggestions to move Mnesia files Michal, since our data has PHI and PII, is there any tool we can use internally to debug it - if it ever happens again? Or any points of concerns? We've been on a rabbit hole for quite a while.

Karl Nilsson

unread,
Jan 11, 2024, 8:50:56 AM1/11/24
to rabbitm...@googlegroups.com
If you're only using classic queue this is a different issue than this old thread which was specifically to do with quorum queues. May be best to start a new question or github discussion for this.



--
Karl Nilsson

jo...@cloudamqp.com

unread,
Jan 11, 2024, 9:15:23 PM1/11/24
to rabbitmq-users
For the classic queue version of this issue see this new discussion: https://github.com/rabbitmq/rabbitmq-server/discussions/10329
Reply all
Reply to author
Forward
0 new messages