Hi all.
We are using Axon 2.4.3 for quite some time now in 2 different apps, the first has been running without problems for 4 or 5 months without problem, the other one was deployed in production 2 weeks ago and it's giving a s**t load of problems. To go right to the point we have some batch jobs that in the end creates Axon processes. We are having problems when doing that, stalling connections to the database. We see lot's of connections like this:
INSERT INTO axon.task_domain_event_entry (event_identifier, aggregate_type, aggregate_identifier, sequence_number, timestamp, payload_type, payload_revision, payload, metaData) VALUES ($1,$2,$3,$4,$5,$6,$7,XML($8),XML($9))
INSERT INTO axon.task_domain_event_entry (event_identifier, aggregate_type, aggregate_identifier, sequence_number, timestamp, payload_type, payload_revision, payload, metaData) VALUES ($1,$2,$3,$4,$5,$6,$7,XML($8),XML($9))
INSERT INTO axon.task_domain_event_entry (event_identifier, aggregate_type, aggregate_identifier, sequence_number, timestamp, payload_type, payload_revision, payload, metaData) VALUES ($1,$2,$3,$4,$5,$6,$7,XML($8),XML($9))
INSERT INTO axon.task_domain_event_entry (event_identifier, aggregate_type, aggregate_identifier, sequence_number, timestamp, payload_type, payload_revision, payload, metaData) VALUES ($1,$2,$3,$4,$5,$6,$7,XML($8),XML($9))
INSERT INTO axon.task_domain_event_entry (event_identifier, aggregate_type, aggregate_identifier, sequence_number, timestamp, payload_type, payload_revision, payload, metaData) VALUES ($1,$2,$3,$4,$5,$6,$7,XML($8),XML($9))
INSERT INTO axon.task_saga_entry(saga_id, revision, saga_type, serialized_saga) VALUES($1,$2,$3,XML($4))
This are going up to a number that stalls our database, in this instance 193 connections from Axon this morning. They were hanging for more than 1 hour when we decide to shutdown the database. There were 4 of those batch jobs that creates the Axon processes today.
To reach the 192 active connections it means some of them are opened for 3 hours or so... I checked the database for locks but it had only 2 that are actually there always.
The other application has similar jobs running every hour with around 50 processes created and we never had this problem. We searched for different configurations both on the database and on Axon but couldn't find nothing specific.
Many thanks.