springboot application with apache camel saga causes Narayana Long Running Actions (LRA) memory leak - java.lang.OutOfMemoryError: Java heap space

154 views
Skip to first unread message

Junior de Souza Predes

unread,
Aug 18, 2023, 9:27:37 AM8/18/23
to narayana-users
Hello, how r u?

We are having this problem with narayana-lra


any help would be appreciated

Thanks a log

Júnior

0

Changing the saga exemple from camel-spring-boot-examples/saga to execute multiple saga transactions with success causes the narayana-lra server to fail and restart because of a java.lang.OutOfMemoryError: Java heap space.

Using the camel org.apache.camel.springboot.example version 3.20.0

pom.xml

<parent> <groupId>org.apache.camel.springboot</groupId> <artifactId>spring-boot</artifactId> <version>3.20.0</version> </parent> <groupId>org.apache.camel.springboot.example</groupId> <artifactId>examples</artifactId> <name>Camel SB :: Examples</name> <description>Camel Examples</description> <packaging>pom</packaging> <properties> <camel-version>3.20.0</camel-version> <skip.starting.camel.context>false</skip.starting.camel.context> <javax.servlet.api.version>4.0.1</javax.servlet.api.version> <jkube-maven-plugin-version>1.9.1</jkube-maven-plugin-version> <kafka-avro-serializer-version>5.2.2</kafka-avro-serializer-version> <reactor-version>3.2.16.RELEASE</reactor-version> <testcontainers-version>1.16.3</testcontainers-version> <hapi-structures-v24-version>2.3</hapi-structures-v24-version> <narayana-spring-boot-version>2.6.3</narayana-spring-boot-version> </properties>

changed docker-compose

version: "3.9" services: lra-coordinator: image: "quay.io/jbosstm/lra-coordinator:7.0.0.Final-3.2.2.Final" network_mode: "host" deploy: resources: limits: memory: 400M environment: - 'JAVA_TOOL_OPTIONS=-Dquarkus.log.level=DEBUG -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=7091 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false' amq-broker: image: "registry.redhat.io/amq7/amq-broker-rhel8:7.10" environment: - AMQ_USER=admin - AMQ_PASSWORD=admin - AMQ_REQUIRE_LOGIN=true ports: - "8161:8161" - "61616:61616"

changed sagaRoute

public class SagaRoute extends RouteBuilder { private static final String DIRECT_SAGA = "direct:saga"; @Autowired private ProducerTemplate producerTemplate; private final ExecutorService executor = Executors.newFixedThreadPool(10); @Override public void configure() throws Exception { rest().get("/perf") .param().type(RestParamType.query).name("n").dataType("int").required(true).endParam() .to("direct:perf"); from("direct:perf") .process(exchange -> { Integer limit = exchange.getMessage().getHeader("n", Integer.class); for (int i = 0; i < limit; i++) { int finalI = i; executor.submit(()-> producerTemplate.sendBodyAndHeader(DIRECT_SAGA, finalI,"id",finalI)) ; } }); rest().post("/saga") .param().type(RestParamType.query).name("id").dataType("int").required(true).endParam() .to(DIRECT_SAGA); from(DIRECT_SAGA) .saga() .compensation("direct:cancelOrder") .log("Executing saga #${header.id} with LRA ${header.Long-Running-Action}") .setHeader("payFor", constant("train")) .to("activemq:queue:{{example.services.train}}?exchangePattern=InOut" + "&replyTo={{example.services.train}}.reply") .log("train seat reserved for saga #${header.id} with payment transaction: ${body}") .setHeader("payFor", constant("flight")) .to("activemq:queue:{{example.services.flight}}?exchangePattern=InOut" + "&replyTo={{example.services.flight}}.reply") .log("flight booked for saga #${header.id} with payment transaction: ${body}") .setBody(header("Long-Running-Action")) .end(); from("direct:cancelOrder") .log("Transaction ${header.Long-Running-Action} has been cancelled due to flight or train failure"); } }

changed paymentRoute https://github.com/apache/camel-spring-boot-examples/blob/camel-spring-boot-examples-3.20.0/saga/saga-payment-service/src/main/java/org/apache/camel/example/saga/PaymentRoute.java

public class PaymentRoute extends RouteBuilder { @Override public void configure() throws Exception { from("activemq:queue:{{example.services.payment}}") .routeId("payment-service") .saga() .propagation(SagaPropagation.MANDATORY) .option("id", header("id")) .compensation("direct:cancelPayment") .log("Paying ${header.payFor} for order #${header.id}") .setBody(header("JMSCorrelationID")) .log("Payment ${header.payFor} done for order #${header.id} with payment transaction ${body}") .end(); from("direct:cancelPayment") .routeId("payment-cancel") .log("Payment for order #${header.id} has been cancelled"); } }

executing o run-local.sh to start the services

executing the command bellow with the number of transactions to be done

curl http://localhost:8084/api/perf?n=100000

Doing this we get on the flight recorder this problem

The live set on the heap seems to increase with a speed of about 26,2 KiB per second during the recording. An analysis of the reference tree found 1 leak candidates. The main candidate is java.util.concurrent.ConcurrentHashMap$Node[] Referenced by this chain: java.util.concurrent.ConcurrentHashMap.table io.narayana.lra.coordinator.domain.service.LRAService.participants io.narayana.lra.coordinator.internal.LRARecoveryModule.service java.lang.Object[] java.util.Vector.elementData com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery._recoveryModules

Flight Recorder Diagnose Image

and the erros on Narayana lra

2023-08-15 19:16:35,438 DEBUG [org.jbo.res.rea.com.cor.AbstractResteasyReactiveContext] (executor-thread-59) Restarting handler chain for exception exception: java.lang.OutOfMemoryError: Java heap space 2023-08-15 19:16:37,729 DEBUG [org.jbo.res.rea.com.cor.AbstractResteasyReactiveContext] (executor-thread-59) Attempting to handle unrecoverable exception: java.lang.OutOfMemoryError: Java heap space 2023-08-15 19:16:38,327 DEBUG [io.ver.ext.web.RoutingContext] (executor-thread-59) RoutingContext failure (500): java.lang.OutOfMemoryError: Java heap space

The attribute that holds the data causing the memory leak is

io.narayana.lra.coordinator.domain.service.LRAService.participants

public class LRAService { private final Map<String, String> participants = new ConcurrentHashMap<>();

in the LRAService class I could not find where itens are removed from this map.

Is it bug? a miss configuration on narayana lra? a bug on apache camel saga?

thanks a lot

Junior de Souza Predes

unread,
Aug 24, 2023, 10:29:32 AM8/24/23
to narayana-users
Reply all
Reply to author
Forward
0 new messages