Need help on Drools DMN performance issue at scale with 500+ Rules

112 views
Skip to first unread message

Chandra

unread,
Jun 9, 2025, 12:12:15 PMJun 9
to Drools Usage

Hi Drools DMN Community Members and Experts


We work on a high-traffic business rules platform using Drools DMN (7.74.1.Final, Spring Boot, Java 8). Despite advanced parallelization and extensive model refactoring, we are unable to achieve sub-500ms response time for large requests—currently stuck at ~2 seconds. We are seeking community advice on design patterns, hardware requirements, caching/model strategies, and practical best practices to bridge this gap.


Request Structure & Chunked Parallel Processing

  • Request structure:
    • Transaction
      • Orders[] - upto 20 size
        • Packages[] - upto size 10
          • Items[] - upto size 100
  • Chunking & Parallelism (Java layer):
    • For requests with many Orders, we split (chunk) the Orders into sub-requests (e.g., 2 Orders per sub-request; also tried chunk sizes of 1, 2, 3, 4).
    • Each sub-request is processed in parallel using CompletableFuture with a dedicated thread pool.
    • Within each sub-request, Orders are also processed in parallel—each Order’s DMN invocation runs in its own CompletableFuture (with a separate thread pool).
    • Results for each Order are aggregated for the final API response.
    • Extensive tuning of pool sizes, chunk sizes, and queue sizes (e.g., pool sizes from 7/14 up to 32/28, queue sizes up to 50).


DMN Model & DRD Structure

  • Main DMN orchestration:
    • The Main DMN contains 3 top level decision services (e.g., “Pricing”, “Eligibility”, “Compliance”), which can execute independently for each Order.
    • Each decision service typically contains 5–15 decisions. (There are 30 Decision services overall at all levels)
    • Each decision is implemented as a Context boxed expression that calls multiple FEEL BKMs and at least one Java BKM (for complex or reusable logic).
    • Each decision service internally orchestrates further decision services, nested up to 3-4 levels deep for business modularity and reuse (e.g., Compliance → Sanctions Check → Country List -> Center List).
  • Rule design:
    • ~500 rules (~60% decision tables up to 30 rows; the rest FEEL expressions).
    • Rules depend on multiple attributes from all three object levels (Order, Package, Item); many rules involve complex combinations from each object level
    • Complex computation is implemented in Java BKMs, invoked via context box expressions/FEEL BKM in the DMN.
  • Design reasoning:
    • DMN is invoked at the Order level (not request level) to keep input objects manageable, business rule around order/isolate business logic, and efficiently parallelize using Java across CPU cores.


Optimization Attempts and Observations

  1. Chunking & parallel execution:
    • Chunked requests into sub requests (sizes 1–4); each sub request processed in its own thread, with Orders in the chunk processed in parallel using a separate thread pool.
  2. Splitting/flattening:
    • Pre flattened Packages/Items in Java to simplify FEEL logic and avoid deep for loops in DMN. Helped code clarity but not performance.
  3. DMN model refactor:
    • Main DMN uses three independent decision services at the top level (each runs in parallel per Order). Each has nested dependencies up to three levels.
    • Attempted to split and invoke these main services in parallel from Java, aggregating results per Order.
  4. Engine/config tweaks:
    • Tried -Dorg.kie.dmn.compiler.execmodel=true; performance degraded.
    • No GC, DB, IO, or lock contention issues, seems to be all perf limits are CPU bound.
  5. Scaling hardware:
    • Started with 8 nodes × 4 CPU cores (no node >70% CPU at peak).
    • Increased to 16 nodes; performance worsened and CPU per node dropped below 45%.
    • Not yet tried increasing CPU cores to 8 or 12 at each node (hoping some perf improvements, Any thought???)
  6. Key Observation:
    • Removing some decision services or rules from the DMN (at the top or nested levels) immediately and noticeably improves performance for large requests. Performance scales non-linearly with rule count/nesting.
  7. Adding rules and deeper DRD nesting directly increases execution time per Order.


Traffic and Goals

  • Hourly load: 160,000 requests.
    • 10%: 10 Orders/request (~2s latency)
    • 10%: 20 Orders/request (~2s latency)
    • 80%: 1–5 Orders/request (500ms–1s latency)
  • Goal: Achieve sub-500ms latency for all request sizes, especially large ones.


Questions for the Community

  1. Are there Drools DMN engine/config limits, practical ceilings, or published benchmarks for large, modular, nested DRDs with 500+ rules?
  2. Is it possible to parallelize independent decision services within a single DMN execution (engine-side)?
  3. Does the combination of FEEL expressions (nested for loops) and heavy BKM usage causung the perf issue, even if complex logic is offloaded to Java?
  4. Have you observed similar scaling/performance issues when increasing Drools DMN nodes/rules? 
  5. What are the minimum/ideal hardware requirements (CPU cores/node, JVM heap, etc.) to support this scale and complexity in Drools DMN?
  6. Are there performance enhancements in newer Drools DMN versions, or engine settings, that made a difference for large, modular models?
  7. Given that each Order-level DMN execution is independent, is there any caching strategy (at runtime or model level) that could improve performance or resource utilization in Drools DMN?



Despite chunking, aggressive parallelism (at chunk and order level, with separate thread pools), modular DRD design, and Java preprocessing, 

Drools DMN remains much slower than our previous system for large requests (~2s vs. <500ms). 


Note: Our current system runs on IBM ODM (with almost similar hardware) and is being migrated to Drools DMN. For the same request and traffic, we achieve response times under 500ms with IBM ODM. However, with Drools, we have struggled to achieve the same response time


Any guidance on DMN design, configuration, hardware configs, any optimizations, or model/runtime caching to improve the performance of my application at this scale would be greatly appreciated..


Big Thank you in advance


Thanks

Chandra

Alex Porcelli

unread,
Jun 9, 2025, 1:47:35 PMJun 9
to drools...@googlegroups.com
Chandra,

Thank you for providing such detailed information. Given the
complexity and performance targets you've outlined, I'm not entirely
sure that community-level support will fully meet your needs. You may
require specialized professional services to achieve the desired
performance.

Additionally, it's important to note that directly comparing Drools
DMN with IBM ODM might not provide an accurate picture, as they are
fundamentally different systems. A more accurate comparison would be
ODM versus Drools DRL. While the DMN engine leverages some components
of the Drools rule engine, it operates quite differently— for
instance, DMN is side-effect-free, whereas ODM and Drools DRL rules
can mutate data in place.

I also noticed you're using an older version of Drools DMN with Java
8. I strongly recommend upgrading to Drools 10.0.0 or the upcoming
10.1.0, which support Java 17 and more recent versions of Spring Boot.
This upgrade alone could bring some performance improvements.

-
Alex
> --
> You received this message because you are subscribed to the Google Groups "Drools Usage" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to drools-usage...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/drools-usage/dcbf1067-199c-492c-b03f-b03b1fd5fae1n%40googlegroups.com.
Reply all
Reply to author
Forward
Message has been deleted
0 new messages