Replacement of Nested Set Model with Closure Table Model

228 views
Skip to first unread message

Edgar Rodríguez Silva

unread,
Apr 16, 2026, 6:02:22 AMApr 16
to AtoM Users
Hello;

We have completed a new development that replaces our current nested set model with a closure table model. This change significantly improves performance, particularly for high-volume files.

We would like to include this development in upcoming versions. Could someone please provide us with guidance or instructions on how to proceed with this process?

Thank you for your assistance.

Sarah Mason

unread,
May 21, 2026, 12:42:58 PMMay 21
to AtoM Users
Hi Edgar,

Thank you for getting in touch about sharing code with the AtoM project. Further to our emails, I wonder if you would like to share the design specification for the closure table model here so the community can review it and share their thoughts. As it is a major architectural change, getting input from other AtoM users will help drive this process.

Thank you for coming to the Community Development meeting today -- for anyone who couldn't attend the meeting minutes have an initial discussion of this.

I'd like to put a call out to the AtoM community to review the specification and provide any questions or feedback to this group. We will also be discussing this proposal at the next AtoM Contributors Gathering on Thursday, 18 June 2026. It would be great to hear from members of the community things such as: 

- how often you rebuild the nested set
- how many records you have and how long the rebuild takes
- any other thoughts and questions about the design analysis when shared.

Thank you again, Edgar, for sharing your development work with the project! I hope we can get a robust discussion about this. We will be discussing more internally at Artefactual and I will be sharing our questions and feedback here.

Best wishes,
Sarah Mason
Contributor Success Specialist

Edgar Rodríguez Silva

unread,
May 27, 2026, 4:20:26 AMMay 27
to ica-ato...@googlegroups.com
Hello,

We're pleased to share the design analysis. In the coming weeks, we'll have some performance data comparing Nested Sets and Closure Tables on a catalog of approximately 600,000 records that we're upgrading to version 2.10.

Best regards

--
You received this message because you are subscribed to a topic in the Google Groups "AtoM Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ica-atom-users/aW9dDL1TWGc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ica-atom-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ica-atom-users/cd0860ce-4f28-449b-98a1-763e91c7ecd3n%40googlegroups.com.


--

Edgar Rodríguez Silva
Project Manager

(+34) 666 82 15 88 - 881 97 55 76
Rúa Hedras nº4 - 2A | 15895 | Milladoiro A Coruña, Spain

closure-table-design-analysis_en.pdf

Edgar Rodríguez Silva

unread,
Jun 11, 2026, 10:14:53 AMJun 11
to ica-ato...@googlegroups.com
Hi Sarah and the AtoM community,

Following up on our previous message, we are sharing the performance report comparing the Nested Set Model and the Closure Table Model. This analysis was conducted using a sample catalog of 600,000 records.

We hope this data provides valuable insights for the community's review and the upcoming Contributors Gathering.

Best regards,


AtoM Performance report_Nested set Vs Closure Table.pdf

Daniel Lovegrove

unread,
Jun 11, 2026, 11:50:31 AMJun 11
to AtoM Users
Hi Edgar,

The performance improvements of the closure table model look promising versus the current nested set hierarchy implementation. The most important takeaway in my mind here is the footnote on page 5: that nested set write operations scale with the total number of rows O(n), and closure table write operations scale with depth. I imagine the maximum depth for most archives is not very large; I would consider that to be a fixed O(1) cost which scales with total database size much better than the nested set.

I have recently encountered issues with the nested set re-build process taking a very long time for a large AtoM instance I maintain. I made a pull request that was accepted to improve the nested set rebuild process by about 10x (see here: https://github.com/artefactual/atom/pull/2273). This doesn't change the fact that the nested set re-build scales with the database's size, but it does make re-builds much quicker. Since this change will be available in 2.11, would you be willing to re-test once those changes are available in the newest release? I'm curious to see how much of an improvement the changes I've made make.

Regards,
-Daniel L.

Edgar Rodríguez Silva

unread,
Jun 12, 2026, 3:07:19 AMJun 12
to ica-ato...@googlegroups.com
Hi Daniel,

Great work, that seems like a significant improvement. What volume of records does the AtoM instance where you tested this have?

We will try to run some comparisons with the patch you shared. Although we are about to deploy version 2.10 with the closure table for a client, we might be able to backport your patch to 2.10 and compare the performance.

Best regards,


Daniel Lovegrove

unread,
Jun 12, 2026, 10:02:12 AMJun 12
to AtoM Users
Hi Edgar,

The largest instance I manage holds 3.3 million records. It took about 20 minutes to rebuild the nested set before the performance improvements, and it now takes about 2 minutes.

Sounds great about testing it, looking forward to seeing the results.

Best,
-Daniel

Johan Pieterse

unread,
Jun 23, 2026, 7:06:00 AM (7 days ago) Jun 23
to AtoM Users

Hi Edgar,

Thanks for looping us in. We run several large, heavily-customised AtoM 2.10 deployments, so the Nested Set write-cost and fragility problem is one we feel directly, we're glad to see it being tackled, and the Xercode design is a solid, well-structured starting point. A few comments from a core-adoption and ecosystem-compatibility angle, roughly in priority order:
1. The retained lft is the central design question. The doc keeps lft for sibling ordering while moving ancestry to the closure table. But AtoM's lft is a global preorder index, maintaining it on insert/move is the very O(n) renumber the proposal is trying to eliminate. As written, the write-cost win looks partial (reads/queries improve; writes don't). The strategic fork worth deciding up front:
  - Conservative: 
    keep global lft, accept that writes still renumber, closure mainly buys query simplicity and integrity, not write performance; 
    or ll solution: decouple sibling ordering into a per-parent sort_order (or a sparse/fractional rank) and drop global lft entirely, so inserts/moves become genuinely cheap. This is the bigger change but it's the one at delivers the performance story.
Picking (a) or (b) changes the whole compatibility and migration surface, so it should be resolved before implementation.

2. The backward-compatibility audit is the real project. Section 11 (finding every lft/rgt/rgt-lft>1 consumer in modules, tasks, jobs, and arElasticSearchPlugin) is one page but is where the risk and effort actually live, and for core it extends to the entire plugin/theme ecosystem, not just base code. EAD/finding-aid export, treeview, browse, CSV import/move, reference-code generation and ACL all walk the tree today. I'd suggest a complete inventory of lft/rgt usages published alongside the design, with a migration note for each, before any merge.

3. Transaction integrity. ClosureMysqlProvider writes via raw PDO, outside Propel. The closure update must run in the same transaction as the model save, or you reintroduce partial-state corruption one layer down, ironically the fragility this is meant to remove. Worth making explicit.

4. Coexistence vs replacement. We'd recommend full replacement per model with a clean migration + regen task, not permanent configurable dual-run. Keeping Nested Set and Closure Table in sync indefinitely roughly doubles the hierarchy write paths and the bug surface (the doc acknowledges this). A deploy-time flag plus the regeneration task gives the rollback safety without the permanent dual-write cost.

5. Migration & rollback for existing installs. Tens of thousands of nodes per install, the upgrade needs a one-shot, resumable closure-build that's been tested on real large datasets, plus a documented rollback. The tools:closuretable task covers generation; an explicit upgrade/downgrade story for production sites would help adoption.

6. Elasticsearch. Agreed that 4a (async idempotent reindex reading live closure) is the right first step and 4b is future work, 4b's terms filter hits the 65,536 limit on large fonds and breaks native hierarchical aggregations. One caveat to document: 4a leaves a visible stale-hierarchy window in search/browse for minutes after a large subtree move.

7. DB version floor. Recursive CTEs require MySQL 8.0+ / MariaDB 10.2+. Worth stating as a hard minimum and checking it against AtoM's currently supported versions.

8. Show the numbers. The entire justification is performance, but there are no benchmarks. Before/after timings for insert/move/delete and descendant/ancestor reads on a realistic large tree (say 100k+ nodes) would make the case far more compelling and let core set expectations.

We're happy to help, we can review the lft/rgt consumer inventory, and we have large customised 2.10 datasets we could use to test migration and benchmark the read/write paths if that's useful.

Best wishes,
Johan Pieterse
The Archive and Heritage Group (Pty) Ltd

Edgar Rodríguez Silva

unread,
Jun 23, 2026, 11:12:23 AM (7 days ago) Jun 23
to ica-ato...@googlegroups.com
Hi Johan,

Thank you for your input.

It has been 18 months since we completed the implementation for AtoM version 2.8. That gave us the necessary experience to correct some of the mistakes we made and apply improvements to the refactoring for AtoM 2.10.

As you rightly mention, using lft as a sorting field among sibling nodes was a mistake. We had to implement several workarounds for global sorting use cases (exports, reports, etc.); in particular, the EAD export caused several issues. We have corrected this by creating a new field for sibling sorting and establishing a global sorting in ES (lft and rgt are no longer used and become dead fields when the closure table is enabled).

Regarding coexistence versus replacement: although we initially planned the project as a replacement, we ultimately decided to maintain coexistence via a configuration flag. This allows the migration from one system to another to be planned, given that in a system with several million objects, the locking window for the transition can exceed 24 hours. It also allows instances with few records (< 20k) to skip the transition, or revert to nested set if decided for any reason. In any case, removing coexistence is trivial if that is preferred.

In ES, we opted for option 4a. Initially, both options were drafted with the intention of finding a solution to the problem of reindexing all nodes in a subtree when it is moved, but we did not reach a satisfactory solution. Any ideas to address this case are welcome. In the meantime, option 4a is not as time-consuming as it seemed before running the performance tests.

Thank you for your feedback regarding transactions; it was something we had pending, and we are now working on it.

In our message from June 11, we posted a performance comparison between the two systems.

Thank you for your offer. We already have the refactored code ready for version 2.10. We would love to receive suggestions or improvements, and any collaboration is welcome.

Best regards,


pieters...@gmail.com

unread,
Jun 23, 2026, 12:24:52 PM (7 days ago) Jun 23
to ica-ato...@googlegroups.com

Hi Edgar,

 

Thanks, that's a thorough response, and it's good to hear the sibling-sort field and the transaction work are in hand. And I'll happily concede the coexistence decision: I'd underweighted the migration lock window.

For a multi-million-object instance a 24h+ transition is a real operational constraint, and a config flag that lets large sites stage the migration, small sites (<20k) skip it, and anyone roll back is clearly the more pragmatic call. Agreed.

 

On the subtree-move reindex, here's an idea that may give you the middle ground 4a and 4b didn't, while keeping the denormalised ancestors field (so aggregations and ACL filters that depend on it don't have to change):

 

The key observation is that on a move, the ancestor delta is identical for every descendant in the moved subtree. Moving node N from P1 to P2 only changes the prefix above N: every descendant loses N's old ancestor chain and gains P2's new one; the portion of the chain within the subtree (N downward) is unchanged. So, you don't need to recompute each document from the closure table, you can apply one delta to the whole subtree server-side:

 

POST <index>/_update_by_query?wait_for_completion=false&conflicts=proceed

{

  "query":  { "term": { "ancestors": N } },     // selects exactly N's subtree — every descendant already contains N

  "script": {

    "lang": "painless",

    "source": "ctx._source.ancestors.removeAll(params.oldPrefix); ctx._source.ancestors.addAll(0, params.newPrefix);",

    "params": { "oldPrefix": [<ancestors of N before the move>], "newPrefix": [<P2 + its ancestors>] }

  }

}

 

Selecting the subtree with term: ancestors = N sidesteps the 65k terms limit entirely (no need to pass the descendant id list), and the whole rewrite is a single async request that ES/OpenSearch executes server-side, no per-document roundtrips from PHP and no per-document DB reads. It works on both ES 5.x and OpenSearch.

 

It layers naturally on your 4a: the Gearman job issues this one _update_by_query instead of looping documents. Two caveats: under concurrent moves on overlapping subtrees, you'd want conflicts=proceed plus a re-run (or version_type=external as you already planned), and if you prefer maximum robustness over speed, you can keep the per-document recompute-from-closure path as a fallback. But for the common case it turns an O(S) reindex into one bulk operation.

 

On the performance report (2026-05-27), I've now been through it, and it makes the write case conclusively. Testing two real production datasets at 60k and 600k, with ES isolated via the on/off passes, is exactly the right design, and the 10× volume step is the decisive result: A3 going from 341× to 4,092× as nodes scale demonstrates O(n) vs O(1) write cost directly, not merely "closure is faster." The candour about where Nested Set wins (B1/B2 point reads, sub-millisecond) makes the rest credible. A few constructive observations and one data gap that connects to the ES question above:

 

1. Concurrency would strengthen the case further. The measurements are single-operation latency (10 sequential reps). The real production pain of Nested Set is that an lft/rgt renumber locks the table and blocks all other users, so under concurrent load the write advantage is even larger than the single-thread numbers suggest. A small concurrency pass would understate nothing and likely make the gap more dramatic.

2. The large subtree move at scale is the missing number, and it's the one most relevant to the ES discussion above. A4 (move-subtree) ran only at 60k with ~9 and ~84 descendants; moving a ~10k-node subtree in a 600k tree wasn't measured. That's precisely the scenario that drives the expensive ancestors reindex, so there's currently no closure-side figure for the case the 4a/_update_by_query strategy is meant to address. It would be very useful to have it.

3. B3's 473× is partly the Propel path, not the data structure alone. Since the Nested Set side hydrates full Propel objects with i18n N+1 queries, B3 is really "hydrated objects vs a JOIN returning rows." It's a fair as-AtoM-actually-behaves comparison, but worth framing as an implementation win rather than a purely structural one (a raw WHERE lft BETWEEN returning columns would also be fast). The memory_peak_kb figures for B3 would be a nice complement here, closure should avoid the large allocation of hydrating thousands of objects.

4. Closure-table size isn't reported. Given O(n·depth) rows, the 600k dataset is presumably several million closure rows, and you note A5-delete slows at 600k for exactly that reason. Publishing the row counts and table/buffer-pool footprint would round out the picture and likely explain the B2 cold-cache variance (P95 ≈ 35 ms).

5. The migration build looks like it could be much faster. ~1,000 records/30s (17h for 2M) is the unoptimised per-record path. A set-based build, seed depth-0, then iteratively INSERT depth+1 by joining the closure to the parent edges, is typically orders of magnitude faster and could shrink the maintenance window from hours to minutes, which also softens the coexistence trade-off.

 

None of these change the conclusion, the report already establishes that Nested Set is unviable past ~30–50k nodes, they'd just make it airtight and fill in the one scenario (large-subtree move at scale) that both documents currently leave open.

 

And yes, we'd be glad to collaborate. I'm happy to review the refactored 2.10 code, particularly the lft/rgt consumer inventory and the ES integration. One honest note on testing, our own instances are modest in size, so where we can help most is code review and functional-correctness testing rather than million-row performance runs; if that's useful, send the branch over.

 

Groete / Regards

Johan Pieterse

082 337-1406

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ica-atom-users/CAGWhycqEbVEtuq5LSP60dgAquWTUto9MDugvA94WFf9J8_nO%3DA%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages