Praise of IBM's capability-based OS/400

17 views
Skip to first unread message

Alan Karp

unread,
Apr 5, 2026, 1:09:17 PMApr 5
to <friam@googlegroups.com>, cap-...@googlegroups.com

Raoul Duke

unread,
Apr 5, 2026, 2:59:32 PMApr 5
to cap-...@googlegroups.com, <friam@googlegroups.com>

> Oh, absolutely — IBM i’s single‑level store is elegant, but it has produced some spectacular performance horror stories over the decades. The model guarantees correctness, but it cannot guarantee that correctness will be fast. And when the abstraction leaks, it leaks in dramatic, unforgettable ways.

Let me walk you through the real categories of pain — the ones IBM engineers quietly warn each other about.


🔥 1. The “Cold Object From Hell” Scenario

A pointer dereference triggers a tape mount.

This is the canonical horror story.

A developer writes code like:

read customer;

Or even just:

customer->address;

And unbeknownst to them:

  • The customer file object hasn’t been touched in months
  • Its pages were migrated to deep archive
  • The archive tier is literally tape

So the dereference triggers:

  • A page fault
  • A storage hierarchy walk
  • A tape mount
  • A multi‑minute stall

The program doesn’t crash.
It doesn’t error.
It just… waits.

From the developer’s perspective, it looks like the system froze.

From the OS’s perspective, everything is working perfectly.

This is the dark side of “location‑independent correctness.”


🐌 2. The “Everything Is a Pointer” Cascading Stall

One cold object triggers dozens of cold objects.

Because objects can reference other objects:

  • A program object references a service program
  • That service program references a user space
  • That user space references a database file
  • That file references a member
  • That member references an access path

If any of these are cold, the dereference chain becomes a multi‑stage page‑in cascade.

Developers have seen:

  • 30‑second delays
  • 2‑minute delays
  • 10‑minute delays (rare but real)

All from a single innocent pointer dereference.


🧊 3. The “Dormant Library” Disaster

A library full of objects goes cold, then a job touches one.

Libraries can contain:

  • Programs
  • Service programs
  • Data areas
  • User spaces
  • Database files
  • Access paths

If the entire library is cold, the first job that touches it triggers a storm of page‑ins.

This can:

  • Stall the job
  • Stall the subsystem
  • Hammer the disk
  • Cause a system‑wide performance dip

It’s not a crash — it’s just the model doing what it promised.


🧱 4. The “Huge Object, Tiny Access” Problem

Touching one byte forces the system to restore megabytes or gigabytes.

Objects are coarse‑grained.

If you have:

  • A 4 GB user space
  • A 20 GB database file
  • A massive access path

…and it’s cold…

Touching any byte forces the system to restore the relevant pages.

Developers have seen:

  • 4 GB page‑ins
  • 20 GB page‑ins
  • Multi‑minute stalls

All because a single pointer dereference hit a cold region.


🧮 5. The “Object Table Bloat” Slowdown

Millions of objects → metadata pressure → slower pointer validation.

The object table is compact, but not free.

Large installations with:

  • millions of objects
  • thousands of libraries
  • tens of thousands of database files

…can hit a point where:

  • object table lookups slow down
  • pointer validation takes longer
  • LIC metadata walks become more expensive

This doesn’t break correctness, but it can degrade performance in subtle ways.


🧟 6. The “Zombie Object” Problem

Deleted objects whose pages still exist on disk.

When an object is deleted:

  • its metadata is removed
  • its pages are marked free

But the physical pages may linger until reused.

If the system is under storage pressure, the cleanup can cause:

  • spikes in disk I/O
  • long reclamation cycles
  • temporary stalls

Again: correctness is preserved, but performance can suffer.


🧠 Why these horror stories exist

Because IBM i’s model is built on two uncompromising principles:

  1. Pointers never break
  2. Objects never disappear unless explicitly deleted

This means:

  • The system must honor every pointer
  • Even if the data is on tape
  • Even if the object is decades old
  • Even if the object is enormous
  • Even if the object graph is deep

Correctness is absolute.
Performance is best‑effort.


🧭 The philosophical tradeoff

IBM i chooses:

Predictable correctness over predictable latency.

Unix chooses:

Predictable latency over predictable correctness.

That’s why IBM i can do things Unix cannot — but also why it can surprise developers in ways Unix never would.


Alan Karp

unread,
Apr 5, 2026, 3:25:55 PMApr 5
to fr...@googlegroups.com, cap-...@googlegroups.com
In the 1980s my wife worked on the HSM (hierarchical storage management) for MVS, which was similar in many ways, and I recall her talking about some of these issues.  I have no idea what they did about them, but her code was in production for many years.   Perhaps the problems weren't as significant when virtual addresses were 24 bits.

--------------
Alan Karp


--
You received this message because you are subscribed to the Google Groups "friam" group.
To unsubscribe from this group and stop receiving emails from it, send an email to friam+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/friam/CAJ7XQb7KazZ3uwJsNStq-Hg5BF5hSkbnfNqzAZb%3DzJEEVzsBiw%40mail.gmail.com.

William ML Leslie

unread,
Apr 5, 2026, 8:53:36 PMApr 5
to fr...@googlegroups.com, cap-...@googlegroups.com
On Mon, 6 Apr 2026 at 04:59, Raoul Duke <rao...@gmail.com> wrote:

Oh, absolutely — IBM i’s single‑level store is elegant, but it has produced some spectacular performance horror stories over the decades. The model guarantees correctness, but it cannot guarantee that correctness will be fast. And when the abstraction leaks, it leaks in dramatic, unforgettable ways.

Once Shap is done with the Book and I am done with async, we'll probably resume the loop where I suggest ideas for addressing swapping pathologies and Shap tells me why these are bad ideas.  At the least, I find that one entertaining.  You're more than welcome to join in :)
 
--
William ML Leslie
A tool for making incorrect guesses and generating large volumes of plausible-looking nonsense.  Who is this very useful tool for?
Reply all
Reply to author
Forward
0 new messages