Persistent crashes w/ assert fails

14 views
Skip to first unread message

Steven Hirsch

unread,
Dec 11, 2022, 8:55:53 PM12/11/22
to uni...@googlegroups.com
Hi, Joerg, et al.

This is becoming chronic on my 'Frankenstein' 11/73 system:

/root/10.01_base/2_src/arm/qunibusadapter.cpp:1011: \
void qunibusadapter_c::worker_device_dma_chunk_complete_event(): \
Assertion `dmareq != __null' failed.

./211BSD_du0_73.sh: line 3: 2147 Aborted \
~/10.03_app_demo/4_deploy/demo --verbose --cmdfile 2.11BSD_du_73.cmd $*

I can trigger it reliably by running fsck on the root volume under single
user mode on 2.11 BSD, but have seen it periodically with RT-11 and
RSX-11.

System detail:

Heath H-11 enclosure and power supply with backplane updated to propagate
BA18-BA21.

DEC M8192-YB CPU w/ FP chip

QBone emulating console, clock, memory and disk storage:

----------- cut here ------------

# inputfile for demo to select a MSCP disk in the "device test" menu.
# Read in with command line option "demo --cmdfile ..."
d # device menu

en kw11

sd dl11
p p ttyS2
p addr 177560
p iv 60
en dl11 # use emulated serial console

pwr # reboot PDP-11
.wait 3000 # wait for PDP-11 to reset
m i # install max QBUS memory

# Deposit bootloader into memory
m ll du.lst

en uda # enable UDA50 controller

# mount 2.11bSD in drive #0 and start
en uda0 # enable drive #0
sd uda0 # select drive #0

p type RD54
p image root.rd54 # mount image file with test pattern

.print MSCP drives ready.
.print UDA50 boot loader installed.
.print Start 10000 to boot from drive 0, 10010 for drive 1, ...
.print Reload with "m ll"
.print
.print Set terminal to 9600 7O1
.print At "73Boot" prompt, just hit RETURN.
.print Login as "root", log out to enter multi user run level.

-----------------------

All CPU and memory diagnostics pass.

Would appreciate any tips for troubleshooting this issue. If there is
anything in particular you want captured, let me know. I have a 32-ch
logic analyzer available.

Steve

Joerg Hoppe

unread,
Dec 12, 2022, 12:46:05 AM12/12/22
to uni...@googlegroups.com
Hi Steven and all,

lookslike something bad in the DMA timing ... anybody else suffering
from this?

Special to Steven is the modified Heath H-11 system.

For debugging it would be nice to see this on a DEC system.

Steven, you say this is "becoming chronic".
This may be an indicator of something going to fail in the M8192,
or other parts of your system. Can you swap the M8192 ... or the QBone?
It maybe even a power supply issue or a backplane contact.

For the logic analyzer, you would generate a GPIO trigger signal before
the assertion:
   // set pin associated with LED 1 to catch "good" DMA cycles
     ARM_DEBUG_PIN2(1) ;
    if (dmareq != NULL)
        ARM_DEBUG_PIN3(1) ; // set pin associated with LED 3 for error
    assert(dmareq != NULL);
     ARM_DEBUG_PIN2(0) ;

The trace should show at least the QBUS IRQ and DMA signals.
One would make multiple shots and compare the regular good DMA cycles
with the deadly one.

kind regards,

Joerg



@Steven: can you add your backplane

Steven Hirsch

unread,
Dec 12, 2022, 8:52:30 AM12/12/22
to Joerg Hoppe, uni...@googlegroups.com
On Mon, 12 Dec 2022, Joerg Hoppe wrote:

> Hi Steven and all,
>
> lookslike something bad in the DMA timing ... anybody else suffering from
> this?
>
> Special to Steven is the modified Heath H-11 system.
>
> For debugging it would be nice to see this on a DEC system.
>
> Steven, you say this is "becoming chronic".
> This may be an indicator of something going to fail in the M8192,
> or other parts of your system. Can you swap the M8192 ... or the QBone?
> It maybe even a power supply issue or a backplane contact.

That's possible, but I've had these on and off since the beginning and did
see them with the original LSI-11/03 CPU. I think the reason they are
more prevelant now is that I'm running very disk intensive environments.

I will check the power supply, but have no means of swapping the QBone.

> For the logic analyzer, you would generate a GPIO trigger signal before the
> assertion:
>    // set pin associated with LED 1 to catch "good" DMA cycles
>      ARM_DEBUG_PIN2(1) ;
>     if (dmareq != NULL)
>         ARM_DEBUG_PIN3(1) ; // set pin associated with LED 3 for error
>     assert(dmareq != NULL);
>      ARM_DEBUG_PIN2(0) ;

So, this new bit of code needs to be added the qunibusadapter.cpp at or
around line 1011?

> The trace should show at least the QBUS IRQ and DMA signals.
> One would make multiple shots and compare the regular good DMA cycles with
> the deadly one.

I'll setup to grab a trace.

If anyone else has thoughts, let me know?

Steve

> @Steven: can you add your backplane

The backplane is a Heath product that's roughly equivalent to an H9270,
but completely based on a double-sided PCB - no wirewrap pins. I brought
through BA18-BA21 using #30 wirewrap wire (following the serpentine path).

I found that Heath presented the LTC signal only on the 4 right-hand
positions and added a jumper to ensure that all slots receive the signal.
Could that possibly be a problem? But, again, I was seeing the assert
fail with an unmodified backplane and M7264 CPU.

The only other mod was to wire-OR AF1 and AH1 to accommodate variation in
where the CPUs present SRUN.
> --
> You received this message because you are subscribed to the Google Groups
> "UniBone" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to unibone+u...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/unibone/132873c2-c504-9fc5-2fd6-cb0d9685d08a%40gmail.com.
>

Steven Hirsch

unread,
Dec 12, 2022, 10:39:10 AM12/12/22
to Joerg Hoppe, uni...@googlegroups.com
On Mon, 12 Dec 2022, Joerg Hoppe wrote:

> It maybe even a power supply issue or a backplane contact.

I carefully cleaned the card-edge fingers and hit the backplane connectors
with de-oxident. The crashes continue.

The power supply voltages look, with +5 at 5.008V. Watching on the scope
with AC coupling, I see voltage excursions mostly at +/- 100mv with
occasional peaks at +/- 300mv. Is that expected to be quieter?


Reply all
Reply to author
Forward
0 new messages