QBone and memory diags.

90 views
Skip to first unread message

Jay Cotton

unread,
Mar 21, 2025, 1:43:36 AMMar 21
to UniBone
I have a problem with my pdp 11/53.  It seems to have lost the ability
to boot from rqdx3, tk50 or enet.  And of course Qbone.

I can load and run RT11 5.7 via the tu58 emulator (slu1).  It can't
see any of the above devices either.

My working hypothesis is a bad bus driver chip on the CPU board.
I need to figure out which part is bad and replace it.

To that end, is there a memory diag the runs on Qbone that will back
test the ram on the CPU board, and thus reveal the bad driver bit. 
It could also be a bad address driver.  I have see a message about 
some address missing the most significant 1 byte.

BTW I could also boot xxdp and run a diagnostic that way, but I don't
see to find the correct test code.

tnx
jc
 

Jay Cotton

unread,
Mar 21, 2025, 1:52:35 AMMar 21
to UniBone
Here is the error I am seeing.

KDJ11-D/S   4.57
Error, see troubleshooting section in Owner's manual for assistance
RAM    VPC=025512  PA=17605512  02040000/077776 <> 177776

Jay Jaeger

unread,
Mar 21, 2025, 9:33:16 AMMar 21
to UniBone
Go to the 11/53 owners manual supplement to figure out how to interpret the addresses (and data) in the message.  Figure out if the failing address is in the physical memory on the QBus of the QBone

http://bitsavers.org/pdf/dec/pdp11/microPDP11/AZ-GPTAA-MC_MicroPDP11_53_System_Supplement_Jun86.pdf

VPC is the virtual PC address in the test (not useful).  same with the PA (physical address corresponding to VPC, I think.)

02040000 is the failing memory address - presumably a physical address as anything else would be useless.   You should deetermine if that is on your memory board or the QBone.

077776 <> 177776 is a DATA value NOT AN ADDRESS.   That is a DATA failure on the RAM.  It is dropping the most significant bit.

As a guess, the most likely suspect is the memory board or your QBone.  Try pulling the QBone and see what happens.

YES, you can run memory diagnostics from the QBone, but that only works properly if the QBone itself is working properly.  Put it in an empty QBus backplane and run its driver and memory tests.  Make sure the QBone jumpers are correctly set up.

As for not booting from the other devices, that feels like a request/grant failure - make sure you followed the rules for how you put boards in a PDP-11/53 QBus backplane (again, see the MicroPDP11 owner's manual and the 11/53 supplement.)  Hopefully you didn't stick a board in an incorrect kind of slot and end up with damage.  (Avoid the CD slots - the right hand side of the first three slots for double width boards) (see your Qbone sheet or the MicroPDP11 Technical manual for the diagram).

This is figure-out-able using the resources available to you online:  The QBone documentation, and the 11/53 owners manual.  G00gle is your friend.

JRJ

Joerg Hoppe

unread,
Mar 21, 2025, 10:17:14 AMMar 21
to uni...@googlegroups.com
Hi,
Go to the 11/53 owners manual supplement to figure out how to interpret the addresses (and data) in the message.  Figure out if the failing address is in the physical memory on the QBus of the QBone
In standard config, QBone's console lists the addresses it is emulating.
So you can see wether 02040000 is CPU onboard memory

(I assume you don't have a separate QBUS memory beside the on-CPU chips)

You can try to probe memory (CPU and/or QBone) by preparing the PDP-11s ODT console,
doing EXAMs/DEPOSITS from there. Read/write values with the MSB DATA15 0100000 set.
If MSB is wrong on all addresses, it's an bus driver fault on the CPU board.
Or some other QBUS card is holding DATA15 low on QBUS due to an defect.

If I remember right, on 11/53 the CPU onboard memory can be disabled.
In that case you would circumvent a defect chip and use QBone's memory only ...
that's what QBone was made for.

kind regards,

joerg




--
You received this message because you are subscribed to the Google Groups "UniBone" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unibone+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/unibone/7a55640a-ac2f-404f-8765-fcb2e4b1e203n%40googlegroups.com.


Jay Cotton

unread,
Mar 21, 2025, 7:01:49 PMMar 21
to UniBone
Current findings.

I ran OKDDD0 on the 11/53 board with the QBone plugged in.  

1.  it did not see the extra memory.
2.  it ran for 11 passes without error.
3.  I see the fault address on the Qbone console.
It says the ram is 02000000..17757776 
                                02040000  fault address from ODT is.
Questions:  

Should I expect OKDDD0 to test any offboard memory or devices ?  I think not.
I suspect I need to have the Qbone tested to see if its faulty.  Figure out A or B is 
broken.  So, is there a Qbone repair service or anything like that or is it buy a new one.

I think I need a new/different CPU board to break the log jamb.  I have 
M7546, M7555, M7504 all reported as not working.  It kind of looks like the
CPU board it faulty.  


jc

Jay Cotton

unread,
Mar 21, 2025, 10:07:02 PMMar 21
to UniBone
Plugged in the QBone card and powered up the machine.  
Using the map command I see 4 meg of ram.  So the CPU can find
the ram.  But not the boot device.

KDJ11-D/S   ROM V2.0

4088 K Bytes

00000000 - 01777776     512 KB     CSR = 17772100
02000000 - 17757776     3576 KB    CSR =
Press the RETURN key to continue:

17772100                MCSR
17772150 - 17772152     DU
17772200 - 17772216     SIPDR0-7
17772220 - 17772236     SDPDR0-7
17772240 - 17772256     SIPAR0-7
17772260 - 17772276     SDPAR0-7
17772300 - 17772316     KIPDR0-7
17772320 - 17772336     KDPDR0-7
17772340 - 17772356     KIPAR0-7
17772360 - 17772376     KDPAR0-7
17772516                MMR3
17773000 - 17773776     CPU ROM
17776500 - 17776506     SLU1
17777520                NR
17777546                LTC CSR, BEVENT = 1
17777560 - 17777566     SLU0
17777572 - 17777576     MMR0,1,2

Press the RETURN key to continue:

17777600 - 17777616     UIPDR0-7
17777620 - 17777636     UDPDR0-7
17777640 - 17777656     UIPAR0-7
17777660 - 17777676     UDPAR0-7
17777750 - 17777752     MREG,Hit/Miss
17777766                CPUER
17777772                PIRQ
17777776                PSW

Jay Jaeger

unread,
Mar 22, 2025, 8:48:21 AMMar 22
to UniBone
If the Qbone has a problem with just the "wrong" bus driver (or maybe even if one of the switches or jumpers is wrong - I haven't looked), it could affect the grant chain after it, making any devices that are after it on the bus disappear.  

As for the question of the diagnostic, yes, it should "see" the QBone memory and try to test it if you tell it the relevant address range to test - unless there is an open slot before the Qbone or the Qbone has an issue, in which case it will not see anything past the open slot.  (This is not like an S100 bus card with special on board memory.  The Qbone emulates memory on the QBus, so the CPU, peripheral devices, diagnostics, etc. should all "see" it.

If the CPU sees the RAM (given the size, I am assuming that the RAM is not the CPU board RAM), then it should see the other devices as well, so long as you have bus continuity.

Have you tried pulling the Qbone, and then making sure there are no open slots (aside from the CD slots in the first few slots on the bus) see how the system behaves?  (You can use ODT to access the CSRs of the various devices to see if the CPU "sees" them.

Your CPU is probably fine.

JRJ

Joerg Hoppe

unread,
Mar 22, 2025, 3:02:29 PMMar 22
to uni...@googlegroups.com
Hi,

diag  OKDDD0 tests only 512K memory on the 11/53 CPU, but no other mem.
Try diag VMSA? or ZQMC?
A mem test must not expect the special memory card's control register in the IO page, which most DEC memory has.
The QBone memory has no additional control register, its just a plain array of words.

For a quick QBone memroy test:
use the ODT console of the 11/53 (perhaps enter by pressing HALT, prompt is then a "@")
then test for example address 200000 ==start of QBone memory as you listed below:

> KDJ11-D/S   ROM V2.0
> 4088 K Bytes
> 00000000 - 01777776     512 KB     CSR = 17772100
> 02000000 - 17757776     3576 KB    CSR =

@2000000/<existing-memory-word>/<enter-new-value-here>
If you find QBone can not hold the MSB 10000, then
- most likely a solder error
- or a defective bus driver, just move the 8641 chips a bit around and see wether the error moves.
Btw, did you build a kit, or bought it "complete and tested" ?

As the CPU responded with
> 02000000 - 17757776     3576 KB    CSR =
QBone is emulating all memory beyound 512KB
("CSR = " means, the CPU did not found a memory control register, which is correct, see above).
And that means QBone could probe the memory on start up,
and that means, QBone correclty issued DMA cycles
and that menas, the GRANT chain is probably closed correctly.
Joerg

Jay Jaeger

unread,
Mar 22, 2025, 3:35:11 PMMar 22
to UniBone
I am pretty sure he has a new QBone dual - so no individual 8641 chips - instead on the QBone dual there are two sets of drivers on daughter boards that are in special DIMM sockets - he could swap those two sets.  (QBone dual is not offered as a kit.).  So, if the problem is with one of those DIMMs (bad connection or bad chip), then swapping them may fix it, or cause the symptom to change).

But he also reported that he could not boot his various devices.  So unless those devices are ALL QBone (but since he has a real disk drive, pretty sure he has a real RQDXn).  So he might be having more than one problem:  both a data line problem (as shown by the results of the startup RAM test) and a grant chain problem (causing subsequent devices to not show up).  So if he has open slots after the QBone, or If the QBone driver for the grant signal is (then it would not close the grant chain, right)  then  the grant chain would not continue *after* his QBone, either because of the QBone itself or because of the slot it is in.  That could only apply if the QBone was, for example, before his RQDXn or there were open slots in between - but one or the other is likely.

JRJ

Jay Cotton

unread,
Mar 22, 2025, 9:39:54 PMMar 22
to UniBone
I have the new QBone dual.  

Here is some poking of the DQRX3.

Here is what the 11/53 thinks is the disk controller address range.
                   17772150 - 17772152     DU

@17772150/077777 177777
@17772150/077777 
@17772150/077777 0
@17772150/077777
@17772152/005500 0
@17772152/005500
 
I don't seem to able to write to this device.

I will try a large memory test program next.

jc

Jay Cotton

unread,
Mar 22, 2025, 10:51:19 PMMar 22
to UniBone
Attempting to boot, used ctl-c and then the map command.
Clearly the CPU knows about the memory.

KDJ11-D/S   ROM V2.0

4088 K Bytes

00000000 - 01777776     512 KB     CSR = 17772100
02000000 - 17757776     3576 KB    CSR =

.R DD1:ZQMCG2
ZQMCG2.BIC

CZQMCG0

SWR = 000000  NEW =

KT11 (MEMORY MANAGEMENT) AVAILABLE
MEMORY MAP:
FROM   000000 TO   757777

PARITY MEMORY MAP:
CORE PARITY
REGISTER AT 172100 CONTROLS
FROM   000000 TO   757777

I get lots of these errors.

PARITY REGISTER DATA ERROR.
V/PC    P/PC    MAUT    REG     S/B     WAS
012374  012374��        040000��        772100��        100400  000000

Here is the test with parity checking turned off.

.RU DD1:ZQMCG2
ZQMCG2.BIC

CZQMCG0

SWR = 000000  NEW = 100

KT11 (MEMORY MANAGEMENT) AVAILABLE
MEMORY MAP:
FROM   000000 TO   757777

PROGRAM RELOCATED TO 720000��
PROGRAM RELOCATED TO 000000��

END PASS #     1��

With parity off, I don't see any memory errors.

I am now running a backup of the QBone code so I can test the DQRX3 controller functions on the
QBone from the CPU.

jc

Jay Cotton

unread,
Mar 22, 2025, 10:59:01 PMMar 22
to UniBone
Here is a test of the DU0: drive via QBone.
I get the same error pattern testing the RQDX3 board with out the QBone installed.

DRSSM-G2                                                                
ZRQA-H-0                                                                
RD/RX EXERCISER                                                          
UNIT IS RQDX or RUX50                                                    
RSTRT ADR 145702                                                        
DR>START                                                                
                                                                         
CHANGE HW (L)  ? Y                                                      
                                                                         
# UNITS (D)  ?                                                          
                                                                         
NO DEFAULT
# UNITS (D)  ? 1

UNIT 0
IP address (O)  172150 ?
Vector (O)  154 ?
BR Level [usually 4-RQDX 5-RUX50] (O)  4 ? 4
Drive number (D)  0 ?
Test entire customer area of this disk (L) Y ?
Write on customer data area of this disk unit (L)  ? Y
** WARNING - CUSTOMER DATA AREA MAY BE OVERWRITTEN! ... CONFIRM (L)  ? Y

CHANGE SW (L)  ? N

FUNCTIONAL TEST STARTED

EXERCISER STARTED

ZRQA DVC FTL ERR  00013 ON UNIT 00 TST 001 SUB 000 PC: 047574
INIT SEQUENCE FAILED
* STEP 2 READ ERROR
* CONTROLLER RAM ERROR (Non-parity)
TIME: 00:01 HOURS

UNIT 0 DROPPED - INIT ERROR


UNT DSK        # OF   # BYTES   # OF    # BYTES  --HARD ERRORS-- --SOFT ERROR-
 #   #  TYPE  READS     READ   WRITES   WRITTEN  SEK DAT DRV HST SEK DAT DRV T
--- --- ----  -----  --------- ------  --------- --- --- --- --- --- --- --- -
 0   0         0000  0,000,000   0000  0,000,000   0   0   0   0   0   0   0 0

TIME: 00:01 HOURS

ZRQA EOP    1
    1 TOTAL ERRS

Jay Jaeger

unread,
Mar 23, 2025, 12:20:08 PMMar 23
to UniBone
Just to be 100% sure:  
-- You are putting in a grant jumper board when you remove the QBone, yes?   
-- Your RQDX3 is the last board, yes?  
-- You have no empty slots except for the right hand side (CD) slots of the first 3 Quad slots between the CPU and the RQDX3, yes?

(It is OK to have empty slots after the last board)

JRJ

Jay Cotton

unread,
Mar 23, 2025, 1:42:02 PMMar 23
to UniBone
I pulled all the boards except the CPU and put the QBone in the first empty slot.
No grant jumper.

The first test I did was with CPU and RQDX3 in the first slot after the CPU board.
Again no grant jumper.

I think that should work. 

So, my last round of testing was with CPU and GBone only, in slot 0 and slot 1
on the A/B side of the back plane.

I think my testing is showing that fault is in the i/o section.  So, now I need to 
learn what lines do i/o device vs memory.  Should  be a small list.

Jay Cotton

unread,
Mar 23, 2025, 2:20:38 PMMar 23
to UniBone
It seems that BBS7 must be asserted before a device will respond to an address.
I wonder if that signal is not making it to the QBone.  

I did swap the bus driver packs and this makes no difference.  

Jay Cotton

unread,
Mar 23, 2025, 2:58:00 PMMar 23
to UniBone
I see bbs7 on the backplane.  It seems to be random and not periodic to 
hard to sync on.

Jay Jaeger

unread,
Mar 23, 2025, 3:31:51 PMMar 23
to UniBone
BBS7 would not be a cyclically recurring signal - it would be asserted when the processor is doing something in the I/O  address space - which could be, for example, a memory control register on a memory board, a register on a peripheral board, etc.  It persumably would NOT be asserted when accessing emulated memory on the QBone.

So, you have no separate memory board, just the memory on the Processor card?

Suggestion: put ONLY the QBone in (NO CPU Card, no RQDX3) in slot *4* and run the full set of QBone/Unibone diagnostics to make sure its driver chips and the QBus backplane are all happy.

You might consider doing a continuity test, end to end, of the QBus Backplane signals (and following any dasily chained signals from slot to slot.)

JRJ

Jay Cotton

unread,
Mar 23, 2025, 5:06:47 PMMar 23
to UniBone
Ready to try diagnostics on QBone.  Where should I be looking for the code?

Joan Touzet

unread,
Mar 23, 2025, 5:28:36 PMMar 23
to UniBone
Testing QBone Dual is virtually identical to testing QBone, the only change is setting the dip switches correctly.

For a Q22 backplane, with nothing else in the backplane, set SW1-1 thru SW1-4 on, SW2-3 on, SW3-1 and SW3-2 on. All the rest should be off. Ensure your backplane's LTC is off if at all possible, otherwise there will be problems with the driver test (as the LTC line will not match what the software tests during expectations.)

Run ./demo.sh -aw 22 and follow the instructions on Joerg's page.

You can loopback test the edge-DIP switches with the test GPIO mode:

tg
lb

and see if the LEDs match the dip switch positions.

Then use Ctrl-C and q to exit the tg mode.

Enter tl to enter the test latch mode, then enter

* r

this should apply random vectors to all of the lines. The test should appear to hang, and there should be no output. If you have a way to monitor the backplane lines, you should see all of them oscillating randomly. Press Ctrl-C to exit, at which point the total number of applied tests will appear.

The most common problems at this point are:

BEVNT -- your backplane is applying LTC and the QBone is not seeing its own vectors on that line
Q19-Q22 -- you are trying to run Q22 in a Q18 slot, or SW1-1 thru 1-4 are not on.

Anything else may indicate issues with the backplane or driver modules. You can rotate the two modules in the QBone Dual to see if the problem follows the module.

If all of this succeeds, the QBone Dual is not at fault.

Jay Cotton

unread,
Mar 30, 2025, 12:29:16 AMMar 30
to UniBone
I don't seem to be finding a problem with the QBone board.  
I do see that one of the am2908 chips is very hot (about 140-150 F), so I will 
be replacing that part.  I get back to the list with results.

jc

Joerg Hoppe

unread,
Mar 30, 2025, 1:58:26 AMMar 30
to uni...@googlegroups.com
Am 30.03.2025 um 06:29 schrieb Jay Cotton:
> I don't seem to be finding a problem with the QBone board.
> I do see that one of the am2908 chips is very hot (about 140-150 F),
> so I will
> be replacing that part.  I get back to the list with results.

Maybe that chip is driving against a short cut bus signal?

Joerg


Three Jeeps

unread,
Mar 30, 2025, 12:50:16 PMMar 30
to UniBone
From what I can see, the AM2908 is a brushed DC motor driver IC and it drivers current capability up to 4.5A continuous and 8A peak.

It is it trying to push out 4+ amps it will get hot.  I don't have a  schematic....what is this thing driving?

Jay Cotton

unread,
Mar 30, 2025, 8:12:07 PMMar 30
to UniBone
Thanks for everyone's interest and ideas.

Problem has been solved.

The bug was a broken piece of plastic on the b row of the last connector on the back plane.
The very corner was chipped and this allowed the bottom spring finger to contact the upper 
row and short out.

Was a really interesting bug to find.  I had a clue but did not understand all the leads until I
remembered that the machine started to act up after I pulled the bus board out of the chassis
and reinstalled it.  It never occurred to me that I had created a short.

Lesson learned.

tnx
JC

pbi...@gmail.com

unread,
Mar 31, 2025, 1:20:06 AMMar 31
to UniBone
The AM2908 is a "Quad Bus Transceivers with Interface Logic".  Example datasheet:

jay....@gmail.com

unread,
Mar 31, 2025, 7:45:29 AMMar 31
to UniBone
Indeed, this is a lesson all owners of old PDP equipment should learn and be proactive about.  Similar to Jay, I had what I call a "lazy pin" in one of the backplane connectors in my ME-11 chassis.  When a board was inserted, everything was fine.  But when there was no board in the slot, one of the top contacts would sag and touch the opposite bottom contact.  One day I was thinking to check the address selection logic of a MM11-L board set I'd been working on, so I installed just the G231 board, leaving out the G110 and the core board. Very unfortunately, the lazy pin in one of the adjacent slots delivered -15V into a clock line on the G231, frying 8 ICs and lifting a trace.

Now I visually inspect all the slots for "lazy pins" before powering anything on.

--Jay
Reply all
Reply to author
Forward
0 new messages