Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

DSSI Disk and DSSI guru needed...

320 views
Skip to first unread message

Holm Tiffe

unread,
Sep 4, 2015, 5:42:49 AM9/4/15
to
Sorry for pestering all available resources with my problem but I need
help with an RF73-EA DSSI Disk.

Copy of my mailing to ccl...@classiccmp.org at first:

>Since the capacity of 2x RF31 and 1x RF71 disks is a little bit low
>for VMS with some compilers (~400MB every disk), I've looked for a
>bigger disk, at least for the sytem itself. (I've already relocated
>the pagefile to the 2nd disk).
>Ok, there are RF73 available at ebay US for $100, but additional $50
>and more for shipping is to much, I have to pay additional 19% of
>customs VAT on top of the sum from disk+shipping.. Maybe there are
>people that think that this prices are ok, but not me, not for an old
>2GB disk for an computer with that power consumption and that
computing >"power".. In the case there is someone in europe that want to
give >away such a disk (>=800 MB) for an acceptable price, please mail me..
>
>Luckily an old friend of mine found 2 Disks in his stock, another RF31
>(not tried jet) and an RF73.
>I've changed now the working but still almost empty RF71 in my
>VAX4000-300 against that RF73 disk and tried to integrate it to the
>system. It starts with all LEDs on (as the other do), begins to rattle
>a little with the head assembly (as the others do) but stops then and
>begins to reposition somewhere in 0.5s cycles. It never finishes doing
>that, it is not going to ready. The ready led is blinking for a short
>time after every 0,5s cycle. I've tried to talk with the disk using
>the KA670 Firmware with set host/dup/dssi/bus:0 2, PARAMS is working
>and STATUS is responding, the displayed last failure was 3304(X) and I
>don't know what that could be..
>
>All other commands do work, but they are aborted since the disk is >busy.
>
>The available RF72DUG8 gives the hint that the error codes are listed
>in the service manuals, but it seems that those manuals aren't
>available somewhere.
>
>What could the error be? Is the disk dead?

..in the meantime I've discovered that the 0.5 seconds clicking is
ending after some hours. It seems that this is some time consuming
recalibration of the head positioning mechanism and the disk isn't busy
anymore after that. Unfortunately it lights up the error LED and I get
messages about that unit is nonfunctional at all with that infamous 3304
error.

I think the problem is that the disk can't read a least a track anymore
and therefore the positioning servo is failing .. or something like
that. Please correct me when I'm wrong.

I've read somewhere that an RF73 Disk has 11 platters with 22 heads and
the most bad surface is disabled at factory (transparent to the user).
There must be diagnostic software/utilities that can do more interesting
things as the local to the disk controller firmware that can be reached
trough the dup server. Is such a software still existing and availabe
somewhere? What could I try to get the disk back to life?
What are the meaning of the 3 dip switches that sits on the PCB next to
the DSSI connector? I've found nothing about them in the available manuals.

I've tried to update the drives firmware to the last revision with the
utilities in sys$etc, but this is failing too and it seems it fails
since the disk is unavailable for a long time (and klicking) after every
controller reset. Can that timeconsuming clicking be supressed
in some way? It seems that the recalibrating is triggered every time
from the POST..

Tested the "new" RF31 in the meantime, it is ok and contains an VMS5.4
as it seems. I've still haven't tried to boot that..

I'm searching for some mounting hardware for the RF31, need the two
brackets with the connector mounting assembly on the back (have the
electrical parts and the DSSI connector) and I need that "front panel"
for the VAX4000 with the two buttons. In the case that I could get them
I could replace the TK70 with another RF31..


Thanks in Advance,

Holm


Stephen Hoffman

unread,
Sep 6, 2015, 9:52:45 AM9/6/15
to
On 2015-09-04 09:42:47 +0000, Holm Tiffe said:

> There must be diagnostic software/utilities that can do more
> interesting things as the local to the disk controller firmware that
> can be reached trough the dup server. Is such a software still existing
> and availabe somewhere? What could I try to get the disk back to life?

Scrounge an HSD05 or HSD10 controller and pieces, and replace the DSSI
drives with some slightly more recent SCSI devices. This'll be the
most expeditious approach for updating the hardware to slightly less
ancient, and still mostly "authentic" — DEC sold HSD widgets and SCSI
on VAX systems, when DSSI went end-of-life. (~1990?)

Alternatives with varying degrees of cost and feasibility...
* Set up a clean room and platter and head reconditioning and/or
refurbishment and related; start remanufacturing these devices.
* Reverse the DSSI protocol and the MSCP traffic, and construct an
emulated storage device. (This if you don't go to HSD and SCSI...)
* Acquire one of the available refurbs — but I'd not expect those to
last very long, given the sheer age of the storage hardware involved.
* Move to VAX emulation. This'll be much more reliable, and you won't
be budgeting cash for spare parts and repairs.

As for the embedded firmware, accessing the IDE itself is feasible
through the console. See the old VMS FAQ.
<http://labs.hoffmanlabs.com/vmsfaq> Search for "/DSSI". Also see
<http://labs.hoffmanlabs.com/node/1414> These VAX console commands and
these the OpenVMS-level SET HOST commands (VAX and Alpha) use the
Diagnostics and Utilities Protocol (DUP) to access the DSSI IDE server,
and from there you'll have command access to various tools. That
written, firmware and diagnostics and the inevitable, futile
reformatting attempts won't help with a failing drive. IIRC & AFAIK,
the DSSI IDEs used automatic bad block revectoring and should have a
replacement and caching table (DEV$M_RCT in DEVCHAR), which means that
the factory and user bad block lists and the spares are managed
automatically. Anything within the EDC is repaired transparently, and
any errors beyond the EDC is rewritten from another member of the
shadowset if one is available, or the data is flagged bad (q.v. BACKUP
and the FORCEDERROR flag) and the sector revectored when next
rewritten. By the time you're seeing piles of errors and hearing odd
noises, the disk is trashed.

I've never seen one of these MSCP devices come back from piles of
errors, and I've never seen reformatting help — "reformatting" in the
DEC VAX sense, rather than the the INITIALIZE-level or FDISK-level
processing that some other systems call "reformatting". Not once the
disk starts tossing the sorts of errors as described in your posting.
Yes, swapping the drive electronics controller can sometimes helps,
depending on the exact cause of the error. But a going-bad HDA?
Nope. Related <http://labs.hoffmanlabs.com/node/838>.

Start figuring out your hardware replacement strategy, while you're
waiting for the inevitably-futile reformat to fail. Swap the disk.
Or rebuild it, or get it rebuilt. Or move to emulation.


--
Pure Personal Opinion | HoffmanLabs LLC

Holm Tiffe

unread,
Sep 7, 2015, 3:50:34 AM9/7/15
to
On 06.09.2015 15:52, Stephen Hoffman wrote:
> On 2015-09-04 09:42:47 +0000, Holm Tiffe said:
>
>> There must be diagnostic software/utilities that can do more
>> interesting things as the local to the disk controller firmware that
>> can be reached trough the dup server. Is such a software still
>> existing and availabe somewhere? What could I try to get the disk back
>> to life?
>
> Scrounge an HSD05 or HSD10 controller and pieces, and replace the DSSI
> drives with some slightly more recent SCSI devices. This'll be the most
> expeditious approach for updating the hardware to slightly less ancient,
> and still mostly "authentic" — DEC sold HSD widgets and SCSI on VAX
> systems, when DSSI went end-of-life. (~1990?)

Yea, I know this. I have my Eyes open for an HSD10 or HSD30 at Ebay.
...but how much one want to invest for 8 VUPS?
>
> Alternatives with varying degrees of cost and feasibility...
> * Set up a clean room and platter and head reconditioning and/or
> refurbishment and related; start remanufacturing these devices.
> * Reverse the DSSI protocol and the MSCP traffic, and construct an
> emulated storage device. (This if you don't go to HSD and SCSI...)
> * Acquire one of the available refurbs — but I'd not expect those to
> last very long, given the sheer age of the storage hardware involved.
> * Move to VAX emulation. This'll be much more reliable, and you won't
> be budgeting cash for spare parts and repairs.

Refurbished in this case mostly means "Dust removed" in the rest of the
cases "pulled".
>
> As for the embedded firmware, accessing the IDE itself is feasible
> through the console. See the old VMS FAQ.
> <http://labs.hoffmanlabs.com/vmsfaq> Search for "/DSSI". Also see
> <http://labs.hoffmanlabs.com/node/1414> These VAX console commands and
> these the OpenVMS-level SET HOST commands (VAX and Alpha) use the
> Diagnostics and Utilities Protocol (DUP) to access the DSSI IDE server,
> and from there you'll have command access to various tools. That
> written, firmware and diagnostics and the inevitable, futile
> reformatting attempts won't help with a failing drive. IIRC & AFAIK,
> the DSSI IDEs used automatic bad block revectoring and should have a
> replacement and caching table (DEV$M_RCT in DEVCHAR), which means that
> the factory and user bad block lists and the spares are managed
> automatically. Anything within the EDC is repaired transparently, and
> any errors beyond the EDC is rewritten from another member of the
> shadowset if one is available, or the data is flagged bad (q.v. BACKUP
> and the FORCEDERROR flag) and the sector revectored when next
> rewritten. By the time you're seeing piles of errors and hearing odd
> noises, the disk is trashed.

Already knew and done that as you can read in my previous post.
>
> I've never seen one of these MSCP devices come back from piles of
> errors,

...MSCP has almost nothing todo with error recovery..I do have QBUS
Emulex and CQD SCSI Controllers that using MSCP for the host, but
the Error management and the physical drive format is another pair of
shoes..

> and I've never seen reformatting help — "reformatting" in the
> DEC VAX sense, rather than the the INITIALIZE-level or FDISK-level
> processing that some other systems call "reformatting". Not once the
> disk starts tossing the sorts of errors as described in your posting.
> Yes, swapping the drive electronics controller can sometimes helps,
> depending on the exact cause of the error. But a going-bad HDA?
> Nope. Related <http://labs.hoffmanlabs.com/node/838>.
>
> Start figuring out your hardware replacement strategy, while you're
> waiting for the inevitably-futile reformat to fail. Swap the disk. Or
> rebuild it, or get it rebuilt. Or move to emulation.
>
>

Hey, I only want to get a piece of history that was given to me for free
running again, so there is no hardware replacement strategy necessary.
Yesterday evening I've repaired a PDP11 CPU and a memory board, that's
my "replacement strategy". I'm doing this just for fun, no commercial
interest.

I'm Unix systems administrator for 25 years and I now the difference
between a low level format, partitioning and filesystems.
I'm don't waiting for "the inevitably-futile reformat to fail" since
there is simply no refomating tool available in the ISE's internals
commands .. or at least it isn't documented at all, which is kernel of
my problem.
There is a drive exerciser, a drive test and an erase program, none of
them does a physical drive reformat (which isn't easy, since those
drives using servo information embedded in the data tracks).


Regards,

Holm





Stephen Hoffman

unread,
Sep 7, 2015, 10:33:49 AM9/7/15
to
On 2015-09-07 07:50:32 +0000, Holm Tiffe said:

>
> Yea, I know this. I have my Eyes open for an HSD10 or HSD30 at Ebay.
> ...but how much one want to invest for 8 VUPS?

I'd be looking to spend far less than you've already expended. Close
to zero, as I can get massively more VUPS and far fewer maintenance
problems with emulation. As hobbies go, you're into much older
hardware, and older hardware fails. Which means budgeting for that, or
finding a cheaper hobby. Working VAX hardware and working peripherals
have been getting (much) more expensive for a decade now, as the
hardware ages out and fails, and as more and more of the widgets get
"skipped."

>> I've never seen one of these MSCP devices come back from piles of errors,
>
> ...MSCP has almost nothing todo with error recovery..I do have QBUS
> Emulex and CQD SCSI Controllers that using MSCP for the host, but
> the Error management and the physical drive format is another pair of shoes..

The on-board MSCP server is what handles recovery out in the RF ISE.
Each DSSI ISE is akin to a scaled-down and updated and per-disk CI
cluster storage controller akin to an HSC50, and running more than a
little microcode — which provides cluster services including MSCP — out
in the ISE controller. DSSI isn't that far off of a CI cluster, for
that matter. There's rather more going on within the TF or RF ISE
controller than with the definitely-not-trivial firmware within a SCSI
disk, for instance. VMS would get involved here via disk shadowing, as
the ISE doesn't have visibility into the other members of the HBVS
RAID. But otherwise, MSCP server — either in the UDA, KDA, KDM, etc,
bus-based MSCP storage controller, or out in the HSC, HSJ, HSG, ISE,
etc — deals with the errors, up until it can't.

> I'm don't waiting for "the inevitably-futile reformat to fail" since
> there is simply no refomating tool available in the ISE's internals
> commands .. or at least it isn't documented at all, which is kernel of
> my problem.
> There is a drive exerciser, a drive test and an erase program, none of
> them does a physical drive reformat (which isn't easy, since those
> drives using servo information embedded in the data tracks).

The VAX DS and MDM diagnostic tools were retired well before this VAX
shipped, and the diagnostics and tools necessary to maintain the
supported components are in the console and sometimes in the controller
or device firmware, or are automatically performed. In the hardware
era you're working with, the drive formatter — where one was even
needed — was usually a TEST command.

As for the typical DEC field service practices, you have a brick. It'd
get swapped. You can either replace it, or disassemble and refurbish
it. DEC field service learned not to reformat failing disks, FS
learned to swap them. The parts were brought back into the depot or
shipped back to the factory. Depending on the fault, diagnostics
might be run or the fault investigated for QC or other purposes, and
the widget was then remanufactured or recycled.

Take it apart and start checking the hardware. No, I don't know of
any wiring schematics, those — like diagnostics — generally weren't
available outside of manufacturing in this era. Field service had
largely stopped doing component-level repairs toward the end of the
VAX-11 series, so there was no need to make schematics available.

Holm Tiffe

unread,
Sep 7, 2015, 5:31:45 PM9/7/15
to
On 07.09.2015 16:33, Stephen Hoffman wrote:
> On 2015-09-07 07:50:32 +0000, Holm Tiffe said:
>
>>
>> Yea, I know this. I have my Eyes open for an HSD10 or HSD30 at Ebay.
>> ...but how much one want to invest for 8 VUPS?
>
> I'd be looking to spend far less than you've already expended. Close to
> zero, as I can get massively more VUPS and far fewer maintenance
> problems with emulation. As hobbies go, you're into much older
> hardware, and older hardware fails. Which means budgeting for that, or
> finding a cheaper hobby. Working VAX hardware and working peripherals
> have been getting (much) more expensive for a decade now, as the
> hardware ages out and fails, and as more and more of the widgets get
> "skipped."

..that's why I have simh running as pdp11 and as vax, you don't asked
for this, you knew better from begin.
>
>>> I've never seen one of these MSCP devices come back from piles of
>>> errors,
>>
>> ...MSCP has almost nothing todo with error recovery..I do have QBUS
>> Emulex and CQD SCSI Controllers that using MSCP for the host, but
>> the Error management and the physical drive format is another pair of
>> shoes..
>
> The on-board MSCP server is what handles recovery out in the RF ISE.

No, it is the M68K RTOS that runs the MSCP Server..ans the error recovery.


> Each DSSI ISE is akin to a scaled-down and updated and per-disk CI
> cluster storage controller akin to an HSC50, and running more than a
> little microcode — which provides cluster services including MSCP — out
> in the ISE controller. DSSI isn't that far off of a CI cluster, for
> that matter.

Yes, I know.

> There's rather more going on within the TF or RF ISE
> controller than with the definitely-not-trivial firmware within a SCSI
> disk, for instance. VMS would get involved here via disk shadowing, as
> the ISE doesn't have visibility into the other members of the HBVS
> RAID. But otherwise, MSCP server — either in the UDA, KDA, KDM, etc,
> bus-based MSCP storage controller, or out in the HSC, HSJ, HSG, ISE, etc
> — deals with the errors, up until it can't.
>

Yes. But what it not dows, is physical formating and reading and writing
of the underlaying storage media.

>> I'm don't waiting for "the inevitably-futile reformat to fail" since
>> there is simply no refomating tool available in the ISE's internals
>> commands .. or at least it isn't documented at all, which is kernel of
>> my problem.
>> There is a drive exerciser, a drive test and an erase program, none of
>> them does a physical drive reformat (which isn't easy, since those
>> drives using servo information embedded in the data tracks).
>
> The VAX DS and MDM diagnostic tools were retired well before this VAX
> shipped, and the diagnostics and tools necessary to maintain the
> supported components are in the console and sometimes in the controller
> or device firmware, or are automatically performed. In the hardware era
> you're working with, the drive formatter — where one was even needed —
> was usually a TEST command.

..if it ever existed in the world outside DEC.
So far as I know even the tools to physical format DEC RX diskettes
where unavailable.
>
> As for the typical DEC field service practices, you have a brick. It'd
> get swapped.

Yes, I know, but this isn't of much help.

> You can either replace it, or disassemble and refurbish
> it. DEC field service learned not to reformat failing disks, FS learned
> to swap them. The parts were brought back into the depot or shipped
> back to the factory. Depending on the fault, diagnostics might be run
> or the fault investigated for QC or other purposes, and the widget was
> then remanufactured or recycled.
>
> Take it apart and start checking the hardware. No, I don't know of any
> wiring schematics, those — like diagnostics — generally weren't
> available outside of manufacturing in this era. Field service had
> largely stopped doing component-level repairs toward the end of the
> VAX-11 series, so there was no need to make schematics available.
>

I don't think that any schematics needed here, the PCB from the ISE is
fine since it starts the POST and behaves normally for the first
seconds. After that the Heads are reaching a bad cylinder and the drive
goes into a Servo calibration procedure which is exited after hours with
a failure. That repeats after every power cycle, not depending on
temperature. Since the drive makes several head steps (audible) it is
unlike that a complete surface is failing (defective head or read
amplifier), but a previous head crash is possible.

I was looking here for People that probably still know what is going
wrong in the drive and what todo with it besides swapping against
factory repaired or new unit (which is unavailable after all that years)
_before_ I open up the drive mechanics.

Regards,

Holm



al.j....@gmail.com

unread,
Jul 20, 2016, 5:24:46 PM7/20/16
to
It's possible that I might be able to provide you some additional insight.
Send email to al.m...@me.com
0 new messages