Medical Image Format FAQ, Part 6/8

0 views
Skip to first unread message

David A. Clunie

unread,
Dec 21, 2003, 9:16:50 AM12/21/03
to
Archive-name: medical-image-faq/part6
Posting-Frequency: monthly
Last-modified: Sun Dec 21 09:16:50 EST 2003
Version: 4.26

4. Host Machines

4.1 Data General

4.1.1 Data General Data

4.1.1.1 Data General Integers

Integers are 16 bit two's complement and stored in
big-endian format as on Sun Sparc and opposite to the Dec
VAX.

4.1.1.2 Data General Floating Point

Single precision real values are 32 bits long, in
big-endian format. The high bit is the sign bit, followed
by a 7 bit excess 64 exponent (power to which 16 must be
raised) then a 24 bit hexadecimally normalized mantissa
with the decimal point to the left of the most significant
bit. Double precision values just have another 32 bits
tacked on the mantissa and the same exponent format.


Sign
|<-->|<------ Exponent ------>|<--------- Mantissa -------->|
______________ ______________ ______________ ______________
| | | | |
|______________|______________|______________|______________|
31 28 27 24 23 20 19 16
|<----------------------- Mantissa ------------------------>|
______________ ______________ ______________ ______________
| | | | |
|______________|______________|______________|______________|
15 12 11 8 7 4 3 0

Here is a little piece of C++ code that should run on
anything and convert Data General floats to whatever the
host's floating point format is.


double value; unsigned char sign; Uint16 exponent; Uint32
mantissa;

typedef struct {
unsigned sign : 1; unsigned exponent : 7; unsigned
mantissa : 24;
} DG_FLOAT;

DG_FLOAT number;

unsigned char buffer[4]; instream.read(buffer,4); if (instream)
{
// DataGeneral is a Big Endian machine memcpy ((char
*)(&number),buffer,4); sign = number.sign; exponent =
number.exponent; mantissa = number.mantissa;

value = (double) mantissa / (1 << 24) *
pow (16.0, (long)(exponent) - 64);
value = (sign == 0) ? value : -value;
} else {
cerr << "read failed\n" << flush; value=0;
}

4.1.2 Data General Operating System

4.1.2.1 Data General RDOS

Used on the GE CT 9800 family. Severely primitive but
then is running on an old machine that can only map 64Kb
of memory at a time after all. It is apparently
multitasking. Documentation may still be available from
Data General (try DG Direct) but is not supplied with the
scanner by GE. If anyone knows where I can find it at a
reasonable price let me know. Here is a brief command
summary culled from a nifty pocket book from GE for
SunOS/Genesis users that compares commands:


CHATR - file attributes CRAND - create randomly organized file
CDIR - create directory DELETE - files or directories DIR -
change directory DISK - free space FILCOM - compare files GDIR
- show working directory name GTOD - show date and time LINK -
files (symbolic) LIST - directory contents MOVE - a file RENAME
- a file SDAT - set date STOD - set time SDUMP - write files to
a device SLOAD - read dumped files SPEED - tex editor TYPE -
contents of file XFER - copy a file

wildcards: '-' is series, '*' is single character

4.1.2.2 Data General AOS/VS

Used on the GE Signa 3X and 4X family. Quite a nice
operating system with multi-tasking and hierarchical
directories. Here is a brief command summary again culled
from a nifty pocket book from GE for SunOS/Genesis users
that compares commands:


ACL - access control list (ownership) BYE - exit command
process COPY - a file CREATE - a text file CREATE/DIR - a
directory CREATE/LINK - link files DELETE - files & directories
DIR - display or change working directory DUMP - to peripheral
F/AS/S - directory listing with file status DATE - show or set
HELP LOAD - DUMPed files MOVE - a file RENAME - a file PATH -
show pathname of a file PAUSE - the command line interpreter
SUPERU ON - enable superuser SED - text editor TIME - show or
set TYPE - contents of text file ? - list processes running

wildcards: '+' is series, '*' is single character


Other useful hints include the use of "^" to refer to the
next directory up (like ".." in Unix) in DIR commands.
Command options follow the command name without any spaces
and are indicated by a slash. COPY operations specify the
destination name first and then the source name. Devices
like the mag tape are indicated by "@", for example
"@MTB0" is tape drive zero. Files on the tape can be
referred to as "@MTB0:nn" which is very handy. For
example to read a file off a CT 9800 tape under AOS/VS:


COPY/V/IMTRSIZE=8192 B038040101.YP @MTB0:18


Perhaps most importantly, there is an extensive online
help system ... use the HELP command.

4.1.3 Data General Network

If you have a GE Signa based on a DG then you can get the
so-called "High Speed Network" card and software from GE. From
memory it is pretty pricey, and there used to be a "slower"
network interface that was cheaper, but I don't think this is
available anymore.


If you have a CT 9800 based on the DG S/140 and you need to get it
connected there are a number of solutions:


- Talk to GE about there ID/LINK II product ... I gather this is
a device that hooks into the SCSI cable to the hard drive (you
need one of the Ace drives not the old Zebra drive), monitors disk
activity and snatches pieces of the conversation to make a copy of
the image data, stores it and makes it available via some network
protocol. Sound crazy ? Perhaps, but they tell me it works and
the price is reasonable, at least for something from GE anyway.
Get them to throw one in next time you buy something big.

- The do-it-yourself approach. Talk to John Clayton
(cla...@c-c.com) at Claflin and Clayton. They supply a complete
R-level solution by providing Ethernet hardware and TCP/IP
software for 16 bit DG OS including AOS and RDOS (specifically
including the GE CT version of RDOS). He tells me "you can expect
a file transfer rate of 25 kbytes/s for S/140 systems". The
package consists of:


$2,850 - EC-10 ethernet controller $1,645 - RDOS TCP/IP software
(telnet client,ftp client/server)


I have not personally tried either of these approaches, and I am
sure there are others (talk to Merge or DeJarnette), but I am
getting really tired of carrying 9-track tapes around so perhaps I
will bite the bullet soon (and upgrade to a HighSpeed Advantage
!).

4.2 Vax

4.2.1 Vax Data

4.2.1.1 Vax Integers

- little endian - 8, 16, or 32 bits

4.2.1.2 Vax Floating Point

- little endian

- D_float

- 8 bytes - sign bit 15 - exponent bits 14-7 excess 128
binary - fraction MSB firstbits 6-0, 31-16, 47-32, 63-48
- normalized bit is not represented (hidden)


- G_float

- 8 bytes - like D, but - exponent is bits 14-4 excess
1024 - fraction 3-0 and 63-16


- F_float

- 4 bytes - like D, smaller fraction


- H_float

- 16 bytes - like G, but - exponent is bits 14-0 excess
16384 - fraction is bits 127-16

- same wierd order - bit 112 least significant

4.2.1.3 Vax Strings

- 16 bits of length - byte of type - byte of class - 32
bits of pointer

4.2.2 Vax Operating System

4.2.2.1 Vax VMS (See also Vax VMS Tools)

Truely one of the world's most irritating operating
systems to use, especially if you are a unix fan. Still
it works, has a great online help system that saves one's
butt almost often enough to be useful, and if you can
remember the directory where kermit is stored and the
weird command to invoke it one can get by (barely).


If you don't know VMS and the vendor doesn't supply the
manuals, get them from DEC ... you need them bad ...
real bad. If (like me) you throw them out everytime you
move then encounter another piece of archaic equipment,
you need the "vaxbook" which is available via ftp from
decoy.uoregon.edu, written by Joseph E St Sauver, which
summarizes commands, files and all sorts of application
specific stuff, though it is no substitute for the real
thing.


Recent VMS update: goddamn file formats ! Why can't VMS
behave like a real operating system and forget this file
format crap ! I have some Philips S5 MR images exported
in ACR/NEMA format and I can't get the things off the
hosts's Vax using Kermit, because though they have fixed
length 512 byte records, some cretinous program sets the
"carriage return carriage control" record attributes,
which causes kermit to send with all the '0A' characters
scrubbed out amongst other atrocities.


I am getting desperate and about to try using the
Hex/Dehex utility that came with Kermit to get the stuff
off and then decode the hex format ! Or perhaps even use
"dump" to make a textfile, transfer, and decipher that.
(No I don't have a C compiler for the Vax so I guess I
can't use uuencode unless someone wants to mail me a
hex'ed executable). Any hints, or instructions as to how
to use FDL and Convert, to change it to a normal format
would be appreciated. (Why can't they just have a "set
file record attribute xxx" command like all the other
millions of set commands ? Grrrr.).


More recent VMS update: finally had an inspiration while
staring at hex dumps of these files - why not use the VMS
"DUMP" utility which produces hex dumps as a "poor man's
uuencode" by saving the dump to a file, transferring it as
an ascii file, and then decoding it at the destination ?
Of course there are no nifty line checksums or anything,
but a transfer protocol such as kermit takes care of this.


The DUMP output defaults to 8 32 bit long words separated
by a space per line displayed as hex, then an ascii string
(32 bytes) and then a 24 bit word hex address offset from
the start of the fixed length record. All the data
containing lines start with a single space, where as
descriptions at the start of each record begin in the
first column, hence the data lines can be easily selected
out. By the way, the hex version of the data is listed in
reverse order ! VMS is so bizarre ! For example, here is
a fixed length 512 byte record file from a Philips S5 MRI
(some of the hex words elided to make the line fit on the
page):


Dump of file SYS$SYSROOT:[GYROSCAN]ABAALKHAIL02010201010001.ANI;1 ... File ID
(2419,301,0) End of file block 198 / Allocated 200

Virtual block number 1 (00000001), 512 (0200) bytes

0000000C 00100008 ... 00000008 .............................. 000000 00083932
2E36302E ... 2D524341 ACR-NEMA 1.0.. .....1994.06.29.. 000020 00600008
4D5F4553 ... 00000030 0.......@.........A.....SE_M..`. 000040 494B0000
00100080 ... 00000002 ....MR..p.....Philips ........KI 000060

00183148 00000002 ... 32200000 .. 2........63865375........H1.. 0001E0
^L Dump of file SYS$SYSROOT:[GYROSCAN]ABAALKHAIL02010201010001.ANI;1 ... File
ID (2419,301,0) End of file block 198 / Allocated 200

Virtual block number 2 (00000002), 512 (0200) bytes

40000018 45424F52 ... 00161250 P.....AGACQ_PT_SURFACE_PROBE...@ 000000


And so on ... you get the idea. This ugly little C++
utility written quickly during this moment of inspiration
will take saved DUMP output and make it binary again:


#include <fstream.h>

#include "MainCmd.h"

signed char hextobin(char c) {
signed char r; switch (c) {
case '0': r=0; break; case '1': r=1; break; case '2': r=2;
break; case '3': r=3; break; case '4': r=4; break; case '5':
r=5; break; case '6': r=6; break; case '7': r=7; break; case
'8': r=8; break; case '9': r=9; break; case 'A': case 'a':
r=0xa; break; case 'B': case 'b': r=0xb; break; case 'C': case
'c': r=0xc; break; case 'D': case 'd': r=0xd; break; case 'E':
case 'e': r=0xe; break; case 'F': case 'f': r=0xf; break;
default: r=-1; break;
} return r;
}

int main(int argc,char **argv) {
CCOMMAND(argc,argv);

while (1) {
const linemax=132; // only needs 113 char line[linemax];
cin.getline(line,linemax); if (!cin || cin.eof()) {
// cerr << "Bad or eof\n" << flush; break;
} unsigned count=cin.gcount(); if (count == 0 || line[0] != ' ')
continue; if (count != 113) {
cerr << "Line length " << count << "\n" << flush; break;
} unsigned i; char *ptr = line + 8*(1+8); // line is in reverse
order ... for (i=0; i<8; ++i) {
unsigned j; for (j=0; j<4; ++j) {
// 2 hex bytes -> 1 byte char bytelo = *--ptr;
char bytehi = *--ptr; unsigned char byte
= (hextobin(bytehi)<<4)
+ hextobin(bytelo);
cout.put(byte);
} --ptr; // space between long words
}
} return 0;
}


Note that the nature of fixed length records under VMS
means that the last record will be padded out to 512 bytes
without any indication of the "real" end-of-file. This
means you have to cope with trailing garbage gracefully.


Hot VMS/Philips news: nee...@pet.mni.mcgill.ca (Peter
Neelin) tells me there is an extremely useful tool for
fiddling binary files called FILE from DECUS. It allows
you to change a file's header information without
modifying the content of the file. This then permits ftp,
kermit, etc. to do the right thing with Philips .ANI
files. It also permits wildcards and does not make a copy
of the file (so it is fast). He says also that someone
has told him that they succeeded in using convert to fix
these files, but his general experience with it is not
positive (it will often change the content of the file and
it doesn't allow wildcards, in addition to promoting the
use of the horrible fdl editor!). If you are interested,
you can get FILE through gopher from decus.org (look for
the DECUS software library archives, under essential
tools). The binary is provided in case you don't have a
compiler. FILE, and many other useful things are also
available from the sites listed in Vax VMS Tools.


Some other useful hints:

- To log onto a serial terminal without executing the
login command file add "/NOCOM" to the username ... this
way you can use the operator console login which often
won't require a password.

- There is a kermit available for the Vax under VMS (file
prefix "vms" in area or tape b) ... I use the "obsolete"
version written in Bliss, because it comes from the
archives at columbia with a hex encoded executable which
can be uploaded just using an ordinary text capture into a
file, and doing the same with the short Macro hex program
that can then be assembled and used to make the convert
into the real executable. Look in places like [SYSEXE]
first though to be sure Kermit is not already there. The
generic C version of kermit runs under VMS (file prefix
"ck" in area or tape f), but not every imaging machine
comes with a VMS C compiler, whereas Macro is always
supposed to be there I gather. There is however also a
hex encoded executable of the C version in the archives
(ckvker.hex) which I haven't tried, and is the one that is
recommended in the kermit documentation.

- There is apparently a zmodem for VMS but I don't know
where it comes from or how to get it.

- Serial ports are almost always defaulted to 9600 baud.

- "SET TERMINAL/ECHO" often isn't set.

- Vax/VMS ftp conventions:


UNIX FTP server Vax/VMS FTP server

cd dir cd [.dir] cd dir/subdir cd [.dir.subdir] cd ..
cd [-]

4.2.2.2 ULTRIX 4.2.2.3 OSF

4.3 Sun - Sun3 68000 and Sun4 Sparc

4.3.1 Sun Data

The sun3 and sun4 architectures use much the same formats. Even
though the processors are different both are big-endian and the
float formats are IEEE. See the Sparc Architecture Manual -
Chapter 3 - Data Formats for more details.


One very important difference though, is that the sun3 convention
is not to align 32 bit and 64 bit data types on 4 and 8 byte
boundaries respectively, whereas the sparc (sun4) architectures
usually does, dictated by a compile time option. Be very careful
when using the same header files on one architecture or the other.
This drove me nuts when trying to figure out why the well
described Genesis (sun3) layout did not match the unknown
Advantage Windows (sun4) data. It was pretty obvious when it was
pointed out though :).

4.3.1.1 Sun Integers

Integers are 8, 16, 32, or 64 bit unsigned or signed two's
complement and stored in big-endian format as on Data
General and opposite to the Dec VAX. Most C compilers
treat short as 16 bits, and int and long as 32 bits.

4.3.1.2 Sun Floating Point

Formats conform to the IEEE 754-1985 Standard for Binary
Floating-Point Arithmetic. Single precision real values
are 32 bits long, in big-endian format. The high bit is
the sign bit, followed by a 8 bit excess 127 exponent
(power to which 2 must be raised) then a 23 bit normalized
mantissa with the decimal point to the left of the most
significant bit, from which 1.0 has been subtracted.
Double precision values have a 11 bit excess 1023 exponent
and a 52 bit mantissa. Quad precision values have a 15
bit excess 16383 exponent and a 112 bit mantissa.


Sign
|<-->|<-------- Exponent -------->|<------- Mantissa ------>|
______________ ______________ ______________ ______________
| | | | |
|______________|______________|______________|______________|
31 28 27 24 23 20 19 16
|<----------------------- Mantissa ------------------------>|
______________ ______________ ______________ ______________
| | | | |
|______________|______________|______________|______________|
15 12 11 8 7 4 3 0

Here is a little piece of C++ code that should run on
anything and convert Sun IEEE floats to whatever the
host's floating point format is. It probably should take
into account a few special cases to be strictly correct:


unsigned char buffer[4]; instream.read(buffer,4); if (instream)
{
#ifdef USESUN4NATIVEFLOAT
float fvalue; memcpy ((char *)(&fvalue),buffer,4);
value=fvalue;
#else USESUN4NATIVEFLOAT
unsigned char sign; Uint16 exponent; Uint32 mantissa;

typedef struct {
unsigned sign : 1; unsigned exponent : 8;
unsigned mantissa : 23;
} IEEE_FLOAT_SINGLE;

IEEE_FLOAT_SINGLE number; // Sparc is a Big Endian
machine memcpy ((char *)(&number),buffer,4); sign =
number.sign; exponent = number.exponent; mantissa =
number.mantissa;

if (exponent) {
value = (1.0 + (double)mantissa / (1 << 23)) *
pow (2.0, (long)(exponent) - 127);
} else {
if (mantissa) {
value = (double)mantissa / (1 << 23) *
pow (2.0, (long)(-126));
} else {
value=0;
}
} value = (sign == 0) ? value : -value;
#endif USESUN4NATIVEFLOAT
} else {
cerr << "read failed\n" << flush; value=0;
}

4.3.1.3 Sun Strings

Strings obey the usual C convention of null terminated
strings without a length preamble.


4.3.2 Sun Operating System

5. Compression Schemes

5.1 Reversible Compression 5.2 Irreversible Compression
5.2.1 Perimeter Encoding
5.3 DICOM Compression

In DICOM, compression (both reversible and irreversible) is achieved by
specifying a particular "transfer syntax" either during
negotiation of the network connection (association) or in the
media application profile for files stored on media (and
specified in the meta information header so the reader knows
which transfer syntax to switch to).


The compressed data stream is actually encoded as an "encapsulated" data
stream as defined in Part 5 of DICOM. Uncompressed data
(unencapsulated) is sent in DICOM as a series of raw bytes or
words (little or big endian) in the Value field of the Pixel
Data element (7FE0,0010). Encapsulated data on the other hand
is sent not as raw bytes or words but as Fragments contained in
Items that are the Value field of Pixel Data. The encoding of
these Items follows the same pattern as is used to specify
Sequences in DICOM, thogh the VR (Value Representation) field of
the Pixel Data is OB not SQ.


The encapsulated compressed data may be a single frame or it may contain
multiple frames for those SOP Classes that allow multifram
images (such as XA, XRF, US and NM). The rules in part 5
further specify that the first Item will either be empty or
contain a list of offsets to the beginning of the Item
containing each frame (or the only frame for a single frame
image). Also, though a frame may be split into multiple
fragments, each fragment may contain data for only one frame.
That is a frame may be split into multiple fragments, but a
fragment may not span different frames. The reason for the
fragments in the first place is that each fragment (each item)
must have a fixed, known length, so unless one buffers the
entire compressed frame before encoding it, one doesn't know in
advance how long it will be. In practice, most encoders do send
one frame per fragment but all decoders must be prepared to
handle the case where a frame spans fragments. Furthermore, all
fragments have to be of even length, and there are padding rules
in Part 5 for the last fragment of a frame (that are consistent
with the definition of padding in the JPEG standard).


Part 5 contains several examples of how to fill in the various fields in
Items of the encapsulated sequence-like value for Pixel Data, so
these will not be repeated here. However the overall strategy
looks something like this for an image with two frames,the first
split across two fragments, and an empty offset table:


(7FE0,0010) VR=OB VL=FFFFFFFF Pixel Data (FFFE,E000) VR=
VL=00000000 Item (empty offset table, hence zero length)
(FFFE,E000) VR= VL=000004C6 Item (first fragment of first frame)
.... compressed byte stream here (4C6 bytes) (FFFE,E000) VR=
VL=0000024A Item (first fragment of first frame) ....
compressed byte stream here (24A bytes) (FFFE,E000) VR=
VL=00000628 Item (first fragment of first frame) ....
compressed byte stream here (628 bytes) (FFFE,E0DD) VR=
VL=00000000 Sequence Delimiter


Note that the Item and Sequence Delimiter tags have no VR, that
the Item Delimiter tag is never used, since Items are required
to be of fixed not undefined length, and that the Sequence
Delimiter tag is always used, since the Pixel Data is always of
undefined length (that is FFFFFFFF) for encapsulated data.


If one is trying to decode a DICOM image encoded with an
encapsulated transfer syntax, one therefore has to get to the
Pixel Data tag, and start parsing the sequence like structure.
One cannot just pass the entire Value field of Pixel Data to a
conventional JPEG decoder for instance. One needs to strip out
the embedded Item tags and the trailing Sequence Delimiter. For
an example of how to do this see the source code from
dicom3tools in "libsrc/include/pixeldat/unencap.h", a simplified
version of which (without the GE bug handling) is reproduced
here.


size_t read(void)
{
// - non-pixel data is always LE, including fragment
delimiters and lengths // - 1st item is offset table,
may have zero VL // - other items are fragments // -
finally sequence delimitation tag (with zero VL) // -
each delimiter is 2 byte group,2 byte element, 4 byte
VL, little endian // - Item tag is (0xfffe,0xe000) // -
Seq delimiter is (0xfffe,0xe0dd)

length=0;

while (!lefttoreadthisfragment && !finished && !bad) {
Uint16 group=read16(); Uint16 element=read16();
Uint32 vl=read32(); if (group == 0xfffe) {
if (element == 0xe0dd) { // Sequence
Delimiter Tag
Assert(vl == 0); finished=true;
} else /* if (element == 0xe000) */ { //
Item Tag
bool vlbyteorderwrong=false; if
(++fragmentnumber > 0) {
Assert(vl); // Zero
length fragments thought
not to be legal
lefttoreadthisfragment=vl;
} else {
// skip the offset table
Assert(vl%4 == 0);
unsigned i=0; while (vl)
{
Uint32
offset=read32();
vl-=4; ++i;
}
}
}
} else {
// bad tag group in encapsulated data
bad=true;
}
}

if (lefttoreadthisfragment && !bad) {
length=unsigned(lefttoreadthisfragment >
maxlength ? maxlength :
lefttoreadthisfragment); if
(istr->read(buffer,length)) {
length=istr->gcount();
} else {
bad=true; length=0;
} lefttoreadthisfragment-=length;
}

return length;
}


An application that will take a DICOM dataset and write a pure
byte stream (having stripped off the DICOM encapsulation) is
also in dicom3tools, "dctoraw". One can feed the output of this
utility straight to a JPEG decoder such as the Stanford PVRG
utility "jpeg -d". If any padding is present at the end of each
frame, it should have been encoded in a manner consistent with
JPEG padding defined in ISO 10918-1 so that the JPEG decoder
won't fail if it encounters padding between the image frames.


Note also that the use of the terms "image" and "frame" are
slightly different in DICOM than JPEG so be careful when
comparing the two standards.


When using images with more than one component (that is a color
image rather than a grayscale image), take care about the color
space. One of the features of the ISO 10918-1 JPEG standard is
that it specifies only a compressed bitstream, and not a file
format. Even if there are three components specified in the
compressed bitstream, that does not mean they are RGB or YBR or
whatever. This has to be signalled outside the bitstream, and
in DICOM this is done in Photometric Interpretation (this is
somewhat controversial however, and one should look at recent
proposed DICOM CPs on the matter, such as CP 143).


In the non-DICOM world, the color space is specified in the file
header such as the commonly used JFIF header, or its superset,
the SPIFF header as defined in ISO 10918-3. Be especially
careful that one does not assume during decoding that a JFIF
header is present in the DICOM compressed bit stream ... it is
not. If one wants to feed the extracted bitstream to a JPEG
decoder that needs a JFIF header (like the IJG code), then you
need to add one. Conversely, never create an encapsulated DICOM
image with a bitstream that contains the JFIF header ... strip
it off first or use an encoder like Stanford PVRG JPEG that
doesn't create JFIF headers.


Here JPEG has been discussed, but the same principle applies to
other encapsulated data sets in DICOM, including the RLE
compression scheme popular in Ultrasound images (which is
equivalent to the TIFF PackBits compression scheme). The
compression scheme to interpret the encapsulated bitstream is
different, but the encapsulation mechanism using Item tags and
fragments is identical.


This mechanism has been widely used in the cardiac angiography
world on the DICOM CDs that these devices make, on Ultrasound 90
mm MODs, and on GE's more recent CT and MR scanners that write
use the CT and MR media application profile on 130 mm MODs.
Note that early implementations of the encapsulation mechanism
and the JPEG lossless encoding contain some bugs which are
described in detail in the section on GE CTI.

6. Getting Connected

6.1 Tapes

Nine-track half-inch tapes were the old medium of choice for archiving
and image exchange and many older pieces of equipment will have these.
Unfortunately most people don't have such a drive on their workstation
or personal computer. There are several possibilities:


- Use another piece of equipment that has a more modern or
networked or serial-ported host and a nine-track drive, and use it to
do the extraction. I used to use a networked Signa 4X to do this to
extract GE 9800 CT tapes.

- Visit your MIS department, which almost certainly has an archaic
mainframe with a tape drive. Sometimes tough to get them to read
formats they aren't expecting though (the hosts not the people I mean
:) ).

- Buy a nine-track for your workstation. This may seem a ridiculous
idea given the price of new 6250 bpi drives are around $5,000, but
one can often pick up bargain primitive non-6250 or refurbished drive
that is adequate for the job.


The Qualstar 1054 is one such drive, that attaches to a SCSI port, and
works with the regular SunOS SCSI tape driver, once a few tables in the
kernel have been updated as follows, and the kernel rebuilt:


{root}% pwd /usr/kvm/sys/scsi/targets

{root}% diff -c stdef.h.prequalstar stdef.h *** stdef.h.prequalstar Tue Aug 30
19:32:24 1994 --- stdef.h Tue Aug 30 19:32:24 1994 *************** *** 43,48
**** --- 43,49 ----
#define ST_TYPE_FUJI 0x21 /* Fujitsu - (not tested) */ #define ST_TYPE_KENNEDY
0x22 /* Kennedy */ #define ST_TYPE_HP 0x23 /* HP */
+ #define ST_TYPE_QUALSTAR 0x24 /* Qualstar */
#define ST_TYPE_HIC 0x26 /* Generic 1/2" Cartridge */ #define ST_TYPE_REEL
0x27 /* Generic 1/2" Reel Tape */

{root}% diff -c st_conf.c.prequalstar st_conf.c *** st_conf.c.prequalstar Tue
Aug 30 19:32:22 1994 --- st_conf.c Tue Aug 30 19:32:22 1994 *************** ***
153,158 **** --- 153,174 ----
* so our best guess as to their capabilities is * included herein. */
+ /* Qualstar 1054 or 1260s scsi 9-track with 64KB buffer */ + { + "Qualstar
1054/1260s 1/2\" Reel", 7, "NCR ADP-53", ST_TYPE_QUALSTAR, 10240, + (ST_REEL |
ST_VARIABLE | ST_BSF | ST_BSR), + 300, 300, + { 0x00, 0x02, 0x06, 0x03}, + { 0,
0, 0, 0 } + }, + /* Qualstar 1054 scsi 9-track with 256KB buffer */ + { +
"Qualstar 1054 1/2\" Reel", 10, "QUALSTAR10", ST_TYPE_QUALSTAR, 10240, +
(ST_REEL | ST_VARIABLE | ST_BSF | ST_BSR), + 300, 300, + { 0x00, 0x02, 0x06,
0x06}, + { 0, 0, 0, 0 } + },
/* Wangtek QIC-150 1/4" cartridge */ {
"Wangtek QIC-150", 14, "WANGTEK 5150ES", ST_TYPE_WANGTEK, 512, (ST_QIC |
ST_AUTODEN_OVERRIDE),


I got my Qualstar 1054 from Bill Power at Power Computer Services for
only $750 and have successfully read GE 9800 CT and Philips S15 MR tapes
with it so far. See the "Sources" section for where to get one.


Once you have such a tape connected to the SCSI port, one can either
write simple programs to read files (easiest if the tape has variable
length records) or use shell scripts and the "dd" command with whatever
the correct block size is. See dd(1), mt(1), and mtio(3) for more
information. Remember that the read(2) call reads one fixed or variable
length record at a time, and returns 0 bytes read for a tape mark, and
two tape marks in a row indicates the end of the tape (normally). If
you encounter short files with a series of records 80 bytes long chances
are you are dealing with header/end markers. This is what ANSI standard
tapes off VAX VMS seem to look like.


Anyone who has any further information about tape formats and handling,
especially references to standard or on-line documents please let me
know.

6.2 Ethernet

6.3 Serial Ports


The next part is part7 - information sources.


Reply all
Reply to author
Forward
0 new messages