Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

v12i068: Public domain TAR, Part01/03

40 views
Skip to first unread message

Rich Salz

unread,
Nov 29, 1987, 6:59:20 PM11/29/87
to
Submitted-by: John Gilmore <hoptoad!g...@UUNET.UU.NET>
Posting-number: Volume 12, Issue 68
Archive-name: pdtar/part01

[ See the first two paragraphs of the README for information on
what this fine program does. --r$ ]

: To unbundle, sh this file
echo README
cat >README <<'@@@ Fin de README'
This is the Nov87 release of a public domain tar(1) replacement. It
implements the 'c', 'x', and 't' commands of Unix tar, and many of the
options. It creates P1003 "Unix Standard" [draft 6] tapes by default,
and can read and write both old and new formats. It can compress or
decompress tar archives "on the fly" (using the 'z' option) as well as
accessing remote tape drives or files by specifying
"host:/dev/tapedrive". It lets you set the default tape drive by
setting TAPE in your environment. Its verbose output looks more like
"ls -l" than the Unix tar, the columns line up, and you can get verbose
listings from the 'cvv' option as well as from 'xvv' and 'tv'. It does
shell-globbing (regular expressions) for listing and extraction. It is
a little better at reading damaged tapes than Unix tar. There is a
half-baked "diff" option for comparing a tape against the file system.
And it's free.

It is designed to be a lot more efficient than the standard Unix tar;
it does as little bcopy-ing as possible, and does file I/O in large
blocks. On the other hand, it has not been timed or performance-tuned;
it's just *designed* to be faster.

On SunOS 3.3, the tar archives it creates under the 'old' option are
byte-for-byte the same as those created by /bin/tar, except the trash
at the end of each file and at the end of the archive has been replaced
by zeroes.

It was written and initially debugged on a Sun Workstation running
4.2BSD. It has been run on Xenix, Unisoft, Vax 4.2BSD, utzoonix, USG,
Masscomp, Minix, and MSDOS systems. I'm interested in finding people
who will port it to other types of (Unix and non-Unix) systems, use it,
and send back the changes; and people who will add the obscure tar
options that they happen to use and I don't. In particular, VMS, Mac,
Atari and Amiga versions would be handy.

It still has a number of loose ends, marked by "FIXME" comments in the
source. Fixes to these things are also welcome.

I am the author of all the code in this program, except some of the
subroutines, which are from contributors listed below. I hereby place
it in the public domain. If you modify it, or port it to another
system, please send me back a copy, so I can keep a master source.

This program is much better than it started, due to the effort and care
put in by Henry Spencer, Fred Fish, Ian Darwin, Geoff Collyer, Stan
Barber, Guy Harris, Dave Brower, Richard Todd, Michael Rendell, Stu
Heiss, and Rich $alz. Thank you, one and all.

John Gilmore
Nebula Consultants
PO Box 170608
San Francisco, California, USA 94117-0608
hoptoad!gnu or g...@toad.com
Hoptoad talks to sun, ptsfa, ihnp4, utzoo, ucsfcgl.

@(#)README 1.14 87/11/11
@@@ Fin de README
echo PORTING
cat >PORTING <<'@@@ Fin de PORTING'
Porting hints for public domain tar
John Gilmore, ihnp4!hoptoad!gnu
@(#)PORTING 1.13 87/11/11

The Makefile should be edited to comment out all the undesired
versions, and create the following configuration lines for the system
you are compiling it on:

DEFS = the proper #define's to conditionally compile for your system.
LIBS = the system libraries and/or object modules to link with the program.
LINT = the lint program (or the compiler with extra checking turned on)
LINTFLAGS = a good strong way to invoke 'lint' on your system.
DEF_AR_FILE = the name of the default archive file on your system.
It should be enclosed in quoted quotes, e.g. \"/dev/foo\" .
DEFBLOCKING = the default blocking factor on your system.
O = the suffix for object files ('o', except 'obj' for MSDOS).

A copy of "getopt", the standard argument parser, is required. It's in
libc on Missed'em V systems and 4.3BSD; on most other systems, you'll
need a copy of a public domain getopt, available through the
comp.sources.unix archives, or from the AT&T Toolchest if you can't
find it elsewhere.

A copy of the Berkeley directory access routines is also required.
These are in libc and <sys/dir.h> on Berkeley systems. A public domain
version is available through comp.sources.unix. There is an #include
you have to change in create.c for this, to set the name of the include
file you have. Some systems have the include file in <sys/ndir.h>.
You'll have to find it on your system, or get the public domain one and
place it somewhere. For MSDOS, I have supplied these directory
routines in msd_dir.c and msd_dir.h, since it's likely that your system
doesn't have them. To permanently install these into your MSC 3.0
library, do the following:
copy msd_dir.h c:\c\include\sys\dir.h
cl -A$(MODEL) -c msd_dir.c
lib $(MODEL)dir.lib msd_dir.obj;
Change c:\c\include to wherever your standard include directory is.
You might have to modify this procedure if you aren't using MSC 3.0.

Grep for FIXME to find places that aren't finished or which have
portability problems. Also see the file TODO.

The MSDOS port was done under the Microsoft C 3.0 compiler and
libraries. In the Makefile, COPTS should be changed to -Zi or nothing;
and there is a special link command for making tar.exe, which you will
have to uncomment, since MSDOS can't handle command lines longer than
128 bytes. Also, clean and install will not work unless you change
/ in path names to \.

On Minix, there are a bunch of problems. "V7 compatible" my ass.
* "make" doesn't expand macros in the Makefile properly. You will
probably have to expand them by hand. Better to go in and fix Minix
"make" though...
* The directory access library is nonexistent. It wasn't in V7 but
anybody who writes code without it, even on V7 systems, is a fool.
* Various other library routines are broken, e.g. printf() doesn't take
"%*s" or "%.*s"; no <sys/types.h> which Unix requires, ctime(), getopt().
@@@ Fin de PORTING
echo Makefile
cat >Makefile <<'@@@ Fin de Makefile'
# Makefile for public domain tar program.
# @(#)Makefile 1.30 87/11/11

# Berserkeley version
DEFS = -DBSD42
LDFLAGS =
LIBS =
LINT = lint
LINTFLAGS = -abchx
DEF_AR_FILE = \"/dev/rmt8\"
DEFBLOCKING = 20
O = o

# USG version
#DEFS = -DUSG
#LDFLAGS =
#LIBS = -lndir
#LINT = lint
#LINTFLAGS = -p
#DEF_AR_FILE = \"/dev/rmt8\"
#DEFBLOCKING = 20
#O = o

# UniSoft's Uniplus SVR2 with NFS
#DEFS = -DUSG -DUNIPLUS -DNFS -DSVR2
#LDFLAGS =
#LIBS = -lndir
#LINT = lint
#LINTFLAGS = -bx
#DEF_AR_FILE = \"/dev/rmt8\"
#DEFBLOCKING = 20
#O = o

# MASSCOMP version
#CC = ucb cc
#DEFS = -DBSD42
#LDFLAGS =
#LIBS =
#LINT = lint
#LINTFLAGS = -bx
#DEF_AR_FILE = \"/dev/rmt0\"
#DEFBLOCKING = 20
#O = o

# (yuk) MS-DOS (Microsoft C) version
#MODEL = S
#DEFS = -DNONAMES -A$(MODEL) -nologo
#LDFLAGS =
#LIBS = $(MODEL)dir.lib
#LINT = $(CC)
#LINTFLAGS = -W3
#DEF_AR_FILE = \"tar.out\"
#DEFBLOCKING = 20
#O = obj

# V7 version
# Pick open3 emulation or nonexistence. See open3.h, port.c.
##DEFS = -DV7 -DEMUL_OPEN3 -Dvoid=int
##DEFS = -DV7 -DNO_OPEN3 -Dvoid=int
#LDFLAGS =
#LIBS = -lndir
#LINT = lint
#LINTFLAGS = -abchx
#DEF_AR_FILE = \"/dev/rmt8\"
#DEFBLOCKING = 20
#O = o

# Minix version
# No lint, so no lintflags. Default file is stdin/out. (Minix "tar"
# doesn't even take an "f" flag, it assumes argv[2] is the archive name!)
# Minix "make" doesn't expand macros right, so Minix users will have
# to expand CFLAGS, SRCS, O, etc by hand, or fix your make. Not my problem!
# You'll also need to come up with getopt() and ctime(), the directory
# library, and a fixed doprintf() that handles %*s. Put this stuff in
# the "SUBSRC/SUBOBJ" macro below if you didn't put it in your C library.
# Note that Minix "cc" produces ".s" files, not .o's, so O = s has been set.
#
# Pick open3 emulation or nonexistence. See open3.h, port.c.
##DEFS = -DV7 -DMINIX -DEMUL_OPEN3
##DEFS = -DV7 -DMINIX -DNO_OPEN3
#LDFLAGS =
#LIBS =
#DEF_AR_FILE = \"-\"
#DEFBLOCKING = 8 /* No good reason for this, change at will */
#O = s

# Xenix version
#DEFS = -DUSG -DXENIX
#LDFLAGS =
#LIBS = -lx
#LINT = lint
#LINTFLAGS = -p
#DEF_AR_FILE = \"/dev/rmt8\"
#DEFBLOCKING = 20
#O = o


CFLAGS = $(COPTS) $(ALLDEFS)
ALLDEFS = $(DEFS) \
-DDEF_AR_FILE=$(DEF_AR_FILE) \
-DDEFBLOCKING=$(DEFBLOCKING)
# next line for Debugging
COPTS = -g
# next line for Production
#COPTS = -O

# Add things here like getopt, readdir, etc that aren't in your
# standard libraries. (E.g. MSDOS needs getopt, msd_dir.c, msd_dir.obj)
SUBSRC=
SUBOBJ=

# Destination directory and installation program for make install
DESTDIR = /usr/pd
INSTALL = cp
RM = rm -f

SRC1 = tar.c create.c extract.c buffer.c getoldopt.c
SRC2 = list.c names.c diffarch.c port.c wildmat.c $(SUBSRC)
SRCS = $(SRC1) $(SRC2)
OBJ1 = tar.$O create.$O extract.$O buffer.$O getoldopt.$O list.$O
OBJ2 = names.$O diffarch.$O port.$O wildmat.$O $(SUBOBJ)
OBJS = $(OBJ1) $(OBJ2)
AUX = README PORTING Makefile TODO tar.1 tar.5 tar.h port.h open3.h \
msd_dir.h msd_dir.c

all: tar

tar: $(OBJS)
$(CC) $(LDFLAGS) -o tar $(COPTS) $(OBJS) $(LIBS)
# command is too long for Messy-Dos (128 char line length limit) so
# this kludge is used...
# @echo $(OBJ1) + > command
# @echo $(OBJ2) >> command
# link @command, $@,,$(LIBS) /NOI;
# @$(RM) command

install: all
$(RM) $(DESTDIR)/tar $(DESTDIR)/.man/tar.[15]
$(INSTALL) tar $(DESTDIR)/tar
$(INSTALL) tar.1 $(DESTDIR)/.man/tar.1
$(INSTALL) tar.5 $(DESTDIR)/.man/tar.5

lint: $(SRCS)
$(LINT) $(LINTFLAGS) $(ALLDEFS) $(SRCS)

clean:
$(RM) errs $(OBJS) tar

tar.shar: $(SRCS) $(AUX)
shar >tar.shar1 $(AUX)
shar >tar.shar2 $(SRC1)
shar >tar.shar3 $(SRC2)

tar.tar.Z: $(SRCS) $(AUX)
/bin/tar cf - $(AUX) $(SRCS) | compress -v >tar.tar.Z

$(OBJS): tar.h port.h
@@@ Fin de Makefile
echo TODO
cat >TODO <<'@@@ Fin de TODO'
@(#) TODO 1.15 87/11/06

Test owner/group on extraction better.

creation of links, symlinks, nodes doesn't follow the -k (f_keep) guidelines;
if the file already exists, it is not replaced, even though no -k.

Check stderr and stdout for errors after writing, and quit if so.

Preliminary design of Multifile option to handle EOFs on input and
output. Multifile can just close the archive when it hits end of
archive, and ask for archive to be changed. It has no choice on some
media, e.g. floppies and cartridge tapes, where there is no room for an
EOF block there. Start off 2nd archive medium with odd header block,
duplicating original, but with offset to start of data spec'd. Reading
such a header causes tar non-'M' to complain while extracting (but to
seek there and do it anyway!) Big win -- this works on cartridge
tapes, should work on floppies, might work on magtape. It would
encourage the *&%#$ systems programmers to fix their drivers, too!

Profile it and see where the time, call counts, etc are going.

Fix directory timestamps after inserting files into them. Wait til next
file that's not in the directory. Need a stack of them.

Option to seek the input file (in skip_file) rather than reading
and tossing it? (Could just jump in buffer if stuff is in core.)
Could misalign archive reads versus filesys and slow it down, who knows?

Add -C option for creating from odd directories a la 4.2BSD?

Break out odd bits of code into separate support modules.

Add the r, u, X, l, F, C, and digit options of Unix tar.

V8 tar does something that is quite handy when reading tapes written on
4.2 system into non-4.2 systems: it reduces file name components to
14 bytes or less and ensures that they are unique (I think it truncates
to 10 bytes and appends "..aa" where aa are two unique letters) and puts
out a file containing the mapping between long names on tape and short
names on disk.

Clean up 'd' (diff) option. Currently it works for regular files
and symlinks, needs work for dirs and links. Ideally, output should
look like "diff -r" or -rl after an extract of the tape and a real diff.
Right now it's very messy. To do the above, we'd need to read the
directories that we touch and check all the file names against what's
on the tape. All we do now is check the file contents and stats.

Check "int" variables to see if they really need to be long (file sizes,
record counts, etc). Sizes of in-core buffers should be int; since
malloc() takes an int argument we can never allocate one any bigger.
Maybe unsigned int would be better, though. Little system people,
help me out here! (E.g. run lint on it on your system and send me
the result if it shows anything fixable.)
@@@ Fin de TODO
echo tar.1
cat >tar.1 <<'@@@ Fin de tar.1'
.TH TAR 1 "5 November 1987"
.\" @(#)tar.1 1.12 11/6/87 Public Domain - gnu
.SH NAME
tar \- tape (or other media) file archiver
.SH SYNOPSIS
\fBtar\fP \-[\fBBcdDhiklmopRstvxzZ\fP]
[\fB\-b\fP \fIN\fP]
[\fB\-f\fP \fIF\fP]
[\fB\-T\fP \fIF\fP]
[ \fIfilename or regexp\fP\| .\|.\|. ]
.SH DESCRIPTION
\fItar\fP provides a way to store many files into a single archive,
which can be kept in another Unix file, stored on an I/O device
such as tape, floppy, cartridge, or disk, sent over a network, or piped to
another program.
It is useful for making backup copies, or for packaging up a set of
files to move them to another system.
.LP
\fItar\fP has existed since Version 7 Unix with very little change.
It has been proposed as the standard format for interchange of files
among systems that conform to the IEEE P1003 ``Portable Operating System''
standard.
.LP
This version of \fItar\fP supports some of the extensions which
were proposed in the P1003 draft standards, including owner and group
names, and support for named pipes, fifos, contiguous files,
and block and character devices.
.LP
When reading an archive, this version of \fItar\fP continues after
finding an error. Previous versions required the `i' option to ignore
checksum errors.
.SH OPTIONS
\fItar\fP options can be specified in either of two ways. The usual
Unix conventions can be used: each option is preceded by `\-'; arguments
directly follow each option; multiple options can be combined behind one `\-'
as long as they take no arguments. For compatability with the Unix
\fItar\fP program, the options may also be specified as ``keyletters,''
wherein all the option letters occur in the first argument to \fItar\fP,
with no `\-', and their arguments, if any, occur in the second, third, ...
arguments. Examples:
.LP
Normal: tar -f arcname -cv file1 file2
.LP
Old: tar fcv arcname file1 file2
.LP
At least one of the \fB\-c\fP, \fB\-t\fP, \fB-d\fP, or \fB\-x\fP options
must be included. The rest are optional.
.LP
Files to be operated upon are specified by a list of file names, which
follows the option specifications (or can be read from a file by the
\fB\-T\fP option). Specifying a directory name causes that directory
and all the files it contains to be (recursively) processed. If a
full path name is specified when creating an archive, it will be written
to the archive without the initial "/", to allow the files to be later
read into a different place than where they were
dumped from, and a warning will be printed. If
files are extracted from an archive which contains
full path names, they will be extracted relative to the current directory
and a warning message printed.
.LP
When extracting or listing files, the ``file names'' are treated as
regular expressions, using mostly the same syntax as the shell. The
shell actually matches each substring between ``/''s separately, while
\fItar\fP matches the entire string at once, so some anomalies will
occur; e.g. ``*'' or ``?'' can match a ``/''. To specify a regular
expression as an argument to \fItar\fP, quote it so the shell will not
expand it.
.IP "\fB\-b\fP \fIN\fP"
Specify a blocking factor for the archive. The block size will be
\fIN\fP x 512 bytes. Larger blocks typically run faster and let you
fit more data on a tape. The default blocking factor is set when
\fItar\fP is compiled, and is typically 20. There is no limit to the
maximum block size, as long as enough memory can be allocated for it,
and as long as the device containing the archive can read or write
that block size.
.IP \fB\-B\fP
When reading an archive, reblock it as we read it.
Normally, \fItar\fP reads each
block with a single \fIread(2)\fP system call. This does not work
when reading from a pipe or network socket under Berkeley Unix;
\fIread(2)\fP only gives as much data as has arrived at the moment.
With this option, it
will do multiple \fIread(2)\fPs to fill out to a record boundary,
rather than reporting an error.
This option is default when reading an archive from standard input,
or over a network.
.IP \fB\-c\fP
Create an archive from a list of files.
.IP \fB\-d\fP
Diff an archive against the files in the file system. Reports
differences in file size, mode, uid, gid, and contents. If a file
exists on the tape, but not in the file system, that is reported.
This option needs further work to be really useful.
.IP \fB\-D\fP
When creating an archive, only dump each directory itself; don't dump
all the files inside the directory. In conjunction with \fIfind\fP(1),
this is useful in creating incremental dumps for archival backups,
similar to those produced by \fIdump\fP(8).
.IP "\fB\-f\fP \fIF\fP"
Specify the filename of the archive. If the specified filename is ``\-'',
the archive is read from the standard input or written to the standard output.
If the \fB-f\fP option is not used, and the environment variable \fBTAPE\fP
exists, its value will be used; otherwise,
a default archive name (which was picked when tar was compiled) is used.
The default is normally set to the ``first'' tape drive or other transportable
I/O medium on the system.
.IP
If the filename contains a colon before a slash, it is interpreted
as a ``hostname:/file/name'' pair. \fItar\fP will invoke the commands
\fIrsh\fP and \fIdd\fP to access the specified file or device on the
system \fIhostname\fP. If you need to do something unusual like rsh with
a different user name, use ``\fB\-f \-\fP'' and pipe it to rsh manually.
.IP \fB\-h\fP
When creating an archive, if a symbolic link is encountered, dump
the file or directory to which it points, rather than
dumping it as a symbolic link.
.IP \fB\-i\fP
When reading an archive, ignore blocks of zeros in the archive. Normally
a block of zeros indicates the end of the archive,
but in a damaged archive, or one which was
created by appending several archives, this option allows \fItar\fP to
continue. It is not on by default because there is garbage written after the
zeroed blocks by the Unix \fItar\fP program. Note that with this option
set, \fItar\fP will read all the way to the end of the file, eliminating
problems with multi-file tapes.
.IP \fB\-k\fP
When extracting files from an archive, keep existing files, rather than
overwriting them with the version from the archive.
.IP \fB\-l\fP
When dumping the contents of a directory to an archive, stay within the
local file system of that directory. This option
only affects the files dumped because
they are in a dumped directory; files named on the command line are
always dumped, and they can be from various file systems.
This is useful for making ``full dump'' archival backups of a file system,
as with the \fIdump\fP(8) command. Files which are skipped due to this
option are mentioned on the standard error.
.IP \fB\-m\fP
When extracting files from an archive, set each file's modified timestamp
to the current time, rather than extracting each file's modified
timestamp from the archive.
.IP \fB\-o\fP
When creating an archive, write an old format archive, which does not
include information about directories, pipes, fifos,
contiguous files, or device files, and
specifies file ownership by uid's and gid's rather than by
user names and group names. In most cases, a ``new'' format archive
can be read by an ``old'' tar program without serious trouble, so this
option should seldom be needed.
.IP \fB\-p\fP
When extracting files from an archive, restore them to the same permissions
that they had in the archive. If \fB\-p\fP is not specified, the current
umask limits the permissions of the extracted files. See \fIumask(2)\fP.
.IP \fB\-R\fP
With each message that \fItar\fP produces, print the record number
within the archive where the message occurred. This option is especially
useful when reading damaged archives, since it helps to pinpoint the damaged
section.
.IP \fB\-s\fP
When specifying a list of filenames to be listed
or extracted from an archive,
the \fB\-s\fP flag specifies that the list
is sorted into the same order as the tape. This allows a large list
to be used, even on small machines, because
the entire list need not be read into memory at once. Such a sorted
list can easily be created by running ``tar \-t'' on the archive and
editing its output.
.IP \fB\-t\fP
List a table of contents of an existing archive. If file names are
specified, just list files matching the specified names. The listing
appears on the standard output.
.IP "\fB\-T\fP \fIF\fP"
Rather than specifying file names or regular expressions as arguments to
the \fItar\fP command, this option specifies that they should
be read from the file \fIF\fP, one per line.
If the file name specified is ``\-'',
the list is read from the standard input.
This option, in conjunction with the \fB\-s\fP option,
allows an arbitrarily large list of files to be processed,
and allows the list to be piped to \fItar\fP.
.IP \fB\-v\fP
Be verbose about the files that are being processed or listed. Normally,
archive creation, file extraction, and differencing are silent,
and archive listing just
gives file names. The \fB\-v\fP option causes an ``ls \-l''\-like listing
to be produced. The output from -v appears on the standard output except
when creating an archive (since the new archive might be on standard output),
where it goes to the standard error output.
.IP \fB\-x\fP
Extract files from an existing archive. If file names are
specified, just extract files matching the specified names, otherwise extract
all the files in the archive.
.IP "\fB\-z\fP or \fB\-Z\fP"
The archive should be compressed as it is written, or decompressed as it
is read, using the \fIcompress(1)\fP program. This option works on I/O
devices and over the network, as well as on disk files; data to or from
such devices is reblocked using a ``dd'' command
to enforce the specified (or default) block size. The default compression
parameters are used; if you need to override them, avoid the ``z'' option
and compress it yourself.
.SH "SEE ALSO"
shar(1), tar(5), compress(1), ar(1), arc(1), cpio(1), dump(8), restore(8),
restor(8), rsh(1), dd(1), find(1)
.SH BUGS
The \fBr, u, w, X, l, F, C\fP, and \fIdigit\fP options of Unix \fItar\fP
are not supported.
.LP
Multiple-tape (or floppy) archives should be supported, but so far no
clean way has been implemented.
.LP
A bug in the Bourne Shell usually causes an extra newline to be written
to the standard error when using compressed or remote archives.
.LP
A bug in ``dd'' prevents turning off the ``x+y records in/out'' messages
on the standard error when ``dd'' is used to reblock or transport an archive.
@@@ Fin de tar.1
echo tar.5
cat >tar.5 <<'@@@ Fin de tar.5'
.TH TAR 5 "15 October 1987"
.\" @(#)tar.5 1.4 11/6/87 Public Domain - gnu
.SH NAME
tar \- tape (or other media) archive file format
.SH DESCRIPTION
A ``tar tape'' or file contains a series of records. Each record contains
TRECORDSIZE bytes (see below). Although this format may be thought of as
being on magnetic tape, other media are often used.
Each file archived is represented by a header record
which describes the file, followed by zero or more records which give the
contents of the file. At the end of the archive file there may be a record
filled with binary zeros as an end-of-file indicator. A reasonable
system should write a record of zeros at the end, but must not assume that
an end-of-file record exists when reading an archive.

The records may be blocked for physical I/O operations. Each block of
\fIN\fP records (where \fIN\fP is set by the \fB\-b\fP option to \fItar\fP)
is written with a single write() operation. On
magnetic tapes, the result of such a write is a single tape record.
When writing an archive, the last block of records should be written
at the full size, with records after the zero record containing
all zeroes. When reading an archive, a reasonable system should
properly handle an archive whose last block is shorter than the rest, or
which contains garbage records after a zero record.

The header record is defined in the header file <tar.h> as follows:
.nf
.sp .5v
.DT
/*
* Standard Archive Format - Standard TAR - USTAR
*/
#define RECORDSIZE 512
#define NAMSIZ 100
#define TUNMLEN 32
#define TGNMLEN 32

union record {
char charptr[RECORDSIZE];
struct header {
char name[NAMSIZ];
char mode[8];
char uid[8];
char gid[8];
char size[12];
char mtime[12];
char chksum[8];
char linkflag;
char linkname[NAMSIZ];
char magic[8];
char uname[TUNMLEN];
char gname[TGNMLEN];
char devmajor[8];
char devminor[8];
} header;
};

/* The checksum field is filled with this while the checksum is computed. */
#define CHKBLANKS " " /* 8 blanks, no null */

/* The magic field is filled with this if uname and gname are valid. */
#define TMAGIC "ustar " /* 7 chars and a null */

/* The linkflag defines the type of file */
#define LF_OLDNORMAL '\\0' /* Normal disk file, Unix compatible */
#define LF_NORMAL '0' /* Normal disk file */
#define LF_LINK '1' /* Link to previously dumped file */
#define LF_SYMLINK '2' /* Symbolic link */
#define LF_CHR '3' /* Character special file */
#define LF_BLK '4' /* Block special file */
#define LF_DIR '5' /* Directory */
#define LF_FIFO '6' /* FIFO special file */
#define LF_CONTIG '7' /* Contiguous file */
/* Further link types may be defined later. */

/* Bits used in the mode field - values in octal */
#define TSUID 04000 /* Set UID on execution */
#define TSGID 02000 /* Set GID on execution */
#define TSVTX 01000 /* Save text (sticky bit) */

/* File permissions */
#define TUREAD 00400 /* read by owner */
#define TUWRITE 00200 /* write by owner */
#define TUEXEC 00100 /* execute/search by owner */
#define TGREAD 00040 /* read by group */
#define TGWRITE 00020 /* write by group */
#define TGEXEC 00010 /* execute/search by group */
#define TOREAD 00004 /* read by other */
#define TOWRITE 00002 /* write by other */
#define TOEXEC 00001 /* execute/search by other */
.fi
.LP
All characters in header records
are represented using 8-bit characters in the local
variant of ASCII.
Each field within the structure is contiguous; that is, there is
no padding used within the structure. Each character on the archive medium
is stored contiguously.

Bytes representing the contents of files (after the header record
of each file) are not translated in any way and
are not constrained to represent characters or to be in any character set.
The \fItar\fP(5) format does not distinguish text files from binary
files, and no translation of file contents should be performed.

The fields \fIname, linkname, magic, uname\fP, and \fIgname\fP are
null-terminated
character strings. All other fields are zero-filled octal numbers in
ASCII. Each numeric field (of width \fIw\fP) contains \fIw\fP-2 digits, a space, and
a null, except \fIsize\fP and \fImtime\fP,
which do not contain the trailing null.

The \fIname\fP field is the pathname of the file, with directory names
(if any) preceding the file name, separated by slashes.

The \fImode\fP field provides nine bits specifying file permissions and three
bits to specify the Set UID, Set GID and Save Text (TSVTX) modes. Values
for these bits are defined above. When special permissions are required
to create a file with a given mode, and the user restoring files from the
archive does not hold such permissions, the mode bit(s) specifying those
special permissions are ignored. Modes which are not supported by the
operating system restoring files from the archive will be ignored.
Unsupported modes should be faked up when creating an archive; e.g.
the group permission could be copied from the `other' permission.

The \fIuid\fP and \fIgid\fP fields are the user and group ID of the file owners,
respectively.

The \fIsize\fP field is the size of the file in bytes; linked files are archived
with this field specified as zero.

The \fImtime\fP field is the modification time of the file at the time it was
archived. It is the ASCII representation of the octal value of the
last time the file was modified, represented as in integer number of
seconds since January 1, 1970, 00:00 Coordinated Universal Time.

The \fIchksum\fP field is the ASCII representaion of the octal value of the
simple sum of all bytes in the header record. Each 8-bit byte in the
header is treated as an unsigned value. These values are added to an
unsigned integer, initialized to zero, the precision of which shall be no
less than seventeen bits. When calculating the checksum, the \fIchksum\fP
field is treated as if it were all blanks.

The \fItypeflag\fP field specifies the type of file archived. If a particular
implementation does not recognize or permit the specified type, the file
will be extracted as if it were a regular file. As this action occurs,
\fItar\fP issues a warning to the standard error.
.IP "LF_NORMAL or LF_OLDNORMAL"
represents a regular file.
For backward compatibility, a \fItypeflag\fP value of LF_OLDNORMAL
should be silently recognized as a regular file. New archives should
be created using LF_NORMAL.
Also, for backward
compatability, \fItar\fP treats a regular file whose name ends
with a slash as a directory.
.IP LF_LINK
represents a file linked to another file, of any type,
previously archived. Such files are identified in Unix by each file
having the same device and inode number. The linked-to
name is specified in the \fIlinkname\fP field with a trailing null.
.IP LF_SYMLINK
represents a symbolic link to another file. The linked-to
name is specified in the \fIlinkname\fP field with a trailing null.
.IP "LF_CHR or LF_BLK"
represent character special files and block
special files respectively.
In this case the \fIdevmajor\fP and \fIdevminor\fP
fields will contain the
major and minor device numbers respectively. Operating
systems may map the device specifications to their own local
specification, or may ignore the entry.
.IP LF_DIR
specifies a directory or sub-directory. The directory name
in the \fIname\fP field should end with a slash.
On systems where
disk allocation is performed on a directory basis the \fIsize\fP
field will contain the maximum number of bytes (which may be
rounded to the nearest disk block allocation unit) which the
directory may hold. A \fIsize\fP field of zero indicates no such
limiting. Systems which do not support limiting in this
manner should ignore the \fIsize\fP field.
.IP LF_FIFO
specifies a FIFO special file. Note that the archiving of
a FIFO file archives the existence of this file and not its
contents.
.IP LF_CONTIG
specifies a contiguous file, which is the same as a normal
file except that, in operating systems which support it,
all its space is allocated contiguously on the disk. Operating
systems which do not allow contiguous allocation should silently treat
this type as a normal file.
.IP "`A' \- `Z'"
are reserved for custom implementations. None are used by this
version of the \fItar\fP program.
.IP \fIother\fP
values are reserved for specification in future revisions of the
P1003 standard, and should not be used by any \fItar\fP program.
.LP
The \fImagic\fP field indicates that this archive was output in the P1003
archive format. If this field contains TMAGIC, then the
\fIuname\fP and \fIgname\fP
fields will contain the ASCII representation of the owner and group of the
file respectively. If found, the user and group ID represented by these
names
will be used rather than the values contained
within the \fIuid\fP and \fIgid\fP fields.
User names longer than TUNMLEN-1 or group
names longer than TGNMLEN-1 characters will be truncated.
.SH "SEE ALSO"
tar(1), ar(5), cpio(5), dump(8), restor(8), restore(8)
.SH BUGS
Names or link names longer than NAMSIZ-1 characters cannot be archived.

This format does not yet address multi-volume archives.
.SH NOTES
This manual page was adapted by John Gilmore
from Draft 6 of the P1003 specification
@@@ Fin de tar.5
echo tar.h
cat >tar.h <<'@@@ Fin de tar.h'
/*
* Header file for public domain tar (tape archive) program.
*
* @(#)tar.h 1.24 87/11/06 Public Domain.
*
* Created 25 August 1985 by John Gilmore, ihnp4!hoptoad!gnu.
*/

/*
* Kludge for handling systems that can't cope with multiple
* external definitions of a variable. In ONE routine (tar.c),
* we #define TAR_EXTERN to null; here, we set it to "extern" if
* it is not already set.
*/
#ifndef TAR_EXTERN
#define TAR_EXTERN extern
#endif

/*
* Header block on tape.
*
* I'm going to use traditional DP naming conventions here.
* A "block" is a big chunk of stuff that we do I/O on.
* A "record" is a piece of info that we care about.
* Typically many "record"s fit into a "block".
*/
#define RECORDSIZE 512
#define NAMSIZ 100
#define TUNMLEN 32
#define TGNMLEN 32

union record {
char charptr[RECORDSIZE];
struct header {
char name[NAMSIZ];
char mode[8];
char uid[8];
char gid[8];
char size[12];
char mtime[12];
char chksum[8];
char linkflag;
char linkname[NAMSIZ];
char magic[8];
char uname[TUNMLEN];
char gname[TGNMLEN];
char devmajor[8];
char devminor[8];
} header;
};

/* The checksum field is filled with this while the checksum is computed. */
#define CHKBLANKS " " /* 8 blanks, no null */

/* The magic field is filled with this if uname and gname are valid. */
#define TMAGIC "ustar " /* 7 chars and a null */

/* The linkflag defines the type of file */
#define LF_OLDNORMAL '\0' /* Normal disk file, Unix compat */
#define LF_NORMAL '0' /* Normal disk file */
#define LF_LINK '1' /* Link to previously dumped file */
#define LF_SYMLINK '2' /* Symbolic link */
#define LF_CHR '3' /* Character special file */
#define LF_BLK '4' /* Block special file */
#define LF_DIR '5' /* Directory */
#define LF_FIFO '6' /* FIFO special file */
#define LF_CONTIG '7' /* Contiguous file */
/* Further link types may be defined later. */

/*
* Exit codes from the "tar" program
*/
#define EX_SUCCESS 0 /* success! */
#define EX_ARGSBAD 1 /* invalid args */
#define EX_BADFILE 2 /* invalid filename */
#define EX_BADARCH 3 /* bad archive */
#define EX_SYSTEM 4 /* system gave unexpected error */


/*
* Global variables
*/
TAR_EXTERN union record *ar_block; /* Start of block of archive */
TAR_EXTERN union record *ar_record; /* Current record of archive */
TAR_EXTERN union record *ar_last; /* Last+1 record of archive block */
TAR_EXTERN char ar_reading; /* 0 writing, !0 reading archive */
TAR_EXTERN int blocking; /* Size of each block, in records */
TAR_EXTERN int blocksize; /* Size of each block, in bytes */
TAR_EXTERN char *ar_file; /* File containing archive */
TAR_EXTERN char *name_file; /* File containing names to work on */
TAR_EXTERN char *tar; /* Name of this program */

/*
* Flags from the command line
*/
TAR_EXTERN char f_reblock; /* -B */
TAR_EXTERN char f_create; /* -c */
TAR_EXTERN char f_diff; /* -d */
TAR_EXTERN char f_dironly; /* -D */
TAR_EXTERN char f_follow_links; /* -h */
TAR_EXTERN char f_ignorez; /* -i */
TAR_EXTERN char f_keep; /* -k */
TAR_EXTERN char f_local_filesys; /* -l */
TAR_EXTERN char f_modified; /* -m */
TAR_EXTERN char f_oldarch; /* -o */
TAR_EXTERN char f_use_protection; /* -p */
TAR_EXTERN char f_sayblock; /* -R */
TAR_EXTERN char f_sorted_names; /* -s */
TAR_EXTERN char f_list; /* -t */
TAR_EXTERN char f_namefile; /* -T */
TAR_EXTERN char f_verbose; /* -v */
TAR_EXTERN char f_extract; /* -x */
TAR_EXTERN char f_compress; /* -z */

/*
* We now default to Unix Standard format rather than 4.2BSD tar format.
* The code can actually produce all three:
* f_standard ANSI standard
* f_oldarch V7
* neither 4.2BSD
* but we don't bother, since 4.2BSD can read ANSI standard format anyway.
* The only advantage to the "neither" option is that we can cmp(1) our
* output to the output of 4.2BSD tar, for debugging.
*/
#define f_standard (!f_oldarch)

/*
* Structure for keeping track of filenames and lists thereof.
*/
struct name {
struct name *next;
short length; /* cached strlen(name) */
char found; /* A matching file has been found */
char firstch; /* First char is literally matched */
char regexp; /* This name is a regexp, not literal */
char name[NAMSIZ+1];
};

TAR_EXTERN struct name *namelist; /* Points to first name in list */
TAR_EXTERN struct name *namelast; /* Points to last name in list */

TAR_EXTERN int archive; /* File descriptor for archive file */
TAR_EXTERN int errors; /* # of files in error */

/*
*
* Due to the next struct declaration, each routine that includes
* "tar.h" must also include <sys/types.h>. I tried to make it automatic,
* but System V has no defines in <sys/types.h>, so there is no way of
* knowing when it has been included. In addition, it cannot be included
* twice, but must be included exactly once. Argghh!
*
* Thanks, typedef. Thanks, USG.
*/
struct link {
struct link *next;
dev_t dev;
ino_t ino;
short linkcount;
char name[NAMSIZ+1];
};

TAR_EXTERN struct link *linklist; /* Points to first link in list */


/*
* Error recovery stuff
*/
TAR_EXTERN char read_error_flag;


/*
* Declarations of functions available to the world.
*/
union record *findrec();
void userec();
union record *endofrecs();
void anno();
#define annorec(stream, msg) anno(stream, msg, 0) /* Cur rec */
#define annofile(stream, msg) anno(stream, msg, 1) /* Saved rec */
@@@ Fin de tar.h
echo port.h
cat >port.h <<'@@@ Fin de port.h'
/*
* Portability declarations for public domain tar.
*
* @(#)port.h 1.3 87/11/11 Public Domain by John Gilmore, 1986
*/

/*
* Everybody does wait() differently. There seem to be no definitions
* for this in V7 (e.g. you are supposed to shift and mask things out
* using constant shifts and masks.) So fuck 'em all -- my own non
* standard but portable macros. Don't change to a "union wait"
* based approach -- the ordering of the elements of the struct
* depends on the byte-sex of the machine. Foo!
*/
#define TERM_SIGNAL(status) ((status) & 0x7F)
#define TERM_COREDUMP(status) (((status) & 0x80) != 0)
#define TERM_VALUE(status) ((status) >> 8)

#ifdef MSDOS
/* missing things from sys/stat.h */
#define S_ISUID 0
#define S_ISGID 0
#define S_ISVTX 0

/* device stuff */
#define makedev(ma, mi) ((ma << 8) | mi)
#define major(dev) (dev)
#define minor(dev) (dev)
#endif /* MSDOS */
@@@ Fin de port.h
echo open3.h
cat >open3.h <<'@@@ Fin de open3.h'
/*
* @(#)open3.h 1.4 87/11/11 Public Domain.
*
* open3.h -- #defines for the various flags for the Sys V style 3-argument
* open() call. On BSD or System 5, the system already has this in an
* include file. This file is needed for V7 and MINIX systems for the
* benefit of open3() in port.c, a routine that emulates the 3-argument
* call using system calls available on V7/MINIX.
*
* This file is needed by PD tar even if we aren't using the
* emulator, since the #defines for O_WRONLY, etc. are used in
* a couple of places besides the open() calls, (e.g. in the assignment
* to openflag in extract.c). We just #include this rather than
* #ifdef them out.
*
* Written 6/10/87 by rmtodd@uokmax (Richard Todd).
*
* The names have been changed by John Gilmore, 31 July 1987, since
* Richard called it "bsdopen", and really this change was introduced in
* AT&T Unix systems before BSD picked it up.
*/

/* Only one of the next three should be specified */
#define O_RDONLY 0 /* only allow read */
#define O_WRONLY 1 /* only allow write */
#define O_RDWR 2 /* both are allowed */

/* The rest of these can be OR-ed in to the above. */
/*
* O_NDELAY isn't implemented by the emulator. It's only useful (to tar) on
* systems that have named pipes anyway; it prevents tar's hanging by
* opening a named pipe. We #ifndef it because some systems already have
* it defined.
*/
#ifndef O_NDELAY
#define O_NDELAY 4 /* don't block on opening devices that would
* block on open -- ignored by emulator. */
#endif
#define O_CREAT 8 /* create file if needed */
#define O_EXCL 16 /* file cannot already exist */
#define O_TRUNC 32 /* truncate file on open */
#define O_APPEND 64 /* always write at end of file -- ignored by emul */

#ifdef EMUL_OPEN3
/*
* make emulation transparent to rest of file -- redirect all open() calls
* to our routine
*/
#define open open3
#endif
@@@ Fin de open3.h
echo msd_dir.h
cat >msd_dir.h <<'@@@ Fin de msd_dir.h'
/*
* @(#)msd_dir.h 1.4 87/11/06 Public Domain.
*
* A public domain implementation of BSD directory routines for
* MS-DOS. Written by Michael Rendell ({uunet,utai}michael@garfield),
* August 1897
*/

#define rewinddir(dirp) seekdir(dirp, 0L)

#define MAXNAMLEN 12

struct direct {
ino_t d_ino; /* a bit of a farce */
int d_reclen; /* more farce */
int d_namlen; /* length of d_name */
char d_name[MAXNAMLEN + 1]; /* garentee null termination */
};

struct _dircontents {
char *_d_entry;
struct _dircontents *_d_next;
};

typedef struct _dirdesc {
int dd_id; /* uniquely identify each open directory */
long dd_loc; /* where we are in directory entry is this */
struct _dircontents *dd_contents; /* pointer to contents of dir */
struct _dircontents *dd_cp; /* pointer to current position */
} DIR;

extern DIR *opendir();
extern struct direct *readdir();
extern void seekdir();
extern long telldir();
extern void closedir();
@@@ Fin de msd_dir.h
echo msd_dir.c
cat >msd_dir.c <<'@@@ Fin de msd_dir.c'
/*
* @(#)msd_dir.c 1.4 87/11/06 Public Domain.
*
* A public domain implementation of BSD directory routines for
* MS-DOS. Written by Michael Rendell ({uunet,utai}michael@garfield),
* August 1897
*/

#include <sys/types.h>
#include <sys/stat.h>
#include <sys/dir.h>
#include <malloc.h>
#include <string.h>
#include <dos.h>

#ifndef NULL
# define NULL 0
#endif /* NULL */

#ifndef MAXPATHLEN
# define MAXPATHLEN 255
#endif /* MAXPATHLEN */

/* attribute stuff */
#define A_RONLY 0x01
#define A_HIDDEN 0x02
#define A_SYSTEM 0x04
#define A_LABEL 0x08
#define A_DIR 0x10
#define A_ARCHIVE 0x20

/* dos call values */
#define DOSI_FINDF 0x4e
#define DOSI_FINDN 0x4f
#define DOSI_SDTA 0x1a

#define Newisnull(a, t) ((a = (t *) malloc(sizeof(t))) == (t *) NULL)
#define ATTRIBUTES (A_DIR | A_HIDDEN | A_SYSTEM)

/* what find first/next calls look use */
typedef struct {
char d_buf[21];
char d_attribute;
unsigned short d_time;
unsigned short d_date;
long d_size;
char d_name[13];
} Dta_buf;

static char *getdirent();
static void setdta();
static void free_dircontents();

static Dta_buf dtabuf;
static Dta_buf *dtapnt = &dtabuf;
static union REGS reg, nreg;

#if defined(M_I86LM)
static struct SREGS sreg;
#endif

DIR *
opendir(name)
char *name;
{
struct stat statb;
DIR *dirp;
char c;
char *s;
struct _dircontents *dp;
char nbuf[MAXPATHLEN + 1];

if (stat(name, &statb) < 0 || (statb.st_mode & S_IFMT) != S_IFDIR)
return (DIR *) NULL;
if (Newisnull(dirp, DIR))
return (DIR *) NULL;
if (*name && (c = name[strlen(name) - 1]) != '\\' && c != '/')
(void) strcat(strcpy(nbuf, name), "\\*.*");
else
(void) strcat(strcpy(nbuf, name), "*.*");
dirp->dd_loc = 0;
setdta();
dirp->dd_contents = dirp->dd_cp = (struct _dircontents *) NULL;
if ((s = getdirent(nbuf)) == (char *) NULL)
return dirp;
do {
if (Newisnull(dp, struct _dircontents) || (dp->_d_entry =
malloc((unsigned) (strlen(s) + 1))) == (char *) NULL)
{
if (dp)
free((char *) dp);
free_dircontents(dirp->dd_contents);
return (DIR *) NULL;
}
if (dirp->dd_contents)
dirp->dd_cp = dirp->dd_cp->_d_next = dp;
else
dirp->dd_contents = dirp->dd_cp = dp;
(void) strcpy(dp->_d_entry, s);
dp->_d_next = (struct _dircontents *) NULL;
} while ((s = getdirent((char *) NULL)) != (char *) NULL);
dirp->dd_cp = dirp->dd_contents;

return dirp;
}

void
closedir(dirp)
DIR *dirp;
{
free_dircontents(dirp->dd_contents);
free((char *) dirp);
}

struct direct *
readdir(dirp)
DIR *dirp;
{
static struct direct dp;

if (dirp->dd_cp == (struct _dircontents *) NULL)
return (struct direct *) NULL;
dp.d_namlen = dp.d_reclen =
strlen(strcpy(dp.d_name, dirp->dd_cp->_d_entry));
dp.d_ino = 0;
dirp->dd_cp = dirp->dd_cp->_d_next;
dirp->dd_loc++;

return &dp;
}

void
seekdir(dirp, off)
DIR *dirp;
long off;
{
long i = off;
struct _dircontents *dp;

if (off < 0)
return;
for (dp = dirp->dd_contents ; --i >= 0 && dp ; dp = dp->_d_next)
;
dirp->dd_loc = off - (i + 1);
dirp->dd_cp = dp;
}

long
telldir(dirp)
DIR *dirp;
{
return dirp->dd_loc;
}

static void
free_dircontents(dp)
struct _dircontents *dp;
{
struct _dircontents *odp;

while (dp) {
if (dp->_d_entry)
free(dp->_d_entry);
dp = (odp = dp)->_d_next;
free((char *) odp);
}
}

static char *
getdirent(dir)
char *dir;
{
if (dir != (char *) NULL) { /* get first entry */
reg.h.ah = DOSI_FINDF;
reg.h.cl = ATTRIBUTES;
#if defined(M_I86LM)
reg.x.dx = FP_OFF(dir);
sreg.ds = FP_SEG(dir);
#else
reg.x.dx = (unsigned) dir;
#endif
} else { /* get next entry */
reg.h.ah = DOSI_FINDN;
#if defined(M_I86LM)
reg.x.dx = FP_OFF(dtapnt);
sreg.ds = FP_SEG(dtapnt);
#else
reg.x.dx = (unsigned) dtapnt;
#endif
}
#if defined(M_I86LM)
intdosx(&reg, &nreg, &sreg);
#else
intdos(&reg, &nreg);
#endif
if (nreg.x.cflag)
return (char *) NULL;

return dtabuf.d_name;
}

static void
setdta()
{
reg.h.ah = DOSI_SDTA;
#if defined(M_I86LM)
reg.x.dx = FP_OFF(dtapnt);
sreg.ds = FP_SEG(dtapnt);
intdosx(&reg, &nreg, &sreg);
#else
reg.x.dx = (int) dtapnt;
intdos(&reg, &nreg);
#endif
}
@@@ Fin de msd_di" >9
X<CB=

0 new messages