This mail message is undeliverable.
(Probably to or from system 'caeco')
It was sent to you or by you.
Sorry for the inconvenience.
Sincerely,
apcislc!uucp
#############################################
##### Data File: ############################
From apcihq!ulowell!harvard!bin Mon Jan 1 14:26:09 1990 remote from snowflake
Received: by snowflake.UUCP (smail2.3)
id AA05130; 1 Jan 90 14:26:09 MST (Mon)
Received: by apcihq.UUCP; Mon, 1 Jan 90 17:14:42 EST
Received: by apollo.uucp (smail2.3)
id AA22122; 1 Jan 90 15:26:33 EST (Mon)
Received: by swan
for apcihq!apcislc!caeco!i-core!beezer (from ulowell!harvard!bin)
id <AA03361@swan>; Sat, 30 Dec 89 00:04:23 EST
Received: by harvard.harvard.edu (5.54/a0.25)
(for apollo!apcihq!apcislc!caeco!i-core!beezer) id AA03019; Fri, 29 Dec 89 22:57:39 EST
Received: by BU.EDU (1.97) Thu, 28 Dec 89 23:47:32 EST
Received: from world.std.com by xenna.Xylogics.COM (4.12/4.7_jlv4/13/89)
id AA30600; Thu, 28 Dec 89 10:49:14 est
Received: by world.std.com (4.1/SMI-4.0)
id AA04747; Thu, 28 Dec 89 23:11:23 EST
Return-Path: <bzs>
Received: by world.std.com (4.1/SMI-4.0)
id AA04740; Thu, 28 Dec 89 23:11:20 EST
Date: Thu, 28 Dec 89 23:11:20 EST
From: apcihq!ulowell!harvard!world.std.com!bzs (Barry Shein)
Message-Id: <891229041...@world.std.com>
To: world.std.com!obi
Subject: OBI DIGEST V1.6
Reply-To: apcihq!ulowell!harvard!world.std.com!obi-request
OBI DIGEST V1.6
Today's Topics:
Administrivia
OCR help (offered, not needed :)
Re: Representing accented characters
History of Project Gutenberg
Portion of upcoming article on copyright
Suggested Subjects for Discussion
Re: Suggested Subjects for Discussion
Re: Suggested Subjects for Discussion
Re: Laws
----------------------------------------------------------------------
Subject: Administrivia
From: b...@world.std.com (Barry Shein)
Date: Date: Thu, 28 Dec 89 10:50:00 EST
ADMINISTRIVIA
1. The mailing list now has 556 subscribers, many local exploders. The
announce-only list has 64 subscribers. That's 620 subscribers, wow.
2. I should put these out more frequently but the holidays got me as
I'm sure they got you also. Oh, Happy Holidays!
3. The articles below from Michael Hart (who heads Project Gutenberg
and a mailing list (gutn...@vmd.cso.uiuc.edu)) were copied to this
list as well as others. I think they're worthwhile, one overviews
Project Gutenberg.
4. Starting with this digest I have added Volume and Issue numbers and
retroactively assigned volumes and issues to all previous issues. This
will make it easier for people to notice if they are missing any and
to generally refer back to other issues in notes.
5. I hope the New Year brings me some TEXT (hint hint), we can gather
it a lot faster than we can scan it in if you tell us where it is! And
THANKS to those who have made contributions thus far, you will be the
unsung heros of a new age.
6. There is much interest from organizations who want to help, expect
some announcements in the next very few months (it would be impolite
to specify them now, but I thought I'd try to make you feel as
encouraged as I have been!)
On with the show!
-Barry Shein
Software Tool & Die, Purveyors to the Trade | b...@world.std.com
1330 Beacon St, Brookline, MA 02146, (617) 739-0202 | {xylogics,uunet}world!bzs
------------------------------
Subject: OCR help (offered, not needed :)
From: "Michael J Kovacs - kov...@bknlvms.BITNET" <uunet!CORNELLC.cit.cornell.edu!KOVACS%BKNLVMS.BITNET>
Date: Wed, 13 Dec 89 10:01 EST
From the January 1990 MacWorld, Mac Bulletin coloumn, p.17:
OCR Goes Preppie
Blue Solutions (until recently Orange Solutions) is working on a product,
tentatively called OCR Prep, to clean up TIFF images of scanned text and
prepare them for optical character recognition packages. OCR Prep is
designed to isolate letters from extraneous bits in gray-scale or
black-and-white scans of dirty originals such as newspapers. It will also
correct for skewing and misalignment of the original scan, one of the
primary reasons optical character recognition packages misidentify
letters. OCR Prep is scanner-independent and should work with any
standard TIFF file. At press time, no release date or pricing had been
set. For more information, contact Blue Solutions at 805/371-4521.
*----------------------------------------------------------------------*
| Michael J Kovacs -(o) (o)- |
| System Operator, Bertrand Library U |
| Bucknell University, Lewisburg PA, USA \___/ |
| BITnet: kovacs@bknlvms *** |
| Domain (MX): kov...@bknlvms.bucknell.edu |
| Internet: kovacs%bknlvms...@cornellc.cit.cornell.edu |
| UUCP: {rutgers|gatech}!psuvax1!bknlvms.bitnet!kovacs |
*----------------------------------------------------------------------*
------------------------------
Subject: Re: Representing accented characters
From: Michael Urban <urban%r...@rand.org>
Date: Mon, 20 Nov 89 10:42:04 -0800 (PST)
Not sure whether this is completely apropos, but in case someone wants
it to think upon.... [Note that different languages need not be all natural,
such as English or Esperanto; a similar type of embedding could be used to
format different computer languages within a text.]
Joe Beckenbach
Caltech CS department
------- Forwarded Message
Received: from rand.org by csvax.caltech.edu (5.59/1.2)
id AA24264; Mon, 20 Nov 89 12:04:19 PST
Received: from rcc.rand.org by rand.org; Mon, 20 Nov 89 10:42:03 -0800
Received: from twain by rcc.arpa; Mon, 20 Nov 89 10:42:10 PST
Received: by twain; Mon, 20 Nov 89 10:42:06 PST
Received: from Version.6.23.N.CUILIB.3.44.SNAP.NOT.LINKED.twain.Unknown.Machine.Type
via MS.5.5.twain.sun3_40;
Mon, 20 Nov 89 10:42:04 -0800 (PST)
Message-Id: <AZO4Jwz01EuMBcno4P@twain>
Date: Mon, 20 Nov 89 10:42:04 -0800 (PST)
From: Michael Urban <urban%r...@rand.org>
To: espe...@rand.org
Subject: Re: Representing accented characters
In-Reply-To: <15...@haddock.ima.isc.com>
References: <891110051...@rand.org>, <QZKkkUz01EuMM46GgO@twain>,
<15...@haddock.ima.isc.com>
Part of the problem is that the needs of the machine are different from the
needs of the human. The reason I use `w' instead of `^u' (which I used to use)
is that it is a little easier to type and (to my eye, at least) a lot easier to
read. But a program that is reading indiscriminately mixed English and
Esperanto text can, as Karl points out, become confused by such a typographic
convention. Another problem is that text that is nicely aligned in one form
becomes disaligned when the Esperanto text changes form. Jack had this problem
when he ran some of my bilingual postings through his VIDI program, and
blank-aligned postings become fairly random when viewed in a variable-width
font (which is how I see them running Andrew). If there is enough translation
software around, optimizing for machine readers (which present pleasing things
to human readers) becomes more imporant.
Perhaps a productive approach would be to use some kind of markup language, a`
la SGML (or Tim Bray's more straightforward SGML-like markup from the Oxford
English Dictionary project) to distinguish Esperanto and English text. With
such a system, a suite of (extremely portable--Karl is an expert at this!) C
programs that did markup->Latin-3 or markup->ASCII translations must be
distributed via the newsgroup to help people read markup messages. For
example, I might mail a message (using the OEDish markup) like
<dulingve>
<angleze>
<p>
This text is an example of bilingual markup <esperante> (malgraw la
Esperanto) </esperante>. It is a little awkward to read in its transmitted
form, but
not entirely unreadable.
</p>
</angleze>
<esperante>
<p>
&C.i tiu teksto estas ekzemplo de dulingva markteksto <angleze> (with
embedded English) </angleze>. &G.i estas iom maloportuna por legi en la
dissenda formo,
sed ne tute nelegebla.
</p>
</esperante>
</dulingve>
Here, I use <p> as the mark for a paragraph, just for completeness. I do not
necessarily propose exactly this form of markup. Indeed, one of the most
bewildering things about SGML is that it prescribes the exact form of almost
nothing...
One filter would produce `nice ASCII':
^Ci tiu teksto estas ekzemplo de This text is an example of
dulingva markteksto (with embedded bilingual markup (malgra^u la
English). ^Gi estas iom Esperanto). It is a little awkward
maloportuna por legi en la dissenda to read in its transmitted form,
formo, sed ne tute nelegebla. but not entirely unreadable.
The options of such a filter might even allow the user's favorite form of
accented letter representation. Here, I have presented the Esperanto u-breve
as `^u' (translated from `w' in the input, although I suppose that this is
really an inappropriate use of markup and a special symbol like `&u' would be
better. ).
Another filter would produce ISO 8859/3 output. Others might put out TeX
(possibly using sophisticated techniques for aligning bilingual text as in
Appendix D of the TeXbook), Troff, or PostScript.
The more I think about this, the better I like the idea. It does not seem too
overwhelmingly complicated, and is fairly parsimonious in its assumptions about
the final presentation. For example, if some future machine provides some
useful mechanism for presenting Esperanto (or similar) text, it will almost
certainly be easy to produce a filter or object to take advantage of it. It
also allows for other languages like Swedish, Chinese, and German to be typed
with whatever markup conventions are appropriate, if someone wants to write the
(ever more complex) software.
Finally, I will note that newsgroups are anarchistic. Nobody is going to
dictate a `standard'. But if a lot of people start sending out their postings
in markup form (and/or starts using the Accent-Convention: mail header that
Karl proposes), and the software for processing it is widely available, nobody
will have to `dictate' anything.
------- End of Forwarded Message
------------------------------
Subject: History of Project Gutenberg
From: "Michael S. Hart" <uunet!vmd.cso.uiuc.edu!HART>
Date: Wed, 20 Dec 89 14:32:28 CST
A ONE SCREEN INTRODUCTION TO PROJECT GUTENBERG
Project Gutenberg believes that the biggest impact of an age
of computers will be mass storage/retrieval of information in an
home environment which will allow quick and accurate access to a
world of ever increasing information. Already, with 360K floppy
disks, anyone can put the compete works of Shakespeare on only a
few dollars worth of disks when a paper copy costs 10 times that
amount and cannot be easily reindexed or searched for quotes.
We also cannot be unaware that that which liberates some can
undoubtedly threaten others. We have been told many times of an
ever increasing number of librarians an academics who feel quite
threatened by the advancement of electronic libraries, moreso in
approximately half the cases than they feel the benefits.
Whether we profit or not, are recognized pos or neg, we will
A ONE SCREEN HISTORY OF PROJECT GUTENBERG #1 OF A SERIES
Project Gutenberg was begun in early 1971 on a Xerox Sigma V
64K mainframe computer at the University of Illinois. The first
of Project Gutenberg texts was the United States' Declaration of
Independence, which was followed by the Constitution. The texts
were prepared on paper tape at a teletype machine and uploaded a
new time upon each request until hard disk space was sufficient,
which didn't last long and we were demoted to a tape which users
requested to be mounted several hours before they wanted a file.
For the next 15 years very little change occured in the text
acquistion area, though the computers got smaller and cheaper in
ever increasing increments as the project was moved to an S-100,
then to an OKI CP/M machine, then to an IBM XT, and resides in a
33Mhz 386 with a gigabyte of CDC SCSI hard disk storage today.
Contributors now reach from coast to coast in North America,
and the Gutenberg listserver has members across both oceans. In
1989, as per our hopes and public prediction, etexts have become
recognized as soemthing beyond "an idea ahead of its time" after
18 years of "You guys want to put Shakespeare on disk? You must
be crazy!" End of Part 1 Your comments are solicited.
------------------------------
Subject: Portion of upcoming article on copyright
From: "Michael S. Hart" <uunet!vmd.cso.uiuc.edu!HART>
Date: Wed, 20 Dec 89 14:36:45 CST
The following is the first page of an article I am preparing.
I referred to it in passing in a previous posting, and stated
I would be willing to post portions of it with encouragement.
I received said encouragement from members of these listserv-
discusssion groups, and even from one of the operators. Even
with this encouragement, I am slightly reluctant to try these
groups as formats for either publication or collaboration. I
encourage comments and suggestions, as well as collaboration,
and will only continue posting portions of the article if the
process does not offend the general membership.
This was written for a 56 line word processor w/ 66 line page
length. If yours is set for 55 lines, please delete a blank.
*********
LEVERS, WHEELS, METALS, AND CHIPS
FROM THE STONE AGE THROUGH BRONZE, IRON, STEEL
TO THE INFORMATION AGE
This series of articles presents a paradigmatic perspective
of the social-economic-cultural (r)evolutionary development our
species has passed through on its way to the present.
Parallels between these developments are sometimes obvious,
apparent, sometimes not so clearly definable. Yet the start of
the personal computer revolution is an obvious place to begin a
consideration of what our (r)evolutionary development will make
us consider now and in the future.
I use the term (r)evolutionary in its historical context as
originated by Copernicus which so rocked the world that a great
change in thought or culture has been called revolutionary ever
since that time. His concept, as you may recall, was that this
world made revolutions around the sun and on its axis. We, who
live in the New World, tend to give
By the way, I have yet to find a dictionary including these
references in its definition of revolutionary. If anyone could
inform me of one, I would appreciate it.
THE DIRECT CAUSE OF THESE ARTICLES
Much of the concern expressed in our discussions of e-texts
(this short generic term is replacing terms such as electronic-
texts, machine-readable-texts, etc. and I am sure the hyphen is
to be dropped shortly in an effort to make it even shorter, and
easier to type in e-mail contexts)
concerns copyright. In addressing this concern I will describe
some of the past (r)evolutionary ideas and intend to include an
interesting discussion of what would be likely to happen in the
case when/if a duplicating machine is invented which duplicates
itself and/or anything else.
The advent of such a machine is worthy of consideration, in
both the context of the current discussion concerning copyright
and the context of value and its representation in general. An
interesting case in point is what would happen to money in such
a scenario or to items such as Van Gogh's Irises which sold for
nearly $50. In such a scenario, the value of the intellectual,
artistic or other of the aesthetic humanistic qualities will in
nearly all regards replace the physical labor of production.
The concerns expressed about the easy and inexpensive copy,
which began with the advent of xerography, and which accelerate
today with digital reproductions of audio and video, as well as
with text, provide us with an invaluable set of opportunities.
Let us make the most of these opportunities to practice for
the day when duplicating machines can duplicate everything, the
exception being the human quality.
------------------------------
Subject: Suggested Subjects for Discussion
From: "Michael S. Hart" <uunet!vmd.cso.uiuc.edu!HART>
Date: Tue, 26 Dec 89 12:22:04 CST
I would like to propose the following subjects for discussion:
1. The relationship between copyright royalties and price of media.
i.e. If the price of producing a book is cut in half, what will
happen to the price the copyright holder is paid?
2. What happens when public domain texts hit the electronic market?
3. What will happen happen when the combined price of media, etexts
and distribution is such that anyone with a computer will easily
afford a private library comparable to the Library of Congress?
Thank you for your interest,
Michael S. Hart, Director, Project Gutenberg
National Clearinghouse for Machine Readable Texts
BITNET: HART@UIUCVMD INTERNET: HA...@VMD.CSO.UIUC.EDU
(*ADDRESS CHANGE FROM *VME* TO *VMD* AS OF DECEMBER 18!!**)
(THE GUTNBERG SERVER IS LOCATED AT GUTN...@UIUCVMD.BITNET)
------------------------------
Subject: Re: Suggested Subjects for Discussion
From: uunet!garcon.cso.uiuc.edu!well!trend (Van der Leun)
Date: Tue, 26 Dec 89 11:00:25 pst
regarding number 1:
The standard practice is to pay a royalty based on the price of the
work. This is a percentage of either the retail price or the
net price paid to the publisher. In the latter case the royalty
is about double the royalty paid on the retail price. In the
case of electronic texts the same sort of arrangement would
prevail, but it would most likely be a different percentage --
and probably a sliding scale based on volume.
------------------------------
Subject: Re: Suggested Subjects for Discussion
From: Van der Leun <uunet!well.sf.ca.us!trend>
Date: Wed, 27 Dec 89 10:32:23 CST
In response to a request for some exemplary figures we received these.
Please note that when books are discounted 50% the wholesale fee for a
$20 retail book become $10. Thus, in this case, a royalty of 20% must
be negotiated to equal a 10% royalty based on the retail price. Local
sources indicate not more than 10% of the money you pay for a book can
be expected to be received by the copyright holder. These percentages
are quite likely to be under strong pressure to change when books will
be sold on disks, which cost so much less than paper. We may also see
a change in the prices of new editions which differ only slightly from
older ones, as is the case with many educational texts. mh
----------------------------Original message----------------------------
In trade books (i.e. books in bookstores) the "standard slide of
retail price royalties" can be 10% to 5,000 copies sold,12.5%
to 10,000 copies sold, 15% thereafter. The quantites vary from
house to house but the percentages are standard.
Net royalties are calculated as a percentage of the net price
and pay according to the discount granted the bookstore by
the publisher which is a factor of the number of copies ordered
(i.e. two copies are discounted 10%, 1000 copies discounted
46%). Since books are still fully returnable by the store
if unsold the publisher will hold some of the royalties earned
in all cases against returns. In general, net royalties may
be thought of as roughly double the retail royalties.
------------------------------
Subject: Re: Laws
From: Dan Bernstein <uunet!stealth.acf.nyu.edu!brnstnd>
Date: Wed, 27 Dec 89 22:37:48 CST
> >Barry, I dare you to catch up---and keep up---with the United States Code.
> >If you succeed, you'll make a lot of people happy. If not, OBI will still
> >have learned a lot about the problems an online text database has to deal
> >with.
>
> Lexus does a lot of this, would it be useful to duplicate that just to
> try to lower the price?
Yes. You don't like the idea of giving Lexis a run for the money?
[ laws aren't too useful to most people ]
The right metric here is interest. It isn't convenient to run to the
library every time you want to look up a law; the easier OBI makes it
to retrieve and read laws, the more people will use the service.
> I wouldn't mind some specific laws though, like perhaps The Copyright
> Act of 1978.
Exactly! Most people have to deal with the law at some point; I want 22
USC, someone else wants 15 USC, you want 17 USC. (1978? You mean ... ?)
Nobody cares about the whole thing, but everybody cares about a piece.
---Dan
------------------------------
End of OBI Digest
*****************