Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Splitting a text file into sentences

7 views
Skip to first unread message

basi

unread,
Nov 29, 2005, 6:45:26 PM11/29/05
to
Looking for ideas on how to split a text file into sentences. I see the
problem of basing the split on [.!?] -- they're also used in ways other
than to end a sentence. If I have to do manual pre-processing of the
text file, what editing might I do? Has anyone had to deal with this
problem and how did you make life easier for you?
Thanks for the help.
basi

Matthew Smillie

unread,
Nov 29, 2005, 9:06:12 PM11/29/05
to


Doing really, really good sentence boundary detection is an on-going
problem in natural language processing. I'm not aware of any Ruby-
based NLP packages, but if you want better accuracy than just using
[.!?:] there are several free NLP packages around (NLTK in Python,
and Stanford's Java NLP package spring to mind) that might help you.
A googling of "sentence tokenization" may also yield some help.

If that sounds like overkill, then you can get accuracy "good enough
for government work" by making a list of regular expressions to catch
exceptions to the punctuation rule. These will necessarily vary a
little depending on your source text, but a typical examples are
catching titles like "Mr.", "Mrs." "Dr.", and all-caps abbreviations
like "U.S.A." or "M.D." (something like this: /([A-Z]\.([A-Z]\.)+/)

good luck,
matthew smillie.

----
Matthew Smillie <M.B.S...@sms.ed.ac.uk>
Institute for Communicating and Collaborative Systems
University of Edinburgh


Kevin Olbrich

unread,
Nov 29, 2005, 9:38:10 PM11/29/05
to
Depending on the text you might be able to search for a period (or other
punctuation) followed by two spaces. It's not robust, but if you know that
convention will be followed by the authors, then it can work.

_Kevin

Nicholas Van Weerdenburg

unread,
Nov 29, 2005, 9:44:46 PM11/29/05
to
I dimly recall something on this list about 9 months ago or so.

Nick
--
Nicholas Van Weerdenburg

Nicholas Van Weerdenburg

unread,
Nov 29, 2005, 9:49:36 PM11/29/05
to


http://www.pressure.to/ruby/ is the reference I found in an old email thread
on this list.

Jeffrey Schwab

unread,
Nov 29, 2005, 10:36:29 PM11/29/05
to

It's a common convention to separate sentences by double spaces. I
started following this convention because Emacs expected it, and now I
use it always.

basi

unread,
Nov 29, 2005, 11:26:35 PM11/29/05
to
Hi,
I have looked at NLTK in Python (and had been hoping a Rubyist would
rewrite it in Ruby). I will go back to NLTK and see if it has a
split-sentence algorithm of sort. And thanks for the tip on Stanfords
Java NLP package. Yes, those abbreviations are pesky, and I may have to
resort to an exceptions list containing the most common ones.
Thanks much,
basi

basi

unread,
Nov 29, 2005, 11:27:50 PM11/29/05
to
Hi,
I will google. Thanks!
basi

basi

unread,
Nov 29, 2005, 11:37:02 PM11/29/05
to
Yes, I learned this convention when I took a keyboarding (i.e., typing)
lesson in high school. Sometime ago, a style manual for word processing
appeared, and one of the advice is to use only one space to separate
sentences. The reason given is that in a justified format, those two
spaces can become four spaces, or even more. Anyway, a lot of text now
has one or two spaces between sentences, and this wouldn't be a
reliable indicator of sentence boundary.
Cheers!
basi

basi

unread,
Nov 29, 2005, 11:37:56 PM11/29/05
to
Hi,
This looks promising. I'm downloading as I write.
Thanks!
basi

Ryan Leavengood

unread,
Nov 29, 2005, 11:56:43 PM11/29/05
to
On 11/29/05, basi <basi...@hotmail.com> wrote:

I too learned the two space after a period convention years ago and
also recently learned that with modern fonts and word processors it is
not necessary. It was tricky to retrain myself, but I did, and have
been using just one space ever since.

So like you say, that isn't a reliable way to discern sentences.

I would recommend following the advice of first filtering out false
positives (possibly even replacing them with temporary markers, Mr.
becomes $MISTER$ or similar), then splitting on punctuation. If you
then test on various sample texts you should be able to find more
false positives that you might have missed.

Ryan


basi

unread,
Nov 30, 2005, 12:30:15 AM11/30/05
to
Hi,
This just might be easier than what I have in mind. I will try this
first.
Thanks!
basi

Damphyr

unread,
Nov 30, 2005, 3:43:37 AM11/30/05
to
Which will not help you at all with foreign languages. And don't forget
putting i.e., e.g. or etc. in the list.
This is an ongoing problem (think about the auto-correction 'feature' of
capitalizing the first letter of every sentence in Openoffice or Word -
something I always turn off because it is so insistent when it's wrong)
Cheers,
V.-
--
http://www.braveworld.net/riva

____________________________________________________________________
http://www.freemail.gr - δωρεάν υπηρεσία ηλεκτρονικού ταχυδρομείου.
http://www.freemail.gr - free email service for the Greek-speaking.


Edwin van Leeuwen

unread,
Nov 30, 2005, 5:41:29 AM11/30/05
to

If you make a regexp: [.!?]\s+[A-Z] you will already capture most. Most
Abbreviations normally aren't followed by a space/capital letter.

One change to this rule that I can think of is Mr. Name, Mrs. Name. But
as you can see these have a <uppercase> followed by only one or two
downcase letters. Most sentences would have at least five non uppercase
in front of the <.> ->
[A-Z]\w\w?\w?\w?\.

--
Posted via http://www.ruby-forum.com/.


Austin Ziegler

unread,
Nov 30, 2005, 7:51:42 AM11/30/05
to
On 11/29/05, Kevin Olbrich <kevin....@duke.edu> wrote:
> Depending on the text you might be able to search for a period (or other
> punctuation) followed by two spaces. It's not robust, but if you know that
> convention will be followed by the authors, then it can work.

That, in fact, is a very *bad* metric to follow, as the proper spacing
after sentence punctuation is a single space. The only reason that two
spaces was used in the past is the space used between sentence endings
in typeset work is a little wider than that used between words (an
em-space vs. an en-space).

-austin
--
Austin Ziegler * halos...@gmail.com
* Alternate: aus...@halostatue.ca


Austin Ziegler

unread,
Nov 30, 2005, 7:54:09 AM11/30/05
to
On 11/29/05, basi <basi...@hotmail.com> wrote:

Look at Text::Format for some indication on how abbreviations could be handled.

Austin Ziegler

unread,
Nov 30, 2005, 7:53:07 AM11/30/05
to

As I noted above, this is an improper convention outside of the
typewriter realm. If you are using anything other than a fixed-pitch
font for display or print, you should *never* use two spaces.

Christian Neukirchen

unread,
Nov 30, 2005, 10:55:41 AM11/30/05
to
Austin Ziegler <halos...@gmail.com> writes:

Alternatively, use text processing systems that do the "right thing";
i.e. transform two spaces into one (e.g. TeX, HTML-based products).
There is no good reason a text processor should show two spaces after
each other in print.

> -austin
--
Christian Neukirchen <chneuk...@gmail.com> http://chneukirchen.org


Jeffrey Schwab

unread,
Nov 30, 2005, 11:19:02 AM11/30/05
to
Austin Ziegler wrote:
> On 11/29/05, Kevin Olbrich <kevin....@duke.edu> wrote:
>
>>Depending on the text you might be able to search for a period (or other
>>punctuation) followed by two spaces. It's not robust, but if you know that
>>convention will be followed by the authors, then it can work.
>
>
> That, in fact, is a very *bad* metric to follow, as the proper spacing
> after sentence punctuation is a single space. The only reason that two
> spaces was used in the past is the space used between sentence endings
> in typeset work is a little wider than that used between words (an
> em-space vs. an en-space).
>

Not true at all. I was always taught to use double spaces after
sentences in grade-school homework assignments done on plain word
processors or typewriters.

James Edward Gray II

unread,
Nov 30, 2005, 12:08:15 PM11/30/05
to

Many of us were and I'll admit that I can't shake the habit. I still
know it's wrong though. ;)

James Edward Gray II

Austin Ziegler

unread,
Nov 30, 2005, 12:39:33 PM11/30/05
to

Then, quite honestly, you were taught wrong. I was taught to use
double spaces with a typewriter or when using fixed-pitch fonts
(although that was later, since most computers and printers didn't
have reliable kerning routines until I was out of university).
Ultimately, the use of double spaces after a period is wrong *even
with fixed-pitch fonts*, but it was done to be clearer since the width
of the em-space and an en-space on a typewriter with a Courier-like
font is exactly the same. The two spaces *simulates* an em-space in a
typeset piece of work. (And that is *fact*, not opinion.)

Mark Ericson

unread,
Nov 30, 2005, 12:55:35 PM11/30/05
to
I too learned two-spaces in typing class. However, I'm now in the one
space camp

Here is a great treatment on the topic,
http://www.webword.com/reports/period.html


Kevin Olbrich

unread,
Nov 30, 2005, 1:25:15 PM11/30/05
to
Whatever the original reason for the double spaces at the end of a line
started, the practice still continues.
In fact, MS word has an option in its grammar checker to enforce one or two
spaces at the end of a sentence. For a lot of people (like me), it is
nothing more than an old habit that is hard to break.

The utility of this method for determining the end of a sentence depends
entirely on the purpose of the program. If I were to write a routine to
parse text that I wrote, it would probably work pretty well, and it would
save me several hours of work trying to implement a fancier, more robust
routine.

The same routine would probably fail horribly for other users or a more
generic corpus of text.

As a general rule, I like to use algorithms that are as simple as possible
for the job. That, of course, depends a lot on what the job is.

Funny, I never thought something like spacing between sentences would be so
controversial. I can almost envision _why making an esoteric remark about
the beauty of 'negative space' in text files.

_Kevin

-----Original Message-----
From: Austin Ziegler [mailto:halos...@gmail.com]
Sent: Wednesday, November 30, 2005 12:40 PM
To: ruby-talk ML
Subject: Re: Splitting a text file into sentences

Jeffrey Schwab

unread,
Nov 30, 2005, 1:28:07 PM11/30/05
to

The Bedford Handbook, which has been my bible for writing conventions
through the past ten years, lists two sets of guidelines: Those
recommended by the Modern Language Association (MLA), and those
recommended by the American Psychological Association (APA). It says
that the MLA style is typically taught in English classes, but that the
APA style is common in the social sciences. Here is the explanation of
the MLA guidelines, from page 633 of the Bedford Handbook for Writers,
(c) 1994:


MLA Guidelines [for essays]:

In typing the text of the essay, leave one space after words, commas,
colons, and semicolons and between the dots in ellipsis marks. Leave
two spaces after periods, question marks, and exclamation points.
To form a dash, type two hyphens with no space between them. Do not
put a space on either side of a dash.


The Handbook goes on to say (p. 635):


Although the APA guidelines call for one space after all punctuation,
most college professors prefer two spaces at the end of a sentence. Use
one space after all other punctuation.
Although two spaces are used after a period that ends a sentence, use
only one space after a period that follows a person's initial (B.F.
Skinner).
To form a dash, type two hyphens with no space between them. Do not
put a space on either side of a dash.


The Handbook itself uses only single spaces at the ends of sentences.
Still, I hardly think there is one conclusively "right" or "wrong"
convention. Until I am convinced otherwise, I will continue to use two
spaces to separate sentences. This makes sentences easier to lex with
regular expressions, and makes them stand out to text editors and human
readers.

Jallan

unread,
Nov 30, 2005, 3:43:48 PM11/30/05
to
Jeffrey Schwab wrote:
> The Handbook itself uses only single spaces at the ends of sentences.
> Still, I hardly think there is one conclusively "right" or "wrong"
> convention. Until I am convinced otherwise, I will continue to use two
> spaces to separate sentences. This makes sentences easier to lex with
> regular expressions, and makes them stand out to text editors and human
> readers.

"Right" or "wrong" in this kind of styling has to do with whether
something is right or wrong according to a particular convention.

The normal convention for professional typography is to use one space
between sentences, whether you are convinced or not, whether using hard
type, a professinoal typesetting program, a desktop publishing program,
or a word processing program.

The older typewiter conventions are still often requested for
manuscripts for academic essays and mansucripts for submission to
publishing houses. These conventions also require underlining rather
than italics, use of double-hyphen for a dash rather than the specific
dash character, and so forth. But should this same manuscript be
professionally printed, even if the text is actually to be set by a
word processor, it would almost certainly be edited first to convert it
to typographical standard: changing all double-spaces to single spaces,
all occurrences of double-hyphen to em-dash or en-dash, using fancy
quotation marks instead of possible straight typewriter quotation
marks, italics instead of underlining, and so forth.

Note that HTML has from the beginning automatically changed any
multiple runs of spaces into a single space when displaying text.

Yes, a convention of always using two spaces would make sentences
easier to lex with regular expressions. Similarly, enforcing one single
spelling of English throughout the world would make searches and
matches easier. However, it is philosphically unsound to ask that the
world change to fit particular data-processing routines, rather than
that data-processing routines be built to properly to deal with
real-world situations.

If your lexing routine fails because many people don't end
non-paragraph-final setences with double-spaces, or do so only in
particular plain text files, it is the fault of your lexing routine for
failing to handling common formatting, unless your lexing is intended
to be a limited tool that works only with manuscript formatted text.

The best general sentence lexing algorithm I've seen is the one set
forth by the Unicode Consoritium at
http://www.unicode.org/reports/tr29/tr29-4.html#Sentence_Boundaries .
This is designed to work reasonably well in any language and writing
system supported by Unicode, not just in English.

Jallan

Dave Howell

unread,
Nov 30, 2005, 5:02:56 PM11/30/05
to
I think "right" or "wrong" are a tad strong for most of the cases
sited. But as a professional book designer and typographer, there's
unquestionably "better" and "worse."

For improved legibility, inter-sentence space should generally be a bit
greater than inter-word space.

Typewriters only had one distance they could travel. Either 1/10th of
an inch ("Pica") or 1/12th ("Elite"). So the only way to add extra
space after a sentence was to double it. That's way too much extra
space, but it was generally better than the alternative. The real
problem was that the words were too far apart, not that the sentences
were too close, but again, the fixed spacing was already an abominable
situation.

Proportional type, dating all the way back to Gutenberg, would
generally use 1/3rd or 1/4th of the height of type type as the
inter-word spacing. This would usually work out to about the width of a
lower case "t" or "l".

When setting modern (by which you may also read "all type before
typewriters" as well) proportional type in fully justified form (left
and right margins both even), the spaces must be stretched out on a
line-by-line basis to fit. Really good typesetting programs (and really
good typesetters sticking little bits of lead between their words (and
I've done that, too)) will add more of the space between sentences than
between words, so as the line stretches, the inter-word space to
inter-sentence space ratio actually changes. (Take a look at a narrow
newspaper column sometime.)

More sophisticated approaches to space will ignore a user's attempt to
sprinkle extraneous space in. Less sophisticated ones might allow it,
and even treat them as individual spaces, stretching both of them
during expansion. {shudder}

The fact that both the MLA Guidelines and the Bedford Handbook
encourage poor typography is regrettable. ("If you cannot type
appropriate punctuation, e.g. an em-dash or en-dash, please use
appropriate substitutions. For both dashes, substitute a pair of
hyphens, which, like true dashes, are typed without adjacent spaces."
There's still software out there that will happily wrap a line between
the two hyphens. Ick!) Nevertheless, if you're submitting a paper to an
institution that expects or requires that, then to not follow them is
wrong, even if the legibility of the submission is better.

What it all boils down to is "Putting two spaces after a period at the
end of a sentence is an artifact left over from the days when the
typewriter was the prevalent text-making tool. Unless you have a
specific reason or requirement to do otherwise, it's preferable to put
only one space between sentences."

*****

For breaking text into sentences, sometimes I find it easier to work
backwards. Also, only very colloquial writing will have a one-word
sentence, so you can solve all "Mr./Dr./Ph.D." cases by the fact that
if a word starts with a cap and ends with a period, it's not a
sentence. For a more sophisticated approach that's still not too
complex to program, check the final word of a sentence against a
dictionary. If it's found there without a final dot, then you're almost
certainly looking at the end of a sentence. If it isn't, then is it
found anywhere else in the document without a dot? If not, then you're
probably looking at an abbreviation. (My mail program uses a monospaced
font. If I thought most readers would read it with a proportional font,
I'd have typed "Ph. D." above, since it should have a thin space before
the D.)

Gavin Sinclair

unread,
Nov 30, 2005, 6:26:07 PM11/30/05
to
Austin Ziegler wrote:
> > Not true at all. I was always taught to use double spaces after
> > sentences in grade-school homework assignments done on plain word
> > processors or typewriters.
>
> Then, quite honestly, you were taught wrong. I was taught to use
> double spaces with a typewriter or when using fixed-pitch fonts
> (although that was later, since most computers and printers didn't
> have reliable kerning routines until I was out of university).
> Ultimately, the use of double spaces after a period is wrong *even
> with fixed-pitch fonts*, but it was done to be clearer since the width
> of the em-space and an en-space on a typewriter with a Courier-like
> font is exactly the same. The two spaces *simulates* an em-space in a
> typeset piece of work. (And that is *fact*, not opinion.)

What rot. How can anything like that be a fact? You're regurgitating
the opinion of a style manual.

Gavin

Matthew Smillie

unread,
Nov 30, 2005, 6:35:28 PM11/30/05
to
On Nov 30, 2005, at 22:02, Dave Howell wrote:

> you can solve all "Mr./Dr./Ph.D." cases by the fact that if a word
> starts with a cap and ends with a period, it's not a sentence.

I'm not sure that's a very good rule, Dave. There are two sentences
here.

The above rule may catch titular abbreviations, but over-generalises
to produce a false negative in the above example. So in solving one
problem, you introduce another one. It's relatively easy to make
another rule to catch the problem in this case, but it would probably
have been simpler to just make a specific rule to eliminate titular
abbreviations, since there really aren't that many of them.

matthew smillie.


Jeffrey Schwab

unread,
Nov 30, 2005, 6:40:31 PM11/30/05
to

This is what I love about Usenet. :)

Austin Ziegler

unread,
Nov 30, 2005, 7:06:24 PM11/30/05
to

Um. No, I'm stating fact. This isn't mere opinion: two spaces were done
to simulate em-spaces in fixed pitch environments. That's a fact. The
reason for that may often be forgotten, but it *remains* a fact. Please
remember that I've done quite a bit of typesetting-style work in the
last year with PDF::Writer and I have to know a bit more about this than
most folks, and it's something of a hobby of mine in any case to know
about printing mechanisms.

The only *opinion* I stated was that the first poster in the chain above
(I think Jeffrey) was taught wrongly. I maintain that as true
regardless, because if he was taught two spaces without the reason why,
then there's a practice being repeated for no good reason.

The practice is nonsense these days in most contexts.

Dave Howell

unread,
Nov 30, 2005, 7:36:53 PM11/30/05
to

On Nov 30, 2005, at 15:35, Matthew Smillie wrote:

> On Nov 30, 2005, at 22:02, Dave Howell wrote:
>
>> you can solve all "Mr./Dr./Ph.D." cases by the fact that if a word
>> starts with a cap and ends with a period, it's not a sentence.
>
> I'm not sure that's a very good rule, Dave. There are two sentences
> here.
>
> The above rule may catch titular abbreviations, but over-generalises
> to produce a false negative in the above example.

I hadn't intended to provide a single magical rule that was perfect in
isolation, after all. {chuckle}


"Ph. D." is not a sentence. But where do you break
My name is Dave, Ph. D. Pleased to meet you.
vs.
You need my Ph. D. friend Dave to help you.

I don't think having a list of abbreviations and titles will improve
that situation much, although it's a lot more work and almost certain
to be incomplete. Any/every rule will have failures; avoiding them is
what takes you into that whole natural language high-octane engine
situation.

However, if you also use the *other* "rule" I mentioned, then you don't
have a problem. "Dave Howell" appears just a couple lines earlier,
establishing "Dave" as a word that doesn't require a period. Therefore,
it's more likely to be at the end of a sentence. The following word
("There") can be found in a dictionary, and in a non-capitalized form,
which means that its capitalization here following a dot strongly
indicates that it's beginning a sentence.

The capital "P" of "Ph." is not preceded by a period either time, so
it's not starting a sentence. After it, "friend" isn't capitalized, so
it's not ending a sentence. But "Pleased" is, and dictionary says "not
normally capitalized" so that's probably a sentence break.

Shot - Piotr Szotkowski

unread,
Nov 30, 2005, 7:07:57 PM11/30/05
to
Hello.

Dave Howell:

> For improved legibility, inter-sentence space should
> generally be a bit greater than inter-word space.

It's worth noting that actually turning this theory into reality seems
to apply to 'Western' (American, British, others?) typography (mostly?
only?).

I've yet to see a typical modern Polish book typeset with greater
inter-sentence spaces. Also (and, I guess, as a result of this),
I doubt I ever saw any Polish email or Usenet post with two
inter-sentence spaces, and I remember how happy I was to find
out about the 'joinspaces' vim option that finally let me reflow
paraghaprs properly, without doing a s/ / /g on them afterwards. :o)

Cheers,
-- Shot
--
He has never been known to use a word that might send a reader
to the dictionary. -- William Faulkner on Ernest Hemingway
====================== http://shot.pl/hovercraft/ === http://shot.pl/1/125/ ===

signature.asc

Gavin Sinclair

unread,
Nov 30, 2005, 7:49:12 PM11/30/05
to

Austin Ziegler wrote:
> On 11/30/05, Gavin Sinclair <gsin...@gmail.com> wrote:
> >> [...] The two spaces *simulates* an

> >> em-space in a typeset piece of work. (And that is *fact*, not
> >> opinion.)
>
> > What rot. How can anything like that be a fact? You're regurgitating
> > the opinion of a style manual.
>
> Um. No, I'm stating fact. This isn't mere opinion: two spaces were done
> to simulate em-spaces in fixed pitch environments. That's a fact. [...]

Fair enough.

Gavin

Matthew Smillie

unread,
Nov 30, 2005, 8:46:22 PM11/30/05
to
On Dec 1, 2005, at 0:36, Dave Howell wrote:

>
> On Nov 30, 2005, at 15:35, Matthew Smillie wrote:
>
>> On Nov 30, 2005, at 22:02, Dave Howell wrote:
>>
>>> you can solve all "Mr./Dr./Ph.D." cases by the fact that if a
>>> word starts with a cap and ends with a period, it's not a sentence.
>>
>> I'm not sure that's a very good rule, Dave. There are two
>> sentences here.
>>
>> The above rule may catch titular abbreviations, but over-
>> generalises to produce a false negative in the above example.
>
> I hadn't intended to provide a single magical rule that was perfect
> in isolation, after all. {chuckle}

Didn't assume you were! It was just a good example to use for a
"this can be harder than it looks" couple of lines of warning, since
it's been my experience that people don't anticipate false negatives
as well as they do false positives.


matthew smillie.


basi

unread,
Nov 30, 2005, 9:14:08 PM11/30/05
to
All was well with this strategy, until i hit a sentence similar to:

The abbreviation for Mister is Mr.
The head office is in New York, N.Y.

In other words, abbreviations that end a sentence. These sentences
don't end with a double dot, so if we replace Mr. with $MISTER$, the
sentence has no end marker.

Hmmm.
basi

Adam i Agnieszka Gasiorowski FNORD

unread,
Dec 3, 2005, 3:36:09 PM12/3/05
to
On 2005-12-01 00:36:53 +0000, Dave Howell <gro...@grandfenwick.net> said:

>
> On Nov 30, 2005, at 15:35, Matthew Smillie wrote:
>
>> On Nov 30, 2005, at 22:02, Dave Howell wrote:
>>
>>> you can solve all "Mr./Dr./Ph.D." cases by the fact that if a word
>>> starts with a cap and ends with a period, it's not a sentence.
>>
>> I'm not sure that's a very good rule, Dave. There are two sentences here.
>>
>> The above rule may catch titular abbreviations, but over-generalises to
>> produce a false negative in the above example.
>
> I hadn't intended to provide a single magical rule that was perfect in
> isolation, after all. {chuckle}
>
>

Want some magick? You are stuck in wrong coordinate
system, like Newton. Stop thinking in terms of words and syntax
rules governing how to put them in correct order. Think
links (alinka). Think relations and revelation.
Words (symbols) have no meaning. None. They *are* empty.
If you want to infiltrate enemy ogranization the most
effective method is not drilling into individual agents,
but monitoring their communications (that is, relations).
If you aquire enough of those relations (and recursively,
but set some boundary unless you are Goddess and can
do anything you fancy) you don't even need to decrypt the
messages, unless you are bored. To destroy enemy
organization, mess with the relations. Agents (symbols,
words, punctuation marks...) are of no importance
whatsowherever. That is why a person, if immersed enough
in a alien language needs no dictinary day-to-day - if one does
need to check, it's not the meaning you are after -
it's definition, that is MORE SYMBOLS, so you can
augment MORE RELATIONS from unfamiliar context (SYMBOL
CLOUD, think quantum mechanics and particles) until
you actually GET the pointer to "meaning" and can call on it
(how to relate that
symbol to some other symbol mesh, you can still have
no idea what the hell fermion "means", but you can use it and
fail to be misunderstood unless you want to).

I have no idea how many "syntax errors" there are
in above paragraph - for the reason sublime, my total
lack of knowledge aboot rules of grammar for the
language used to convey meaning heretofore. HTH.

P.S. It makes me wonder, what 't bony "heretofore" word
"means" right now to you, Reader. Compose witty remarks if
it's a-kind funny miss-take, I enjoy my Self when people
smirk. Yes, I did stick-in a word possessing none of it's
meaning in my poor head. I must be mad? Or contrary-wise.
I'm not sure, to be frank with you a-like Frank
Herbert iff there was such word in usage "then". She
will compensate for that - any dictionary dug
up shall (she can't help it) explain in detail or else - she always does
that when I go at a genuine miracle in open source. It's
the game we play. I need some time, we make
a beatiful team... Prop me up with another
pill! A-musing...

-- 
I am the One. I am A vampire A-calling for your love! A.A!
I am the fire that burns within your blood. I am the One!!
No bars or chains can keep me from your bed! I am the One!
Nothing on earth can get me from your head! I am the One!!

Adam i Agnieszka Gasiorowski FNORD

unread,
Dec 3, 2005, 5:20:43 PM12/3/05
to
>
> He has never been known to use a word that might send a reader
> to the dictionary. -- William Faulkner on Ernest Hemingway
>
>

Now, that is a wise one - it actually helps
to comprehend my jabber in the other post O
spontaneously generated today...

0 new messages