[VIM] HEoLL

42 views
Skip to first unread message

ling...@gmail.com

unread,
Feb 4, 2008, 10:44:30 PM2/4/08
to v...@vim.org
Hello,

I'm tired of this topic, so I'll be curt:

I think that it's silly that vim sticks trailing
newlines at the end of files and then proceeds to
hide this fact from the user.

Moreover, I think it's ridiculous that a (broken)
script is the only recourse:

http://vim.wikia.com/wiki/VimTip1369

Yes, yes. Perhaps this 'feature' kept idiots from saving
malformed C code back when parser writers were too lazy
to cover the case, but it reeks of big-government utilit-
arian do-too-much-and-get-in-the-way Windowsism.

I don't want Clippy in my terminal.

Goodness! Just read ":h 'eol" to see the madness.

Fuzzy Logic

unread,
Feb 4, 2008, 11:03:32 PM2/4/08
to vim...@googlegroups.com
Exaggerate much? There's always notepad if you don't like useful
functionality. OR you could write your own patch and make Vim do
whatever the heck you want it to do.

Fuzzy

Ben Schmidt

unread,
Feb 5, 2008, 12:36:28 AM2/5/08
to vim...@googlegroups.com, v...@vim.org

Wow! Someone's passionate!

I agree somewhat, though. It would make sense to me if 'eol' were effective
regardless of the setting of 'binary'. People who want the old functionality could
easily have something like

:au BufWritePre * if ! &binary | set eol | endif

which is a much easier workaround than the current situation.

Apart from the argument that Windows apps are b0rken, is there any reason for
having 'eol' only take effect when 'binary' is set? What might it break if it were
changed so it were always effective? Might you consider making the change, Bram?

Ben.


Send instant messages to your online friends http://au.messenger.yahoo.com

Georg Dahn

unread,
Feb 5, 2008, 1:19:14 AM2/5/08
to vim...@googlegroups.com
> I think that it's silly that vim sticks trailing
> newlines at the end of files and then proceeds to
> hide this fact from the user.

Obviously every line ends somewhere and where it ends there is the end
of line (EOL). There is nothing hidden here.

Best regards,
Georg Dahn


Georg Dahn

unread,
Feb 5, 2008, 1:55:51 AM2/5/08
to vim...@googlegroups.com
> There's always notepad if you don't like useful
> functionality.

I consider this no functionality since in text files lines need EOLs to
be complete. Vim just produces complete lines only. I prefer to call
this the correct behavior instead of a functionality.

Unfortunately there are some programs like Microsoft's notepad and its
clones which show an empty line at the end of a correct text file which
is not there. I call this a bug. However, this is such a common bug,
that many people got used to it and mistake this bug with a feature.

Best regards,
Georg Dahn


ling...@gmail.com

unread,
Feb 5, 2008, 9:19:46 AM2/5/08
to vim...@googlegroups.com

> of line (EOL). There is nothing hidden here. ... text files lines need


> EOLs to be complete. Vim just produces complete lines only. I prefer
> to
> call this the correct behavior instead of a functionality.


While your logic is appealing, I think it is confused by bad naming and
no experience with the technical details of parsers.

EOL is basically \n, which is the 'newline'. It is called the 'newline',
because it signals that a new line should begin after it. For this
reason,
EOL is a misnomer, and it is more appropriately called a 'line
separator'.

Moreover, the newline is just a character--extra byte(s)--introduced
by, say,
the 'enter' (or 'return') key. If a user does not press this key at
the end
of the last line, is it necessary to introduce this character?

In this sense, there is no reason for the last line to end with a
newline,
except for uniformity, which is helpful with algorithmic processing of
text
--parsing--but which is sometimes unexpected.

For my case, I'm working with a parser that has a bug with respect to
the
last line ending with EOF; the flaw lies with Flex, in my opinion, but
it
can be hidden by the introduction of this extraneous newline.

> Unfortunately there are some programs like Microsoft's notepad and its
> clones which show an empty line at the end of a correct text file
> which
> is not there. I call this a bug. However, this is such a common bug,
> that many people got used to it and mistake this bug with a feature.

If a newline signals that the next line should be displayed with the
following
text, how is notepad's behavior incorrect? That is, if a newline is
followed
by nothing--the empty string--it is perfectly reasonable that an empty-
line (a
line containing an empty string) is displayed.

If notepad is incorrect, then so is its "clone" emacs.
(so that the conversation isn't wrongly swayed: I prefer vim).

Such behavior gives a visual cue to the existence of that new line
character,
and provides a means for the cursor to enter that empty line. **Are
you surprised
by this behavior when such an empty line appears in the middle of a
text file?**

Take the case of vim again: The last displayed line is given a newline
automatically,
so why is it that special keys must be pressed explicitly to enter the
next line?
Why can't one just navigate to that empty line with the movement keys?
It seems to
me that vim treats the last line as a special case, and unnecessarily
special treatment
always yields such nasty workarounds as the script referenced in my
original post.

Charles E Campbell Jr

unread,
Feb 5, 2008, 11:20:57 AM2/5/08
to vim...@googlegroups.com
ling...@gmail.com wrote:

>On 5 Feb 2008, at 1:19 AM, Georg Dahn wrote:
>
>
>
>>> I think that it's silly that vim sticks trailing
>>> newlines at the end of files and then proceeds to
>>> hide this fact from the user.
>>>
>>>
>>Obviously every line ends somewhere and where it ends there is the end
>>of line (EOL). There is nothing hidden here. ... text files lines need
>>EOLs to be complete. Vim just produces complete lines only. I prefer
>>to
>>call this the correct behavior instead of a functionality.
>>
>>
>
>
>While your logic is appealing, I think it is confused by bad naming and
>no experience with the technical details of parsers.
>
>

I think you're assuming things that you have no right to assume -- ie.
that Georg Dahn has or doesn't have experience with technical details of
parsers. Personally, I think that's irrelevant to the issue of
end-of-line markers. And I have no idea whether or not Georg Dahn has
the mentioned experience.

>EOL is basically \n, which is the 'newline'. It is called the 'newline',
>because it signals that a new line should begin after it. For this
>reason,
>EOL is a misnomer, and it is more appropriately called a 'line
>separator'.
>
>

That's the notepad (mis)interpretation. This is a "standards" issue
and, as such, is somewhat arbitrary. Vim is following vi's behavior,
which is appropriate for Vim, which is that EOL means end-of-line, not
line-separator.

>Moreover, the newline is just a character--extra byte(s)--introduced
>by, say,
>the 'enter' (or 'return') key. If a user does not press this key at
>the end
>of the last line, is it necessary to introduce this character?
>
>In this sense, there is no reason for the last line to end with a
>newline,
>except for uniformity, which is helpful with algorithmic processing of
>text
>--parsing--but which is sometimes unexpected.
>
>For my case, I'm working with a parser that has a bug with respect to
>the
>last line ending with EOF; the flaw lies with Flex, in my opinion, but
>it
>can be hidden by the introduction of this extraneous newline.
>
>

Flex is assuming that EOL means end-of-line, and apparently expects its
files to adhere to that standard.

>
>
>>Unfortunately there are some programs like Microsoft's notepad and its
>>clones which show an empty line at the end of a correct text file
>>which
>>is not there. I call this a bug. However, this is such a common bug,
>>that many people got used to it and mistake this bug with a feature.
>>
>>
>
>If a newline signals that the next line should be displayed with the
>following
>text, how is notepad's behavior incorrect? That is, if a newline is
>followed
>by nothing--the empty string--it is perfectly reasonable that an empty-
>line (a
>line containing an empty string) is displayed.
>
>

This means that "nothing" is a line delimiter, so now there's an EOL
character/sequence (tnx to Windows for the "sequence", BTW :( ), and
EOF (end-of-file), and now an "empty string" (null byte)? Anyway, the
"empty line displayed" is reasonable only if you assume that EOLs are
line separators. Assuming that EOL means end-of-line, then it would be
reasonable to assume that no empty trailing line is displayed. Again,
this is a standards issue (what does EOL mean?).

> If notepad is incorrect, then so is its "clone" emacs.
> (so that the conversation isn't wrongly swayed: I prefer vim).
>
>Such behavior gives a visual cue to the existence of that new line
>character,
>and provides a means for the cursor to enter that empty line. **Are
>you surprised
>by this behavior when such an empty line appears in the middle of a
>text file?**
>
>Take the case of vim again: The last displayed line is given a newline
>automatically,
>so why is it that special keys must be pressed explicitly to enter the
>next line?
>Why can't one just navigate to that empty line with the movement keys?
>It seems to
>me that vim treats the last line as a special case, and unnecessarily
>special treatment
>always yields such nasty workarounds as the script referenced in my
>original post.
>
>

Mostly your conclusions remain predicated on the assumption that EOL
means Line Separator. Assuming that lines should end with EOLs means
that notepad is treating the last line as a special case, not Vim.

Regards,
Chip Campbell

Matthew Winn

unread,
Feb 5, 2008, 11:56:31 AM2/5/08
to v...@vim.org
On Tue, 5 Feb 2008 09:19:46 -0500, ling...@gmail.com wrote:

> On 5 Feb 2008, at 1:19 AM, Georg Dahn wrote:
>
> > Obviously every line ends somewhere and where it ends there is the end
> > of line (EOL). There is nothing hidden here. ... text files lines need
> > EOLs to be complete. Vim just produces complete lines only. I prefer
> > to call this the correct behavior instead of a functionality.
>
> While your logic is appealing, I think it is confused by bad naming and
> no experience with the technical details of parsers.
>
> EOL is basically \n, which is the 'newline'. It is called the 'newline',
> because it signals that a new line should begin after it. For this
> reason, EOL is a misnomer, and it is more appropriately called a 'line
> separator'.

You're confusing two subtly different concepts.

(Considering Unix alone for the moment; I'll get on to other systems
in a moment.)

The character 0x0A is called "line feed". When sent to a device that
understands it, that character causes a linefeed. It moves the cursor,
the print head, or whatever the device may have to the same position
on the next line. This is entirely a matter of device control.

When the character 0x0A appears in a file it has a fundamentally
different meaning. It isn't a device control command because it isn't
being sent to a device. When 0x0A appears in a Unix text file its
meaning is "end of line". The character's meaning in a device control
context doesn't affect its meaning in a text file context. The two
meanings are entirely different.

It's even more obvious when you consider a character like 0x04. To a
physical device that means end of transmission. In a WordStar document
file it means toggle double strike. The device control meaning of the
character is irrelevant to the meaning when the character is in a file
because a file is not a device.

The meaning of "\n" is something else entirely. It's a line feed _and_
a line terminator, depending on context. On Unix it is translated to
the line feed character, but it's called "new line" because when
printed to a terminal device in cooked mode it's mapped into the pair
of characters 0x0D 0x0A, which moves the cursor to the start of a new
line. When printed to a file the name sticks, even though the function
in this case is "line terminator", not "new line". Strictly speaking,
files don't even have the concept of lines. Lines are a result of
software abiding by the convention that a line is a sequence of zero
or more non-linefeed characters followed by a line feed.

Note that line feed means "move to the same position in the next
line". Even when sent to a device it has a different meaning from the
concept of "new line" that you want to give it.

When you start getting into other systems there are even more problems
because they use different conventions for terminating lines. DOS uses
0x0D 0x0A by default, while Macs use 0x0D. 0x0D is carriage return.
How can you possibly claim that 0x0D means "new line"? Files don't
even have a carriage, so how can the device control function of the
character possibly have meaning in the context of a file?

The term "new line" is ambiguous and inaccurate. Best to forget it.

> Moreover, the newline is just a character--extra byte(s)--introduced
> by, say, the 'enter' (or 'return') key. If a user does not press this key at
> the end of the last line, is it necessary to introduce this character?

Again, you're muddling separate concepts. The carriage return key is
used to move to the next line but it's not actually inserting any sort
of "new line character". It's a command to the editor. When the file
is written the appropriate line endings are supplied.

In Vim you can type "o <Esc>" over and over again and get many lines.
You haven't pressed carriage return for any of them. By your logic
none of those lines should have a "new line" at the end of them.

> If a newline signals that the next line should be displayed with the
> following text, how is notepad's behavior incorrect?

Notepad's behaviour is incorrect because it treats the line terminator
as a separator. I can understand why it was done this way: the easy
way to split a small DOS file into lines is to load the entire file
into memory and then run through replacing the carriage returns with
nulls and setting the start-of-line pointers to the first non-linefeed
after each carriage return. The problem is that this approach doesn't
scale: it falls apart completely if the file doesn't fit into memory,
and once you start reading the file in sections you have to start
dealing with the possibility that one of the lines is unlike all the
others. Far better to use a line terminator than a line separator.

> That is, if a newline is followed
> by nothing--the empty string--it is perfectly reasonable that an empty-
> line (a line containing an empty string) is displayed.

Another way to look at it is that Notepad has every line followed by
an invisible newline character apart from the last line, which for
some reason is treated differently from every other line in the file.

--
Matthew Winn

ling...@gmail.com

unread,
Feb 5, 2008, 2:00:04 PM2/5/08
to vim...@googlegroups.com, v...@vim.org, Charles E Campbell Jr
On 5 Feb 2008, at 11:20 AM, Charles E Campbell Jr wrote:

> ling...@gmail.com wrote:
>
>> On 5 Feb 2008, at 1:19 AM, Georg Dahn wrote:
>>
>>>> I think that it's silly that vim sticks trailing
>>>> newlines at the end of files and then proceeds to
>>>> hide this fact from the user.
>>>>
>>> Obviously every line ends somewhere and where it ends there is the
>>> end
>>> of line (EOL). There is nothing hidden here. ... text files lines
>>> need
>>> EOLs to be complete. Vim just produces complete lines only. I prefer
>>> to call this the correct behavior instead of a functionality.
>>
>> While your logic is appealing, I think it is confused by bad naming
>> and
>> no experience with the technical details of parsers.
>>
>>
> I think you're assuming things that you have no right to assume -- ie.
> that Georg Dahn has or doesn't have experience with technical
> details of
> parsers. Personally, I think that's irrelevant to the issue of
> end-of-line markers. And I have no idea whether or not Georg Dahn has
> the mentioned experience.

It wasn't an assumption. It was a deduction. Also, parsing--including
scanning--has everything to do with text files; that's all one ever
does with text.

>> EOL is basically \n, which is the 'newline'. It is called the
>> 'newline',
>> because it signals that a new line should begin after it. For this
>> reason, EOL is a misnomer, and it is more appropriately called a
>> 'line
>> separator'.
>
> That's the notepad (mis)interpretation. This is a "standards" issue
> and, as such, is somewhat arbitrary.

Standards are indeed arbitrary, but so is Mathematics; while the axioms
are arbitrary, the choice is important:

ASCII defines two independent and orthogonal movements of the
print head: Carriage Return (CR) and Line Feed (LF). (IBM's EBCDIC
did not make this mistake; it defined a single New Line (NL)
character.)

(http://www.rfc-editor.org/EOLstory.txt)

Besides, you neglect to comment on how emacs makes the same
(mis)interpretation.

> Vim is following vi's behavior, which is appropriate for Vim, which
> is that EOL means end-of-line, not line-separator.

Fair enough.

>> Moreover, the newline is just a character--extra byte(s)--introduced
>> by, say, the 'enter' (or 'return') key. If a user does not press
>> this key at
>> the end of the last line, is it necessary to introduce this
>> character?
>>
>> In this sense, there is no reason for the last line to end with a
>> newline,
>> except for uniformity, which is helpful with algorithmic processing
>> of

>> text--parsing[/scanning]--but which is sometimes unexpected.


>>
>> For my case, I'm working with a parser that has a bug with respect to
>> the last line ending with EOF; the flaw lies with Flex, in my
>> opinion,
>> but it can be hidden by the introduction of this extraneous newline.
>
> Flex is assuming that EOL means end-of-line, and apparently expects
> its
> files to adhere to that standard.

Read more carefully. I was discussing EOF and a flaw (in the control
logic)
of Flex that is hidden by the fact that vim inserts an EOL of its own
accord.

>>> Unfortunately there are some programs like Microsoft's notepad and
>>> its
>>> clones which show an empty line at the end of a correct text file
>>> which is not there. I call this a bug. However, this is such a
>>> common bug,
>>> that many people got used to it and mistake this bug with a feature.
>>>
>>>
>>
>> If a newline signals that the next line should be displayed with
>> the following
>> text, how is notepad's behavior incorrect? That is, if a newline is
>> followed
>> by nothing--the empty string--it is perfectly reasonable that an
>> empty-
>> line (a line containing an empty string) is displayed.
>>
> This means that "nothing" is a line delimiter, so now there's an EOL
> character/sequence (tnx to Windows for the "sequence", BTW :( ), and
> EOF (end-of-file), and now an "empty string" (null byte)?

No, no, no, and No.

I deduce that you're not a programmer.

> and EOF (end-of-file)


This is a signal, not a delimiter.

> and now an "empty string" (null byte)?

No, the null byte is a character. An empty string is literally
nothing, as described in the following.

> This means that "nothing" is a line delimiter

I'm not quite sure what you're trying to say; perhaps you're speaking
of a "vimy"
interpretation of the situation. Let's take EOL to be '\n'. Then,
consider:

some text\nsome more text\n\nsome text after a blank line.

That example contains what can be interpreted as 4 lines. The "\n\n"
shows us that
there is an EMPTY STRING, usually written "", and this empty string
denotes a blank
line.

If I save this in an editor that doesn't append '\n' of its own
accord, the text file
will contain 55 characters. If I open this text file in vim and then
save it, it now
has 56 characters:

some text\nsome more text\n\nsome text after a blank line.\n

I didn't even make a change. It didn't even ask.

> (tnx to Windows for the "sequence", BTW :( )

It would seem that Windows was more or less following a kind of
standard:

During the early ARPAnet research days (~1970-1972), this end-of-line
diversity among operating systems made network communication between
diverse host systems difficult. After some discussion (recorded in
early RFCs), the researchers adopted a single convention:

ASCII text transmitted across the network *must* use the
two-character sequence: CR LF.

This choice was designed to spread the pain equally among all
operating systems of the day; each has to translate to and from the CR
LF convention when text was transferred across the network.

(http://www.rfc-editor.org/EOLstory.txt)

> Anyway, the "empty line displayed" is reasonable only if you assume
> that
> EOLs are line separators. Assuming that EOL means end-of-line, then
> it would
> be reasonable to assume that no empty trailing line is displayed.
> Again,
> this is a standards issue (what does EOL mean?).

I've already stated that EOL is a misnomer.

Also, you clearly did not understand my example (below) of an empty
line in the
middle of text.

>> If notepad is incorrect, then so is its "clone" emacs.
>> (so that the conversation isn't wrongly swayed: I prefer vim).

Wow. You totally skipped that point.

>> Such behavior gives a visual cue to the existence of that new line
>> character, and provides a means for the cursor to enter that empty
>> line. **Are you surprised by this behavior when such an empty line
>> appears in the middle of a text file?**

Ah. You just skipped it.

>>
>> Take the case of vim again: The last displayed line is given a
>> newline
>> automatically, so why is it that special keys must be pressed
>> explicitly
>> to enter the next line? Why can't one just navigate to that empty
>> line with
>> the movement keys? It seems to me that vim treats the last line as
>> a special
>> case, and unnecessarily special treatment always yields such nasty
>> workarounds
>> as the script referenced in my original post.
>
> Mostly your conclusions remain predicated on the assumption that EOL
> means Line Separator. Assuming that lines should end with EOLs means
> that notepad is treating the last line as a special case, not Vim.

The user specifies with some key (sequence) when a line separator
should be
introduced. This is the simplest, most general, non-specific approach.
The
fact that vim does things behind your back (like editing your own text
file
autonomously) is clear indication that vim has chosen the weaker
model; the
workarounds necessary scream in my favor.

Mostly my conclusions are that vim chose the wrong standard.

ling...@gmail.com

unread,
Feb 5, 2008, 2:00:14 PM2/5/08
to vim...@googlegroups.com, v...@vim.org, Matthew Winn
On 5 Feb 2008, at 11:56 AM, Matthew Winn wrote:

>> On Tue, 5 Feb 2008 09:19:46 -0500, ling...@gmail.com wrote:
>>
>> That is, if a newline is followed by nothing--the empty string--it is

>> perfectly reasonable that an empty-line (a line containing an empty

>> string)
>> is displayed.
>
> Another way to look at it is that Notepad has every line followed by
> an invisible newline character apart from the last line, which for
> some reason is treated differently from every other line in the file.

This illustrates the limitation of your model.

Your model says that my model produces incomplete files.
My model says that your model adds an empty line, which is
still complet, but unnecessary.

The fact that your model can't explain my model and that my model can
explain your model suggests that my model is superior.

>> While your logic is appealing, I think it is confused by bad naming
>> and
>> no experience with the technical details of parsers.
>>
>> EOL is basically \n, which is the 'newline'. It is called the
>> 'newline',
>> because it signals that a new line should begin after it. For this
>> reason, EOL is a misnomer, and it is more appropriately called a
>> 'line
>> separator'.
>
> You're confusing two subtly different concepts.
>
> (Considering Unix alone for the moment; I'll get on to other systems
> in a moment.)
>
> The character 0x0A is called "line feed". When sent to a device that
> understands it, that character causes a linefeed. It moves the cursor,
> the print head, or whatever the device may have to the same position
> on the next line. This is entirely a matter of device control.
>
> When the character 0x0A appears in a file it has a fundamentally
> different meaning. It isn't a device control command because it isn't
> being sent to a device. When 0x0A appears in a Unix text file its
> meaning is "end of line". The character's meaning in a device control
> context doesn't affect its meaning in a text file context. The two
> meanings are entirely different.

I think you fail to see the beauty in a unified model. The parser/
scanner
is a device, whether it is composed of physical components or software
commands.

Your model of separate contexts is trash.

> It's even more obvious when you consider a character like 0x04. To a
> physical device that means end of transmission. In a WordStar document
> file it means toggle double strike. The device control meaning of the
> character is irrelevant to the meaning when the character is in a file
> because a file is not a device.

I think everyone realizes that everything is meaningless without meaning
(interpretation).

This is a terrible analogy.

> The meaning of "\n" is something else entirely. It's a line feed _and_
> a line terminator, depending on context. On Unix it is translated to
> the line feed character, but it's called "new line" because when
> printed to a terminal device in cooked mode it's mapped into the pair
> of characters 0x0D 0x0A, which moves the cursor to the start of a new
> line.

What's your point? It's just translating the word from Chinese to
Japanese.

> When printed to a file the name sticks, even though the function
> in this case is "line terminator", not "new line".

You've only shown that your model uses different names in different
contexts;
incidentally, this contextual model of yours illustrates the
importance of naming
concepts. The name "line terminator" is obviously less general and
therefore carries
more baggage than "line seperator". You've made an interpretation--
based on the
name--that is inferior.

> Strictly speaking, files don't even have the concept of lines. Lines
> are a result
> of software abiding by the convention that a line is a sequence of
> zero or more
> non-linefeed characters followed by a line feed.

Indeed! (almost)

Text is nothing more than a string of characters.

Therefore, why unnecessarily require a '\n' (at the end of the last
line)?
Your model requires unnecessary information.

> Note that line feed means "move to the same position in the next
> line". Even when sent to a device it has a different meaning from the
> concept of "new line" that you want to give it.
>
> When you start getting into other systems there are even more problems
> because they use different conventions for terminating lines. DOS uses
> 0x0D 0x0A by default, while Macs use 0x0D. 0x0D is carriage return.
> How can you possibly claim that 0x0D means "new line"? Files don't
> even have a carriage, so how can the device control function of the
> character possibly have meaning in the context of a file?

This is a historical matter of printing machinery that has no bearing
on the concept of EOL. You are caught up in numbers rather than
concepts.

Indeed you are correct that systems record(ed) \r\n in text files to
signify those two device motions. However, it is quite clear that such
device-specific commands have no business mingling with the data.

When people realized the importance of separating concerns,
they started using just one character. '\n' was basically the
last character of a line of text, so a lot chose that one.

Your comments are more in my favor!

> The term "new line" is ambiguous and inaccurate.

As is EOL and "line terminator". If anything, "line separator" is the
best name for the best concept.

The line separator adds information only where it is necessary. The line
terminator adds information when it is not necessary (at EOF).

>> Moreover, the newline is just a character--extra byte(s)--introduced
>> by, say, the 'enter' (or 'return') key. If a user does not press
>> this key at
>> the end of the last line, is it necessary to introduce this
>> character?
>
> Again, you're muddling separate concepts. The carriage return key is
> used to move to the next line but it's not actually inserting any sort
> of "new line character". It's a command to the editor. When the file
> is written the appropriate line endings are supplied.

You are muddling concerns.

> In Vim you can type "o <Esc>" over and over again and get many lines.
> You haven't pressed carriage return for any of them. By your logic
> none of those lines should have a "new line" at the end of them.

You couldn't have exposed your lack of conceptual basis any more.

>> If a newline signals that the next line should be displayed with the
>> following text, how is notepad's behavior incorrect?
>
> Notepad's behaviour is incorrect because it treats the line terminator
> as a separator. I can understand why it was done this way: the easy
> way to split a small DOS file into lines is to load the entire file
> into memory and then run through replacing the carriage returns with
> nulls and setting the start-of-line pointers to the first non-linefeed
> after each carriage return. The problem is that this approach doesn't
> scale: it falls apart completely if the file doesn't fit into memory,

You're still confusing concept and implementation.

I'm not arguing for any particular sequence of line separator.

> and once you start reading the file in sections you have to start
> dealing
> with the possibility that one of the lines is unlike all the others.
> Far
> better to use a line terminator than a line separator.

That was an initial point of mine: your model pushes trouble to user
space
rather than leaving it in the implementation space where it belongs.

Matt Wozniski

unread,
Feb 5, 2008, 2:31:36 PM2/5/08
to vim...@googlegroups.com

Heaven help me, but I can't stop myself from fanning the flames...

Contents of notepad1.txt: "a"
Contents of vim1.txt: "a\n"
Contents of notepad2.txt: "b"
Contents of vim2.txt: "b\n"

cat notepad1.txt notepad2.txt >notepad.txt
cat vim1.txt vim2.txt >vim.txt

Contents of notepad.txt: "ab"
Contents of vim.txt: "a\nb\n"

So, the vim method of seeing the files means that concatenating two
files (correctly) leaves the last line of the 1st and the first line
of the 2nd as separate lines, and the notepad way of seeing the files
incorrectly combines two lines. Why? Because the last line of the
1st file was never ended with an EOL. Personally, I much, much prefer
the vim way of handling it, if only because using a command like "cat"
to display the file to a terminal (correctly, IMHO) will result in the
prompt being drawn at the beginning of a new line, rather than
immediately following the text in the last line in the file (since it
was never ended).

~Matt

Tim Chase

unread,
Feb 5, 2008, 2:35:03 PM2/5/08
to vim...@googlegroups.com
> interpretation of the situation. Let's take EOL to be
> '\n'. Then, consider:
>
> some text\nsome more text\n\nsome text after a blank line.
>
> That example contains what can be interpreted as 4 lines.

This example also contains what can be interpreted as three
lines and a partial line with more data to come.

When you combine two binary files:

file1 = "one\ntwo\nthree"
file2 = "four\nfive\nsix"

the results should be

file1+file2 = "one\ntwo\nthreefour\nfive\nsix"

If your process treats file1 as having an implicit "a line
can end at the end of a a file, even if it's not explicitly
put in there", then you end up with results like

file1+file2 = "one\ntwo\nthree\nfour\nfive\nsix"

where an extra newline has appeared out of nowhere, and you
also need to write additional code to handle this behavior.

> Your model says that my model produces incomplete files.
> My model says that your model adds an empty line, which
> is still complet, but unnecessary.
>
> The fact that your model can't explain my model and that
> my model can explain your model suggests that my model is
> superior.

Your model can't accomodate partial lines at the end of a
file. The "a line always ends with an EOF" model can.
Which suggests that your model is inferior :)

> Your model of separate contexts is trash.

so, demonstrably, are your social graces. Vim provides a
means to treat the file as binary (where it is possible for
the last "line" in file to not be terminated, and for more
data to follow) but does the "correct" thing by default
(a text-editor treating the file as text) yet you reject its
services. If you care, just set up an autocmd to treat all
your files as binary and stop kvetching.

> I've already stated that EOL is a misnomer.

EOL is not a misnomer. EOL means that the end of the line
has been reached. If you haven't reached an EOL character,
then it's possible that there is more data to come.


-tim


Tony Mechelynck

unread,
Feb 5, 2008, 3:13:11 PM2/5/08
to vim...@googlegroups.com, v...@vim.org, Charles E Campbell Jr

Which standard is "right" and which one is "wrong" will be resolved
differently depending on which set of beliefs you're starting from. The whole
thread so far sounds quite somewhat "theological" to me. So depending on which
dogmas you accept, you'll come to different conclusions. In particular, if you
accept Notepad's behaviour as "the model" you'll find that "Vim adds a
spurious empty line at the end of all files" while if you accept Vim's
behaviour unquestioningly you'll find that "Notepad brokenly fails to properly
terminate the last line of all files".

Saying that "a line separator should only be introduced when the user
specifies one" already assumes that the EOL is a "separator", not a
"terminator". Tenants of the opposite side will say that whenever the user
creates a line, that line should be properly terminated, regardless of whether
the new line is created by typing text into a zero-length file, by splitting a
line in the middle, by adding an empty line before or after any existing line,
by pasting a linewise selection, whatever. (They will also say that a
zero-length file has zero lines, not "one empty line".)

Of course, whichever of these behaviours I decide to prefer, the tenants of
the opposite doctrine will declare me anathema. Let's, however, try to find a
reason why to prefer that a file, when stored on disk and not currently being
modified, should or shouldn't have an end-of-line at the very end (not
including the EOL at the end of the penultimate line, even if the last line is
empty).

The one fact that seems relevant to me is that when you concatenate two or
more text files (using e.g. "cat file1 file2 file3 > file_n" in Unix, or COPY
FILE1.TXT+FILE2.TXT+FILE3.TXT FILE_N.TXT in Dos), if the original files' last
lines end with anything but an end-of-line marker, the last (unterminated)
line of each file will be run in with the first line of the next one. Such
behaviour is of course undesirable. Similarly, displaying a file on the
terminal (with "cat foobar" or "TYPE FOOBAR.TXT") will tend to concatenate the
last line with the command-line prompt, another undesirable (or at least very
ugly) behaviour. Therefore (IMHO) Notepad chose the wrong standard.

I hear you coming: you'll say that when concatenating files, the concatenating
program should add an EOL marker between the last line of each file and the
first line of the next, and that when displaying text on the console, the
displaying program (at end-of-job) or the command shell (before displaying the
prompt) should send a linebreak command to the display... but weren't you the
one who said that for the sake of simplicity, the program should never take
"spontaneous" action behind our backs? In the latter case (before the prompt),
it isn't at all unbelievable to me that both the program and the command shell
will end up "moving the carriage" on the display, with the result that there
will be a spurious empty line before every prompt except the first one after a
bootstrap or a clear-screen command.


Best regards,
Tony.
--
Don't suspect your friends -- turn them in!
-- "Brazil"

Ben Schmidt

unread,
Feb 5, 2008, 10:19:04 PM2/5/08
to vim...@googlegroups.com
> Heaven help me, but I can't stop myself from fanning the flames...

Mmm. And I think that is all it is going to achieve, too.

Just as the OP accuses his opponents of doing, he neglects to comment on the
positive points raised by his opposition on numerous occasions. Nor has he replied
to my post that contained suggestions for a way forward that might work for both
parties. He is evidently not interested in a rational discussion, or in making Vim
better, but simply flaming.

Grumps about flex also belong on the flex mailing list, or such, not here. Though,
to be honest, my suspicion is (without knowing any details, though) that it's not
flex that is at fault, but the flex script (or the parser making use of flex). If
one asks politely on the flex mailing list, the behaviour may be explicable, or
suggestions may be forthcoming for how to modify the script so it works more as
intended.

DOS/Windows has traditionally made use of an EOF marker: ^Z (0x1A), though this is
falling out of use from what I can gather. It was, as well as used in consoles to
signal EOF, actually written in files, too (and indeed, sometimes if one was
encountered in a file, the remainder of the file would be ignored, even if
existent). This also makes a mess of things if you simply concatenate them, as
does a string of characters not terminated with CR/LF at the end of a file.

And indeed, there's another use of text files right there, that doesn't involve
scanning and parsing--concatenation. I do many things with text files apart from
scan and parse them. I often store prose in them, like emails, for instance. And
emails often get concatenated into mbox files, too. So I like my files to end with
CR/LF, and believe that this model has been proven more useful, though less
flexible--which you consider 'better' is your choice, of course. But I also like
the idea that doing :e :w wouldn't change the file, i.e. wouldn't add a CR/LF
where there wasn't one before, so I like the idea of making 'eol' effective
without 'binary'. I think it would also make sense when virtualedit is set to make
the x command join the lines if the cursor is at that point at the end (on the
'line separator' if that model is being adopted by the user) and perhaps reset
'eol' if done on the last line of the file. Vim could then work quite well with
this model if desired, even though this would be different to Vim tradition.

Perhaps these ideas give opportunity for further discussion towards the
improvement of Vim. As I have no desire just to fan the flames though, this will
be the last I will say unless the conversation reaches some level of maturity that
rises above 'my way is better than your way'.

Ben Schmidt

unread,
Feb 5, 2008, 10:30:03 PM2/5/08
to vim...@googlegroups.com
>>> I think you're assuming things that you have no right to assume -- ie.
>>> that Georg Dahn has or doesn't have experience with technical
>>> details of
>>> parsers. Personally, I think that's irrelevant to the issue of
>>> end-of-line markers.

Yes. I think this discussion also needs to gain a maturity level beyond 'you are
obviously not a programmer/parser writer/scanner writter/software person/hardware
person/historian/intelligent member of the human race'. A bit of humility,
open-mindedness, and cooperation has the potential to go a long way in a
discussion like this (and in any discussion, in my view). So far we have seen too
little of it, and I think all would have to agree that the discussion has not yet
achieved anything positive. You can hold your own views on why that is, but I
suspect the lack of politeness and respect shown by a few could be a big
contributing factor.

I forgot to put that in my last post. My comments about not writing further do
stand; they just got a bit interefered with by some faulty memory!

Grins,

DervishD

unread,
Feb 6, 2008, 2:10:13 AM2/6/08
to vim...@googlegroups.com, v...@vim.org, Matthew Winn
Hi all, and don't feed the trolls...

Raúl Núñez de Arenas Coronado
--
Linux Registered User 88736 | http://www.dervishd.net
It's my PC and I'll cry if I want to... RAmen!
We are waiting for 13 Feb 2009 23:31:30 +0000 ...

Matthew Winn

unread,
Feb 6, 2008, 4:24:13 AM2/6/08
to v...@vim.org
On Wed, 06 Feb 2008 14:19:04 +1100, Ben Schmidt
<mail_ben...@yahoo.com.au> wrote:

> > Heaven help me, but I can't stop myself from fanning the flames...
>
> Mmm. And I think that is all it is going to achieve, too.

We know the OP's a troll, but he's a funny troll. I wanted to respond
to his reply to my first post but I kept laughing too much. It's like
the world's funniest joke: his comments can be dealt with only one
word at a time.

> DOS/Windows has traditionally made use of an EOF marker: ^Z (0x1A), though this is
> falling out of use from what I can gather. It was, as well as used in consoles to
> signal EOF, actually written in files, too (and indeed, sometimes if one was
> encountered in a file, the remainder of the file would be ignored, even if
> existent). This also makes a mess of things if you simply concatenate them, as
> does a string of characters not terminated with CR/LF at the end of a file.

^Z was just a compatibility thing hanging over from CP/M. CP/M's file
system didn't have any concept of file size: all you had was a set of
blocks, so all files had a length that was a multiple of the block
size. That wasn't a problem for binary files but for text there needed
to be some way of indicating that the last block finished early, so ^Z
was used.

I don't know why DOS and Windows stuck with the same idea for so long.
Utilities such as "type" continued to acknowledge ^Z long after the
availability of a true file size made the presence of an end of file
marker unnecessary and undesirable. Perhaps it was to make it easier
to read CP/M files on the newer OS. It certainly led to all manner of
weird special cases, with software trying to work out whether a file
was text or binary and discarding ^Z if it was the final character in
the file. Nasty stuff.

--
Matthew Winn

Malmberg Emil (Consultant)

unread,
Feb 6, 2008, 4:42:51 AM2/6/08
to vim...@googlegroups.com
I'm not going to comment on the discussion as such, and this post will
be wildly off-topic and have nothing to do with Vim, really. Apologies
in advance.
I finished reading Carl Sagan's admittedly slightly aged book "Dragons
of Eden - speculations on the evolution of human intelligence" not long
ago - in it, evolution and the many different possible pathways it may
take is a key matter. One of the pathways ends up where we are now - man
being the most intelligent of all animals (or, at least, having the
highest brain weight/body weight ratio). It is clear that this is due to
(for us, anyway) lucky circumstances - many other pathways would have
led to us being way further down in the food chain, if not already
extinct. When reading the posts on the subject discussed here, I can't
help but wonder: if we were to reset time to some time before the first
concept of files (or computers, even) was thought out, and let "computer
evolution" have another go at it, would we still end up where we are
today? Or is there concievably another way of doing it, altogether
different in its approach? I remember having the same thoughts as a
young kid when I first started learning about computers - why is
everything organised in files and directories? Is it just the result of
some decision taken early in the evolution process that's stuck with us,
is it the result of a fairly straightforward mapping from the
pre-computerised world of printed books, or is it simply because there's
no other reasonable way of doing it?

I know this is not alt.philosophics, but something in your discussion
triggered this old question in me. If there's someone on this list with
greater insight into how these concepts came about back in the day, I
would be happy to read your thoughts on the matter.

Cheers,
Emil

Meier, Gerd (Gerd)

unread,
Feb 6, 2008, 4:46:16 AM2/6/08
to vim...@googlegroups.com
unsubscribe

Julio Garvía

unread,
Feb 6, 2008, 7:34:58 AM2/6/08
to vim...@googlegroups.com
unsubscribe
 
Julio




¿Con Mascota por primera vez? - Sé un mejor Amigo
Entra en Yahoo! Respuestas.

Georg Dahn

unread,
Feb 6, 2008, 7:44:10 AM2/6/08
to vim...@googlegroups.com
Hi!

You can unsubscribe by either sending a mail to

vim-uns...@vim.org

or by visiting

http://groups.google.com/group/vim_use

Best regards,
Georg Dahn


2008/2/6, Julio Garvía <jgarvia...@yahoo.es>:

ling...@gmail.com

unread,
Feb 6, 2008, 8:33:55 AM2/6/08
to vim...@googlegroups.com
On 5 Feb 2008, at 10:19 PM, Ben Schmidt wrote:

> Nor has he replied to my post that contained suggestions for a way
> forward that might
> work for both parties. He is evidently not interested in a rational
> discussion, or in
> making Vim better, but simply flaming.

I believe that your first post is the obvious solution, so I let it
stand for itself
and turned my attention to the ensuing conceptual discussion in which
I believe I
commented quite rationally.

Furthermore, of course I'm concerned with improving Vim. I've
obviously taken the
time to get to know the program, bothered to subscribe to the list,
and continued
a time-consuming argument.

What you call flaming I call conviction, and perhaps it will have an
impact on how
others think about not only this issue but the ramifications of their
future design
choices. As for the choices I support, I feel like I've argued with
rationality and
balance, albeit with a few personal jabs where emotion got the better
of me.

Matthew Winn

unread,
Feb 6, 2008, 9:53:27 AM2/6/08
to v...@vim.org
On Wed, 6 Feb 2008 10:42:51 +0100, "Malmberg Emil (Consultant)"
<emil.m...@saabgroup.com> wrote:

> When reading the posts on the subject discussed here, I can't
> help but wonder: if we were to reset time to some time before the first
> concept of files (or computers, even) was thought out, and let "computer
> evolution" have another go at it, would we still end up where we are
> today? Or is there concievably another way of doing it, altogether
> different in its approach? I remember having the same thoughts as a
> young kid when I first started learning about computers - why is
> everything organised in files and directories? Is it just the result of
> some decision taken early in the evolution process that's stuck with us,
> is it the result of a fairly straightforward mapping from the
> pre-computerised world of printed books, or is it simply because there's
> no other reasonable way of doing it?

Interesting question. Possible alternative histories of computers.

Suppose the first computers had been built in China instead of the
West. There'd have been a much stronger drive towards wide characters
right from the start. A typical Western architecture had 36-bit words
that were divided into six 6-bit characters. Would we have seen those
words divided into three 12-bit characters or two 18-bit ones? Would
the byte have been twelve bits wide?

I think the concept of directories and files would have been almost
inevitable because it's an easy way to organise things and it can be
implemented on very low-powered equipment. Some early systems such as
CP/M's idea of user numbers[1] were easy to code into the OS but don't
match the way people think. Readable words work better, just as domain
names are better for people than IP addresses. Files and directories
combine a concept that people can understand with an implementation
that is light on space and power.

I think Unix's way of working is probably the best of all, where a
file doesn't have a name. (Most people don't realise this, but it's
true. On Unix a file is a list of blocks on disc; it may have zero or
more filenames that point to it, but none of those are actually the
name of the file. Names have files; files don't have names.) Hard
links allow files to appear anywhere in the directory tree, and even
in several places at once. That makes organisation simple. There
may be other ways of arranging things, but not ways that could be
implemented well early enough to become an accepted standard.

If the Internet had started on Windows, URLs would have \s in them.
Every Unix user in the world would vomit on the keyboard every time
they browsed.

And if Bill Joy et al had been issued with terminals that had no
escape key we'd all be using emacs. Or, at the very least, had
separate cursor keys been more common in the 1970s we may not have
had the concept of an insert command in our favourite editor.

[1] On CP/M there were 16 user areas on each disc. A user number
was simply a four-bit flag on a file. User areas weren't physically
separated: when you typed "user 6" it merely meant that the only files
you could see or access were those with the flag set to 6. As a
special case you could run programs in user 0 whatever user area you
were in. It worked well enough for what it was -- a way of removing
the clutter from the screen -- but before long you invariably forgot
which area was which and ended up having to type "user 1" "dir" "user
2" "dir" "user 3" "dir" until you found the files you were looking
for.

--
Matthew Winn

Richard Hartmann

unread,
Feb 6, 2008, 10:27:27 AM2/6/08
to vim...@googlegroups.com
On Feb 5, 2008 8:31 PM, Matt Wozniski <m...@drexel.edu> wrote:


> Heaven help me, but I can't stop myself from fanning the flames...

Actually, I found your explanation to be one of the most reasonable. I
will try to continue this :)
If you are not interested in lengthy discussions, skip to the end, the
beef is there.


Basically, the old EOL vs newline discussion is one of implicit
assumption to help in most cases vs robustness in all cases.

If you define 0x0A as a carriage return & newline, i.e. map its meaning
directly to what an old typewriter used to do when you pressed the large
key on the right (I am not calling it carriage return on purpose), OP is
right in most of his arguments.

If you define 0x0A as EOL, as I do, VIM is right. The fact that binary
and text are handled differently from the fact that they are two
different ways to represent data. In the one scheme, you have a char
with a special meaning. In the other, all 256 possible combinations are
equal.

Now, let's get back to robustness.


> cat notepad1.txt notepad2.txt >notepad.txt
> cat vim1.txt vim2.txt >vim.txt

As you can see, the one scheme is failsafe, no matter what
implementation of cat (or any other program with similar functionality)
you are using. It does not have to do any (expensive) checks. If the
sizes align correctly, you can even attach two inodes to each other and
have a valid end result.

Another example would be:

% cat foo.pl
#!/usr/bin/perl
use strict;
use warnings;

print "I am going to try this thing which is really flaky and/or takes a
long time\n";
print "The result is:\n";
print evil_function();

% ./foo.pl
I am going to try this thing which is really flaky and/or takes a
long time
The result is:

== example ends here ==

Now, please tell me if the program is still running or if it broke with
an error. You could argue that every program needs to return valid
messages, but simple fact is that you can not rely on that. If you have
piped STDOUT to a file and STDER to somewhere else, the problem is even
worse.
Even worse problems ensue when your shell and/or terminal eats the last
line of a programs output if not terminated EOL.


As a final example, let me paraphrase one of the mantras of programming:
"Be strict in what you send out, but liberal in what you accept."


If I have to choose, I will always prefer a program with save defaults.


That all being said, I agree with OP that an option to not automagically
append EOL to a text file that does not have it already should probably
be an option for VIM. On the other hand, the TODO is long, Bram's time
limited and a viable solution via tip 1369 available.

lingwitt, would you be willing to write a patch doing this (along with
the docs), Bram, would you accept such a patch?


Best regards,
Richard

ling...@gmail.com

unread,
Feb 7, 2008, 6:53:47 PM2/7/08
to vim...@googlegroups.com

On 6 Feb 2008, at 10:27 AM, Richard Hartmann wrote:

> lingwitt, would you be willing to write a patch doing this (along with
> the docs), Bram, would you accept such a patch?

This seems reasonable, but I've entered a very taxing time of the year,
and I have absolutely no familiarity with the code. Bram would probably
be able to implement the requested feature much more easily.

Why not introduce a new option rather than break the current
functionality
of 'eol'?

Ben Schmidt

unread,
Feb 7, 2008, 7:32:09 PM2/7/08
to vim...@googlegroups.com
>> lingwitt, would you be willing to write a patch doing this (along with
>> the docs), Bram, would you accept such a patch?
>
> This seems reasonable, but I've entered a very taxing time of the year,
> and I have absolutely no familiarity with the code. Bram would probably
> be able to implement the requested feature much more easily.

Or I can. It is not a large change. Attached.

> Why not introduce a new option rather than break the current
> functionality
> of 'eol'?

I personally don't think it really breaks anything, and Vim has so many options
already, I think options should only be added if they are really necessary for new
functionality/user preferences. It's also a little more work to add a new option,
though not much, I guess, so if Bram wants me to do so, I will. But we haven't
heard from Bram yet whether he's even willing to consider this change in any form.

Cheers,

Ben.


eol_without_bin.patch

Ben Schmidt

unread,
Feb 7, 2008, 9:50:55 PM2/7/08
to vim...@googlegroups.com
> Or I can. It is not a large change. Attached.

Hmmm. Except there are some other issues...

I didn't make the change for when writing to shells/filters (os_unix.c). I
probably should. Though it's a bit of an odd scenario. Whether you want the final
EOL often depends more on the filter than on the file being edited, and what
happens if you use a filter that doesn't like final EOL but you're piping only
part of a file through it? But really that is another question entirely. I will
just make it omit the last EOL if and only if the last line of the buffer is
included in the data being filtered.

And it seems there is some old code floating around which is no longer needed, but
which has bugs in it that cause inconsistent/wrong behaviour (even in the current
unpatched version). It will take me a little more time to look into that, but
there's not much point if Bram isn't going to accept it...so...I will wait for a
word from him before spending further time on it.

ling...@gmail.com

unread,
Feb 8, 2008, 9:45:15 AM2/8/08
to vim...@googlegroups.com

On 7 Feb 2008, at 9:50 PM, Ben Schmidt wrote:

> I will just make it omit the last EOL if and only if the
> last line of the buffer is included in the data being filtered.

Why not just let the option decide this? That way it's
configurable by the user.

Bram Moolenaar

unread,
Feb 8, 2008, 3:46:10 PM2/8/08
to Ben Schmidt, vim...@googlegroups.com

Ben Schmidt wrote:

I'm not going to change how the 'eol' option works. It does break
things for people that expect Vim to work like it works now.

Add another option to use 'eol' even when not in binary mode? I doubt
this is useful to more than a few people. And another option is another
thing that the user must be aware of that could be wrong.

--
hundred-and-one symptoms of being an internet addict:
7. You finally do take that vacation, but only after buying a cellular modem
and a laptop.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ download, build and distribute -- http://www.A-A-P.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Reply all
Reply to author
Forward
0 new messages